In Java streams is peek really only for debugging?
Solution 1:
The important thing you have to understand is that streams are driven by the terminal operation. The terminal operation determines whether all elements have to be processed or any at all. So collect
is an operation that processes each item, whereas findAny
may stop processing items once it encountered a matching element.
And count()
may not process any elements at all when it can determine the size of the stream without processing the items. Since this is an optimization not made in Java 8, but which will be in Java 9, there might be surprises when you switch to Java 9 and have code relying on count()
processing all items. This is also connected to other implementation-dependent details, e.g. even in Java 9, the reference implementation will not be able to predict the size of an infinite stream source combined with limit
while there is no fundamental limitation preventing such prediction.
Since peek
allows “performing the provided action on each element as elements are consumed from the resulting stream”, it does not mandate processing of elements but will perform the action depending on what the terminal operation needs. This implies that you have to use it with great care if you need a particular processing, e.g. want to apply an action on all elements. It works if the terminal operation is guaranteed to process all items, but even then, you must be sure that not the next developer changes the terminal operation (or you forget that subtle aspect).
Further, while streams guarantee to maintain the encounter order for a certain combination of operations even for parallel streams, these guarantees do not apply to peek
. When collecting into a list, the resulting list will have the right order for ordered parallel streams, but the peek
action may get invoked in an arbitrary order and concurrently.
So the most useful thing you can do with peek
is to find out whether a stream element has been processed which is exactly what the API documentation says:
This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline
Solution 2:
The key takeaway from this:
Don't use the API in an unintended way, even if it accomplishes your immediate goal. That approach may break in the future, and it is also unclear to future maintainers.
There is no harm in breaking this out to multiple operations, as they are distinct operations. There is harm in using the API in an unclear and unintended way, which may have ramifications if this particular behavior is modified in future versions of Java.
Using forEach
on this operation would make it clear to the maintainer that there is an intended side effect on each element of accounts
, and that you are performing some operation that can mutate it.
It's also more conventional in the sense that peek
is an intermediate operation which doesn't operate on the entire collection until the terminal operation runs, but forEach
is indeed a terminal operation. This way, you can make strong arguments around the behavior and the flow of your code as opposed to asking questions about if peek
would behave the same as forEach
does in this context.
accounts.forEach(a -> a.login());
List<Account> loggedInAccounts = accounts.stream()
.filter(Account::loggedIn)
.collect(Collectors.toList());