Are java streams able to lazilly reduce from map/filter conditions?
I am using a functional programming style to solve the Leetcode easy question, Count the Number of Consistent Strings. The premise of this question is simple: count the amount of values for which the predicate of "all values are in another set" holds.
I have two approaches, one which I am fairly certain behaves as I want it to, and the other which I am less sure about. Both produce the correct output, but ideally they would stop evaluating other elements after the output is in a final state.
public int countConsistentStrings(String allowed, String[] words) {
final Set<Character> set = allowed.chars()
.mapToObj(c -> (char)c)
.collect(Collectors.toCollection(HashSet::new));
return (int)Arrays.stream(words)
.filter(word ->
word.chars()
.allMatch(c -> set.contains((char)c))
)
.count();
}
In this solution, to the best of my knowledge, the allMatch statement will terminate and evaluate to false at the first instance of c for which the predicate does not hold true, skipping the other values in that stream.
public int countConsistentStrings(String allowed, String[] words) {
Set<Character> set = allowed.chars()
.mapToObj(c -> (char)c)
.collect(Collectors.toCollection(HashSet::new));
return (int)Arrays.stream(words)
.filter(word ->
word.chars()
.mapToObj(c -> set.contains((char)c))
.reduce((a,b) -> a&&b)
.orElse(false)
)
.count();
}
In this solution, the same logic is used but instead of allMatch
, I use map
and then reduce
. Logically, after a single false
value comes from the map
stage, reduce
will always evaluate to false
. I know Java streams are lazy, but I am unsure when they ''know'' just how lazy they can be. Will this be less efficient than using allMatch
or will laziness ensure the same operation?
Lastly, in this code, we can see that the value for x
will always be 0 as after filtering for only positive numbers, the sum of them will always be positive (assume no overflow) so taking the minimum of positive numbers and a hardcoded 0 will be 0. Will the stream be lazy enough to evaluate this to 0 always, or will it work to reduce every element after the filter anyways?
List<Integer> list = new ArrayList<>();
...
/*Some values added to list*/
...
int x = list.stream()
.filter(i -> i >= 0)
.reduce((a,b) -> Math.min(a+b, 0))
.orElse(0);
To summarize the above, how does one know when the Java stream will be lazy? There are lazy opportunities that I see in the code, but how can I guarantee that my code will be as lazy as possible?
The actual term you’re asking for is short-circuiting
Further, some operations are deemed short-circuiting operations. An intermediate operation is short-circuiting if, when presented with infinite input, it may produce a finite stream as a result. A terminal operation is short-circuiting if, when presented with infinite input, it may terminate in finite time. Having a short-circuiting operation in the pipeline is a necessary, but not sufficient, condition for the processing of an infinite stream to terminate normally in finite time.
The term “lazy” only applies to intermediate operations and means that they only perform work when being requested by the terminal operation. This is always the case, so when you don’t chain a terminal operation, no intermediate operation will ever process any element.
Finding out whether a terminal operation is short-circuiting, is rather easy. Go to the Stream
API documentation and check whether the particular terminal operation’s documentation contains the sentence
This is a short-circuiting terminal operation.
allMatch
has it, reduce
has not.
This does not mean that such optimizations based on logic or algebra are impossible. But the responsibility lies at the JVM’s optimizer which might do the same for loops. However, this requires inlining of all involved methods to be sure that this conditions always applies and there are no side effect which must be retained. This behavioral compatibility implies that even if the processing gets optimized away, a peek(System.out::println)
would keep printing all elements as if they were processed. In practice, you should not expect such optimizations, as the Stream implementation code is too complex for the optimizer.