Java 8: performance of Streams vs Collections
Stop using
LinkedList
for anything but heavy removing from the middle of the list using iterator.Stop writing benchmarking code by hand, use JMH.
Proper benchmarks:
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(StreamVsVanilla.N)
public class StreamVsVanilla {
public static final int N = 10000;
static List<Integer> sourceList = new ArrayList<>();
static {
for (int i = 0; i < N; i++) {
sourceList.add(i);
}
}
@Benchmark
public List<Double> vanilla() {
List<Double> result = new ArrayList<>(sourceList.size() / 2 + 1);
for (Integer i : sourceList) {
if (i % 2 == 0){
result.add(Math.sqrt(i));
}
}
return result;
}
@Benchmark
public List<Double> stream() {
return sourceList.stream()
.filter(i -> i % 2 == 0)
.map(Math::sqrt)
.collect(Collectors.toCollection(
() -> new ArrayList<>(sourceList.size() / 2 + 1)));
}
}
Result:
Benchmark Mode Samples Mean Mean error Units
StreamVsVanilla.stream avgt 10 17.588 0.230 ns/op
StreamVsVanilla.vanilla avgt 10 10.796 0.063 ns/op
Just as I expected stream implementation is fairly slower. JIT is able to inline all lambda stuff but doesn't produce as perfectly concise code as vanilla version.
Generally, Java 8 streams are not magic. They couldn't speedup already well-implemented things (with, probably, plain iterations or Java 5's for-each statements replaced with Iterable.forEach()
and Collection.removeIf()
calls). Streams are more about coding convenience and safety. Convenience -- speed tradeoff is working here.
1) You see time less than 1 second using you benchmark. That means there can be strong influence of side effects on your results. So, I increased your task 10 times
int max = 10_000_000;
and ran your benchmark. My results:
Collections: Elapsed time: 8592999350 ns (8.592999 seconds)
Streams: Elapsed time: 2068208058 ns (2.068208 seconds)
Parallel streams: Elapsed time: 7186967071 ns (7.186967 seconds)
without edit (int max = 1_000_000
) results were
Collections: Elapsed time: 113373057 ns (0.113373 seconds)
Streams: Elapsed time: 135570440 ns (0.135570 seconds)
Parallel streams: Elapsed time: 104091980 ns (0.104092 seconds)
It's like your results: stream is slower than collection. Conclusion: much time were spent for stream initialization/values transmitting.
2) After increasing task stream became faster (that's OK), but parallel stream remained too slow. What's wrong? Note: you have collect(Collectors.toList())
in you command. Collecting to single collection essentially introduces performance bottleneck and overhead in case of concurrent execution. It is possible to estimate the relative cost of overhead by replacing
collecting to collection -> counting the element count
For streams it can be done by collect(Collectors.counting())
. I got results:
Collections: Elapsed time: 41856183 ns (0.041856 seconds)
Streams: Elapsed time: 546590322 ns (0.546590 seconds)
Parallel streams: Elapsed time: 1540051478 ns (1.540051 seconds)
That' s for a big task! (int max = 10000000
) Conclusion: collecting items to collection took majority of time. The slowest part is adding to list. BTW, simple ArrayList
is used for Collectors.toList()
.