Is .collect guaranteed to be ordered on parallel streams?
TL;DR
Yes, the order is guaranteed.
Stream.collect() API documentation
The starting place is to look at what determines whether a reduction is concurrent or not. Stream.collect()
's description says the following:
If the stream is parallel, and the
Collector
is concurrent, and either the stream is unordered or the collector is unordered, then a concurrent reduction will be performed (seeCollector
for details on concurrent reduction.)
The first condition is satisfied: the stream is parallel. How about the second and third: is the Collector
concurrent and unordered?
Collectors.toList() API documentation
toList()
's documentation reads:
Returns a
Collector
that accumulates the input elements into a newList
. There are no guarantees on the type, mutability, serializability, or thread-safety of theList
returned; if more control over the returnedList
is required, usetoCollection(Supplier)
.Returns:
a Collector which collects all the input elements into a List, in encounter order
An operation that works in encounter order operates on the elements in their original order. This overrides parallelness.
Implementation code
Inspecting the implementation of Collectors.java
confirms that toList()
does not include the CONCURRENT
or UNORDERED
traits.
public static <T>
Collector<T, ?, List<T>> toList() {
return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,
(left, right) -> { left.addAll(right); return left; },
CH_ID);
}
// ...
static final Set<Collector.Characteristics> CH_ID
= Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.IDENTITY_FINISH));
Notice how the collector has the CH_ID
trait set, which has only the single IDENTITY_FINISH
trait. CONCURRENT
and UNORDERED
are not there, so the reduction cannot be concurrent.
A non-concurrent reduction means that, if the stream is parallel, collection can proceed in parallel, but it will be split into several thread-confined intermediate results which are then combined. This ensures the combined result is in encounter order.
See also: Why parallel stream get collected sequentially in Java 8
You are guaranteed to get the elements in encounter order.
From documentation of toList
:
Returns: a Collector which collects all the input elements into a List, in encounter order
See java.util.streams summary for more information on the term "encounter order".
Furthermore, List#spliterator
documentation requires that the all implementations of List
produce spliterators that are ORDERED
:
The Spliterator reports Spliterator.SIZED and Spliterator.ORDERED. Implementations should document the reporting of additional characteristic values.
Oddly enough, while List
interface requires iterator()
to produce elements in "proper sequence", spliterator()
is only required to be ordered but not specifically required to follow the list's natural ordering.
So, to answer your question, the list produced by toList
is guaranteed to contain the elements exactly as the source list's spliterator orders them. It does not matter whether the stream is parallel or sequential.