Difference between Java 8 streams and RxJava observables
Solution 1:
Short answer
All sequence/stream processing libs are offering very similar API for pipeline building. The differences are in API for handling multi-threading and composition of pipelines.
Long answer
RxJava is quite different from Stream. Of all JDK things, the closest to rx.Observable
is perhaps java.util.stream.Collector
Stream
+ CompletableFuture
combo (which comes at a cost of dealing with extra monad layer, i. e. having to handle conversion between Stream<CompletableFuture<T>>
and CompletableFuture<Stream<T>>
).
There are significant differences between Observable and Stream:
- Streams are pull-based, Observables are push-based. This may sound too abstract, but it has significant consequences that are very concrete.
- Stream can only be used once, Observable can be subscribed to many times.
-
Stream#parallel()
splits sequence into partitions,Observable#subscribeOn()
andObservable#observeOn()
do not; it is tricky to emulateStream#parallel()
behavior with Observable, it once had.parallel()
method but this method caused so much confusion that.parallel()
support was moved to separate repository: ReactiveX/RxJavaParallel: Experimental Parallel Extensions for RxJava. More details are in another answer. -
Stream#parallel()
does not allow to specify a thread pool to use, unlike most of RxJava methods accepting optional Scheduler. Since all stream instances in a JVM use the same fork-join pool, adding.parallel()
can accidentally affect the behaviour in another module of your program. - Streams are lacking time-related operations like
Observable#interval()
,Observable#window()
and many others; this is mostly because Streams are pull-based, and upstream has no control on when to emit next element downstream. - Streams offer restricted set of operations in comparison with RxJava. E.g. Streams are lacking cut-off operations (
takeWhile()
,takeUntil()
); workaround usingStream#anyMatch()
is limited: it is terminal operation, so you can't use it more than once per stream - As of JDK 8, there's no
Stream#zip()
operation, which is quite useful sometimes. -
Streams are hard to construct by yourself, Observable can be constructed by many waysEDIT: As noted in comments, there are ways to construct Stream. However, since there's no non-terminal short-circuiting, you can't e. g. easily generate Stream of lines in file (JDK providesFiles#lines()
andBufferedReader#lines()
out of the box though, and other similar scenarios can be managed by constructing Stream from Iterator). - Observable offers resource management facility (
Observable#using()
); you can wrap IO stream or mutex with it and be sure that the user will not forget to free the resource - it will be disposed automatically on subscription termination; Stream hasonClose(Runnable)
method, but you have to call it manually or via try-with-resources. E. g. you have to keep in mind thatFiles#lines()
must be enclosed in try-with-resources block. - Observables are synchronized all the way through (I didn't actually check whether the same is true for Streams). This spares you from thinking whether basic operations are thread-safe (the answer is always 'yes', unless there's a bug), but the concurrency-related overhead will be there, no matter if your code need it or not.
Round-up
RxJava differs from Streams significantly. Real RxJava alternatives are other implementations of ReactiveStreams, e. g. relevant part of Akka.
Update
There's trick to use non-default fork-join pool for Stream#parallel
, see Custom thread pool in Java 8 parallel stream.
Update
All of the above is based on the experience with RxJava 1.x. Now that RxJava 2.x is here, this answer may be out-of-date.
Solution 2:
Java 8 Stream and RxJava looks pretty similar. They have look alike operators (filter, map, flatMap...) but are not built for the same usage.
You can perform asynchonus tasks using RxJava.
With Java 8 stream, you'll traverse items of your collection.
You can do pretty much the same thing in RxJava (traverse items of a collection) but, as RxJava is focussed on concurrent task, ..., it use synchronization, latch, ... So the same task using RxJava may be slower than with Java 8 stream.
RxJava can be compared to CompletableFuture
, but that can be able to compute more than just one value.
Solution 3:
There are a few technical and conceptional differences, for example, Java 8 streams are single use, pull based, synchronous sequences of values whereas RxJava Observables are re-observable, adaptively push-pull based, potentially asynchronous sequences of values. RxJava is aimed at Java 6+ and works on Android as well.
Solution 4:
Java 8 Streams are pull based. You iterate over a Java 8 stream consuming each item. And it could be an endless stream.
RXJava Observable
is by default push based. You subscribe to an Observable and you will get notified when the next item arrives (onNext
), or when the stream is completed (onCompleted
), or when an error occurred (onError
).
Because with Observable
you receive onNext
, onCompleted
, onError
events, you can do some powerful functions like combining different Observable
s to a new one (zip
, merge
, concat
). Other stuff you could do is caching, throttling, ...
And it uses more or less the same API in different languages (RxJava, RX in C#, RxJS, ...)
By default RxJava is single threaded. Unless you start using Schedulers, everything will happen on the same thread.
Solution 5:
The existing answers are comprehensive and correct, but a clear example for beginners is lacking. Allow me to put some concrete behind terms like "push/pull-based" and "re-observable". Note: I hate the term Observable
(it's a stream for heaven's sake), so will simply refer to J8 vs RX streams.
Consider a list of integers,
digits = [1,2,3,4,5]
A J8 Stream is a utility to modify the collection. For example even digits can be extracted as,
evens = digits.stream().filter(x -> x%2).collect(Collectors.toList())
This is basically Python's map, filter, reduce, a very nice (and long overdue) addition to Java. But what if digits weren't collected ahead of time - what if the digits were streaming in while the app was running - could we filter the even's in realtime.
Imagine a separate thread process is outputting integers at random times while the app is running (---
denotes time)
digits = 12345---6------7--8--9-10--------11--12
In RX, even
can react to each new digit and apply the filter in real-time
even = -2-4-----6---------8----10------------12
There's no need to store input and output lists. If you want an output list, no problem that's streamable too. In fact, everything is a stream.
evens_stored = even.collect()
This is why terms like "stateless" and "functional" are more associated with RX