What is Streams3 in Node.js and how does it differ from Streams2?

I've often heard of Streams2 and old-streams, but what is Streams3? It get mentioned in this talk by Thorsten Lorenz.

Where can I read about it, and what is the difference between Streams2 and Streams3.

Doing a search on Google, I also see it mentioned in the Changelog of Node 0.11.5,

stream: Simplify flowing, passive data listening (streams3) (isaacs)


Solution 1:

I'm going to give this a shot, but I've probably got it wrong. Having never written Streams1 (old-streams) or Streams2, I'm probably not the right guy to self-answer this one, but here it goes. It seems as if there is Streams1 API that still persists to some degree. In Streams2, there are two modes of streams flowing (legacy), and non-flowing. In short, the shim that supported flowing mode is going away. This was the message that lead to the patch now called called Streams3,

Same API as streams2, but remove the confusing modality of flowing/old mode switch.

  1. Every time read() is called, and returns some data, a data event fires.
  2. resume() will make it call read() repeatedly. Otherwise, no change.
  3. pause() will make it stop calling read() repeatedly.
  4. pipe(dest) and on('data', fn) will automatically call resume().
  5. No switches into old-mode. There's only flowing, and paused. Streams start out paused.

Unfortunately, to understand any of description which defines Streams3 pretty well, you need to first understand Streams1, and the legacy streams

Backstory

First, let's take a look at what the Node v0.10.25 docs say about the two modes,

Readable streams have two "modes": a flowing mode and a non-flowing mode. When in flowing mode, data is read from the underlying system and provided to your program as fast as possible. In non-flowing mode, you must explicitly call stream.read() to get chunks of data out. — Node v0.10.25 Docs

Isaac Z. Schlueter said in November slides I dug up:

streams2

  • "suck streams"
  • Instead of 'data' events spewing, call read() to pull data from source
  • Solves all problems (that we know of)

So it seems as if in streams1, you'd create an object and call .on('data', cb) to that object. This would set the event to be trigger, and then you were at the mercy of the stream. In Streams2 internally streams have buffers and you request data from those streams explicitly (using `.read). Isaac goes on to specify how backwards compat works in Streams2 to keep Streams1 (old-stream) modules functioning

old-mode streams1 shim

  • New streams can switch into old-mode, where they spew 'data'
  • If you add a 'data' event handler, or call pause() or resume(), then switch
  • Making minimal changes to existing tests to keep us honest

So in Streams2, a call to .pause() or .resume() triggers the shim. And, it should, right? In Streams2 you have control over when to .read(), and you're not catching stuff being thrown at you. This triggered a legacy mode that acted independently of Streams2.

Let's take an example from Isaac's slide,

createServer(function(q,s) {
  // ADVISORY only!
  q.pause()
  session(q, function(ses) {
    q.on('data', handler)
    q.resume()
  })
})
  • In Streams1, q starts up right away reading and emitting (likely losing data), until the call to q.pause advises q to stop pulling in data but not from emitting events to clear what it already read.
  • In Streams2, q starts off paused until the call to .pause() which signifies to emulate the old mode.
  • In Streams3, q starts off as paused having never read from the file handle making the q.pause() a noop, and on the call to q.on('data', cb) will call q.resume until there is no more data in the buffer. And, then call again q.resume doing the same thing.

Solution 2:

Seems like Streams3 was introduced in io.js, then in Node 0.11+

Streams 1 Supported data being pushed to a stream. There was no consumer control, data was thrown at the consumer whether it was ready or not.

Streams 2 allows data to be pushed to a stream as per Streams 1, or for a consumer to pull data from a stream as needed. The consumer could control the flow of data in pull mode (using stream.read() when notified of available data). The stream can not support both push and pull at the same time.

Streams 3 allows pull and push data on the same stream.

Great overview here:

https://strongloop.com/strongblog/whats-new-io-js-beta-streams3/

A cached version (accessed 8/2020) is here: https://hackerfall.com/story/whats-new-in-iojs-10-beta-streams-3