Building a promise chain recursively in javascript - memory considerations

In this answer, a promise chain is built recursively.

Simplified slightly, we have :

function foo() {
    function doo() {
        // always return a promise
        if (/* more to do */) {
            return doSomethingAsync().then(doo);
        } else {
            return Promise.resolve();
        }
    }
    return doo(); // returns a promise
}

Presumably this would give rise to a call stack and a promise chain - ie "deep" and "wide".

I would anticipate a memory spike larger than either performing a recursion or building a promise chain alone.

Is this so?
Has anyone considered the memory issues of building a chain in this way?
Would memory consumption differ between promise libs?

Solution 1:

a call stack and a promise chain - ie "deep" and "wide".

Actually, no. There is no promise chain here as we know it from doSomeThingAsynchronous.then(doSomethingAsynchronous).then(doSomethingAsynchronous).… (which is what Promise.each or Promise.reduce might do to sequentially execute handlers if it was written this way).

What we are facing here is a resolve chain¹ - what happens in the end, when the base case of the recursion is met, is something like Promise.resolve(Promise.resolve(Promise.resolve(…))). This is only "deep", not "wide", if you want to call it that.

I would anticipate a memory spike larger than either performing a recursion or building a promise chain alone.

Not a spike actually. You'd slowly, over time, build a bulk of promises that are resolved with the innermost one, all representing the same result. When, at the end of your task, the condition is fulfilled and the innermost promise resolved with an actual value, all of these promises should be resolved with the same value. That would end up with O(n) cost for walking up the resolve chain (if implemented naively, this might even be done recursively and cause a stack overflow). After that, all the promises except for the outermost can become garbage collected.

In contrast, a promise chain that is built by something like

[…].reduce(function(prev, val) {
    // successive execution of fn for all vals in array
    return prev.then(() => fn(val));
}, Promise.resolve())

would show a spike, allocating n promise objects at the same time, and then slowly resolve them one by one, garbage-collecting the previous ones until only the settled end promise is alive.

memory
  ^     resolve      promise "then"    (tail)
  |      chain          chain         recursion
  |        /|           |\
  |       / |           | \
  |      /  |           |  \
  |  ___/   |___     ___|   \___     ___________
  |
  +----------------------------------------------> time

Is this so?

Not necessarily. As said above, all the promises in that bulk are in the end resolved with the same value², so all we would need is to store the outermost and the innermost promise at one time. All intermediate promises may become garbage-collected as soon as possible, and we want to run this recursion in constant space and time.

In fact, this recursive construct is totally necessary for asynchronous loops with a dynamic condition (no fixed number of steps), you cannot really avoid it. In Haskell, where this is used all the time for the IO monad, an optimisation for it is implemented just because of this case. It is very similar to tail call recursion, which is routinely eliminated by compilers.

Has anyone considered the memory issues of building a chain in this way?

Yes. This was discussed at promises/aplus for example, though with no outcome yet.

Many promise libraries do support iteration helpers to avoid the spike of promise then chains, like Bluebird's each and map methods.

My own promise library^3,4 does feature resolve chains without introducing memory or runtime overhead. When one promise adopts another (even if still pending), they become indistinguishable, and intermediate promises are no longer referenced anywhere.

Would memory consumption differ between promise libs?

Yes. While this case can be optimised, it seldom is. Specifically, the ES6 spec does require Promises to inspect the value at every resolve call, so collapsing the chain is not possible. The promises in the chain might even be resolved with different values (by constructing an example object that abuses getters, not in real life). The issue was raised on esdiscuss but remains unresolved.

So if you use a leaking implementation, but need asynchronous recursion, then you better switch back to callbacks and use the deferred antipattern to propagate the innermost promise result to a single result promise.

_{[1]: no official terminology

[2]: well, they are resolved with each other. But we want to resolve them with the same value, we expect that

[3]: undocumented playground, passes aplus. Read the code at your own peril: https://github.com/bergus/F-Promise

[4]: also implemented for Creed in this pull request}

Solution 2:

Disclaimer: premature optimization is bad, the real way to find out about performance differences is to benchmark your code, and you shouldn't worry about this (I've only had to once and I've used promises for at least 100 projects).

Is this so?

Yes, the promises would have to "remember" what they're following, if you do this for 10000 promises you'd have a 10000 long promise chain, if you don't then you won't (for example, with recursion) - this is true for any queueing flow control.

If you have to keep track of 10000 extra things (the operations) then you need to keep memory for it and that takes time, if that number is a million it might not be viable. This varies among libraries.

Has anyone considered the memory issues of building a chain in this way?

Of course, this is a big issue, and a use case for using something like Promise.each in libraries like bluebird over thenable chaining.

I've personally had in my code to avoid this style for a quick app that traverses all the files in a VM once - but in the vast majority of cases it's a non issue.

Would memory consumption differ between promise libs?

Yes, greatly. For example bluebird 3.0 will not allocate an extra queue if it detects a promise operation is already asynchronous (for example if it starts with a Promise.delay) and will just execute things synchronously (because the async guarantees are already preserved).

This means that what I claimed in my answer to the first question isn't always true (but is true in the regular use case) :) Native promises will never be able to do this unless internal support is provided.

Then again - it's no surprise since promise libraries differ by orders of magnitude from one another.

Solution 3:

I just came out a hack that may help solving the problem: don't do recursion in the last then, rather, do it in the last catch, since catch is out of the resolve chain. Using your example, it would be like this:

function foo() {
    function doo() {
        // always return a promise
        if (/* more to do */) {
            return doSomethingAsync().then(function(){
                        throw "next";
                    }).catch(function(err) {
                        if (err == "next") doo();
                    })
        } else {
            return Promise.resolve();
        }
    }
    return doo(); // returns a promise
}

Solution 4:

To complement the awesome existing answers I'd like to illustrate the expression, which is the result of such an asynchronous recursion. For the sake of simplicity I use a simple function that computes the power of a given base and exponent. The recursive and base case are equivalent to those of the OP's example:

const powerp = (base, exp) => exp === 0 
 ? Promise.resolve(1)
 : new Promise(res => setTimeout(res, 0, exp)).then(
   exp => power(base, exp - 1).then(x => x * base)
 );

powerp(2, 8); // Promise {...[[PromiseValue]]: 256}

With the help of some substitution steps the recursive portion can be replaced. Please note that this expression can be evaluated in your browser:

// apply powerp with 2 and 8 and substitute the recursive case:

8 === 0 ? Promise.resolve(1) : new Promise(res => setTimeout(res, 0, 8)).then(
  res => 7 === 0 ? Promise.resolve(1) : new Promise(res => setTimeout(res, 0, 7)).then(
    res => 6 === 0 ? Promise.resolve(1) : new Promise(res => setTimeout(res, 0, 6)).then(
      res => 5 === 0 ? Promise.resolve(1) : new Promise(res => setTimeout(res, 0, 5)).then(
        res => 4 === 0 ? Promise.resolve(1) : new Promise(res => setTimeout(res, 0, 4)).then(
          res => 3 === 0 ? Promise.resolve(1) : new Promise(res => setTimeout(res, 0, 3)).then(
            res => 2 === 0 ? Promise.resolve(1) : new Promise(res => setTimeout(res, 0, 2)).then(
              res => 1 === 0 ? Promise.resolve(1) : new Promise(res => setTimeout(res, 0, 1)).then(
                res => Promise.resolve(1)
              ).then(x => x * 2)
            ).then(x => x * 2)
          ).then(x => x * 2)
        ).then(x => x * 2)
      ).then(x => x * 2)
    ).then(x => x * 2)
  ).then(x => x * 2)
).then(x => x * 2); // Promise {...[[PromiseValue]]: 256}

Interpretation:

With new Promise(res => setTimeout(res, 0, 8)) the executor is invoked immediately and executes a non-bllocking computation (mimicked with setTimeout). Then an unsettled Promise is returned. This is equivalent with doSomethingAsync() of the OP's example.
A resolve callback is associated with this Promise via .then(.... Note: The body of this callback was substituted with the body of powerp.
Point 2) is repeated and a nested then handler structure is build up until the base case of the recursion is reached. The base case returns a Promise resolved with 1.
The nested then handler structure is "unwound" by calling the associated callback correspondingly.

Why is the generated structure nested and not chained? Because the recursive case within the then handlers prevents them from returning a value until the base case is reached.

How can this work without a stack? The associated callbacks form a "chain", which bridges the successive microtasks of the main event loop.