tc39 / proposal-iterator-helpers

Methods for working with iterators in ECMAScript

Home Page:https://tc39.es/proposal-iterator-helpers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow interleaved mapping in async iterators

bakkot opened this issue · comments

This is reviving #128, basically.

It would be nice if code like

x = asyncIteratorOfUrls
  .map(u => fetch(u))

await Promise.all([
  x.next(),
  x.next(),
])

could perform the fetches in parallel. Right now, because async iterator helpers are essentially "implemented" as async generators, it can't - the second call to .next will be queued until the first one finishes, rather than immediately being forwarded to the underlying iterator.

If the implementation of map were different

like this
AsyncIteratorProto.map =
  function(fn) {
    return {
      __proto__: AsyncIteratorProto,
      next: async () => {
        let { done, value } = await this.next();
        if (done) return { done: true };
        return {
          done: false,
          value: await fn(value),
        };
      },
    };
  };
then the above snippet would just work. I think we should consider revising this.

It is less clear how, and whether, to allow parallelism in other helpers. I think they all have pretty natural semantics, but I have not yet worked through all of them in detail.

More speculatively, at a later date this would allow us to add a helper (say .bufferAhead(N)) to eagerly pump an async iterator and buffer the results. That would let you make any async iterator parallel with bounded concurrency, assuming the iterator was capable of supporting parallelism (so e.g. .map applied to the result of an async generator, but not an async generator itself), without changing the ordering semantics of the result.

See slides here and some discussion here.

commented

I'd also love if we could even change the semantics of async generators so that yield does no longer implicitly await but instead can be resumed as soon as .next() is called again…

@bergus See some discussion of that here, though of course such a change would not be in scope for this proposal in particular.

I note that, with the change I'm proposing in this issue, you could get the same effect by doing yield { v: promise } inside the async generator and then doing .map(box => box.v) on the result of the async generator. Which is slightly silly, but does let you get the thing you want without web compat risk.

If the implementation of map were different

It's technically allowed by the protocol, but is it a problem that this design allows for a { done: false, value ... } to come AFTER a { done: true }? Currently all spec iterators ensure that { done: true } mean that all successive calls to .next() produce { done: true }.

Note that this would be a difference from the synchronous version:

const results = [
    { done: false, value: "A" },
    { done: true },
    { done: false, value: "B" },
];

class CustomSyncIterator extends Iterator {
    [Symbol.iterator]() { return this; }
    
    #index = 0;
    
    next() {
        return results[this.#index++] ?? { done: true };
    }
}

class CustomAsyncIterator extends Iterator {
    [Symbol.asyncIterator]() { return this; }
    
    #index = 0;
    
    async next() {
        return results[this.#index++] ?? { done: true };
    }
}

const syncIterator = new CustomSyncIterator().map(value => value.repeat(5));
const syncResults = [
    syncIterator.next(),
    syncIterator.next(),
    syncIterator.next(),
];

const asyncIterator = new CustomAsyncIterator().map(value => value.repeat(5));
const asyncResults = await Promise.all([
    asyncIterator.next(),
    asyncIterator.next(),
    asyncIterator.next(),
]);

console.log(syncResults); // [{ done: false, value: "AAAAA" }, { done: true }, { done: true }]
console.log(asyncResults); // [{ done: false, value: "AAAAA" }, { done: true }, { done: false, value: "BBBBB" }]

but is it a problem that this design allows for a { done: false, value ... } to come AFTER a { done: true }?

To be clear I'm not saying that my sample code would be literally the implementation, just demonstrating how it's possible to get parallelism. (Note that the sample code creates a new next method each time - it's definitely not intended to be a high-fidelity implementation.) We would almost certainly want to keep a bit indicating whether the iterator has been closed, at the very least.

Yeah, this is all down the the level of the async iterator protocol. My initial impression is also that you won't be able to do this, because according to the async iterator protocol the only way to find out if an iterable is done is to await its next value.

My implementation there isn't meant to be complete, but it does support concurrency if you call .next multiple times. Here is a snippet you can run today which demonstrates that the map callback can run concurrently with itself and with the underlying async generator.

Yeah you're right. I withdraw that criticism.

Closing this issue as "we expect to do this and need to work out the details". For discussion of the details, follow along and contribute at https://github.com/tc39/proposal-async-iterator-helpers.