Allow interleaved mapping in async iterators
bakkot opened this issue · comments
This is reviving #128, basically.
It would be nice if code like
x = asyncIteratorOfUrls
.map(u => fetch(u))
await Promise.all([
x.next(),
x.next(),
])
could perform the fetches in parallel. Right now, because async iterator helpers are essentially "implemented" as async generators, it can't - the second call to .next
will be queued until the first one finishes, rather than immediately being forwarded to the underlying iterator.
If the implementation of map
were different
like this
AsyncIteratorProto.map =
function(fn) {
return {
__proto__: AsyncIteratorProto,
next: async () => {
let { done, value } = await this.next();
if (done) return { done: true };
return {
done: false,
value: await fn(value),
};
},
};
};
It is less clear how, and whether, to allow parallelism in other helpers. I think they all have pretty natural semantics, but I have not yet worked through all of them in detail.
More speculatively, at a later date this would allow us to add a helper (say .bufferAhead(N)
) to eagerly pump an async iterator and buffer the results. That would let you make any async iterator parallel with bounded concurrency, assuming the iterator was capable of supporting parallelism (so e.g. .map
applied to the result of an async generator, but not an async generator itself), without changing the ordering semantics of the result.
I'd also love if we could even change the semantics of async generators so that yield
does no longer implicitly await
but instead can be resumed as soon as .next()
is called again…
@bergus See some discussion of that here, though of course such a change would not be in scope for this proposal in particular.
I note that, with the change I'm proposing in this issue, you could get the same effect by doing yield { v: promise }
inside the async generator and then doing .map(box => box.v)
on the result of the async generator. Which is slightly silly, but does let you get the thing you want without web compat risk.
If the implementation of
map
were different
It's technically allowed by the protocol, but is it a problem that this design allows for a { done: false, value ... }
to come AFTER a { done: true }
? Currently all spec iterators ensure that { done: true }
mean that all successive calls to .next()
produce { done: true }
.
Note that this would be a difference from the synchronous version:
const results = [
{ done: false, value: "A" },
{ done: true },
{ done: false, value: "B" },
];
class CustomSyncIterator extends Iterator {
[Symbol.iterator]() { return this; }
#index = 0;
next() {
return results[this.#index++] ?? { done: true };
}
}
class CustomAsyncIterator extends Iterator {
[Symbol.asyncIterator]() { return this; }
#index = 0;
async next() {
return results[this.#index++] ?? { done: true };
}
}
const syncIterator = new CustomSyncIterator().map(value => value.repeat(5));
const syncResults = [
syncIterator.next(),
syncIterator.next(),
syncIterator.next(),
];
const asyncIterator = new CustomAsyncIterator().map(value => value.repeat(5));
const asyncResults = await Promise.all([
asyncIterator.next(),
asyncIterator.next(),
asyncIterator.next(),
]);
console.log(syncResults); // [{ done: false, value: "AAAAA" }, { done: true }, { done: true }]
console.log(asyncResults); // [{ done: false, value: "AAAAA" }, { done: true }, { done: false, value: "BBBBB" }]
but is it a problem that this design allows for a
{ done: false, value ... }
to come AFTER a{ done: true }
?
To be clear I'm not saying that my sample code would be literally the implementation, just demonstrating how it's possible to get parallelism. (Note that the sample code creates a new next
method each time - it's definitely not intended to be a high-fidelity implementation.) We would almost certainly want to keep a bit indicating whether the iterator has been closed, at the very least.
Yeah, this is all down the the level of the async iterator protocol. My initial impression is also that you won't be able to do this, because according to the async iterator protocol the only way to find out if an iterable is done is to await its next value.
My implementation there isn't meant to be complete, but it does support concurrency if you call .next
multiple times. Here is a snippet you can run today which demonstrates that the map callback can run concurrently with itself and with the underlying async generator.
Yeah you're right. I withdraw that criticism.
Closing this issue as "we expect to do this and need to work out the details". For discussion of the details, follow along and contribute at https://github.com/tc39/proposal-async-iterator-helpers.