kriskowal / gtor

A General Theory of Reactivity

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Compare and contrast GToR Stream/Observable vs Rx Observable

domenic opened this issue · comments

The section on async generators seems confused about what Jafar's slides propose. It proposes Rx-style observables (your signals, I believe); it does not propose anything like the async iterators in GTOR.

I think Rx conflates streams and observables in a lot of ways and confuses more than illuminates in that respect. For example, Jafar does point out that async generators can use await yield to pass back pressure. For the purposes of GtoR, observables and signals are strictly for lossy (a word I need to use in this article) time series data and pressure is not involved.

Yeah, I kind of got that from your approach, but you never really came out and said "the slides are wrong"; in fact you said something more like "the slides are right."

Yes, that is my mistake. Parts seem right to me other parts I don’t understand well enough to judge but don’t match what I’ve described here.

Is there a link to Jafar's slides?

Indeed. Copied here…

https://docs.google.com/file/d/0B4PVbLpUIdzoMDR5dWstRllXblU/edit

On Fri, Aug 15, 2014 at 5:41 AM, zenparsing notifications@github.com
wrote:

Is there a link to Jafar's slides?


Reply to this email directly or view it on GitHub
#6 (comment).

I am still absorbing these slides and the Rx mentality. I am certain at this point that the interface Rx describes for observables is what I describe as an readable output stream or promise iterator.

Regardless, I like, no…love, the idea of an on operator that is effectively a combination of the synchronous iterator of operator with an implied await on the promise for an iteration that stream.next() would return.

Hypothesis: The semantics of Rx style observables are multi-faceted, capturing both lossy time series data semantics and lossless pressure semantics depending on some factor I am yet to discover.

Hypothesis: Rx appears to have per-iteration error handling built-in whereas GToR proposes orthogonal idioms for per-iteration and per-stream error handling.

rxObservable.subscribe(function () {
    console.log("rxObservable produced an iteration normally");
}, function () {
    console.log("rxObservable was unable to produce a value for this iteration");
});

gtorStream.forEach(function (n) {
    return blah().then(function (value) {
        console.log("iteration produced normally");
    }, function (error) {
        console.log("failed to produce an iteration");
    });
    // the resolution of either handler is the iteration,
    // and if an error escapes, it will terminate the stream
    // prematurely.
})
.then(function (value) {
    console.log("stream terminated normally");
}, function (error) {
    console.log("stream terminated prematurely");
});

I am still absorbing the slides as well. It will probably take me another week of consideration, but at this point I'm somewhat disappointed that I don't find a clear argument against what seems like the obvious path: that an async generator returns an iterable of Promises.

More explicitly:

  • Generators are a pleasant way to create iterators (mapping sequences, really).
  • Async functions are a pleasant way to create promise chains.
  • Async generators are a pleasant way to create iterators (mapping sequences) of promise chains.

Can observables not be constructed from iterables of Promises? I haven't tried to prove it yet, but it seems like observe is merely creating an async pipe. If so, then why is piping essential to the concept of the async generator?

@zenparsing can you specify what you mean by iterable of promises? Which of the following does next() return on a promise iterator:

  1. Promise<Iteration<T>>
  2. Iteration<Promise<T>>

Near as I can tell, what Jafar describes as iterator.observe(generator) is equivalent to GToR iterator.forEach(generator.yield).then(generator.return, generator.throw), which is to say…a pipe of some kind. I have named the method copy in Q-IO, but it is spiritually equivalent to what Domenic calls pipeTo. (I don’t think pipeThrough is sound, so I would just call it pipe and have a reader-centric view of streams at this juncture).

Right - that was a confusing way to put it. I meant that next(val) would return a promise for the next element of the sequence, which corresponds to the async generator returning Iterable<Promise<T>> in the slides.

But I'm a little unclear on this: what is Iteration in your previous comment? What kind of interface is that?

I think I can answer myself: Iteration is just { value, done }, correct?

In that case next(val) would return Promise<Iteration<T>>

Yes, though be aware that it is a term I fabricated and hasn’t by any means caught on. So I believe that we are all in agreement of the type signature of an async generator, including Jafar.

https://github.com/kriskowal/gtor#iterators

https://github.com/kriskowal/gtor/blob/master/iteration.js

Yes, although Jafar makes the async (EDIT: generator) function, when called, return an Observable, which is essentially a pipe factory. If I understand his design correctly, there's no way to iterate over the (async) sequence without piping.

Oh, that is interesting. I did not notice that, and @domenic hinted at this distinction with #9, the distinction between an iterable and an iterator.

In Soviet JavaScript, generators return iterators, not iterables. I would expect an asynchronous generator to keep with the rustic aesthetic.

Looking over the slides again, I believe Jafar’s async generators do return async iterators, not async iterables. The JavaScript of operator accepts iterables, and Jafar’s slides do suggest that the on operator accepts async iterables, and by extension async iterators, but when he calls an async generator function, either interpretation would work.

I will have to write an article just going over this slide by slide and pinpointing any specific disagreement.

A lot of this looks algebraically equivalent to what I’m proposing. He is proposing that observe be the asynchronous analogy to iterate, but I propose that the iterate symbol can be reused for asynchronous iterables and just return an async iterator instead of an iterator. The async iterator would implement next and what Jafar calls observe would be implemented in terms thereof, and I would call it pipe or copy (reusing the synchronous analogue again). copy(generator) would be implemented in terms of forEach(generator.next).then(generator.return, generator.throw) and forEach would be implemented in terms of next(). For these little architectural differences, we appear to agree about the semantics.

He says that Object.observe is an async generator. This is more the true sense of an observable, because it disregards back pressure, but the interface is a subset for all intents and purposes, so I can’t say we disagree.

And this analysis has made it much more clear to me that next is the pragmatic name for yield, while yield is the ideal name for next if you want a transparent analogy between generators and iterators. but that they are otherwise equivalent in all positions.

My design inclinations are in line with yours. I'm curious why Jafar chose to not expose a next(val) method directly. Is there anyone we can bother that might shed more light on that choice?

My take on things was that his observables are fundamentally push: once you call the async generator function, its behavior is out of your control entirely, and you get notified if/when it wants you to, via the subscription callback.

This seems in conflict with the normal generator next() model, where the consumer, via calling next, gets some measure of control over the body of the function. So I think that's why he doesn't have a next().

I also want to second @zenparsing that I don't see the iteration object exposed anywhere in his design, so the type signature is not in agreement with the Promise<Iteration<T>> idea (which also seems natural to me).

Thanks @domenic - that makes sense when looking at the slides. It seems an odd choice though - I don't think it's difficult to come up with scenarios where you want to write an async generator where the consumer is in control.

I tried writing a readFile and writeFile using async generators here:

https://gist.github.com/zenparsing/26b200543bb8ae0ca4df

One thing that jumps out at me that I didn't really consider before is that there is no way to access the value of the first call to next on an iterator. It kind of just disappears forever. Is that right?

If so, then it makes writing these streams awkward because you have to artificially "pump" the iterator with a useless call to next.

For example, if your async generator is writing buffers to a file, then there's no way to capture the buffer that is passed to the first call to next.

@domenic @kriskowal what do you think?

This appears to be an issue for Jafar's design as well. The observe method is specified as taking a Generator as argument, but you can't use a generator "newborn" from a (sync) generator function - the argument first submitted to next will be inaccessible.

I've posed the question on es-discuss: http://esdiscuss.org/topic/that-first-next-argument

When composing an async function and a generator function, we have two precedents. A generator function does not resume until we first call next. An async function begins working immediately and returns a promise. A generator function iterator is pull only. However, an async generator, as I describe here, is "pressurized", effectively pull (in the sense that you call next) and push (in the sense that next returns a promise). I’d suggest that an async generator would begin work immediately and only pause on await. The async generator function and its async iterator would be mediated by a stream/pipe/buffer.

I think I disagree here.

Consider this. If an async function does not have an await, then it behaves exactly like a regular function, except that its return value is converted to a promise:

async function af() {
    console.log("inside call");
}

console.log("before call");
af();
console.log("after call");

/* 
> "before call"
> "inside call"
> "after call"
*/

By symmetry, I would expect that an async generator without await would behave exactly like a regular generator, except that its IteratorResults would be converted to promises:

async function *ag() {
    console.log("1");
    yield 1;
    console.log("2");
}

console.log("before call");
let iter = ag();
console.log("after call");
console.log("before next");
iter.next();
console.log("after next");
// ...etc

/*
> "before call"
> "after call"
> "before next"
> "1"
> "after next"
...etc
*/

I find @zenparsing's argument compelling.

Yes, it’s good and I buy it.

I think I finally have a handle on the crux of the difference between Observables and Async Iterators.

Iterators transmit data from callee to caller, down the call stack.

Observables transmit data from caller to callee, up the call stack.

Not coincidentally, event-subscriber systems also transmit data up the stack. I think that explains why the slides focus more heavily on representing event streams (as opposed to data processing streams).

Does that sound right?

Re-reading this thread ... it's a goldmine.

This came up at the last TC39 meeting again. The only compelling argument given for async generators vending observables instead of async iterators was efficiency reasons. I.e., allocating a Promise<IterationResult<T>> for every piece of data was argued to be inefficient.

That seems possible, but again, more relevant to event streams than to data streams.

The other argument was somewhat of a pragmatic one. Namely, that observables (and their focus on events) are more useful as a pattern in general than async iterators. And thus, if we're willing to bless something with syntax, it should be observables. That way, you can use async function* to create easy observable combinators, similarly to how you use function* to create easy iterator combinators.

I still think observables are a bad match for async function*. I guess it would make more sense to introduce something completely new, e.g. observable function or something. Alternately I'd be interested in some sort of "sync counterpart" of observables, which we could then package up under function^ (strawman) and then observables would be async function^. Maybe that sync counterpart is forEach-ables?

Indeed, a theoretically sound design seems to me something like:

  • iterables + async -> async iterables
  • forEachables + async -> observables

If we were to bless them with syntax I'd do something like

  • for (x of iterable); produced by function*s
  • for (await x of asyncIterable); produced by async function*s
  • for (x on forEachable); produced by function^s??
  • for (await x on observable); produced by async function^s??

Of course this is a pretty ridiculous expansion in complexity for unclear use cases: pragmatically forEachables are pretty silly, and it's questionable whether we have room for async iterables and observables. Still, it strengthens my belief that async function* is not the right syntax.

Saw that this was briefly discussed at the tail end of the meeting notes. Glad that this is being discussed. I do think there is grounds for both push and pull abstractions at this layer.

This reminds me of the layers of arithmetic operators. At layer one you have the dichotomy of plus and minus, then at layer two you have multiplication and division which would be symmetric alone, but then you get the hangers-on of modulo and remainder. At the third layer you have exponentiation. The analogue of subtraction at this layer is the radical, but the logarithm is a dual in another sense. Seems that each time you move up a layer, more variations reveal themselves.

Yeah, as you move up in abstraction, you lose simplifying symmetries, and so you gain more complexity when you invert or interact with the now-fractured operators. Handling that sort of thing well requires a very well-thought-out system for expressing the new operations; just taking the first manifestations and throwing ad-hoc syntax at them won't do very well unless you get lucky.

We need to some in-depth study on this, see how many of the axises are relevant enough to be worth addressing, and find some unifying patterns that make them all make sense and carve up the idea-space in a way that's reasonably easy to understand and works well in syntax.

@tabatkins That is the charter for this living document.

Yeah, I know. ^_^ Just saying, for Domenic's benefit.

Alternately I'd be interested in some sort of "sync counterpart" of observables, which we could then package up under function^ (strawman) and then observables would be async function^. Maybe that sync counterpart is forEach-ables?

🧟 🧙‍♂️ RESURRECTED (sorry)

@domenic I've thought about this a lot over the years, and I've always thought this sort of thing has some merit, but it would require something different than yield that behaved a little differently. Basically something that pushed values out and continued, rather than pushed the value out and blocked the execution of the function. So in these strawmen I've called this push and push*. Which behave the same as yield, generally, but don't provide a value to the LHS of any expression, and don't stop the execution of the current code.

function^ pushThings() {
  push 1;
  push 2;
  push 3;
}

const values = pushThings(); // Observable<number>

values.forEach(console.log); // logs 1, then 2, then 3 synchronously (could be scheduled)

// or this is equivalent. (which couldn't be scheduled?)
for (const value on values) {
   console.log(value); 
}

Where an async function^ would be more like this:

async function^ pushAsyncThings() {
   for (let n = 0; n < 100; n++) {
     await sleep(1000);
     push n;
   }
}

const values = pushAsyncThings();

for await (const value on values) {
  console.log(value);
}

The interesting thing this could provide, though, is the ability, along with the WHATWG proposed addition to EventTarget to build some interesting things. In this example, I'm assuming that push* behaves similar to push in that it wouldn't wait/block anything, instead it would start pushing values out (asynchronously) and move to the next line immediately:

async function^ pushedPricesFor(symbol, signal) {
  const socket = new WebSocket('wss://priceinfo/socket/endpoint');
  
  // Start a subscription to errors that causes
  // it to throw. Since `push` (or `push*`) doesn't
  // block execution, maybe this just runs until the 
  // code block is complete
  push* socket.on('error').map((e) => {
    throw e;
  })
  
  // Wait for our socket to open
  await socket.on('open').first();
  
  // Then send our symbol to start getting streaming data
  socket.send(symbol);
  
  // Yield our data from the backend
  push* socket.on('message')
    .map(e => JSON.parse(e.data))
    .takeUntil(signal.on('abort'));
    
  const closeEvent = await socket.on('close').first();
  
  if (!closeEvent.wasClean) {
    throw new Error('Socket closed dirty.') // Network error maybe?
  }
}