consecutive version diff'ing

Question

consecutive version diff'ing

dret opened this issue 8 years ago · comments

i am woondering if this has been implemented, and if so, how robust it is: "The hub MAY reduce the payload to a diff between two consecutive versions if its format allows it."
even if we assume that a hub never misses a topic update, the same cannot be said for subscribers, so for them getting diffs may be problematic. they might also not even notice that there is a problem, unless the versions carry ETags strong enough to verify the versions, and their ability to be used for diff purposes.

Julien Genestoux commented 8 years ago

NP!

Aaron Parecki · Answer 1 · Wed Oct 26 2016 00:23:52 GMT+0800 (China Standard Time)

afaik the implementation of this has been around sending individual items from a feed, such as entries an in RSS feed.

Tony Garnock-Jones · Answer 2 · Wed Oct 26 2016 00:46:54 GMT+0800 (China Standard Time)

This question touches on thorny issues of content model, "reliability" level, consistency properties etc.

What is PubSub for? Transient, best-effort notifications where occasional message loss is fine and expected (atop which something else may be built), or "reliable", high-value notifications, where message loss is frowned upon and serious energy is expended on making sure notifications aren't missed, or duplicated?

Since "guaranteed delivery" isn't a thing, it might make sense to explicitly go for a best-effort/unreliable framing, and to leave "reliability" to upper levels. However, it could be a big help if some indication of lost or duplicate notifications were able to be provided! Could a hub be asked to assign a sequence number to its notifications? (It would have to be scoped very carefully.) That way at least consumers could detect gaps and duplications in the stream, and take application-specific appropriate action.

Julien Genestoux · Answer 3 · Wed Oct 26 2016 20:36:54 GMT+0800 (China Standard Time)

@tonyg I believe enforcing reliability can very well happen at the "content" level and not at the protocol level. The best solution I found for this was to have some kind of sequential id in each "payload" that the hub passes through. The subscriber can then easily identify if it's missing parts.

Tony Garnock-Jones · Answer 4 · Wed Oct 26 2016 22:59:08 GMT+0800 (China Standard Time)

@julien51 Thank you! Your remark helped me realise that what I was thinking of is best left for later, and out of scope for this round of the spec. Sorry for muddying the water. Keeping this spec best-effort/unreliable seems very much the best option.