bincode-org / bincode

A binary encoder / decoder implementation in Rust.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Recovering from errors during writes

jonhoo opened this issue · comments

In most cases, a write to an io::Write will only fail if the writer is no longer valid. In those cases, the is no reason to continue writing, and the serialization should simply be aborted. However, in other cases, it would be useful to resume a write into an io::Write following an error. This applies particularly to transient errors like ErrorKind::TimedOut, ErrorKind::WouldBlock, and ErrorKind::Interrupted.

Unfortunately, since bincode currently writes things out piece by piece (and calls into the writer multiple times) without keeping track of how far it has gotten, resuming from these errors is basically impossible. One has to first serialize the value into a Vec<u8>, and then write from that value into the underlying writer, which introduces some unfortunate double-buffering.

What's the proposed solution for handling these kinds of errors?

If you can't afford buffering, my proposed solution would be to create your own Write implementating struct that keeps track of how many bytes were written into it, and when it is retried, it drops the first n bytes, but continues to write the bytes past the last successful write.

This does mean that you do wind up going through the effort of serializing a lot of your content into the void when you don't really need to, but I don't think that there's any other way to solve this given Bincodes constraints.

I guess such a wrapper would also need to know when serialization starts and finishes (i.e., the first and last write of a serialization) so that it knows when to start dropping, and when to reset n. Though managing that would then presumably be the responsibility of the thing calling bincode.

The alternative proposal here would be for bincode to keep track of its serialization progress somehow, but that would likely also come with some pretty substantial API changes that we'd want to avoid. This should probably be mentioned in the docs for bincode::serialize_into, since ErrorKind::Interrupted is an error that can relatively easily come up in normal execution, and would completely break most implementations that aren't prepared for this.

Though managing that would then presumably be the responsibility of the thing calling bincode.

Yep!

The alternative proposal here would be for bincode to keep track of its serialization progress somehow, but that would likely also come with some pretty substantial API changes that we'd want to avoid.

Also, because bincode uses Serde for serialization/deserialization, it would require breaking Serde changes. Last time I talked to them, they have no plans to ever break API compat.

ErrorKind::Interrupted is an error that can relatively easily come up in normal execution

For this error in particular, could you have Write wrapper that retries the internal write call on Interrupted "errors"?

struct RetryWriteOnInterrupted<W: Write>(W);
impl Write for ......

bincode::serialize_into(&my_struct, RetryOnInterrupted(my_writer));

And yeah, this would be a good thing to document, thanks for bringing it up!

Yes, that would work. This is basically what write_all does, but given that bincode does many small writes, you don't actually get that benefit. This sounds like a good idea for crate (both retry and "drop-n") :)

Closing since it seems that this is best solved by an external Write wrapper.