Feature request: Stream decoding with "stop" as the error behavior
chris-martin opened this issue · comments
I am considering a situation where I have a ByteString
stream that may be UTF-8 up to some unknown point, and I'd like to be able to do a streaming decode of as much Text
as possible for as long as the input is valid, and then stop at the first sign of trouble, obtaining both the decoded Text
and the non-UTF8 ByteString
remainder.
I envision something like this:
streamDecodeUtf8' :: ByteString -> Decoding'
data Decoding' = Some'
Text -- ^ What was decoded
ByteString -- ^ Remainder that was not decoded
(Maybe UnicodeException)
-- ^ 'Just' an exception if the remainder is non-empty
-- because it begins with invalid input.
-- 'Nothing' if the remainder is empty or is non-empty
-- but could become valid with more input.
This is being worked on #448
The API there is more complicated, because (1) returning a Text
forces you to do a copy and (2) returning the remainder as a ByteString
forces you to append to the next chunk to resume. But I think it's still possible make it look closer to what you are proposing while leaving the user in control of how the copying to Text
is done.
returning the remainder as a
ByteString
forces you to append to the next chunk to resume
Yes, the existing stream API gives you, in addition to the ByteString remainder, a function that lets you continue without having to concatenate, and there's no reason I should have proposed changing that aspect. A better attempt would be:
streamDecodeUtf8Strict :: ByteString -> StrictDecoding
data StrictDecoding = StrictDecoding
Text -- ^ What was decoded
ByteString -- ^ Remainder that was not decoded
(Either UnicodeException (ByteString -> StrictDecoding))