haskell / text

Haskell library for space- and time-efficient operations over Unicode text.

Home Page:http://hackage.haskell.org/package/text

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The haddock for the function `streamDecodeUtf8With` says that it takes a byte stream, but in truth it takes a byte array!

kindaro opened this issue · comments

See:

λ import Data.Text.Encoding
λ import Data.Text.Encoding.Error
λ import Data.ByteString.Lazy (pack)
λ :type pack
pack ∷ [GHC.Word.Word8] → Data.ByteString.Lazy.Internal.ByteString
λ :type streamDecodeUtf8With lenientDecode . pack

<interactive>:1:38: error:
    • Couldn't match type ‘Data.ByteString.Lazy.Internal.ByteString’
                     with ‘bytestring-0.11.4.0:Data.ByteString.Internal.Type.ByteString’
      Expected: [GHC.Word.Word8]
                → bytestring-0.11.4.0:Data.ByteString.Internal.Type.ByteString
        Actual: [GHC.Word.Word8]
                → Data.ByteString.Lazy.Internal.ByteString
      NB: ‘bytestring-0.11.4.0:Data.ByteString.Internal.Type.ByteString’
            is defined in ‘Data.ByteString.Internal.Type’
          ‘Data.ByteString.Lazy.Internal.ByteString’
            is defined in ‘Data.ByteString.Lazy.Internal’
    • In the second argument of ‘(.)’, namely ‘pack’
      In the expression: streamDecodeUtf8With lenientDecode . pack

This is what the haddock says:

Decode, in a stream oriented way, a lazy ByteString containing UTF-8 encoded text.

As I understand, my code should work — the pack imported from Data.ByteString.Lazy should compose with streamDecodeUtf8With lenientDecode. It is confusing that we have the same name ByteString for two different things — byte stream and byte array. Maybe I am terribly confused. Please help me understand this situation.

(bytestring offers type synonyms StrictByteString and LazyByteString, let me use them to reduce confusion)

It's just that documentation for streamDecodeUtf8With is wrong since inception. It takes a StrictByteString indeed. "Stream-oriented" means that you are supposed to pass further StrictByteStrings one by one to Decoding.

Could you possibly raise a PR to fix haddocks?