sindresorhus / strip-bom

Strip UTF-8 byte order mark (BOM) from a string

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

the stream version does not reliably run 'is-utf8'

amitport opened this issue · comments

line 20 of index.js read only a chunk of the file (which can be a small as 3 bytes). The 'is-utf8' check in line 9, should receive a larger input in order to determine the the file is utf-8.

Good catch :)

Are you able to infer how many bytes it needs?

I don't know of any common practice and there is always a chance a file is a binary that just starts with some text.

To me, it sounds reasonable to require that the library users send only text files

Agreed. I'll get the readme updated.