tr: Illegal byte sequence
sureshjoshi opened this issue · comments
On almost every invocation of ./pwd.sh w someusername 30
, I receive a "tr: Illegal byte sequence" response. On the very few invocations that work (one in every 20), my random safe/filename
is only 1-2 characters long.
It appears to be a MacOS problem (I'm on 11.6) - and the workaround here appears to work: https://unix.stackexchange.com/questions/45404/why-cant-tr-read-from-dev-urandom-on-osx
LC_CTYPE=C tr -dc "[:lower:]" < /dev/urandom | fold -w8 | head -n1
I don't have any locales in my environment by default, and my zshrc doesn't set any by default - so I think the data pulled from urandom is going haywire.
I'm also not sure if my workaround above is "the" solution, or just a workaround.
Thanks for reporting this issue.
Can you let me know if it works with LC_CYPE=en_US.UTF-8
? I recommend setting it in your zshrc for now like https://github.com/drduh/config/blob/master/zshrc#L25-L29
Actually, it turns out that no, using that CTYPE doesn't work - which has me all kinds of surprised. I wasn't expecting it to fail. I've tried setting both LC_CTYPE and LC_ALL to that locale, and I still get the illegal byte sequence.
I am also having this issue on macos with zsh.
Here's the offending command pipeline and sample output:
$ tr -dc "foobar" < /dev/urandom | fold -w8 | head -n1
tr: Illegal byte sequence
Like @sureshjoshi, I was unable to get this line working by prefixing LC_CTYPE=en_US.UTF-8
:
$ LC_CTYPE=en_US.UTF-8 tr -dc "foobar" < /dev/urandom | fold -w8 | head -n1
tr: Illegal byte sequence
LC_CTYPE=en_US.UTF8
seems to work though for some reason:
$ LC_CTYPE=en_US.UTF8 tr -dc "foobar" < /dev/urandom | fold -w8 | head -n1
orbrrrba
LC_CTYPE=C
also works:
$ LC_CTYPE=C tr -dc "foobar" < /dev/urandom | fold -w8 | head -n1
ffrbrooo
As @drduh mentioned, this is probably something that should be configured in our .zshrc
.
Edit: This probably explains the above:
$ LANG=en_US.utf8 locale
LANG="en_US.utf8"
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL="C"
en_US.utf8
is probably invalid, so it just defaults to C
for the locale, which works above. I can't seem to get en_US.utf-8
to work, though.
By default, my locale looks like this (nothing in .zshrc
):
$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"
So I'm kind of stumped now. The only thing I can think to do is to patch this line in my copy of pwd.sh
by prefixing it with LC_ALL=C
. This works for me for the time being.
We can make the same fix as drduh/Purse#3
Fix pushed, thanks for reporting and investigating!