nijel / enca

Extremely Naive Charset Analyser

Home Page:https://cihar.com/software/enca/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

enca fails to convert many simplified chinese texts

erjiang opened this issue · comments

I have a bunch of simplified chinese texts that fail when I try to run enca -L chinese -x utf8 example.txt. I think the issue is that the files are using code points from the newer GB18030 that are not in the older GB2312. That is, doing iconv -f gb2312 -t utf8 example.txt will not work, but iconv -f gb18030 -t utf8 example.txt will work.

I suggest replacing GB2312 with the newer GB18030.