When reading (chinese) .msg files, HTML converted from RTF is completely garbled (encoding issue)
bbottema opened this issue · comments
Benny Bottema commented
The problem is that the RTF's included codepage is ignored and all the hex bytes for text are converted one at the time. However, codepage 936 (chinese charset) requires two bytes per character (double byte character set, DBCS). Moreover, any code page defined in the RTF header should be honored when parsing user text.
Benny Bottema commented
Solved by bbottema/outlook-message-parser#3.
Benny Bottema commented
Released in 6.0.0-rc1.
Benny Bottema commented
6.0.0 has released as well, finally.