cpg command not superceding fcharset command
abLoftware opened this issue · comments
Japanese_UTF8.rtf.txt
Japanese_JIS.rtf.txt
It looks like fcharset is not being over-ridden when cpg is also provided,
From RTF Specification version 1.9.1 pg 20:
"If the \cpgN does appear, it supersedes the code page corresponding to the \fcharsetN."
test cases attached
TestBug2.java.txt
Thanks for the bug report. Could you provide the two samples as standalone RTF files which I can open with Microsoft Word or Wordpad? Thanks!
H Andre, unfortunately I can't see any attachments - cold you link them directlyto the GitHub issue?
Thanks!
Jon
I've had a chance to take a quick look. Before I make any changes to the code, I wanted to validate what Microsoft products made of the sample RTF files you provided.
This is what Wordpad makes of the JIS file
and here's what Wordpad makes of the UTF8 file
Here's what Word makes of the JIS file
and here's what Word makes of the UTF8
Based on these results I'm inclined to think that the UTF8 version of the file isn't correct as it stands. If we can get to the point with the UTF8 file where it renders consistently when opened in a Microsoft product and uses the cpg
command, I can make a stab at getting the parser to work with it appropriately.
Thanks for the reply. Unfortunately emailing responses back to this issue drops any embedded images or files. Can you add the images via the GitHub UI?
Hi! That was interesting, I got different results from Wordpad in Windows 8.1 and Windows 10 . I could see the files both rendering the same with the Windows 10 version. Anyway, I've applied a fix and released a new version - hopefully that'll work for you!