Improper encoding / decoding of some special 7-bit values in cp437, macintosh
rossj opened this issue · comments
Hi there.
I've noticed that cp437
does not properly encode / decode special symbols that are assigned to bytes 0x01-0x1F and 0x7F. Instead, When decoding, these bytes are incorrectly treated as-is and passed through as control characters. Similarly, when encoding the special characters in this range, they are replaced with question marks.
I've noticed a similar issue with the macintosh
encoding, which has special symbols defined at x11-x14.
As an example, the two tests below are currently failing:
import { decode, encode } from 'iconv-lite';
describe('encodings', () => {
it('should encode special cp437 symbols that map to bytes 0x0-0x1F', () => {
const input = '\u263A'; // A smiley face
const result = encode(input, 'cp437');
expect(result[0]).toEqual(1);
});
it('should decode cp437 bytes in range 0x01-0x1F', () => {
const input = Buffer.from([1]);
const result = decode(input, 'cp437');
expect(result).toEqual('\u263A');
});
});
hmm yeah I think you're right. Thank you for filing this issue and the tests, really helpful!
My current encoding generation code uses iconv
project as the source, so it seems that it's wrong there too. Strange to see this in a relatively widely known encoding.
I'll fix this soon.
Came here to log exactly this. Any ETA? This would help a lot with enigma-bbs as well as a text mode RPG I'm working on!
I had a double check, seems the issue exist indeed. I checked the source code, and found cp437 was achieved by remote resource, but i guess the remote resource lack of partial data. how about we make special treatment for these special characters?
hmm yeah I think you're right. Thank you for filing this issue and the tests, really helpful!
My current encoding generation code usesiconv
project as the source, so it seems that it's wrong there too. Strange to see this in a relatively widely known encoding.
I'll fix this soon.