Please double-check the end of the EUDC range
hsivonen opened this issue · comments
In the Shift_JIS decoder, the inclusive end pointer 10528 looks suspicious, since it means only one possible trail byte (the lowest possible) is allowed for the lead byte F9. One would expect either the special case to run to the end of the pointers whose lead byte is F8 (making 10528 an exclusive bound) or run to the end of the pointers whose lead byte is F9.
Please double-check that the range is correct and, if it is, please add a note saying that the range is weird on purpose.
cc @vyv03354
This range originated from https://www.w3.org/Bugs/Public/show_bug.cgi?id=24130. @vyv03354 can probably clear this up.
The entire F9 range is valid as the removed commit indicates:
651f672
This is a bug of the current decoder algorithm.
Okay, so it should be 10715 ((0xF9 - 0xC1) x 188 + 0xFC - 0x41 in decimal)?
@r12a not sure if you test this range, but you might have to update a few things here.
Okay, so it should be 10715 ((0xF9 - 0xC1) x 188 + 0xFC - 0x41 in decimal)?
I think so.
So 236196e introduced the current range and it seems I just made an error there by not including the full range of 0xF9.