deNULL / utf-c

A compact way to encode Unicode strings

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Typo in rangesLatin declaration

mdmt1 opened this issue · comments

commented

In both Go and JS versions, range for comma character is specified incorrectly: {0x2D, 0x2C} instead of {0x2C, 0x2D}.

commented

Also, in auxOffset declaration there is an outdated comment:
// 0x0000, Latin is a special case, it merges A-Z, a-z, 0-9, "-" and " " characters.
Note "-" ({0x2D, 0x2E}) instead of ",".
Same error in README.md, assuming Habr post as a source of truth.

commented

Actually, I think that was a mistake in the article, it's supposed to be "-" (i.e. {0x2D, 0x2E}). Obviously, one can choose "," instead in their own implementation (and tweak the code accordingly), depending on the context. For example, dash can be more useful when storing words in a dictionary (or for texts with a lot of negative numbers? :)