Some Unicode characters aren't iterated correctly

Question

Some Unicode characters aren't iterated correctly

cbbfcd opened this issue 5 years ago · comments

thx! i like this repo, but unicode is a trouble, such as:

type(target, 'café', 1000, 'cafe', 1000, loop);

type(target, '👨‍👩‍👦', 1000, '👨‍👩‍👧‍👦', 1000, loop);

Cam Wiegert · Answer 1 · Mon Jan 13 2020 23:10:32 GMT+0800 (China Standard Time)

I know that some Unicode surrogate pairs aren't handled by String.prototype[Symbol.iterator], which is how this library iterates over strings.

Many emoji and other Unicode characters are handled correctly, but some surrogate pairs and diacritics cause an issue. I'd like to handle as many cases as possible without greatly increasing the complexity of the library or introducing a dependency.

One thing I haven't done yet is use String.prototype.normalize on string arguments. I'm going to use this issue to discuss options. Thank you!

Cam Wiegert · Answer 2 · Mon Jan 13 2020 23:13:47 GMT+0800 (China Standard Time)

An example of using the String normalize method:

[...'café'].length === 5
[...'café'.normalize()].length === 4

波比小金刚 · Answer 3 · Tue Jan 14 2020 11:06:05 GMT+0800 (China Standard Time)

@camwiegert thx your replay.
i totally agree with you and String.prototype.normalize is a good idea.