Problems with ️FE0F character
oriolbcn opened this issue · comments
It seems that the ️️FE0F character that some emojis have is not properly taken into account, which causes rendering problems in Safari because it leaves the FE0F character there when, in fact, it should be taken as part of the emoji.
Emojis are a critical part of our business, we recently migrated our old implementation to use this library and are even paying a licensce, and this a really big stopper for us right now, making it not feasible to use the library as we would provide a bad UX to users.
How to reproduce
Open the toImage demo page in Safari and try with the ❤️emoji. This is what happens:
What I have discovered so for
The problem comes from a combination of the unicode maps that are constructed + the contents of the emojiList
object. Specifically these 2 lines:
https://github.com/emojione/emojione/blob/master/lib/js/emojione.js#L409
https://github.com/emojione/emojione/blob/master/lib/js/emojione.js#L400
This uses the uc_output
property, however for all the emojis that have FE0F at the end, the uc_output
is missing that part. For instance, the uc_output
of ❤️is just 2764. Therefore, when converting U+2764 U+FE0F, it just replaces U+2764 and leaves the U+FE0F hanging.
I have noticed that the uc_match
property has the proper combination (2764-FE0F), so I tried to used that, but then it fails for complex emojis like Family and Couple that use the 200D joiner. For those, the uc_output
has the joiner, but the uc_match
has not.
In summary:
uc_outupt
has the 200D joiner character but not the FE0F characer. Fails with ❤️, succeeds with 👨👧.uc_match
has the FE0F character but not the 200D joiner. Succeeds with ❤️, fails with 👨👧.
So I think the underlying problem is in the way the emojiList
object is built. I have tried to find the script that generates this but I can't find it, I guess you keep it in private.
Possible solutions I can think of
Best (more difficult)
The emoji.json
file has an array of multiple matches for each emoji. I think the library should consider all these for a match. These include all variations with and without the FE0F and 200D characters, and maybe even other optional characters. This ensures that the emoji will be matched no matter how it is written.
Good
Standardize the value of uc_match
and uc_output
. Have always the full unicode sequence (with FE0F and 200D) in one property and without in the other property.
Hacky
I guess that if, with the current emojiList
, we take both the uc_match
and uc_output
to build the maps (placing always the longest of the 2 first), it would work, but I have not tested it yet.
@oriolbcn we've published a solution for this that I think works fairly well. Please try this out and let me know how it goes.
As you'll see, trailing VS16 characters will simply be removed after replacement has occurred which has proven successful in variations of this library that we've used in our applications. If you run into any issues with it we can revisit the alternatives.
Tried it and works great 👍Don't love the solution though, to me the character should be taken as part of the Emoji replacement. But it does the job.