Emoji convert to Unicode codepoints
guitarrapc opened this issue Β· comments
Description
yamlfmt convert emoji to Unicode codepoints.
This is breaking change to yaml.
Reproduce steps
- prepare yaml.
ascii: this is acsii
# https://emojipedia.org/smiling-face-with-smiling-eyes/ https://codepoints.net/U+1F60A
# https://emojipedia.org/party-popper/ https://codepoints.net/U+1F389
emoji: π π
- Run yamlfmt.
- See changed. Emoji convert to it's Unicode.
ascii: this is acsii
# https://emojipedia.org/smiling-face-with-smiling-eyes/ https://codepoints.net/U+1F60A
# https://emojipedia.org/party-popper/ https://codepoints.net/U+1F389
emoji: "\U0001F60A \U0001F389"
-dry diff
$ ls
foo.yaml
$ yamlfmt -dry
foo.yaml:
ascii: this is acsii
# https://emojipedia.org/smiling-face-with-smiling-eyes/ https://codepoints.net/U+1F60A
# https://emojipedia.org/party-popper/ https://codepoints.net/U+1F389
- emoji: π π
+ emoji: "\U0001F60A \U0001F389"
Version Info
- yamlfmt v0.3.0
- OS: Windows 10 and 11
Thank you for opening an issue!
This is a bug upstream in the yaml.v3
package. This issue go-yaml/yaml#737 and the PR to fix it go-yaml/yaml#738 have been open for some time and met with silence. The problem seems to be that there's a character in UTF-16 sequences that are parsed properly, but on rewriting are not properly detected as printable. I will try and bump the PR and issue, hopefully it can get merged some time soon.
If it takes too long, I may have to fork it and pull it in as a submodule in this repo, but I would really like to avoid that if possible.
I managed to create a hotfix that will parse the literal unicode codepoints and re-encode them properly. It's pretty overkill, but all signs point to this not being fixed in yaml.v3 any time soon.
Thanks for bringing this to my attention!