Compiling with strange characters
aleixpellicer opened this issue · comments
I've just installed and run Eleventy and it is compiling the example with strange characters:
<p>��#� �P�a�g�e� �h�e�a�d�e�r�
�
�</p>
OS: Windows 10
Eleventy: 0.12.1
Node: 10.23.0
@aleixpellicer Odd. What happens if you edit the README.md locally and resave+rebuild?
@aleixpellicer Odd. What happens if you edit the README.md locally and resave+rebuild?
exactly the same:
This:
# Page header # Hello again
Converts into this:
<p>��#� �P�a�g�e� �h�e�a�d�e�r� � �#� �H�e�l�l�o� �a�g�a�i�n�</p>
Very strange. I haven't seen that before, but I don't use Windows very often to help troubleshoot. I can probably reinstall Windows 10 on a VM and see if I can poke at it tomorrow.
const str = "��#� �P�a�g�e� �h�e�a�d�e�r� � �#� �H�e�l�l�o� �a�g�a�i�n�";
for (const char of str) {
console.log(`char="${char}"; code=${char.charCodeAt(0)}`)
}
OUTPUT
char="�"; code=65533
char="�"; code=65533
char="#"; code=35
char="�"; code=65533
char=" "; code=32
char="�"; code=65533
char="P"; code=80
char="�"; code=65533
char="a"; code=97
char="�"; code=65533
char="g"; code=103
char="�"; code=65533
char="e"; code=101
char="�"; code=65533
char=" "; code=32
char="�"; code=65533
char="h"; code=104
char="�"; code=65533
char="e"; code=101
char="�"; code=65533
char="a"; code=97
char="�"; code=65533
char="d"; code=100
char="�"; code=65533
char="e"; code=101
char="�"; code=65533
char="r"; code=114
char="�"; code=65533
char=" "; code=32
char="�"; code=65533
char=" "; code=32
char="�"; code=65533
char="#"; code=35
char="�"; code=65533
char=" "; code=32
char="�"; code=65533
char="H"; code=72
char="�"; code=65533
char="e"; code=101
char="�"; code=65533
char="l"; code=108
char="�"; code=65533
char="l"; code=108
char="�"; code=65533
char="o"; code=111
char="�"; code=65533
char=" "; code=32
char="�"; code=65533
char="a"; code=97
char="�"; code=65533
char="g"; code=103
char="�"; code=65533
char="a"; code=97
char="�"; code=65533
char="i"; code=105
char="�"; code=65533
char="n"; code=110
char="�"; code=65533
As close as I can figure, char code 65533 is the "Unicode Character 'REPLACEMENT CHARACTER' (U+FFFD)" (per https://www.fileformat.info/info/unicode/char/fffd/index.htm).
Comments: used to replace an incoming character whose value is unknown or unrepresentable in Unicode.
compare the use of U+001A as a control character to indicate the substitute function
I wonder if it's an encoding issue. If you open the code in VS Code (or whatever), is the encoding set to UTF-8, or something else?
I'm having a hard time reproducing locally here on macOS 11.6.1/Big Sur, w/ Eleventy v0.12.1 and Node v14.18.1.
# Hello World
Produces the following output:
<h1>Hello World</h1>
The issue happened before, apparently it happens on windows users when you copy the example from the eleventy website:
I will close this, but this will keep happening to Windows users, so maybe the fix could be another way of generating the example files.
I also had this issue when doing the quick start example. It happens when running the 'echo' command in powershell (ver 5.1 or earlier), which encodes the characters in UTF-16 by default.
It goes away if you use the VScode editor (for example) that writes the characters in UTF-8. UTF-8 is the encoding most commonly used.
Please upvote #2323