Accentuation changed to weird HTML entity with gmail_quote
Savageman opened this issue · comments
Julian Espérat commented
é is transformed to ĂŠ when a gmail_quote div is present:
// Good
quotations.extract_from_html("<div>accentuation é</div>");
// <div>accentuation é</div>
// Bad é is transformed to ĂŠ
quotations.extract_from_html("<div>accentuation é</div><div class='gmail_quote'></div>");
// <html><head></head><body><div>accentuation ĂŠ</div></body></html>
My input string is in UTF-8.
Julian Espérat commented
Found a weird fix: if I wrap <html>...</html>
tags around the content, it works!
So <html><div>accentuation é</div><div class='gmail_quote'></div></html>
is fine :)