Encoding changes when premailer is ran
engwan opened this issue · comments
By default, premailer
replaces HTML entites (See premailer/premailer#181). This can sometimes change a message body's encoding. For example, —
is changed to —
and is no longer valid for 7bit
transfer encoding.
The problem is premailer-rails
forces encoding and transfer-encoding to be that of the old body.
I think we should just remove lines 76 and 79 from https://github.com/fphilipe/premailer-rails/blob/master/lib/premailer/rails/hook.rb
This makes Mail::Part
detect the right encoding and transfer encoding to use.
Note: I saw this #166 and I think the problem before was that we were setting transfer-encoding but not the encoding causing text to be mangled. If we remove both, I think it should work fine.
Also, in our case, since we don't want premailer
to replace HTML entities, we pass the output_encoding: 'US-ASCII'
option as described here: https://github.com/premailer/premailer/wiki/Premailer-Parameters-and-Options
This turns non-ASCII characters in the original body to HTML-encoded ones (&#xxxx;
). The resulting body is now valid for 7bit
transfer encoding but premailer-rails
forces this to be quoted-printable
(the transfer encoding of the old body).
Removing the setting of encoding and transfer-encoding would fix both cases where the transformed body needs a more complex or simpler transfer encoding.
Here's the issue on our issue tracker for reference: https://gitlab.com/gitlab-org/gitlab-ce/issues/55964
@fphilipe any thoughts on this? I'm seeing GMail show "[Message clipped]" despite my emails being well under the 102kb limit, I believe because they contain the © symbol (which is ©
in my code, but seems to get transformed to © by premailer)
@fphilipe I removed the Premailer::Rails.config[:output_encoding] = 'US-ASCII'
initializer & used the gem as follows: gem 'premailer-rails', github: 'fphilipe/premailer-rails', branch: 'fix-encoding'
The message clipping is unfortunately back with this branch. It seems this fix doesn't work 😞
Would you be able to share the resulting email as an eml
file?
Even better would be if you could also share an eml
file generated with the latest release for comparison.
I emailed you eml files. The key difference seems to be:
clipped: =A9 2019
not clipped: © 2019
Also, clipped emails now have =0D
before every newline character
Thanks, @swrobel!
Just to clarify:
- clipped: no
output_encoding
specified and using thefix-encoding
- not clipped:
output_encoding
set toUS-ASCII
and using latest available version of the gem from RubyGems
First, both emails render correctly for me in Mail.app and in Thunderbird. Switching to plain text in Thunderbird also renders everything correctly.
The difference between the way ©
is encoded is because of the removal of output_encoding
. When output_encoding
is set to US-ASCII
premailer (or nokogiri, not sure) will convert non-ascii characters to the equivalent html entity. If you were to keep output_encoding: 'US-ASCII'
you'd still get ©
.
On the other hand, =A9
is the encoding for the unicode character ©
(this is the case when Content-Transfer-Encoding
is set to quoted-printable
; here's a good explanation of the different transfer encodings). I'm unsure though why it is being encoded as =A9
and not =C2=A9
as is the case in the GitHub notification I received for your comment that included that character:
Content-Type: text/plain;
charset=UTF-8
Content-Transfer-Encoding: quoted-printable
@fphilipe any thoughts on this? I'm seeing GMail show "[Message clipped]"=
despite my emails being well under the 102kb limit, I believe because th=
ey contain the =C2=A9 symbol (which is `©` in my code, but seems to =
get transformed to =C2=A9 by premailer)=0D
=0D
-- =0D
You are receiving this because you were mentioned.=0D
Reply to this email directly or view it on GitHub:=0D
https://github.com/fphilipe/premailer-rails/issues/240#issuecomment-53142=
4966=
Regarding =0D
, that's simply the encoding of a carriage return. Look at the GitHub notification email above and you'll spot those as well. These get encoded because the non-encoded carriage return has a special meaning in the encoded email. The above gets rendered as follows:
@fphilipe any thoughts on this? I'm seeing GMail show "[Message clipped]" despite my emails being well under the 102kb limit, I believe because they contain the © symbol (which is `©` in my code, but seems to get transformed to © by premailer)
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/fphilipe/premailer-rails/issues/240#issuecomment-531424966
Questions:
- If you leave out the
©
character, does that make any difference in Gmail's clipping? - Both emails you shared contain a blob at the end, which I couldn't read or figure out what it is. Can you explain? That's also adding to your emails size.
- Would you be able to share the email before premailer is run by disabling premailer? That would help in clarifying some of the charset, which look off to me. I'm curious why the email body is encoded as ISO-8859-1 rather than UTF-8.
@fphilipe apologies for the delay in responding but this took a lot of troubleshooting. The key thing that you noticed was:
I'm curious why the email body is encoded as ISO-8859-1 rather than UTF-8
I looked at the original message being sent by Rails and it indeed should've been UTF-8 so either GMail or Sendgrid was changing it. After changing from Sendgrid to Mailgun, the emails stopped being clipped and GMail's raw view showed the correct UTF-8 encoding (this is just using the currently released gem & no US-ASCII
override). Bizarre!!!
Hopefully @engwan can weigh in on whether your branch fixes his issue.
Is there any update on this?
Hey guys, we recently activated SMIME signing for emails within gitlab, and noticed that as soon as any unicode character is included in the email, the calculated signature failed. This pointed to encoding/conversion issues between signature calculation and delivery, and since premailer-rails is also active there, I was suspecting it could be the cause.
But I did some local tests and disabling premailer didn't show any effect. Funny thing, just downgrading the mail gem to 2.6.6 solved the problems. This open issue might be the cause: mikel/mail#1190
We'll be opening an issue in gitlab later but wanted to give you a heads up in case this could be related.
premailer-rails 1.11.0 is out, which should address this. Thanks everyone!