fphilipe / premailer-rails

CSS styled emails without the hassle.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Encoding changes when premailer is ran

engwan opened this issue · comments

By default, premailer replaces HTML entites (See premailer/premailer#181). This can sometimes change a message body's encoding. For example, — is changed to and is no longer valid for 7bit transfer encoding.

The problem is premailer-rails forces encoding and transfer-encoding to be that of the old body.

I think we should just remove lines 76 and 79 from https://github.com/fphilipe/premailer-rails/blob/master/lib/premailer/rails/hook.rb

This makes Mail::Part detect the right encoding and transfer encoding to use.

Note: I saw this #166 and I think the problem before was that we were setting transfer-encoding but not the encoding causing text to be mangled. If we remove both, I think it should work fine.

Also, in our case, since we don't want premailer to replace HTML entities, we pass the output_encoding: 'US-ASCII' option as described here: https://github.com/premailer/premailer/wiki/Premailer-Parameters-and-Options

This turns non-ASCII characters in the original body to HTML-encoded ones (&#xxxx;). The resulting body is now valid for 7bit transfer encoding but premailer-rails forces this to be quoted-printable (the transfer encoding of the old body).

Removing the setting of encoding and transfer-encoding would fix both cases where the transformed body needs a more complex or simpler transfer encoding.

Here's the issue on our issue tracker for reference: https://gitlab.com/gitlab-org/gitlab-ce/issues/55964

@fphilipe any thoughts on this? I'm seeing GMail show "[Message clipped]" despite my emails being well under the 102kb limit, I believe because they contain the © symbol (which is © in my code, but seems to get transformed to © by premailer)

@engwan, @swrobel, sorry for not getting back earlier. I've looked into this. Thanks for the suggestion, @engwan. I've opened a PR and tested the output of different encodings in different mail clients.

Would you mind taking this for a spin? #248

Let me know if you need a beta release.

@fphilipe I removed the Premailer::Rails.config[:output_encoding] = 'US-ASCII' initializer & used the gem as follows: gem 'premailer-rails', github: 'fphilipe/premailer-rails', branch: 'fix-encoding'

The message clipping is unfortunately back with this branch. It seems this fix doesn't work 😞

Would you be able to share the resulting email as an eml file?

Even better would be if you could also share an eml file generated with the latest release for comparison.

I emailed you eml files. The key difference seems to be:

clipped: =A9 2019
not clipped: © 2019

Also, clipped emails now have =0D before every newline character

Thanks, @swrobel!

Just to clarify:

  • clipped: no output_encoding specified and using the fix-encoding
  • not clipped: output_encoding set to US-ASCII and using latest available version of the gem from RubyGems

First, both emails render correctly for me in Mail.app and in Thunderbird. Switching to plain text in Thunderbird also renders everything correctly.

The difference between the way © is encoded is because of the removal of output_encoding. When output_encoding is set to US-ASCII premailer (or nokogiri, not sure) will convert non-ascii characters to the equivalent html entity. If you were to keep output_encoding: 'US-ASCII' you'd still get ©.

On the other hand, =A9 is the encoding for the unicode character © (this is the case when Content-Transfer-Encoding is set to quoted-printable; here's a good explanation of the different transfer encodings). I'm unsure though why it is being encoded as =A9 and not =C2=A9 as is the case in the GitHub notification I received for your comment that included that character:

Content-Type: text/plain;
 charset=UTF-8
Content-Transfer-Encoding: quoted-printable

@fphilipe any thoughts on this? I'm seeing GMail show "[Message clipped]"=
 despite my emails being well under the 102kb limit, I believe because th=
ey contain the =C2=A9 symbol (which is `©` in my code, but seems to =
get transformed to =C2=A9 by premailer)=0D
=0D
-- =0D
You are receiving this because you were mentioned.=0D
Reply to this email directly or view it on GitHub:=0D
https://github.com/fphilipe/premailer-rails/issues/240#issuecomment-53142=
4966=

Regarding =0D, that's simply the encoding of a carriage return. Look at the GitHub notification email above and you'll spot those as well. These get encoded because the non-encoded carriage return has a special meaning in the encoded email. The above gets rendered as follows:

@fphilipe any thoughts on this? I'm seeing GMail show "[Message clipped]" despite my emails being well under the 102kb limit, I believe because they contain the © symbol (which is `©` in my code, but seems to get transformed to © by premailer)

-- 
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/fphilipe/premailer-rails/issues/240#issuecomment-531424966

Questions:

  • If you leave out the © character, does that make any difference in Gmail's clipping?
  • Both emails you shared contain a blob at the end, which I couldn't read or figure out what it is. Can you explain? That's also adding to your emails size.
  • Would you be able to share the email before premailer is run by disabling premailer? That would help in clarifying some of the charset, which look off to me. I'm curious why the email body is encoded as ISO-8859-1 rather than UTF-8.

@fphilipe apologies for the delay in responding but this took a lot of troubleshooting. The key thing that you noticed was:

I'm curious why the email body is encoded as ISO-8859-1 rather than UTF-8

I looked at the original message being sent by Rails and it indeed should've been UTF-8 so either GMail or Sendgrid was changing it. After changing from Sendgrid to Mailgun, the emails stopped being clipped and GMail's raw view showed the correct UTF-8 encoding (this is just using the currently released gem & no US-ASCII override). Bizarre!!!

Hopefully @engwan can weigh in on whether your branch fixes his issue.

Is there any update on this?

Hey guys, we recently activated SMIME signing for emails within gitlab, and noticed that as soon as any unicode character is included in the email, the calculated signature failed. This pointed to encoding/conversion issues between signature calculation and delivery, and since premailer-rails is also active there, I was suspecting it could be the cause.

But I did some local tests and disabling premailer didn't show any effect. Funny thing, just downgrading the mail gem to 2.6.6 solved the problems. This open issue might be the cause: mikel/mail#1190

We'll be opening an issue in gitlab later but wanted to give you a heads up in case this could be related.

premailer-rails 1.11.0 is out, which should address this. Thanks everyone!