jhillyerd / enmime

MIME mail encoding and decoding package for Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

It does not correctly extract long filename.

rudyeeee opened this issue · comments

What I did:

If the filename is too long, it is stored separately.
But when I call (*enmime.Part).FileName, it only reads the last FileName.

example)

To: diix109847@daum.net
From: =?UTF-8?B?7KCA64SQ?= <journal@comtrue.com>
Subject: 11111111111
Message-ID: <dda70b89-f9bd-575c-2db0-0addfa422f53@comtrue.com>
Date: Wed, 12 Dec 2018 17:05:46 +0900
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101
 Thunderbird/60.3.2
MIME-Version: 1.0
Content-Type: multipart/mixed;
 boundary="------------BA281A40060E21A764F8C2BD"
Content-Language: ko

This is a multi-part message in MIME format.
--------------BA281A40060E21A764F8C2BD
Content-Type: text/plain; charset=euc-kr; format=flowed
Content-Transfer-Encoding: 7bit

2222222222222222222


--------------BA281A40060E21A764F8C2BD
Content-Type: application/x-zip-compressed;
 name="=?UTF-8?B?6rCc7J247KCV67O0X+qwgTMw6rCcLnppcC56aXA=?="
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename*0*=euc-kr''%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%2E%7A%69;
 filename*1*=%70%2E%7A%69%70

UEsDBBQAAAAIAGl1h02mqssRCgAAAA4AAAAMAAAAMjIyMjIyMjIudHh0MzQEAV0jMGUIAFBL
AQIUABQAAAAIAGl1h02mqssRCgAAAA4AAAAMACQAAAAAAAAAIAAAAAAAAAAyMjIyMjIyMi50
eHQKACAAAAAAAAEAGABpkKrA743UAcTHx6AekNQBxMfHoB6Q1AFQSwUGAAAAAAEAAQBeAAAA
NAAAAAAA
--------------BA281A40060E21A764F8C2BD--

In the above example, the filename is the sum of filename*0* ~ filename*1*, but only filename*1* is read.

What I expected:
sum of filename*0* ~ filename*1*

What I got:
only value of filename*1*

Release or branch I am using:
both master and release v0.4.0

If the filename is longer, it can be as follows:

Content-Disposition: attachment;
 filename*0*=euc-kr''%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0;
 filename*1*=%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8;
 filename*2*=%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30;
 filename*3*=%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0;
 filename*4*=%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8;
 filename*5*=%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30;
 filename*6*=%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0;
 filename*7*=%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8;
 filename*8*=%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30;
 filename*9*=%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0;
 filename*10*=%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA;
 filename*11*=%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2;
 filename*12*=%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3;
 filename*13*=%2E%7A%69%70

Thanks, I don't remember seeing that in the RFCs, will have look into it more.

commented

This is currently implemented in the mime package: https://golang.org/src/mime/mediatype.go?s=4428:4473#L167
However, only 'us-ascii' and 'utf-8' are supported at this time: https://golang.org/src/mime/mediatype.go?s=5620:5663#L223

As is, you will get a "hex percent-encoded" result, but if the encoding is not us-ascii or utf-8 then filename data on the same line as the encoding declaration will be omitted.

commented

@jhillyerd If we want to implement this fix we have two options: PR into golang mime package, which would bring the "enmime/internal/coding/charsets.go" for the ride, or copy out ParseMediaType such that we can amend it.

Negative reasons to support copying and amending:

  • code quality: I don't have a high confidence that our code is up to the scrutiny of the go developers. I also doubt it would be acceptable to link mime pkg to transform pkg in such a way.
  • level of effort to contribute to golang/go: Sometimes these MRs just sit there for 1+yrs

Positive reasons to support copying and amending:

  • Gain the ability to consolidate our various ParseMediaType fixes
  • Is an immediate solution
  • Provides a staging ground for getting our copy of ParseMediaType upto quality standards for making a contribution to golang/go in the future
commented

Here is a playground exposing the result: https://play.golang.org/p/zOX_CQAV23l

Just tested on playground, this is still broken in the current version of Go (only us-ascii and utf-8 supported).

I think copying ParseMediaType into our codebase is the better option. I'm not sure how frequently this comes up in the wild. If others run into this, please comment.