It does not correctly extract long filename.
rudyeeee opened this issue · comments
What I did:
If the filename is too long, it is stored separately.
But when I call (*enmime.Part).FileName, it only reads the last FileName.
example)
To: diix109847@daum.net
From: =?UTF-8?B?7KCA64SQ?= <journal@comtrue.com>
Subject: 11111111111
Message-ID: <dda70b89-f9bd-575c-2db0-0addfa422f53@comtrue.com>
Date: Wed, 12 Dec 2018 17:05:46 +0900
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101
Thunderbird/60.3.2
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="------------BA281A40060E21A764F8C2BD"
Content-Language: ko
This is a multi-part message in MIME format.
--------------BA281A40060E21A764F8C2BD
Content-Type: text/plain; charset=euc-kr; format=flowed
Content-Transfer-Encoding: 7bit
2222222222222222222
--------------BA281A40060E21A764F8C2BD
Content-Type: application/x-zip-compressed;
name="=?UTF-8?B?6rCc7J247KCV67O0X+qwgTMw6rCcLnppcC56aXA=?="
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename*0*=euc-kr''%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%2E%7A%69;
filename*1*=%70%2E%7A%69%70
UEsDBBQAAAAIAGl1h02mqssRCgAAAA4AAAAMAAAAMjIyMjIyMjIudHh0MzQEAV0jMGUIAFBL
AQIUABQAAAAIAGl1h02mqssRCgAAAA4AAAAMACQAAAAAAAAAIAAAAAAAAAAyMjIyMjIyMi50
eHQKACAAAAAAAAEAGABpkKrA743UAcTHx6AekNQBxMfHoB6Q1AFQSwUGAAAAAAEAAQBeAAAA
NAAAAAAA
--------------BA281A40060E21A764F8C2BD--
In the above example, the filename is the sum of filename*0* ~ filename*1*, but only filename*1* is read.
What I expected:
sum of filename*0* ~ filename*1*
What I got:
only value of filename*1*
Release or branch I am using:
both master and release v0.4.0
If the filename is longer, it can be as follows:
Content-Disposition: attachment;
filename*0*=euc-kr''%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0;
filename*1*=%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8;
filename*2*=%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30;
filename*3*=%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0;
filename*4*=%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8;
filename*5*=%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30;
filename*6*=%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0;
filename*7*=%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8;
filename*8*=%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30;
filename*9*=%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0;
filename*10*=%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA;
filename*11*=%B8%5F%B0%A2%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2;
filename*12*=%33%30%B0%B3%B0%B3%C0%CE%C1%A4%BA%B8%5F%B0%A2%33%30%B0%B3;
filename*13*=%2E%7A%69%70
Thanks, I don't remember seeing that in the RFCs, will have look into it more.
This is currently implemented in the mime package: https://golang.org/src/mime/mediatype.go?s=4428:4473#L167
However, only 'us-ascii' and 'utf-8' are supported at this time: https://golang.org/src/mime/mediatype.go?s=5620:5663#L223
As is, you will get a "hex percent-encoded" result, but if the encoding is not us-ascii or utf-8 then filename data on the same line as the encoding declaration will be omitted.
@jhillyerd If we want to implement this fix we have two options: PR into golang mime package, which would bring the "enmime/internal/coding/charsets.go" for the ride, or copy out ParseMediaType such that we can amend it.
Negative reasons to support copying and amending:
- code quality: I don't have a high confidence that our code is up to the scrutiny of the go developers. I also doubt it would be acceptable to link mime pkg to transform pkg in such a way.
- level of effort to contribute to golang/go: Sometimes these MRs just sit there for 1+yrs
Positive reasons to support copying and amending:
- Gain the ability to consolidate our various ParseMediaType fixes
- Is an immediate solution
- Provides a staging ground for getting our copy of ParseMediaType upto quality standards for making a contribution to golang/go in the future
Here is a playground exposing the result: https://play.golang.org/p/zOX_CQAV23l
Just tested on playground, this is still broken in the current version of Go (only us-ascii and utf-8 supported).
I think copying ParseMediaType into our codebase is the better option. I'm not sure how frequently this comes up in the wild. If others run into this, please comment.