jhillyerd / enmime

MIME mail encoding and decoding package for Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Content-Type: message/* not parsed

daogan opened this issue · comments

What I did:

Tested a sample email taken from RFC1521 Appendix C with enmime and compared the results with Python email parser and results from Gmail APIs.

What I expected:

Results should not differ too much, though may not necessarily be exactly the same.

What I got:

Nested part (Content-Type: message/*) not parsed but treated as single part.

Release or branch I am using:

Master

(Please attach a sample message if you feel it will help reproduce the issue)

Sample MIME I'm using:

MIME-Version: 1.0
From: Nathaniel Borenstein <nsb@bellcore.com>
To: Ned Freed <ned@innosoft.com>
Subject: A multipart example
Content-Type: multipart/mixed;
     boundary=unique-boundary-1

This is the preamble area of a multipart message.
--unique-boundary-1

   ...Some text appears here...

--unique-boundary-1
Content-Type: message/rfc822

From: (mailbox in US-ASCII)
To: (address in US-ASCII)
Subject: (subject in US-ASCII)
Content-Type: Text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: Quoted-printable

   ... Additional text in ISO-8859-1 goes here ...

--unique-boundary-1--

Results (the message/rfc822 part) parsed by Gmail APIs:

{
    "partId": "4",
    "mimeType": "message/rfc822",
    "filename": "",
    "headers": [
     {
      "name": "Content-Type",
      "value": "message/rfc822"
     }
    ],
    "body": {
     "size": 0
    },
    "parts": [
     {
      "partId": "4.0",
      "mimeType": "text/plain",
      "filename": "",
      "headers": [
       {
        "name": "From",
        "value": "(mailbox in US-ASCII)"
       },
       {
        "name": "To",
        "value": "(address in US-ASCII)"
       },
       {
        "name": "Subject",
        "value": "(subject in US-ASCII)"
       },
       {
        "name": "Content-Type",
        "value": "Text/plain; charset=ISO-8859-1"
       },
       {
        "name": "Content-Transfer-Encoding",
        "value": "Quoted-printable"
       }
      ],
      "body": {
       "size": 52,
       "data": "ICAgLi4uIEFkZGl0aW9uYWwgdGV4dCBpbiBJU08tODg1OS0xIGdvZXMgaGVyZSAuLi4NCg=="
      }
     }
    ]
   }

Results parsed by enmime:

{
  "part_id": "2",
  "mime_type": "message/rfc822",
  "body": {
    "data": "RnJvbTogKG1haWxib3ggaW4gVVMtQVNDSUkpClRvOiAoYWRkcmVzcyBpbiBVUy1BU0NJSSkKU3ViamVjdDogKHN1YmplY3QgaW4gVVMtQVNDSUkpCkNvbnRlbnQtVHlwZTogVGV4dC9wbGFpbjsgY2hhcnNldD1JU08tODg1OS0xCkNvbnRlbnQtVHJhbnNmZXItRW5jb2Rpbmc6IFF1b3RlZC1wcmludGFibGUKCiAgIC4uLiBBZGRpdGlvbmFsIHRleHQgaW4gSVNPLTg4NTktMSBnb2VzIGhlcmUgLi4uCg==",
    "size": 226
  },
  "headers": [
    {
      "name": "Content-Type",
      "value": "message/rfc822"
    }
  ]
}

Nested parts of Content-Type: message/* seems not parsed, I found the Python email parser treat it separately before handling multipart type, link, link.

commented

This is expected, as enmime does not recursively parse nested message/rfc822 parts. To parse this nested part, use the part content to generate a new envelope. As a side note, the textproto library renders headers as a map[string][]string and not as []struct{name string, value string}.

Agree with Neil, this is working as intended right now. I'm going to treat this as a feature request, we could have an option to enable it after #90 is implemented.

Currently worked by generating a new envelope with the part content and changing the enclosing PartIDs, it would be useful to have this enhancement included in future versions, thanks for help.