bbottema / simple-java-mail

Simple API, Complex Emails (Jakarta Mail smtp wrapper)

Home Page:http://www.simplejavamail.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[bug] Attachment body parts should separately parse Content-Disposition and ContentID, possible resulting in an downloadable attachment that is also embedded

roy20021 opened this issue · comments

Hello,

I am trying to use simple-java-mail for parsing and showing mails, however I am finding out an issue regards parts that should be treated as CID instead of Attachments.

More precisely, the library is not handling at all the different multipart subtype (mainly: mixed, alternative and related) and relies on the content-disposition header to determine if the part should be a CID or an attachment.
As per my trials, at least a message sent via gmail, the eml might be composed of multipart/related (https://www.ietf.org/rfc/rfc2387.txt ) and the CID parts declared with disposition 'attachment'.
As per https://www.ietf.org/rfc/rfc2183.txt (2.1, 2.2 paragraph) the disposition meaning is about automatic or manual actions from the user.

Example:
Example.zip

There is an image "doclife.jpg" that should appear in the cidMap (but it is not).

I propose to handle that more loosely: roy20021@eed9bf9

I haven't tested yet the change since #490 is a blocker.
Please provide a formatter (I am using eclipse) I should use if necessary.

Hello roy, thank you for contributing. I haven't had much time to dive into this, but it's difficult to see what you changed exactly, with all the formatting changes. Would it be possible to split out the formatting from the actual change?

Somewhat related to #179

Can you apply a specific formatter to all the code and reference it into the repo? It will be extremely easier to contribute.
As an example: https://github.com/google/google-java-format

I describe below my changes:

  1. made MimeDataSource implements DataSource
  2. at MimeMessageParser changed the parseMimePartTree method in order to assume every attachment as a CID
  3. at MimeMessageParser renamed moveInvalidEmbeddedResourcesToAttachments to moveNonEmbeddedResourcesToAttachments changing its logic in order to iterate over all cids and move to attachment if not into the html content

Thank you for the suggestion, but I don't want to format my whole code base right now.

I did a little bit of research, but it seems there is no clear behaviour for this. Sure, resources should be used as embedded based on CID, but it is not clear these same resources should not also be downloadable attachments. Meaning, a resource can both be an embedded image and a downloadable file.

I never realized this, but this seems to be the reason why I sometimes get emails in gmail where many of the embedded images are also listed as downloadable files. Apparently, gmail does this commonly, but this is not universal behaviour and the RFC's don't explicitly define this use case. It's more of a by-product due to the flexible nature of the RFC's.

What are your thoughts on this?

I'm going to treat the handling of Content-Disposition separate from any embedding/CID references. This means I'm going to assume the resource should both be embedded and and attachment which in the case of Simple Java Mail means, the resources occurs in two list at the same time (embeddedImages and attachments).

I'm going to treat the handling of Content-Disposition separate from any embedding/CID references. This means I'm going to assume the resource should both be embedded and and attachment which in the case of Simple Java Mail means, the resources occurs in two list at the same time (embeddedImages and attachments).

Great, I think the same.

Are you going to take care of the changes? If you want me doing any changes let me know.

I'm already having a crack at it 👍

I'll release the fix tonight (my time)

Released in 8.6.3. If you could verify it works as you expect, that would be very helpful.

@bbottema I confirm it works as expected. Thank you!