bbottema / outlook-message-parser

A Java parser for Outlook messages (.msg files)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

When from name/address are not available (unsent emails), these fields are filled with binary garbage

bbottema opened this issue · comments

Originally reported by @Faelean in #25 (comment)

Fix released in 1.7.3, thanks again @Faelean.

Hey, i just noticed that this fix does not work for names and adresses with characters outside the ascii range. Also i think the parsing algorithm for the Sender name is not fully correct. Please have a look at the parsing code in MSGReader, which handles this case correctly and works with all encodings, it also does not parse the MAPITag 0x8008, which causes the problem in "unsent draft.msg". It is a great project. I learned a lot by reading through their code! https://github.com/Sicos1977/MSGReader/blob/c11e890908d329c8d3892ff27fbe2da0ce049f4b/MsgReaderCore/Outlook/Message.cs#L1727

You can also refer to my gist with all of the mapi property tags i could find on the internet:
https://gist.github.com/derrohrbach/54de2e88b99fd67bd0db4a3057439a63

I think this issue should be reopened...

Hey thanks for the reference material. I'll take a closer look when I'll have some time. In the mean time, I'll happily accept pull requests if feel up to it.

This seems like a larger fix and I am short on time right now, also my tests are still not working correctly. I'll have to pass for now, sorry!

It must by my misunderstanding, but I don't see where 0x8008 is handled in the project you referenced.

Exactly thats the point ;) It does NOT parse it!

oh crap, I misread your comment :D

Hmm, but man, holy cow how complex should getting a sender+name be. That C# way is not a piece of cake either (100 lines of code).

Maybe it does not have to be that complex, on the other hand the msg format is not really known for it's simplicity either :D

I think our implementation is close to working well enough with the current setup. Doing it any other way requires reworking the code completely.

What do you think about replacing the current check with this approach (SO)? Would have to test it for sure, but I'm not at home currently.

Hm i thought maybe it would be enough to simply ignore 0x8008 or have a look at what that value really means... I don't like checks like the one you posted, they always feel like a workaround instead of adressing the problem directly

Hm i thought maybe it would be enough to simply ignore 0x8008 (..)

Surprisingly (to me) this appears to be the case. Then I have no idea why 8008 is included there.

Thanks for pushing the issue @derrohrbach, I just released 1.7.4 with the work-around and 0x8008 handling removed.