postalsys / postal-mime

Email parser for browser and serverless environments

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem with base64 encoded message

Juergen-29482 opened this issue · comments

I send a email with the text content:
BC0JGO6L+BPNCXOZytv7eGPMtwcbs1WuY8Bbg1yG+bWNXRdOoeG9dYTy10Z+5Q0EMmLkhAIr6X/6itw7cGn+wsCuiCha+5xcDvTqu2Y+Siiq+3zRCh6Ha6MTr2MeNb2z91slRjyFgifriHOhBwuXj8Egf+sCIANY3pURmGxcy/oaeDP+jm8KrsjvGIur1zdt9Kli28eLYPpjIzlt7XMTQQjw/s/y81NCJto/xA3XTWPmblrl1zXzStvdcNJdzv/w/d8SJkOOOIkLv7TYCDLJFLz604bD2CZErnW0TwVn4n9nMh0LsVyp0/f1zIo+G3yaBfnKou2Ae5dkn7NsiCQv8N+HK8a/2kmxLreujkw0VwVvDCxgQkalEOyh3OP+VvnscTY4yTPfz41l9WPPjlzYvr92+eSxYjU5Ej0DRITXb7kslM+1T4vzF61Bmn7+Ex/+y6kSGMLea1/t//f21tsTJDdX1qL8ZKh0mpm3oZyCPoxX

If I read the text message after the cloudflare rar content is parsed, I will get the following result:
BC0JGO6L BPNCXOZytv7eGPMtwcbs1WuY8Bbg1yG bWNXRdOoeG9dYTy10Z 5Q0EMmLkhAIr6X/6itw7cGn wsCuiCha 5xcDvTqu2Y Siiq 3zRCh6Ha6MTr2MeNb2z91slRjyFgifriHOhBwuXj8Egf sCIANY3pURmGxcy/oaeDP jm8KrsjvGIur1zdt9Kli28eLYPpjIzlt7XMTQQjw/s/y81NCJto/xA3XTWPmblrl1zXzStvdcNJdzv/w/d8SJkOOOIkLv7TYCDLJFLz604bD2CZErnW0TwVn4n9nMh0LsVyp0/f1zIo G3yaBfnKou2Ae5dkn7NsiCQv8N HK8a/2kmxLreujkw0VwVvDCxgQkalEOyh3OP VvnscTY4yTPfz41l9WPPjlzYvr92 eSxYjU5Ej0DRITXb7kslM 1T4vzF61Bmn7 Ex/ y6kSGMLea1/t//f21tsTJDdX1qL8ZKh0mpm3oZyCPoxX

Every + sign will be replaced through a space sign, and the base64 encoding is broken.
Can you check that please?
Thank you,
Hans-Jürgen

Can you provide the full email source you are parsing? The “+” sign is commonly used as the placelolder for space symbol in several encoding schemes. How exactly should the parser process that input depends on the specif content node headers.

The + sign is used for base64 encoded binary files. Attached you find 2 files. One is a chrome.png file and the second one is the base64 format of the png file.
encoded-20240229154027.txt
chrome
If you replace the + sign with a space, you will break the base64 encoding.
You can play around with it on: https://www.base64encode.org

Additional information: For our system to make sure, some information is not broken, it will be send as base64 encoded text in a email. If our server receives the mail, it will encode the base64 format and process it.

The contents do not matter. PostalMime parsers emails and I assume you used these files as attachments in an email. PostalMime returns attachments as ArrayBuffer values. To see how exactly PostalMime parses that email I would need the raw .eml file.

No, we don't attach any file to a email. The user can copy paste a text from a application into a email and send it to us. The email has one single line:
BC0JGO6L+BPNCXOZytv7eGPMtwcbs1WuY8Bbg1yG+bWNXRdOoeG9dYTy10Z+5Q0EMmLkhAIr6X/6itw7cGn+wsCuiCha+5xcDvTqu2Y+Siiq+3zRCh6Ha6MTr2MeNb2z91slRjyFgifriHOhBwuXj8Egf+sCIANY3pURmGxcy/oaeDP+jm8KrsjvGIur1zdt9Kli28eLYPpjIzlt7XMTQQjw/s/y81NCJto/xA3XTWPmblrl1zXzStvdcNJdzv/w/d8SJkOOOIkLv7TYCDLJFLz604bD2CZErnW0TwVn4n9nMh0LsVyp0/f1zIo+G3yaBfnKou2Ae5dkn7NsiCQv8N+HK8a/2kmxLreujkw0VwVvDCxgQkalEOyh3OP+VvnscTY4yTPfz41l9WPPjlzYvr92+eSxYjU5Ej0DRITXb7kslM+1T4vzF61Bmn7+Ex/+y6kSGMLea1/t//f21tsTJDdX1qL8ZKh0mpm3oZyCPoxX

And if the email comes in, we simply want to extract this single line of text.
Here is a log report from the cloudflare protocol. Maybe this will help:
{ "outcome": "ok", "scriptName": "emailtext", "diagnosticsChannelEvents": [], "exceptions": [], "logs": [ { "message": [ "Subject: ", "Testmail" ], "level": "log", "timestamp": 1709135867324 }, { "message": [ "HTML: ", "<html><head><style type=\"text/css\"><!--#xb8a7f630edc34a5 #xa7d53e261db244449de3241525731239\n{font-size: 12pt;}\n#xb8a7f630edc34a5 #xa7d53e261db244449de3241525731239\n{font-family: \"Segoe UI\"; font-size: 12pt;}\n#xb8a7f630edc34a5 #xa7d53e261db244449de3241525731239\n{font-family: \"Segoe UI\"; font-size: 11pt;}\n#xb8a7f630edc34a5\n{font-family: \"Segoe UI\"; font-size: 11pt;}\n--></style><style id=\"css_styles\" type=\"text/css\"><!--blockquote.cite { margin-left: 5px; margin-right: 0px; padding-left: 10px; padding-right:0px; border-left: 1px solid #cccccc }\nblockquote.cite2 {margin-left: 5px; margin-right: 0px; padding-left: 10px; padding-right:0px; border-left: 1px solid #cccccc; margin-top: 3px; padding-top: 0px; }\na img { border: 0px; }\nli[style='text-align: center;'], li[style='text-align: center; '], li[style='text-align: right;'], li[style='text-align: right; '] { list-style-position: inside;}\nbody { font-family: 'Segoe UI'; font-size: 11pt; }\n.quote { margin-left: 1em; margin-right: 1em; border-left: 5px #ebebeb solid; padding-left: 0.3em; }\n--></style>\n\n\n</head>\n<body><div><span>BC0JGO6L+BPNCXOZytv7eGPMtwcbs1WuY8Bbg1yG+bWNXRdOoeG9dYTy10Z+5Q0EMmLkhAIr6X/6itw7cGn+wsCuiCha+5xcDvTqu2Y+Siiq+3zRCh6Ha6MTr2MeNb2z91slRjyFgifriHOhBwuXj8Egf+sCIANY3pURmGxcy/oaeDP+jm8KrsjvGIur1zdt9Kli28eLYPpjIzlt7XMTQQjw/s/y81NCJto/xA3XTWPmblrl1zXzStvdcNJdzv/w/d8SJkOOOIkLv7TYCDLJFLz604bD2CZErnW0TwVn4n9nMh0LsVyp0/f1zIo+G3yaBfnKou2Ae5dkn7NsiCQv8N+HK8a/2kmxLreujkw0VwVvDCxgQkalEOyh3OP+VvnscTY4yTPfz41l9WPPjlzYvr92+eSxYjU5Ej0DRITXb7kslM+1T4vzF61Bmn7+Ex/+y6kSGMLea1/t//f21tsTJDdX1qL8ZKh0mpm3oZyCPoxX</span></div></body></html>\n" ], "level": "log", "timestamp": 1709135867324 }, { "message": [ "Text: ", "BC0JGO6L+BPNCXOZytv7eGPMtwcbs1WuY8Bbg1yG+bWNXRdOoeG9dYTy10Z+5Q0EMmLkhAIr6X/6itw7cGn+wsCuiCha+5xcDvTqu2Y+Siiq+3zRCh6Ha6MTr2MeNb2z91slRjyFgifriHOhBwuXj8Egf+sCIANY3pURmGxcy/oaeDP+jm8KrsjvGIur1zdt9Kli28eLYPpjIzlt7XMTQQjw/s/y81NCJto/xA3XTWPmblrl1zXzStvdcNJdzv/w/d8SJkOOOIkLv7TYCDLJFLz604bD2CZErnW0TwVn4n9nMh0LsVyp0/f1zIo+G3yaBfnKou2Ae5dkn7NsiCQv8N+HK8a/2kmxLreujkw0VwVvDCxgQkalEOyh3OP+VvnscTY4yTPfz41l9WPPjlzYvr92+eSxYjU5Ej0DRITXb7kslM+1T4vzF61Bmn7+Ex/+y6kSGMLea1/t//f21tsTJDdX1qL8ZKh0mpm3oZyCPoxX\n" ], "level": "log", "timestamp": 1709135867324 } ], "eventTimestamp": 1709135867305, "event": { "rawSize": 8348, "rcptTo": "test@xx.com", "mailFrom": "hjw@test.de" }, "id": 0 }

The email content:
`Content-Type: multipart/alternative;
boundary="------=_MB849CF2B7-B4F0-44F4-ADA0-D9F926568ED5"

--------=_MB849CF2B7-B4F0-44F4-ADA0-D9F926568ED5
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: quoted-printable

BC0JGO6L+BPNCXOZytv7eGPMtwcbs1WuY8Bbg1yG+bWNXRdOoeG9dYTy10Z+5Q0EMmLkhAIr6X/=
6itw7cGn+wsCuiCha+5xcDvTqu2Y+Siiq+3zRCh6Ha6MTr2MeNb2z91slRjyFgifriHOhBwuXj8=
Egf+sCIANY3pURmGxcy/oaeDP+jm8KrsjvGIur1zdt9Kli28eLYPpjIzlt7XMTQQjw/s/y81NCJ=
to/xA3XTWPmblrl1zXzStvdcNJdzv/w/d8SJkOOOIkLv7TYCDLJFLz604bD2CZErnW0TwVn4n9n=
Mh0LsVyp0/f1zIo+G3yaBfnKou2Ae5dkn7NsiCQv8N+HK8a/2kmxLreujkw0VwVvDCxgQkalEOy=
h3OP+VvnscTY4yTPfz41l9WPPjlzYvr92+eSxYjU5Ej0DRITXb7kslM+1T4vzF61Bmn7+Ex/+y6=
kSGMLea1/t//f21tsTJDdX1qL8ZKh0mpm3oZyCPoxX
--------=_MB849CF2B7-B4F0-44F4-ADA0-D9F926568ED5
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

<style type=3D"text/css"></style><style id=3D"css_styles" type=3D"text/css"></style>
BC0JGO6L+BPNCXOZytv7eGPMtwcbs1WuY8Bbg1yG+bWNXRdOoeG9dYTy10= Z+5Q0EMmLkhAIr6X/6itw7cGn+wsCuiCha+5xcDvTqu2Y+Siiq+3zRCh6Ha6MTr2MeNb2z91slR= jyFgifriHOhBwuXj8Egf+sCIANY3pURmGxcy/oaeDP+jm8KrsjvGIur1zdt9Kli28eLYPpjIzlt= 7XMTQQjw/s/y81NCJto/xA3XTWPmblrl1zXzStvdcNJdzv/w/d8SJkOOOIkLv7TYCDLJFLz604b= D2CZErnW0TwVn4n9nMh0LsVyp0/f1zIo+G3yaBfnKou2Ae5dkn7NsiCQv8N+HK8a/2kmxLreujk= w0VwVvDCxgQkalEOyh3OP+VvnscTY4yTPfz41l9WPPjlzYvr92+eSxYjU5Ej0DRITXb7kslM+1T= 4vzF61Bmn7+Ex/+y6kSGMLea1/t//f21tsTJDdX1qL8ZKh0mpm3oZyCPoxX
--------=_MB849CF2B7-B4F0-44F4-ADA0-D9F926568ED5-- `

I tested parsing this email in a cloudflare worker, and it worked fine.

This is the worker code

import PostalMime from 'postal-mime';
export default {
	async email(message, env, ctx) {
		const parser = new PostalMime();
		const email = await parser.parse(message.raw);

		console.log(email.html);
	},
};

And this is the parsed HTML:

"<html><head><style type=\"text/css\"><!--#x8a16e6036b634fa\n{font-family: \"Segoe UI\"; font-size: 11pt;}\n--></style><style id=\"css_styles\" type=\"text/css\"><!--blockquote.cite { margin-left: 5px; margin-right: 0px; padding-left: 10px; padding-right:0px; border-left: 1px solid #cccccc }\nblockquote.cite2 {margin-left: 5px; margin-right: 0px; padding-left: 10px; padding-right:0px; border-left: 1px solid #cccccc; margin-top: 3px; padding-top: 0px; }\na img { border: 0px; }\nli[style='text-align: center;'], li[style='text-align: center; '], li[style='text-align: right;'], li[style='text-align: right; '] {  list-style-position: inside;}\nbody { font-family: 'Segoe UI'; font-size: 11pt; }\n.quote { margin-left: 1em; margin-right: 1em; border-left: 5px #ebebeb solid; padding-left: 0.3em; }\n--></style>\n\n\n</head>\n<body><div><span>BC0JGO6L+BPNCXOZytv7eGPMtwcbs1WuY8Bbg1yG+bWNXRdOoeG9dYTy10Z+5Q0EMmLkhAIr6X/6itw7cGn+wsCuiCha+5xcDvTqu2Y+Siiq+3zRCh6Ha6MTr2MeNb2z91slRjyFgifriHOhBwuXj8Egf+sCIANY3pURmGxcy/oaeDP+jm8KrsjvGIur1zdt9Kli28eLYPpjIzlt7XMTQQjw/s/y81NCJto/xA3XTWPmblrl1zXzStvdcNJdzv/w/d8SJkOOOIkLv7TYCDLJFLz604bD2CZErnW0TwVn4n9nMh0LsVyp0/f1zIo+G3yaBfnKou2Ae5dkn7NsiCQv8N+HK8a/2kmxLreujkw0VwVvDCxgQkalEOyh3OP+VvnscTY4yTPfz41l9WPPjlzYvr92+eSxYjU5Ej0DRITXb7kslM+1T4vzF61Bmn7+Ex/+y6kSGMLea1/t//f21tsTJDdX1qL8ZKh0mpm3oZyCPoxX</span></div></body></html>\n"

All the plus signs are still in place. So I think the issue is somewhere else

We need the plain text, not the HTML code. Therefore I call email.text; instead of email.html; Have you checked that too?

Yes, it looks correct. You can also run these checks yourself, btw.

"BC0JGO6L+BPNCXOZytv7eGPMtwcbs1WuY8Bbg1yG+bWNXRdOoeG9dYTy10Z+5Q0EMmLkhAIr6X/6itw7cGn+wsCuiCha+5xcDvTqu2Y+Siiq+3zRCh6Ha6MTr2MeNb2z91slRjyFgifriHOhBwuXj8Egf+sCIANY3pURmGxcy/oaeDP+jm8KrsjvGIur1zdt9Kli28eLYPpjIzlt7XMTQQjw/s/y81NCJto/xA3XTWPmblrl1zXzStvdcNJdzv/w/d8SJkOOOIkLv7TYCDLJFLz604bD2CZErnW0TwVn4n9nMh0LsVyp0/f1zIo+G3yaBfnKou2Ae5dkn7NsiCQv8N+HK8a/2kmxLreujkw0VwVvDCxgQkalEOyh3OP+VvnscTY4yTPfz41l9WPPjlzYvr92+eSxYjU5Ej0DRITXb7kslM+1T4vzF61Bmn7+Ex/+y6kSGMLea1/t//f21tsTJDdX1qL8ZKh0mpm3oZyCPoxX\n"