cesanta / mongoose

Embedded Web Server

Home Page:https://mongoose.ws

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reading post data from https connection fails now and then

jcorporation opened this issue · comments

  • My goal is: Read post requests from a https socket issued by javascript
  • My actions were: read the data in MG_EV_HTTP_MSG with struct mg_http_message *hm = (struct mg_http_message *) ev_data; as used in your examples
  • My expectation was: mongoose reads the complete post data
  • The result I saw:
694221 3 sock.c:441:accept_conn         16 6 accepted 192.168.88.144:53682 -> 0.0.0.0:8443
694221 3 tls_openssl.c:111:mg_tls_init  16 Setting TLS
694223 3 tls_openssl.c:190:mg_tls_init  16 SSL accept OK
694259 3 tls_openssl.c:200:mg_tls_hands 16 success
694259 3 sock.c:298:read_conn           16 0x6 snd 0/0 rcv 0/2048 n=-2 err=0
694259 3 sock.c:298:read_conn           16 0x6 snd 0/0 rcv 0/2048 n=557 err=0

Mongoose does not read the post data from the socket.

The problem occurs only:

  • with latest master
  • with an iPhone as client (tested with safari and firefox mobile)
  • only with enabled ssl (openssl in my case)

Without SSL or on my Linux Desktop it works without problems.

The issue is reproduceable with your http-restful-server example. I added an index.html that posts data with the javascript fetch api to the /api/stats endpoint.

<html>
    <body>
        <p id="status"></p>
        <a href="javascript:post()">Post</a>
        <script>
            async function post() {
                document.getElementById('status').textContent = "Sending...";
                const body = 'test';
                const response = await fetch('/api/stats', {
                    method: 'POST',
                    mode: 'same-origin',
                    credentials: 'same-origin',
                    cache: 'no-store',
                    redirect: 'follow',
                    headers: {
                        'Content-Type': 'text/plain',
                        'Content-Length': body.length.toString()
                    },
                    body: body
                });
                if (response) {
                    document.getElementById('status').textContent = await response.text();
                }
                return false;
            }
        </script>
    </body>
</html>

Your example is compiled with: make CFLAGS_EXTRA="-DMG_TLS=MG_TLS_OPENSSL -lssl -lcrypto"

Some requests are answered as expected, but for others there are no replies from mongoose.

  • My question is: seems to be a bug?

Environment

  • mongoose version: latest master
  • Compiler/IDE and SDK: gcc version 13.2.0 (Ubuntu 13.2.0-4ubuntu3)
  • Target hardware/board: Linux t14 6.5.0-15-generic #15-Ubuntu SMP PREEMPT_DYNAMIC Tue Jan 9 17:03:36 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Could you please try to get a Wireshark capture of that traffic ?
See howto here: #2570 (comment)
Thanks

Attached is the tcpdump with the keylog file. Dump was created with tcpdump -vvv -s0 -nni any -w dump.dmp port 8443.

dump.zip

Log:

11bf075 1 sock.c:227:mg_open_listener   bind: 98
11bf075 1 net.c:190:mg_listen           Failed: http://0.0.0.0:8000, errno 98
11bf075 3 net.c:202:mg_listen           2 4 https://0.0.0.0:8443
11c5cfe 3 sock.c:441:accept_conn        3 5 accepted 192.168.88.144:53747 -> 0.0.0.0:8443
11c5cff 3 tls_openssl.c:119:mg_tls_init 3 Setting TLS
11c5d01 3 tls_openssl.c:202:mg_tls_init 3 SSL accept OK
11c5d13 3 tls_openssl.c:212:mg_tls_hand 3 success
11c5d13 3 sock.c:298:read_conn          3 0x5 snd 0/0 rcv 0/2048 n=-2 err=0
11c5d13 3 sock.c:298:read_conn          3 0x5 snd 0/0 rcv 0/2048 n=436 err=0
11c5d13 3 sock.c:309:write_conn         3 5 snd 1074/2048 rcv 0/2048 n=1074 err=0
11c646c 3 sock.c:298:read_conn          3 0x5 snd 0/2048 rcv 0/2048 n=557 err=0
11c7e98 3 sock.c:441:accept_conn        4 6 accepted 192.168.88.144:53748 -> 0.0.0.0:8443
11c7e98 3 tls_openssl.c:119:mg_tls_init 4 Setting TLS
11c7e9a 3 tls_openssl.c:202:mg_tls_init 4 SSL accept OK
11c7eb7 3 tls_openssl.c:212:mg_tls_hand 4 success
11c7eb7 3 sock.c:298:read_conn          4 0x6 snd 0/0 rcv 0/2048 n=-2 err=0
11c7eb7 3 sock.c:298:read_conn          4 0x6 snd 0/0 rcv 0/2048 n=557 err=0
11c7eb9 3 sock.c:298:read_conn          4 0x6 snd 0/0 rcv 557/2048 n=4 err=0
11c7eb9 3 sock.c:309:write_conn         4 6 snd 268/2048 rcv 0/2048 n=268 err=0
11c8677 3 sock.c:298:read_conn          4 0x6 snd 0/2048 rcv 0/2048 n=557 err=0

@cpq Apparently the client sends headers first, then the POST data (4 bytes: test). The server receives two frames.
I forced that behavior by copying the OP request data and adding a pause (attached, run ./request | openssl s_client -ign_eof -connect localhost:8443)
request.zip

Unfortunately, I can't reproduce it here; maybe it is my OpenSSL version (1.1.1k)

Can't reproduce it either, run dozens of requests, all successful
My environment is MacOS, OpenSSL 3.1
image

Tried with built-in TLS, also works fine.

@jcorporation , could you make a test with a built-in TLS please? Here's how to do it:

  1. Apply a patch (attached) to change certificates
  2. Build with make clean all CFLAGS_EXTRA="-DMG_TLS=MG_TLS_BUILTIN"

certs.diff.txt

Can't reproduce it either, run dozens of requests, all successful

Issue occurs only with Safari on iOS as client.

I tested with OpenSSL versions: 3.0.10 and 3.1.4

could you make a test with a built-in TLS please?

The failure is gone with built-in TLS.

Issue occurs only with Safari on iOS as client.

Yes, but here we have two possibilities: the way the request is sent and its contens; the OpenSSL version
(that discarding differences on the server side)
I took care of taking the request you sent from your iOS client and tried my best to do something similar to the way it sends it (headers, pause, data)
So, we seem to be left with OpenSSL stuff
Maybe the way the data is encrypted and sent triggers something on our side... perhaps you could add some hexdumps on Mongoose ? @cpq WDYT, is it worth it to try to hexdump before and after calling OpenSSL and check why HTTP_MSG is not triggered ?

I tested with OpenSSL versions: 3.0.10 and 3.1.4

I'd point to something in the iOS version as most likely... based on what we've tried to reproduce and failed.

The failure is gone with built-in TLS.

That is so cool ...

Activating hexdump is a good idea

@jcorporation, do you mind doing that please? Essentially, if (ev == MG_EV_OPEN) { c->is_hexdumping = 1; }. Then, please share server logs for both good and bad cases. I think we'll hunt down what's going on.

no problem, attached are the logs with hexdump. the first two posts failed, the third one succeeded.

hexdump.txt

Mongoose never sees the content...
In 1st and 2nd the test data is missing.
I remember seeing it at the network level... "we need to go deeper" with our dumps.
Check out this: from the log above that belongs to the dump, where I saw that data:

11c646c 3 sock.c:298:read_conn          3 0x5 snd 0/2048 rcv 0/2048 n=557 err=0
11c7e98 3 sock.c:441:accept_conn        4 6 accepted 192.168.88.144:53748 -> 0.0.0.0:8443
11c7e98 3 tls_openssl.c:119:mg_tls_init 4 Setting TLS
11c7e9a 3 tls_openssl.c:202:mg_tls_init 4 SSL accept OK
11c7eb7 3 tls_openssl.c:212:mg_tls_hand 4 success
11c7eb7 3 sock.c:298:read_conn          4 0x6 snd 0/0 rcv 0/2048 n=-2 err=0
11c7eb7 3 sock.c:298:read_conn          4 0x6 snd 0/0 rcv 0/2048 n=557 err=0
11c7eb9 3 sock.c:298:read_conn          4 0x6 snd 0/0 rcv 557/2048 n=4 err=0

557 and no 4
the second one succeeds, 557 and 4

in the hexdump:

5a6fc3 3 sock.c:298:read_conn           5 0x6 snd 0/2048 rcv 0/2048 n=559 err=0
5a8cc5 3 sock.c:441:accept_conn         6 7 accepted 192.168.88.144:59076 -> 0.0.0.0:8443
5a8d90 3 sock.c:298:read_conn           6 0x7 snd 0/0 rcv 0/2048 n=559 err=0
5aace0 3 sock.c:441:accept_conn         7 8 accepted 192.168.88.144:59077 -> 0.0.0.0:8443
5aad46 3 sock.c:298:read_conn           7 0x8 snd 0/0 rcv 0/2048 n=559 err=0
5aad48 3 sock.c:298:read_conn           7 0x8 snd 0/0 rcv 559/2048 n=4 err=0
5aad48 2 sock.c:105:iolog               
-- 7 0.0.0.0:8443 <- 192.168.88.144:59077 4
0000   74 65 73 74                                       test            

1st and 2nd: 559 and no 4
the 3rd one succeeds, 559 and 4

We are missing that data. I've seen it on the tcpdump capture, it is on a separate Ethernet frame right after the first one.

Side note - @scaprile , does it make sense for this example (actually, for all of them) to reuse certs from device dashboard?

Side note - @scaprile , does it make sense for this example (actually, for all of them) to reuse certs from device dashboard?

@cpq ... Long ago I thought short examples were better kept short, so I used the simplest embedded-like way to load certificates, and the TLS tutorial shows different examples with different options to do that (embedded in a string, read from packed_fs on the fly, read from a file and kept in RAM). So..., I think this is fine as it is now, we have simple ways, we have options, and examples.
I personally prefer reading from packed_fs, keeps them out of the source code space.

The thing is those certs cannot be used with our built-in TLS. Anyway, that's offtopic.

@scaprile your observation is right! I managed to reproduce it on Macos + Safari. Here is the good case. We read 536 bytes, then 4 bytes:

3731ea5e 3 sock.c:300:read_conn         139 0x5 snd 0/0 rcv 0/2048 n=-2 err=0      // Tried to read immediately after handshake, got EAGAIN, hence -2. Will try later
3731ea5e 4 sock.c:699:mg_mgr_poll       2 -- Tchrc
3731ea5e 4 sock.c:699:mg_mgr_poll       1 -- tchrc
3731ea5e 4 sock.c:699:mg_mgr_poll       139 r- Tchrc             // Note: connection 139 is readable, so we'll read 
3731ea5e 3 sock.c:300:read_conn         139 0x5 snd 0/0 rcv 0/2048 n=536 err=0      // And here we read 536 bytes
3731ea5e 4 sock.c:699:mg_mgr_poll       2 -- Tchrc
3731ea5e 4 sock.c:699:mg_mgr_poll       1 -- tchrc  
3731ea5f 4 sock.c:699:mg_mgr_poll       139 r- Tchrc            // Connection 139 is readable again
3731ea5f 3 sock.c:300:read_conn         139 0x5 snd 0/0 rcv 536/2048 n=4 err=0   // And yeah, we read next 4 bytes

After hitting "Post" many times, it eventually got stuck with this log. Here is the bad case: we read 536 bytes, but never see 4 bytes:

3731eb9d 3 tls_openssl.c:190:mg_tls_ini 141 SSL accept OK
3731eb9d 4 sock.c:699:mg_mgr_poll       1 -- tchrc
3731eb9d 4 sock.c:699:mg_mgr_poll       141 r- TcHrc
3731eb9d 4 sock.c:699:mg_mgr_poll       2 -- Tchrc
3731eb9d 4 sock.c:699:mg_mgr_poll       1 -- tchrc
3731eba2 4 sock.c:699:mg_mgr_poll       141 r- TcHrc
3731eba2 3 tls_openssl.c:200:mg_tls_han 141 success
3731eba2 3 sock.c:300:read_conn         141 0x5 snd 0/0 rcv 0/2048 n=536 err=0   // Read 536
3731eba2 4 sock.c:699:mg_mgr_poll       2 -- Tchrc
3731eba2 4 sock.c:699:mg_mgr_poll       1 -- tchrc
3731ef8b 4 sock.c:699:mg_mgr_poll       141 -- Tchrc        // And Doh!!! socket not readable?? Why ??

So in the "bad" case, the socket is not readable for some reason. Either because the browser really does not send more data, or because Mongoose just has a bug and does not identify the socket as readable. Now, the wireshark log should be the judge here. If we see for the "bad" case a 4-byte packet coming to us, then Mongoose is guilty. If 4-byte packet is not in the wireshark dump, then the browser is guilty.

Here goes tcpdump.

Good case:
image

Bad case:
image

So it looks like, in the bad case, browser just does not send us all data for some reason. Stuff circled in red, is all I see, there is nothing coming after that. I'd blame safari.
Any other thoughts?

dump

It is in the OP dump.
Judging by the time difference between frames, looks like a race condition, it arrives while we are processing, so somehow we say we read it all but we've only read a part of it...

Yeah I viewed OP's dump, but can't make much out of it.
I need two cases, good and bad, side by side. That's what I see clearly in my dump.
So apparently, we're reading TLS data, and not getting all of it - because the browser does not send all of it.
If you have a counter evidence, please speak up

Second one is good, though my way of dumping TLS keylog seems to only work for the first connection...
In the first connection I can see clearly the data is there, and we do not answer back.
Our log shows that we did not read those 4 bytes from the second frame, in fact there is no event for that.

It could be that the different set of ciphers, or different certs, could change the codepath on safari side.

@jcorporation if you use different certs provided by patch above, but still use openssl (or mbedtls), does it trigger the issue?

if you use different certs provided by patch above, but still use openssl (or mbedtls), does it trigger the issue?

yes, I see no behavior change

@cpq watching your log in detail:
good case: 120 --> GET / ; 295 x 2 <-- response ; 614 + 82 bytes --> POST + data, 334 <-- response
bad case: 120 --> GET / ; 295 x 2 <-- response ; 614 + 82 bytes --> POST + data, no response
The only thing that is different is that there is no 334 byte response from Mongoose.
I'd bet the 4-byte test is in that 82 byte frame that is in blue.

I managed to decrypt all three requests. In the meantime I disabled tls1.3, but that does not change anything.

The first two are failing the last one succeeds.

Attached are the log, dump and keylog.txt. I hope that helps!

debug.zip

Patch to decrypt all requests was:

if (!s_initialised) {                 // <-- ADD AND CLAIM "initialised" HERE
    s_initialised++;
}
SSL_CTX_set_keylog_callback(tls->ctx, cb);

@scaprile you're correct, I missed the fact that "334 Application data" is not the browser, but Mongoose should send.
@jcorporation thanks for the log. Now we need to understand what's in that "334 application data"

1st and 2nd ones fail: header and data are sent in different frames 1 and 2 us apart.
3rd one succeeds: header and data are sent in different frames 1.4 ms apart.
I'd say this smells like a race condition, but where...

@scaprile yep, you're correct. There is a race. When a distance between packets is too small, then we got two frames hit the socket buffer fast, and an application gets raw encrypted data in a single read. In a slower situation, that's 2 reads. Our code uses mg_tls_recv() function which decrypts raw data, but it does it once per raw read. If one raw read contains 2 encrypted chunks, we never decrypt the 2nd one, because we're waiting for more encrypted data.

@jcorporation could you apply the following patch and see if the issue still remans, please? (you need to make mongoose.c in the repo root to re-amalgamate it)

tls3.diff.txt

Or, better, this patch - it is more correct
tls4.diff.txt

Or, better, this patch - it is more correct
tls4.diff.txt

Great, this patch solves the issue!