cesanta / mongoose

Embedded Web Server

Home Page:https://mongoose.ws

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MG_EV_CLOSE is fired before MG_EV_HTTP_MSG

jcorporation opened this issue · comments

  • My goal is: use mongoose as http client
  • My actions were: used the http-client example with enabled ssl
  • My expectation was: The MG_EV_HTTP_MSG fires before the MG_EV_CLOSE event
  • The result I saw: The MG_EV_CLOSE fires before the MG_EV_HTTP_MSG event.

Environment

  • mongoose version: current master
  • Compiler/IDE and SDK: gcc linux (ubuntu)
  • Target hardware/board: x86_64
  • Connectivity chip/module: ?
  • Target RTOS/OS (if applicable): linux

I use mongoose in myMPD as reverse proxy. Since commit 144c2f4 the MG_EV_CLOSE event fires before the MG_EV_HTTP_MSG event when compiled with openssl. Reverting to a previous commit fixes that. I access some connection specific fn_data in the MG_EV_HTTP_MSG and it SEGFAULTS because I free it in the MG_EV_CLOSE event.

This issue can be reproduced with the http-client example.

Log

9a8168 4 sock.c:697:mg_mgr_poll         1 r- Tchrc
9a8169 3 sock.c:299:read_conn           1 0x5 snd 0/2048 rcv 36864/38912 n=1053 err=0
9a8169 4 sock.c:697:mg_mgr_poll         2 -- tchrc
9a8169 4 sock.c:697:mg_mgr_poll         1 r- Tchrc
9a8169 3 sock.c:299:read_conn           1 0x5 snd 0/2048 rcv 37917/38912 n=-1 err=0
CLOSED
HTTP MSG
9a8169 3 net.c:151:mg_close_conn        1 5 closed
9a816b 4 sock.c:697:mg_mgr_poll         2 -- tchrC
9a816b 3 net.c:151:mg_close_conn        2 4 closed
9a816b 3 net.c:249:mg_mgr_free          All connections closed

Source code

// Copyright (c) 2021 Cesanta Software Limited
// All rights reserved
//
// Example HTTP client. Connect to `s_url`, send request, wait for a response,
// print the response and exit.
// You can change `s_url` from the command line by executing: ./example YOUR_URL
//
// To enable SSL/TLS, , see https://mongoose.ws/tutorials/tls/#how-to-build

#include "mongoose.h"

// The very first web page in history. You can replace it from command line
static const char *s_url = "https://www.google.com/";
static const char *s_post_data = NULL;      // POST data
static const uint64_t s_timeout_ms = 1500;  // Connect timeout in milliseconds

// Print HTTP response and signal that we're done
static void fn(struct mg_connection *c, int ev, void *ev_data, void *fn_data) {
  if (ev == MG_EV_OPEN) {
    // Connection created. Store connect expiration time in c->data
    *(uint64_t *) c->data = mg_millis() + s_timeout_ms;
  } else if (ev == MG_EV_POLL) {
    if (mg_millis() > *(uint64_t *) c->data &&
        (c->is_connecting || c->is_resolving)) {
      mg_error(c, "Connect timeout");
    }
  } else if (ev == MG_EV_CONNECT) {
    // Connected to server. Extract host name from URL
    struct mg_str host = mg_url_host(s_url);

    if (mg_url_is_ssl(s_url)) {
      struct mg_tls_opts opts = {.ca = mg_unpacked("/certs/ca.pem"),
                                 .name = mg_url_host(s_url)};
      mg_tls_init(c, &opts);
    }

    // Send request
    int content_length = s_post_data ? strlen(s_post_data) : 0;
    mg_printf(c,
              "%s %s HTTP/1.0\r\n"
              "Host: %.*s\r\n"
              "Content-Type: octet-stream\r\n"
              "Content-Length: %d\r\n"
              "\r\n",
              s_post_data ? "POST" : "GET", mg_url_uri(s_url), (int) host.len,
              host.ptr, content_length);
    mg_send(c, s_post_data, content_length);
  } else if (ev == MG_EV_HTTP_MSG) {
    // Response is received. Print it
    struct mg_http_message *hm = (struct mg_http_message *) ev_data;
    //printf("%.*s", (int) hm->message.len, hm->message.ptr);
    printf("HTTP MSG\n");
    c->is_draining = 1;        // Tell mongoose to close this connection
    *(bool *) fn_data = true;  // Tell event loop to stop
  } else if (ev == MG_EV_ERROR) {
    *(bool *) fn_data = true;  // Error, tell event loop to stop
  } else if (ev == MG_EV_CLOSE) {
    printf("CLOSED\n");
  }
}

int main(int argc, char *argv[]) {
  const char *log_level = getenv("LOG_LEVEL");  // Allow user to set log level
  if (log_level == NULL) log_level = "4";       // Default is verbose

  struct mg_mgr mgr;              // Event manager
  bool done = false;              // Event handler flips it to true
  if (argc > 1) s_url = argv[1];  // Use URL provided in the command line
  mg_log_set(atoi(log_level));    // Set to 0 to disable debug
  mg_mgr_init(&mgr);              // Initialise event manager
  mg_http_connect(&mgr, s_url, fn, &done);  // Create client connection
  while (!done) mg_mgr_poll(&mgr, 50);      // Event manager loops until 'done'
  mg_mgr_free(&mgr);                        // Free resources
  return 0;
}

It takes me several minutes to compare your code against ours, can you please let us know if you changed something in our example code and where ? Thank you.
You can always refer to our code with permalinks.

I'll check this later.

Thanks for the fast response, this is really awesome!
I added only the handling of the MG_EV_CLOSE event int the connection handler and replaced the output of the response with a simple printf("HTTP MSG").

Diff:

diff --git a/examples/http-client/main.c b/examples/http-client/main.c
index a1f76cdd..464416c8 100644
--- a/examples/http-client/main.c
+++ b/examples/http-client/main.c
@@ -10,7 +10,7 @@
 #include "mongoose.h"
 
 // The very first web page in history. You can replace it from command line
-static const char *s_url = "http://info.cern.ch/";
+static const char *s_url = "https://www.google.com/";
 static const char *s_post_data = NULL;      // POST data
 static const uint64_t s_timeout_ms = 1500;  // Connect timeout in milliseconds
 
@@ -48,11 +48,14 @@ static void fn(struct mg_connection *c, int ev, void *ev_data, void *fn_data) {
   } else if (ev == MG_EV_HTTP_MSG) {
     // Response is received. Print it
     struct mg_http_message *hm = (struct mg_http_message *) ev_data;
-    printf("%.*s", (int) hm->message.len, hm->message.ptr);
+    //printf("%.*s", (int) hm->message.len, hm->message.ptr);
+    printf("HTTP MSG\n");
     c->is_draining = 1;        // Tell mongoose to close this connection
     *(bool *) fn_data = true;  // Tell event loop to stop
   } else if (ev == MG_EV_ERROR) {
     *(bool *) fn_data = true;  // Error, tell event loop to stop
+  } else if (ev == MG_EV_CLOSE) {
+    printf("CLOSED\n");
   }
 }

I can reproduce your issue, though I still do it with 7.12.
I can't point to that commit as the culprit.
We'll check this and fix it.

Wireshark capture attached. This is with 7.12

  • MG_EV_CLOSE seems to fire for some reason; WE close the connection
    24aa097e 3 sock.c:276:read_conn         1 5 snd 0/2048 rcv 18432/20480 n=861 err=0
    24aa097e 3 sock.c:276:read_conn         1 5 snd 0/2048 rcv 19293/20480 n=-1 err=11
    24aa097e 2 main.c:58:fn                 CLOSED
    24aa097e 2 main.c:52:fn                 DATA
    24aa097f 3 net.c:148:mg_close_conn      1 5 closed
    24aa097f 3 net.c:148:mg_close_conn      2 4 closed
    24aa097f 3 net.c:234:mg_mgr_free        All connections closed
    
    • Notice the n=-1 err=11 This is with MbedTLS and 7.12
      • with HEAD I get n=-1 err=115
      • with OpenSSL I get n=-1 err=0
  • then we deliver MG_EV_HTTP_MSG anyway and main tells the manager to stop (after requesting to close but we've already closed)

cesantadotcom.zip

@jcorporation Thank you

This one is weird. It turns out that Google does not send response as chunked, and does not indicate Content-Length.
If you change our client example to:

  1. HTTP/1.1 instead of HTTP/1.0
  2. Add Accept: */* header

Then the message is delivered as before CLOSE event. Still, there might be problem with Mongoose.
Google's answer is without Content-Length, but it is still standard one: the end of the message is determined by the connection close. MSG must come before the CLOSE. Let me look deeper.

Yeah, this is it.

Mongoose calls user-defined handler first, before the protocol handler:

mongoose/src/event.c

Lines 19 to 23 in 5826d0e

// Run user-defined handler first, in order to give it an ability
// to intercept processing (e.g. clean input buffer) before the
// protocol handler kicks in
if (c->fn != NULL) c->fn(c, ev, ev_data);
if (c->pfn != NULL) c->pfn(c, ev, ev_data);

That means that if user handler catches MG_EV_CLOSE, it is delivered before the MG_EV_HTTP_MSG - and this is only for cases when server does not specify Content-Length, so the end of the message is regarded when a connection gets closed. Weird corner case, but it is what it is. I think you can check this condition by looking at c->recv.len. If it is non-zero, meaning there is data, then you might receive a message still.

My test above was with cesanta.com
Please notice that we are closing, not the other end
Do we close because of no content-length indication ?

@scaprile it is the same with cesanta.com: do curl -si https://cesanta.com and you'll see that there is no Content-Length set. That means that client reads data until socket gets closed. When it does, then MG_EV_CLOSE is sent, and both user's event handler and HTTP event handler got it, but user's handler gets it first.

I wonder, is that strategy to call user handler first, is correct.

It is correct to me, user first, default last.

Yeah but if we call user handler first, we always can have that weird confusing side-effect.
MG_EV_CLOSE gets delivered, caught by the user handlers, and then protocol handler can decide to send more events to the user handler - just like HTTP handler does.

Maybe we should check if there is data in the socket buffer before believing the closure point blank
I still don't see that closuee mechanism... the client is closing... and Mongoose logs n=-1, shouldn't we stay waiting forever for more data instead of closing right after getting the last piece of data ?

24aa097e 3 sock.c:276:read_conn         1 5 snd 0/2048 rcv 18432/20480 n=861 err=0
24aa097e 3 sock.c:276:read_conn         1 5 snd 0/2048 rcv 19293/20480 n=-1 err=11
24aa097e 2 main.c:58:fn                 CLOSED
24aa097e 2 main.c:52:fn                 DATA

We've got -1 reading from socket - so it is done, closed. What's the point of waiting? We must close the connection, and we do.

Mongoose's architecture is such that is has two event handler functions: protocol c->pfn, and user c->fn.
Protocol is kinda lower level. User is higher level. But both handlers receive all events.

From that point of view, c->pfn must be called first.
I don't really recall why did we decide to call c->fn first. The comment says that c->fn should be able to tweak the receive buffer before c->pfn gets a chance to process it. Why was it? That sounds wrong to me now.

As far as I remember, other systems seem to do that, too...
Changing the order of handlers can have many unexpected and undesired effects. We'd have to probably re-write a lot of stuff.
I'd try to prevent sending EV_CLOSE when there is outstanding data, fire an EV_READ that will in this case end up being an EV_HTTP_MSG and then fire the close event.

@cpq Thanks for the fast investigation.

I use the following snippet in my code to send the request:

mg_printf(nc, "GET %s HTTP/1.1\r\n"
        "Host: %.*s\r\n"
        "User-Agent: myMPD/"MYMPD_VERSION"\r\n"
        "Accept: */*\r\n"
        "Connection: close\r\n"
        "\r\n",
        mg_url_uri(backend_nc_data->uri),
        (int)host.len, host.ptr
);

And I get the MG_EV_CLOSE event before the MG_EV_HTTP_MSG. This occurs only since the commit above. It works without problems with the latest release.

I link mongoose with OpenSSL.

Log:

21:31:55 INFO     webserver myMPD/src/web_server/proxy.c:119: Creating new http backend connection to "https://jcorporation.github.io/webradiodb/db/index/webradiodb-combined.min.json"
21:31:55 INFO     webserver myMPD/src/web_server/proxy.c:144: Forwarding client connection "3" to http backend connection "4"
21:31:55 INFO     webserver myMPD/src/web_server/proxy.c:91: Backend connection "4" established, host "jcorporation.github.io"
21:31:55 DEBUG    webserver myMPD/src/web_server/proxy.c:106: Sending GET /webradiodb/db/index/webradiodb-combined.min.json HTTP/1.1 to backend connection "4"
21:31:56 INFO     webserver myMPD/src/web_server/proxy.c:66: Backend tcp connection "4" closed
21:31:56 DEBUG    webserver myMPD/src/web_server/webradiodb.c:190: Got response from connection "4", response code 200: 200003 bytes

That has been reported before in #1475
I think it is confusing, wrong, and incorrect in principle - per my comment above. I am filing the PR to revert the order back to normal: protocol first, user second.

I pulled latest master and I can confirm that b804217 fixes the issue. Thanks for your support and this great piece of software!