erlang / otp

Erlang/OTP

Home Page:http://erlang.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gen_tcp socket based implementation do not work with FDs passed from systemd

hauleth opened this issue · comments

Describe the bug

gen_tcp socket do not receive any messages when passing file descriptor as an inet6 socket.

To Reproduce

Example application can be found in systemd repo.

To be able to test it out you need to be running systemd-enabled Linux distribution. You can use Multipass to get one (Ubuntu) if you do not have one.

  1. Run application with socket activation
    $ systemd-socket-activate -l 8888 --fdname=tcp -E ERL_FLAGS='-kernel inet_backend socket' rebar3 shell
  2. In separate shell connect to the echo service nc localhost 8888
  3. No message is delivered to the echo_tcp process.

For comparing with default implementation works as expected. Steps:

  1. Run application with socket activation
    $ systemd-socket-activate -l 8888 --fdname=tcp rebar3 shell
  2. In separate shell connect to the echo service nc localhost 8888
  3. Messages are delivered to listening process and are responded to.

Expected behavior

When running process with -kernel inet_backend socket messages should be delivered to listening process and such process should respond to the client.

Affected versions

I have tested on OTP 23.3, but I assume that all 23.x line is affected.

Additional context

This come out when I was trying to test whether socket based implementation of gen_tcp will allow fixing problem with dropping the connections when using systemd socket activation.

Ref hauleth/erlang-systemd#24 (comment)

The goal there is to allow full VM restarts without dropping socket connections.

Could you test this with 24.0-rc2?
There has been lot of bug fixes and changes in the master (24.0) branch.

@bmk I will try to do so.

@bmk on OTP 24.0-rc2 it completely fails with {error, einval}.

I have ported the code from using gen_tcp:

-module(echo_tcp).

-include_lib("kernel/include/logger.hrl").

-export([start_link/1,
         listen/1]).

start_link(Opt) ->
    Pid = spawn_link(?MODULE, listen, [Opt]),
    {ok, Pid}.

listen(Opt) ->
    Opts = [Opt,
            binary,
            {nodelay, true},
            {packet, 0},
            {active, false},
            {reuseaddr, true}],
    {ok, LSock} = gen_tcp:listen(0, Opts),
    accept(LSock).

accept(LSock) ->
    {ok, Sock} = gen_tcp:accept(LSock),
    spawn(fun() -> loop(Sock) end),
    accept(LSock).

loop(Sock) ->
    case gen_tcp:recv(Sock, 0) of
        {ok, Data} ->
            ?LOG_NOTICE(#{data => Data, protocol => tcp}),
            gen_tcp:send(Sock, Data),
            loop(Sock);
        {error, closed} ->
            ok
    end.

to using raw socket:

-module(echo_tcp).

-include_lib("kernel/include/logger.hrl").

-export([start_link/1,
         listen/1]).

start_link(Opt) ->
    Pid = spawn_link(?MODULE, listen, [Opt]),
    {ok, Pid}.

listen({fd, Fd}) ->
    {ok, LSocket} = socket:open(Fd),
    ok = socket:listen(LSocket),
    accept(LSocket);
listen({port, Port}) ->
    {ok, LSocket} = socket:open(inet6, stream, tcp),
    ok = socket:bind(LSocket, #{family => inet6,
                               port => Port,
                               addr => any}),
    ok = socket:listen(LSocket),
    accept(LSocket).

accept(LSock) ->
    {ok, Sock} = socket:accept(LSock),
    spawn(fun() -> loop(Sock) end),
    accept(LSock).

loop(Sock) ->
    case socket:recv(Sock) of
        {ok, Data} ->
            ?LOG_NOTICE(#{data => Data, protocol => tcp}),
            socket:send(Sock, Data),
            loop(Sock);
        {error, closed} ->
            ok
    end.

And with raw socket everything works as expected. It seems like gen_tcp tries to do some additional work (probably call bind on already bound FD), but I still haven't traced where and why it is happening.

Ok, I have found that it may me related to JIT, as strace shows that only EINVAL value is returned from arch_prctl(0x3001, /* different values there */). Full system trace:

https://gist.github.com/hauleth/3012efda82c3ed2e3ee1c46edb2eb639

And this does not happen when running without JIT?

I am still trying to build OTP 24.0-rc2 without JIT. I tried to build version with both, but it failed #4722

@bmk still there. I will generate the strace report in a moment.

Ok, I have found that it may me related to JIT, as strace shows that only EINVAL value is returned from arch_prctl(0x3001, /* different values there */). Full system trace:

That's far too early to have anything to do with JIT.

I am still trying to build OTP 24.0-rc2 without JIT. I tried to build version with both, but it failed #4722

Try building with ./configure --disable-jit

I wonder if this has got something to do with Linux's "interesting" handling of IPv6 scoped addresses?
Can you give dead simple example of how it is supposed to work. I try the following; what am I doing wrong?

$ systemd-socket-activate -l [::]:12345 --inetd -a cat
Listening on [::]:12345 as 3.

from another shell:

$ nc 127.0.0.1 12345
hello
hello
^C

and get this in shell 1:

Communication attempt on fd 3.
Connection from 127.0.0.1:35178 to [::ffff:127.0.0.1]:12345
Spawned cat (cat) as PID 6304.
Execing cat (cat)
Child 6304 died with code 0

So far so good. But how about IPv6?

$ nc ::1 12345

nothing happens

^C

If I change the argument to systemd-... into -l [::1]:12345 I get immediate connection refused when trying to connect to 127.0.0.1 12345, and nothing happens for ::1 12345.

So, how is this supposed to work?

@IngelaAndin @bmk https://gist.github.com/hauleth/086f94d31eb260f84880caabf42b7446

@RaimoNiskanen both connections works for me (systemd 245 on Ubuntu 20.04 LTS). However there is huge difference between using inetd-compatible behaviour and "new-style daemons" behaviour. In former the socket connection is accepted by the systemd (in case of TCP sockets), and then there is new process spawned for each connection. In latter case systemd only opens port, and then passes it as a raw file descriptor to the spawned process, since then the process is fully responsible for accepting and responding to the socket connections in whatever manner they seem to be feasible.

I have a hard time figuring out how this systemd stuff looks for the Erlang node. What is it that does not work, from the Erlang node's point of view?
For starters, I have still not managed to create a connection, with Erlang or with any tool, over IPv6 loopback in any way; and I suspect the scope id is the problem.

How is this raw file descriptor passed to the spawned process, exactly, and how is it used in your Erlang code. We need to minimize this example.

I have systemd 237 on Ubuntu 18.04 LTS.

@RaimoNiskanen the way it works is pretty simple in fact. What doesn't from Erlang viewpoint is using existing, bounded, file descriptor with gen_tcp when using socket backend (it works perfectly fine with default inet backend).

How is this raw file descriptor passed to the spawned process, exactly

It is super simple from C viewpoint (simplified "pseudocodish" solution):

int socket_fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP);
bind(socket_fd, (struct sockaddr *)&addr, sizeof(addr));

// fork current process
dup2(socket_fd, 3);
setenv("LISTEN_FDS", "1", 1); // amount of sockets passed via the environment, in this simplified case - always 1
exec(prog_path, prog_args);

Now in the spawned process (in our case Erlang) we should be able to do:

FDsCount = list_to_integer(os:getenv("LISTEN_FDS")),
[FD | _] = [2 + N || N <- lists:seq(1, FDsCount)],
{ok, Socket} = gen_tcp:listen(0, [{fd, FD}]).

In my systemd library it is of course hidden by function, but that does more or less exactly what I have shown above (with small addition of also parsing LISTEN_FDNAMES for socket passed with names, which is irrelevant there, and check if the OS PID match, which also is irrelevant for the discussion there).

I can try to prepare minimal C runner, that will just open socket and pass given socket to new process for sake of testing this out.

Simplest C runner that can provide behaviour similar to what systemd does:

#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <strings.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
        int fd;
        struct sockaddr_in6 addr;

        char pid[255] = {0};

        bzero((char *) &addr, sizeof(addr));
        addr.sin6_family = AF_INET6;
        addr.sin6_addr = in6addr_any;
        addr.sin6_port = htons(8080);

        fd = socket(AF_INET6, SOCK_STREAM, 0);
        if (fd == -1) {
                perror("Failed opening socket");
                return 1;
        }
        if (bind(fd, (struct sockaddr *)&addr, sizeof(addr)) < 0) {
                perror("Failed binding socket");
                exit(1);
        }

        if (listen(fd, 10) < 0) {
                perror("Failed to listen");
                exit(1);
        }

        if (dup2(fd, 3) < 0) {
                perror("Cannot duplicate FD");
                exit(1);
        }
        sprintf(pid, "%u", getpid());
        setenv("LISTEN_FDS", "1", 1);
        setenv("LISTEN_FDNAMES", "", 1);
        setenv("LISTEN_PID", pid, 1);

        execv(argv[1], &argv[1]);

        // Not reached on success
        perror("Cannot start process");
        exit(1);
}

It is called as ./a.out $(which erl) and then you have 1 IPv6 TCP socket passed as FD3 to the Erlang shell. So it should be possible to do gen_tcp:listen(0, [{fd, 3}, inet6]) within shell. It will be successful when using default inet_backend but will return {error,einval} when using socket backend.

I have tested it on OTP 23 and it successfully open sockets, however gen_tcp:accept/1 never returns. With default backend everything works as expected.

It is worth mentioning that it isn't problem in the socket module itself, as I can successfully run socket:open(3) and accept/recv content of the socket. It seems like problem lies somewhere in the backing module for gen_tcp.

Some further debugging showed that the problem comes probably from the fact that when we use inet_backend = socket then gen_tcp_socket will try to bind the socket, which will obviously fail, as the socket is already bound:

> gen_tcp:listen(0, [{fd,3}, inet6]).
(<0.90.0>) call gen_tcp_socket:listen(0,[{fd,3},inet6]) ({erl_eval,do_apply,6})
(<0.102.0>) call gen_tcp_socket:init({open,inet6,#{fd => 3},<0.90.0>}) ({gen_statem,
                                                                         init_it,
                                                                         6})
(<0.102.0>) call gen_tcp_socket:callback_mode() ({gen_statem,
                                                  loop_state_callback,11})
(<0.102.0>) call gen_tcp_socket:handle_event({call,{<0.90.0>,#Ref<0.2888407712.2773745665.251950>}},{setopts,[{start_opts,[{timeout,infinity}]},{active,true}]},connect,{{params,{'$socket',#Ref<0.2888407712.2773876737.251947>},
         <0.90.0>,#Ref<0.2888407712.2773745665.251946>},
 #{active => true,buffer => <<>>,delay_send => false,deliver => term,
   exit_on_close => true,header => 0,line_delimiter => 10,mode => list,
   packet => raw,packet_size => 67108864,send_timeout => infinity,
   send_timeout_close => false,show_econnreset => false,start_opts => [],
   type => undefined}}) ({gen_statem,loop_state_callback,11})
(<0.102.0>) call gen_tcp_socket:handle_event({call,{<0.90.0>,#Ref<0.2888407712.2773745665.251952>}},{bind,#{addr => {0,0,0,0,0,0,0,0},family => inet6,port => 0}},connect,{{params,{'$socket',#Ref<0.2888407712.2773876737.251947>},
         <0.90.0>,#Ref<0.2888407712.2773745665.251946>},
 #{active => true,buffer => <<>>,delay_send => false,deliver => term,
   exit_on_close => true,header => 0,line_delimiter => 10,mode => list,
   packet => raw,packet_size => 67108864,send_timeout => infinity,
   send_timeout_close => false,show_econnreset => false,
   start_opts => [{timeout,infinity}],
   type => undefined}}) ({gen_statem,loop_state_callback,11})
(<0.102.0>) call gen_tcp_socket:handle_event({call,{<0.90.0>,#Ref<0.2888407712.2773745665.251953>}},close,connect,{{params,{'$socket',#Ref<0.2888407712.2773876737.251947>},
         <0.90.0>,#Ref<0.2888407712.2773745665.251946>},
 #{active => true,buffer => <<>>,delay_send => false,deliver => term,
   exit_on_close => true,header => 0,line_delimiter => 10,mode => list,
   packet => raw,packet_size => 67108864,send_timeout => infinity,
   send_timeout_close => false,show_econnreset => false,
   start_opts => [{timeout,infinity}],
   type => undefined}}) ({gen_statem,loop_state_callback,11})
(<0.102.0>) call gen_tcp_socket:terminate(normal,closed,{{params,{'$socket',#Ref<0.2888407712.2773876737.251947>},
         <0.90.0>,#Ref<0.2888407712.2773745665.251946>},
 #{active => false,buffer => <<>>,delay_send => false,deliver => term,
   exit_on_close => true,header => 0,line_delimiter => 10,mode => list,
   packet => raw,packet_size => 67108864,send_timeout => infinity,
   send_timeout_close => false,show_econnreset => false,
   start_opts => [{timeout,infinity}],
   type => undefined}}) ({gen_statem,terminate,7})
{error,einval}

That may be the source of weird behaviour when using different backend.

This is strange. I could have sworn we fixed this...

@bmk provided fix in #4787

It was fixed, for the inet backend, and also for the socket backend (we thought). connect should have worked since there a bind address of undefined was handled correctly and the fix was to change the default bind address from any to undefined, but we forgot to implement the special treatment implemented in the inet backend for listen, that ignores the bind address (default any) when an {fd,Fd} option is used.

We'll merge your PR shortly.

It seems that it is still occurs on OTP 24.1.4:

$ systemd-socket-activate -l 8888 --fdname=tcp -E ERL_FLAGS='-kernel inet_backend socket' erl
Listening on [::]:8888 as 3.
Communication attempt on fd 3.
Execing erl (erl)
Erlang/OTP 24 [erts-12.1.4] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:1] [jit]

Eshell V12.1.4  (abort with ^G)
1> dbg:tracer().
{ok,<0.82.0>}
2> dbg:tp(gen_tcp_socket, cx).
{ok,[{matched,nonode@nohost,28},{saved,cx}]}
3> dbg:p(all, [call, return_to]).
{ok,[{matched,nonode@nohost,42}]}
4> gen_tcp:listen(0, [{fd, 3}, inet6]).
(<0.80.0>) call gen_tcp_socket:listen(0,[{fd,3},inet6]) ({erl_eval,do_apply,6})
(<0.87.0>) call gen_tcp_socket:init({open,inet6,#{fd => 3},<0.80.0>}) ({gen_statem,
                                                                        init_it,
                                                                        6})
(<0.87.0>) returned from gen_tcp_socket:init/1 -> {ok,connect,
                                                   {{params,
                                                     {'$socket',
                                                      #Ref<0.1210326129.4165599233.188528>},
                                                     <0.80.0>,
                                                     #Ref<0.1210326129.4165468161.188523>},
                                                    #{active => false,
                                                      buffer => <<>>,
                                                      delay_send => false,
                                                      deliver => term,
                                                      exit_on_close => true,
                                                      header => 0,
                                                      line_delimiter => 10,
                                                      mode => list,
                                                      packet => raw,
                                                      packet_size => 67108864,
                                                      send_timeout => infinity,
                                                      send_timeout_close =>
                                                       false,
                                                      show_econnreset => false,
                                                      start_opts => [],
                                                      type => undefined}}}
(<0.87.0>) call gen_tcp_socket:callback_mode() ({gen_statem,
                                                 loop_state_callback,11})
(<0.87.0>) returned from gen_tcp_socket:callback_mode/0 -> handle_event_function
(<0.87.0>) call gen_tcp_socket:handle_event({call,{<0.80.0>,#Ref<0.1210326129.4165468161.188533>}},{setopts,[{start_opts,[]},{active,true}]},connect,{{params,{'$socket',#Ref<0.1210326129.4165599233.188528>},
         <0.80.0>,#Ref<0.1210326129.4165468161.188523>},
 #{active => false,buffer => <<>>,delay_send => false,deliver => term,
   exit_on_close => true,header => 0,line_delimiter => 10,mode => list,
   packet => raw,packet_size => 67108864,send_timeout => infinity,
   send_timeout_close => false,show_econnreset => false,start_opts => [],
   type => undefined}}) ({gen_statem,loop_state_callback,11})
(<0.87.0>) returned from gen_tcp_socket:handle_event/4 -> {keep_state,
                                                           {{params,
                                                             {'$socket',
                                                              #Ref<0.1210326129.4165599233.188528>},
                                                             <0.80.0>,
                                                             #Ref<0.1210326129.4165468161.188523>},
                                                            #{active => true,
                                                              buffer => <<>>,
                                                              delay_send =>
                                                               false,
                                                              deliver => term,
                                                              exit_on_close =>
                                                               true,
                                                              header => 0,
                                                              line_delimiter =>
                                                               10,
                                                              mode => list,
                                                              packet => raw,
                                                              packet_size =>
                                                               67108864,
                                                              send_timeout =>
                                                               infinity,
                                                              send_timeout_close =>
                                                               false,
                                                              show_econnreset =>
                                                               false,
                                                              start_opts => [],
                                                              type =>
                                                               undefined}},
                                                           [{reply,
                                                             {<0.80.0>,
                                                              #Ref<0.1210326129.4165468161.188533>},
                                                             ok}]}
(<0.87.0>) call gen_tcp_socket:handle_event({call,{<0.80.0>,#Ref<0.1210326129.4165468161.188534>}},{bind,#{addr => any,family => inet6,port => 0}},connect,{{params,{'$socket',#Ref<0.1210326129.4165599233.188528>},
         <0.80.0>,#Ref<0.1210326129.4165468161.188523>},
 #{active => true,buffer => <<>>,delay_send => false,deliver => term,
   exit_on_close => true,header => 0,line_delimiter => 10,mode => list,
   packet => raw,packet_size => 67108864,send_timeout => infinity,
   send_timeout_close => false,show_econnreset => false,start_opts => [],
   type => undefined}}) ({gen_statem,loop_state_callback,11})
(<0.87.0>) returned from gen_tcp_socket:handle_event/4 -> {keep_state_and_data,
                                                           [{reply,
                                                             {<0.80.0>,
                                                              #Ref<0.1210326129.4165468161.188534>},
                                                             {error,einval}}]}
(<0.87.0>) call gen_tcp_socket:handle_event({call,{<0.80.0>,#Ref<0.1210326129.4165468161.188537>}},close,connect,{{params,{'$socket',#Ref<0.1210326129.4165599233.188528>},
         <0.80.0>,#Ref<0.1210326129.4165468161.188523>},
 #{active => true,buffer => <<>>,delay_send => false,deliver => term,
   exit_on_close => true,header => 0,line_delimiter => 10,mode => list,
   packet => raw,packet_size => 67108864,send_timeout => infinity,
   send_timeout_close => false,show_econnreset => false,start_opts => [],
   type => undefined}}) ({gen_statem,loop_state_callback,11})
(<0.87.0>) returned from gen_tcp_socket:handle_event/4 -> {next_state,closed,
                                                           {{params,
                                                             {'$socket',
                                                              #Ref<0.1210326129.4165599233.188528>},
                                                             <0.80.0>,
                                                             #Ref<0.1210326129.4165468161.188523>},
                                                            #{active => false,
                                                              buffer => <<>>,
                                                              delay_send =>
                                                               false,
                                                              deliver => term,
                                                              exit_on_close =>
                                                               true,
                                                              header => 0,
                                                              line_delimiter =>
                                                               10,
                                                              mode => list,
                                                              packet => raw,
                                                              packet_size =>
                                                               67108864,
                                                              send_timeout =>
                                                               infinity,
                                                              send_timeout_close =>
                                                               false,
                                                              show_econnreset =>
                                                               false,
                                                              start_opts => [],
                                                              type =>
                                                               undefined}},
                                                           [{reply,
                                                             {<0.80.0>,
                                                              #Ref<0.1210326129.4165468161.188537>},
                                                             ok}]}
(<0.87.0>) call gen_tcp_socket:terminate(normal,closed,{{params,{'$socket',#Ref<0.1210326129.4165599233.188528>},
         <0.80.0>,#Ref<0.1210326129.4165468161.188523>},
 #{active => false,buffer => <<>>,delay_send => false,deliver => term,
   exit_on_close => true,header => 0,line_delimiter => 10,mode => list,
   packet => raw,packet_size => 67108864,send_timeout => infinity,
   send_timeout_close => false,show_econnreset => false,start_opts => [],
   type => undefined}}) ({gen_statem,terminate,7})
{error,einval}
(<0.87.0>) returned from gen_tcp_socket:terminate/3 -> void
(<0.80.0>) returned from gen_tcp_socket:listen/2 -> {error,einval}
5>

It seems that in this code:

listen_open(Domain, ListenOpts, StartOpts, ExtraOpts, Backlog, BindAddr) ->
    case
        start_server(Domain, [{timeout, infinity} | StartOpts], ExtraOpts)
    of
        {ok, Server} ->
            ErrRef = make_ref(),
            try
                Setopts =
                    default_active_true(
                      [{start_opts, StartOpts} |
                       setopts_opts(ErrRef, ListenOpts)]),
                ok(ErrRef, call(Server, {setopts, Setopts})),
                ok(ErrRef, call_bind(Server, default_any(Domain, BindAddr))),
                Socket = val(ErrRef, call(Server, {listen, Backlog})),
                {ok, ?MODULE_socket(Server, Socket)}
            catch
                throw : {ErrRef, Reason} ->
                    close_server(Server),
                    ?badarg_exit({error, Reason})
            end;
        {error, {shutdown, Reason}} ->
            ?badarg_exit({error, Reason});
        {error, _} = Error ->
            ?badarg_exit(Error)
    end.

The call to default_any/2 will not return undefined if Domain is inet*:

default_any(Domain, undefined = Undefined) ->
    if
        Domain =:= inet;
        Domain =:= inet6 ->
            #{family => Domain,
              addr   => any,
              port   => 0};
        true ->
            Undefined
    end;
default_any(_Domain, BindAddr) ->
    BindAddr.

Which is different behaviour from "classic" inet backend and against expected behaviour, as I am trying to use INET6 socket that is already bound by the supervisor.

A combination I did not get right...

Thank you for fixing this issue!