gramineproject / graphene

Graphene / Graphene-SGX - a library OS for Linux multi-process applications, with Intel SGX support

Home Page:https://grapheneproject.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

With Go program, inside a docker container, bind fails with permission denied error, invalid handle error.

sudharkrish opened this issue · comments

Description of the problem

With Go program, inside a docker container, bind fails permission denied error

Steps to reproduce

Able to reproduce on a recent graphene pull(Aug 30th, 2021), commit-id-> c321726
Also able to reproduce on https://github.com/oscarlab/graphene/releases/tag/v1.2-rc1

Providing Sample Go program and scripts to reproduce the issue.

In graphene repo, under your /home->/graphene/Examples directory, copy this zip file->(
go_sample.zip) , and then unzip it,
to create go_sample directory under /graphene/Examples/go_sample.

Under /graphene/Examples/go_sample$
Note: You may need to use sudo for the below script for your docker build.

  1. Run the script -> ./launch_main_in_graphene_container.sh
    This script will build the sample Go program(in a docker container),
    and then it uses Dockerfile_graphene to enable docker container,
    with Graphene, and does sgx-build for Go program inside
    the docker container, and launches the container(main_gsgx) with a shell.
  2. Get a terminal to that container named as -> main_gsgx
    Examples/go_sample$ docker exec -it main_gsgx /bin/bash
    root@e879ba553088:/graphene/Examples/go_sample#
  3. Inside the container's terminal launch the Go program using graphene:
    Examples/go_sample# ./run_main.sh

Expected results

Output below, when running the same Go program, outside of Graphene.
Examples/go_sample$ ./main
SK_DBG: listening on 172.17.0.1:8805
client: wrote: hello
server: read: hello

Actual results

When running in a container using graphene:

[P1:T1:main] debug: Allocating stack at 0x0 (size = 8388608)
[P1:T1:main] debug: loading "file:./main"
[P1:T1:main] debug: append_r_debug: adding file:./main at 0x0
[P1:T1:main] debug: Creating pipe: pipe.srv:1
debug: sock_getopt (fd = 11, sockopt addr = 0x7ffef14e4360) is not implemented and always returns 0
[P1:T1:main] debug: Shim process initialized
[P1:shim] debug: IPC worker started
[P1:T1:main] debug: Created sigframe for sig: 23 at 0xb0009390 (handler: 0x460be0, restorer: 0x460d20)
[P1:T1:main] error: bind: invalid handle returned
SK_DBG: ListenUDP->error listen udp 172.17.0.1:8805: bind: permission denied
2021/09/04 01:35:40 listen udp 172.17.0.1:8805: bind: permission denied

Additional information

Go sample code under gopro2 folder in zip file attached.
I debugged this Go program using Graphene's GDB, when running it inside the docker container.
When Go Program calls net.ListenUDP, this api invokes 2 syscalls, 1. to create socket, 2. bind
In this case, socket creation happens fine, but post-socket creation, that socket-handle that is passed to bind, is determined as invalid, in shim_do_bind in LibOS code.
This is shown by the error thrown by LibOS code.
[P1:T1:main] error: bind: invalid handle returned
But call to bind still goes thro, and seeing bind: permission denied error.

May be fixed by this PR: #2678

@dimakuv , this PR #2678 that you have mentioned, has change only in Graphene-direct, not in Graphene-SGX. I am testing in Graphene-SGX, also I am using a non-zero port-number.
In any case, I tested with a TCP Go server program, and it still fails to bind, when launched inside a docker container using Graphene-SGX. Here is a zip file with the program and how to test-> (gopro_tcp_testing_container.zip)

And here is the log:
[P1:T1:main] debug: loading "file:./main"
[P1:T1:main] debug: append_r_debug: adding file:./main at 0x0
[P1:T1:main] debug: Creating pipe: pipe.srv:1
debug: sock_getopt (fd = 11, sockopt addr = 0x7ffc7c4c4740) is not implemented and always returns 0
[P1:T1:main] debug: Shim process initialized
[P1:shim] debug: IPC worker started
[P1:T1:main] debug: Created sigframe for sig: 23 at 0x90009390 (handler: 0x460b80, restorer: 0x460cc0)
[P1:T1:main] error: bind: invalid handle returned
Error listening: listen tcp 172.17.0.1:8805: bind: permission denied
[P1:T1:main] debug: ---- shim_exit_group (returning 1)

Can you run it with loader.log_level = "all" and attach the resulting log?

The relevant part of this log is:

[P1:T1:main] trace: ---- shim_socket(INET, SOCK_NONBLOCK|SOCK_CLOEXEC|DGRAM, 0) = 0x3
[P1:T1:main] trace: ---- shim_setsockopt(3, 1, 6, 0xa41047e4, 4) = 0x0
[P1:T1:main] error: bind: invalid handle returned
[P1:T1:main] trace: ---- shim_bind(3, 0xa404002c, 16) = -13

To be honest, this doesn't help much. Apparently, there is some issue with the address parameter used by bind() of the UDP server. Could you maybe debug it with GDB?

Given that we didn't debug UDP properly, and the UDP code in PAL is very old, it's no surprise it is so buggy... We need to refactor it completely.

@dimakuv did some debugging with GDB, and turns out that this a configuration issue, not a graphene issue.
Given that this Go application was run using Graphene, inside a docker container, bind failed due to this error-> EADDRNOTAVAIL(99) Cannot assign requested address
But Pal's-> unix_to_pal_error_positive, does NOT check for this->EADDRNOTAVAIL, and instead returns a default value of PAL_ERROR_DENIED
And then later in LibOS, in shim_do_bind, it calls-> pal_to_unix_errno(PAL_ERROR_DENIED), gets converted to EACCES(Permission Denied).

If possible, we can try to add a change in Graphene, to ensure that application gets the REAL error for this case-> EADDRNOTAVAIL(99) Cannot assign requested address.

But otherwise, this PR can be closed.

Thanks for debugging this. Looking at https://github.com/gramineproject/gramine/blob/40e942db6b02555cf2414f20bbd313b63f50e400/Pal/src/host/Linux/pal_linux_error.h and https://github.com/gramineproject/gramine/blob/40e942db6b02555cf2414f20bbd313b63f50e400/common/include/pal_error.h, I don't see any suitable PAL error code to correspond to EADDRNOTAVAIL. In other words, we'll have to add a new PAL error code to have the 1:1 conversion, and we try to not increase the number of error codes in Gramine.

But noted. If we'll hit something similar again, we'll strongly consider adding this error code. Closing the issue.