gristlabs / grist-core

Grist is the evolution of spreadsheets.

Home Page:https://www.getgrist.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gvisor sandbox fails when container storage on XFS: `OSError: [Errno 38] Function not implemented`

gabriel-v opened this issue · comments

After unrelated #914 I ran into this following error when container fails to boot:

2024-04-01 14:42:31.591 - debug: 3-pipe Sandbox started sandboxPid=69, flavor=gvisor, command=undefined, entryPoint=(default)
2024-04-01 14:42:31.630 - info: Sandbox stderr: run.py: sandbox/gvisor/run.py -E PYTHONPATH=/grist/sandbox/grist -E PIPE_MODE=minimal -m /grist/sandbox --checkpoint /tmp/engine__grist python3 -- /grist/sandbox/grist/main.py sandboxPid=69, flavor=gvisor, command=undefined, entryPoint=(default)
2024-04-01 14:42:31.889 - info: Sandbox stderr: Problem: Traceback (most recent call last): sandboxPid=69, flavor=gvisor, command=undefined, entryPoint=(default)
2024-04-01 14:42:31.890 - info: Sandbox stderr: Problem:   File "/grist/sandbox/grist/main.py", line 11, in <module> sandboxPid=69, flavor=gvisor, command=undefined, entryPoint=(default)
2024-04-01 14:42:31.890 - info: Sandbox stderr: Problem:     import logging sandboxPid=69, flavor=gvisor, command=undefined, entryPoint=(default)
2024-04-01 14:42:31.891 - info: Sandbox stderr: Problem:   File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
Problem:   File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
Problem:   File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
Problem:   File "<frozen importlib._bootstrap_external>", line 936, in exec_module
Problem:   File "<frozen importlib._bootstrap_external>", line 1073, in get_code
Problem:   File "<frozen importlib._bootstrap_external>", line 1131, in get_data sandboxPid=69, flavor=gvisor, command=undefined, entryPoint=(default)
2024-04-01 14:42:31.891 - info: Sandbox stderr: Problem: OSError: [Errno 38] Function not implemented sandboxPid=69, flavor=gvisor, command=undefined, entryPoint=(default)

grist Container Version: 1.1.12
OS: RHEL 7.9
XFS version : 4.5
Docker Version: 24

The source of the problem seems to be the XFS filesystem that the volume data is mounted on (oserror).

Container runs fine with sandbox turned off on same XFS storage. Running runsc on debug command do ls works as described in #914.

Is XFS known to be unsupported with gVisor?

Any idea how to further debug this? What operation is it trying to do on the storage to get "not implemented"?

Hmm searching for XFS on gvisor repo brings up issues with various random problems that get resolved, where people happen to be using XFS but that isn't the problem, so I would tentatively guess that XFS can work.

Not very sure how to debug. Could check what File "<frozen importlib._bootstrap_external>", line 1131, in get_data for the given version of importlib is doing?

Ah good idea. In the container:

# find / -name '_bootstrap_external.py'
/usr/local/lib/python3.11/importlib/_bootstrap_external.py
# python3 --version
Python 3.11.4

The file on github is identical to the file at the path above.

The failing operation is f.read() where f was opened with _io.open_code(...).

I will try to get the runsc arguments --strace --debug into the command that runs the sandbox, as per gvisor docs. Is there some env var that I can override to add these arguments everywhere, or do I need to fork and add them manually? GVISOR_FLAGS

Running:

docker run -it --rm --privileged --name grist_debug -v /storage/_grist_debug:/persist  --cap-add=SYS_PTRACE -e GVISOR_FLAGS='-unprivileged -ignore-cgroups --debug --strace --debug-log=/persist/gvisor-debug.log' -e 'GRIST_SANDBOX_FLAVOR=gvisor' gristlabs/grist:1.1.12

Where /storage is XFS.

Log output is same as above.

Runsc output with and without --strace: https://gist.github.com/gabriel-v/fbfbe486487ca6c4d3ea20629ccf7575

(edit) first revision on gist was missing --privileged and was getting different errors (operation not permitted) - running with --privileged yields the expected "function not implemented" error.

Running grep on "-with-strace.log" yields this:

I0401 17:00:18.349049      77 strace.go:631] [   1:   1] python3.11 X read(0x3 /usr/local/lib/python3.11/logging/__pycache__/__init__.cpython-311.pyc, 0x55c4895c7800, 0x18241) = 0 (0x0) errno=38 (function not implemented) (10.322µs)
I0401 17:00:18.350852      77 strace.go:631] [   1:   1] python3.11 X read(0x3 /usr/local/lib/python3.11/logging/__init__.py, 0x55c4895c7800, 0x13bb8) = 0 (0x0) errno=38 (function not implemented) (21.373µs)
I0401 17:00:18.354833      77 strace.go:593] [   1:   1] python3.11 E write(0x2 host:[3], 0x55c489551160 "OSError: [Errno 38] Function not implemented\n", 0x2d)

So it fails to read with this error both the .pyc and the .py for the std python logging library?


Some of the comments linked from upstream issues mention selinux as well as centos and xfs - I tried setting selinux mode permissive but problem remains. I will also try later to disable selinux and reboot machine to observe any changes.

Hmmm one thing I remember running into was a limit on number of open files. Don't know if symptom would be exactly what you are seeing though. I see on our SaaS that we have the following setting:

      ulimits:
        - name: nofile
          softLimit: 65535
          hardLimit: 65535

Could be good to just double-check there is no file-related ulimit being hit.

Set the nofile limit to 65k, both on host users and on container config, same error

Set SELINUX=disabled on host, same error

We will obtain some ext4 volumes on those same machines and mount on those instead of XFS, to see if sandbox will work then

Edit: added ext4 mount point (a 16GB file zeroed out and mounted into the /persist folder) and getting same "not implemented" error. So it's either because the host root fs (and docker installation, storage) are still XFS, or the issue is something else entirely.

Do you have any plans for some alternative sandbox environments? Even a bash script with old school chroot jail and env var dropping might help at this point

@gabriel-v there's a pyodide sandbox, used for desktop app and grist-static, it needs a little setup that isn't in the standard image: https://github.com/gristlabs/grist-core/tree/main/sandbox/pyodide

There are a few other sandboxes, and it is relatively easy to add more:

const spawners = {
pynbox, // Grist's "classic" sandbox - python2 within NaCl.
unsandboxed, // No sandboxing, straight to host python.
// This offers no protection to the host.
docker, // Run sandboxes in distinct docker containers.
gvisor, // Gvisor's runsc sandbox.
macSandboxExec, // Use "sandbox-exec" on Mac.
pyodide, // Run data engine using pyodide.
};

PRs welcome in this area. At Grist Labs all our development efforts are currently focused elsewhere unfortunately.