Bug: ArchiveBox doesn't work on NFS/SMB/FUSE drives that disallow root ownership or have `root_squash` set
gnattu opened this issue Β· comments
Describe the bug
If the docker-entry point failed to chown the archive folder, it quits immediately, so that the container does not run at all, make it impossible to run archivebox with an nfs-mounted volume as the archive folder.
I'm currently using this with a modified entrypoint which removed the line to chmod everything under DATA dir.
Steps to reproduce
-
Create an NFS volume with docker:
docker volume create --driver local --opt type=nfs --opt o=addr=[ip-address],rw --opt device=:[path-to-directory] [volume-name]
-
Use this as the archivefolder in docker-compose.yml:
- volume-name:/data/archive
-
Start container. The conatiner quit right after it failed to chown the nfs mounted folder
Screenshots or log output
[+] Creating 1/0
β Container archivebox-sonic-1 Running 0.0s
chown: changing ownership of '/data/archive': Operation not permitted
ArchiveBox version
[+] Creating 1/0
β Container archivebox-sonic-1 Running 0.0s
0.7.1
ArchiveBox v0.7.1+editable BUILD_TIME=2023-12-18 06:57:08 1702882628
IN_DOCKER=True IN_QEMU=False ARCH=aarch64 OS=Linux PLATFORM=Linux-6.5.13-orbstack-00121-ge428743e4e98-aarch64-with-glibc2.36 PYTHON=Cpython
FS_ATOMIC=True FS_REMOTE=True FS_USER=1000:1000 FS_PERMS=644
DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False
[i] Dependency versions:
β PYTHON_BINARY v3.11.7 valid /usr/local/bin/python3.11
β SQLITE_BINARY v2.6.0 valid /usr/local/lib/python3.11/sqlite3/dbapi2.py
β DJANGO_BINARY v3.1.14 valid /usr/local/lib/python3.11/site-packages/django/__init__.py
β ARCHIVEBOX_BINARY v0.7.1 valid /usr/local/bin/archivebox
β CURL_BINARY v8.4.0 valid /usr/bin/curl
β WGET_BINARY v1.21.3 valid /usr/bin/wget
β NODE_BINARY v21.4.0 valid /usr/bin/node
β SINGLEFILE_BINARY v1.1.18 valid /app/node_modules/single-file-cli/single-file
β READABILITY_BINARY v0.0.9 valid /app/node_modules/readability-extractor/readability-extractor
β MERCURY_BINARY v1.0.0 valid /app/node_modules/@postlight/parser/cli.js
β GIT_BINARY v2.39.2 valid /usr/bin/git
β YOUTUBEDL_BINARY v2023.11.16 valid /usr/local/bin/yt-dlp
β CHROME_BINARY v120.0.6099.28 valid /browsers/chromium-1091/chrome-linux/chrome
β RIPGREP_BINARY v13.0.0 valid /usr/bin/rg
[i] Source-code locations:
β PACKAGE_DIR 24 files valid /app/archivebox
β TEMPLATES_DIR 4 files valid /app/archivebox/templates
- CUSTOM_TEMPLATES_DIR - disabled None
[i] Secrets locations:
- CHROME_USER_DATA_DIR - disabled None
- COOKIES_FILE - disabled None
[i] Data locations:
β OUTPUT_DIR 9 files @ valid /data
β SOURCES_DIR 77 files valid ./sources
β LOGS_DIR 1 files valid ./logs
β ARCHIVE_DIR 361 files @ valid ./archive
β CONFIG_FILE 162.0 Bytes valid ./ArchiveBox.conf
β SQL_INDEX 4.7 MB valid ./index.sqlite3 ```
<!-- Tickets without full version info will closed until it is provided,
we need the full output here to help you solve your issue -->
This only happens if your NFS volume disables or remaps permissions. The solution is not to remove the chmod but rather to set PUID & PGID environment variables to the same values that your NFS server enforces.
set PUID & PGID environment variables to the same values that your NFS server enforces.
This is not always do-able because my server enforces the owner group id to 0 for all files and in this case, you cannot just set PGID=0 in env to fix this issue because you will have to allow root access for this NFS client as well. The PUID is already set to the correct value and that PUID does have read/write permission in the mounted dir. (And actually, the owner of the folder)
hmm that's somewhat usual but I guess I can allow PGID=0.
I just pushed a commit to :dev
to remove the PGID!=0 check, can you try it out?
You can get it with docker pull archivebox/archivebox:dev
or follow these instructions: https://github.com/ArchiveBox/ArchiveBox#install-and-run-a-specific-github-branch
The strange thing is, even after set PGID=0
using archivebox/archivebox:dev
, the container still failed with chown: changing ownership of '/data/archive': Operation not permitted
.
Ah it looks like you're using the root_squash
option on the server. This prevents all chown
calls from working, and is unfortunately not easily supported by ArchiveBox at the moment.
AB in Docker starts running as root to create the data dir and set it up correctly, so when it drops down to a sub-user with fewer permissions it wont be able to modify the root_squashed
files it just created as they'll now be owned by the NFS anonymous/nobody user. If you're able to disable that option on the server or set no_root_squash
, it should work.
The security benefit that root_squash
provides is limited anyway, as root on NFS is effectively able to impersonate any UID. root_squashing only prevents a specific SUID escalation attack but doesn't stop clients from impersonating other users.
https://superuser.com/questions/1737302/root-squashing-for-nfs-and-smb-clarification
https://serverfault.com/questions/88114/cant-modify-or-chown-files-on-readynas-nfs-share
In my case, the fix here was to disable root_squash
on my NFS-via-ZFS share (on the NAS server), like so:
zfs set sharenfs="no_root_squash,rw=@10.8.1.0/24,rw=@10.222.0.0/24" tank/nas
and then verify that the NFS server exports reflects the new option with # exportfs -v
. I then recreated the archivebox docker container and stopped getting this permission error.