spotify / dh-virtualenv

Python virtualenvs in Debian packages

Home Page:http://dh-virtualenv.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

On package upgrades, sometimes the system Python library is, well, destroyed

jhermann opened this issue · comments

Layout changes in the lib of virtualenvs (v16.4.0+) damage system files because symlinks are followed.

Right now, this is usually only triggered if, as part of your build process, you update virtualenv to latest (e.g. when building in Docker). See https://github.com/1and1/debianized-jupyterhub/blob/67faf583d0f20215c7d0e639efd399668659d513/Dockerfile.build#L85 for an example.

Symptoms: On package install, all symlinked folders in the stdlib get their contents replaced by symlinks to self, i.e.

…venv/…/importlib -> /usr/lib/python3.6/importlib/
/usr/lib/python3.6/importlib/abc.py -> /usr/lib/python3.6/importlib/abc.py  ← self-link

This happens on mere unpacking a package with new layout over an installed one with the old layout.

At Yelp we are hitting this too, but we are just using dh-venv 1.0.1 ... ?
I bet something else changed (wheel?) that is making these symlinks instead of normal files.

I think it is related to pypa/virtualenv#1309, specifically the comment about symlinks. I'm not sure I totally understand why the presence of symlinks cause this issue though.

Adding a little more detail about what we're seeing.

We start on a working system with an old version of our dh-virtualenv package installed:

# The file in the venv is a normal file, not a symlink
root@8cc6e4b95eb9:/# ls -l /opt/venvs/tron/lib/python3.6/encodings/punycode.py
-rw-r--r-- 1 root root 6881 Jan 13  2017 /opt/venvs/tron/lib/python3.6/encodings/punycode.py

# So is the system file
root@8cc6e4b95eb9:/# ls -l /usr/lib/python3.6/encodings/punycode.py
-rw-r--r-- 1 root root 6881 Jan 13  2017 /usr/lib/python3.6/encodings/punycode.py

# Just unpack the "bad" deb (a newer version of the same package), don't even run the scripts
root@8cc6e4b95eb9:/# dpkg --unpack /mnt/bad.deb
(Reading database ... 17554 files and directories currently installed.)
Preparing to unpack /mnt/bad.deb ...
Unpacking tron (0.9.9.14.2) over (0.9.9.14) ...

# Now the system file is a symlink to itself
root@8cc6e4b95eb9:/# ls -l /usr/lib/python3.6/encodings/punycode.py
lrwxrwxrwx 1 root root 40 Feb 26 21:07 /usr/lib/python3.6/encodings/punycode.py -> /usr/lib/python3.6/encodings/punycode.py

so the root cause appears to be that virtualenv changed the lib layout from "symlink to directory" to "directory containing symlinks"

the unpacking above causes a write-through as /opt/venvs/tron/lib/python3.6/encodings used to be a symlink-to-directory (which ends up splatting on the system /lib/python3.6/encodings/*)

I'm working around this by ensuring we are using an older version of virtualeenv.

I'm also using ls /opt/venvs/*/lib/python3.6/encodings/punycode.py | xargs -n 1 test -f to detect if there are any timebombs waiting.

Workaround / Solution: use Python3 venv, then there is no copy of the stdlib that could overwrite anything. Also makes for smaller packages.

Root cause:
If a new package comes with physical lib subdirs that in turn contain file symlinks, but the old package has symlinks into the host stdlib for these subdirs, those symlinks are followed and thus the new file symlinks are applied to the host stdlib. Thus instant carnage.

Solution: Use venv (see above). Or remove symlinks in a "preinst" script (and check that is not too late in the package upgrade lifecycle, and the preinst is taken from the NEW package).

The safe version of virtualenv to use seems to be 16.3.0, so installing that version also prevents the defect.

Note that normally, you get the system's virtualenv which is usually older – but e.g. building in Docker can be done in a way that virtualenv is upgraded before dh-virtualenv is called.

With Tron we use python3.6 already, and it affects us
https://github.com/Yelp/Tron/blob/master/debian/rules#L22
Also we don't have any extra preinst scripts, only the postinst script that dh-venv 1.0 comes with.

With Tron we use python3.6 already, and it affects us
https://github.com/Yelp/Tron/blob/master/debian/rules#L22
Also we don't have any extra preinst scripts, only the postinst script that dh-venv 1.0 comes with.

The safe route is to use --builtin-venv and remove the triggers file (see commits in the project linked in 1st post).

By the way, the ticket is referencing --always-copy flag. Would that work in this case instead?

By the way, the ticket is referencing --always-copy flag. Would that work in this case instead?

That would probably be the best route for projects still stuck to Python2. Else venv.

Yeah, maybe we should default to use that flag when setting up the virtualenv when not using built-in venv (and default to built-in venv in future).

FYI: Opened a new issue for virtualenv pypa/virtualenv#1327

--always-copy is no solution and only makes things worse (symlinks are still followed). The only viable solution for classic virtualenv is to remove (all?) symlinks in an existing venv directory, before overwriting with new files.

BTW, restoring a damaged lib (2.7 in this case):

dpkg -S /usr/lib/python2.7/ \
    | sed -r s/:amd64//g | cut -d: -f1 | tr -d ' ' | tr , \\n \
    | xargs apt install --reinstall

If Python stopped working altogether, you might need to copy files from another machine beforehand. Or call the above command several times.

mam-dev/debianized-sentry@99f0c5e is the core of a solution. To be viable, the symlinks should be tarballed before deletion, and possibly restored by postrm abort-upgrade for a 100% quality solution.

As in the above commit, your app install might be broken if the upgrade fails. An explicit reinstall of the old version fixes that.

See https://wiki.debian.org/MaintainerScripts#Upgrading

@jhermann Just to get some clarification: Does this also happen if system python upgrades (i.e. the trigger files fires?)

No, the trigger is just useless since venv symlinks the Python binary, and references the original stdlib.

Change on virtualenv was reverted, so I'm closing this ticket.

In any case, I think this ticket popped up a few thoughts I think should be addressed for dh-virtualenv moving forward

  1. Default to built-in venv. Less likely moving target, of course with possibility still to use regular virtualenv. This is obviously a backwards incompatible change eventually
  2. Default to Python 3. This is very much given, especially when Python 2 is EOL January 2020. Also backwards incompatible.

Created #275 and #274