tox-dev / filelock

A platform-independent file lock for Python.

Home Page:https://py-filelock.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

filelock 3.10.0 changes umask to 0 in multithreaded processes, causing world-writable files to be created

chriskuehl opened this issue · comments

We are encountering an issue where a recent change in filelock (#192) is sometimes causing virtualenvs we create to be created with world-writable files, e.g.:

$ ls -l venv/bin
total 56K
-rw-rw-rw- 1 ckuehl users 2.2K Mar 22 10:35 activate
-rw-rw-rw- 1 ckuehl users 1.5K Mar 22 10:35 activate.csh
-rw-rw-rw- 1 ckuehl users 3.0K Mar 22 10:35 activate.fish
-rw-rw-rw- 1 ckuehl users 3.3K Mar 22 10:35 activate.nu
-rw-rw-rw- 1 ckuehl users 1.8K Mar 22 10:35 activate.ps1
-rw-rw-rw- 1 ckuehl users 1.2K Mar 22 10:35 activate_this.py
[...]

$ ls -l venv/lib/python3.8/site-packages
total 48K
[...]
-rw-rw-rw- 1 ckuehl users  151 Mar 22 10:35 distutils-precedence.pth
[...]
-rw-rw-rw- 1 ckuehl users    0 Mar 22 10:35 setuptools-65.6.3.virtualenv

This is causing issues for us because tools like https://github.com/Yelp/aactivator refuse to automatically activate world-writable activate files. On multi-user systems this may also be a security concern.

I believe I've tracked it down to a change in #192 to call umask(0) before acquiring the lock:
https://github.com/tox-dev/py-filelock/blob/66b2d49720a215c6c03bd5ab1a840defaa00436a/src/filelock/_api.py#L182-L186

My hypothesis is that code from another thread is being run in between the call to umask(0) and the call to reset the umask back to its original value, which causes files to be created while umask 0 is active.

Reproduction while creating a virtualenv

I can reproduce this even without tox in a virtualenv with these dependencies:

$ venv/bin/pip freeze --all
distlib==0.3.6
filelock==3.10.0
pip==23.0.1
platformdirs==2.5.2
setuptools==65.6.3
virtualenv==20.17.1
wheel==0.38.4

...and by calling this command:

$ venv/bin/python -m virtualenv --no-download --python /usr/bin/python3.8 test-venv
created virtual environment CPython3.8.13.final.0-64 in 104ms
  creator CPython3Posix(dest=/tmp/tmp.juMXqvrFzt/test-venv, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/nail/home/ckuehl/.local/share/virtualenv)
    added seed packages: pip==23.0.1, setuptools==65.6.3, wheel==0.38.4
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
$ ls -l test-venv/bin/activate
-rw-rw-rw- 1 ckuehl users 2.1K Mar 22 10:44 test-venv/bin/activate

Note that this reproduces for me only about 50% of the time; other times it has the normal (-rw-r--r--) permissions as expected since my umask is 0o22. I'm assuming it only happens some of the time due to getting lucky with thread scheduling.

If I manually patch virtualenv by inserting this into the import path early on:

import os
import threading
import traceback

real_umask = os.umask

def fake_umask(new_umask):
    pid = os.getpid()
    tid = threading.get_ident()
    print(f"{pid=} {tid=} Changing umask to {oct(new_umask)} from {traceback.format_stack()[-2]!r}")
    return real_umask(new_umask)

os.umask = fake_umask

...then you can see the calls to os.umask from different interleaved threads:

$ rm -rf test-venv && venv/bin/python -m virtualenv --no-download --python /usr/bin/python3.8 test-venv && ls -l test-venv/bin/activate
pid=3823951 tid=140034962657728 Changing umask to 0o0 from '  File "/tmp/tmp.juMXqvrFzt/venv/lib/python3.8/site-packages/filelock/_api.py", line 182, in acquire\n    previous_umask = os.umask(0)\n'
pid=3823951 tid=140034962657728 Changing umask to 0o22 from '  File "/tmp/tmp.juMXqvrFzt/venv/lib/python3.8/site-packages/filelock/_api.py", line 186, in acquire\n    os.umask(previous_umask)  # reset umask to initial value\n'
pid=3823951 tid=140034850154048 Changing umask to 0o0 from '  File "/tmp/tmp.juMXqvrFzt/venv/lib/python3.8/site-packages/filelock/_api.py", line 182, in acquire\n    previous_umask = os.umask(0)\n'
pid=3823951 tid=140034858546752 Changing umask to 0o0 from '  File "/tmp/tmp.juMXqvrFzt/venv/lib/python3.8/site-packages/filelock/_api.py", line 182, in acquire\n    previous_umask = os.umask(0)\n'
pid=3823951 tid=140034850154048 Changing umask to 0o22 from '  File "/tmp/tmp.juMXqvrFzt/venv/lib/python3.8/site-packages/filelock/_api.py", line 186, in acquire\n    os.umask(previous_umask)  # reset umask to initial value\n'
pid=3823951 tid=140034934462016 Changing umask to 0o0 from '  File "/tmp/tmp.juMXqvrFzt/venv/lib/python3.8/site-packages/filelock/_api.py", line 182, in acquire\n    previous_umask = os.umask(0)\n'
pid=3823951 tid=140034934462016 Changing umask to 0o22 from '  File "/tmp/tmp.juMXqvrFzt/venv/lib/python3.8/site-packages/filelock/_api.py", line 186, in acquire\n    os.umask(previous_umask)  # reset umask to initial value\n'
pid=3823951 tid=140034858546752 Changing umask to 0o0 from '  File "/tmp/tmp.juMXqvrFzt/venv/lib/python3.8/site-packages/filelock/_api.py", line 186, in acquire\n    os.umask(previous_umask)  # reset umask to initial value\n'
created virtual environment CPython3.8.13.final.0-64 in 112ms
  creator CPython3Posix(dest=/tmp/tmp.juMXqvrFzt/test-venv, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/nail/home/ckuehl/.local/share/virtualenv)
    added seed packages: pip==23.0.1, setuptools==65.6.3, wheel==0.38.4
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
-rw-rw-rw- 1 ckuehl users 2.1K Mar 22 10:53 test-venv/bin/activate

Downgrading to filelock 3.9.1 makes the problem go away.

Can you put in a PR to have a thread lock here; to make it thread safe? 😊

I'm not sure what the fix should look like -- I don't think that just putting a thread lock around the code snippet I highlighted is sufficient. That would prevent multiple threads from entering the umask checking code at once, but it doesn't prevent an issue where thread A changes the umask and then thread B takes over execution and starts writing files with the new umask.

Wouldn't #205 work? two threads can't be at the same time within the umask changing code; so that part will become atomic 🤔

My understanding is that a lock creates a critical section which two threads cannot execute at the same time, but it doesn't prevent a second thread from running entirely. So if I have two threads which interleave like this:

  • Thread A: Enter critical section, change umask to 0
  • Thread B: Create a new file
  • Thread A: Change umask back to the original value

...then thread B will have created a file using the wrong umask.

I could be wrong on this though, I don't know all of the subtleties of how Python handles threading.

Threads are never parallel, because they are not processes. So only one thread can run at any time. A thread lock ensures that only one thread can be in that code segment at any time. So it should work.

Guess this should ensure that two parallel locks will create correct file permissions. However, if there's another thread creating a file (not a lock file) in another thread, that can still be incorrect. I'm not sure that's solve-able 🤔

@jahrules cc on this as is change you've added 🤔

Very interesting. I'm researching this now.

Guess we can use something like https://stackoverflow.com/a/53357635 to read at first invocation the state, and reuse it. That would make most cases work (assuming most of the time users don't change this, so we don't need to keep updating it).

Threads are never parallel, because they are not processes. So only one thread can run at any time. A thread lock ensures that only one thread can be in that code segment at any time. So it should work.

Are you sure? I see thread interleaving when I test it out:

import os
import threading
import time

def check_umask():
    while True:
        with threading.Lock():
            old_umask = os.umask(0)
            print("Changed umask to 0")
            time.sleep(1)
            os.umask(old_umask)
            print("Reset umask back")
        time.sleep(0.1)

def print_forever():
    while True:
        print("X")
        time.sleep(0.1)

if __name__ == '__main__':
    t1 = threading.Thread(target=check_umask)
    t2 = threading.Thread(target=print_forever)
    t1.start()
    t2.start()
    t1.join()
    t2.join()
$ python3 test.py
Changed umask to 0                  
X                                                                              
X            
X                                                                              
X                               
X                                                                              
X                                                                              
X                
X                                                                              
X                               
X                                                                              
Reset umask back                      
X                
Changed umask to 0
X
X                                                                                                                                                             
X                                 
X                                                                              
X                 
X                  
X
X                                    
X                                                                              
X        
Reset umask back
X  
[...]

All of those X's between "Changed" and "Reset" are places I could have written a file with the wrong umask.

You're creating a new thread lock per thread there. Move the thread lock to a global and will work as described.

You're creating a new thread lock per thread there. Move the thread lock to a global and will work as described.

Oh, good point. It's still the same result, though. I don't think threading locks prevent execution of other threads, they just create a critical section. https://docs.python.org/3/library/threading.html#lock-objects seems to agree as well.

I don't think threading locks prevent execution of other threads, they just create a critical section.

Note my comment above:

Guess this should ensure that two parallel locks will create correct file permissions. However, if there's another thread creating a file (not a lock file) in another thread, that can still be incorrect. I'm not sure that's solve-able 🤔

sorry, catching up here. @chriskuehl I 100% agree that it's the threads writing between os.umask(0) and the revert. This is an unfortunate side effect of threads sharing the global namespace with the process that spawned the thread.

Im currently pondering a solution.

@gaborbernat @chriskuehl

Here's my suggestion:

Since I don't think there is a convenient workaround for setting os.umask not affecting files potentially being written in other threads; I would propose updating the code here.

removing both os.umask here:

                        previous_umask = os.umask(0)
                        try:
                            self._acquire()
                        finally:
                            os.umask(previous_umask)  # reset umask to initial value

and replacing with a call to os.fchown() inside of _unix.py/_windows.py files right after the call to os.open()

The proposed change works for UNIX based systems but has some issues with Windows. Im currently researching a solution

Sorry for a silly question, but why does acquire() need to worry about permission bits?
_acqure() already uses the mode argument to os.open to set permission bits.

https://github.com/tox-dev/py-filelock/blob/0b102873247c37ccbcc30ce9d1e67e4eb3e7f5d1/src/filelock/_unix.py#L33-L41

EDIT: I see os.open() is affected by os.umask(). (docs)
Could a solution be to not force the mode with chmod or otherwise, and allow the permissions of the file to be restricted by os.umask() if set?

That is a bad API because other people would run into the same issue as we here.

The fix in #206 is causing issues, today our CI started failing trying to install poetry in Docker/Podman container once 3.10.2 was released, 3.10.0 is working fine

Added details to the PR #206 (comment)

There is also another issue with #206

Added details to PR comment

There is also another issue with #206

By calling open() without mode, and then following it with os.chmod() this leads to a situation where os.chmod() may set stricter permissions than open(). This means there is an opportunity between file creation and file permission set where less privileged users can influence the file. This is a problem.

import os, filelock

# Program makes os.umask more permissive
os.umask(0)

with filelock.FileLock("/path/to/lock"):
    # Between lock creation and destruction, there is a moment where the permissions are rwxrwxrwx.
    ...

https://github.com/tox-dev/py-filelock/blob/b85b5cbb3db00bfc87d75b8551be341ea8696092/src/filelock/_unix.py#L33-L36

I think the permissions should be set by os.open() and os.chmod() to avoid an issue.
Also, it may be a good idea have a way for the permissions of the lock file to be set by os.umask() should the user wish. Perhaps the default permissions for FileLock should be None, and signals that chmod() is not called and permissions are whatever os.umask() makes them to be.

Related Stack post: https://stackoverflow.com/questions/36745577/how-do-you-create-in-python-a-file-with-permissions-other-users-can-write#comment105490504_36745806

I don't think there's really an issue here. The default umask is 0o022 which will default to permissions of 644 (-rw-r--r--). If the user has requested different permissions, then they will be set following that going from less stringent to more stringent in 99.9% of cases.

If the user has changed their umask themselves; then yes, it will create the filelock with 0o777 and then chmod will make it 0o644. But it should be noted that before my changes, this was how the lock files already worked.

We just wanted to expose the ability to have permissions other than 0o644 so that users could have lockfiles which are usable by multiple usercodes without having to know about umask or take a manual action to change the permissions.

Albeit, before this commit, the prior solution was not thread-safe which is not any better.
And you are right that so long as os.umask() is a reasonable value, there is no issue.

Before this change (I mean before the multi-user locks) you are right that setting os.umask(0) would give the locks 0o777 perms. The difference is, now that there is an explicit mode argument, there is now a situation where a user may think the lock will be securely and properly held, but then has an issue due to this moment between file creation and permission set.

However, I think that 0.1% case can be covered by passing self._mode to os.open() as well.
Just because its not "common" doesn't make it unimportant.

Perhaps this is better as a separate PR?