pypa / distutils

distutils as found in cpython

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

New test_set_include_dirs leaves files in working directory

jaraco opened this issue · comments

In #153, I added test_set_include_dirs. At least on my mac, this operation leaves a file in the working directory:

private/var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pytest-of-jaraco/pytest-32/test_set_include_dirs0/foo.o

It seems something about invoking a compiler on a file causes the path for that file to get built relative to the current directory :/

So apparently, the C compiler class intentionally replaces absolute paths with relative ones.

Removing that line (and the one above it), the tests still pass, so there's no good way to tell how important that behavior is. Do all known use-cases rely on relative paths? Or do some pass absolute paths and rely on this rewriting?

In my opinion, the behavior should be consistent, to place the compiled file alongside its source regardless of whether it was specified absolute or relative. It's going to take some work to avoid potential breakage.

Those lines were added in 5daaf7c, apparently for bpo-668662python/cpython#37775.

On my Ubuntu 22.04.1 LTS system, this deprecation warning behaves in a very odd way when using Cython's pyximport on Python 3.10.5.

If I create a small test python file, pyxtest.py:

import pyximport
pyximport.install()
import foo  # noqa
foo.hello()

and a small foo.pyx file next to it:

#cython: language_level=3

def hello():
    print("Hello!")

then running the script produces:

$ rm -rf ~/.pyxbld ; python -W default::DeprecationWarning: pyxtest.py
/home/user/miniconda3/envs/qutip/lib/python3.10/site-packages/pyximport/pyximport.py:51: DeprecationWarning: the imp module is deprecated in favour of importlib and slated for removal in Python 3.12; see the module's documentation for alternative uses
  import imp
/home/user/miniconda3/envs/qutip/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py:956: DeprecationWarning: Absolute path '/home/user/.pyxbld/temp.linux-x86_64-cpython-310/pyrex/foo' is being replaced with a relative path 'home/user/.pyxbld/temp.linux-x86_64-cpython-310/pyrex/foo' for outputs. This behavior is deprecated. If this behavior is desired, please comment in pypa/distutils#169.
  warnings.warn(msg, DeprecationWarning)
Hello!

AND the .pyx file is not built in the relative path, but in the absolute one:

tree ~/.pyxbld 
/home/user/.pyxbld
├── lib.linux-x86_64-cpython-310
│   └── foo.cpython-310-x86_64-linux-gnu.so
└── temp.linux-x86_64-cpython-310
    ├── home
    │   └── simon
    └── pyrex
        └── foo.c

The rm ~/.pyxbuild just removes any cache .so files. The -W ... is just to make sure the warning are printed. The warning about the use of imp has already been fixed for pyximport, but not released yet.

I'm not sure how pyximport can remove this deprecation warning -- they want to build in an absolute path, not a relative one. Any advice for what tools like pyximport should do? Or can this somehow be addressed in setuptools / distutils?

I tested on a few different versions of setuptools, with the following results:

  • No deprecaton warning:
    • setuptools-64.0.0
    • setuptools-64.0.3
  • Deprecation warning:
    • setuptools-65.0.0
    • setuptools-65.0.2
    • setuptools-65.1.0

Small clarification: The behaviour described as the future behaviour in the deprecation warning (i.e. that the absolute path will be used in future) is already the actual behaviour that occurs (at least in how pyximport uses this).

This is the part that is very confusing. Maybe the "deprecated" behaviour only ever occurred on some specific platform?

I suspect what's happening is pyximport is triggering the behavior because it's specifying an absolute path, and distutils is passing that absolute path to the .compile() method, which compiles to an object file in a path relative to the current directory, but it then uses that artifact to link and somehow ends up with the linked module back in the original path. If that's the case, the pyximport can ignore the deprecation warning. I'll try to replicate the issue and see if I can trace the behavior and confirm my suspicion.

In an attempt to replicate the issue, I created this Dockerfile:

# syntax = docker/dockerfile:1.4.0

from jaraco/multipy-tox
RUN <<EOF cat >> pyxtest.py
import pyximport
pyximport.install()
import foo  # noqa
foo.hello()
EOF
RUN <<EOF cat >> foo.py
# cython: language_level=3

def hello():
	print("hello")
EOF
CMD pip-run cython setuptools -- -W default::DeprecationWarning pyxtest.py

When I run that image, it completes but doesn't emit the reported warning:

 draft $ docker run -it @$(docker build -q .)
Collecting cython
  Downloading Cython-0.29.32-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_24_aarch64.whl (1.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 6.1 MB/s eta 0:00:00
Collecting setuptools
  Downloading setuptools-65.1.0-py3-none-any.whl (1.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 4.8 MB/s eta 0:00:00
Installing collected packages: setuptools, cython
Successfully installed cython-0.29.32 setuptools-65.1.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
/tmp/pip-run-urgoe9au/pyximport/pyximport.py:51: DeprecationWarning: the imp module is deprecated in favour of importlib and slated for removal in Python 3.12; see the module's documentation for alternative uses
  import imp
hello

As you can see, the latest version of Setuptools is getting installed. I also confirmed that "import distutils" does resolve to the Setuptools-local version of distutils.

Any idea why I'm not replicating the failure?

@jaraco Would it be easy to try with Python 3.10? I don't see it on Python 3.9 and earlier, so maybe 3.11 is fine too?

I did try Python 3.10 also. Same result:

$ docker run -it @$(docker build -q .) py -3.10 -m pip-run cython setuptools -- -W default::DeprecationWarning pyxtest.py
Collecting cython
  Downloading Cython-0.29.32-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_24_aarch64.whl (1.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 11.6 MB/s eta 0:00:00
Collecting setuptools
  Using cached setuptools-65.1.0-py3-none-any.whl (1.2 MB)
Installing collected packages: setuptools, cython
Successfully installed cython-0.29.32 setuptools-65.1.0
/tmp/pip-run-jxe_ve8z/pyximport/pyximport.py:51: DeprecationWarning: the imp module is deprecated in favour of importlib and slated for removal in Python 3.12; see the module's documentation for alternative uses
  import imp
hello

I suspect there's another factor - perhaps something to do with using the native Ubuntu Python.

aside: I've updated jaraco/multipy-tox so it no longer emits the pip as root warning.

I figured out my mistake. I left the x off of foo.pyx.

This Dockerfile replicates the issue nicely:

# syntax = docker/dockerfile:1.4.0

from jaraco/multipy-tox
RUN <<EOF cat >> pyxtest.py
import pyximport
pyximport.install()
import foo  # noqa
foo.hello()
EOF
RUN mkdir test
RUN <<EOF cat >> foo.pyx
# cython: language_level=3

def hello():
	print("hello")
EOF
CMD py -3.10 -m pip-run -q cython setuptools -- -W default::DeprecationWarning -W 'ignore:the imp module' pyxtest.py

I'm able to more directly trigger the error with this:

# syntax = docker/dockerfile:1.4.0

from jaraco/multipy-tox
RUN <<EOF cat >> pyxtest.py
import pyximport
pyximport.install()
pyximport.build_module('foo', 'foo.pyx')
EOF
RUN mkdir test
RUN <<EOF cat >> foo.pyx
# cython: language_level=3

def hello():
	print("hello")
EOF
ENV PYTHONWARNINGS="default::DeprecationWarning,ignore:the imp module"
CMD py -3.10 -m pip-run -q cython setuptools -- pyxtest.py

Moving the warnings to an environment variable makes it easier not to have to supply it. And invoking build_module bypasses the import behavior, getting a little closer to the cause (and bypassing the need to execute the module).

Even better, this installs the dependencies and pins to Python 3.10 by default:

# syntax = docker/dockerfile:1.4.0

from jaraco/multipy-tox
RUN <<EOF cat >> pyxtest.py
import pyximport
pyximport.install()
pyximport.build_module('foo', 'foo.pyx')
EOF
RUN mkdir test
RUN <<EOF cat >> foo.pyx
# cython: language_level=3

def hello():
	print("hello")
EOF
ENV PYTHONWARNINGS="default::DeprecationWarning,ignore:the imp module"
ENV PY_PYTHON=3.10
RUN py -m pip install cython setuptools
CMD py pyxtest.py

Aha, so calling pyx_to_dll with an absolute filename is what leads to the warning:

 draft $ docker run -it @$(docker build -q .) py -c "from pyximport import pyxbuild; pyxbuild.pyx_to_dll('foo.pyx')"
 draft $ docker run -it @$(docker build -q .) py -c "from pyximport import pyxbuild; pyxbuild.pyx_to_dll('/foo.pyx')"
/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/ccompiler.py:956: DeprecationWarning: Absolute path '/foo' is being replaced with a relative path 'foo' for outputs. This behavior is deprecated. If this behavior is desired, please comment in pypa/distutils#169.
  warnings.warn(msg, DeprecationWarning)

Passing build_in_temp=True also triggers the warning:

 draft $ docker run -it @$(docker build -q .) py -c "from pyximport import pyxbuild; pyxbuild.pyx_to_dll('foo.pyx', build_in_temp=True)"
/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/ccompiler.py:956: DeprecationWarning: Absolute path '/_pyxbld/temp.linux-aarch64-cpython-310/pyrex/foo' is being replaced with a relative path '_pyxbld/temp.linux-aarch64-cpython-310/pyrex/foo' for outputs. This behavior is deprecated. If this behavior is desired, please comment in pypa/distutils#169.
  warnings.warn(msg, DeprecationWarning)

The absolute path gets generated by this logic when cython_c_in_temp.

When I do some debugging, I can see the effect of the path getting made relative because the outputs show the path twice:

 draft $ docker run -it @$(docker build -q .) py -m pdb pyxtest.py
> /pyxtest.py(1)<module>()
-> import pyximport
(Pdb) b setuptools/_distutils/command/build_ext.py:559
Breakpoint 1 at /usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py:559
(Pdb) c
/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/ccompiler.py:956: DeprecationWarning: Absolute path '/_pyxbld/temp.linux-aarch64-cpython-310/pyrex/foo' is being replaced with a relative path '_pyxbld/temp.linux-aarch64-cpython-310/pyrex/foo' for outputs. This behavior is deprecated. If this behavior is desired, please comment in pypa/distutils#169.
  warnings.warn(msg, DeprecationWarning)
> /usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py(559)build_extension()
-> self._built_objects = objects[:]
(Pdb) objects
['/_pyxbld/temp.linux-aarch64-cpython-310/_pyxbld/temp.linux-aarch64-cpython-310/pyrex/foo.o']
(Pdb) sources
['/_pyxbld/temp.linux-aarch64-cpython-310/pyrex/foo.c']

Now that I look at the object_filenames code again, I observe that the mangling is probably important. If the mangling isn't done, when os.path.join is invoked, the outdir will be disregarded.

So maybe the mangling should be brought back, but documented more clearly... or maybe this code should stop trying so hard to provide every possible option such that it does unexpected things when no options are supplied.

After thinking about this overnight, I'm pretty sure this deprecation will need to get removed.

I think what this function is really trying to achieve is to ensure that object filenames are stored somewhere in the "output_dir", even if the source files are indicated by an absolute path. According to the docs, the files should be stored relative to their originally specified path in the output dir, but that example only considers relative paths. And to make matters worse, the default output dir is "", meaning the current working directory.

I think three things need to happen:

  • Remove the deprecation warning.
  • Instead of simply replacing an absolute path with a relative one, calculate the best path relative to the output dir (so a temp path doesn't appear twice).
  • If an output dir is not specified, the output dir should default to the dir where the source file is found.

Ugh. I see object_filenames is a mess. It has its own implementation in every compiler (unixccompiler inherits the default, but every other derived compiler (bcpp, msvc*, cygwin) re-implements the logic, each in their own subtly different way. BBCP and cygwin do not patch the dir, so will have problems if an absolute dir is passed. MSVC does its own patching.

Arg. Thank you for digging into this morass over the weekend!

In 1e89985 and some of the parent commits, I've done some refactoring so at least _msvccompiler and unixcompiler share the implementation for the shared logic. I've deprecated bcpp (similar to msvc*), so I'm not going to bother with those. In 610da0b, I gave cygwincompiler a similar treatment.

Setuptools 65.2 has should have the release that no longer emits a warning. I'm going to open a new issue to track the possible follow-up actions.

Thanks! I will give 65.2 a try locally and in the QuTiP CI (which covers Python 3.6-3.10 and Ubuntu, Mac OS and Windows).