indygreg / PyOxidizer

A modern Python application packaging and distribution tool

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Segfault loading Frozen Modules

dstufft opened this issue · comments

This may be too esoteric for you all to support, and I suspect the problem may be elsewhere but I'm not entirely sure so I figured I'd start here.

I'm attempting to use pyembed directly, with a self compiled CPython 3.11.4 which I am static linking (through Bazel), and whenever I turn on using the oxidized importer I get a segfault. Running under lldb I get this:

Process 2508996 launched: '/home/dstufft/projects/pypi/code/.bazel/bin/apps/demo/demo' (x86_64)
Process 2508996 stopped
* thread #1, name = 'demo', stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
    frame #0: 0x0000555557784a9c demo`oxidized_importer::python_resources::PythonResourcesState$LT$u8$GT$::index_interpreter_frozen_modules::hd1bedcc40b6742b4(self=0x00007fffffff95e8) at python_resources.rs:577:25

It appears that means that PyImport_FrozenModules is an empty array, so the unsafe code that is fetching items from it is crashing.

What's confusing to me is the program works fine without oxidized importer, which suggests that Python has the frozen modules it needs to start up, but when I turn on oxidized importer it seg faults trying to read them.

Poking around some more, it appears that PyImport_FrozenModules is NULL by default in Python, e.g. if I run a script like this:

import ctypes

class struct_frozen(ctypes.Structure):
    _fields_ = [("name", ctypes.c_char_p),
                ("code", ctypes.POINTER(ctypes.c_ubyte)),
                ("size", ctypes.c_int),
                ("get_code", ctypes.POINTER(ctypes.c_ubyte)),  # Function pointer
               ]


FrozenTable = ctypes.POINTER(struct_frozen)

table = FrozenTable.in_dll(ctypes.pythonapi, "PyImport_FrozenModules")
print(table[0].name)

With just a normal python3, I get:

ValueError: NULL pointer access

If I change that to _PyImport_FrozenBootstrap as documented as an example in ctypes, I get the default frozen modules.

It appears that.. something in pyoxidizer's toolchain might be causing PyImport_FrozenModules to be non NULL when it otherwise normally wouldn't be, and oxidizied_imports assumes that it is an array?

Digging even further into it, this appears to be a change in Python 3.11: python/cpython@074fa57#diff-7247d35d315a26d853c8597ef32be4c8f8c2c7f9836b9fac0e727e6560394d78R132

It looks like previously PyImport_FrozenModules was initialized to be the same as _PyImport_FrozenModules, but now it is initialized to be NULL.

From what I can tell, PyImport_FrozenModules was primarily designed as a way for people to add additional frozen modules to the interpreter, but the way it was implemented it was the entire list of frozen modules, so people embedding CPython who wanted to add their own had to duplicate the default list.

The above change removes _PyImport_FrozenModules and defaults PyImport_FrozenModules to NULL, and the default bootstrap modules are now stored in _PyImport_FrozenBootstrap.