Segfault loading Frozen Modules
dstufft opened this issue · comments
This may be too esoteric for you all to support, and I suspect the problem may be elsewhere but I'm not entirely sure so I figured I'd start here.
I'm attempting to use pyembed
directly, with a self compiled CPython 3.11.4 which I am static linking (through Bazel), and whenever I turn on using the oxidized importer I get a segfault. Running under lldb I get this:
Process 2508996 launched: '/home/dstufft/projects/pypi/code/.bazel/bin/apps/demo/demo' (x86_64)
Process 2508996 stopped
* thread #1, name = 'demo', stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
frame #0: 0x0000555557784a9c demo`oxidized_importer::python_resources::PythonResourcesState$LT$u8$GT$::index_interpreter_frozen_modules::hd1bedcc40b6742b4(self=0x00007fffffff95e8) at python_resources.rs:577:25
It appears that means that PyImport_FrozenModules
is an empty array, so the unsafe code that is fetching items from it is crashing.
What's confusing to me is the program works fine without oxidized importer, which suggests that Python has the frozen modules it needs to start up, but when I turn on oxidized importer it seg faults trying to read them.
Poking around some more, it appears that PyImport_FrozenModules
is NULL
by default in Python, e.g. if I run a script like this:
import ctypes
class struct_frozen(ctypes.Structure):
_fields_ = [("name", ctypes.c_char_p),
("code", ctypes.POINTER(ctypes.c_ubyte)),
("size", ctypes.c_int),
("get_code", ctypes.POINTER(ctypes.c_ubyte)), # Function pointer
]
FrozenTable = ctypes.POINTER(struct_frozen)
table = FrozenTable.in_dll(ctypes.pythonapi, "PyImport_FrozenModules")
print(table[0].name)
With just a normal python3
, I get:
ValueError: NULL pointer access
If I change that to _PyImport_FrozenBootstrap
as documented as an example in ctypes, I get the default frozen modules.
It appears that.. something in pyoxidizer's toolchain might be causing PyImport_FrozenModules
to be non NULL
when it otherwise normally wouldn't be, and oxidizied_imports assumes that it is an array?
Digging even further into it, this appears to be a change in Python 3.11: python/cpython@074fa57#diff-7247d35d315a26d853c8597ef32be4c8f8c2c7f9836b9fac0e727e6560394d78R132
It looks like previously PyImport_FrozenModules
was initialized to be the same as _PyImport_FrozenModules
, but now it is initialized to be NULL
.
From what I can tell, PyImport_FrozenModules
was primarily designed as a way for people to add additional frozen modules to the interpreter, but the way it was implemented it was the entire list of frozen modules, so people embedding CPython who wanted to add their own had to duplicate the default list.
The above change removes _PyImport_FrozenModules
and defaults PyImport_FrozenModules
to NULL
, and the default bootstrap modules are now stored in _PyImport_FrozenBootstrap
.