docker-library / python

Docker Official Image packaging for Python

Home Page:https://www.python.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Debbuging a `-slim` image?

hterik opened this issue · comments

We are hitting a rare deadlock in production that can't be reproduced using debug images.
Only way to debug it is to attach to the production image as the problem happens.
Problem is that production are based on the -slim images.

apt install gdb and python3-dbg is not enough because the python running is built from source and not aligned with what is available in apt.

I've tried to start the corresponding non-slim image but their binaries don't seem to align enough to make gdb happy:

Start python inside -slim image

➡️ My end goal is to run gdb py-bt on this process:
docker run -ti python:3.11-slim-bookworm python3 -c "import time; time.sleep(1000)"

Inside this image there is no chance to debug at all due to missing gdb and debug symbols. This is expected.

Build a debugger image from the corresponding non-slim image.

FROM python:3.11-bookworm

RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt \
    apt update -y && \
    apt install -y gdb

docker build -t mydebugpython .

Use the newly built debug image to attach into the first running container

Find the container id and use that to attach to the same pid-namespace.

docker run \
    --cap-add SYS_PTRACE \
    --pid container:15aab7ea5f57 \
    --privileged  \
    -ti \
    --entrypoint bash docker.io/library/mydebugpython
root@bb90d04f8722:/# gdb -q python --pid 1 -ex "py-bt"

Reading symbols from python...
Attaching to program: /usr/local/bin/python, process 1

warning: Build ID mismatch between current exec-file /usr/local/bin/python
and automatically determined exec-file /usr/local/bin/python3.11
exec-file-mismatch handling is currently "ask"

Load new symbol table from "/usr/local/bin/python3.11"? (y or n) y
Reading symbols from /usr/local/bin/python3.11...
Reading symbols from target:/usr/local/bin/../lib/libpython3.11.so.1.0...
(No debugging symbols found in target:/usr/local/bin/../lib/libpython3.11.so.1.0)
Reading symbols from target:/lib/x86_64-linux-gnu/libc.so.6...
Reading symbols from /usr/lib/debug/.build-id/82/ce4e6e4ef08fa58a3535f7437bd3e592db5ac0.debug...
Reading symbols from target:/lib/x86_64-linux-gnu/libm.so.6...
Reading symbols from /usr/lib/debug/.build-id/ea/87e1b3daf095cd53f1f99ab34a88827eccce80.debug...
Reading symbols from target:/lib64/ld-linux-x86-64.so.2...
Reading symbols from /usr/lib/debug/.build-id/38/e7d4a67acf053c794b3b8094e6900b5163f37d.debug...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
__GI___clock_nanosleep (clock_id=1, flags=1, req=0x7ffd9d9ec458, rem=0x0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:71
71	../sysdeps/unix/sysv/linux/clock_nanosleep.c: No such file or directory.
Traceback (most recent call first):
  (unable to read python frame information)    # <--------------   :(

So it seems like the difference of slim and non-slim isn't only the presence of debug symbols or not, but also the binary is built differently?
Is this possible to solve somehow? My knowledge of how python is built and how to operate gdb are only moderate.