Broken image extraction if no filters and CMYK colorspace
stefan6419846 opened this issue · comments
Image extraction is broken when isinstance(lfilters, NullObject)
and mode == "CMYK"
in
Lines 818 to 826 in 0106904
Environment
Which environment were you using when you encountered the problem?
$ python -m platform
Linux-5.14.21-150400.24.100-default-x86_64-with-glibc2.31
$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==4.1.0, crypt_provider=('local_crypt_fallback', '0.0.0'), PIL=10.2.0
Code + PDF
This is a minimal, complete example that shows the issue:
from pypdf import PdfReader
reader = PdfReader('file.pdf')
for page in reader.pages:
print(page)
for key in page.images.keys():
print(key)
print(page.images[key])
An anonymized version of the file is out3.pdf.
Traceback
This is the complete traceback I see:
Traceback (most recent call last):
File "/home/stefan/tmp/venv/lib64/python3.9/site-packages/PIL/PngImagePlugin.py", line 1279, in _save
rawmode, mode = _OUTMODES[mode]
KeyError: 'CMYK'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/stefan/tmp/venv/lib/python3.9/site-packages/pypdf/filters.py", line 876, in _xobj_to_image
img.save(img_byte_arr, format=image_format)
File "/home/stefan/tmp/venv/lib64/python3.9/site-packages/PIL/Image.py", line 2439, in save
save_handler(self, fp, filename)
File "/home/stefan/tmp/venv/lib64/python3.9/site-packages/PIL/PngImagePlugin.py", line 1282, in _save
raise OSError(msg) from e
OSError: cannot write mode CMYK as PNG
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/stefan/tmp/venv/lib64/python3.9/site-packages/PIL/PngImagePlugin.py", line 1279, in _save
rawmode, mode = _OUTMODES[mode]
KeyError: 'CMYK'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/stefan/tmp/run.py", line 9, in <module>
print(page.images[key])
File "/home/stefan/tmp/venv/lib/python3.9/site-packages/pypdf/_page.py", line 2420, in __getitem__
return self.get_function(index)
File "/home/stefan/tmp/venv/lib/python3.9/site-packages/pypdf/_page.py", line 501, in _get_image
imgd = _xobj_to_image(cast(DictionaryObject, xobjs[id]))
File "/home/stefan/tmp/venv/lib/python3.9/site-packages/pypdf/filters.py", line 880, in _xobj_to_image
img.save(img_byte_arr, format=image_format)
File "/home/stefan/tmp/venv/lib64/python3.9/site-packages/PIL/Image.py", line 2439, in save
save_handler(self, fp, filename)
File "/home/stefan/tmp/venv/lib64/python3.9/site-packages/PIL/PngImagePlugin.py", line 1282, in _save
raise OSError(msg) from e
OSError: cannot write mode CMYK as PNG