unidoc / unipdf-cli

CLI for PDF processing using unipdf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unexpected behaviour when extracting images from a PDF file

ahall opened this issue · comments

Description

I just tried to extract images from a PDF file. The command run was:

$ ./unicli extract -r images ~/Downloads/heimadaemi_VI_E2_2015_2016_nr_09.pdf
Images successfully extracted to /Users/ahall/Downloads/heimadaemi_VI_E2_2015_2016_nr_09.zip
ahalls-MBP:unicli ahall$ unzip /Users/ahall/Downloads/heimadaemi_VI_E2_2015_2016_nr_09.zip

When unzipping the images I got:

l$ unzip /Users/ahall/Downloads/heimadaemi_VI_E2_2015_2016_nr_09.zip
Archive:  /Users/ahall/Downloads/heimadaemi_VI_E2_2015_2016_nr_09.zip
  inflating: p1_0.jpg
  inflating: p1_1.jpg
  inflating: p1_2.jpg
  inflating: p1_3.jpg
  inflating: p1_4.jpg
  inflating: p1_5.jpg
  inflating: p1_6.jpg
  inflating: p1_7.jpg
  inflating: p1_8.jpg
  inflating: p1_9.jpg
  inflating: p1_10.jpg
$ ls -la p1*
-rw-r--r--@ 1 ahall  staff  599 Dec 31  1979 p1_0.jpg
-rw-r--r--  1 ahall  staff  599 Dec 31  1979 p1_1.jpg
-rw-r--r--  1 ahall  staff  599 Dec 31  1979 p1_10.jpg
-rw-r--r--  1 ahall  staff  599 Dec 31  1979 p1_2.jpg
-rw-r--r--  1 ahall  staff  599 Dec 31  1979 p1_3.jpg
-rw-r--r--  1 ahall  staff  599 Dec 31  1979 p1_4.jpg
-rw-r--r--  1 ahall  staff  599 Dec 31  1979 p1_5.jpg
-rw-r--r--  1 ahall  staff  599 Dec 31  1979 p1_6.jpg
-rw-r--r--  1 ahall  staff  599 Dec 31  1979 p1_7.jpg
-rw-r--r--  1 ahall  staff  599 Dec 31  1979 p1_8.jpg
-rw-r--r--  1 ahall  staff  599 Dec 31  1979 p1_9.jpg

Expected Behavior

The file has no images and should not even return a ZIP file, should just state that the file has no images. The files also should also have more recent timestamp than 1979.

Actual Behavior

A zip file is returned with a few empty .jpg files. The files also have a 1979 timestamp.

File causing issue

heimadaemi_VI_E2_2015_2016_nr_09.pdf