incorrect order of pdf pages with --modified_pdf
daleonpz opened this issue · comments
Daniel P. commented
It generates the pdf with the annotations but the pages seem to be in any order. But when I use --combined-pdf it works. The annotations are there and the pages are in order. I tried it with two books, one of them is a two columns book and the other one is a one column book.
Check the order of the page indexes:
Book Writers (2017, Createspace Independent Publishing Platform) - libgen.lc.pdf"
PDF in-device directory: .
-------PAGE IDX #104
-------PAGE IDX #114
-------PAGE IDX #8
-------PAGE IDX #132
-------PAGE IDX #26
-------PAGE IDX #43
-------PAGE IDX #79
-------PAGE IDX #88
-------PAGE IDX #14
-------PAGE IDX #107
-------PAGE IDX #119
-------PAGE IDX #115
-------PAGE IDX #52
Probably it should be sorted before saving if we save the order of pages in an array.
Maybe something like this:
pages_order = []
....
# at remarks.py: 180
if modified_pdf:
mod_pdf.insertPDF(ann_doc, start_at=-1)
pages_order.append(page_idx)
# at remark.py: 203
if modified_pdf:
mod_pdf = _sort_document( mod_pdf, pages_order)
mod_pdf.save(f"{output_dir}/{name} _remarks-only.pdf")
mod_pdf.close()
or put everything together and delete the blank pages after.
for example
at remarks.py: 180
if modified_pdf:
mod_pdf.insertPDF(ann_doc, start_at=page_idx)
and
at remarks.py:203
if modified_pdf:
l = list(range(mod_pdf.pageCount)) # list of all pages
for i in l:
if not doc.getPageText(i) # if no text on page number i ...
l.remove(i) # delete that page from list
mod_pdf.select(l) # select remaining pages from the PDF
mod_pdf.save(f"{output_dir}/{name} _remarks-only.pdf")
mod_pdf.close()