exiftool / exiftool

ExifTool meta information reader/writer

Home Page:https://exiftool.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add PDF 2.0 support

skylord123 opened this issue · comments

Since PDF 2.0 is now a thing I have been getting a bunch of errors in my processes using exiftool:

Detail: Error: Can't yet write PDF version 2.0

PDF 2.0 test files can be found here:
https://github.com/pdf-association/pdf20examples

Thanks for the great software!
~Skylar

Thanks for the suggestion. I'll look into this as soon as I get a chance.

Damnit! The ISO has absconded with the PDF 2.0 specification. They are demanding a ransom of 198 Swiss francs. :( If anyone has a copy of the specification they want to share, my email is philharvey66 at gmail.com

OK. In the absence of access to the specification, we must resort to trial and error. All I've done is to allow the version checking to be bypassed with the -m option. I tested this with all of the sample PDF 2.0 files from the URL you provided. They all seemed to work fine with the exception of the one with text before the PDF file header, which ExifTool refused to edit (I think this is fine -- it says right in the text that this file may be rejected by some PDF processors).

I've just released ExifTool 11.79 with this update. Let me know how it works for you. I may remove the warnings and the need to use the -m option if we become confident enough that this is working OK.

I'm closing this issue in the absence of any feedback about 11.79.

@boardhead I am running this now and it seems to be working fine.

I may be able to get my company to buy the spec. It just depends if we can own the spec and share it with you or do we have to buy you your own copy?

According to the strict ISO licensing, you must purchase a multi-user license to share a publication with multiple users. Did I mention that I hate the ISO?: https://exiftool.org/commentary.html#ISO

@boardhead Yeah I had no clue that ISO charged for specs. It is pretty insane that it costs around $200 just to get access to the PDF 2.0 spec. Wouldn't really call that "open" myself. It definitely conflicts with their whole mission.

I just spent a couple hours trying to find if anyone posted it online but to no avail. $200 is a lot of money even for the company I am working for. I will have to see if they will pay for it or not.

On the bright side the Chairman of the PDF Association Matt Kuznicki left this comment on an issue (This is quoted from here: pdf-association/pdf20examples#7 (comment)):

Most everything in PDF 2.0 is shared with earlier versions of PDF, so your best resources are mostly going to be written for earlier versions of PDF. Many people learn by reading the spec, examining PDFs that are generated from other libraries or programs, and experimenting.
Two books I can recommend are Leonard Rosenthal's "Developing with PDF: Dive Into the Portable Document Format" and John Whitington's "PDF Explained: The ISO Standard for Document Exchange". https://brendanzagaeski.appspot.com/0004.html has some good introduction and starting points as well.
Most everything you learn about making PDF files from versions before PDF 2.0 will apply to PDF 2.0 as well, so don't worry about looking for PDF 2.0 - specific tutorials.

It seems like they literally want us to learn this way (trial and error) which is completely against the purpose of having a spec. Gatekeeping the spec will just lead to people building bad PDF code.

Anyways, I'm running various PDF 2.0 tests today and will let you know if I run into any issues. So far it is looking good and everything is working flawlessly. If we run into issues and the spec would be good for you to have I can bring it up and try and get you a copy.

Thanks for this. Basically, the only thing that could be a problem is if the PDF 2.0 spec defines some new feature that wouldn't be parsed properly by a 1.x reader. I was really expecting a new encryption algorithm or something similar that would need to be supported by ExifTool. However, it sounds like there were no substantial changes to the structure that would be incompatible with a PDF 1.x parser, which is good. Your testing (ie. trial and error) is very useful, so please let me know how this goes. I will consider removing the -m requirement in ExifTool 11.82 (due to be released late January) unless you find a problem.

I've released 11.82 ahead of schedule. Please let me know if you find any problems.

@boardhead just opened an issue on our project to update. Should get done later tonight. I definitely let you know if I run into anything. :)

No issues reported, so it seems this is working OK.