pdfcpu / pdfcpu

A PDF processor written in Go.

Home Page:http://pdfcpu.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inconsistent configuration

quetz opened this issue · comments

Right now code like this does not work:

conf := &model.Configuration{} // assuming sane defaults for zero object here
// do any kind of writing with this configuration

It turns out that at least Eol should be set for any PDF generation, else empty string is used (is it ever useful to have empty string there?).

Also there are several places related to PDF optimization:

  • conf.cmd == model.OPTIMIZE
  • conf.Optimize = true
  • conf.WriteObjectStream
  • conf.WriteXRefStream

Is there any reason to have optimization configuration split like this?

Don't do this.
Just leave the config nil unless you have a meaningful one - just follow along the tests in the codebase.

This does not always work: for example ReadValidateAndOptimize() function panics on nil configuration.

And anyways, what's the idea behid conf.Cmd? Is this a leftover from command-line tool and as user of API I should not touch it at all?

Another example: ReadValidateAndOptimize calls OptimizeContext only if conf.Cmd == model.OPTIMIZE or conf.Optimize == true. Which is strange - if I call this function I expect it to Validate and Optimize (as name says) without extra steps.
Futher down the call stack OptimizeXRefTable calls optimizeResourceDicts only if conf.Cmd == model.OPTIMIZE (setting conf.Optimize == true would do nothing).

Could you please explain why configuration is structured in this way?
Is there a historical or technical reason for this separation?

I am aiming to ensure that I'm using the library correctly and that I understand the design decisions that could affect the way I implement PDF processing capabilities in my application.

Thank you for your time and assistance.

Optimization has turned into an optional feature in one of the latest commits as requested, with the exception of the optimize command.

These latest configuration changes are yet to be officially released and will be documented properly.

Always make sure you use the latest commit.

If you are using the api you have a choice of commands for processing your PDF.

All commands will provide a corresponding default configuration internally based on config.yml in your pdfcpu config.
This means in general all you have to do is pass nil for the config parameter.

Of course you are free to pass your own configuration.
The recommended way for doing so is smth like:

conf := model.NewDefaultConfiguration()
conf.Eol = types.EolCRLF

You will never ever have to do something like conf := &model.Configuration{}

I added some inline documentation for ReadValidateAndOptimize.
Please check back for the next release, which will include updated docs for the pdfcpu configuration.