Inconsistent configuration

Question

Inconsistent configuration

quetz opened this issue 2 months ago · comments

Right now code like this does not work:

conf := &model.Configuration{} // assuming sane defaults for zero object here
// do any kind of writing with this configuration

It turns out that at least Eol should be set for any PDF generation, else empty string is used (is it ever useful to have empty string there?).

Also there are several places related to PDF optimization:

conf.cmd == model.OPTIMIZE
conf.Optimize = true
conf.WriteObjectStream
conf.WriteXRefStream

Is there any reason to have optimization configuration split like this?

Horst Rutter · Answer 1 · Thu Apr 04 2024 05:42:21 GMT+0800 (China Standard Time)

Don't do this.
Just leave the config nil unless you have a meaningful one - just follow along the tests in the codebase.

Quet Zal · Answer 2 · Thu Apr 04 2024 14:25:53 GMT+0800 (China Standard Time)

This does not always work: for example ReadValidateAndOptimize() function panics on nil configuration.

And anyways, what's the idea behid conf.Cmd? Is this a leftover from command-line tool and as user of API I should not touch it at all?

Quet Zal · Answer 3 · Thu Apr 04 2024 14:37:05 GMT+0800 (China Standard Time)

Another example: ReadValidateAndOptimize calls OptimizeContext only if conf.Cmd == model.OPTIMIZE or conf.Optimize == true. Which is strange - if I call this function I expect it to Validate and Optimize (as name says) without extra steps.
Futher down the call stack OptimizeXRefTable calls optimizeResourceDicts only if conf.Cmd == model.OPTIMIZE (setting conf.Optimize == true would do nothing).

Could you please explain why configuration is structured in this way?
Is there a historical or technical reason for this separation?

I am aiming to ensure that I'm using the library correctly and that I understand the design decisions that could affect the way I implement PDF processing capabilities in my application.

Thank you for your time and assistance.

Horst Rutter · Answer 4 · Fri Apr 05 2024 20:41:51 GMT+0800 (China Standard Time)

Optimization has turned into an optional feature in one of the latest commits as requested, with the exception of the optimize command.

These latest configuration changes are yet to be officially released and will be documented properly.

Always make sure you use the latest commit.

Horst Rutter · Answer 5 · Tue Apr 09 2024 07:48:35 GMT+0800 (China Standard Time)

If you are using the api you have a choice of commands for processing your PDF.

All commands will provide a corresponding default configuration internally based on config.yml in your pdfcpu config.
This means in general all you have to do is pass nil for the config parameter.

Of course you are free to pass your own configuration.
The recommended way for doing so is smth like:

conf := model.NewDefaultConfiguration()
conf.Eol = types.EolCRLF

You will never ever have to do something like conf := &model.Configuration{}

Horst Rutter · Answer 6 · Tue Apr 09 2024 07:55:04 GMT+0800 (China Standard Time)

I added some inline documentation for ReadValidateAndOptimize.
Please check back for the next release, which will include updated docs for the pdfcpu configuration.