trufanov-nok / minidjvu-mod

A multipage DjVu encoder. This is a fork of minidjvu, with full-scale shared dictionaries (djbz) optimization and a few tricks in order to compensate the subsequent performance drop (multi-threading etc.).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use DjVuLibre's miniexp.cpp for settings parsing

trufanov-nok opened this issue · comments

As suggested by Leon Bottou:
https://sourceforge.net/p/djvu/discussion/103286/thread/1b0de7aa93/?page=1&limit=25#2ae7

... I just found your 'settings reader' using s-expressions. If I had realized you wanted to do this, I could have saved you a lot of work by pointing out that the files "miniexp.h" and "miniexp.cpp" are actually standalone and can be used without the rest of libdjvu as in the "minilisp" example of https://sourceforge.net/p/djvu/djvulibre-git/ci/master/tree/doc/minilisp/.

The doc is in the h file (https://sourceforge.net/p/djvu/djvulibre-git/ci/master/tree/libdjvu/miniexp.h). To read the settings file, you just have create an io structure with miniexp_io_init() and miniexp_io_set_input() and then call miniexp_read_r(io). This returns a s-expression data structure that contains the whole settings and that you can navigate with the miniexp.h functions.

You can even define a macrochar to implement the # comments you have in the settings. See https://sourceforge.net/p/djvu/djvulibre-git/ci/master/tree/doc/minilisp/minilisp.cpp#l1111 which defines semicolon comments ( instead of a # comments )

...

The annotation processing code in djvused is older. I never changed it because I want to preserve the original chunk indentation. So instead of really decoding it, there is a state machine that filters things out and fixes compatibility issues. The real nasty parsing code is in DjVuAnno.cpp. Also it is tightly connected to the Lizardtech XML annotation tools that many people said they wanted. Otherwise I would have removed that long ago. I could not but have no time....