OCR-D / core

Collection of OCR-related python tools and wrappers from @OCR-D

Home Page:https://ocr-d.de/core/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`ocrd network processing-worker` can only invoke cli

joschrew opened this issue · comments

For starting workers with docker-compose I use commands like this:
command: ocrd network processing-worker --database $MONGODB_URL --queue $RABBITMQ_URL --create-queue --queue-connect-attempts 5 ocrd-anybaseocr-crop.
That creates a ProcessingWorker class with processor_class set to None. This finally leads to call run_cli in process_helpers.py.invoke_processor().
Maybe it is possible to provide the class (string with class-name is not sufficient) somehow with the ocrd network processing-worker command.

Maybe it is possible to provide the class (string with class-name is not sufficient) somehow with the ocrd network processing-worker command.

Possible only for processors that are installed in the same python environment as the processing worker, not possible in general IIUC.

Recap of the discussion yesterday:

  • We do need a generic solution to add the worker CLI to the processor CLI because that is the only way we can reasonably access the processor class which allows for caching etc.
  • The processor API should be extended in such a way that the worker CLI is programmatically added without requiring changes to the processors themselves
  • If that is not possible without hacking click's internals, then we need to adapt all the processors to add the additional CLI themselves.

There is no way for ocrd network processing-worker to be able to reliably deduce the python class of a processor just from the CLI name. For pythonic processors, ocrd-* --agent-type {worker,server} (or ocrd-* [server|worker] is the way to go, for bashlib processors we need ocrd network processing-worker is the only way.