Argument "val.preload" documented but not known
wrznr opened this issue · comments
According to https://calamari-ocr.readthedocs.io/en/latest/doc.command-line-usage.html#preloading-data-load-data-on-the-fly there is an argument val.preload
used to prevent validation images from being loaded into RAM. However, applying this argument leads to an UnknownArgumentError
:
$ calamari-train --train.preload False --val.preload False --trainer.gen SplitTrain --trainer.gen.validation_split_ratio=0.2 --train PageXML --train.images *.tif
2022-03-24 14:16:12.989906: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-03-24 14:16:12.989929: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
INFO 2022-03-24 14:16:14,206 calamari_ocr.ocr.training.pipe: Splitting training and validation files with ratio 0.2: 89/358 for validation/training.
CRITICAL 2022-03-24 14:16:14,231 tfaip.util.logging: Uncaught exception
Traceback (most recent call last):
File "/home/kmw/Documents/Work/OCR-D/ocrd_all/venv/bin/calamari-train", line 8, in <module>
sys.exit(run())
File "/home/kmw/Documents/Work/OCR-D/ocrd_all/venv/lib/python3.8/site-packages/calamari_ocr/scripts/train.py", line 17, in run
main(parse_args())
File "/home/kmw/Documents/Work/OCR-D/ocrd_all/venv/lib/python3.8/site-packages/calamari_ocr/scripts/train.py", line 40, in parse_args
params = parser.parse_args(args).trainer
File "/home/kmw/Documents/Work/OCR-D/ocrd_all/venv/lib/python3.8/site-packages/paiargparse/main_parser.py", line 93, in parse_args
raise UnknownArgumentError(f"Unknown Arguments {' '.join(argv)}. Possible alternatives:{''.join(help_str)}")
paiargparse.dataclass_parser.UnknownArgumentError: Unknown Arguments --val.preload False. Possible alternatives:
--val.preload ==> --train.preload, --val.prefetch, --val.limit
You're right, that is confusing, the documentation needs some clarification. I'm guessing from the code that --val.preload
only works when you use a separate validation dataset via --val
, --val.images
and so on. In your case, the validation dataset is derived directly from the --train
-dataset with SplitTrain
, so the parameter could probably be --trainer.gen.val.preload
(would need to test if this has any effect).