Can we properly solve the reason for `tf1_disable_interactive_logs` existence?
kba opened this issue · comments
In ocrd_network/utils
we have
def tf_disable_interactive_logs():
try:
# This env variable must be set before importing from Keras
environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
# from tensorflow.keras.utils import disable_interactive_logging
# Enabled interactive logging throws an exception
# due to a call of sys.stdout.flush()
disable_interactive_logging()
except Exception:
# Nothing should be handled here if TF is not available
pass
Why did we do that and how can we get rid of it? Because importing tensorflow is expensive and this is particularly strongly felt with the bashlib processors/tests because they create new python sessions (with all the penalties from importing tensorflow) many times during a single run.
There are other bottlenecks like parsing YAML and importing modules globally that are only needed in a single if-else clause but this is the lowest-hanging fruit.
Keras thinks shell is interactive but it is not in case of the Processing Worker
. Check here as well. Potentially this should be resolved on processor level, so we do not have to do that manually in ocrd network
.
2023-02-17 15:11:54,788 - ocrd.network.processing_worker - DEBUG - Starting to process the received message: <ocrd.network.rabbitmq_utils.ocrd_messages.OcrdProcessingMessage object at 0x7f6db9a54050>
2023-02-17 15:11:54,789 - ocrd.network.processing_worker - DEBUG - Invoking the pythonic processor: ocrd-calamari-recognize
2023-02-17 15:11:54,789 - ocrd.network.processing_worker - DEBUG - Invoking the processor_class: <class 'ocrd_calamari.recognize.CalamariRecognize'>
2023-02-17 15:11:55,233 - ocrd.network.processing_worker - ERROR - [Errno 5] Input/output error
Traceback (most recent call last):
File "/home/mm/Desktop/core/ocrd/ocrd/network/processing_worker.py", line 234, in run_processor_from_worker
instance_caching=False
File "/home/mm/Desktop/core/ocrd/ocrd/processor/helpers.py", line 95, in run_processor
instance_caching=instance_caching
File "/home/mm/Desktop/core/ocrd/ocrd/processor/helpers.py", line 332, in get_processor
parameter=parameter
File "/home/mm/venv37-ocrd-new/lib/python3.7/site-packages/ocrd_calamari/recognize.py", line 44, in __init__
self.setup()
File "/home/mm/venv37-ocrd-new/lib/python3.7/site-packages/ocrd_calamari/recognize.py", line 52, in setup
self.predictor = MultiPredictor(checkpoints=checkpoints)
File "/home/mm/venv37-ocrd-new/lib/python3.7/site-packages/calamari_ocr/ocr/predictor.py", line 228, in __init__
data_preproc=data_preproc, processes=processes) for cp in checkpoints]
File "/home/mm/venv37-ocrd-new/lib/python3.7/site-packages/calamari_ocr/ocr/predictor.py", line 228, in <listcomp>
data_preproc=data_preproc, processes=processes) for cp in checkpoints]
File "/home/mm/venv37-ocrd-new/lib/python3.7/site-packages/calamari_ocr/ocr/predictor.py", line 116, in __init__
graph_type="predict", batch_size=batch_size)
File "/home/mm/venv37-ocrd-new/lib/python3.7/site-packages/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_backend.py", line 17, in create_net
processes=self.processes,
File "/home/mm/venv37-ocrd-new/lib/python3.7/site-packages/calamari_ocr/ocr/backends/tensorflow_backend/tensorflow_model.py", line 59, in __init__
print(self.model.summary())
File "/home/mm/venv37-ocrd-new/lib/python3.7/site-packages/keras/engine/training.py", line 3304, in summary
layer_range=layer_range,
File "/home/mm/venv37-ocrd-new/lib/python3.7/site-packages/keras/utils/layer_utils.py", line 319, in print_summary
print_fn(f'Model: "{model.name}"')
File "/home/mm/venv37-ocrd-new/lib/python3.7/site-packages/keras/utils/io_utils.py", line 80, in print_msg
sys.stdout.flush()
OSError: [Errno 5] Input/output error
2023-02-17 15:11:55,233 - ocrd.network.processing_worker - ERROR - <class 'ocrd_calamari.recognize.CalamariRecognize'> failed with an exception.
We can start by fixing this in ocrd_calamari. I'll drop the actual calls to the method from core and add them to ocrd_calamari.
@MehmedGIT Can you check whether #1091 combined with OCR-D/ocrd_calamari#90 solves the issue? Then I can check which other processors need this.
@kba, I have just tested and I see no problems.