VikParuchuri / surya

OCR, layout analysis, reading order, line detection in 90+ languages

Home Page:https://www.datalab.to

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Return confidences on run_recognition

bobbydmartino opened this issue · comments

Hi,

Currently, in surya/ocr.py, in run_recognition, no confidences are returned. It would be an easy fix, since currently line 30 is
rec_predictions, _ = batch_recognition(all_slices, all_langs, rec_model, rec_processor, batch_size=batch_size)
just change it to:
rec_predictions, confidence_scores = batch_recognition(all_slices, all_langs, rec_model, rec_processor, batch_size=batch_size)
and add the confidence scores to the OCRResult, thus giving more information to anyone using a different detector or ground truth bounding boxes and surya as the recognizer.