Simple code to convert pdf/s to image files and use Tesseract OCR on these image files to extract text from them. This code focuses on extracting Batch No. from pharmacy bills using RegEx. None of the actual pdfs and files could be added as all data used was real life/sensitive data.
Simple code to convert pdf/s to image files and use Tesseract OCR on these image files to extract text from them. This code focuses on extracting Batch No. from pharmacy bills using RegEx. None of the actual pdfs and files could be added as all data used was real life/sensitive data.