Understand

Understand is a Google Chrome extension that converts the images inside PDFs to text. This is incredibly useful for searching through scanned books.

Chrome extension for PDF optical character recognition

TODO

Stitch all images into 1 to reduce # of API requests
PDF download progress
Re-enable pages > 1
Fix "All images processed indicator"
Order "recognize-ed" text properly (vertically)
Improve highlight positioning
Understand local files (file://....) - ask user to select file using <input type="file">, alternative: native app (these 2 are necessary if they don't have "allow file URLs" checked for the extension)
Offer to load normally (especially if there was an error)
Loading progress bars (instead of just spinner)
Port to other browsers

Chrome extension for PDF optical character recognition

Apache License 2.0

Language:JavaScript 69.2%Language:TypeScript 30.6%Language:HTML 0.1%