aman-17 / Reader-for-Blind

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Google_vision_api_key_powered

This project presents the automatic document reader, image description for visually impaired people, developed on Raspberry Pi. It uses the Google Computer Vision API technology for the identification of the printed characters using image sensing devices and computer programming. It converts images of typed, handwritten, or printed text into machine encoded text. In this research these images are converted into the audio output (Speech) through the use of OCR and Text-to-speech synthesis. The conversion of printed document into text files is done using Raspberry Pi which again uses Python programming. The text files are processed by OpenCV library & python programming language and audio output is achieved.

In this analysis, we've got represented a epitome system to scan written text and handheld objects for helping the blind individuals. To extract text regions from advanced backgrounds, we've got projected a completely unique text localization formula supported models of stroke orientation and edge distributions. The corresponding feature maps estimate the worldwide structural feature of text at each component. Block patterns project the projected feature maps of a picture patch into a feature vector. Adjacent character grouping is performed to calculate candidates of text patches ready for text classification. Associate degree Adaboost learning model is utilized to localize text in camera-based pictures. Google's Vision API is employed to perform word and image recognition on the localized text regions and rework into audio output for blind users. Because the Raspberry Pi board is high-powered the camera starts streaming. The streaming knowledge are going to be displayed on the screen victimization GUI application. Once the item for text reading is placed ahead of the camera then the capture button is clicked to produce image to the board. Using Tesseract library the image are going to be born-again into knowledge and also the knowledge detected from the image are going to be shown on the standing bar. The obtained knowledge are going to be pronounced through the ear phones using Text- to-speech synthesis.

About


Languages

Language:Python 100.0%