vikesh8860 / Multitasker

Multitasker is a machine learning project based on python that implements Photo Ocr, Photos to Pdf converter, Text to speech converter and Speech to text converter

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multitasker

This Repository contains a python based application that implements Photo Ocr, Photos to Pdf converter, Text to speech converter and Speech to text converter.

The Multitasker Project is a Machine Learning project made in python language and UI made in PyQt framework. The parts of the app are:

  • Photo Ocr

    The Photo Ocr is implemented in machine learning with the help of SVM library of skLearn and the OpenCv for image manipution. The Photo Ocr application gives nearly 80% accuray on the digital printed cleaned text. The accuracy of prediction decrease as the level of noise increases.

    The dataset for the Ocr is not taken from external source but synthetised with the help of using different windows font of different size and also using some distortion

  • Pdf Scanner

    The pdf scanner takes the images from the user in .png , .jpg , .jpeg, and .gif and then using the Py2Pdf Python Library convert it to the pdf files. The features of this application are that you can add ,remove, change the order of images dynamically. After you convert it to pdf you can easily save in any directory.

  • Speech To Text

    Sometimes there are situations when you want to write something rapidly but you have to write everthing by you hand so, what Speech to Text basically do is it directly transforms your spoken text in to an editable text where you can also edit the text in the editor.It uses the Python gtts module to convert the spoken words to editable text.

  • Text To Speech

    This is an application by which you can cconvert any editable text document to an audio file and play instantly or save it for later use.Currently it supports three accents i.e English-US ,English-UK, English-Indian. It uses the python Speech recognition library to do the conversion.

The app is designed with Qt4 framework and is successfully tested on Windows 10 and Windows 8.1

If you want to contact me, then feel free to ping me here : https://kvikesh800.wixsite.com/learner/contact

If you want to contribute to the project then also you are welcomed

ScreenShots

home photo_ocr
Main Window Photo Ocr Window
pdf_scanner speech_to_text
Pdf Scanner Window Speech To Text Window
text_to_speech about_us
Text To Speech Window About Us Window

About

Multitasker is a machine learning project based on python that implements Photo Ocr, Photos to Pdf converter, Text to speech converter and Speech to text converter


Languages

Language:Python 100.0%