BibratRanjan / Denoising-dirty-documents

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Denoising-dirty-documents

Overview

Optical Character Recognition (OCR) is the process of getting type or handwritten documents into a digitized format. If you've read a classic novel on a digital reading device or had your doctor pull up old healthcare records via the hospital computer system, you've probably benefited from OCR.

OCR makes previously static content editable, searchable, and much easier to share. But, a lot of documents eager for digitization are being held back. Coffee stains, faded sun spots, dog-eared pages, and lot of wrinkles are keeping some printed documents offline and in the past.

This is for a challenge in kaggle, follow it here.

Requirements

  • Keras 2.2 with Tensorflow backend

Sample

Original

Denoised

About


Languages

Language:Python 100.0%