Handwriting OCR
breta hajek
Unknown
- 0 Collaborators
OCR software for recognition of handwritten text, open source ...learn more
Project status: Under Development
Groups
Student Developers for AI
Overview / Usage
The project tries to create software for recognition of a handwritten text from photos. The process is divided into four main steps from detection of page to recognition and output of detected words. It uses computer vision (OpenCV) and machine learning (TensorFlow). It also test and experiments with different approaches to the individual steps.
The project is open source and anyone is welcome to join the development.
Methodology / Approach
Proces of recognition is divided into 4 steps. The initial input is a photo of page with text.
1. Detection of page and removal of background
2. Detection and separation of words
3. Normalization of words
4. Separation and recegnition of characters (recognition of words)
Each step is tested and developed independently. For last step I gathered over 190 000 images of words. New models are all trained and tested on this images.
Technologies Used
The code is written in Python 3.6 using libraries:
Tensorflow
OpenCV
Numpy
Pandas
Repository
https://github.com/Breta01/handwriting-ocr