online handwritten gurmukhi script recognition and its challenges r. k. sharma thapar university,...

19
ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Upload: augusta-sharp

Post on 11-Jan-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

ONLINE HANDWRITTEN GURMUKHI SCRIPT

RECOGNITION AND ITS CHALLENGES

R. K. SHARMA

THAPAR UNIVERSITY, PATIALA

Page 2: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Handwriting Recognition System

The technique by which a computer system can recognize characters and other symbols written by hand in natural handwriting is called handwriting recognition (HWR) system.

Page 3: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Types of HWR systems

HWR

Off-line HWR

Handwritten Document is scanned and then recognized by the machine, is called off-line handwriting recognition.

Handwritten Documents are recognized while being written, it is called on-line handwriting recognition.

On-line HWR

Page 4: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

CO

MP

LEX

ITY

Increasing

Page 5: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Handwriting Recognition System

Writer dependent

Writer independent

Closed-vocabulary

Open-vocabulary

Page 6: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

A general recognition procedure for On-line HWR

Data Collection &

Preprocessing Features

Extraction & Segmentation

Recognition Methods & Post-

processing

Page 7: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Data Collection

Input Pen Writing

Store pen movements

Text/Other file created

Text/Other file to be converted to a suitable format

Page 8: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Need of an application for selected hardware device

• Pre-developed applications do not support the features for user requirements, i.e., storing all pixels information for written text, deletion and addition of strokes w.r.t. user requirements, scaling the written text etc.

• Own GUI for user requirements needs to be developed.

Page 9: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

• Size Normalization

• Centering of text

• Interpolating missing points

• Smoothing of Text

• Slant Correction

• Resampling of points

Preprocessing

Page 10: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Feature Extraction

• A feature extractor designed by Govindaraju converts chain code image into feature vectors and then used in recognition phase.

• Hu et al. worked with point oriented features like stroke tangents for handwriting recognition.

• Hu et al. also proposed a method where high-level features were extracted and then combined with local-features at each sample point. These introduced features were capable of covering large input pattern and had invariance properties.

• Rocha designed feature extractor that reduced dimension of the problem and provided structural description of a character shape that consists of specification of its features and their special inter-ralations.

Page 11: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

• Feature extractor designed by S.W. Lee extracted four directional feature vectors with kirsch masks and one global feature vector linearly compressed from normalized input image.

• Kirsch masks were also used by Chaos in recognition of handwritten

• Numerals.

• Blumenstein introduced a feature extraction technique for the recognition of segmented handwritten characters.

• A hybrid feature extraction method proposed by PiFuei that was capable of providing an effective feature set of full dimension for the multiclass cases.

Page 12: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Feature Categories

Features

Low-Level or Local

(directions, positions, slope, area, slant etc.)

High-Level or Global

(loops, crossings, Headline, straight line, dots etc.)

Page 13: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

back

Devices based features

Time taken by the pen device for capturing a stroke is one of the features as each stroke has its own complexity. If suitable information is collected about each stroke time span, it may help in recognition process.

Density of points in a stroke is device dependent.

Directions of pen movement in a stroke might be helpful in recognition.

  Stroke area covered.

Pressure of the pen movements.

Page 14: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Features’ Properties Features giving better results may vary from one script to

another script.

A method that gives good results for a script may not do so for other scripts.

There is no standard method for computing features of a language.

Features should vary to a reasonable extent.

Features must be available from different users handwriting.

Features should be measurable through algorithms.

Features are selected in such a way that they represent the handwriting well and emphasize the inter-class differences and intra-class similarities.

Page 15: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Category Method Researchers

Statistical Hidden Markov Model, Support Vector Machine

Amlan kundu and Parambir Bahl (1988); Beigi (1994); Bellegarda (1994); Beim (2001); Connell and Jain (2002); Rigoll (1996);

Subrahmonia (1996)

Neural Network TDNN Guyon (1992); Schomaker (1993); Morasso (1995); Yeager (1998)

Syntactical and Structural

Decision Tree Kerrick and Bovik(1988); Chan and Yeung(1999); Jung and Kim(2000)

Elastic Matching Dynamic Programming Palvidis(1997); Wakahara and Odaka(1997); Webster and Nakagawa(1998)

Recognition methods

Page 16: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Category Advantages Disadvantages

Statistical Models temporal relationship well.

Requires very large amount of training data

Neural Network Classification time is fast. Does not model temporal relationship well.

Syntactical and Structural Less training data and robust for WI system.

Feature choice is manual and highly script dependent.

Elastic Matching Powerful high level features.

Not good for the system, where large variations exists in handwriting.

Advantages and disadvantages of Recognition methods

Page 17: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Post Processing

Other important AspectLanguage rules An Efficient Post Processing Algorithm for Online Handwritten Gurmukhi

Character Recognition using Set Theory”, International Journal of Pattern Recognition and Artificial Intelligence, 27(4), 1353002 (1-17), 2013 by Ravinder Kumar and R.K. Sharma

Language Models

Page 18: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

Challenges• Reverse Handwriting

• Zone wise stroke predictions

• Confusing Strokes

• Prediction of half Akshras for example:Pairi ‘ ’ਹ , Pairi ‘ ’ਵ

• New Classes in Handwritten Words

• New Features, Selection from existing features

• New Classifiers / Hybrid Classifiers

Page 19: ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA

THANK YOU ALL !!!!!