Download - Abdelwaheb BELAID
![Page 1: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/1.jpg)
1
Natural Language Processing: Arabic Cursive Handwriting
Recognition
A. Belaïd
Natural Language Processing: Natural Language Processing: Arabic Cursive Handwriting Arabic Cursive Handwriting
RecognitionRecognition
A. A. BelaïdBelaïd
![Page 2: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/2.jpg)
2
PreambleArabic – Part of the Semitic language family
One of the most spoken language in the world– Nearly 250 million people speak Arabic
Spoken outside Arabic countries– Over 600 000 people In the United States speak
ArabicGive rise to other alphabets– Farsi, Urdu…
• spoken by millions of people from Iran, Pakistan, India…
![Page 3: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/3.jpg)
3
PreambleThe work presented here is that of numerous researchers working with me…– Najoua Ben Amara– Samia Maddouri– Afef Kacem– Imen Ben Cheikh– Hiba Khelil– Mohamed Yazid Boudaren– Nazih Ouwayed– Christophe Choisy– Umapada Pal
![Page 4: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/4.jpg)
4
Presentation outlineIntroduction– Brief historic, specific applications
Writing characteristics– Those shared with Latin– Proper to Arabic
Issues of cursive word recognition– Reading models– Global word-based (holistic), Local letter-based
(analytical) approach, Hybrid approachLanguage Processing– Some basic solutions for handwriting recognition
![Page 5: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/5.jpg)
5
IntroductionThe field arose before the apparition of
computers
1900 1980 20081916 1950 1965
Maturity
Price
Patents on OCR: blindtelegraph
Working Models
OCR inindustry
1st Postal address reader
Forms
Handwritten forms
Small devicesIntelligent pen
![Page 6: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/6.jpg)
6
Today for Latin: An industry and a real market
Material : – Scanners adapted to documents
We know– scan documents in a huge quantity,
preserving the image quality– compress them, publish them on the net– recognize by OCR and identify some
structure elements But … – on very good quality documents– rather recent, poor structure, printed, – Handwriting:
• Just few work available for specific applications: small vocabulary
![Page 7: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/7.jpg)
7
IntroductionArabic
Started at 1980 – increasing demand for information indexing and retrieval
Pioneers– A. Amin (Loria), – M. Cheriet (ETS, Montreal), – N. Ben Amara (ENIT)…
Today– Many Labs: REGIM (Sfax), LRI (Annaba), READ (Loria),
ETS (Montreal), CEDAR (USA), ISI (India)Many dedicated sessions and workshops– IWFHR, ICFHR, CIFED, ICDAR, SACH’06
Several public datasets– IFN/ENIT, DARPA/SAIC, CENPARMI, Farsi-City…
• See Volker Märgner list
![Page 8: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/8.jpg)
8
IntroductionArabic
Commercial Arabic OCR– Number of commercial Arabic OCR engines
• Sakhr's Automatic Reader: ~1500$• Readiris from IRIS : ~500$• Verus from NovoDynamics : ~1300$• Omnipage
– Evaluation by UNLV• Sakhr (90,33%), OmniPage (86.89%)
Open Source Arabic OCR projects– The Siragi project (started in 2005)
• Part of the Arabic Unix open source project
![Page 9: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/9.jpg)
9
IntroductionSome specific applications
Bank check recognition of courtesy amounts
![Page 10: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/10.jpg)
10
IntroductionSignature recognition,
verification, forgery detection…
V. K. Madasu et al.Pattern Recognition, 38 (2005)
Normalized vector angle (α) in boxes
M.A. Ismail et al.Pattern Recognition, 33 (2000)
Global and local features in boxes
Algorithms based on fuzzy concepts
![Page 11: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/11.jpg)
11
IntroductionSome specific applications
Writer identification
Comparison of Gabor-Based Features for Writer Identification of Farsi/Arabic Handwriting, IWFHR,’06, F. Shahabi et al.
Automatic Writer Identification Using Connected-Component Contours and Edge-Based Features of Uppercase Western Script, L. Schomaker et al, PAMI, V. 26, N. 6, 2004
![Page 12: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/12.jpg)
12
IntroductionWord Spotting by request
Spotting Words in Handwritten Arabic Documents, S. Srihari et al. , SACH 2006
Ch. Choisy, A. Belaid, Cross-learning in AnalyticWord Recognition Without Segmentation. IJDAR’02.
Template matching: ALMALIKStochastic model: MADAME
PrototypesCandidates
![Page 13: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/13.jpg)
13
IntroductionNewspapers segmentation
Connected Pattern Segmentation and Title Grouping in Newspaper Images, P. E Mitchella et al, ICPR’04
Arabic Page Segmentation, Planet, K. Hadjar et al. ICDAR03
![Page 14: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/14.jpg)
14
IntroductionSome specific applications
Paleographic inspection
INSA Lyon: Auto-similarité de formes pour la discrimination des styles d’écriture des manuscrits médiévaux, I. Moalla, F. LeBourgois, H. Emptoz, A. M. Alimi
Progressive evolution between VI and XVIc
University of Pisa: to classify and identify medieval scripts
University of Annaba:
![Page 15: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/15.jpg)
15
IntroductionThe issue of recognition
Handwritten Latin recognition showed first the way– In terms of modalities
• On-line vs Off-line– In terms of scripts
• Printed vs Handwritten– In terms of pre-processing
• Shape normalization• Feature extraction: indices or graphemes
– In terms of methodologies, classified regarding:• use or not lexicon • nature of primitives / model: structural, statistic,
stochastic• vision level: local or global
![Page 16: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/16.jpg)
16
IntroductionInitiated by Speech recognition
Today: a well established PR System
Preprocessing
Recognition
Feature extraction&
Vector quantization
Sequence
Tree structured lexicon
HMM Models database
« Bonsoir, à demainpour une nouvelleédition du journal »
بقة
نقش
تفت
بقة لثة
startstandstore Enrollment
![Page 17: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/17.jpg)
17
IntroductionWhen LP contributes?
Preprocessing + Feature extraction
Phoneme / Character /Modeling
Character Models
Training data
Language Modeling
Text
Lexicon +Grammar
Recognition searchPreprocessing + Feature extraction
Image input
EnrollmentEnrollment
RecognitionRecognition
WordSequence
![Page 18: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/18.jpg)
18
IntroductionThe process bases are well
established
Analytic
Global
Pre-segmentation Internal
Segm
en ta ti on
RecognitionLearning
Sliding Window
a b z...
Discriminate Path
Discriminate Model
![Page 19: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/19.jpg)
19
IntroductionPerformances: criteria influencing
the quality
Writer nb
omni
multi
reducedmono
large
Lexiconsize
Writing
non constrained
guided
![Page 20: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/20.jpg)
20
IntroductionThe performances are satisfactory
for Arabic
Script Process Model
![Page 21: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/21.jpg)
21
Outline
1. Introduction2. Writing characteristics3. Issues of Segmentation4. Natural Language Processing
![Page 22: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/22.jpg)
22
Writing characteristicsSome of them are similar to Latin
Writing lines
AlignementUpper-lineMean-lineBase-lineLower-line
Baseline
X-height
![Page 23: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/23.jpg)
23
Writing characteristicsSome of them are similar to Latin
Perceptive invariants: J. C. Simon called: regularities and singularities
Letter support
Letter peculiarities
![Page 24: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/24.jpg)
24
Writing characteristicsSome of them are similar to Latin
Perceptive invariants / regularities and singularities
![Page 25: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/25.jpg)
25
Writing characteristicsSome of them are similar to Latin
Perceptive invariants / regularities and singularities
![Page 26: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/26.jpg)
26
Writing characteristicsSome of them are similar to Latin
Perceptive invariants / regularities and singularities
![Page 27: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/27.jpg)
27
Latin vs Arabic What changes?
Essentially the script, always complex
Arabic SystemsThe difficulty is permanent:
Cursiveness, ligature, tashkeel
Latin SystemsThe difficulty is gradual
![Page 28: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/28.jpg)
28
Latin vs Arabic What changes?
The gaps are significant for Latin, not always for Arabic
LatinBetween words
ArabicEverywhere
![Page 29: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/29.jpg)
29
Latin vs Arabic What changes?
The ligatures are permanent: horizontal and vertical
![Page 30: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/30.jpg)
30
Latin vs Arabic Arabic has some peculiarities
1. Helpful: accents and diacritical dots contribute to the recognition
![Page 31: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/31.jpg)
31
Latin vs Arabic Arabic has some peculiarities
2. Helpful: the letter elongation contributes to the segmentation
![Page 32: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/32.jpg)
32
Latin vs Arabic Arabic has some peculiarities
3. Helpful: the position of the hamza (16) and the descenders
![Page 33: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/33.jpg)
33
Latin vs Arabic Arabic Pecularities?
5. Helpful: PAWs offer a pause in the writing, a decomposition of the writing • Simplify the script apprehension, make easier the
linear recognition
PAW
[Al-Badr and R. M. Haralick 1998]
![Page 34: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/34.jpg)
34
Arabic ScriptIn conclusion
• Arabic: more global than syllabic• PAWs : facilitate the recognition• PAW level ~ letter level in Latin• For recognition
– In Arabic: to reach PAW level: characteristic information
– In Latin: to reach letter level• The PAW level is the stable level
makes it semi-global
![Page 35: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/35.jpg)
35
Outline
1. Introduction2. Writing characteristics3. Issues of Recognition4. Natural Language Processing
![Page 36: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/36.jpg)
36
Issue of segmentationDue to the local variability– It is widely accepted
• Arabic word segmentation in letters is very delicate and not always ensured
• Usually in most attempts, Arabic word is segmented into graphemes (copied on Latin)– This is an error!
![Page 37: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/37.jpg)
37
Issues of recognitionConsidering Arabic peculiarities
Reading models
– The recognition of a word • implies the processing of visual data and its
interpretation at the linguistic level– Psychologists call "mental lexicon access"
• the process by which the human associates the image of a word to its significance
Several models emerges
![Page 38: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/38.jpg)
38
Interactive Activation ModelMc Clelland and Rumelhart 1981
Important assumptions– Perception takes place in a
multilevel processing:• Feature, letter, word
A consequence:– more abstract levels of
representation are only accessed via intermediate level
A third assumption– Processing combines both
bottom-up and top-down information refers• readers can use their
(top-down) knowledge of words to help identify letter sequences from (bottom-up) visual input
-
- ME O
\ /
MATE MOVE
-
+-++- +
--+-+ -
Words
Letters
Features
Neurons have excitatory and inhibitory connections
![Page 39: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/39.jpg)
39
IA & Arabic RecognitionArabic writing – fits very well the reading principle of IA
• Clearly privileges the superiority of the whole• Local perceptual information is just used to help
word understanding
But the corresponding model – should be adapted to consider the PAW level and letter
distortions: • PAWs introduce an intermediate global level
Hence, perfect similarity if adapted
![Page 40: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/40.jpg)
40
Perceptro [Côté, Cheriet 98]
Limited number of features– Ascender, descender, loop
word not having these features cannot be initialized
No trainingno inhibition rapid saturation
Recognition– Perceptive cycles– Top-down & bottom up
![Page 41: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/41.jpg)
41
Transparent Neural Network
[Maddouri, Belaïd, Ellouze, 03]– Input correction by FD
[Ben Cheikh, Kacem, 07]– Slight extended vocabulary
(Tunisian city names)– Training possibility
[Ben Cheikh, Belaïd, Kacem, 08]– Wide vocabulary
![Page 42: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/42.jpg)
42
Arabic RecognitionCorrection process
لص لحملسر
Propagation
Back-Propagation ?
Original image
Reconstructedimage (harmonics)
Real image
![Page 43: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/43.jpg)
43
Arabic RecognitionExperiments– 2100 images, 70 words, 63 PAWs– Without Perceptive cycle
• PAW RR: 68.42% • Word RR: 90%
– With perceptive cycles• PAW RR: 95% • Word RR: 97%
![Page 44: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/44.jpg)
44
Arabic Recognition Methodologies
Considering human perception of Arabic writing with the particularity of PAWs
revised literature approaches: vision degree
• Global-based vision classifiers• Semi-global-based vision classifiers• Local-based vision classifiers• Hybrid-level classifiers
and examined their proximity with IA
![Page 45: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/45.jpg)
45
MethodologiesI Global-based Vision Classifier
The word – regarded as a whole
The features– doesn’t need to be precise:
• presence and somerelationships
The approach– assimilated to segmentation free
even if a segmentation is used, no localinterpretation is madeinformation is gathered at the word level
Its use is limited to small vocabularies
![Page 46: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/46.jpg)
46
GBVCExamples
Srihari and al [2005]:– Several preprocessing steps– Feature extraction for PAWs
and Words: – aspects measurements
– Word resemblance by NN• 10 writers writing 10
documents each : word extraction is ~ 60%, rr=70%
Noise suppression and binarization
Suppression of internal contours
Fusion of minor components
![Page 47: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/47.jpg)
47
GBVCExamples
Al Badr et al [1998] – Free segmentation method :
• detects a set of shape primitives on the word
• matches the regions of the word with a set of symbol models
• maximizes the a posteriori probability of the arrangement of symbol models
– Word recognition scores : clean (99.39%), degraded (95.60%) or scanned (73.13%)
Matching with symbol model
Correspondence regionsof the model (in shades of gray)
![Page 48: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/48.jpg)
48
Local-Based-Vision ClassifierExample
Shirin Saleem et al. [2008]: – BBN Technologies, Cambridge, MA: BBN Byblos OCR
System (DARPA data set): – Locate line tops and bottoms– Extract narrow overlapping vertical slices of the image
• measure features on each slice• reduce the size of feature using Linear Discriminant
Analysis (typically 15 features)
![Page 49: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/49.jpg)
49
Simple Frame-based Features
Examples of features:– Intensity as a function of vertical position
– Vertical derivative of intensity
– Horizontal derivative of Intensity
– Local angle within a small window
– Difference of angle
![Page 50: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/50.jpg)
50
Character Hidden Markov Model
![Page 51: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/51.jpg)
51
Local-Based-Vision ClassifierExample
R. Al Hadj et al [2006]– HMM for letters and words
with sliding windows– Windows correspond to 3
different orientations: density description
– A second system integrates all the orientations in each position
85.02% (Top1) 91.29% (Top2) 93.14% (Top3)
![Page 52: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/52.jpg)
52
Global-Based Vision ClassifierExamples
Khorsheed et al [2000] – Polar transformation coupled
with a Fourier transform– Each word: template with Fourier
coefficients– Recognition
• normalized ED from templates• In a multi-font approach:
– 95.4% of good word classification on 1700 samples of different size, angle and translation
Original images
Normalized images by polar transform
![Page 53: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/53.jpg)
53
GBVCSynthesis
The works related – accredit the word superiority– Many feature combinations
and models perform well
The proximity with IA? – can operate – but limited to 2 levels
needs more precisionin feature extraction
Adaptation of GC to Arabic– Possible if high level features
usedInput
Feature
Word
![Page 54: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/54.jpg)
54
MethodologiesII Semi Global Vision Classifier
The word – natural concatenation of independent PAWs which
provides a natural segmentation
The features– are numerous and different
require normalization of image before extractionThe approach
leads to reduce the vocabulary as only the PAWs are considered
Important to find features
![Page 55: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/55.jpg)
55
Semi-Global-Based VCExamples
Planar HMM: Ben Amara, Belaïd, Ellouze [1996]For the main: band width:
• observation P of the S HMM• a specific function (normal
density) of the duration
For the secondary: band description: List of B&W segments in each line of the band– Morphology of each PAW
– 99.84% for 33168 samples, 100 PAWs
![Page 56: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/56.jpg)
56
Semi-Global-Based VCExamples: town name recognition
Burrow [2000]– Method
• to trace lines making up the town name, and to use these as a representation
– Features• Vector angles + average
length +… – Results
• ED (converted into pseudo probabilities) between the test feature vectors and all those in the training set
• Recognition rate 74%
![Page 57: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/57.jpg)
57
SGBVCSynthesis
The works related– similar to those for GBVC– some are reported on PAWs
The proximity with IA is limited– only features and PAW levels considered
The adaptation of Semi Global BVC to Arabicfits well but limited to PAWs
fits better if a gathering procedure of PAWs is possible
![Page 58: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/58.jpg)
58
IA architecture for Semi-GlobalVC
…
Input
Letter
PAW
…
Feature
![Page 59: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/59.jpg)
59
MethodologiesIII Local-Based Vision Classifier
The word – regarded as a list of letters or
smaller entities
The features – should be located precisely,
inversely to the other approacheswhere flexibility is tolerated
The approach– should gather, confront these
entities to identify the word
The interest– can cope with large vocabulary
![Page 60: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/60.jpg)
60
LBVCExample
Multi-level handwritten word recognition for tunisian city names Miled [1997]
![Page 61: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/61.jpg)
61
LBVCThe strategy: 2 perceptive levels
First perceptive level:– practices the global view by extracting visual indices: by
tracing and grapheme extraction– This global Information is extracted in the main zones :
(b) diacritics; (c) baseline and middle zone characters
![Page 62: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/62.jpg)
62
LBVCExample
Then, visual indices are extracted by tracing
![Page 63: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/63.jpg)
63
LBVCExample
Finally: a Markovian modeling is operated on the list of visual indices
Recognition: 58,9% (top1) to 86,8% (top10)
![Page 64: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/64.jpg)
64
LBVCExample
The 2nd perceptive level practices an analytical approach by extracting finest features: graphemes
18 classes• 1. A: alef, 2. B - D: graphemes with ascenders• 3. E – H : graphemes with both ascenders and descenders• 4. I – M : graphemes with descenders• 5. N – R : graphemes within the middle zone
Recognition: 69.68% (top1) to 91.66% (top10)
![Page 65: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/65.jpg)
65
Local-Based-Vision ClassifierExample
Finally, the 3rd level practices a pseudo-analytical modeling and recognition of PAWs and words
37 words: 80.11% (Top1), 90.79% (Top5)
![Page 66: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/66.jpg)
66
LBVCSynthesis
The works related– give good result showing that the analytic approach can
perform well– point out drawbacks of over and under-segmentation
As letters or segments are recognized independentlyany error can perturb the whole recognition process
The proximity with IA is far- WSE is not taken into account because
- no global vision of the word, but as a sum of small parts
![Page 67: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/67.jpg)
67
MethodologiesIV Hybrid Level Classifier
The word – regarded as a whole as well globally as in details
The features – Correspond to precise location reinforced according to the level
of detail needed
The approach– combines different strategies: to approach more human
reading: • the analysis must be global for a good synthesis of
the information • while being based on local information suitable to
make emerge this information
![Page 68: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/68.jpg)
68
Hybrid-Level ClassifierExamples
NSHP-HMM [Choisy & Belaïd 02] :– a random field drawing its observation directly in the
image – a HMM taking into account the column observations
ijX
ijXθ
![Page 69: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/69.jpg)
69
Hybrid-Level ClassifierExamples
Analytical aspect : Local-Global aspect :
![Page 70: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/70.jpg)
70
Hybrid-Level ClassifierExamples
NSHP-HMM [Vajda & Belaïd 06] :– Combination of structural and pixel information
![Page 71: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/71.jpg)
71
HM &Synthesis
The works related – seem efficient– IA seen as meta-model reassembling models working at
different visual levels: global, local, semi-global The proximity with IA is close – If we add the PAW level- It combines different levels as proposed in IA
The interest- to do the maximum without segmentation- if needed, we can operate a segmentation which will be
guided by the context
![Page 72: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/72.jpg)
72
Outline
1. Introduction2. Writing characteristics3. Issues of Recognition4. Language Processing
![Page 73: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/73.jpg)
73
NLPNumber of effective Arabic words go past 60 billions! – due to its morphological complexity [K. Darwish 02]
makes their automatic processing unrealistic– handicaps: dictionary building, IR, automatic spelling…
Simplification of their pattern becomes mandatoryfor their processing
One solution seems to turn towardsmorphological analysis and word stemming
![Page 74: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/74.jpg)
74
NLPMany studies – highlight the richness and the stability of Arabic in
terms of morpho-phonologic peculiar to this language [A. Ben Hamadou 93], [S. Kanoun 02], [W. Kammoun04], [M. Cheriet 06]
Questions– Importance of the kind of linguistic knowledge– more appropriate location for its incorporation
![Page 75: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/75.jpg)
75
NLPMost of them confirm– The morphological structure of Arabic
• can be analyzed in terms of consonantal roots, considered as independent morphological unit
Tri-literal roots, the most common of them– [Watson 06]
• give rise up to 15 verbal forms or stems, one basic and the rest derived
– [Ben Hamadou 93]• an average of 80 currently used words derive from a
given root – [Kanoun 02]
• 808 healthy tri-consonant a lexicon of 98 413 words
![Page 76: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/76.jpg)
76
Radical
Word decompositionAn Arabic word is
decomposable (e.g. derivates from a root) (school :مدرسة ) or not (doctor :دڪتور )
A decomposable word is composed of morphemes: prefix, radical and suffix
The radical (or the verbal core) is– the derivation of a root according
to a given scheme by introducing“access” letters: ,ا م
A root is either– tri-consonant (three letters): تبآ– quadri-consonant (four letters): دحرج– Healthy ( جرح) or non-healthy ( قال contains a vowel at least)
![Page 77: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/77.jpg)
77
Schemes can go up to 70– علاتف ,فعل مفعال , , علامف , لاس تفعا , مفعول, لافتع ا , منفعل
Schemes classes are: – Verb “ تبآ ”رحل “ / ”– Agent noun “ تب آا ”راحل “ / ”– Accentuated agent noun ”رّحال “– Patient noun “ توبمك ”– Machine noun “ تاب آ ”
Arabic Morpho-phonologicalConcepts
![Page 78: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/78.jpg)
78
The approach: Transparent Neural Network
Easy to train:– Decomposable on 3
mono-layers– Training is rapid
But not allows too many outputs
![Page 79: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/79.jpg)
79
First craftiness– To consider the word as the conjugation of a root
according to a given scheme– To separate the outputs in roots and schemes
• For 8000 words that rise from 100 roots the maximum of schemes is 1400
8000 size problem 1500 size problem (still high !)
RNTword100 roots
1400 conjugated schemesRNTRNTTNN
To process a wide vocabulary: several improvements
![Page 80: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/80.jpg)
80
To consider the scheme as a brief scheme ( علو ناتفي is defined by a brief scheme: a non-conjugated one, عل اتف ) and a set of conjugation elements
– Brief schemes number is around 75– Conjugation elements number is 12 (tense, gender, person, definition)
87 neurons represent 1400 conjugated schemes
8000 1500 187 (100 roots + 75 schemes + 12 conjugation elts.)
Second craftiness
![Page 81: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/81.jpg)
81
Roots and schemes trainings are independent– These two trainings do not require the same information
• The Information about word PAWs are:– useless for the training of its root
فح آا حآفا
– useful for the training of its scheme
فح آا scheme: عل فا حآفا scheme: ل فعا
Third craftiness
root: آفح
![Page 82: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/82.jpg)
82
Third craftinessTo separate them and so lighten them by splitting the TNN into two models:
TNN_R
TNN_S
100 outputs
87 outputs
Word
These sizes are now practicable
![Page 83: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/83.jpg)
83
TNN_R: three-layer network– Learns how to focus on root letters and ignores access ones
reserved for schemes, trains roots from structural primitives
Neuro-Linguistic approach
-0. 465. 9
15. 3
2. 8
0.74
-1
8. 94
Primitives (70)
2. 64
2. 63
4. 94
2. 64
4. 93
Letters(117)
PD
QM
ا آ
اـب
RD
ا ت
د
ا ع
HF د عب
RF
Roots(100)
اس
root: بعد
PDQMHFRD RF
![Page 84: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/84.jpg)
84
TNN_S: 4 layers– learns schemes from structural primitives, how to ignore
root letters, focuses on access letters: prefix, suffix …– PAWs of Arabic schemes more reduction
Neuro-Linguistic approach
Conjugation elements (10)
Primitives
PAWS of SchemesSchemesLetters
PD
HM
ا آ
ا آ
JF
ات
ر
اث
HF
ات*ات
**
تفاعل
متفاعل
Singular
Accomplished
ات*امت
masculine
feminine
word: تڪاثر
+ PD+HM+HF+JF+ Access letters: ت,ا+ Singul + accompl.+ masc.
![Page 85: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/85.jpg)
85
TNN_R training: words containing the root– As the information trained corresponds to two independent cases:
• letter constitution / root formation: letter location and sisters • and there is no local error to treat
we separate it into two MonoLP: 1 & 2
Neuro-Linguistic approach
أتصرف &
Letter position & sister letters
انصرفتصرف
يتصرفStructural features
![Page 86: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/86.jpg)
86
TNN_S training: same corpus as for TNN_RThe same way: 3 sub-networks:
Neuro-Linguistic approach
& أتصرف&
انصرفتصرف
يتصرف
Letter position & sisters
Fixed manually to indicate those should
be activated
Mono-LP1 Mono-LP3 Mono-LP4
![Page 87: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/87.jpg)
صرفسرق
حرفصرخ
عل تاف
TNN_R
TNN_S
X
X
Xانصرفانحرف
: HI PD BM JrF BPI
: HI PD RM JrF BPI
HI PD BM JrF BPI
X
فعل نا
انصرف
Perceptive cycles
Perceptive cycles
Linguistic
restrictionHI PD BM JrF BPI
!!
!!
!!
!!
Recognition: perceptive cycles + linguistic restrictif
![Page 88: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/88.jpg)
88
Vocabulary size– 1531 words– 51 roots– 25 brief schemes
Training base– The same training corpus for TNN_R and TNN_S
• TNN_R corpus size : 1531 (words) to train 50 roots• TNN_S corpus size : 1531 (words) to train 25
schemesTest base– size: 765= 255 (words) * 3 (samples)
Experiments
![Page 89: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/89.jpg)
92%
99.7%
97 %
Top4
TNN_R91.9%89.4%80,7%1531
Pseudo-global
Typesetted/ Handwritten
[Ben Cheikhand al 08]
88,7%25AnalyticHandwritten[Touj & Ben Amara 07]
Typesetted
Typesetted
Writing
Analytic
Analytic
Approach
1423
545
Vocab. size
96.4%95.7%81.3%[Kammoun06]
TNN_S93%
95 %84 %69 %[Kanoun 02]
Top3Top2Top1
Word baseComparison with approaches
dedicated to wide vocabularies
![Page 90: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/90.jpg)
90
Conclusion (1)
Neural model + linguistic knowledge – Arabic writing recognition with wide vocabulary: – Knowledge: Arabic morphology analysis
Favors the recognition of words which have never been learned– It is just needed that its root and its scheme have been
already learned via other words
Conclusion
![Page 91: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/91.jpg)
91
«آثر» the root«تفاعل» && the scheme
The words « آثرأ », « ريآث », « ةآثر » and « تآثر » participate to the training of the root «آثر» (in addition to their schemes)
The words « نقا عت », « رباق ت », « خل ادت » and « سك ام ت » participate to the training of the scheme « تفاعل» (in addition to their roots)
.
Hence, when recognizing the word « ثر ا ك ت », our model should be able to recognize :
Example
![Page 92: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/92.jpg)
92
Conclusion (1)
The improvements will continue:– Knowledge
• Considering other aspects of the Arabic morphology: other kinds of roots, derivations…
– Recognition stage• More linguistic restriction in the perceptive cycles
– Data Base• To work on more realistic vocabulary by enlarging
more the size
Perspectives
![Page 93: Abdelwaheb BELAID](https://reader031.vdocument.in/reader031/viewer/2022030320/586dff8c1a28ab21638c074b/html5/thumbnails/93.jpg)
Conclusion (1)
Thank you