the second ph.d seminar document image analysis and ... .pdf · degraded image caused by...
TRANSCRIPT
Document image analysis and recognition
Songtao Huang Supervisor: Dr. Ahmadi
Electrical and computer engineeringUniversity of Windsor
The Second Ph.D Seminar
Procedure of document image analysis
Post-Processing
Pre-processing
Character recognition
or Object Recognition
Document acquisition
Binarization
Page Segmentation
(Layout Analysis)
Overview
1. Challenge of binarization.
2. Existing binarization algorithms.
3. Proposed HMM based binarization method.
4. Edge based binarization algorithm.
5. Future work---2D HMM based OCR.
Binarization
Convert gray images into
binary images for further process.
Challenges of binarization (1)
Degraded image caused by non-uniform illumination
Challenges of binarization (2)Image with low contrast and stroke dependent noises
Challenges of binarization (3)Image with variable background intensity
Conventional Binarization Algorithms
1. Histogram shape information[1][2][3].2. Histogram entropy information[4][5]. 3. Clustering-based methods[6].4. Thresholding Based on Attribute Similarity[7-12]5. Spatial information[13]. 6. Local adaptive thresholding[14-17].
Histogram shape information
The peaks, valleys and curvatures of the smoothed histogram are analyzed.
Convex hull thresholding[1].Peak-and-valley thresholding[2].Shape-modeling thresholding[3].
0 50 100 150 200 250 3000
500
1000
1500
2000
2500
3000Original Histogram
Histogram entropy information
Entropy-based methods result in algorithms that usethe entropy of the foreground and background regions,the cross-entropy between the original and binarizedimage, etc.
1) Entropic thresholding[4].2) Cross-entropic thresholding[5].
i
j
iij yyE log
0∑=
= jnjn
jnj
j
jtotal AA
AAEE
AAE
E −−−
−+−= loglog
Clustering-based methods[6].
The gray-level samples are clustered in two parts as background and foreground object, or alternately are modeled as a mixture of two Gaussians.[6]
Thresholding Based on Attribute Similarity
These algorithms select the threshold value based on some attribute quality or similarity measure between the original image and the binarized version of the image.
Moment preserving thresholding[7].Edge field matching thresholding[8].Fuzzy similarity thresholding[9][10].Topological stable-state thresholding[11].Maximum information thresholding[12].
Spatial Thresholding Methods
This class of algorithms utilizes not only gray value distribution but also dependency of pixels in a neighborhood, for example, in the form of context probabilities, correlationfunctions, cooccurrence probabilities, local linear dependence models of pixels, 2-D entropy, etc.
Cooccurrence thresholding methods[13].
local Adaptive Thresholding
A threshold is calculated at each pixel, which depends on some local statistics like range, variance, or surface-fitting parameters of the pixel neighborhood.
Local variance methods[14]. Local contrast methods[15]. Center-surround schemes[16].Surface-fitting thresholding[17].
Proposal 1: HMM based binarization algorithm
A
B C
Neighborhood of different kinds of pixels(A)
Neighborhood of different kinds of pixels(B)
Neighborhood of different kinds of pixels(C)
Feature extraction
Seven elements in vertical direction feature vector
)0,0()0,1(1 )4( PPv −=
)0,0()0,1()0,2(
1 2)(
)1( PPP
v −+
= −−
)0,0()0,1()0,2()0,3(
1 3)(
)0( PPPP
v −++
= −−−
)0,0()0,1()0,2()0,3(
1 3)(
)6( PPPP
v −++
=
)0,0()0,1()0,2(
1 2)(
)5( PPP
v −+
=
)0,0()0,1(1 )2( PPv −= −
)0,0(1 )3( Pv =
Four direction feature vectors
3,60,|3|
)(3
11)0,0(
),0(1 ≠≤≤−
−= ∑
−
−=
iiPiP
ivi
orj
j
3,60,|3|
)(3
11)0,0(
),(2 ≠≤≤−
−= ∑
−
−=
iiPiP
ivi
orj
jj
3,60,|3|
)(3
11)0,0(
)0,(3 ≠≤≤−
−= ∑
−
−=
iiPiP
ivi
orj
j
3,60,|3|
)(3
11)0,0(
),(4 ≠≤≤−
−= ∑
−
−=
− iiPiP
ivi
orj
jj
Feature vectors
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
]6[ ]5[ ]4[ ]3[ ]2[ ]1[ ]0[]6[ ]5[ ]4[ ]3[ ]2[ ]1[ ]0[]6[ ]5[ ]4[ ]3[ ]2[ ]1[ ]0[
]6[ ]5[ ]4[ ]3[ ]2[ ]1[ ]0[
4444444
3333333
2222222
1111111
vvvvvvvvvvvvvvvvvvvvv
vvvvvvv
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
4
3
2
1
VVVV
Quantization
. Evaluation problem. Given the HMM M=(A, B, π) and the observation sequence O=o1 o2 ... oK , calculate the probability that model M has generated sequence O .
• Decoding problem. Given the HMM M=(A, B, π) and the observation sequence O=o1 o2 ... oK , calculate the most likely sequence of hidden states si that produced this observation sequence O.
• Learning problem. Given some training observation sequences O=o1 o2 ... oK and general structure of HMM (numbers of hidden and visible states), determine HMM parameters M=(A, B, π) that best fit training data. O=o1...oK denotes a sequence of observations ok∈{v1,…,vM}.
Main issues using HMMs :
HMM based binarization
Feature vector
Hidden Markov Model of foreground
Number of observations is 10.Number of states is 3
Hidden Markov Model of background
Comparison of the possibility
Flow of training procedure
Pixels extraction
K-mean to obtain 10 central vectors
Pixels quantization
Training HMM to obtain the parameters in the models
Input vectors [0000]---[9999]
Save the attributes of the vectors [0000]---[9999]
Classification through HMM
Comparing the distances of each vector to the ten cluster centers acquired in the training step, we derive the observation sequence for every pixel. Since attribute of each pixel can be found in the look-up table savedin the reference set at the training step, the recognition result can be obtained with minimum time consumed in this stage.
Feature extraction
Look up the table of reference
Vector quantization
Proposed binarization algorithm 1(HMM based binarization algorithm)
In the first stage, a coarse global thresholding method is used to discriminate the bright part of the whole image from the foreground pixels which have lower values.
In the second stage, the left unconfirmed pixels which are supposed to be a mixture of foreground and part of the background are input into the HMM pixel classifier toget the attribute of each pixel.
A coarse global threshold.
HMM based
classifier
∑∑
=
== 255
0
255
0
)(
)(
i
i
ih
iihMean
0 50 100 150 200 250 3000
500
1000
1500
2000
2500
3000Original Histogram
Simulation resultsK
ittle[19]
HM
MLocal[21]
Otsu[20]
Simulation resultsK
ittle[19]
HM
MLocal[21]
Otsu[20]
FAIL
Simulation results
HMM
Kittle[19]
HM
MLocal[21]
Otsu[20]
Proposal 2: Edge based binarization algorithm
Proposed binarization algorithm 2(Edge based binarization algorithm)
Step 1: Edge dectection- Prewitt detector
6]1,[]1,[
],[1
1
1
1∑ ∑−= −=++−−+
= k kjkiIjkiI
jiP6
],1[],1[],[
1
1
1
1∑ ∑−= −=++−+−
= k kkjiIkjiI
jiQ
22 ],[],[],[ jiQjiPjiM += )],[],[arctan(],[
jiQjiPjiO =
111000-1-1-1
10-110-110-1
HORIZONTAL ORIENTATIONAL KERNEL VERTICAL ORIENTATIONAL KERNEL
Thresholding for Edge detection
Here the simplest root mean square (RMS) value is utilized as shown below, the threshold
heightwidth
jiMT
width
i
height
jedge ×
×=
∑ ∑= =1 1),(4
Gradient determination
P(i,j)
011-101
-1-10
11010-10-1-1
111000-1-1-1
10-110-110-1
Q(i,j)
S(i,j)R(i,j)
|)),(||,),(||,),(||,),((|),( jiSjiRjiQjiPMaxjiG =
Select the minimum and maximum values fromthe determined gradient direction
Demo of pixels selection
155133122
12312266
1007967
123|),(| =jiQ
19|),(| =jiS
199|),(| =jiR
164|),(| =jiP
|),(|),( jiRjiG =
Foreground pixel Background pixel
Selected foreground pixels Selected background pixels
Histogram of selected pixels
The histogram of the original image
∑=
≤≤=j
ibackback iHjE
0255j0 )()(∑
=
≤≤=j
iforefore iHjE
255255j0 )()(
Determination of threshold
)()()(:as calculated becan error general the
iEiEiEEThen
backforetotal
total
+=
Flow of edge based thresholdingalgorithm
Edge detection
Corner detection
Edge thresholding
Foreground and background pixels determination
Determination of threshold
Result
Enhanced edge based thresholding
Combination of Kittle and zoning methods.
Zoned edge information
My contributions and achievements
1) A HMM based binaization algorithm
2) An new edge information based binarization algorithm
3) Both algorithms have good performances in comparison with the existing algorithms.
Future job: A new polar 2D HMM system.
• The structure of hidden states is chosen.
• Observations are feature vectors extracted from vertical slices.
Character recognition with 1DHMM[22]
Character recognition with 2D Pseudo HMM[22]