ict_danang_2014.pdf
TRANSCRIPT
-
Car License Plate Recognition Use Neutral Network
Diep N. Phan, Dai V.Tran
Electronic & Telecommunication Engineering Department
University of Technology, University of Danang
Danang, Vietnam
[email protected], [email protected]
Tuan V. Pham
Center of Excellence
University of Technology, University of Danang
Danang, Vietnam
Abstract Car license plate recognition is the extraction of
car license plate information from an image or a sequence of
images. In this paper, we present an improved approach four
steps for Car License Plate Recognition: Preprocessing, Plate
localization, Character segmentation and Character recognition.
Preprocessing stage will improve the quality of the acquired
image, which is a major factor in the success of the Car license
plate recognition. Plate localization stage uses the projection
method. Character segmentation Algorithm uses the Labeling
method. In character type classification, the number of closed-
area is used to separate characters into three types and then in
recognition stage three different feed-forward neural networks
are trained to identify characters in each type. Experimental
results show that the proposed approach is robust to a variety of
illumination, view angle, size, and plate type under complex
environments. The performance of the character recognition
stage achieved 99.43% for image high quality.
Keywords Car license plate recognition; Plate localization; Character segmentation; Character recognition; Neural network;
Image projection; Image label.
I. INTRODUCTION
Image processing techniques have been applied widely from civil device to specialized equipment. The use of image processing for license plate recognition will contribute to solving a part of the problem traffic congestion and automate some tasks related to the management of cars. Now, management of transport and general manager automobiles and motorcycles in particular is extremely complex, as well as the work of detecting and sanctioning traffic violations, theft... will spend a lot of time and effort. And then, the demand of building automation system identification and manage motorized means of traffic are also born. This system will reduce the pressure on human resources in management and control these transport. Through this theme, also create the premise for develop solutions such as: license plate recognition, documents recognition.
License Plate Recognition (LPR) is a technology to extract license number from vehicle image capturing by a single or multiple cameras. It has various applications in traffic control, vehicle theft prevention, vehicle surveillance, parking lot access control, etc. A LPR system typically consists of three steps: plate extraction, character segmentation and character recognition.
The LPR system that extracts a license plate number from a given image can be composed of four stages (Fig 1). The first stage is to applied plates image, remove noise and improve images quality [1-4]. The second stage is to extract the license plate region from the image based on some feature, such as boundary, the color, or the existence of the characters [3, 8]. The next, individual characters are segmented using connected component analysis which is simple and straightforward [4, 5]. However, it has been seen that the connected component analysis method may fail to extract all the characters when they are joined or broken. Hence, several morphological operations have been utilized to improve the robustness of character segmentation. Lastly, a four-stage classifier is employed to recognize characters. This classifier categorizes plate characters into three types. The next stage consists of three different feed-forward neural networks trained to recognize characters in correspondence with each type above [6-10]. The reason under the hood of this two-stage structure is to overcome difficulties stemming from training a single large neural network. Two training models are also proposed to improve robustness of the neutral networks: clean model and noisy model. In addition, three test scenarios are presented to evaluate performance of the recognition stage.
The remainder of this paper is organized as follows. In Section II, improves quality of input image. Section III demonstrates how to extract the license plate region. Section IV demonstrates character segmentation methods and Section V discusses character recognition methods. In Section VI, we summarize the paper and discuss the future research.
Input
Preprocessing
Plate
Localization
Character
Segmentation
Recognition 43A 03246
Fig 1: Block diagram of License Plate Recognition
-
II. PREPROCESSING
A. Convert RGB image or colormap to grayscale
In Number Plate Detection, the image of a car plate may
not always contain the same brightness and shades. Therefore,
the given image has to be converted from RGB to Gray form.
Converts RGB values to Gray scale values by forming a
weighted sum of the R, G, and B components.
B. Noise filter and dilate an image
However, during this conversion from RGB to Gray form, certain important parameters like difference in color, lighter edges of object, etc. may get lost [6]. The process of dilation will help to nullify such losses. Dilation is a process of improvising given image by filling holes in an image, sharpen the edges of objects in an image, and join the broken lines and increase the brightness of an image. Using dilation, the noise within an image can also be removed. By making the edges sharper, the difference of gray value between neighboring pixels at the edge of an object can be increased. The process of dilation will help to nullify such losses.
We can use a periodical convolution of the function f with specific types of matrices m to noise filter and dilate an image:
( ) ( ) , -
( ) , ( ) ( )-
(1)
As: w, h are width and height of the image represent by the
function . Note: The expression , -represents the element in xi
column and yi row of matrix m.
Each image operation, filter is defined by a convolution matrix. The convolution matrix defines how the specific pixel is affected by neighboring pixels in the process of convolution. Individual cells in the matrix represent the neighbors related to the pixel situated in the center of the matrix. The pixel represented by the cell y in the destination image (fig. 3) is affected by the pixels x0x8 according to the formula:
y =
(2)
Image input Matrix Convolution
matrix
Image output Matrix
x1 x2 x3 m1 m2 m3
x4 x x5 m4 m m5 y
x6 x7 x8 m6 m7 m8
Fig 2: The pixel is affected by its neighbors according to the convolution matrix.
Fig 3: Result after preprocessing.
III. LICENSE PLATE LOCALIZATION
After the series of convolution operations, we can detect an
area of the number plate according to a statistics of the
snapshot. There are various methods of statistical analysis.
One of them is a horizontal and vertical projection of an image
into the axes x and y. In the present work, we use a projection
approach based on gray level computed from vehicle images
for localization of significant license plate regions [6-8].
Advantage of this method is very simple implementation.
Vertical image projection
Let an input image be defined by a discrete function ( ). Then, a vertical projection py of the function at a point y is a summary of all pixel magnitudes in the yth row of the input image. We can mathematically define the vertical projection as:
( ) (3)
As: ( ) ( )
w and h are dimensions of the image
Fig 4: Vertical image projection.
Fig 5: Detected license plate.
IV. PLATE CHARACTER SEGMENTATION
In this section, we describe our proposed character segmentation approach which based on binary connected components detection and a processing chain for character identification using geometrical constraints of Vietnam license plates.
Character Segmentation separates each letter or number where it is subsequently processed by Optical character recognition (OCR).
Preprocessing
-
A. Preprocessing
Fig.6 Block diagram of processing stage
Input image as depicted in Fig.7a is initially processed to improve it is quality. In the beginning, the input color images are transformed into grayscale images using the NTSC standard method.
It then is converter into binary image as presented in Fig.7b using Local Adaptive Threshold Method.
( )
* ( )+ (4)
( ) { ( )
( ) } (5)
The tilted plate image has bad effects on segmentation and recognition stages as reported in [9]. Therefore, in our approach, the tilted plate is corrected by rotating an angle which is estimated by the Hough Transform [10]. The corrected tilted plate is shown as in Fig.7c.
a. The color tilted plate image
b. The resulted plate image after begin binarized
c. Plate image in Hough diagram.
d. The tilted plate image after being corrected
Fig. 7 Preprocessing plate
B. Segmentation
The purpose of this step is to find the individual characters
on the plate. Pixel connectivity and projection profiles are two
popular features for segmenting license plate characters [11].
In the proposed approach, pixel connectivity feature was used
because it is more robust to rotation than projection profiles.
Fig. 7 shows the algorithm of segmentation stage.
First, all connected regions are found and labeled using
connected component analysis [12]. Then all regions, that the
heights and the areas of their bounding box are outliers, will
be removed. Fig.8 shows the intermediate results of this
process.
a. All regions are found are labeled using connected
component analysis
b. All label after coarse segmentation
c. All character in plate are segmented
Fig. 8 Character segmentation
The remaining objects which are mostly characters will be
resized into 64x32 pixel image.
V. PLATE CHARACTER RECOGNOTION
After the segmentation of elements, the final module in the license plate recognition process is character recognition. For recognition problem, Multi-layer perceptron (MLP) neural network is an important in the recognition by [13][14]. In the paper, we proposed an improved method based on MLP neural network and back-propagation algorithm for training to recognize character and number in Vietnam license plates.
-
A. Character type classification
For recognition problems, (MLP) neural networks are one
of the most common used. In spite of their advantages, neural
networks have some limitations. One crucial limitation is the
difficulty in training a large neural network. For training such
network, it requires a large dataset which may not easy to
collect. Thus, to overcome this issue, the proposed
classification process divides characters into three categories.
Each category will then be recognized using small separated
neural network as shown in Fig. 9.
Fig.9 Block diagram of plate characters recognition stage
In character recognition, feature points are one of the most
useful features. To extract the feature points, character image
is first skeletonized as shown as Fig. 10-a,b. Then the number
of its intersection(s), end-point(s) and closed-areas will be
counted (Fig.11). In our experiment the number of
closed-areas is robust for character type classification while
the other feature points are not good for this task. By using
this feature, characters will be pre-classified into three type
which are type 1 (one closed-area), type 2 (two closed-areas),
type 3 (three closed-areas) as listed in Fig. 12.
a. Segment character b. Skeletonized character Fig.10 Segmented character is skeletonized
a. Intersection b. end-point c. closed-area Fig. 11 Three feature points
Fig. 12 All characters is divided into three type based on
closed-area feature
B. Feature Extraction
Each character image has large dimension preventing it
from being use as the input feature. However, we can
use dimension reduction techniques to map data to a
lower dimensional space such that uninformative
variance in the data is discarded. There are many
dimension reduction techniques such as principal
components analysis (PCA), projection pursuit (PP),
principal curves (PC), self-organizing maps (SM). In the paper
proposed a hybrid method of principal components analysis
and local binary pattern (LBP). Firstly, PCA extracted the
global grayscale feature of the whole facial expression image
and reduced the data size at the same time. And LBP extracted
local neighbor texture feature of the mouth area, which
contributes most to facial expression recognition. Fusing the
global and local feature will be more effective for facial
expression recognition.
PCA algorithm follows 6 steps:
Step 1: Give vector representing a set of sampled images:
Step 2: Compute the average vector
Step 3: Stack the data into n-by-m matrix where the rows are
Step 4: Compute SVD of
Step 5: Keep the first 26 rows of with largest singular values: as principal component
Step 6: Project images on this principal component to get 26-Dimensional representations:
.( ) ( ) (
) /
Local Binary Pattern
The LBP is non-parametric operator which describes the local spatial structure of an image. At a given pixel position ( ), LBP is defined as an ordered set of binary comparisons of pixel. The resulting LBP pattern at the pixel can be expressed as follows:
( ) ( )
Where corresponds to gray value of the center pixel ( )
to the gray values of the 8 surrounding pixels, and function ( )defined as:
( ) { ( ) ( )
}
Using LBP operator the whole image is transformed to LBP map.
LBP Histogram Sequence
We use local feature histogram to present the region property of the LBP patterns by the following processing: first each LBP map is spatially divided into multiple non-
-
overlapping regions, and histogram h is computed from each region. Then histogram sequence H.
Here the histogram h of image f(x,y) with grey levels in range [0,L-1] is defined as:
, ( ) -
Where i is the i-th gray level, is the number of pixels in the image with gray level I and
( ) {
}
Assume the whole is divided into m regions, then the histogram of the r-th region could be expressed as ( ) and the concatenated histogram sequence as ( ).
C. Neural network with back-propagation algorithm
In order to assign each digit signature to its corresponding
ASCII representation, two feed forward back propagation
neural network (NNET) were designed, the one assigned to
Type1, one other to Type2 and one other to Type3. The
networks consist of one hidden and one output layer with Log-
Sigmoid transfer function, as show in Fig13
Fig13. Type1: 62-50-25 NNET Type2: 62-50-8 NNET
Type3: 60-50-2 NNET.
The Type 1 NNET receives a 62x1 vector in the form of
Fig13. It passes through the first log-sigmoid hidden layer
with contains 50 neurons and finally it enter the output log-
sigmoid layer which contains 25 neurons, leading to vector . The output 25x1 vector values correspond to the sequence of
Type1 and are between 0 and 1(due to the log-sigmoid
function). It can be said that each value represent the
probability that the input signature is classified to a specific
Type1. The final result is provided through a competitive
transfer function which returns the index with the optimum
value.
The Type2 and Type3 NNET work identically to the
Type1 network, except that it use 8, 2 neurons in the output
layer.
Back-propagation algorithm. Five steps of training MLP network use back-gropagation
algorithm as following:
Step 1: Perform a feedforward pass, computing the activation
for layers , and so on up to the output layer : ( ) ( ) ( ) ( )
( ) ( ( )) Step 2: For each output unit i in layer (the output layer), set:
( )
( )
|| ( )||
(
( )) ( ( ))
Step 3: For for each node i in layer l , set
(
( )
( )) (
)
Step 4: Compute the desired partial derivatives, which are
given as:
For i=1 to m
a. Use backpropagation to compute ( ) ( ) and ( ) ( ) With:
( ) ( ) ( )
( )
( ) ( ) ( )
b. Set ( ) ( ) ( ) ( )
c. Set ( ) ( ) ( ) ( )
Step 5: Update the parameters:
( ) ( ) [(
( )) ( )]
( ) ( ) [(
( ))]
VI. EVALUATION
A. Classification measure
In this paper, performance of the proposed recognition algorithm is assessed via the true positive rate (TPR) and false positive rate (FPR) which are defined as follow:
,
where TP, FP, FN and TN are determined as follows:
True positives (TP): The amount of character A which is correctly recognized as character A.
False positives (FP): The amount of non-A character which is wrongly recognized as character A.
False negatives (FN): The amount of character A which is wrongly recognized as non-A character.
True negatives (TN): The amount of non-A character which is correctly recognized as non-A character.
B. License plate localization evaluation
Database of plate image input
To evaluate performances of the proposed method, a database containing 127 plates has been built based on collection of plate images of Viet Nam on Internet. The database is divided into 6 sets according to the illumination, view, quantity, contrast, weather, position/angle of orientation and the images quality condition. The details of this result are described in Table I.
TABLE I. DETAILS OF RESULT
Test database
Amo-unt
Segment-ation
TPR FPR
-
Total 127 107 84.25 15.75
Contrast
Low 35 27 77.14 22.86
High 95 79 85.87 14.13
Weather
Rainy 32 31 96.88 3.13
Cloudy 93 78 83.87 16.13
Sunny 25 23 92 8
Position/ Angle
Straight 60 55 91.67 8.33
Rotation 39 32 82.05 17.95
Projection 17 11 64.71 35.29
Quantity
High 101 83 82.18 17.82
Low 15 12 64.52 20
View
Front 95 86 90.53 9.47
Back 31 20 64.52 35.48
Background
Simple 40 34 85 15
Complex 28 20 71.43 28.57
C. Segmentation evaluation
Database of plate region
A database containing 341 plates has been built based on collection of plate images on Internet and plates capturing by our research team. The database is divided into 3 sets according to the illumination, weather, angle of orientation and the images quality condition. The details of this database are described in Table II.
TABLE II. DATABASE OF PLATES DESCRIPTION
Test database
Quantity Description
Set1 183
Clear data with good conditions: normal lighting condition, nice weather, no angle of orientation, black characters on white background, high quality.
Set2 67 Rotated plates with various angles of orientation and good lighting conditions.
Set3 91
Rotated and projected plates with large angle of orientation, plates with bad lighting conditions (too bright, too dark, night light), bad quality, blurred, noise, small size.
Evaluation
The result of segmentation step is shown as Table II. The performance of three scenarios are 96.16%, 76.12% and 61.54% respectively.
TABLE III. THE RESULT OF SEGMENTATION STEP
Total plate Segmentation Ratio
Set1 183 176 96.17%
Set2 67 51 76.12%
Set3 91 56 61.54%
D. Character type classification
1) Character database
Due to the varieties of image's capturing condition which lead to the differences in size, brightness, and contract of segmented characters; two training models are proposed as clean model and noisy model to improve robustness of the classifier. The clean model is trained on a dataset consists of characters with good illumination condition and no rotation. Each character in training set has 20 samples. The noisy model is trained on a dataset which has 20 good samples and 30 noisy sample per character.
The test data is divided into 3 scenarios:
The Well Matched scenario (WM): The tested samples of each character are similar to the ones for training.
The Medium Mismatched scenario (MM): This test set consisting of samples that have relative differences in lighting condition, fonts and angle of orientation.
The Highly Mismatched scenario (HM): There are completely differences in fonts, angle of orientation and illumination conditions between the training and test samples of each character.
In this database, the clean training characters and WM test characters were extracted from Set 1 of plate database mentioned. MM test characters were extracted from Set 2 and HM test characters were extracted from Set 3.
Detailed description of the database with two training models is depicted in Table IV.
TABLE IV. CHARACTER DATABASE DESCRIPTION
Train
(samples/character)
Test
(samples/character)
Clear Noisy WM MM HM ALL
Clean
model 50 0 20 20 20 60
Noisy
model 50 30 10 10 10 30
2) Evaluation
The test sets consist of WM, MM, HM and ALL (WM+MM+HM) are used for testing in both training models. At first, closed-area feature is used for categorizing the type of characters. The categorizing results of three subsets are shown in Table 4 (clean model) and Table 5 (noisy model).
-
a) Clean model
The test set contains of test WM (700 chars), test MM (700 chars), test HM (700 chars) and test ALL (2100 chars) are used for evaluation the clean model.
TABLE V. THE RESULT OF TESTING CLEAN MODEL
Clean model
WM MM HM ALL
T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3
Type1 (T1)
498 2 0 499 1 0 499 1 0 1496 4 0
Type2 (T2)
2 158 0 4 155 1 6 150 4 12 463 5
Type3 (T3)
0 0 40 0 0 40 1 0 39 1 0 119
b) Noisy model
The test set contains of test WM (350 chars), test MM (350 chars), test HM (350 chars) and test ALL (1050 chars) are used for evaluation the noisy model.
TABLE VI. THE RESULT OF TESTING NOISY MODEL
Noisy model
WM MM HM ALL
T1 T2 T3 T1 T2 T3 T1 T2 T3 T1 T2 T3
Type1 (T1)
250 0 0 249 1 0 249 1 0 748 2 0
Type2 (T2)
2 78 0 1 79 0 1 78 1 4 235 1
Type3 (T3)
0 0 20 0 0 20 0 0 20 0 0 60
E. Character Recognition evaluation
After categorizing characters into three types, characters are recognized by three different neural networks. The average TPR and FPR of all characters are shown in Table VI (clean model) Table VII (noisy model).
1) Clean model
TABLE VII. RESULT OF TESTING CHARACTER RECOGNITION FOR CLEAN MODEL
Clean model Total TPR FPR
WM 700 99.43% 0.018%
MM 700 93.98% 0.187%
HM 700 85.12% 0.458%
ALL 2100 92.84% 0.221%
2) Noisy model
TABLE VIII. RESULT OF TESTING CHARACTER RECOGNITION FOR NOISY MODEL
Noisy
model Total TPR FPR
WM 350 99.43% 0.018%
MM 350 98.29% 0.054%
HM 350 92.00% 0.249%
ALL 1050 96.57% 0.107%
According to the testing result, we can see that:
When training with clean model, the TPR is high and
FPR is quite low for three scenarios. The TPRs of
WM, MM, HM are 99.43%, 93.98%, 85.12%
respectively and the average TPR of three scenarios
is 92.84%.
When training with noisy model, the classification performance improves reasonably. The TPRs of MM, HM
scenarios increase approximately 4%.
VII. CONCLUSION
In this paper, we presented about four step of Car License Plate recognition. We had seen, performance of License plate localization is not good, which only more 84%, because we using the projection method to detection, that is very simply, but efficiency not high. And remaining, performance of License plate segmentation and Character recognition is very good, which more 98%. In future, we will try to complete it better.
ACKNOWLEDGMENT
We would like to give a special thank to Mr. Tuan M. Nguyen and Mr Anh Nguyen, Electronic & Telecommunication Engineering Department, Danang University of Technology, The University of Danang.
REFERENCES
[1] Badaw, "Automatic License Plate Recognition (ALPR): A State of the Art Review" Circuits and Systems for
Video Technology, IEEE Transactions on , vol.PP,
no.99, pp.1, 0.
[2] Fodor I. K.;, "A survey of dimension reduction techniques," LLNL technical report, june 2002.
[3] Zhenhua Guo; Lei Zhang; Zhang, D.; , "A Completed Modeling of Local Binary Pattern Operator for Texture
Classification," Image Processing, IEEE Transactions
on , vol.19, no.6, pp.1657-1663, June 2010.
[4] Ondrej Martinsky; Algorithmic And Mathematical Principles Of Automatic Number Plate Recognition
Systems, Faculty Of Information Technology Department Of Intelligent Systems, Brno University Of
technology, 2007.
[5] Peter Tarbek;, Morphology Image Pre-Processing For Thinning Algorithms, Journal of Information, Control and Management Systems, vol. 5, no. 1, pp. 131-138,
2007.
[6] Naikur Bharatkumar Gohil; Car License Plate Detection; B.E., Dharmsinh Desai University, India, 2006.
[7] Lu Liu; Hongjiang Yu; Kehe Cai; Jia Wang, "License plate recognition using topology structure
features" Computing, Control and Industrial Engineering
(CCIE), 2011 IEEE 2nd International Conference on ,
-
vol.2, no., pp.251-254, 20-21 Aug. 2011.
[8] Trng Quc Bo, V Vn Phc; Gii thut mi cho bi ton nh v v nhn dng bin s xe t; Tp ch khoa hc Trng i hc Cn Th; Phn A: Khoa hc T nhin, Cng ngh v Mi trng: 27(2013):44-45.
[9] Zhenhui Zhang; Shaohong Yin, Hough Transform and Its Application in Vehicle License Plate Tilt Correction Computer and Information Science, vol. 1, pp. 116-119,
Aug. 2008.
[10] Xiaodan Jia; Xinnian Wang; Wenju Li; Haijiao Wang; , "A Novel Algorithm for Character Segmentation of
Degraded License Plate Based on Prior Knowledge"
Automation and Logistics, 2007 IEEE
International Conference on , vol., no., pp.249-253, 18-
21 Aug. 2007.
[11] Badawy, W, "Automatic License Plate Recognition (ALPR): A Stateof the Art Review" Circuits and Systems
for Video Technology, IEEE Transactions on , vol.PP,
no.99, pp.1, 0. [4] Ni Ma; Bailey, D.G.; Johnston, C.T.; ,
"Optimised single pass connected components analysis,"
ICECE Technology, 2008. FPT 2008. International
Conference on , vol., no., pp.185-192, 8-10 Dec. 2008.
[12] B. Kroese. An Introduction to Neural Networks, Amsterdam, University of Amsterdam, 1996, 120 p.
[13] FEITOSA, R. Q., VELLASCO, M. M. B., MAFFRA, D. V., ANDRADE, S. S. R. S., OLIVEIRA, D. T., Facial Expression Classification Using RBF and Back
Propagation Neural Networks,In:4th World Multiconference onSystemics, Cybernetics and
Informatics (SCI'2000) and the 6th International
Conference on Information Systems Analysis and
Synthesis (ISAS'2000), 2000, pp.73 77. [14] Chun Wang, Bo Liu, Xinzhi Zhou, Research on vehicle
plate character recognition based on BP neural network, China Measurement Technology, 2005, vol.31, pp. 26-28
(inChinese).