[ieee 2008 3rd ieee conference on industrial electronics and applications (iciea) - singapore...
TRANSCRIPT
Handwritten Signature Recognition using Departure
of Images from Independence
Samir Kumar Bandyopadhyay Member IEEE, Department of Computer
Science and Engineering, University of
Calcutta, Senate House, 87/1 College Street,
Kolkata -700073, India
Debnath Bhattacharyya Computer Science and Engineering
Department, Heritage Institute of
Technology, Anandapur,
Kolkata - 700107, India
Poulami Das Computer Science and Engineering
Department, Heritage Institute of
Technology, Anandapur,
Kolkata - 700107, India
Abstract-In this paper we propose a new Handwritten
Signature Recognition Algorithm. The Algorithm is based on
pixel-to-pixel relationship between Images. The Algorithms are
based on extensive statistical analysis, Standard Deviation,
variance and Theory of Cross-Correlation. This is an extension
work of Handwritten Signature Identification.
This Algorithm supports the application environment and
we strongly believe that “User Recognition” could be a solid
platform for future research and study based on statistics and
probability theory.
Keywords: Security, authentication, watermarking,
image analysis, correlation and biometric.
I. INTRODUCTION
Various good techniques of secure transmission of data
are proposed and already taken into practice. Data Hiding is
the process of secretly embedding information inside a data
source without changing its perceptual quality. Digital
watermarking is the process of conveying information by
imperceptibly embedding it into the digital media.
Steganography (covered writing) the process of secretly
embedding information into a data source in such a way its
very existence is concealed. Extraction, Authentication and
Recognition of Data also equally important for security
purpose. The researchers have proposed numerous
authentication and recognition schemes, out of these
Biometric authentications and recognitions are used widely.
Biologically inspired approaches have got better popularity
in research.
The term correlation can also mean the cross-correlation
of two functions or electron correlation in molecular
systems. In probability theory and statistics, correlation, also
called correlation coefficient, indicates the strength and
direction of a linear relationship between two random
variables. In general statistical usage, correlation or co-
relation refers to the departure of two variables from
independence, although correlation does not imply
causation. In this broad sense there are several coefficients,
measuring the degree of correlation, adapted to the nature of
data. A number of different coefficients are used for
different situations. The best known is the Pearson product-
moment correlation coefficient, which is obtained by
dividing the covariance of the two variables by the product
of their standard deviations.
II. EARLIER WORKS
Numerous approaches have been proposed for
Handwritten Signature Identification, Recognition and
Authentication systems. Besides all, one approach that has
shown great promise is the use of Artificial Neural Network
in the Handwritten Signature Identification. An Artificial
Neural Network is trained to identify patterns among
different supplied handwriting samples. Handwritten
signature samples are considered input for the artificial
neural network model and typically weights also supplied for
recognition [3].
According to Berend-Jan van der Zwaag, the used method
in Neural Network is, various characters are taught to the
network in a supervised manner. A character is presented to
the system and is assigned a particular label. Several variant
patterns of the same character are taught to the network
under the same label. Hence the network learns various
possible variations of a single pattern and becomes adaptive
in nature [5].
Debnath Bhattacharyya, Samir Kumar Bandyopadhyay
and Deepsikha Chaudhury, 2007, proposed a scheme where
the Standard Deviation for each byte of the Training Image
Files (sample signatures) is computed and then each
corresponding byte of Test Signature is compared to check
whether it falls within the range of (Mean ± Standard
Deviation ). If 70% cases match, then the Test Signature is
accepted [4].
F. Bartolini, A. Tefas, M. Barni and I. Pitas discussed the
problem of authenticating video surveillance image. After an
introduction motivating the need for a watermarking-based
authentication of VS (video surveillance) sequences, a brief
survey of the main watermarking-based authentication
techniques is presented and the requirements that an
authentication algorithm should satisfy for VS applications
are discussed. A novel algorithm which is suitable for VS
visual data authentication have proposed [6].
Rehab H. Alwan, Fadhil J. Kadhim, and Ahmad T. Al-
Taani, 2005, have explained a method with three main steps.
First, the edge of the image is detected using Sobel mask
filters. Second, the least significant bit LSB of each pixel
is used. Finally, a gray level connectivity is applied using
978-1-4244-1718-6/08/$25.00 ©2008 IEEE Pg 964
a fuzzy approach and the ASCII code is used for
information hiding. The prior bit of the LSB represents the
edged image after gray level connectivity, and the remaining
six bits represent the original image with very little
difference in contrast. The given method embeds three
images in one image and includes, as a special case of data
embedding, information hiding, identifying and
authenticating text embedded within digital images [7].
Yusuk Lim, Changsheng Xu and David Dagan Feng,
2001, described the web-based authentication system
consists of two parts: one is a watermark embedding system
and the other is authentication system. In case of watermark
embedding system, it is installed in the server as application
software that any authorized user, who has access to server,
can generate watermarked image. The distribution can use
any kind of network transmission such as FTP, e-mail etc.
Once image is distributed to externally, client can access to
authentication web page to get verification of image [8].
Min Wu and Bede Liu, June, 2003, proposed a new
method to embed data in binary images, including scanned
text, figures, and signatures. The method manipulates
“flippable” pixels to enforce specific blockbased relationship
in order to embed a significant amount of data without
causing noticeable artifacts. Shuffling is applied before
embedding to equalize the uneven embedding capacity from
region to region. The hidden data can be extracted without
using the original image, and can also be accurately
extracted after high quality printing and scanning with the
help of a few registration marks [9].
Debnath Bhattacharyya, Samir Kumar Bandyopadhyay
and Poulami Das, 2007, have conducted [10] an extensive
survey of the existing graphical password schemes and
proposed an alternate scheme. Entire work has divided into
three phases- a. sampling of users passwords, processing and
storage; b. security on transmission; and c. Recognition and
authentication.
III. OUR WORK
Number footnotes separately in superscripts. Place the
actual footnote at the bottom of the column in which it was
cited. Do not put footnotes in the reference list. Use letters
for table footnotes (see Table I). IEEE Transactions no
longer use a journal prefix before the volume number. For
example, use “IEEE Trans. Magn., vol. 25,” not “vol. MAG-
25.
Mainly, in this paper, we have focused on ‘Recognition of
Handwritten Signature’. Prior to discuss ‘Recognition of
Handwritten Signature’, it is important to get some idea of
‘Data Hiding and Extraction of Handwritten Signature’; our
earlier work.
Before Embedding process, in Figure-1, processing of the
image is must. Firstly, Draw the Signature on a device by
pen or by mouse on the screen panel. This drawn image is
captured and put into the processes of extracting Region Of
Interest (ROI), scaled (the ROI) into a specific size and
thinned into single pixel format [1].
Law of Independent Assortment is used to watermarking
the processed Handwritten Signature; double lined
protection is provided during transmission of Handwritten
Signature over network [2].
For Handwritten Signature Identification and
Authentication - a forward propagation technique is used to
authenticate of input image out of the available training
images [3]. Here we are providing another alternative of
Recognition Technique as follow:
The correlation is defined only if both of the standard
deviations are finite and both of them are nonzero. The
correlation is 1 in the case of an increasing linear
relationship, and some value in between (0 > r <= 1),
indicating the degree of linear dependence between the
values. The closer the coefficient is to 1, the stronger the
correlation between the Images.
The correlation coefficient ρX, Y between two random
variables x and y with expected values µX and µY and
standard deviations σX and σY is defined as:
ρX, Y = covariance(X, Y) / σX σY
= E (( X - µX ) (Y – µY )) / σX σY ….. (i)
where, E is the expected value operator.
µX = E(X), σX2 = E(X
2 ) − E
2(X) and same for Y
also, thus we can express,
(E(XY) – E(X) E(Y))
ρX,Y= ----------------------------------------------
((E(X2
) − E2(X))
1/2 (E(Y
2 ) − E
2(Y))
1/2)
…..(ii) We have a series of n number of 2D arrays generated from
corresponding training images, values are stored in the
arrays are 0s or 1s, size of each array is fixed, i.e., Size, S =
Width of Array x Height of Array. X is one of the training
arrays and Y is the array to be checked, measurements of X
and Y written as Xi and Yi where i = 1, 2, ...,S.
“Pearson product-moment correlation coefficient” or
“sample correlation coefficient” is used here to estimate the
correlation of X and Y. It is especially important if X and Y
are both normally distributed (possible if the images are
from same training set). The Pearson correlation coefficient
is then the best estimate of the correlation of X and Y. The
Correlation Coefficient is written (in this case) from
equation (i) and (ii):
(S∑XiYi - ∑Xi ∑Yi)
rX,Y= -------------------------------------------------
((S∑Xi2 – (∑Xi)
2)1/2
((S∑Yi2 – (∑Yi)
2)1/2
…..(iii)
rX, Y will be calculated using equation (iii) for each of the
n number of 2D arrays generated from corresponding
training images.
Pg 965
The correlation is 1 in the case of an increasing linear
relationship. If the values are independent then the
correlation is 0 (or negative or positive high value). Suppose
the random variable X is uniformly distributed on the
interval from >0 to 1, and Y = X2. Then Y is completely
determined by X, so that X and Y are dependent, but if their
correlation is zero; they are uncorrelated.
The correlation coefficient a concept from statistics is a
measure of how well trends in the predicted values follow
trends in past actual values. It is a measure of how well the
predicted values from a forecast model “fit” with the real-life
data.
The correlation coefficient is a number between 0 and 1.
If there is no relationship between the predicted values and
the actual values the correlation coefficient is 0 or very low
(the predicted values are no better than random numbers).
As the strength of the relationship between the predicted
values and actual values increases so does the correlation
coefficient. A perfect fit gives a coefficient of 1.0. Thus the
higher the correlation coefficient the better.
We have converted the Bi-Color images into 2D
corresponding arrays, where elements of the array are the
pixel values of Bi-Color images, taken as ‘0’ for white and
‘1’ for black.
Handwritten Signature Recognition Algorithm (HSRA):
Input : N-Training Image(s), 1-Test Image
Output : Test Image, FIT or UNFIT
Procedure HSRA()
{
1. Declare N number of 2D Training Arrays with size of
Training Image(s).
Declare a 2D Test Array with size of Test Image.
Declare a single dimensional ‘Correlation Coefficient
Array’ of size N.
Declare sum squares, TS1, TS2, TTS. Declare sum
elements, T1, T2.
2. for i = 0 to width of the Image(s)
3. for j = 0 to height of the Image(s)
4. Store corresponding Image[i, j]’s pixel value to
[i, j] location of corresponding 2D Array
(Stored values are either 0 or 1 depending on
the pixel value of the images, 1 for black and 0
for white).
5. end for
6. end for
7. Continue Steps-2 to Step-6 for N number of Training
Image(s) and Test Image.
8. for i = 0 to width of the any created 2D Array
9. for j = 0 to height of the any created 2D Array
10. Compute sum square, TS1 +=
Training array element x Training array
element
11. Compute sum square, TS2 +=
Test array element x Test array element
12. Compute sum square, TTS +=
Training array element x Test array element
13. Compute sum elements, T1 += Training array element
14. Compute sum elements, T2 += Test array element
15. end for
16. end for
17. Calculate r, correlation coefficient by the expression
using equation (iii), and store into correlation
coefficient array.
18. Continue Steps-8 to Step-17, N times (1 by 1) for N
number of Training Array(s) and Test Array each time
with the Training Array.
19. Check the correlation coefficient array for the value(s)
between 0.9999 to 1.0
20. if found then the Test Image is (FIT) matched with
any of the Training Images else (UNFIT) not matched.
}
IV. RESULT AND DISCUSSION
The stated Algorithm has got 2 distinct divisions, a.
conversion of Image to 2D Array; and b. Recognition. All
images are taken with same size including the Test Image.
This is done by our previous work(s), Extracting Area of
Interest, Scaling and Thinning [1]. Moreover all these
processes have done prior to store training images in the
storage area and data hiding [2, 10].
A. Complexity analysis of the stated algorithm
For conversion of Image to 2D Array:
Size of Image, S = Image width x Image height. Number
of Image(s) to convert, n. Thus the time complexity to
convert N number of images with size S each, is, N x S, for a
large problem set, this can be written as,
N x N = N2 ------------- (1)
Recognition:
This is necessary to access Elements of two 2D Arrays for
calculating Sum Square of elements and sum of the
elements, i.e., S+S = 2S = 2N ---------- (2)
Sum product of the elements, that is also,
S + S = 2S = 2N ---------- (3)
Calculating and storing of correlation coefficient, that is N
times, so, N ---------- (4)
Now, from equations (2), (3) and (4), we can state that, for
matching the Test Image with single Training Image,
2N + 2N + N = 5N ------------ (5)
And this checking has to be done with N number of
Training Images, that is,
5N x N = N2 (for a large Training Set) ----------- (6)
So, from equation (1) and (6), Time Complexity is 2N2.
B. Test Results
Testing is done here with 131 users (individuals)
Signatures, each user with 24 Handwritten Signatures and
10-trained forgery Signatures [11].
Table-1 shows the series of Correlation Coefficients,
returned by each of the Training Array for the given Test
Array Figure-1. Here, testing is done with 24 Training
Images set for a single user. Only one such instance is shown
here in Figure-1 out of 131. Perfect match found in Table-1
is with Signature-11 (one of the Training Image) and in this
Pg 966
Signature-1 Signature-2 Signature-3 Signature-4 Signature-22 Signature-23 Signature-24
Training Images converted to corresponding 2D Training Arrays (Training Set of same problem space)
Converted to 2D Array Test Image
(Signature-11 from Training Set)
[Draw the element-by-elemen
t Relation with each of the
figure-1
Pg 967
Test Array
training array, called Correlation Coefficient]
Table-1
Test Array generated from Test Image
(Here, it is taken as
Signature-11)
Training Arrays for
User1
Correlation Coefficient generated from
each Training Array with Test Array
Signature-1 0.2004229361600069
Signature-2 0.1790631326991368
Signature-3 0.1052254603432329
Signature-4 0.1942522684172455
Signature-5 0.1648025984549539
Signature-6 0.2710549123069337
Signature-7 0.2858090120330544
Signature-8 0.2270414224565152
Signature-9 0.2410276787612588
Signature-10 0.1849605626411834
Signature-11 1.0000000000000000
Signature-12 0.1855773367436578
Signature-13 0.2383045709100168
Signature-14 0.2396381537581523
Signature-15 0.1763162895592169
Signature-16 0.1614256952772022
Signature-17 0.1788460176153321
Signature-18 0.2141833381981659
Signature-19 0.2549800147305654
Signature-20 0.1863935765642522
Signature-21 0.2127480431132642
Signature-22 0.2261281705643858
Signature-23 0.2035685353062401
Signature-24 0.2116770968123705
>>>> Fit and Fitness Rate: 1.0 <<<<
Table-2
Test Array generated from Test Image
(forgery Signature)
Training Arrays for
User1
Correlation Coefficient generated from
each Training Array with Test Array
Signature-1 0.1603882402409401
Signature-2 0.1308927191867935
Signature-3 0.0216864692218872
Signature-4 0.1152753319031659
Signature-5 0.1145443439922423
Signature-6 0.1435363117501333
Signature-7 0.1788840115877254
Signature-8 0.1803741509777574
Signature-9 0.1187262870739589
Signature-10 0.1294898232917472
Signature-11 0.1903566207977627
Signature-12 0.1209446379722159
Signature-13 0.1758481630899941
Signature-14 0.1294655818378044
Signature-15 0.0812553839862523
Signature-16 0.1195819710595384
Signature-17 0.1810475033277922
Signature-18 0.1655539521357283
Signature-19 0.1833515998510241
Signature-20 0.1694563585575498
Signature-21 0.1628242070273894
Signature-22 0.1772467029199628
Signature-23 0.0918058808861437
Signature-24 0.1548684211509291Forgery Signature >>>> UnFit <<<< of user1 figure-2
Pg 968
case Test Image is found “FIT”. Correlation Coefficient
returning 1.0 is the best match, however, value returning 0-1
depicts Test Signature is under same problem space, here,
under same training set.
Table-2 shows the series of Correlation Coefficients,
returned by each of the Training Array (Figure-1) for the
given Test Image (forgery Signature) of Figure-2, an
instance of a reverse case only shown here. However,
Correlation Coefficients returning here in between 0-1, but,
in any case nowhere nearer to 1.0 (exact match).
V. CONCLUSION
This is an extension work of Handwritten Signature
Recognition, that we have started a year back. Various
Watermarking and Data Hiding techniques we have already
proposed and published in different International Journals
and Conference Proceedings. Prior to these we have worked
on Morphological Image Processing focused on Handwritten
Signature Scaling, Thinning and extraction of area of interest
(Handwritten Signature Area within an Image). Thus a series
of work just going to be completed with this proposed
Recognition Scheme.
This Recognition scheme is based on extensive Statistical
Analysis, Theory of Correlation Coefficient and Standard
Deviation. Various test results are positively backing this
scheme. We hope our Study and Research definitely will be
in the spot light.
REFERENCES
[1] Debnath Bhattacharyya, Samir Kumar Bandyopadhyay and
Poulami Das, Handwritten Signature Varification System using
Morphological Image Analysis”, CATA-2007 International Conference, A publication of International Society for Computers
and their Applications, Honolulu, Hawaii, USA, March 28-30,
2007, pp. 112-117. [2] Debnath Bhattacharyya, Samir Kumar Bandyopadhyay and
Poulami Das, “Handwritten Signature Extraction from
Watermarked Images using Genetic Crossover”, MUE’07 IEEE CS Conference, Seoul, Korea, April 27-30, 2007, pp. 987-991.
[3] Debnath Bhattacharyya, Samir Kumar Bandyopadhyay and
Poulami Das, “A Flexible ANN System for Handwritten Signature Identification”, Proceedings of the International MultiConference of
Engineers and Computer Scientists 2007 Volume II, IMECS '07,
March 21 - 23, 2007, Hong Kong, Lecture Notes in Engineering and Computer Science, pp. 1883-1887, Newswood Limited, 2007.
[4] Debnath Bhattacharyya, Samir Kumar Bandyopadhyay and
Deepsikha Chaudhury, “Handwritten Signature Authentication Scheme using Integrated Statistical Analysis of Bi-Color Images”,
IEEE ICCSA 2007 Conference, Kuala Lumpur, Malaysia, August
26-29, 2007. [5] Berend-Jan van der Zwaag, Handwritten Digit Recognition : A
Neural Network Demo, Euregio Computational Intelligence Center
, Dept. of Electrical Engineering, University of Twente, Enschede, the Netherlands.
[6] F. Bartolini, A. Tefas, M. Barni and I. Pitas, “Image Authentication
Techniques for Surveillance Applications”, IEEE Proceedings, Vol. 89, No. 10, October 2001.
[7] Rehab H. Alwan, Fadhil J. Kadhim, and Ahmad T. Al-Taani, “Data
Embedding Based on Better Use of Bits in Image Pixels”, International Journal of Signal Processing Vol 2, No. 2, 2005.
[8] Yusuk Lim, Changsheng Xu and David Dagan Feng, “Web based Image Authentication Using Invisible Fragile Watermark”, 2001,
Pan-Sydney Area Workshop on Visual Information Processing
(VIP2001), Sydney, Australia. [9] Min Wu, Member, IEEE, and Bede Liu, Fellow, IEEE, “Data
Hiding in Binary Image for Authentication and Annotation”, IEEE Trans. Image Processing, vol. 12, pp. 696–705, June 2003.
[10] Debnath Bhattacharyya, Samir Kumar Bandyopadhyay and
Poulami Das, “User Authentication by Secured Graphical Password Implementation”, IEEE Electro-Information Technology (EIT’07),
Marriott O'Hare Chicago, IL, USA, May 17-20, 2007, pp. 32-41.
[11] Handwriting Databases: http://www.gpds.ulpgc.es/
Pg 969