table of contents - repositori | universitas udayana. rajesh kumar, national university of singapore...

Table of Contents

Sr. No Topic

1. Scope of the Journal

2. The Model

3. The Advisory and Editorial Board

4. Papers

First Published in the United States of America. Copyright © 2012 Foundation of Computer Science Inc. Originated and printed by Foundation of Computer Science Press, New York, USA

Scope of the Journal

International Journal of Computer Applications (IJCA) creates a place for publication of papers which covers the

frontier issues in Computer Science and Engineering and their applications which will define new wave of

breakthroughs. The journal is an initiative to identify the efforts of the scientific community worldwide towards

inventing new-age technologies. Our mission, as part of the research community is to bring the highest quality

research to the widest possible audience. International Journal of Computer Applications is a global effort to

consolidate dispersed knowledge and aggregate them in a search-able and index-able form.

The perspectives presented in the journal range from big picture analysis which address global and universal

concerns, to detailed case studies which speak of localized applications of the principles and practices of

computational algorithms. The journal is relevant for academics in computer science, applied sciences, the

professions and education, research students, public administrators in local and state government, representatives

of the private sector, trainers and industry consultants.

Indexing International Journal of Computer Applications (IJCA) maintains high quality indexing services such as Google

Scholar, CiteSeer, UlrichsWeb, DOAJ (Directory of Open Access Journals) and Scientific Commons Index,

University of St. Gallen, Switzerland. The articles are also indexed with SAO/NASA ADS Physics Abstract

Service supported by Harvard University and NASA, Informatics and ProQuest CSA Technology Research

Database. IJCA is constantly in progress towards expanding its contents worldwide for the betterment of the

scientific, research and academic communities.

Topics International Journal of Computer Applications (IJCA) supports a wide range of topics dealing in computer

science applications as: Embedded Systems, Pattern Recognition, Signal Processing, Robotics and Micro-

Robotics, Theoretical Informatics, Quantum Computing, Software Testing, Computer Vision, Digital Systems,

Pervasive Computing etc.

The Model

Open Review

International Journal of Computer Applications approach to peer review is open and inclusive, at the same time as

it is based on the most rigorous and merit-based ‘blind’ peer-review processes. Our referee processes are criterion-

referenced and referees selected on the basis of subject matter and disciplinary expertise. Ranking is based on

clearly articulated criteria. The result is a refereeing process that is scrupulously fair in its assessments at the same

time as offering a carefully structured and constructive contribution to the shape of the published paper.

Intellectual Excellence

The result is a publishing process which is without prejudice to institutional affiliation, stage in career, national

origins or disciplinary perspective. If the paper is excellent, and has been systematically and independently

assessed as such, it will be published. This is why International Journal of Computer Applications has so much

exciting new material, much of it originating from well known research institutions but also a considerable amount

of brilliantly insightful and innovative material from academics in lesser known institutions in the developing

world, emerging researchers, people working in hard-to-classify interdisciplinary spaces and researchers in liberal

arts colleges and teaching universities.

The Advisory and Editorial Board

The current editorial and Advisory committee of the International Journal of Computer Applications (IJCA)

includes members of research center heads, faculty deans, department heads, professors, research scientists,

experienced software development directors and engineers.

Dr. T. T. Al Shemmeri, Staffordshire University, UK Bhalaji N, Vels University

Dr. A.K.Banerjee, NIT, Trichy Dr. Pabitra Mohan Khilar, NIT Rourkela

Amos Omondi, Teesside University Dr. Anil Upadhyay, UPTU

Dr Amr Ahmed, University of Lincoln Cheng Luo, Coppin State University

Dr. Keith Leonard Mannock, University of London Harminder S. Bindra, PTU

Dr. Alexandra I. Cristea, University of Warwick Santosh K. Pandey, The Institute of CA of India

Dr. V. K. Anand, Punjab University Dr. S. Abdul Khader Jilani, University of Tabuk

Dr. Rakesh Mishra, University of Huddersfield Kamaljit I. Lakhtaria, Saurashtra University

Dr. S.Karthik, Anna University Dr. Anirban Kundu, West Bengal University of

Technology

Amol D. Potgantwar, University of Pune Dr Pramod B Patil, RTM Nagpur University

Dr. Neeraj Kumar Nehra, SMVD University Dr. Debasis Giri, WBUT

Dr. Rajesh Kumar, National University of Singapore Deo Prakash, Shri Mata Vaishno Devi University

Dr. Sabnam Sengupta, WBUT Rakesh Lingappa, VTU

D. Jude Hemanth, Karunya University P. Vasant, University Teknologi Petornas

Dr. A.Govardhan, JNTU Yuanfeng Jin, YanBian University

Dr. R. Ponnusamy, Vinayaga Missions University Rajesh K Shukla, RGPV

Dr. Yogeshwar Kosta, CHARUSAT Dr.S.Radha Rammohan, D.G. of Technological

Education

T.N.Shankar, JNTU Prof. Hari Mohan Pandey, NMIMS University

Dayashankar Singh, UPTU Prof. Kanchan Sharma, GGS Indraprastha

Vishwavidyalaya

Bidyadhar Subudhi, NIT, Rourkela Dr. S. Poornachandra, Anna University

Dr. Nitin S. Choubey, NMIMS Dr. R. Uma Rani, University of Madras

Rongrong Ji, Harbin Institute of Technology, China Dr. V.B. Singh, University of Delhi

Anand Kumar, VTU Hemant Kumar Mahala, RGPV

Prof. S K Nanda, BPUT Prof. Debnath Bhattacharyya, Hannam University

Dr. A.K. Sharma, Uttar Pradesh Technical

University

Dr A.S.Prasad, Andhra University

Rajeshree D. Raut, RTM, Nagpur University Deepak Joshi, Hannam University

Dr. Vijay H. Mankar, Nagpur University Dr. P K Singh, U P Technical University

Atul Sajjanhar, Deakin University RK Tiwari, U P Technical University

Navneet Tiwari, RGPV Dr. Himanshu Aggarwal, Punjabi University

Ashraf Bany Mohammed, Petra University Dr. K.D. Verma, S.V. College of PG Studies & Research

Totok R Biyanto, Sepuluh Nopember R.Amirtharajan, SASTRA University

Sheti Mahendra A, Dr. B A Marathwada University Md. Rajibul Islam, University Technology Malaysia

Koushik Majumder, WBUT S.Hariharan, B.S. Abdur Rahman University

Dr.R.Geetharamani, Anna University Dr.S.Sasikumar, HCET

Rupali Bhardwaj, UPTU Dakshina Ranjan Kisku, WBUT

Gaurav Kumar, Punjab Technical University A.K.Verma, TERI

Prof. B.Nagarajan, Anna University Vikas Singla, PTU

Dr H N Suma, VTU Dr. Udai Shanker, UPTU

Anu Suneja, Maharshi Markandeshwar University Prof. Rachit Garg, GNDU

Aung Kyaw Oo, DSA, Myanmar Dr Lefteris Gortzis, University of Patras, Greece.

Suhas J Manangi, Microsoft Mahdi Jampour, Kerman Institute of Higher Education

Prof. D S Suresh, Pune University Prof.M.V.Deshpande, University of Mumbai

Dr. Vipan Kakkar, SMVD University Dr. Ian Wells, Swansea Metropolitan University, UK

Dr M Ayoub Khan, Ministry of Communications

and IT, Govt. of India

Yongguo Liu, University of Electronic Science and

Technology of China

Prof. Surendra Rahamatkar, VIT Prof. Shishir K. Shandilya, RGPV

M.Azath, Anna University Liladhar R Rewatkar, RTM Nagpur University

R. Jagadeesh K, Anna University Amit Rathi, Jaypee University

Dr. Dilip Mali, Mekelle University, Ethiopia. Dr. Paresh Virparia, Sardar Patel University

Morteza S. Kamarposhti , Islamic Azad University

of Firoozkuh, Iran

Dr. D. Gunaseelan Directorate of Technological

Education, Oman

Dr. M. Azzouzi, ZA University of Djelfa, Algeria. Dr. Dhananjay Kumar, Anna University

Jayant shukla, RGPV Prof. Yuvaraju B N, VTU

Dr. Ananya Kanjilal, WBUT Daminni Grover, IILM Institute for Higher Education

Vishal Gour, Govt. Engineering College Monit Kapoor, M.M University

Dr. Binod Kumar, ISTAR Amit Kumar, Nanjing Forestry University, China.

Dr.Mallikarjun Hangarge, Gulbarga University Gursharanjeet Singh, LPU

Dr. R.Muthuraj, PSNACET Mohd.Muqeem, Integral University

Dr. Chitra. A. Dhawale, Symbiosis Institute of

Computer Studies and Research

Dr.Abdul Jalil M. Khalaf, University of Kufa, IRAQ.

Dr. Rizwan Beg, UPTU R.Indra Gandhi, Anna University

V.B Kirubanand, Bharathiar University Mohammad Ghulam Ali, IIT, Kharagpur

Dr. D.I. George A., Jamal Mohamed College Kunjal B.Mankad, ISTAR

Raman Kumar, PTU Lei Wu, University of Houston – Clear Lake, Texas.

G. Appasami , Anna University S.Vijayalakshmi, VIT University

Dr. Gurpreet Singh Josan, PTU Dr. Seema Shah, IIIT, Allahabad

Dr. Wichian Sittiprapaporn, Mahasarakham

University, Thailand.

Chakresh Kumar, MRI University, India

Dr. Vishal Goyal, Punjabi University, India Dr. A.V.Senthil Kumar, Bharathiar University, India

R.C.Tripathi, IIIT-Allahabad, India Prof. R.K. Narayan , B.I.T. Mesra, India

PAPERS

System Progress Estimation in Time based Coordinated Checkpointing Protocols

Authors : P. K. Suri, Meenu Satiza

1-6

Adaptive Learning for Algorithm Selection in Classification

Authors : Nitin Pise, Parag Kulkarni

7-12

Routing Protocol for Mobile Nodes in Wireless Sensor Network

Authors : Bhagyashri Bansode, Rajesh Ingle

13-16

32-Bit NxN Matrix Multiplication: Performance Evaluation for Altera FPGA, i5 Clarkdale, and

Atom Pineview-D Intel General Purpose Processors

Authors : Izzeldin Ibrahim Mohd, Chay Chin Fatt, Muhammad N. Marsono

17-23

Recognizing and Interpreting Sign Language Gesture for Human Robot Interaction

Authors : Shekhar Singh, Akshat Jain, Deepak Kumar

24-31

Change Data Capture on OLTP Staging Area for Nearly Real Time Data Warehouse base on

Database Trigger

Authors : I Made Sukarsa, Ni Wayan Wisswani, K. Gd. Darma Putra, Linawati

32-37

Decision Support System for Admission in Engineering Colleges based on Entrance Exam Marks

Authors : Miren Tanna

38-41

A Genetic Algorithm based Fuzzy C Mean Clustering Model for Segmenting Microarray Images

Authors : Biju V G, Mythili P

42-48

International Journal of Computer Applications (0975 – 8887)

Volume 52– No.11, August 2012

1

System Progress Estimation in Time based Coordinated

Checkpointing Protocols

P. K. Suri

Dean, Research and Development; Chairman, CSE/IT/MCA, HCTM

Technical Campus, Kaithal,Haryana, India

Meenu Satiza HCTM Technical Campus

ABSTRACT

A mobile computing system consists of mobile and stationary

nodes. Checkpointing is an efficient fault tolerant technique

used in distributed systems. Checkpointing in mobile systems

faces many new challenges such as low wireless bandwidth,

frequent disconnections and lack of stable storage on mobile

nodes. Coordinated Checkpointing that minimizes the number

of processes to take useless checkpoints is a suitable approach

to introduce fault tolerance in such systems. The time-based

checkpointing protocol eliminates communication overheads

by avoiding extra control messages and useless checkpoints.

Such protocols directly accesses stable storage when

checkpoints are saved. In this paper a new probabilistic

approach for evaluation of the system progress is devised

which is suitable for the mobile distribution applications. The

system behavior is observed by varying some system

parameters such as fault rate, clock drift rate, saved

checkpoint time, checkpoint intervals. A validation regarding

system progress is made via a simulation technique. The

simulation results show that the proposed probabilistic model

is well suited for the mobile computing systems.

General Terms

Checkpointing, System progress, Simulation

Keywords

Distributed system, fault tolerance, time-based checkpointing

System progress, consistent checkpoint

1. INTRODUCTION Checkpointing is a major technique of fault tolerance system

in which state of a process has to be saved in stable storage so

that the process can be restarted in case of fault. There are two

main categories of checkpointing techniques: (i) coordinated

and (ii) uncoordinated checkpointing. In coordinated

checkpointing, the processes send the control messages to

their dependent processes to save their states at the same time.

This results a global consistent state from which the system

recovers when a fault occur in the system. In uncoordinated

checkpointing, the processes save their states independently.

In this type of protocol, during fault occurrence processes

rollback to a point of recovery. Recently new type of time

based coordinated checkpointing techniques have been

introduced which avoid extra coordination messages among

dependent processes. The time based approach is based on

loosely synchronized timers. The Timer information is

piggybacked along application messages. System performance

of the time based checkpointing protocols depends on the

application and system’s characteristics such as checkpoint

intervals, save checkpoint time, resynchronize time, clock

drifts. We proposed a probabilistic model for the system

progress with particular system parametric values. This model

shows that how system operations can affect the system

performance and the simulation results shows the states at

which system perform well with the particular values of

defined parameters. A simulation model is also developed to

validate the system progress.

1.1 Related work In 1985, Chandy and Lamport [1] proposed a global snapshot

algorithm for distributed systems. The global state is achieved

by coordinating all the processors and logging the channel

state at the time of checkpointing. Special messages called

markers are used for coordination and for identifying the

messages originating at different checkpointing intervals.

In 1987, Koo-Toueg [5] proposed a two phase Minimum-

process Blocking Scheme for distributed system. The

consequence of algorithm is a consistent global checkpointing

state that involves only the participating processes and prevent

live lock problem (A single failure can cause an infinite

rollbacks)

In 1996 Ravi Prakash and Mukesh Singhal [11] presented a

synchronous non-blocking snapshot collection algorithm for

mobile systems that does not force every node to take a local

snapshot. They had also proposed a minimal

rollback/recovery algorithm in which the computation at a

node is rolled back only if it depends on operations that have

been undone due to the failure of node(s).

In 1996 N. Neves and W.K.Fuches [9] presented a time based

checkpointing protocol which eliminate communication

overhead present in traditional checkpointing protocols. The

checkpointing protocol was implemented on CM5 and their

performance was compared using several applications.

In 2001 Guohong Cao and Mukesh Singhal [4] had introduced

the concept of “mutable checkpoint” which is neither a

tentative checkpoint nor a permanent checkpoint. To design

efficient checkpointing algorithms for mobile computing

systems the mutable checkpoints can be saved anywhere e.g.

the main memory or local disk of MHs. In this way taking a

mutable checkpoint avoids the overhead of transferring large

amounts of data to the stable storage at MSSs over the

wireless network.

In 2002 Chi-Yi Lin et. al. [7] proposed an improved time

based checkpointing protocol by integrating the improved

timer synchronization technique. The mechanism of time

synchronization utilizes the accurate timer in MSSs as an

absolute reference. The timers in fixed hosts (MSSs) are more

reliable than those in MHs.

In 2003 Chi-Yi Lin and Sy-Yen Kuo[8] had proposed an

efficient time-based non-blocking checkpointing protocol.

The protocol reduces the no. of checkpoints transmitted over

wireless link. The protocol use synchronized timer to

indirectly coordinate the creation of checkpoints. In the



2

system each process takes a soft checkpoint first which is

saved in main memory of mobile hosts. If the process is

irrelevant to initiator it can be discarded otherwise will be

saved in the local disk at a later time as hard checkpoint. As a

result the number of disk accesses in mobile hosts can be

reduced. The advantage of using time based approach

improves the need of explicit coordination message.

In 2006 Men Chaoguang [2] proposed a two-phase time based

checkpointing strategy. This eliminates orphan and in-transit

messages. In this strategy, the issues of time - based adaptive

checkpoint strategy was evaluated which describes about all

processes need not to block their computation work and also

not to log all messages. In proposed strategy the inconsistency

issues are also discussed.

In 2007 Awasthi and Kumar [6] had proposed a probabilistic

approach based on keeping track of direct dependencies of

processes. Initiator MSS collects the direct dependency

vectors of all processes and sends the checkpoint request to all

dependent MSSs. This step was taken to reduce the time to

collect the coordinated checkpoint. It would also reduce the

number of useless checkpoints and the blocking of the

processes. The buffering of selective messages at the receiver

end and exact dependencies among processes had maintained.

Hence the useless checkpoint requests and the number of

duplicate checkpoint requests get reduced.

In 2008 Suchistmita Chinara and Santanu Kumar Rath [3]

had proposed an energy efficient mobility adaptive distributed

clustering algorithm for Mobile ad-hoc Network. In which a

better cluster stability and a low maintenance overhead is

achieved by volunteer and non volunteer cluster heads. The

proposed algorithm is divided into parts like cluster

formation, energy consumption model, cluster maintenance.

The objective of algorithm is to minimize the re affiliation

rate (A changing situation of member node search for another

head is called re affiliation) .The simulation Experiment

compare the ID of members. A high ID member act as cluster

head and cluster maintenance overhead is reduced from time

to time.

In 2011 Anil Panghal, Sharda Panghal, Mukesh Rana [10]

presented a comprehensive study of the existing techniques

namely Checkpoint-based recovery and Log-based recovery.

Based on the study they conclude that Log-based recovery

techniques which combine checkpointing and logging of

nondeterministic events during pre-failure execution are

suitable for systems that frequently interact with the outside

world. They also conclude that communication-induced

checkpointing reduces the message overhead if implemented

along with checkpoint staggering can prove to be the best

method for recovery in distributed systems

2. PROBLEM FORMULATION In this paper a number of time based checkpointing protocols

are analyzed. In [9] Neves and Fuchs had given the concept of

timers to reduce the communication overhead. They made the

following assumptions:

(a) The processes involved in checkpointing have loosely

synchronized clocks.

(b) All the processes are approximately synchronized and

have a deviation from real time in their local clock

timers. The local clock drift rate between the processes

being assumed as ρ.

(c) The timer will terminate at most (2ρT/ (1-ρ2) ≈ 2ρT)

seconds apart. Here T is the initial timer value. Normally

drift rate ρ attains values between 10-5 sec. to 10-8 sec.

(d) The clocks will show a maximum drift of 2NρT after N

checkpoint interval.

Consider the following figure1. in which P1 and P2 are two

processes. The message M1 is sent from process P1 to P2 in its

Nth checkpoint interval and also message M2 is being sent in

same interval of P1 to (N+1)th interval of P2. Let some fault

arrives in timeline of Process P2.

It is observed that checkpoint N+1 is saved in P2 before the

fault occurrence. Now P2 has yet to receive M2. But P2 has no

information of message M2. Such situations can be handled by

resending unacknowledged messages again.

According to Neves the more time is wasted in storing

checkpoints and the processes has to block its execution for a

long time which is an impractical situation. Such

inconsistencies of in-transit or orphan messages can be

handled by using time based checkpointing approach where

the messages are now being sent along with timers.

According to Men Chaoguang approach [2] the orphan

messages can be eliminated by using communication induced

approach and in-transit messages can be stored in message

logging queue. The following figure 2. Illustrates the above

situations.

Consider P1 and P2 are two processes. T1 and T2 are their

timers respectively. Let MD = D + 2 ρ T be the maximum

deviation between the timer of two processes = T1 – T2 .tmax

is the maximum delivery time at which process P2 should get

the message M1. tmin is the minimum delivery time of message

M4. ED = MD – tmin is the effective deviation in which the

processes cannot send or receive the message .M2 and M3 lies

in effective deviation and they arise the inconsistency due to

orphan and in-transit messages. To handle orphan message M3

a communication induced checkpoint is placed before the

delivery of message M3 and in-transit message M2 can be

retrieved from message logging queue.

It is observed that the parametric values ρ, T, D, tmin, tmax,

fault rate λ, Saved checkpoint point time S, time (t) at which

fault occurs affects the mobile distributed system

performance.

In this paper a probabilistic model is developed in which the

system performance is evaluated by varying various system

parametric values.

P1

P2

M1 M2

N N+1

N N+1

Fault arrived

Figure 1: Inconsistent state



3

3. SYSTEM PROGRESS EVALUATION

3.1 Probabilistic model development When faults occur in the system, resynchronization is made.

Here the system’s progress is defined as the ratio of

constructive computational work to the total work during a

given interval of time.

In order to perform a simulation experiment on distributed

system a random sample of time t1,t2,t3………..tn is

generated by transforming n uniform random numbers

u1,u2,u3……un in the interval (0,1). Where λ is the positive

constant depending on characteristics of distributed system

[12]. The general term of time tk is

tk = – (1/λ)*logeuk where k Є [1,n]

Let ts = time to store checkpoint, tmin be the minimum

checkpoint delivery time, tmax be the maximum

checkpointing delivery time, Tdiff be the maximum difference

between timers of different processes, L be the length of

checkpoint intervals before resynchronization, ρ be the clock

drift rate between the processes, fr be the fault rate, Tw be the

probabilistic wasted time of fault occurrence ,Twr be the

probabilistic wasted time of fault occurrence between

resynchronization, tr be the resynchronization time.

Let T1, T2, T3………Tnmax are n checkpoint intervals

between resynchronization. When fault not occurs then this

time will be equal to maximum number of checkpoint

intervals nmax.

The system progress is evaluated by developing following

probabilistic model.

Here ts ≤ (Tdiff + 2*nmax*L* ρ – tmin)

(ts + tmin – Tdiff)/ (2*L*ρ) ≤ nmax

nmax = ceil ((ts + tmin – Tdiff)/(2*L*ρ)

Let Tcons be the time interval during which constructive

computational work is done. Where Tcons is given as

Tcons = L – ts – tk where k Є [1,n]

The probability density function of occurrence of fault is

given as

Let Ir = Expected number of intervals between

resynchronization = Probability of happening fault during any

Interval less than nmax + Probability of happening no fault

during any interval less than nmax.

In Fig. 3 a set of nmax checkpoints numbers ({1, 2, 3 ……k,

k+1…nmax}) are considered on the time line of the process.

The probability of a fault occurring in the kth checkpoint

interval Pr[k] in last resynchronization process is given as

Pr[k] = e – fr* L*k – e – fr* L*(k+1)

(nmax – 1)

Ir = ∑ k * Pr[k] + nmax * e – fr* L*k

k =1

Ir = (1 – e – fr* L*nmax)/ (e fr* L – 1)

Probability of wasted time of occurrence of fault Tw and the

wasted time of occurrence between resynchronization Twr is

given as

Tw = ((e – fr*Tcons)*( – fr * Tcons – 1 ) + 1)/(fr*(1 – e – fr*Tcons ))

The Probability of wasted time of occurrence between

resynchronization Twr is given as

Twr = (1 – e – fr* L*nmax)*( ts +Tw) + e – fr* L*nmax * tr

Let probability of total time between resynchronization is Tr .

Where Tr = Ir * L + Twr

Let TCW be the Probability of time used in constructive work

between resynchronization

TCW = Ir*Tcons

The System Progress (SP say) of a process = TCW/Tr

The System Progress of all the processes = ∑ TCW/Tr

The System Progress of the complete system having n

processes = (∑ TCW/Tr)/n.

3.2 Validation of system progress To confirm the correctness of system progress evaluation,

system progress validation is implemented to more detailed

confidence level of simulation. Here in the simulation

technique to achieve the validation having a better confidence

level ,first 1000 runs of simulation experiment are made from

10 samples with 100 checkpoint intervals then

2000,3000,……..10000 runs are made and then average value

of system progress, their standard deviation (say SD), upper (

say UL) , lower (say LL) confidence limit of system progress

is computed. Further corresponding interval of interest Tcons

and then corresponding optimal system progress is evaluated

for the system by using the variation among the parameters

say λ, ρ, ts, fr, L. The used simulation technique follows as

Let’s take n independent samples of time interval Length L

and according to such n samples values of System progress

SP1,SP2,SP3……………………SPn .Then their mean μ and

P1

P2

M1

tmax

T2

T1

M2 M3

M4

tmin ED

MD

Fig.2 Elimination of inconsistent state

Fig 3. Fault arrival in kth checkpoint interval

L

2 1 k k+1

Fault

Waste

time

nmax



4

standard deviation σ are evaluated .The sample mean of all

System progresses are to be evaluated by using formula:

SPmean = ∑ SPi/ n

The variance σ2 can be estimated as :

σ2est = (1/ (n-1))*∑(SPk – SPmean)

2

The general relationship between the parameters is given as

Pr {μ – t ≤ SPmean ≤ μ + t} = 1 – α

where t is the tolerance on either side of mean within which

the estimated to fall within probability 1– α. The normal

density function is Φ(y) = y1-α/2, the upper confidence limit UL

and lower confidence limit LL of System progress can be

obtained respectively.

y

Φ(y) = ∫ (1/√2Π)* e–(z2/2)

*dz

–α

Where z = (√n) *(SPmean – μ)/ σest

UL = SPmean + (y1-α/2 * σest) / (√n)

LL = SPmean – (y1-α/2 * σest) / (√n)

y1-α/2 = 2.58 (99% confidence level)

The interval (UL - LL) will contain the true mean with a

specific certain experimental confidence value [12].

3.2 Simulation results Simulation result shows that the System Performance is

affected by the various factors such as number of checkpoint

intervals, Clock drift rate of processors, Fault rate of

processors, Time of saving checkpoints .In our simulation

experiment such variations of factors against System Progress

are shown in tabular as well as graphical form.

3.2.1 Checkpoint interval vs. system progress The following table expresses the parametric values used in

proposed model. The first column shows when varying values

of Fault rate (Table 4), Drift rate (Table 5), Checkpoint

intervals (Table 2) and second and third column shows the

corresponding other variable names and their particular

values.

Table 1: Parametric values of System model

According to these values the system progress is evaluated

and respective graphs are drawn. First according to increasing

values of checkpoint intervals (L) corresponding decreasing

values of system progress is evaluated i.e. obviously as

number of checkpoint intervals are increased corresponding

system progress get decreased (Fig 4) i.e. The system progress

is affected by number of checkpoints

Table 2. Checkpoint intervals vs. System Progress

Checkpoint

Intervals( L)

System Progress

SP

100 0.9925

10100 0.950282

20100 0.902832

30100 0.857017

40100 0.81285

50100 0.770316

Fig 4: Checkpoint intervals vs. System Progress

3.2.2 Saved Checkpoint time vs. system progress Table 3. describes as time to save checkpoints get increased

the system progress get decreased.

This is illustrated in Fig.5 which is obviously true as the time

to save checkpoint get increased the system progress will

decrease.

Table 3. Saved Checkpoint time vs. System Progress

Saved checkpoint

Time (ts)

System Progress

(SP)

1 0.98183

2 0.981554

3 0.981278

4 0.980999

5 0.980723

6 0.980441

7 0.980166

8 0.979887

9 0.979609

10 0.979331

11 0.979053

12 0.978775

13 0.978498

14 0.978221

15 0.977942

SYSTEM PROGRESS EVALUATION

Used System parameters

Variable tr 0.1

State Tdiff 0.01

tmin 0.001

fr L 3600

Fault rate ts 0.7

ρ 0.000001

ρ fr 0.00001

Drift rate L 3600

ts 0.7

L fr 0.00001

Ckeckpoint

Interval ρ 0.000001

ts 0.7



5

Fig 5: Checkpoint intervals vs. System Progress

3.2.3 Fault Rate vs. system progress This subsection describes how fault rate affects the system

progress.

Table 4. describes as fault rate get increased The system

progress get decreased.This is illustrated in Fig.6

Table 4. Fault Rate vs. system progress

Fault Rate

fr

System Progress

SP

1.00E-16 0.999803

1.00E-15 0.999741

1.00E-14 0.999767

1.00E-13 0.999783

1.00E-12 0.999802

1.00E-11 0.999761

1.00E-10 0.999726

1.00E-09 0.999765

1.00E-08 0.999658

1.00E-07 0.999625

Fig 6 Fault Rate vs. System Progress

3.2.4 Drift Rate vs. system progress This subsection describes how drift rate affects the system

progress. For low value of drift rate the system progress is

high little bit .The System Progress of non blocking protocol

is not much affected for different values of drift rate the

System .Table 5. and Fig.6 illustrates this.

Table 5. Drift Rate vs. System progress

Drift Rate ρ

System Progress

SP

0.1 0.994184

0.01 0.993871

1.00E-03 0.993605

1.00E-04 0.983866

1.00E-05 0.993416

1.00E-06 0.994208

1.00E-07 0.993747

1.00E-08 0.994071

1.00E-09 0.993928

1.00E-10 0.994053

Fig 7 Drift Rate vs. System Progress

3.2.5 System progress validation In Table 6. the first column, first entry illustrates that 10

samples of checkpoint interval of length 100 are taken and

corresponding System progress of 100,200,…..1000

checkpoint intervals gets evaluated, their average is shown in

second column (i.e. 0.99520).The third and fourth column

shows their standard deviation, upper and lower confidence

limit respectively. Similarly System Progress of other samples

having checkpoint intervals 2000, 3000 …10000 are validated

Similar validation can be applied to other system parameters.

The difference between upper and lower confidence limit

should be less than 2*Tolerance value. Here tolerance value is

0.001 for 99% confidence

Table 6. System progress validation

Sample

No.

System

Progress

average

σest

Upper

Confidence.

Limit

Lower

Confidence

Limit

1000 0.99520 0.00114 0.9961 0.99427

2000 0.99180 0.00141 0.9929 0.99065

3000 0.98702 0.00146 0.9882 0.98583

4000 0.98215 0.00014 0.9833 0.98095

5000 0.97727 0.00014 0.9784 0.97606

6000 0.97238 0.00147 0.9735 0.97117

7000 0.96750 0.00147 0.9687 0.96629

8000 0.96263 0.00147 0.9638 0.96143

9000 0.95777 0.00146 0.9589 0.95658

10000 0.95293 0.00146 0.9541 0.95174



6

4. CONCLUSION In this paper the problem of arrival of fault is efficiently

discussed. A probabilistic model is developed for evaluation

of System progress of the processes along with a particular set

of parameters. It is observed that the System Progress is

evaluated by introducing the time generated by negative

exponential distribution function. The system Progress gets

optimizes on particular values of system parameters. A

validation regarding the System progress on the basis of set of

parameter checkpoint interval length (L) value is derived

.Such validation can be evaluated regarding the other set of

parameters such as drift rate, fault rate, saved checkpoint

time.

5. ACKNOWLEDGMENTS Sincere thanks to HCTM Technical Campus Management

Kaithal-136027, Haryana, India for their constant

encouragement.

6. REFERENCES [1] Chandy K.M. and Lamport L. “Distributed Snapshots:

Determining Global States of Distributed Systems” ACM

Transactions Computer systems vol. 3, no.1. pp. 63-75,

Feb.1985

[2] Chaoguang M., Yunlong Z. and Wenbin Y., “A two-

phase time-based consistent checkpointing strategy,” in

Proc. ITNG’06 3rd IEEE International Conference on

Information Technology: New Generations, April 10-12,

2006, pp. 518–523.

[3] Chinara Suchistmita and Rath S.K.“An Energy Efficient

Mobility Adaptive Distributed Clustering Algorithm for

Mobile ad-hoc Network” 978-1-4244-2963-9/08 (2008)

IEEE.

[4] Guohong Cao and Singhal Mukesh, “Mutable

Checkpoints: a new checkpointing approach for Mobile

Computing Systems”, IEEE Transaction on Parallel and

Distributed Systems, vol. 12, no. 2, pp. 157-172,

February 2001

[5] Koo. R. and Toueg. S. “Checkpointing and Rollback-

Recovery for Distributed Systems”. IEEE Transactions

on Software Engineering, SE-13(1): pp 23-31, January

1987.

[6] Kumar Lalit, Kumar Awasthi, “A Synchronous

Checkpointing Protocol for Mobile Distributed Systems:

Probabilistic Approach” International Journal of

Information and Computer Security, Vol.1, No.3 .pp

298-314, 2007.

[7] Lin C., Wang S., and Kuo S., “A Low Overhead

Checkpointing Protocol for Mobile Computing System”

in Proc of the 2002 IEEE Pacific Rim International

Symposium on dependable computing (PRDC’02).

[8] Lin C., Wang S., and Kuo S., “An efficient time-based

checkpointing protocol for mobile computing systems

over wide area networks,” in Lecture Notes in Computer

Science 2400, Euro-Par 2002, Springer-Verlag, 2002, pp.

978–982. Also in Mobile Networks and Applications,

2003, vo. 8, no. 6, pp. 687–697.

[9] Neves N., Fuchs W.K., “Using time to improve the

performance of coordinated checkpointing,” In:

Proceedings of 2nd IEEE International Computer

Performance and Dependability Symposium, Urbana-

Champaign, USA, 1996, pp.282 –291.

[10] Panghal Anil, Panghal Sharda, Rana Mukesh

“Checkpointing Based Rollback Recovery in Distributed

Systems” Journal of Current Computer Science and

Technology Vol. 1 Issue 6 [2011]258-266.

[11] Prakash R. and Singhal M., “Low-Cost Checkpointing

and Failure Recovery in Mobile Computing Systems”,

IEEE Transaction on Parallel and Distributed Systems,

vol. 7, no. 10, pp. 1035-1048, October1996.

[12] “ System simulation with digital computer” by Narsingh

Deo



7

Adaptive Learning for Algorithm Selection in

Classification

Nitin Pise

Research Scholar Department of Computer Engg. & IT College of Engineering, Pune, India

Parag Kulkarni Phd, Adjunct Professor

Department of Computer Engg. & IT College of Engineering, Pune, India

ABSTRACT No learner is generally better than another learner. If a learner

performs better than another learner on some learning

situations, then the first learner usually performs worse than

the second learner on other situations. In other words, no

single learning algorithm can perform well and uniformly

outperform other algorithms over all learning or data mining

tasks. There is an increasing number of algorithms and

practices that can be used for the very same application. With

the explosion of available learning algorithms, a method for

helping user selecting the most appropriate algorithm or

combination of algorithms to solve a problem is becoming

increasingly important. In this paper we are using meta-

learning to relate the performance of machine learning

algorithms on the different datasets. The paper concludes by

proposing the system which can learn dynamically as per the

given data.

General Terms

Machine Learning, Pattern Classification

Keywords Learning algorithms, Dataset characteristics, algorithm

selection

1. INTRODUCTION

The knowledge discovery [3] is an iterative process. The

analyst must select the right model for the task he is going to

perform, and within it, the right model or algorithm, where the

special morphological characteristics of the problem must

always be considered. The algorithm is then invoked and its

output is evaluated. If the evaluations results are poor, the

process is repeated with new selections. A plethora of

commercial and prototype systems with a variety of models

and algorithms exist at the analyst’s disposal. However, the

selection among them is left to the analyst. The machine

learning field has been evolving for a long time and has given

us a variety of models and algorithms to perform the

classification, e.g. decision trees, neural networks, support

vector machines [4], rule inducers, nearest neighbor etc. The

analyst must select among them the ones that better match the

morphology and the special characteristics of the problem at

hand. This selection is one of the most difficult problems

since there is no model or algorithm that performs better than

all others independently of the particular problem

characteristics. A wrong choice of model can have a more

severe impact: A hypothesis appropriate for the problem at

hand might be ignored because it is not contained in the

model’s search space.

There is an increasing number of algorithms and practices that

can be used for the very same application. Extensive research

has been performed to develop appropriate machine learning

techniques for different data mining tasks, and has led to a

proliferation of different learning algorithms. However,

previous work has shown that no learner is generally better

than another learner. If a learner performs better than another

learner on some learning situations, then the first learner

usually performs worse than the second learner on other

situations [5]. In other words, no single learning algorithm can

perform well compared to the other algorithms and

outperform other algorithms over all classification tasks. This

has been confirmed by the “no free lunch theorems” [6]. The

major reasons are that a learning algorithm has different

performances in processing different datasets and that

different variety of ‘inductive bias’ [7]. In real-world

applications, the users need to select an appropriate learning

algorithm according to the classification task that is to be

performed [8],[9]. If we select the algorithm inappropriately,

it results in a slow convergence or may lead to a sub-optimal

local minimum. Meta-learning has been proposed to deal with

the issues of algorithm selection [10]. One of the aims of

meta-learning is to help or assist the user to determine the

most suitable learning algorithm(s) for the problem at hand.

The task of meta-learning is to find functions that map

datasets to predicted data mining performance (e.g., predictive

accuracies, execution time, etc.). To this end meta-learning

uses a set of attributes, called meta-attributes, to represent the

characteristics of classification tasks, and search for the

correlations between these attributes and the performance of

learning algorithms. Instead of executing all learning

algorithms to obtain the optimal one, meta-learning is

performed on the meta-data characterizing the data mining

tasks. The effectiveness of meta-learning is largely dependent

on the description of tasks (i.e., meta-attributes).

Ensemble methods are learning algorithms that construct a set

of classifiers and then classify new data points by taking a

vote of their predictions. Combining classifiers or studying

methods for constructing good ensembles of classifiers to

achieve higher accuracy is an important research topic [1] [2].

The drawback of ensemble learning is that in order for

ensemble learning to be computationally efficient,

approximation of posterior needs to have a simple factorial

structure. This means that most dependence between various

parameters cannot be estimated. It is difficult to measure

correlation between classifiers from different types of

learners. Also there are learning time and memory

constraints. Learned concept is difficult to understand.

So we are trying to propose adaptive learning. We need to

propose algorithm for selection of methods for classification

task. The datasets are identified and we are trying to map to

learning algorithms or methods. We need to generate adaptive



8

function. Adaptive learning will be built on the top of

ensemble methods.

2. RELATED WORKS

Several algorithm selection systems and strategies have been

proposed previously [3][10][11][12]. STATLOG [14] extracts

various characteristics from a set of datasets. Then it

combines these characteristics with the performance of the

algorithms. Rules are generated to guide inducer selection

based on the dataset characteristics. This method is based on

the morphological similarity between the new dataset and

existing collection of datasets. When a new dataset is

presented, it compares the characteristics of the new dataset to

the collection of the old datasets. This costs a lot of time.

Predictive clustering trees for ranking are proposed in [15]. It

uses relational descriptions of the tasks. The relative

performance of the algorithms on a given dataset is predicted

for a given relational dataset description. Results are not very

good, with most relative errors over 1.0 which are worse than

default prediction. Data Mining Advisor (DMA) [16] is a

system that already has a set of algorithms and a collection of

training datasets. The performance of the algorithms for every

subset in the training datasets is known. When the user

presents a new dataset, DMA first finds a similar subset in the

training datasets. Then it retrieves information about the

performance of algorithms and ranks the algorithms and gives

the appropriate recommendation. Our approach is inspired by

the above method used in [16].

Most work in this area is aimed at relating properties of data

to the effect of learning algorithms, including several large

scale studies such as the STATLOG (Michie et al., 1994) and

METAL (METAL-consortium, 2003) projects. We will use

this term in a broader sense, referring both to ‘manual’

analysis of learner performance, by querying, and automatic

model building, by applying learning algorithms over large

collections of meta-data. An instance based learning algorithm

(K-nearest neighbor) was used to determine which training

datasets are closest to a test dataset based on similarity of

features, and then to predict the ranking of each algorithm

based on the performance of the neighboring datasets.

3. LEARNING ALGORITHMS AND

DATASET CHARACTERISTICS

In general there are two families of algorithms, the statistical,

which are best implemented by an experienced analyst since

they require a lot of technical skills and specific assumptions

and the data mining tools, which do not require much model

specification but they offer little diagnostic tools. Each family

has reliable and well-tested algorithms that can be used for

prediction. In the case of the classification task [11], the most

frequent encountered algorithms are logistic regression (LR),

decision tree and decision rules, neural network (NN) and

discriminant analysis (DA). In the case of regression, multiple

linear regression (MLR), classification & regression trees

(CART) and neural networks have been used extensively.

In the classification task the error rate is defined

straightforwardly as the percentage of the misclassified cases

in the observed versus predicted contingency table. When

NNs are used to predict a scalar quantity, the square of the

correlation for the predicted outcome with the target response

is analogous to the r-square measure of MLR. Therefore the

error rate can be defined in the prediction task as:

Error rate = 1 - correlation2 (observed, predicted)

In both tasks, error rate varies from zero to one, with one

indicating bad performance of the model and zero the best

possible performance.

The dataset characteristics are related with the type of

problem. In the case of the classification task the number of

classes, the entropy of the classes and the percent of the mode

category of the class can be used as useful indicators. The

relevant ones for the regression task might be the mean value

of the dependent variable, the median, the mode, the standard

deviation, skewness and kurtosis. Some database measures

include the number of the records, the percent of the original

dataset used for training and for testing, the number of

missing values and the percent of incomplete records. Also

useful information lies on the total number of variables. For

the categorical variables of the database, the number of

dimensions in homogeneity analysis and the average gain of

the first and second Eigen values of homogeneity analysis as

well as the average attribute entropy are the corresponding

statistics. For the continuous variables, the average mean

value, the average 5% trimmed mean, the median, the

variance, the standard deviation, the range, the inter-quartile

range, skewness, kurtosis and the Huber’s M-estimator are

some of the useful statistics that can be applied to capture the

information on the data set.

The determinant of the correlation matrix is an indicator of the

interdependency of the attributes on the data set. The average

correlation, as it is captured by Crobach-α reliability

coefficient, may be still an important statistic. By applying

principal component analysis on the numerical variables of

the data set, the first and second largest Eigen values can be

observed.

If the data set for a classification task has categorical

explanatory variables, then the average information gain and

the noise to signal ratio are two useful information measures,

while the average Goodman and Kruskal tau and the average

chi-square significance value are two statistical indicators.

Also in the case of continuous explanatory variables, Wilks’

lambda and the canonical correlation of the first

discrimination function may be measures for the

discriminating power within the data set.

By comparing a numeric with a nominal variable with the

student’s t-test, two important statistics are produced to

indicate the degree of their relation, namely Eta squared and

the Significance of the F-test.

Table 1. DCT dataset properties [17]

Nr_Attributes Nr_num_attributes

Nr_sym_attributes Nr_examples

Nr_classes MissingValues_Total

MissingValues_relative Mean_Absolute_Skew

MStatistic MeanKurtosis

NumAttrsWithOutliers MstatDF

MstatChiSq SDRatio

WiksLambda Fract

Cancor BarlettStatistic

Class Entropy Mutual Information

Joint Entropy Eqivalent_nr_of_attrs

Entropy Attributes NoiseSignalRatio



9

4. PROPOSED METHOD Here we are considering properties of scenarios. We need to

classify learning scenario. We are extracting features of input

data or datasets. We are using the concept of meta-learning.

Meta-learning relates algorithms to their area of expertise

using specific problem characteristics. The idea of meta-

learning is to learn about classifiers or learning algorithms, in

terms of the kind of data for which they actually perform well.

Using dataset characteristics, which are called meta-features;

one predicts the performance results of individual learning

algorithms. These features are divided into several categories:

Sample or general features: Here we need to find

out the number of classes, the number of attributes,

the number of categorical attributes, the number of

samples or instances etc.

Statistical features: Here we require to find

canonical discriminant, correlations, skew, kurtosis

etc.

Information theoretic features: Here we need to

extract class entropy, signal to noise ratio etc.

We are proposing adaptive methodology. Different thoughts

can be considered, e.g. parameters such as the input data,

learning methods, learning policies, learning methods

combination. Here there can be a single learner or multiple

learners. Also we can use simple voting or averaging while

combining the performance of the different learners.

5. EXPERIMENTS

5.1 Experimental Descriptions Here we need to map the dataset’s characteristics to the

performance of the algorithm. We are capturing the

knowledge about the algorithms’ from experiments. Here we

are calculating the algorithms’ accuracy on each dataset.

After the experiments, accuracy of each algorithm

corresponding to every dataset is saved in the knowledge base

for the future use. The Ranking procedure is shown in Figure

1.

Given a new dataset, we use k-NN [7] to find out the most

similar dataset in the knowledge base with the new one. K-

Nearest Neighbor learning is the most basic instance-based

method. The nearest neighbors of an instance are defined in

terms of the standard Euclidean distance. Let an arbitrary

instance x be described by the feature vector

<a1 (x), a2 (x), --- an(x) >

Where ar (x) denotes the value of the rth attribute of instance x.

Then the distance between two instances xi and xj is defined to

be d (xi- xj),

d (xi- xj) = √ (∑(ar (xi) – ar (xj ) )) 2

Here r varies from 1 to n in summation. 24 characteristics are

used to compare the two dataset’s similarities. A distance

function that based on the characteristics of the two datasets is

used to find the most similar neighbors, whose performance is

expected to be similar or relevant to the new dataset. The

recommended ranking of the new dataset is built by

aggregating the learning algorithms’ performance on the

similar datasets. The knowledge base KB stores the dataset’s

characteristics and the learning algorithms’ performance

corresponding to each dataset.



10

k Similar Datasets

Ranking of learning

algorithms for the

new dataset

Result: Recommended

learning algorithm

Decision

Making

New Dataset

Calculate Dataset

Characteristics

Characteristics of the

new dataset

k-NN

Calculate Dataset

Characteristics

Knowledge base

(Learning

algorithms’

performance &

Dataset

Characteristics)

Fig 1: The Ranking of Learning Algorithms

6. RESULTS AND DISCUSSIONS Here we have used Adult Dataset [13]. The dataset Adult has

following features:

48842 instances

14 attributes (6 continuous, 8 nominal)

Contains information on adults such as age, gender,

ethnicity, martial status, education, native country,

etc.

The instances are classified into either “Salary

>50K” or “Salary <= 50K”

Table 2 shows the ranking of eight algorithms used on Adult

Dataset from UCI Repository. The table shows highest rank to

LogitBoost algorithm, then to J48, oneR and finally lowest

rank is given to ZeroR algorithm.



11

Table 2. Ranking of different algorithms on Adult Dataset

Algorithm Rank

LogitBoost 1

J48 2

OneR 3

DecisionStump 4

IB1 5

IBK 6

NaiveBayes 7

ZeroR 8

Table 3. Correctly & Incorrectly Classified Instances for

Adult Dataset

Algorithm % of

Correct

classified

instances

% of

Incorrect

classified

instances

LogitBoost 84.68 15.32

ZeroR 76.07 23.93

Fig. 2: % Classified instances with top ranked algorithm

LogitBoost on Adult Dataset

Figure 2 shows percentage of classified instances with the top

ranked algorithm called LogitBoost on Adult Dataset. Here

84.68 % instances are correctly classified.

Figure 3 shows percentage of classified instances with the

lowest ranked algorithm called ZeroR on Adult Dataset. Here

76.07 % instances are correctly classified.

Fig. 3: % Classified instances with lowest ranked

algorithm ZeroR on Adult Dataset

7. CONCLUSIONS AND FUTURE WORK In this paper, we present our preliminary work on using meta-

learning method for helping user effectively to select the most

appropriate learning algorithms and give the ranking

recommendation automatically. It will assist both novice and

expert users. Ranking system can reduce the searching space,

give him/her the recommendation and guide the user to select

the most suited algorithms. Thus the system will assist to

learn adaptively using the experiences from the past data. In

the future work, we will investigate more on our proposed

method and test extensively on other datasets. Meta Learning

helps improve results over the basic algorithms. Using Meta

Characteristics on the Adult dataset to determine an

appropriate algorithm, almost 85% correct classification is

achieved for LogitBoost algorithm. So out of eight algorithms

LogitBoost algorithm is recommended to the user.

8. ACKNOWLEDGMENTS Our thanks to the experts who have contributed towards

development of the different algorithms and made them

available to the users.

9. REFERENCES [1] Kuncheva, L, Bezdek J., and Duin, R. 2001 Decision

Templates for Multiple Classifier Fusion: An

Experimental Comparison, Pattern Recognition. 34, (2),

pp.299-314, 2001.

[2] Dietterich, T. 2002 Ensemble Methods in Machine

Learning 1st Int. Workshop on Multiple Classifier

Systems, in Lecture Notes in Computer Science, F. Roli

and J. Kittler, Eds. Vol. 1857, pp.1-15, 2002.

[3] Alexmandros, K. and Melanie, H. J. 2001 Model

Selection via Meta-Learning: A Comparative Study.

International Journal on Artificial Intelligence Tools.

Vol. 10, No. 4 (2001).

[4] Joachims, T. 1998 Text Categorization with Support

Vector Machines: Learning with Many Relevant

Features. Proceedings of the European Conference on

Machine Learning, Springer.

[5] Schaffer, C. 1994 Cross-validation, stacking and bi- level

stacking: Meta-methods for classification learning, In Cheeseman, P. and Oldford R.W.(eds) Selecting Models from Data: Artificial Intelligence and

IV, 51-59.

Correct

Incorrect

Correct

Incorrect



12

[6] Wolpert, D. 1996 The lack of a Priori Distinctions between Learning Algorithms, Neural Computation, 8,

1996, 1341-1420.

[7] Mitchell, T. 1997 Machine Learning, McGraw Hill.

[8] Brodley, C. E. J.1995 Recursive automatic bias selection

for classifier construction, Machine Learning, 20, 63-94.

[9] Schaffer, C. J. 1993 Selecting a Classification Methods

by Cross Validation, Machine Learning, 13, 135-143.

[10] Kalousis, A. and Hilario, M. 2000 Model Selection via

Meta-learning: a Comparative study, Proceedings of the

12th International IEEE Conference on Tools with AI,

Canada, 214-220.

[11] Koliastasis, D. and Despotis, D. J. 2004 Rules for

Comparing Predictive Data Mining Algorithms by Error

Rate, OPSEARCH, VOL. 41, No. 3.

[12] Fan, L., Lei M. 2006 Reducing Cognitive Overload by

Meta-Learning Assisted Algorithm Selection,

Proceedings of the 5th IEEE International Conference on

Cognitive Informatics, pp. 120-125, 2006.

[13] Frank, A. and Asuncion, A. 2010. UCI machine learning

Repository [http://archive.ics.uci.edu/ml]. Irvine, CA:

University of California, School of Information and

Computer Science.

[14] Michie, D. and Spicgelhater, D. 1994 Machine Learning,

Neural and Statistical Classification. Elis Horwood

Series in Artificial Intelligence, 1994.

[15] Todorvoski, L. and Blockeel, H. 2002 Ranking with

Predictive Clustering Trees, Efficient Multi-Relational

Data Mining, 2002.

[16] Alexandros, K. and Melanie, H. J. 2001 Model Selection

[17] Peng, Y., Flach., P., Soarces C. and Brazdil, P., 2002

Improved Dataset Characterization for Meta-learning,

Springer LNCS 2534, pp. 141-152, 2002.



13

Routing Protocol for Mobile Nodes in Wireless Sensor

Network

Bhagyashri Bansode

Department of Computer Engineering, Pune Institute of Computer Technology, Pune,

Maharashtra, India.

Rajesh Ingle Phd, Department of Computer Engineering, Pune

Institute of Computer Technology, Pune, Maharashtra, India

ABSTRACT

Wireless sensor network made up of sensor nodes which are

fix or mobile. LEACH is clustered based protocol uses time

division multiple access. It supports mobile nodes in WSN.

Mobile node changes cluster. LEACH wait for two TDMA

cycles to update the cluster, within these two cycles mobile

node which changed cluster head, can’t send data to any other

cluster head, it causes packet loss. We propose an adaptive

Low Packet Loss Routing protocol which support mobile

node with low packet loss. This protocol uses time division

multiple access scheduling to reserve the battery of sensor

node. We form clusters, each cluster head update cluster after

every TDMA cycle to reduce packet loss. The proposed

protocol sends data to cluster heads in an efficient manner

based on received signal strength. The performance of

proposed LPLR protocol is evaluated using NS2.34 on Linux

2.6.23.1.42.fc8 platform. It has been observed that the

proposed protocol reduces the packet loss compared to

LEACH-Mobile protocol.

Keywords

Cluster based routing, mobility, LEACH-Mobile, WSN

1.INTRODUCTION

A wireless sensor network (WSN) consists of spatially

distributed autonomous sensors to monitor physical or

environmental conditions, such as temperature, sound,

vibration, pressure, humidity, motion or pollutants and to

cooperatively pass their data through the network to a main

location. Modern networks are bi-directional, also enabling

control of sensor activity. The development of wireless sensor

networks was motivated by military applications such as

battlefield surveillance; today such networks are used in many

industrial and consumer applications, such as industrial

process monitoring and control, machine health monitoring.

WSN consist of mobile or fix sensor nodes. In some cases it

consists of hybrid sensor nodes. All nodes sense and send

data to server. This increases communication overhead

because all nodes are sending data to server. This network

containing hundreds or thousands of sensor node and main

challenge in WSN is to reduce energy consumption and low

packet loss in each sensor node. There are many routing

protocols like Destination Sequenced Distance Vector

(DSDV), Dynamic Source Routing (DSR), and Ad hoc On

Demand Distance Vector (AODV) [1]. These protocols are

supported to WSN but they are not suitable for tiny, low

capacity sensor nodes and they require high power

consumption. Flat-based multi-hop routing protocols,

designed for static WSN [2-6], have also been exploited in

WSN mobile nodes. However it not supports to mobility of

sensor node

The main challenge in WSN is to minimize energy

consumption in each sensor node. Many researchers

concentrate on the routing protocol that would consume less

power and hence prolong network’s life span. Wireless ad hoc

network routing protocols have been proposed for routing

protocols in WSN.

Low Energy Adaptive Clustering Hierarchy-Mobile

(LEACH-Mobile) [7] is routing protocol which support to

WSN which have mobile nodes. LEACH-Mobile supports

sensor nodes mobility in WSN by adding membership

declaration to LEACH protocol. LEACH-Mobile protocol

selects heads randomly and form cluster. Cluster head create

Time Division Multiple Access (TDMA) schedule. Nodes

sense and send that data to cluster head according to TDMA

schedule. Mobility of node is big challenge to maintain

cluster. Mobile nodes changes cluster continuously. LEACH-

Mobile protocol update cluster after every two cycles of

TDMA schedule. Packet loss happened in between two cycles

of TDMA schedule. Mobile node which is not near to any

cluster cannot send data to any cluster head so it causes packet

loss.

Sensor nodes in LEACH-Mobile wait for two

consecutive failure TDMA cycles, then cluster head decide

that it has moved out of its cluster. During these two TDMA

cycles sensor node loss the packets. In LPLR, sensor node

does not need to wait for two consecutive TDMA cycles from

cluster head to make decision. Cluster head directly decides

that member node has moved out of its cluster after one

TDMA cycle. The data loss is reduced by sending its data to

new cluster head and sends join acknowledgment message to

the cluster head.

We proposed a new low packet loss technique with

efficient power consumption routing protocol for WSN. This

proposed routing protocol called Low Packet Loss Routing

Protocol for mobile nodes in wireless sensor network (LPLR

Mobile-WSN). In our proposed protocol, the cluster head

sends data request message to its members. When the cluster

head does not receive data from its members, the packet is

considered lost and cluster head delete membership of sensor

node from the cluster. On the other hand, when the sensor

node does not receive data request message from cluster head

it will try to get entry into new cluster to avoid packet loss.

Cluster head gives entry in TDMA schedule to incoming

nodes from other cluster. Transmitter will send the message

according to the received signal strength of data request

message from the cluster head.



14

Table 1 Abbreviations

WSN Wireless Sensor Network

DSDV Destination Sequenced Distance Vector

DSR Dynamic Source Routing

AODV Ad hoc On Demand Distance Vector

LEACH-Mobile Low Energy Adaptive Clustering

Hierarchy-Mobile

TDMA Time Division Multiple Access

LPLR Low Packet Loss Routing

CSMA Carrier Sense Multiple Access

CA MAC Collision Avoidance Medium Access

Control

CH Cluster Head

SN Sensor Node

2. LOW PACKET LOSS ROUTING Low Packet Loss Routing (LPLR) is Low Packet Loss

Routing protocol for wireless sensor network. LPLR proposes

to handle packet loss and efficiently use energy resources. In

this protocol, cluster head receives the data not only from its

members during TDMA allocated time slot but also from

other lost sensor nodes. WSN consist of mobile and fix both

type of nodes. After cluster formation mobile node can change

cluster. LPLR gives entry to mobile node in new cluster and

TDMA schedule to send data.

2.1 Selection of Cluster Head Protocols like TEEN [8] and APTEEN [9] used stationary but

dynamically changing cluster head. In some protocol where

sensor nodes are mobile cluster head is selected according to

mobility factor [10]. The node with the smallest mobility

factor in each cluster is chosen as cluster head. In LEACH-

Mobile cluster head assumed to be stationary and static in

order to control mobility. In proposed protocol we elect

cluster heads randomly. It assumed to be stationary and static

through the rounds.

2.2 Formation of Cluster After a cluster head has been selected, it broadcasts an

advertisement messages to the rest of the sensor nodes in the

network as in LEACH and LEACH-Mobile. For these

advertisement messages, cluster heads use a Carrier Sense

Multiple Access with Collision Avoidance Medium Access

Control (CSMA/CA MAC) protocol. All cluster heads use the

same transmit energy when transmitting advertisement

messages. Sensor nodes must keep their receivers “ON” in

order to receive the advertisement message from their cluster

head. After sensor nodes have received advertisement

messages from one or more cluster heads, sensor nodes

compare the received signal strength for received

advertisement messages, and decide the cluster to which it

will belong. By assuming symmetric propagation channels,

the sensor node selects cluster head to which the minimum

amount of transmitted energy is needed for communication. In

the case of a tie, a random cluster-head is chosen. After

deciding the cluster it will belong, the node sends registration

message to inform the cluster head. This advertisement

messages are transmitted to the cluster heads using

CSMA/CA MAC protocol. During this phase, all cluster

heads must keep their receiver on.

2.3 TDMA Schedule Creation After cluster head receives registration messages from the

nodes that would like to join the cluster, the cluster head

creates a TDMA schedule based on the number of nodes and

assigns each node a time slot to transmit the data. This

schedule is broadcasted to all the sensor nodes in the cluster.

All sensor nodes will transmit data according to TDMA

schedule

2.4 Data Transmission Once the clusters are created and the TDMA schedule is

fixed, data transmission from sensor nodes to their cluster

heads begin according to their TDMA scheduled. Upon

receiving data request from the cluster head the sensor node

switches on its radio transmitter, adjusts transmission power

and sends its data. At the end of the transmission, the node

turns off its radio, thus we can save the battery of sensor node.

The cluster head must keep its radio on to send data request

messages, receive data from the sensor nodes and to send and

receive other messages needed to maintain the network.

Sensor node receives data request message from the cluster

head, it will send its data back to the cluster head. If the

sensor nodes did not receive data request message from its

cluster head, it will send the message to a free cluster head.

3. LPLR ALGORITHM Cluster Head – CH1 to CH6

Sensor node – S1 to S51

1. Select head randomly.

Select cluster head CH1 to CH6 from S1 to S51 sensor node.

2. All cluster head broadcast advertisement message, from S1

to S51 except cluster heads.

3. Sensor node receives advertisement messages from CH1 to

CH6.

4. Compare received signal strength. Select maximum

received signal strength. According to that received signal

strength sensor node select head CH1 to CH6.

5. Sensor node sends registration message to cluster head.

6. Cluster head create TDMA schedule and broadcast to all

member nodes.

7. According to TDMA schedule cluster head sends data

request message to member node.

8. Member node sends data to cluster head.

Figure 1: Messages of Cluster Head and Sensor Node

3.1 Cluster Head Cluster head broadcast advertisement message. If cluster head

receive registration message then it create TDMA schedule



15

and broadcast that schedule. When cluster head finishes

receiving data messages from all sensor nodes, it will check

whether it receives data messages from all members. Any of

the member nodes did not send the data message then cluster

head remove that sensor node from cluster. Cluster head again

broadcast advertisement message to all nodes and update the

cluster. Updating of cluster causes entry of new mobile nodes

in TDMA schedule of cluster.Figure1 shows messages

transfers and receives by cluster head.

3.2 Sensor Node Sensor node receives advertisement message from one or

more cluster head. According to received signal strength node

select head. Node send registration message to selected head

and get entry into TDMA schedule of that cluster. Member

node sends data to cluster head according to TDMA schedule

when it receive data request message from cluster head. If

sensor node does not receive any advertisement or any data

request message then it sends data to free cluster head.

Figure1 shows messages transfers and receives by sensor

node.

4. SIMULATION RESULT We have simulated LPLR and LEACH-Mobile using NS2.34

on Linux 2.2.23.1.42.fc8 platform with parameters as shown

in Table 1. Basic hardware requirement for this simulation is

pentium4 processor, 512mb RAM, 10GB hard disk.

Table 2 Performance Parameter

Parameter Value

Network Size (L*W) 1800*840

Number of Sensor Nodes (N) 51

Percentage of Cluster Head 10%

Percentage of Mobile Sensor

Node

90%

Sensing Range 500m

Sensor speed 1-17 m/s

Figure2. Total Number of Received Packets

Figure2. shows total number of packets received by cluster head from member node. We applied LEACH-Mobile to network and observed number of packets received by each

cluster head from member node in the WSN. We applied

LPLR to the same WSN and observed same result. We can

see from figure2 the LPLR achieves significant improvement

compared to LEACH-Mobile. We can conclude that packet

loss decreases using LPLR

Figure3. Remaining Energy of Nodes

Figure3 shows remaining energy of every sensor node in the

WSN. Member node wakes up to send the data according to

TDMA schedule otherwise they are in sleep mode so we can

reduce energy consumption. We can compare the remaining

energy of sensor node in LEACH-Mobile and LPLR from

figure3. We can say that sensor node can reserve more battery

using LPLR than LEACH-Mobiles.

Figure4. Packet Delivery Ratio

Figure4 shows packet delivery ration of cluster head of LPLR

and LEACH-Mobile. From this figure we can compare

delivery ratio of both protocol and performance of LPLR is

efficient than LEACH-Mobile.

5. CONCLUSION We proposed cluster based routing protocol LPLR which is

efficient as compare to LEACH-Mobile protocol. In proposed

protocol all sensor nodes are maintain by cluster head. TDMA

schedule is also maintained by cluster head. Cluster head

collect data from member node and sends that data to server.

If sensor node fails or battery of node is get discharged then

cluster head remove record of that node from TDMA schedule

and update schedule. Cluster head failure can affect the

working of WSN. Member node cannot send data to cluster

head in case of cluster head failure. It causes packet loss. We



16

are working on cluster head failure case to get better result

than proposed LPLR protocol.

We proposed efficient routing protocol for mobile

nodes in wireless sensor network. LPLR protocol is efficient

in energy consumption and packet delivery in the network. It

forms cluster. Cluster head create and maintain TDMA

schedule. Sensor node wake up at the time of sending data and

it goes into sleep mode, it reserve the battery of sensor node.

Important feature of this protocol is it update cluster after

every TDMA cycle. So every mobile node can get entry into

new cluster and can send data. LPLR also maintain a free

cluster head. Mobile sensor node which is not in the range of

any cluster or node which are moved from one cluster and

waiting for new cluster they sends data to free cluster head. It

reduces packet loss. We have simulated LEACH-Mobile and

proposed LPLR protocol and got efficient result of LPLR.

6. REFFERENCES [1] C.Perkins and P.Bhagwat. "Highly Dynamic Destination-

Sequenced Distance-Vector Routing (DSDV) for

Mobile Computers," presented at the ACM '94

Conference on Communications Architectures, Protocols

and Applications, 1994.

[2]W. Heinzelman, J. Kulik, and H. Balakrishnan, "Adaptive

protocols for information dissemination in wireless

sensor networks," Proc. 5th ACM/IEEE Mobicom

Conference (MobiCom '99), Seattle, WA, August, 1999,

pp. 174-85.

[3]J.Kulik, W. R. Heinzelman, and H. Balakrishnan,

"Negotiation-based protocols for disseminating

information in wireless sensor networks," Wireless

Networks, Vol. 8, 2002, pp. 169-185.

[4] C. Intanagonwiwat, R. Govindan, and D. Estrin, "Directed

diffusion: a scalable and robust communication paradigm

for sensor networks," Proc. of ACM MobiCom '00,

Boston, MA, 2000, pp. 56-67.

[5] D. Braginsky and D. Estrin, "Rumor routing algorithm for

sensor networks," Proc. of the 1st Workshop on Sensor

Networks and Applications (WSNA), Atlanta, GA,

October 2002.

[6]Y. Yao and J. Gehrke, "The cougar approach to in-network

query processing in sensor networks", in SIGMOD

Record, September 2002.

[7]Guofeng Hou, K. Wendy Tang, "Evaluation of LEACH

protocol Subject to Different Traffic Models,” presented

at the first International conference on Next Generation

Network (NGNCON 2006), Hyatt Regency Jeju,

Korea/July 9-13,1006

[8]A. Manjeshwar and D. P. Agarwal, "TEEN: a routing

protocol for enhanced efficiency in wireless sensor

networks," presented at the 1st Int. Workshop on Parallel

and Distributed Computing Issues in Wireless Networks

and Mobile Computing, April 2001.

[9]A. Manjeshwar and D. P. Agarwal, "APTEEN: A hybrid

protocol for efficient routing and comprehensive

information retrieval in wireless sensor networks,"

Parallel and Distributed Processing Symposium., IPDPS

2002, pp. 195-202.

[10]M. Liliana, C. Arboleda, and N. Nasser, "Cluster-based

Routing Protocol or Mobile Sensor Networks," presented

at the 3rd Int. Conf. on Quality of Service in

Heterogeneous Wired/Wireless Networks, Waterloo,

Ontario, Canada, 2006.



17

32-Bit NxN Matrix Multiplication: Performance Evaluation for Altera FPGA, i5 Clarkdale, and Atom Pineview-D Intel

General Purpose Processors

Izzeldin Ibrahim Mohd

Faculty of Elect. Engineering,

Universiti Teknologi Malaysia,

81310 JB, Johor

Chay Chin Fatt

Intel Technology Sdn. Bhd.,PG9 (Intel U Building), Bayan

Lepas,11900 Penang

Muhammad N. Marsono

Faculty of Elect. Engineering,

Universiti Teknologi Malaysia,

81310 JB, Johor

ABSTRACT

Nowadays mobile devices represent a significant portion of

the market for embedded systems, and are continuously

demanded in daily life. From the end-user perspective size,

weight, features are the key quality criteria. These

benchmarks criteria became the usual design constraints in the

embedded systems design process and put a high impact on

the power consumption. This paper survey and explore

different low power design techniques for FPGA and

processors. We compare, evaluate, and analyze, the power and

energy consumption in three different designs namely, Altera

FPGA Cyclone II which has a systolic array matrix

multiplication implemented, i5 Clarkdale, and Atom

Pineview-D Intel general purpose processors, which multiply

two nxn 32-bit matrices and produce a 64-bit matrix as an

output. We concluded that FPGA is a more power and energy

efficient on low matrix size. However, general purpose

processor performance is close to FPGA on larger matrix size

as the larger cache size in general purpose processor help in

reducing latency. We also concluded that the performance of

FPGA can be improved in terms of latency if more systolic

array processing elements are implemented in parallel to

allow more concurrency.

General Terms

Computational Mathematics

Keywords

FPGA, Matrix Multiplication, General Purpose Processor,

Systolic Array, Energy Consumption

1. INTRODUCTION With drastic improvement and mature of Field Programmable

Gate Array (FPGA) technology nowadays, FPGAs become

one of the choices for designer other than traditionally

solution such as general purpose processor and digital signal

processor (DSP). Its nature of reconfigurable and can be

programmed to implement any digital circuit make it a best

candidate for most of data computation extensive application.

This is including area such as signal processing and

encryption engine which involves large amount of real time

data processing. FPGAs provide better throughput and

latency since it is able to be customized to optimize the

execution of particular process or algorithm.

Traditionally, research of FPGAs and improvement of FPGAs

are mainly focusing on reducing the area overhead and

increasing the speed [7-11]. With emerging of portable and

mobile devices which are now become a need of most of

people today, performance metrics of any of devices are not

mainly focus on latency and throughput but energy efficiency

is key factor as well. As summary, performances of electronic

devices are not mainly focusing on just speed but energy

efficiency should be listed as a major design consideration.

Designers are focusing on producing high throughput solution

while maintaining the power consumption low.

A lot of study and experiment had been done comparing the

energy efficiency between FPGAs, DSPs, embedded

processor [1] and general purpose processor. However,

particularly on general purpose processor, most of experiment

are not comparing to the greatest and latest commercial

processor in market which claimed by manufacturer that

several low power design techniques had been adopted. With

the advance of semiconductor process technology nowadays

which lead to lower leakage current, and flexibility of

software implementation for power saving, performance of

general purpose processor in term of power dissipation and

energy consumption had greatly improved. The key question

here is how well current modern general purpose processor in

market performs in term of energy efficiency compares to

FPGAs particularly on signal processing centric application.

This paper is going to evaluate and discuss in detail this key

question by executing nxn matrix multiplication on Altera

Cyclone II FPGA [17-19] and Intel processor and comparing

the performance of both devices in term of power dissipation

and energy efficiency.

2. RELATED WORK Ronald Scrofano et al had shown that matrix multiplication of

two nxn matrices can be done most efficiently in term of

energy and power with FPGA in their paper- Energy

Efficiency of FPGAs and Programmable Processor for Matrix

Multiplication [1]. They compared the energy efficiency of

nxn matrix multiplication between Xilinx Vertex-II, Texas

Instruments DSP processor (TMS320C6415) and Intel Xscale

PXA250. They use linear array architecture for the matrix

multiplication module. In their work, there is no actual power

measurement been done on FPGA. However, only estimated

energy of each Process Element (PE) was done. Each PE

components (multiplier, adder, register, RAM) was modeled

in VHDL and synthesis. Design is placed and route after

synthesis. Place and route result and output from simulation

were used as input to Xilinx Xpower tool to estimate the

power. The way the power estimation was done actually

assumed that location of each component in FPGA does not

affect the power consumption. This is not true as location of

each component has big impact on the routing. And

capacitance of routing is a major factor of power



18

consumption. Thus, accuracy of power estimation shown by

Ronald Scrofano et al is a concern.

Seonil Choi et al developed an energy efficient design for

matrix multiplication base on uniprocessor architecture and

linear array architecture for both on chip and off chip storage

[2-6]. No actual measurement was taken on actual HW

implementation. However, energy consumption was

estimated using Xilinx Xpower tool. They showed that linear

array architecture need 49 cycles for 6x6 matrices and up to

256 cycles for 15x15 matrices.

3. SYSTOLIC ARRAY MATRIX

MULTIPLICATION ON ALTERA

CYCLONE II Traditional way of implementing systolic array for matrix

multiplication is matching the systolic array order to the

problem size [12-16]. For example, an 8x8 matrix size will

need 8x8 order systolic array and 16x16 matrix size will need

16x16 order systolic array. The basic building block of

systolic array is 2x2 and each 2x2 systolic array need 4

multiply and accumulators (MAC) as shown in Figure 1. The

number of MAC increases tremendously if problem size

increases. Table 1 below shows number of MACs required on

matrix size.

Table1: Resource utilization on difference Matrices sizes

Matrix Size Number of 2x2

systolic array

Number of

MACs

2x2 1 4

4x4 4 16

8x8 16 64

16x16 64 256

Our target FPGA device for power measurement is Cyclone II

EP2C35F672C6 which has only 33,216 logic elements and 70

embedded 9 bits multiplier. Besides, input data of 32 bit and

output data of 64 bit imply that we need pretty huge MAC on

wide bus width. If we follow the method of matching the

matrix size to the order of systolic array, resource utilization

will exceed the number of gate on target devices.

Alternatively, we utilized 4x4 systolic arrays which is

developed by connecting four 2x2 systolic arrays in the way

shown in Figure 2, as basic building block to construct up to

16x16 matrix multiplication. In other words, the same 4x4

systolic arrays will be used to implement 2x2, 4x4, 8x8 and

16x6 matrix multiplication module on the expense of latency

increment on larger problem size. Figure 3 below illustrates

the implementation with single 4x4 systolic array.

Figure 1: Functional Block Diagram of 2x2 Systolic Array

Figure 2: Functional Block Diagram of 4x4 Systolic Array

Clearly, an algorithm is needed in order to use a single 4x4

systolic array for all matrices sizes. The steps below show the

method of constructing 8x8 matrix multiplication module

from a single 4x4 systolic array.

Step 1: Divided an 8x8 input matrix A and matrix B to four

4x4 sub matrices which are named as A1-A4 and B1-B4.

MAC MAC

MAC MAC

1

0

1

0

1

0

1

0

Reg

Reg Reg

Reg Reg

Reg Reg

RegReg

Reg

in_col0 in_col1

in_row0

in_row1

out_row0

out_row1

in_data0

in_data1

out_col0 out_col0

out_data0

out_data0

mult_over

mult_over mult_over

mult_over

mult_over

clk

clr

clk clr mult_over

in_col0 in_col1 in_col2 in_col3

in_row0

in_row1

in_row2

in_row3

out_data0

out_data1

out_data2

out_data3



19

Step 2: Output matrix C can be obtained by passing sub

matrix A and sub matrix B to 4x4 systolic array and adding up

the result accordingly as below

C1= A1xB1+A2xB3

C2= A1xB2+A2xB4

C3= A3xB1+A4xB3

C4= A3xB3+A4xB4

Figure 3: Implementation of matrix multiplier with single

4x4 systolic array

The same method can be used for higher matrices sizes such

as 16x16. As example, for 16x16 matrix size, we first need to

divide the input matrix to four 8x8 sub matrices. And, each

8x8x sub matrix will be further divided into four 4x4 sub

matrices. Figure 4 below illustrates the method of

constructing 8x8 matrix multiplication module from a single

4x4 systolic array

Figure 4: Constructing 8x8 matrix multiplier from single

4x4 systolic array

Figure 5 shows the high level functional block diagram of

overall design. Both 32 bit matrix A and matrix B will be

stored in Random Access Memory (RAM). A module

(Reg_delay) is required to stagger the input matrix A and

matrix B as required by algorithm of systolic array. A 4x4

matrix multiplication module responsible to compute the

multiplication between elements of matrix A and matrix B.

Dataout module responsible for addition of result of 4x4

matrix multiplication and write each element of output matrix

to RAM_out. Output of matrix multiplication result will be

stored in RAM.

Figure 5: High Level Functional Block Diagram of Matrix

Multiplication Module

4. MATRIX MULTIPLICATION ON

GENRAL PURPOSE PROCESSOR For general purpose processor, we used Intel i5 Clarkdale

Intel Atom Pineview-D processor. Matrix multiplication is

running on MATLAB on these two processors. MATLAB

was used because it is incorporates the Linear Algebra

Package (LAPACK) which greatly improve the performance

of matrix multiplication. A single iteration of computation of

matrix multiplication will not take advantage of the large

cache size in Intel processor. Thus, in order to utilize the

cache from Intel processors, 1000 iterations of 32 bit matrix

multiplication will be executed. With this, we can ensure that

the cache in processor will be fully utilized.

5. POWER AND ENERGY

MEASUREMENTS Power is measured using Tektronix current probe with current

amplifier and digital Tektronix digital oscilloscope TDS7104

to capture both the current and the voltage.

The power is obtained by multiplying the voltage and the

current. Average voltage and current are measured within the

window of matrix multiplication process. The energy

consumption is obtained by multiplying the average power to

total execution time.

2x2

2x2 2x2

2x2 2x2

2x2 Multiplier2x2 Multiplier


8x8 Multiplier8x8 Multiplier 16x16 Multiplier16x16 Multiplier

8x8 8x8

8x8 8x8

4x4 4x4

4x4 4x4

Resources Intensive !!!Resources Intensive !!!

2x2

2x2 2x2

2x2 2x2

2x2 2x2

2x2 2x2

2x2 2x2

2x2 2x2



8x8 Multiplier8x8 Multiplier 16x16 Multiplier16x16 Multiplier

With SchedulingWith Scheduling

A1 A2

A3 A4

B1 B2

B3 B4

X

MatrixA 8x8 MatrixB 8x8C1 C2

C3 C4

=

MatrixC 8x8

A1B1+A2B3 A1B2+A2B4

A3B1+A4B3 A3B3+A4B4



20

5.1. Power and Energy Measurements on

ALTERA DE2 Board Figure 6 shows the experiment setup for the power

measurement on ALTERA board. The Cyclone II Altera

FPGA (EP2C35F672C6) is powered by 3 power rails. There

are the core voltage (VCC_INT, 1.2V), IO Voltage (VCCIO,

3.3V) and PLL Voltage (VCC12, 1.2V). Figure 7 below

shows the power supply pins of the Cyclone II FPGA. In

order to obtain total power consume by the entire IC, the

current drawn by each of those rail is measured.

Figure 6: Hardware setup of current measurement on

Altera DE2 board

Figure 7: Power Supply Pins of Cyclone II

EP2C35F672C6

IO voltage (VCCIO) of Cyclone II is power directly by 3.3V

rail on Altera DE2 board through a 0ohm resistor (R92) as

shown in Figure 8. So that, the current drawn by Cyclone II

can be easily measure by replacing the 0ohm resistor with a

wire loop for current probe to be clamped on.

Both core voltage and PLL voltage are connected to output of

a low dropout regulator (LDO). Since there is no shunt

resistor exist on the output of LDO, a workaround is needed to

measure the power consume by these 2 voltage rails. In order

to measure the current on both voltage rails, output pin (pin 2)

of LDO (U24) have been lifted up and a wire loop is inserted

for current probe to be clamped on as shown in Figure 8. The

current is measured at the output of the LDO but not the input

of the LDO to exclude the efficiency loss of the LDO. If we

measured the current at the input of the LDO, it includes the

efficiency loss of LDO which is not required.

Figure 8: Power Block of ALTERA DE2 Board

5.2 Power and Energy Measurements on

the General Purpose Processors In the computer system, the power supply unit is supplying

5V, 3.3V and 12V to the motherboard. 12V is the main

voltage used by processor switching voltage regulator. 12V to

processor is supplied through a 2x2 or 2x4 power connector

from power supply. 12V input will be down regulated to

processor core and IO voltage. Thus, measuring the 12V input

current from 2x2 or 2x4 power connector will give the total

current consume by processor.

Since the computer system is running on operating system and

there are other background activities which will consume

processor resources, we have to take into consideration

current consumes by those background activities. Otherwise,

current measure will not only included the current consume

by matrix multiplication process but it include other process

that is happening in background as well. One of the ways to

overcome this is to measure the current on 12V rail when

system is idle. Idle is referring to situation where system is

power up but no other active process is running on the system

except those background activities initiate by operating

system. And another set of current measurement is taken

when system is running the required matrix multiplication

routine. By subtracting the idle current from the current

consume during matrix multiplication routine, we can obtain

the net current used just to execute the matrix multiplication.

With this methodology, we can measure both the power and

energy consume by processor for the particular matrix

multiplication module as shown in Figure.9.

Figure 9: Hardware setup of current measurement on

Atom Pineview-D2 processor



21

6. RESULTS AND ANALYSIS In this section, we analyzed the results of power and energy

we obtained from both Altera DE2 board as well as the two

platforms with Intel Core i5 and Intel Atom Pineview-D

processor. We discussed the observation we have from result

obtained. All measurements are done on 1000 iterations of

matrix multiplication on the three different designs.

6.1. Power From the results shown in Table 2 and Figure 10, it is cleared

that FPGA dissipate the least amount of power compared to

Intel i5 Clarkdale and Intel Atom Pineview-D. Our result

shows that Intel Atom Pineview-D perform better in term of

power dissipation compare to Intel i5 Clarkdale so that Intel

Atom series can be used for low power application. Intel Core

i5 consumes 16 times as much as power compare to Altera

Cyclone II while Intel Atom consumes 2.6 times more power

than Cyclone II.

Apparently, an interesting observation to be noted is all of

three implementations are consuming the same amount of

power regardless of matrix size. This is due the same HW unit

will operate at any case regardless of problem size and data

content.

Table 2: Power Dissipation versus Matrix Sizes for the

three Implementations

Matrix

Sizes

Power Dissipated (mW)

ALTERA

DE2 Cyclone

II

Intel General Purpose

Processors

Atom

Pineview-D i5 Clarkdale

2x2 237.05 794.24 5046.38

4x4 287.70 791.43 5152.54

8x8 290.27 788.36 5095.23

16x16 293.53 840.42 5156.37

Figure 10: Power Dissipation versus Matrix Sizes for the


For matrix multiplication on the FPGA, we used single 4x4

systolic array as basic building block for all matrix size as we

stated earlier in previous section. That is the reason why we

got same amount of power dissipated on all matrix sizes. As

the result we can concluded that for higher order of systolic

array, the power consumption will increase by the same

amount of logic element increment. For example, if we

choose to use 8x8 systolic array as basic building block, the

amount of power dissipated on FPGA will become closer to

Intel Atom Pineview-D. However, this will come with

advantage of reducing the latency.

6.2 Energy A device with less power dissipation doesn’t mean that it has

longer battery time. It might dissipate less power but take

significant time to complete a task compare to a high power

dissipation device but operate faster. A more accurate

measurement on battery life is using energy. Energy can be

calculated by multiplying average power to latency.

Table 3: Energy Consumption versus Matrix Sizes for the


Matrix

Size

ALTERA DE2

Cyclone II

Intel General

Purpose Processors

Atom

Pineview-D

i5

Clarkdale Latency

(ms)

Energy

(mJ)

Latency

(ms)

Energy

(mJ)

Latency

(ms)

Energy

(mJ)

2x2 0.25 0.06 13 10.33 1.80 9.08

4x4 1.14 0.33 13.8 10.92 2.20 11.34

8x8 7.76 2.25 19 14.98 2.50 12.74

16x16 60.02 17.62 41.4 34.79 3.90 20.11

As shown in Table 3 and Figure 11.The ALTERA FPGA is

consumed the least energy for matrix size up to 16x16.

However, on matrix size higher than 16x16, a linear

extrapolation on the line graph will reveal that Intel Core i5

performs better than FPGA in terms of energy consumption.

The main reason is the latency of FGPA implementation

increases at rate higher than the latency increment rate of Intel

Core i5 when matrix size grows from 16x16. The scheduling

we made on FPGA design in order to reduce resource

utilization is the main reason of excessive latency increment.

One can reduce the latency by using higher order of systolic

array as basic building block. Apparently, Intel Atom is not an

economic solution for matrix multiplication centric

application. Even though Intel Atom dissipates less power

than Intel Core i5 but overall it consumes more energy than

Intel Core i5 on same workload. However, we not deny that

minimum power dissipation is still important to keep

minimum heat dissipation for simple thermal solution.

On other hand, if we increase the order of systolic array in the

way that it matches with problem size, latency will improve

significantly. For example, use 16x16 systolic array for

problem size of 16x16. In order to estimate the energy

consumption, we have to understand the resource utilization

increment as well as the latency improvement if higher order

of systolic array is used.

Figure 11: Energy Consumption versus Matrix Sizes for

the three Implementations



22

A 4x4 systolic array is formed from four 2x2 systolic array, an

8x8 systolic array is constructed from four 4x4 systolic array

and so on. With scheduling, we use single 4x4 systolic array

for problem size of 8x8. Now, without scheduling we need

four 4x4 systolic array on problem size of 8x8. Thus, in term

of resource utilization, it increases by factor of 4.

Furthermore, if we assume that power dissipation is

proportional to resource utilization, power dissipation will

increase by factor of 4 as well.

As for the latency, matrix size of 8x8 needs 8 iterations if

single 4x4 systolic array is used. However, four 4x4 systolic

array will reduce the latency by factor of 8. Thus, overall

energy consumption will be reduced by 50% (power

dissipation increases by 4 times but latency decrease by 8

times). Figure 12 shows that the estimated energy

consumption if order of systolic array is matching with matrix

size. If we linearly extrapolate the line graph, FPGA is still

the candidate consume the least energy even at matrix size

higher than 16x16.

Figure 12: Energy consumption versus matrix size with

estimated result on higher order systolic array

7. CONCLUSION Due to hardware limitation on Altera DE2 with Cyclone II

EP2C35F672C6 which only has 33,216 logic elements,

systolic array matrix multiplication design has to be scheduled

and using only 4x4 systolic array. This approach helped to

ensure that design can be implemented and fit into Altera DE2

board. However, it induces significant latency. This design

can be modified with higher order of systolic array to

minimize latency and implement on FPGA family with higher

logic count such as Cyclone IV EP4C115 which has 115,000

logic elements.

Using more resources for parallel processing will always

reduce the latency but it increases the power dissipation. This

is because a powerful chip required more silicon area as more

circuit is required. Comparing between Atom Pineview-D

and Core i5, obviously Core i5 has better performance but it

suffer for higher power dissipation. For matrix size less than

16x16, we observed the FPGA is the best candidate in term of

power and energy consumption. Increasing the matrix size

further to greater than 16x16, Core i5 become a more

favorable candidate. Larger data and instruction cache in Core

i5 is the main reason we do not see latency increase at rate as

high as on Atom Pineview-D and Cyclone II when matrix size

increase. However, if we increase the order of systolic array to

match the matrix size, our estimated result show that FPGA is

still the most economical candidate for matrix multiplication.

8. ACKNOWLEDGMENTS We would like to thank the Universiti Teknology Malaysia for

funding support. We also like to take this opportunity to

express our appreciation to the Intel Technology SDN BHD

Malaysia, In particular, Intel Test and Tool Operation (iTTO)

for making this work possible.

9. REFERNCES [1]. R. Scrofano, S. Choi, and V. K. Prasanna, “Energy

Efficiency of FPGAs and Programmable Processors for

Matrix Multiplication,” in Proc. of IEEE Intl. Conf. on

Field Programmable Technology, pp. 422-425, 2002.e

[2]. S. Choi, V. K. Prasanna, and J. Jang, “Minimizing

energy dissipation of matrix multiplication kernel on

Virtex-II,” in Proc. of SPIE, Vol. 4867, pp. 98-106,

2002.

[3]. J. Jang, S. Choi, and V. K. Prasanna, “Energy efficient

matrix multiplication on FPGAs,” in Proc. of 12th Intl.

Conf.on Field Programmable Logic and Applications, pp.

534-544, 2002.

[4]. J. Jang, S. Choi, and V. K. Prasanna, “Area and Time

Efficient Implementations of Matrix Multiplication on

FPGAs,” in Proc. of IEEE Intl. Conf. on Field

Programmable Technology, pp. 93-100, 2002.

[5]. H. T. Kung and C. E. Leiserson, “Systolic arrays for

(VLSI),” Introduction to VLSI Systems, 1980.

[6]. V. K. P. Kumar and Y. Tsai, “On synthesizing optimal

family of linear systolic arrays for matrix multiplication,”

IEEE Trans Comput., vol. 40, no. 6, pp. 770–774, 1991.

[7]. Lamoureux J and Luk, W,“An overview of Low-Power

Techniques for Field-Programmable Gate Arrays.”, in

Adaptive Hardware and System. AHS’08. NASA/ESA,

2008.

[8]. Sutter, G., Boemo, E. "Experiments in low power FPGA

design", Lat. Am. Appl. Res., vol.37, no.1, pp.99-104,

2007.

[9]. Dave. N, Fleming. K, Myron King, Pellauer. M,

Vijayaraghavan, M. “Hardware Acceleration of Matrix

Multiplication on a Xilinx FPGA”, in Formal Methods

and Models for Codesign, 2007. MEMOCODE 2007. 5th

IEEE/ACM International Conference, May 30 2007-June

2 2007.

[10]. Aslan. S, Desmouliers. C., Oruklu. E and Saniie. J. “An

Efficient Hardware Design Tool for Scalable Matrix

Multiplication”, in Circuits and Systems (MWSCAS),

2010 53rd IEEE International Midwest Symposium,

pp1262-1265, 2010

[11]. H.T. Kung. “Why Systolic Architecture”, in IEEE

computer, pp37-46. 1982

[12]. Ju-Wook Jang, Seonil B. Choi and Viktor K. Prasanna. “

Energy and Time Efficient Matrix Multiplication on

FPGAs”, in IEEE transactions on very large scale

integration (VLSI) system vol 13, NO 11, November

2005.

[13]. Qasim, S.M, Abbasi S.A, Almashary B. “ A proposed

FPGA-based parallel architecture for matrix

multiplication”, in circuits and systems, 2008, APCCAS

2008, IEEE Asia Pacific Conference, pp1763-1766,

2008.



23

[14]. Syed M, Qasim, Ahmed A.Telba, Abdulhameed Y.

AlMazroo. “FPGA Design and Implementation of Matrix

Multiplier Architectures for Image and Signal Processing

Applications”, in IJCSNS International Journal of

Computer Science and Network Security, VOL 10. NO2,

Feb 2010.

[15]. AHM Shapri and N.A.Z Rahman. “Performance Analysis

of Two-Dimensional Systolic Array Matrix

Multiplication with Orthogonal Interconnections”, in

International Journal on New Computer Architectures

and Their Applications (IJNCAA) 1(3): 1090-1100, 2001

[16]. Jonathan Break. “Systolic Array and their Application”,

inhttp://www.cs.ucf.edu/courses/cot4810/fall04/.../Systol

ic_Arrays.ppt

[17]. Altera Inc, Cyclone II Device Handbook, Volume 1,

available at www.altera.com

[18]. Altera Inc, DE2 Development and Education Board Use

Manual available at www.altera.com

[19]. Altera Inc DE2 Development and Education Board

Schematic, available at www.altera.com



24

Recognizing and Interpreting Sign Language Gesture for Human Robot Interaction

Shekhar Singh

Assistant Professor CSE, Department

PIET, Samalkha, Panipat, India

Akshat Jain Assistant Professor CSE, Department


Deepak Kumar Assistant Professor CSE, Department


ABSTRACT

Visual interpretation of sign language gesture can be useful in

accomplishing natural human robot interaction. This paper

describes a sign language gesture based recognition,

interpreting and imitation learning system using Indian Sign

Language for performing Human Robot Interaction in real

time. It permits us to construct a convenient sign language

gesture based communication with humanoid robot. The

classification, recognition, learning, interpretation process is

carried out by extracting the features from Indian sign

language (ISL) gestures. Chain code and fisher score is

considered as a feature vector for classification and

recognition process. It is to be done by the two statistical

approaches namely known as Hidden Markov Model (HMM)

technique and feed forward back propagation neural network

(FNN) in order to achieve satisfactory recognition accuracy.

The sensitivity, specificity and accuracy were found to be

equal 98.60%, 97.64% and 97.52% respectively. It can be

concluded that FNN gives fast and accurate recognition and it

works as promising tool for recognition and interpretation of

sign language gesture for human computer interaction. The

overall accuracy of recognition and interpretation of the

proposed system is 95.34%. Thus, this approach is suitable

for automated real time human computer interaction tool.

General Terms

Human Robot Interaction; Gesture, Indian Sign Language;

Vector Quantization and LBG algorithm; Hidden Markov

Model; Neural Network.

Keywords

HRI, ISL, CLAHE, Chain Code, HMM, Fisher Score, FNN,

Gesture recognizing, Gesture Interpretation

1. INTRODUCTION The growing commitment of the society in reducing the

barriers to persons with disabilities, added to the advances of

the computers and form recognition methods, has motivated

the development of the present system that recognized, learn

and interpret ate the Indian sign language gesture .Sign

language gesture based recognition, learning and

interpretation is one of the most promising areas in research

appealing its huge applications [28]. An artificial intelligent

framework is being built with sign language gesture in order

to accelerate Human Robot Interaction accurately. We have

chosen suitable recognition, learning, interpretation

techniques for the establishment of gesture based

communication with humanoid robot. According to the

different researcher point of view, hidden markov model

(HMM) is a very well acceptable method for gesture

recognition. But, we find that it takes more time to recognize

the sign language gesture. For this reason we are trying to find

a different method which gives same accuracy to recognize,

learn and interpret ate sign language gestures but within a

very short period of time. Indian Sign Language (ISL) has its

own grammatical and syntactical meaning in the linguistic

form of signs. It implies a visual-spatial language which

consists of hands, arms, and facial expression and head or

body postures in such a manner that linguistic information

could be provided significantly. The construction of Indian

sign language (ISL) gesture [1, 28] can be defined by several

parameters like shape of the hand, location of the hand

movements in a straight or circular way, orientation of the

hand, facial expressions, body or head posture and eye gaze.

The sign language gestures primitives are being captured by

Indian Sign Language symbols. It imparts a challenge in terms

of its complex symbolic gestural representation and proper

linguistic understanding. It would be used as a helping agent

for interpreting the knowledge among hearing impaired

people in their own community. It will increase their strength

of conversation in an extreme extent. This paper tries to

introduce the strategy for dealing with the dynamic hand

gesture recognition, learning and interpretation with reduction

of time which will really be helpful for designing a concrete

human robot interaction (HRI) system. It incorporates the

challenges to extract and interpret ate the dynamic feature

from the continuous signals.

The current prototype strongly supports the mimicry on

humanoid robot in real time. The further expansion on

recognition, learning and interpretation includes the

translation system between verbal expression and sign

language. This system would be useful for hearing impaired

people for exchanging information among them through the

human robot interaction (HRI) approach. This demands a real

time gesture recognition system by translating sign language.

The unique and novel techniques using neural network and

HMM have been applied separately in our current system to

recognize the Indian sign language (ISL) gestures and

generate mimicry by the humanoid robot accordingly. Then

we try to identify a suitable recognition, learning,

interpretations system according to the time taken for

recognizing learn, and interpreted the sign language gesture.

The used procedure in this automatic recognition is based in

the adequate modeled of the hand sign by Neural Network.

For it, it has been realized the calculation of Fisher score,

which is extracted of HMM [28, 31]. This HMM is fed with

the chain code, determined from the band sign image. For this

process, the first step has been arranged the capture of

samples. It is necessary to create a database with hand sign

pictures of the sign text of numerous persons, being the ideal

infinite. In our case, it was enclosed for sixty different people

(50 hand signs each one). Subsequently, it has been applied a

image preprocessing, transforming the color image in a white

and black one, with more reduced size, which defines the



25

hand outline. With this outline, we are going to extract a

series of parameter, associated in a vector that defines the

contour of the hand sign with chain code [29, 32], and it is

classified and parameterized by HMM. From these HMM,

some parameters are going to extract, based on the emission

probability of the vectors, which are going to determine the

score Fisher [28, 32], for classifying it with Neural Network

[33, 34, 39, 40]. This system will he supervised with a

training process, in which is learned to differentiate some

hand signs of other; and a other test process, where the models

will be verified. This process is resumed in the following

figure 1. In the second section the creation of the database. In

the third, the applied image processing. Subsequently, the

calculation of Fisher score by HMM. In the fifth section the

neural network classifier. The sixth sections are all the

realized experiments and finally the conclusions and

references. Figure 1 shows the proposed system for

recognizing and interpretation of sign language gesture for

human robot interaction.

2. INDIAN SIGN LNGUAGE GESTURE

ACQUISITION SYSTEM The prime focus on recording Indian sign language gesture is

to create a repository with dynamic ISL video (sequence of

images) gestures from different kinds of Indian sign language

class/word dictionary. Preliminarily, Indian sign language

dynamic gestures have been recorded with fixed frame rate

per second (10 fps) and with the fixed location of the camera

from the object. The samples of Indian sign language gestures

are to be used during the classification, learning and

interpretation process. All the Indian sign language gestures

include various kinds of hand motions. A capturing device

SONY handy cam with 8 mega pixel resolutions is used for

capturing videos of several Indian sign language gestures.

One elementary approach for image processing tends to

background uniformity where a dark background is chosen for

dealing with gray scale images effectively.

To have a controlled environment, the background uniformity

has been kept while recording the videos in real time. This

will reduce the computational complexity during background

removal and increase recognition and interpretation accuracy

in real time [15]. A single gesture video has been restricted to

20 frames. This is done by selecting 20 frames equally spaced

in time from the original captured video. The background is

chosen dark. Every ISL gesture implies some class or word

which could be captured by waving both hands in a very

appropriate manner [16]. For the enhancement of

preprocessing Indian sign language gestures need fast and

accurate movements of hands. Several operations have been

accomplished in all the Indian sign language (ISL) videos

before the classification process.

3. PREPROCESSING AND FEATURE

EXTRACTION Primarily, all the Indian sign language videos are split up into

sequences of image frames (RGB). The frames are converted

into grayscale images and the background is subtracted in

order to reduce computational complexity. In this process are

going to take the color images for transforming to binary

images of hand sign shape (white and black) with a fixed

height of 400 pixels, conserving the relation with respect the

width. The following steps are;

1. Filter RGB image for eliminating the noise, hue,

saturation effects.

2. Convert RGB image in YCBCR color image by

eliminating the hue and saturation [40].

3. Convert YCBCR image in grayscale by eliminating the

hue and saturation [40].

4. Filter grayscale image for eliminating the noise, hue and

saturation.

5. Enhance contrast using histogram equalization [29]. To

realize this process, it is equalized the histogram of the

different levels of gray, under a lineal function, but

without affecting to the darkest points, only for the

clearest parts (the hand sign), marking differences among

the shadow of the hand and the background; and the own

hand.

6. Take the frame out to eliminate some border effects.

7. Convert image to binary image by Thresholding. The

Thresholding is computed by means of Otsu's method,

which chooses the threshold to minimize the interclass

variance of the Thresholding black and white pixels.

With this step is finished for determining the hand as an

object.

8. Morphologic operators [30]. It is applied the dilatation

operations in a first place, and after the erosion, as effect

to recover the conditions of size of the image. The

elimination of the noise is other desired effect. The

dilatation is going to unite the holes that can remain in

the line of the contour. Subsequently the hand is filled as

object, and finally recovers the original dimension, by

means of the erosion, that is able to eliminate the

possible noises happened by the own photo.

9. Reduction. The size of image is reduced in one quarter,

for reducing the size of the final vector.

10. Calculation of the contour [31, 32]. It is calculated the

hand contour, which determines the hand sign on the

background of the image, with the particularity that the

connection among pixels is alone one pixel, with a

connectivity among neighbors of 8, that is chain code.

11. Adjustment of the wrist. The image is trimmed slightly

with the intention to determine the side of the one that

the hand sign arises. Adjustment of high of the image.

Finally it is fixed the value of the height, maintaining the

proportionality with regard to the width. In this way the

information is not loosen, indistinctly that the hand sign

is horizontal, vertical up or vertical downward.

Figure 1 shows the process of image preprocessing and

feature extraction step of proposed system. Figure 1 shows the

proposed system for recognizing and interpretation of sign

language gesture for human robot interaction. The feature

extraction for hand gesture recognition is done using chain

code and fisher score [17]. It provides tremendous flexibility

towards scene illumination invariance property. The

classification and interpretation policy includes robustness in

changing the illumination conditions. The fisher score of

dynamic gestures forms the feature vector based on the image

feature. HMM and neural network are very robust and

efficient algorithm which would be used to classify ISL

dynamic gestures based on the pattern classification and

recognizing technique. The algorithm is constituted with the

chain code and fisher score of the local orientation of edges in

an image. The chain code and fisher score will be treated as

feature vector for motion classification of Indian sign

language gestures. The algorithm is very fast and strong to

compute the feature vectors of the sequence of images.

Therefore, the calculation of direction of edges can be

performed in real time applications. It offers advantage to

scene illumination changes and even light condition changes.

The edges of the sequences of images would be still same

[18]. All the Indian sign language gestures have been captured



26

in different lighting conditions. Another advantage of chain

code and fisher score refers to the translation invariant

property. It demonstrates that the same frames at different

position of gestures would produce the same feature vectors.

It is being done to calculate the chain code and fisher score of

the local orientations for all the frames of the moving

gestures. Translation of the frame in the gesture does not

change the chain code and fisher score. The overall algorithm

[37, 38] has been described to evaluate the feature vector for

recognition, learning and interpretation system.

4. ISL RECOGNITION AND

INTERPRETATION TECHNIQUE The classification and interpretation process is done by two

statistical techniques: HMM technique and neural network.

Hidden Markov Model technique is followed by vector

quantization technique with Linde Buzo Gray algorithm [38].

The HMM explains the construction of model which is

needed to generate the observation sequences. Here we use a

left-right HMM. The details of HMM is defined in the

following manner [37]: In our research work we have

generated 10 different HMM models for 21 different ISL

gestures for classification and interpretation. Figure 1 shows

the process of image processing, chain code calculation, fisher

scores calculation, HMM, quantization, learning, recognizing

and interpretation of sign language gesture.

4.1 Fisher Score calculation Once it is obtained all the outline images of the hand signs, it

is realized the calculation of Fisher score. This process

comprises in three steps;

a. Extraction of parameter from outline: chain code [29].

b. Creation HMM with chain code as input [31, 32].

c. Calculation of Fisher score from gradient of logarithm of

the observation symbol probability distribution [28, 32]

Fig 1: Block diagram of the proposed approach for hand tracking and gesture recognition. Processing is organized into six

layers.

Preprocessing of Indian Sign Language Gesture

Color

Smoothing

Color Space

Transformation

Contrast Limited Adaptive

Histogram equalization Process

Segmentation of Indian Sign Language Gesture

Assign Skin Color

Probabilities to pixels

Compute Skin

Colored Blobs

Background

Subtraction

Hand Tracking in the Model Space

Particle filters

for left Arm

Hand Gesture

Detection

HMM for Body

orientation

Filter for Right arm

Parameterization

Extraction of

Parameter

Extract the

Chain Code

HMM: Calculation of

Fisher Score

Hidden Markov Model

Gesture Classification

Feed Forward Back Propagation Neural

Network

Decision Gesture Recognition

Image Acquisition Process

Initialization Process Color Image Sequence



27

This vector of the contour hand sign is obtained with the mask

of the figure 2, observing the position of each pixel with their

adjacent one. This vector is formed by numbers from the 1 to

the 8 that describes the outline of the hand sign. The

information that is extracted, describes the sequence of the

hand, accompanied by temporary information, because all the

hand signs are acquired in the same order and sense. This

information is very important for the recognizer based on

HMM, therefore utilizes this information to distinguish the

different hand signs. It is fixed a criterion of start for the

obtaining of this vector, verifying first if is a vertical,

horizontal since down or vertical since up hand sign, for this

is not rotation invariant. It is begun for seeking write pixels

with the following order of priority: first, the first column of

the left (horizontal hand sign), the last row (vertical since

down) or the first row (vertical since up).

1 2 3

8 X 4

7 6 5

Fig 2: Mask of composition of the vector of the chain code

4. 2 Transformation of Parameter with

HMM

It is going to determine by the supervised classification of

chain code using HMM, which is the maximum rate of

success, for extrapolating the forward and backward

parameter of HMM in the calculation of Fisher score.

Therefore, the HMM employed is a Bakis, and is trained with

the procedure of Baum-Welch, to maximize the probabilities

of success [28]. Besides, 8 symbols by state have been

utilized. The creation of the HMM models has two phases, the

training and the test. Finally, the number of states (N) and the

percentage of training samples have utilized like parameters to

find the highest rate of success.

4.3 Fisher Score

Finally, it is proposed the transformation that provides the

HMM probabilities relating to the approach of the Fisher

score [28, 32]. With this goal, it intends to unite the

probability given by the HMM to the given discrimination of

the neural network, whose tie of union is this Fisher score.

This score calculates the gradient with respect to the

parameters of HMM, in particular, on the probabilities of

emission of a vector of data x, while it is found in a certain

state q Є {l,…….., N}, given by the matrix of symbol

probability in state q(bq(x)), just as it is indicated in the

following equation ;

P(x/ q, λ) = bq(x) (Eq. 1)

If it is realized the derivate of the logarithm of the above

probability, with the purpose to calculate its gradient, it is

obtained the kernel of Fisher, whose expression comes given

by;

∂log p(x/q, λ)/∂p(x, q) =ζ(x, q)/bq(x)-ζ (q) (Eq. 2)

Where in [28, 31, 32], it has been found the approximations

and the calculation of above equation. Besides, ζ (x, q)

represents the number of times, that is localized in a state q,

during the generation of a sequence, emitting a certain symbol

x [28, 31]. ζ (q) represents the number of times that has been

in q during the process of generation of the sequence [28, 31].

These values are obtained directly and of form effective, from

the forward backward algorithm, applied to the HMM [28,

31]. The application of this score (Ux) to the neural network,

comes given by the expression of the equation 2, utilizing the

techniques of the natural gradient, from the following

equation [1];

Ux = p(x, q) log p(x / q, λ) (Eq. 3)

Where Ux define the direction of maximum slope of the

logarithm of the probability to have a certain symbol in a

state.

4.4 Feed forward back propagation

Technique The objective of this study is to classifying Fisher kernel data

of sign language latter symbol using feed forward back

propagation neural network and Levenberg-Marquardt (LM)

as the training algorithm. LM algorithm has been used in this

study due to the reason that the training process converges

quickly as the solution is approached. For this study, sigmoid,

hyperbolic tangent functions are applied in the learning

process. Feed forward back propagation neural network use to

classify sign language gesture according to fisher score

characteristic [33, 34, 39, and 40]. Feed forward back

propagation neural network is created by generalizing the

gradient descent with momentum weight and bias learning

rule to multiple layer networks and nonlinear differentiable

transfer functions. Input vectors and the corresponding target

vectors are used to train feed forward back propagation neural

network. Neural network train until it can classify the defined

pattern. The training algorithms use the gradient of the

performance function to determine how to adjust the weights

to minimize performance. The gradient is determined using a

technique called back propagation, which involves performing

computations backwards through the network. The back

propagation computation is derived using the chain rule of

calculus. In addition, the transfer functions of hidden and

output layers are tan-sigmoid and tan-sigmoid, respectively.

Training and Testing:

The proposed network was trained with fisher score data

cases. When the training process is completed for the training

data, the last weights of the network were saved to be ready

for the testing procedure. The time needed to train the training

datasets was approximately 4.60 second. The testing process

is done for 60 cases. These 60 cases are fed to the proposed

network and their output is recorded.

5. VECTOR QUANTIZATION

TECHNIQUE A discrete HMM is taken into consideration for recognition

process of Indian sign language gestures. The feature vector

of chain code needs to be converted into a finite set of

symbols from a codebook. The VQ technique plays a

reference role in HMM based approach in order to convert

continuous Indian sign language (ISL) gestural signals into a

discrete sequence of symbols for discrete HMM. The VQ

concept is entirely determined by a codeword which is

composed by fixed prototype vectors. Fig 3 shows that the

process of quantization has been divided into two parts. The

first part is having the ability to produce a codebook and the



28

second part attempts to update the codeword followed by

training of all the vectors according to their finite vectors. Its

strength lies in reducing the data redundancy and the

distortion created among the quantized data and the original

data. It is essentially required to propose a VQ method which

would be genuinely used to minimize this distortion measure.

In order to compute the minimum average distortion measure

for a set of vectors an iterative algorithm is proposed by

Linde, Buzo and Gray [22] which is known as LBG vector

quantization designing algorithm. The algorithm illustrates the

generation of optimal codebook (in our case codebook size is

16) for isolated ISL gestures.

6. RECOGNIZING AND

INTERPRETATION BY HUMANOID

ROBOT

Fig 3 shows that the process of quantization, learning,

recognizing and interpretation of sign language gesture. An

underlying concept of learning gestures has been introduced

for humanoid robot in order to perform several tasks

eventually [27]. An integration of humanoid robot with Indian

sign language gestures encounters an elegant way of

communication through mimicry. The real time robotics

simulation software, WEBOTS is adopted to generate HOAP-

2 actions accurately. The way of learning process marks an

intelligent behavior of HOAP-2 which sustains its learning

capability in any type of environment. The learning process is

dealt with the HOAP-2 robot controller which has been built

intelligently. It is used to invoke Comma Separated Value

(CSV) file in order to perform that gestures in real time. All

the classified gestures bring out some useful information

about all the joints of upper body of the humanoid robot.

6.1 Learning ISL gesture using HMM with

vector quantization techniques In order to learn the ISL gestures by humanoid robot the

preprocessing technique is essentially needed for this purpose.

It employs the following steps.

Capture the ISL gesture as an input gesture.

Apply an algorithm for extracting orientation histogram to construct feature vector.

From the feature vector of each gesture an initial

codebook is to be generated. Then apply LBG algorithm

to generate an optimized codebook.

Each row corresponds to a number of the codeword

which helps to form a quantized vector used by hmm

algorithm.

In this preprocessing step we have generated 30 symbol

sequences for each ISL gesture as each gesture is captured

with equal number of frames. Next stage implies to train each

gesture using Hidden Markov Model where parameters of the

HMM can be determined efficiently. We calculated the

transition probability and emission probability of HMM by

the known states and known sequences. The training

algorithm for each gesture measures the accurate transition

and emission probability which are used for finding out the

most probable sequence.

It has been assumed 5 hidden states for each ISL gesture and

made the state sequences in the distribution of 1 to 5 with total

number sequence matches the codebook size. The observation

sequence for each state has been represented by row vector

with the calculated codebook of all the training samples. Each

observation sequence for each training gesture corresponds to

single row vector. Then apply algorithm for the estimation of

transition and emission probability of each gesture model and

preserve it for recognition purpose with unknown gesture.

Every HMM model is uniquely attached with each gesture

which is trained with different samples of each ISL gesture.

This trained model is extensively used for recognition of new

gesture. This new gesture would be tested through all the

trained HMM model. The new gesture is to be declared as

classified when it provided a maximum likelihood state with a

trained gesture. Compute the percentage of each probable

state for all the training samples which agrees with the likely

state sequences of test gesture. The gesture is declared as

classified with maximum percentage. Fig 3 shows the process

of quantization, learning, recognizing and interpretation of

sign language gesture using HMM and neural network.

Fig 3: Learning of ISL gesture using HMM, Fisher Score

and Neural Network technique

Capture Video

Split the gesture video into

sequence of color image frames

Gesture video into sequence of

gray scale frames

Preprocessing of all Frames

Construct Feature Space for all

gestures

Vector Quantization technique

for codebook generation

Generate HMM model for Each

Gestures

Recognize test gesture

according to maximum

likelihood and generate Fisher

Score

Learning of humanoid robot

according to the recognized

gesture using Fisher Score and

Neural Network

Database for

chain code,

fisher Score

and codebook



29

6.2 Learning ISL gesture using Neural

Network The objective of this study is to classifying Fisher kernel data

of sign language gesture using feed forward back propagation

neural network and Levenberg-Marquardt (LM) as the

training algorithm. Feed forward back propagation neural

network use to classify sign language gesture according to

fisher score characteristic [33, 34, 39, and 40]. Fig 3 shows

the process of quantization, learning, recognizing and

interpretation of sign language gesture using neural network.

7. RESULT ANALYSIS Only 21 ISL gestures are selected for our work. In our work

the same gesture is performed by the ten different persons.

During acquisition of ISL video gestures we have kept same

number of frames for each gesture. It makes the recognition

process easier but it reduces the freedom of doing the gesture.

In the HMM classification process each model is tested by

Viterbi algorithm, where same number of frames are required

to get percentage of the matching gesture. We have taken 10

samples of each gesture for training and 10 separate samples

for each gesture for testing. We have generated probable path

from the trained hidden model for each training samples of

each gesture using viterbi algorithm. The same process is also

applied for test samples of each gesture. In the recognition

phase of known gestures, we have taken test sample one by

one from the test gestures and have compared that sample

with all the samples of each trained gestures iteratively. It

produces the recognition percentage of that particular sample

into the entire training samples. In that process we have

created separate training set and test set with most probable

paths. If a particular gesture is matched with 60% and above

according to the maximum likelihood state with one of the

training samples of that gesture then it is considered as

classified gesture. We have achieved up to 98% recognition

accuracy with both the techniques. The total prototype is

tested and simulated on Intel Core 2 Duo system with Mat lab

coding. Each gesture for mimicry generation needs a special

care of its dedicated joints which are responsible to perform

that gesture. The experiments have been realized with

independent samples for training and for test. These have been

repeated in five times to take averaged values, expressed them

by their average and their variance.

Table 1: Rate of success of the HMM in function of the

percentage of samples for training and of the number of

states

Samples

Trainin

g

Number of state

20 45 45 65 100

20% 75.66

%

±22.82

86.60

%

±3.33

87.74

%

±2.47

85.38

%

±7.67

64.40

%

±45.65

40% 78.20

%

±12.83

88.70

%

±3.50

88.58

%

±7.24

92.10

%

±1.82

73.20

%

±26.50

60% 76.78

%

±11.20

91.90

%

±2.95

87.30

%

±3.90

87.98

%

± 3.10

68.10

%

±25.87

80% 77.31

%

±31.42

92.75

%

±14.19

93.20

%

±8.30

92.50

%

±14.15

75.65

%

±16.80

The execution of the same have been done sequential mode,

first in the HMM to achieve its maximum success, varying the

number of states and the percentage of samples in the training.

The experimental results when varying the number of states

and states of HMM are shown in table 1. As a result, the

recognition accuracy was improved by increasing the number

of states and the highest recognition accuracy was obtained in

65 states. Although the recognition accuracy was hardly

improved when the number of samples increased, the highest

recognition accuracy was obtained in 80% samples. As a

result, recognition accuracy was 94.50% by using only the

position of hands. The results are show in table1. Of the above

table is deduced that the best percentage of training is fur 80%

and 65 states, with a rate of 94.50%, presenting a small

variance. Of this model is going to generate the kernel of

Fisher, and it is going to apply to the neural network. It is

observed as for the neural networks are presented better

results, arriving to an average rate of 95.34%, with the

smallest variance. The overall accuracy of Indian sign

language gesture recognition in the training, validation and

testing mode are 98.60, 97.64 and 97.52%.

8. CONCLUSION

In this work we have used Indian sign language gesture as a

communicating agent between human and robot interaction.

This is our first step to design a prototype of vision base

human robot interaction (HRI) system for speech and hearing

impaired persons. They can use the humanoid robot as his

translator or as his helping agent where the persons could

communicate with the robot through Indian sign language

gesture. According to the observation we select neural

network as a best recognition tool compatible with fisher

score according to time taken and accuracy for recognizing

gestures. However in the present work we are only

implementing a mimicry action by the humanoid robot. We

could in principle use these techniques for trying to recognize

any other human gestures. However the complexity of

classification will increase because of the ambiguity in normal

human gestures. We have chosen ISL because of its rigid

vocabulary which makes the classification simpler. The

present work only recognizes a single gesture at a time. It will

be challenging to recognize multiple gestures or sequence of

gesture one after the other. The task in hand will be to

separate out each gesture from the sequence of gestures and

use our gesture recognition techniques as described in this

paper. In this article a robust and novel automatic Indian sign

language gesture recognition system has been presented. The

overall accuracy of sign language recognition in the training,

validation and testing mode are 98.60, 97.64 and 97.52%. We

are concluding that that the proposed system gives fast and

accurate Indian sign language gesture recognition. Given the

encouraging test results, we are confident that an automatic

sign language gesture recognition system can be developed.

9. REFERENCES [1] Zhang, J., Zhao, M.: A vision-based gesture recognition

system for human-robot interaction. Robotics and

Biomimetics (ROBIO), 2009 IEEE International

Conference on, vol., no., pp.2096-2101, 19-23 Dec.

(2009). doi: 10.1109/ROBIO.2009.5420512

[2] Calinon, S., Guenter, F., Billard, A.: On Learning,

Representing and Generalizing a Task in a Humanoid

Robot. IEEE Trans. on Systems, Man and Cybernetics,

Part B, Vol. 37, No. 2, pp. 286-298 (2007). doi:

10.1109/TSMCB.2006.886952



30

[3] Calinon, S., Guenter, F., Billard, A.: Goal-Directed

Imitation in a Humanoid Robot. International Conference

on Robotics and Automation (ICRA), pp. 299-304

(2005).

[4] Pantic, M., Rothkrantz, L. J. M.: Toward an affect-

sensitive multimodal human-computer interaction. IEEE,

vol.91, no.9, pp. 1370- 1390, Sept. (2003).

[5] Bhuyan, M. K., Ghoah, D., Bora, P. K.: A Framework

for Hand Gesture Recognition with Applications to Sign

Language. India Conference, 2006 Annual IEEE, PP. 1-

6, Sept. (2006). doi: 10.1109/INDCON.2006.302823

[6] Prasad, J. S., Nandi, G. C.: Clustering Method

Evaluation for Hidden Markov Model Based Real-Time

Gesture Recognition. Advances in Recent Technologies

in Communication and Computing, ARTCom '09, pp.

419-423, 27-28Oct. (2009).

[7] Lee, H. J., Chung, J. H.: Hand gesture recognition using

orientation histogram. TENCON 99. Proceedings of the

IEEE Region 10 Conference, vol.2, no., pp.1355-1358

vol.2, Dec. (1999). doi: 10.1109/TENCON.1999.818681

[8] Freeman, W. T., Roth, M.: Orientation histograms for

hand gesture recognition. Intl. Workshop on Automatic

Face- and Gesture- Recognition, IEEE Computer

Society, Zurich, Switzerland, pp.296—301, June (1995).

MERL-TR94-03.

[9] Nandy, A., Prasad, J. S., Chakraborty, P., Nandi, G. C.,

Mondal, S.: Classification of Indian Sign Language In

Real Time. International Journal on Computer

Engineering and Information Technology (IJCEIT), Vol.

10, No. 15, pp. 52-57, Feb. (2010).

[10] Nandy, A., Prasad, J. S., Mondal, S., Chakraborty, P.,

Nandi, G. C.: Recognition of Isolated Indian Sign

Language gesture in Real Time. BAIP 2010, Springer

LNCS-CCIS, Vol. 70, pp. 102-107, March (2010). doi:

10.1007/978-3-642-12214-9_18.

[11] Dasgupta, T., Shukla, S., Kumar, S., Diwakar, S., Basu,

A,: A Multilingual Multimedia Indian Sign Language

Dictionary Tool. The 6’Th Workshop on Asian

Language Resources, pp. 57-64 (2008).

[12] Kim, J., Thang, N. D., Kim, T.: 3-D hand motion

tracking and gesture recognition using a data glove.

Industrial Electronics, 2009. ISIE 2009. IEEE

International Symposium on, vol., no., pp.1013-1018, 5-8

July (2009). doi: 10.1109/ISIE.2009.5221998

[13] Jiangqin, W., Wen, G., Yibo, S., Wei, L., Bo, P.: A

simple sign language recognition system based on data

glove. Signal Processing Proceedings, 1998. ICSP '98.

1998 Fourth International Conference on, vol.2, no.,

pp.1257-1260 vol.2 (1998). doi:

10.1109/ICOSP.1998.770847

[14] Ishikawa, M., Matsumura, H.: Recognition of a hand-

gesture based on self-organization using a DataGlove.

Neural Information Processing, 1999. Proceedings.

ICONIP '99. 6th International Conference on, vol.2, no.,

pp.739-745 vol.2 (1999). doi:

10.1109/ICONIP.1999.845688

[15] Swee, T. T., Ariff, A. K., Salleh, S. H., Seng, S. K.,

Huat, L. S.: Wireless data gloves Malay sign language

recognition system. Information, Communications &

Signal Processing, 2007 6th International Conference on,

vol., no., pp.1-4, 10-13 Dec. (2007). doi:

10.1109/ICICS.2007.4449599

[16] Liang, R. H., Ouhyoung, M.: A real-time continuous

gesture recognition system for sign language. Automatic

Face and Gesture Recognition, 1998. Proceedings. Third

IEEE International Conference on, vol., no., pp.558-567,

14-16 Apr (1998).

[17] Won, D., Lee, H. G., Kim, J. Y., Choi, M., Kang, M. S.:

Development of a wearable input device based on human

hand-motions recognition. Intelligent Robots and

Systems, 2004. (IROS 2004). Proceedings. 2004

IEEE/RSJ International Conference on, vol.2, no., pp.

1636- 1641 vol.2, 28 Sept.-2 Oct. (2004). doi:

10.1109/IROS.2004.1389630

[18] Kuzmanic, A., Zanchi, V.: Hand shape classification

using DTW and LCSS as similarity measures for vision-

based gesture recognition system. EUROCON, 2007.

The International Conference on "Computer as a Tool",

vol., no., pp.264-269, 9-12 Sept. (2007). doi:

10.1109/EURCON.2007.4400350

[19] Hienz, H., Grobel, K., Offner, G.: Real-time hand-arm

motion analysis using a single video camera. Automatic

Face and Gesture Recognition, 1996., Proceedings of the

Second International Conference on , vol., no., pp.323-

327, 14-16 Oct. (1996).

[20] Hasanuzzaman, M., Ampornaramveth, V., Zhang, T.,

Bhuiyan, M. A., Shirai, Y., Ueno, H.: Real-time Vision-

based Gesture Recognition for Human Robot Interaction.

Robotics and Biomimetics, 2004. ROBIO 2004. IEEE

International Conference on, vol., no., pp.413-418, 22-26

Aug. (2004). doi: 10.1109/ROBIO.2004.1521814.

[21] Rabiner, L. R.: A tutorial on hidden Markov models and

selected applications in speech recognition. IEEE, vol.77,

no.2, pp.257-286, Feb. (1989). doi: 10.1109/5.18626.

[22] Vector Quantization Technique and LBG Algorithm.

www.cs.ucf.edu/courses/cap5015/vector.ppt.

[23] Michailovich, O., Rathi, Y., Tannenbaum, A.: Image

Segmentation Using Active Contours Driven by the

Bhattacharyya Gradient Flow. IEEE Transactions on

Image Processing, vol.16, no.11, pp.2787-2801, Nov.

(2007). doi: 10.1109/TIP.2007.908073

[24] Kailath, T.: The Divergence and Bhattacharyya Distance

Measures in Signal Selection. IEEE Transactions on

Communication Technology, vol.15, no.1, pp.52-60, Feb.

(1967).

[25] Nayak, S., Sarkar, S., Loeding, B.: Distribution-Based

Dimensionality Reduction Applied to Articulated Motion

Recognition. Pattern Analysis and Machine Intelligence,

IEEE Transactions on, vol.31, no.5, pp.795-810, May

(2009). doi: 10.1109/TPAMI.2008.80.

[26] Nandy, A., Mondal, S., Prasad, J. S., Chakraborty, P.,

Nandi, G. C.: Recognizing & interpreting Indian Sign

Language gesture for Human Robot Interaction.

Computer and Communication Technology (ICCCT),

2010 International Conference on , vol., no., pp. 712-717,

17-19 Sept. (2010). doi: 10.1109/ICCCT.2010.5640434.

[27] Mitra, S., Acharya, T.: Gesture Recognition: A Survey.

IEEE Transactions on Systems, Man, and Cybernetics,

Part C: Applications and Reviews, vol.37, no.3, pp.311-

324, May (2007). doi: 10.1109/TSMCC.2007.893280.



31

[28] Lawence R. Rabiner, “A tutorial on Hidden Markov

models and Selected Applications in Speech

Recognition”, in Proceedings of the IEEE, vol. 77, no.

2,IEEE, pp. 257-286, 1989.

[29] L. O’Gorman and R. Kasturi, Document Image AnalJsis,

IEEE Computer Society Press, 1995.

[30] J. Serra, Image Analysis and Mathematical Morphology,

Academic Press, 1982.

[31] C. Travieso, C. Morales, I. Alonso y M. Ferrer,

“Handwritten digits parameterisation for HMM based

recognition”, Proceedings of the Image Processing and

its Applications,vol.2, pp. 770-774, julio de 1999.

[32] E. Gomez, C.M. Travieso, J.C. Briceiio, M.A. Ferrer,

“Biometric Identification Svstem by Lip Shape”, in

Proceeding of 361h International Carnahan Conference

on Security Technology, Atlantic City, October 2002,pp.

39-42.

[33] L. Fausett, "Fundamentals of Neural Networks,

Architectures, Algorithms, and Applications", Prentice-

Hall, Inc. 1994, pp-304-315.

[34] K.Murakami, H.Taguchi: Gesture Recognition using

Recurrent Neural Networks. In CHI ’91 Conference

Proceedings (pp. 237-242). ACM. 1991.

[35] Chang, J. Chen, W. Tai, and C. Han, ―New Approach

for Static Gesture Recognition", Journal of Information

Science and Engineering22, 1047-1057, 2006.

[36] S. Naidoo, C. Omlin and M. Glaser, "Vision-Based Static

Hand Gesture Recognition Using Support Vector

Machines", 1998.pages 88 – 94.

[37] Vladimir I. Pavlovic, Rajeev Sharma, Thomas S Huang,

―Visual Interpretation of Hand Gestures for Human-

Computer Interaction: A review‖ IEEE Transactions of

pattern analysis and machine intelligence, Vole 19, NO

7, July 1997, 677 - 695

[38] Visual gesture recognition. In proceedings IEEE Visual

Image Signal Process, by Davis & Shah, 1994 vol.141,

Issue: 2, 101 – 106.

[39] Hand gesture Recognition of Radial Basis Functions

(RBF) Networks and Decision Trees ―International

Journal of Pattern Recognition and Artificial Intelligence.

Volume: 11, Issue: 6(1997) pp. 845-850.

[40] Face detection in complicated backgrounds and different

illumination conditions by using YCbCr color space and

neural network Pattern Recognition Letters, Volume 28,

and Issue 16, 1 December 2007, Pages 2190-2200.



32

Change Data Capture on OLTP Staging Area for Nearly Real Time Data Warehouse base on Database Trigger

I Made Sukarsa

Departement of Information Technology Faculty of Engineering Udayana University,

Bali, Indonesia

Ni Wayan Wisswani Departement of

Informatic Manajemen Politeknik Negeri Bali,

Bali, Indonesia

I K. Gd. Darma Putra

Departement of Information Technology Faculty of Engineering Udayana University,

Bali, Indonesia

Linawati Departement of

Electrical Engineering Faculty of Engineering Udayana University,

Bali, Indonesia

ABSTRACT

A conventional data warehouse use to produce summary from

an organization information system in a long time period. This

condition will make the management unable to get the most

up to date data every time it needed. Therefore a nearly real

time data warehouse which will manage the ETL process with

a more compact data and a shorter period is needed.

The design of nearly real time data warehouse in this research

is implemented in two steps. The first step is done by data

collection technique modeling to make a more compact ETL

data managed. This step is done by putting the staging area on

an Online Transactional Processing (OLTP). It can minimize

the failure of data movement process from the OLTP to the

staging area. Besides that, the CDC method is also had

applied on the OLTP. This method will be implemented with

a trigger active database. The trigger will capture of the data

changing on the OLTP, transform it and then load it to the

staging area in one time. The second step is the

synchronization process of the data movement from the

staging area to the nearly real time data warehouse. This

process is done by mapping the movement which is ran by the

SQL Yog. The mapping result will accomplished by the

windows task scheduler

General Terms

Modelling System, Data Warehouse

Keywords

Nearly real time data warehouse, Change Data Capture,

Surrogate key, Trigger.

1. INTRODUCTION Data warehouse is a need for an organization. Data warehouse

(DWH) capable to be the data sources to all integrated report

making process which are needed in prompting the decision

making process. [1]. Data source from various OLTP

processed through the various stages that consist of Extract,

Transform and Loading (ETL). ETL is built on a tier that is

placed between the source data and DWH and known also as

a staging area[2]. Extract part relied on to take data from

multiple sources within a specific time period to be taken to

DWH. Data is cleaned, integrated and transformed into a

specific format by the transform and then moved to the DWH

by Loading component.

Conventional ETL machine will work on time variant. This

machine will save the data periodically in accordance to the

organization business process flow [3]. This characteristic

made the DWH unable to give the most up to date information

from every event on the transactional system. The fact is data

warehouse which is real time is really needed in decision

making which is need the highest level of up to date

information. [4].

Real time data warehouse will able to show the ETL working

result in an exact time according to the transactional time on a

number system [5]. But ETL as the core of data warehouse [6]

cannot really work on real time [7]. This happens because of

the ETL need some time to process the data from various

sources in a large amount, and has to go through some

communication component [8]. The delay time is needed by

ETL to process this summary, which trigger the term Nearly

Real Time Data Warehouse (NRTDWH) [7].

To produce NRTDWH, ETL therefore can be implemented by

applying Change Data Capture (CDC) [9]. CDC is used to

know the changing on the data sources and then capture it to

be given to the database destinations which need it [10]. This

ability made CDC able to capture data changing efficiently

[11] therefore NRTDWH will be easier to be implemented.

Based on the above explanation, therefore the effort to create

NRTDWH by CDC modeling becomes really important to be

implemented.

2. RELATED WORK Some researches on the development of CDC modeling and

real time data warehouse have been done [12]. The modeling

of CDC processes uses the log analysis while it introduces the

architecture of semi real time DWH to make real time data

warehouse by using the CDC mechanism which have owned

by Oracle.

[10] Modeling of the data changing capture process by using a

set of web service. Captured modeling use the web service is

also done by [13] and to facilitate real time data warehouse is

introduce an architecture of multi level real time data cache.

Meanwhile [8] modeling of the ETL for real time DWH with

using schedule algorithm to balance the query and updates

thread control trigger based on ETL machine.

In our research will be develop a trigger based CDC modeling

which will capture data changing on different sources system.

The same trigger will transform the capture result in one time

and then load it to the staging area which is placed on the

OLTP.

The capture, transform and load (CTL) which has designed,

made the DWH able to receive the data summary faster. It

happen because the ETL process a smaller amount data and

the CTL process result is the final data which is accordance to



33

the DWH structure. This condition made the synchronization

process of the whole data sources to DWH doesn’t need a

more advanced transformation.

.

3. CAPTURE, TRANSFORM AND

LOAD

3.1 CTL Framework The CTL model architecture for NRTDWH which will be

developed on this research is visualize like the following

figure 1:

ODS - My SQL

data berubah

Scheduler

Loading

Transform

Loading

ODS user

End User

Change data Capture

Staging area ODS

Transform

load

Staging ODS

ODS - My SQL

data berubah

ODS user

Change data Capture

Staging area ODS

Transform

load

Staging ODS

Nearly Realtime

Datawarehouse

Loading

Viewing

Figure 1. General architecture of the system

In this model, transform and load process will be conducted

by each OLTP engine so as to reduce the time delay due to the

staging area located at each OLTP and do not need to build a

new staging area as in the models that already exist. The

integration process has been completed on the OLTP so the

data warehouse will receives the final data.

NRTDWH on this research is produced from the CTL process

on different OLTP sources. This model is starting to work

when a user enters new data, change or delete a record or

some field on the OLTP.

Event insert, will make a trigger capture the inserted data and

then save it as a new record on a table in staging area which is

appropriate. An update to one or some field on a record, make

a trigger captured the changing which is made. The result will

be used to updating data or being save as a new record on a

table accordingly on a staging area. On the other hand, if the

deleted process happens, therefore deleted data will change

some field on the active record in staging area. The delete

proscess can make a trigger inserted as a new data to the

appropriate table on the staging area. CTL will work like

figure 2, in the following

inserted data capture

Transform data inaccordance to the

appropriate structure

updated data capture

Loading the changing tothe staging OLTP

deleted data capture

Load to DWH

determination process

data manipulation

Figure 2. CTL process flow

When the transform of the capture result is done, trigger

might do one of these processes:

1. Simple Transform Process. This process will do

some field adjustment and formatting data between

captured data with the structure on the staging area.

This process happens if the information on the

related topic on a staging area is the information

which comes from one table and doesn’t need

relation with other table.

2. Leveled Transform Process. This process is

completed with advance query joint operation

process and other operation which has look up

characteristic. This is done if the information comes

from some tables on the OLTP.

All saved CTL process result on the staging area then move to

NRTDWH by task scheduler based on the metadata mapping

design. This metadata will be the basic rule to do join data

from every OLTP sources to NRTDWH. In order to make the

data warehouse easier to understand, therefore the data on

data warehouse will be shown through a data mart application.

3.2 Dimensional Modelling On this research, all of the OLTP uses the same MySQL

platform database. OLTP will give the data that NRTDWH

needed, while staging area will load the CTL results into

dimensions and facts tables which are ready to be joined to

NRTDWH. Through the figure 3, will be shown the star

schema which will be put on each staging area on OLTP and

the dimensional modeling on the data warehouse.

dimensi_prodi

PK id_sd_prodi

id_prodi

nama_prodi

nm_prodi_lama

mulai

selesai

status

fak_pengunjung_perdisertasi

PK id_sd_pengunjungdisertasi

waktu

id_sd_disertasi

jumlahpengunjung

id_sd_prodi

dimensi_disertasi

PK id_sd_disertasi

id_disertasi

judul_penelitian

nama_peneliti

status

mulai

selesai

jdl_lama

id_prodi

fakta_prodi_disertasi

PK id_sd_prodi_disertasi

id_sd_prodi

tgl

jumlah

fg_pengunjung_perprodi

PK id_sd_pengunjungprodi

id_sd_prodi

waktu

jumlahpengunjung

fg_pengunjung_perprodi_perbulan

PK id_sd_pengunjungprodibulan

id_sd_prodi

bulan

jumlahpengunjung

DESAIN STAGING OLTP1

dimensi_prodi

PK id_dwh_prodi

id_prodi

nama_prodi

nm_prodi_lama

mulai

selesai

status

fg_pengunjungprodi

PK id_dwh_pengunjungprodi

id_waktu

id_dwh_prodi

jumlahpengunjung

fak_pengunjung_tsds

PK id_dwh_fp_tsds

id_dwh_tsds

id_dwh_prodi

waktu

jumlahpengunjung

dimensi_tesisdisertasi

PK id_dwh_ts_ds

id_tsds

judul_penelitian_baru

judul_penelitian_lama

nama_peneliti

status

mulai

selesai

id_prodi

fakta_prodi_tsds

PK id_dwh_prodi_ts_ds

id_dwh_prodi

tgl

jumlah

fg_pengunjung

PK id_dwh_pengunjung

waktu

id_dwh_prodi

jumlahpengunjung

DESAIN DWH

dimensi_prodi

PK id_st_prodi

id_prodi

nama_prodi

nm_prodi_lama

mulai

selesai

status

fak_pengunjung_pertesis

PK id_st_pengunjungtesis

id_st_thesis

id_st_prodi

waktu

jumlahpengunjung

dimensi_tesis

PK id_st_thesis

id_thesis

judul_penelitian_lama

judul_penelitian_baru

nama_peneliti

status

mulai

selesai

id_prodifakta_prodi_tesis

PK id_st_proditesis

id_st_prodi

jumlah

tgl

fak_pengunjung_per_prodi

PK id_st_pengunjungprodibulan

id_st_prodi

bulan

jumlahpengunjung

fak_pengunjung_per_prodi

PK id_st_pengunjungprodi

id_st_prodi

waktu

jumlahpengunjung

DESAIN STAGING OLTP 2

Figure 3. Dimensional Modeling



34

Even though comes from a different sources, the join data

process from the staging area to NRTDWH, doesn’t need an

advanced transformation process to form a new surrogate key

on every dimensions and facts. Even though so, all data on

NRTDWH will be able to be differentiated. This happen

because the surrogate key on this research is designed to keep

the characteristic from OLTP source. The surrogate key

model on this research is also able to prevent the failure of

joining data process because of the same data.

3.3 Nearly Real time data warehouse The effort to make the NRTDWH on this research is done by

some way, which are:

a. Staging area design which unite with the OLTP

database.

This is done to shorten the time of data capture

changing process from the OLTP to the staging area.

Therefore the transform process can be done

immediately. This model is also to minimize the

communication failure. It is because of the data source

and target put on the same host. The staging area

placement on the OLTP is also to make the

synchronization process into the NRTDWH become

easier. It is because of the whole data process is done in

the OLTP, therefore all of the save data in the staging

area are the final data in accordance to the structure

which NRTDW wanted.

b. Shorten the data load time span to NRTDWH with a

trigger.

The effort to shorten the load process is done by using a

trigger. Trigger will make the capture can be done in a

short time period of time if it is compare with the other

CDC method. The shorter capture process surely will

influence the time which is needed for the transform and

load process on the staging area.

c. Join the Transform on the Change Data Capture

The CTL process which is done in one time by using the

same trigger surely will minimize the delay between

capture and transform. This will immediate the load

process to the staging area, therefore the

synchronization will also be organized to be shorter.

d. The use of a trigger, function and procedure as the

transform engine.

On this research, all of the capture process, transform

and load which take place will be run by PL/SQL

trigger, function and procedure. Trigger is chosen

because all the process will works faster and all daily

transaction capable to work without disturbance. This

happen because PL/SQL works on DBMS. Trigger also

can be known events that make the record in

accordance in OLTP changing. Therefore the changing

data will have the transform process directly without

comparison with previous data which are have already

save on the DWH. This will help NRTDWH easier to

achieve.

3.4 The Synchronization Process The synchronization process is done by moving and joining

the data processes result which is load on the staging area on

each OLTP. This process consists of two main components.

The first component will do the metadata mapping which will

be done by SQL Yog Ultimate. The metadata will use as the

basic rules when the synchronization process happened. All

these mappings are saving on a job file which is different for

every source. The second component is the scheduler which

contains of the data moving time span schedule to DWH. This

process will run the job file on a metadata scheme which has

made. The making scheduler is done by a windows operation

system which is scheduled in every one minute.

3.5 Testing and Results The testing of CDC modeling on this research use three

testing application: the thesis system and the Dissertation

system which act as an OLTP, and the data mart of Udayana

university application. The testing is done by manipulating

some data dummy which is spread on each OLTP. The data

manipulation is done only to some tables on OLTP which

might be the source of the DWH.

The testing on this research is done by two phase. The First

phase is done to know that the CTL process on the staging

area is done successfully. The second testing is done to prove

that the synchronization from the staging area on each OLTP

to NRTDWH is successfully done by the scheduler.

3.5.1 Capture, Transform and Load Testing

Trigger will do the CTL process before and after insert,

update and delete happen on an OLTP. These manipulation

processes will influence the facts and dimensions tables of

each staging area. One of the CTL processes which will be

observed is one of it the manipulation process of insert,

update and delete on the th_thesis table. The insert process of

th_thesis table through a form visualize on the following

figure 4:

Figure 4: Insert Form of the thesis OLTP system.

When the insert happen through above form, the CTL on

th_thesis table will work to insert a new row to the table

dimension and fact on the staging area. It caused the

dimension table will be like the figure 5.

Figure 5. Inserts data result to the thesis dimension

While the While the fact table pengunjung_pertesis will like

the figure 6

Figure 6. Data insert result to the fact table

pengunjung_pertesis

Other fact table is also influenced by this process is the

prodi_tesis table. When CTL succeed therefore the table will

be like figure 7.

Figure 7. Inserts result to prodi_tesis



35

The pengunjung_prodi fact table will also have some changes

when the insert to th_tesis is done. The result of CTL process

on this table will be like figure 8

Figure 8. The inserts result to pengunjung_prodi table on

thesis OLTP system.

Insert to the th_thesis table will also influence the

pengunjung_prodi_perbulan fact table. This process will

caused the table changes like figure 9.

Figure 9. Inserts result to the table of

pengunjung_prodi_perbulan on thesis system.

The update process which will influence the dimension and

the facts is done by two means: First, update the th_thesis

table which is done trough a form like the figure 10.

Figure 10. Updates Form of the thesis OLTP System.

Update on this form, is done to the name of the researcher

field, the title of the research or the id_prodi field. This will

trigger CTL to work and influence the dimension and facts

table on the staging area. If the change happened only on the

field of name and the title of the inputed data research,

therefore CTL will caused the thesis dimension change like

figure 11.

Figure 11. Update result of thesis dimension

If this changing is done on the id_prodi field, therefore prodi

dimension will change like the figure 12.

Figure 12. Updates result of id prodi field on prodi dimension

The changing of id_prodi field, can influence the fact table on

the staging area. The fact table which is change is the

pengunjung_pertesis table. This changing will be shown on

figure 13.

Figure 13. Update result pengunjung_per_thesis

After the CTL working, the prodi _tesis fact table will be like

the figure 14.

Figure 14. Updates Result of the id_prodi field on the

prodi_tesis table.

Because of these process, the pengunjung_prodi fact table will

be like figure 15.

Figure 15. Updates result of pengunjung_prodi table of the

thesis system

Then the changed of the other facts is

fg_pengunjungprodibulan. It will change like figure 16.

Figure 16. Updates result of the pengunjung_prodi perbulan

table on the thesis system.

The second update method to th_thesis is done through a form

like figure 17.

.

Figure 17. Update Form on the lihat field on the table thesis.

User activity through this form, caused the value of lihat field

which is save on the th_thesis table will change. This change

caused CTL work, therefore the pengunjung_per_tesis table

will be like figure 18 on the following.

Figure 18. Update result of the lihat field on the

pengunjung_pertesis table

Other table will also change because of this process is the

pengunjung prodi fact table. The results will look like figure

19.

Figure 19. Update Result of the pengunjung_prodi table.

CTL process which is trigger by the lihat field is also change

the pengunjung_prodi_perbulan table. The changing on this

table will be like figure 20.



36

Figure 20. Update result of the pengunjung_prodi_perbulan

table.

The delete process on the thesis system is done through a

form like figure 21.

Figure 21. Delete form of the th_thesis table on OLTP thesis

system.

The delete activity through this form, triggers the CTL

process to work. It makes some change on the record on some

tables in a staging area. The first table which will change is

the thesis dimension table. Changing on this table is shown

like the figure 21.

Figure 21. Delete result on the thesis dimension table.

Other table which also will change is the fakta_prodi_tesis.

Due to this process this table will be look like this following

figure.

Figure 22. Delete results of the study program thesis

While the pengunjung_per_prodi table will be look like figure

23.

Figure 23. Delete result on the pengunjung_prodi table

3.5.2 Data Synchronization Process to Data

Warehouse

The data synchronization process from OLTP source to

NRTDWH is done by a scheduler. Its work in according to the

scheme which has designed. Data which is successfully

moved from staging area will be joined into NRTDWH based

on the metadata which is shown on table 1 on the following.

Table 1. staging area Metadata of DWH

Source

staging area

Source tables Destination table on

NRTDWH

DWH

disertasi

Dimensi disertasi Dimensi_ts_ds

DWH

disertasi

Dimensi prodi Dimensi_prodi

DWH

disertasi

Fak_pengunjung_p

erdisertasi

Fak_pengunjungtsds

DWH

disertasi

Fakta_prodi_diserta

si

Fakta_prodi_tsds

DWH

disertasi

Fg_pengunjung_pro

di

Fakta_pengunungprodi

DWH

disertasi

Fgpengunjungprodi

bln

Fakpengunjungprodbln

DWH thesis Dimensi tesis Dimensi_ts_ds

DWH thesis Dimensi prodi Dimensi_prodi

DWH thesis Fak_pengunjung_p

ertesis

Fak_pengunjungtsds

DWH thesis Fakta_prodi_tesis Fakta_prodi_tsds

DWH thesis Fg_pengunjung_pro

di

Fata_pengunungprodi

DWH thesis Fg_kunjungprodibu

lan

Fakkunjungprodbulan

The above metadata will be the rule base of the

synchronization process. Figure 24 on the following, show the

succeed synchronization history of the capture job scheduler.

Figure 24. Job scheduler history

One of the succeed synchronization process which is shown

on figure 25 on the following

Figure 25. The synchronization result to the

prodi_tesis_disertasi table on the NRTDWH

The synchronization result which is saving on the dimension

and fact table on NRTDWH is shown through a data mart

application. It make the data on the NRTDWH easier to read

and help the end user to get a whole meaning. Trough this

application, the data on NRTDWH has to going through

masking process first. This process is done by syncronize the

prodi dimension table with the related fact. One of the

masking processes is done between the values on prodi

dimension table which is shown like figure 26 with the record

value on figure 25.



37

Figure 26. Data on the prodi dimension table on NRTDWH

Based on this, therefore the masking process result on the

testing application will give a result like figure 27.

Figure 27. Masking result of the dimension and fact table

Through this application, the masking of a result is also can be

seen by using graphics. The graphics which is get from the

data on the prodi_disertasi fact is like the following figure 28.

Figure 28. Masking Graphic Result

4. CONCLUSION AND THE FUTURE

WORK On this research has developed a method to create nearly data

warehouse which comes from some different OLTP with the

same platform. NRTDWH is done by implementing CTL

based on trigger. It will run the transform and load process in

one time on the staging area which is put on the OLTP. This

future research is able to be done by applying CTL to create

nearly real time data warehouse for form different platform

data sources and perform measurements on the OLTP

performance because of the extra burden of staging machine.

Data integration issues also need special attention to meet a

more dynamic modeling. If further research can be done will

be obtained data warehouse implementation model that is

more real time by cutting processing time in the staging area.

5. ACKNOWLEDGMENTS Our special thanks to the Divinkom Departement of Udayana

University, Indonesia Bali, who have contributed towards the

application test of the model.

6. REFERENCES [1] Robert M. Bruckner, Beate List, and Josef Schiefer,

Striving towards Near Real-Time Data Integration for

Data Warehouses , Data Warehousing and Knowledge

Discovery Lecture Notes in Computer Science, 2002,

Volume 2454/2002, 173-182, DOI: 10.1007/3-540-

46145-0_31

[2] Javed, Dr.Muhammad Younus. , Nawaz, Asim. ,2010.

Data Load Distribution by Semi Real Time Data

Warehouse, In: Computer and Network Technology

(ICCNT), 2010 Second International Conference On

page(s): 556 - 560

[3] Inmon, W.H. 2005. Building The Data Warehouse

Fourth Edition. Canada : Wiley Publishing.Inc.

[4] Simitsis, A.; Vassiliadis, P.; Sellis, T.;, Optimizing ETL

Processes in Data Warehouses.In Data Engineering,

2005. ICDE 2005. Proceedings. 21st International

Conference on Digital Object, Page(s): 564 – 575

[5] Vandermay, John., 2001. Considerations for Building a

Real-time Data Warehouse

[6] Savitri, F.N. , Laksmiwati, H. ,Study of localized data

cleansing process for ETL performance improvement in

independent datamart, Electrical Engineering and

Informatics (ICEEI), 2011 International Conference on,

[diunduh : 13 Agustus 2011]

[7] Langseth ,Justin., 2004, Real-Time Data Warehousing:

Challenges and Solutions.

[8] Jie Song; Yubin Bao; Jingang Shi; 2010, A Triggering

and Scheduling Approach for ETL . Computer and

Information Technology (CIT), 2010 IEEE 10th

International Conference on , Page(s): 91 – 98.

[9] R. Kimball and J. Caserta, The Data Warehouse ETL

Toolkit: Practical Techniques for Extracting, Cleanin.

John Wiley & Sons, 2004.

[10] Mitchell J Eccles, David J Evans and Anthony J

Beaumont, True Real-Time Change Data Capture

WithWeb Service Database Encapsulation, 2010, 2010

IEEE 6th World Congress on Services

[11] Attunity Ltd , 2009, Efficient and Real Time Data

Integration With Change Data Capture, Tersedia di

http://www.attunity.com/cdc_for_etl

[12] Jingang Shi, Yubin Bao, Fangling Leng, Ge

Yu.2008,Study on Log-Based Change Data Capture and

Handling Mechanism in Real-Time Data Warehouse. In

International Conference on Computer Science and

Software Engineering, CSSE 2008, Volume 4:

Embedded Programming / Database Technology / Neural

Networks and Applications / Other Applications,

December 12-14, 2008, Wuhan, China. pages 478-481,

IEEE Computer Society, 2008.

[13] Liu Jun; Hu ChaoJu; Yuan HeJin. 2010. Application of

Web Services on The Real-time Data Warehouse

Technology, Advances in Energy Engineering (ICAEE),

2010 International Conference on , Page(s): 335 – 338

http://www.springerlink.com/content/978-3-540-44123-6/

http://www.springerlink.com/content/978-3-540-44123-6/

http://www.springerlink.com/content/0302-9743/

http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=9680



http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Savitri,%20F.N..QT.&newsearch=partialPref

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Laksmiwati,%20H..QT.&newsearch=partialPref






http://www.attunity.com/cdc_for_etl

http://researchr.org/alias/jingang-shi

http://researchr.org/alias/yubin-bao

http://researchr.org/alias/fangling-leng

http://researchr.org/alias/ge-yu

http://researchr.org/alias/ge-yu

http://researchr.org/publication/ShiBLY08

http://researchr.org/publication/ShiBLY08

http://researchr.org/publication/csse%3A2008-4







38

Decision Support System for Admission in Engineering

Colleges based on Entrance Exam Marks

Miren Tanna B.E. Student

Dept. of Computer Engineering Thakur College of Engineering &Technology, Mumbai, India

ABSTRACT

Making a wise career decision is very important for everyone.

In recent years, decision support tools and mechanisms have

assisted us in making the right career decisions. This paper

attempts to enable a student who wishes to pursue

Engineering, make up good decisions, using the help of a

Decision Support System. Last 3 years’ information has been

obtained from the website of Directorate of Technical

Education, India (DTE) which makes it freely available.

Using Decision Rules, results are computed from which a

student can choose which stream and college he/she can opt

for on the basis of Entrance Exam marks he/she has scored.

To make the results more relevant, a search in the already

created decision system is performed. A student has to enter

his/her Entrance Exam scores and the stream he/she wishes to

opt for. Based on the entered information, the decision system

will return colleges and streams categorized as Ambitious,

Best Bargain and Safe.

General Terms

Data Mining, Decision Support System.

Keywords

Prediction, Result prediction.

1. INTRODUCTION Universities possess large amount of demographic data about

the students and colleges. This data is present without any

form of analysis. Informal analysis requires one to read

through each line of the data. Such a form of analysis is not

economical. Studies have been conducted in similar area such as

understanding student data [1]. There they apply and evaluate

a decision tree algorithm to university records, producing

graphs that are useful both for predicting graduation, and

finding factors that lead to graduation. Another study has been

conducted which uses student data to predict which branch a

student has high chances of being placed into [2]. In this

study, they make use of adjacency list, information gain

theory and confusion matrix. This proposal deals with the two problems. Using the data

available, to predict in which college a student has high

chances of getting an admit, and also which stream. It also

deals with providing relevant results by learning from

previous system states and revising itself each time. Decision Support System are mostly interactive systems, often

required by humans to provide necessary information based

on specific inputs; and they are also adaptable computer based

information system. Such a decision system not only utilizes

decision rules, models, and a comprehensive database but also

the decision maker’s own insights, leading to specific,

implementable decisions in solving problems that would be

difficult for a human to take alone [3].

2. PROBLEM DEFINITION Admission into professional colleges for engineering degree

course is based on scores of the Common Entrance Test

(CET). Students are allotted colleges based on these scores.

Seats are allotted on the basis of availability of seats in CAP

rounds. The lowest score accepted in a college for a certain

CAP round is known as the cut-off score.

Universities under DTE collect data about CET scores and

admissions from each college under that particular university.

Analyzing this extensive data provides us with an opportunity

to predict the admission pattern for a particular score, branch

and even a CAP round. Presently there are no such resources

to sort out colleges based on the parameters of marks,

branches and CAP rounds. Due to absence of such resources a

student would be less informed regarding the colleges he is

eligible in. Here we propose a technique to make use of

Decision Support System to assist in providing a student with

such decisions. The decisions taken by the system should not

only focus on present decisions, but also should take past

decisions into account.

The proposal mainly discusses about the use of DSS for

finding the most appropriate colleges forstudents based on

their CET scores. However, the scope of the project can be

extended to includethe common entrance exam that is being

envisioned to bring about uniformity and fairness in

thecurrent admission system. The algorithm that has been

developed can be modified accordinglyso that it will function

properly for the new pattern as well, for obtaining an

admission to differentcolleges. The main focus here has been

given to the Engineering field and the data has beencollected

accordingly. So, the students opting for the engineering field

may enter their marks inorder to get an appropriate result for

the colleges suitable for them. Similarly, this system can

beused for several other fields too such as Medicine,

Pharmacy, etc.

3. DATA MINING Data mining is a process that analyses (often large)

observational data sets to find relationships within it and to

summarize this data in a way that can be used by humans for

various purposes [4]. Techniques such as Bayes’ theorem,

neural networks and decision trees are an integral part of the

data mining process. Data mining is the process of extracting out knowledge from a

set of data. It discovers new patterns from data sets using

various methods of artificial intelligence, machine learning

and database systems. It is one of the steps of the Knowledge



39

Discovery in Databases process. In the Knowledge Discovery

Process the uncovered hidden knowledge can be identified as

relationships or patterns. The relationships may be between

two or more different objects which may change over a period

of time. Discovery of relationships is a key result of data

mining [5]. If knowledge discovery is one aspect of data mining,

prediction is the other. Here we look for a specific association

with regard to an event or condition. Pattern discovery is

another outcome of data mining operations. The data mining

tools mine the usage patterns of thousands of users and

discover the potential pattern of usage that will produce

results.

The most common example of data mining would be analysis

of shopping trends amongst shoppers. Products bought most

commonly by shoppers, are placed next to each other for

greater sales.

4. DECISION SUPPORT SYSTEM DSS are designed specifically to facilitate decision processes.

It should support rather than automate decision making, and

should adapt quickly to the changing requirements of decision

makers [6]. Decision Support Systems are found most useful

as they couple human decision making skills along with the

computational capability of a computer to improve the quality

of decisions. It is a computer-based support system for

management decision makers who deal with semi-structured

problems [7]. We propose implementation of combination of Data-Driven

as well as Knowledge driven type of Decision Support

System. Data-driven decision support systems are based on

access and manipulation of series of internal, external and

sometimes real-time data of an organization. Simple file

systems accessed by query and retrieval tools provide the

most elementary level of functionality [8]. Knowledge-driven

DSS suggest or recommend actions to users. They use

business rules, knowledge bases and also human expertise in

form of programmed internal logic. Decisions and tasks which

can be taken and performed respectively by a human expert

are taken by a Knowledge-driven DSS. The generic tasks

include classification, configuration, diagnosis, interpretation,

planning and prediction [9]. DSS systems often require user involvement in the

construction of problem representation and model

verification. They also require direct user involvement in the

analysis and evaluation of decision outcomes. These activities

involve subjective judgments and, therefore, a DSS should

focus on effective support and not on automatic selection. An

effective DSS is one which is flexible, adaptable to changing

user scenarios, its environment and one which learns from

knowledge gained on past user scenarios [10].

5. CENTRAL TENDENCY Central tendency refers to the number of ways, in which the

central value or the median value is calculated. The most

common and most effective numerical measure of the

“center” of a set of data is the (arithmetic) mean [11].

Let x1; x2,…, xN be a set of N values or observations, such as

for some attribute, like salary. Equation 1 shows the mean of

this set of values.

(1)

6. IMPLEMENTATION The CET scores are stored in the database, and it is processed

to obtain the relevant results. Colleges are sorted out on the

basis of previous year results as well as each CAP round.

High and low values are taken into consideration to apply

conditions to the entries in the database to categorize them.

The results are categorized as Ambitious, Best Bargain and

Safe colleges.

High = Score + offset (2)

Low = Score — offset (3) where offset is the deciding factor which decides the relevance

of the results. Algorithm of the logic is given as follows:

1) Accept input from user. 2) Search database for results pre-existing in the

database. 3) If result is found, display them to the user. This

method thereby reduces the complexity and

provides a quick result to the user. Update low and

high values for the existing result using the current

input of the user. 4) Else, do the following:

a) Search entire database. b) For entries, with mean greater than input but

less than or equal to high, mark them as

Ambitious. It means the candidate has slim

chances of getting into that college and stream. c) For entries, with mean less than input but

greater than or equal to low, mark them as Best

Bargain. It means the computed result is best

possible college-branch result for the candidate

and is most likely to get into it. d) For entries, with mean less than or equal to

low, mark them as Safe. It means the candidate

has highest chances of getting into these

colleges. 5) Enter the result generated in step 4 into the database

for making it available for future users. Mean is the average of marks of all years for a particular CAP

round. At step 4 and 5, the system learns on its own on the

basis of marks and previous year results. It revises itself each

time the system is used. Therefore, each time, the system is in

a new state, except when results pre-exist in the table.

7. TEST RESULTS Sample data taken from the data collected from the DTE

website is used to test the working of this decision system. Data is compiled by them every year, after the admission

process gets over with the help of feedback from the colleges

under it. This data is available on its website [12]. Table 1

shows the sample data. The user has to provide his/her CET

score. The system will provide the user its decision and will

update itself and be ready for future decision queries.

Table 1. Sample data of college cut off scores for

particular branch and year College

ID

Branch

ID

CAP

Round

2009

Marks

2010

Marks

2011

Marks

1 1 1 144 149 126

1 1 2 142 146 0

1 1 3 148 136 101

2 4 1 150 147 128

2 4 2 147 149 120

2 4 3 142 0 0



40

Decision rules have been tested using the data from the year

2009-2011. MySQL is used for processing of the data.

Table 2. Mean scores of three years for a college for a

particular CAP Round College ID Branch ID CAP Round Mean

1 1 1 139

1 1 2 144

1 1 3 128

2 4 1 141

2 4 2 138

2 4 3 142

Table 3. Result shown to the user College ID Branch ID CAP Round Type

1 1 1 Best Bargain

1 1 3 Safe

2 4 1 Ambitious

2 4 2 Safe

College ID refers to a college, Branch ID refers to an

engineering stream. Mumbai University offer specialization in

Computer, Information Technology, Electronics, Mechanical

and many more streams. CAP Rounds are the admission

rounds undertaken by a college, if there are seats available

under a particular stream. 2009 Marks, 2010 Marks, 2011

Marks are the cut off marks for that year, for a particular

college, branch and CAP round. 0 marks indicate no seats

were available for a particular CAP round in that year. If

scores of two years are available, then mean is calculated for

three years, else, if it is available for two years, then mean of

two years is calculated. Values of mean calculated by the

system are shown in Table 2. Let the offset value be equal to 2. If a higher offset value is

considered, then the results may not be realistic. Let the user

input be 140.

The system will follow the algorithm as follows.

1) System will check if results are available for the

score of 140. Let us assume that the results are not

available. 2) The system will calculate the values of High and

Low as 142 and 138 on the basis of (2) and (3)

respectively. 3) Now, system will search through the database and

perform following updates: a) For entries with mean greater than input but

less than or equal to High, i.e. 140 and 142

respectively are listed as Ambitious. b) For entries with mean less than input but

greater than or equal to Low, i.e. 140 and 138

respectively are listed as Best Bargain. c) For entries with mean less than or equal to

Low, i.e. 138 are listed as Safe. 4) Results of steps a, b and c are updated in the

database for future use. 5) If another search on marks of 140 is performed, the

system will show values of calculations performed

in previous steps. This way it saves resources and

does not perform calculation unnecessarily. Results provided to the user are shown in Table 3. The results

shown to the user are the decisions made by the system, which

it thinks are optimal for a user having CET marks of 140. A user can also provide with specific branch of engineering in

which he/she is interested in getting decisions about. This

way, more user-oriented decisions can be generated.

A user can also provide with specific branch of engineering in

which he/she is interested in getting decisions about. This

way, more user-oriented decisions can be generated.

8. CONCLUSION The papers referred helped getting a better idea about the

technique to be applied in this proposed system. Thus, helping

in getting a better idea about the way in which the systemmust

work with minimum faults in it. The papers provided me with

a clear idea about which techniques may have which

advantages and disadvantages with them when implemented.

The advantages of the system proposed in this paper are that it

uses past inputs whichenables the user to get a more realistic

result as compared to a system generating resultsbased purely

on the thresholds assigned. The cut-off marks keep changing

year afteryear due to the varying difficulty levels in different

sections for different years thus, it isnecessary for the system

to be able to update itself. The system proposed in this

paperkeeps revising itself after each calculation so as to

provide the user with the most possiblyaccurate and up-to-

date information regarding the colleges they are eligible for.

As thesystem is put into use, more and more data will be

collected each year. Thus, with more data, a stronger system

can be guaranteed to be made available to the students/users.

Choosing the right career path, coupled with the right

institution is extremely important for any student. With large

amount of data at hand, it is important that it should be

analyzed efficiently. Hence this work demonstrates how data

mining technologies can be used to help take wise career

decisions. This method can further be extended by using

various probability based prediction methods.

If the current system is kept in mind, then all the results from

the different entrance exams such as state CETs, IIT-JEE,

BIT-SAT, AIEEE can be taken into account and a system can

be developed togive proper results to students according to

their scores in the different exams. However, differentstudents

may score differently in different tests and thus, the system

must be robust enough todecide which is the best score

amongst all and thus must be taken into consideration. And,

not allstudents always appear for all the exams to gain

admission to an engineering college. Keeping thisin mind, the

options must be made available to students while selecting the

scores to be entered forthe different entrance examinations.

9. ACKNOWLEDGMENTS I would like to thank Prof. Shiwani Gupta for encouraging us

to implement this project, Ushang Thakker for assisting me in

designing the logic and Lavanya Singh for helping me in

preparing this paper.

10. REFERENCES [1] Elizabeth Murray, Using decision trees to understand

student data, Proceedings of the 22nd International

Conference on Machine Learning, Bonn, Germany,2005,

unpublished.

[2] Sudheep Elayidom, Dr. Sumam Mary Idikkula, Joseph

Alexander, Anurag Ojha, Applying data mining

techniques for placement chance prediction, 2009

International Conference on Advances in Computing,

Control, and Telecommunication Technologies,

Trivandrum, Kerala, 2009, published, pp 669-671.

[3] V. S. Janakiraman, K. Sarukesi, Decision Support

Systems, Chapter 6, PHI Learning Pvt. Ltd., New Delhi,

2004, p 26.



41

[4] D. J. Hand, Heikki Mannila, Padhraic Smyth, Principles

of data mining, Cambridge, MIT Press, 2001, p 1.

[5] Paulraj Ponniah, Data Warehousing Fundamentals, John

Wiley & Sons, 2001, p 402-403.

[6] Daniel J. Power, Decision Support Systems: Concepts

and Resources for Managers, Greenwood Publishing

Group, 2002, p 6-13.

[7] Keen, P. G. W. and M. S. Scott-Morton. Decision

Support Systems: An Organizational Perspective.

Reading, MA: Addison-Wesley, 1978.

[8] Frada Burstein, C. W. Holsapple, Handbook on Decision

Support, vol. 1, Springer-Verlag, Berlin, 2008, p 127.

[9] Daniel J. Power, Decision Support Basics, Business

Expert Press, 2009, p 41.

[10] Gregory E. Kersten, Zbigniew Mikolajuk, Anthony G. O.

Yeh, Decision support systems for sustainable

development, Kluwer Academic Publishers, Norwell,

2000, p42.

[11] Jiawei Han, MichelineKamber, Data Mining Concepts

and Techniques, Chapter 2, Morgan Kaufmann

Publishers, San Francisco, 2006, p. 51

[12] Directorate of Technical Education [Online]. Available:

http://www.dte.org.in/fe2011/StaticPages/Default.aspx



42

A Genetic Algorithm based Fuzzy C Mean Clustering

Model for Segmenting Microarray Images

Biju V G

Division of Electronics School Of Engineering

Cochin university of Science and Technology

Mythili P Division of Electronics School Of Engineering

Cochin university of Science and Technology

ABSTRACT Genetic algorithm based Fuzzy C Mean (GAFCM) technique

is used to segment spots of complimentary DNA (c-DNA)

microarray images for finding gene expression is proposed in

this paper. To evaluate the performance of the algorithm,

simulated microarray slides were generated whose actual

mean values were known and is used for testing. K-means,

Fuzzy C Means (FCM) and the proposed GAFCM algorithm

were applied to the simulated images for the separation of the

foreground (FG) spot signal information from background

(BG) and the results were compared. The strength of the

algorithm was tested by evaluating the segmentation matching

factor, coefficient of determination, concordance correlation

and gene expression values. From the results it is observed

that the segmentation ability of GAFCM is better compared to

FCM and K- Means algorithms.

Keywords K-means, FCM, GAFCM, Genetic Algorithm, Segmentation,

Gene expression

1. INTRODUCTION C-DNA microarrays is one of the most fundamental and

powerful tools in biotechnology, which has been utilized in

many biomedical applications such as cancer research,

infectious disease diagnosis and treatment, toxicology

research, pharmacology research, and agricultural

development. The enormous improvement of technology in

the last decade provides the ability to simultaneously identify

and quantify thousands of genes by their gene expression [1].

The spots on a microarray are segmented from the

background to compute the red to green intensity ratio to give

the gene expression. The three basic operations to compute

the spot intensities are gridding, segmentation and intensity

extraction. These operations are used to find the accurate

location of the spot, separate spot FG from BG and the

calculation of the mean red and green intensity ratio.

In the last decade, several software packages and algorithms

were developed for segmenting spots in microarray images.

Fixed circle segmentation was the first algorithm used in

ScanAlyze Software [2], where all spots were considered to

be circular with a predefined fixed radius. An adaptive circle

segmentation technique was employed in the GenePix

software [3], where the radius of each spot was not considered

constant but adapts to each spot separately. Dapple software

estimated the radius of the spot using the laplacian based edge

detection [4]. An adaptive shape segmentation technique was

used in the Spot software [5]. A histogram-based

segmentation method was used in the ImaGene software [6].

Later watershed [7] and the seeded region algorithms [8] were

employed. The disadvantage of the above mentioned software

packages and algorithms were either the spots were

considered to be circular in shape or a priori knowledge of the

precise position of the spot’s center was a prerequisite [9].

Further segmentation algorithms based on the statistical

Mann–Whitney test were also used [10], which assess the

statistical significant difference between the FG and BG.

Lately the K-Means and FCM clustering algorithm are the

techniques that are used for spot segmentation [11][12].

The present work mainly focuses on the microarray spot

segmentation ability of the proposed GAFCM algorithm over

the FCM and K-mean algorithm. Gridding is done by means

of an automatic gridding based on intensity profile technique

using both horizontal and vertical intensity profiles and the

spots are addressed on the basis of this gridding information.

The K-means, FCM and GAFCM algorithm were

developed in matlab [13]. For the evaluation and testing of the

algorithm both simulated and real microarray images were

used. The performance of the algorithms were tested by

evaluating the segmentation matching factor (SMF),

Coefficient of determination (r2), Concordance correlation (Pc)

and spot gene expression value.

2. METHODS The aim of microarray image processing is to extract each

spotted DNA sequence as well as its background estimates

and quality measures. This can be achieved in three steps:

gridding, segmentation and information extraction as shown

in Figure 1. In the gridding process, the coordinates of each

spot are determined. In the segmentation process, the

pixels are segmented as BG or FG, and in the third step

the intensities are extracted and the gene expressions are

obtained. The results are useful for accurate microarray

analysis which involves data normalization, filtering and data

mining. Clustering is the most common technique that is used

for the segmentation of the microarray images. The idea of the

clustering application is to divide the pixels of the image into

several clusters (usually two clusters) and then to characterize

these clusters as FG or BG. The K-means segmentation

algorithm is based on the traditional K-means clustering

technique [14]. It employs a square-error criterion, which is

calculated for each of the two clusters. A brief idea of FCM



43

[15] is given in Section 3 and the proposed GAFCM is

described in detail in Section 4.

Figure 1 Block diagram of microarray image processing.

3. FUZZY C MEAN (FCM)

ALGORITHM

Let x= xi, i = 1 to N be the pixels of a single microarray spot,

where N is the number of pixels present in the spot. These

pixels have to be clustered in two classes BG and FG. Let cj

j=1,2 be the cluster centers of the FG and BG pixels

respectively. Each pixel should have membership degrees uij

for each cluster. The pixel is assigned to a particular cluster

based on the value of the membership degree function. Hence

the algorithm aims at iteratively improving the membership

degree function until there is no change in the cluster centers.

The sum of the membership values of a pixel belonging to all

clusters should satisfy Equation 1.

∑ (1)

The Euclidean distance from a pixel to a cluster center is

given by

(2)

The aim of this method is to minimize the absolute value of

the difference between the two consecutive objective

functions Ft and Ft+1 given by the Equation 3 and 4.

∑ ∑

(3)

(4)

Where m is the fuzziness parameter and ε is the error

which has to be minimized. Iteratively in each step, the

updated membership uij and the cluster centers cj are

given by Equations 5 and 6.

∑

(5)

∑

∑

(6)

4. GENETIC ALGORITHM BASED FCM

OPTIMIZATION (GAFCM).

GA is a powerful, stochastic non-linear optimization tool

based on the principles of natural selection and evolution

[16][17][18][19][20]. To find the optimum fuzzy partitions of

a microarray spot signal, a new GA based fuzzy c mean

clustering method has been proposed. Clustering using

GAFCM can be achieved using the following steps. Here each

chromosome in the population of GA encodes a possible

partition of image and the goodness of the chromosome is

computed by using a fitness function. The technique is

described as follows.

A. Population initialization

The chromosomes are made up of real numbers

which represent microarray spot BG and FG pixel

intensity centers respectively. These values are

randomly initialized by taking all possible intensity

values in the search space under evaluation.

B. Fitness computation

Fitness of a chromosome is calculated in two steps.

In the first step membership values of the image

data points to the different clusters are computed by

using FCM algorithm. In the second step fitness

value is computed. This is used as a measure to

evaluate the fitness of the chromosome. The

membership degree function uij can be computed

using the FCM algorithm explained in Section 3.

Saha et.al has given a fitness function for the

segmentation of satellite images [21][22]. This has

been further modified for finding the cluster center

of c-DNA microarray spots and is given in Equation

7.

(7)

Where (8)

(9)

(10)

Ec is same as Equation 4. This is the difference between

two successive objective function values in FCM. This

value is to be minimized. Dc is the maximum Euclidean

distance between two cluster centers among all centers.

E is the error matrix; Gij is a 2x N reference matrix. The

first row of the reference matrix is the one dimensional

binary image corresponding to the simulated spot. The

second row is the complement of first row. The objective

is to maximize the Fit so as to achieve proper clustering.

To ensure this E & Ec values has to decrease and Dc has

to increase.

DNA

Microarray

image

Gridding

Automatic spot

cropping based on

gridding

Segmentation

of spot from

background

Red and

Green

channel

intensity

extraction

Computation

of gene

expression



44

C. Selection, Crossover and Mutation

Roulette wheel selection method is applied on the population

where, each chromosome receives a number that is

proportional to its fitness value. Crossover and Mutation are

the two Genetic Operators used for the creation of new

Chromosomes. After repeating steps A, B, C for a fixed

number of iterations the best cluster centers are selected [23].

The flow chart for performing GAFCM is given in Figure 2

No

Yes

Figure 2 Flow chart of GAFCM algorithm.

5. EVALUATION OF THE PROPOSED

METHOD

To quantify the effectiveness of the proposed approach,

simulated as well as real microarray images from the Stanford

Microarray Database (SMD) have been used. The spots were

gridded and segmented using K-Means, FCM and GAFCM

independently for comparison purposes. Simulated microarray

images were used for validation and comparison purposes

since their gene expressions are known. Spots were simulated

with realistic characteristics to ensure that it looks like a true

c-DNA image, consisting of more than 1000 spots. Hence a

real c-DNA image was used as a template, and its binary

version was produced by employing a threshold technique

[24]

After converting it into a binary image, the spot area is

replaced by random values of mean intensities. In the

simulated microarray image the mean intensity value of each

spot was predefined, ranging between 0 and 255 for both the

R and G channels [24]. BG intensities were replaced by a

single intensity value.

The accuracy of any segmentation technique can be evaluated

using three parameters. The segmentation matching factor

SMF, The coefficient of determination r2 and The

concordance correlation Pc. The SMF [25][26][27] for every

binary spot, produced by the clustering algorithm is given by

(11)

Where Aseg is the area of the spot, as determined by the

proposed algorithm and Aact is the actual spot area. A perfect

match is indicated by a 100% score, any score higher than

50% indicates reasonable segmentation where as a score less

than 50% indicate poor segmentation. The coefficient of

determination r2 [24][28][29] indicates the strength of the

linear association between simulated and calculated spots, as

well as the proportion of the variance of the calculated data.

∑

∑

(12)

Where Iseg and Iact are the mean intensity value of the

calculated and simulated spots respectively and Imean is the

overall mean spot intensity values of the simulated image. The

algorithm that scores r2 value closer to 1 has better

performance.

The concordance correlation Pc was calculated using the

Equation

(13)

Start

Crop the spot sub image based on

Gridding

Initialize the center encoded population

matrix P (K) of size (Nx2)

Select chromosome

Update uij matrix

matrix

Calculate cj matrix based on uij

Find Ec, E, Dc and Fit matrix

Selection, Cross over & Mutation

Update the population matrix P (K)

If (iterations=

desired value)

Select the best uij, cj & Cluster the spot

pixels into BG & FG

Stop

A

A



45

Where A and B are two samples, are the mean values,

and SA and SB are the standard deviation of the samples. The

higher the Pc value, the better the performance of the

algorithm. Further the proposed algorithm’s performance has

been tested in the presence of noise. This was done by

corrupting the simulated spot with additive white Gaussian

noise whose signal-to-noise ratio (SNR) ranges from 1 to 19

dB [30].

6. RESULTS AND DISSCUSSION

The segmentation ability of KM, FCM and the proposed

GAFCM algorithm is made by computing and comparing the

SMF r2 and Pc values explained in section 5. The K-Means,

FCM and GAFCM algorithms were applied independently on

these images for the classification of the BG and FG pixels.

Several microarray images with different FG mean were

simulated and spots were randomly selected from these

images. The SMF value for the three algorithms is shown in

Figure 3 with the original spots, actual boundaries and the

results obtained for various methods. It is obvious from the

result that GAFCM shows an overall SMF of 98.56%

compared to FCM with 97.19% and K-means with 68.78%.

The average SMF, r2 and Pc values shown in Table 1 is

obtained from the simulated microarray image shown in

Figure 4 before corrupting it with noise.

Table 1 The SMF, r2 and Pc value for a simulated

microarray image before adding noise.

KM FCM GAFCM

SMF 82.304 98.3447 99.3357

r2 0.80188 0.968114 0.991427

Pc 0.77947 0.968089 0.991424

The segmentation ability of the proposed method in the

presence of noise has been studied. To do this, the simulated

microarray images were added with additive white Gaussian

noise gradually. The SMF, r2 and Pc values of the noisy

images were computed using K-means, FCM and GAFCM

algorithm. The SNR value is varied from 1dB to 19 dB.

Figure 5 shows the graph of SMF vs SNR for the three

algorithms and Table 2 gives the corresponding numerical

value. It can be seen from the graph that the difference in the

SMF is more for FCM and GAFCM compared with K-

means. In the case of GAFCM and FCM even though curves

are close, GAFCM segmentation is better than FCM for low

and high noise images. The result shows that the overall SMF

value varies from 97.050% to 70.551%, 96.807% to 69.645%

and 85.418% to 53.940% for GAFCM, FCM and K-means

respectively. This reveals that GAFCM is having better SMF

value.

The Coefficient of determination (r2) for simulated

microarray images for K-means, FCM and GAFCM are

shown in Table 3. The graph between r2 and SNR in dB is

shown in Figure 6. The method that scores r2 value closer to 1

has better performance. The r2 value of GAFCM is closer to 1

compared to FCM and K-means for low noise images. The

variation of r2 for SNR variation from 1 to 19 dB is from

0.7501 to 0.1296, 0.6935 to 0.1079 and 0.2880 to 0.0036 for

GAFCM, FCM and K-means respectively.

The concordance correlation (Pc) values obtained for K-

means, FCM and GAFCM are shown in Table 4. Figure 7

shows the graph between Pc and SNR in dB. Higher the values

of Pc the better will be the segmentation value for that

algorithm. From Table 4 it can be seen that the Pc value varies

from 0.7471 to 0.0960, 0.6916 to 0.0796 and 0.2878 to 0.0007

for GAFCM, FCM and K-mean respectively. This clearly

indicates that the proposed GAFCM has better segmentation

capability for the current application.

Figure 3 Comparison results for seven segmented spots

obtained from seven simulated images.



46

Figure 4 Simulated microarray image used to calculate the

gene expression.

SNR(dB)

0 2 4 6 8 10 12 14 16 18 20

SM

F

50

60

70

80

90

100

SNR vs GAFCM

SNR vs K-Means

SNR vs FCM

Figure 5 SMF calculated for simulated image corrupted

with additive white Gaussian noise having different levels

of SNR (dB) using K-means, FCM, GAFCM algorithms.

Table 2 The comparison of K-means, FCM, GAFCM

algorithm based on segmentation matching factor

(SMF) for simulated microarray images with

different levels of additive white Gaussian noise

SNR(dB).

SNR(dB) KM FCM GAFCM

1 53.93972 69.64504 70.55050

3 58.52296 78.66445 79.11223

5 63.03961 84.53164 84.63773

7 67.87467 88.79217 89.11575

9 72.60327 92.44617 92.73175

11 77.90749 92.61146 93.02225

13 81.82369 94.17475 94.70089

15 84.01279 95.58631 96.18429

17 85.22194 96.1873 96.28328

19 85.41774 96.80675 97.05008

Table 3 The comparison of K-means, FCM,

GAFCM algorithm based on coefficient of

determination (r2) for simulated microarray images

with different levels of additive white Gaussian

noise SNR(dB).


1 0.003582 0.107935 0.129569

3 0.002433 0.070657 0.08278

5 0.009682 0.200522 0.217191

7 0.014513 0.380952 0.414809

9 0.034473 0.348032 0.382025

11 0.091063 0.310028 0.361558

13 0.211104 0.35561 0.454974

15 0.273211 0.613217 0.657108

17 0.301239 0.619506 0.728683

19 0.287993 0.693543 0.750119

SNR (dB)

0 2 4 6 8 10 12 14 16 18 20

r2

0.0

0.2

0.4

0.6

0.8

SNR vs GAFCM

SNR vs K-Means

SNR vs FCM

Figure 6 r2 calculated for simulated image corrupted with

additive white Gaussian noise having different levels of

SNR (dB) using K-means, FCM, GAFCM algorithms.



47

Table 4 The comparison of K-means, FCM, GAFCM

algorithm based on concordance correlation (Pc) for

simulated microarray images with different levels of

additive white Gaussian noise SNR (dB).


1 0.0007 0.0796 0.0960

3 0.0003 0.0447 0.0497

5 0.0028 0.1813 0.1977

7 0.0052 0.3601 0.3923

9 0.0190 0.3429 0.3778

11 0.0762 0.2910 0.3412

13 0.2058 0.3551 0.4546

15 0.2730 0.6120 0.6536

17 0.3012 0.6173 0.7257

19 0.2878 0.6916 0.7477

SNR (dB)

0 2 4 6 8 10 12 14 16 18 20

Pc

0.0

0.2

0.4

0.6

0.8

SNR vs GAFCM

SNR vs K-Means

SNR vs FCM

Figure 7 Pc calculated for simulated image corrupted with

additive white Gaussian noise having different levels of

SNR (dB) using K-means, FCM, GAFCM algorithms.

The aim of microarray image processing is to find the gene

expression value. The gene expression value is the logarithm

mean intensity ratio of red and green channels in a spot. The

closeness of the computed gene expression value with the

actual value shows the performance of the algorithm. To

validate this, several microarray images were simulated and

tested. Figure 4 shows one such simulated images and the

corresponding result is shown in Table 5. The better the

segmentation technique the closer will be the gene expression

value with the actual value. Table 5 shows the gene

expression value obtained for a microarray simulated image of

16 spots using the three segmentation methods along with

their actual values of gene expression. It can be seen that the

gene expression value measured is almost close to the actual

value in the case of GAFCM compared to FCM and K-

Means. This shows that GAFCM algorithm has better scope in

microarray image spot segmentation application.

Table 5 Comparison of gene expression values computed

using K-means, FCM and GAFCM algorithm.

SPOT

No

Gene Expression

KM FCM GAFCM Actual

1 -0.01147 -0.06477 -0.04779 -0.04779

2 0.04617 -0.12034 -0.12034 -0.12034

3 0.03171 -0.09431 -0.09431 -0.09431

4 0.16624 0.08583 0.085828 0.091598

5 -0.12983 -0.19036 -0.17852 -0.17852

6 -0.00411 -0.11734 -0.11734 -0.10333

7 -0.05711 -0.1459 -0.13697 -0.13276

8 0.12509 -0.00511 -0.00511 -0.00386

9 -0.02495 -0.07131 -0.07716 -0.07716

10 -0.04111 -0.09078 -0.09078 -0.09078

11 -0.05853 -0.15023 -0.15023 -0.15023

12 0.06195 0.0167 0.016696 0.016696

13 -0.02509 -0.10586 -0.09059 -0.09059

14 0.03494 -0.04701 -0.04701 -0.04922

15 -0.11408 -0.2259 -0.2259 -0.2259

16 0.0467 -0.07544 -0.0705 -0.02818

7. CONCLUSION

Segmentation is an important part in microarray image

processing. The microarray spot segmentation for estimating

gene expression using K-means FCM and proposed GAFCM

has been done. It is seen that the proposed GAFCM algorithm

is more efficient than the FCM and K-means in terms of

clustering the signal FG and BG pixels. The errors during

segmentation lead to inaccurate calculation of gene expression

values in the intensity extraction step. All the above

mentioned algorithms do not perform well at high noise



48

levels. This can be rectified by using suitable filtering

techniques. As our future work, the noise removal has to be

addressed to get much smoother image and also an improved

clustering algorithm is to be developed so that low signal

intensity spots can be segmented more effectively.

8. REFERENCES [1] Y. H. Yang, M. J. Buckley, S. Duboit, and T. P. Speed

(2002), “Comparison of methods for image analysis on

c- DNA microarray data,” J. Comput. Graphical Statist.,

vol. 11, pp. 108–136

[2] M.B.Eisen. (1999). ScanAlyze [Online] Available:-

http://rana.lbl.gov/ EisenSoftware.htm

[3] GenPix 4000, A User’s Guide (1999), Axon Instruments,

Inc., Foster City, CA.

[4] J. Buhler, T. Ideker, and D. Haynor, “Dapple: improved

techniques for finding spots on DNA microarrays,”

Technical Report. UWTR 2000-08-05, UV CSE,

Seattle,Washington, USA.

[5] M. J. Buckley. (2000). The spot user’s guide.

CSIRO Mathematical and Information Science [Online].

Available:

http://www.cmis.csiro.au/IAP/Spot/spotmanual.html.

[6] ImaGene, ImaGene 6.1 User Manual. (2006. [Online]

Available:-http://www.biodiscovery.com/index/papps-

webfiles-action.

[7] S. Beucher and F. Meyer (1993), “The morphological

approach to segmentation: The watershed

transformation,” Opt. Eng., vol. 34, pp. 433–481.

[8] R. Adams and L. Bischof (Jun. 1994), “Seeded region growing,” IEEE Trans. Pattern Anal. Mach. Intell., vol.

16, no. 6, pp. 641–647.

[9] D. Bozinov and J. Rahenfuhrer (2002.), “Unsupervised

technique for robust target separation and analysis of

DNA microarray spots through adaptive pixel

clustering,” J. Bioinform., vol. 18, pp. 747–756.

[10] Y. Chen, E. R. Dougherty, and M. L. Bittne (1997),

“Ratio-based decisions abd the quantitative analysis of c-

DNA microarray images,” J. Biomed. Opt., vol. 2, pp.

264–374.

[11] S. Wu and H. Yan (2003), “Microarray Image Processing Based on Clustering and Morphological

Analysis”, Proc. Of First Asia-Pasific Bioinformatics

Conference, Adelaide, Australia, pp. 111-118.

[12] Volkan Uslan and Đhsan Ömür Bucak (2010). Microarray image segmentation using clustering

methods. Mathematical and Computational

Applications, Vol. 15, No. 2, pp. 240-247, © Association

for Scientific Research

[13] The Math Works, Inc., Software, MATLABR (2010a).

Natick, MA.

[14] MacQueen, J. B. (1967). Some Methods for

classifications. In 5-th Berkeley Symposium on

Mathematical Statistics and Probability, 1, 281-297.

Berkeley:University of California Press

[15] J. C. Bezdek (1981), Pattern Recognition with Fuzzy

Objective Function Algorithms, Plenum Press, New

York.

[16] D. E. Goldberg (1989), Genetic Algorithms in Search,

Optimization & Machine Learning, Boston: Addison-

Wesley, Reading, ch. 1.

[17] L.Davis (Ed.)(1991), Handbook of Genetic Algorithms,

Van Nostrand Reinhold, New York.

[18] Z. Michalewicz (1992), Genetic Algorithms #Data

Structures" Evolution Programs, Springer, New York.

[19] J.L.R. Filho, P.C. Treleaven, C. Alippi (1994), Genetic

algorithm programming environments, IEEE Comput.

27, 28-43.

[20] U. Maulik and S. Bandyopadhyay (2000),

“Genetic algorithm based clustering technique,” Pattern

Recog., vol. 33, pp. 1455–1465.

[21] Saha, S. and Bandyopadhyay, S., Accepted, (2007),

Fuzzy Symmetry Based Real-Coded Genetic Clustering

Technique for Automatic Pixel Classification in Remote

Sensing Imagery. Fundamenta Informaticae.

[22] S. Bandyopadhyay and S. Saha (2007), “GAPS: A clustering method using a new point symmetry based

distance measure,” Pattern Recog., vol. 40, pp. 3430–

3451.

[23] F. Herrera, M. Lozano, and J. L. Verdegay (Nov 1998),

“Tackling Real Coded Genetic Algorithms: Operators

and Tools for Behavioural Analysis,” Artificial

Intelligence Review, vol. 12, no. 4, pp. 265–319.

[24] O. Demirkaya, M. H. Asyali, and M.M. Shoukri (2005),

“Segmentation of c-DNA microarray spots using Markov

radom field modeling,” Bioinformatics, vol. 21, no. 13,

pp. 2994–3000.

[25] D. Tran and M. Wagner (2002), “Fuzzy C-means

clustering-based speaker verification,” in Lecture Notes

in Computer Science: Advances in Soft Computing—

AFSS 2002, N. R. Pal and M. Sugeno, Eds. New York:

Springer-Verlag, pp. 318–324.

[26] D. Betal, N. Roberts, and G. H. Whitehouse (1997),

“Segmentation and numerical analysis of micro

calcifications on mammograms using mathematical

morphology,” Br. J. Radiol., vol. 70, no. 837, pp. 903–

917.

[27] E.I. Athanasiadis, D.A. Cavouras, P.P. Spyridonos,

D.Th.Glotsos, I.K. Kalatzis, G.C. Nikiforidis (July

2009), Complementary DNA microarray image

processing based on the Fuzzy Gaussian mixture

model, in: IEEE Transaction on Information

Technology in Biomedicine, vol. 13, issue 4.

[28] E.I. Athanasiadis, D.A. Cavouras, P.P. Spyridonos,

D.Th.Glotsos, I.K. Kalatzis, G.C. Nikiforidis (2011), A

Wavelet based markov random field segmentation model

in segmenting microarray experiments, in: Computer

methods and programs in biomedicine 104,307-315.

[29] A.Lehmussola, et al. (2006), Evaluating the performance

of microarray segmentation algorithms, Bioinformatics

22, 2910–2917.

[30] K. Blekas, N. Galatsanos, A. Likas, and I. E. Lagaris

(Jul. 2005.), “Mixture model analysis of DNA

microarray images,” IEEE Trans. Med. Imag., vol. 24,

no. 7, pp. 901–907.

table of contents - repositori | universitas udayana. rajesh kumar, national university of singapore...

Documents