project report on eye tracking interpretation system

61
A Project Report on EYE TRACKING INTERPRETATION SYSTEM Submitted by Name Seat No GAVHALE NAVLESH B120283824 GAVHANE DEVENDRA B120283825 KURKUTE SIDDHESHWAR B120283843 A Project report submitted as a partial fulfillment towards Project for term-II of Bachelor of Electronics Engineering, 2015-16 Under the guidance of Mrs.P.S.Kasliwal Department of Electronics Engineering MIT Academy of Engineering, Alandi (D), Pune 412 105 Savitribai Phule Pune University. 2015-2016

Upload: kurkute1994

Post on 11-Apr-2017

434 views

Category:

Engineering


20 download

TRANSCRIPT

Page 1: Project report on Eye tracking interpretation system

A Project Report on

EYE TRACKING INTERPRETATION SYSTEM

Submitted by

Name Seat No

GAVHALE NAVLESH B120283824

GAVHANE DEVENDRA B120283825

KURKUTE SIDDHESHWAR B120283843

A Project report submitted as a partial fulfillment towards Project for term-II of

Bachelor of Electronics Engineering,

2015-16

Under the guidance of

Mrs.P.S.Kasliwal

Department of Electronics Engineering

MIT Academy of Engineering, Alandi (D),

Pune 412 105

Savitribai Phule Pune University.

2015-2016

Page 2: Project report on Eye tracking interpretation system

CERTIFICATE

This is to certify that

Name Seat No

NAVLESH GAVHALE B120283824

DEVENDRA GAVHANE B120283825

SIDDHESHWAR KURKUTE B120283843

of

MIT Academy of Engineering, Alandi (D), Pune have submitted Project report on

EYE TRACKING INTERPRETATION SYSTEM as a partial fulfillment of term II for award

of degree of Bachelor of Electronics Engineering, from Savitribai Phule Pune University, Pune,

during the academic year 2015-16.

Project Guide Head of Dept

Mrs.P.S.Kasliwal Dr.M.D.Goudar

External Examiner

Page 3: Project report on Eye tracking interpretation system

Acknowledgement

We take this opportunity to thank certain people without whom this endeavor would not have been

possible. We would also express our thanks to the head of Department of Electronics engineering

Dr.M.D.Goudar. We would like to express our sincere gratitude to our guide Mrs.P.S.Kasliwal

for constant encouragement, help and guidance without which this project would not have been

completed.

We would like to express our sincere gratitude towards Mr.S.A.Khandekar, Mr.P.R.Ubare,

Mr.G.R.Vyawhare, Mr.P.P.Kumbhar for their constant support and valuable advice throughout

the progress of the project. Last but not the least, We express our heartiest acknowledgement to

our parents, friends and colleagues who directly or indirectly helped us in completing the project.

Page 4: Project report on Eye tracking interpretation system

ABSTRACT

With a growing number of computer devices around,the increasing time we spend for interacting

with such devices, we are interested in finding new methods of interaction which ease the use

of computers or increase interaction efficiency. Eye tracking looks to be a promising technology

to achieve this goal.The aim of this project is to create a low cost Eye Tracking Interpretation

System.This system will penetrate to masses and using this system poor people will also get a

quality life. In our project the camera is mounted on a cheap sunglass frame at a distance of 5-10

cm.We will track and interpret eye movements with Human Computer Interaction(HCI) by using

Infrared Pupil-Corneal Reflection/pupil-centre method.The outcome of this project will shape fu-

ture projects where we try to integrate such a system to develop assistant systems for cars,language

reading, music reading, human activity recognition, the perception of advertising etc.

Page 5: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

INDEX

1 INTRODUCTION 2

1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Necessity of project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 LITERATURE SURVEY 5

2.1 Eye Tracking Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 Videooculography (VOG) . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.2 Video-based infrared (IR) pupil-corneal reflection (PCR) . . . . . . . . . . 8

2.1.3 Electrooculography (EOG) . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 SYSTEM DESCRIPTION 12

3.1 Related work component selection . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.1.1 Selection of Single Board Computer . . . . . . . . . . . . . . . . . . . . . 12

3.2 IR Sensitive Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2.1 The Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2.2 Three Kinds of Infrared Lights . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2.3 How It Is Processed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.3 Raspberry Pi2 Model B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3.1 Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3.2 Connectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.4 Costing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Department of Electronics Engineering, MIT Academy of Engineering, Pune i

Page 6: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

4 SOFTWARE DESCRIPTION 25

4.1 The NOOBS installer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.2 Operating System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3 Boot Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.4 Open CV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.5 Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.5.1 Image Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.5.2 Image Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.5.3 Tracking Algorithm’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.6 Virtual Network Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5 METHODOLOGY 38

5.1 Block Diagram and Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.2 Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.3.1 USB Camera interface with the Raspberry Pi . . . . . . . . . . . . . . . . 42

5.3.2 Pupil detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.3.3 Output of pupil focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.3.4 Execution on Raspberry pi . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.3.5 Project as a standalone system . . . . . . . . . . . . . . . . . . . . . . . . 44

6 RESULT 46

7 APPLICATIONS 48

8 CONCLUSION AND FUTURE SCOPE 50

9 REFERENCES 51

Department of Electronics Engineering, MIT Academy of Engineering, Pune ii

Page 7: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

List of Figures

1.1 Eye Anatomy[5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1 VOG based tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 The relationship between the pupil center and the corneal reflection when the user

fixates on different locations on the screen . . . . . . . . . . . . . . . . . . . . . . 9

2.3 Wearable EOG goggles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.1 Comparison of Single Board Computers(SBC’s) . . . . . . . . . . . . . . . . . . . 13

3.2 IR Imaging Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.3 Blocks on Raspberry Pi2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.4 Block Diagram of BCM2836 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.5 Block Diagram of ARM Cortex-A7 . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.6 Pin out of Raspberry Pi 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.7 RCA Video Connector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.1 Raspbian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2 Boot process of Raspberry Pi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.3 Otsus Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.4 Averaging filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.5 Gaussian filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.6 Median filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.7 MeanShift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Department of Electronics Engineering, MIT Academy of Engineering, Pune iii

Page 8: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

4.8 VNC Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.1 Block Diagram[9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.2 Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5.3 Camera image without IR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.4 Camera image with IR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.5 Project as a standalone system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

6.1 Samples of different subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Department of Electronics Engineering, MIT Academy of Engineering, Pune iv

Page 9: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

List of Tables

3.1 Component Cost Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6.1 Comparison of MSE of various filters . . . . . . . . . . . . . . . . . . . . . . . . 46

9.1 Project Schedule Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Department of Electronics Engineering, MIT Academy of Engineering, Pune v

Page 10: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Chapter 1

INTRODUCTION

An eye has a lot of communicative power. Eye contact and gaze direction are central and very im-

portant factors in human communication.The eye has also been said to be a mirror to the soul or

window into the brain.Gaze behavior reflects cognitive processes that can give hints of our think-

ing and intentions.[4] We often look at things before acting on them. To see an object we have to

fixate our gaze at it long enough for the brains visual system to perceive it. Fixations are typically

between 200 and 600 ms. During any fixation, we only see a fairly narrow area visual scene with

high acuity. To perceive the scene accurately, we need to scan constantly it with rapid movement

of an eye, so-called saccades. Saccades are quick, ballistic jumps of 2 Degrees or longer that take

about 30 to 120 ms each.In addition to saccadic movements, the eyes can follow a moving target;

this is known as (smooth) pursuit movement.[1]

The size of the fovea, subtends at an angle of about one degree from the eye. The diameter of this

region corresponds to area of two degrees, which is about the size of a thumbnail when viewed

with the arm extended. Everything inside the fovea is perceived with high acuity but the acuity

decreases rapidly towards the periphery of the eye. The cause for this can be seen by examining

the retina(see Fig. 1.1). The lens focuses the reflected light coming from the pupil on the center

of the retina. The fovea contains cones, photoreceptive cells that are sensitive to color and provide

acuity. In reverse case, the peripheral area contains mostly rods, i.e. cells that are sensitive to light,

shade and motion.For example, a sudden movement in periphery can quickly attract the viewers

Department of Electronics Engineering, MIT Academy of Engineering, Pune 1

Page 11: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Figure 1.1: Eye Anatomy[5]

attention.We only see a small fraction of the visual scene in front of us with high acuity at any

point in time. The user need to move his/her eyes toward the target is the basic requirement for eye

tracking: it is possible to deduct the gaze vector by observing the line-of-sight.

Eye tracking is a technique where an individual users pupil(eye) movements are measured so that

we come to know that both where a user is looking at any given instant and the sequence in

which his/her eyes get shifted from one location to another.Eye tracking is the process of tracking

pupil(eye) movements or the absolute point of gaze (POG)referring to the point where users gaze

is focused at in the visuals. Eye tracking is being used in a no. of application areas, from psycho-

logical research and medical diagnostic to usability studies and interactive gaze-controlled applica-

tions.We are focused on the use of real-time data from human pupil(eye) movements.Just as speech

and other technologies requires accurate interpretation of user speech and related parameters,eye-

movement data analysis requires accurate interpretation of user eye movements i.e.mapping ob-

served, to the user intentions that produced them.[12]

Department of Electronics Engineering, MIT Academy of Engineering, Pune 2

Page 12: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Eye movement recordings can provide an objective source of interface-evaluation data that can in-

form the design of improved interfaces. Eye movements can also be captured and used as control

signals to enable people to interact with interfaces directly without the need for mouse or keyboard

input, which can be a major advantage for certain populations of users such as disabled individuals.

1.1 Problem Statement

As Paralyzed/physically disabled people could not convey their messages to others, hence build a

low cost system to communicate themselves to the outside world.

1.2 Necessity of project

The main objective of this project is to Provide a useful system to paralyzed/physically disabled

peoples which will be easy to configure and handle.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 3

Page 13: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Chapter 2

LITERATURE SURVEY

During the course of this project, there were many surveys carried out regarding the various com-

ponents,techniques of eye tracking they might be implemented in our project along with the various

systems already existing. Many websites were visited to get ideas about the current working mod-

ules for controlling of the parameters. Books on micro controller as well as various eye behaviours

were read to understand exactly how much it will be able to sustain our requirements and if any

other options could be used.

Initially, eye movements were mainly studied by the physicological inspection and observation.

Basic eye movements were categorized and their duration estimated long before the eye tracking

technology enabled precise measurement of eye movements.The first generation of eye tracking

devices was highly uncomfortable. A breakthrough in eye tracking technology was the develop-

ment of the first contact-free eye tracking apparatus based on photography and light reflected from

the cornea. It can be considered as the first invention in video-based eye tracking, corneal reflec-

tion eye tracking systems. The development of unobtrusive camera-based systems and the increase

of computing power enabled gathering of eye tracking data in real time, enabling the use of pupil

movement as a control method for people with disabilities.[11]

As far as camera is concerned it is better to use the camera with high resolution not less than 640 x

480 and camera must be IR sensitive. As far as sensors are concerned, there are many IR sensors

that are particularly used in such systems to accurately track the eye movements. Some of them

Department of Electronics Engineering, MIT Academy of Engineering, Pune 4

Page 14: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

are Sharp GP2Y0A02YK0F,Phidgets 1103,Dagu Compound,Fairchild QRB1134 etc.But all of of

them are too much bigger in size to mount around the camera so here we will be using IR LED’S

to do the job.

2.1 Eye Tracking Techniques

While a large number of different techniques are being deployed to track eye movements in the

past, three of them have emerged as the predominant ones and are commonly used in research and

commercial applications in today’s world.These techniques are

(1) Videooculography (VOG)-video based tracking using remote visible light video cameras.

(2) Video-based infrared pupil-corneal reflection (PCR).

(3) Electrooculography (EOG).

While the first two video-based techniques have so many common properties, all techniques have

wide range of application areas where they are mostly used.Video-based eye tracking relies on

video cameras and therefore can be used for developing interfaces that do not require accurate

POG tracking (e.g. about 4Degrees). In contrast, video based PCR provides highly accurate point

of gaze measurements of up to 0.5Degrees of visual angle and has therefore it is a preferred tech-

nique in scientific domains, such as reading or pupil(eye) movement based interaction, and com-

mercial applications, such as in car safety research. Finally, EOG has been used for decades for

ophthalmological studies as it can be used for measuring movements of the eyes with high ac-

curacy. In addition to different application areas, each of these measurement techniques also has

specific technical advantages and disadvantages.

2.1.1 Videooculography (VOG)

A video-based eye tracking that can be either used in a remote or head-mounted configuration. A

typical setup consists of a video camera that records the movements of the eye(s) and a computer

that saves and analyses the gaze data. In remote systems, the camera is typically based below

the computer screen (Fig. 2.1) while in head-mounted systems, the camera is attached either on

Department of Electronics Engineering, MIT Academy of Engineering, Pune 5

Page 15: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

a frame of eyeglasses or in a separate helmet. Head-mounted systems often also include a scene

camera for recording the users point of view, which can then be used to map the users gaze to the

current visual scene.

The frame rate and resolution of the video camera have a significant effect on the accuracy of

tracking; a low-cost web camera cannot compete with a high-end camera with high-resolution and

high sample rate. The focal length of the lens, the angle, as well as the distance between the eye

and the camera have an effect on the working distance and the quality of gaze tracking. With

large zooming (large focal length), it is possible to get a close-up view of the eye but it narrows

the working angle of the camera and requires the user to sit fairly still (unless the camera follows

the users movements). In head-mounted systems, the camera is placed near the eye, which means

a bigger image of the eye and thus more pixels for tracking the eye. If a wide angle camera is

used, it allows more freedom of movement of the user but also requires a high-resolution camera

to maintain enough accuracy for tracking the pupil.

Since tracking of an eye is based on video images, it requires clear view of the eye. There are so

many issues that may affect the quality of eye tracking,such as rapidly changing light conditions,

reflections from eyeglasses, continuously dropping of eyelids,squinting the eyes while smiling,

or even heavy makeup.The video images are the basis for estimating the gaze position on the

computer screen: the location of the eye(s) and the center of the pupil are detected. Changes

in the position is tracked, analyzed and mapped to gaze coordinates. For a detailed survey of

video-based eye tracking techniques,detection of pupil and estimation of gaze position. If only one

reference point is used and no other reference points are available, the user has to stay still for an

accurate calculation of the gaze vector (the line of sight from the users eye to the point of view on

the screen). Forcing the user to sit still may be uncomfortable,thus various methods for tracking

and compensating the head movement have been implemented. Head tracking methods are also

required for head-mounted systems, if one wishes to calculate the point of gaze in relation to the

users eye and the environment.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 6

Page 16: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Figure 2.1: VOG based tracking

2.1.2 Video-based infrared (IR) pupil-corneal reflection (PCR)

Systems only based on visible light and pupil center tracking tend to be inaccurate and sensitive to

head movement. To address this problem, a reference point, a so called corneal reflection or glint,

can be added. Such a reference point can be added by using an artificial infrared (IR) light source

aimed on- or off-axis at the eye. An on-axis light source will result in a bright pupil effect, making

it easier for the analysis software to recognize the pupil in the image. The effect is similar to the

red-eye effect caused by flash in a photograph. The off-axis light results in dark pupil images. Both

will help in keeping the eye area well lit but they do not disturb viewing or affect pupil dilation

since IR light is invisible to the human eye.

By measuring the corneal reflection(s) from the IR source relative to the center of the pupil, the

system can compensate for inaccuracies and also allow for a limited degree of head movement.

Gaze direction is then calculated by measuring the changing relationship between the moving pupil

center of the eye and the corneal reflection (see Fig. 2.2). As the position of the corneal reflection

Department of Electronics Engineering, MIT Academy of Engineering, Pune 7

Page 17: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

remains roughly constant during eye movement, the reflection will remain static during rotation

of the eye and changes in gaze direction, thus giving a basic eye and head position reference. In

addition, it also provides a simple reference point to compare with the moving pupil, and thus

enables calculation of the gaze vector.

Figure 2.2: The relationship between the pupil center and the corneal reflection when the userfixates on different locations on the screen

While IR illumination enables fairly accurate remote tracking of the user it does not work well

in changing ambient light, such as in outdoors settings. There is an ongoing research that tries

to solve this issue. In addition, according to our personal experience, there seems to be a small

number of people for whom, robust/accurate eye tracking does not seem to work even in laboratory

settings. Electrooculography is not dependent or disturbed by lighting conditions and thus can

replace VOG-based tracking in some of these situations and for some applications.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 8

Page 18: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

2.1.3 Electrooculography (EOG)

The human eye can be modeled as a dipole with its positive pole at the cornea and its negative

pole at the retina. Assuming a stable cornea-retinal potential difference,the eye is the origin of a

steady electric potential field. The electrical signal that can be measured from this field is called the

electrooculogram (EOG). The signal is measured between two pairs of surface electrodes placed

in periorbital positions around the eye (see Fig. 2.3) with respect to a reference electrode (typically

placed on the forehead). If the eyes move from the center position towards one of these electrodes,

the retina approaches this electrode, while the cornea approaches the opposing one. This change

in dipole orientation causes a change in the electric potential field, which in turn can be measured

to track eye movements. In contrast to video-based eye tracking, recorded eye movements are

typically split into one horizontal and one vertical EOG signal component. This split reflects the

discretisation given by the electrode setup.

One drawback of EOG compared to video-based tracking is the fact that EOG requires electrodes

to be attached to the skin around the eyes. In addition, EOG provides lower spatial POG tracking

accuracy and is therefore better suited for tracking relative eye movements. EOG signals are

subject to signal noise and artifacts and prone to drifting, particularly if recorded in mobile settings.

EOG signalslike other physiological signalsmay be corrupted with noise from the residential power

line, the measurement circuitry, electrodes, or other interfering physiological sources.

One advantage of EOG compared to video-based eye tracking is that changing lighting conditions

have only little impact on EOG signals, a property that is particularly useful for mobile recordings

in daily life settings. As light falling into the eyes is not required for the electric potential field to

be established; EOG can also be measured in total darkness or when the eyes are closed. It is for

this reason that EOG is a well known measurement technique for recording eye movements during

sleep e.g. to identify REM phases or diagnosing sleep disorders . The second major advantage of

EOG is that the signal processing is computationally light-weight and particularly does not require

any complex video and image processing.Consequently, while EOG has traditionally been used in

stationary settings, it can also be implemented as a low power and fully embedded on body system

for mobile recordings in daily life settings. While state-of-the art video-based eye trackers require

Department of Electronics Engineering, MIT Academy of Engineering, Pune 9

Page 19: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Figure 2.3: Wearable EOG goggles

additional equipment for data recording and storage, such as laptops, and are limited to recording

times of a few hours, EOG allows for long term recordings, allowing capture of peoples everyday

life.[11]

Department of Electronics Engineering, MIT Academy of Engineering, Pune 10

Page 20: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Chapter 3

SYSTEM DESCRIPTION

3.1 Related work component selection

1. Camera = IR sensitive and minimum resolution of 640 x 480

2. Sunglasses =Cheap and strong enough to sustain the weight of camera

3. Input to the system = Reflections from Human eye.

• a.IR Sensor 4-IR LEDS.

4. Power Requirement

For Single Board Computer=5V.

3.1.1 Selection of Single Board Computer

Selecting the proper microcontroller unit is one of the critical decisions which is more likely to be

responsible for the success or failure of the project. There are numerous criteria’s which should

be considered while choosing a microcontroller.While selecting the microcontroller the first ques-

tion which should be asked is ”What does the microcontroller need to do in your project?” Others

factors those should be considered include features, pricing, availability, development tools, man-

ufacturer support, stability, and solesourcing. While Selecting microcontroller for our project we

Department of Electronics Engineering, MIT Academy of Engineering, Pune 11

Page 21: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

compared various Single Board Computers (SBC’s) those are Shown in below Figure:

Figure 3.1: Comparison of Single Board Computers(SBC’s)

3.2 IR Sensitive Camera

3.2.1 The Basics

Infrared imaging cameras allow you to do more than just see in the dark they work regardless of

the lighting conditions, giving you a sharp set of eyes even in absolutely zero light.An infrared

imaging camera basically detects light. The quantity of energy found in a light wave is correlated

to its wavelength. The longer the wavelength is, the lower energy it has. This kind of energy can

only be detected by an infrared thermal camera. Aside from light waves, there are visible lights

and these are better known to us as colors.Among all the colors that we have, violet has the most

Department of Electronics Engineering, MIT Academy of Engineering, Pune 12

Page 22: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

energy and red has the least energy. Because of this, the terms Ultraviolet and Infrared came to be.

All lights that are below red or are darker than red are considered infrared which is not visible to

the human eye but is visible through imaging cameras.[6]

Infrared rays can also travel through fog, dust and smoke no matter how thick. It can even travel

through some materials. Although infrared imaging cameras are often referred to as night vision

cameras, dont confuse them with day and night cameras. A day and night camera has a very

sensitive imaging chip that lets the camera capture a viewable picture even in less light conditions.

It does not, however, use infrared technology. A day and night camera is a good option for areas

that have a constant source of light, such as a street light or security light, but they will not work

if that light is switched off (either accidentally or knowingly). When light is available, infrared

cameras will give you a color picture. As it gets darker, the camera automatically switches to

infrared mode. In this mode, the camera records in black and white.

Figure 3.2: IR Imaging Camera

Department of Electronics Engineering, MIT Academy of Engineering, Pune 13

Page 23: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

3.2.2 Three Kinds of Infrared Lights

The infrared lights can be categorized into 3 groups. They are the near infrared, mid infrared and

thermal infrared. Near infrared ray is the nearest to visible light and has the wavelength between

0.7 to 1.3 mns. Mid infrared ray has a wavelength between 1.3 to 3.1 mns. Both the near infrared

rays and the mid infrared are being utilized in common electronic devices in our current world.

Some of them are remote controls, cellular phones and electronic switches. Thermal infrared is

the one that has the largest part of the Spectrum of Light, while near infrared is between 0.7 to 1.3

mns and mid infrared ray is between 1.3 to 3.1 mns (the thermal infrared is between 3 to 30 mns).

3.2.3 How It Is Processed

An infrared imaging camera uses a special lens that is focused mainly of infrared light emitted

by all the objects that it can see. The infrared light emitted is processed by an array of infrared

detectors, which then creates a temperature pattern that is also known as thermogram. The entire

process from obtaining the infrared light to making a thermogram takes approximately 0.033 ms.

After the thermogram is created, it is then made into an electrical impulse. The electric impulse

that was made from the thermogram is then sent to a single processing unit or a circuit board that

is mainly focused on sending the signal into a data for display. Once the data has been sent to the

display, people are be able to see various colors depending on the intensity of the emission of the

infrared light. Through the different combinations that came from the impulses made by different

objects, an infrared thermal image is then created.Sometimes, infrared cameras are best utilized

when people want to see the things that cannot be seen by the naked eye. They are also being

utilized by the military because the infrared imaging camera allows them to see in the dark.

To check how good is your IR imaging camera, One way is measuring the camera’s lux. Lux refers

to the amount of light required to give a good picture. Obviously, the lower the lux, the lower the

light the camera needs. A true IR camera will have a 0.0 lux in infrared mode, which means they

can see in complete, utter, total darkness with no light at all. You can also compare IR cameras

according to how far they can see in complete darkness.Some long range cameras can see up to

Department of Electronics Engineering, MIT Academy of Engineering, Pune 14

Page 24: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

150 feet...in total darkness! Depending on your requirement, you can select short range or long

range cameras that will keep you covered.[4]

3.3 Raspberry Pi2 Model B

Figure 3.3: Blocks on Raspberry Pi2

The Raspberry Pi 2 delivers 6 times the processing capacity of previous models. This second

generation Raspberry Pi has an upgraded Broadcom BCM2836 processor, which is a powerful

ARM Cortex-A7 based quad-core processor that runs at 900MHz. The board also features an

increase in memory capacity to 1Gbyte.

3.3.1 Specifications

Chip - Broadcom BCM2836 SoC

Core architecture - Quad-core ARM Cortex-A7

CPU - 900 MHz

GPU - Dual Core VideoCore IV Multimedia Co-Processor, Provides Open GL ES 2.0, hardware-

accelerated OpenVG, and 1080p30 H.264 high-profile decode

Department of Electronics Engineering, MIT Academy of Engineering, Pune 15

Page 25: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Capable of 1Gpixel/s, 1.5Gtexel/s or 24GFLOPs with texture filtering and DMA infrastructure

Memory - 1GB LPDDR2

Operating System - Boots from Micro SD card, running a version of the Linux operating system

Dimensions - 85 x 56 x 17mm

Power - Micro USB socket 5V, 2A

3.3.2 Connectors

Ethernet - 10/100 BaseT Ethernet socket

Video Output - HDMI (rev 1.3 and 1.4)

Audio Output - 3.5mm jack, HDMI

USB - 4 x USB 2.0 Connector

GPIO Connector - 40-pin 2.54 mm (100 mil) expansion header: 2x20 strip, Providing 27 GPIO

pins as well as +3.3 V, +5 V and GND supply lines

Camera Connector - 15-pin MIPI Camera Serial Interface (CSI-2)

JTAG - Not populated

Display Connector -Display Serial Interface (DSI) 15 way flat flex cable connector with two data

lanes and a clock lane

Memory Card Slot - Micro SDIO

1.Processor

Broadcom BCM2836:

This second generation Raspberry Pi has an upgraded Broadcom BCM2836 processor, which is a

powerful ARM Cortex-A7 based quad-core processor that runs at 900MHz.The processor is built

with four ARM cores, compared to the earlier generation BCM2835, which has a single ARM core

that was used in the original Raspberry Pi. It has twice the amount of memory (RAM) on board

and has the same pocket-sized form factor.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 16

Page 26: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Figure 3.4: Block Diagram of BCM2836

ARM Cortex-A7:

The Cortex-A7 MPCore processor is a high performance, low power processor that implements

the ARMv7A architecture. The Cortex-A7 MPCore processor has one to four processors in a sin-

gle multiprocessor device. It provides up to 20percent more single thread performance than the

Cortex-A5.The Cortex-A7 processor builds on the energy-efficient 8-stage pipeline.Performance

and power optimized L1 caches combine minimal access latency techniques to maximize perfor-

mance and minimize power consumption.It also benefits from an integrated L2 cache designed for

low-power, with lower transaction latencies and improved OS support for cache maintenance.NEON

technology in it can accelerate multimedia and signal processing algorithms such as video en-

code/decode, 2D/3D graphics, gaming, audio and speech processing, image processing, telephony,

and sound synthesis.Figure below shows the block diagram of ARM Cortex-A7:

Department of Electronics Engineering, MIT Academy of Engineering, Pune 17

Page 27: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Figure 3.5: Block Diagram of ARM Cortex-A7

2.Power source

The device is powered by a 5V micro USB supply. Exactly how much current (mA) the Rasp-

berry Pi requires is dependent on what you connect to it.Typically, the model B uses between

700-1000mA depending on what peripherals are connected.The maximum power the Raspberry

Pi can use is 1 Amp. If you need to connect a USB device that will take the power requirements

above 1 Amp, then you must connect it to an externally-powered USB hub.

The power requirements of the Raspberry Pi increase as you make use of the various interfaces

on the Raspberry Pi. The GPIO pins can draw 50mA safely, distributed across all the pins; an

individual GPIO pin can only safely draw 16mA. The HDMI port uses 50mA, the camera module

requires 250mA, and keyboards and mouse can take as little as 100mA or over 1000mA.Just hav-

ing the Pi 2 Model B running idle (no HDMI, graphics, ethernet or wifi just console cable) the Pi 2

draws 200mA and with WiFi running, that adds another 170mA and if you have Ethernet instead,

that adds about 40mA.

BACKPOWERING:

Backpowering occurs when USB hubs do not provide a diode to stop the hub from powering against

the host computer. Other hubs will provide as much power as you want out each port.some hubs

may backfeed the Raspberry Pi. This means that the hubs will power the Raspberry Pi through its

Department of Electronics Engineering, MIT Academy of Engineering, Pune 18

Page 28: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

USB cable input cable, without the need for a separate micro-USB power cable, and bypass the

voltage protection. If you are using a hub that backfeeds to the Raspberry Pi and the hub experi-

ences a power surge, your Raspberry Pi could potentially be damaged.

3.SD Card

The Raspberry Pi does not have any onboard storage available. The operating system is loaded on

a SD card which is inserted on the SD card slot on the Raspberry Pi. The operating system can be

loaded on the card using a card reader on any computer.In Raspberry Pi the data is written and read

a lot more frequently, and from differing locations on the card.Please note that it is not necessary

that a SD card with high write speed rating is always required.

4.GPIO

GPIO General Purpose Input Output

General-purpose input/output (GPIO) is a generic pin on an integrated circuit whose behaviour,

including whether it is an input or output pin, can be controlled by the user at run time.GPIO pe-

ripherals vary widely. In some cases, they are simple group of pins that can switch as a group to

either input or output. In others, each pin can be set up to accept or source different logic volt-

ages, with configurable drive strengths and pull ups/downs. Input and output voltages are typically

though not always limited to the supply voltage of the device with the GPIOs, and may be damaged

by greater voltages. GPIO pins have no special purpose defined, and go unused by default. The

idea is that sometimes the system designer building a full system that uses the chip might find it

useful to have a handful of additional digital control lines, and having these available from the chip

can save the hassle of having to arrange additional circuitry to provide them. GPIO capabilities

may include:

1.GPIO pins can be configured to be input or output.

2.GPIO pins can be enabled/disabled.

3.Input values are readable (typically high=1, low=0).

Department of Electronics Engineering, MIT Academy of Engineering, Pune 19

Page 29: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

4.Output values are writable/readable.

5.Input values can often be used as IRQs (typically for wake up events).

The Raspberry Pi2 Model B board has a 40-pin 2.54 mm expansion header arranged in a 2x20

strip. They provide 27 GPIO pins as well as +3.3 V, +5 V and GND supply lines.

Figure 3.6: Pin out of Raspberry Pi 2

Department of Electronics Engineering, MIT Academy of Engineering, Pune 20

Page 30: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

5.DSI Connector

The Display Serial Interface (DSI) is a specification by the Mobile Industry Processor Interface

(MIPI) Alliance aimed at reducing the cost of display controllers in a mobile device. It is com-

monly targeted at LCD and similar display technologies. It defines a serial bus and a communica-

tion protocol between the host (source of the image data) and the device (destination of the image

data).A DSI compatible LCD screen can be connected through the DSI connector.

6.RCA Video

RCA Video outputs (PAL and NTSC) are available on all models of Raspberry Pi. Any television

or screen with a RCA jack can be connected with the RPi.

Figure 3.7: RCA Video Connector

7.USB 2.0 Port

The Raspberry Pi 2 Model B is equipped with four USB2.0 ports. These are connected to the

LAN9512 combo hub/Ethernet chip IC3, which is itself a USB device connected to the single

upstream USB port on BCM2836.The USB ports enable the attachment of peripherals such as

keyboards, mouse, webcams that provide the Pi with additional functionality. The USB host port

inside the Pi is an On-The-Go (OTG) host as the application processor powering the Pi, BCM2836,

Department of Electronics Engineering, MIT Academy of Engineering, Pune 21

Page 31: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

was originally intended to be used in the mobile market: i.e. as the single USB port on a phone

for connection to a PC, or to a single device. In essence, the OTG hardware is simpler than the

equivalent hardware on a PC.OTG in general supports communication to all types of USB device,

but to provide an adequate level of functionality for most of the USB devices that one might plug

into a Pi, the system software has to do more work. The USB specification defines three device

speeds - Low, Full and High. Most mouse and keyboards are Low-speed, most USB sound devices

are Full-speed and most video devices (webcams or video capture) are High-speed.

General Limitations:

The OTG hardware on Raspberry Pi has a simpler level of support for certain devices, which may

present a higher software processing overhead. The Raspberry Pi also has only one root USB port:

all traffic from all connected devices is funneled down this bus, which operates at a maximum

speed of 480mbps.The software overhead incurred when talking to Low- and Full-speed devices

means that there are soft limitations on the number of simultaneously active Low- and Full-speed

devices. Small numbers of these types of devices connected to a Pi will cause no issues.

8.Ethernet

Ethernet port is available on Model B. It can be connected to a network or internet using a standard

LAN cable on the Ethernet port. The Ethernet ports are controlled by Microchip LAN9514 LAN

controller chip.Experience shows that under ideal circumstances close to 99Mbps of useful data

can be transferred on a 100Mbps Ethernet segment.We can get higher than 100Mbps of throughput

as long as we add a gigabit USB3.0 Ethernet adapter.

9.HDMI

HDMI High Definition Multimedia Interface

HDMI1.3and1.4 - a type A port is provided on the RPi to connect with HDMI screens.The HDMI

port on the Raspberry Pi supports the Consumer Electronics Control(CEC) Standard.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 22

Page 32: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

3.4 Costing

Sr.No. Component Cost1) Raspberry Pi 2Model B 25002) iBall ROBO K20 Camera 10203) Sunglass Frame 2004) 4-IR LED’s(Transmitter) 100

Table 3.1: Component Cost Table

Department of Electronics Engineering, MIT Academy of Engineering, Pune 23

Page 33: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Chapter 4

SOFTWARE DESCRIPTION

4.1 The NOOBS installer

The Raspberry Pi package only comes with the main board and nothing else. It does not come

shipped with an operating system. Operating systems are loaded on a SD card from a computer

and then the SD card is inserted in the Pi which becomes the primary boot device.

Installing operating system can be easy for some enthusiasts, but for some beginners working with

image files of operating systems can be difficult. So the Raspberry Pi foundation made a software

called NOOBS New Out Of Box Software which eases the process of installing an operating sys-

tem on the Pi.

The NOOBS installer can be downloaded from the official website. A user only needs to connect a

SD card with the computer and just run the setup file to install NOOBS on the SD card. Next, insert

the card on the Raspberry Pi. On booting the first time, the NOOBS interface is loaded and the user

can select from a list of operating systems to install. It is much convenient to install the operating

system this way. Also once the operating system is installed on the card with the NOOBS installer,

every time the Pi boots, a recovery mode provided by the NOOBS can be accessed by holding the

shift key during boot. It also allows editing of the config.txt file for the operating system.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 24

Page 34: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

4.2 Operating System

The Raspberry Pi primarily uses Linux kernel-based operating systems. The ARM11 is based on

version 6 of the ARM which is no longer supported by several popular versions of Linux, including

Ubuntu. The install manager for Raspberry Pi is NOOBS. The OSs included with NOOBS are:

1.Archlinux ARM

2.OpenELEC

3.Pidora (Fedora Remix)

4.Raspbmc and the XBMC open source digital media center

5.RISC OS The operating system of the first ARM-based computer

6.Raspbian Maintained independently of the Foundation; based on ARM hard-float (armhf)-

Debian 7 ’Wheezy’ architecture port, that was designed for a newer ARMv7 processor whose

binaries would not work on the Rapberry Pi, but Raspbian is compiled for the ARMv6 instruction

set of the Raspberry Pi making it work but with slower performance. It provides some available deb

software packages, pre-compiled software bundles. A minimum size of 2 GB SD card is required,

but a 4 GB SD card or above is recommended. There is a Pi Store for exchanging programs. The

’Raspbian Server Edition (RSEv2.4)’, is a stripped version with other software packages bundled

as compared to the usual desktop computer oriented Raspbian.

About Raspbian

Figure 4.1: Raspbian

Raspbian is an unofficial port of Debian Wheezy armhf with compilation settings adjusted to pro-

duce optimized ”hard float” code that will run on the Raspberry Pi. This provides significantly

faster performance for applications that make heavy use of floating point arithmetic operations.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 25

Page 35: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

All other applications will also gain some performance through the use of advanced instructions of

the ARMv6 CPU in Raspberry Pi. Although, Raspbian is primarily the efforts of Mike Thompson

and Peter Green, it has also benefited greatly from the enthusiastic support of Raspberry Pi com-

munity members who wish to get the maximum performance from their device.

Raspbian is a free operating system based on Debian optimized for the Raspberry Pi hardware. An

operating system is the set of basic programs and utilities that make your Raspberry Pi run. How-

ever, Raspbian provides more than a pure OS: it comes with over 35,000 packages, pre-compiled

software bundled in a nice format for easy installation on your Raspberry Pi. The initial build of

over 35,000 Raspbian packages, optimized for best performance on the Raspberry Pi, was com-

pleted in June of 2012. However, Raspbian is still under active development with an emphasis on

improving the stability and performance of as many Debian packages as possible. Note: Raspbian

is not affiliated with the Raspberry Pi Foundation. Raspbian was created by a small, dedicated team

of developers that are fans of the Raspberry Pi hardware, the educational goals of the Raspberry Pi

Foundation and, of course, the Debian Project.

What is Debian?

Debian is a free operating system for your computer and includes the basic set of programs and

utilities that make your computer run along with many thousands of other packages. Debian has

a reputation within the Linux community for being very high-quality, stable and scalable. Debian

also has an extensive and friendly user community that can help new users with support for practi-

cally any problem. This makes Debian an ideal operating system for the Raspberry Pi that will be

used by children and many others using Linux for the first time.

What is Raspbian?

Raspbian is an unofficial port of Debian wheezy armhf with compilation settings adjusted to pro-

duce code that uses ”hardware floating point”, the ”hard float” ABI and will run on the Raspberry

Pi. The port is necessary because the official Debian wheezy armhf release is compatible only with

versions of the ARM architecture later than the one used on the Raspberry Pi (ARMv7-A CPUs

and higher, vs the Raspberry Pi’s ARMv6 CPU). The Debian squeeze image issued by the Rasp-

berry Pi foundation was based on debian armel which uses software floating point and the ”soft

Department of Electronics Engineering, MIT Academy of Engineering, Pune 26

Page 36: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

float” ABI. The foundation used the existing Debian port for less capable ARM devices. There-

fore, it did not use of the Pi’s processor’s floating point hardware - reducing the Pi’s performance

during floating point intensive applications - or the advanced instructions of the ARMv6 CPU.

4.3 Boot Process

The Raspberry Pi does not boot as a traditional computer. The Video Core i.e. the Graphics pro-

cessor actually boots before the ARM CPU.

The boot process of the Raspberry Pi can be explained as follows:

1.When the power is turned on, the first bits of code to run is stored in a ROM chip in the SoC and

is built into the Pi during manufacture. This is the called the first-stage bootloader.

2.The SoC is hardwired to run this code on startup on a small RISC Core (Reduced Instruction Set

Computer). It is used to mount the FAT32 boot partition in the SDCard so that the second-stage

bootloader can be accessed. So what is this second-stage bootloader stored in the SD Card? Its

bootcode.bin. This file can be seen while mount process of an operating system on the SD Card in

windows.

3.Now heres something tricky. The first-stage bootloader has not yet initialized the ARM CPU

(meaning CPU is in reset) or the RAM. So, the second-stage bootloader also has to run on the

GPU. The bootloader.bin file is loaded into the 128K 4 way set associative L2 cache of the GPU

and then executed. This enables the RAM and loads start.elf which is also in the SD Card. This is

the third-stage bootloader and is also the most important. It is the firmware for the GPU, meaning

it contains the settings or in our case, has instructions to load the settings from config.txt which is

also in the SD Card. We can think of the config.txt as the BIOS settings.

4.The start.elf also splits the RAM between the GPU and the ARM CPU. The ARM only has

access the to the address space left over by the GPU address space. For example, if the GPU

was allocated addresses from 0x000F000 0x0000FFFF, the ARM has access to addresses from

0x00000000 0x0000EFFF.

5.The physical addresses perceived by the ARM core is actually mapped to another address in the

Department of Electronics Engineering, MIT Academy of Engineering, Pune 27

Page 37: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

VideoCore (0xC0000000 and beyond) by the MMU (Memory Management Unit) of the Video-

Core.

6.The config.txt is loaded after the split is done so the splitting amounts cannot be specified in the

config.txt. However, different .elf files having different splits exist in the SD Card. So, depending

on the requirement, the file can be renamed to start.elf and boot the Pi. In the Pi, the GPU is King!

7.Other than loading config.txt and splitting RAM, the start.elf also loads cmdline.txt if it exists.

It contains the command line parameters for whatever kernel that is to be loaded. This brings us

to the final stage of the boot process. The start.elf finally loads kernel.img which is the binary

file containing the OS kernel and releases the reset on the CPU. The ARM CPU then executes

whatever instructions in the kernel.img thereby loading the operating system.

8.After starting the operating system, the GPU code is not unloaded. In fact, start.elf is not just

firmware for the GPU, It is a proprietary operating system called VideoCore OS (VCOS). When

the normal OS (Linux) requires an element not directly accessible to it, Linux communicates with

VCOS using the mailbox messaging system.

Figure 4.2: Boot process of Raspberry Pi

Department of Electronics Engineering, MIT Academy of Engineering, Pune 28

Page 38: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

4.4 Open CV

About OpenCV

Open Source Computer Vision Library is an open source computer vision and machine learning

software library. OpenCV was built to provide a common infrastructure for computer vision ap-

plications and to accelerate the use of machine perception in the commercial products. Being a

BSD-licensed product, OpenCV makes it easy for to modify the code.The library has more than

2500 optimized algorithms, which includes a comprehensive set of both classic and state-of-the-

art computer vision and machine learning algorithms. These algorithms can be used to detect and

recognize faces, identify objects, classify human actions in videos, track camera movements, stitch

images together to produce a high resolution image of an entire scene, find similar images from an

image database, remove red eyes from images taken using flash,etc.

Along with well-established companies like Google, Yahoo, Microsoft, Intel, IBM, Sony, Honda,

Toyota that employ the library. OpenCVs deployed uses span the range from stitching street view

images together, detecting intrusions in surveillance video in Israel, monitoring mine equipment

in China, helping robots navigate and pick up objects at Willow Garage, detection of swimming

pool drowning accidents in Europe, running interactive art in Spain and New York, checking run-

ways for debris in Turkey, inspecting labels on products in factories around the world on to rapid

face detection in Japan. It has C++, C, Python, Java and MATLAB interfaces and supports Win-

dows, Linux, Android and Mac OS. OpenCV leans mostly towards real-time vision applications

and takes advantage of MMX and SSE instructions when available.OpenCV is written natively in

C++. OpenCV has a modular structure, which means that the package includes several shared or

static libraries. The following modules are available:

1.core - a compact module defining basic data structures, including the dense multi-dimensional

array Mat and basic functions used by all other modules.

2.imgproc - an image processing module that includes linear and non-linear image filtering, geo-

metrical image transformations (resize, affine and perspective warping, generic table-based remap-

ping), color space conversion, histograms, and so on.

3.video - a video analysis module that includes motion estimation, background subtraction, and

Department of Electronics Engineering, MIT Academy of Engineering, Pune 29

Page 39: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

object tracking algorithms.

4.calib3d - basic multiple view geometry algorithms, single and stereo camera calibration, object

pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction.

5.features2d - salient feature detectors, descriptors, and descriptor matchers.

6.objdetect - detection of objects and instances of the predefined classes.

7.highgui - an easy to use interface to video capturing, image and video codecs, as well as simple

UI capabilities.

8.gpu - GPU accelerated algorithms from different OpenCV modules.

9.some other helper modules, such as FLANN and Google test wrappers, Python bindings, and

others.

4.5 Image Processing

Images may suffer from the following degradations:

1)Poor contrast due to poor illumination or finite sensitivity of the imaging device

2)Electronic sensor noise or atmospheric disturbances leading to broad band noise

3)Aliasing effects due to inadequate sampling

4)Finite aperture effects or motion leading to spatial

In order to avoid all above one may think of options like Image filtering,Thresholding etc.

4.5.1 Image Thresholding

There are various types of Thresholding Some of them are:

1)Simple Thresholding

2)Adaptive Thresholding

3)Otsus Thresholding

1. Simple Thresholding

In this type if pixel value is greater than a threshold value, it is assigned one value (may be white),

Department of Electronics Engineering, MIT Academy of Engineering, Pune 30

Page 40: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

else it is assigned another value (may be black). First argument will be the source image, which

should be a grayscale image. Second argument will be the threshold value which is used to classify

the pixel values. Third argument will be the max. value which represents the value to be given

if pixel value is more than (sometimes less than) the threshold value.It is also called as Global

Thresholding As the the threshold value given will be global.

2. Adaptive Thresholding

when image has different lighting conditions in different areas. In that case, we go for adaptive

thresholding. In this, we have to calculate the threshold for a small regions of the image. So we get

different thresholds for different regions of the same image and it gives us better results for images

with varying illumination.[13]

3. Otsus Thresholding

It automatically calculates a threshold value from image histogram for a bimodal image (In simple

words, bimodal image is an image whose histogram has two peaks). For images which are not

bimodal, Otsus Thresholding wont be accurate.

Figure 4.3: Otsus Thresholding

Department of Electronics Engineering, MIT Academy of Engineering, Pune 31

Page 41: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

4.5.2 Image Filtering

We will mainly focus on two types of filters:

1)Smoothing (low-pass)

2)Sharpening (high-pass)

1.Smoothing filters

Some of the Smoothing filters are listed and explained below:

a)Averaging filtering

b)Gaussian filtering

c)Median filtering

a)Averaging filtering

It simply takes the average of all the pixels under kernel area and replaces the central element with

this average. This type of filtering is also called Blur Filtering. Fig below Shows the effect of

averaging filter on input image as well as the updated value of matrix and Mean Square Error:

Figure 4.4: Averaging filtering

Department of Electronics Engineering, MIT Academy of Engineering, Pune 32

Page 42: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

b)Gaussian filtering

The smoothing depends on standard deviation in the X and Y directions, sigmaX and sigmaY re-

spectively. Fig below Shows the effect of gaussian filter on input image as well as the updated

value of matrix and Mean Square Error:

Figure 4.5: Gaussian filtering

c)Median filtering

In this the median of all the pixels under the kernel window and the central pixel is replaced with

this median value. This is highly effective in removing salt-and-pepper noise. One interesting

thing to note is that, in the Gaussian and box filters, the filtered value for the central element can be

a value which may not exist in the original image. However this is not the case in median filtering,

since the central element is always replaced by some pixel value in the image. This reduces the

noise effectively. Fig below Shows the effect of Median filter on input image as well as the updated

value of matrix and Mean Square Error:

Department of Electronics Engineering, MIT Academy of Engineering, Pune 33

Page 43: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Figure 4.6: Median filtering

2.Sharpening filters

Some of them are listed below:

a)Unsharp masking

b)High Boost filter

c)Gradient (1st derivative)

d)Laplacian (2nd derivative)

4.5.3 Tracking Algorithm’s

In Opencv there are so many algorithms are available out of which we are going to use two of them

those are explained below:

1)MeanShift

2)CamShift

MeanShift

The aim of meanshift is simple. Consider you have a set of points(It can be a pixel distribution).You

are given a small window ( may be a circle) and you have to move that window to the area of max-

Department of Electronics Engineering, MIT Academy of Engineering, Pune 34

Page 44: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

imum pixel density (or maximum number of points). It can be explained using the simple image

given below:

Figure 4.7: MeanShift

The initial window is shown in blue circle with the name C1. Its original center is marked in

blue rectangle, named C1-o. But if you find the centroid of the points inside that window, you

will get the point C1-r (marked in small blue circle) which is the real centroid of window. Surely

they dont match. So move your window such that circle of the new window matches with previous

centroid. Again find the new centroid. Most probably, it wont match. So move it again, and con-

tinue the iterations such that center of window and its centroid falls on the same location (or with a

small desired error). So finally what you obtain is a window with maximum pixel distribution. It is

marked with green circle, named C2. As you can see in image, it has maximum number of points.

CamShift

As seen in mean shift the size of the window remains constant which is not good for the mov-

ing object,So in order to adapt the window size with size and rotation of the target, CAMshift

(Continuously Adaptive Meanshift) is invented by Gary Bradsky. It applies meanshift first. Once

meanshift converges, it updates the size of the window. It also calculates the orientation of best

fitting ellipse to it. Again it applies the meanshift with new scaled search window and previous

Department of Electronics Engineering, MIT Academy of Engineering, Pune 35

Page 45: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

window location. The process is continued until required accuracy is met.[13]

4.6 Virtual Network Computing

Figure 4.8: VNC Connections

The aim is to install the VNC server software on Pi and the VNC viewer software on the host

computer.configure the Pi to give out an IP address.Raspberry Pi will now be directly connected

to a host computer using a single Ethernet cable, thus making a completely isolated point to point

network between the two and therefore your network becomes stronger.

Note: (you don’t need a cross over cable for connection,a standard cable will work because the

Pi Ethernet port automatically switches the transmit and receive pins.) It makes the Pi Ethernet

port behave in a similar way to a home router. This means assigning a static IP address to it and

installing a DHCP service that will respond to address requests from the host computer. It is good

to use an IP address range that is very different to your main network.Using this internet can be

shared from laptops WiFi over Ethernet. This also lets you access internet on the pi and connect

raspberry pi to laptop display.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 36

Page 46: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Chapter 5

METHODOLOGY

5.1 Block Diagram and Description

Block Diagram:

Figure 5.1: Block Diagram[9]

Department of Electronics Engineering, MIT Academy of Engineering, Pune 37

Page 47: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Description:

The block diagram mainly consist of six parts

1) User

2) Sunglass frame

3) IR Module

4) Micro CMOS cam

5) Processor

6) Computational Software

7) Monitor

1)User:The user will be one who will provide the input to the system through his/her eyes.

2)Sunglass frame: Cheap and strong enough to sustain the weight of camera.

3)IR Module: Four IR LEDs mounted below the camera which will propagate the ir light on hu-

man eye which will help in tracking the eye movements more precisely.

4)Micro CMOS cam:IR sensitive and minimum resolution of 640 x 480 which will be mounted

on the eyeglass frame.

5)Processor:Raspberry pi which will process the images/pictures captured by camera.

6)Computational Software:Raspbian,Open CV.The output from processor is then computed in

open cv.

7)Monitor:It will be used to display the output generated from Computational software.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 38

Page 48: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

5.2 Flowchart

Figure 5.2: Flowchart

Department of Electronics Engineering, MIT Academy of Engineering, Pune 39

Page 49: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

5.3 Implementation

Visible spectrum imaging is a passive approach of taking ambient light reflected from the eye. it

is often the case that the most effective feature to track in visible spectrum analysis is the contour

between the iris and the sclera known as the limbus. The 3 most relevant features of the eyes

are the pupil - the aperture that lets light into the pupil (eye), the iris - the colored muscle group

that controls the diameter of the pupil(eye), and the sclera, the white protective tissue that covers

the remaining portion of the pupil(eye). Visible spectrum eye tracking is complicated by the fact

that uncontrolled adequate light conditions is being used as the source, which can contain multiple

specular and diffuse components. IR imaging eliminates uncontrolled specular reflection by illu-

minating the eye with a uniform and controlled IR light not perceivable by the user.

A further benefit of IR imaging is that the pupil, rather than the limbus,it is the strongest feature

contour in the image both the sclera and iris strongly reflect IR light while only the sclera strongly

reflects visible light. Tracing the pupil contour is preferable given that the pupil contour is small

and more sharply defined than the limbus. Further more, due to its size, the pupil is less likely to

be occluded by the eyelids. The primary disadvantage of IR imaging techniques is that they cannot

be used outdoors during daytime due to the ambient IR illumination.

Infrared eye tracking typically utilizes either bright-pupil or dark-pupil techniques. Bright-pupil

techniques illuminate the eye with a source that is on or very near the axis of the camera. The

result of such illumination is that the pupil is clearly demarcated as a bright region due to the photo

reflective nature of the back of the eye. Dark-pupil techniques illuminate the eye with an off axis

source such that the pupil is the darkest region in the image, while the sclera, iris and eye lids all

reflect relatively more illumination. In either method, the first surface specular reflection of the

illumination source off of the cornea (the outer most optical element of the eye) is also visible.

This vector between the pupil center and the corneal reflection is typically used as the dependent

measure rather than the pupil center alone. This is because the vector difference is insensitive to

slippage of the head gear - both the camera and the source move simultaneously.[10]

Department of Electronics Engineering, MIT Academy of Engineering, Pune 40

Page 50: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

5.3.1 USB Camera interface with the Raspberry Pi

Specifications of Camera Used:

1)Image Sensor - High quality 1/6 CMOS sensor

2)Sensor resolution - 300K pixels

3)Video Format - 24-Bit True Color

4)Max. Video Resolution - 1600 x 1200 pixels

5)Lens - High quality 5G wide angle lens

6)Max. Image Resolution - 5500 x 3640 pixels

7)Interface - USB 2.0. Backward compatible with USB 1.1

8)Focus - 5 cm to Infinity

9)Night Vision - Yes

10)Power Supply - USB bus powered

11)White Balance - Auto

12)Frame Rates - 18 frames per second

USB Cameras are imaging cameras that use USB 2.0 to transfer image data. USB Cameras are

designed to easily interface with dedicated computer systems by using the same USB technology

that is found on most computers. It is on raspberry pi also. The accessibility of USB technology

in computer systems as well as the 480 Mb/s transfer rate of USB 2.0 makes USB Cameras ideal

for many imaging applications. Minimum resolution Of 640x480 pixels is required. The distance

between camera and eye is 5 to 10 cm.In this project use iBall ROBO-K20 PC webcam as a USB

camera. There is a lot of recent motivation to do image processing and computer vision tasks on

the Rasberry Pi. Then opencv is installed on raspberry Pi.We installed properly USB camera with

the raspberry pi then USB camera will use the driver to work properly on raspberry pi board.UV4L

driver will be used for raspberry pi.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 41

Page 51: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Figure 5.3: Camera image without IR

Figure 5.4: Camera image with IR

5.3.2 Pupil detection

The pupil is a hole located in the center of the iris of the eye that allows light to enter the retina.

It appears black because light rays entering the pupil are either absorbed by the tissues inside the

eye directly, or absorbed after diffuse reflections within the eye that mostly miss exiting the narrow

pupil. Pupil - the opening in the center of the iris- it changes size as the amount of light changes

Department of Electronics Engineering, MIT Academy of Engineering, Pune 42

Page 52: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

(the more light, the smaller the hole). By measuring the corneal reflection(s) from the IR source

relative to the center of the pupil, the system can compensate for inaccuracies and also allow for

a limited degree of head movement. Gaze direction is then calculated by measuring the changing

relationship between the moving pupil center of the eye and the corneal reflection.

5.3.3 Output of pupil focus

the eye and pupil of eye will be detected by the camera by the opencv code. USB Camera will

be interface to the Rasberry Pi. This result shown in raspbian OS.Then different value of X, Y

coordinates will be obtained and then these coordinates can be used for any purpose depending on

applications.

5.3.4 Execution on Raspberry pi

USB Camera interfaced with raspberry pi. Rassberry pi with the use of SD card, will then install

raspbian OS and opencv on rassberry pi. Fist image will be captured by USB Camera. Focus of

camera is on eye in an image and detection of the center position of pupil by opencv code. Take

the position value of pupil as reference, and then the different value of X, Y coordinates will be set

for particular command.[5]

5.3.5 Project as a standalone system

Our project can be implemented as a standalone system. The cost of the system can be considerably

reduced using only those features of the raspberry pi which are required like Ethernet, One USB

2.0, SD-CARD, HDMI, etc. For the purpose of displaying the result any kind of display unit like

LCD or VGA can be used. As a further development the result can be transferred to anywhere

using IoT like technologies. Because of this there no need to sit beside the patient continuously.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 43

Page 53: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Figure 5.5: Project as a standalone system

Department of Electronics Engineering, MIT Academy of Engineering, Pune 44

Page 54: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Chapter 6

RESULT

1. The objective of project is to interpret the eye movements for relevant information.In view of

this it is necessary to retrieve an image with no noise or else the noise introduced will change the

interpretation.Thus a study of different filters was carried out on the image.The table below shows

that the gaussian filter is the right choice in such application least MSE.This ca be further improved

by measuring the PSNR of an image.

Sr.No. Name of the Filter Mean Square Er-ror (MSE)

1) Averaging(Blur) 2.39452) Bilateral 2.79213) Gaussian 1.59614) Median 2.0907

Table 6.1: Comparison of MSE of various filters

Department of Electronics Engineering, MIT Academy of Engineering, Pune 45

Page 55: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

2. As our eye i.e. pupil moves in such a way that it will draw a letter one at a time similarly a

word then a sentence can be drawn. In our project we took some samples of the different subjects

as shown in below table and then calculated the accuracy which came to be around 65-70percent.

Figure 6.1: Samples of different subjects

Department of Electronics Engineering, MIT Academy of Engineering, Pune 46

Page 56: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Chapter 7

APPLICATIONS

1.Explicit eye input is utilized in applications that implement gaze-based command and control.

Here, people use voluntary eye movements and consciously control their gaze direction, for exam-

ple, to communicate or control a computer. Gaze-based control is especially useful for people with

severe disabilities for whom eyes may be the only ordue to developments in technologya reckon-

able option to interact with the world.

2.In its simplest form, the eye can be used as a switch. For example, the user may blink once or

twice, or use simple vertical or horizontal eye movements as an indication of agreement or dis-

agreement (obtainable even with a low-cost web camera based tracker). The most common way to

implement gaze-based control is to use the eyes capability to point at the desired target (requiring

a more accurate tracker).

3.Gaze-based user modeling provides a way to better understand the users behavior, cognitive pro-

cesses, and intentions.[4]

4.Passive eye monitoring is useful for diagnostic applications in which the users visual behavior

is only recorded and stored for later offline processing and analysis with no immediate reaction or

effect on the users interaction with the world.

5.computer interaction is to move beyond such purely descriptive analyses of a small set of specific

eye movement characteristics toward developing holistic computational models of a users visual

behavior. These models typically rely on computational methods from machine learning and pat-

Department of Electronics Engineering, MIT Academy of Engineering, Pune 47

Page 57: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

tern recognition. The key goal of these efforts is to gain a better understanding of and to be able to

perform automatic predictions about user behavior.

For example, As evidenced by research that a variety of visual and non-visual human activities,

such as reading or common office activities, could be spotted and recognized automatically by ana-

lyzing features based solely on eye movement patternsindependent of any information on gaze.[4]

6.It can be used for automatic annotation and filtering in life logging applications.

7.Can be used by paralyzed/needy people for communication purpose.

8.Marketing research and Medical research (neurological diagnosis).

9.The car industry does research in this field too, with the aim of developing assistant systems for

cars. For example, an eye tracker in the car could warn the driver when she or he falls asleep while

driving the car.

10.Specific applications include the tracking eye movement in language reading, music reading,

human activity recognition, the perception of advertising.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 48

Page 58: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Chapter 8

CONCLUSION AND FUTURE SCOPE

Conclusion

The idea of eye control is of great use to not only the future of natural input but more importantly

the handicapped and disabled. One of the main goals for Eye Tracking Interpretation System is

to enable completely paralyzed patients to make their life more accessible and to provide them

opportunity of independence and movement.According the our studies this project is feasible.

Future Scope

1.In addition to gaze direction and eye movement patterns, also other eye-related measurements

such as the pupil size and even microsaccades can contribute to the interpretation of the users emo-

tional and cognitive state.

2.Gaze behavior can also be combined with other measurements from the users face and body,

enabling multimodel physiological computing.

3.Gaze-based user modeling may offer a step toward truly intelligent interfaces that are able to

facilitate the user in a smart way that complements the users natural behavior.[12]

Department of Electronics Engineering, MIT Academy of Engineering, Pune 49

Page 59: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

Chapter 9

REFERENCES

1.De Luca A. Weiss R., and Drewes H. Evaluation of Eye-Gaze Interaction Methods for Security

Enhanced PIN-Entry In Proceedings of the 19th Australasian Conference on Computer-Human

interaction, OZCHI 2007. vol. 51. ACM Press (2007), 199 202.

2.Ashdown M., Oka K., and Sato Y. Combining Head Tracking and Mouse Input for a GUI on

Multiple Monitors. In Extended Abstracts on Human Factors in Computing Systems, CHI ’05.

ACM Press (2005), 1188 1191.

3.Atterer R. Schmidt A., and Wnuk M. A. Proxy-Based Infrastructure for Web Application Shar-

ing and Remote Collaboration on Web Pages, In Proceedings of the 11th IFIP TC13 International

Conference on Human-Computer Interaction, INTERACT 2007, Springer (2007), 74 87.

4.Abrams R.A., Meyer D.E., and Kornblum S. Speed and Accuracy of Saccadic Eye Movements:

Characteristics of Impulse Variability in the Oculomotor System, Journal of Experimental Psychol-

ogy: Human Perception and Performance (1989), Vol. 15, No. 3, 529 543.

5.Santella A., Agrawala M., DeCarlo D., Salesin D., and Cohen M. Gaze-Based Interaction for

Semi-Automatic Photo Cropping, In Proceedings of the SIGCHI Conference on Human Factors in

Computing Systems CHI ’06. ACM Press (2006), 771 780.

6.Albert W. (2002). Do web users actually look at ads A case study of banner ads and eye-tracking

technology, In Proceedings of the Eleventh Annual Conference of the Usability Professionals As-

sociation.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 50

Page 60: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

7.Cowen L., Ball L. J., and Delin J. (2002). An eye-movement analysis in X. Faulkner, J. Finlay,

and F. Dtienne (Eds.), People and Computers XVI Memorable yet Invisible: Proceedings of HCI

2002 (pp. 317-335). London: Springer- Verlag Ltd.

8.Dongheng Li, David Winfield, Derrick J. Parkhurst. Starburst: A hybrid algorithm for video-

based eye tracking combining feature-based and model-based approaches,Human Computer Inter-

action Program Iowa State University, Ames, Iowa, 50010

9.Brigham FJ, Zaimi E, Matkins JJ et al (2001) The eyes may have it: reconsidering eye-movement

research in human cognition, In: Scruggs TE, Mastropieri MA (eds) Technological applications.

Advances in learning and behavioral disabilities, vol 15. Emerald Group Publishing Limited, Bin-

gley, pp 3959

10.Heathcote A. Brown S., and Mewhort D. J. The Power Law repealed: The case for an Expo-

nential Law of Practice, In Pyschonomic Bulletin and Review Vol. 7, Issue 2, (2000), 185 207.

11.Majaranta P. and Raiha K. Twenty Years of Eye Typing: Systems and Design Issues, In Proceed-

ings of the 2002 Symposium on Eye Tracking Research and Applications. ETRA ’02. ACM Press

(2002), 15 22.

12.Alex Poole and Linden J. Ball.Eye Tracking in Human-Computer Interaction and Usability Re-

search: Current Status and Future Prospects.

13.Alexander Mordvintsev and Abid KOpenCV-Python Tutorials Documentation Release 1.

Department of Electronics Engineering, MIT Academy of Engineering, Pune 51

Page 61: Project report on Eye tracking interpretation system

EYE TRACKING INTERPRETATION SYSTEM

• PROJECT SCHEDULE PLAN

Sr.No. Activity Plan(period) Execution1) Literature survey August Completed2) Coding and Software Development September,October and

NovemberCompleted

3) Testing January and February Completed4) Implementation February and March Completed5) Report Submission March Completed6) Project Submission April

Table 9.1: Project Schedule Plan

Project Guide Project Co-ordinator Head of Dept

Mrs.P.S.Kasliwal Mr.S.A.Khandekar Dr.M.D.Goudar

Department of Electronics Engineering, MIT Academy of Engineering, Pune 52