chapter 4 empirical study and analysisshodhganga.inflibnet.ac.in/bitstream/10603/45872/9/10... ·...
TRANSCRIPT
Chapter 4
EMPIRICAL STUDY AND ANALYSIS
The research describes that the development of an emotion detection approach is based
on the automatic monitoring of physiological signals using a microcontroller. There are
three main aspects of this study: (a) experimentation setup for the physiological sensing,
(b) signal processing to sense the affective state, and (c) affective computing using the
machine learning algorithms. This chapter focuses on the empirical study of this research.
The physiological signals were concurrently recorded and coordinated by the hardware
and software combination throughout the whole experimentation to analyze the potential
concurrent changes that occurred due to the sympathetic activation of aroused emotion.
The goal of this chapter is to define the experimental setups for the data collection, which
will be further used for emotion prediction.
There are number of patients exhibiting autonomic disorders. All autonomic tests include
their physiological background, indications, contra-indications, the entry conditions that
must be fulfilled before the subject is allowed to take the test, the instrumentation, the
activities flow performed during the test, and the exceptions which might cause test.
4.1 Generalities
A setup and a corresponding protocol were defined and implemented while performing
the experiments. Those protocols are:
Provide an appropriate stimulus, capable of eliciting stress in the subjects
participating in the experiment.
Provide appropriate variation in each output data.
Provide proper coordination of all the software and hardware components that are
involved in the experimental process.
Record the GSR, BVP, and temperature signals with all the necessary time markers.
This design is not suitable for the people having a disease called
hyperhidrosis(Ogorevc et al., 2013), which causes excessive sweating. It is a
drawback of the system.
The complete implementation of the system for experimental with the coordinated
software and hardware components are described in the Chapter 2.For validation and data
collection, three sets of different experiments were conducted with totally different
scenarios by using same strategy and protocol.
Scenario _1 (S0_1):- In the first scenario, the experiments were done in a multinational
company for an improvement in the daily activities of the staff of the company. The
company’s interest was to target the weak performers. After the discussion, permission
was granted by the company for the betterment of the employees. This helped the staff to
work on their emotional aspect.
Scenario _2 (S0_2):- In the second setup, the experiments were done by a doctor on the
hundred odd patients (subject) in a hospital; each subject having different age, gender,
and medical background. The experiment was also included the paralyzed people to
understand that how everybody cannot express the emotions. This helped the doctor to
see an exact mind state of a patient for the better treatment.
Scenario _3 (S0_3):-In this scenario, a set of audio/video clips was successfully used as
stimuli, in the real-time. Different audio or video songs of different languages were
played and then even with the choice of the subject. The songs experimentally were
found to be triggering-off the specific emotion. Various subjects were asked to listen to
the clips and subjectively feedback was measured for detecting the emotion arousal.
This work was to design and develop a real-time monitoring system that can be used to
estimate different emotions, especially for the people who cannot express their emotions,
such as the people suffering from a paralyzed body. The expected values of the different
biofeedback modalities are mapping with different ranges of emotion areas mentioned in
the Chapter 3. Different emotional expressions produce different changes in the
autonomic activity; following are the examples of various activities:
Table 4.1: Change in Autonomic Activity(Ekman et al., 1983, Kreibig, 2010)
Emotions GSR BVP Temperature
Anger Decreases Increased Increases
Fear Increases Increased Decreases
Happiness No Change Normal No Change
Stress can be seen as a state of crisis that is preceded by arousal due to an external
stimulus. An external stimulus can be considered something that tends to create
distractions. Once the factor causing stress (the stressor) disappears, the body gets relaxed
and calm; and then returns to a normal state.
This considers a simplified setting by assuming that the person is either in the normal
state or in a stressed state. The change between the two states can be sudden or
incremental; typically, arousal is more rapid and relaxation takes considerably
longer.(Fontaine et al., 2007) We can see that the various emotions are categorized; the
emotions are based upon the degree of arousal from low to high and valence i.e. positive
to negative of emotions are shown in Fig 4.1(Schmidt and Trainor, 2001). All the features
were selected from the training data which was extracted from real-time experimental
data set.
Fig. 4.1: Two-dimensional emotion models with four quadrants
GSR
BVP
Stress
Sadness
Arousal
Valence
Joyful
Neutral
Calm
The record was collected based on the experiments consisting extraction of GSR, BVP,
and temperature signals and stored within the microcontroller. According to emotion
model(Ohme et al., 2009)following are expected outcomes from various activities of
experiments:
When stress will increases then at same time the skin conductivity GSR will decrease
and HR/BVP will increase
When joyfulness is decreased then the skin conductivity GSR will increase and
HR/BVP will (increase/decrease)
When calmness, there will be then no change
The realistic interest of these experiments was to predict the state for statistics collection
(samples). This was done to have those samples available for the testing that were never
presented to the system during the training phase. This data was collected and analyzed in
the controlled settings with the designed hardware and with appropriate algorithms
embedded in the microcontroller. Data sets of all experiments are given in the attached
Appendix 3.
4.2 Benchmark construction of experimentation
4.2.1 Experimental Study_ S0_1
Aim:
The experiment was to determine the change in the emotional level of a subject
responding to a task given by a company and to a series of questions with emotional
content.
Participants and stimuli:
An experimental setup was established at the sitting place of the subject, where the daily
tasks were performed. As mentioned in the procedure, the readings were taken at the time
when the subject was performing the assigned task (office work). Both the genders,
twenty odd male and female subjects ranging from the age 20 to 55, were considered.
Initially, the subjects under neutral conditions were measured; that served as the
reference for us to estimate the variance of the values in the different emotional states.
Once the state was reached, the subject tended to be in that state for a finite amount of
time. The total time of stimulus for each emotion was between 2-3 minutes and with a
gap of 2 minutes between different emotions. So, to fulfill the criteria, emotion was
estimated thrice.
Few sample questions with different emotional content are given below:
How long you have worked as a subordinate?
How do you rank your overall job profile?
Are you satisfied with your pay package?
Does your job profile justify your hard work?
Are you satisfied with the services provided by the company?
Do you think that you deserve more in life?
Procedure:
1. Selected subjects, by the company itself, should be asked to go for hand wash
with soap and water, and then get the hands dried properly.
2. Subject should be healthy (that is, no fever etc.).
3. Subject should be without any alcohol intake.
4. The GSR,BVP, and temperature sensors should be attached to the distal finger
segment of two non-adjacent fingers
5. The subject should sit comfortably without any external stimuli disturbance.
6. During the experiment, subject should not be allowed to have water, as it can
change the emotion.
7. Two different measurements should be taken in this experiment: (a) while regular
daily task, and (b) while carrier satisfaction interview.
8. Readings should be taken three times: (a) before the task, (b) during the task, and
(c) after the task. Average of these three values would be considered as actual
value.
9. Different emotions should be detected, as it would also help in professional
growth (by building strong emotions)
4.2.2 Experiment_ S0_2
Aim:
Skin conductance orienting response (SCOR) in childhood, habituation is absent at age 3
but apparent at age 4 and increases thereafter to peak at age 6 and then levels off.(Gao et
al., 2007, Kylliäinen and Hietanen, 2006).
This experiment was designed for all age group above 6 and to determine the change in
the emotional level of a subject while responding to the doctors for the questions with the
emotional content.
Participants and stimuli:
An experiment setup was established according to the comfort-level of a doctor. Total ten
questions, five with neutral content and five of an emotional nature, were asked from the
subject. The doctor instructed the subject to sit quietly and answer each question honestly
in one word. The subject was instructed not to give explanation on any answer. Questions
were asked according to the age factor. Once the state was reached, the subject tended to
be in that state for a finite amount of time. The total time of stimulus for each emotion
was between 2-3 minutes. Experiment was carried for 15 days on different subjects and
sometimes subjects were also intentionally repeated for the better judgment.
Table 4.2: Questions with Emotional Content
Age 6 to 12 Age 12 to 35 Age above 35
Does being alone at night
frighten you?
Are you in love? Do you ever cry?
Has anyone ever beaten you? Do you ever cry? Do you recall your young days?
Are you scared of Ghosts?
Do you feel there is someone who
understands you?
Have you ever seen a tragic
accident?
Do you feel scared during the
exams days?
Do you have any best friend? Are you satisfied with your
achievements in the life?
How do you handle the exam
pressure?
Are you satisfied with your career? Whom you miss the most in your
life and why?
Table 4.3: Questions with Neutral Content
Age 6 to 12 Age 12 to 35 Age above 35
Do you like burger? Is it Monday today? Do you have a car?
Do you like watching TV? Do you like holidays? Do you have a House?
Which day is today?
What is your hobby? Do you have kids?
Which is your favorite
game?
Which sport does you like the
most?
Are you a foodie?
Do you like coloring? Who is your best friend? Which is your favorite dish?
Procedure
1. The subjects coming for the daily checkup should be asked to go for the hand wash with
soap and water, and the get the hands dried properly.
2. Subject’s health should not be critical (e.g. fever etc.)
3. Subject should be ready without any alcohol intake.
4. The GSR, BVP, and temperature sensors should be attached to the surface of the distal
finger segment of two non-adjacent fingers.
5. The subject should sit comfortably without any external stimuli disturbance.
6. During the experiment, subject should not be allowed to have water, as it can change the
emotion.
7. Two different measurements should be performed in this experiment: (a) during the
regular daily task, and (b) during the carrier satisfaction interview.
8. Different emotions should be detected, as it would also help doctor for the better
understanding.
4.2.3 Experiment_ S0_3
Aim
The experiment was performed to determine that how the audio/video clips may result in
a high subject agreement in terms of the elicited emotions (that is, sadness, anger,
surprise, fear, and amusement). Twenty-one movies, in three groups, were played for the
participants. Each group of seven clips was meant to extract different emotion (Stress,
Joyful, and Calmness).
Participants and stimuli
An experiment was done on 20 undergraduate/graduate students from different streams:
electronics & communication, computer science, and civil. The subjects participated in
the study all mutually. The subjects were informed that after the experiment they had to
fill out a questionnaire where they had to answer the demographic items. Then the
subjects were informed that they would be watching various movie clips geared to elicit
emotions and during each clip, they would be prompted to answer the questions about the
emotions that they experienced while watching the scene. They were also asked to
respond according to the emotions they experienced. A slide show played the various
clippings and, after each one of the clips, a slide was presented asking the participants to
answer the survey items for the previous scene. During the above measurement, the
subject was advised to abstain from all physical work, and needed to concentrate on
listening to the clips. The total time of stimulus for each emotion was between 4 to 5
minutes, with minimum gap of 1 minute between different stimuli, during which the
music was put off and the subject was advised to come to normal, sip water, munch on a
snack etc. For each scene, four questions were asked. The questions are:
Which emotion did you experience from this video clip?
How would you rate, on a five point scale, the intensity of the sentiment that you
experienced?
Whether you experienced any other emotion at the same intensity or advanced, and if
so, specify what that feeling was?
Have you seen that clip before?
Procedure
1. The subjects, who volunteered for the experiment, were asked to go for the hand wash
with soap and water, and get the hands dried properly.
2. Subject should be healthy (that is, no fever etc.).
3. Subject should be ready without any alcohol intake.
4. The GSR, BVP, and temperature sensors should be attached to the surface of the
distal finger segment of two non-adjacent fingers.
5. The subject should sit comfortably without any external stimuli disturbance.
6. The readings should be taken three times: (a) before the task, (b) during the task, and
(c) after the task. Average of these three values would be considered as actual value.
7. Different emotions should be detected, as it would help in gathering the accurate
training data.
4.3 DATA ANALYSIS
Data Analysis is the process of reducing/filtering the large amounts of collected data in a
way so that the data makes sense. To do this, the hardware was designed and developed
with a capability to do the data analysis and data storage within the hardware. The
following fig 4.2 represents the general structure of the proposed system.
Fig. 4.2: Data analysis and subject assessment for emotion estimation
4.3.1 Data Acquisition
The information was gathered based on the above mentioned experiments. The sensors
were attached to the fingers of the individuals to simultaneously acquire the BVP, GSR,
and Temperature signals by means of a recording mechanism. The purpose of these
experiments was to focus on both main stressing tasks, namely Talk Preparation (TP) and
Hyperventilation (HV). Each experiment was divided into four steps, which are described
in the subsequent subsections:
1) First step (FS_1) consisted of attaching sensors to the persons, and after a variable
period of time when the subject was asked to calm down, an acquirement was
performed according to the procedure mentioned above.
Emotion
Induction Measuring physiological
variables
Emotion
Estimation
Subject Assessment and
Data Analysis
2) Hyperventilation (HV): later, the person was required to breathe intensely and speedy
for every 2-3 seconds, indicated by the experimenter. This task was performed until
the subject evidently perceived the changes in his/her corporal sensations. It was in
this moment exactly when GSR/BVP/Temperature was sampled, representing an
obvious behaviour of physiological signals under a tensing situation.
3) Talk Preparation (TP): After HV, the subject was asked to take a break and then was
asked to prepare the answers to the questions mentioned in the above experiments.
The subject was given one or two minutes to prepare for the answers; signals were
sampled again during a period of 90 seconds, representing a stressing situation.
4) In the final step (FS_2), the experimentation comes to an end by acquiring the
emotions from the subject. It is significant to state that for the sake of independence in
the order of the tasks.
4.3.2 Normalization and feature extraction
The procedures described above resulted in a set of physiological records (total 160
physiological records). The differences among the number of data sets for each emotion
class are due to the data loss for the data of some participants during various segments of
the experiment. In order to compute the number of variations in the physiological
responses, the data was normalized for every emotion, as the participants went from a
calm state to the state of experiencing a specific emotion. Normalization is also important
for minimizing the individual differences among participants in terms of their
physiological responses while experiencing a specific emotion. The composed data was
normalized by using the average value of the corresponding information type gathered
during the relaxation period for the same participant. An example of normalization for the
GSR values is as follows:
Normalized Data = raw_data – raw_relaxation_data (1)
Raw_relaxation data
After the data signals were normalized, features were extracted from the normalized data.
Four features were extracted for each data signal type: maximum, minimum, mean, and
variance of the normalized data. The information was stored in a three dimensional array
of real numbers:
1 The subjects who participated in the experiment
2 The emotion classes (stress, joyfulness, and calmness)
3 Extracted features of statistics signal types (minimum; maximum; mean; and var
iance of GSR, temperature, and BVP).
Every slot of the array consists of one exact feature of a precise data signal type,
belonging to one exact participant while s/he was experiencing one precise emotion. (e.g.,
a slot carries the mean of normalized skin temperature assessment of, say, the participant
number 1 while s/he was experiencing tension, whereas, another slot, for example,
contains the variance of normalized value of the participant number 5 while s/he was
experiencing calmness). As mentioned, features were extracted for each data type and
then supervised learning algorithm was implemented that took these features as input and
interpreted them for final prediction.
4.3.3 Classification Methods
Classifiers are compared on the experimental data. The Naïve Bayes classifiers are
trained and tested on the individual and multiple subjects. Later than all the features were
extracted, these were provided as contribution to the learning systems, which were
trained to differentiate the tension state. The training data has been classified into two
different sets in order to evaluate that how activity information may influence the results
of a stress inference. One set of training data includes only the GSR/BVP/Temperature
related features, while the second set also includes the accelerometer information. We
also evaluated the classification performance for the between-subjects datasets and
within-subject datasets. A cross-validation analysis was applied on the resulting models.
The entire dataset was used to generate several types of the physiological response
models. These models included the models of changes to all GSR/BVP/Temperature
response. For a cross-validation, the original sample is randomly partitioned into k equal
size sub-samples; of these k sub-samples, a single sub-sample is retained as the validation
data for testing the model, and the remaining (k – 1) sub-samples are used as the training
data. (Abu-Nimeh et al., 2007)The cross-validation process is then repeated k times
(the folds), with each of the k sub-samples used exactly once as the validation data.
The k results from the folds can then be averaged (or otherwise combined) to produce a
single estimation. According to the cross validation strategy, the original data is first
divided into 10 equal subsets. Sequentially, one subset is tested using the classifier
trained on the remaining subsets. This process is repeated until every instance has been
used exactly once for testing. The overall success rate for a classifier is then evaluated as
the number of correct classifications divided by the total number of feature sets tested:
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦𝑅𝑎𝑡𝑒 =Correct classifications
total number of feature sets tested (2)
Considered mean, minimum, maximum, and standard deviation of skin conductance and
peak height; the total number; and the cumulative amplitude, rising time, and energy of
startle responses in a segment. These features were initiated useful in the earlier studies.
The Naïve Bayes classifiers are based on the probability models that integrate class
conditional assumptions (Quattoni et al., 2004) We basically estimate the probabilities
that an object from each class will fall in every cell of the discrete variables (every
probable discrete value of the vector variable X), and then we employ Bayes theorem to
create a classification. This technique computes the conditional probabilities of the
diverse classes given the values of attributes of an unidentified sample and then the
classifier will calculate that the sample belongs to the class having the maximum
posterior probability. If an instance is represented by an n-dimensional feature vector,
(x1, x2,…, xn), a sample is classified to a class c from a set of probable classes C
according to the highest posteriori (MAP) decision rule, mentioned in chapter 3.
Classify (a1, a2,…..an) = argmax p(C=c)∏ p(xi|C = c)ni=1 (3)
The conditional probability in the above calibration is obtained from the estimates of the
possibility mass function using the training data. Even though the self-determination
assumption may not be a practical model of the probabilities involved, it may still permit
relatively correct classification performance.
4.3.4 Observations
In this section, the results from all three experiments are discussed. The situations and
emotions where there occurs a great arousal, such as horror and melancholy were easy to
identify, whereas the lower arousal emotions, such as joy and sadness were meagerly
distinguishable. The present work is an attempt to such an end and hopes to find out the
methods and ways to achieve the goal of affective communication. This experiment has a
drawback that it is not based on the natural / real emotional states, but the induced
emotions are being observed and analyzed. The other factor of importance is the
emotional responses that are purely dependent upon the regulation capability of the
individual. The signals from the experimental subjects were gathered and diverse features
were extracted. The prediction performance was evaluated using 10-fold cross validation:
10 samples were pulled out as the test samples, and the residual samples were used to train
the classifiers. The objective was to develop and train a system that accepts the various
physiological variables as input and predicts the participant’s affective state. Few
examples of the statistics variation are shown below:
GSR Variation
Fig. 4.3: Variation in GSR
BVP Variations
Fig. 4.4: Variation in Blood volume Pulse (BVP)
Temperature Variation
Fig. 4.5: Variation in Temperature
4.4 CONCLUSIONS
The results from the experiments illustrate a promising correlation among the emotional
tension and the monitored physiological signals. The tests performed with the classifiers
have recognized the user emotional states on the basis of the features extracted from the
physiological signals. These results have exposed that, below the controlled conditions,
the simultaneous monitoring and simultaneous processing of three physiological signals:
BVP, GSR, and ST are complete success. This work corresponds to the data collected in
the controlled laboratory settings. However, the controlled setting in a laboratory is not
suitable for mobile emotion monitoring, because the physical activity affects the
measured physiological signals. The automated induction of an accurate physiological
response was followed by the prediction models. It is interesting to know that for
predicting all three parameters the accuracy levels were surprisingly high. The
physiological responses follow directly from the changes in affect and thus can be used as
the key predictors of an affective state. Although biofeedback devices can be used to
obtain actual physiological signals, it may be impractical to require the users to
biofeedback equipment and then deploy an additional hardware with the applications.