mathematical models for predicting cognitive workload · performance deterioration ( mazaeva et...
TRANSCRIPT
1
Mathematical Models for Predicting Cognitive Workload
1,2
K. Mohanavelu, 1R. Vishnupriya,
1S. Poonguzhali,
3K. Adalarasu,
4,*N. Nathiya
1Department of Electronics and Communication Engineering, College of Engineering,
Guindy, Anna University, Chennai, Tamil Nadu, India 2Defence Bio-engineering and Electromedical Laboratory (DEBEL), DRDO, Bangalore,
Karnataka, India 3School of Electrical and Electronics Engineering, SASTRA University, Thanjavur, Tamil
Nadu, India 4Division of Mathematics, School of Advanced Sciences, VIT University, Chennai, Tamil
Nadu, India *Corresponding Author E-mail: [email protected]
ABSTRACT
Cognitive workload estimation manifests in the form of the quantum of mental resources a subject has to
cater to fulfill the assigned tasks. Prior art technologies, methods and studies have laid emphasis on the
questionnaire (psychological) and physiological (bio-potentials like electrocardiogram (ECG),
electroencephalogram (EEG)) data based strategies which provided optimal results. In this paper a review
of the approaches in designing a mathematical model to link these psychological and physiological data to
examine whether we can achieve a positive correlation in estimation of different levels of cognitive
engagements is attempted. This approach paves a way for the researches and developers to estimate the
cognitive workload in a quantitative platform.
Keywords: Workload; Electroencephalogram (EEG); Feature Extraction; Classification.
INTRODUCTION
Workload can be defined as the total demand imposed on a person while performing any task
(Keller, 2002). Excess workload disturbs human ability to process information, thereby leading to
performance deterioration (Mazaeva et al., 2001). The objective of mental workload analysis is to
classify workload into various levels. Cognitive workload, which is related to the quantum of
mental resources demanded while fulfilling assigned tasks, is classified into intrinsic, germane
and extraneous workload. Among these, intrinsic load depends on the task complexity and the
person’s capacity to perform the task. Germane load arises when a learner acquires a new concept
which is challenging. Improper conception and delivery of information disrupts the learner’s
attention and results in extraneous load (Antonenko et al., 2010). Various methods have been
proposed for classifying workload namely, subjective rating based methods, task performance
based methods and physiological response based methods (Haapalainen et al., 2010; Ganesh et
al., 2011).
Commonly used subjective assessment technique is NASA TLX where workload can be
indicated in numerical or graphical scale. The overall workload in NASA TLX is categorized into
six scales namely, temporal demand, physical demand, mental demand, effort, performance and
International Journal of Pure and Applied MathematicsVolume 118 No. 18 2018, 3655-3672ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)url: http://www.ijpam.euSpecial Issue ijpam.eu
3655
2
frustration (Cao et al., 2009). Task performance based methods are less commonly used. For
example, primary task measurements evaluate speed and accuracy; at the same time secondary
task measurements incorporate tasks as time estimation and memory search (Mazaeva et al.,
2001). Physiological signals are preferred because continuous measurement provides accurate
perception and in-depth knowledge on the effects of cognitive workload (Antonenko et al., 2010;
Geethanjali et al., 2016). Such physiological signals are Electrocardiogram (ECG),
Electroencephalogram (EEG), Electromyogram (EMG), Galvanic Skin Response (GSR), Skin
Temperature (ST), Blood Pressure (BP), Blood volume pulse (BVP) and respiratory rate
(Karthikeyan et al., 2011). Functional Magnetic Resonance Imaging (fMRI) and Positron
Emission Tomography (PET) are the recent physiological techniques. Fundamental drawback of
these imaging techniques is that the person is subjected to lie in a restricted posture (Antonenko
et al., 2010). However, cognitive workload can be accurately and non-invasively measured by
EEG. Thus EEG is one of the suitable physiological signals to monitor cognitive workload, as is
validated further.
In the literature, many researchers have examined physical and mental workload using
questionnaire method and physiological signals. This paper concentrates mainly on the
measurement of mental demand using EEG signal. To evaluate and predict cognitive workload
using questionnaire method or physiological signals, a huge data base has to be referred every
time. Thus a mathematical model can be built, where all the attributes from questionnaire and
physiological signals are fed as input. The model is trained with the attributes to classify the
workloads into low, medium and high workload levels. The aim of this paper is to discuss various
models available to classify different levels of workload and the comparative results provide
justification about the better model to be applied for cognitive workload classification.
2. MATHMETICAL MODELS
A model is used to inspect the effects of various components using mathematical formulae. In
general, mathematical models are constructed by mathematical equations, logical or
computational structures to describe a physical system (Bellomo et al., 2007). Mathematical
models may have different forms such as algebraic, probabilistic, differential equation etc. And
there are different types of models namely deterministic, stochastic, static, dynamic, continuous,
discrete, individual, structured, qualitative, quantitative, mechanistic and statistical models
(Imboden et al., 2013).
International Journal of Pure and Applied Mathematics Special Issue
3656
3
Figure 1. Generic architecture to build a mathematical model.
Mathematical models aid in prediction and provide solutions for different real life problems.
Thus, they have so many imperative applications. The areas of applications are various, like
anthropology, architecture, artificial intelligence, animations, astronomy, biology, chemistry,
forensic and cyber sciences, computer science, economics, finance, mechanical engineering,
neuroscience etc., but not limited to these. The architecture of building a mathematical model is
shown in Figure1.
The following chapter describes about the mathematical model for biological signals and
cognitive workload and the appropriateness of utilising the model in different scenarios is
discussed with example.
3. MATHEMATICAL MODELS FOR COGNITIVE WORKLOAD Fisher’s Linear Discriminant Model
Linear discriminant analysis is a classification method proposed by Fisher. The basic concept in
the back of Fisher’s Linear Discriminant (FLD) model is to project the data onto one dimensional
line. During projection the distance between the modes of the two classes is maximum, while
minimizing the variance within each class. It is used in the field of statistics, machine learning,
pattern recognition etc.
Consider, there are two different classes with samples x1 = {x11,. . ., xL1
1 } and x2 = {x1
2, . . ., xL2
2}.
The objective of FLD model is to maximize J(w) as given in Eq. (1),
International Journal of Pure and Applied Mathematics Special Issue
3657
4
wSw
wSw=J(w)
w
T
B
T
(1)
TB mmmmS 2121 (2)
Tii xx
iw mxmxSi
2,1
(3)
where, SB denotes ‘between class’ scatter matrix (Eq. (2)), Sw denotes ‘within classes’ scatter
matrix (Eq. (3)) and mi is given in Eq. (4).
il
j
i
j
i
i xl
m1
1 (4)
J(w) is maximized in order to predict the direction which maximizes the projected class
means and minimizes the classes variance (Mikat et al., 1999; Welling et al., 2008).
The main advantage of FLD model is that it saves memory and time. But they are best fit for
groups that are linearly separable. FLD model is largely preferred when the samples are small.
The discrimination between classes is poor when the overlap is large.
Application of FLD Analysis to Classify Mental Workload
Zongmin et al. (2013) established a model that predicts the mental workload of aircraft cockpit
display interface using Bayesian Fisher’s discrimination and classification method. Subjects were
requested to perform different flight simulation tasks inside the flight simulator. Heart Rate, R-R
interval and Heart Rate Variability (HRV) were taken as the indexes to reflect various levels of
workload. Three linear equations were constructed in this model namely y1, y2 and y3 with
parameters x1, x2, x3 and x4. Let x1 indicates standard deviation from the normal R-R interval
(SDNN) and x2 indicates the NASA_TLX score, x3 indicates the reaction time of performing
abnormal information, while processing abnormal operation in flight and x4 indicates the
accuracy operation. The values of y1, y2 and y3 change in accordance with x1, x2, x3 and x4. A high
value of y1 implies that the participants were at low workload. Similarly, a high value of y2
implies moderate workload and a high value of y3 implies high workload. The model was
validated by original validation and cross validation methods. The two methods that are original
check and cross validation are validated and the average discrimination is reported as 89.58% and
prediction accuracies as 85.42%.
Alexandre et al. (2006) presented a paper to analyze the speech/music classification based on
Fisher’s linear discriminant application. The features, viz., timbre-related, rhythm-related and
pitch-related was taken for classification. K-nearest neighbour (K-NN) and Fisher’s linear
discriminant algorithm were used for classification. The analysis of speech/music based on
Fisher’s linear discriminant application provided very good classification results.
Zhang et al. (2007) proposed a paper where the Fisher’s discriminant analysis was used to
International Journal of Pure and Applied Mathematics Special Issue
3658
5
analyse the attention deflects/hyperactivity disorder in resting-state of the brain. PC-FDA model
was used for classification. Initially, the dimension of feature space is reduced by the principal
compound analysis after which, FDA was achievable. The classifier performance was estimated
by leave one out cross validation method. PC-FDA showed 85% accuracy in classification.
Dornhege et al. (2007) propounded a paper to detect mental workload in drivers during real
traffic conditions. Two experimental setup had been tested, i.e., an auditory workload scheme and
mental calculation task. FLDA was used for classification. The classifier output is a scalar value;
it represents the low workload (values below zero) or high workload (values above zero) degrees
that have been estimated. It consists of two thresholds, ml < mh, with the threshold switching to
high workload, if the output exceeds mh. If the output falls below the lower value, i.e., ml, the
threshold would be switching to a low workload. The values ml and mh are subject and task
specific so these values were calibrated on the training data. Cross-validation method was used to
forecast the most relevant parameter for workload detection. Greater than 90% classification
accuracies were achieved by the best performing subject for both the classification and the
auditory task.
Linear Regression Model
Linear regression model is used for modelling the linear relationship between two continuous
variables (Yan & Su, 2009). The field of application of linear regression model are astronomy,
statistics, business, share market, finance, biology, etc.
In linear regression model there are two types of variables namely dependent variable,
indicated by y and independent variable denoted by x. Simple linear regression model contains
only one independent variable.
The simple linear regression model is represented by Eq. (5)
xy 10 (5)
where β0 is y intercept, β1 is the slope of the regression line and ε is the random error.
The second type of linear regression model named as multiple linear regression, which has
one dependant variable and many independent variables, is represented as follows (Eq. (6)).
y ppxx110 (6)
Where, y is dependent variable, β0, β1, …, βp are regression coefficients, x1, x2, …, xp are
independent variables in the model.
Linear regression model shows optimal results when the relationship between dependent
variable and independent variable are linear. The results are poor for non-linear relationships.
Application of Linear Regression Analysis to Classify Mental Workload
Liang et al. (2005) proposed a paper depending on driving performance and EEG power spectrum
analysis for monitoring the driver’s alertness. The power spectrum of electroencephalogram
International Journal of Pure and Applied Mathematics Special Issue
3659
6
(EEG) log sub band, correlation analysis, principal component analysis and linear regression
models had been used in combination, to predict driver’s sleepiness level indirectly in a virtual
reality based driving simulator. Using the available information of EEG power spectrum, to
calculate the subject’s driving performance, a 50-order linear regression model with least square
cost function was created. Correlation coefficient with highest value was taken as a factor for
selecting EEG channel. Two EEG channels with highest values were selected. Among the 10
sessions, the cross-session estimation amounted to 0.82 ± 0.07 in mean correlation coefficient and
for within- session estimation amounted to 0.85 ± 0.11 in the mean correlation coefficient
between actual driving performance time series. These results proved that a small number of EEG
channels are sufficient for the continuous estimation of EEG-based driving performance, and the
minute to minute operator alertness changes could be measured accurately.
Stikic et al. (2011) presented a research which is based on EEG to calculate the performance
of present and future cognitive abilities, using the reaction time of both single and combined
modes and also based on accuracy performance metric. The least squares approach was used for
training the models. Two distinctive approaches were used to evaluate the trained linear
regression models, by auto-validation and cross-validation. The prevailing performance model
the results were obtained rather high (sensitivity of 78%-85%, positive predictive value of 57%-
73%, specificity of 61%-92%, and negative predictive value of 75%-98%), after grouping the
discovered and estimated F-measure into classes of acceptable and unacceptable performance.
The future performance model turned into capable of group performance into acceptable and
unacceptable performance. The acquired results had been in a similar range (sensitivity of 69%-
85%, positive predictive value of 57%-70%, specificity of 58%-82%, and negative predictive
value of 74%-90%). From these result it is stated, the EEG data is effectively used to predict
future performance.
Linear Mixed Model
Linear mixed models (LMM) or mixed effects models seem to be the extensions of linear
regression models containing fixed as well as random effects. They compute the relationship
between continuous dependent variable and independent variables. This model is useful when
repeated measurements are taken on the subjects under different conditions. This model is
suitable, when the observed data contains more than one level of variation. LMM is widely used
in image processing, bio statistics (Gibbons et al., 2003) and vegetation classification (Tateishi et
al., 2004).
LMM contains two component namely fixed and random effects. Fixed effects represents the
effects in which all possible levels are present whereas, random effects contains level which
represent random draws from large population. Fixed effects are represented by constant
International Journal of Pure and Applied Mathematics Special Issue
3660
7
parameters. In contrast to the fixed effect, random effects are represented by random variables
which may follow normal distribution (Foulkes et al., 2010).
The standard form of a linear mixed-effects model is given by Eq. (7),
ZbXy (7)
where, y is a n x1 continuous response vector; n is the number of elements; X is an n x p
fixed effect design matrix; β is an p x 1 fixed effect vector; Z is an n x q random effects design
matrix; b is an q x 1 random effects vector; ε is an n x 1 vector of n residuals.
Assume that the errors b and ε are distributed independently mean zero and D and R are
variances respectively (Cnaan et al., 1997) (Eq. (8) and Eq. (9)).
XyE (8)
RZDZy Tvar (9)
The main advantages of LMM are, the model has standard errors and it provides estimates of
random effects with which sub-units can be classified. The disadvantage of LMM is that, more
distributional assumptions should be made (Maruyama, 2008).
Auto Regressive Model
In auto regressive (AR) model the present value of series Xt, is defined by a function of p past
values Xt-1, Xt-2, …, Xt-p. Auto regressive model is used to predict future samples with the previous
samples. AR model is quite simple however effective approach for time series characterization.
The order of the model was determined from the range of preceding observations and the data is
used to estimate the weights which act as the parameters of the model this in turn uniquely
characterize the time series. AR model is used for signal analysis, pattern discrimination etc.
The benchmark model, which also happens to be one of the simplest models, is concise with
the autoregressive process of order p (AR(p)) (Bontempi, 2013). An auto regressive process ϕt of
order n is given by Eq. (10),
tntntt W 11 (10)
This stated that the next value is a linear weighted sum of the past n values and a random
shock. This seems to be like a linear regression model where ϕ is regressed on its past values. On
that account, this model is prefixed with ‘auto’. Multivariate autoregressive models extend to
multiple time series, the current values of the vector of all variables will be modelled as a linear
sum of previous values. Simple time series first order auto regressive model is explained in this
section.
The series ϕt (Eq. (11)) is AR(1) only if it satisfies the Eq. (10),
ttt w 1 (11)
where, wt is zero mean white noise. By substitution ϕt-1 in Eq. (10), it can be written by Eq.
International Journal of Pure and Applied Mathematics Special Issue
3661
8
(12) and Eq. (13)
.22
112 ttttttt wwwww (12)
)1(]var[
0][
422
w
E
t
t
(13)
Then if |α| < 1 the variance is finite and the variance (Eq. 14) is equals to
)1(][
2
22
w
Vart (14)
And the autocorrelation is written by Eq. (15)
21,0)( kkk (15)
Estimation of AR(n) Parameters
For t observations in Eq. (9), the parameters of AR(n) model are estimated by least square
minimizing method (Eq. (16)),
2
11
1
][minargˆntnt
T
nt
t
(16)
The multiple least square problems can be solved in matrix form representation (Eq. (17) and
Eq. (18)).
Y = X α (17)
where,
1
1
11
232
121
,
n
T
T
nn
nTTT
nTTT
YX
(18)
In AR model, output forecasting is based on past outputs and the main drawback is, it does
not consider past disturbances (Ganesh et al., 2011).
Application of AR model to classify mental workload
Zhao et al. (2010) proposed a system for estimating driver’s mental fatigue based of EEG in a
visual reality based stimulator. The features from frontal, central and occipital lobes were
expressed using Multivariate auto regressive (MVAR) model. The technique Akaike's
information criterion (AIC) was detected as the most adequate to determine the order of AR
model. Followed to AIC, the MVAR order became decided on as 3. Three levels of mental
fatigue were identified by combined support vector machine (SVM) and kernel principal
compound analysis (KPCA). The three state classification accuracy of MVAR is 81.64%.
Saidatul et al. (2011) presented a paper based on the AR modelling techniques the EEG
signals were analyzed in the condition of relaxation and mental stress. This comparative study
International Journal of Pure and Applied Mathematics Special Issue
3662
9
measures the power spectral density (PSD) of EEG signals during two different states namely
resting state and mental stress condition. The fast Fourier transform (FFT) by Welch’s method
was used to calculate the power density spectra, Yule–Walker and Burg’s method was used to
estimate the auto regressive (AR) method. To classify relax and stressed conditions neural
network classifier was used. It is stated that maximum classification accuracy of 91.17% was
obtained for the Burg Method, while comparing the Yule Walker and Welch Method technique.
Sun et al. (2014) proposed a fatigue classification method where to extract the highly
discriminative functional connectivity information multivariate pattern analysis technique was
adopted. Overall accuracy of 81.5% (p<0.0001) was achieved from this algorithm.
Anderson et al. (2011) presented a paper based on classification of spontaneous EEG signal
during arithmetic task using AR modelling. EEG was processed by four different representations,
i.e., (i) scalar AR model coefficients, (ii) multivariate AR coefficients, (iii) Eigen values of a
correlation matrix, and the (iv) Karhunen–Lo`eve transform of the multivariate AR coefficients.
Feature vectors defined by these representations were classified with a standard, feed forward
neural network trained via the error back propagation algorithm. Similar results from these four
representations proved that, compared to EEG signals which are untrained and novel, the
multivariate AR coefficients are better to some extend with 91.4% average classification
accuracy.
Hierarchical Bayes Model
Hierarchical Bayes model is the combination of sub-groups and Bayes theorem. A hierarchical
model can be separated into different sub-models. Sub-models combine together to form the
hierarchical model. To integrate all the sub-models Bayes theorem is used. Hierarchical models
are best fit for data structured in groups, for example, demographically, spatially, temporally.
Each group of data’s is represented by different parameters. Regressions contain slope and
intercept. In hierarchical regressions, each group has different intercepts and slopes. Hierarchical
modeling is applied for observational units with different levels. The hierarchical form of analysis
and used for understanding multi parameter problems and also plays an important role in
developing computational strategies. Hierarchical Bayes model are used in marketing field,
ecological data analysis etc.
There are equal advantages and disadvantages in this model. The model is used to predict
future from the past information. It follows likelihood principles. The limitation in this model is,
there are no definite rules to select prior distribution.
Bayes’ Theorem
Bayesian methods are used to analyze hierarchical models. The posterior probability can be
International Journal of Pure and Applied Mathematics Special Issue
3663
10
formed from the conditional probability. The product of prior distribution p(θ) and sampling
distribution p (y | θ) is denoted as condition probability. The posterior distribution is given by Eq.
(18)
)(
|
)(
,|
yP
PyP
yP
yPyP
(18)
From the above equation it can be inferred that, Bayes’ theorem shows conditional
probability and individual event’s relationship.
Let y be the data, θ is the process and ϕ denotes the parameter, the Bayesian hierarchical
model is defined using a series of steps:
1. Data model: ],|[ y
2. Process model: ]|[
3. Parameter model: ][
In simple terms the posterior distribution is written by Eq. (19)
][]|[]|[]|[ yyP (19)
Posterior distribution is defined as the product of the likelihood ratio and the prior
distribution (Penny et al., 2007; Bernardo et al., 1983).
Application of Bayes model to classify mental workload
Wang et al. (2011) proposed a method for cross-subject workload based on hierarchical Bayes
model classification. Compared to other models the classifiers neural networks and support vector
machine having the advantages of modelling the inter relationship, incorporating the prior
knowledge as well as accounting for unreliability in data, through probabilistic theory. Models
are trained and tested using fivefold cross-validation setup. Results show that hierarchical Bayes
model has greater accuracy as single neural network and single native Bayes model. Instead of
using multiple classifiers for multiple subjects this one classifier can be used for multiple
subjects.
Hidden Markov Model
The Hidden Markov Model (HMM) seems to be a substantial statistical model for classifying
time varying (temporal) sequences (Blunsom, 2004). A hidden Markov model has a specific state
which is called a receiving state/accepting state/find state. Once the machine reaches this state it
cannot come out of this state. While it is in that accepting state it emits only one visible state.
HMM are well known for speech recognition and also used in the field of bioinformatics,
chemistry, genetic engineering, machine learning, finance etc.
The general definition of a HMM is given by Eq. (20):
),,( BA (20)
International Journal of Pure and Applied Mathematics Special Issue
3664
11
)|(],[ 1 itjtijij sqsqPaaA
)|()()],([ itkt sqvxPkbikbiB
where A is a transition array, which contains the probability of state j following state i.
Transition probabilities are time independent. B is the observation array, which has the
probability of observation k being produced from the state j.
Initial probability array is represented as π:
)(],[ 1 isqPii
The algorithm of HMM is of three stages namely evaluation, decoding and learning stage.
Each stage is explained below.
Evaluation
In evaluation stage, the probability of visible sequence generated by the model is to be found.
Forward algorithm is used for generating visible O from the given model λ.
The forward probability variable is given by Eq. (21),
)|,,()( 21 ittt sqoooPi (21)
The forward algorithm is as follows:
1. Initialisation:
NiObi ii 1),()( 11
2. Induction:
NjTtobaij tj
N
j
ijtt
1,11),()()( 1
1
1
3. Termination:
)()|(1
iOPN
i
T
The induction stage is very important in forward algorithm. For each state Sj, until time t, the
probability arriving to the particular state is observed and stored in αj(t).
Decoding
The most probable sequence of hidden states is discovered using decoding algorithm. To detect
the best state sequence for an observation sequence, Viterbi algorithm is used. This algorithm is
similar to forward algorithm, instead of summing each transition probabilities, they are
maximised at each step.
)|,,,(max)( 2121, 1,21
titqqq
t ooosqqqPit
International Journal of Pure and Applied Mathematics Special Issue
3665
12
The Viterbi algorithm is as follows:
1. Initialisation:
0)1(,1),()( 111 Niobi ii
2. Recursion:
,1,2),()(max)( 11
NjTtobaij tjijtNi
t
NjTtait ijtNi
t
1,2,)(maxarg)( 11
3. Termination:
)(max1
* iP TNi
)(maxarg1
* iqT T
Ni
4. Optimal state sequence backtracking:
.1,2,1,*
11
* TTtqqt tt
The maximum state are stored and used as a back pointer. Backtracking technique is used for
finding the best state sequence from the back pointers stored in recursion stage, but there is no
easy way to find the second best state sequence.
Learning
After the decoding stage, the model parameters λ = (A, B, π) are to be estimated. Supervised
training and unsupervised training are considered to be the two standard techniques for learning
stage. Supervised learning is used in cases where both inputs and outputs are known. Here, inputs
are equated as observations and outputs as states. Unsupervised training is used in cases where
training set contains input data alone.
Baum-Welch algorithm is used for supervised learning. The algorithm is as follows:
Maximum Likelihood Estimates (MLE) is used to determine the model parameters λ. For the
transition matrix we use Eq. (22):
)(
,|
i
ji
jiijtCount
ttCountttPa
(22)
where Count(ti, tj) is the number of times tj followed ti in the training data. For the
observation matrix (Eq. (23)):
)Count(t
)t,Count(W=t|wP=b
i
jk
jkj(k) (23)
Where Count(wk, tj) is the number of times wk was tagged tj in the training data. And for the
initial probability distribution (Eq. (24))
International Journal of Pure and Applied Mathematics Special Issue
3666
13
)(
)()(
1
11
qCount
tqCounttqP iii
(24)
In order to avoid zero counts during the estimation of HMM and to improve the performance
of the model, smoothing technique can be used. HMM is an effective learning algorithm and can
handle variables of different lengths but they do not explain the dependencies among hidden
states.
Application of HMM to classify EEG signals
Obermaier et al. (2001) presented a system to classify left and right hand movements during
imagination from single EEG trail using HMM classifier. Zhong & Ghosh (2002) suggested on
HMM and coupled HMM in the classification of multi-channel EEG. Although HMM can be
used for classifying EEG they are well known speech recognition (Rabiner, 1989). Lotte et al.
(2007) stated that HMM classifer would be suitable for high dimensionality feature vector and
features containing time information and also he added that LDA can be used for small data set.
4. SUMMARY
The different types of models, parameters and the levels of workload are summarized in Table 1.
The advantages and limitations of the models are depicted in Table 2.
Based on the usage of mathematical models in various applications as per (Figure 2), it is
suggested that FLDA which reduces the size of the feature vector and Bayes model which uses
probability distribution to predict uncertainties are better in their performance when compared to
other mathematical models. However HMM shall also be explored to estimate cognitive
workload as it is predominately used for EEG signal characterization and classification.
Table 1. Different types of model, parameters used in the corresponding model, and different
levels of workload.
Types of Model Parameters / Attributes Levels of Classification
Fisher Linear Discriminant
Analysis (FLDA)
Between class matrix SB
Within class matrix Sw
Vector matrix J(w)
Low
Moderate
High
Linear Regression Model β0 , y intercept
β1, the slope of the regression line
ε, the random error
Alertness
Linear Mixed Model X, fixed effect design matrix.
β, fixed effect vector.
Z, random effects design matrix.
b, random effects vector.
Low level workload
High level workload
International Journal of Pure and Applied Mathematics Special Issue
3667
14
ε, vector of n residuals.
Auto Regressive Model with
neural network classifier
Past values Xt-1, Xt-2,….Xt-p
wt is zero mean white noise
Relax
Mental stress
Hierarchical Bayes Model P(θ|y) Posterior distribution
[ϕ], Prior distribution
[y|θ],[θ|ϕ], Likelihood ratio
Low
Medium
High
Hidden Markov Model [aij], Hidden state transition probability
[bjk], Visible state transition probability
P(O|λ), Posterior probability
EEG signal classification
during right and left
imaginary movement
Table 2. Advantages and limitations of cognitive workload models.
Types of Models Advantages Limitations
Fisher Linear Discriminant
Analysis
Saves memory and time. Suitable for linearly separable
groups.
Sample size should be small.
Linear Regression Model When there is a linear
relationship between
dependent variable and
independent variable, the
results are optimal.
Poor performance when the
relationships are non-linear.
Linear Mixed Model Defined with standard
errors.
More distributional assumptions
should be made.
Auto Regressive Model with
Neural Network Classifier
Future variables can be
predicted from past
outputs.
Past disturbances are not taken into
account.
Hierarchical Bayes Model It obeys likelihood
principle
It is used to predict future
from past samples.
There are no definite rules to select
prior distribution.
Hidden Markov Model They can handle large set
of samples.
Dependencies among hidden states
are unknown.
Figure 2. Ratio chart for various applications of mathematical model (FLDA – Fisher Linear
Discriminant Analysis; LR – Linear Regression; LMM – Linear Mixed Model; AR – Auto
Regression; Bayes - Hierarchical Bayes Model; HMM - Hidden Markov Model).
International Journal of Pure and Applied Mathematics Special Issue
3668
15
CONCLUSION
The aim of this review is to highlight the significant models available to estimate the cognitive
workload. The above discussion gives a clear comparison among different features of the EEG
signal and methods used for predicting cognitive workload. The comparative study of this paper
enables researchers to develop a workload model with higher gain accuracy.
REFERENCES
Keller, J. (2002). Human performance modelling for discrete-event simulation: workload.
Proceedings of the 2002 Winter Simulation Conference, 157-162.
https://doi.org/10.1109/WSC.2002.1172879.
Mazaeva, N., Ntuen, C. and Lebby, G. (2001). Self-organizing map (SOM) model for mental
workload classification. Proceedings of IFSA World Congress and 20th NAFIPS
International Conference Joint 9th. 1822-1825.
https://doi.org/10.1109/NAFIPS.2001.943829.
Antonenko, P., Paas, F., Grabner, R. and Gog, TV. (2010). Using electroencephalography to
measure cognitive load. Educational Psychology Review, 22(4), 425-438.
https://doi.org/10.1007/s10648-010-9130-y.
Karthikeyan, P., Murugappan, M. and Yaacob, S. (2011). A review on stress inducement stimuli
for assessing human stress using physiological signals. Proceedings of IEEE 7th
International Colloquium on Signal Processing and its Applications, 420-425.
Mikat, S., Fitscht, G., Weston, J., Scholkopft, B. and Mullert, K-R. (1999). Fisher discriminant
analysis with kernels. Neural Networks for Signal Processing IX, 1999. Proceedings of
the 1999 IEEE Signal Processing Society Workshop, 41-48.
https://doi.org/10.1109/NNSP.1999.788121.
Welling, M. (2008). Fisher linear discriminant analysis. Max Welling's class notes in Machine
learning, 16(7), 817-830.
Yan, X. and Su, XG. (2009). Linear regression analysis: theory and computing. World Scientific
Publication, 5 Toh Tuck Link, Singapore, 348.
Foulkes, AS. Azzoni, L., Li, X., Johnson, MA., Smith, C., Mounzer, K. and Montaner, LJ.
(2010). Prediction based classification for longitudinal biomarkers. Annals of Applied
Statistics, 4(3), 1476-1497. https://dx.doi.org/10.1214%2F10-AOAS326.
Bontempi, G. (2013). Machine learning strategies for time series prediction. Hammamet, pp. 128.
Penny, W., Friston, K., Ashburner, J., Kiebel, S. and Nichols, T. (2007). Statistical parametric
mapping: the analysis of functional brain images. Neuroscience, Elsevier Publications,
Academic Press, 656.
Bernardo, JM., Degroot, MH., Lindley, DV. and Smith, AFM. (1983). Bayesian statistics 2.
Proceedings of Second Valencia International Meeting, 778.
Blunsom, P. (2004). Hidden Markov Model. Technical Report, Available in
www.cs.mu.oz.au/460/2004/materials/hmm-tutorial.pdf.
Zongmin, W., Damin, Z., Xiaoru, W., Chen, L. and Huan, Z. (2013). A model for discrimination
and prediction of mental workload of aircraft cookpit display interface. Chinese
Journal of Aeronautics, 27(5), 1070-1077. https://doi.org/10.1016/j.cja.2014.09.002.
Alexandre, E., Rosa, M., Cuadra, L. and Gil-Pita, R. (2006). Application of fisher linear
discriminant analysis to speech/music classification. Proceedings of International
Conference on Computer as a Tool, EUROCON 2005, 1666-1669.
https://doi.org/10.1109/EURCON.2005.1630291.
Zhang, H., Zhu, Y., Maniyeri, J. and Guan, C. (2007). Fisher discriminative analysis of resting
state brain function for attention- deflict/hyperactivity disorder. Neuroimage, 40(1),
110-120. https://doi.org/10.1007/11566489_58.
Liang, SF., Lin, CT., Wu, RC., Chen, YC., Huang, TY. and Jung, TP. (2005). Monitoring driver’s
alertness based on the driving performance estimation and the EEG power spectrum
International Journal of Pure and Applied Mathematics Special Issue
3669
16
analysis. Proceedings of IEEE Engineering in Medicine and Biology 27th Annual
Conference, 6, 5738-5741. https://doi.org/10.1109/IEMBS.2005.1615791.
Dornhege, G., Millan, J.del R., Hinterberger, T., McFarland, DJ. and Muller, K-R (2007).
Improving human performance in a real operating environment through real-time
mental workload detection. Toward Brain-Computer Interfacing, 1, MIT Press, 409-
422.
Stikic, M., Johnson, RR., Levendowski, DJ., Popovic, DP., Olmstead, RE. and Berka, C. (2011).
EEG-derived estimators of present and future cognitive performance. Frontiers in
Human Neuroscience, 5(70), 1-13.
Zhao, C., Zheng, C., Zhao, M. and Liu, J. (2010). Classifying driving mental fatigue based on
multivariate autoregressive models and kernel learning Algorithms. Proceedings of
International Conference on Biomedical Engineering and Informatics (BMEI 2010),
2330-2334. https://doi.org/10.1109/BMEI.2010.5639579.
Saidatul1, A., Paulraj, MP., Yaacob, S. and Yusnita, MA. (2011). Analysis of EEG signals during
relaxation and mental stress condition using AR modeling techniques. Proceedings of
IEEE International Conference on Control System, Computing and Engineering, 477-
481. https://doi.org/10.1109/ICCSCE.2011.6190573.
Sun, Y., Lim, J., Meng, J., Kwok, K., Thakor, N. and Bezerianos, A. (2014). Discriminative
analysis of brain functional connectivity patterns for mental fatigue classification.
Annals of Biomedical Engineering, 40(10), 2084-2094. https://doi.org/10.1007/s10439-
014-1059-8.
Anderson, CW., Stolz, V. and Shamsunder, S. (2011). Multivariate autoregressive models for
classification of spontaneous electroencephalographic signals during mental tasks.
IEEE Transactions on Biomedical Engineering, 45(3), 277-286.
https://doi.org/10.1109/10.661153.
Wang, Z., Hope, RM., Wang, Z., ji, Q. and Gray, WD. (2011). Cross subject workload
classification with hierarchical Bayes model. Neuroimage, 59(1), 64-69.
https://doi.org/10.1016/j.neuroimage.2011.07.094
Obermaier, B., Guger, C., Neuper, C. and Pfurtscheller, G. (2001). Hidden Markov models for
online classification of single trail EEG data. Pattern Recognition Letters, 22(12),
1299-1309. https://doi.org/10.1016/S0167-8655(01)00075-7.
Zhong, S. and Ghosh, J. (2002). HMMs and coupled HMMs for multi-channel EEG
classification. Proceedings of the 2002 International Joint Conference on Neural
Networks, IJCNN '02, 1154-1159. https://doi.org/10.1109/IJCNN.2002.1007657.
Rabiner, LR. (1989). A tutorial on hidden Markov models and selected applications in speech
recognition. Proceedings of the IEEE, 77(2), 257-286. https://doi.org/10.1109/5.18626.
Lotte, F., Congedo, M., Lecuyer, A., Lamarche, F. and Arnaldi, B. (2007). A review of
classification algorithms for EEG-based brain–computer interfaces. Journal of Neural
Engineering, 4(2), R1-R13. https://doi.org/10.1088/1741-2560/4/2/R01.
Bellomo, N., Elena De Angelis, and Delitala, M. (2007). Lecture notes on mathematical
modelling in applied sciences.
Imboden, DM. and Pfenninger, S. (2013). Introduction to systems analysis mathematically
modelling natural systems. Springer-Verlag Berlin Heidelberg, 2013 edition, 252.
Haapalainen, E., SeungJun Kim, Forlizzi, JF. and Dey, AK. (2010). Psycho-physiological
measures for assessing cognitive load. Proceedings of the 12th ACM International
Conference on Ubiquitous Computing, 301-310.
https://doi.org/10.1145/1864349.1864395.
Cao, A., Chintamani, KK., Pandaya, AK. and Ellis, RD. (2009). NASA TLX: Software for
assessing subjective mental workload. The Psychonomic Society, 113-117.
https://doi.org/10.3758/BRM.41.1.113.
Gibbons, RD. and Hedeker, D. (2000). Applications of mixed-effects models in biostatistics. The
Indian Journal of Statistics, 62, 70-103.
International Journal of Pure and Applied Mathematics Special Issue
3670
17
Maruyama, N. (2008). Generalized linear models using trajectories estimated from a linear mixed
model. Division of Biostatistics, Kitasato University Graduate School of
Pharmaceutical sciences, 102-110.
Cnaan, A., Laird, NM. and Slasor, P. (1997). Using the general linear mixed model to analyze
unbalanced repeated measures and longitudinal data. Statistics in Medicine, 16, 2349-
2380. https://doi.org/10.1002/(SICI)1097-0258(19971030)16:20<2349::AID-
SIM667>3.0.CO;2-E.
Tateishi, R., Shimazakiand, Y. and Gunin, PD. (2004). Spectral and temporal linear mixing
model for vegetation classification. International Journal of Remote Sensing, 25, 4203-
4218. https://doi.org/10.1080/01431160410001680437.
Ganesh, UL., Krishna, M., Hari prasad, SA. and Krishna, SR. (2011). Review on models for
generalized predictive controller. Proceedings of International Conference on
Computer Science, Engineering and Applications, 418-424.
https://doi.org/10.5121/csit.2011.1237.
Geethanjali, B., Adalarasu, K., Jagannath, M. and Rajasekaran, R. (2016). Enhancement of task
performance aided by music. Current Science, 111(11), 1794-1801.
https://doi.org/10.18520/cs%2Fv111%2Fi11%2F1794-1801.
International Journal of Pure and Applied Mathematics Special Issue
3671
3672