mathematical models for predicting cognitive workload · performance deterioration ( mazaeva et...

1

Mathematical Models for Predicting Cognitive Workload

1,2

K. Mohanavelu, 1R. Vishnupriya,

1S. Poonguzhali,

3K. Adalarasu,

4,*N. Nathiya

1Department of Electronics and Communication Engineering, College of Engineering,

Guindy, Anna University, Chennai, Tamil Nadu, India 2Defence Bio-engineering and Electromedical Laboratory (DEBEL), DRDO, Bangalore,

Karnataka, India 3School of Electrical and Electronics Engineering, SASTRA University, Thanjavur, Tamil

Nadu, India 4Division of Mathematics, School of Advanced Sciences, VIT University, Chennai, Tamil

Nadu, India *Corresponding Author E-mail: [email protected]

ABSTRACT

Cognitive workload estimation manifests in the form of the quantum of mental resources a subject has to

cater to fulfill the assigned tasks. Prior art technologies, methods and studies have laid emphasis on the

questionnaire (psychological) and physiological (bio-potentials like electrocardiogram (ECG),

electroencephalogram (EEG)) data based strategies which provided optimal results. In this paper a review

of the approaches in designing a mathematical model to link these psychological and physiological data to

examine whether we can achieve a positive correlation in estimation of different levels of cognitive

engagements is attempted. This approach paves a way for the researches and developers to estimate the

cognitive workload in a quantitative platform.

Keywords: Workload; Electroencephalogram (EEG); Feature Extraction; Classification.

INTRODUCTION

Workload can be defined as the total demand imposed on a person while performing any task

(Keller, 2002). Excess workload disturbs human ability to process information, thereby leading to

performance deterioration (Mazaeva et al., 2001). The objective of mental workload analysis is to

classify workload into various levels. Cognitive workload, which is related to the quantum of

mental resources demanded while fulfilling assigned tasks, is classified into intrinsic, germane

and extraneous workload. Among these, intrinsic load depends on the task complexity and the

person’s capacity to perform the task. Germane load arises when a learner acquires a new concept

which is challenging. Improper conception and delivery of information disrupts the learner’s

attention and results in extraneous load (Antonenko et al., 2010). Various methods have been

proposed for classifying workload namely, subjective rating based methods, task performance

based methods and physiological response based methods (Haapalainen et al., 2010; Ganesh et

al., 2011).

Commonly used subjective assessment technique is NASA TLX where workload can be

indicated in numerical or graphical scale. The overall workload in NASA TLX is categorized into

six scales namely, temporal demand, physical demand, mental demand, effort, performance and

International Journal of Pure and Applied MathematicsVolume 118 No. 18 2018, 3655-3672ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)url: http://www.ijpam.euSpecial Issue ijpam.eu

3655

2

frustration (Cao et al., 2009). Task performance based methods are less commonly used. For

example, primary task measurements evaluate speed and accuracy; at the same time secondary

task measurements incorporate tasks as time estimation and memory search (Mazaeva et al.,

2001). Physiological signals are preferred because continuous measurement provides accurate

perception and in-depth knowledge on the effects of cognitive workload (Antonenko et al., 2010;

Geethanjali et al., 2016). Such physiological signals are Electrocardiogram (ECG),

Electroencephalogram (EEG), Electromyogram (EMG), Galvanic Skin Response (GSR), Skin

Temperature (ST), Blood Pressure (BP), Blood volume pulse (BVP) and respiratory rate

(Karthikeyan et al., 2011). Functional Magnetic Resonance Imaging (fMRI) and Positron

Emission Tomography (PET) are the recent physiological techniques. Fundamental drawback of

these imaging techniques is that the person is subjected to lie in a restricted posture (Antonenko

et al., 2010). However, cognitive workload can be accurately and non-invasively measured by

EEG. Thus EEG is one of the suitable physiological signals to monitor cognitive workload, as is

validated further.

In the literature, many researchers have examined physical and mental workload using

questionnaire method and physiological signals. This paper concentrates mainly on the

measurement of mental demand using EEG signal. To evaluate and predict cognitive workload

using questionnaire method or physiological signals, a huge data base has to be referred every

time. Thus a mathematical model can be built, where all the attributes from questionnaire and

physiological signals are fed as input. The model is trained with the attributes to classify the

workloads into low, medium and high workload levels. The aim of this paper is to discuss various

models available to classify different levels of workload and the comparative results provide

justification about the better model to be applied for cognitive workload classification.

2. MATHMETICAL MODELS

A model is used to inspect the effects of various components using mathematical formulae. In

general, mathematical models are constructed by mathematical equations, logical or

computational structures to describe a physical system (Bellomo et al., 2007). Mathematical

models may have different forms such as algebraic, probabilistic, differential equation etc. And

there are different types of models namely deterministic, stochastic, static, dynamic, continuous,

discrete, individual, structured, qualitative, quantitative, mechanistic and statistical models

(Imboden et al., 2013).

International Journal of Pure and Applied Mathematics Special Issue

3656

3

Figure 1. Generic architecture to build a mathematical model.

Mathematical models aid in prediction and provide solutions for different real life problems.

Thus, they have so many imperative applications. The areas of applications are various, like

anthropology, architecture, artificial intelligence, animations, astronomy, biology, chemistry,

forensic and cyber sciences, computer science, economics, finance, mechanical engineering,

neuroscience etc., but not limited to these. The architecture of building a mathematical model is

shown in Figure1.

The following chapter describes about the mathematical model for biological signals and

cognitive workload and the appropriateness of utilising the model in different scenarios is

discussed with example.

3. MATHEMATICAL MODELS FOR COGNITIVE WORKLOAD Fisher’s Linear Discriminant Model

Linear discriminant analysis is a classification method proposed by Fisher. The basic concept in

the back of Fisher’s Linear Discriminant (FLD) model is to project the data onto one dimensional

line. During projection the distance between the modes of the two classes is maximum, while

minimizing the variance within each class. It is used in the field of statistics, machine learning,

pattern recognition etc.

Consider, there are two different classes with samples x1 = {x11,. . ., xL1

1 } and x2 = {x1

2, . . ., xL2

2}.

The objective of FLD model is to maximize J(w) as given in Eq. (1),


3657

4

wSw

wSw=J(w)

w

T

B

T

(1)

TB mmmmS 2121 (2)

Tii xx

iw mxmxSi

2,1

(3)

where, SB denotes ‘between class’ scatter matrix (Eq. (2)), Sw denotes ‘within classes’ scatter

matrix (Eq. (3)) and mi is given in Eq. (4).

il

j

i

j

i

i xl

m1

1 (4)

J(w) is maximized in order to predict the direction which maximizes the projected class

means and minimizes the classes variance (Mikat et al., 1999; Welling et al., 2008).

The main advantage of FLD model is that it saves memory and time. But they are best fit for

groups that are linearly separable. FLD model is largely preferred when the samples are small.

The discrimination between classes is poor when the overlap is large.

Application of FLD Analysis to Classify Mental Workload

Zongmin et al. (2013) established a model that predicts the mental workload of aircraft cockpit

display interface using Bayesian Fisher’s discrimination and classification method. Subjects were

requested to perform different flight simulation tasks inside the flight simulator. Heart Rate, R-R

interval and Heart Rate Variability (HRV) were taken as the indexes to reflect various levels of

workload. Three linear equations were constructed in this model namely y1, y2 and y3 with

parameters x1, x2, x3 and x4. Let x1 indicates standard deviation from the normal R-R interval

(SDNN) and x2 indicates the NASA_TLX score, x3 indicates the reaction time of performing

abnormal information, while processing abnormal operation in flight and x4 indicates the

accuracy operation. The values of y1, y2 and y3 change in accordance with x1, x2, x3 and x4. A high

value of y1 implies that the participants were at low workload. Similarly, a high value of y2

implies moderate workload and a high value of y3 implies high workload. The model was

validated by original validation and cross validation methods. The two methods that are original

check and cross validation are validated and the average discrimination is reported as 89.58% and

prediction accuracies as 85.42%.

Alexandre et al. (2006) presented a paper to analyze the speech/music classification based on

Fisher’s linear discriminant application. The features, viz., timbre-related, rhythm-related and

pitch-related was taken for classification. K-nearest neighbour (K-NN) and Fisher’s linear

discriminant algorithm were used for classification. The analysis of speech/music based on

Fisher’s linear discriminant application provided very good classification results.

Zhang et al. (2007) proposed a paper where the Fisher’s discriminant analysis was used to


3658

5

analyse the attention deflects/hyperactivity disorder in resting-state of the brain. PC-FDA model

was used for classification. Initially, the dimension of feature space is reduced by the principal

compound analysis after which, FDA was achievable. The classifier performance was estimated

by leave one out cross validation method. PC-FDA showed 85% accuracy in classification.

Dornhege et al. (2007) propounded a paper to detect mental workload in drivers during real

traffic conditions. Two experimental setup had been tested, i.e., an auditory workload scheme and

mental calculation task. FLDA was used for classification. The classifier output is a scalar value;

it represents the low workload (values below zero) or high workload (values above zero) degrees

that have been estimated. It consists of two thresholds, ml < mh, with the threshold switching to

high workload, if the output exceeds mh. If the output falls below the lower value, i.e., ml, the

threshold would be switching to a low workload. The values ml and mh are subject and task

specific so these values were calibrated on the training data. Cross-validation method was used to

forecast the most relevant parameter for workload detection. Greater than 90% classification

accuracies were achieved by the best performing subject for both the classification and the

auditory task.

Linear Regression Model

Linear regression model is used for modelling the linear relationship between two continuous

variables (Yan & Su, 2009). The field of application of linear regression model are astronomy,

statistics, business, share market, finance, biology, etc.

In linear regression model there are two types of variables namely dependent variable,

indicated by y and independent variable denoted by x. Simple linear regression model contains

only one independent variable.

The simple linear regression model is represented by Eq. (5)

xy 10 (5)

where β0 is y intercept, β1 is the slope of the regression line and ε is the random error.

The second type of linear regression model named as multiple linear regression, which has

one dependant variable and many independent variables, is represented as follows (Eq. (6)).

y ppxx110 (6)

Where, y is dependent variable, β0, β1, …, βp are regression coefficients, x1, x2, …, xp are

independent variables in the model.

Linear regression model shows optimal results when the relationship between dependent

variable and independent variable are linear. The results are poor for non-linear relationships.

Application of Linear Regression Analysis to Classify Mental Workload

Liang et al. (2005) proposed a paper depending on driving performance and EEG power spectrum

analysis for monitoring the driver’s alertness. The power spectrum of electroencephalogram


3659

6

(EEG) log sub band, correlation analysis, principal component analysis and linear regression

models had been used in combination, to predict driver’s sleepiness level indirectly in a virtual

reality based driving simulator. Using the available information of EEG power spectrum, to

calculate the subject’s driving performance, a 50-order linear regression model with least square

cost function was created. Correlation coefficient with highest value was taken as a factor for

selecting EEG channel. Two EEG channels with highest values were selected. Among the 10

sessions, the cross-session estimation amounted to 0.82 ± 0.07 in mean correlation coefficient and

for within- session estimation amounted to 0.85 ± 0.11 in the mean correlation coefficient

between actual driving performance time series. These results proved that a small number of EEG

channels are sufficient for the continuous estimation of EEG-based driving performance, and the

minute to minute operator alertness changes could be measured accurately.

Stikic et al. (2011) presented a research which is based on EEG to calculate the performance

of present and future cognitive abilities, using the reaction time of both single and combined

modes and also based on accuracy performance metric. The least squares approach was used for

training the models. Two distinctive approaches were used to evaluate the trained linear

regression models, by auto-validation and cross-validation. The prevailing performance model

the results were obtained rather high (sensitivity of 78%-85%, positive predictive value of 57%-

73%, specificity of 61%-92%, and negative predictive value of 75%-98%), after grouping the

discovered and estimated F-measure into classes of acceptable and unacceptable performance.

The future performance model turned into capable of group performance into acceptable and

unacceptable performance. The acquired results had been in a similar range (sensitivity of 69%-

85%, positive predictive value of 57%-70%, specificity of 58%-82%, and negative predictive

value of 74%-90%). From these result it is stated, the EEG data is effectively used to predict

future performance.

Linear Mixed Model

Linear mixed models (LMM) or mixed effects models seem to be the extensions of linear

regression models containing fixed as well as random effects. They compute the relationship

between continuous dependent variable and independent variables. This model is useful when

repeated measurements are taken on the subjects under different conditions. This model is

suitable, when the observed data contains more than one level of variation. LMM is widely used

in image processing, bio statistics (Gibbons et al., 2003) and vegetation classification (Tateishi et

al., 2004).

LMM contains two component namely fixed and random effects. Fixed effects represents the

effects in which all possible levels are present whereas, random effects contains level which

represent random draws from large population. Fixed effects are represented by constant


3660

7

parameters. In contrast to the fixed effect, random effects are represented by random variables

which may follow normal distribution (Foulkes et al., 2010).

The standard form of a linear mixed-effects model is given by Eq. (7),

ZbXy (7)

where, y is a n x1 continuous response vector; n is the number of elements; X is an n x p

fixed effect design matrix; β is an p x 1 fixed effect vector; Z is an n x q random effects design

matrix; b is an q x 1 random effects vector; ε is an n x 1 vector of n residuals.

Assume that the errors b and ε are distributed independently mean zero and D and R are

variances respectively (Cnaan et al., 1997) (Eq. (8) and Eq. (9)).

XyE (8)

RZDZy Tvar (9)

The main advantages of LMM are, the model has standard errors and it provides estimates of

random effects with which sub-units can be classified. The disadvantage of LMM is that, more

distributional assumptions should be made (Maruyama, 2008).

Auto Regressive Model

In auto regressive (AR) model the present value of series Xt, is defined by a function of p past

values Xt-1, Xt-2, …, Xt-p. Auto regressive model is used to predict future samples with the previous

samples. AR model is quite simple however effective approach for time series characterization.

The order of the model was determined from the range of preceding observations and the data is

used to estimate the weights which act as the parameters of the model this in turn uniquely

characterize the time series. AR model is used for signal analysis, pattern discrimination etc.

The benchmark model, which also happens to be one of the simplest models, is concise with

the autoregressive process of order p (AR(p)) (Bontempi, 2013). An auto regressive process ϕt of

order n is given by Eq. (10),

tntntt W 11 (10)

This stated that the next value is a linear weighted sum of the past n values and a random

shock. This seems to be like a linear regression model where ϕ is regressed on its past values. On

that account, this model is prefixed with ‘auto’. Multivariate autoregressive models extend to

multiple time series, the current values of the vector of all variables will be modelled as a linear

sum of previous values. Simple time series first order auto regressive model is explained in this

section.

The series ϕt (Eq. (11)) is AR(1) only if it satisfies the Eq. (10),

ttt w 1 (11)

where, wt is zero mean white noise. By substitution ϕt-1 in Eq. (10), it can be written by Eq.


3661

8

(12) and Eq. (13)

.22

112 ttttttt wwwww (12)

)1(]var[

0][

422

w

E

t

t

(13)

Then if |α| < 1 the variance is finite and the variance (Eq. 14) is equals to

)1(][

2

22

w

Vart (14)

And the autocorrelation is written by Eq. (15)

21,0)( kkk (15)

Estimation of AR(n) Parameters

For t observations in Eq. (9), the parameters of AR(n) model are estimated by least square

minimizing method (Eq. (16)),

2

11

1

][minargˆntnt

T

nt

t

(16)

The multiple least square problems can be solved in matrix form representation (Eq. (17) and

Eq. (18)).

Y = X α (17)

where,

1

1

11

232

121

,

n

T

T

nn

nTTT

nTTT

YX

(18)

In AR model, output forecasting is based on past outputs and the main drawback is, it does

not consider past disturbances (Ganesh et al., 2011).

Application of AR model to classify mental workload

Zhao et al. (2010) proposed a system for estimating driver’s mental fatigue based of EEG in a

visual reality based stimulator. The features from frontal, central and occipital lobes were

expressed using Multivariate auto regressive (MVAR) model. The technique Akaike's

information criterion (AIC) was detected as the most adequate to determine the order of AR

model. Followed to AIC, the MVAR order became decided on as 3. Three levels of mental

fatigue were identified by combined support vector machine (SVM) and kernel principal

compound analysis (KPCA). The three state classification accuracy of MVAR is 81.64%.

Saidatul et al. (2011) presented a paper based on the AR modelling techniques the EEG

signals were analyzed in the condition of relaxation and mental stress. This comparative study


3662

9

measures the power spectral density (PSD) of EEG signals during two different states namely

resting state and mental stress condition. The fast Fourier transform (FFT) by Welch’s method

was used to calculate the power density spectra, Yule–Walker and Burg’s method was used to

estimate the auto regressive (AR) method. To classify relax and stressed conditions neural

network classifier was used. It is stated that maximum classification accuracy of 91.17% was

obtained for the Burg Method, while comparing the Yule Walker and Welch Method technique.

Sun et al. (2014) proposed a fatigue classification method where to extract the highly

discriminative functional connectivity information multivariate pattern analysis technique was

adopted. Overall accuracy of 81.5% (p<0.0001) was achieved from this algorithm.

Anderson et al. (2011) presented a paper based on classification of spontaneous EEG signal

during arithmetic task using AR modelling. EEG was processed by four different representations,

i.e., (i) scalar AR model coefficients, (ii) multivariate AR coefficients, (iii) Eigen values of a

correlation matrix, and the (iv) Karhunen–Lo`eve transform of the multivariate AR coefficients.

Feature vectors defined by these representations were classified with a standard, feed forward

neural network trained via the error back propagation algorithm. Similar results from these four

representations proved that, compared to EEG signals which are untrained and novel, the

multivariate AR coefficients are better to some extend with 91.4% average classification

accuracy.

Hierarchical Bayes Model

Hierarchical Bayes model is the combination of sub-groups and Bayes theorem. A hierarchical

model can be separated into different sub-models. Sub-models combine together to form the

hierarchical model. To integrate all the sub-models Bayes theorem is used. Hierarchical models

are best fit for data structured in groups, for example, demographically, spatially, temporally.

Each group of data’s is represented by different parameters. Regressions contain slope and

intercept. In hierarchical regressions, each group has different intercepts and slopes. Hierarchical

modeling is applied for observational units with different levels. The hierarchical form of analysis

and used for understanding multi parameter problems and also plays an important role in

developing computational strategies. Hierarchical Bayes model are used in marketing field,

ecological data analysis etc.

There are equal advantages and disadvantages in this model. The model is used to predict

future from the past information. It follows likelihood principles. The limitation in this model is,

there are no definite rules to select prior distribution.

Bayes’ Theorem

Bayesian methods are used to analyze hierarchical models. The posterior probability can be


3663

10

formed from the conditional probability. The product of prior distribution p(θ) and sampling

distribution p (y | θ) is denoted as condition probability. The posterior distribution is given by Eq.

(18)

)(

|

)(

,|

yP

PyP

yP

yPyP

(18)

From the above equation it can be inferred that, Bayes’ theorem shows conditional

probability and individual event’s relationship.

Let y be the data, θ is the process and ϕ denotes the parameter, the Bayesian hierarchical

model is defined using a series of steps:

1. Data model: ],|[ y

2. Process model: ]|[

3. Parameter model: ][

In simple terms the posterior distribution is written by Eq. (19)

][]|[]|[]|[ yyP (19)

Posterior distribution is defined as the product of the likelihood ratio and the prior

distribution (Penny et al., 2007; Bernardo et al., 1983).

Application of Bayes model to classify mental workload

Wang et al. (2011) proposed a method for cross-subject workload based on hierarchical Bayes

model classification. Compared to other models the classifiers neural networks and support vector

machine having the advantages of modelling the inter relationship, incorporating the prior

knowledge as well as accounting for unreliability in data, through probabilistic theory. Models

are trained and tested using fivefold cross-validation setup. Results show that hierarchical Bayes

model has greater accuracy as single neural network and single native Bayes model. Instead of

using multiple classifiers for multiple subjects this one classifier can be used for multiple

subjects.

Hidden Markov Model

The Hidden Markov Model (HMM) seems to be a substantial statistical model for classifying

time varying (temporal) sequences (Blunsom, 2004). A hidden Markov model has a specific state

which is called a receiving state/accepting state/find state. Once the machine reaches this state it

cannot come out of this state. While it is in that accepting state it emits only one visible state.

HMM are well known for speech recognition and also used in the field of bioinformatics,

chemistry, genetic engineering, machine learning, finance etc.

The general definition of a HMM is given by Eq. (20):

),,( BA (20)


3664

11

)|(],[ 1 itjtijij sqsqPaaA

)|()()],([ itkt sqvxPkbikbiB

where A is a transition array, which contains the probability of state j following state i.

Transition probabilities are time independent. B is the observation array, which has the

probability of observation k being produced from the state j.

Initial probability array is represented as π:

)(],[ 1 isqPii

The algorithm of HMM is of three stages namely evaluation, decoding and learning stage.

Each stage is explained below.

Evaluation

In evaluation stage, the probability of visible sequence generated by the model is to be found.

Forward algorithm is used for generating visible O from the given model λ.

The forward probability variable is given by Eq. (21),

)|,,()( 21 ittt sqoooPi (21)

The forward algorithm is as follows:

1. Initialisation:

NiObi ii 1),()( 11

2. Induction:

NjTtobaij tj

N

j

ijtt

1,11),()()( 1

1

1

3. Termination:

)()|(1

iOPN

i

T

The induction stage is very important in forward algorithm. For each state Sj, until time t, the

probability arriving to the particular state is observed and stored in αj(t).

Decoding

The most probable sequence of hidden states is discovered using decoding algorithm. To detect

the best state sequence for an observation sequence, Viterbi algorithm is used. This algorithm is

similar to forward algorithm, instead of summing each transition probabilities, they are

maximised at each step.

)|,,,(max)( 2121, 1,21

titqqq

t ooosqqqPit


3665

12

The Viterbi algorithm is as follows:

1. Initialisation:

0)1(,1),()( 111 Niobi ii

2. Recursion:

,1,2),()(max)( 11

NjTtobaij tjijtNi

t

NjTtait ijtNi

t

1,2,)(maxarg)( 11

3. Termination:

)(max1

* iP TNi

)(maxarg1

* iqT T

Ni

4. Optimal state sequence backtracking:

.1,2,1,*

11

* TTtqqt tt

The maximum state are stored and used as a back pointer. Backtracking technique is used for

finding the best state sequence from the back pointers stored in recursion stage, but there is no

easy way to find the second best state sequence.

Learning

After the decoding stage, the model parameters λ = (A, B, π) are to be estimated. Supervised

training and unsupervised training are considered to be the two standard techniques for learning

stage. Supervised learning is used in cases where both inputs and outputs are known. Here, inputs

are equated as observations and outputs as states. Unsupervised training is used in cases where

training set contains input data alone.

Baum-Welch algorithm is used for supervised learning. The algorithm is as follows:

Maximum Likelihood Estimates (MLE) is used to determine the model parameters λ. For the

transition matrix we use Eq. (22):

)(

,|

i

ji

jiijtCount

ttCountttPa

(22)

where Count(ti, tj) is the number of times tj followed ti in the training data. For the

observation matrix (Eq. (23)):

)Count(t

)t,Count(W=t|wP=b

i

jk

jkj(k) (23)

Where Count(wk, tj) is the number of times wk was tagged tj in the training data. And for the

initial probability distribution (Eq. (24))


3666

13

)(

)()(

1

11

qCount

tqCounttqP iii

(24)

In order to avoid zero counts during the estimation of HMM and to improve the performance

of the model, smoothing technique can be used. HMM is an effective learning algorithm and can

handle variables of different lengths but they do not explain the dependencies among hidden

states.

Application of HMM to classify EEG signals

Obermaier et al. (2001) presented a system to classify left and right hand movements during

imagination from single EEG trail using HMM classifier. Zhong & Ghosh (2002) suggested on

HMM and coupled HMM in the classification of multi-channel EEG. Although HMM can be

used for classifying EEG they are well known speech recognition (Rabiner, 1989). Lotte et al.

(2007) stated that HMM classifer would be suitable for high dimensionality feature vector and

features containing time information and also he added that LDA can be used for small data set.

4. SUMMARY

The different types of models, parameters and the levels of workload are summarized in Table 1.

The advantages and limitations of the models are depicted in Table 2.

Based on the usage of mathematical models in various applications as per (Figure 2), it is

suggested that FLDA which reduces the size of the feature vector and Bayes model which uses

probability distribution to predict uncertainties are better in their performance when compared to

other mathematical models. However HMM shall also be explored to estimate cognitive

workload as it is predominately used for EEG signal characterization and classification.

Table 1. Different types of model, parameters used in the corresponding model, and different

levels of workload.

Types of Model Parameters / Attributes Levels of Classification

Fisher Linear Discriminant

Analysis (FLDA)

Between class matrix SB

Within class matrix Sw

Vector matrix J(w)

Low

Moderate

High

Linear Regression Model β0 , y intercept

β1, the slope of the regression line

ε, the random error

Alertness

Linear Mixed Model X, fixed effect design matrix.

β, fixed effect vector.

Z, random effects design matrix.

b, random effects vector.

Low level workload

High level workload


3667

14

ε, vector of n residuals.

Auto Regressive Model with

neural network classifier

Past values Xt-1, Xt-2,….Xt-p

wt is zero mean white noise

Relax

Mental stress

Hierarchical Bayes Model P(θ|y) Posterior distribution

[ϕ], Prior distribution

[y|θ],[θ|ϕ], Likelihood ratio

Low

Medium

High

Hidden Markov Model [aij], Hidden state transition probability

[bjk], Visible state transition probability

P(O|λ), Posterior probability

EEG signal classification

during right and left

imaginary movement

Table 2. Advantages and limitations of cognitive workload models.

Types of Models Advantages Limitations

Fisher Linear Discriminant

Analysis

Saves memory and time. Suitable for linearly separable

groups.

Sample size should be small.

Linear Regression Model When there is a linear

relationship between

dependent variable and

independent variable, the

results are optimal.

Poor performance when the

relationships are non-linear.

Linear Mixed Model Defined with standard

errors.

More distributional assumptions

should be made.

Auto Regressive Model with

Neural Network Classifier

Future variables can be

predicted from past

outputs.

Past disturbances are not taken into

account.

Hierarchical Bayes Model It obeys likelihood

principle

It is used to predict future

from past samples.

There are no definite rules to select

prior distribution.

Hidden Markov Model They can handle large set

of samples.

Dependencies among hidden states

are unknown.

Figure 2. Ratio chart for various applications of mathematical model (FLDA – Fisher Linear

Discriminant Analysis; LR – Linear Regression; LMM – Linear Mixed Model; AR – Auto

Regression; Bayes - Hierarchical Bayes Model; HMM - Hidden Markov Model).


3668

15

CONCLUSION

The aim of this review is to highlight the significant models available to estimate the cognitive

workload. The above discussion gives a clear comparison among different features of the EEG

signal and methods used for predicting cognitive workload. The comparative study of this paper

enables researchers to develop a workload model with higher gain accuracy.

REFERENCES

Keller, J. (2002). Human performance modelling for discrete-event simulation: workload.

Proceedings of the 2002 Winter Simulation Conference, 157-162.

https://doi.org/10.1109/WSC.2002.1172879.

Mazaeva, N., Ntuen, C. and Lebby, G. (2001). Self-organizing map (SOM) model for mental

workload classification. Proceedings of IFSA World Congress and 20th NAFIPS

International Conference Joint 9th. 1822-1825.

https://doi.org/10.1109/NAFIPS.2001.943829.

Antonenko, P., Paas, F., Grabner, R. and Gog, TV. (2010). Using electroencephalography to

measure cognitive load. Educational Psychology Review, 22(4), 425-438.

https://doi.org/10.1007/s10648-010-9130-y.

Karthikeyan, P., Murugappan, M. and Yaacob, S. (2011). A review on stress inducement stimuli

for assessing human stress using physiological signals. Proceedings of IEEE 7th

International Colloquium on Signal Processing and its Applications, 420-425.

Mikat, S., Fitscht, G., Weston, J., Scholkopft, B. and Mullert, K-R. (1999). Fisher discriminant

analysis with kernels. Neural Networks for Signal Processing IX, 1999. Proceedings of

the 1999 IEEE Signal Processing Society Workshop, 41-48.

https://doi.org/10.1109/NNSP.1999.788121.

Welling, M. (2008). Fisher linear discriminant analysis. Max Welling's class notes in Machine

learning, 16(7), 817-830.

Yan, X. and Su, XG. (2009). Linear regression analysis: theory and computing. World Scientific

Publication, 5 Toh Tuck Link, Singapore, 348.

Foulkes, AS. Azzoni, L., Li, X., Johnson, MA., Smith, C., Mounzer, K. and Montaner, LJ.

(2010). Prediction based classification for longitudinal biomarkers. Annals of Applied

Statistics, 4(3), 1476-1497. https://dx.doi.org/10.1214%2F10-AOAS326.

Bontempi, G. (2013). Machine learning strategies for time series prediction. Hammamet, pp. 128.

Penny, W., Friston, K., Ashburner, J., Kiebel, S. and Nichols, T. (2007). Statistical parametric

mapping: the analysis of functional brain images. Neuroscience, Elsevier Publications,

Academic Press, 656.

Bernardo, JM., Degroot, MH., Lindley, DV. and Smith, AFM. (1983). Bayesian statistics 2.

Proceedings of Second Valencia International Meeting, 778.

Blunsom, P. (2004). Hidden Markov Model. Technical Report, Available in

www.cs.mu.oz.au/460/2004/materials/hmm-tutorial.pdf.

Zongmin, W., Damin, Z., Xiaoru, W., Chen, L. and Huan, Z. (2013). A model for discrimination

and prediction of mental workload of aircraft cookpit display interface. Chinese

Journal of Aeronautics, 27(5), 1070-1077. https://doi.org/10.1016/j.cja.2014.09.002.

Alexandre, E., Rosa, M., Cuadra, L. and Gil-Pita, R. (2006). Application of fisher linear

discriminant analysis to speech/music classification. Proceedings of International

Conference on Computer as a Tool, EUROCON 2005, 1666-1669.

https://doi.org/10.1109/EURCON.2005.1630291.

Zhang, H., Zhu, Y., Maniyeri, J. and Guan, C. (2007). Fisher discriminative analysis of resting

state brain function for attention- deflict/hyperactivity disorder. Neuroimage, 40(1),

110-120. https://doi.org/10.1007/11566489_58.

Liang, SF., Lin, CT., Wu, RC., Chen, YC., Huang, TY. and Jung, TP. (2005). Monitoring driver’s

alertness based on the driving performance estimation and the EEG power spectrum


3669

16

analysis. Proceedings of IEEE Engineering in Medicine and Biology 27th Annual

Conference, 6, 5738-5741. https://doi.org/10.1109/IEMBS.2005.1615791.

Dornhege, G., Millan, J.del R., Hinterberger, T., McFarland, DJ. and Muller, K-R (2007).

Improving human performance in a real operating environment through real-time

mental workload detection. Toward Brain-Computer Interfacing, 1, MIT Press, 409-

422.

Stikic, M., Johnson, RR., Levendowski, DJ., Popovic, DP., Olmstead, RE. and Berka, C. (2011).

EEG-derived estimators of present and future cognitive performance. Frontiers in

Human Neuroscience, 5(70), 1-13.

Zhao, C., Zheng, C., Zhao, M. and Liu, J. (2010). Classifying driving mental fatigue based on

multivariate autoregressive models and kernel learning Algorithms. Proceedings of

International Conference on Biomedical Engineering and Informatics (BMEI 2010),

2330-2334. https://doi.org/10.1109/BMEI.2010.5639579.

Saidatul1, A., Paulraj, MP., Yaacob, S. and Yusnita, MA. (2011). Analysis of EEG signals during

relaxation and mental stress condition using AR modeling techniques. Proceedings of

IEEE International Conference on Control System, Computing and Engineering, 477-

481. https://doi.org/10.1109/ICCSCE.2011.6190573.

Sun, Y., Lim, J., Meng, J., Kwok, K., Thakor, N. and Bezerianos, A. (2014). Discriminative

analysis of brain functional connectivity patterns for mental fatigue classification.

Annals of Biomedical Engineering, 40(10), 2084-2094. https://doi.org/10.1007/s10439-

014-1059-8.

Anderson, CW., Stolz, V. and Shamsunder, S. (2011). Multivariate autoregressive models for

classification of spontaneous electroencephalographic signals during mental tasks.

IEEE Transactions on Biomedical Engineering, 45(3), 277-286.

https://doi.org/10.1109/10.661153.

Wang, Z., Hope, RM., Wang, Z., ji, Q. and Gray, WD. (2011). Cross subject workload

classification with hierarchical Bayes model. Neuroimage, 59(1), 64-69.

https://doi.org/10.1016/j.neuroimage.2011.07.094

Obermaier, B., Guger, C., Neuper, C. and Pfurtscheller, G. (2001). Hidden Markov models for

online classification of single trail EEG data. Pattern Recognition Letters, 22(12),

1299-1309. https://doi.org/10.1016/S0167-8655(01)00075-7.

Zhong, S. and Ghosh, J. (2002). HMMs and coupled HMMs for multi-channel EEG

classification. Proceedings of the 2002 International Joint Conference on Neural

Networks, IJCNN '02, 1154-1159. https://doi.org/10.1109/IJCNN.2002.1007657.

Rabiner, LR. (1989). A tutorial on hidden Markov models and selected applications in speech

recognition. Proceedings of the IEEE, 77(2), 257-286. https://doi.org/10.1109/5.18626.

Lotte, F., Congedo, M., Lecuyer, A., Lamarche, F. and Arnaldi, B. (2007). A review of

classification algorithms for EEG-based brain–computer interfaces. Journal of Neural

Engineering, 4(2), R1-R13. https://doi.org/10.1088/1741-2560/4/2/R01.

Bellomo, N., Elena De Angelis, and Delitala, M. (2007). Lecture notes on mathematical

modelling in applied sciences.

Imboden, DM. and Pfenninger, S. (2013). Introduction to systems analysis mathematically

modelling natural systems. Springer-Verlag Berlin Heidelberg, 2013 edition, 252.

Haapalainen, E., SeungJun Kim, Forlizzi, JF. and Dey, AK. (2010). Psycho-physiological

measures for assessing cognitive load. Proceedings of the 12th ACM International

Conference on Ubiquitous Computing, 301-310.

https://doi.org/10.1145/1864349.1864395.

Cao, A., Chintamani, KK., Pandaya, AK. and Ellis, RD. (2009). NASA TLX: Software for

assessing subjective mental workload. The Psychonomic Society, 113-117.

https://doi.org/10.3758/BRM.41.1.113.

Gibbons, RD. and Hedeker, D. (2000). Applications of mixed-effects models in biostatistics. The

Indian Journal of Statistics, 62, 70-103.


3670

17

Maruyama, N. (2008). Generalized linear models using trajectories estimated from a linear mixed

model. Division of Biostatistics, Kitasato University Graduate School of

Pharmaceutical sciences, 102-110.

Cnaan, A., Laird, NM. and Slasor, P. (1997). Using the general linear mixed model to analyze

unbalanced repeated measures and longitudinal data. Statistics in Medicine, 16, 2349-

2380. https://doi.org/10.1002/(SICI)1097-0258(19971030)16:20<2349::AID-

SIM667>3.0.CO;2-E.

Tateishi, R., Shimazakiand, Y. and Gunin, PD. (2004). Spectral and temporal linear mixing

model for vegetation classification. International Journal of Remote Sensing, 25, 4203-

4218. https://doi.org/10.1080/01431160410001680437.

Ganesh, UL., Krishna, M., Hari prasad, SA. and Krishna, SR. (2011). Review on models for

generalized predictive controller. Proceedings of International Conference on

Computer Science, Engineering and Applications, 418-424.

https://doi.org/10.5121/csit.2011.1237.

Geethanjali, B., Adalarasu, K., Jagannath, M. and Rajasekaran, R. (2016). Enhancement of task

performance aided by music. Current Science, 111(11), 1794-1801.

https://doi.org/10.18520/cs%2Fv111%2Fi11%2F1794-1801.


3671

mathematical models for predicting cognitive workload · performance deterioration ( mazaeva et...

Documents