stress and cognitive load - wordpress.com

Stress and Cognitive Load Dan Conway Page: 1

Stress and Cognitive Load

NICTA Summer Scholarship Report ‐ 2011/2012.

Dan Conway

Supervisors: Dr Yang Wang, Dr Fang Chen

ABSTRACT

Human physiological signals have been suggested as proxies with which to non‐invasively quantify

Cognitive Load (CL) in real‐time. A major challenge for any approach to CL detection is the presence

of stress, which may affect physiological measurements in ways that confound reliable detection of

CL. In this experiment we attempted to dissociate CL from stress. GSR was used as a proxy for stress

and pre‐frontal cortical activity as measured by EEG signals as a proxy for CL. Three levels of CL were

induced in 11 participants using math tasks in both ‘no‐stress’ and ‘stress’ conditions. The

experiment used a modified version of the MIST protocol which utilizes feelings of lack of control,

task failure and self and social‐evaluation to induce stress. Using basic statistical measures for eight

subjects, GSR levels were shown to be significantly different between CL levels in the ‘no‐stress’

condition, but not in the ‘stress’ condition. This has important implications for CL quantification in

that other physiological signals may also exhibit similar patterns where a stress response over‐rides

signal variation owing to CL. Further analysis of the body of data generated by this experiment,

utilising machine learning techniques is suggested.

INTRODUCTION

Physiological signals have previously been proposed as a method of quantifying Cognitive Load (CL).

Signals include heart‐rate, heart‐rate variability, pupil‐dilation, blood‐pressure, respiration rate and

GSR (Galvanic Skin Response). Some notable successes in CL quantification have been achieved via

signals such as speech (Chen, 2006), Heart Period (Veltman & Gaillard, 1998), Pupillary Response

(Xu, 2011), Heart Rate Variability (Aasman, Mulder, & Mulder, 1987).

However, any given physiological signal is only a proxy for CL and is likely to be effected by countless

additional inputs from the human body (Longo, et al., 2010). Thus a major task of CL measurement

via physiological means is demonstrating the diagnosticity and construct validity of any nominated

proxy physiological signal.

One of the major contributors to change in human physiological systems is stress (Martin, 2007).

Stress has been shown to effect both the sympathetic and parasympathetic nervous systems and, in

its more extreme states, results in large changes to physiological function that may well obscure the

relationship between a physiological indicator and CL. Furthermore – stress may, in some

circumstances, be a confound for CL in that changes in CL may correlate with changes in stress‐levels

(Veltman & Gaillard, 1998). Construct validity must be established before we can safely assert that

changes in physiological indicators are the result of CL and not stress (or indeed some other factor).


The ‘Stress and Cognitive Load’ experiment (SACL) was an attempt to dissociate CL from Stress. This

was hoped would provide a clearer picture of the relationship between these two variables, which

would be useful in the operationalisation of CL detection technologies.

Within the psychological literature there are a number of different models and definitions of stress.

Within the CL community though, there seems to be a common reliance on a particular view of

stress that emphasizes an extremely mechanistic view of what constitutes a stressor. Setz (2010)

says ‘Work related stress occurs when there is a poor match between job demands and the

capabilities, resources or needs of the worker’. Similarly Hagmueller (2006) suggests that stress is a

‘Physiological and psychological answer of a human body to a specific workload’ and according to

Hockey (1997): ‘a mismatch between required and prevailing task states’. Whilst these more

homeostatic based definitions may be sufficient for certain paradigms, they do not incorporate the

subjective element of human qualia. An alternative, and widely accepted model of stress in the

psychological field, and perhaps more appropriate to the CL paradigm, is Lazarus and Folkman’s

(1984) ‘transactional’ model of stress. They propose that stressors are extremely subjective and

what constitutes a stressor can vary widely from person to person. Specifically they propose a

model where any event can potentially be a stressor and there are two cognitive ‘appraisal’

processes that determine whether the event will be responded to as a stressor. This is an important

distinction in that it more successfully models the real world variability of people to potentially

stressful events. Furthermore it incorporates subjective impressions and internal cognitive

processes as potential triggers for stress.

Although definitions of stress vary, there is good consensus in the literature regarding conditions

where it is likely to arise. Failure at a task, together with feelings of lack of control, in situations

where participants are evaluated by others is a widely used paradigm for stress induction.

Figure 1. The Appraisal process in evaluating potential stressors according to Lazarus and Folkman’s ‘Transactional’ model.


Duval and Wicklund, (1972) found that observation of the self via a mirror or a camera pointing at

the participant led to increased objective self‐awareness and therefore self‐evaluation. In conditions

resulting from negative self‐evaluations this mirror became a highly aversive stimuli and led to an

‘uncomfortable state’.

Aversive self‐evaluative events are also amplified by the presence of others as demonstrated by

Markus (1978). Furthermore, ‘upward‐social comparison’ where the participant compares

themselves with others who are considered higher performers, presents a threat to self‐esteem

(Tesser, 1988). Tesser (1988) also showed that when the observer is from a highly‐relevant

performance domain (ie: some‐one from the same company, who is an expert in the field) social

comparison is more likely to occur and further that an observer with high social status (as relative to

the observed) increased social comparison.

A ‘feeling of lack of control’ (Dickerson & Kemeny, 2004) is also supported as a pre‐cursor to stress,

especially when resulting in task failure in highly motivated states (Setz, et al., 2010).

These various techniques were all operationalised in an experimental paradigm developed by

Dedovich (2005) called the ‘Montreal Imaging Stress Task’ (MIST). The Stress and Cognitive Load

experiment presented here (SACL) closely follows the MIST protocol with minor operational

adjustments for the lab context in which it was presented. Three blocks of math tasks, of increasing

difficulty levels were presented in both ‘no‐stress’ and ‘stressed’ conditions.

In the stress condition, task failure was induced by imposing dynamically adjusting time‐limits so

that they were just too short to allow participants to complete trials. Lack of control was induced by

the inability to pause or complete tasks in time. Self‐evaluative stress was induced by displaying a

video image of the participant directly within their field of view. Social‐evaluative stress was

induced by the presence of ‘observers’ watching the participants perform poorly at the task, one of

whom was the CEO of the company for which they worked, which also gave rise to a highly

motivated environment.

A challenge for dissociating stress from CL is the possibility that increasing CL results in a

corresponding increases in stress. Indeed when stress is not controlled for and participants are

highly motivated to perform well, this is a likely outcome (Longo, et al., 2010). Thus previous

findings (Shi, 2007) of increasing level of GSR in response to assumed workload may well be an

indication of increased stress rather than CL. Based on this assumption it was hypothesized that GSR

was actually in indicator of stress and not CL and thus was selected as an independent variable. It

was originally hoped that through strict control of the stress environment GSR would remain stable

between CL levels but show higher values in the ‘stress’ condition than the ‘no‐stress’ condition.

GSR has been shown to be a reliable indicator of the stress response (Setz, et al., 2010). Whilst

physiological responses to stressors involve both the sympathetic and para‐sympathetic nervous

system, the human skin is one of the few organs controlled only by the sympathetic nervous system,

and thereby suggests itself as a clearer indicator of stress based arousal than those organs which are

controlled by both systems (Setz, et al., 2010).

Activity in the Pre‐frontal Cortex has been shown to correlate with Working Memory tasks via fMRI

procedures (Barch, et al., 1997). EEG has also been shown to be able to detect variations in Task


Difficulty via power band analysis (Harmony, et al., 1999). The SACL experiment thus attempted to

quantify CL via PFC EEG activity. In order to additionally assess the suitability of new, commercially

available and low cost EEG systems for CL measurement, an Emotiv EEG headset and software was

used for gathering EEG data.

Unfortunately during the process of designing this experiment, I could find no conceptually valid way

of returning to a ‘No‐stress’ condition after a ‘Stress’ condition. This means that for all subjects the

Stress condition followed the ‘No‐stress’ condition. The 3 blocks of tasks of different cognitive load

levels were also always presented in order of difficulty from low to high. In future experiments it is

suggested that a randomized order of CL level within each condition is presented.

Figure 2: Experiment sequence of tasks and conditions.

METHOD

The experiment consisted of a ‘within‐subjects’, six‐way factorial design.

‘No‐stress’ condition ‘Stress’ condition

Cognitive Load Level 1

‘No‐Stress’ CL level 1

‘Stress’ CL level 1







Table 1: Experimental design

Participants

11 male participants (ten right‐handed, one left‐handed), all employees or students at NICTA, took

part in the experiment, each at different times, in the level five laboratory at the NICTA

headquarters at ATP Sydney. Participants were offered one movie ticket and biscuits as recompense

for their participation.

Screening procedure and ethics

All participants had the voluntary nature of the experiment explained to them and told that they

could stop and leave at any time. They signed a release form and then filled out a paper version of

the Kessler K‐10 Psychological Distress Scale to ascertain that they were unlikely to be vulnerable to

ongoing negative effects from the stress condition. Only participants who scored less than 19 (thus

fell into the category ‘likely to be well’) were permitted to continue the experiment. Three potential


participants were rejected via this means. All procedures as specified in the UNSW ethics application

were carried out.

Materials

The math problems presented were of three different difficulty levels. For level 1 problems, three

terms were to be added together. All terms were less than 100 and the answer was always less than

100. No carrying was required for any stage of the addition except for, in some cases, the operation

that resulted in the answer, when executed left to right. Thus, when executed left to right, addition

of the 1’s always resulted in a number less than 10 except for the final operation. For level 2, each

problem consisted of four or five terms, with both addition and subtraction required. Each term was

less than 1000 and the answer was always less than 1000. For each stage of the problem, when

solved from left to right, the 1’s required a ‘carry’ operation. Eg: the 10’s had to be manipulated

whilst remembering the 1’s. Level 3 problems consisted of five terms, each less than 1000, the

answer always being less than 1000, with both addition, subtraction and multiplication for two

terms required. For the multiplication terms, at least one term was more than 20, but never more

than 40. The multiplication terms were in a random position within the problem.

It was attempted for all problems to avoid operations that would be facilitated by at least the most

common of mathematical ‘tricks’. Thus numbers never summed to a multiple of 10, multiplications

never used numbers divisible by 10 or 5 etc…

Difficulty Level 1 Problems Difficulty Level 2 Problems Difficulty Level 2 Problems

27 + 11 + 15 = 558 ‐ 477 + 81 + 312 = 818 ‐ 631 + 556 ‐ 12 x 21 =

33 + 42 + 8 = 129 + 786 ‐ 533 ‐ 81 = 917 ‐ 13 x 22 ‐ 407 ‐ 173 =

22 + 31 + 44 = 774 ‐ 447 + 315 ‐ 71 = 245 + 687 ‐ 22 x 29 + 558 = Table 2: Examples of the math problems presented.

The formulation of these problems came about through a reasonably intensive period of pilot testing

where it was found that the number of terms, carrying and having to remember numbers whilst

performing other operations (such as required by the order of operations in level 3 problems) were

the most reliable method of increasing subjective difficulty ratings. Math problems examples are

presented in Table 2 (with a full list available in Appendix A).

Apparatus

All experimental stimuli were presented on a VDU using custom software (SACL V1.0) whilst

participants were sitting comfortably at a desk.

An Emotiv ‘Epoc’ headset was fitted to the participants’ heads according to the protocol outlined in

‘Emotiv Beta EPOC Hardware Setup Guide Revision 1.0’. Saline solution was applied to the sensors

and contact was reliable and stable for all participants for all sensors with the exception of P7 and P8

which were either intermittent or made no contact at all for four participants (seemingly owing to

the shape of the participants head). Participants’ eyeglasses seemed to make no difference to

contact quality. EEG signals were then recorded using the Emotiv ‘Testbench’ application,

monitored visually during the experiment by the experimenter and later converted to XLSX files for

data analysis. Markers were sent by the SACL application to the Testbench software via virtual serial

ports at the end of each information screen and at the beginning and end of each block.


Event Marker value sent by SACL to Testbench software

Experiment Begin Button is pressed 30

End of any information/Inter‐block pause screen.

32

End of the ‘Nominate Target Score’ screen (when the experimenter is to enter the room and enabled the video screens of the ‘observers’ for the beginning of the stress condition.

31

Block begins/ends. Block markers are three digit numbers. The format is:

1st digit: 1 (indicates it’s a block marker)

2nd digit: o 1 = Task begin. o 0 = Task end.

3rd Digit: Block Number. Eg: 103 = Block marker, end of block, block 3.

Table 3: Markers codes sent by SACL to Testbench software for insertion into EEG output EDF file.

GSR signals were collected using a Thought Technology ‘ProComp Infiniti’ interface and its ‘SC

Flex/Pro’ skin conductance sensor. The sensors were attached to D2 and D4 of the non‐dominant

had for all participants. GSR signals were sampled at a rate of 256Hz via custom software developed

by Ronnie Taib for NICTA.

Participants were asked to remain perfectly still and only move their dominant hand for mouse

control during the experiment. Participants were not asked to suppress blinking as this may have

added to the ‘stress’ during measurement of their baseline state and during the ‘No‐stress’

condition.

Procedure

All participants undertook the ‘no‐stress’ condition first.

Participants were told that they would be completing math tasks but it was emphasized that their

performance/accuracy was not important and in all likelihood that their performance and accuracy

data would not even be examined. They were told that the tasks were designed to induce different

levels of CL and the aim of the experiment was solely to try and measure CL via the GSR and EEG

systems. The experimenter maintained an informal and casual tone with the participants before

leaving the room to allow the participant to commence the experiment.

After submitting some basic demographic information, a two minute ‘baseline’ period was carried

out where the participants were told, via an on‐screen prompt, that they should just relax and let

their mind wander. Then three two‐minute blocks of math tasks were presented with 4 multiple

choice answers available for response by clicking on‐screen buttons with the mouse. Tasks were not

time‐limited and feedback was not provided. The blocks were not terminated until a participant

finished the current question, therefore some blocks were longer owing to participants finishing a

question after the 2 prescribed two minutes.


The three blocks in the ‘no‐stress’ condition were of level 1, 2 and 3 difficulty in sequential order.

Between each block the participant was given a two minute ‘pause’ to allow physiological signals

time to return to baseline.

After block three, the participants were asked, via on screen prompts, to nominate a ‘target score’

for further tasks based on their estimation of their performance so far. Once submitted, the stress

condition ensued. They were told that their performance would be now be monitored. They were

also informed of time limits for further trials. At this point the experimenter entered the lab and

switched on two large LCD televisions behind and above the participant’s computer monitor, thus

directly in the field of view of the participant, and a video projector aimed at a large screen directly

to the participants left. One LCD screen displayed a video feed of the participants face from a small

webcam mounted on the desk. The second LCD screen displayed a video feed of ‘observers’ staring

into the camera, and therefore apparently at the participant. The large screen to the participants

left displayed a ‘mirrored’ image of the screen that the participant was using, thus rendering their

performance highly visible. Participant were then told that they would now be able to see the

observers for the rest of the experiment, implying that they had been being watched all along from

behind the one‐way glass of the observation room. The ‘observers’ were, unbeknownst to the

participant, actually only a pre‐recorded video, but appeared to be Hugh‐Durrant‐White, the CEO of

Nicta, a fellow student, and the experimenter.

Figure 3: A participant during the stress condition. His own image is displayed on the left hand LCD television screen, the ‘observers’ on

the right hand LCD television screen, and the experiment screen projected on the large screen to his left.

The participant was then told to continue and the experimenter returned to the observation room.

The pre‐recorded video of the ‘observers’ was timed so that the experimenter appeared to take his

place amongst the ‘observers’ at this point.


Now in the stress condition, three more blocks of level 1, 2 and 3 difficulty level math tasks were

then carried out, again with two minute pauses in between each block but with time limits now

imposed on each trial. These were calculated by taking the mean of the RT’s for the same CL level

block in the ‘no‐stress’ condition and multiplying this time by .9. This time limit was also then

dynamically updated. In cases where participants correctly answered three questions in a row

within the trial time limits, the time limit was further multiplied by .9. In cases where the participant

either ran out of time or responded incorrectly to three questions in a row, the time limit was

multiplied by 1.1. The participants nominated target score was displayed on the left hand side of the

screen during all trials, and their current percentage of correct answers was displayed on the right

hand side. Feedback (‘Correct’, ‘Wrong’ or ‘Out of time’) was provided for one second after each

trial.

Figure 4: The screen presentation of a typical (level 1) math task in the ‘stress’ condition.

Once all six blocks had been completed the experiment concluded and the participant was

immediately told that the video of the ‘observers’ had been pre‐recorded and ‘the boss’ had not, in

fact, been watching them. They were then debriefed verbally, encouraged to ask questions, as well

as given a text debriefing document to take away with them. Participants were also asked to fill out

a questionnaire where they could report on their experience of the experiment and they were given

the option of revoking their consent for NICTA to use their data (no participants chose to do so).

Finally participants were asked to complete a further three blocks of math tasks each of a different

difficulty level and equivalent to those in the main experiment and rate each question via a nine

point Likert scale to ascertain subjective ratings of task difficulty (F. G. Paas & Van Merrienboer,

1994).

RESULTS

EEG Data was gathered for 11 participants but was only tentatively analysed owing to project time

constraints, as such it will not be discussed here.

GSR data was collected for 11 participants, however only data from the first eight were analysed

owing to project time constraints. The analysis described below is seen as preliminary in that it does


not involve any machine learning or feature recognition which is suggested as being the most fruitful

method of analysis. The present analysis may, however, be taken as indicative.

In the Post‐Experiment questionnaire participants were asked (amongst other things) ‘Did you feel

‘stressed out’ during the experiment? If so – how much?’. All participants answered in the

affirmative with responses ranging from ‘A little bit’, through to ‘Yes, very’.

An ANOVA of pooled subjective ratings for each level showed differences (n = 8, df = 2, F = 82.32, p <

.0001). The means of each group were increasing by CL level (Level 1: M = 1.52, SD = .19. Level 2: M

= 4.17, SD = .27, Level 3: M = 7.46, SD = .47) and the difference between each group was significant

(1 to 2: p < .0001, 2 to 3: p <.0001).

Epoch generation

The ‘middle minute’ of each block was used as the basis for analysis with the aim of minimising

fatigue and practice effects. Once the epochs were extracted, a mean value of the two minute

baseline period was derived. This mean was subtracted from the GSR value at each time point

within the participant’s epochs. A mean was then calculated for the entire epoch, resulting in a

single value for each epoch (or block) for each participant.

Statistical Analysis

For GSR measurements within the ‘no‐stress’ condition, a repeated measures Anova showed

significant differences between CL level within subjects (F = 6.402, df = 2, p = .029). Group means

showed increasing GSR response with increased CL (CL level 1 = .243 µS, CL level 2 = .661 µS, CL level

3 = .9299 µS). However for the ‘stress’ condition, a repeated measures Anova showed no significant

differences between CL level within subjects (F = 2.816, df = 2, p = .287). In this condition the means

of each group descended with increasing CL (CL level 1 = .3 µS, CL level 2 = .26 µS, CL level 3 = .24

µS). All measurements quoted are in Micro‐Siemens (µS), a measurement of conductance.

Figure 5. Mean of ‘Middle Minute Epochs’ of GSR for participants 2‐9 measured during the

'No‐stress' condition by Cognitive Load.


Paired‐Samples T‐Tests of GSR measurements between the ‘no‐stress’ and ‘stress’ conditions were

then carried out for each CL level.

Significant differences were found between conditions for CL level 1 (t = ‐3.786, df = 7, p = .007)

where the ‘no‐stress’ condition exhibited a lower mean GSR (.24 µS) than ‘stress’ (2.99 µS).





Figure 6. Mean of ‘Middle Minute Epochs’ of GSR for participants 2‐9 measured during the

'Stress' condition by Cognitive Load.

Figure 7. Mean of ‘Middle Minute Epochs’ of GSR for participants 2‐9 by condition for Cognitive Load Levels 1, 2 and 3.


DISCUSSION

The methods used here to induce stress seemed to deliver reliable increases in GSR as well as

subjective ratings of stress. Thus although we cannot categorically say that the ‘no‐stress’ condition

had no stress, and importantly, we cannot say that stress levels did not vary across the different

levels of CL tasks in the ‘no‐stress’ condition (more on this later), we CAN say that the two conditions

were quantitatively different in the levels of stress that they induced.

Although the analysis presented is somewhat rudimentary and does not utilise any feature

detection, the outcomes may be considered indicative of an underlying truth: In the ‘no‐stress’

condition GSR was seen to reliably increase with CL, whereas in the stress condition, this relationship

is not present.

This has important considerations for attempting to use physiological signals as means of assessing

CL. It is conceivable that similar patterns may exist for other physiological signals meaning that

stress needs to be controlled in order to accurately measure CL. Further research into other

physiological signals along similar lines as the experiment presented here seems validated.

A challenge for CL investigations is the nature of the task used to induce CL. A language based task

was not implemented in consideration of the variability of language backgrounds in the likely

participants, ie: NICTA staff, and this consideration is likely to be important in future experiments. A

math based task was chosen largely for its previous implementation in both the Trier Social stress

test (Kudielka, 2008) and Dedovic’s MIST protocol (2005). However task performance did

demonstrate variation owing to, it is assumed, individual differences in math ability. Although an

analysis of subjective ratings did show sufficient differences between the tasks, future investigations

may be better served by using a task that is less dependent on learnt abilities. An ‘n‐back’ task may

more specifically target Working Memory and thus be a more reliable method of inducing given

levels of CL across individuals of different abilities and backgrounds.

Figure8: A typical participant’s GSR response over the duration of the experiment.


Any task that is assumed to induce different levels of CL must be assessed as to its criterion validity.

One cannot assume that a given manipulation of task parameters will monotonically increase task

difficulty. The human cognition system, with its capacity for parallel processing or even super‐

processing, variation in stopping rules for different tasks and variance in task expertise may in many

cases exhibit variation in performance for a given task that is not reflected in the assumed increasing

difficulty of a task. Altering a given parameter of a task (increasing the amounts of targets from 1 to

2 for example) may well not make a task twice as difficult. Indeed owing to the brains’ enormous

plasticity this is in fact quite unlikely. Thus assessment of task difficulty at the outset of the

experimental design process is critical.

Self‐reported, subjective ratings of task difficulty have been shown to be a reliable method of

quantification of task difficulty (Damos, 1991), and a number of different approaches to gathering

this data have been proposed. In the initial stages of this project the SWAT (Nygren, 1991) , NASA‐

TLX (Rubio, Díaz, Martín, & Puente, 2004) and Paas (F. G. Paas & Van Merrienboer, 1994) rating

systems were all trialed and assessed for suitability for implementation. The SWAT process was

quickly found to be both slow and laborious, and also has been reported as less sensitive to

differences in low CL than NASA‐TLX (Luximon, 2001). The NASA‐TLX paradigm is highly regarded

(Damos, 1991) and demonstrates good diagnosticity and as such was operationalised in a pilot

project (indeed is also an option within the SACL experiment application) to assess its suitability.

Two of the six factors (Physical Demand and Temporal Demand), were not relevant to our purposes

so a modified version was implemented with an appropriate mathematically modified version of the

weightings matrix. This was found to have extremely poor face validity with participants expressing

a large degree of frustration and dissatisfaction with the method. Finally a simple 9 point Likert

scale, emulating Paas was implemented and this was found to be highly appropriate to the situation

being both less intrusive and possessing higher face validity. Both Paas and the modified NASA‐TLX

ratings systems are available as options within the SACL application.

The Cognitive Load model, as proposed by Sweller (1988) and expanded upon by Paas (2003) and

others, was originally developed within the Pedagogical paradigm, and as such, is a sometimes

uneasy fit for the real‐time cognition focus that typifies NICTA’s approach. Specifically, the CL

model, does not incorporate the concept of the ‘central executive’ in Working Memory as proposed

by Baddeley (1974), but rather uses a conceptual model of ‘schema’ retrieval and application

(Schnotz & Kürschner, 2007). This may or may not have serious implications for our research

purposes, but a more concerning aspect of the CL model is the complete absence of any reference to

perceptual processing. Perceptual processing has been shown to require cognitive effort in that the

initial acquisition of information by the sensory system, when stimuli are indistinct or ambiguous,

induces load on the cognitive system (Eysenck, 2010). Signal Detection Theory (SDT) has been

grappling with these problems since the Second World War and an established body of knowledge

exists to allow the quantification and assessment of the nature of signal acquisition issues. The

question then becomes whether the load induced by perceptual processing falls within or outside of

the CL model. In one fMRI study, (Barch, et al., 1997) achieved a double‐dissociation between

activity associated with Working Memory in the dorsolateral prefrontal cortex and load induced by

Perceptual Coding which showed increased activity in the anterior cingulate as well as the frontal

cortex. This suggests that these are two separate sources of ‘load’ on the human cognition system,

and the perceptual demands of a task are likely to induce patterns of cortical activation that are

distinct from Working Memory tasks. This has important ramifications for EEG based approaches to


detecting CL in that task difficulty imposed by perceptual coding (such as degraded stimuli,

ambiguous signals, etc…) may contribute to mental load but in patterns that are different to that of

Working Memory load. With the aim of developing real‐time detection of CL via EEG, this factor

then needs to be either excluded from the CL model, or if it is to be included, must be quantified.

Further research into the relationship between stress and Cognitive Load is suggested. Whether it

will be possible to ever create a motivated, no‐stress task remains to be seen, but of particular

interest would be to determine whether stress is automatically induced when the task exceeds

operator capacity. According to Lazarus and Folkman’s model this is likely, since the appraisal

processes would result in an ‘I cannot cope’ evaluation. An experiment where carefully calibrated

tasks induce load just below and just above participants’ cognitive capacity would be informative.

It also remains to be seen whether one can ever create tasks where increasing task difficulty does

not induce some corresponding increase in stress. This experiment was based on the assumption

that one can exert mental effort and not necessarily become stressed, but this assumption is open to

challenge and more direct proxies of stress such as blood‐cortisol levels as controlled by the

hypothalamus‐pituitary‐adrenal axis would need to be assessed in order to clarify this relationship.

An interesting study with many parallels to the work presented here was carried out by Setz (2010).

Unfortunately the title of the paper is somewhat misleading since they do not experimentally

manipulate cognitive load and therefore only demonstrate the ability to differentiate between

‘stress’ and ‘no‐stress’ conditions where CL is consistent between conditions (confusingly they label

the ‘no‐stress’ condition as the ‘Cognitive Load Condition’). Nonetheless, the feature detection

processes they outline appear promising, and suggest themselves as avenues with which to analyse

the body of data generated by the SACL experiment.

An additional intended outcome from this experiment was assessing the suitability of the Emotiv

Epoc headset for EEG recording. As mentioned earlier the P7 and P8 sensors were sometimes

unable to make contact with a participants scalp owing to head size/shape and a lack of flexibility by

the unit. Furthermore there were some problems with the build quality of the device (sensors

would sometimes drop out of their housing when being fitted, the plastic flanges holding sensors in

place would sometimes break, etc…). Having said this, the device seemed to perform remarkably

well given its price. Ideally a within‐subjects comparison of signals derived both by the Epoc and

another, higher quality headset on an identical series of tasks would now be carried out allowing

more exact comparison of the derived EEG signals.

One potential flaw in the design of the current experiment lies in the potential for the negative

relationship between CL and GSR levels in the stress condition to be the result of an initial spike of

GSR when the stress condition begins, followed by a gradual decline as the physiological signals

return to some other ‘baseline’ state. Future experiments should establish the recovery time of GSR

signals and ascertain that this effect is not present in the tasks within the stress condition.

The focus of this project (a Summer ‘Taste of Research’ scholarship) was on the experimental design

and data‐collection aspects. Thus, the body of data generated by this experiment begs for more

rigorous and deeper analysis. It is recommended that current Machine Learning approaches be

applied to both the GSR and EEG signals acquired to ascertain whether existing feature detection

methods can distinguish between CL levels in the stress condition.


REFERENCES

Aasman, J., Mulder, G., & Mulder, L. J. (1987). Operator effort and the measurement of heart‐rate variability. Human Factors, 29(2), 161‐170.

Baddeley, A. D., . Hitch, G. (1974). Working Memory. Oxford: Oxford University Press.

Barch, D. M., Braver, T. S., Nystrom, L. E., Forman, S. D., Noll, D. C., & Cohen, J. D. (1997). Dissociating working memory from task difficulty in human prefrontal cortex. Neuropsychologia, 35(10), 1373‐1380.

Chen, F. (2006). Designing human interface in speech technology. NY: Springer Science+Business Media, Inc.

Damos, D., L. (Ed.). (1991). Multiple Task Performance. London: Taylor and Francis.

Dedovic, K., Renwick, R,. Mahani, N. K., Engert, C., Lupien, S. J., Pruessner, J. (2005). The Montreal Imaging Stress Task: using functional imaging to investigate the efects of percieving and processing psychosocial stress in the human brain. Journal of Psychiatry & Neuroscience, 30(5), 319‐325.

Dickerson, S. S., & Kemeny, M. E. (2004). Acute Stressors and Cortisol Responses: A Theoretical Integration and Synthesis of Laboratory Research. Psychological Bulletin, 130(3), 355‐391.

Duval, S., Wicklund, R.A. (1972). A Theory of objective Self Awareness. New York: Academic Press.

Eysenck, M. W., Keane, M.T. (2010). Cognitive Psychology. New York: Taylor and Francis.

Hagmueller, M., Rank, E., Kubin, G. (2006). Evaluation of the Human Voice for Indications of Workload‐induced Stress in the Aviation Environment EEC Technical/Scientific Report (EEC Note 2006/18 ed., Vol. 2006‐023). Bretigny‐sur‐Orge: Eurocontrol Experimental Centre.

Harmony, T. Ì. a., Fernandez, T. Ì. a., Silva, J., Bosch, J., Valdes, P., Fernandez‐Bouzas, A., et al. (1999). Do specific EEG frequencies indicate different processes during mental calculation? Neuroscience Letters, 266(1), 25‐28.

Kudielka, B. M. (2008). The Treir Social Stress Test. SciTopics. Retrieved from

Lazarus, R., S. Folkman, S. (1984). Stress, Appraisal and Coping. New York: Springer Pub. Co.

Longo, L., Barrett, S., Yao, Y., Sun, R., Poggio, T., Liu, J., et al. (2010). Cognitive Effort for Multi‐agent Systems

Brain Informatics (Vol. 6334, pp. 55‐66): Springer Berlin / Heidelberg.

Luximon, A., Goonetilleke, R, S. (2001). Simplified subjective workload assessment technique, from http://www‐ieem.ust.hk/dfaculty/ravi/papers/workload.pdf

Markus, H. (1978). The effect of mere presence on social facilitation: An unobtrusive test. Journal of Experimental Social Psychology, 14(4), 389‐397.

Martin, G. N., Carlson, N.R., Biskit, W. (2007). Psychology. Edibburgh Gate: Pearson.

Nygren, T. E. (1991). PSYCHOMETRIC PROPERTIES OF SUBJECTIVE WORKLOAD MEASUREMENT TECHNIQUES ‐ IMPLICATIONS FOR THEIR USE IN THE ASSESSMENT OF PERCEIVED MENTAL WORKLOAD. Human Factors, 33(1), 17‐33.


Paas, F., Tuovinen, J. E., Tabbers, H., & Van Gerven, P. W. (2003). Cognitive load measurement as a means to advance cognitive load theory. Educational Psychologist, 38(1), 63‐71. doi: http://dx.doi.org/10.1207/S15326985EP3801_8

Paas, F. G., & Van Merrienboer, J. J. (1994). Variability of worked examples and transfer of geometrical problem‐solving skills: A cognitive‐load approach. Journal of Educational Psychology, 86(1), 122‐133. doi: http://dx.doi.org/10.1037/0022‐0663.86.1.122

Robert, G., & Hockey, J. (1997). Compensatory control in the regulation of human performance under stress and high workload: A cognitive‐energetical framework. Biological Psychology, 45(1‐3), 73‐93.

Rubio, S., Díaz, E., Martín, J., & Puente, J. M. (2004). Evaluation of Subjective Mental Workload: A Comparison of SWAT, NASA‐TLX, and Workload Profile Methods. Applied Psychology, 53(1), 61‐86. doi: 10.1111/j.1464‐0597.2004.00161.x

Schnotz, W., & Kürschner, C. (2007). A Reconsideration of Cognitive Load Theory. Educational Psychology Review, 19(4), 469‐508. doi: 10.1007/s10648‐007‐9053‐4

Setz, C., Arnrich, B., Schumm, J., La Marca, R., Tröster, G., & Ehlert, U. (2010). Discriminating stress from cognitive load using a wearable eda device. IEEE Transactions on Information Technology in Biomedicine, 14(2), 410‐417.

Shi, Y., Ruiz, N., Taib, R., Choi, E. H. C., Chen, F. (2007). Galvanic Skin Response (GSR) as an Index of Cognitive Load. Paper presented at the CHI, 2007, San Jose, California.

Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257‐285.

Tesser, A., Berkowitz, L. (1988). Toward a Self‐Evaluation Maintenance Model of Social Behavior Advances in Experimental Social Psychology (Vol. Volume 21, pp. 181‐227): Academic Press.

Veltman, J. A., & Gaillard, A. W. K. (1998). Physiological workload reactions to increasing levels of task difficulty. Ergonomics, 41(5), 656‐669.

Xu, J., Wang, Y., Chen, F., Choi, H., Li, G. Chen, S., Hussain, S. (2011). Pupillary Response Based Cognitive Workload Index under Luminance and Emotional Changes. Paper presented at the CHI 2011, Vancouver, BC, Canada.

Dan Conway: [email protected]


Appendix 1 – Complete Listing of Math problems used.

Difficult Level 1

Question Number Math Problem

Correct Answer

Multiple Choice

Answer 1

Multiple Choice

Answer 2

Multiple Choice

Answer 3

Multiple Choice

Answer 4

1 56 + 28 + 11 = 95 93 94 95 96

2 23 + 59 + 14 = 96 96 86 94 84

3 45 + 22 + 14 = 81 89 81 90 91

4 22 + 16 + 11 = 49 48 49 47 45

5 44 + 11 + 15 = 70 70 65 71 69

6 32 + 15 + 19 = 66 66 84 85 86

7 28 + 43 + 16 = 87 88 87 89 85

8 35 + 23 + 12 = 69 70 59 69 71

9 43 + 22 + 14 = 79 77 78 79 80

10 52 + 36 + 5 = 93 87 93 95 92

11 43 + 22 + 14 = 79 77 79 75 80

12 37 + 18 + 25 = 80 80 70 75 85

13 53 + 23 + 17 = 93 91 87 90 93

14 26 + 48 + 11 = 85 83 82 85 87

15 15 + 47 + 33 95 92 93 94 95

16 52 + 29 + 11 = 92 88 94 92 96

17 15 + 32 + 13 = 60 61 59 55 60

18 51 + 35 + 12 = 98 86 88 99 98

19 22 + 45 + 11 = 78 76 77 78 80

20 54 + 23 + 18 = 95 93 95 97 91

21 62 + 16 + 12 = 90 82 86 89 90

22 35 + 12 + 11 = 58 58 56 55 48

23 15 + 42 + 23 = 80 83 81 80 82

24 6 + 42 + 11 = 59 57 59 55 56

25 41 + 32 + 14 = 87 77 79 87 85

26 33 + 14 + 22 = 69 79 67 77 75

27 43 + 15 + 22 = 80 78 82 80 79

28 36 + 23 + 18 = 77 77 79 78 87

29 14 + 15 + 16 = 45 43 44 45 46

30 22 + 31 + 44 = 97 96 85 95 97

31 32 + 15 + 12 = 59 56 59 60 61

32 45 + 21 + 22 = 88 86 78 88 90

33 51 + 34 + 13 = 98 98 97 96 95

34 26 + 31 + 18 = 75 73 63 76 75

35 46 + 23 + 15 = 84 81 78 64 84

36 22 + 31 + 14 + 11 + 8 = 86 84 85 86 87

37 15 + 33 + 21 = 69 65 69 57 69

38 25 + 34 + 11 = 70 67 68 70 71

39 32 + 26 + 17 = 75 77 75 73 85

40 15 + 52 + 25 = 92 92 91 89 93


41 62 + 25 + 9 = 96 94 92 99 96

42 51 + 24 + 17 = 92 91 92 82 84

43 22 + 43 + 18 = 83 81 79 88 83

44 33 + 35 + 14 = 82 82 85 79 78

45 17 + 11 + 26 = 54 51 52 53 54

46 41 + 36 + 12 = 89 88 86 93 89

47 44 + 23 + 19 = 86 82 84 86 90

48 11 + 27 + 31 = 69 59 60 65 69

49 31 + 25 + 19 = 75 77 75 73 71

50 26 + 13 + 22 = 61 59 60 61 62

51 45 + 31 + 11 = 87 77 79 89 87

52 9 + 53 + 21 = 83 82 83 85 87

53 42 + 14 + 12 = 68 64 66 70 68

54 36 + 22 + 6 = 64 64 63 61 66

55 37 + 11 + 17 = 65 55 56 65 67

56 35 + 14 + 20 = 69 64 71 68 69

57 28 + 11 + 16 = 55 59 51 56 55

58 25 + 14 + 12 = 51 51 53 50 49

Difficulty Level 2


Correct Answer

Multiple Choice

Answer 1

Multiple Choice

Answer 2

Multiple Choice

Answer 3

Multiple Choice

Answer 4

59 519 + 288 ‐ 344 + 87 = 550 660 550 650 555

60 448 + 373 ‐ 215 + 219 = 825 825 823 826 827

61 984 ‐ 579 + 238 + 119 + 64 = 826 816 826 814 824

62 552 ‐ 87 ‐ 112 + 482 = 835 825 735 825 835

63 88 + 186 + 334 ‐ 238 + 449 = 819 817 818 819 820

64 558 ‐ 477 + 81 + 312 = 474 474 574 575 573

65 819 ‐ 365 + 228 ‐ 47 + 175 = 810 712 710 810 812

66 758 ‐ 442 + 94 + 118 ‐ 32 = 496 490 496 498 494

67 881 ‐ 735 + 285 ‐ 129 = 302 302 312 300 301

68 129 + 786 ‐ 533 ‐ 81 = 301 301 299 300 302

69 444 + 189 ‐ 324 + 111 = 420 422 410 424 420

70 853 ‐ 669 + 386 + 211 ‐ 85 = 696 704 702 696 698

71 744 ‐ 366 + 88 ‐ 219 = 247 257 247 245 237

72 774 ‐ 447 + 315 ‐ 71 = 571 571 561 581 573

73 663 + 218 ‐ 442 + 319 + 83 = 841 842 843 840 841

74 338 + 258 + 114 ‐ 439 = 271 481 271 281 371

75 119 + 597 + 181 ‐ 425 = 472 472 462 482 574

76 512 + 319 ‐ 486 + 119 = 464 467 464 469 463

77 212 + 448 ‐ 219 + 79 = 520 417 418 420 520

78 214 + 391 + 115 ‐ 85 = 563 563 565 561 564

79 75 + 83 + 56 + 244 = 458 358 368 458 468

80 144 + 267 ‐ 63 + 412 = 760 761 661 760 762

81 489 + 325 + 111 ‐ 429 = 496 494 496 490 497


82 339 + 378 ‐ 428 + 380 = 669 559 659 669 729

83 468 + 355 ‐ 265 + 174 = 732 732 733 734 735

84 319 + 294 ‐ 55 + 342 = 900 800 890 900 902

85 419 + 95 + 229 = 743 576 749 743 822

86 385 + 473 ‐ 281 ‐ 412 = 165 87 119 156 165

87 312 + 389 + 253 = 954 827 857 854 954

88 91 + 343 + 272 ‐ 81 = 625 672 675 625 635

89 454 + 318 ‐ 526 = 246 246 218 226 238

90 328 + 279 + 88 ‐ 216 = 479 379 359 459 479

91 468 + 233 ‐ 437 + 77 = 341 341 361 329 331

92 353 + 288 + 55 = 696 688 689 696 794

93 59 + 286 + 312 = 657 651 682 836 657

94 377 + 585 ‐ 225 = 737 738 722 735 737

95 185 + 244 ‐ 321 = 108 88 118 108 348

96 479 + 237 ‐ 381 + 64 = 399 401 411 413 399

Difficulty Level 3


Correct Answer

Multiple Choice

Answer 1

Multiple Choice

Answer 2

Multiple Choice

Answer 3

Multiple Choice

Answer 4

97 23 x 18 + 392 ‐ 514 ‐ 138 = 154 366 154 150 288

98 551 + 14 x 12 ‐ 342 + 358 = 735 645 745 380 735

99 354 + 17 x 7 + 219 ‐ 488 = 204 206 304 204 208

100 818 ‐ 631 + 556 ‐ 12 x 21 = 491 491 391 381 501

101 342 + 18 x 27 ‐ 545 + 319 = 602 612 602 614 608

102 982 ‐ 16 x 29 ‐ 342 +668 = 844 845 844 834 824

103 154 + 378 ‐ 447 + 17 x 24 = 493 493 495 487 337

104 917 ‐ 13 x 22 ‐ 407 ‐ 173 = 51 81 51 61 84

105 514 ‐ 323 + 682 ‐ 16 x 28 = 425 425 427 328 338

106 661 ‐ 7 x 38 + 485 ‐ 612 = 139 257 137 237 139

107 21 x 19 + 344 ‐ 487 + 577 = 833 743 823 833 843

108 33 x 28 ‐ 451 + 255 ‐ 691 = 37 57 139 55 37

109 711 ‐ 538 = 8 x 29 + 263 = 668 666 686 668 646

110 24 x 26 ‐ 333 + 587 ‐ 692 = 186 286 188 186 146

111 11 x 19 ‐ 124 + 578 ‐ 492 = 171 257 171 169 173

112 687 ‐ 33 x 12 + 288 + 348 = 927 392 574 846 927

113 245 + 687 ‐ 22 x 29 + 558 = 852 848 852 854 754

114 812 ‐ 569 + 179 + 14 x 19 = 688 664 866 688 662

115 812 ‐ 18 x 33 + 266 ‐ 118 = 366 347 346 389 366

116 394 + 287 ‐ 244 + 8 x 16 = 565 568 565 468 466

117 557 + 278 ‐ 59 ‐ 13 x 11 = 633 733 732 633 632

118 642 + 185 ‐ 377 + 23 x 8 = 634 582 645 676 634

119 258 + 187 + 66 ‐ 17 x 6 = 409 407 409 405 403

120 35 x 8 + 216 + 438 ‐ 174 = 760 760 670 690 769

121 638 + 285 ‐ 7 x 27 + 81 = 815 813 815 815 816

122 612 + 6 x 27 ‐ 379 ‐ 55 = 340 338 388 368 340


123 78 + 166 + 416 ‐ 16 x 11 = 484 478 496 448 484

124 24 x 13 + 369 + 281 ‐ 77 = 885 883 881 879 885

125 358 + 19 x 11 + 227 ‐ 377 = 417 357 782 417 319

126 432 + 389 ‐ 576 ‐ 14 x 7 = 147 147 127 135 137

127 279 + 533 ‐ 22 x 6 + 175 = 855 857 856 855 854

128 377 + 318 ‐ 532 + 15 x 6 = 253 153 255 357 253

129 947 ‐ 885 + 24 x 5 + 85 = 267 267 169 263 312

130 524 + 377 ‐ 13 x 9 + 117 = 901 899 799 798 901

131 336 + 572 ‐ 18 x 6 ‐ 294 = 506 516 506 517 520

132 815 ‐ 674 + 714 ‐ 23 x 14 = 533 535 533 531 529

133 412 + 369 ‐ 16 x 9 + 284 = 921 921 911 901 917

134 16 x 9 + 477 ‐ 85 + 312 = 848 843 756 924 848

stress and cognitive load - wordpress.com

Documents