cognitive load measurement using speech/linguistic features

66
From imagination to impact Using Information to Drive Decisions Cognitive Load Measurement using Speech/Linguistic Features Dr. Fang Chen NICTA Copyright 2010 1 Dr. Fang Chen [email protected]

Upload: others

Post on 31-Dec-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cognitive Load Measurement using Speech/Linguistic Features

From imagination to impact

Using Information to Drive Decisions

Cognitive Load Measurement using Speech/Linguistic Features

Dr. Fang Chen

NICTA Copyright

2010

1

Dr. Fang [email protected]

Page 2: Cognitive Load Measurement using Speech/Linguistic Features

Outline

• Background

• Research Applications

• Speech and Language Analyses

• Data Sets:

– Reading Experiment– Reading Experiment

– Touch-table Collaborative Experiment

– Bushfire Study

– Driving Experiment

2

Page 3: Cognitive Load Measurement using Speech/Linguistic Features

Background

• Cognitive loadCognitive loadCognitive loadCognitive load (CL):(CL):(CL):(CL): refers to the mental demand imposed on

working memory by a particular task.

• Working Memory:Working Memory:Working Memory:Working Memory: limited capacity for holding information in

mind in the context of cognitive activity.

• Cognitive Load Theory:Cognitive Load Theory:Cognitive Load Theory:Cognitive Load Theory: development of the instructional

methods for effective use of people's limited cognitive

processing capacity.

3

Page 4: Cognitive Load Measurement using Speech/Linguistic Features

Research Aims

• Overall:– Identification of potential indices of cognitive load for

• real-time,

• objective,

• non-intrusive •non-intrusive

•measurement of cognitive load.

• Specific to this research:– Identification of potential linguistic and grammatical features of

cognitive load.

4

Page 5: Cognitive Load Measurement using Speech/Linguistic Features

Need for CL Measurement

• Overloading or underloading of cognitive

processing:

– degradation of performance, and/or

– failures of learning and performing, and/or

– source of performance errors. – source of performance errors.

• CL measurement is crucial for:

– minimising the amount of cognitive effort required,

– maintaining the right level of CL,

– achieving adaptive system response,

– improving user performance.

5

Page 6: Cognitive Load Measurement using Speech/Linguistic Features

Cognitive Load Measures

• Subjective measures– e.g. self-reporting – manual, post-task, time-consuming, intrusive.

• Physiological measures– e.g. eyes, brain, skin biosensors – sensitive, signal noise, intrusive, lot of

complex equipmentcomplex equipment

• Performance measures – e.g. error rate, task performance – dual tasks

• Behavioral measures– e.g. speech, pressure mouse – can be automatic, non-intrusive

6

Page 7: Cognitive Load Measurement using Speech/Linguistic Features

Research Applications

• Designing intelligent adaptive user interfaces for intensive

working/interaction environments.

– Emergency services e.g. Bushfire Cooperative Research Centre

– Road traffic control services e.g. Roads and Traffic Authority (RTA)

• Other potential areas:

– Call centers – Call centers

– Air traffic control rooms

– Pilot cockpits

– Online education / e-learning

– … and so on.

7

Page 8: Cognitive Load Measurement using Speech/Linguistic Features

Speech and Linguistic Measures

• Why Speech?Why Speech?Why Speech?Why Speech?

– Sensitivity in the speechSensitivity in the speechSensitivity in the speechSensitivity in the speech modality shown by prior art.

– NonNonNonNon----intrusive,intrusive,intrusive,intrusive, easy to collecteasy to collecteasy to collecteasy to collect e.g. phone calls, conversations

– Objective measureObjective measureObjective measureObjective measure, not easily manipulated by the user

– RealRealRealReal----time analysistime analysistime analysistime analysis is possible (for some speech signal features)

– Widely availableWidely availableWidely availableWidely available, in a number of application scenarios

• What measures?What measures?What measures?What measures?

– Pauses and response latency

• Pausing differently under different conditions.

– Language and word usage

• Using particular words and/or phrases at specific sentence and/or paragraph

positions;

– Grammar features and structures

• Using particular types of linguistic/grammatical categories;

• Using a particular type of syntax or grammatical structure i.e. usage of parts of speech

and their forms;8

Page 9: Cognitive Load Measurement using Speech/Linguistic Features

Experiment Setup

• A user study with two controlled levels of cognitive load

– Elicit natural speech from users

The Sun

The Sun has "burned" for more than 4.5 billion years and will continue to do so for several billion more. It is a massive collection of gas, mostly hydrogen and

• A reading and comprehension task

– General knowledge (avoid the

expertise effect)

– Reading the extract

NICTA Copyright 2010 9

collection of gas, mostly hydrogen and helium. Because it is so massive, it has immense gravity, enough gravitational force to hold all of hydrogen and helium together (and to hold all of the planets in their orbits around the Sun!). The Sun does not "burn" like wood burns – it is a gigantic nuclear reactor….

– Reading the extract

– Answer open-ended questions

• Give a short summary of the story

in at least five whole sentences.

• What was the most interesting

point in this story?.

• Describe at least two other points

highlighted in this story.

Page 10: Cognitive Load Measurement using Speech/Linguistic Features

Story-reading Experiment

• Experimental setup

– Story reading followed by Q&A

– 3 different levels of text difficulty (Lexile Framework for Reading,

www.lexile.com)

– 3 stories in each of the 2 sessions (fixed order)

• 1st session: 2nd session:

– “Sleep” (900L), “Smoke Detectors” (950L),

NICTA Copyright 2010 10

– “Sleep” (900L), “Smoke Detectors” (950L),

– “History of Zero” (1350L) & “Hurricanes” (1250L) &

– “Milky Way Galaxy” (1400L) “The Sun” (1300L)

• 5 minutes break between sessions

– Dual-task for “Milky Way Galaxy” & “The Sun”

• Counting of background spoken numbers while reading the stories and

answering the questions

Page 11: Cognitive Load Measurement using Speech/Linguistic Features

Experiment Setup

• Cognitive load level design– Lexile Framework for Reading (200L 1st grade, 1700L grad)

• Syntactic and semantic complexity, vocabulary

– Text with same difficulty for both conditions

– Aural dual task, counting numbers during reading and answering

Task Load LevelTask Load LevelTask Load LevelTask Load Level Lexile RatingLexile RatingLexile RatingLexile Rating Dual TaskDual TaskDual TaskDual Task

NICTA Copyright 2010 11

• Participants

– 15 native English speakers as subjects (8 females and 7 males)

Task Load LevelTask Load LevelTask Load LevelTask Load Level Lexile RatingLexile RatingLexile RatingLexile Rating Dual TaskDual TaskDual TaskDual Task

Low 1300L No

High 1300L Yes

Page 12: Cognitive Load Measurement using Speech/Linguistic Features

Reading Experiment Data – Pause Analysis

12

Page 13: Cognitive Load Measurement using Speech/Linguistic Features

Pause Analysis – Results Summary

*p<0.05, n=24.

13

Page 14: Cognitive Load Measurement using Speech/Linguistic Features

Touch-table Collaboration Study - Lab Data

• Collaborative tasks using multi-touch tabletop screen.

• Interactive Firefighting tasks.

• 10 groups x 4 members = 40 subjects + (1 Pilot group)

– 30 Commanders + 10 Leaders

– 39 subjects data available (1 leader’s data missing)

• Speech Transcriptions using ELAN.• Speech Transcriptions using ELAN.

• Extracted and cleaned for LIWC and other analysis tools.

• Analysis completed:

– Subjective Ratings

– Grammar features - Pronouns

– Word Category Features

– Language Complexity Features

14

Page 15: Cognitive Load Measurement using Speech/Linguistic Features

15

Page 16: Cognitive Load Measurement using Speech/Linguistic Features

Touch-table Study Design

16

Page 17: Cognitive Load Measurement using Speech/Linguistic Features

Lab Data – Some Hypotheses

• Higher subjective ratings under high load task.

• More speech and longer sentences.

• More and longer pauses under high load task.

• More use of:

– Negative emotion words, inclusive words, swear words, cognitive and – Negative emotion words, inclusive words, swear words, cognitive and

perceptive phrases, disagreement words etc.

• Less use of:

– Positive emotion words, agreement, certainty, achievement words

• More hesitations and incomplete sentences

• More use of plural pronouns and less use of singular ones.

• More complex sentences under high load task.

17

Page 18: Cognitive Load Measurement using Speech/Linguistic Features

Lab Data – Subjective Ratings

18

Page 19: Cognitive Load Measurement using Speech/Linguistic Features

Lab Data – Linguistic Analysis (Words)

19

Page 20: Cognitive Load Measurement using Speech/Linguistic Features

Lab Data – Linguistic Analysis (Pronouns)

• Singular pronouns decrease

• Plural pronouns increase

20

Page 21: Cognitive Load Measurement using Speech/Linguistic Features

Lab Data – Linguistic Analysis (Pronouns)

• Interaction between Singular and Plural Personal Pronouns

21

Page 22: Cognitive Load Measurement using Speech/Linguistic Features

• Language complexity measuresLanguage complexity measuresLanguage complexity measuresLanguage complexity measures

• Measured by two major factors:

– Semantic difficultySemantic difficultySemantic difficultySemantic difficulty: observes the use of words, their frequencies, and

their lengths (both in syllables as well as alphabets/characters).

– Syntactic complexitySyntactic complexitySyntactic complexitySyntactic complexity: observes primarily the sentence length, which

is considered as the best indicator of text or language complexity.

Lab Data – Language Complexity Analysis

is considered as the best indicator of text or language complexity.

• Hypotheses

– Language Complexity increases

– Lexical Density decreases

22

Page 23: Cognitive Load Measurement using Speech/Linguistic Features

• Lexical Density (Vocabulary Richness) – expected to decrease

• Hard Word Ratio – expected to decrease

• Gunning Fog Index

Lexical Density is the estimated measure of content per

functional and lexical units or lexemes in total text. In simple

words, it is a measure of the ratio of unique words to the total

number of words.

Lexical Density = (different words / total words) x 100

A word is considered complex or hard if it has three

or more syllables and does not contain a hyphen ( -

). For example, the word ‘density’ has three

syllables.

Complex Word Ratio is the measure of the ratio of

complex words to the total number of words.Gunning Fog Index calculates the syntactic complexity of

language using sentence lengths and complex words and

implies that short and simple sentences in plain English

achieve a better score (lower value) than long sentences in Flesch-Kincaid Grade calculates the language difficulty using

Lab Data – Language Complexity Analysis

• Gunning Fog Index – expected to increase

• Flesch-Kincaid Grade – expected to increase

• SMOG Grade – expected to increase

• Lexile Level – expected to increase

achieve a better score (lower value) than long sentences in

complicated language.

Gunning Fog Index = 0.4 x (ASL + ((SYW / words) x 100))

Where:

ASL = Average sentence length (the number of words divided

by the number of sentences)

SYW = Number of words with three or more syllables

Flesch-Kincaid Grade calculates the language difficulty using

average sentence lengths and average syllables per word. It

estimates the number of years of education required to

understand the written or transcribed text.

Flesch-Kincaid Grade = (0.39 x ASL) + (11.8 x ASW) – 15.59

Where:

ASL = Average sentence length (the number of words divided by

the number of sentences)

ASW = Average number of syllables per word (the number of

syllables divided by the number of words)

The SMOG Grade also estimates the number of education years

needed to fully comprehend the text. It uses sentences and

complex words to calculate it. The emphasis on full

comprehension distinguishes this measurement from other

complexity measures.

SMOG Grade = square root of ((SYW / sentences) x 30) + 3

Where:

SYW = Number of words with three or more syllables

Lexile Level also measures the comprehension complexity

of any text. A Lexile measure is the numeric representation

of a text’s difficulty ranging from 200L for easy to above

1700L for complicated texts. It uses mean sentence

lengths and mean log word frequency to calculate it.

23

Page 24: Cognitive Load Measurement using Speech/Linguistic Features

Lab Data – Language Complexity Analysis

24

Page 25: Cognitive Load Measurement using Speech/Linguistic Features

Lab Data – Language Complexity Analysis

25

Page 26: Cognitive Load Measurement using Speech/Linguistic Features

Bushfire Data - Introduction

• Speech and transcription data from Bushfire CRC.

• Training exercises – four states (TAS, VIC, NSW, and QLD).

• Three roles: Incident Controller (IC), Planning, Operations.

• 11 exercises, 33 subjects

• All exercises monitored by bushfire management experts.

• Operators co-located in a control room and trained for

NICTA Copyright 2010 26

• Operators co-located in a control room and trained for roles.

• Data collection, transcription, coding, cleaning, analyses.

• Four different load levels: – (1) ‘low’: casual conversation, no time pressure;

– (2) ‘medium’: routine tasks;

– (3) ‘high’: challenging tasks, time constraints; and

– (4) ‘very high’: very challenging, lot of unexpected events and breakdowns.

• Combined into low and high.

Page 27: Cognitive Load Measurement using Speech/Linguistic Features

Bushfire Data – Same Hypotheses

• Higher subjective ratings under high load task.

• More speech and longer sentences.

• More and longer pauses under high load task.

• More use of:

– Negative emotion words, inclusive words, swear words, cognitive and – Negative emotion words, inclusive words, swear words, cognitive and

perceptive phrases, disagreement words etc.

• Less use of:

– Positive emotion words, agreement, certainty, achievement words

• More hesitations and incomplete sentences

• More use of plural pronouns and less use of singular ones.

• More complex sentences under high load task.

27

Page 28: Cognitive Load Measurement using Speech/Linguistic Features

Bushfire Data – Linguistic Analysis (Words)

28

Page 29: Cognitive Load Measurement using Speech/Linguistic Features

Bushfire Data – Linguistic Analysis (Pronouns)

• Singular pronouns decrease

• Plural pronouns increase

29

Page 30: Cognitive Load Measurement using Speech/Linguistic Features

Bushfire Data – Linguistic Analysis (Pronouns)

• Interaction between Singular and Plural Personal Pronouns

30

Page 31: Cognitive Load Measurement using Speech/Linguistic Features

Bushfire Data – Language Complexity Analysis

31

Page 32: Cognitive Load Measurement using Speech/Linguistic Features

Other Linguistic Analysis Possibilities

• NNNN----gram Analysisgram Analysisgram Analysisgram Analysis

• Bi-gram Ratio

• Others:• Most common N-grams

(Bigrams, Trigrams, 4-grams)

• Most common words (Unigrams)70%

80%

90%

100%

Perc

ent

Bi-gram Ratio

• Most common words (Unigrams)

• Most frequent or least frequent N-grams

• More…

• Parse Tree AnalysisParse Tree AnalysisParse Tree AnalysisParse Tree Analysis

– Order of nOrder of nOrder of nOrder of n----gramsgramsgramsgrams

• For both For both For both For both –––– words and parts of speech.words and parts of speech.words and parts of speech.words and parts of speech.

L1L1L1L1 L2L2L2L2 L3L3L3L3 L4L4L4L4 pppp

BiBiBiBi----gram Ratiogram Ratiogram Ratiogram Ratio 93.5% 80.9% 79.4% 72.6% 0.0002

50%

60%

1 2 3 4

Load Level

32

Page 33: Cognitive Load Measurement using Speech/Linguistic Features

An Abstract CLM Model

• Automatic, Real-time, Non-intrusive

33

Page 34: Cognitive Load Measurement using Speech/Linguistic Features

Looking at Data Sets

• Reading Experiment

• Touch-table Collaborative Experiment

• Bushfire Study

• Driving Study

34

Page 35: Cognitive Load Measurement using Speech/Linguistic Features

Driving Study Data - Introduction

• Simulated Driving Experiment

• Investigate how the distractions can affect the performance of the user

• Identification of features to measure users’ cognitive load.

• 18 participants (8 females and 10 males)

• Data collected:– Video (2 cameras, front and rear view)

• Eye gaze movement

– Audio

– Galvanic Skin Response (GSR) or skin resistance

35

Page 36: Cognitive Load Measurement using Speech/Linguistic Features

Driving Study Data – Experiment Setup

• Big screen for game

• Front camera

• Simulator frame

• Wireless headset

• Bio-sensor (GSR)• Bio-sensor (GSR)

• Speakers at back

• Rear Camera

36

Page 37: Cognitive Load Measurement using Speech/Linguistic Features

Future Challenges

• Areas for future work– Development of larger databases

– Task dependant and task independent feature

• Need to take lab experiments ‘into the wild’

– Defining, researching and standardising tasks of interest

– Joint modeling of linguistic, speaker and cognitive load/emotion – Joint modeling of linguistic, speaker and cognitive load/emotion

information

37

Page 38: Cognitive Load Measurement using Speech/Linguistic Features

Exploring MultimodalitiesExploring Multimodalities

38

Page 39: Cognitive Load Measurement using Speech/Linguistic Features

Exploring Multimodality

• Hypothesis:– Users are more likely to use complimentary multimodal productions

as cognitive load increases

– Users will tend to rely on one modality more as cognitive load increases

• Method:

NICTA Copyright 2010 39

• Method: – Wizard of OZ scenario: speech and gesture interface for a series of

map based tasks; task increasing in difficulty by varying quantity of content and time-pressure

– Conditions for Speech Only interaction, Gesture Only interaction and Multimodal

– Videotape participants, record audio, record answers, post-hoc introspection questionnaire

Page 40: Cognitive Load Measurement using Speech/Linguistic Features

Multimodality and Cognitive Load

• Exploring Multimodal Interface

Scenarios

– The recognisers in the interface

will capture the user’s input and

interpret the information and

choose and appropriate response

Cognitive Load Analysis

User

Characteristics

Visual Data

Audio Data

Physiological Data

Environmental

NICTA Copyright 2010 40

choose and appropriate response

– Opportunity to capture interaction

data implicitlyTask

Characteristics

Environmental Data

Other

Modalities

Page 41: Cognitive Load Measurement using Speech/Linguistic Features

Experiment Design

• Task:Task:Task:Task:

– Incident Management Response

E.g. A major accident on corner of X and Y.

– Operators are required to deploy necessary crews and implement policies

and procedures

• Method:Method:Method:Method:

– Elicit speech and free-hand gesture interface for a series of map based

tasks;

41

tasks;

– Wizard of OZ scenario

– Videotape participants, record audio, record answers, post-hoc

introspection questionnaire

• Dependant Variables: Dependant Variables: Dependant Variables: Dependant Variables:

– Biosensor input: GSR and BVP

– Gesture: video footage

– Speech: transcribed manually

– Performance: latency, completion time & error-rates

– Multimodal productions: manual annotation

Page 42: Cognitive Load Measurement using Speech/Linguistic Features

Examining Multimodal Input Structures

NICTA Copyright 2010 42

Page 43: Cognitive Load Measurement using Speech/Linguistic Features

The Task

• There are 36 small tasks, divided into 3 groups of 12.

• Each group of 12 will consist of maps from 4 different cities:

• Each new task will be given to you at the top of the screen:– e.g. There has been an accident on the corner of Victoria and Liverpool Street.

• The tasks will be carried out using different modes: – speech + gesture together,

NICTA Copyright 2010 43

– speech + gesture together,

– speech-only and

– gesture-only

The experimenter will tell you which mode you should be using for each task.

• The task will first require some visual search for information.

• There are only three things the system can do:1. Zooming in and out of maps

2. Selecting map elements

3. Tagging map elements

Page 44: Cognitive Load Measurement using Speech/Linguistic Features

The TaskToolbox

Task

Description

NICTA Copyright 2010 44Information/Feedback Area

Map

Page 45: Cognitive Load Measurement using Speech/Linguistic Features

Zooming Map Levels

Lower-level map

Contains selectable

elements; can zoom out

to higher level map

NICTA Copyright 2010 45

Top-level map

No selectable elements:

divided into four quadrants by

a dotted black line

Page 46: Cognitive Load Measurement using Speech/Linguistic Features

Selectable Elements

• Selected elements will be shown with a blue border.

==>==>==>==>

School

Petrol Station

Library

Fire Station

NICTA Copyright 2010 46

Library

Shopping Centre

Parking Station

Intersection

Hospital

RTA Branch

Church

Page 47: Cognitive Load Measurement using Speech/Linguistic Features

Tagging Map Elements

Accident: e.g. car accident, fire, flooding

� Green border

Tagging is a two-step process:

1. Select map element ->

2. Tag as Accident, Incident or Event -> ->

NICTA Copyright 2010 47

� Green border

Event: e.g. concert, protest march, fun run

� Red border

Incident: occurrence that might cause a disruption to the traffic, e.g.

broken-down car, or a traffic jam in peak hour

� Yellow border

Info: Information area beneath the map ->

Clear: Clears all tags for selected element

Page 48: Cognitive Load Measurement using Speech/Linguistic Features

Special Tag: Notifying

Two parts: The element and the recipient need to be specified.

• Select map element (e.g. Intersection, marked as accident)

->

NICTA Copyright 2010 48

• Select NOTIFY action

� PINK tag appears ->

• Select the recipient map element (RTA Branch, Fire Station…)

� AQUA tag appears ->

Page 49: Cognitive Load Measurement using Speech/Linguistic Features

Zooming

• 2 zoom levels

• Lower level maps have selectable elements

• Zoom in: 4 quadrants

Top-level zoomable Map

(no selectable elements)

NICTA Copyright 2010 49

• Zoom in: 4 quadrants

• Zoom out

Lower-level Map

with selectable

elements

Page 50: Cognitive Load Measurement using Speech/Linguistic Features

The Modalities

• Speech

– Short and sweet

– No specific words, no specific word order�We only give some suggestions

– Speak clearly and loudly

Zooming Zoom into the top right quadrant

Top right quadrant

NICTA Copyright 2010 50

Top right quadrant

Zoom in to top right

Zoom out please

Selecting Select the Church on Liverpool Street

Church on Liverpool

Please highlight the Church

Tagging Make selected Church an accident (or incident or event) zone

Selected Church. Accident.

Accident.

Page 51: Cognitive Load Measurement using Speech/Linguistic Features

The Modalities (2)

• Hand Gestures

– Pointing

– Hand shapes

Zooming Point to quadrant and pause to select and zoom in.

Point to diagonal opposite ends of map, pause to zoom out.

NICTA Copyright 2010 51

Selecting Point to the element, pause until beep

Tagging Very clear hand shape (fist, flat palm, scissors, thumbs-up)

OR

Point to button in toolbox, pause to select

Page 52: Cognitive Load Measurement using Speech/Linguistic Features

The Modalities (3)

• Multimodal

– Speech + gesture

– Any order or combination

– Speech only or gesture only are OK

– Examples:• “Make this into an accident” + pointing at element

NICTA Copyright 2010 52

• “Make this into an accident” + pointing at element

• “Zoom into this quadrant” + pointing at quadrant

• “Zoom out again”

Page 53: Cognitive Load Measurement using Speech/Linguistic Features

Research DesignResearch DesignResearch DesignResearch Design

Balancing Available ModalitiesBalancing Available ModalitiesBalancing Available ModalitiesBalancing Available Modalities• The traffic incident management (TIM) domain was used, and subjects

were required to update a geographical map with traffic conditions information. Following our requirement, tasks were achievable using the following modalities:

– GestureGestureGestureGesture:

• Deictic pointing to map locations, items, and function buttons;

NICTA Copyright 2010 53

• Deictic pointing to map locations, items, and function buttons;

• Circling gestures for zoom functions.

– Hand ShapesHand ShapesHand ShapesHand Shapes: Predefined hand shapes for item tagging: fist, open palm, thumbs up etc

– SpeechSpeechSpeechSpeech: street names, actions etc

• A large overlap was introduced across modal ways of performing actions. However, some tasks required the combination of modalities.

Page 54: Cognitive Load Measurement using Speech/Linguistic Features

Task Design

• Task Specification

– Task was given in written mode

– Users had freedom of inspection

– The task described a situation, but did not specify activities, e.g.

“An incident has occurred: a truck has lost some of its load at Walter

Avenue and Lytton Road, near Mowbray Park”

NICTA Copyright 2010 54

Avenue and Lytton Road, near Mowbray Park”

• Task Activities

– Locate point of interest on the map

– Mark with one of 3 tags: accident, incident or event

– Notify relevant authorities, e.g. if casualties exist, notify a hospital.

– 11 different kinds of functionality available

Page 55: Cognitive Load Measurement using Speech/Linguistic Features

Task Difficulty Level DesignTask Difficulty Level DesignTask Difficulty Level DesignTask Difficulty Level Design

• There were four levels of cognitive load, and three tasks were completed for each level.

• The same visual was used for each level to avoid differences in visual complexity.

• The tasks varied in load through:

– The number of distinct entitiesnumber of distinct entitiesnumber of distinct entitiesnumber of distinct entities in the task description;

– The number of distractorsnumber of distractorsnumber of distractorsnumber of distractors (items not needed for the task);

– The minimum number of actionsminimum number of actionsminimum number of actionsminimum number of actions required for the task.

NICTA Copyright 2010 55

– The minimum number of actionsminimum number of actionsminimum number of actionsminimum number of actions required for the task.

– Further load was achieved in Level 4 by introducing a time limit.

Level Entities Actions Distractors Time

1 6 3 2 ∞

2 10 8 2 ∞

3 12 13 4 ∞

4 12 13 4 90 sec.

Page 56: Cognitive Load Measurement using Speech/Linguistic Features

Available Modalities

• The Modalities

– Aimed to capture natural patterns of

speech and gesture combinations

– Speech: natural spoken language

‘recognised’ by an operator

• Avoids bias injected by errors in

recognisers

– Gesture: automated hand tracking

InputInputInputInput SpeechSpeechSpeechSpeech GestureGestureGestureGesture

Select “Select” Point

Zoom “Zoom” Circling

Notify “Notify Thumbs up

Tag

Accident

“Accident” Fist

NICTA Copyright 2010 56

– Gesture: automated hand tracking

• Untethered: no equipment used on

the person

• Both tracking of the hand and hand

shapes used

• Buttons added to reduce

expressivity gap between gesture

and speech

– Either or both could be used for

each command

Accident

Tag Incident “Incident” Open Palm

Tag Event “Event” Scissors

Page 57: Cognitive Load Measurement using Speech/Linguistic Features

Example of Interaction

<Point at location>; or“St Mary’s Church”

Selecting a location/item of interest

<Point at quadrant>; or“Zoom in to the top right quadrant”

Zooming in or out of a map

Example of InteractionSystem

Functionality

NICTA Copyright 2010 57

“End task”; or<Point at End task button>

Starting or ending a task

<Select accident> and“notify”; or fist shape and <Select recipient>

Notifying a recipient (item) of an accident, incident or an event

<Select location> and:“Incident”; orScissors shape

Tagging a location of interest with an ‘accident’, ‘incident’or ‘event’ marker

interest

Page 58: Cognitive Load Measurement using Speech/Linguistic Features

Wizard of Oz

Wizard

Camcorder

Main computer

NICTA Copyright 2010 58

Firewire

camera

Camcorder

AGR

Page 59: Cognitive Load Measurement using Speech/Linguistic Features

Data CapturedData CapturedData CapturedData Captured

• The study generated various streams of data that were captured as

follows:

– Speech was orthographically transcribed, including specific tags for

disfluencies such as false starts, hesitations. Start and end time were

annotated for each utterance;

– Hand motion was captured by the automatic gesture recogniser at the rate

of 20 frames per seconds. Positions are relative to the camera view angle;

NICTA Copyright 2010 59

of 20 frames per seconds. Positions are relative to the camera view angle;

– Deictic pointing (pause while pointing, or circling) and hand shapes were

annotated at two levels: the video was annotated to mark the start and end

time of the overall motion leading to the gesture.

– System feedback to the user such as task change (marked by a beep), item

information, or error message were recorded with their time of occurrence;

– Bio-sensor data was recorded at the rate of 100 points per second. Skin

conductance is measured in micro Siemens (µS) while blood volume pulse

only provides relative measures expressed in percentage.

Page 60: Cognitive Load Measurement using Speech/Linguistic Features

Sample of Annotation

Turn Construction Modality Content

Mark an

Incident

(A)

Select

(a)

Gesture [point to St Mary’s Church]

Speech “Select St.Mary’s Church”

Tag

(a)

Shape [scissors=Incident]

Speech “Incident”

NICTA Copyright 2010 60

Speech “Incident”

Mark an

Accident

(C)

Select

(c)

Speech “Select Crown Street Library”

Tag

(c)Shape [fist=Accident]

Mark an Event

(B)

Select

(b)

Speech “Select”

Gesture [point to Collingwood School]

Tag

(b)

Shape [open_palm=Event]

Page 61: Cognitive Load Measurement using Speech/Linguistic Features

Results and Analysis

• Users:Users:Users:Users: 15 available

• Total inputs:Total inputs:Total inputs:Total inputs: 1119

• Total turns:Total turns:Total turns:Total turns: 394 (206 MM)

• Total constructions:Total constructions:Total constructions:Total constructions: 644

• Average difficulty rating for levels (subjective)� Level 1 (easiest): 2/10

• Redundancy and Complementarity:– Each user command in the system

requires an action and an object• Speech and/or

• Gesture-HandShape

• Redundancy– Doubling up of either action or object

information or both

NICTA Copyright 2010 61

� Level 1 (easiest): 2/10

� Level 2: 4.2/10

� Level 4 (hardest): 5/10Action Object

Speech √ √

Gesture √ √

information or both

• Complementarity– Action and object come through

different modalities

Action Object

Speech √

Gesture √

Page 62: Cognitive Load Measurement using Speech/Linguistic Features

Rates of Redundancy

40

50

60

70

80

90

Q1

Min

Mean

Max

• Redundancy:– Conveying the same information over

more than one modality,

– Either would be sufficient on its own

Turn Const Modality Content

Pure

Redundant

Select Gesture [point to St Mary’s Church]

Speech “Select St.Mary’s Church”

NICTA Copyright 2010 62

0

10

20

30

40

Level1 Level2 Level4

Max

Q3

Proportion of Purely Redundant turns by Level

• We found a statistically significant decreasedecreasedecreasedecrease in the number of purely redundant turns from – 62.91% in Level 1 to

– 29.9% in Level 4 of all multimodal turns.

Tag Hand_Shape [scissors=Incident]

Speech “Incident”

Page 63: Cognitive Load Measurement using Speech/Linguistic Features

30

40

50

60

70

Purely redundant

Partially redundant

Redundancy

NICTA Copyright 2010 63

0

10

20

30

Level1 Level2 Level4

Purely

complementary

We observed a steady decrease in redundancy as task difficulty increased. An ANOVA test between-users, across levels, shows there are significant differences between the means (F =3.88 (df=2); p<0.05).

Page 64: Cognitive Load Measurement using Speech/Linguistic Features

Rates of Complementarity

• Complementarity:– Conveying different information over different modalities

e.g.

Turn Action Modality Content

Pure

Complement

Select Speech “Select St Mary’s Church”

NICTA Copyright 2010 64

• We also found trends of increased multimodal complementarity across levels:– 12.86% in Level 1

– 45.53% in Level 2, and

– 36.02% in Level 4

Tag Hand_Shape [scissors=Incident]

Page 65: Cognitive Load Measurement using Speech/Linguistic Features

Cognitive and Working Memory Theories

• Why?Reduced level of redundancy + increased level of complementarity, suggests a specific working memory strategy

• Modal Model of Working Memory [Baddeley, 92]

• Working Memory Strategies:

Phonological Loop

NICTA Copyright 2010 65

• Working Memory Strategies:

– Activity is shifted to areas marked exclusively for modal use

– At high load, users try to maximise the usage of modal working memory

– Users channel the required semantic chunks to different modalities, with the least amount of least amount of least amount of least amount of replicationreplicationreplicationreplication possible

Central Executive

Visual-Spatial Sketchpad

Page 66: Cognitive Load Measurement using Speech/Linguistic Features

Discussion and Challenges

• Results:

– The results of this study give initial evidence for

redundancy/complementarity behavioural symptom of cognitive load

management employed by users

• Sensitivity and Diagnosticity:

– ‘Ceiling’ values for rates of redundancy or complementarity

NICTA Copyright 2010 66

– ‘Ceiling’ values for rates of redundancy or complementarity

– Clearly not suitable for all users

• Automatic cognitive load estimation:

– A compound measure

– Various individual modal measurements for robustness

– Weighting of features on a per-user basis

• more reliable indices will influence a combined measure more strongly