cognitive load measurement using speech/linguistic features
TRANSCRIPT
From imagination to impact
Using Information to Drive Decisions
Cognitive Load Measurement using Speech/Linguistic Features
Dr. Fang Chen
NICTA Copyright
2010
1
Dr. Fang [email protected]
Outline
• Background
• Research Applications
• Speech and Language Analyses
• Data Sets:
– Reading Experiment– Reading Experiment
– Touch-table Collaborative Experiment
– Bushfire Study
– Driving Experiment
2
Background
• Cognitive loadCognitive loadCognitive loadCognitive load (CL):(CL):(CL):(CL): refers to the mental demand imposed on
working memory by a particular task.
• Working Memory:Working Memory:Working Memory:Working Memory: limited capacity for holding information in
mind in the context of cognitive activity.
• Cognitive Load Theory:Cognitive Load Theory:Cognitive Load Theory:Cognitive Load Theory: development of the instructional
methods for effective use of people's limited cognitive
processing capacity.
3
Research Aims
• Overall:– Identification of potential indices of cognitive load for
• real-time,
• objective,
• non-intrusive •non-intrusive
•measurement of cognitive load.
• Specific to this research:– Identification of potential linguistic and grammatical features of
cognitive load.
4
Need for CL Measurement
• Overloading or underloading of cognitive
processing:
– degradation of performance, and/or
– failures of learning and performing, and/or
– source of performance errors. – source of performance errors.
• CL measurement is crucial for:
– minimising the amount of cognitive effort required,
– maintaining the right level of CL,
– achieving adaptive system response,
– improving user performance.
5
Cognitive Load Measures
• Subjective measures– e.g. self-reporting – manual, post-task, time-consuming, intrusive.
• Physiological measures– e.g. eyes, brain, skin biosensors – sensitive, signal noise, intrusive, lot of
complex equipmentcomplex equipment
• Performance measures – e.g. error rate, task performance – dual tasks
• Behavioral measures– e.g. speech, pressure mouse – can be automatic, non-intrusive
6
Research Applications
• Designing intelligent adaptive user interfaces for intensive
working/interaction environments.
– Emergency services e.g. Bushfire Cooperative Research Centre
– Road traffic control services e.g. Roads and Traffic Authority (RTA)
• Other potential areas:
– Call centers – Call centers
– Air traffic control rooms
– Pilot cockpits
– Online education / e-learning
– … and so on.
7
Speech and Linguistic Measures
• Why Speech?Why Speech?Why Speech?Why Speech?
– Sensitivity in the speechSensitivity in the speechSensitivity in the speechSensitivity in the speech modality shown by prior art.
– NonNonNonNon----intrusive,intrusive,intrusive,intrusive, easy to collecteasy to collecteasy to collecteasy to collect e.g. phone calls, conversations
– Objective measureObjective measureObjective measureObjective measure, not easily manipulated by the user
– RealRealRealReal----time analysistime analysistime analysistime analysis is possible (for some speech signal features)
– Widely availableWidely availableWidely availableWidely available, in a number of application scenarios
• What measures?What measures?What measures?What measures?
– Pauses and response latency
• Pausing differently under different conditions.
– Language and word usage
• Using particular words and/or phrases at specific sentence and/or paragraph
positions;
– Grammar features and structures
• Using particular types of linguistic/grammatical categories;
• Using a particular type of syntax or grammatical structure i.e. usage of parts of speech
and their forms;8
Experiment Setup
• A user study with two controlled levels of cognitive load
– Elicit natural speech from users
The Sun
The Sun has "burned" for more than 4.5 billion years and will continue to do so for several billion more. It is a massive collection of gas, mostly hydrogen and
• A reading and comprehension task
– General knowledge (avoid the
expertise effect)
– Reading the extract
NICTA Copyright 2010 9
collection of gas, mostly hydrogen and helium. Because it is so massive, it has immense gravity, enough gravitational force to hold all of hydrogen and helium together (and to hold all of the planets in their orbits around the Sun!). The Sun does not "burn" like wood burns – it is a gigantic nuclear reactor….
– Reading the extract
– Answer open-ended questions
• Give a short summary of the story
in at least five whole sentences.
• What was the most interesting
point in this story?.
• Describe at least two other points
highlighted in this story.
Story-reading Experiment
• Experimental setup
– Story reading followed by Q&A
– 3 different levels of text difficulty (Lexile Framework for Reading,
www.lexile.com)
– 3 stories in each of the 2 sessions (fixed order)
• 1st session: 2nd session:
– “Sleep” (900L), “Smoke Detectors” (950L),
NICTA Copyright 2010 10
– “Sleep” (900L), “Smoke Detectors” (950L),
– “History of Zero” (1350L) & “Hurricanes” (1250L) &
– “Milky Way Galaxy” (1400L) “The Sun” (1300L)
• 5 minutes break between sessions
– Dual-task for “Milky Way Galaxy” & “The Sun”
• Counting of background spoken numbers while reading the stories and
answering the questions
Experiment Setup
• Cognitive load level design– Lexile Framework for Reading (200L 1st grade, 1700L grad)
• Syntactic and semantic complexity, vocabulary
– Text with same difficulty for both conditions
– Aural dual task, counting numbers during reading and answering
Task Load LevelTask Load LevelTask Load LevelTask Load Level Lexile RatingLexile RatingLexile RatingLexile Rating Dual TaskDual TaskDual TaskDual Task
NICTA Copyright 2010 11
• Participants
– 15 native English speakers as subjects (8 females and 7 males)
Task Load LevelTask Load LevelTask Load LevelTask Load Level Lexile RatingLexile RatingLexile RatingLexile Rating Dual TaskDual TaskDual TaskDual Task
Low 1300L No
High 1300L Yes
Reading Experiment Data – Pause Analysis
12
Pause Analysis – Results Summary
*p<0.05, n=24.
13
Touch-table Collaboration Study - Lab Data
• Collaborative tasks using multi-touch tabletop screen.
• Interactive Firefighting tasks.
• 10 groups x 4 members = 40 subjects + (1 Pilot group)
– 30 Commanders + 10 Leaders
– 39 subjects data available (1 leader’s data missing)
• Speech Transcriptions using ELAN.• Speech Transcriptions using ELAN.
• Extracted and cleaned for LIWC and other analysis tools.
• Analysis completed:
– Subjective Ratings
– Grammar features - Pronouns
– Word Category Features
– Language Complexity Features
14
15
Touch-table Study Design
16
Lab Data – Some Hypotheses
• Higher subjective ratings under high load task.
• More speech and longer sentences.
• More and longer pauses under high load task.
• More use of:
– Negative emotion words, inclusive words, swear words, cognitive and – Negative emotion words, inclusive words, swear words, cognitive and
perceptive phrases, disagreement words etc.
• Less use of:
– Positive emotion words, agreement, certainty, achievement words
• More hesitations and incomplete sentences
• More use of plural pronouns and less use of singular ones.
• More complex sentences under high load task.
17
Lab Data – Subjective Ratings
18
Lab Data – Linguistic Analysis (Words)
19
Lab Data – Linguistic Analysis (Pronouns)
• Singular pronouns decrease
• Plural pronouns increase
20
Lab Data – Linguistic Analysis (Pronouns)
• Interaction between Singular and Plural Personal Pronouns
21
• Language complexity measuresLanguage complexity measuresLanguage complexity measuresLanguage complexity measures
• Measured by two major factors:
– Semantic difficultySemantic difficultySemantic difficultySemantic difficulty: observes the use of words, their frequencies, and
their lengths (both in syllables as well as alphabets/characters).
– Syntactic complexitySyntactic complexitySyntactic complexitySyntactic complexity: observes primarily the sentence length, which
is considered as the best indicator of text or language complexity.
Lab Data – Language Complexity Analysis
is considered as the best indicator of text or language complexity.
• Hypotheses
– Language Complexity increases
– Lexical Density decreases
22
• Lexical Density (Vocabulary Richness) – expected to decrease
• Hard Word Ratio – expected to decrease
• Gunning Fog Index
Lexical Density is the estimated measure of content per
functional and lexical units or lexemes in total text. In simple
words, it is a measure of the ratio of unique words to the total
number of words.
Lexical Density = (different words / total words) x 100
A word is considered complex or hard if it has three
or more syllables and does not contain a hyphen ( -
). For example, the word ‘density’ has three
syllables.
Complex Word Ratio is the measure of the ratio of
complex words to the total number of words.Gunning Fog Index calculates the syntactic complexity of
language using sentence lengths and complex words and
implies that short and simple sentences in plain English
achieve a better score (lower value) than long sentences in Flesch-Kincaid Grade calculates the language difficulty using
Lab Data – Language Complexity Analysis
• Gunning Fog Index – expected to increase
• Flesch-Kincaid Grade – expected to increase
• SMOG Grade – expected to increase
• Lexile Level – expected to increase
achieve a better score (lower value) than long sentences in
complicated language.
Gunning Fog Index = 0.4 x (ASL + ((SYW / words) x 100))
Where:
ASL = Average sentence length (the number of words divided
by the number of sentences)
SYW = Number of words with three or more syllables
Flesch-Kincaid Grade calculates the language difficulty using
average sentence lengths and average syllables per word. It
estimates the number of years of education required to
understand the written or transcribed text.
Flesch-Kincaid Grade = (0.39 x ASL) + (11.8 x ASW) – 15.59
Where:
ASL = Average sentence length (the number of words divided by
the number of sentences)
ASW = Average number of syllables per word (the number of
syllables divided by the number of words)
The SMOG Grade also estimates the number of education years
needed to fully comprehend the text. It uses sentences and
complex words to calculate it. The emphasis on full
comprehension distinguishes this measurement from other
complexity measures.
SMOG Grade = square root of ((SYW / sentences) x 30) + 3
Where:
SYW = Number of words with three or more syllables
Lexile Level also measures the comprehension complexity
of any text. A Lexile measure is the numeric representation
of a text’s difficulty ranging from 200L for easy to above
1700L for complicated texts. It uses mean sentence
lengths and mean log word frequency to calculate it.
23
Lab Data – Language Complexity Analysis
24
Lab Data – Language Complexity Analysis
25
Bushfire Data - Introduction
• Speech and transcription data from Bushfire CRC.
• Training exercises – four states (TAS, VIC, NSW, and QLD).
• Three roles: Incident Controller (IC), Planning, Operations.
• 11 exercises, 33 subjects
• All exercises monitored by bushfire management experts.
• Operators co-located in a control room and trained for
NICTA Copyright 2010 26
• Operators co-located in a control room and trained for roles.
• Data collection, transcription, coding, cleaning, analyses.
• Four different load levels: – (1) ‘low’: casual conversation, no time pressure;
– (2) ‘medium’: routine tasks;
– (3) ‘high’: challenging tasks, time constraints; and
– (4) ‘very high’: very challenging, lot of unexpected events and breakdowns.
• Combined into low and high.
Bushfire Data – Same Hypotheses
• Higher subjective ratings under high load task.
• More speech and longer sentences.
• More and longer pauses under high load task.
• More use of:
– Negative emotion words, inclusive words, swear words, cognitive and – Negative emotion words, inclusive words, swear words, cognitive and
perceptive phrases, disagreement words etc.
• Less use of:
– Positive emotion words, agreement, certainty, achievement words
• More hesitations and incomplete sentences
• More use of plural pronouns and less use of singular ones.
• More complex sentences under high load task.
27
Bushfire Data – Linguistic Analysis (Words)
28
Bushfire Data – Linguistic Analysis (Pronouns)
• Singular pronouns decrease
• Plural pronouns increase
29
Bushfire Data – Linguistic Analysis (Pronouns)
• Interaction between Singular and Plural Personal Pronouns
30
Bushfire Data – Language Complexity Analysis
31
Other Linguistic Analysis Possibilities
• NNNN----gram Analysisgram Analysisgram Analysisgram Analysis
• Bi-gram Ratio
• Others:• Most common N-grams
(Bigrams, Trigrams, 4-grams)
• Most common words (Unigrams)70%
80%
90%
100%
Perc
ent
Bi-gram Ratio
• Most common words (Unigrams)
• Most frequent or least frequent N-grams
• More…
• Parse Tree AnalysisParse Tree AnalysisParse Tree AnalysisParse Tree Analysis
– Order of nOrder of nOrder of nOrder of n----gramsgramsgramsgrams
• For both For both For both For both –––– words and parts of speech.words and parts of speech.words and parts of speech.words and parts of speech.
L1L1L1L1 L2L2L2L2 L3L3L3L3 L4L4L4L4 pppp
BiBiBiBi----gram Ratiogram Ratiogram Ratiogram Ratio 93.5% 80.9% 79.4% 72.6% 0.0002
50%
60%
1 2 3 4
Load Level
32
An Abstract CLM Model
• Automatic, Real-time, Non-intrusive
33
Looking at Data Sets
• Reading Experiment
• Touch-table Collaborative Experiment
• Bushfire Study
• Driving Study
34
Driving Study Data - Introduction
• Simulated Driving Experiment
• Investigate how the distractions can affect the performance of the user
• Identification of features to measure users’ cognitive load.
• 18 participants (8 females and 10 males)
• Data collected:– Video (2 cameras, front and rear view)
• Eye gaze movement
– Audio
– Galvanic Skin Response (GSR) or skin resistance
35
Driving Study Data – Experiment Setup
• Big screen for game
• Front camera
• Simulator frame
• Wireless headset
• Bio-sensor (GSR)• Bio-sensor (GSR)
• Speakers at back
• Rear Camera
36
Future Challenges
• Areas for future work– Development of larger databases
– Task dependant and task independent feature
• Need to take lab experiments ‘into the wild’
– Defining, researching and standardising tasks of interest
– Joint modeling of linguistic, speaker and cognitive load/emotion – Joint modeling of linguistic, speaker and cognitive load/emotion
information
37
Exploring MultimodalitiesExploring Multimodalities
38
Exploring Multimodality
• Hypothesis:– Users are more likely to use complimentary multimodal productions
as cognitive load increases
– Users will tend to rely on one modality more as cognitive load increases
• Method:
NICTA Copyright 2010 39
• Method: – Wizard of OZ scenario: speech and gesture interface for a series of
map based tasks; task increasing in difficulty by varying quantity of content and time-pressure
– Conditions for Speech Only interaction, Gesture Only interaction and Multimodal
– Videotape participants, record audio, record answers, post-hoc introspection questionnaire
Multimodality and Cognitive Load
• Exploring Multimodal Interface
Scenarios
– The recognisers in the interface
will capture the user’s input and
interpret the information and
choose and appropriate response
Cognitive Load Analysis
User
Characteristics
Visual Data
Audio Data
Physiological Data
Environmental
NICTA Copyright 2010 40
choose and appropriate response
– Opportunity to capture interaction
data implicitlyTask
Characteristics
Environmental Data
Other
Modalities
Experiment Design
• Task:Task:Task:Task:
– Incident Management Response
E.g. A major accident on corner of X and Y.
– Operators are required to deploy necessary crews and implement policies
and procedures
• Method:Method:Method:Method:
– Elicit speech and free-hand gesture interface for a series of map based
tasks;
41
tasks;
– Wizard of OZ scenario
– Videotape participants, record audio, record answers, post-hoc
introspection questionnaire
• Dependant Variables: Dependant Variables: Dependant Variables: Dependant Variables:
– Biosensor input: GSR and BVP
– Gesture: video footage
– Speech: transcribed manually
– Performance: latency, completion time & error-rates
– Multimodal productions: manual annotation
Examining Multimodal Input Structures
NICTA Copyright 2010 42
The Task
• There are 36 small tasks, divided into 3 groups of 12.
• Each group of 12 will consist of maps from 4 different cities:
• Each new task will be given to you at the top of the screen:– e.g. There has been an accident on the corner of Victoria and Liverpool Street.
• The tasks will be carried out using different modes: – speech + gesture together,
NICTA Copyright 2010 43
– speech + gesture together,
– speech-only and
– gesture-only
The experimenter will tell you which mode you should be using for each task.
• The task will first require some visual search for information.
• There are only three things the system can do:1. Zooming in and out of maps
2. Selecting map elements
3. Tagging map elements
The TaskToolbox
Task
Description
NICTA Copyright 2010 44Information/Feedback Area
Map
Zooming Map Levels
Lower-level map
Contains selectable
elements; can zoom out
to higher level map
NICTA Copyright 2010 45
Top-level map
No selectable elements:
divided into four quadrants by
a dotted black line
Selectable Elements
• Selected elements will be shown with a blue border.
==>==>==>==>
School
Petrol Station
Library
Fire Station
NICTA Copyright 2010 46
Library
Shopping Centre
Parking Station
Intersection
Hospital
RTA Branch
Church
Tagging Map Elements
Accident: e.g. car accident, fire, flooding
� Green border
Tagging is a two-step process:
1. Select map element ->
2. Tag as Accident, Incident or Event -> ->
NICTA Copyright 2010 47
� Green border
Event: e.g. concert, protest march, fun run
� Red border
Incident: occurrence that might cause a disruption to the traffic, e.g.
broken-down car, or a traffic jam in peak hour
� Yellow border
Info: Information area beneath the map ->
Clear: Clears all tags for selected element
Special Tag: Notifying
Two parts: The element and the recipient need to be specified.
• Select map element (e.g. Intersection, marked as accident)
->
NICTA Copyright 2010 48
• Select NOTIFY action
� PINK tag appears ->
• Select the recipient map element (RTA Branch, Fire Station…)
� AQUA tag appears ->
Zooming
• 2 zoom levels
• Lower level maps have selectable elements
• Zoom in: 4 quadrants
Top-level zoomable Map
(no selectable elements)
NICTA Copyright 2010 49
• Zoom in: 4 quadrants
• Zoom out
Lower-level Map
with selectable
elements
The Modalities
• Speech
– Short and sweet
– No specific words, no specific word order�We only give some suggestions
– Speak clearly and loudly
Zooming Zoom into the top right quadrant
Top right quadrant
NICTA Copyright 2010 50
Top right quadrant
Zoom in to top right
Zoom out please
Selecting Select the Church on Liverpool Street
Church on Liverpool
Please highlight the Church
Tagging Make selected Church an accident (or incident or event) zone
Selected Church. Accident.
Accident.
The Modalities (2)
• Hand Gestures
– Pointing
– Hand shapes
Zooming Point to quadrant and pause to select and zoom in.
Point to diagonal opposite ends of map, pause to zoom out.
NICTA Copyright 2010 51
Selecting Point to the element, pause until beep
Tagging Very clear hand shape (fist, flat palm, scissors, thumbs-up)
OR
Point to button in toolbox, pause to select
The Modalities (3)
• Multimodal
– Speech + gesture
– Any order or combination
– Speech only or gesture only are OK
– Examples:• “Make this into an accident” + pointing at element
NICTA Copyright 2010 52
• “Make this into an accident” + pointing at element
• “Zoom into this quadrant” + pointing at quadrant
• “Zoom out again”
Research DesignResearch DesignResearch DesignResearch Design
Balancing Available ModalitiesBalancing Available ModalitiesBalancing Available ModalitiesBalancing Available Modalities• The traffic incident management (TIM) domain was used, and subjects
were required to update a geographical map with traffic conditions information. Following our requirement, tasks were achievable using the following modalities:
– GestureGestureGestureGesture:
• Deictic pointing to map locations, items, and function buttons;
NICTA Copyright 2010 53
• Deictic pointing to map locations, items, and function buttons;
• Circling gestures for zoom functions.
– Hand ShapesHand ShapesHand ShapesHand Shapes: Predefined hand shapes for item tagging: fist, open palm, thumbs up etc
– SpeechSpeechSpeechSpeech: street names, actions etc
• A large overlap was introduced across modal ways of performing actions. However, some tasks required the combination of modalities.
Task Design
• Task Specification
– Task was given in written mode
– Users had freedom of inspection
– The task described a situation, but did not specify activities, e.g.
“An incident has occurred: a truck has lost some of its load at Walter
Avenue and Lytton Road, near Mowbray Park”
NICTA Copyright 2010 54
Avenue and Lytton Road, near Mowbray Park”
• Task Activities
– Locate point of interest on the map
– Mark with one of 3 tags: accident, incident or event
– Notify relevant authorities, e.g. if casualties exist, notify a hospital.
– 11 different kinds of functionality available
Task Difficulty Level DesignTask Difficulty Level DesignTask Difficulty Level DesignTask Difficulty Level Design
• There were four levels of cognitive load, and three tasks were completed for each level.
• The same visual was used for each level to avoid differences in visual complexity.
• The tasks varied in load through:
– The number of distinct entitiesnumber of distinct entitiesnumber of distinct entitiesnumber of distinct entities in the task description;
– The number of distractorsnumber of distractorsnumber of distractorsnumber of distractors (items not needed for the task);
– The minimum number of actionsminimum number of actionsminimum number of actionsminimum number of actions required for the task.
NICTA Copyright 2010 55
– The minimum number of actionsminimum number of actionsminimum number of actionsminimum number of actions required for the task.
– Further load was achieved in Level 4 by introducing a time limit.
Level Entities Actions Distractors Time
1 6 3 2 ∞
2 10 8 2 ∞
3 12 13 4 ∞
4 12 13 4 90 sec.
Available Modalities
• The Modalities
– Aimed to capture natural patterns of
speech and gesture combinations
– Speech: natural spoken language
‘recognised’ by an operator
• Avoids bias injected by errors in
recognisers
– Gesture: automated hand tracking
InputInputInputInput SpeechSpeechSpeechSpeech GestureGestureGestureGesture
Select “Select” Point
Zoom “Zoom” Circling
Notify “Notify Thumbs up
Tag
Accident
“Accident” Fist
NICTA Copyright 2010 56
– Gesture: automated hand tracking
• Untethered: no equipment used on
the person
• Both tracking of the hand and hand
shapes used
• Buttons added to reduce
expressivity gap between gesture
and speech
– Either or both could be used for
each command
Accident
Tag Incident “Incident” Open Palm
Tag Event “Event” Scissors
Example of Interaction
<Point at location>; or“St Mary’s Church”
Selecting a location/item of interest
<Point at quadrant>; or“Zoom in to the top right quadrant”
Zooming in or out of a map
Example of InteractionSystem
Functionality
NICTA Copyright 2010 57
“End task”; or<Point at End task button>
Starting or ending a task
<Select accident> and“notify”; or fist shape and <Select recipient>
Notifying a recipient (item) of an accident, incident or an event
<Select location> and:“Incident”; orScissors shape
Tagging a location of interest with an ‘accident’, ‘incident’or ‘event’ marker
interest
Wizard of Oz
Wizard
Camcorder
Main computer
NICTA Copyright 2010 58
Firewire
camera
Camcorder
AGR
Data CapturedData CapturedData CapturedData Captured
• The study generated various streams of data that were captured as
follows:
– Speech was orthographically transcribed, including specific tags for
disfluencies such as false starts, hesitations. Start and end time were
annotated for each utterance;
– Hand motion was captured by the automatic gesture recogniser at the rate
of 20 frames per seconds. Positions are relative to the camera view angle;
NICTA Copyright 2010 59
of 20 frames per seconds. Positions are relative to the camera view angle;
– Deictic pointing (pause while pointing, or circling) and hand shapes were
annotated at two levels: the video was annotated to mark the start and end
time of the overall motion leading to the gesture.
– System feedback to the user such as task change (marked by a beep), item
information, or error message were recorded with their time of occurrence;
– Bio-sensor data was recorded at the rate of 100 points per second. Skin
conductance is measured in micro Siemens (µS) while blood volume pulse
only provides relative measures expressed in percentage.
Sample of Annotation
Turn Construction Modality Content
Mark an
Incident
(A)
Select
(a)
Gesture [point to St Mary’s Church]
Speech “Select St.Mary’s Church”
Tag
(a)
Shape [scissors=Incident]
Speech “Incident”
NICTA Copyright 2010 60
Speech “Incident”
Mark an
Accident
(C)
Select
(c)
Speech “Select Crown Street Library”
Tag
(c)Shape [fist=Accident]
Mark an Event
(B)
Select
(b)
Speech “Select”
Gesture [point to Collingwood School]
Tag
(b)
Shape [open_palm=Event]
Results and Analysis
• Users:Users:Users:Users: 15 available
• Total inputs:Total inputs:Total inputs:Total inputs: 1119
• Total turns:Total turns:Total turns:Total turns: 394 (206 MM)
• Total constructions:Total constructions:Total constructions:Total constructions: 644
• Average difficulty rating for levels (subjective)� Level 1 (easiest): 2/10
• Redundancy and Complementarity:– Each user command in the system
requires an action and an object• Speech and/or
• Gesture-HandShape
• Redundancy– Doubling up of either action or object
information or both
NICTA Copyright 2010 61
� Level 1 (easiest): 2/10
� Level 2: 4.2/10
� Level 4 (hardest): 5/10Action Object
Speech √ √
Gesture √ √
information or both
• Complementarity– Action and object come through
different modalities
Action Object
Speech √
Gesture √
Rates of Redundancy
40
50
60
70
80
90
Q1
Min
Mean
Max
• Redundancy:– Conveying the same information over
more than one modality,
– Either would be sufficient on its own
Turn Const Modality Content
Pure
Redundant
Select Gesture [point to St Mary’s Church]
Speech “Select St.Mary’s Church”
NICTA Copyright 2010 62
0
10
20
30
40
Level1 Level2 Level4
Max
Q3
Proportion of Purely Redundant turns by Level
• We found a statistically significant decreasedecreasedecreasedecrease in the number of purely redundant turns from – 62.91% in Level 1 to
– 29.9% in Level 4 of all multimodal turns.
Tag Hand_Shape [scissors=Incident]
Speech “Incident”
30
40
50
60
70
Purely redundant
Partially redundant
Redundancy
NICTA Copyright 2010 63
0
10
20
30
Level1 Level2 Level4
Purely
complementary
We observed a steady decrease in redundancy as task difficulty increased. An ANOVA test between-users, across levels, shows there are significant differences between the means (F =3.88 (df=2); p<0.05).
Rates of Complementarity
• Complementarity:– Conveying different information over different modalities
e.g.
Turn Action Modality Content
Pure
Complement
Select Speech “Select St Mary’s Church”
NICTA Copyright 2010 64
• We also found trends of increased multimodal complementarity across levels:– 12.86% in Level 1
– 45.53% in Level 2, and
– 36.02% in Level 4
Tag Hand_Shape [scissors=Incident]
Cognitive and Working Memory Theories
• Why?Reduced level of redundancy + increased level of complementarity, suggests a specific working memory strategy
• Modal Model of Working Memory [Baddeley, 92]
• Working Memory Strategies:
Phonological Loop
NICTA Copyright 2010 65
• Working Memory Strategies:
– Activity is shifted to areas marked exclusively for modal use
– At high load, users try to maximise the usage of modal working memory
– Users channel the required semantic chunks to different modalities, with the least amount of least amount of least amount of least amount of replicationreplicationreplicationreplication possible
Central Executive
Visual-Spatial Sketchpad
Discussion and Challenges
• Results:
– The results of this study give initial evidence for
redundancy/complementarity behavioural symptom of cognitive load
management employed by users
• Sensitivity and Diagnosticity:
– ‘Ceiling’ values for rates of redundancy or complementarity
NICTA Copyright 2010 66
– ‘Ceiling’ values for rates of redundancy or complementarity
– Clearly not suitable for all users
• Automatic cognitive load estimation:
– A compound measure
– Various individual modal measurements for robustness
– Weighting of features on a per-user basis
• more reliable indices will influence a combined measure more strongly