detecting and responding to student emotion

78
Detecting and Responding to Student Emotion within an Online Tutor Beverly Park WOOLF College of Information and Computer Sciences, University of Massachusetts Amherst •1 •Supported by •National Science Foundation; U.S Dept of Education; •Bill & Melinda Gates Foundation

Upload: beverly-park-woolf

Post on 07-Apr-2017

7 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Detecting and Responding to Student Emotion

Detecting and Responding to Student Emotion within an

Online Tutor

Beverly Park WOOLFCollege of Information and Computer Sciences,

University of Massachusetts Amherst

•1

•Supported by•National Science Foundation; U.S Dept of Education;

•Bill & Melinda Gates Foundation

Page 2: Detecting and Responding to Student Emotion

General Motivation

• What to do in the moment when students are frustrated, bored, etc.?– Increase Challenge? Decrease Challenge?– Provide extra scaffolds?– Provide “affective” scaffolds? What are those?– Encourage to stop and think about what is going on?

• How to measure changes in student affect, capturing micro-changes in student affective states?

Page 3: Detecting and Responding to Student Emotion

Many people: •relate to computers in the same way they relate to

humans (Nass, 2010);•continue to engage in frustrating tasks significantly

longer after an empathic digital response (Picard); •have lowered stress levels after receiving an empathetic message from a digital character (Arroyo et al., 2009); •recalled more information when interacting with an

artist agent compared to scientist agent; •report reduced frustration and more general interest

when working with gender-matched characters.

People and Computers

Page 4: Detecting and Responding to Student Emotion

In Summary

•4

• Empathetic characters help decrease students’ anxiety and boredom. • Simple 2D characters instead of 3D characters work well with students.• Non-natural language processing based tutoring system work well.• Learning companions that show empathy help with students’ negative

affective states,• Growth mindset messages provide a boost in student math learning. • Empathic messages yield higher math performance and learning.• Simple success/failure comments are harmful to students, in

comparison to other conditions

Page 5: Detecting and Responding to Student Emotion

Negative Emotion and Learning

• Confusion is associated with learning under certain conditions

D'Mello, S. K., Lehman, B., Pekrun, R. & Graesser, A. C. (2014) Confusion Can be Beneficial For Learning. Learning & Instruction.

• Boredom reduces task performancePekrun, R., Goetz, T., Daniels, L., Stupinsky, R. & Perry, R. (2010) Boredom in Achievement

Settings: Exploring Control–Value Antecedents and Performance Outcomes of a Neglected Emotion. Journal of Educational Psychology. 102(3), 531-549.

• Boredom increases ineffective behaviors such as gamingBaker, R. S. J. d., D'Mello, S. K., Rodrigo, M. M. T. & Graesser, A. C. (2010). Better to Be Frustrated than Bored: The Incidence, Persistence, and Impact of Learners' Cognitive-

Affective States during Interactions with Three Different Computer-Based Learning Environments. International Journal of Human-Computer Studies. 68(4), 223-241.

Negative Valence Emotions

Page 6: Detecting and Responding to Student Emotion

•6

Page 7: Detecting and Responding to Student Emotion

Affective learning companions congratulate students on effort exerted and talk to them about their effort and learning.

Affective Learning Companions

Page 8: Detecting and Responding to Student Emotion

Incorrect ResponseStudent effort shown/correct response

Student effort shown /incorrect response

Agent Emotion

Agents support frustrated students by acting helpful, bored, or confused.

Arroyo et al., AIED2009

Page 9: Detecting and Responding to Student Emotion

Agent EmotionEffort Attribution Shrug High interest

Students believe agents are part of the learning experience, mentors. . . who are together with students against the computer, . . . who are more knowledgeable (most of the time) cognitively and

emotionally. Arroyo et al., AIED2009

Page 10: Detecting and Responding to Student Emotion

Methodology

Measure students’ cognitive and affective attributes, (skills, motivation, engagement) in real-time.

Offer appropriate and timely interventions.

Measure the impact of each intervention

Page 11: Detecting and Responding to Student Emotion

Reduced Frustration

•More Frustrated

•Less Frustrated

•Neutral•Frustration

•Level

Page 12: Detecting and Responding to Student Emotion

Increased Interest•Less boredom for math at posttest time in LC condition. •+F(94,1)=3.4,p=.07

•More Interested

•More Bored

•Neutral•Interest•Level

Page 13: Detecting and Responding to Student Emotion

Improved Confidence

Page 14: Detecting and Responding to Student Emotion

Detect Student Emotion– Sensors;– Mathematical models;– Student Self-Reports Emotion.

Remediate Student Emotion– Teacher-based– Peer-based– Game-based

•14

AGENDA

Page 15: Detecting and Responding to Student Emotion

•15

Use Sensors

Page 16: Detecting and Responding to Student Emotion

•16

The Students

•Rural-Area High School in MA (35 students)•Geometry and Algebra classes

•UMASS 114 (29 students)•Math for Elementary School Teachers

Page 17: Detecting and Responding to Student Emotion

•17

Detect Emotions with Each Sensor

Conclusion: a camera can detect most emotion.

Page 18: Detecting and Responding to Student Emotion

The MathSpring System

Page 19: Detecting and Responding to Student Emotion

• Detection of Student Emotion– Sensors;– Mathematical models;– Student Self-Reports Emotion.

• Remediation of Student Emotion– Teacher-based– Peer-based– Game-based

•19

AGENDA

Page 20: Detecting and Responding to Student Emotion

•20

Can Models Reproduce Sensors?

Confidence InterestFrustration Excitement

Math Pretest

Math PosttestMath Self-concept

Before Tutor

After Tutor

Math Self-concept Math Value

Math Value Learning Orientation

Learning Orientation

Learned?Perception

Liked?Perception

Page 21: Detecting and Responding to Student Emotion

•21

Measuring cognition, meta-cognition and affect.

Pre- and Post-tutor assessments of:• Self-concept in Mathematics (3 items)• Math Value/Liking (3 items)• Mastery Learning (Learning Orientation) 2 items• Mathematics Test (15 items)• Math emotions Baseline (4 items)

• Frustration/Ease, Confidence/Anxiety, • Interest/Boredom, Excitement/lack of Excitement

Post-tutor assessments only• Perception of Learning• Liking the tutoring software

Page 22: Detecting and Responding to Student Emotion

•22

Models of EmotionsFrom Tutor-Context Variables Only

Linear Models to Predict Emotions Variables Entered in Stepwise Regression

Confidence InterestFrustration Excitement

# Hints Seen

Solved? 1st Attempt

# Incorrectattempts

Gender Ped. Agent

Seconds to1st Attempt

Time in Tutor

Seconds To Solve

Tutor Context Variables (for the last problem)

R=0.53 R=0.43 R=0.37R=0.49

Page 23: Detecting and Responding to Student Emotion

•23

Models of Emotions with SensorsFrom Tutor-Context Variables and Sensors

Linear Models to Predict Emotions Variables Entered in Stepwise Regression

Confidence InterestFrustration Excitement

SitForwardStdev

“Concentrating”Max. Probability

Camera Facial Detection Software

SitForwardMean

Seat Sensor

# Hints Seen

Solved? 1st Attempt

# Incorrectattempts

Gender Ped. Agent

Seconds to1st Attempt

Time in Tutor

Seconds To Solve

Tutor Context Variables (for the last problem)Tutor Only All Sensors+Tutor

R=0.53 R=0.43 R=0.37R=0.49

R=0.72 R=0.70 R=NAR=0.82

Sensor Variables (Mean, Min, Max, Stdev for the last problem)

Page 24: Detecting and Responding to Student Emotion

• Detection of Student Emotion– Sensors;– Mathematical models;– Student Self-Reports Emotion.

• Remediation of Student Emotion– Teacher-based– Peer-based– Game-based

•24

AGENDA

Page 25: Detecting and Responding to Student Emotion

How are you feeling? Please rate your level of interest in this

Page 26: Detecting and Responding to Student Emotion

•26

Students Self-Report EmotionsFour bipolar emotional axes

Page 27: Detecting and Responding to Student Emotion

Resulting Data

Page 28: Detecting and Responding to Student Emotion

How can we understandstudent affect?

How can student’s actual words be used?

Page 29: Detecting and Responding to Student Emotion

Research to Detect Affect

Two major approaches to examine affect:•Categorical

– Each affect is considered a discrete category– Approach usually used in ITS

•Dimensional models of emotion– Valence (+/-), activation (high/low)– Locus of control (internal/external)– Previous results mixed

Page 30: Detecting and Responding to Student Emotion

Self-report Methods• Student was prompted for affect every 5

problems or 8 minutes (whichever came first)– Asked either frustration, confidence, excitement

or interest each time• Asked to rate their affect on

5-point Likert scale and toanswer “why”

Page 31: Detecting and Responding to Student Emotion

Phase 1: Open Coding

• 450 random responses from 2011 given to 5 coders

• Each coder independently created ~10 categories and tagged the responses using these, i.e., students words were an example of emotion XX;– Coders covers at least 70% of all responses– Could tag a response with multiple categories if

appropriate

Page 32: Detecting and Responding to Student Emotion

Phase 2: Axial Coding• 3 coders used the resulting 5 schemes to create 10 final

categories:– IDK- student doesn’t know why they feel that way– Boring- student says they are bored/material is boring– Easy- student says material is easy– Hard- student says material is hard– Internal- student attributes feelings to self– External- student attributes feelings to something outside

self– Positive- valence of comment is positive– Negative- valence of comment is negative– Supportive- student says tutor is helpful or supports them– Unsupportive- student says tutor is not helpful or does not

support them

Page 33: Detecting and Responding to Student Emotion

Phase 3: Application and Validation of Tags

• Four coders each coded the 2015 data using the 10 agreed upon categories

• Inter-rater reliability by Cohen’s Kappa• Used the coded data from the coder who

had highest agreement with others overall Highest Kappa between any 2 coders for Each Tag

Page 34: Detecting and Responding to Student Emotion

Results & Analyses

Frequency of Each Code Out of a Total Sample Group Frequency of Each Code Out of a Total Sample Group (2015 N = 449; 2011 N = 464) (2015 N = 449; 2011 N = 464)

The tutor seems to improve in promoting positive student affect.

Page 35: Detecting and Responding to Student Emotion

Results & AnalysesPercentage of reports containing each tag Percentage of reports containing each tag

broken down by affectbroken down by affect

20112011

20152015

More positive, less negative affect.More internal, less external affect.

Less boring material.

Page 36: Detecting and Responding to Student Emotion

Results & Analyses

Frequent CombinationsFrequent Combinations

Page 37: Detecting and Responding to Student Emotion

Discussion

• Many students tend to externalize affect– Especially when negative; “the problems are too

difficult”• Populations differed on when they reported the

material was “hard”• The tutor seems to be improving in promoting

positive student affect.

Page 38: Detecting and Responding to Student Emotion

Representations of Affect

Page 39: Detecting and Responding to Student Emotion

Use External Coders

•She looks “Angry”

•She looks “Angry”

•She looks “Angry”

Bromp Observation Method Protocol

•Ocumpaugh, J., Baker, R., and Rodrigo, M.M.T. •Baker-Rodrigo observation method protocol (BROMP) 1.0. Training manual version 1.0. Technical Report. New York, NY: EdLab. Manila, Philippines: Ateneo Laboratory for the Learning Sciences, 2012.

Page 40: Detecting and Responding to Student Emotion

Inter-Rater Reliability

• Reliability is a decent “goodness” metric

• Reliability ≠ Validity

• Good face validity

• ≈ “Angry”

• ≈“Constipat

ed”?

Page 41: Detecting and Responding to Student Emotion

Internal Representation ≈ Experience?

• Self Appraisal

• Self Report

• Relate an Experience to a Representation

•Can I understand how I feel?

Page 42: Detecting and Responding to Student Emotion

Self-Report Requires Matching Experience to Representation

Page 43: Detecting and Responding to Student Emotion

Let’s Address This

• Self-Report Method Reliability

• Establish Method to Measure Reliability

• Distinguish Relative Reliability of Different Methods

Page 44: Detecting and Responding to Student Emotion

Participants

• Eighty One (81) Seventh Graders from two California Middle Schools

• Majority Latin American

• Close to California Median Income

Page 45: Detecting and Responding to Student Emotion

Students Were Given

Page 46: Detecting and Responding to Student Emotion

& the Following Stickers

Angry Anxious Bored Confident Confused

Enjoying Excited Frustrated Interested Relieved

Page 47: Detecting and Responding to Student Emotion

Motivations For Stickers

• Words based on relevance in education and similarity to emotes (faces)

• Emoticons (faces) based on Broekens & Brinkman 2013 Affect Button– Has Extremes of Valence (Pleasure), Activation, &

Dominance– We chose each extreme for faces (2^3)

Page 48: Detecting and Responding to Student Emotion

Averaged Self-Report Values

Page 49: Detecting and Responding to Student Emotion

Can progress reports from virtual teachers improve

student interest, excitement? Given that they encourage students to

stop and reflect... And give them a choice…

Do progress reports have the capacity

to improve student interest and excitement?

Page 50: Detecting and Responding to Student Emotion

What do researchers know about showing progress to students?

• Cognitive side: basic progress charts every 6 problems in an intelligent tutor led to higher learning gains

Arroyo et al. (2007). Repairing disengagement with non-invasive interventions. AIED 2007.

• Meta-cognitive side: progress reports reduce student gaming behaviors (e.g., hint abuse)

Arroyo et al. (2007)

Page 51: Detecting and Responding to Student Emotion

Variance of Student Placements

Page 52: Detecting and Responding to Student Emotion

Discussion

• Two things to address:

– Assuming These Results are “Representative”, What Does It Mean for Self-Report?

– “Mistakes were made…”

Page 53: Detecting and Responding to Student Emotion

What Does This Mean for Self-Report?

• One Student’s “High Boredom” ≠ Another’s

• Student’s don’t see affect as fitting Russell’s “Core Affect” wheel

• Should inform decisions of whether to use & how much to trust self-report– Cheap & Easy– High Variance, but other measures are more

tightly controlled

Page 54: Detecting and Responding to Student Emotion

Possible Sources of Error

• Experience ≠ Recall (Bieg et al)

• Culture & Representations

• Relative Ordering (us specific)

•Can I recall how

I felt?

Page 55: Detecting and Responding to Student Emotion

Detect Student Emotion– Sensors;– Mathematical models;– Student Self-Reports Emotion.

Remediate Student Emotion– Teacher-based– Peer-based– Game-based

•56

AGENDA

Page 56: Detecting and Responding to Student Emotion

Detect Student Emotion– Sensors;– Mathematical models;– Student Self-Reports Emotion.

Remediate Student Emotion– Teacher-based– Peer-based– Game-based

•57

AGENDA

Page 57: Detecting and Responding to Student Emotion

•ACTIVATING EMOTION EXPERIMENTS

•DEACTIVATING EMOTION EXPERIMENTS

•Frustration, anxiety

•Boredom•Unexcitement

•Characters

•Student Progress Page

Page 58: Detecting and Responding to Student Emotion

• NSF Cyberlearning DIP Collaborative Research: Impact of Adaptive Interventions on Student Affect, Performance and Learning (2.5 more years)

•ACTIVATING EMOTION EXPERIMENTS

•DEACTIVATING EMOTION EXPERIMENTS

•Frustration, anxiety

•Boredom•Unexcitement

•Characters

•Student Progress Page

Page 59: Detecting and Responding to Student Emotion

The MathSpring System

Page 60: Detecting and Responding to Student Emotion

The Student Progress Page

•61

Dovan Rai, PhD, WPI

Page 61: Detecting and Responding to Student Emotion

•Pekrun, R., Elliot, A. J. & Maier, M. A. (2009)

In Ed Psych, well accepted that motivation/emotions influence achievement, but…

Page 62: Detecting and Responding to Student Emotion

Three Experimental Conditions

•63

• No access to the Student Progress Page

• “My Progress” button was present (student choice)

• Prompt invitation to see “my progress” upon bored (disinterested) or unexcited

• Force student to see my progress when student bored (disinterested) or unexcited

Page 63: Detecting and Responding to Student Emotion

Initial Results about receiving SPP

• SPP access indeed increased across conditions– no-button M = 1.3, – Button present M = 3.1, – SPP offered upon low affect M = 6.0, – SPP forced upon low affect M = 8.8.

• Confirmed that there were no differences between conditions in terms of baseline interest and excitement as measured by the pre-affect survey (ns).

Page 64: Detecting and Responding to Student Emotion

• Our current Gold standard/Truth: self-reports• During Experiment:

– Ask students about their “interest” level and their “excitement” level every 7 minutes, on average.

Gathering Student Affect “During”

Low

High•Neutral / Middle

Page 65: Detecting and Responding to Student Emotion

Resulting Models•Interest: Pearson's R= 0.464 , Kappa=0.281

•Interest Model=•  0.425 * INTERESTED PRE•- 0.253 * numMistakes•+ 0.140 * FRUSTRATION PRE•- 0.219 * ERRORS_INCOMP•+ 0.367 * SHINT Last3probs•+ 0.140 * EXCITED PRE•+ 0.535 * isSolved•+ 0.146 * LEARN_ORIENT PRE•+ 0.128 * AVOIDANCE PRE•+ 0.131 * OVERHELP PRE•- 0.088

•Excitement: Pearson's R= 0.431, Kappa=0.18

•Excitement Model=•  0.376 * LIKE MATH PRE•+ 0.260 * INTERESTED PRE•- 0.284 * ERRORS_INCOMP PRE•+ 0.017 * AverageLogTimeToSolve•+ 0.140 * LEARN_ORIENT•- 4.065 * GIVEUP SOFLast3•+ 0.162 * LEARNOWN PRE•- 0.197 * LEARNGOAL PRE•+ 0.150 * FRUSTRATED PRE•+ 0.149 * EXCITED PRE•+ 0.303

Bold = features about student interaction with tutor

Page 66: Detecting and Responding to Student Emotion

Predictions of Interest/ExcitementDISINTDISINTNEUTRALNEUTRALNEUTRALNEUTRALNEUTRALNEUTRALDISINTNEUTRALNEUTRALDISINTDISINTDISINTDISINTNEUTRALNEUTRALNEUTRALINTERINTERINTERINTERINTERINTERINTERINTERINTER

NEUTRALNEUTRALEXCNEUTRALNEUTRALNEUTRALNEUTRALNEUTRALEXCNEUTRALNEUTRALNEUTRALUNEXCUNEXCNEUTRALNEUTRALNEUTRALNEUTRALNEUTRALNEUTRALNEUTRALNEUTRALNEUTRALNEUTRALEXCEXCEXCEXCEXC

Use models to classify all student-problem interactions as low/neutral/high interest and low/neutral/high excitement

Page 67: Detecting and Responding to Student Emotion

How to analyze changes in student affect, from moment to moment?

MARKOV CHAIN MODELS

Page 68: Detecting and Responding to Student Emotion

Student Interest •Ten thousand Data Points N~230.

SPPAbsent

SPPPresent

SPP Prompted

SPP Forced

Page 69: Detecting and Responding to Student Emotion

Student Excitement•Ten thousand Data Points N~230.

SPPAbsent

SPP•Present

SPP Prompted

SPP Forced

Page 70: Detecting and Responding to Student Emotion

How to compare Markov Chain Models, quantitatively?

• Probability of following a specific path• What is the probability that a student will end

up excited, after 3 transitions?• Given that they started in a specific state?

Page 71: Detecting and Responding to Student Emotion

Conclusions

• We have refined a methodology to analyze how specific interventions produce changes in [affective] states– Randomized Controlled trials Model

creation/application Markov Chain Models Path Analysis.

• Evidence that having the SPP present instead of absent can lead towards interest and excitement– Not shooting for an ideal policy, just a policy that

works to some extent, capable to compete against others

Page 72: Detecting and Responding to Student Emotion

• Detection of Student Emotion– Sensors;– Mathematical models;– Student Self-Reports Emotion.

• Remediation of Student Emotion– Teacher-based– Peer-based– Game-based

•73

AGENDA

Page 73: Detecting and Responding to Student Emotion

• Detection of Student Emotion– Sensors;– Mathematical models;– Student Self-Reports Emotion.

• Remediation of Student Emotion– Teacher-based– Peer-based– Game-based

•74

AGENDA

Page 74: Detecting and Responding to Student Emotion

Remaining Questions

• What if students started unexcited/bored instead of neutral?

• Can we compute the probability of “improving” their affective state, regardless of where they started?

Page 75: Detecting and Responding to Student Emotion

Future Work

• Look into more detail on:– Internal vs. external tag– Relationship between difficulty and affect

• Possible new affective constructions (e.g., persistence)

• Create better predictive models for specific reasons

• Use NLP to label the responses rather than have human coders do it.

Page 76: Detecting and Responding to Student Emotion

Unresolved Questions• No clear evidence that encouraging the SPP at moments of low-

affect is better than simply having the button available• Choice in Button-Present condition might give students a sense of agency, that in turn might make them “feel good”, engaged, etc.

• Might be that intervening only based on the last report of affect is not good enough? Intervene after episodes of boredom/lack of excitement?

• Maybe SPP at moments of boredom/lack of excitement is not that great to repair those states.

• Maybe our models are not so great and that impacts both the affective paths and the results.

Page 77: Detecting and Responding to Student Emotion

The Research Team!

•Ivon ArroyoIvon Arroyo

•Naomi WixonNaomi Wixon •Danielle AllessioDanielle Allessio •Kasia MuldnerKasia Muldner

•Winslow BurlesonWinslow Burleson •Beverly WoolfBeverly Woolf

Page 78: Detecting and Responding to Student Emotion

Detecting and Responding to Student Emotion within an

Online Tutor

Thank You.

Any Questions?

•79

This research was funded by the National Science Foundation, #1324385, Cyberlearning DIP, Impact of Adaptive Interventions on Student Affect, Performance, and Learning; Burleson, Arroyo and Woolf (PIs). Any opinions, findings, conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.