using machine learning to monitor collaborative interactions
DESCRIPTION
Using Machine Learning to Monitor Collaborative Interactions. Carolyn Penstein Ros é Language Technologies Institute/ Human-Computer Interaction Institute. VMT-Basilica (Kumar & Ros é, 2010). Labeled Texts. Labeled Texts. TagHelper. Behavior. Unlabeled Texts. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/1.jpg)
Using Machine Learning to Monitor Collaborative Interactions
Carolyn Penstein Rosé
Language Technologies Institute/ Human-Computer Interaction
Institute
![Page 2: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/2.jpg)
VMT-Basilica (Kumar & Rosé, 2010)
![Page 3: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/3.jpg)
Download tools at:http://www.cs.cmu.edu/~cprose/TagHelper.htmlhttp://www.cs.cmu.edu/~cprose/SIDE.html
Monitoring Collaboration with Machine Learning Technology
TagHelper
Labeled Texts
Unlabeled Texts
Labeled Texts
A Model that can Label More Texts
Time
Beh
avio
r
<Triggered Intervention>
![Page 4: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/4.jpg)
TagHelper Tools and SIDE
TagHelper Tools uses text miningtechnology to automate annotationof conversational data SIDE facilitates rapid prototyping of reporting
interfaces for group learning facilitators
Define Summaries
Annotate Data
Visualize Annotated Data
http://www.cs.cmu.edu/~cprose/TagHelper.htmlhttp://www.cs.cmu.edu/~cprose/SIDE.html
![Page 5: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/5.jpg)
Important caveat!! Machine learning isn’t magic But it can be useful for
identifying meaningful patterns in your data when used properly
Proper use requires insight into your data
?
![Page 6: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/6.jpg)
Naïve Approach: When all you have is a hammer…
TargetRepresentationData
![Page 7: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/7.jpg)
Naïve Approach: When all you have is a hammer…
TargetRepresentation
Problem: there isn’t one universally best approach!!!!!
Data
![Page 8: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/8.jpg)
Slightly less naïve approach: Aimless wandering…
TargetRepresentationData
![Page 9: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/9.jpg)
Slightly less naïve approach: Aimless wandering…
TargetRepresentation
Problem 1: It takes too long!!!
Data
![Page 10: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/10.jpg)
Slightly less naïve approach: Aimless wandering…
TargetRepresentation
Problem 2: You might not realize all of the options that are available to you!
Data
![Page 11: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/11.jpg)
Expert Approach: Hypothesis driven
TargetRepresentationData
![Page 12: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/12.jpg)
Expert Approach: Hypothesis driven
TargetRepresentation
You might end up with the same solution in the end, but you’ll get there faster.
Data
![Page 13: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/13.jpg)
Expert Approach: Hypothesis driven
TargetRepresentation
Today we’ll start to learn how!
Data
![Page 14: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/14.jpg)
What is machine learning?
Automatically or semi-automatically Inducing concepts (i.e., rules) from dataFinding patterns in dataExplaining dataMaking predictions
Data Learning Algorithm Model
New Data
PredictionClassification Engine
![Page 15: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/15.jpg)
How does machine learning work?
The simplest rule learner willlearn to predict whatever isthe most frequent result class.This is called the majorityClass.
What will the rule be in this case?
It will always predict yes.
A slightly more sophisticated rule learner will find the feature that gives the mostinformation about the result class. Whatdo you think that would be in this case?
Outlook:Sunny -> NoOvercast -> YesRainy-> Yes
<Feature Name>:<value> -> <prediction><value> -> <prediction>…
![Page 16: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/16.jpg)
What will be the prediction?
Outlook:Sunny -> NoOvercast -> YesRainy-> Yes
Model
New Data
Yes
![Page 17: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/17.jpg)
More Complex Algorithm… Two simple algorithms
last time0R – Predict the
majority class1R – Use the most
predictive single feature Today – Intro to
Decision TreesToday we will stay at a
high levelWe’ll investigate more
details of the algorithm next time
* Only makes 2 mistakes!
![Page 18: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/18.jpg)
More Complex Algorithm… Two simple algorithms
last time0R – Predict the
majority class1R – Use the most
predictive single feature Today – Intro to
Decision TreesToday we will stay at a
high levelWe’ll investigate more
details of the algorithm next time
* Only makes 2 mistakes!
![Page 19: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/19.jpg)
More Complex Algorithm… Two simple algorithms
last time0R – Predict the
majority class1R – Use the most
predictive single feature Today – Intro to
Decision TreesToday we will stay at a
high levelWe’ll investigate more
details of the algorithm next time
* Only makes 2 mistakes!
What will it do with this example?
![Page 20: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/20.jpg)
More Complex Algorithm… Two simple algorithms
last time0R – Predict the
majority class1R – Use the most
predictive single feature Today – Intro to
Decision TreesToday we will stay at a
high levelWe’ll investigate more
details of the algorithm next time
* Only makes 2 mistakes!
What will it do with this example?
![Page 21: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/21.jpg)
More Complex Algorithm… Two simple algorithms
last time0R – Predict the
majority class1R – Use the most
predictive single feature Today – Intro to
Decision TreesToday we will stay at a
high levelWe’ll investigate more
details of the algorithm next time
* Only makes 2 mistakes!
What will it do with this example?
![Page 22: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/22.jpg)
More Complex Algorithm… Two simple algorithms
last time0R – Predict the
majority class1R – Use the most
predictive single feature Today – Intro to
Decision TreesToday we will stay at a
high levelWe’ll investigate more
details of the algorithm next time
* Only makes 2 mistakes!
What will it do with this example?
![Page 23: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/23.jpg)
Why is it better?
Not because it is more complexSometimes more complexity makes
performance worse What is different in what the three rule
representations assume about your data?0R1RTrees
The best algorithm for your data will give you exactly the power you need
![Page 24: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/24.jpg)
Why is it better?
Not because it is more complexSometimes more complexity makes
performance worse What is different in what the three rule
representations assume about your data?0R1RTrees
The best algorithm for your data will give you exactly the power you need
Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?
![Page 25: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/25.jpg)
Why is it better?
Not because it is more complexSometimes more complexity makes
performance worse What is different in what the three rule
representations assume about your data?0R1RTrees
The best algorithm for your data will give you exactly the power you need
Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?
![Page 26: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/26.jpg)
Why is it better?
Not because it is more complexSometimes more complexity makes
performance worse What is different in what the three rule
representations assume about your data?0R1RTrees
The best algorithm for your data will give you exactly the power you need
Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?
Now lets say you don’t know the shape, now what would you learn?
![Page 27: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/27.jpg)
Why is it better?
Not because it is more complexSometimes more complexity makes
performance worse What is different in what the three rule
representations assume about your data?0R1RTrees
The best algorithm for your data will give you exactly the power you need
Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?
Now lets say you don’t know the shape, now what would you learn?
![Page 28: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/28.jpg)
Why is it better?
Not because it is more complexSometimes more complexity makes
performance worse What is different in what the three rule
representations assume about your data?0R1RTrees
The best algorithm for your data will give you exactly the power you need
Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?
Now lets say you don’t know the shape, now what would you learn?
If you know the shape, you have fewer degreesof freedom – less room to make a mistake.
![Page 29: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/29.jpg)
Why is it better?
Not because it is more complexSometimes more complexity makes
performance worse What is different in what the three rule
representations assume about your data?0R1RTrees
The best algorithm for your data will give you exactly the power you need
Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?
Now lets say you don’t know the shape, now what would you learn?
If you know the shape, you have fewer degreesof freedom – less room to make a mistake.
![Page 30: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/30.jpg)
Why is it better?
Not because it is more complexSometimes more complexity makes
performance worse What is different in what the three rule
representations assume about your data?0R1RTrees
The best algorithm for your data will give you exactly the power you need
Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?
Now lets say you don’t know the shape, now what would you learn?
If you know the shape, you have fewer degreesof freedom – less room to make a mistake.
![Page 31: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/31.jpg)
Why is it better?
Not because it is more complexSometimes more complexity makes
performance worse What is different in what the three rule
representations assume about your data?0R1RTrees
The best algorithm for your data will give you exactly the power you need
Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?
Now lets say you don’t know the shape, now what would you learn?
If you know the shape, you have fewer degreesof freedom – less room to make a mistake.
![Page 32: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/32.jpg)
Why is it better?
Not because it is more complexSometimes more complexity makes
performance worse What is different in what the three rule
representations assume about your data?0R1RTrees
The best algorithm for your data will give you exactly the power you need
![Page 33: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/33.jpg)
What do concepts look like?
![Page 34: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/34.jpg)
Clarification: Concepts as Lines
R B
S
T
C
X
X
X
X
X
X
![Page 35: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/35.jpg)
Machine Learning Process Overview Get to know your data
What distinguishes messages from different categories
Represent messages in terms of features Use feature table tab
Build machine learning model Use machine learning tab
Learn from mistakes, and try again Use feature analyzer tab
Features Coding
![Page 36: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/36.jpg)
Machine Learning
![Page 37: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/37.jpg)
Algorithms you will use
Decision Trees (J48): good with small feature sets, can find contingencies between features
Naïve Bayes: fast, makes decisions based on probabilities
Support Vector Machines (SMO), makes decisions based on weights, usually works well on text
![Page 38: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/38.jpg)
Setting Up Your Data
![Page 39: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/39.jpg)
How do you know when you have coded enough data?
What distinguishesQuestions and Statements?
Not all questionsend in a questionmark.
Not all WH wordsoccur in questionsI versus you isnot a reliable predictor
You need to codeenough to avoidlearning rules thatwon’t work
![Page 40: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/40.jpg)
Basic Idea
Represent text as a vector where each position corresponds to a term
This is called the “bag of words” approach
Cows make cheese
110001
Hens lay eggs 001110
CheeseCowsEggsHensLayMake
But same representationBut same representationfor “Cheese makes cows.”!for “Cheese makes cows.”!
![Page 41: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/41.jpg)
What can’t you conclude from “bag of words” representations?
Causality: “X caused Y” versus “Y caused X”
Roles and Mood: “Which person ate the food that I prepared this morning and drives the big car in front of my cat” versus “The person, which prepared food that my cat and I ate this morning, drives in front of the big car.” Who’s driving, who’s eating, and who’s preparing
food?
![Page 42: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/42.jpg)
Part of Speech Tagging
1. CC Coordinating conjunction
2. CD Cardinal number 3. DT Determiner 4. EX Existential there 5. FW Foreign word 6. IN Preposition/subord 7. JJ Adjective 8. JJR Adjective,
comparative 9. JJS Adjective, superlative 10.LS List item marker 11.MD Modal
12.NN Noun, singular or mass
13.NNS Noun, plural 14.NNP Proper noun,
singular 15.NNPS Proper noun, plural 16.PDT Predeterminer 17.POS Possessive ending 18.PRP Personal pronoun 19.PP Possessive pronoun 20.RB Adverb 21.RBR Adverb, comparative 22.RBS Adverb, superlative
http://www.ldc.upenn.edu/Catalog/docs/treebank2/cl93.html
![Page 43: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/43.jpg)
Part of Speech Tagging
23.RP Particle
24.SYM Symbol
25.TO to
26.UH Interjection
27.VB Verb, base form
28.VBD Verb, past tense
29.VBG Verb, gerund/present participle
30.VBN Verb, past participle
31.VBP Verb, non-3rd ps. sing. present
32.VBZ Verb, 3rd ps. sing. present
33.WDT wh-determiner
34.WP wh-pronoun
35.WP Possessive wh-pronoun
36.WRB wh-adverb
http://www.ldc.upenn.edu/Catalog/docs/treebank2/cl93.html
![Page 44: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/44.jpg)
Feature Space Design
Feature Space DesignThink like a computer!Machine learning algorithms look for features that
are good predictors, not features that are necessarily meaningful
Look for approximations If you want to find questions, you don’t need to do a
complete syntactic analysis Look for question marks Look for wh-terms that occur immediately before an
auxilliary verb
![Page 45: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/45.jpg)
Feature Space Design
Feature Space DesignPunctuation can be a “stand in” for mood
“you think the answer is 9?” “you think the answer is 9.”
Bigrams capture simple lexical patterns “common denominator” versus “common multiple”
POS bigrams capture syntactic or stylistic information
“the answer which is …” vs “which is the answer”Line length can be a proxy for explanation
depth
![Page 46: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/46.jpg)
Feature Space Design
Feature Space DesignContains non-stop word can be a predictor of
whether a conversational contribution is contentful
“ok sure” versus “the common denominator”Remove stop words removes some distracting
featuresStemming allows some generalization
Multiple, multiply, multiplicationRemoving rare features is a cheap form of
feature selection Features that only occur once or twice in the corpus
won’t generalize, so they are a waste of time to include in the vector space
![Page 47: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/47.jpg)
Error Analysis
![Page 48: Using Machine Learning to Monitor Collaborative Interactions](https://reader035.vdocument.in/reader035/viewer/2022062802/56814523550346895db1e918/html5/thumbnails/48.jpg)
Any Questions?