using artificial intelligence to support peer review of writing
DESCRIPTION
Using Artificial Intelligence to Support Peer Review of Writing. Diane Litman Department of Computer Science, Intelligent Systems Program, & Learning Research and Development Center. Context. Speech and Language Processing for Education. Learning Language (reading, writing, speaking). - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/1.jpg)
Using Artificial Intelligence to Support Peer Review of Writing
Diane Litman
Department of Computer Science,Intelligent Systems Program, &
Learning Research and Development Center
![Page 2: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/2.jpg)
Context
Speech and Language Processing for Education
Learning Language(reading, writing,
speaking)
Using Language (to teach everything else)
Tutors
Scoring
Readability
Processing Language
Tutorial Dialogue
Systems / Peers
CSCLDiscourse
CodingLecture
Retrieval
Questioning& Answering
Peer Review
![Page 3: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/3.jpg)
Outline
SWoRD Improving Review Quality Identifying Helpful Reviews Summary and Current Directions
![Page 4: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/4.jpg)
SWoRD [Cho & Schunn, 2007]
Authors submit papers Reviewers submit (anonymous) feedback Authors revise and resubmit papers Authors provide back-ratings to reviewers
regarding feedback helpfulness
![Page 5: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/5.jpg)
Some Weaknesses
1. Feedback is often not stated in effective ways
2. Feedback and papers often do not focus on core aspects
![Page 6: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/6.jpg)
Our Approach: Detect and Scaffold
1. Detect and direct reviewer attention to key feedback features such as solutions
2. Detect and direct reviewer and author attention to thesis statements in papers and feedback
Improving Learning from Peer Review with NLP and ITS Techniques (with Ashley, Schunn), LRDC internal grant
![Page 7: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/7.jpg)
Feedback Features and Positive Writing Performance [Nelson & Schunn, 2008]
Solutions
Summarization
Localization
Understanding of the Problem
Implementation
![Page 8: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/8.jpg)
![Page 9: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/9.jpg)
I. Detecting Key Feedback Features
Natural Language Processing (NLP) to extract attributes from text, e.g.– Regular expressions (e.g. “the section about”)– Domain lexicons (e.g. “federal”, “American”)– Syntax (e.g. demonstrative determiners)– Overlapping lexical windows (quotation identification)
Machine Learning (ML) to predict whether feedback contains localization and solutions, and whether papers contain a thesis statement
![Page 10: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/10.jpg)
Learned Localization Model [Xiong, Litman & Schunn, 2010]
![Page 11: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/11.jpg)
Quantitative Model Evaluation
Feedback Feature
ClassroomCorpus
N BaselineAccuracy
ModelAccuracy
ModelKappa
HumanKappa
Localization
History 875 53% 78% .55 .69
Psychology 3111 75% 85% .58 .63
Solution
History 1405 61% 79% .55 .79
CogSci 5831 67% 85% .65 .86
![Page 12: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/12.jpg)
II. Predicting Feedback Helpfulness
Can expert helpfulness ratings be predicted from text? [Xiong & Litman, 2011a]
Impact of predicting student versus expert helpfulness ratings
[Xiong & Litman, 2011b]
![Page 13: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/13.jpg)
Results: Predicting Expert Ratings (average of writing and domain experts)
Techniques used in ranking product review helpfulness can be effectively adapted to peer-reviews (R = .6) Structural attributes (e.g. review length, number of questions) Lexical statistics Meta-data (e.g. paper ratings) However, the relative utility of such features varies
Peer-review features improve performance (R = .7) Theory-motivated (e.g. localization) Abstraction (e.g. lexical categories) better for small corpora
![Page 14: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/14.jpg)
Changing the meaning of “helpfulness”
Helpfulness may be perceived differently by different types of people
Average of two experts (prior experiment)
Writing expert
Content expert
Student peers
14
![Page 15: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/15.jpg)
Content versus Writing Experts– Writing-expert rating = 2
– Content-expert rating = 5
15
Your over all arguements were organized in some order but was
unclear due to the lack of thesis in the paper. Inside each arguement, there was no order to the ideas presented,
they went back and forth between ideas. There was good support to the
arguements but yet some of it didnt not fit your arguement.
First off, it seems that you have difficulty writing transitions between paragraphs. It
seems that you end your paragraphs with the main idea of each paragraph. That being
said, … (omit 173 words) As a final comment, try to continually move your
paper, that is, have in your mind a logical flow with every paragraph having a
purpose.
• Writing-expert rating = 5
• Content-expert rating = 2
Argumentation issue
Argumentation issue
Transition issue Transition issue
![Page 16: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/16.jpg)
Results: Other Helpfulness Ratings Generic features are more predictive for student ratings
Lexical features: transition cues, negation, suggestion words Meta features: paper rating
Theory-supported features are more useful for experts Both experts: solution Writing expert: praise Content expert: critiques, localization
16
![Page 17: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/17.jpg)
Summary Artificial Intelligence (NLP and ML) can be used to
automatically detect desirable feedback features
– localization, solution
– feedback and reviewer levels
Techniques used to predict product review helpfulness can be effectively adapted to peer-review– Knowledge of peer-reviews increases performance
– Helpfulness type influences feature utility
17
![Page 18: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/18.jpg)
Current and Future Work
Extrinisic evaluation in SWoRD –Intelligent Scaffolding for Peer Reviews of Writing (with Ashley, Godley, Schunn), IES (recommended for funding)
Extend to reviews of argument diagrams –Teaching Writing and Argumentation with AI-Supported Diagramming and Peer Review (with Ashley, Schunn), NSF
Teacher dashboard –Keeping Instructors Well-informed in Computer-Supported Peer Review (with Ashley, Schunn, Wang), LRDC internal grant
18
![Page 19: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/19.jpg)
Thank you!
Questions?
19
![Page 20: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/20.jpg)
Peer versus Product Reviews Helpfulness is directly rated on a scale (rather than
a function of binary votes) Peer reviews frequently refer to the related papers Helpfulness has a writing-specific semantics Classroom corpora are typically small
20
![Page 21: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/21.jpg)
Generic Linguistic Features
type Label Features (#)
Structural STRrevLength, sentNum, question%,
exclamationNum
Lexical UGR, BGRtf-idf statistics of
review unigrams (#= 2992) and bigrams (#= 23209)
Syntactic SYNNoun%, Verb%, Adj/Adv%, 1stPVerb%,
openClass%
Semantic(adapted)
TOP counts of topic words (# = 288) ;
posW, negWcounts of positive (#= 1319)
and negative sentiment words (#= 1752)
Meta-data(adapted)
META paperRating, paperRatingDiff
21
![Page 22: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/22.jpg)
Type Label Features (#)
Cognitive Science
cogSpraise%, summary%, criticism%,
plocalization%, solution%Lexical
CategoriesLEX2 Counts of 10 categories of words
Localization LOCFeatures developed for
identifying problem localization
Specialized Features
22
![Page 23: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/23.jpg)
Lexical Categories
Extracted from:1. Coding Manuals2. Decision trees trained with Bag-of-Words
23
Tag Meaning Word list
SUG suggestion should, must, might, could, need, needs, maybe, try, revision, want
LOC location page, paragraph, sentence
ERR problem error, mistakes, typo, problem, difficulties, conclusion
IDE idea verb consider, mention
LNK transition however, but
NEG negative fail, hard, difficult, bad, short, little, bit, poor, few, unclear, only, more
POS positive great, good, well, clearly, easily, effective, effectively, helpful, very
SUM summarization main, overall, also, how, job
NOT negation not, doesn't, don't
SOL solution revision, specify, correction
![Page 24: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/24.jpg)
Discussion
24
• Effectiveness of generic features across domains• Same best generic feature combination (STR+UGR+MET)• But…
![Page 25: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/25.jpg)
Results: Specialized Features
25
• Introducing high level features does enhance the model’s performance. Best model: Spearman correlation of 0.671 and Pearson
correlation of 0.665.
Feature Type r rs
cogS 0.425+/-0.094 0.461+/-0.072
LEX2 0.512+/-0.013 0.495+/-0.102
LOC 0.446+/-0.133 0.472+/-0.113
STR+MET+UGR (Baseline) 0.615+/-0.101 0.609+/-0.098
STR+MET+LEX2 0.621+/-0.096 0.611+/-0.088
STR+MET+LEX2+TOP 0.648+/-0.097 0.655+/-0.081
STR+MET+LEX2+TOP+cogS 0.660+/-0.093 0.655+/-0.081
STR+MET+LEX2+TOP+cogS+LOC 0.665+/-0.089 0.671+/-0.076
![Page 26: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/26.jpg)
• Student rating = 3• Expert-average rating
= 5
Students versus Experts
26
The author also has great logic in this paper. How can we consider the United
States a great democracy when everyone is not treated equal. All of the main points were indeed supported in
this piece.
I thought there were some good opportunities to provide further data to strengthen your argument. For example
the statement “These methods of intimidation, and the lack of military force
offered by the government to stop the KKK, led to the rescinding of African American democracy.” Maybe here
include data about how … (omit 126 words)
praisepraise
CritiqueCritique
– Student rating = 7
– Expert-average rating = 2
![Page 27: Using Artificial Intelligence to Support Peer Review of Writing](https://reader035.vdocument.in/reader035/viewer/2022062802/568145fc550346895db30705/html5/thumbnails/27.jpg)
Sample Result: All Features
27
• Feature selection of all features• Students are more influenced by meta features, demonstrative
determiners, number of sentences, and negation words• Experts are more influenced by review length and critiques
• Content expert values solutions, domain words, problem localization• Writing expert values praise and summary