xip dashboard: visual analytics from automated rhetorical parsing of scientific metadiscourse

26
Duygu Simsek, Simon Buckingham Shum, Anna De Liddo, Rebecca Ferguson — The Open University, UK Ágnes Sándor — Xerox Research Centre Europe, FR 1 st International Workshop on Discourse-Centric Learning Analytics April 8, 2013, LAK13 Conference, Leuven, Belgium XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Upload: simon-buckingham-shum

Post on 06-May-2015

8.454 views

Category:

Education


3 download

DESCRIPTION

XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse ABSTRACT A key competency that we seek to build in learners is a critical mind, i.e. ability to engage with the ideas in the literature, and to identify when significant claims are being made in articles. The ability to decode such moves in texts is essential, as is the ability to make such moves in one’s own writing. Computational techniques for extracting them are becoming available, using Natural Language Processing (NLP) tuned to recognize the rhetorical signals that authors use when making a significant scholarly move. After reviewing related NLP work, we introduce the Xerox Incremental Parser (XIP), note previous work to render its output, and then motivate the design of the XIP Dashboard, a set of visual analytics modules built on XIP output, using the LAK/EDM open dataset as a test corpus. We report preliminary user reactions to a paper prototype of such a novel dashboard, describe the visualizations implemented to date, and present user scenarios for learners, educators and researchers. We conclude with a summary of ongoing design refinements, potential platform integrations, and questions that need to be investigated through end-user evaluations.

TRANSCRIPT

Page 1: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Duygu Simsek, Simon Buckingham Shum, Anna De Liddo, Rebecca Ferguson — The Open University, UK Ágnes Sándor — Xerox Research Centre Europe, FR

1st International Workshop on Discourse-Centric Learning Analytics April 8, 2013, LAK13 Conference, Leuven, Belgium

XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Page 2: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Metadiscourse

Xerox Incremental Parser

Visual analytics v0.1: XIP Dashboard

User Scenarios & Evaluation

2

Page 3: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Metadiscourse signals important moves in educated/scholarly narrative

3

(When scholarly culture works well) this is what gets your papers accepted by

reviewers, and quoted by others

Clear statements regarding the problem, the claim, the argument, the evidence, the implications…

This is what we teach students from school

upwards

Page 4: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Rhetorical functions of metadiscourse identified by the Xerox Incremental Parser (XIP)

BACKGROUND KNOWLEDGE:

Recent studies indicate …

… the previously proposed …

… is universally accepted ...

NOVELTY:

... new insights provide direct evidence ...

... we suggest a new ... approach ...

... results define a novel role ...

OPEN QUESTION: … little is known … … role … has been elusive

Current data is insufficient …

GENERALIZING: ... emerging as a promising approach Our understanding ... has grown exponentially ... ... growing recognition of the

importance ...

CONTRASTING IDEAS: … unorthodox view resolves … paradoxes …

In contrast with previous hypotheses ...

... inconsistent with past findings ...

SIGNIFICANCE: studies ... have provided important advances

Knowledge ... is crucial for ... understanding

valuable information ... from studies

SURPRISE: We have recently observed ... surprisingly

We have identified ... unusual The recent discovery ... suggests intriguing roles

SUMMARIZING: The goal of this study ... Here, we show ...

Altogether, our results ... indicate

Page 5: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

5

Xerox Incremental Parser (XIP)

Sándor, Á. and Vorndran, A. (2010). The detection of salient messages from social science research papers and its application in document search. Workshop on Natural Language Processing Tools Applied to Discourse Analysis in Psychology, Buenos Aires, Argentina, May 10-14. 2010.

Page 6: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

6

Xerox Incremental Parser (XIP)

Sándor, Á. and Vorndran, A. (2010). The detection of salient messages from social science research papers and its application in document search. Workshop on Natural Language Processing Tools Applied to Discourse Analysis in Psychology, Buenos Aires, Argentina, May 10-14. 2010.

Page 7: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

7

Xerox Incremental Parser (XIP)

Sándor, Á. and Vorndran, A. (2010). The detection of salient messages from social science research papers and its application in document search. Workshop on Natural Language Processing Tools Applied to Discourse Analysis in Psychology, Buenos Aires, Argentina, May 10-14. 2010.

Page 8: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

8

Xerox Incremental Parser (XIP)

Sándor, Á. and Vorndran, A. (2010). The detection of salient messages from social science research papers and its application in document search. Workshop on Natural Language Processing Tools Applied to Discourse Analysis in Psychology, Buenos Aires, Argentina, May 10-14. 2010.

Page 9: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Initial evaluation of XIP is promising, but methodologically complex

Human analyst XIP

A striking example – but not all were like this (De Liddo et al, 2012)

19 sentences annotated 22 sentences annotated 11 sentences same as human annotation

71 sentences annotated 59 sentences annotated 42 sentences same as human annotation

Document 1

Document 2

Extract from annotation comparison:

Page 10: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Xerox Incremental Parser (XIP)

XIP’s raw output is fine for NLP machines/researchers, but

not learner/educator friendly

Page 11: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Xerox Incremental Parser (XIP)

XIP’s raw output is fine for NLP machines/researchers, but

not learner/educator friendly

Page 12: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Xerox Incremental Parser (XIP)

5000 (or even 30) plain text files…

we need overviews of XIP analyses from

a corpus

Page 13: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Making XIP analytics visible: 1. annotations on the full text using the OU’s Cohere social sensemaking app (Firefox add-on)

Page 14: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Making XIP analytics visible: 2. XIP annotations visualized in Cohere as a network around the document

Page 15: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

?

? ?

?

2nd phase analysis of document-concept clouds… Connecting? Merging? Re-tagging? Summarising?

Making XIP analytics visible (2)

Page 16: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

XIP Dashboard: towards an earlier phase dashboard for navigating XIP output

Draw attention to patterns of potential significance to students, educators and experienced researchers alike:

§  the occurrence of domain concepts in different metadiscourse contexts – e.g. effective tutoring dialogue in sentences classified contrast

§  trends of the above over time, e.g. to show the development of an idea

§  trends within and differences between research communities as reflected in their publications

§  eventually, the above for one’s own writing 16

Page 17: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Paper prototype to elicit initial reactions

17

Page 18: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Paper prototype to elicit initial reactions

18

‘Intro movie’ from researcher

Participants point + click with finger

Basic navigation seems fine

Enthusiasm for a tool that could help with literature

analysis

Also for a tool to improve one’s own writing by showing

trends, or inconsistencies

Page 19: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

XIP Dashboard Temporal trends per corpus

19

Similar patterns for LAK & EDM literatures

Summary & Contrast categories relatively

higher, and rising

(Not controlled for different corpus sizes in

these graphs)

Page 20: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

XIP Dashboard Comparing corpora filtered by concept

20

Page 21: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

XIP Dashboard All papers by year and concept, with colour = concept density (v2 mockup)

21

Page 22: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

XIP Dashboard Rhetorical function of the sentences behind each bubble

22

Page 23: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

XIP Dashboard Heatmap of all concepts by rhetorical classification (v2 mockup)

23

Page 24: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

XIP Dashboard User scenarios… Student / Educator / Researcher

24

Familiarization with the background material in

a literature…

Comparing different writing patterns between

communities, or students…

Focusing on specific concepts of interest in

combination with rhetorical context

Page 25: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

XIP Dashboard User Evaluations

25

Signal-noise ratio?

Deeper or shallower reading?

New insights, or just faster insights?

Better writing, or just gaming the system?

Page 26: XIP Dashboard: Visual Analytics from Automated Rhetorical Parsing of Scientific Metadiscourse

Summary Early phases of work: a promising language technology now has visual analytics we can deploy with stakeholders

Beyond number / size / frequency of posts; ‘hottest thread’

An important feature of educated writing is knowing how to signal substantive rhetorical moves. NLP can detect this, and we can now generate rudimentary visual analytics.

To be continued…

http://ww

w.glennsasscer.com

/wordpress/w

p-content/uploads/2011/10/iceberg.jpg