how communities curate knowledge & how ontologists can help -eurecom--2015-01-19
TRANSCRIPT
Knowledge
• Statements believed to be true
• Arrived at through interpreting evidence
"Body of truths or facts accumulated in the course of time" – dictionary.com
Knowledge
• Statements believed to be true
• Arrived at through interpreting evidence
"Body of truths or facts accumulated in the course of time" – dictionary.com
Knowledge Representation
• Formalisms for sharing knowledge
• Words, numerals, data structures, …
• In my case, ontologies & Web standards
My answers
• How do communities curate knowledge?
– A community has mechanisms for accumulating and persuading each other of "facts".
• How can information technology help?
My answers
• How do communities curate knowledge?
– A community has mechanisms for accumulating and persuading each other of "facts".
• How can information technology help?
– Use knowledge representation systems. Structure evidence the community uses to persuade each other.
Which knowledge should be included
in Wikipedia?
Jodi Schneider, Krystian Samp, Alexandre Passant, and Stefan Decker. “Arguments about Deletion: How Experience Improves the
Acceptability of Arguments in Ad-hoc Online Task Groups”. In CSCW 2013.
Jodi Schneider and Krystian Samp. “Alternative Interfaces for Deletion Discussions in Wikipedia: Some Proposals Using Decision
Factors. [Demo]” In WikiSym2012.
Jodi Schneider, Alexandre Passant, and Stefan Decker. “Deletion Discussions in Wikipedia: Decision Factors and Outcomes.” In
WikiSym2012.
Problem: Newcomers are confused about
Wikipedia's standards
o "Emsworth Cricket Club is one of the oldest cricket
clubs in the world, and this really is worth a
mention. Especially on a website, where pointless
people … gets (sic) a mention."
o "Why just because it is a small team and not major
does it not deserve it’s (sic) own page on here?"
19
Use community criteria to summarize
discussions
OriginalDiscussion
Ontology
Semantic Enrichment
Semantically Enriched
RDFa
Querying
Queryable
User Interface
With Barchart
Determine the ontology
o Content analysis of a corpus
o Compare two different annotation approaches
o Iterative annotation
• Multiple annotators
• Refine to get good inter-annotator agreement
• 4 rounds of annotation
About the corpus
o 72 discussions started on 1 day.
Each discussion has
• 3—33 messages
• 2—15 participants
o In total, 741 messages contributed by 244 users.
Each message has
• 3—350+ words
o 98 printed A4 sheets
2 types of annotation
o 1. Walton’s Argumentation Schemes
(Walton, Reed, and Macagno 2008)
• Informal argumentation
(philosophical & computational argumentation)
• Identify & prevent errors in reasoning (fallacies)
• 60 patterns
o 2. Factors Analysis(Ashley 1991)
• Case-based reasoning
• E.g. factors for deciding cases in trade secret law,
favoring either party (the plaintiff or the defendant).
For the ontology, we chose decision factors
o 1. Walton’s Argumentation Schemes
(Walton, Reed, and Macagno 2008)
• Most appropriate for writing support
• 15 categories + 2 non-argumentative categories
• Detailed analysis of content
o 2. Decision Factorso (drawing on Ashley 1991)
• Close to the community rules & policies
• 4 categories + 1 catchall
• Good domain coverage
Factor Example (used to justify `keep')
Notability Anyone covered by another
encyclopedic reference is considered
notable enough for inclusion in
Wikipedia.
Sources Basic information about this album at a
minimum is certainly verifiable, it's a
major label release, and a highly
notable band.
Maintenance …this article is savable but at its
current state, needs a lot of
improvement.
Bias It is by no means spam (it does not
promote the products).
Other I'm advocating a blanket "hangon" for
all articles on newly- drafted players
Jodi Schneider, Alexandre Passant & Stefan Decker
Deletion Discussions in Wikipedia: Decision Factors and Outcomes
4 Decision Factors + "Other"
Decision factors articulate values/criteria
o 4 Factors in Deletion Discussions cover
• 91% of comments
• 70% of discussions
o We argue that the best way to avoid deletion is for
readers to understand these criteria.
26
Use community criteria to summarize
discussions
OriginalDiscussion
Ontology
Semantic Enrichment
Semantically Enriched
RDFa
Querying
Queryable
User Interface
With Barchart
PU* - Perceived usefulness
PE* - Perceived ease of use
DC -Decision completeness
PF - Perceived effort
IC* - Information
completeness
Statistical Significance
PU* p < .001
PE* p .001
IC* p .039
Results: 84% prefer our system
“Information is structured and I can quickly get an overview of the key arguments.”
“The ability to navigate the comments made it a bit easier to filter my mind set and to come to a conclusion.”
“It offers the structure needed to consider each factor separately, thus making the decision easier. Also, the number of comments per factor offers a quick indication of the relevance and the deepness of the decision.”
16/19, based on a 20 participant user test.
1 participant did not take the final survey
Summary
o How do communities curate knowledge?
• By discussing and applying community standards.
• In Wikipedia, 4 questions are used to evaluate borderline
articles:
o Notability – Is the topic appropriate for our encyclopedia?
o Sources – Is the article well-sourced?
o Maintenance – Can we maintain this article?
o Bias – Is the article neutral? POV appropriately weighted?
o How can information technology help?
• Organize evidence based on the criteria communities use.
• In Wikipedia, we developed an alternate interface for
deletion discussions.
Summary: Our process
o Get to know a community and its needs
(Ethnography)
o Develop a model for their process
(Annotation to inform ontology development)
o Build a computer support system
(Web standards: RDF/OWL, SPARQL)
o Test & refine the system
“Rule” Argumentation Scheme
“Arguments about Deletion: How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups”
CSCW 2013
“Evidence” Argumentation Scheme
“Arguments about Deletion: How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups”
CSCW 2013