how communities curate knowledge & how ontologists can help -eurecom--2015-01-19

44
How Communities Curate Knowledge & How Ontologists Can Help Jodi Schneider EURECOM 2015-01-19

Upload: jodischneider

Post on 15-Jul-2015

311 views

Category:

Technology


0 download

TRANSCRIPT

How Communities Curate Knowledge &

How Ontologists Can Help

Jodi Schneider

EURECOM2015-01-19

Motivating Questions

• How do communities curate knowledge?

• How can information technology help?

Community

• Group of people

• Shared tasks

• Intercommunication

Knowledge

• Statements believed to be true

• Arrived at through interpreting evidence

"Body of truths or facts accumulated in the course of time" – dictionary.com

Knowledge

• Statements believed to be true

• Arrived at through interpreting evidence

"Body of truths or facts accumulated in the course of time" – dictionary.com

Knowledge Representation

• Formalisms for sharing knowledge

• Words, numerals, data structures, …

• In my case, ontologies & Web standards

Motivating Questions

• How do communities curate knowledge?

• How can information technology help?

My answers

• How do communities curate knowledge?

– A community has mechanisms for accumulating and persuading each other of "facts".

• How can information technology help?

My answers

• How do communities curate knowledge?

– A community has mechanisms for accumulating and persuading each other of "facts".

• How can information technology help?

– Use knowledge representation systems. Structure evidence the community uses to persuade each other.

Which knowledge should be included

in Wikipedia?

Jodi Schneider, Krystian Samp, Alexandre Passant, and Stefan Decker. “Arguments about Deletion: How Experience Improves the

Acceptability of Arguments in Ad-hoc Online Task Groups”. In CSCW 2013.

Jodi Schneider and Krystian Samp. “Alternative Interfaces for Deletion Discussions in Wikipedia: Some Proposals Using Decision

Factors. [Demo]” In WikiSym2012.

Jodi Schneider, Alexandre Passant, and Stefan Decker. “Deletion Discussions in Wikipedia: Decision Factors and Outcomes.” In

WikiSym2012.

Wikipedia deletes articles

Wikipedia deletes articles

Wikipedia deletes articles

Example Deletion Discussion

Problem: Long, no-consensus discussions

Problem: Long, no-consensus discussions

Problem: Newcomers are confused

about Wikipedia's standards

Problem: Newcomers are confused

about Wikipedia's standards

Problem: Newcomers are confused about

Wikipedia's standards

o "Emsworth Cricket Club is one of the oldest cricket

clubs in the world, and this really is worth a

mention. Especially on a website, where pointless

people … gets (sic) a mention."

o "Why just because it is a small team and not major

does it not deserve it’s (sic) own page on here?"

19

Use community criteria to summarize

discussions

OriginalDiscussion

Ontology

Semantic Enrichment

Semantically Enriched

RDFa

Querying

Queryable

User Interface

With Barchart

Determine the ontology

o Content analysis of a corpus

o Compare two different annotation approaches

o Iterative annotation

• Multiple annotators

• Refine to get good inter-annotator agreement

• 4 rounds of annotation

About the corpus

o 72 discussions started on 1 day.

Each discussion has

• 3—33 messages

• 2—15 participants

o In total, 741 messages contributed by 244 users.

Each message has

• 3—350+ words

o 98 printed A4 sheets

2 types of annotation

o 1. Walton’s Argumentation Schemes

(Walton, Reed, and Macagno 2008)

• Informal argumentation

(philosophical & computational argumentation)

• Identify & prevent errors in reasoning (fallacies)

• 60 patterns

o 2. Factors Analysis(Ashley 1991)

• Case-based reasoning

• E.g. factors for deciding cases in trade secret law,

favoring either party (the plaintiff or the defendant).

For the ontology, we chose decision factors

o 1. Walton’s Argumentation Schemes

(Walton, Reed, and Macagno 2008)

• Most appropriate for writing support

• 15 categories + 2 non-argumentative categories

• Detailed analysis of content

o 2. Decision Factorso (drawing on Ashley 1991)

• Close to the community rules & policies

• 4 categories + 1 catchall

• Good domain coverage

Factor Example (used to justify `keep')

Notability Anyone covered by another

encyclopedic reference is considered

notable enough for inclusion in

Wikipedia.

Sources Basic information about this album at a

minimum is certainly verifiable, it's a

major label release, and a highly

notable band.

Maintenance …this article is savable but at its

current state, needs a lot of

improvement.

Bias It is by no means spam (it does not

promote the products).

Other I'm advocating a blanket "hangon" for

all articles on newly- drafted players

Jodi Schneider, Alexandre Passant & Stefan Decker

Deletion Discussions in Wikipedia: Decision Factors and Outcomes

4 Decision Factors + "Other"

Decision factors articulate values/criteria

o 4 Factors in Deletion Discussions cover

• 91% of comments

• 70% of discussions

o We argue that the best way to avoid deletion is for

readers to understand these criteria.

26

Use community criteria to summarize

discussions

OriginalDiscussion

Ontology

Semantic Enrichment

Semantically Enriched

RDFa

Querying

Queryable

User Interface

With Barchart

Data model uses the decision factors

We add a discussion summary…

…by semantically enriching messages

Our discussion summary…

… gives more detail for each decision factor

On click, open the comments

with that decision factor

Test our Experimental System…

Against this Control System.

PU* - Perceived usefulness

PE* - Perceived ease of use

DC -Decision completeness

PF - Perceived effort

IC* - Information

completeness

Statistical Significance

PU* p < .001

PE* p .001

IC* p .039

Final survey

Results: 84% prefer our system

“Information is structured and I can quickly get an overview of the key arguments.”

“The ability to navigate the comments made it a bit easier to filter my mind set and to come to a conclusion.”

“It offers the structure needed to consider each factor separately, thus making the decision easier. Also, the number of comments per factor offers a quick indication of the relevance and the deepness of the decision.”

16/19, based on a 20 participant user test.

1 participant did not take the final survey

Summary

o How do communities curate knowledge?

• By discussing and applying community standards.

• In Wikipedia, 4 questions are used to evaluate borderline

articles:

o Notability – Is the topic appropriate for our encyclopedia?

o Sources – Is the article well-sourced?

o Maintenance – Can we maintain this article?

o Bias – Is the article neutral? POV appropriately weighted?

o How can information technology help?

• Organize evidence based on the criteria communities use.

• In Wikipedia, we developed an alternate interface for

deletion discussions.

Summary: Our process

o Get to know a community and its needs

(Ethnography)

o Develop a model for their process

(Annotation to inform ontology development)

o Build a computer support system

(Web standards: RDF/OWL, SPARQL)

o Test & refine the system

“Rule” Argumentation Scheme

“Arguments about Deletion: How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups”

CSCW 2013

“Evidence” Argumentation Scheme

“Arguments about Deletion: How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups”

CSCW 2013

Evidence + Rule -> Conclusion

“Arguments about Deletion: How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups”

CSCW 2013