"stories" in data and the roles of crowdsourcing – views of a web miner
DESCRIPTION
"Stories" in data and the roles of crowdsourcing – views of a Web miner. Bettina Berendt Dept . of Computer Science KU Leuven, Belgium http://people.cs.kuleuven.be/~bettina.berendt / Thanks to: Ilija Subašić, Markus Luczak-Rösch, and Laura Dr ă gan. A story. Story structure. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/1.jpg)
"Stories" in data and the roles of crowdsourcing – views of a Web miner
Bettina Berendt
Dept. of Computer ScienceKU Leuven, Belgiumhttp://people.cs.kuleuven.be/~bettina.berendt/
Thanks to: Ilija Subašić, Markus Luczak-Rösch, and Laura Drăgan
![Page 2: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/2.jpg)
![Page 3: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/3.jpg)
A story
![Page 4: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/4.jpg)
Story structure
![Page 5: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/5.jpg)
One case of provenance
![Page 6: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/6.jpg)
Another case of provenance
![Page 7: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/7.jpg)
Formalizing provenance: a high-level view
![Page 8: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/8.jpg)
Challenge 1:Many voices
![Page 9: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/9.jpg)
Challenge 2
![Page 10: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/10.jpg)
Challenge 3:subjectivity
![Page 11: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/11.jpg)
The STORIES Tool
![Page 12: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/12.jpg)
Uncover (1)
![Page 13: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/13.jpg)
Uncover (2)
![Page 14: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/14.jpg)
Scan (over time)
![Page 15: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/15.jpg)
Uncover
![Page 16: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/16.jpg)
Zoom
![Page 17: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/17.jpg)
Search: formulating ad-hoc concepts
![Page 18: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/18.jpg)
Track (2)
![Page 19: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/19.jpg)
Textual summarization
![Page 20: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/20.jpg)
Challenge 4
![Page 21: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/21.jpg)
Crowd-sourcing the truth? Wikipedia (here: the Gaza Flotilla Raid)
![Page 22: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/22.jpg)
Challenge 5
![Page 23: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/23.jpg)
Challenge 4: More specifically
Challenge 5: vagueness - reprise
![Page 24: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/24.jpg)
The “live crowdsourcing activity“•Goal: crowdsource data citation metadata•Motivation 1 / possible extension
•Motivation 2 / case study
![Page 25: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/25.jpg)
![Page 26: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/26.jpg)
http://prov.usewod.org
![Page 27: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/27.jpg)
The data
Datasets
Publications
[People]
![Page 28: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/28.jpg)
The datasets
Preloaded:
– USEWOD datasets– DBpedia– SWDF– Bio2RDF– LinkedGeoData– BioPortal– OpenBioMed
![Page 29: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/29.jpg)
The datasets
Preloaded:
– Generic (!)– Versions/releases– References
![Page 30: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/30.jpg)
The datasets
Add new:
– Name*– Version– Release date– URL
![Page 31: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/31.jpg)
The publications
Preloaded:
– USEWOD workshop papers
![Page 32: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/32.jpg)
The publications
Add new:
– Title*– Authors– Year– URL
![Page 33: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/33.jpg)
The data
![Page 34: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/34.jpg)
The task
Capture
which dataset is used in which publication
and
how
![Page 35: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/35.jpg)
Data representation
Datasets
Publications
Connections between them
schema.org
prov:Entity
?
![Page 36: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/36.jpg)
Data representation
Datasets
Publications
Connections between them
schema.org
prov:Entity
prov:Derivation
![Page 37: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/37.jpg)
The task
Capture
which dataset is used in which publication
and
how
![Page 38: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/38.jpg)
Connections
Publication – Publication
Publication – Dataset
Dataset – Publication
Dataset - Dataset
![Page 39: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/39.jpg)
Connections
Publication – Publication
citation
![Page 40: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/40.jpg)
Connections
Publication – Dataset
Dataset – Publication
mentions
describes
evaluates
analyses
compares
![Page 41: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/41.jpg)
Connections
Dataset – Dataset
extends
includes
overlaps
transformation of
generalisation of
![Page 42: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/42.jpg)
Data representation
Subclasses of prov:Derivation
(inverse of Publication-DS)
![Page 43: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/43.jpg)
The task
Capture
which dataset is used in which publication
and
how
![Page 44: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/44.jpg)
Data representation
![Page 45: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/45.jpg)
Data representation
![Page 46: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/46.jpg)
Bundles
![Page 47: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/47.jpg)
Live crowdsourcing activity 2014: outcomes
Participants 6
Bundles 81 avg: 13.5, min: 2, max:27
Publications 19
Datasets 2 (3)
Connections 95 Inclusion: 62 Analysis: 21, Mention: 6
![Page 48: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/48.jpg)
Lessons learned
Data is dirty
– even coming from experts
Focus on the task
– make everything else simpler– minimise data input
![Page 49: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/49.jpg)
Questionnaire results
Inconclusive results on the suitability of the vocabulary,
But interesting answers to: „“what questions would this information answer for you?“:
● “What are popular datasets?”● “Which datasets are facilitators for research
on X?”● “What publications are related through a
dataset (but don't mention each other)?”
![Page 50: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/50.jpg)
• What is outsourced• Who is the crowd• How is the task designed• How are the results validated• How can the process be optimised
[Quinn & Bederson, 2012]
Outlook (1): Dimensions of crowdsourcing
![Page 51: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/51.jpg)
Dimensions Specific questions• [Who] Which crowd(s)? Experts & non-experts• [What] Enhanced by IE?• [design/validation] How to combine these
sources of metadata?• [Optimisation] Incentives?
▫“Student science“?▫Citizen science?▫“Learner science“?
Enlarging the scope: “How come ...?“ Storytelling
![Page 52: "Stories" in data and the roles of crowdsourcing – views of a Web miner](https://reader036.vdocument.in/reader036/viewer/2022062718/56812b5c550346895d8f7c6a/html5/thumbnails/52.jpg)
THANK YOU!
Some references:
• Subašić, I. & Berendt, B. (2009). Discovery of interactive graphs for understanding and searching time-indexed corpora. Knowledge and Information Systems. http://people.cs.kuleuven.be/~bettina.berendt/Papers/subasic_berendt_2009.pdf
• Berendt, Bettina; Last, Mark; Subasic, Ilija; Verbeke, Mathias (2013). New formats and interfaces for multi-document news summarization and its evaluation, In: Fiori, Alessandro (ed.), Innovative Document Summarization Techniques: Revolutionizing Knowledge Understanding. IGI Global. https://lirias.kuleuven.be/bitstream/123456789/423917/1/berendt_last_subasic_verbeke_2013_withbib.pdf
• Dragan, Laura, Luczak-Rösch, Markus, Simperl, Elena, Berendt, Bettina and Moreau, Luc (2014) Crowdsourcing data citation graphs using provenance. In, Provenance Analytics (ProvAnalytics2014), Cologne, DE, 09 Jun 2014. 4pp. http://eprints.soton.ac.uk/365374/
• ~ Presentation at LCPD 2014 : Second workshop on Interlinking and Contextualizing Publications and Datasets, to appear in DLIB Magazine