verifiable, linked open knowledge that anyone can edit

64
Verifiable, Linked Open Knowledge That Anyone can Edit Dario Taraborelli @readermeter

Upload: dario-taraborelli

Post on 23-Jan-2018

1.392 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Verifiable, linked open knowledge that anyone can edit

Verifiable, Linked Open Knowledge That Anyone can Edit

Dario Taraborelli@readermeter

Page 2: Verifiable, linked open knowledge that anyone can edit

A short history of Wikipedia

A website that anyone can edit

The largest reference work on the internet

A multi-language online encyclopedia

Page 3: Verifiable, linked open knowledge that anyone can edit

A short history of Wikipedia

A website that anyone can edit

The largest reference work on the internet

A multi-language online encyclopedia

Page 4: Verifiable, linked open knowledge that anyone can edit

A short history of Wikipedia

A website that anyone can edit

The largest reference work on the internet

A multi-language online encyclopedia

Page 5: Verifiable, linked open knowledge that anyone can edit

Wikipedia: unintended outcomes

accelerate the dissemination of scholarship

support open scientific research

enable distributed fact-checking and curation of scientific knowledge

Page 6: Verifiable, linked open knowledge that anyone can edit

accelerate the dissemination of scholarship

support open scientific research

enable distributed fact-checking and curation of scientific knowledge

Page 7: Verifiable, linked open knowledge that anyone can edit

Wikipedia: unintended outcomes

accelerate the dissemination of scholarship

support open scientific research

enable distributed fact-checking and curation of scientific knowledge

Page 8: Verifiable, linked open knowledge that anyone can edit

Outline

1. Wikipedia as the front matter to all research

2. A new kind of open knowledge

3. Wikidata: Collaboratively curated linked open data

4. WikiCite: Building the sum of all human citations

5. Applications

6. Concluding remarks

Page 9: Verifiable, linked open knowledge that anyone can edit

Wikipedia as the front matter to all research

Page 10: Verifiable, linked open knowledge that anyone can edit

“Wikipedia is not the bottom layer of authority, nor the top, but in fact the highest layer without formal vetting. In this unique role, it serves as an ideal bridge between the validated and unvalidated Web.”

Casper GrathwohlChronicle of Higher Education

http://chronicle.com/article/article-content/125899/

Page 11: Verifiable, linked open knowledge that anyone can edit

Top sources of DOI resolutions

http://crosstech.crossref.org/2014/02/many-metrics-such-data-wow.html http://blog.crossref.org/2016/05/https-and-wikipedia.html

Page 12: Verifiable, linked open knowledge that anyone can edit

The world’s most accessed online medical resource?

Heilman and West (2015) doi.org/10.2196/jmir.4069

Page 13: Verifiable, linked open knowledge that anyone can edit

Most visited resource on Ebola in West Africa

Heilman (2016) http://tinyurl.com/jfuyduv

Most used internet site in Liberia, Sierra Leone and Guinea for Ebola during 2014 outbreak

Greater than CNN, CDC and WHO

Page 14: Verifiable, linked open knowledge that anyone can edit

A new kind of open knowledge

Page 15: Verifiable, linked open knowledge that anyone can edit

The backbone of the linked open data ecosystem

Schmachtenberg et al (2014)http://lod-cloud.net [CC BY SA]

Page 16: Verifiable, linked open knowledge that anyone can edit

Challenges

Biases / errors

Coverage

Diversity and inclusiveness

Verifiability

Page 17: Verifiable, linked open knowledge that anyone can edit

Machine-readable linked open dataEditable by anyone

Supporting human + algorithmic curationComprehensive

Transparently verifiable

Page 18: Verifiable, linked open knowledge that anyone can edit

Machine-readable linked open dataEditable by anyone

Supporting human + algorithmic curationComprehensive

Transparently verifiable

Page 19: Verifiable, linked open knowledge that anyone can edit

Machine-readable linked open dataEditable by anyone

Supporting human + algorithmic curationComprehensive

Transparently verifiable

Page 20: Verifiable, linked open knowledge that anyone can edit

WikidataCollaboratively curated linked open data

Page 21: Verifiable, linked open knowledge that anyone can edit

Wikidata

Free knowledge base that anyone can edit

Launched in 2012

Integrated with Wikipedia and other sister projects

Statistics (Aug 2016)Nearly 20M itemsOver 100M statements

Page 22: Verifiable, linked open knowledge that anyone can edit

Wikidata:Growth

http://reportcard.wmflabs.org/graphs/active_editors

English Wikipedia

Wikidata

Page 23: Verifiable, linked open knowledge that anyone can edit

Wikidata:Growth

http://reportcard.wmflabs.org/graphs/very_active_editors

English Wikipedia

Wikidata

Page 24: Verifiable, linked open knowledge that anyone can edit

Wikidata’s anatomy

https://www.wikidata.org/wiki/Wikidata:Introduction

Page 25: Verifiable, linked open knowledge that anyone can edit

Wikidata’s anatomy

Linked data, San Francisco, Jeblad https://commons.wikimedia.org/wiki/File:Linked_Data_-_San_Francisco.svg [CC BY SA]

Page 26: Verifiable, linked open knowledge that anyone can edit

SPARQL: http://tinyurl.com/zelqrwp

Paintings by Gustav Klimt

Wikidataquery examples

Page 27: Verifiable, linked open knowledge that anyone can edit

SPARQL:https://t.co/cDR4Lt7V6P

Birth place of people employed by MIT

Page 28: Verifiable, linked open knowledge that anyone can edit

SPARQL:http://tinyurl.com/h5x5q4q

Children of Genghis Khan

Page 29: Verifiable, linked open knowledge that anyone can edit

Expert curation of scientific open data

Benjamin Good (2016) Opportunities and challenges presented by Wikidata in the context of biocurationhttp://tinyurl.com/hk9qrmz

Page 30: Verifiable, linked open knowledge that anyone can edit

Expert curation of scientific open data

Gene Wiki: WIkidata SPARQL exampleshttps://bitbucket.org/sulab/wikidatasparqlexamples/overview

Get a list of all diseases treated by MetforminGet all the gene ontology evidence codes used in WikidataGet all known drug-drug interactions for Methadone via its CHEMBL id

Page 31: Verifiable, linked open knowledge that anyone can edit

WikiCiteBuilding the sum of all human citations

Randall Munroe, Wikipedian protester http://tinyurl.com/p3rodlb [CC BY]

Page 32: Verifiable, linked open knowledge that anyone can edit

the disappearance of provenance

http://bit.ly/SumOfAllCitations

Page 33: Verifiable, linked open knowledge that anyone can edit

the disappearance of provenance

Page 34: Verifiable, linked open knowledge that anyone can edit

the disappearance of provenance

http://wapo.st/1Y5Smm6

Page 35: Verifiable, linked open knowledge that anyone can edit
Page 36: Verifiable, linked open knowledge that anyone can edit

Linking is a small act of generosity that sends people away from your site to some other that you think shows the world in a way worth considering. [...]

[Sources] that are not generous with linking [...] are a stopping point in the ecology of information. That’s the operational definition of authority: The last place you visit when you’re looking for an answer. If you are satisfied with the answer, you stop your pursuit of it. Take the links out and you think you look like more of an authority.

D. Weinberger (2012) Linking is a public goodhttp://www.hyperorg.com/blogger/2012/02/26/2b2k-linking-is-a-public-good/

Page 37: Verifiable, linked open knowledge that anyone can edit
Page 38: Verifiable, linked open knowledge that anyone can edit

a provenance-preserving answer engine

Page 39: Verifiable, linked open knowledge that anyone can edit

a provenance-preserving answer engine

The sum of all human knowledge

The sum of all data and sources backing human knowledge

+

Page 41: Verifiable, linked open knowledge that anyone can edit

Benjamin Good (2016) Opportunities and challenges presented by Wikidata in the context of biocurationhttp://tinyurl.com/hk9qrmz

Page 43: Verifiable, linked open knowledge that anyone can edit

The molecular origins of insulin go at least as far back as the simplest unicellular [[eukaryotes]].<ref name='LeRoith'>{{cite journal | vauthors = LeRoith D, Shiloach J, Heffron R, Rubinovitz C, Tanenbaum R, Roth J | title = Insulin-related material in microbes: similarities and differences from mammalian insulins | journal = Can. J. Biochem. Cell Biol. | volume = 63 | issue = 8 | pages = 839–49 | year = 1985 | pmid = 3933801 | doi = 10.1139/o85-106 }}</ref> Apart from animals, insulin-like proteins are also known to exist in Fungi and Protista kingdoms.

References in Wikipedia

Page 44: Verifiable, linked open knowledge that anyone can edit
Page 45: Verifiable, linked open knowledge that anyone can edit

Wikicite: goals

Lay the foundations for building a repository of all Wikimedia citations and source metadata as structured data

Design data models and technology to improve the coverage, quality, standards-compliance and machine-readability of

citations and source metadata in Wikimedia projects

https://meta.wikimedia.org/wiki/WikiCite_2016

Page 46: Verifiable, linked open knowledge that anyone can edit

Wikidata as the solution

VisionTechnologyCommunityScaleLicensingIndependence

Page 48: Verifiable, linked open knowledge that anyone can edit

https://tools.wmflabs.org/sqid/#/view?id=P2860

cites property now used in 350,000+ statements

Page 50: Verifiable, linked open knowledge that anyone can edit

The Zika corpus

Open citation graph layer

Bibliographic metadata layer

Expert annotation layer

Encyclopedic layer

Page 51: Verifiable, linked open knowledge that anyone can edit

The Zika corpus

Encyclopedic layer

Page 52: Verifiable, linked open knowledge that anyone can edit

The Zika corpus

Expert annotation layer

Encyclopedic layer

Page 53: Verifiable, linked open knowledge that anyone can edit

The Zika corpus

Bibliographic metadata layer

Expert annotation layer

Encyclopedic layer

Page 54: Verifiable, linked open knowledge that anyone can edit

The Zika corpus

Open citation graph layer

Bibliographic metadata layer

Expert annotation layer

Encyclopedic layer

Page 55: Verifiable, linked open knowledge that anyone can edit

Applications

Page 56: Verifiable, linked open knowledge that anyone can edit

Co-author graphs for individual researchers SPARQL: http://tinyurl.com/zml3jox

Page 57: Verifiable, linked open knowledge that anyone can edit

Most cited authors in the research corpus on Zika SPARQL: http://tinyurl.com/jb8da68

Page 58: Verifiable, linked open knowledge that anyone can edit

Semi-automated recommendation of missing statements or sources for unsourced statements

https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References

Page 59: Verifiable, linked open knowledge that anyone can edit

Tools for crowdsourcing entity matching / disambiguation

http://www.generalist.org.uk/blog/2014/wikidata-identifiers-and-the-odnb-where-next/ http://www.generalist.org.uk/blog/2014/wikidata-and-identifiers-part-2-the-matching-process/

Page 60: Verifiable, linked open knowledge that anyone can edit

all statements citing a New York Times article

the most popular scholarly journals used as citations for statements in any item that is a subclass of economics

all statements citing the works of Joseph Stiglitz

all statements citing journal articles by physicists from Oxford University

all statements citing a journal article that was retracted

all statements citing a source that cites a journal article that was retracted

New opportunities for linked open knowledge curation and discovery

https://meta.wikimedia.org/wiki/WikiCite_2016/Report/Group_5

Page 61: Verifiable, linked open knowledge that anyone can edit

Concluding remarks

Page 62: Verifiable, linked open knowledge that anyone can edit

Liberate public domain bibliographic and citation data

Support new forms of open curation and distributed fact-checking

Accelerate open scientific research

Verifiable, Linked Open Knowledge That Anyone can Edit

Page 64: Verifiable, linked open knowledge that anyone can edit

Thank youAcknowledgments

Daniel Mietchen, Jonathan Dugan, Lydia Pintscher, Cameron Neylon, James Hare, James Heilman, Magnus Manske, the Gene Wiki team (especially Andra Waagmeester and Benjamin Good), the University of Chicago Knowledge Lab, all WikiCite 2016 participants and Wikidata Source Metadata project contributors.

Additional image credits

Printing press, M. Wirth https://thenounproject.com/term/printing/11880/ [CC BY]Cocitation network for openfMRI papers, F. Å. Nielsen https://twitter.com/fnielsen/status/752860630932156416

[email protected] • @readermeter • @Wikidata • @WikiCite • @WikiResearch