verifiable, linked open knowledge that anyone can edit

Post on 23-Jan-2018

1.393 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Verifiable, Linked Open Knowledge That Anyone can Edit

Dario Taraborelli@readermeter

A short history of Wikipedia

A website that anyone can edit

The largest reference work on the internet

A multi-language online encyclopedia

A short history of Wikipedia

A website that anyone can edit

The largest reference work on the internet

A multi-language online encyclopedia

A short history of Wikipedia

A website that anyone can edit

The largest reference work on the internet

A multi-language online encyclopedia

Wikipedia: unintended outcomes

accelerate the dissemination of scholarship

support open scientific research

enable distributed fact-checking and curation of scientific knowledge

accelerate the dissemination of scholarship

support open scientific research

enable distributed fact-checking and curation of scientific knowledge

Wikipedia: unintended outcomes

accelerate the dissemination of scholarship

support open scientific research

enable distributed fact-checking and curation of scientific knowledge

Outline

1. Wikipedia as the front matter to all research

2. A new kind of open knowledge

3. Wikidata: Collaboratively curated linked open data

4. WikiCite: Building the sum of all human citations

5. Applications

6. Concluding remarks

Wikipedia as the front matter to all research

“Wikipedia is not the bottom layer of authority, nor the top, but in fact the highest layer without formal vetting. In this unique role, it serves as an ideal bridge between the validated and unvalidated Web.”

Casper GrathwohlChronicle of Higher Education

http://chronicle.com/article/article-content/125899/

Top sources of DOI resolutions

http://crosstech.crossref.org/2014/02/many-metrics-such-data-wow.html http://blog.crossref.org/2016/05/https-and-wikipedia.html

The world’s most accessed online medical resource?

Heilman and West (2015) doi.org/10.2196/jmir.4069

Most visited resource on Ebola in West Africa

Heilman (2016) http://tinyurl.com/jfuyduv

Most used internet site in Liberia, Sierra Leone and Guinea for Ebola during 2014 outbreak

Greater than CNN, CDC and WHO

A new kind of open knowledge

The backbone of the linked open data ecosystem

Schmachtenberg et al (2014)http://lod-cloud.net [CC BY SA]

Challenges

Biases / errors

Coverage

Diversity and inclusiveness

Verifiability

Machine-readable linked open dataEditable by anyone

Supporting human + algorithmic curationComprehensive

Transparently verifiable

Machine-readable linked open dataEditable by anyone

Supporting human + algorithmic curationComprehensive

Transparently verifiable

Machine-readable linked open dataEditable by anyone

Supporting human + algorithmic curationComprehensive

Transparently verifiable

WikidataCollaboratively curated linked open data

Wikidata

Free knowledge base that anyone can edit

Launched in 2012

Integrated with Wikipedia and other sister projects

Statistics (Aug 2016)Nearly 20M itemsOver 100M statements

Wikidata:Growth

http://reportcard.wmflabs.org/graphs/active_editors

English Wikipedia

Wikidata

Wikidata:Growth

http://reportcard.wmflabs.org/graphs/very_active_editors

English Wikipedia

Wikidata

Wikidata’s anatomy

https://www.wikidata.org/wiki/Wikidata:Introduction

Wikidata’s anatomy

Linked data, San Francisco, Jeblad https://commons.wikimedia.org/wiki/File:Linked_Data_-_San_Francisco.svg [CC BY SA]

SPARQL: http://tinyurl.com/zelqrwp

Paintings by Gustav Klimt

Wikidataquery examples

SPARQL:https://t.co/cDR4Lt7V6P

Birth place of people employed by MIT

SPARQL:http://tinyurl.com/h5x5q4q

Children of Genghis Khan

Expert curation of scientific open data

Benjamin Good (2016) Opportunities and challenges presented by Wikidata in the context of biocurationhttp://tinyurl.com/hk9qrmz

Expert curation of scientific open data

Gene Wiki: WIkidata SPARQL exampleshttps://bitbucket.org/sulab/wikidatasparqlexamples/overview

Get a list of all diseases treated by MetforminGet all the gene ontology evidence codes used in WikidataGet all known drug-drug interactions for Methadone via its CHEMBL id

WikiCiteBuilding the sum of all human citations

Randall Munroe, Wikipedian protester http://tinyurl.com/p3rodlb [CC BY]

the disappearance of provenance

http://bit.ly/SumOfAllCitations

the disappearance of provenance

the disappearance of provenance

http://wapo.st/1Y5Smm6

Linking is a small act of generosity that sends people away from your site to some other that you think shows the world in a way worth considering. [...]

[Sources] that are not generous with linking [...] are a stopping point in the ecology of information. That’s the operational definition of authority: The last place you visit when you’re looking for an answer. If you are satisfied with the answer, you stop your pursuit of it. Take the links out and you think you look like more of an authority.

D. Weinberger (2012) Linking is a public goodhttp://www.hyperorg.com/blogger/2012/02/26/2b2k-linking-is-a-public-good/

a provenance-preserving answer engine

a provenance-preserving answer engine

The sum of all human knowledge

The sum of all data and sources backing human knowledge

+

Benjamin Good (2016) Opportunities and challenges presented by Wikidata in the context of biocurationhttp://tinyurl.com/hk9qrmz

The molecular origins of insulin go at least as far back as the simplest unicellular [[eukaryotes]].<ref name='LeRoith'>{{cite journal | vauthors = LeRoith D, Shiloach J, Heffron R, Rubinovitz C, Tanenbaum R, Roth J | title = Insulin-related material in microbes: similarities and differences from mammalian insulins | journal = Can. J. Biochem. Cell Biol. | volume = 63 | issue = 8 | pages = 839–49 | year = 1985 | pmid = 3933801 | doi = 10.1139/o85-106 }}</ref> Apart from animals, insulin-like proteins are also known to exist in Fungi and Protista kingdoms.

References in Wikipedia

Wikicite: goals

Lay the foundations for building a repository of all Wikimedia citations and source metadata as structured data

Design data models and technology to improve the coverage, quality, standards-compliance and machine-readability of

citations and source metadata in Wikimedia projects

https://meta.wikimedia.org/wiki/WikiCite_2016

Wikidata as the solution

VisionTechnologyCommunityScaleLicensingIndependence

https://tools.wmflabs.org/sqid/#/view?id=P2860

cites property now used in 350,000+ statements

The Zika corpus

Open citation graph layer

Bibliographic metadata layer

Expert annotation layer

Encyclopedic layer

The Zika corpus

Encyclopedic layer

The Zika corpus

Expert annotation layer

Encyclopedic layer

The Zika corpus

Bibliographic metadata layer

Expert annotation layer

Encyclopedic layer

The Zika corpus

Open citation graph layer

Bibliographic metadata layer

Expert annotation layer

Encyclopedic layer

Applications

Co-author graphs for individual researchers SPARQL: http://tinyurl.com/zml3jox

Most cited authors in the research corpus on Zika SPARQL: http://tinyurl.com/jb8da68

Semi-automated recommendation of missing statements or sources for unsourced statements

https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References

Tools for crowdsourcing entity matching / disambiguation

http://www.generalist.org.uk/blog/2014/wikidata-identifiers-and-the-odnb-where-next/ http://www.generalist.org.uk/blog/2014/wikidata-and-identifiers-part-2-the-matching-process/

all statements citing a New York Times article

the most popular scholarly journals used as citations for statements in any item that is a subclass of economics

all statements citing the works of Joseph Stiglitz

all statements citing journal articles by physicists from Oxford University

all statements citing a journal article that was retracted

all statements citing a source that cites a journal article that was retracted

New opportunities for linked open knowledge curation and discovery

https://meta.wikimedia.org/wiki/WikiCite_2016/Report/Group_5

Concluding remarks

Liberate public domain bibliographic and citation data

Support new forms of open curation and distributed fact-checking

Accelerate open scientific research

Verifiable, Linked Open Knowledge That Anyone can Edit

Thank youAcknowledgments

Daniel Mietchen, Jonathan Dugan, Lydia Pintscher, Cameron Neylon, James Hare, James Heilman, Magnus Manske, the Gene Wiki team (especially Andra Waagmeester and Benjamin Good), the University of Chicago Knowledge Lab, all WikiCite 2016 participants and Wikidata Source Metadata project contributors.

Additional image credits

Printing press, M. Wirth https://thenounproject.com/term/printing/11880/ [CC BY]Cocitation network for openfMRI papers, F. Å. Nielsen https://twitter.com/fnielsen/status/752860630932156416

dario@wikimedia.org • @readermeter • @Wikidata • @WikiCite • @WikiResearch

top related