verifiable, linked open knowledge that anyone can edit
TRANSCRIPT
Verifiable, Linked Open Knowledge That Anyone can Edit
Dario Taraborelli@readermeter
A short history of Wikipedia
A website that anyone can edit
The largest reference work on the internet
A multi-language online encyclopedia
A short history of Wikipedia
A website that anyone can edit
The largest reference work on the internet
A multi-language online encyclopedia
A short history of Wikipedia
A website that anyone can edit
The largest reference work on the internet
A multi-language online encyclopedia
Wikipedia: unintended outcomes
accelerate the dissemination of scholarship
support open scientific research
enable distributed fact-checking and curation of scientific knowledge
accelerate the dissemination of scholarship
support open scientific research
enable distributed fact-checking and curation of scientific knowledge
Wikipedia: unintended outcomes
accelerate the dissemination of scholarship
support open scientific research
enable distributed fact-checking and curation of scientific knowledge
Outline
1. Wikipedia as the front matter to all research
2. A new kind of open knowledge
3. Wikidata: Collaboratively curated linked open data
4. WikiCite: Building the sum of all human citations
5. Applications
6. Concluding remarks
Wikipedia as the front matter to all research
“Wikipedia is not the bottom layer of authority, nor the top, but in fact the highest layer without formal vetting. In this unique role, it serves as an ideal bridge between the validated and unvalidated Web.”
Casper GrathwohlChronicle of Higher Education
http://chronicle.com/article/article-content/125899/
Top sources of DOI resolutions
http://crosstech.crossref.org/2014/02/many-metrics-such-data-wow.html http://blog.crossref.org/2016/05/https-and-wikipedia.html
The world’s most accessed online medical resource?
Heilman and West (2015) doi.org/10.2196/jmir.4069
Most visited resource on Ebola in West Africa
Heilman (2016) http://tinyurl.com/jfuyduv
Most used internet site in Liberia, Sierra Leone and Guinea for Ebola during 2014 outbreak
Greater than CNN, CDC and WHO
A new kind of open knowledge
The backbone of the linked open data ecosystem
Schmachtenberg et al (2014)http://lod-cloud.net [CC BY SA]
Challenges
Biases / errors
Coverage
Diversity and inclusiveness
Verifiability
Machine-readable linked open dataEditable by anyone
Supporting human + algorithmic curationComprehensive
Transparently verifiable
Machine-readable linked open dataEditable by anyone
Supporting human + algorithmic curationComprehensive
Transparently verifiable
Machine-readable linked open dataEditable by anyone
Supporting human + algorithmic curationComprehensive
Transparently verifiable
WikidataCollaboratively curated linked open data
Wikidata
Free knowledge base that anyone can edit
Launched in 2012
Integrated with Wikipedia and other sister projects
Statistics (Aug 2016)Nearly 20M itemsOver 100M statements
Wikidata:Growth
http://reportcard.wmflabs.org/graphs/active_editors
English Wikipedia
Wikidata
Wikidata:Growth
http://reportcard.wmflabs.org/graphs/very_active_editors
English Wikipedia
Wikidata
Wikidata’s anatomy
https://www.wikidata.org/wiki/Wikidata:Introduction
Wikidata’s anatomy
Linked data, San Francisco, Jeblad https://commons.wikimedia.org/wiki/File:Linked_Data_-_San_Francisco.svg [CC BY SA]
SPARQL: http://tinyurl.com/zelqrwp
Paintings by Gustav Klimt
Wikidataquery examples
SPARQL:https://t.co/cDR4Lt7V6P
Birth place of people employed by MIT
SPARQL:http://tinyurl.com/h5x5q4q
Children of Genghis Khan
Expert curation of scientific open data
Benjamin Good (2016) Opportunities and challenges presented by Wikidata in the context of biocurationhttp://tinyurl.com/hk9qrmz
Expert curation of scientific open data
Gene Wiki: WIkidata SPARQL exampleshttps://bitbucket.org/sulab/wikidatasparqlexamples/overview
Get a list of all diseases treated by MetforminGet all the gene ontology evidence codes used in WikidataGet all known drug-drug interactions for Methadone via its CHEMBL id
WikiCiteBuilding the sum of all human citations
Randall Munroe, Wikipedian protester http://tinyurl.com/p3rodlb [CC BY]
the disappearance of provenance
http://bit.ly/SumOfAllCitations
the disappearance of provenance
the disappearance of provenance
http://wapo.st/1Y5Smm6
Linking is a small act of generosity that sends people away from your site to some other that you think shows the world in a way worth considering. [...]
[Sources] that are not generous with linking [...] are a stopping point in the ecology of information. That’s the operational definition of authority: The last place you visit when you’re looking for an answer. If you are satisfied with the answer, you stop your pursuit of it. Take the links out and you think you look like more of an authority.
D. Weinberger (2012) Linking is a public goodhttp://www.hyperorg.com/blogger/2012/02/26/2b2k-linking-is-a-public-good/
a provenance-preserving answer engine
a provenance-preserving answer engine
The sum of all human knowledge
The sum of all data and sources backing human knowledge
+
https://twitter.com/egonwillighagen/status/718474906858582016
Benjamin Good (2016) Opportunities and challenges presented by Wikidata in the context of biocurationhttp://tinyurl.com/hk9qrmz
https://tools.wmflabs.org/wikidata-todo/stats.php https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Source_MetaData#Sources_used_as_references_on_Wikidata
77%
2013 2014 2015 2016
References in Wikidata
The molecular origins of insulin go at least as far back as the simplest unicellular [[eukaryotes]].<ref name='LeRoith'>{{cite journal | vauthors = LeRoith D, Shiloach J, Heffron R, Rubinovitz C, Tanenbaum R, Roth J | title = Insulin-related material in microbes: similarities and differences from mammalian insulins | journal = Can. J. Biochem. Cell Biol. | volume = 63 | issue = 8 | pages = 839–49 | year = 1985 | pmid = 3933801 | doi = 10.1139/o85-106 }}</ref> Apart from animals, insulin-like proteins are also known to exist in Fungi and Protista kingdoms.
References in Wikipedia
Wikicite: goals
Lay the foundations for building a repository of all Wikimedia citations and source metadata as structured data
Design data models and technology to improve the coverage, quality, standards-compliance and machine-readability of
citations and source metadata in Wikimedia projects
https://meta.wikimedia.org/wiki/WikiCite_2016
Wikidata as the solution
VisionTechnologyCommunityScaleLicensingIndependence
https://meta.wikimedia.org/wiki/WikiCite_2016
https://tools.wmflabs.org/sqid/#/view?id=P2860
cites property now used in 350,000+ statements
https://twitter.com/harej/status/765336072997842944
The Zika corpus
Open citation graph layer
Bibliographic metadata layer
Expert annotation layer
Encyclopedic layer
The Zika corpus
Encyclopedic layer
The Zika corpus
Expert annotation layer
Encyclopedic layer
The Zika corpus
Bibliographic metadata layer
Expert annotation layer
Encyclopedic layer
The Zika corpus
Open citation graph layer
Bibliographic metadata layer
Expert annotation layer
Encyclopedic layer
Applications
Co-author graphs for individual researchers SPARQL: http://tinyurl.com/zml3jox
Most cited authors in the research corpus on Zika SPARQL: http://tinyurl.com/jb8da68
Semi-automated recommendation of missing statements or sources for unsourced statements
https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References
Tools for crowdsourcing entity matching / disambiguation
http://www.generalist.org.uk/blog/2014/wikidata-identifiers-and-the-odnb-where-next/ http://www.generalist.org.uk/blog/2014/wikidata-and-identifiers-part-2-the-matching-process/
all statements citing a New York Times article
the most popular scholarly journals used as citations for statements in any item that is a subclass of economics
all statements citing the works of Joseph Stiglitz
all statements citing journal articles by physicists from Oxford University
all statements citing a journal article that was retracted
all statements citing a source that cites a journal article that was retracted
New opportunities for linked open knowledge curation and discovery
https://meta.wikimedia.org/wiki/WikiCite_2016/Report/Group_5
Concluding remarks
Liberate public domain bibliographic and citation data
Support new forms of open curation and distributed fact-checking
Accelerate open scientific research
Verifiable, Linked Open Knowledge That Anyone can Edit
meta.wikimedia.org/wiki/WikiCite • @wikicite
Thank youAcknowledgments
Daniel Mietchen, Jonathan Dugan, Lydia Pintscher, Cameron Neylon, James Hare, James Heilman, Magnus Manske, the Gene Wiki team (especially Andra Waagmeester and Benjamin Good), the University of Chicago Knowledge Lab, all WikiCite 2016 participants and Wikidata Source Metadata project contributors.
Additional image credits
Printing press, M. Wirth https://thenounproject.com/term/printing/11880/ [CC BY]Cocitation network for openfMRI papers, F. Å. Nielsen https://twitter.com/fnielsen/status/752860630932156416
[email protected] • @readermeter • @Wikidata • @WikiCite • @WikiResearch