from linked data to linked science - ub...
TRANSCRIPT
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
10th International BielefeldConference
Philipp Cimiano, Semantic Computing, CITEC, Bielefeld Univ.
»From Linked Data to LinkedScience«
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Papers, papers, papers, ...
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Limitations
I can not `work' with the paper
I can not get access to the full data (only views)
I can not check / validate data
I difficult to reproduce results
I No aggregation
How can these limitations be overcome?
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
RAW DATA NOW!
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Linked Data Principles
I Use URIs as names for things -- provide unique IDs
I Use HTTP URIs so that people can look up those names -- access to
content
I When someone looks up a URI, provide useful information, using
the standards (RDF*, SPARQL) -- fostering reuse
I Include links to other URIs -- fostering discovery
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Linked Data Cloud
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Scientific Data remains in silos
In Science, we often have the situation that valuable data is not
publicly available.
There are many reasons for this:
I high effort involved in publishing data (curation, manuals)
I lack of incentives (but this is changing, ...)
I competitive advantage (careers, ...)
I lack of technical expertise
BUT: openness requires to make raw scientific available (btw. it has
been payed by the tax payer)
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Science Organizations calling for makingdata publicly accessible
I DFG: strong commitment to open access and long-term archiving
I Wissenschaftsrat:I Empfehlungen zur Zukunft des bibliothekarischen
Verbundsystems in DeutschlandI Übergreifende Empfehlungen zu Informationsinfrastrukturen
I European Commission: Open Access Pilot (FP7)
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Applying Linked Data Principles toScientific Data
I Idea: create an ecosystem and infrastructure where Scientific Data
can be published using open formats so that the data can be
easily found and re-used.
I Infrastructure already there (WWW, HTTP), standards available
(RDF*, SPARQL), linked data search engines
I Missing: Incentives, Methodology.
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
What would this support?
I find relevant data
I check / validate data
I reproduce experiment
I compare to own data
I ...
Provenance data is of course key.
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Our goal
I Develop a methodology that supports scientists in publishing
their research / experimental data as linked open data.
I So far: two use cases that guide us in the design of the
methodology and will provide a benchmarking.
I Fortunately: two scientists at Bielefeld University that are willing
to release their data as linked open data.
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Chemistry: Glass Transition of AtomsphericAerosols
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Use Case in Chemistry
I A researcher in chemistry wants to collect all known glass
transition temperatures about aerosols.
I The researcher would use a semantic web search engine to look
for appropriate data, using a SPARQL query to aggregate the data
as needed.
I The collected data would contain provenance information so that
the scientist can assess how trustful the data actually is.
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Competency Questions
I Give me all glass transition temperatures of organic compounds.
I Give me all glass transition temperatures of amino acids
measured by differential scanning calorimetry.
I Which substances form glasses at temperature and pressure
conditions in the troposphere?
Technical challenges: vocabulary alignment, provenance data, data
indexing etc.
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Biology: Natural Movement of Stick Insects
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Biology: Natural Movement of Stick Insects
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Use Case
I A researcher interested in insect locomotion downloads this
dataset.
I The researcher is able to recompute the course of joint angles
from the data given.
I As a result, the researcher is for example able to perform a
simulation of the organism or compare the data to own
measurements on the same or a similar organism.
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Competency Questions
I Give me all motion capture datasets about insects!
I Recompute the course of joint angles and compare them to own
motion capture data.
I ...
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Towards a Methodology
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Vocabularies
I Provenance: Open Provenance Model, Provenir, PML, ...
I Bibliographic Data: BIBO, PRISM, ISBD, ...
I domain-specific vocabularies
Publication of Linked Data is an intellectual activity!
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Summary
I Open Science: making research products available to the public.
I Linked Open Science: apply Linked Data technology to accomplish
this (RAW RESEARCH DATA NOW!).
I Infrastructure and open standards exist (but we need dedicated
search engines that index and provide access to content).
I Proposal of a first methodology
I Important: decentralized approach!
I Central role: Data curators
I Discussion: what does it mean for the way libraries publish data?
How do we link paper and research results? Interactive papers?
AG Semantic Computing
Philipp Cimiano & Cord Wiljes
Acknowledgements
Funding by Cognitive Interaction Technology Excellence Center (CITEC)
Joint work with Cord Wiljes