linked data introduction w exempel

17
Linked Data “I’m encouraged by what actually can be done to improve the research and commercial utility of information.” Here’s an intro to why … Kerstin Forsberg AstraZeneca R&D, Sweden Introduction

Upload: kerstin-forsberg

Post on 03-Nov-2014

799 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Linked data introduction w exempel

Linked Data

“I’m encouraged by what actually can be done to improve the research and commercial utility of information.”

Here’s an intro to why …

Kerstin ForsbergAstraZeneca R&D, Sweden

Introduction

Page 2: Linked data introduction w exempel

Linked Data Introduction 2

Web of (Linked) Data

Web 3.0

Web of Documents

An Intro To The Semantic Web: Why You Need To Know About It Sooner Than Later , by Samantha Wong Image Source: Frederic Martin

Page 3: Linked data introduction w exempel

Linking Open Data (LOD) cloud

The Linking Open Data cloud diagram

Linked Data Introduction3

http://data.gov.uk/

http://data.gov/

Two Forerunners: UK and US Government

Page 4: Linked data introduction w exempel

Linking Open Data (LOD) cloud

4

Global Identifier (URI) for a AZ study http://data.linkedct.org/resource/trial/NCT00755378

48 RDF Triples

Linked Data Introduction

Linked Data ClinicalTrial.gov

Page 5: Linked data introduction w exempel

4 Principles for Linked Data …… and 5 stars for Linked Open Data

5

1. Use URIs (Uniform Resource Identifiers) as names for things.

2. Use HTTP URIs so that people can look up (dereference) those names.

3. When someone looks up a URI, provide useful information.

4. Include links to other URIs so that they can discover more things.

Source: Linked Open Data star scheme by example

More resources introducing and describing the Linked Data idea

Linked Data Introduction

Page 6: Linked data introduction w exempel

Linked Enterprise Data

6 Linked Data Introduction

Source: What does Open Data mean for Enterprises?

More resources introducing and describing the Linked Data idea

Page 7: Linked data introduction w exempel

“Linked R&D Data”Examples of Building Blocks

7

From the AZ RDI report: “Persistent URIs and Linked Data “ordered by Mike Westaway

• Global identifier (URI) scheme for AZ entities (e.g. people, projects, studies, drugs)

• Recommended biomedical ontologies and basic vocabularies to provide context (semantic and provenance) to data

• Dataset Catalogue

Linked Data Introduction

Page 8: Linked data introduction w exempel

I’m encouraged by …

• … what actually can be done by applying Linked Data principles, together with a stepwise implementation and pragmatic application of crucial building blocks, to …

• … improve the research and commercial utility of information• Organized for associations • Prepared for not yet defined use• Ready for automation where computers

can function alongside us to • Mitigate the complexity in discovering,

accessing, connecting and interpreting information

• Improve the productivity in managing information

Linked Data Introduction 8

Health Care and Life Sciences (HCLS) Interest GroupLinking Open Drug Data

EU project The Large Knowledge ColliderLinked Life Data

A 2-page summary of our learnings from participating in these external projects: Linked Data in Pharma, 2011, Bo Andersson and Kerstin Forsberg

Page 9: Linked data introduction w exempel

Extras

• One example- Spending Data in UK

• Two things to remember- RDF Triples - Global Identifiers (URIs)

• Scenarios• Linked Clinical Study Metadata• Linked Patient Data in a Clinical Study

Linked Data Introduction (Extras)9

Page 10: Linked data introduction w exempel

Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond

One example

10Linked Spending Data –

How and Why Bother

http://spending.lichfielddc.gov.uk/spend/8605670

Globally identified by a URI

From the Linking Open Data cloud

A Payment from Lichfield District Council, one local authority in the UK.

Linked Data Introduction (Extras)

Spending Data in UK

Page 11: Linked data introduction w exempel

Two things to remember RDF Triples

has the colorThe sky blue

Example from a text book

commentThe property Net Amount “The net amount of the payment. This is the effective cost to the payer after any reclaimable tax has been deducted.”

Triples for the standards that provides the semantics

subclass ofThe class Expenditure Line Observation in a multi-dimensional data cube for statistics

subject predicat objectResource

DescriptionFramework

Payment number 8605670

Payment number 8605670

Payment number 8605670

Net Amount 120.00

Example from the Spending data example in UK

Type Expenditure Line

Payer Lichfield District Council

Lichfield District Council Type Local Authority

11 Linked Data Introduction (Extras)

Page 12: Linked data introduction w exempel

Two things to remember Global Identifiers

Examples of Identifiers for “types of things” / “standards for things”

http://reference.data.gov.uk/def/payment#netAmount

http://reference.data.gov.uk/def/payment#ExpenditureLine

http://purl.org/linked-data/cube#Observation

http://statistics.data.gov.uk/def/administrative-geography/LocalAuthority

UniformResourceIdentifier

URI

Examples of Identifiers for “things” in Spending data example

http://spending.lichfielddc.gov.uk/spend/8605670

http://statistics.data.gov.uk/id/local-authority/41UD

12 Linked Data Introduction (Extras)

Page 13: Linked data introduction w exempel

Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond

One example – “under the hood”

Live view using the Web Data Inspector

2 of 10 RDF Triples

13 Linked Data Introduction (Extras)

Page 14: Linked data introduction w exempel

Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond

One example – with a top-down approach to standardization of the semantics

Live view using the Web Data InspectorThe Payment Ontology

14 Linked Data Introduction (Extras)

Page 15: Linked data introduction w exempel

Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond

UK government: Top-down approach to standardization for Spending Data

Statistical Data perspectiveLinked Data Cube Vocabulary

Payment Ontology

15

Guide to the Payments Ontology

The RDF Data Cube vocabulary

Presentation: Statistical Data in RDF

Linked Data Introduction (Extras)

Page 16: Linked data introduction w exempel

Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond

Scenario: Linked Clinical Study Metadata

http://clinial.data.astrazeneca.com/id/study/D8180C00011

What would we like to see as thelinked data description of it?

What would we like to see on a internal

webpage presenting linked data describing a clinical

study?

http://data.linkedct.org/resource/trial/NCT00755378 owl:sameAs

Linked Data Introduction (Extras)16

http://reference.cdisc.org/ct/sdtm/TSPARAMCD#ROUTE

Internal categorization to support Design &

Interpretation decisions

http://clinical.reference.astrazenenca.com/DI/SIZE#LARGE

Page 17: Linked data introduction w exempel

Scenario

• If each AZ clinical study had a global identifier/URI (could be something like this http://data.astrazenenca.com/id/clinicalstudy/D8180C00011/ similar to what exist already today for studies in ClinicalTrial.gov e.g. http://data.linkedct.org/resource/trial/NCT00755378

• If each identified Observation in a clinical study dataset delivered to AZ had a global identifier, e.g. http://data.astrazenenca.com/data/observation/D8180C00011/20000034

• If each individual Observation via a RDF triple linked to its global identified test procedure, similar to what exists today for CDISC’s SDTM submission values e.g. hemoglobin measurement http://linkedlifedata.com/resource/umls/id/C051801

• If each individual Observation via a RDF triple linked to contextual information, similar to what exists today for CDISC’s SDTM submission values for e.g. http://linkedlifedata.com/resource/umls/id/C0038846 - prefLabel ‘Supine Position’. (Hopefully, CDISC, together with NCI, will in the future publish their standards in a similar way to make it easier to link clinical data.)

• If each identified Observation also was delivered with its provenance information, for example two RDF triples expressing references to the identified measurement device and to the SOP document being used. For more details see the provenance vocabulary http://trdf.sourceforge.net/provenance/ns.htm

Linked Patient Data in a Clinical Study

17 Linked Data Introduction (Extras)