exercises - phusewiki.org€¦ · 2 introduction to jena fuseki • apache-jena – contains the...

24
1 Exercises Due to time constraints and the large number of attendees, we were unable to provide hands-on experience during the session. This section provides exercises and a link to materials so you may try creating and querying Linked Data on your own. To obtain files for the exercises, go to: http://www.phusewiki.org/wiki/index.php?title=Semantic_Technology_Curriculum Download the file: PhUSECSS-Semantics101-AttendeeFiles.zip

Upload: duonganh

Post on 09-May-2018

236 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

1

Exercises

Due to time constraints and the large number of attendees, we were unable to provide hands-on experience during the session. This section provides exercises and a link to materials so you may try creating and querying Linked Data on your own.

To obtain files for the exercises, go to: http://www.phusewiki.org/wiki/index.php?title=Semantic_Technology_Curriculum

Download the file: PhUSECSS-Semantics101-AttendeeFiles.zip

Page 2: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

2

Introduction to Jena Fuseki

• Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT …

• Apache-Jena-Fuseki – the Jena SPARQL server

Page 3: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

3

Load a File into Fuseki • File: ex001.ttl

@prefix css: <http://www.example.org/CSS/> .

@prefix ct: <http://bio2rdf.org/clinicaltrials/> .

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ct:NCT00799760 css:title "Evaluation of Efficacity…"@en ;

css:phase "Phase 3"@en ;

css:enrollment "541"^^xsd:int .

Instructions sent to attendees/available on wiki

Page 4: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

4

Query #1: Getting Started See

Exercises

File: ex002.rq

PREFIX css: <http://www.example.org/CSS/>

SELECT *

WHERE{

?s ?p ?o .

} LIMIT 10

Page 5: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

5

PREFIX css: <http://www.example.org/CSS/>

PREFIX ct: <http://bio2rdf.org/clinicaltrials/>

SELECT ?nctid ?title

WHERE{

?nctid css:title ?title .

}

ct:NCT00799760 css:title "Evaluation of Efficacity and Safety…”@en ;

S

Query #2: Graph Pattern for Title

Query

P Data

O

?nctid css:title

?title

Page 6: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

6

Query for Study Title File: ex003.rq

PREFIX css: <http://www.example.org/CSS/>

PREFIX ct: <http://bio2rdf.org/clinicaltrials/>

SELECT ?nctid ?title

WHERE{

?nctid css:title ?title .

}

See Exercises

Page 7: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

7

Upload another file File: ex004.TTL

@prefix css: <http://www.example.org/CSS/> .

@prefix ct: <http://bio2rdf.org/clinicaltrials/> .

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ct:NCT00799760 css:title "Evaluation of Efficacity …”@en ;

css:phase "Phase 3"@en ;

css:enrollment "541"^^xsd:integer ;

css:primOutcome css:outcome1 .

css:outcome1 rdf:type ct:primary-outcome;

ct:measure "RT-PCR for influenza A virus…"@en ;

ct:time-frame "2 days".

See Exercises

Page 8: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

8

css:title "Evaluation of Efficacity …”@en ;

css:phase "Phase 3"@en ;

css:enrollment "541"^^xsd:integer ;

css:outcome1 rdf:type ct:primary-outcome;

css:primOutcome css:outcome1.

ct:NCT00799760

"RT-PCR for influenza A virus…"@en ; ct:measure

ct:time-frame

Graph Query

ct:NCT00799760 ?outURI css:primOutcome

Query for Primary Outcome

"2 days".

Data

?outURI ct:measure

?outcome

Page 9: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

9

SPARQL Query PREFIX css: <http://www.example.org/CSS/>

PREFIX ct: <http://bio2rdf.org/clinicaltrials/>

SELECT ?outcome

WHERE

{

ct:NCT00799760 css:primOutcome ?outURI .

?outURI ct:measure ?outcome .

}

Retrieve data that matches the Graph Pattern

NCTID ?outURI primOutcome measure

?outcome

Page 10: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

10

Query for Study Outcome

PREFIX css: <http://www.example.org/CSS/>

PREFIX ct: <http://bio2rdf.org/clinicaltrials/>

SELECT ?outcome

WHERE{

ct:NCT00799760 css:primOutcome ?outURI .

?outURI ct:measure ?outcome . }

File: ex005.rq

See Exercises

Page 11: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

11

ns1:NCT00799760 rdf:type ns2:Resource ,

ns2:Clinical-Study .

ns1:NCT00799760 ns3:title "Evaluation of Efficacity and Safety

of Oseltamivir and Zanamivir"@en .

ns2:actual-enrollment 541 ;

…AND MUCH MORE….

Trial Triples with SPARQL http://lod.openlinksw.com/sparql

DESCRIBE <http://bio2rdf.org/clinicaltrials:NCT00799760>

Page 12: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

12

Query for Study Outcome

PREFIX css: <http://www.example.org/CSS/>

PREFIX ct: <http://bio2rdf.org/clinicaltrials/>

SELECT ?outcome

WHERE{

ct:NCT00799760 css:primOutcome ?outURI .

?outURI ct:measure ?outcome . }

File: ex005.rq

See Exercises

Page 13: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

13

Query with R R Packages: • rrdf • rrdflibs

http://github.com/egonw/rrdf

Requires Java 7 or higher

rrdf, rrdflibs

Willighagen E. (2014) Accessing biological data in R with semantic web technologies. PeerJ PrePrints 2:e185v3 See https://dx.doi.org/10.7287/peerj.preprints.185v3

Page 14: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

14

File: queryLocalTTL.R

library(rrdf)

dataSource = load.rdf(“<path to the TTL file>/ex004.ttl",

format="N3")

query = 'PREFIX css: <http://www.example.org/CSS/>

PREFIX ct: <http://bio2rdf.org/clinicaltrials/>

SELECT ?primaryOutcome

WHERE

{

ct:NCT00799760 css:primOutcome ?outURI .

?outURI ct:measure ?primaryOutcome .

}'

queryResult = as.data.frame(sparql.rdf(dataSource, query))

queryResult

See Exercises

Page 15: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

15

Query an Endpoint with R

library(rrdf)

endpoint = "http://localhost:3030/test/query"

query = "SELECT * WHERE {?s ?p ?o . } LIMIT 10 "

queryResult = sparql.remote(endpoint, query)

queryResult

File: queryLocalFuseki.R

See Exercises

Page 16: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

16

Query with SAS SAS Macros: %sparqlquery - SPARQL query %sparqlupdate - SPARQL update

https://github.com/MarcJAndersen/SAS-SPARQLwrapper

Implementation: • SAS PROC HTTP to access the

service • Send query/update as text file • Input result using SAS LIBNAME

for XML

Other approaches: • PROC groovy to execute Java Code

from Apache Jena • SAS Java objects to interface to Apache

Jena

Requires running SPARQL service, for example Apache Jena

Page 17: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

17

File: queryLocalFuseki.sas

Assumptions: • Service active at endpoint • TTL file uploaded to store

Page 18: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

18

Query a Remote Source At: http://lod.openlinksw.com/sparql

Page 19: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

19

Create RDF using R

• R with rrdf, rrdflibs

https://github.com/egonw/rrdf

• R Data frame to RDF

– Excel->data frame-> to RDF

– SAS dataset -> data frame -> RDF

rrdf, rrdflibs

Page 20: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

20

Create RDF using R

Packages: rrdf, rrdflibs • add.triple()

– Add a triple :object is a URI

• add.data.triple()

– Add triple: object is a literal

Page 21: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

21

Create RDF using R

Try or follow along

File: createTTLFromR.R

Output File: createTTLFromR.TTL

Page 22: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

22

Create RDF using SAS

• SAS accessing SPARQL service using PROC HTTP – All functions provided by the service, see SPARQL 1.1

Protocol (https://www.w3.org/TR/sparql11-protocol/) – Implemented as SAS macros

https://github.com/MarcJAndersen/SAS-SPARQLwrapper

• SAS generating text files with

– RDF in Turtle – SPARQL INSERT statements

Page 23: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

23

Output File:

createTTLFromSAS.TTL

Create RDF using SAS File: createTTLFromSAS.SAS

2 1

3

Try or follow along

Page 24: Exercises - phusewiki.org€¦ · 2 Introduction to Jena Fuseki • Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools ARQ, RIOT

24

Validate • Apache Jena RIOT (RDF I/O Technology)

riot –validate CreateTTLFromEditor.TTL

Example errors 1. Forgot PAV prefix

08:45:44 ERROR riot :: line: 9, col: 16] Undefined prefix: pav

2. Incorrect triples termination

08:45:44 ERROR riot :: [line: 9, col: 32] Unexpected IRI

for predicate…

* note: requires Apache Jena in the system path