saveface - save your facebook content as rdf data

20
Saveface – Save Facebook’s data as RDF graph Using Jena, Joseki & FB graph API Fuming Shih [email protected]

Upload: fuming-shih

Post on 11-May-2015

2.432 views

Category:

Technology


0 download

DESCRIPTION

The slides share experience on how to build a crawl FB content and save as RDF, then use Joseki (Jena) to serve the RDF data using SPARQL endpoint.

TRANSCRIPT

Page 1: Saveface - Save your Facebook content as RDF data

Saveface – Save Facebook’s data as RDF graph

Using Jena, Joseki & FB graph APIFuming Shih

[email protected]

Page 2: Saveface - Save your Facebook content as RDF data

About Me

• 4th year graduate student at CSAIL, working with Hal Abelson

• Member of DIG group (decentralized information group) at CSAIL

• Working on topics relating to privacy, mobile context, and accountability

2

Page 3: Saveface - Save your Facebook content as RDF data

Simond Secono's Walled Gardens Picture, taken from TimBL's presentation 3

Saveface

Saveflickr

Page 4: Saveface - Save your Facebook content as RDF data

Outline

• Demo Saveface SPARQL endpoint• Overview • Set up Joseki SPARQL endpoint• From Protégé (data modeling) to Jena/Jaster

(RDF library/SPARQL endpoint– Protégé – Jastor – Facebook graph API– Jena

4

Page 5: Saveface - Save your Facebook content as RDF data

Overview

• Protégé 4.1 (data modeling)• Jastor library (RDF to POJO)• Facebook graph API• RestFB*• Jena/Jastor

5

Page 6: Saveface - Save your Facebook content as RDF data

Setup Joseki (Jena)

• Joskei is an HTTP engine that supports SPRAQL; (use jetty, support ARQ for Jena)– configuration as turtle file

• Get Jena 2.6.3, tdb 0.8.7, Joseki 3.4.2 at http://sourceforge.net/projects/jena/files/– or go to

http://dig.csail.mit.edu/2010/aintno/rdfData/aintno_joseki.tar.gz for everything in one zip file

– Jena is now an Apache Incubator program (http://incubator.apache.org/jena/index.html)

source: http://ricroberts.com/articles/installing-jena-and-joseki-on-os-x-or-linux 6

Page 7: Saveface - Save your Facebook content as RDF data

Setup environment

• export JOSEKIROOT=/path/to/Joseki-3.4.2• export TDBROOT=/path/to/TDB-0.8.7• export JENAROOT=/path/to/Jena-2.6.3• export

CLASSPATH=.:$JENAROOT/lib/*.jar:$TDBROOT/lib/*.jar:$JOSEIKIROOT/lib/*.jar

• export PATH=“$TDBROOT/bin:$JOSEKIROOT/bin:$PATH• if you download the all-in-one package(* I have put all jars

under Joseki’s lib folder)– export JOSEKIROOT="/path/to/Joseki-aintno”– export PATH="$JOSEKIROOT/bin:$PATH”– export CLASSPATH=".:$JOSEKIROOT/lib/*.jar" 7

Page 8: Saveface - Save your Facebook content as RDF data

Run Joseki

• cd /path/to/Joseki• ./bin/rdfserver

– ./bin/rdfserver - - help (joseki.rdfserver [--verbose] [--port N] dataSourceConfigFile)

• Now open browser at http://localhost:2020/– test some of the SPARQL query interface with example

data

8

Page 9: Saveface - Save your Facebook content as RDF data

Joseki - Http access to SPARQL Endpoint

9

Page 10: Saveface - Save your Facebook content as RDF data

Saveface

• Goal: save my Facebook data as linked data• Facebook *finally* provides restful API to

access its data (Facebook Graph API)– http

://developers.facebook.com/docs/reference/api/ – graph structure (e.g. Album class)

• http://developers.facebook.com/docs/reference/api/album/

10

Page 11: Saveface - Save your Facebook content as RDF data

From Data model to Java POJO

• Used Protégé to create owl class for each of the Facebook classes– be aware that mapping from OO to ontology needs cares– serialize as RDF files

• Mapping ontologies (owl files) to JAVA classes – used Jastor library to generates Java interfaces,

implementations, factories, and listeners based on the properties and class hierarchies in the Web ontologies

– easier for non-Semantic Web java developer to make use of ontology

11

Page 12: Saveface - Save your Facebook content as RDF data

Jastor • Typesafe, Ontology Driven RDF Access from Java http

://jastor.sourceforge.net/ – Use Jena 2.4

• Provides an interface for access/setting/adding event listeners to RDF model

RDF DBOntology files

Mapping tool

Jena2 Platform(RDF model + Reasoning Engine + Persistence System)

JAVA VM

Jastor

listenerOperator

FOAF

iCal SIOC

Tag

12

Page 13: Saveface - Save your Facebook content as RDF data

Example

JastorContext ctx = new JastorContext();ctx.addOntologyToGenerate(new FileInputStream("src/data/Tag.owl"), "http://www.mit.edu/dig/ns/tag", "edu.mit.dig.model.Tag");

JastorGenerator gen = new JastorGenerator( new File("gensrc").getCanonicalFile(), ctx);gen.run();

Tag tag = edu.mit.dig.model.Tag.tagFactory.createTag(NS_PREFIX + "id_1", model);

tag.addName("A tag");tag.addX(45);tag.addY(32);

Create mapping

Make use of the class

13

Page 14: Saveface - Save your Facebook content as RDF data

RestFB + RDF

• Facebook graph API client• Forked RestFB 1.5.4 and added RDFUtil.java

– used java reflection to covert each FB objects in RestFB to Jena RDF model (method toRDF())

• Default domain name for Saveface data– http://servername:port_num/data/saveface/

14

Page 15: Saveface - Save your Facebook content as RDF data

Demo

• git clone [email protected]:fumingshih/savefaceDemo.git• Login to your Facebook• Go to http://developers.facebook.com/docs/reference/api/

– click on one of the links to view your content in json format (graph)– copy the access_token after https://graph.facebook.com/me/

friends?access_token

• Run saveface.tutorial.Exercise1.java– paste the access_token string (* only valid for one

hour)– change the directory for storing RDF (TDB files)

15

Page 16: Saveface - Save your Facebook content as RDF data

Access SaveFace Data through Joseki

• Open /path/to/your/Joseki/joseki-config.ttl• Three concepts in the configuration files

– services • Services are the points that request are sent to• Need to specify dataset and processor • Note that the service reference and the routing of incoming requests by

URI as defined by web.xml have to align

– datasets• can be path to the dataset• or using Jena assembler description to compile different named graphs

together

– processors• set limitations on SPARQL queries (locking, no FROM/FROM NAMED)

Reference: http://www.joseki.org/configuration.html16

Page 17: Saveface - Save your Facebook content as RDF data

Configuration Example (Service)# Service 3 - SPARQL processor only handing a given dataset(TDB)<#service3> rdf:type joseki:Service ; rdfs:label "SPARQL on the named graph of saveface" ; joseki:serviceRef "saveface" ; # web.xml must route this name to Joseki

# dataset part joseki:dataset <#savefacedata> ;

# Service part. # This processor will not allow either the protocol, # nor the query, to specify the dataset. joseki:processor joseki:ProcessorSPARQL_MultiDS ; .

17

Page 18: Saveface - Save your Facebook content as RDF data

Configuration Example (Dataset)

# init tdb[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .

tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .tdb:GraphTDB rdfs:subClassOf ja:Model .

<#savefacedata> rdf:type tdb:DatasetTDB ; rdfs:label "saveface dataset" ; #change this line below to your path to the dataset tdb:location "/Users/fuming/tmp/saveface_demo" ; .

18

Page 19: Saveface - Save your Facebook content as RDF data

Facebook Data as Linked Data!

• Change graph name to <urn:saveface:dataGraph:FumingShih> in the SPARQL query

19

Page 20: Saveface - Save your Facebook content as RDF data

References• http://incubator.apache.org/jena/index.html• http://www.joseki.org/ • Graph API

– http://developers.facebook.com/docs/reference/api/

• Jastor – http://jastor.sourceforge.net/

• RestFB (http://restfb.com/ )– FB API browser (http://zestyping.livejournal.com/257224.html)

• SavefaceDemo – https://github.com/fumingshih/savefaceDemo – More on Saveface demo

• http://dice.csail.mit.edu/aintno/ui/#aintno • http://dig.csail.mit.edu/wiki/SocialWebs_Data_Crawler/RDF_Repository_Setup

20