got bored by the relational database? switch to a rdf store!

25
Got bored by the relational database? Switch to a RDF store! Fabrizio Giudici Tidalwave s.a.s.

Upload: benfante

Post on 18-Dec-2014

2.038 views

Category:

Technology


1 download

DESCRIPTION

Il database relazionale basato su SQL è da decenni una costante nella gran parte delle architetture software enterprise. Ha resistito a cambiamenti epocali, come il passaggio dal procedurale all’object oriented, e all’avvento di vari linguaggi di programmazione. Eppure il suo modello operativo non è immediatamente adattabile a quello delle altre parti delle architetture: è ben nota la presenza dell' “impedenza OO-RDBMS” e tutta una serie di strumenti (i mappatori O/R) sono stati appositamente sviluppati per alleviare il problema. Strumenti che, generalmente, sono fortemente amati o fortemente odiati da architetti e sviluppatori. Nonostante queste asperità, il database SQL è apprezzato perché supporta il modello ACID, ha un comportamento prevedibile, il SQL è conosciuto praticamente da tutti, è relativamente facile svolgere i lavori amministrativi. Sarà così anche per il prossimo decennio? Negli ultimi tempi, molti articoli e blog hanno iniziato ad incrinare la fama di intangibilità del database SQL; uno degli argomenti più gettonati è relativo al grid e cloud computing, per i quali sono state proposte alternative (come p.es. BigTable di Google). In questa presentazione, tuttavia, faremo ancora riferimento ad un’architettura a strati tradizionale, dove il problema del database SQL da risolvere è la rigidità dello schema dati. Considereremo infatti un’applicazione web reale che rappresenta una base di conoscenza la cui struttura deve evolvere nel corso del tempo; con un database SQL, questo implicherebbe aggiungere colonne a tabelle esistenti e create nuove tabelle – un’operazione non supportata dalla maggioranza dei mappatori O/R, che necessiterebbe la ricompilazione del codice. La soluzione usata in questo scenario è uno “store RDF”. RDF (Resource Description Framework) è un approccio di rappresentazione dell’informazione completamente diverso dal modello relazionale del SQL. Esso consiste in triple “soggetto-predicato-oggetto”, omogenee tra loro. Pertanto, sia l’equivalente di “aggiungere una nuova colonna” o “una nuova tabella” in RDF equivale ad aggiungere una tripla – non un’operazione amministrativa, ma un passo del normale flusso operativo. Si noti che RDF è noto per essere alla base del Web Semantico, concetto che tuttavia non fa parte di questa presentazione, il cui focus è descrivere uno store RDF come un “miglior database”. Dopo la parte introduttiva, verranno illustrati concetti pratici: implementazioni esistenti (OpenRDF), come si affronta il problema della conversione oggetti – triple, come implementare le transazioni e così via. Verranno illustrati schemi architetturali e esempi di codice riferiti ad un’applicazione reale open source; verranno infine affrontati una serie di problemi aperti relativi a questa scelta architetturale.

TRANSCRIPT

Page 1: Got bored by the relational database? Switch to a RDF store!

Got bored by the relational database?Switch to a RDF store!

Fabrizio GiudiciTidalwave s.a.s.

Page 2: Got bored by the relational database? Switch to a RDF store!

Who I am

● Java consultant since 1996● Senior architect● Java instructor for Sun since 1998● Member of the NetBeans Dream Team● Technical Writer, Blogger at Java.Net, DZone● http://weblogs.java.net/blog/fabriziogiudici/● http://www.tidalwave.it/people

Page 3: Got bored by the relational database? Switch to a RDF store!

Where I am using RDF stores

http://bluemarine.tidalwave.it

Page 4: Got bored by the relational database? Switch to a RDF store!

Agenda

● Why the RDBMs?● RDBMs issues● The Semantic Model● OpenSesame and Elmo● A few code samples● Conclusion

Page 5: Got bored by the relational database? Switch to a RDF store!

RDBMs are everywhere

Page 6: Got bored by the relational database? Switch to a RDF store!

What do we expect from a RDBM?

● Persistence● Reliability● Transactions● Integrability● Manageability

Page 7: Got bored by the relational database? Switch to a RDF store!

Lack of cohesion

● Do we really need a RDBM for those things?● No, we don't

– Persistence and transactions are good

– The specific relational schema is evil

● RDBMs sell those stuff in a single package

Page 8: Got bored by the relational database? Switch to a RDF store!

ER-OO impedance

● Entity-Relationship is different than OO– Primary keys

– No inheritance

– No behaviour

– Normalization rules

– Relationship through external keys

Page 9: Got bored by the relational database? Switch to a RDF store!

ORMs

● Tools to minimize the ER-OO impedance● Java has got a standard API: JPA

– Hibernate, TopLink, EclipseLink, OpenJPA

– Tries to abstract the database à la Java

● Good, but the RDBM has still to be designed● And maintained

Page 10: Got bored by the relational database? Switch to a RDF store!

Can we get rid of the relational database?

Page 11: Got bored by the relational database? Switch to a RDF store!

The Semantic Model

● Semantic Technology != Semantic Web● RDF: Resource Description Framework● “Triples” are the atomic information item● Subject / predicate / object

– Java / is-a / programming-language

– Fabrizio / is-member-of / NetBeans Dream Team

– Verona / is-part-of / Veneto

– Verona / has-plate / “VR”

Page 12: Got bored by the relational database? Switch to a RDF store!

The Semantic Model

● The subject is a resource ● The predicate is a property● The object is a value● A value is a resource or a primitive type● Resources, properties identified by URL/URN

– Just a naming scheme

– Not necessarily web-related

Page 13: Got bored by the relational database? Switch to a RDF store!

Formal representation

● RDF is not related to XML● XML is just one of the way to represent RDF

– XML-RDF, unfortunately referred to as RDF

● Notation 3 (N3), another popular representation– Much more human-readable

● Other formats exist● RDF representation is often referred to as

“serialization”

Page 14: Got bored by the relational database? Switch to a RDF store!

(XML-)RDF is near to you

● RSS/RDF● Dublin Core● XMP by Adobe

Page 15: Got bored by the relational database? Switch to a RDF store!

Compared to RDBMs

● There's no fixed schema– Everything is a triple

– “AAA slogan”: Anyone can say Anything about Any topic

● Adding new data types is adding triples– No need to add / alter tables

– Maintainance is just updating data

● Databases can be distributed (federations)– Can be merged by just copying triples together

Page 16: Got bored by the relational database? Switch to a RDF store!

What about performance?

● Not as optimized as SQL● There's no spread knowledge about tuning as

for SQL● Some missing parts

– E.g. Sesame still misses select count(*)

Page 17: Got bored by the relational database? Switch to a RDF store!

OpenSesame, Elmo

● Popular Java infrastructure for RDF– FLOSS

– http://www.openrdf.org

● Elmo providers JPA-like operations– Annotations

– Specific API or even subset of JPA

Page 18: Got bored by the relational database? Switch to a RDF store!

A simple code example

● Note the use of standard ontologies

import org.openrdf.elmo.annotations.rdf;

@rdf(GeoVocabulary.URI_GEO_LOCATION)public class GeoLocation { @rdf("http://www.w3.org/2003/01/geo/wgs84_pos#lat") private Double latitude;

@rdf("http://www.w3.org/2003/01/geo/wgs84_pos#long") private Double longitude;

@rdf("http://www.w3.org/2003/01/geo/wgs84_pos#alt") private Double altitude;

@rdf("http://www.tidalwave.it/rdf/geo/2009/02/22#code") private String code; }

Page 19: Got bored by the relational database? Switch to a RDF store!

A simple code example

● Declare persistent classes inMETA-INF/org.openrdf.elmo.concepts

● Choose a store– Memory

– Memory backed by file

– Database (transactional)

Repository repository = new SailRepository( new MemoryStore(new File("/tmp/RDFStore"))); repository.initialize(); ElmoModule module = new ElmoModule(); SesameManagerFactory factory = new SesameManagerFactory(module, repository); SesameManager em = factory.createElmoManager();

Page 20: Got bored by the relational database? Switch to a RDF store!

Use as JPA EntityManager

em.getTransaction().begin(); GeoLocation genova = new GeoLocation(); genova.setLatitude(45.0); genova.setLongitude(9.0); genova.setCode("GE");

em.persist(genova); em.getTransaction().commit();

Page 21: Got bored by the relational database? Switch to a RDF store!

Queries

● There are specific query languages● SPARQL is one of the most popular● Similar to SQL, but triples in place of tables

PREFIX wgs84: <http://www.w3.org/2003/01/geo/wgs84_pos#>SELECT ?location WHERE { ?location wgs84:lat ?lat }

Page 22: Got bored by the relational database? Switch to a RDF store!

Running a query

em.getTransaction().begin(); String queryString = "PREFIX wgs84: <http://www.w3.org/2003/01/geo/wgs84_pos#>\n" + "SELECT ?location WHERE \n" + " {\n" + " ?location a ?type.\n" + " ?location wgs84:lat ?lat\n" + " }"; final ElmoQuery query = em.createQuery(queryString). setType("type", GeoLocation.class). setParameter("lat", 45.0);

final List<GeoLocation> result = query.getResultList();

for (GeoLocation l : result) { System.err.println(l); } em.getTransaction().commit();

Page 23: Got bored by the relational database? Switch to a RDF store!

Scratching the surface

● Elmo is powerful● Supports advanced constructs

– Objects with “multiple personality”

– Mixins

Page 24: Got bored by the relational database? Switch to a RDF store!

Open issues

● OpenSesame doesn't support all databases● Lack of experience

– Programming skills

– Maintainance

– Tuning

– Managerial culture

● Not widespread● Performance?

Page 25: Got bored by the relational database? Switch to a RDF store!

Conclusion

● RDBMs are mainstream, but old● They lead to rigid schemata, don't fit the OO● It's possible to use something different● RDF stores can be a viable alternative

[email protected]