notes from a beginner semantic technology - … · oil & gas logistics - tracking norwegian...
TRANSCRIPT
Semantic Technology -Notes From a Beginner
Nils Jacob Berland - PhDX2X Maritime AS
Theano AS
About NJ Berland
● PhD in informatics○ Stochastic optimization and parallel computations
● Previous work experience○ Telecom - Security and Intelligent buildings○ Oil & Gas logistics - tracking ○ Norwegian University of Science and Technology ○ Molde University College - Oil & Gas logistics
NJ Berland at present
X2X Maritime AS - CTO
● Visualization of oil and gas logistics
Theano AS - CTO
● Stakeholder analysis
Warning!
● I am NOT an expert in Semantic Technology
● I try to build systems that simply work
● And we need existing and good building blocks to help our job
About the presentation
● Talk about how we got started with ST
● Show some examples
● Stimulate your curiosity
Two demos
● Oil & Gas logistics
● Decision makers
Both were created to explore how ST work and how the technology can be used for our products
O&G Logistics
● Containers
○ Are owned by one company
○ Are rented by another company for a period
○ Are handled by other companies for a period
○ Have certificates managed by maybe several companies
○ And are inspected by other companies
O&G Logistics
● Events○ Can be generated by almost anyone○ Events are basically owned by the renter○ The renter can in principle decide who to share
events with○ Events are added continuously
● Rentals○ A rental has a start and finally an end
O&G Logistics
● What is difficult about this?
○ Manage temporal permissions
○ Collect data from various systems
○ View temporal relationships
○ Explore relationships
○ Queries like "who rented a unit at dd:mm:yyyy"?
O&G Logistics
● A demo implementation○ A simulator in Python to generate events○ A basic ontology○ Jena for triples○ Node.js to route data from Jena to web clients○ Some Python "middleware" for SPARQL
● Demo of "LogisticsHub light"
Code to insert events
...
data = {
'id': 'event_' + str(uuid.uuid4()),
'timestamp': timestamp,
'type': type,
'ccu': ccu,
'location': location
}
sparql_wrapper.setQuery("""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX : <http://ontology.org/>
INSERT DATA { :%(id)s rdf:type :%(type)s;
:timeStamp "%(timestamp)s";
:hasCCU :%(ccu)s;
:atLocation :%(location)s;
} """ % data)
Code to view active rentals
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX : <http://ontology.org/>
select * where
{
?rental rdf:type :Rental;
:hasRenter ?renter;
:hasStart ?startdate;
:hasCCU ?ccu.
FILTER NOT EXISTS {?rental :hasEnd ?enddate}.
}
ORDER BY ?startdate
O&G Logistics
● Evaluation○ Fun to make○ A working demo required some weeks work with a
few good people○ Some of the queries are in practice hard problems
for SQL○ For graph databases it is quite easy○ Extremely simple to expand and add new systems○ Jena has proved very robust○ Easy to port to e.g. AllegroGraph or other "big triple
stores"○ Gave us a lot of insights into LogisticsHub
Decision Makers
● Show how the parliament of Norway and bureaucrats are connected by gender, place, topics, etc.
● Combine data from public sources (http://data.stortinget.no/eksport) and our own data
Decision Makers
● A demo implementation○ A basic ontology - Based on FOAF○ Jena triple store○ Python pull from public sources and push to Jena○ Python pull our own data and push to Jena○ Node.js to route data from Jena to web clients○ Some Python "middleware" for SPARQL○ D3JS.org for visualization○ Data is pulled from Jena when new
configurations are chosen
● Demo
Decision Makers
● Evaluation
○ Fun to make○ Less work than the Logistics demo - with the
right people!○ Easy to expand
■ With other data sources■ Other data types
○ Jena works nice
What is cool about ST?
● Get started fast - really!● SPARQL - once you understand it!● Due to the graph structure - some queries are
simple and fast
● Easy to add new data and expand● An instance can be a member of many classes
(I can be a programmer and dancer at the same time)!
Always consider hybrids
● SQL or NoSQL for storage○ e.g. MongoDB for event logs
● RDF for interesting queries● JSON for API's● SPARQL "construct" to extract data for in-
memory analysis
So where do you start?
● Forget the buzzwords● Read "Programming the semantic web" by
Toby Segaran, Colin Evans and Jamie Taylor - preferably do the examples yourself
● Make some working systems that use triple stores
● You are never alone - there are many free tools available!
The big conclusions
● Google Trends○ "Semantic Web" is not so interesting
anymore○ "Linked Data" is a stable term○ SPARQL is rising○ Neo4j is growing faster
● ST is however○ Fun○ Surprisingly simple to use○ Very efficient in some situations○ Over-hyped
The big conclusions (cont)
● ST enable○ Less monolithic systems○ Faster prototyping○ Inferencing and create "data that is not there"○ Linking data!
● Test out alternatives and hybrid solutions○ Pure graph databases○ Schema-less NoSQL - e.g. MongoDB