linked statistical open government data
TRANSCRIPT
Orebro, 28 April 2015
Linked Statistical Open Government Data
Efthimios Tambouris University of Macedonia and ITI-CERTH, Greece
(Some aspects of) Open Data
Statistical Data and Linked Open Data Technologies
Linked Statistical Data Lifecycle
Tools for Linked Statistical Open Government Data
The OpenCube Project
Conclusions
2
Table of Contents
Orebro, 28 April 2015
More than 180 Open Government Data portals around the globe provide data that “can be freely used, reused and redistributed by anyone”
3
Open Government Data (OGD)
Huge potential for transparency and economic growth
First Step: understand the big picture
Orebro, 28 April 2015
1. Understanding OGD: A Classification Scheme
Kalampokis, E., Tambouris, E., Tarabanis, K.: A
Classification Scheme for Open Government Data:
Towards Linking Decentralized Data. International
Journal of Web Engineering and Technology 6(3), 266–
285 (2011)
24 official OGD initiatives were
classified based on that framework
in 2010 (this classification may now
have changed)
Orebro, 28 April 2015
3. A Stage Model for OGD + OSMD
Kalampokis, Ε., Tambouris Ε. and Tarabanis Κ., Open Government Data: A Stage Model. In: M. Janssen et
al. (Eds): EGOV2011. LNCS 6846, 235-246, 2011.
Orebro, 28 April 2015
2. Understanding Open Social Media Data: An Analysis Framework
E. Kalampokis, E. Tambouris and K. Tarabanis (2013) Understanding the Predictive Power of Social Media, Internet
Research, Vol.23, No.5, pp. 544-559
Based on the
review of ~60
relevant scientific
publications
Orebro, 28 April 2015
Promises of OGD initiatives largely not fulfilled
Next Step: dig into data themselves
7
Problem
Orebro, 28 April 2015
(Some aspects of) Open Data
Statistical Data and Linked Open Data Technologies
Linked Statistical Data Lifecycle
Tools for Linked Statistical Open Government Data
The OpenCube Project
Conclusions
8
Table of Contents
Orebro, 28 April 2015
Open (Gov) Data are very important for the EU
A big portion of Open Data concerns statistics e.g. 6875 out of 7682 datasets of the EU Open Data Portal are of statistical nature.
Statistical data is often organized as data cubes, where each cell contains a measure described based on a number of dimensions.
9
Nature of Open Gov Data
Orebro, 28 April 2015
Data Cube
10
Dimensions Hierarchy
Measure
Orebro, 28 April 2015
Users frequently want to blend & combine statistical data from multiple sources
But, these data usually resides in files and databases (data silos) that are hard to combine
11
Focus
Orebro, 28 April 2015
Linked Data has the potential to enable combining and performing analytics on top of disparate and previously isolated statistical data
The RDF Data Cube Vocabulary has been proposed for modelling multi-dimensional data as RDF graphs.
However, tools for handling linked data cubes:
are only few and scattered
have not been tested under real-life conditions
12
Linked Data
Potential of using LOD in statistical data analysis unexploited
Orebro, 28 April 2015
(Some aspects of) Open Data
Statistical Data and Linked Open Data Technologies
Linked Statistical Data Lifecycle
Tools for Linked Statistical Open Government Data
The OpenCube Project
Conclusions
13
Table of Contents
Orebro, 28 April 2015
14
The OpenCube project
OpenCube is a 2-year project funded by the EU within FP7
The project aims to develop and test processes and tools for managing statistical
linked open data.
The results will:
Facilitate data publishers to create linked data cubes from legacy formats
Empower data users to browse, visualise, link, expand and analyse data cubes.
Enable analysis not possible before (merging data cubes at a Web scale)
Orebro, 28 April 2015
We propose a lifecycle for statistical LD
The lifecycle is divided into two phases: publish and reuse (or consume)
The lifecycle prescribes the steps that raw data cubes* should go through in order to create value.
OpenCube also develops tools to support the whole lifecycle of linked statistical data.
Linked Statistical Data Lifecycle
15
* We assume statistical data is organized as data cubes, where each cell
contains a measure described based on a number of dimensions.
Orebro, 28 April 2015
(Some aspects of) Open Data
Statistical Data and Linked Open Data Technologies
Linked Statistical Data Lifecycle
Tools for Linked Statistical Open Government Data
The OpenCube Project
Conclusions
16
Table of Contents
Orebro, 28 April 2015
Publishing components TARQL extension
D2RQ /R2RML-QB extension
JSON-stat
Grafter
Consuming components OpenCube Browser
OpenCube MapView
R Analysis Chart
Linking components
17
OpenCube Toolkit
Developed using Information Workbench open source as underlying linked data management platform
License scheme OpenCube components are
provided under open source licenses
Check http://opencube-toolkit.eu
But, commercial solutions are also offered by consortium members
Orebro, 28 April 2015
18
Publishing Components
Orebro, 28 April 2015
19
Consume: OpenCube browser Summarize observations
across a dimension
(dimension reduction)
Change the axes
of the table
Change the
language
Change the fixed
values
It enables the exploration of an RDF data cube by presenting a two-dimensional slice of the cube as a table.
The slice is created by setting a fixed values for each dimension that is not presented in the table.
Orebro, 28 April 2015
Visualization of RDF data cubes on a map.
It supports: Markers
Bubble
Choropleth maps
20
Consume: OpenCube MapView
Orebro, 28 April 2015
Visualisation of analysis results (charts & tables)
Reuse of analysis results: preserving R output as linked data
21
Consume: Integration with R
Orebro, 28 April 2015
22
Consume: Other Visualizations
Analytics and Reporting Visualization and Exploration
Stock chart
Orebro, 28 April 2015
Enables Performing analytics on top of combined data cubes
Steps: 1. Select a data cube
2. Discover cubes on the Web of Linked Data having compatible structure; i.e. cubes with dimensions, measures etc. that can expand the initial cube
3. Create expanded views of the initial cube
4. Consume the new cube(s)
23
Linking Statistical Data
Orebro, 28 April 2015
24
Example: Start with an initial cube
Orebro, 28 April 2015
25
Example: Discover & Select compatible cubes
Orebro, 28 April 2015
26
Example: Browse an expanded view of the initial cube
Orebro, 28 April 2015
Can be from different domain, e.g. access to medical records
27
Applications
19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
Query
Data consumer Data provider
Linked Data Cubes and
Metadata
Access
Policies
Authorization Mechanism
...
...
...
User
Profiles ..
.
..
.
..
.
Access Policies
..
.
..
.
..
.
Query
Results
Data-Cubes
Metadata
User
profiles
Authorized
Results
Create Profile
Create Profile
Request
Authorized
Results
SPARQL
Query
Au
tho
riza
tion
Inte
rfac
e
Query
Results
Qu
ery
Re
su
lts
E. Kamateri, E. Kalampokis, E. Tambouris, and K. Tarabanis (2014) The Linked Medical Data Access Control Framework,
Journal of Biomedical Informatics, Vol.5, pp. 213-225
(Some aspects of) Open Data
Statistical Data and Linked Open Data Technologies
Linked Statistical Data Lifecycle
Tools for Linked Statistical Open Government Data
The OpenCube Project
Conclusions
28
Table of Contents
Orebro, 28 April 2015
For more information
http://opencube-project.eu
http://opencube-toolkit.eu
29
The OpenCube Project
OpenCube consortium
Orebro, 28 April 2015
(Some aspects of) Open Data
Statistical Data and Linked Open Data Technologies
Linked Statistical Data Lifecycle
Tools for Linked Statistical Open Government Data
The OpenCube Project
Conclusions
30
Table of Contents
Orebro, 28 April 2015
Open Statistical data are rapidly increasing due to Open Data policies
Linked Data technologies can provide web-scale linking and analysis of statistical data
OpenCube project develops processes and tools for statistical data management
These can be divided into: Tools for producing linked open statistical data
Tools for linking (expanding) open statistical data
Tools for consuming linked open statistical data
31
Conclusions
Orebro, 28 April 2015
For more information
http://opencube-project.eu
http://opencube-toolkit.eu
Project coordinators:
Konstantinos Tarabanis, [email protected]
Themis Tambouris, [email protected]
32
Questions?
Orebro, 28 April 2015