building blocks for an open data strategy of the european · standardisation. desk research high...
TRANSCRIPT
![Page 1: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/1.jpg)
Towards a joint linked open data strategy for the European Statistical System
ESS LOD 2017Malta, 19 January 2017
www.pwc.com
Dr. Nikolaos LoutasPwC Data & Analytics
A study delivered by PwC EU Services for Eurostat
![Page 2: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/2.jpg)
PwC
Agenda
Our approach
Expected benefits LOD for NSIs and data consumers
LOD strategy building blocks
Strategy & Policy
People & Capabilities
Data & Metadata
Linked Data Governance
Technology & Infrastructure
Proposed LOD proofs of concept
2
January 2017Towards a joint LOD strategy for the ESS
![Page 3: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/3.jpg)
PwC
Towards a strategy for Linked Open Data for NSIsOur approach
Interviews with 6 NSIs and 3 organisations from Academia, Industry and
standardisation.Desk research
High level architecture for Linked Open Data for
official statistics
Draft strategy for Linked Open Data
Perspectives,Key strategic questions,Recommendations, way
forward
3
January 2017Towards a joint LOD strategy for the ESS
![Page 4: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/4.jpg)
PwC
What can LOD deliver to…
National Statistical Institutes
Having a unified view over data, thanks to easier integration;
More flexible means of data dissemination and wider outreach;
Increased standardisation, interoperability andcollaboration opportunities;
Easier to innovate and evolve;
Cost reductions, collect and publish once, reuse many times.
Data reusers
Using the right data at the right time in the right format;
Better understanding of the data as the data and the model are closely interwoven;
Increased trust, thanks to traceability and provenance;
Easier integration with other data from various domains;
Enhanced data exploration by navigating the links;
Innovation.
4
January 2017Towards a joint LOD strategy for the ESS
![Page 5: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/5.jpg)
PwC
Towards a strategy for Linked Open Data for NSIsBuilding blocks
Strategy &Policy
People & Capabilities
Data & Metadata
Linked Data Governance
Technology &Infrastructure
5
January 2017Towards a joint LOD strategy for the ESS
![Page 6: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/6.jpg)
PwC
Strategy and PolicyRecommendations
Endorsement by top management
Common analysis of challenges and opportunities
Identifying priorities reconciling local and
central needs
Evaluating opportunities for connection with
external providers
Developing common approaches to LOD
among NSIs and Eurostat, e.g.
licensing, governance
Ensuring financial and human resources
for local actions
Working in small cycles with well-
defined outcomes and KPIs
Evaluating opportunities for new business models for
data dissemination or value-added services
Identifying clearly the objectives and audiences of
communication
Defining objectives, funding and results of
hackathons.
Monitoring regularly LOD implementations
and investments
6
January 2017Towards a joint LOD strategy for the ESS
![Page 7: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/7.jpg)
PwC
Strategy and PolicyKey strategic questions
3 What is more important for the first stage of LOD experiments:
(a) Interconnect several official statistics datasets within an NSI to improve data dissemination? How can ESS support it?
(b) Interconnect several official statistics datasets from NSIs and/or Eurostat?
(c) Publish statistics as linkable machine readable format which can easily be reused and integrated with other types of data (e.g. geospatial, weather?)
2 What should be the level of ambition of the ESS? Start small? Big investment?
Towards a joint LOD strategy for the ESS
1 What could be the immediate gains from investing in LOD?
7
January 2017
![Page 8: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/8.jpg)
PwC
People and CapabilitiesRecommendations
Establishing links with research groups
and academia
Identifying necessary skills and
competences for LOD-related
activities
Training in cooperation with
experts and external providers to fill any
gaps
Developing proofs of concept
collaboratively between ESS
members
Setting up a community of
reusers of linked official statistics in
the ESS
8
January 2017Towards a joint LOD strategy for the ESS
![Page 9: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/9.jpg)
PwC
People and CapabilitiesKey strategic questions
1
2
For which aspects of LOD would you need external expertise?
Who could provide that?
Build or buy? Would you opt for in-house development or for outsourcing linked data implementations?
9
January 2017Towards a joint LOD strategy for the ESS
3 What could be the role of the ESS?
![Page 10: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/10.jpg)
PwC
Data and MetadataRecommendations
Evaluating opportunities for joint data services
e.g. harvesting, aggregating, enriching
and improving data/metadata quality
Adopting common standards, e.g.
StatDCAT-AP and Data Cube with specific
application profiles if necessary
Integrating existing standards for statistical data (e.g. SDMX) and
linked-data-based standards and
application profiles
Converting existing data and metadata in
standard way, e.g. XML, CSV, JSON, RDF
Prioritising focus on datasets by expected increase of efficiency within NSI/network,
potential of commercial reuse, needs of NGOs
10
January 2017Towards a joint LOD strategy for the ESS
![Page 11: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/11.jpg)
PwC
Data and MetadataKey strategic questions
1
2
3
Which datasets could be converted and exposed as LOD in the first steps? Which ones have highest potential for reuse? Is there any "low hanging fruit", such as census data, economic or employment indicators
To what extent is code list standardization and management an enabling factor for inter-institutional LOD?
Which classifications could converted and exposed as LOD in the first steps? Which would have the highest impact?
11
January 2017Towards a joint LOD strategy for the ESS
![Page 12: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/12.jpg)
PwC
Linked Data GovernanceRecommendations
Persistent URI policy supported by SLA for
data re-users
Lifecycle of Linked Data to be specified and
managed
Integration of internal and external LOD via
controlled vocabularies, authority files and code
lists
Connection with LOD user communities and
product developers. Extending customer
support to them.
12
January 2017Towards a joint LOD strategy for the ESS
![Page 13: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/13.jpg)
PwC
Linked Data GovernanceKey strategic questions
1
2
Would it be useful for the ESS members to develop a common persistent URI policy?
Can NSIs provide Service Level Guarantees for LOD, e.g. an explicit commitment to availability and persistence?
13
January 2017Towards a joint LOD strategy for the ESS
3 Should NSI/ESS become members of W3C or other standards bodies to drive standardisation efforts for LOD related to statistics data, or should this be done through existing structures in the statistics area?
![Page 14: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/14.jpg)
PwC
Technology and InfrastructureRecommendations
Adopting a virtual unified layer for
exposing LOD on top of existing databases
Developing tools and software
collaboratively as Open Source or in
joint procurement by ESS
Adopting a common approach to access mechanisms, e.g. APIS, SPARQL
endpoints, direct URI resolution
Establishing connections with
industry for procuring tools and specific
consulting activities
Enhancing NSIs portals to reap benefits
of linked data, e.g. improving search, navigation, data
exchange
Investigating on the use of linked data in mobile applications
Exploiting test-beds to encourage
experimentation with linked data
Researching on the implementation of linked closed data
14
January 2017Towards a joint LOD strategy for the ESS
![Page 15: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/15.jpg)
PwC
Technology and InfrastructureKey strategic questions
1
2
3
Do NSIs have tools in place that could be used to transform existing data into LOD? What tools might be missing?
Can LOD help on the input side, e.g. receiving linked data from other sources and integrating that into the local data collections at the NSI?
Could the provision of test-beds or sandboxes help the development of LOD?
• Software for data portals/catalogues
• RDF repository
• Web and application servers
• SPARQL Query builders
• Linked Data APIs
• Linked data visualisation tools
• Transformation tools for publishing statistical data and metadata in linked data formats (e.g. from
SDMX to StatDCAT-AP or from CSV or databases to RDF Data Cube)
• Validators
• Data/metadata harvesters
• Tools for managing nomenclatures and codelists (e.g. Vocbench 3.0).
• Ontology editors (e.g. protégé)
15
January 2017Towards a joint LOD strategy for the ESS
![Page 16: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/16.jpg)
PwC
Proposed LOD proofs of concept for collaborative development in the ESS
January 2017Towards a joint LOD strategy for the ESS
16
![Page 17: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/17.jpg)
PwC
Linking official statistics within an NSI to improve data dissemination
1. Publication of official statistics as linked data. This development can be done in three steps: • The development of an open source tool to transform a traditional data source (CSV,
database, etc.) into linked (open) data;• The data transformation; and • The publication of the linked data through a SPARQL endpoint.
2. Development of guidelines and best practices for statistical linked data (e.g. guidelines for URIs, standards, transformation and publishing practices, communication and culture change, technologies and infrastructure etc.).
3. Communication of the results (e.g. dissemination in the participating NSIs, in the ESS and beyond through internal communications, workshops and trainings, publications on social media, and participation to conferences) to ensure the continuity of the linked open data initiative, to promote the reuse of the linked data and to convince other NSIs to join the journey.
4. Effort: around 80 and 100 man-days.5. Stakeholders: Mix of 5-6 NSIs with and without LOD experience and Eurostat.
Proposed solution
17
January 2017
![Page 18: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/18.jpg)
PwC
Publishing standardised nomenclatures as linked open metadata
1. Technical solution to publish data:• An open source tool to transform the datasets from flat file to RDF• A RDF triple store to publish the RDF files.• A user interface to explore the RDF store.
2. Development of guidelines and best practices for statistical linked data (e.g. guidelines for URIs, standards, transformation and publishing practices, communication and culture change, technologies and infrastructure etc.).
3. Communication of the results to ensure the continuity of the LOD initiative, to promote the reuse of the linked data and to convince other NSI to join the journey.
4. Effort: 80-100 man-days.5. Stakeholders: 2-3 NSIs and other potential reusers of these metadata (e.g. DGs,
public organisation, academia). Global organisation like UNSD.
Proposed solution
18
January 2017
![Page 19: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/19.jpg)
PwC
Linking official statistics with other data to develop value added services and apps
1. Publication of statistical linked datasets in an innovative way:• The linkage of datasets coming from various NSIs and Eurostat; and• The development of an application offering an innovative access to the data (e.g. a
visualisation).
2. Building of a European statistician linked data community able to create a momentum around this subject to exchange on best practices and to develop guidelines.
3. Communication of the results to ensure the continuity of the LOD initiative, to promote the reuse of the linked data and to convince other NSI to join the journey.
4. effort: 80-100 man-days.5. Stakeholders: 3-4 NSIs mix with and without experience, Eurostat and eventually
other organisations working with LOD.
Proposed solution
19
January 2017
![Page 20: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/20.jpg)
PwC
Group discussion
January 2017Towards a joint LOD strategy for the ESS
20
![Page 21: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges](https://reader034.vdocument.in/reader034/viewer/2022050420/5f8f54fe7b6c3931bf765710/html5/thumbnails/21.jpg)
PwC
Get in touch with us to know more
21
Nikolaos [email protected]
Daniel Brulé[email protected]
This publication has been prepared by PwC EU Services for Eurostat under DI07171 specific contract 353. “PwC” refers to PwC Enterprise Advisory bvba which is a member firm of PricewaterhouseCoopers International Limited, each member firm of which is a separate legal entity.
Makx [email protected]