building blocks for an open data strategy of the european · standardisation. desk research high...

21
Towards a joint linked open data strategy for the European Statistical System ESS LOD 2017 Malta, 19 January 2017 www.pwc.com Dr. Nikolaos Loutas PwC Data & Analytics A study delivered by PwC EU Services for Eurostat

Upload: others

Post on 04-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

Towards a joint linked open data strategy for the European Statistical System

ESS LOD 2017Malta, 19 January 2017

www.pwc.com

Dr. Nikolaos LoutasPwC Data & Analytics

A study delivered by PwC EU Services for Eurostat

Page 2: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Agenda

Our approach

Expected benefits LOD for NSIs and data consumers

LOD strategy building blocks

Strategy & Policy

People & Capabilities

Data & Metadata

Linked Data Governance

Technology & Infrastructure

Proposed LOD proofs of concept

2

January 2017Towards a joint LOD strategy for the ESS

Page 3: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Towards a strategy for Linked Open Data for NSIsOur approach

Interviews with 6 NSIs and 3 organisations from Academia, Industry and

standardisation.Desk research

High level architecture for Linked Open Data for

official statistics

Draft strategy for Linked Open Data

Perspectives,Key strategic questions,Recommendations, way

forward

3

January 2017Towards a joint LOD strategy for the ESS

Page 4: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

What can LOD deliver to…

National Statistical Institutes

Having a unified view over data, thanks to easier integration;

More flexible means of data dissemination and wider outreach;

Increased standardisation, interoperability andcollaboration opportunities;

Easier to innovate and evolve;

Cost reductions, collect and publish once, reuse many times.

Data reusers

Using the right data at the right time in the right format;

Better understanding of the data as the data and the model are closely interwoven;

Increased trust, thanks to traceability and provenance;

Easier integration with other data from various domains;

Enhanced data exploration by navigating the links;

Innovation.

4

January 2017Towards a joint LOD strategy for the ESS

Page 5: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Towards a strategy for Linked Open Data for NSIsBuilding blocks

Strategy &Policy

People & Capabilities

Data & Metadata

Linked Data Governance

Technology &Infrastructure

5

January 2017Towards a joint LOD strategy for the ESS

Page 6: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Strategy and PolicyRecommendations

Endorsement by top management

Common analysis of challenges and opportunities

Identifying priorities reconciling local and

central needs

Evaluating opportunities for connection with

external providers

Developing common approaches to LOD

among NSIs and Eurostat, e.g.

licensing, governance

Ensuring financial and human resources

for local actions

Working in small cycles with well-

defined outcomes and KPIs

Evaluating opportunities for new business models for

data dissemination or value-added services

Identifying clearly the objectives and audiences of

communication

Defining objectives, funding and results of

hackathons.

Monitoring regularly LOD implementations

and investments

6

January 2017Towards a joint LOD strategy for the ESS

Page 7: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Strategy and PolicyKey strategic questions

3 What is more important for the first stage of LOD experiments:

(a) Interconnect several official statistics datasets within an NSI to improve data dissemination? How can ESS support it?

(b) Interconnect several official statistics datasets from NSIs and/or Eurostat?

(c) Publish statistics as linkable machine readable format which can easily be reused and integrated with other types of data (e.g. geospatial, weather?)

2 What should be the level of ambition of the ESS? Start small? Big investment?

Towards a joint LOD strategy for the ESS

1 What could be the immediate gains from investing in LOD?

7

January 2017

Page 8: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

People and CapabilitiesRecommendations

Establishing links with research groups

and academia

Identifying necessary skills and

competences for LOD-related

activities

Training in cooperation with

experts and external providers to fill any

gaps

Developing proofs of concept

collaboratively between ESS

members

Setting up a community of

reusers of linked official statistics in

the ESS

8

January 2017Towards a joint LOD strategy for the ESS

Page 9: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

People and CapabilitiesKey strategic questions

1

2

For which aspects of LOD would you need external expertise?

Who could provide that?

Build or buy? Would you opt for in-house development or for outsourcing linked data implementations?

9

January 2017Towards a joint LOD strategy for the ESS

3 What could be the role of the ESS?

Page 10: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Data and MetadataRecommendations

Evaluating opportunities for joint data services

e.g. harvesting, aggregating, enriching

and improving data/metadata quality

Adopting common standards, e.g.

StatDCAT-AP and Data Cube with specific

application profiles if necessary

Integrating existing standards for statistical data (e.g. SDMX) and

linked-data-based standards and

application profiles

Converting existing data and metadata in

standard way, e.g. XML, CSV, JSON, RDF

Prioritising focus on datasets by expected increase of efficiency within NSI/network,

potential of commercial reuse, needs of NGOs

10

January 2017Towards a joint LOD strategy for the ESS

Page 11: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Data and MetadataKey strategic questions

1

2

3

Which datasets could be converted and exposed as LOD in the first steps? Which ones have highest potential for reuse? Is there any "low hanging fruit", such as census data, economic or employment indicators

To what extent is code list standardization and management an enabling factor for inter-institutional LOD?

Which classifications could converted and exposed as LOD in the first steps? Which would have the highest impact?

11

January 2017Towards a joint LOD strategy for the ESS

Page 12: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Linked Data GovernanceRecommendations

Persistent URI policy supported by SLA for

data re-users

Lifecycle of Linked Data to be specified and

managed

Integration of internal and external LOD via

controlled vocabularies, authority files and code

lists

Connection with LOD user communities and

product developers. Extending customer

support to them.

12

January 2017Towards a joint LOD strategy for the ESS

Page 13: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Linked Data GovernanceKey strategic questions

1

2

Would it be useful for the ESS members to develop a common persistent URI policy?

Can NSIs provide Service Level Guarantees for LOD, e.g. an explicit commitment to availability and persistence?

13

January 2017Towards a joint LOD strategy for the ESS

3 Should NSI/ESS become members of W3C or other standards bodies to drive standardisation efforts for LOD related to statistics data, or should this be done through existing structures in the statistics area?

Page 14: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Technology and InfrastructureRecommendations

Adopting a virtual unified layer for

exposing LOD on top of existing databases

Developing tools and software

collaboratively as Open Source or in

joint procurement by ESS

Adopting a common approach to access mechanisms, e.g. APIS, SPARQL

endpoints, direct URI resolution

Establishing connections with

industry for procuring tools and specific

consulting activities

Enhancing NSIs portals to reap benefits

of linked data, e.g. improving search, navigation, data

exchange

Investigating on the use of linked data in mobile applications

Exploiting test-beds to encourage

experimentation with linked data

Researching on the implementation of linked closed data

14

January 2017Towards a joint LOD strategy for the ESS

Page 15: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Technology and InfrastructureKey strategic questions

1

2

3

Do NSIs have tools in place that could be used to transform existing data into LOD? What tools might be missing?

Can LOD help on the input side, e.g. receiving linked data from other sources and integrating that into the local data collections at the NSI?

Could the provision of test-beds or sandboxes help the development of LOD?

• Software for data portals/catalogues

• RDF repository

• Web and application servers

• SPARQL Query builders

• Linked Data APIs

• Linked data visualisation tools

• Transformation tools for publishing statistical data and metadata in linked data formats (e.g. from

SDMX to StatDCAT-AP or from CSV or databases to RDF Data Cube)

• Validators

• Data/metadata harvesters

• Tools for managing nomenclatures and codelists (e.g. Vocbench 3.0).

• Ontology editors (e.g. protégé)

15

January 2017Towards a joint LOD strategy for the ESS

Page 16: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Proposed LOD proofs of concept for collaborative development in the ESS

January 2017Towards a joint LOD strategy for the ESS

16

Page 17: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Linking official statistics within an NSI to improve data dissemination

1. Publication of official statistics as linked data. This development can be done in three steps: • The development of an open source tool to transform a traditional data source (CSV,

database, etc.) into linked (open) data;• The data transformation; and • The publication of the linked data through a SPARQL endpoint.

2. Development of guidelines and best practices for statistical linked data (e.g. guidelines for URIs, standards, transformation and publishing practices, communication and culture change, technologies and infrastructure etc.).

3. Communication of the results (e.g. dissemination in the participating NSIs, in the ESS and beyond through internal communications, workshops and trainings, publications on social media, and participation to conferences) to ensure the continuity of the linked open data initiative, to promote the reuse of the linked data and to convince other NSIs to join the journey.

4. Effort: around 80 and 100 man-days.5. Stakeholders: Mix of 5-6 NSIs with and without LOD experience and Eurostat.

Proposed solution

17

January 2017

Page 18: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Publishing standardised nomenclatures as linked open metadata

1. Technical solution to publish data:• An open source tool to transform the datasets from flat file to RDF• A RDF triple store to publish the RDF files.• A user interface to explore the RDF store.

2. Development of guidelines and best practices for statistical linked data (e.g. guidelines for URIs, standards, transformation and publishing practices, communication and culture change, technologies and infrastructure etc.).

3. Communication of the results to ensure the continuity of the LOD initiative, to promote the reuse of the linked data and to convince other NSI to join the journey.

4. Effort: 80-100 man-days.5. Stakeholders: 2-3 NSIs and other potential reusers of these metadata (e.g. DGs,

public organisation, academia). Global organisation like UNSD.

Proposed solution

18

January 2017

Page 19: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Linking official statistics with other data to develop value added services and apps

1. Publication of statistical linked datasets in an innovative way:• The linkage of datasets coming from various NSIs and Eurostat; and• The development of an application offering an innovative access to the data (e.g. a

visualisation).

2. Building of a European statistician linked data community able to create a momentum around this subject to exchange on best practices and to develop guidelines.

3. Communication of the results to ensure the continuity of the LOD initiative, to promote the reuse of the linked data and to convince other NSI to join the journey.

4. effort: 80-100 man-days.5. Stakeholders: 3-4 NSIs mix with and without experience, Eurostat and eventually

other organisations working with LOD.

Proposed solution

19

January 2017

Page 20: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Group discussion

January 2017Towards a joint LOD strategy for the ESS

20

Page 21: Building blocks for an open data strategy of the European · standardisation. Desk research High level architecture for Linked Open Data for ... management Common analysis of challenges

PwC

Get in touch with us to know more

21

Nikolaos [email protected]

Daniel Brulé[email protected]

This publication has been prepared by PwC EU Services for Eurostat under DI07171 specific contract 353. “PwC” refers to PwC Enterprise Advisory bvba which is a member firm of PricewaterhouseCoopers International Limited, each member firm of which is a separate legal entity.

Makx [email protected]