big data: economic opportunities for italy - aspen institute · hidden in massive datasets •...

47
Big Data: economic opportunities for Italy a cura di Scuola Normale Superiore Consiglio Nazionale delle Ricerche Regione Emilia Romagna per Aspen Institute Italia Interesse nazionale Ottobre 2017 Piazza Navona, 114 00186 - Roma Tel: +39 06 45.46.891 Fax: +39 06 67.96.377 Via Vincenzo Monti, 12 20123 - Milano Tel: +39 02 99.96.131 Fax: +39 06 99.96.13.50 www.aspeninstitute.it

Upload: others

Post on 20-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

Big Data: economic opportunities for Italy

a cura di

Scuola Normale Superiore Consiglio Nazionale delle Ricerche Regione Emilia Romagna per Aspen Institute Italia

Interesse nazionale Ottobre 2017

Piazza Navona, 114 00186 - Roma

Tel: +39 06 45.46.891 Fax: +39 06 67.96.377

Via Vincenzo Monti, 12

20123 - Milano Tel: +39 02 99.96.131

Fax: +39 06 99.96.13.50

www.aspeninstitute.it

Page 2: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

Report structure • Hallmarks of big data

• Data science

• Towards a data science agenda

• Economic and growth outlook

• Big data: structural change and renewed competitiveness

2

Page 3: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

HALLMARKS OF BIG DATA

Page 4: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

VOLUME

HALLMARKS OF BIG DATA

4

Page 5: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

VARIETY + VELOCITY

weather

sensors POS payments

data warehouse

corporate

social media

video-surveillance

text documents

medical data

email

unstructured structure

scientific research

financial markets

structured

real

-tim

e ve

loci

ty

stat

ic

HALLMARKS OF BIG DATA

5

Page 6: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

VOLUME VELOCITY

VARIETY

VALUE

HALLMARKS OF BIG DATA

6

Page 7: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

HALLMARKS OF BIG DATA

• Big data analytics enables the identification of attributes, trends, and patterns on which to base choices and build entirely data-driven policies, even in the absence of benchmark models and theories within the context of use

• However, the potential value of big data can only be tapped through the development of ad-hoc analytics algorithms (data science)

• Data amassed for distinct purposes can contribute to formulating highly-innovative scenarios and methods even in different contexts

7

Page 8: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

SOURCING BIG DATA

It is essential to: • ensure access for Italian firms and

researchers to this wealth of information

• encourage data sharing • safeguard the privacy rights of

individuals • formulate policies endorsed at a

supranational level

8

Page 9: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

DATA SCIENCE

Page 10: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

WHAT IS DATA SCIENCE?

Data availability, sophisticated analytics techniques, and scalable infrastructure Data science

10

Page 11: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

Data science includes data extraction, data preparation, data exploration, data transformation, storage and retrieval, computing infrastructure, various types of data mining, machine and statistical learning, optimization, presentation of explanations and predictions, and the exploitation of results taking into account ethical, social, legal, and business considerations.

WHAT IS DATA SCIENCE?

11

Page 12: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

THE DATA

Data may be structured or unstructured, big or small, and static or real-time.

12

Page 13: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

THE ANALYTICS

• Data-mining algorithms for automated pattern discovery highlights the structure hidden in massive datasets

• Machine learning - “deep learning” methods exploit large “training” datasets of examples to learn general rules and models to classify data and predict outcomes

• Network science unveils the magic of shifting from the statistics of populations to the statistics of interlinked entities, connected by the ties of their mutual interactions

13

Page 14: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

Validation

Data

Dem

ogra

phic

dat

a Ge

ogra

phic

dat

a M

ovem

ent d

ata

Tran

spor

t dat

a

Models

T-Cl

uste

ring

T-Pa

tter

ns

Forecasts

FROM DATA TO KNOWLEDGE

14

Page 15: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

DATA SCIENCE FOR SOCIETY

Data science can improve society and boost social progress by: • supporting policymaking • yielding novel ways of producing high-quality and high-

precision statistical information • empowering citizens with self-awareness tools, and • promoting ethical uses of big data

for the “city of citizens” and people, societal debate, better governance, official statistics and demography, sustainable development, and developing countries.

15

Page 16: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

A new data-dominated science is emerging, a data-centric way of thinking, organizing, and carrying out research activities that can lead to the solution of problems hitherto considered extremely difficult or even impossible to tackle, as well as resulting in serendipitous discoveries. Computational social science, medicine, meteorology, environmental science, ecology, agriculture, geology, and seismology are scientific fields where the data deluge, analytical capacity, processing capability, and data sharing and curation infrastructure are providing a powerful boost to research.

DATA SCIENCE FOR SCIENCE

16

Page 17: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

DATA SCIENCE FOR INDUSTRY AND BUSINESS

Data science has the capacity to create an ecosystem of data-driven innovative business opportunities (facilitated by participatory platforms) that can help firms collaborate to bring to light new local, national, and global whitespace markets, and which can be leveraged for collaborative, participatory creation and enrichment of big data. Energy, environment, agri-food, mobility, transport and logistics, manufacturing and production, the public sector, healthcare, financial services, telecommunications services, retail, tourism etc.

17

Page 18: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

MEASURING HAPPINESS VIA TWITTER

Computational social science is now using digital tools to analyze people’s rich and interactive lives to answer questions that were previously impossible to investigate. (Mann. PNAS January 19, 2016, vol. 113 no. 3)

18

Page 19: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

SOCIETAL DEBATE

By analyzing millions of datasets of public debates on social media and in newspaper articles, it is possible to gauge what the most discussed topics are, how they emerge and evolve over time and space, and how opinions polarize.

19

Page 20: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

MOBILITY, DIVERSITY, AND WELLBEING

Big data can improve official statistics by providing cheaper information in a more timely manner, capturing small-scale phenomena, and enabling the measurement of phenomena that were previously inexistent (digital assets of the population) or near-to-impossible to capture (happiness or mood).

A

B

C

HW

20

Page 21: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

FUNCTIONAL AREAS IN TUSCANY

Data science for the “city of citizens”: Cities are the ideal living labs in which to test and deploy data science applications that indirectly translate into benefits for the individual in the form of improved public transport, a safer and healthier living environment, sustainable development, etc.

The polycentric city revealed by citizens’ everyday movements

21

Page 22: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

ESTIMATING THE PROPAGATION OF FINANCIAL DISTRESS

Financial services: Huge amounts of data are processed to detect fraud and risk, to analyze customer behavior, segmentation, trading, and credit risk. Network science allows the systemic risk of existing economic and financial networks to be measured, thereby helping to prevent shocks and disasters.

22

Page 23: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

SPORTS ANALYTICS

The proliferation of new sensing technologies that provide data streams extracted from every game is changing the way scientists, fans, and practitioners conceive of sports performance. By combining this (big) data with the powerful tools of data science and AI, it is now possible to reveal the great complexity underlying sports performance and carry out many challenging tasks: from automatic tactical analysis to data-driven performance ranking, game-outcome prediction, and injury forecasting.

23

Page 24: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

TOWARDS A DATA SCIENCE AGENDA

Page 25: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

• Semantics data integration and enrichment technology • New foundations for big data analytics • Engineering the management and curation of data • Advanced visualization and user experience • Scalable architectures for analytics • Responsible access to data

MAIN SCIENTIFIC AND TECHNOLOGICAL CHALLENGES

25

Page 26: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

NEW FOUNDATIONS FOR BIG DATA ANALYTICS

At the convergence of data mining, machine learning, statistical modeling, optimization, and complex systems science, capable of transparently monitoring the quality of data and the results of analytical processes

– Reconciling statistical inference and computing – Explanation of machine-learning decision models – Correlation versus causality – Individual versus collective data analytics – Embedding of privacy mechanisms – Analytics as a service

26

Page 27: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

ANALYTICS AS A SERVICE

From descriptive analytics (“What happens?”) to diagnostics (“Why did it happen?”) to prediction (“What will happen?”) to prescription (“How to make it happen?”)

Man-machine collaboration

Data Scientist Machine Intelligence

Descriptive Analytics

Diagnostic Analytics

Predictive Analytics

Prescriptive Analytics 27

Page 28: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

NEW BOUNDARIES OF DATA USABILITY

• The GDPR will enter into force on 25 May 2018 and introduces new obligations for data processors in and outside the EU

• Defines rights for individuals regarding control of their own data and includes elements such as the adoption of privacy-by-design and privacy risk assessment, right to erasure and explanation, and accountability and transparency principles

28

Page 29: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

BIG DATA, BIG RISKS

Big data is algorithmic, therefore it cannot be biased… yet • All traditional evils of social discrimination, and many new ones, exhibit

themselves in the big-data ecosystem • Because of its tremendous power, massive data analysis must be used

responsibly • Technology alone will not suffice: policy, user-involvement and education

efforts are needed

29

Page 30: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

FOUR SKILLSETS OF THE DATA SCIENTIST

• Harvest and manage data with technical skills in collecting and integrating databases built from heterogeneous sources

• Make sense of data with technical skills in data mining, statistics, and machine learning to gain insight from large volumes of data

• Tell the story: skill in narrating the stories that data tells after analysis and modeling (e.g. using both visual and multimedia storytelling)

• Master ethical and legal aspects at every step of the discovery process 30

Page 31: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

THE DATA SCIENCE PIPELINE

31

Page 32: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

ECONOMIC AND GROWTH OUTLOOK

Page 33: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

CURRENT SECTORS WHERE BIG DATA IS EMPLOYED

33

Page 34: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

BIG DATA AND HEALTHCARE

The growth of sequencing capabilities and the sharing of medical data enables – from a big data perspective – the optimization of treatments and the development of personalized protocols without further experimentation (either on animals or on humans). This area raises particularly evident issues of ethics and privacy.

Increase in DNA sequencing capabilities

34

Page 35: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

THE EU DATA MARKETPLACE

• EU data market (i.e. the marketplace where data-related products or services are exchanged) – in 2016, estimated at almost EUR 60 billion – by 2020, will amount to more than EUR 106 billion according to

the high-growth scenario forecast

• Total number of data firms in the EU (i.e. organizations whose main activity is the production and delivery of data-related products or services) – neared the threshold of 255,000 units in 2016 – will reach 360,000 units by 2020 according to the high-growth

scenario forecast

35

Page 36: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

• The EU data market (data workers are those engaged in collecting, storing, managing, and analyzing data as their primary activity) – employed 6.1 million data workers in 2016 – will employ 10.4 million by 2020 according to the high-growth scenario

forecast

• The data economy (representing the aggregate impact of the data market on the EU economy as a whole) – accounted for almost 2% of EU GDP in 2016 – will have an impact of 4% on the total EU economy by 2020 according

to the high-growth scenario forecast

THE EU DATA MARKETPLACE

36

Page 37: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

US DATA MARKET

Source: McKinsey Top 5 Game-changers 2013 37

Page 38: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

Source: McKinsey 12 disruptive technologies 2017 38

Page 39: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

Source: http://www3.weforum.org/docs/WEF_Future_of_Jobs.pdf

39

Page 40: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

FUTURE OF JOBS

• 5.1 million jobs set to be lost in Western countries to disruptive labor-market changes over the period 2015–2020

• a total loss of 7.1 million jobs concentrated in routine white-collar office functions, such as office and administrative roles

• a gain of 2 million jobs in computer-, mathematical-, architectural-, and engineering-related fields

Source: World Economic Forum’s “Future of Jobs” Report (2016)

40

Page 41: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

BIG DATA: STRUCTURAL CHANGE

AND RENEWED COMPETITIVENESS

Page 42: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

Industry 4.0 should not be viewed solely from a technological standpoint but also from the perspective of the ability to coordinate science, technology, skills, and social context with a view to being best able to facilitate convergence of distinct but complementary technologies to respond to both the major global issues and the individual demands of millions of users/clients

INDUSTRY 4.0 AND BIG DATA

42

Page 43: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

What gives order to new "industry" is hyper-connectivity and, hence, big data

Big data not only as a commodity but, above all, as a new way of tackling and managing modern-day complexity

Global value chains move their various phases around according to the value-added achievable in different local contexts

INDUSTRY 4.0 AND BIG DATA

43

Page 44: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

high

medium high

Cottage industry

Fordist production

medium

low

low

Volumes of production

Product Differentiation

Flexible production

Industry 4.0

PRODUCT DEFINITION

44

Page 45: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

high

medium

low medium high

Scale

low

Scope

Rigid mass production

Flexible mass production

Customized individual production

Customized mass production

PROCESS ORGANIZATION

45

Page 46: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

46

Enabling technologies Additive manufacturing Digital manufacturing Virtual reality Second generation robots Internet of things Big data Artificial intelligence

I 4.0

Skills and infrastructure for the convergence of complementary technologies

46

Page 47: Big Data: economic opportunities for Italy - Aspen Institute · hidden in massive datasets • Machine learning - “deep learning” methods exploit large “training” datasets

BIBLIOGRAPHY

• Data Science: a Game-changer for Science and Innovation, Report for the G7 Academy, 2017

• The Big Data Value Strategic Research Innovation Agenda, 2017 http://www.bdva.eu/

• Big Data Analytics: towards a European Research Agenda, ERCIM (European Research Consortium for Informatics and Mathematics) White Paper on Big Data Analytics, 2015 https://www.ercim.eu/news/387-ercim-white-paper-on-big-data-analytics

• The fourth paradigm: data-intensive scientific discovery, Tony Hey, Stewart Tansley and Kristin Tolle, Microsoft Research, 2009

47