data analytics practice - amazon s3...data, information and analytics as services, delen &...

33
1

Upload: others

Post on 20-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

1

Page 2: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Data Analytics practice

[press space bar]2

Page 3: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Agenda

1· Data Analytics

2· SafeClouds

3· Conclusions & challenges

What's new and what's not?

Quick wins

The Data Science practice

The learning problem

The project

The partners

The work programme

Scenarios, outcomes

3

Page 4: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Data Analytics

4 . 1

Page 5: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Data Analytics

What's new and what's not18th century

1920's

1980's

1990's

2000's

Future in aviation

Bayesian statistics

Parametric models

Highly non-linear relationships in real complex datasets

New analytical techniques, large data sets, high non-linearity

Machine learning concepts; Storage, Computing, Communications

Focus on processes that provide actionable analytics

4 . 2

Page 6: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

The Data Science practice in aviation

??Data Analytics

Individualisation trumps universalsIndividualisation trumps universals

Intangibles that appear to be completely intractable can be measured andIntangibles that appear to be completely intractable can be measured andpredictedpredicted

4 . 3

Page 7: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

The Data Science practice

4 . 4

Page 8: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

What's the learning problem?Data Analytics

4 . 5

Page 9: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

What's the learning problem?Data Analytics

4 . 6

Page 10: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Building models with massive data

The data models and solving the inference problem have challenges:

· Multi-dimensionality, heterogeneity and incompleteness of data, volume of data, velocity,...

The discipline: Knowledge Discovery on massive dataThe discipline: Knowledge Discovery on massive data

· Model selection, including complexity/over-Ztting trade-offs

· Model running, including selection of training data, validation and testing

· Model deployment, including stability and trade-offs precision-accuracy-recall

Data Analytics

4 . 7

Page 11: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Building KDD models with massive dataData Analytics

4 . 8

Page 12: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

5 . 1

Page 13: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

SafeClouds

data management, infrastructure, data protection, data mining tools, visualisation

Aviation safety knowledge discovery

Systematic identiZcation of hazards

Applied research - laboratory validation (TRL5)

5 . 2

Page 14: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

5 . 3

Page 15: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

SafeClouds research project

5 . 4

Page 16: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Some scenarios of interest: Some scenarios of interest:

SafeClouds research project

Real time approach congestion monitoring

Proper separation with terrain

Level busts

Runway performance

Runway excursions

Unstable approaches

5 . 5

Page 17: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

EASASafeClouds research project

5 . 6

Page 18: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

SafeClouds outcomes

Questions

Scenarios descriptionSafeClouds platform

Datasets

Case Studies

Tools

Case Studies analyticsCase Studies analytics

Agile analyticsAgile analyticsmethodologymethodology

Outputs

5 . 7

Page 19: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Next stepsNext steps

Consortium Agreement sign. inc. data protection & sharing - Sept '16

Grant Agreement signature - Sept '16

Project starts - early Oct '16

Consortium Coordinator - Paula López-Catalá, [email protected]

SafeClouds research project

5 . 8

Page 20: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Conclusions & challenges

6 . 1

Page 21: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

ConclusionsData ingest

Cleanse

Fuse

Build Models

Build infrastructure

Secure

Enable the data Build/govern theplatform

Engage the business

Discover

Monitor

Deploy

Data sources

Complexity

Costs

Skill gap in ML-aviation

Reliance on IT

Trust / Privacy

Agile methodologies

ROI metrics

Change processes

Challenges

6 . 2

Page 22: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Some thoughts on challenges

· Analytics Center of Excellence is not an IT organisation

· Data Science agile management is a must

· Reusable data & logic for governance and consistency

· Great tools for collaboration, visual tools.

6 . 3

Page 23: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Closing thoughts

DifZcult to see "quick wins" or "low-hanging fruits"

Your model is not what your data scientists design,it’s what your engineers implement - translation business totechnical is key

Data Science is a craft - there is no Excel+++

6 . 4

Page 24: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Thank you!

References

Annual Safety Review, EASA, 2016

Data, information and analytics as services, Delen & Demirkan, 2012

Data Science for business, Provost & Fawcett, 2013

European Big Data Value Strategic Research Agenda, 2015

Frontiers in Massive Data Analytics, National Academy of Sciences, 2013

Network analysis reveals patterns behind air safety events, 2014

The irrational effectiveness of mathematics in natural sciences, Wigner, 1960

The irrational effectiveness of data, Norwig, 2009; youtube.com/watch?v=yvDCzhbjYWs

SafeClouds documentation - to be published from October 2016 in www.SafeClouds.eu

Synchronisation likelihood in aircraft trajectories, Zanin, 2013

David Pérez - [email protected]

www.SafeClouds.eu

this presentation - slides.innaxis.org/2016.09.08.SafeClouds

6 . 5

Page 25: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

BackUp

7

Page 26: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Hazards

A hazard can be considered as a dormant potential for harm

which is present in one form or another within the aviation system or its environment.

This potential for harm may be in the form of

- a natural hazard such as terrain, or

- a technical hazard such as wrong runway markings

8

Page 27: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Data Analytics

Building KDD models with massive data

9

Page 28: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

The SafeClouds initiative

The SafeClouds research initiative is promoted by a complete spectrum of Aviation

and ICT European stakeholders to develop big data, data protection and data mining

tools for the improvement of aviation safety.

SafeClouds presents a project to develop aviation safety knowledge discovery

techniques from a large set of distributed datasets.

Novel systematic identiZcation of hazards and handling of data and processes

tailored to the requirements of aviation that are efZcient, effective and acceptable by

all the relevant parties in the aviation value-chain.

10

Page 29: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Addressing the learning problem

11 . 1

Page 30: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

I - Feature extraction

II - Feature combinationMostly data management

Domain knowledge

Mostly mathDomain knowledge

Addressing the learning problem

Safety KDD research model

11 . 2

Page 31: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Addressing the learning problem

Safety KDD research model

I - Feature extraction

II - Feature combination

Hazards and

Leading indicators

11 . 3

Page 32: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Eurocontrol trafZc data - 10 months ECAC trafZc, 2min resolution

Low frequency of aviation safety events

Medium term data-driven prediction on LoS events?

KDD study on prediction of separation

1 Classical features describing the status of airspace

2 Complex network features

3 Historical trajectory likelihood-based features

Data Analytics

Building KDD models with massive data

11 . 4

Page 33: Data Analytics practice - Amazon S3...Data, information and analytics as services, Delen & Demirkan, 2012 Data Science for business, Provost & Fawcett, 2013 European Big Data Value

Concepts

Recall literally is how many of the how many of the truetrue positives were positives were recalledrecalled, i.e. how manyof the correct hits were also found.

Precision is how many of the how many of the returnedreturned hits were hits were truetrue positive positive i.e. how many ofthe found were correct hits.

Accuracy is how many of the times the algorithms were correct, i.e. total truepositives plus true negatives

recall = TP / (TP + FN)precision = TP / (TP + FP)accuracy = (TP+TN)/ ALL

12