privacy preserving data analytics - poseid-on · •# data breach notifications •~59k from may...

15
Co-funded by the Horizon 2020 Framework Programme of the European Union Privacy preserving data analytics Coimbra, 11 July, 2019 Melek Önen Workshop on Privacy, Data Protection and Digital Identity, July 2019

Upload: others

Post on 06-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

Co-funded by the Horizon 2020 Framework Programme of the

European Union

Privacy preserving data analyticsCoimbra, 11 July, 2019

Melek Önen

Workshop on Privacy, Data Protection and Digital Identity, July 2019

Page 2: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

2

PAPAYA Project

Project title PAPAYA: PlAtform for PrivAcY preserving data Analytics

Call DS-08 Cybersecurity PPP: Privacy, Data Protection, Digital Identities

Grant Agreement GA no: 786767

Project Officer Mr. Nikolaos Panagiotarakis (H2020, REA)

Duration 36 months (1 May 2018 – 30 Apr 2021)

Consortium

Project Coordinator EURECOM

Workshop on Privacy, Data Protection and Digital Identity, July 2019

Page 3: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

Data Analytics as a Service

3

Prediction

Workshop on Privacy, Data Protection and Digital Identity, July 2019

Page 4: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

• # Data Breach Notifications• ~59K from May 2018 until Jan. 2019

• Average Cost in 2018• Global: 3.96M$, Per record: 148$

• Top 3 sectors • Health, Financial, Services

• Factors increasing cost• Extensive migration to cloud• Third party involvement• Compliance failures

• Factors decreasing cost• Extensive use of encryption• Use of security analytics

4https://www.bankinfosecurity.com/gdpr-data-breach-reports-to-eu-exceed-59000-a-12006

2019 Data Breaches in Europe

Page 5: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

PAPAYA

5

• Objectives

– Privacy by design

• PP analytics: processing over protected data

– Different settings

• Single vs multiple Dos

• Third party queriers

– Integrated platform

• Common framework

– User control

• Transparency, usabilitiy & auditability

Workshop on Privacy, Data Protection and Digital Identity, July 2019

Page 6: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

Project Roadmap

6Workshop on Privacy, Data Protection and Digital Identity, July 2019

(Currently at M15)

Page 7: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

PAPAYA Use Cases

• Healthcare umbrella• Arrhythmia detection• Stress detection

• Mobility and phone usage umbrella• Mobility analytics• Mobile usage analytics• Threat detection

7Workshop on Privacy, Data Protection and Digital Identity, July 2019

[D2.1]

Page 8: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

PAPAYA Analytics

• Neural Network classification• Arrhythmia detection, Threat detection

• Collaborative Neural Network training• Stress detection

• Trajectory clustering • Mobility analysis

• Counting (& and set operations)• Mobile phone usage

8Workshop on Privacy, Data Protection and Digital Identity, July 2019

[D3.1]

Page 9: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

MHPon premises

CardioPharmaTablet App

Arrhythmia detection with Neural Networks

DoctorWeb App

Trusted area(may be the cardiologist clinic)

Page 10: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

Privacy vs. Neural Networks• Advanced Privacy technologies

• FHE, MPC• Challenges

• Additional overhead: Computation, memory and bandwidth

• Complex operations (sigmoid, tanh, etc.)

• Real numbers (vs. integers with PETs)• Goal

• Reduce NN complexity• Use low degree polynomials• Use integers• Keep good level of accuracy

• Solution for PAPAYAÞGenerate a dedicated NN model from

scratch

10

x1

x2

x3

xi

x16

x1.wh1,i+b1

x2.wh2,i+b2

x3.wh3,i+b3

xi.whj,i+bi

x38.wh38,i+b38

y'12

y'22

y'32

y'i2

y'382

x1.wo1,i+b1

x2.wo2,i+b2

x3.wo3,i+b3

xi.woj,i+bi

x16.wo16,i+b16

   

Input Layer FC Layer SquareFunction 

Softmax Function 

ymax

FC Layer

Workshop on Privacy, Data Protection and Digital Identity, July 2019

Page 11: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

PP arrhythmia classification based on 2PC

• NN architecture optimization• PCA to reduce input size• Minimum number of hidden layers with good accuracy

• Approximate non linear operations• Square (!") instead of sigmoid (f t = '

'()*+)• Simple max instead of softmax f - = ).

∑.012 ). 3ℎ565 - = 1, … , :• Approximate real numbers

• Truncation: ×10=• Performance

• 62ms to classify one heartbeat, 400 ms to classify 10

11Workshop on Privacy, Data Protection and Digital Identity, July 2019

[IFIP Summer School on Privacy’19]

Page 12: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

Privacy vs. Clustering

• Objective• Encrypted trajectory clustering

• Challenges• Processing of Personal data (location,

ID, etc.)• Sophisticated distance computation• Comparisons

• PAPAYA solutionÞ Approximate distance + use 2PC

12Workshop on Privacy, Data Protection and Digital Identity, July 2019

Page 13: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

PAPAYA Architecture (v1)

13Workshop on Privacy, Data Protection and Digital Identity, July 2019

[D4.1 available soon]

Page 14: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

Co-funded by the Horizon 2020 Framework Programme of the

European Union

The work described in this presentation has been conducted within the project PAPAYA. This project has received funding from the European Union’s Horizon 2020 (H2020) research and innovation programme under the Grant Agreement no 786767. This document does not represent the opinion of the European Union, and the European Union is not responsible for any use that might be made of its content.

Thank you

PAPAYA GA Meeting, Madrid, 5-6 March 2019

Page 15: Privacy preserving data analytics - PoSeID-on · •# Data Breach Notifications •~59K from May 2018 until Jan. 2019 •Average Cost in 2018 •Global: 3.96M$, Per record: 148$ •Top

PAPAYA Dashboards

• Platform Dashboard (Web application)• Service catalog• Service Add/Delete/Update• Application Create/Delete/Deploy• Application monitoring for service owners

• Agent Dashboard• Agent Configuration• Data processing logs

• Data Subject Toolbox• Explaining PP Analytics• Data Disclosure Visualization Tool• Annotated Log view tool• Privacy Engine

15Workshop on Privacy, Data Protection and Digital Identity, July 2019