providing trust in data-driven innovation

25
PROVIDING TRUST IN DATA-DRIVEN INNOVATION

Upload: others

Post on 27-Jan-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

PROVIDING TRUST IN DATA-DRIVEN INNOVATION

Page 2: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

2 | Copyright © 2020 Syntho B.V. All rights reserved.

more compliance costs for companies that lack privacy protection

100%

of data for AI will be unlocked by privacy enhancing techniques

50%

more profits for companies that earn and maintain digital trust with customers

30%

of organizations have storage of personal data as biggest privacy risk

70%

increase in industry collaborations expected with use of privacy tools

70%

of companies cite privacy as no. 1 barrier for AI implementation

30%

of customers trust their insurer to use their personal data

30%

of privacy compliance tooling will rely on AI in 2023, up from 5% today

40%

of population will have data privacy regulations in 2023, up from 10% today

65%

of training data for AI will be synthetically generated

25%

Data Privacy – a Key Driver for Business SuccessData privacy is fundamental to the success of organizations and detrimental to those that ignore it

• Preserving Privacy While Using Personal Data for AI Training: Gartner 2020

• The State of Privacy and Personal Data Protection 2020-2022: Gartner 2020

• 100 Data and Analytics Predictions Through 2024: Gartner 2020

• Cool Vendors in AI Core Technologies: Gartner 2020

• Hype Cycle for Privacy 2020: Gartner 2020

• 5 Areas Where AI Will Turbocharge Privacy Readiness: Gartner 2019

• Top 10 Strategic Technology Trends for 2019: Gartner, 2019

Page 3: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

3 | Copyright © 2020 Syntho B.V. All rights reserved.

The added value of Syntho and

Syntho’s data privacy solution* for

companies

* Syntho solution - AI software-as-a-service (SaaS) tool for customers to generate synthetic data from personal data

Enable data-driven innovation

As an ‘enabler’, Syntho enables companies

to preserve data utility for data-driven

innovations while meeting privacy standards Reduce data breachesSyntho allows companies to reduce the chance of a

privacy-sensitive data breach costing $8.2M per

breach on average and $388M for large companies

Protect brand and reputationBy securely protecting personal data through

the use of synthetic data, Syntho helps prevent

customer victims and reputational damage

Compliance with privacy regulations

Syntho ensures compliance with modern data privacy

regulations (e.g. GDPR), which apply to 65% of the

world’s population in 2023, up from 10% today

• Hype Cycle for Privacy 2020: Gartner 2020

• Procurement on the Front Lines: New Trends in Data Privacy and Cybersecurity Risks: Gartner 2020

Page 4: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

7 | Copyright © 2020 Syntho B.V. All rights reserved.

The problemData privacy hinders data-driven innovation, and classic anonymization techniques fail to provide a solution

Privacy hinders data-driven innovation

Data privacy rules and legislation (e.g. GDPR) rightfully protect

individuals, but hinder organizations from ‘innovating’ with data,

which may include any data processing activity to improve the business

Classic anonymization is no solution

Classic privacy-enhancing techniques exhibit a trade-off

between data privacy and data utility leading to a distorted

structure and statistical properties and remaining privacy risk

Data sharing

Data commerce

Test & developAI & analytics

Page 5: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

8 | Copyright © 2020 Syntho B.V. All rights reserved.

Classic ‘anonymization’ offers no solution

Always a privacy risk

Page 6: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

9 | Copyright © 2020 Syntho B.V. All rights reserved.

Classic ‘anonymization’ offers no solution

Destroys data

Always a privacy risk

Page 7: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

12 | Copyright © 2020 Syntho B.V. All rights reserved.

Classic ‘anonymization’ offers no solutionClassic privacy-enhancing techniques do not lead to private data and result in a loss of information

Classic ‘anonymization’

Example technique Original data Manipulated data

Generalization 27 years old Between 25 and 30 years old

Suppression / Wiping [email protected] [email protected]

Pseudonymization Amsterdam hVFD6td3jdHHj78ghdgrewui6

Row and column shuffling Aligned Shuffled

Page 8: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

13 | Copyright © 2020 Syntho B.V. All rights reserved.

The privacy-utility trade-offClassic privacy-enhancing techniques suffer from a trade-off between data privacy and data utility

Classic ‘anonymization’

Privacy Protection

Data Utility

Page 9: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

Synthetic data by Syntho

Page 10: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

16 | Copyright © 2020 Syntho B.V. All rights reserved.

A B C

Which of these images is fake?

Page 11: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

17 | Copyright © 2020 Syntho B.V. All rights reserved.

A B C

Synthetic images: generated by Artificial Intelligence (AI)

* www.thispersondoesnotexist.com

Page 12: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

19 | Copyright © 2020 Syntho B.V. All rights reserved.

AI-generated synthetic dataThe Syntho Engine is capable of generating highly realistic and anonymous synthetic data based on real data

Synthetic data

✓ No privacy risk

✓ Statistical value and granularity preserved

✓ Unrestricted use and sharing

× High privacy risk

× Locked value for analysis

× (Legal) privacy restrictions

Original data

Your secure IT environment

State-of-the-art software solution using generative

adversarial networks (GAN)

Syntho Engine

Page 13: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

20 | Copyright © 2020 Syntho B.V. All rights reserved.

AI-generated synthetic data: a game changerSynthetic data largely overcomes the traditional trade-off between data privacy and data utility

Classic ‘anonymization’

Privacy Protection

Data Utility

Synthetic data

Page 14: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

Synthetic data privacy and quality

Page 15: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

22 | Copyright © 2020 Syntho B.V. All rights reserved.

Privacy-preserving synthetic dataThe concept of privacy is fully imbedded and a consequence of the applied technology

Original data

Name Age Gender Item Price Data

Olivia 26 Female Shoes €125 4 March

John 75 Male Laptop €695 5 March

George 41 Male Beer €4 7 March

… … … … … …

George 41 Male Shirt €25 9 March

N=100k

Original data with applied classic anonymization

Name Age Gender Item Price Data

xxx 25-30 Female Shoes €100 - €200 March

xxx 70-75 Male Laptop €600 - €700 March

xxx 40-45 Male Beer <€5 March

… … … … … …

xxx 40-45 Male Shirt €20 - €30 March

N=100k

Synthetic data

Name Age Gender Item Price Data

xxx 23 Female Sofa €790 1 March

xxx 23 Female Scarf €40 3 March

xxx 52 Male Razor €5 7 March

… … … … … …

… … … … … …

… … … … … …

xxx 35 Female Wine €7 9 March

N = 800k?

Original data

Name Age Gender Item Price Data

Olivia 26 Female Shoes €125 4 March

John 75 Male Laptop €695 5 March

George 41 Male Beer €4 7 March

… … … … … …

George 41 Male Shirt €25 9 March

N=100k

Dataset levelData value destroyed

Attribute levelPrivacy risk due to 1:1 relationship with original recordsNumber of original data records and manipulated data records is equal

Dataset levelPreserved data quality

Attribute levelSynthetic data records have no 1:1 mapping with the original dataAn unlimited amount of synthetic data records can be generated

Cla

ssic

ano

nym

izat

ion

Synt

heti

c da

ta

Page 16: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

23 | Copyright © 2020 Syntho B.V. All rights reserved.

Synthetic data qualityFor demonstrating the quality of the synthetic data, we provide a detailed quality report and offer joint evaluation

Statistical quality report Joint evaluation

Univariate distributions Correlations

Multivariate distributions Additional measures upon request

• By definition, data utility (or ‘usability’) can only beunderstood in relation to the target domain, wherethe data will be used, shared and / or stored.

• This is why we propose to evaluate the syntheticdata with a domain expert in order to demonstratethe synthetic data ‘makes sense’.

Page 17: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

24 | Copyright © 2020 Syntho B.V. All rights reserved.

Syntho Engine preserves deep ‘hidden’ relationsDeep ‘hidden’ relations, like multivariate distributions and correlations, are also captured by the Syntho Engine

Original data Synthetic data

Page 18: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

25 | Copyright © 2020 Syntho B.V. All rights reserved.

So, why still use personal data if you can use synthetic data?From a privacy and utility perspective you should always opt for synthetic data when your use case allows so

Value for analysis Privacy risk

Synthetic data High None

Personal data High High

Classic ‘anonymized’ data Low-Medium Medium-High

* https://iapp.org/news/a/accelerating-ai-with-synthetic-data/

Page 19: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

26 | Copyright © 2020 Syntho B.V. All rights reserved.

Key benefits

So, why still use personal data if you can use synthetic data?Using synthetic data to minimize personal data and unlock formerly locked personal data cultivates several benefits

Locked personal

data

Personal data in use

Syntheticdata

Now Future

Unlock personal data

Minimize personal data

1. Less risk

2. More data

3. Faster data access

Boosted Innovation

Page 20: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

Use cases

Page 21: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

30 | Copyright © 2020 Syntho B.V. All rights reserved.

Use casesSynthetic data unlocks a wide variety of client use cases to boost innovation

Testing and development

GDPR compliant test environments

Agile analytics

Eliminate time-consuming governance

blocking data access and innovation

Data commerce

Responsibly monetize your data assets

Data sharing

Privacy-preserving public and third

party data sharing

Data retention

Overcome legal retention periods

Data augmentation

Intelligent data augmentation to reduce

bias, extend and balance datasets

Page 22: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

32 | Copyright © 2020 Syntho B.V. All rights reserved.

Example: freely use and share anonymous synthetic data

Page 23: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

33 | Copyright © 2020 Syntho B.V. All rights reserved.

Example: freely use and share anonymous synthetic data

Synthetic customer data

Page 24: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

45 | Copyright © 2020 Syntho B.V. All rights reserved.

Thank You

Page 25: PROVIDING TRUST IN DATA-DRIVEN INNOVATION

PROVIDING TRUST IN DATA-DRIVEN INNOVATION