providing trust in data-driven innovation
TRANSCRIPT
PROVIDING TRUST IN DATA-DRIVEN INNOVATION
2 | Copyright © 2020 Syntho B.V. All rights reserved.
more compliance costs for companies that lack privacy protection
100%
of data for AI will be unlocked by privacy enhancing techniques
50%
more profits for companies that earn and maintain digital trust with customers
30%
of organizations have storage of personal data as biggest privacy risk
70%
increase in industry collaborations expected with use of privacy tools
70%
of companies cite privacy as no. 1 barrier for AI implementation
30%
of customers trust their insurer to use their personal data
30%
of privacy compliance tooling will rely on AI in 2023, up from 5% today
40%
of population will have data privacy regulations in 2023, up from 10% today
65%
of training data for AI will be synthetically generated
25%
Data Privacy – a Key Driver for Business SuccessData privacy is fundamental to the success of organizations and detrimental to those that ignore it
• Preserving Privacy While Using Personal Data for AI Training: Gartner 2020
• The State of Privacy and Personal Data Protection 2020-2022: Gartner 2020
• 100 Data and Analytics Predictions Through 2024: Gartner 2020
• Cool Vendors in AI Core Technologies: Gartner 2020
• Hype Cycle for Privacy 2020: Gartner 2020
• 5 Areas Where AI Will Turbocharge Privacy Readiness: Gartner 2019
• Top 10 Strategic Technology Trends for 2019: Gartner, 2019
3 | Copyright © 2020 Syntho B.V. All rights reserved.
The added value of Syntho and
Syntho’s data privacy solution* for
companies
* Syntho solution - AI software-as-a-service (SaaS) tool for customers to generate synthetic data from personal data
Enable data-driven innovation
As an ‘enabler’, Syntho enables companies
to preserve data utility for data-driven
innovations while meeting privacy standards Reduce data breachesSyntho allows companies to reduce the chance of a
privacy-sensitive data breach costing $8.2M per
breach on average and $388M for large companies
Protect brand and reputationBy securely protecting personal data through
the use of synthetic data, Syntho helps prevent
customer victims and reputational damage
Compliance with privacy regulations
Syntho ensures compliance with modern data privacy
regulations (e.g. GDPR), which apply to 65% of the
world’s population in 2023, up from 10% today
• Hype Cycle for Privacy 2020: Gartner 2020
• Procurement on the Front Lines: New Trends in Data Privacy and Cybersecurity Risks: Gartner 2020
7 | Copyright © 2020 Syntho B.V. All rights reserved.
The problemData privacy hinders data-driven innovation, and classic anonymization techniques fail to provide a solution
Privacy hinders data-driven innovation
Data privacy rules and legislation (e.g. GDPR) rightfully protect
individuals, but hinder organizations from ‘innovating’ with data,
which may include any data processing activity to improve the business
Classic anonymization is no solution
Classic privacy-enhancing techniques exhibit a trade-off
between data privacy and data utility leading to a distorted
structure and statistical properties and remaining privacy risk
Data sharing
Data commerce
Test & developAI & analytics
8 | Copyright © 2020 Syntho B.V. All rights reserved.
Classic ‘anonymization’ offers no solution
Always a privacy risk
9 | Copyright © 2020 Syntho B.V. All rights reserved.
Classic ‘anonymization’ offers no solution
Destroys data
Always a privacy risk
12 | Copyright © 2020 Syntho B.V. All rights reserved.
Classic ‘anonymization’ offers no solutionClassic privacy-enhancing techniques do not lead to private data and result in a loss of information
Classic ‘anonymization’
Example technique Original data Manipulated data
Generalization 27 years old Between 25 and 30 years old
Suppression / Wiping [email protected] [email protected]
Pseudonymization Amsterdam hVFD6td3jdHHj78ghdgrewui6
Row and column shuffling Aligned Shuffled
13 | Copyright © 2020 Syntho B.V. All rights reserved.
The privacy-utility trade-offClassic privacy-enhancing techniques suffer from a trade-off between data privacy and data utility
Classic ‘anonymization’
Privacy Protection
Data Utility
Synthetic data by Syntho
16 | Copyright © 2020 Syntho B.V. All rights reserved.
A B C
Which of these images is fake?
17 | Copyright © 2020 Syntho B.V. All rights reserved.
A B C
Synthetic images: generated by Artificial Intelligence (AI)
* www.thispersondoesnotexist.com
19 | Copyright © 2020 Syntho B.V. All rights reserved.
AI-generated synthetic dataThe Syntho Engine is capable of generating highly realistic and anonymous synthetic data based on real data
Synthetic data
✓ No privacy risk
✓ Statistical value and granularity preserved
✓ Unrestricted use and sharing
× High privacy risk
× Locked value for analysis
× (Legal) privacy restrictions
Original data
Your secure IT environment
State-of-the-art software solution using generative
adversarial networks (GAN)
Syntho Engine
20 | Copyright © 2020 Syntho B.V. All rights reserved.
AI-generated synthetic data: a game changerSynthetic data largely overcomes the traditional trade-off between data privacy and data utility
Classic ‘anonymization’
Privacy Protection
Data Utility
Synthetic data
Synthetic data privacy and quality
22 | Copyright © 2020 Syntho B.V. All rights reserved.
Privacy-preserving synthetic dataThe concept of privacy is fully imbedded and a consequence of the applied technology
Original data
Name Age Gender Item Price Data
Olivia 26 Female Shoes €125 4 March
John 75 Male Laptop €695 5 March
George 41 Male Beer €4 7 March
… … … … … …
George 41 Male Shirt €25 9 March
N=100k
Original data with applied classic anonymization
Name Age Gender Item Price Data
xxx 25-30 Female Shoes €100 - €200 March
xxx 70-75 Male Laptop €600 - €700 March
xxx 40-45 Male Beer <€5 March
… … … … … …
xxx 40-45 Male Shirt €20 - €30 March
N=100k
Synthetic data
Name Age Gender Item Price Data
xxx 23 Female Sofa €790 1 March
xxx 23 Female Scarf €40 3 March
xxx 52 Male Razor €5 7 March
… … … … … …
… … … … … …
… … … … … …
xxx 35 Female Wine €7 9 March
N = 800k?
Original data
Name Age Gender Item Price Data
Olivia 26 Female Shoes €125 4 March
John 75 Male Laptop €695 5 March
George 41 Male Beer €4 7 March
… … … … … …
George 41 Male Shirt €25 9 March
N=100k
Dataset levelData value destroyed
Attribute levelPrivacy risk due to 1:1 relationship with original recordsNumber of original data records and manipulated data records is equal
Dataset levelPreserved data quality
Attribute levelSynthetic data records have no 1:1 mapping with the original dataAn unlimited amount of synthetic data records can be generated
Cla
ssic
ano
nym
izat
ion
Synt
heti
c da
ta
23 | Copyright © 2020 Syntho B.V. All rights reserved.
Synthetic data qualityFor demonstrating the quality of the synthetic data, we provide a detailed quality report and offer joint evaluation
Statistical quality report Joint evaluation
Univariate distributions Correlations
Multivariate distributions Additional measures upon request
• By definition, data utility (or ‘usability’) can only beunderstood in relation to the target domain, wherethe data will be used, shared and / or stored.
• This is why we propose to evaluate the syntheticdata with a domain expert in order to demonstratethe synthetic data ‘makes sense’.
24 | Copyright © 2020 Syntho B.V. All rights reserved.
Syntho Engine preserves deep ‘hidden’ relationsDeep ‘hidden’ relations, like multivariate distributions and correlations, are also captured by the Syntho Engine
Original data Synthetic data
25 | Copyright © 2020 Syntho B.V. All rights reserved.
So, why still use personal data if you can use synthetic data?From a privacy and utility perspective you should always opt for synthetic data when your use case allows so
Value for analysis Privacy risk
Synthetic data High None
Personal data High High
Classic ‘anonymized’ data Low-Medium Medium-High
* https://iapp.org/news/a/accelerating-ai-with-synthetic-data/
26 | Copyright © 2020 Syntho B.V. All rights reserved.
Key benefits
So, why still use personal data if you can use synthetic data?Using synthetic data to minimize personal data and unlock formerly locked personal data cultivates several benefits
Locked personal
data
Personal data in use
Syntheticdata
Now Future
Unlock personal data
Minimize personal data
1. Less risk
2. More data
3. Faster data access
Boosted Innovation
Use cases
30 | Copyright © 2020 Syntho B.V. All rights reserved.
Use casesSynthetic data unlocks a wide variety of client use cases to boost innovation
Testing and development
GDPR compliant test environments
Agile analytics
Eliminate time-consuming governance
blocking data access and innovation
Data commerce
Responsibly monetize your data assets
Data sharing
Privacy-preserving public and third
party data sharing
Data retention
Overcome legal retention periods
Data augmentation
Intelligent data augmentation to reduce
bias, extend and balance datasets
32 | Copyright © 2020 Syntho B.V. All rights reserved.
Example: freely use and share anonymous synthetic data
33 | Copyright © 2020 Syntho B.V. All rights reserved.
Example: freely use and share anonymous synthetic data
Synthetic customer data
45 | Copyright © 2020 Syntho B.V. All rights reserved.
Thank You
PROVIDING TRUST IN DATA-DRIVEN INNOVATION