machine learning security and privacy in …...adversarial perturbations against deep neural...
TRANSCRIPT
![Page 1: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/1.jpg)
Security and Privacy in Machine LearningNicolas PapernotWork done at the Pennsylvania State University and Google Brain
July 2018 - MSR AI summer school
@NicolasPapernot
www.papernot.fr
![Page 2: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/2.jpg)
Martín Abadi (Google Brain)Pieter Abbeel (Berkeley)Michael Backes (CISPA)Dan Boneh (Stanford)Z. Berkay Celik (Penn State)Brian Cheung (UC Berkeley)Yan Duan (OpenAI)Gamaleldin F. Elsayed (Google Brain)Úlfar Erlingsson (Google Brain)Matt Fredrikson (CMU)Ian Goodfellow (Google Brain)Kathrin Grosse (CISPA)Sandy Huang (UC Berkeley)Somesh Jha (U of Wisconsin) 2
Alexey Kurakin (Google Brain)Praveen Manoharan (CISPA)Patrick McDaniel (Penn State)Ilya Mironov (Google Brain)Ananth Raghunathan (Google Brain)Arunesh Sinha (U of Michigan)Shreya Shankar (Stanford)Jascha Sohl-Dickstein (Google Brain) Shuang Song (UCSD)Ananthram Swami (US ARL)Kunal Talwar (Google Brain)Florian Tramèr (Stanford)Michael Wellman (U of Michigan)Xi Wu (Google)
![Page 3: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/3.jpg)
Machine learning brings social disruption at scale
3
HealthcareSource: Peng and Gulshan (2017)
EducationSource: Gradescope
TransportationSource: Google
EnergySource: Deepmind
![Page 4: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/4.jpg)
Machine learning is not magic (training time)
4
Training data
![Page 5: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/5.jpg)
Machine learning is not magic (inference time)
5
?
C
![Page 6: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/6.jpg)
Machine learning is deployed in adversarial settings
6
YouTube filtering
Content evades detection at inference
Tay chatbot
Training data poisoning
![Page 7: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/7.jpg)
Machine learning does not always generalize well (1/2)
7
Training data Test data
CatDog
CatDog
![Page 8: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/8.jpg)
What if the adversary systematically poisoned the data?
8(Understanding Black-box Predictions via Influence Functions, Koh and Liang)
![Page 9: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/9.jpg)
9
What if the adversary systematically evaded at inference time?
Biggio et al., Szegedy et al., Goodfellow et al., Papernot et al., ...
(Goodfellow et al., 2014)
![Page 10: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/10.jpg)
Machine learning does not always generalize well (2/2)
10
Training data Test data
CatDog
CatDog
Membership inference attack (Shokri et al.)
![Page 11: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/11.jpg)
What is the threat model?
11
Consu
mer
s
Model
host
Data o
wners
Trai
ning
Infe
renc
e
Data integrity
Model integrity
Model integrity
Data poisoning(Koh and Liang, 2017)
Backdoor(Gu et al., 2017)
Adversarial examples(Szegedy et al., 2013)
Data confidentiality
Membership inference(Shokri et al., 2017)
Data Privacy
CryptoNets(Dowlin et al., 2016)
Model confidentiality Model extraction(Tramer et al., 2016)
Data confidentiality
Data privacy RAPPOR(Erlingsson, 2014)
Federated learning(McMahan, 2017)
Adversarial goal Attack / defense example
Towards the Science of Security and Privacy in Machine Learning [IEEE EuroS&P 2018]Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, and Michael Wellman
![Page 12: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/12.jpg)
Attacking Machine Learning Integrity with Adversarial Examples
12
![Page 13: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/13.jpg)
Attacker may see the model: attacker needs to know details of the machine learning model
to do an attack --- aka a white-box attacker
Attacker may not see the model: attacker who knows very little (e.g. only gets to ask a few
questions) --- aka a black-box attacker
The threat model
13
ML
ML
![Page 14: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/14.jpg)
Jacobian-based Saliency Map Approach (JSMA)
14The Limitations of Deep Learning in Adversarial Settings [IEEE EuroS&P 2016]Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami
![Page 15: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/15.jpg)
15
![Page 16: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/16.jpg)
Adversarial examples...
… beyond deep learning
16
… beyond computer vision
Logistic Regression
Support Vector Machines
Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples [arXiv preprint]Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow
P[X=Malware] = 0.90P[X=Benign] = 0.10
P[X*=Malware] = 0.10P[X*=Benign] = 0.90
Adversarial Attacks on Neural Network Policies [arXiv preprint]Sandy Huang, Nicolas Papernot, Ian Goodfellow, Yan Duan, Pieter AbbeelAdversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017]Kathrin Grosse, Nicolas Papernot, Praveen Manoharan, Michael Backes, Patrick McDaniel
Nearest Neighbors
Decision Trees
Useful to think about definitions and threat model
![Page 17: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/17.jpg)
Attacker may see the model: attacker needs to know details of the machine learning model
to do an attack --- aka a white-box attacker
Attacker may not see the model: attacker who knows very little (e.g. only gets to ask a few
questions) --- aka a black-box attacker
The threat model
17
ML
ML
![Page 18: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/18.jpg)
Attacking remotely hosted black-box models
18
Remote ML sys
“no truck sign”“STOP sign”
“STOP sign”
(1) The adversary queries remote ML system for labels on inputs of its choice.
Practical Black-Box Attacks against Machine Learning [AsiaCCS 2017]Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z.Berkay Celik, and Ananthram Swami
![Page 19: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/19.jpg)
19
Remote ML sys
Local substitute
“no truck sign”“STOP sign”
“STOP sign”
(2) The adversary uses this labeled data to train a local substitute for the remote system.
Attacking remotely hosted black-box models
![Page 20: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/20.jpg)
20
Remote ML sys
Local substitute
“no truck sign”“STOP sign”
(3) The adversary selects new synthetic inputs for queries to the remote ML system based on the local substitute’s output surface sensitivity to input variations.
Attacking remotely hosted black-box models
![Page 21: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/21.jpg)
21
Remote ML sys
Local substitute
“yield sign”
(4) The adversary then uses the local substitute to craft adversarial examples, which are misclassified by the remote ML system because of transferability.
Attacking remotely hosted black-box models
![Page 22: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/22.jpg)
Cross-technique transferability
22Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples [arXiv preprint]Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow
ML
![Page 23: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/23.jpg)
Properly-blinded attacks on real-world remote systems
23
All remote classifiers are trained on the MNIST dataset (10 classes, 60,000 training samples)
Remote Platform ML technique Number of queriesAdversarial examples
misclassified (after querying)
Deep Learning 6,400 84.24%
Logistic Regression 800 96.19%
Unknown 2,000 97.72%
![Page 24: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/24.jpg)
24
![Page 25: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/25.jpg)
Defending against adversarial examples
25
![Page 26: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/26.jpg)
Learning models robust to adversarial examples is hard
26Is attacking machine learning easier than defending it? [Blog post at www.cleverhans.io]Ian Goodfellow and Nicolas Papernot
Error spaces containing adversarial examples are large
Learning or detecting adversarial examples creates an arms race
ML Victim
![Page 27: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/27.jpg)
What makes a successful deep neural network?
27
Softmax
3rd hidden
2nd hidden
1st hidden
Inputs
Layer name Neural architecture Representation spaces
Panda
![Page 28: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/28.jpg)
What makes a successful adversarial example?
28
School BusSoftmax
3rd hidden
2nd hidden
1st hidden
Inputs + = + =
Layer name Neural architecture Representation spaces Nearest neighbors
![Page 29: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/29.jpg)
Nearest neighbors indicate support from training data...
29
Panda School BusSoftmax
3rd hidden
2nd hidden
1st hidden
Inputs + = + =
Layer name Neural architecture Representation spaces Nearest neighbors
![Page 30: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/30.jpg)
… Deep k-Nearest Neighbors (DkNN) classifier
30Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning [arXiv preprint]Nicolas Papernot, Patrick McDaniel
1. Searches for nearest neighbors in the training data at each layer
2. Estimates the nonconformity of input x for each possible label y
3. Apply conformal prediction to compute:
a. Confidence
“How likely is the prediction given the training data?”
b. Credibility
“How relevant is the training data to the prediction?”+ =
Panda School Bus
![Page 31: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/31.jpg)
Carlini & WagnerBasic Iterative Method
Example applications of DkNN credibility
31Out
of d
istr
ibut
ion
Mis
labe
led
inpu
tsA
dver
saria
l ex
ampl
es
Clean inputs
![Page 32: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/32.jpg)
Implications for the attacker and defender
Attacker
32
Defender
Reject low credibility predictions:
-> explicit tradeoff between clean accuracy and adversarial accuracy
Active learning: more training data through human labeling of rejected predictions
Contributes to breaking “black-box” myth
![Page 33: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/33.jpg)
Some (surprising) connections to fairness & interpretability
33
Adversarial Examples that Fool both Human and Computer Vision [arXiv preprint]Gamaleldin F. Elsayed, Shreya Shankar, Brian Cheung, Nicolas Papernot, Alex Kurakin, Ian Goodfellow, Jascha Sohl-Dickstein
![Page 34: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/34.jpg)
Machine Learning with Privacy
34
![Page 35: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/35.jpg)
Types of adversaries and our threat model
35
In our work, the threat model assumes:
- Adversary can make a potentially unbounded number of queries- Adversary has access to model internals
Model inspection (white-box adversary)Zhang et al. (2017) Understanding DL requires rethinking generalization
Model querying (black-box adversary)Shokri et al. (2016) Membership Inference Attacks against ML ModelsFredrikson et al. (2015) Model Inversion Attacks
?Black-box
ML
![Page 36: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/36.jpg)
A definition of privacy: differential privacy
36
Randomized Algorithm
Randomized Algorithm
Answer 1Answer 2
...Answer n
Answer 1Answer 2
...Answer n
????}
![Page 37: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/37.jpg)
Private Aggregation of Teacher Ensembles (PATE)
37
Partition 1
Partition 2
Partition n
Partition 3
...
Teacher 1
Teacher 2
Teacher n
Teacher 3
...
Training
Sensitive Data
Data flow
Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data [ICLR 2017 best paper]Nicolas Papernot, Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, and Kunal Talwar
![Page 38: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/38.jpg)
Aggregation
38
Count votes Take maximum
![Page 39: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/39.jpg)
Intuitive privacy analysis
39
If most teachers agree on the label, it does not depend on specific partitions, so the privacy cost is small.
If two classes have close vote counts, the disagreement may reveal private information.
![Page 40: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/40.jpg)
Noisy aggregation
40
Count votes Add Laplacian noise Take maximum
![Page 41: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/41.jpg)
Teacher ensemble
41
Partition 1
Partition 2
Partition n
Partition 3
...
Teacher 1
Teacher 2
Teacher n
Teacher 3
...
Aggregated Teacher
Training
Sensitive Data
Data flow
![Page 42: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/42.jpg)
Student training
42
Partition 1
Partition 2
Partition n
Partition 3
...
Teacher 1
Teacher 2
Teacher n
Teacher 3
...
Aggregated Teacher Student
Training
Available to the adversaryNot available to the adversary
Sensitive Data
Public Data
Inference Data flow
Queries
![Page 43: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/43.jpg)
Why train an additional “student” model?
43
Each prediction increases total privacy loss.Privacy budgets create a tension between the accuracy and number of predictions.
Inspection of internals may reveal private data.Privacy guarantees should hold in the face of white-box adversaries.
1
2
The aggregated teacher violates our threat model:
![Page 44: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/44.jpg)
Student training
44
Partition 1
Partition 2
Partition n
Partition 3
...
Teacher 1
Teacher 2
Teacher n
Teacher 3
...
Aggregated Teacher Student
Training
Available to the adversaryNot available to the adversary
Sensitive Data
Public Data
Inference Data flow
Queries
![Page 45: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/45.jpg)
Deployment
45
Inference
Available to the adversary
QueriesStudent
![Page 46: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/46.jpg)
Differential privacy:A randomized algorithm M satisfies ( , ) differential privacy if for all pairs of neighbouring datasets (d,d’), for all subsets S of outputs:
Application of the Moments Accountant technique (Abadi et al, 2016)
Strong quorum ⟹ Small privacy cost
Bound is data-dependent: computed using the empirical quorum
Differential privacy analysis
46
![Page 47: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/47.jpg)
Trade-off between student accuracy and privacy
47
![Page 48: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/48.jpg)
Synergy between utility and privacy
48
1. Check privately for consensus2. Run noisy argmax only when consensus is sufficient
Scalable Private Learning with PATE [ICLR 2018]Nicolas Papernot, Shuang Song, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, Ulfar Erlingsson
![Page 49: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/49.jpg)
Trade-off between student accuracy and privacy
49
Selective PATE
![Page 50: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/50.jpg)
Machine learning and Goodhart’s law
Economist Charles Goodhart posited in 1975 that …
“When a measure becomes a target, it ceases to be a good
measure”
As ML models make more and more decisions, we will have to satisfy them, and they will become targets.
50
![Page 51: Machine Learning Security and Privacy in …...Adversarial Perturbations Against Deep Neural Networks for Malware Classification [ESORICS 2017] Kathrin Grosse, Nicolas Papernot, Praveen](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0a91d17e708231d42c475e/html5/thumbnails/51.jpg)
?Thank you for listening!
51
www.cleverhans.io@NicolasPapernot
www.papernot.fr