provably robust deep - carnegie mellon school of computer...
TRANSCRIPT
![Page 1: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/1.jpg)
Provably robust deep learning
J. Zico Kolter
Carnegie Mellon University and Bosch Center for AI
1
Wooaah...
![Page 2: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/2.jpg)
OutlineIntroduction
Attacking machine learning algorithms
Defending against adversarial attacks
Final thoughts
2
![Page 3: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/3.jpg)
OutlineIntroduction
Attacking machine learning algorithms
Defending against adversarial attacks
Final thoughts
3
![Page 4: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/4.jpg)
The AI breakthrough (some recent history)
4Karras et al., 2018 Radford et al., 2019 Vinyals et al., 2019
![Page 5: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/5.jpg)
…but the stakes are low
5
??
??
![Page 6: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/6.jpg)
Adversarial attacks
6
Sharif et al., 2016Evtimov et al., 2017
Athalye et al., 2017
Figure from Madry et al.
![Page 7: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/7.jpg)
… and some recent work
7[Lee and Kolter, 2019], https://arxiv.org/abs/1906.11897
![Page 8: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/8.jpg)
Why should we care?…you probably don’t have an adversary changing inputs to your classifier at a pixel level (or if you do, you have bigger problems)
1. Genuine security implications for deep networks (e.g., with physical attacks)
2. Says something fundamental about the representation of deep classifiers, smooth decision boundaries, sensitivity to distribution shift (within threat model), etc
8
![Page 9: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/9.jpg)
OutlineIntroduction
Attacking machine learning algorithms
Defending against adversarial attacks
Final thoughts
9
![Page 10: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/10.jpg)
Adversarial attacks as optimization
10
𝐄",$
max(∈∆
Loss 𝑓/(𝑥 + 𝛿), 𝑦
𝐄",$
Loss 𝑓/(𝑥), 𝑦
![Page 11: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/11.jpg)
The adversarial optimization problemHow do we solve the “inner” optimization problem
max(∈∆
Loss 𝑓/(𝑥 + 𝛿), 𝑦
Key insight: the same process that enabled us to learn the model parameters via gradient descent also allows us to create an adversarial example via gradient descent
𝜕
𝜕𝛿Loss 𝑓
/(𝑥 + 𝛿), 𝑦
11
![Page 12: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/12.jpg)
Solving with projected gradient descentSince we are trying to maximize the loss when creating an adversarial example, we repeatedly move in the direction of the positive gradient
Since we also need to ensure that 𝛿 ∈ Δ, we also project back into this set after each step, a process known as projected gradient descent (PGD)
𝛿 ≔ Proj∆
𝛿 + 𝛼𝜕
𝜕𝛿Loss 𝑓
/𝑥 + 𝛿 , 𝑦
Example: for Δ = {𝛿: 𝛿∞≤ 𝜖} (called the ℓ
∞ball), the projection operator just
clips each coordinate to [−𝜖, 𝜖]
12
![Page 13: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/13.jpg)
The Fast Gradient Sign MethodThe Fast Gradient Sign Method (FGSM) takes a single PGD step with step size 𝛼 →∞, which corresponds exactly to just taking a step in the signs of the gradient terms
Creates weaker attacks than running full PGD, but substantially faster
13
∆
δ = 0
α∂
∂δGQbb(fθ(x + δ), y)
P∆
![Page 14: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/14.jpg)
Illustration of adversarial examplesWe will demonstrate adversarial attacks on MNIST data set, using two different architectures
14
FC-100FC-10Conv-32x28x28
Conv-32x28x28
Conv-64x14x14
Conv-64x14x14
FC-200FC-10
2-layer fully connected MLP 6 layer ConvNet
![Page 15: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/15.jpg)
Illustrations of FGSM/PGD
15
ConvNet(FGSM):
ConvNet(PDG)
2.9% 1.1%
92.6%
41.7%
96.4%
74.3%
MLP ConvNet
Test Error, epsilon=0.1
Clean FGSM PGD
![Page 16: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/16.jpg)
OutlineIntroduction
Attacking machine learning algorithms
Defending against adversarial attacks
Final thoughts
16
![Page 17: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/17.jpg)
Adversarial robustness
min/
𝐄",$
Loss 𝑓/(𝑥), 𝑦 ⟹ min
/
𝐄",$
max(∈∆
Loss 𝑓/(𝑥 + 𝛿), 𝑦
1. Adversarial training: Take model SGD steps at (approximate) worst-case perturbations [Goodfellow et al., 2015, Kurakin et al., 2016; Madry et al., 2017]
2. Certified defenses: Provably upper bound inner maximization [Wong and Kolter, 2018; Ragunathan et al., 2018; Mirman et al., 2018; Cohen et al., 2019]
17
“pig”
![Page 18: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/18.jpg)
Adversarial trainingHow do we optimize the objective
min/
∑
",$∈O
max(∈∆
Loss 𝑓/(𝑥 + 𝛿), 𝑦
We would like to solve it with gradient descent, but how do we compute the gradient of the objective with the max term inside?
18
![Page 19: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/19.jpg)
Danskin’s TheoremA fundamental result in optimization:
𝜕
𝜕𝜃max(∈∆
Loss 𝑓/(𝑥 + 𝛿), 𝑦 =
𝜕
𝜕𝜃Loss 𝑓
/(𝑥 + 𝛿
⋆), 𝑦
where 𝛿⋆ = argmax
(∈∆
Loss 𝑓/(𝑥 + 𝛿), 𝑦
Seems “obvious,” but it is a very subtle result; means we can optimize through the max by just finding it’s maximizing value
Note however, it only applies when max is performed exactly
19
![Page 20: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/20.jpg)
Adversarial trainingRepeat
1. Select minibatch 𝐵2. For each 𝑥, 𝑦 ∈ 𝐵, compute
adversarial example 𝛿⋆ 𝑥3. Update parameters
𝜃 ≔ 𝜃 −𝛼
𝐵∑
",$∈T
𝜕
𝜕𝜃Loss 𝑓
/(𝑥 + 𝛿
⋆𝑥 ), 𝑦
Common to also mix robust/standard updates (not done in our case)
20
1.1% 0.9%
41.7%
2.6%
74.4%
2.8%
ConvNet Robust ConvNet
Test Error, epsilon=0.1
Clean FGSM PGD
![Page 21: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/21.jpg)
Evaluating robust modelsOur model looks good, but we should be careful declaring success
Need to evaluate against different attacks, PGD attacks run for longer, with random restarts, etc
Note: it is not particularly informative to evaluate against a different type of attack, e.g. evaluate ℓ
∞robust model against ℓ
1or ℓ
2attacks
21
![Page 22: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/22.jpg)
Adversarial robustness
min/
𝐄",$
Loss 𝑓/(𝑥), 𝑦 ⟹ min
/
𝐄",$
max(∈∆
Loss 𝑓/(𝑥 + 𝛿), 𝑦
1. Adversarial training: Take model SGD steps at (approximate) worst-case perturbations [Goodfellow et al., 2015, Kurakin et al., 2016; Madry et al., 2017]
2. Certified defenses: Provably upper bound inner maximization [Wong and Kolter, 2018; Ragunathan et al., 2018; Mirman et al., 2018; Cohen et al., 2019]
22
“pig”
![Page 23: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/23.jpg)
Provable defenses
max(∈∆
Loss 𝑓/𝑥 + 𝛿 , 𝑦 ≤ max
(∈∆
Loss 𝑓/
rel𝑥 + 𝛿 , 𝑦 ≤ Loss(𝑓
/
dual𝑥,Δ , 𝑦)
23
ℓ u
z
zℓ u
z
z
ℓ
uz
z
Dual from [Wong and Kolter, 2018], also independently
derived via hybrid zonotope [Mirman et al., 2018] and
forward Lipschitz arguments [Weng et al., 2018]
Maximization problem is now a convex linear program [Wong and Kolter, 2018]
[Wong and Kolter, 2018], https://arxiv.org/abs/1711.00851
![Page 24: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/24.jpg)
Robust optimization: putting it all togetherIn the end, instead of minimizing the traditional loss…
minimize/
∑
_=1
`
ℓ(ℎ/𝑥_, 𝑦
_)
…we just minimize our computed bound on loss, implemented in an auto-differentiation framework (PyTorch), and we get a guaranteed bound on worst-case loss (or error) for any norm-bounded adversarial attack
minimize/
∑
_=1
`
ℓ(𝐽c,/
𝑥_, 𝑦
_) ≥ minimize
/
∑
_=1
`
ℓ(max(∈∆
ℓ ℎ/𝑥_+ 𝛿 , 𝑦
_)
Full code available at https://github.com/locuslab/convex_adversarial
24
![Page 25: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/25.jpg)
2D Toy ExampleSimple 2D toy problem, 2-100-100-100-2 MLP network, trained with Adam (learning rate = 0.001, no hyperparameter tuning)
25
Standard training Robust convex training
![Page 26: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/26.jpg)
Standard and robust errors on MNIST 𝜖 = 0.1
26
1.10%
17%
1.10%
100%
44%
3.70%0.00%
10.00%20.00%30.00%40.00%50.00%60.00%70.00%80.00%90.00%
100.00%
Standard CNN Robust linear classifier Our method (CNN)
Error Guaranteed robust error bound
![Page 27: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/27.jpg)
MNIST AttacksWe can also look at how well real attacks perform at 𝜖 = 0.1
27
1.1% 1.1%
50%
2.1%
82%
2.8%
100%
3.7%0.0%
10.0%20.0%30.0%40.0%50.0%60.0%70.0%80.0%90.0%
100.0%
Standard training Our methodNo attack FGSM PGD Robust bound
![Page 28: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/28.jpg)
What causes adversarial examples?Adversarial examples are caused (informally) by small regions of adversarial class “jutting” into an otherwise “nice” decision region (see also, e.g., [Roth et al., 2019])
28
Data point
Correct classIncorrect class
![Page 29: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/29.jpg)
Randomization as a defense?We can “smooth” this decision region by adding Gaussian noise to the input and picking the majority class of the classifier over this noise
This was proposed (in many different ways) as a heuristic defense, but [Lecuyer et al, 2018] and later [Li et al., 2018] demonstrated that it gives certified bounds; we simplify and tighten this analysis in [Cohen et al., 2019]
29
𝑓(𝑥) 𝑔 𝑥 = argmax
$
𝐏c∼k(0,m
2o)[𝑓 𝑥 + 𝜖 = 𝑦]
![Page 30: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/30.jpg)
Visual intuition of randomized smoothingTo classify panda images, classify a bunch of versions perturbed by random noise, take the majority vote
Note that this requires that our “base” classifier 𝑓 be able to classify noisy images well (in practice, means we also need to train on these noisy images)
30
![Page 31: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/31.jpg)
The randomized smoothing guaranteeTheorem (binary case):• Given some input 𝑥, let 𝑦 = 𝑔(𝑥) be prediction of the smoothed classifier,
and let 𝑝 > 1/2 be the associated probability of this class under the smoothing distribution
𝑝 = 𝐏c∼k(0,m
2o)𝑓 𝑥 + 𝜖 = 𝑦
• Then 𝑔 𝑥 + 𝛿 = 𝑦 (i.e., smoothed classifier is robust)for any 𝛿 such that
𝛿2≤ 𝜎Φ
−1𝑝
where Φ−1 is the Gaussian inverse CDF
31
![Page 32: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/32.jpg)
Proof of certified robustnessReasonable question: why can performance on random noise tell us anything about performance under adversarial noise?
Proof of theorem (informal): • Suppose I have two points 𝑥 and 𝑥 + 𝛿 and you an adversarial want to craft
a decision boundary for the underlying classifier 𝑓(𝑥) such that:1. 𝑥 is classified one way by smoothed classifier 𝑔(𝑥)2. 𝑥 + 𝛿 is classified differently by smoothed classifier 𝑔(𝑥)
32
![Page 33: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/33.jpg)
x
x + δ
Proof of certified robustness (cont)
x
x + δ
33
x
x + δ
x
x + δ
x
x + δ
x
x + δ
𝑓(𝑥) 𝑔 𝑥
x
x + δ
x
x + δ
For linear classifier, we can compute ℓ2
distance to worse-case boundary exactly𝑅 = 𝜎Φ
−1𝑝
where 𝑝 is probability of majority class; implies any perturbation with 𝛿2≤ 𝑅
cannot change class label ∎
x
x + δ
R
(Follows from Neyman-Pearson
lemma in hypothesis testing)
See also [Li and Kuelbs 1998]
(thanks Ludwig Schmidt for pointing out reference)
![Page 34: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/34.jpg)
Caveats (a.k.a. the fine print)The procedure here only guarantees robustness for the smoothed classifier 𝑔 not for the underlying classifier 𝑓
The probability 𝑝 of correct classification under smoothing cannot be computed exactly (the exactly convolution of a Gaussian with a neural network is intractable)• In practice, we need to resort to Monte Carlo estimates to compute a lower
bound on 𝑝 and certify the prediction (need a lot of samples to compute certified radius, though much fewer to just compute prediction)
• Bounds hold with high probability over (internal) randomness of sampling
We are certifying a tiny radius compared to noise distribution
34
![Page 35: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/35.jpg)
Comparison to previous SOTA on CIFAR10
35
For identical networks, mostly outperforms previous SOTA for ℓ2
robustness, but also scales to much larger networks (where it uniformly outperforms duality-based approaches)
![Page 36: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/36.jpg)
Performance on ImageNet
36
Example: we can certify smoothed classifier has top-1 accuracy of 37% under anyperturbation with 𝛿
2≤ 1 (in normalized pixels, i.e., RGB values in [0,1])
![Page 37: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/37.jpg)
Future and ongoing workExtension to other perturbation norms besides ℓ
2?
• Seems extremely challenging (possibly impossible under certain assumptions), e.g., can’t do better than naive 𝑑1/2 scaling for ℓ
∞norm
A strange property:• Previous work on LP bounds was extremely specific to neural networks• Smoothing work never uses the fact that base classifier is neural network
My best guess for a way forward: we need to use model information to extract properties of base classifier beyond single probability 𝑝, use these to get better bounds
37
![Page 38: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/38.jpg)
OutlineIntroduction
Attacking machine learning algorithms
Defending against adversarial attacks
Final thoughts
38
![Page 39: Provably robust deep - Carnegie Mellon School of Computer ...cliu6/16-883/robust_deep_learning.pdf · The AI breakthrough (some recent history) 4 Karraset al., 2018 Radford et al.,](https://reader034.vdocument.in/reader034/viewer/2022042409/5f26b5f8feb6291bc322aec2/html5/thumbnails/39.jpg)
Robust artificial intelligenceDeep learning is making amazing strides, but we have a long ways to go before we can build deep learning systems that achieve even ”small” degrees of robustness/adaptability compared to what humans take for granted
Resources:• http://zicokolter.com – Web page with all papers• http://github.com/locuslab – Code associated with all papers• http://adversarial-ml-tutorial.org – Tutorial/code on adversarial robustness• http://locuslab.github.io – Group blog
39