automated website fingerprinting through deep learning...› cumul (panchenko et al, 2016) ›...

49
Automated Website Fingerprinting through Deep Learning Vera Rimmer 1 , Davy Preuveneers 1 , Marc Juarez 2 , Tom Van Goethem 1 and Wouter Joosen 1 1 2 NDSS 2018 – Feb 19th (San Diego, USA)

Upload: others

Post on 14-Feb-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Automated Website Fingerprinting through Deep LearningVera Rimmer1, Davy Preuveneers1, Marc Juarez2, Tom Van Goethem1 and Wouter Joosen1

1 2

NDSS 2018 – Feb 19th (San Diego, USA)

Page 2: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting

Page 3: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Anonymous Communication through Tor

› All (secure) communication protocols expose metadata

timing, size of packets, identities, locations, addresses, communication patterns –> reveal private information

› Anonymity tools relay traffic through protected communication channels

The Onion Router (Tor)

2

Page 4: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting

› Side-channel attack that reveals user’s browsing activity

› Adversary is a local eavesdropper

• ISP • Autonomous Systems • Local network admins • Wi-Fi hotspot owners

3

Page 5: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting

• Number of packets • Average packet size • % of incoming packets • Timing of packets • ...

3

Page 6: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting

3

Page 7: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting

3

"Closed world" of websites

Page 8: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting Pipeline

Communication patterns

4

Page 9: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting Pipeline

Communication patterns

Feature Extraction

4

Page 10: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting Pipeline

Communication patterns

Feature Extraction Machine Learning

4

Page 11: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting Pipeline

Communication patterns

Feature Extraction Machine Learning

Identification

4

Page 12: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

State-of-the-Art Attacks

5

› kNN (Wang et al., 2014) › 3,000 features picked through heuristics (total size, total time, number of packets,

packet ordering, traffic bursts…) › Classifier: k-Nearest Neighbors

› k-Fingerprinting (Hayes et al., 2016) › 150 features selected from Wang’s through the analysis of feature importance › Classifier: Random Forest and k-Nearest Neighbors

› CUMUL (Panchenko et al., 2016) › 100 features, interpolation points of the cumulative sum of packet lengths › Classifier: Support Vector Machine

Page 13: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting Arms-race

Communication patterns Identification

Feature Extraction Machine Learning

6

Main focus of prior work: • Manual engineering • Intellectual effort • Difficult and expensive

AND Success of attacks is defined by the set of engineered features

Page 14: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting Arms-race

Communication patterns Identification

Feature Extraction Machine Learning

Concealing these features creates a countermeasure

7

Page 15: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting Arms-race

Communication patterns Identification

Feature Extraction Machine Learning

Feature Extraction

New attack exploits other, still visible features

8

Page 16: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting Arms-race

Communication patterns Identification

Feature Extraction Machine Learning

Feature Extraction

9

Page 17: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Alternative?

Website Fingerprinting

Communication patterns Identification

Feature Extraction Machine Learning

10

Page 18: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Website Fingerprinting

Communication patterns Identification

Feature Extraction Machine Learning

Deep Learning

Page 19: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Deep Learning for WF

Page 20: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Why Deep Learning?

› Automatic feature learning from raw input

› Obviates hand-engineering of features

› Adaptive to changes in patterns

› Limited transparency and interpretability

› Learned features are implicit and abstract

› Efficient, easily distributed and parallelized

11

Page 21: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Deep Learning based WF

› Data Collection

› DL requires a lot of training data

› Deep Neural Network choice

› Choosing the best suited deep learning algorithm

› Hyperparameter Tuning and Model Selection

› Tuning of heavily parameterised models

12

Page 22: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Data Collection

› Built a distributed crawler

› captures timing, direction and sizes of TCP packets

› 2,500 traces for each 900 top Alexa most popular sites: largest-ever dataset

› Closed worlds: CWN datasets, where N is the number of sites

13

Page 23: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Deep Neural Networks

› Choice of a Deep Neural Network (DNN) suited for the input data

› 1D sequences of incoming and outgoing Tor cells encoded as 1 and -1

› Explored 3 major types of DNNs:

› feedforward: Stacked Denoising Autoencoder (SDAE) • learns from the continuous structure through dimensionality reduction

› convolutional: Convolutional Neural Network (CNN) • learns from the spatial structure through convolutions and subsampling

› recurrent: Long Short Term Memory (LSTM) • learns from the temporal structure (time-series) through internal memory

14

Page 24: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Evaluation and Results

Page 25: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Re-evaluation of Traditional Attacks

15

95.43

92.4792.87

Page 26: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Re-evaluation of Traditional Attacks

15

Best performant on the closed world,

most practical attack

95.43

92.4792.87

Page 27: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Closed World

16

Page 28: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Closed World

16

Overall, comparable with the state-of-the-art

Page 29: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Closed WorldCW100: CUMUL still

outperforms all attacks, followed by

CNN

16

Page 30: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Closed World

16

Accuracy falls as the number of

websites increases

Page 31: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Closed World

16

CW900: SDAE outperforms state-

of-the-art

Page 32: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Number of Traces per Website

17

Acc

urac

y (%

)

Number of instances per website

Page 33: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Number of Traces per Website

17

Acc

urac

y (%

)

Number of instances per website

LSTM takes longer to catch up (due to learning

constraints on long sequences)

Page 34: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Concept Drift

18

CW200

Page 35: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Concept Drift

Moment of training 18

CW200

Page 36: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Concept Drift

Moment of training 18

CW200

SDAE, LSTM and CNN generalize

better than the state-of-the-art

Page 37: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Implications and Take-aways

Page 38: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Implications and Take-aways

› First thorough evaluation of DL for WF

› Powerful and robust attack (accuracy: 96% for CW100, 94% for CW900)

› Each DNN has its strengths and weaknesses

› Game-changer for the WF arms-race:

› Automated feature learning (vs. the burden of manual feature engineering)

› Harder to defend against (due to non-trivial interpretability of features)

› Data collection and model selection are crucial to the performance

› Evaluated by collecting the largest dataset for WF

19

Page 39: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Thank you!

WEBSITE FINGERPRINTING THROUGH DEEP LEARNING https://distrinet.cs.kuleuven.be/software/tor-wf-dl

Page 40: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

References

20

1. T. Wang and I. Goldberg, “Improved Website Fingerprinting on Tor,” in ACM Workshop on Privacy in the Electronic Society (WPES). ACM, 2013, pp. 201–212.

2. T. Wang and I. Goldberg, “On realistically attacking tor with website fingerprinting,” in Proceedings on Privacy Enhancing Technologies (PoPETs). De Gruyter Open, 2016, pp. 21–36.

3. T. Wang, X. Cai, R. Nithyanand, R. Johnson, and I. Goldberg, “Effective Attacks and Provable Defenses for Website Fingerprinting,” in USENIX Security Symposium. USENIX Association, 2014, pp. 143–157.

4. A. Panchenko, F. Lanze, A. Zinnen, M. Henze, J. Pennekamp, K. Wehrle, and T. Engel, “Website fingerprinting at internet scale,” in Network & Distributed System Security Symposium (NDSS). IEEE Computer Society, 2016, pp. 1–15.

5. J. Hayes and G. Danezis, “k-fingerprinting: a Robust Scalable Website Fingerprinting Technique,” in USENIX Security Symposium. USENIX Association, 2016, pp. 1–17.

6. K. Abe and S. Goto, “Fingerprinting attack on tor anonymity using deep learning,” Proceedings of the Asia-Pacific Advanced Network, vol. 42, pp. 15–20, 2016.

7. M. Juarez, S. Afroz, G. Acar, C. Diaz, and R. Greenstadt, “A critical evaluation of website fingerprinting attacks,” in ACM Conference on Computer and Communications Security (CCS). ACM, 2014, pp. 263–274.

Page 41: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

SDAE

Feature extraction

ClassificationHidden representation

Tor cells

Autoencoder SDAE classifier

21

Page 42: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

CNNFeature extraction Classification

Feature maps

Convolution Subsampling

Feature extractionTor cells

CNN classifier

22

Page 43: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

LSTM

LSTM unitLSTM network

23

Page 44: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Closed World vs Open World

Closed World Open World

4

Page 45: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

State-of-the-Art Attacks

› kNN (Wang et al., 2014)› Features

› 3,000 (picked through heuristics) › total size, total time, number of

packets, packet ordering, traffic bursts…

› Classifier › k-Nearest Neighbors (k-NN)

› Accuracy › 92% (100 websites)

k=4

x2

x1

k=7

6

Page 46: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

State-of-the-Art Attacks

› k-Fingerprinting (Hayes et al, 2016)

› Features › 150 (selected from Wang’s through

analysis of feature importance) › Classifier

› Random Forest + k-NN

› Accuracy › 93% (100 websites)

7

Page 47: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

State-of-the-Art Attacks

› CUMUL (Panchenko et al, 2016)› Features

› 100 (derived as interpolation points of the cumulative sum of packet lengths)

› Classifier › Support Vector Machine (SVM)

› Accuracy › From 97% (100 websites)

8

Page 48: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Open World: ROC Curve

17

Monitored: 200 websites Non-monitored: 400,000 websites

Page 49: Automated Website Fingerprinting through Deep Learning...› CUMUL (Panchenko et al, 2016) › Features › 100 (derived as interpolation points of the cumulative sum of packet lengths)

Open World: ROC CurveCNN and SDAE

outperform state-of-the-art

17

Monitored: 200 websites Non-monitored: 400,000 websites