exploiting data-usage statistics for website ... · exploiting data-usage statistics for website...
TRANSCRIPT
S C I E N C E P A S S I O N T E C H N O L O G Y
www.iaik.tugraz.at
Exploiting Data-Usage Statistics forWebsite Fingerprinting Attackson AndroidRaphael Spreitzer, Simone Griesmayr,Thomas Korak, and Stefan MangardIAIK, Graz University of Technology, Austria
WiSec 2016, Darmstadt, Germany, 18th July 2016
www.iaik.tugraz.at
Contributions
Side-channel attack to infer browsing behavior
Unprivileged application
Data-usage statistics
High accuracy
Also works when traffic is routed through Tor
READ_HISTORY_BOOKMARKS does not provide protection
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 20162
www.iaik.tugraz.at
Website FingerprintingTraditional attack scenario
Attacker located somewhere on the victim’s networkTraffic analysis techniques to infer browsing behavior
InternetSSH tunnel
Attacker
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 20163
www.iaik.tugraz.at
Website FingerprintingAttack scenario against smartphones
Malicious application running in unprivileged modeObserve information “leaking” from browser application
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 20164
www.iaik.tugraz.at
Data-Usage StatisticsWhat is this?
Track the amount of incoming/outgoing network trafficUsers can stick to their data planAvailable to all apps w/o any permission
Availability/proc/uid_stat/[uid]/tcp_rcv|tcp_snd
Android API TrafficStats.getUidRxBytes, .getUidTxBytesHow to get uid?
ActivityManager.getRunningAppProcesses() (REAL_GET_TASKS?)PackageManager.getInstalledApplications()
High resolution (single TCP packet lengths)Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 20165
www.iaik.tugraz.at
Data-Usage Statistics
Experiment
Local server hosting awebsite (tcpdump)
Launch website onAndroid (data-usagestatistics)
0 5 10 15−2000
0
2000
4000
6000
8000
10000
TCP packet sequence
Tra
ffic
acco
rdin
g t
o T
CP
pa
cke
ts [
byte
]
1448
1448
1417
1448
1448
1448 71
366364
Outgoing traffic on server
Incoming traffic on server
0 10 20−2000
0
2000
4000
6000
8000
10000
TCP packet sequence
Tra
ffic
acco
rdin
g t
o d
ata
−u
sa
ge
sta
tistics [
byte
]
1448
1448
1417
1448
1448
1519
366364
tcp_rcv on smartphone
tcp_snd on smartphone
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 20166
www.iaik.tugraz.at
Usage Statistics for Real Websites
0 10 20 30 40 500
1
2
3
4
5
6x 10
5
Packet sequence
Da
ta−
usa
ge
sta
tistics [
byte
]
google.com
facebook.com
youtube.com
Websites aredistinguishable
Stable: signatures ofrepeated visits to thesame page are similar
Diverse: signatures ofdifferent pages vary
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 20167
www.iaik.tugraz.at
Adversary Model and Attack Scenario
Adversary model
Traditional: nw-based attacker
Unprivileged app distributed via app market
Attack
1. Training phase (offline)
2. Attack phase (online)
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 20168
www.iaik.tugraz.at
Website Fingerprinting
Training phase
Observe data-usage statistics while loading specific websites
⇒ build signature database
No “fancy” machine-learning approach
⇒ no expensive training phase necessary
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 20169
www.iaik.tugraz.at
Website FingerprintingAttack phase
1. Distribute malicious application
2. Observe data-usage statistics for browser application
3. Infer visited website by means of signature database
Similarity metric for traces
SIM(t1, t2) =|t1 ∩ t2||t1 ∪ t2|
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 201610
www.iaik.tugraz.at
ResultsIntra-day classification rate
100 most popular websites globally
20 40 60 80
10
20
30
40
50
60
70
80
90
0
0.2
0.4
0.6
0.8
1
89% of 500 page visits
confusion of google*.* pages
20 40 60 80
10
20
30
40
50
60
70
80
90
0
0.2
0.4
0.6
0.8
1
98% of 500 page visits
with google*.* pages mergedSpreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 201611
www.iaik.tugraz.at
ResultsInter-day classification for Tor
100 most popular websites in the US
0 1 2 3 4 5 6 70.8
0.85
0.9
0.95
1
Cla
ssific
ation r
ate
Days between gathering training data and test data (∆)
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 201612
www.iaik.tugraz.at
ResultsWebsites with the highest number of misclassifications
∆ Website # misclassifications
2 days ask.com 5 times2 days twitch.tv 5 times2 days cnn.com 3 times
5 days bbc.com 5 times5 days indeed.com 5 times5 days nytimes.com 5 times5 days twitch.tv 5 times5 days espn.go.com 4 times
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 201613
www.iaik.tugraz.at
ComparisonWork Exploited information Countermeasure # websites Classification rate
Ours Client-side data-usage statistics None 500 97%Jana and Shmatikov [JS12] Client-side memory footprint None 100 35%
Ours Client-side data-usage statistics Tor 100 95%Wang et al. [WCN+14] TCP packets Tor 100 95%Wang and Goldberg [WG13] TCP packets Tor 100 91%Cai et al. [CZJJ12] TCP packets captured via tshark Tor 100 84%Panchenko et al. [PNZE11] Client-side tcpdump Tor 775 55%Herrmann et al. [HWF09] Client-side tcpdump Tor 775 3%
AdvantagesEase of applicability (unprivileged app vs on the wire)Computational performance (no training vs 608 000 CPU seconds)Classification ratesNo traffic noise due to other apps
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 201614
www.iaik.tugraz.at
CountermeasuresAgainst NW-based fingerprinting attacks
Traffic morphing, HTTPOS, BuFLO, Glove
Tor?
Client-side countermeasures
Permission-based approaches? [ZDH+13]
Request permission to monitor data-usage statistics?Let developers specify how statistics should be published?
⇒ update data-usage statistics on a more coarse-grained level
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 201615
www.iaik.tugraz.at
ConclusionsFundamental weaknesses in Android
Seemingly innocuous information
. . . that turns out to be a serious information leak
Unprivileged app can infer browsing behavior, although
Orweb or “private/incognito” modes do not store browsing history
Traffic is routed through Tor
READ_HISTORY_BOOKMARKS should protect this sensitive information
⇒ Privacy issue
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 201616
S C I E N C E P A S S I O N T E C H N O L O G Y
www.iaik.tugraz.at
Exploiting Data-Usage Statistics forWebsite Fingerprinting Attackson AndroidRaphael Spreitzer, Simone Griesmayr,Thomas Korak, and Stefan MangardIAIK, Graz University of Technology, Austria
WiSec 2016, Darmstadt, Germany, 18th July 2016
www.iaik.tugraz.at
Bibliography I
[CZJJ12] Xiang Cai, Xin Cheng Zhang, Brijesh Joshi, and Rob Johnson.
Touching from a Distance: Website Fingerprinting Attacks and Defenses.
In Conference on Computer and Communications Security – CCS 2012, pages 605–616. ACM, 2012.
[HWF09] Dominik Herrmann, Rolf Wendolsky, and Hannes Federrath.
Website Fingerprinting: Attacking Popular Privacy Enhancing Technologies with the Multinomial Naı̈ve-Bayes Classifier.
In Cloud Computing Security Workshop – CCSW, pages 31–42. ACM, 2009.
[JS12] Suman Jana and Vitaly Shmatikov.
Memento: Learning Secrets from Process Footprints.
In IEEE Symposium on Security and Privacy – S&P 2012, pages 143–157. IEEE Computer Society, 2012.
[PNZE11] Andriy Panchenko, Lukas Niessen, Andreas Zinnen, and Thomas Engel.
Website Fingerprinting in Onion Routing Based Anonymization Networks.
In Workshop on Privacy in the Electronic Society – WPES 2011, pages 103–114. ACM, 2011.
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 201618
www.iaik.tugraz.at
Bibliography II
[WCN+14] Tao Wang, Xiang Cai, Rishab Nithyanand, Rob Johnson, and Ian Goldberg.
Effective Attacks and Provable Defenses for Website Fingerprinting.
In USENIX Security Symposium 2014, pages 143–157. USENIX Association, 2014.
[WG13] Tao Wang and Ian Goldberg.
Improved Website Fingerprinting on Tor.
In Workshop on Privacy in the Electronic Society – WPES 2013, pages 201–212. ACM, 2013.
[ZDH+13] Xiao-yong Zhou, Soteris Demetriou, Dongjing He, Muhammad Naveed, Xiaorui Pan, XiaoFeng Wang, Carl A. Gunter, and KlaraNahrstedt.
Identity, Location, Disease and More: Inferring Your Secrets from Android Public Resources.
In Conference on Computer and Communications Security – CCS 2013, pages 1017–1028. ACM, 2013.
Spreitzer, Griesmayr, Korak, MangardWiSec 2016, Darmstadt, Germany, 18th July 201619