networkprofiler: towards automatic fingerprinting of android apps shuaifu dai, alok tongaonkar,...
TRANSCRIPT
![Page 1: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/1.jpg)
NetworkProfiler: Towards Automatic Fingerprinting of
Android AppsShuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song
Presented by: Junaed Bin Halim
![Page 2: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/2.jpg)
Outline
•Goal•Motivation•System Overview•Evaluation•Limitations•Related Work•Conclusion/Question
![Page 3: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/3.jpg)
Goal
•What?–Develop a systematic tool• Automatically generate network profiles• In HTTP traffic
–To Identify Apps
•How?–By detecting fingerprints / signatures
![Page 4: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/4.jpg)
Motivation
•Why do we need to identify applications?–To classify traffic generated by the applications for
better network management.–Operators can have a clear visibility into their
network• Better security: Intrusion detection• Better throughput: Real time video over download etc.
•What is traffic classification?–Categorize network traffic according to various
parameters, e.g., port number or protocol
![Page 5: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/5.jpg)
Motivation (contd.)
•Why only Android apps?–Smartphone usage is increasing• 488m smartphones vs 415m pcs in 2011
–Users installs applications (apps) on their smartphones (avg 26 ~ 41)• Most applications generate network traffic
–Researchers prefer android over iOS (openness, availability of tools etc.)
•Why http traffic?–>80% smartphone traffic is http.
![Page 6: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/6.jpg)
Observation
•An app can have many different network behaviors
Important to cover as many network behavior as possible
Key Idea: Identify the invariant parts of the flows belonging to an app
![Page 7: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/7.jpg)
Network Profiler System Overview
![Page 8: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/8.jpg)
Fingerprint Extractor : Parser
•Each HTTP request is composed of 3 parts–m: method–p: page• pc: page component• fn: file name
–q: query• k-v: key-value pair
![Page 9: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/9.jpg)
Fingerprint Extractor: Clusterer
•Uses agglomerative clustering to group HTTP requests by similarities.
•How to find similarity?–Use Jaccard index as a measure of similarity
![Page 10: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/10.jpg)
Fingerprint Extractor: Clusterer (2)
•Cluster–Distance between pages, : 1 - similarity–Distance between queries, : 1 – similarity–Distance between headers, + )/2–Same cluster if [ = 0.6]–Merge cluster A and B if cluster C is similar to both.
•
![Page 11: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/11.jpg)
Fingerprint Extractor: Generation
•Build state machine for each cluster
•Merge state machines that contain the same hosts
![Page 12: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/12.jpg)
Fingerprint Extractor: Generation(2)
•Query-values:– Some have the app name
embedded• Extract keywords from
manifest file–Any unique keyword is
sufficient•Third-party traffic:–Presence of app_id or key
![Page 13: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/13.jpg)
Droid Driver
•Executes android apps and collects the network traces•Consists of two components–Random Tester• For traffic between the app provider, or third-party
–Directed Tester• For traffic between a CDN, or others
•Runs either component for an app
![Page 14: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/14.jpg)
Droid Driver: Random Tester
•Runs the app randomly –Application events are generated at random–For applications that generate• Traffic between the app server• Third party traffic
– Admob, Google DoubleClick– Omniture, Google Analytics
•Efficient
![Page 15: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/15.jpg)
Droid Driver: Directed Tester
•Not all app has unique id in its traffic–In some cases, the unique id is developer id (Angry
Birds, ESPN)•Directed Tester–Consists of 3 components• Path Recorder• Heuristic Path Generator• Path Replayer
![Page 16: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/16.jpg)
Droid Driver: Directed Tester(2)
•Path Recorder–Records user events in an
emulator•Heuristic Path Generator–Generates unexplored
paths•Path Replayer– Forces the app to execute a
given path–Captures the network trace
![Page 17: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/17.jpg)
Evaluation
•Downloaded 90k apps•70k uses internet•For 2 different traffic–Ad Traffic• Identified ad library from the manifest files of 32k apps
– 25k uses 1 ad library– 4k, 1k, 600, and 400 apps uses 2,3,4 and 5 ad libraries– Less than 300 uses more than 5
–Non-Ad Traffic• Considered 6 apps only
– Youtube, flixter, espn, score center, cnet news, pandora, and zedge
![Page 18: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/18.jpg)
Evaluation: Ad Traffic
![Page 19: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/19.jpg)
Evaluation: Ad Traffic
![Page 20: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/20.jpg)
Evaluation: Non-Ad Traffic
•Manually generated seed-action-path.•Used Directed Testing to generated traffic.•All ads traffic were excluded.•Remaining traffic was annotated with the name of the app.
![Page 21: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/21.jpg)
Results
•All applications were successfully identified in their experiment–For which network profile was generated–Not all were verified
![Page 22: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/22.jpg)
Limitations
•Only identify apps that generate network traffic–Most application does these days•Only works for HTTP traffic–Does not work for HTTPS–Does not work for apps that use proprietary
protocols (skype etc.) •Uses supervised learning–Applications must be known prior to classification.–Need new signatures if app developer changes the
http request structure
![Page 23: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/23.jpg)
Related Work
•Several works tried to classify traffic–Packet inspection• Port based
– Historically many applications utilize “well-known” ports– Classifier looks only the port in TCP SYN packets– Not all applications have registered port with IANA
• Payload based– Payload is visible, known to the classifier– Does not work if payload is obfuscated/encrypted
–Packet Inspection is computationally expensive
![Page 24: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/24.jpg)
Related Work (contd.)
•Classification based on statistical traffic properties–empirical models of connection characteristics -
such as bytes, duration, arrival periodicity–flow duration, packet inter-arrival time and packet
size and byte profile–distributions of packet lengths and packet inter-
arrival times–Etc.
![Page 25: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/25.jpg)
Related Work (contd.)
•Machine Learning–Based on statistical properties of the traffic–Supervised Learning (Classification)–Unsupervised Learning (Clustering)
•Different work uses different ML algorithms
•See: “A Survey of Techniques for Internet Traffic Classification using Machine Learning” - Thuy T.T. Nguyen, Grenville Armitage
![Page 26: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/26.jpg)
Related Work : Examples
•Discoverer : 2007–Automatically reverse engineers the protocol
message formats of an application from its network trace• Application session : group of messages• Message format specification : sequence of fields• Common field semantics: length, offset, pointer, cookie,
endpoint-address etc.–Discoverer derives message format specification• Using cluster
![Page 27: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/27.jpg)
Related Work: Examples
•EarlyBird: 2004, Polygraph: 2005, Hamsa : 2006–Detects previously unknown worms and viruses–Generates signatures of worms by identifying
common byte flows in the network traffic
![Page 28: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/28.jpg)
Related Work (contd.)
•Intrusion detection–2 approaches• Signature based• Anomaly based
•This paper uses signature based application classification
•Anomaly-based detection–Monitors system activity to classify
![Page 29: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/29.jpg)
Anomaly Based Detection
•Triggers alarm when some type of unusual behavior occurs on the network.–Anything that deviates from “normal” is unusual•Heuristic based•Example:–Protocol anomaly: HTTP traffic on a non-standard
port–Application anomaly: A segment of binary code in a
user password.–Statistical anomaly: Too much UDP compared to TCP
traffic.
![Page 30: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/30.jpg)
Signature Based vs Anomaly Based
•Signature Based:–Strength: Precise if signatures are correctly
generated–Weakness: Requires prior knowledge about the
signatures•Anomaly Based:–Strength: Has the potential to detect new or
unknown attacks–Weakness: Often results in false alarms due to the
difficulty in modeling the “norm”
![Page 31: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/31.jpg)
Related Work: Application Profiling
•Profiledroid: 2012–Profiles applications at 4 layers:• Static layer, User layer, Operating system layer, and
Network layer• Network layer metrics: Traffic intensity, Origin of traffic,
CDN + Cloud traffic, Google traffic, Third-party traffic, Incoming vs outgoing traffic, Number of distinct traffic sources, Ratio between Http vs Https traffic• Relies completely on users running apps to generate
traffic
![Page 32: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/32.jpg)
Problems with existing works
•Not Scalable•Requires user’s involvement / not automatic•Coupled with the underlying TCP/Application layer protocol
![Page 33: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/33.jpg)
Inter-component control flow graph
•Used to specify control flow in android applications•Model components:–Activity–Service–Broadcast receivers•External Signals: User Events•Internal Signals: Generated by method calls
See: http://danious.files.wordpress.com/2013/05/dominguezthesis2.pdf
![Page 34: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/34.jpg)
Inter-component control flow graph (contd.)
![Page 35: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/35.jpg)
Inter-component control flow graph (contd.)
![Page 36: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/36.jpg)
Why this paper in CSCE 715?
•Network operators can provide better security for their network–Block malicious traffic–Apply traffic engineering
•Is that all?
![Page 37: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/37.jpg)
•The smartphone app you use reveal your personality–Cornell University Study, 2011• Appthusiasts• Appcentrics• Live Wires• Creators• Connectors• Apprentices
–App market research firm Flurry Analytics also confirms this
http://www.news.cornell.edu/stories/2011/02/trevor-pinch-links-app-usage-personality-typeshttp://sachendra.wordpress.com/2011/05/11/the-smartphone-apps-you-use-reveal-your-personality/http://wallstcheatsheet.com/stocks/can-your-apple-device-app-usage-reveal-your-personality.html/
![Page 38: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/38.jpg)
Conclusion
•NetworkProfiler can identify applications with high precision–Uses network trace generated by the apps–Needs to know the patterns of generated traffic
beforehand–Works only for known applications•DirectedTesting can automate traffic generation from all paths of an application
![Page 39: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:](https://reader035.vdocument.in/reader035/viewer/2022062304/56649ee65503460f94bf70a7/html5/thumbnails/39.jpg)
Questions?