a qualitative analysis of android taint- analysis results · methodology a qualitative analysis of...

23
A Qualitative Analysis of Android Taint- Analysis Results ASE 2019, San Diego Linghui Luo @LinghuiLuo

Upload: others

Post on 18-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

A Qualitative Analysis of Android Taint-

Analysis Results

ASE 2019, San Diego

Linghui Luo

@LinghuiLuo

Page 2: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

Linghui Luo Eric Bodden Johannes Späth

2

Page 3: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Static taint analysis for detecting data leaks

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

Context-sensitive

Flow-sensitive Field-sensitive

Object-sensitive

3

Page 4: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

State-of-the-art taint analyses are not path-sensitive

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

Static taint analysis tools:

◼ Will tell: there is a taint flow between source and sink – a potential leak

◼ Won‘t tell:

◼ which value does secret hold at runtime

◼ this taint flow can only happen when the command is „upload“.

4

Page 5: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Using dynamic analysis for validating static findings

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

Dynamic analysis tools must run a test case:

◼ Covering the execution path

◼ In this example a file contains a String „upload“

5

Page 6: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Hard to generate such a test case!

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego6

Page 7: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Research Goal

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

◼ To study Android taint-analysis results on real world apps

◼ To understand what kind of path conditions are relevant

◼ How taint flows are conditioned on different factors

Factor 1

Factor 2

Factor 3

...

Taint flow

7

Page 8: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

We seek to identify …

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

I/O operations

Configurations

UI interactions

8

Page 9: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Research Questions

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

What types of taint flows does FlowDroid report?

How common is each type?

FlowDroid

9

Page 10: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Challenges

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

◼ analysis may not scale

◼ analysis of path constraints is a non-

distributive problem

◼ FlowDroid built on top of IFDS

◼ IFDS solves distributive problems

Can‘t simply make static taint analysis path-sensitive:

10

Page 11: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Our Solution

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego11

Page 12: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Methodology

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

◼ Dataset: 2000 Android apps (2016-2018) from AndroZoo

◼ FlowDroid v2.5.1 in default configuration

◼ We collected constraint-APIs for COVA

◼ 335 APIs for UI interactions

◼ 448 APIs for Configurations

◼ 120 APIs for I/O operations

◼ Semi-automated study in two steps

◼ Filter false positives

◼ Classifying taint flows wrt. path constraints 1022

978

FlowDroid reports 28176 taint flows in 2000 apps

Leaky apps Non-leaky apps

12

Page 13: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Step 1: Filter false positives

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

Stratified random sampling taint flows according to top source-sink-pairs

Group #Flows #Apps #Sampled

Apps

A 2,193 535 54

B 1,410 199 20

C 194 166 17

D 1,440 156 16

E 862 291 30

F 847 85 9

Intra-procedural

Inter-procedural

13

Page 14: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

False-positive patterns found in step 1

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

Group A & E (source = openConnection, sink = setRequestProperty )

Group B (source = obtainMessage, sink = sendMessage )

14

Page 15: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Takeaways

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

11/47 default sources used by FlowDroid are inappropriate.

They cause 28% of taint flows being false positives.

Researchers who used FlowDroid in the default configuration may

need to re-evaluate their conclusions.

In a short investigation, we found 9 papers in which the work was

built on top FlowDroid and inappropriate sources and sinks were

used.

15

Page 16: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

False-positive patterns in step 1 (continued)

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

Group D (source = getString, sink = startActivityForResult )

16

Page 17: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Takeaways

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

Taint flows with taints connecting sources and sinks on the same

objects are false positives.

For a better precision, such approximated rule should not be used

for all sources and sinks, but only in certain cases.

17

Page 18: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Step 2: Classify taint flows wrt. path constraints

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

IO-constrained

Configuration-constrained

18

Page 19: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Classification in step 2

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

UI-constrained

Configuration-constrained I/O-constrained

54.8%

Unconstrained

Infeasible

False positives9.3%

2.7% 0.6%

0.7% 0.1%

<0.01%

0.04%

31%

0.7%

19

Page 20: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Takeaways

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

Taint flows are seldom conditioned by combinations of the three

factors.

Thus, most taint flows could be dynamically confirmed by different

tools that specialize on the respective category.

20

Page 21: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Summary

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

https://github.com/secure-software-engineering/COVA

21

Page 22: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

COVA: A Static Analysis Tool for Computing Partial

Path Constraints

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

Constraint MAP Code

TRUE 1 String secret = source();

TRUE 2 button.setOnClickListener(new OnClickListener(){

TRUE 3 void onClick(View view){

CLICK 4 int sdk = Build.VERSION.SDK_INT;

CLICK 5 if (sdk < 20)

CLICK Λ SDK<20 6 sink(secret);

CLICK Λ SDK≥20 7 else

CLICK Λ SDK≥20 8 doSth1();

CLICK Λ SDK≥20 9 if (sdk < 20)

FALSE 10 doSth2();

https://github.com/secure-software-engineering/COVA

22

Page 23: A Qualitative Analysis of Android Taint- Analysis Results · Methodology A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego Dataset: 2000 Android

Feel free to contact me if you have any questions

A Qualitative Analysis of Android Taint-Analysis Results, @LinghuiLuo, ASE'19, San Diego

[email protected]

23