23 concurrency bug detection through improved pattern matching using semantic information slides...

22
/ 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 22年 6年 23年 1 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Moonzoo Kim CS Dept. KAIST

Upload: martina-harmon

Post on 21-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Concurrency Bug Detection through Improved Pattern Matching

Using Semantic Information

Slides taken from Shin Hong’s MS Thesis Defense

23年 4月 21日

1Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

Moonzoo Kim

CS Dept. KAIST

Page 2: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Approach• Verification techniques

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

2

ModelChecking

Testing

Lock-basedstatic

analysis Pattern-based

analysis

Precision

Scalability• Related works - Lock-based static analysis techniques : RacerX, RELAY Lock discipline, Partial order among locks - Pattern-based bug detection : MetaL, FindBugs Low precision (Too many false alarms)

Page 3: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Goal

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

3

Linux file system code

Reported bug survey

Common bug patterns

Concurrency bug detection technique

Level ofAbstraction

Linux file system verification

Concurrency Bug Classification

Concurrency Bug Classification

Concurrency Bug PatternsConcurrency Bug Patterns

Semantic Augmented Concurrency Bug Patterns

Semantic Augmented Concurrency Bug Patterns

Change Log & Code Review

Change Log & Code Review

• Code-based automatic concurrency bug detection for large size system software

• Idea: Automatically detect common concurrency bugs using both syntactic and semantics code pattern matching

Page 4: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Concurrency Bug Classification (1/3)

• We survey previous concurrency bugs from Linux file systems– Search Linux Change Log 2.6.1 ~ 2.6.28– Keyword: concurrency, data race, deadlock, livelock, file system, ext, etc.– In almost 300 documents, we found 40 bug reports (patches) related to both

file system and concurrency bugs.

• We construct concurrency bug classification to analyze the bug reports.– 5 different aspects

• Symptom: – Data race (machine exception), Data race (Faulty state), Deadlock, Livelock.

• Fault : – Design decision violations, Incorrect use of synch. idioms, Program logic error

• Resolution: – {Insert, Remove, Change, Reorder} £ {Sync. operation, Data operation, Control

operation}

• Related synchronization mechanism: – Instruction, Barrier, Thread operations, Conditional variable, Lock, Complex lock,

Semaphore

• Synchronization granularity: – Kernel-level, File system-level, File-level, Inode-level

– 27 bugs are classified

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

4

Page 5: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Example of Concurrency Bug

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

5

Bug08 (Reported in Linux Change Log 2.6.11.11 /fs/ext3/balloc.c)

- Symptom: Data race (Faulty state)

- Fault: Incorrect use of synchronization idioms

In ext3_discard_reservation(), it incorrectly use test and test-and-set idiom by missing the second testing. This may cause race condition, so that rsv_window_remove()can be invoked twice for the same object.

- Resolution: Insert control operations

- Related synchronization mechanism : Binary lock

- Lock granularity: File system

• Example

ext3_discard_reservation(inode * inode){ if (!rsv_is_empty(&rsv->rsv_window)) { spin_lock(rsv_lock) ; rsv_window_remove(inode->i_sb,rsv) ; spin_unlock(rsv_lock) ; }}

/* Linux kernel 2.11.10 */

ext3_discard_reservation(inode * inode){ if (!rsv_is_empty(&rsv->rsv_window)) { spin_lock(rsv_lock) ; if (!rsv_is_empty(&rsv->rsv_window)) rsv_window_remove(inode->i_sb,rsv) ; spin_unlock(rsv_lock) ; }}

/* Linux kernel 2.11.11 */

Page 6: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Concurrency Bug Classification

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

6

• Symptom \ Fault Design decision violationIncorrect use of

synchronization idiomsProgram logic error

Data race resultingmachine exception

Bug1,Bug5,Bug9,Bug13,Bug18,Bug20,Bug23

Bug16

Data race resulting faulty states

Bug2,Bug7,Bug22, Bug25 Bug8,Bug12,Bug26, Bug27 Bug3,Bug15,Bug17, Bug19

Deadlock Bug11,Bug24 Bug10,Bug14,Bug21

Livelock Bug4,Bug6

•Sync. granularity \ Sync. primitives Instruction Barrier

Thread operation

Binary Lock Complex Lock Semaphore

Kernel level Bug4 Bug12Bug1,Bug2,Bug3,Bug5,Bug10,Bug11

Bug19 Bug6

File System level Bug17 Bug8,Bug9,Bug22 Bug14

File level Bug15, Bug16 Bug20 Bug13 Bug7

Inode level Bug26Bug18,Bug23,Bug25,Bug26,Bug27

Bug21,Bug24

• 13 bugs among 27 bugs in the concurrency bug classifications cannot be detected by conventional lock-based concurrency bug detection techniques.

Page 7: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Concurrency Bug Patterns• Based on the bug analysis result by the classification, we define the 10

concurrency bug patterns in order to detect unrevealed bugs automatically by code pattern matching.

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

7

Race condition Deadlock Livelock

Barrier• No Memory Barrier After Object Initializations

• Busy-waiting on atomic variable without memory barrier

Instruction• Use atomic instructions in Non- atomic ways

Thread operation

• Unsynchronized Data Passing to Child Thread

• Waiting Already Finished Thread

Conditional variable • Waiting with Lock Held

Lock• Misused Test and Test-and-Set• Unlock before I/O Operations

• Releasing and Re-taking Outer Lock

Complex lock

• Unintended Big Kernel Lock Releasing

Page 8: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Misused Test and Test-and-Set Bug Pattern

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

8

int data; /*shared data*/

int func(){

if (test(data)) { /*test*/

lock () ;

if (test(data)) { /*test*/

data = newvalue ;

}

unlock () ;

}

}

/* Correct Code */

int data; /*shared data*/

int func(){

if (test(data)) { /*test*/

lock () ;

data = newvalue ;

}

unlock () ;

}

}

/* Incorrect Code */

test(data) might be false

Page 9: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Concurrency Bug Patterns (3/3)• Bug detection result by pattern matching

– Misused Test and Test and Set pattern:• We analyze 9 file systems with VFS layer in Linux 2.6.30.4.

• 142 suspected bugs are revealed as false positive by manual review & discussion with maintainers.

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

9

Ext2 Ext3 Ext4 NFS ReiserFS Proc Sysfs UDF BtrFS Total

# of suspected

bugs13 19 18 15 18 18 12 15 17 145

# of files 18 24 23 34 26 20 9 17 34 205

LOC 28K 104K 25K 56K 27K 18K 37K 9K 82K 386K

Time(sec) 0.19 1.06 2.57 1.31 1.49 0.80 0.61 0.78 1.6 10.4

/* In Linux kernel 2.6.30.4 /fs/nfs/nfs4state.c */void nfs_free_seqid(nfs_seqid *seqid) {

if (!list_empty(&seqid->list)) {sequence = seqid->sequence->sequence ;spin_lock(&sequence->lock);list_del(&seqid->list) ;spin_unlock(&clp->cl_lock);rpc_wake_up(&sequence->wait) ;

}}

- Ex. A suspected bug detected from NFS

Page 10: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Semantics Augmented Pattern Matching

• We improve the bug pattern matching using semantic information to refine the bug detection results.

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

10

Program behavior(reachable states)

Safe states (no error)

State space Bug pattern#1

Bug pattern#2

Bug pattern#2with semantics

Bug pattern#1 with semantics

• There are the following main sources of false positives:– No parallel thread to be scheduled

– Synchronized by other locks

– Shared variable initializations without holding locks

Thread sensitive analysis

Lock analysis

Simple points-to analysis

Page 11: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

COBET Framework• We build a COncurrency Bug pattErn maTching framework (COBET) to

support programming template for effective bug pattern detector generation upon EDG C/C++ parser.

23年 4月 21日

COBET: Pattern-driven Concurrency Bug Detection Framework

11

AST generator

Path analysis

Semantic analysis engine

Aliasanalysis

Config.Thread starting

functions

Lock spec.

Target C src code

Memory alloc. spec.

Lockanalysis

Bug 1description

Bug pattern detector 1

COBET synthesizer

Bug 2description

Bug pattern detector 2

Semanticconditionchecker

Syntacticpatternmatcher

Phase I: Bug pattern description and pattern detector construction

Phase II: Code pattern matching with semantic code checking

Page 12: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

COBET Synthesizer• Pattern description language (PDL)

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

12

Grammar of COBET PDL

1a:pattern1 { 2a: fun $f1{3a: if $cond {4a: lock $l; 5a: \{if $cond { }}6a: unlock $l;7a:}}}

1b:pattern 2 {2b: fun $f2 {3b: write $w;4b: }}

Example of incorrect test-test-set bug pattern

Page 13: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Thread Sensitive Analysis (1/2)• Most concurrency errors are interleaved executions of two or more threads.• We extend each bug pattern to specify multiple disconnected code patterns which

are necessary for bug occurrence.

• The extended bug pattern matching inputs thread starting functions (system call handlers), thread-spawning operation, and interrupt-handler registering operations to notice execution starting points.

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

13

Code patternmatch#1

Code patternmatch #2

Errorexecution

Page 14: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Thread Sensitive Analysis (2/2)

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

14

Code pattern #1

if (expr) {

lock(m) ; /* no if(expr) here*/

...unlock(m) ;

Code pattern #2

write(x) ;

Ex. Extended Misused Test and Test-and-Set bug pattern

proc_get_sb() {...

ei = PROC_I(sb->s_root->d_inode); if (!ei->pid) { rcu_read_lock(); ei->pid=get_pid(find_pid_ns(1,ns)); rcu_read_unlock(); }

...

proc_alloc_inode() {…

ei = kmem_cache_alloc(...) ; if (!ei) return NULL ; ei->pid = NULL ;

proc_get_sb() {…

if (!ei->pid) ei->pid = find_get_pid(1) ;

Ex. A suspected bug of Misused Test and Test-and-Set from Proc file system

• Code pattern#2 Match-2

• Code pattern#2 Match-1• Code pattern #1 Match-1

• Condition: expr may read x

Page 15: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Lock Analysis (1/3)

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

15

Code patternmatch#1

Code patternmatch #2

Lock analysis

• We applied lock analysis to compute the set of locks held when a code pattern match is executed.

– Inter-procedural, flow-sensitive, path-insensitive analysis (similar to RacerX).– The specification of lock acquiring(releasing) operations are given by users.

Lock(A)

Lock(C)

Lock(A)

LS = {A}

LS = {A,B}

Conflict!No concurrent execution is possible due to lock A.

Lock(B)

Unlock(C)

Page 16: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Lock Analysis (2/3)

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

16

proc_alloc_inode() {…

ei = kmem_cache_alloc(… ; if (!ei) return NULL ; ei->pid = NULL ;

proc_get_sb() {…

if (!ei->pid) ei->pid = find_get_pid(1) ;

Lockset: {lock_kernel}

sys_mount()

do_mount()

do_new_mount()

do_kern_mount()

vfs_kern_mount()

proc_get_sb() {…

ei=PROC_I(sb->s_root->d_inode);if (!ei->pid) { rcu_read_lock(); ei->pid=get_pid(find_pid_ns(1,… rcu_read_unlock();}

sys_mount()

do_mount()

do_new_mount()

do_kern_mount()

vfs_kern_mount()

compat_sys_futimesat()

__user_walk_fd()

do_utimes()

Lockset: {lock_kernel}

Lockset: {inode.i_mutexl}

acquire lock_kernel() acquire lock_kernel()

acquire inode.i_mutex

real_lookup()

• Code pattern#1 Match-1 • Code pattern#2 Match-1 • Code pattern#2 Match-2

Conflict

Page 17: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Lock Analysis (3/3)

• COBET uses an inter-procedural lock analysis– Transfer the lockset of a call site to the caller function analysis

• To avoid redundant inter-procedural lock analysis, COBET semantic engine uses a cache that record analysis result for functions

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

17

Page 18: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Points-to Analysis• Non-shared variables are free from concurrency errors.• A newly allocated heap variable is non-shared until it is linked to a global data structure.• These variables are initialized without any synchronization during non-shared period.• This type code would result false positives in the bug pattern matching for the lack of alias-

sensitive analysis.

23年 4月 21日

Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information

18

Match 2-2

0: proc_alloc_inode() {

1: ...

2: ei = kmem_cache_alloc( ...

3: if (!ei) return NULL ;

4: ei->pid = NULL ;

...

Match 1-1

0: proc_get_sb() {

1: ...

2: ei = PROC_I(sb->s_root-

>d_inode);

3: if (!ei->pid) {

4: rcu_read_lock();

5: ei->pid = get_pid(...;

6: rcu_read_unlock();

7: }

8: ...

Non-shared: {ei}

Non-shared: {}

Non-shared: {ei}

No other thread can access ei->pid.

Page 19: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Experiment Result (1/3)• We reviewed Linux Change Logs on file systems from 2.6 to 2.6.28 and defined the four bug patterns based on

actual bug history.

• Effectiveness– We applied the four bug pattern detectors to seven Linux file systems in kernel 2.6.30.4– We detected the four new bugs confirmed by Linux maintainers

23年 4月 21日

COBET: Pattern-driven Concurrency Bug Detection Framework

19

Page 20: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Experiment Result (2/3)• Efficiency

– We measured the false alarm reduction rate through the semantic analyses and time cost– The false alarms are reduced as additional analysis techniques are employed, and the time costs were not burdensome (8.34 sec for analyzing 193KLOC)

23年 4月 21日

COBET: Pattern-driven Concurrency Bug Detection Framework

20

Page 21: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Experiment Result (3/3)

• Applicability– We applied the four bug detectors to Linux device drivers and network modules

23年 4月 21日

COBET: Pattern-driven Concurrency Bug Detection Framework

21

Page 22: 23 Concurrency Bug Detection through Improved Pattern Matching Using Semantic Information Slides taken from Shin Hong’s MS Thesis Defense 2016-01-041Concurrency

/ 23

Conclusion• COBET framework

– Pattern-based concurrency bug detection framework

– Able to define and match various bug patterns.

– Utilizes composite patterns with semantic conditions to improve accuracy

• Empirical result– Defined four concurrency bug patterns with various synchronization

mechanisms

– Implement the four bug pattern detectors through COBET

– Detect ten fresh and real bugs in Linux

23年 4月 21日

COBET: Pattern-driven Concurrency Bug Detection Framework

22