by: gal nave and dan slov supervisor: dmitri perelman technion, electrical engineering department...

Deadlock Detection in Linux

By: Gal Nave and Dan SlovSupervisor: Dmitri Perelman

Technion, Electrical engineering departmentNSSL Laboratory

Overview

• Introduction

• Deadlock detection algorithm

• Algorithm implementation in Linux

• Summary

Introduction

Parallel vs Serial computation

• An evolution of serial computing •Parallel comp solves larger problems

• Provides concurrency

•Faster Bigger Better More principle

Introduction (cont.)

Parallel computing pitfalls

• Synchronization •Ordering

• DEADLOCKS

• We will concentrate on deadlocks from now on

Problem definition

Deadlock is a specific condition when two or more processes (or threads) are each waiting for each other to release a resource, thus forming a circular chain

Project Definition :

•Design and implementation of deadlock detection mechanism in Linux

• Supply support for successful debugging - backtrace - dependency description

• Evaluate the performance overhead implied by the solution

The Algorithm

Deadlock detection algorithm by M.Herlihy and E.Koskinen which they called:

DREADLOCKS

Deadlock Detection Algorithm

• The algorithm exploits the fact that during busy wait no useful work is done

•The thread that tries to lock a mutex checks if it is not already lock using atomic operation test-and-set (TSL)

•If the mutex is already locked, the thread waits for it to get unlocked (or in other words thread spins about the lock).

•From time to time a thread tests the mutex using TSL :

Test-and-setIf the lock has already been set by another thread - waitRepeat until TSL returns zero (the lock is set)

Deadlock Detection Algorithm (cont.) •

The Idea :

•Why don’t we use this time between 2 consecutive test-and-set’s to look for deadlock!!!!!

THE BASIC ASSUMPTION OF THE ALGORITHM IS THAT THE TIME WHILE A THREAD IS SPINNING COULD BE USED FOR SOME USEFUL WORK – DEADLOCK DETECTION!!!

Deadlock Detection Algorithm (cont.)

•Now lock algorithm would look like the following:

1. Try to lock the mutex using test-and-set2. If the lock has already been set by another thread – try to detect a deadlock3. If there is no deadlock try to lock the mutex again using test-and-set Else if there is a deadlock alert the user4. Repeat until TSL returns zero (the lock is set) or deadlock is detected

Deadlock Detection – how it gets done

•Each thread has a list of processes/threads, it is waiting for, to acquire some resource (mutex ). Let’s call this list digest.

•Thread trying to acquire mutex that is already locked checks the owner’s digest for its TID.

Digest of A:

{}

Digest of B:

{}


•If TID is found in mutex owner’s digest, it would imply that thread is waiting for mutex owner to release the lock while owner is waiting for the thread itself to release another lock – classic deadlock!!!!

Digest of A:

{B}

Digest of B:

{A}


•If TID is not found - set union of the thread’s digest with the one of the mutex owner.

•Keep spinning until the lock is acquired or deadlock is detected

Digest of A:

{B}

Digest of B:

{}

Algorithm Implementation for Linux :

The Implementation is based upon:

• GLIBC GNU C library version 2.6

• NPTL Native POSIX Thread Library

• Any distribution of Linux supporting glibc 2.6

Implementation Details

• Three most important structure in implementations are:

• thread struct• mutex struct• digest struct

• The details of each of these structs are in the next foils

Thread struct modification

• Each thread is described by a structure defined in descr.h

• the structure defining thread includes all sorts of fields like thread ID, attributes etc

• We have added additional field to the structure to hold the digest entry (the list of TID’s the thread is waiting for)

• Initially digest is empty

• If the thread that tries to acquire mutex and sees that its already locked, it scans the mutex owning thread’s digest for its own ID.

Digest struct

• Dependency list is implemented as a linked list.

• digest of the thread that is waiting for a mutex has a field pointing at the thread owner’s digest:

Digest of a thread that is waiting for a mutex

Digest of a mutex owning thread

Mutex owning thread’s digest may point to other digests

Digest structure detailed

• Digest structure is defined in pthread_digest.h :

typedef struct _digest_t {unsigned __tid;int __time_stamp;int __cnt;int __ref_cnt;int __is_alive;pthread_digest_p digest;} pthread_digest_t;

•Explanation all the fields purpose will follow

Digest structure detailed

• __tid specifies thread ID of the thread owning the digest

•__time_stamp specifies the time of last update of the digest

•__cnt counts the number of mutexes the thread holds

•__ref_cnt specifies the number of threads pointing at the digest

• digest points at the next digest (NULL if the thread is not waiting for a mutex)

Implementation Details (cont.)

Since digest is ADT, some methods should be added to allow stronger decoupling.

Most important actions on digest are:

• append another thread’s digest• upon acquiring a mutex release dependency list• update mutex owner• compare time stamps of two digests• print dependency list and stack n case of a deadlock

The usage of digest ADT methods in the algorithm

1. Init_thread_digest ()2. Try to lock the mutex using test-and-set3. If the lock has already been set by another thread –

append_owner_digest()compare_time_stamp() if time_stamp is outdated

scan_digest()else

do nothing4. If there is no deadlock try to lock the mutex again

using test-and-set Else if there is a deadlock print_dependency_list() 5. Repeat step 2-4 until the lock is set or deadlock is detected6. If mutex is acquired

release_dependency_list()update_mutex_owner()

Mutex structure

• How to get a pointer to owning mutex thread’s digest? • Mutex struct needs to be modified!!!

•Each mutex is described by a structure defined in pthreadtypes.h

•Additional field is added to mutex_t structure to hold the owner digest entry

• If a thread acquires mutex it updates digest owner field in a mutex structure to contain a pointer to the thread’s digest

What if deadlock is detected?

The following is done:

• Print dependency list to stderr

• Print backtrace to stderr

• Return deadlock_found error code

What if deadlock is detected?

stderr example (real example from the test):

Thread ID 1090525520 is waiting for thread ID 1082132816Thread ID 1082132816 is waiting for thread ID 1090525520Thread ID 1090525520 has detected a deadlock...Backtrace/home/dmitri/deadlock_detection/glibc261_build/nptl/libpthread.so.0 [0x2afa80b53fbc]/home/dmitri/deadlock_detection/glibc261_build/nptl/libpthread.so.0 [0x2afa80b54088]/home/dmitri/deadlock_detection/glibc261_build/nptl/libpthread.so.0(pthread_mutex_lock+0x1db) [0x2afa80b4c6ab]./simp_test(lock_mutex+0x15) [0x400d41]./simp_test(thr_b_func+0x5b) [0x400e4c]

Problems and solutions

• Memory leakage

Description:

Thread finished its task and exits and all the memory it used gets freed. But other threads may point at its digest!!!

Solution:

Add 2 additional fields to digest structure: _is_alive to delete digest logically_ref_count to count how many threads reference the digest. Free the memory if is_alive is false and ref_cnt is zero.


• Memory leakage take 2

Description:

So thread does not necessarily free the digest memory upon exiting, but operating system does! Operating system has a maintaining daemon process that frees all the leaked memory.

Solution:

Add a global hash-table that references all digests. Delete entries from the hash table using the same principle as in freeing the digest memory Standard hash table of glibc 2.6 is not suitable due to very limited number of operations on it, so another library was added gnu hash table: ghtlib


• Mutex struct is used by a lot (all ?) processes .

Description:

Mutex struct is used by a lot if not all processes and change in its size requires changes in kernel.

Solution:

Remove changes from mutex struct and add another hash table to contain all mutexes, using mutex address as a key, and a pointer to the digest of the owner as a data.Performance degradation? Not really!!!


• Thread struct is used by a lot (all ?) processes .

Description:

Thread struct is used by a lot if not all processes and change in its size requires changes in kernel.

Solution:

Remove changes from thread struct and use one of the field provided by glibc author as a size buffer.

Verification

The following test suit was used :

Controlled deadlockDescription: Deliberately create deadlock using small

amounts of thread, so its easy to monitor the detection. Compare to standard glibc

Test result: Passed. Dependency list was printed while standard glibc got frozen in a deadlock.

Statistical deadlockDescription: Create a number of threads (from 10 to 150

depending on test) that randomly lock mutexes controlling the probability of deadlock by a number of locked mutexes and a number of threads.

Test result: Number of detected deadlocks are proportional to a number of randomly locked mutexes.

Verification (cont.)

The typical graph (50 threads locking number of random mutexes)

Performance Evaluation

The following benchmarks were used to evaluate performance:

1. Locking performance

• Create a fixed number of threads

• Let each thread lock and unlock the same number of mutexes, without doing any other task

• Repeat the step above for different number of mutexes for both standard and modified glibc.

LOCKING PERFORMANCE

(50 threads locking number of mutexes)

Performance Evaluation (cont.)

2. CPU bound performance

• Create a fixed number of threads (50 for instance)

• Let each thread lock and unlock the same number of mutexes, while doing a heavy calculation in between

• Repeat the step above for different number of mutexes for both standard and modified glibc. The graph below represents typical picture where 50 mutexes were used

Performance Evaluation (cont.)

CPU BOUND PERFORMANCE

Usage Proposal

• Compile the library using provided makefile

• Either install it or leave it as an alternative to mainstream glibc using LD_LIBRARY_PATH system variable

• Could be configured to do the following:

a) Quit the task and return the error code in case of deadlock

b) Print to stderr the deadlock information and remain in deadlock letting user to decide what to do

c) Detect hotspots (printing __ref_cnt field from digest structure) may be easily added

Summary

• The integration of deadlock detection mechanism into Linux is definitely possible

• Simple deadlocks may and should be detected

• Programs with not very critical performance and relatively low number of locking operation could use deadlock detection mechanism all the time without major impact on performance

• NPTL is not completely decoupled from the kernel and therefore some changes in kernel are needed to make deadlock detection more effective

Final thoughts:

•Most of personal computers havemore than one core

•Parallel programming is gettingmore and more essential

•Deadlocks problem becomes THE PROBLEM

• Deadlock prevention is not in our hands

• Deadlock avoidance demands operational system overhead

• Deadlock detection may be effective and low cost in particular cases

• GNU C library allows code modification

by: gal nave and dan slov supervisor: dmitri perelman technion, electrical engineering department...

Documents

lock classic deadlock

useful work deadlock

solution slide

principle slide

digest of b

mutex owners digest

problem definition deadlock

linux summary slide