ch 8, 9 kernel synchronization

41
1 Ch 8, 9 Kernel Synchronization 45 제 : Kernel Synchronization

Upload: sol

Post on 04-Feb-2016

29 views

Category:

Documents


0 download

DESCRIPTION

제 45 강 : Kernel Synchronization. Ch 8, 9 Kernel Synchronization. How CPU performs X++. CPU. variable X is stored in storage box storage box can not compute computation is performed in CPU. (2). Register. Memory. (1). (3). load: CPU register  storage X - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ch 8, 9   Kernel Synchronization

1

Ch 8, 9 Kernel Synchronization

제 45 강 : Kernel Synchronization

Page 2: Ch 8, 9   Kernel Synchronization

2

How CPU performs X++

(1)load: CPU register storage X (2)compute: register ++ (3) store: CPU register storage X

Memory

CPU

X

Register

variable X is stored in storage boxstorage box can not compute

computation is performed in CPU

(2)

(1) (3)

Page 3: Ch 8, 9   Kernel Synchronization

3

(Three Cases of Race)

A. 2 CPU’s sharing a variable

B. Kernel base code v.s. Interrupt

handler

C. Kernel Preemption between

Processes

Page 4: Ch 8, 9   Kernel Synchronization

4

(Case A)2 CPU’s sharing a variable

Page 5: Ch 8, 9   Kernel Synchronization

5

(Case) CPU’s not sharing variable

CPU #0

Shared Memory

vi a.out

sh a.out

kernel a.out

program counter

CPU #1 program counter

CPU #2 program counter

CPU #3 program counter

Page 6: Ch 8, 9   Kernel Synchronization

6

CPU #0

Shared Memory

vi a.out

sh a.out

kernel a.out

X

program counter

CPU #1 program counter

CPU #2 program counter

CPU #3 program counter

load X compu

testore X

load X compu

testore X

(Case A) SMP sharing variable

user

user

user

user

Page 7: Ch 8, 9   Kernel Synchronization

7

• Thread 1(access X)

load X (X=7)--------------increment X (7 8)--------------store back X (X=8)--------------

• Thread 2(access X)

--------------load X (X=7)--------------increment X (7 8)--------------store back X (X=8)

Thread 1(access X)

load X (X=7)increment X (78)store back X (X=8)------------------------------------------

Thread 2(access X)

------------------------------------------load X (X=8)increment X (89)store back X (X=9)

interleavedexecution

atomicexecution

(mutual exclusion)

---- Incorrect lost update!

---- Correct

Page 8: Ch 8, 9   Kernel Synchronization

8

Introduction• Critical region (critical section):

– Code path that reads & writes shared data

• Race condition :– two threads(tasks) simultaneously

executing the same kind of critical region. (interleaved execution)

• Synchronization :– make sure race condition does not occur– Guarantee atomic execution within critical

region (serialize accesses)

Page 9: Ch 8, 9   Kernel Synchronization

9

(Case B)

kernel base code v.s. interrupt

handler

Page 10: Ch 8, 9   Kernel Synchronization

10

CPU

load X compu

testore X

others

others

load X compu

testore X

others

others

Register

(1)(2)

(3)

(4)(5)

kernel base interrupt handler

(Case B) kernel base code v.s. interrupt

handler

R=10

X=10

R=11

R=10

R=11

X=11X=11

Page 11: Ch 8, 9   Kernel Synchronization

11

(Case C) Process Context Switch

during System Call Execution

Page 12: Ch 8, 9   Kernel Synchronization

12

load X compu

testore X

load X compu

testore X

KernelSystem

Call

KernelSystem

Call

user

user

user

user

Shared Memory

vi a.out

sh a.out

kernel a.out

X

CPU program counter

(Case C) Process Context Switch during System Call

Execution

(sh) (vi)

Page 13: Ch 8, 9   Kernel Synchronization

13

Linux functions for Preventing Race

1. Atomic functions• for simple operations

2. Spinlock• busy-wait

3. Semaphore• block_wakeup (1) load X

(2) compute (3) store X

explicit lock Sleep if fail

(1) load X (2) compute (3) store X

explicit lock Keep checkingif fail

(1) load X (2) compute (3) store X

simple operation like + ++ - * / atomic_inc(X)

Page 14: Ch 8, 9   Kernel Synchronization

14

(1) Atomic functions

• For simple operations• For Integer -- init, read, set, add, inc, …

eg atomic_set (&v, 4); //v=4 atomic_inc(&v); //v++

• For Boolean -- set, clear, change, testeg set_bit (0, &word); //set the bit zero of &word

change_bit (0, &word); //flip test_and_set_bit(0, &word) //set the bit zero &

return previous value//

• If CR is a simple (integer or boolean) operation,

then just use these atomic operations

Page 15: Ch 8, 9   Kernel Synchronization

15

Lock

• General structure of process Pi

• Each process:do {

entry section Lock (wait if in use)

critical region (CR)exit section Unlockremainder section

} while (1);

Page 16: Ch 8, 9   Kernel Synchronization

16

Lock - several issues

• What do you do while waiting for lock release?– busy-wait – block-&-wakeup

• Size of the area this lock protects (lock granularity) – Single lock -- entire kernel– Many locks -- one for each small critical region (variable)– concurrency vs overhead

• Deadlock – deadlock prevention, avoidance, detection– thread 1 thread 2

acquire lock A acquire lock Btry to acquire lock B try to acquire lock A

Page 17: Ch 8, 9   Kernel Synchronization

17

2 kinds of lock1. Spin lock

– busy-wait– lock memory-bus if necessary (Ch 9)– holding time should be short– should not sleep while holding it

2. Semaphore– block-&-wakeup– context switch overhead (sleep, wakeup)– you can sleep while holding this lock

CPU CPU

BUS

MEMORY

Page 18: Ch 8, 9   Kernel Synchronization

18

(2) Spin Lock• busy loop• Lock holding time should be short• spin locks can be used in interrupt handlers (in this case one should disable local

interrupts)• Cannot request spin lock you already hold (i.e. spin locks are not recursive in Linux)• spin lock can be dynamically created

spinlock_t my_lock = SPIN_LOCK_UNLOCKED; // declarationspin_lock(&my_lock); --- critical region ---spin_unlock(&my_lock);

Page 19: Ch 8, 9   Kernel Synchronization

19

(3) Semaphores• Sleeping lock• Failed to obtain the semaphore? go to sleep

– Interrupt handler cannot use it (IH cannot sleep)– hold both spinlock & semaphore? Don’t try

• When do we use – When (lock holding time) > overhead

(sleep+wakeup)– only in process context (not in interrupt hander)

• Semaphores do not disable kernel preemption.

down (&sem); // P(), sleep if fail (not respond to signal)up (&sem); // V(), release the given semaphore

Page 20: Ch 8, 9   Kernel Synchronization

20

Which method to use : Spin lock vs. Semaphore

• Requirements

low overhead lockingshort lock hold timeinterrupt contextneed to sleep

• Recommended Lock

spin lock is preferredspin lock is preferredspin lock is requiredsemaphore is required

Page 21: Ch 8, 9   Kernel Synchronization

21

A. Symmetrical multiprocessing (Multiple CPU’s)2 CPU’s – each enter the same type of CR’s (fetch x, add, store x)Cure: use semaphore when enter/exiting CR (“SMP-safe”)

Other CPU is not allowed to enter same type of CR

B. Interrupt (Single CPU) kernel is in the middle of CR (fetch x, add, store x) interruptinterrupt handler enters the same CR (fetch x, add, store x)Cure: disable interrupt while kernel enters CR (“interrupt-safe”) Interrupt cannot occur in this CR

C. Kernel preemption (Single CPU) /* Linux now allows kernel preemption Do this safely */

kernel is in the middle of CR A (fetch x, add, store x) preempt CPUCPU Other task X. invokes system call (enter kernel)kernel enters the same CR A (fetch x, add, store x)Cure: Not allow preemption during kernel CR (“preempt-safe”) kernel is guaranteed to complete this CRPA

PX

CR(x)

CR(x)

Kernel Base Code

CR(x)

Interrupt Handler

CR(x)

cpu cpu

CR(x)

disableintrp

Lock(x)

disablepreemption

Where -- race conditions

Page 22: Ch 8, 9   Kernel Synchronization

22

Advanced Issues

Page 23: Ch 8, 9   Kernel Synchronization

23

Reader-Writer Spin Lock

• reader lock -- allow many readers to share lock• writer lock -- at most one writer (no readers)• use when code is clearly separated -- R/W section.• Favors readers over writers writer waits until all readers have finished

rwlock_t mr_rwlock; read_lock(&mr_rwlock) write_lock(&mr_rwlock)read_unlock(&mr_rwlock) write_unlock(&mr_rwlock)

Page 24: Ch 8, 9   Kernel Synchronization

24

Seq Locks• Write Lock

– “No other writers?” ---- always succeed– initial value of this lock is zero– when acquiring lock, seqeunce_counter++

becomes odd

– when releasing lock, seqeunce_counter++ becomes even

– “odd” means “writer is currently working” • Reader

– goal -- obtain the most recent version– before READ -- copy sequence_counter to local variable – after READ ---- get current sequence_counter,

• If recent value is odd retry READ• If (current value <> saved value) retry READ

• Favors writer over reader – writer does not wait for reader

• scalable -- many readers and only a few writers

c R

i

but diffvalue?

retry

even

evenW

W

Page 25: Ch 8, 9   Kernel Synchronization

25

Comparison1. Spin-lock

– do not allow concurrent access to shared data– busy-wait

2. Semaphore– do not allow concurrent access to shared data– block & wakeup

3. Reader/Writer spin-lockreader has higher priority over writer

– allow multiple readers to access shared data concurrently4. Seqlock

– writer has higher priority over reader– allow multiple readers to access shared data concurrently

Page 26: Ch 8, 9   Kernel Synchronization

26

BKL : The Big Kernel Lock

• 1st SMP implementation• Global spin lock• Only one task could be in kernel at a time• Later, fine-grained locking was introduced multiple CPU’s execute kernel

concurrently• BKL is not scalable• BKL is obsolete now.

lock_kernel();unlock_kernel();

Page 27: Ch 8, 9   Kernel Synchronization

27

Completion Variables

• PA needs to signal PB that an event has occurred.

PB : wait_for_completion (struct completion *);

//waits for the completion variable to be signaled sleep

PA : complete (struct completion *);

// signals any tasks waiting at completion var. wake up

Init: init_completion (struct completion *);//initialization

Page 28: Ch 8, 9   Kernel Synchronization

28

Barriers• Both the compiler and the processor can reorder

reads and writes for performance reason.• ordering (read & write) is important for locking• Barriers

– “no read/write reordering allowed across barriers”.

rmb() : load(read) barrierwmb() : store(write) barriermb() : load & store barrier

Page 29: Ch 8, 9   Kernel Synchronization

29

thread 1 a=3; b=4;--

thread 2--c = b; d = a;

thread 1a=3; mb();b=4;--

thread 2--c = b;rmb();d = a;

Initally a = 1, b = 2

// c=4, d=1 can happen

// c=4, d=1 can not happen

Page 30: Ch 8, 9   Kernel Synchronization

30

Ch 10 Timers and Time Management

Page 31: Ch 8, 9   Kernel Synchronization

31

Terminology• HZ

– tick rate (differs for each architecture) #define HZ 1000

(include/asm-i386/param.h)– In most other architecture, HZ is 100– i386 architecture (since 2.5 series)

• jiffies– number of ticks since system boot – global variable

• jiffies-Hz-Time– To convert from [seconds to jiffies]

• (second * HZ)– To convert from [jiffies to seconds]

• (jiffies / HZ)

tickticktick

1000 Hz

100 Hz

Jiffies

Page 32: Ch 8, 9   Kernel Synchronization

32

Terminology

• Issues on HZ – If increase the tick rate from 100 Hz 1000

Hz?•All timed events have a higher resolution •and the accuracy of timed events improve•overhead of timer Interrupt increase .

• Issues on jiffies– Internal representation of jiffies– jiffies wrap-around

Page 33: Ch 8, 9   Kernel Synchronization

33

Hardware Clocks and Timers

• System Timer– Drive an interrupt at a periodic rate– Programmable Interrupt Timer (PIT) on X86– kernel programs PIT on boot to drive timer interrupt

at HZ

• Real-Time Clock (RTC)– Nonvolatile device for storing the time– RTC continues even when the system is off (small

battery)– On boot, kernel reads the RTC – Initialize the wall time (in struct timespec xtime)

Page 34: Ch 8, 9   Kernel Synchronization

34

Timer Interrupt Handler• The architecture-dependent routine

– Acknowledge or reset system timer as required– Periodically save the updated wall time to the RTC– Call the architecture-independent timer routine,

do_timer()

• The architecture-independent routine– jiffies++ – Update the wall time– Update resource usages– Run any dynamic timers that have expired– Calculate load average

Page 35: Ch 8, 9   Kernel Synchronization

35

Timer Interrupt Handlervoid do_timer(struct pt_regs *regs){ jiffies_64++; /* increment jiffies */ update_process_times(user_mode(regs)); update_times();}

void update_process_times(int user_tick){ struct task_struct *p = current; int cpu = smp_processor_id(), system = user_tick ^ 1; update_one_process(p, user_tick, system, cpu); run_local_timers(); /* marks a softirq */ scheduler_tick(user_tick, system);

/* decrements the currently running process’s timeslice and sets need_resched if needed */

}

static inline void update_times(void){ unsigned long ticks; ticks = jiffies - wall_jiffies; if (ticks) { wall_jiffies += ticks; update_wall_time(ticks); } calc_load(ticks);}

1 if user mode0 if kernel mode

p->utime += user;p->stime += system;

Page 36: Ch 8, 9   Kernel Synchronization

36

The Time of Day• The wall time

– xtime.tv_sec value stores the number of seconds that have elapsed since January 1, 1970 (epoch)

– xtime.tv_nsec value stores the number of nanosecnds that have elapsed in the last second

struct timespec xtime;strut timespec {

time_t tv_sec; /* seconds */long tv_nsec; /* nanoseconds */

};

Page 37: Ch 8, 9   Kernel Synchronization

37

Timers• Timers --- eg “run fA() every 250 ms.”

– Kernel code often needs to delay execution of some function until a later time

– e.g.) bottom half mechanisms

• How to use timers – Initial setup

• Specify expiration time• Specify function to execute upon said expiration

– Activate the timer

Page 38: Ch 8, 9   Kernel Synchronization

38

Timers

• Timer structure

struct timer_list {

struct list_head entry; /* timers are part of linked list */

unsigned long expires; /* expiration value, in jiffies */

spinlock_t lock; /* lock protecting this timer */

void (*function)(unsigned long); /* the timer handler function */

unsigned long data; /* lone argument to the handler */

};

Page 39: Ch 8, 9   Kernel Synchronization

39

Timers

• Using Timers– Define timer

– Initialize the timer’s internal values

– Fill out the remaining values as required

– Activate the timer

struct timer_list my_timer;

init_timer(&my_timer);

my_timer.expires = jiffies + delay; /* timer expires in delay ticks */my_timer.data = 0; /* zero is passed to timer handler */my_timer.function = my_function; /* function to run when timer expires */

add_timer(&my_timer);

Page 40: Ch 8, 9   Kernel Synchronization

40

Delaying Execution• Schedule_timeout()

set_current_state (TASK_INTERRUPTIBLE); /* put the task to sleep */schedule_timeout (s * HZ);

signed long schedule_timeout(signed long timeout){

struct timer_list timer; /* create timer */unsigned long expire;expire = timeout + jiffies;init_timer(&timer); /* initialize the timer */timer.expires = expire;timer.data = (unsigned long) current;timer.function = process_timeout;add_timer(&timer); /* activate the timer */schedule(); /* call schedule() */del_timer_sync(&timer);timeout = expire - jiffies;return timeout < 0 ? 0 : timeout;

}

static void process_timeout(unsigned long __data){ wake_up_process((task_t *)__data);}

Page 41: Ch 8, 9   Kernel Synchronization

41

Other Ways to Delay Execution

• Busy Looping (as soon as …)unsigned long delay = jiffies + 10; /* ten ticks

*/while (time_before(jiffies, delay));

• Small Delays (micro-, or milli- seconds)void udelay(unsigned long usecs);void mdelay(unsigned long msecs);udelay(150); /* delay for 150 us