james the giant killer: evaluating locking schemes in 2010-02-27 james francis toy iv david...

14
james the GIANT killer: evaluating locking schemes in 2010-02-27 james francis toy iv David Hemmendinger

Upload: cory-lamb

Post on 27-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

james the GIANT killer: evaluating locking schemes in

2010-02-27james francis toy iv

David Hemmendinger

Purpose

• Evaluate current locking scheme in FreeBSD

• See if the locking methods can be improved

• Evaluate both methods and form conclusions

WAIT! why do we need locking?

• Race conditions– “bad” “dog”– Threads race each other on context

switches–Possible incorrect result : “bda dog”

(interleaving)–Alternative correct result: “dog” “bad”

when it matters

• The most important things is correct results– Incorrect code on kernel level (close to the metal) can

result in other userland applications yielding incorrect results.

• In some cases incorrect code can lead to death– Therac:– typing command sequences so fast input data was

being corrupted – resulted in overexposure to radiation.

OK; so what is a GIANT_LOCK?

What the kernel currently uses for locking!

GIANT_LOCK

• GIANT_LOCKs only allow one thread in the kernel at a time

• This is the simple solution; however, it inhibits concurrency!

• Why concurrency is important with Symmetric Multi-Processing– Logical concurrency w/o SMP

• Is there a better solution?

kernel log

sysctl

scheduler

virtual mem

• Locks only go around shared data structures : “critical sections” in subsystems

• Seldom do threads bombard a specific subsystem (kernel design issue)

• Developer communities currently always favor FGL implementation– If everything is FGL then concurrency is promoted from the smallest subsystem to

the largest

Fine Grained Locking (FGL)

memsched

sysctl kernel log

GIANT_LOCK vs. FGL • GIANT_LOCK is safe!– Guarantees no race and no deadlocks– Problem: Inhibits concurrency!

• Fine grained locking promotes concurrency– Problem: Complexity?– Possible Problem: Deadlocks? (subsystems mutually exclusive)

– Possible problem: locking overhead (how long FGL takes)• Will this pay off?– May depend on the subsystem

• Will the FGL code present a maintenance problem?

DESIGN: locking in BSD land• Tools of the trade (sleep locks, read locks, read write locks)• Two separate branches in a version control system

• Specific subsystems being targeted and why– Klog• prints kernel events -- low traffic

– Sysctl• manages kernel variables -- higher traffic

Fine Grained Locking

GIANT_LOCK

DESIGN: method of evaluation(comparison of FGL and GIANT)

• (control)– Tests designed to hit specific subsystems (FGL a win?)• multi-threaded make sysctl• Kernel event character device klog

• (locking profiler)1. Set sysctl variable 2. Running thread(s)3. Unset sysctl variable– TOO MUCH data

Correctness and how I achieved it

• Correctness– Race conditions arise from writing to a shared

resource– Readers share a lock – Writers pick up exclusive lock

• Process• New branch of code• Replace GIANT entry with fine grained locks• Rebuild kernels in both configurations• cron runs build test scripts (automated builds and profiling)

RESULTS : SYSCTL tests

Series10

20000

40000

60000

80000

100000

120000

140000

GIANTFGL

SUCCESSIVE TESTS

μS

kernel builds with 128 threads

Conclusions

• Sysctl did produce data that strongly support the initial hypothesis.– 1/2% system increase of throughput (one small step)– Max time saved ~30 seconds (2 hrs build)

• Klog test did not produce useful data because the locking mechanisms are around a device node not a msg_buffer.

• Extending FGLs to more subsystems means more throughput.

:: what really matters ::

• Goal was to determine if:– FGL is detrimental to the system (complication)– FGL is significantly faster than GIANT_LOCKING in a

small subsystem like sysctl.

• Locking methods are a very important to SMP– Running as close to the metal as you can get– If this fails to be correct or efficient the rest of the

programs run on the system fail too!

• The FGL implementation should scale well.