disk scheduling in linux its really quite complex!
TRANSCRIPT
My Goals
• Teach a little bit of Computer Science.• Show that the easy stuff is hard in real life.
• BTW, Operating systems are cool!
How A Disk Works
• A disk is a bunch of data blocks.
• A disk has a disk head.
• Data can only be accessed when the head is on that block.
• Moving the head takes FOREVER.
• Data transfer is fast (once the disk head gets there).
The Problem Statement
• Suppose we have multiple requests for disk blocks …. Which should we access first?
P.S. Yes, order does matter … a lot.
Formal Problem Statement
• Input: A set of requests.
• Input: The current disk head location.
• Input: Algorithm state (direction??)
• Output: The next disk block to get.
P.S. Not the whole ordering, just the next one.
The Goals
• Maximize throughput– Operations per second.
• Maximize fairness– Whatever that means.
• Avoid Starvation– And very very long waits.
• Real Time Concerns– These can be life threatening (and bank
accounts too!).
What’s Not Done
• Every Operating system assigns priorities to all sorts of things– Requests for RAM– Requests for the CPU– Requests for network access
• Few use disk request priority.
Why Talk About This Now?
• Because in the last year Linux has had three very different schedulers, and they’ve been tested against each other.
A Pessimal Algorithm
• Choose the disk request furthest from the current disk head.
• This is known to be as bad as any algorithm without idle periods.
An Optimal Algorithm
• Chose the disk request closest to the disk head.
• It’s Optimal!
Optimal Analyzed
• This is known to have the highest performance in operations per second.
• It’s unfair to requests toward the edges of the disk.
• It allows for starvation.
First Come First Serve
• Serve the requests in their arrival order.– It’s fair.– It avoid starvation.– It’s medium lousy
performance.– Some simple OSs use
this.
Elevator
• Move back and forth, solving requests as you go.
• Performance– good
• Fairness– Files near the middle of the disk get 2x the
attention.
Cyclic Elevator
• Move toward the bottom, solving requests as you go.
• When you’ve solved the lowest request, seek all the way to the highest request.– Performance penalty occurs here.
Cyclic Elevator Analyzed
• It’s fair.
• It’s starvation-proof.
• It’s very good performance.– Almost as good as elevator.
• It’s used in real life (and every textbook).
Deadline Scheduler
• Each request has a deadline.
• Service requests using cyclic elevator.
• When a deadline is threatened, skip directly to that request.
• For Real Time (which means xmms)
Jens Axboe
Deadline Analyzed
• Gives Priority to Real Time Processes.• Fair otherwise.• No starvation
– Unless a real time process goes wild.
Anticipatory Scheduling(The Idea)
• Developed by several people.• Coded by Nick Piggin.
• Assume that an I/O request will be closely followed by another nearby one.
Anticipatory Scheduling(The Algorithm)
• After servicing a request … WAIT.– Yes, this means do nothing even though there
is work to be done.
• If a nearby request occurs soon, service it.
• If not … cyclic elevator.
Anticiptory Scheduling Analyzed
• Fair
• No support for real time.
• No starvation.
• Makes assumptions about how processes work in real life.– That’s the idleness.– They better be right
Benchmarking the Anticipatory Scheduler
Source: http://www.cs.rice.edu/~ssiyer/r/antsched/antsched.pdf
Completely Fair Queuing(also by Jens)
• Real Time needs always come first
• Otherwise, no user should be able to hog the disk drive.
• Priorities are OK.
The CFQ Picture
RT
Q1
Q2
Q20
Dispatcher Disk Queue
Disk
Yes, Gabe’s art is better
10msValve
Analyzing CFQ
• Complex!!!!!
• Has Real Time support (Jens likes that).
• Fair, and fights disk hogs!– A new kind of fairness!!
• No starvation is possible.– Real time crazyness excepted.
• Allows for priorities– But no one knows how to assign them.
Benchmark #1
• time (find kernel-tree -type f | xargs cat > /dev/null)
Dead: 3 minutes 39 secondsCFQ: 5 minutes 7 secondsAS: 17 seconds
The Benchmark #2
for i in 1 2 3 4 5 6dotime (find kernel-tree-$i -type f | xargs cat > /dev/null ) &
done
Dead: 3m56.791sCFQ: 5m50.233s AS: 0m53.087s
The Benchmark #3
time (cp 1-gig-file foo ; sync)
AS: 1:22.36CFQ: 1:25.54Dead: 1:11.03
Benchmark #4
• time ssh testbox xterm -e true
Old: 62 secondsDead: 14 secondsCFQ: 11 secondsAS: 12 seconds
Benchmark #5
• While “cat 512M-file > /dev/null “
• Measure “write-and-fsync -f -m 100 outfile”
Old: 6.4 secondsDead: 7.7 secondsCFQ: 8.4 secondsAS: 11.9 seconds
The Winner is …
• Andrew Morton said, "the anticipatory scheduler is wiping the others off the map, and 2.4 is a disaster."
What You Learned
• In Real Operating Systems …
– Performance is never obvious.– No one uses the textbook algorithm.– Benchmarking is everything.
• Theory is useful, if it helps you benchmark better.
• Linux is cool, since we can see the development process.