project idea 1: adding thread support for xv6heng/teaching/cs179f-winter20/... · 2020. 1. 15. ·...
TRANSCRIPT
Project Idea 1:
Adding Thread Support for
XV6
1
Adopted some slides from www.cs.pdx.edu/~walpole/class/cs533/winter2007/slides/92.ppt
2
Threads
l Separate execution and resource container roles
u The thread defines a sequential execution stream within a
process (PC, SP, registers)
u The process defines the address space, resources, and general
process attributes (everything but threads)
l Threads become the unit of scheduling
u Processes are now the containers in which threads execute
u Processes become static, threads are the dynamic entities
3
Recap: Process Address Space
Stack
0x00000000
0xFFFFFFFF
Code
(Text Segment)
Static Data
(Data Segment)
Heap
(Dynamic Memory Alloc)Address
Space
SP
PC
4
Threads in a Process
Stack (T1)
Code
Static Data
Heap
Stack (T2)
Stack (T3)
Thread 1
Thread 3
Thread 2
PC (T1)
PC (T3)PC (T2)
Threads: Concurrent Servers
Using fork() to create new processes to handle
requests in parallel is overkill for such a simple
task
Recall our forking Web server:
while (1) {
int sock = accept();
if ((child_pid = fork()) == 0) {
Handle client request
Close socket and exit
} else {
Close socket
}
}
5
Threads: Concurrent Servers
Instead, we can create a new thread for each
request
web_server() {
while (1) {
int sock = accept();
thread_fork(handle_request, sock);
}
}
handle_request(int sock) {
Process request
close(sock);
}
6
Implementing threads
Kernel Level Threads
All thread operations are implemented in the kernel
The OS schedules all of the threads in the system
Don’t have to separate from processes
OS-managed threads are called kernel-level
threads or lightweight processes
Windows: threads
Solaris: lightweight processes (LWP)
POSIX Threads (pthreads):
PTHREAD_SCOPE_SYSTEM
7
Kernel Thread (KLT) Limitations
KLTs make concurrency cheaper than processes
Much less state to allocate and initialize
However, there are a couple of issues
Issue 1: KLT overhead still high
Thread operations still require system calls
Ideally, want thread operations to be as fast as a procedure
call
Issue 2: KLTs are general; unaware of application
needs
Alternative: User-level threads (ULT)8
Alternative: User-Level Threads
Implement threads using user-level library
ULTs are small and fast
A thread is simply represented by a PC, registers,
stack, and small thread control block (TCB)
Creating a new thread, switching between threads,
and synchronizing threads are done via procedure call
No kernel involvement
User-level thread operations 100x faster than kernel
threads
pthreads: PTHREAD_SCOPE_PROCESS
9
CSE 153 – Lecture 6 – Threads 10
Thread Scheduling
The thread scheduler determines when a thread runs
It uses queues to keep track of what threads are doing
Just like the OS and processes
But it is implemented at user-level in a library
Run queue: Threads currently running (usually one)
Ready queue: Threads ready to run
Are there wait queues?
How would you implement thread_sleep(time)?
CSE 153 – Lecture 6 – Threads 11
Non-Preemptive Scheduling
Threads voluntarily give up the CPU with thread_yield
What is the output of running these two threads?
while (1) {
printf(“ping\n”);
thread_yield();
}
while (1) {
printf(“pong\n”);
thread_yield();
}
Ping Thread Pong Thread
CSE 153 – Lecture 6 – Threads 12
thread_yield()The semantics of thread_yield are that it gives
up the CPU to another thread
In other words, it context switches to another thread
So what does it mean for thread_yield to
return?
Execution trace of ping/pongprintf(“ping\n”);
thread_yield();
printf(“pong\n”);
thread_yield();
…
CSE 153 – Lecture 6 – Threads 13
Implementing thread_yield()
thread_yield() {
thread_t old_thread = current_thread;
current_thread = get_next_thread();
append_to_queue(ready_queue, old_thread);
context_switch(old_thread, current_thread);
return;
}
The magic step is invoking
context_switch()
Why do we need to call
append_to_queue()?
As old thread
As new thread
CSE 153 – Lecture 6 – Threads 14
Thread Context Switch
The context switch routine does all of the magic
Saves context of the currently running thread
(old_thread)
Push all machine state onto its stack (not its TCB)
Restores context of the next thread
Pop all machine state from the next thread’s stack
The next thread becomes the current thread
Return to caller as new thread
This is all done in assembly language
It works at the level of the procedure calling
convention, so it cannot be implemented using
procedure calls
CSE 153 – Lecture 6 – Threads 15
Preemptive Scheduling
Non-preemptive threads have to voluntarily give up CPU
A long-running thread will take over the machine
Only voluntary calls to thread_yield(), thread_stop(), or thread_exit()
causes a context switch
Preemptive scheduling causes an involuntary context switch
Need to regain control of processor asynchronously
Use timer interrupt (How do you do this?)
Timer interrupt handler forces current thread to “call” thread_yield
CSE 153 – Lecture 6 – Threads 16
Threads SummaryProcesses are too heavyweight for multiprocessing
Time and space overhead
Solution is to separate threads from processes
Kernel-level threads much better, but still significant overhead
User-level threads even better, but not well integrated with OS
Scheduling of threads can be either preemptive or non-
preemptive
Now, how do we get our threads to correctly cooperate
with each other?
Synchronization…
Cooperation between Threads
What is the purpose of threads?
Threads cooperate in multithreaded programs
Why?
To share resources, access shared data structures
Threads accessing a memory cache in a Web server
To coordinate their execution
One thread executes relative to another
CSE 153 – Lecture 7 – Synchronization 17
Spinlock
Here is our lock implementation with test-and-set:
CSE 153 – Lecture 9 – Semaphores and
Monitors
18
struct lock {
int held = 0;
}
void acquire (lock) {
while (test-and-set(&lock->held));
}
void release (lock) {
lock->held = 0;
}
SemaphoresSemaphores are an abstract data type that provide mutual exclusion to critical sections
Block waiters, interrupts enabled within critical section
Described by Dijkstra in THE system in 1968
Semaphores are integers that support two operations:
wait(semaphore): decrement, block until semaphore is open
Also P(), after the Dutch word for test, or down()
signal(semaphore): increment, allow another thread to enter
Also V() after the Dutch word for increment, or up()
That's it! No other operations – not even just reading its value – exist
Semaphore safety property: the semaphore value is always greater than or equal to 0
CSE 153 – Lecture 9 – Semaphores and
Monitors
19
CSE 153 – Lecture 9 – Semaphores and
Monitors
20
Blocking in Semaphores
Associated with each semaphore is a queue
of waiting threads/processes
When wait() (or P()) is called by a thread:
If semaphore is open, thread continues
If semaphore is closed, thread blocks on queue
Then signal() (or V()) opens the semaphore:
If a thread is waiting on the queue, the thread is
unblocked
If no threads are waiting on the queue, the signal
is remembered for the next thread
CSE 153 – Lecture 9 – Semaphores and
Monitors
21
Semaphore Types
Semaphores come in two types
Mutex semaphore (or binary semaphore)
Represents single access to a resource
Guarantees mutual exclusion to a critical section
Counting semaphore (or general semaphore)
Multiple threads pass the semaphore determined by count
mutex has count = 1, counting has count = N
Represents a resource with many units available
or a resource allowing some unsynchronized concurrent access
(e.g., reading)
Protecting a critical region
sem mutex =1;
process CS[i = 1 to n] {
while (true) {
P(mutex);
critical region;
V(mutex);
noncritical region;
}
}
CSE 153 – Lecture 9 – Semaphores and
Monitors
22
Implementing a 2-process barrier
CSE 153 – Lecture 9 – Semaphores and
Monitors
23
Neither process can pass the barrier until both have arrived
Must be able continuously reuse the barrier; therefore it must reinitialize after letting processes
pass the barrier.This will require two semaphores: a signaling semaphore for each of arrival and departure.
Each process x signals its arrival using a V(arrivex) and then waits on the other process' (y) semaphore
with a P(arrivey);
sem arrive1 = 0, arrive2 = 0;
process Worker1 {
...
V(arrive1); /*signal arrival */
P(arrive2); /*await arrival of other process */
...
}
process Worker2 {
...
V(arrive2);
P(arrive1);
..
}
A simple producer/consumer
problem
Use a single shared buffer
Only one process can read or write
the buffer at a time
Producers put things in the buffer
Consumers take things out of the
buffer
Need two semaphores
Empty will keep track of whether the
buffer is empty
Full will keep track of whether the
buffer is full
CSE 153 – Lecture 9 – Semaphores and
Monitors
24
typeT buf; /* a buffer of some type */
sem empty =1; /*initially buffer is empty */
sem full = 0; /*initially buffer is not full */
process Producer [i = 1 to m] {
while (true) {
...
/*produce data, then deposit it in the buffer */
P(empty);
buf = data;
V(full);
}
}
process Consumer [j=1 to n] {
while (true) {
P(full);
result = buf;
V(empty);
...
}
}