parallel programming course explicit threading - intel programming course explicit threading paul...
TRANSCRIPT
Parallel Programming CourseExplicit Threading
Paul Guermonprezwww.Intel-Software-Academic-Program.com
[email protected] Software
2012-03-14
2
Programming with Explicit Threads
Identify tasks for threading
Identify computational model
•What work is to be done?
•What algorithm to use?
Identify how data will be accessed
•What is global? What is local?
•How is data to be assigned to threads?
3
Coding Considerations
Create function to encapsulate computation
•
•May be a function that already exists• If so, driver function may be needed to coordinate multiple threads
•
• Pthreads allows single parameter to thread function• C structure for multiple arguments
4
More Coding Considerations
Recast parameter to local variable if needed
•May be structure of several parameters
•
Add code to determine task of thread
•May need access to global variable
•
Add code to access data for task• Add code to protect global variables
• Shared data access may need to be restricted
•
Add code to synchronize thread executions
5
Task Allocation Methods
Static• Initial assignment sets work•Computations or data are divided equally• based on number of threads and thread ID
Dynamic•Useful if tasks require unequal or unknown execution time•Work given to threads as computation proceeds
6
What is Pthreads?
POSIX.1c standard
C language interface
Threads exist within same process
All threads are peers
•No explicit parent-child model
• Exception: “main thread” holds process information
7
Using Pthreads
Identify portions of code to thread
Encapsulate code into function
• If code is already a function, a driver function may need to be written to coordinate work of multiple threads
Add pthread_create() call to assign thread(s) to execute function
8
pthread_create
pthread_t *tidHandle of created thread
const pthread_attr_t *attr• attributes of thread to be created :•detach state, stack size, stack address
•void *(*function)(void *)• function to be mapped to thread
•void *arg• single argument to function
int pthread_create(tid, attr, function, arg);
9
pthread_create Explained
Spawn a thread running the function
Thread handle returned via pthread_t structure
• Specify NULL to use default attributes
Single argument sent to function
• If no arguments to function, specify NULL
Check error codes!• EAGAIN - insufficient resources to create thread• EINVAL - invalid attribute
10
Example: Thread Creation#include <stdio.h>#include <pthread.h>
void *hello () { printf("Hello Thread\n"); }
main() { pthread_t tid; pthread_create(&tid, NULL, hello, NULL);}
Main thread is process
When process ends, all threads end too.
This example may work, but we need some method of waiting for a thread to finish
11
Waiting for POSIX* Threads
Not a good idea!
// wasted cycles!// wasted cycles!
#include <stdio.h>#include <pthread.h>int threadDone = 0;
void *hello () { printf("Hello Thread\n"); threadDone = 1;}
main() { pthread_t tid; pthread_create(&tid, NULL, hello, NULL); while (!threadDone); }
12
Waiting for a Thread
pthread_t tid
• handle of joinable thread
void **val_ptr
• exit value returned by joined thread
int pthread_join(tid, val_ptr);
13
pthread_join Explained
Calling thread waits for thread with handle tid to terminate
•Only one thread can be joined
• Thread must be joinable
Exit value is returned from joined thread
• Type returned is (void *)
•Use NULL if no return value expected
Check Error Code :● ESRCH - thread (pthread_t) not found
● EINVAL - thread (pthread_t) not joinable
14
Thread States
Pthreads threads have two states
• joinable and detached
Threads are joinable by default
•Resources are kept until pthread_join
•Can be reset with attributes or API call
Detached threads cannot be joined• Resources can be reclaimed at termination• Cannot reset to be joinable
pthread_join detaches the thread automaticallyd
Threads can only be joined once.
15
Example: Multiple Threads
#include <stdio.h>#include <pthread.h>#define NUM_THREADS 4
void *hello () { printf("Hello Thread\n");
} main() {
pthread_t tid[NUM_THREADS];for (int i = 0; i < NUM_THREADS; i++)
pthread_create(&tid[i], NULL, hello, NULL);
for (int i = 0; i < NUM_THREADS; i++)pthread_join(tid[i], NULL);
}
There must be one call for each thread needed to be “joined” after termination
16
LAB #1: “HelloThreads”Modify the previous example code to print out
• Appropriate “Hello Thread” message
• Unique thread number • Use for-loop variable of pthread_create() loop
Example output:
Hello from Thread #0
Hello from Thread #1
Hello from Thread #2
Hello from Thread #3
17
What’s wrong?
What is printed for myNum?
void *threadFunc(void *pArg) { int* p = (int*)pArg; int myNum = *p; printf( "Thread number %d\n", myNum);}. . .// from main():for (int i = 0; i < numThreads; i++) { pthread_create(&tid[i], NULL, threadFunc, &i);}
18
Hello Threads Timeline
Time main Thread 0 Thread 1
T0 i = 0 --- ----
T1 create(&i) --- ---
T2 i++ (i == 1) launch ---
T3 create(&i) p = pArg ---
T4 i++ (i == 2) myNum = *p
myNum = 2
launch
T5 wait print(2) p = pArg
T6 wait exit myNum = *p
myNum = 2
19
Race Conditions
Concurrent access of same variable by multiple threads
•Read/Write conflict
•Write/Write conflict
Most common error in concurrent programs
May not be apparent at all times
20
How to Avoid Data Races
Scope variables to be local to threads
• Variables declared within threaded functions
• Allocate on thread’s stack
• TLS (Thread Local Storage)
Control shared access with critical regions
•Mutual exclusion and synchronization
• Lock, semaphore, condition variable, critical section, mutex…
21
Solution – “Local” Storage
void *threadFunc(void *pArg) { int myNum = *((int*)pArg); printf( "Thread number %d\n", myNum);}. . .
// from main():for (int i = 0; i < numThreads; i++) { tNum[i] = i; pthread_create(&tid[i], NULL, threadFunc, &tNum[i]);}
22
Pthreads* Mutex
Simple, flexible, and efficient
Enables correct programming structures for avoiding race conditions
New types
• pthread_mutex_t• the mutex variable
• pthread_mutexattr_t• mutex attributes
Before use, mutex must be initialized
Mutex can only be “held” by one thread at a time.
23
pthread_mutex_init
pthread_mutex_t *mutex
• mutex to be initialized
const pthread_mutexattr_t *attr
• attributes to be given to mutex
•
• Error Codes :• ENOMEM - insufficient memory for mutex• EAGAIN - insufficient resources (other than memory)• EPERM - no privilege to perform operation
int pthread_mutex_init( mutex, attr );
24
Alternate Initialization
Can also use the static initializer
PTHREAD_MUTEX_INITIALIZER
•Uses default attributes
Programmer must always pay attention to mutex scope
•Must be visible to threads
pthread_mutex_t mtx1 = PTHREAD_MUTEX_INITIALIZER;
25
pthread_mutex_lock
pthread_mutex_t *mutex
mutex to attempt to lock
int pthread_mutex_lock( mutex );
26
pthread_mutex_lock Explained
Attempts to lock mutex
• If mutex is locked by another thread, calling thread is blocked
Mutex is held by calling thread until unlocked
•Mutex lock/unlock must be paired or deadlock occurs
Check error code : ● EINVAL - mutex is invalid● EDEADLK - calling thread already owns mutex
27
pthread_mutex_unlock
pthread_mutex_t *mutex• mutex to be unlocked
•
Error Code :• EINVAL - mutex is invalid• EPERM - calling thread does not own mutex
int pthread_mutex_unlock( mutex );
28
Example: Use of mutex#define NUMTHREADS 4pthread_mutex_t gMutex; // why does this have to be global?int g_sum = 0;
void *threadFunc(void *arg) { int mySum = bigComputation(); pthread_mutex_lock( &gMutex ); g_sum += mySum; // threads access one at a time pthread_mutex_unlock( &gMutex );}
main() { pthread_t hThread[NUMTHREADS];
pthread_mutex_init( &gMutex, NULL ); for (int i = 0; i < NUMTHREADS; i++) pthread_create(&hThread[i],NULL,threadFunc,NULL);
for (int i = 0; i < NUMTHREADS; i++) pthread_join(hThread[i]);}
29
4 . 0
2 . 0
1 . 00 . 0 X
Numerical Integration Example
∫ 4 . 0
( 1 + x 2 ) d x = π0
1
static long num_steps=100000; double step, pi;
void main(){ int i; double x;
step = 1.0/(double) num_steps; for (i=0; i< num_steps; i++){ x = (i+0.5)*step; pi += 4.0/(1.0 + x*x); } pi *= step; printf("Pi = %f\n",pi);}
30
Lab 2: Computing Pi
Parallelize the numerical integration code using Win32* Threads
How can the loop iterations be divided among the threads?
What variables can be local?
What variables need to be visible to all threads?
31
Semaphores
Synchronization object that keeps a count•Represent the number of available resources
• Formalized by Edsgar Dijkstra
Two operation on semaphores•Wait [P(s)]: Thread waits until s > 0, then s = s-1
• Post [V(s)]: s = s + 1
32
POSIX Semaphores
Not part of the POSIX Threads specification
•Defined in POSIX.1b
Check for system support of semaphores before use
• Is _POSIX_SEMAPORES defined in <unistd.h>?
• If available, threads within a process can use semaphores
•Use header file <semaphore.h>
New type
• sem_t• the semaphore variable
33
sem_init
sem_t *sem
• counting semaphore to be initialized
int pshared
• if non-zero, semaphore can be shared across processes
unsigned int value
• initial value of semaphore
int sem_init( sem, pshared, value );
34
sem_init Explained
Initializes the semaphore object
If pshared is zero, semaphore can only be used by threads within the calling process
• If non-zero, semaphore can be used between processes
The value parameter sets the initial semaphore count value
Check error code :● EINVAL – sem is not valid semaphore● EPERM – process lacks approriate privelege● ENOSYS – semaphores not supported
36
sem_wait Explained
If semaphore count is greater than zero
•Decrement count by one (1)
• Proceed with code following
Else, if semaphore count is zeroThread blocks until value is greater than zero
Check error code :● EINVAL – sem is not valid semaphore● EDEADLK – deadlock condition was detected● ENOSYS – semaphores not supported
38
sem_post Explained
Post a wakeup to semaphore
If one or more threads are waiting, release one
•Otherwise, increment semaphore count by one (1)
Check error code :● EINVAL – sem is not valid semaphore● ENOSYS – semaphores not supported
39
Semaphore Uses
Control access to limited size data structures
•Queues, stacks, deques
•Use count to enumerate available elements
Throttle number of active threads within a region
Binary semaphore [0,1] can act as mutex
40
Semaphore Cautions
No ownership of semaphore
Any thread can release a semaphore, not just the last thread to wait
•Use good programming practice to avoid
No concept of abandoned semaphore
• If thread terminates before post, semaphore increment may be lost
•Deadlock
41
Example: Semaphore as Mutex
Main thread opens input file, waits for thread termination
Threads will
•Read line from input file
•Count all five letter words in line
42
Example: Mainsem_t hSem1, hSem2;FILE *fd; int fiveLetterCount = 0;
main(){ pthread_t hThread[NUMTHREADS];
sem_init (*hSem1, 0, 1); // Binary semaphore sem_init (*hSem2, 0, 1); // Binary semaphore
fd = fopen("InFile", "r"); // Open file for read
for (int i = 0; i < NUMTHREADS; i++) pthread_create (&hThread[i], NULL, CountFives, NULL); for (int i = 0; i < NUMTHREADS; i++) pthread_join (hThread[i], NULL);
fclose(fd);
printf("Number of five letter words is %d\n", fiveLetterCount);}
43
Example: Semaphores
void * CountFives(void *arg) { int bDone = 0 ; char inLine[132]; int lCount = 0;
while (!bDone) { sem_wait(*hSem1); // access to input bDone = (GetNextLine(fd, inLine) == EOF); sem_post(*hSem1); if (!bDone) if (lCount = GetFiveLetterWordCount(inLine)) { sem_wait(*hSem2); // update global fiveLetterCount += lCount; sem_post(*hsem2); } }}
45
Condition Variables
Semaphores are conditional on the semaphore count
Condition variable is associated with an arbitrary conditional • Same operations: wait and signal
Provides mutual exclusion• by having threads wait on the condition variable until
signaled.
46
Condition Variable and Mutex
Condition variables are always paired with a Mutex
• Protects evaluation of the conditional expression
• Prevents race condition between signaling thread and threads waiting on condition variable
47
Lost and Spurious Signals
Signal to condition variable is not saved
• If no thread waiting, signal is “lost”
• Thread can be deadlocked waiting for signal that will not be sent
Condition variable can (rarely) receive spurious signals
• Slowed execution from predictable signals
•Need to retest conditional expression
48
Condition Variable Algorithm
Avoids problems with lost and spurious signals
acquire mutex;while (conditional is true) wait on condition variable;perform critical region computation;update conditional;signal sleeping thread(s);release mutex;
Negation of condition needed to proceed
Mutex is automatically released when thread waits
May be optional
49
Condition Variablespthread_cond_init, pthread_cond_destroy
• initialize/destroy condition variable
pthread_cond_wait
• attempt to hold condition variable
pthread_cond_signal
• signal release of condition variable
pthread_cond_broadcast
• broadcast release of condition variable
50
Condition Variable Types
Data types used
• pthread_cond_t• the condition variable
• pthread_condattr_t• condition variable attributes
Before use, condition variable (and mutex) must be initialized
51
pthread_cond_init
pthread_cond_t *cond
• condition variable to be initialized
const pthread_condattr_t *attr
• attributes to be given to condition variable
•
•Return codes :• ENOMEM - insufficient memory for mutex
• EAGAIN - insufficient resources (other than memory)
• EBUSY - condition variable already intialized
• EINVAL - attr is invalid
•
•
int pthread_cond_init( cond, attr );
52
Alternate Initialization
Can also use the static initializer
PTHREAD_COND_INITIALIZER
•Uses default attributes
Programmer must always pay attention to condition (and mutex) scope
•Must be visible to threads
pthread_cond_t cond1 = PTHREAD_COND_INITIALIZER;
53
pthread_cond_wait
pthread_cond_t *cond• condition variable to wait on
pthread_mutex_t *mutex• mutex to be unlocked
int pthread_cond_wait( cond, mutex );
54
pthread_cond_wait Explained
Thread put to “sleep” waiting for signal on cond
Mutex is unlocked
• Allows other threads to acquire lock
•When signal arrives, mutex will be reacquired before pthread_cond_wait returns
Check error codes :• EINVAL - cond or mutex is invalid• EINVAL - different mutex for concurrent waits• EINVAL - calling thread does not own mutex
55
pthread_cond_signal
pthread_cond_t *cond• condition variable to be signaled
int pthread_cond_signal( cond );
56
pthread_cond_signal Explained
Signal condition variable, wake one waiting thread
If no threads waiting, no action taken
• Signal is not saved for future threads
Signaling thread need not have mutex
•May be more efficient
• Problem may occur if thread priorities used
Check error code :● EINVAL - cond is invalid
57
Example: Denominator
Two threads oversee a global variable
• Thread 1 calculates the value
• Thread 2 needs a non-zero value
A mutex controls access to the variable
Thread 1 signals thread 2 (waiting)
58
Example: Denominator
#include <pthread.h>
pthread_mutex_t denom_mtx = PTHREAD_MUTEX_INITIALIZER;pthread_cond_t denom_cond = PTHREAD_COND_INITIALIZER;float denominator = 0.0; /* global */
void *thread1() { pthread_mutex_lock( &denom_mtx );
denominator = f(); /* calculate denominator */ pthread_signal( &denom_cond ); /* signal waiting thread */
pthread_mutex_unlock( &denom_mtx );}
Thread 1 :
59
Example: Denominator
void *thread2() { float local_denom;
pthread_mutex_lock( &denom_mtx );
/* wait for non-zero denominator */ while( denominator == 0.0 ) pthread_cond_wait( &denom_cond, &denom_mtx ); local_denom = denominator;
pthread_mutex_unlock( &denom_mtx );
/* Use local copy of denominator for division */}
Thread 2 :
60
pthread_cond_broadcast
pthread_cond_t *cond• condition variable to signal
int pthread_cond_broadcast( cond );
61
pthread_cond_broadcast Explained
Wake all threads waiting on condition variable
If no threads waiting, no action taken
• Broadcast is not saved for future threads
Signaling thread need not have mutex
Check return code :● EINVAL - cond is invalid
62
Denominator: Many threadsAssume multiple threads created on Thread2()
What happens if…• all threads are waiting?• no threads are waiting?• only some threads are waiting?
• -> Everything works out just fine
void *thread1() { pthread_mutex_lock( &denom_mtx ); denominator = f(); pthread_cond_broadcast( &denom_cond ); /* wake all */ pthread_mutex_unlock( &denom_mtx );}
63
Activity 4 – Condition variables
Replace spin-wait and thread counting variable with condition variables to signal thread completion of computational piece
64
Some Advanced Functions
Thread-specific Data
Thread Attributes
Mutex Attributes
Condition Variable Attributes
Just a quick overview of functions
•Not too many details
65
Thread-specific Data
Another means for “local” storage
pthread_key_create
• Create thread-specific key for all threads
pthread_setspecific
• Associate thread-specific value with given key
pthread_getspecific
• Return current data value associated with key
pthread_key_delete
• Delete a data key
66
Thread Attribute Functionspthread_attr_init
• Initialize attribute object to default settings
pthread_attr_destroy
• Delete attribute object
pthread_{get|set}detachstate
• Return or set thread’s detach state
pthread_{get|set}stackaddr
• Return or set the stack address of thread
pthread_{get|set}stacksize
• Return or set the stack size of thread
67
Mutex Attribute Functionspthread_mutexattr_init
• Intialize mutex attribute object to defaults
pthread_mutexattr_destroy
• Delete mutex attribute object
pthread_mutexattr_{get|set}pshared
• Return or set whether mutex is shared between processes
68
Condition Attribute Functionspthread_condattr_init
• Initialize condition variable attribute object
pthread_condattr_destroy
• Destroy condition variable attribute object
pthread_condattr_{get|set}pshared
• Return or set whether condition variable is shared between processes
69
Summary
Create threads to execute work encapsulated within functions
Coordinate shared access between threads to avoid race conditions
• Local storage to avoid conflicts
• Synchronization objects to organize use
License Creative Commons - By 2.5
You are free:to Share — to copy, distribute and transmit the workto Remix — to adapt the workto make commercial use of the work
Under the following conditions:Attribution — You must attribute the work in the manner
specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).
With the understanding that:Waiver — Any of the above conditions can be waived if you
get permission from the copyright holder.Public Domain — Where the work or any of its elements is in
the public domain under applicable law, that status is in no way affected by the license.
Other Rights — In no way are any of the following rights affected by the license:
o Your fair dealing or fair use rights, or other applicable copyright exceptions and limitations;
o The author's moral rights;o Rights other persons may have either in the work itself or in
how the work is used, such as publicity or privacy rights.Notice — For any reuse or distribution, you must make clear
to others the license terms of this work. The best way to do this is with a link to the web page http://creativecommons.org/licenses/by/2.5/