cs11 c++ dgccourses.cms.caltech.edu/cs11/material/dgc/lectures/... · uses of multithreaded...

33
CS11 C++ DGC Spring 2006-2007 Lecture 1

Upload: others

Post on 25-Apr-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

CS11 C++ DGC

Spring 2006-2007Lecture 1

Page 2: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Welcome

7-8 lectures, equivalent number of labsRequired: general familiarity with C++

Creating classes and class hierarchiesDynamic memory managementUsing C++ exceptionsUsing templates

CS cluster accountProgramming environmentElectronic “submission”

~/cs11/cppdgc/lab1/ etc.

Page 3: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Overview

Focus on libraries and tools used in Caltech’s DARPA Grand Challenge projectLibraries:

POSIX threadsThe Spread Toolkit (http://www.spread.org)

Tools:make – build managementdoxygen – automated documentation generatorSubversion – version control system

Page 4: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Overview (2)

Labs aren’t DGC-specificSome general techniques for using these C-based libraries from C++The libraries and tools are used in many software projects

Knowledge and techniques will (hopefully) be very applicable to DGC efforts

…and many other software projects!

Page 5: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Multithreaded Programming

“Thread” = a sequential thread of executionA single, sequential flow of executionA thread does only one thing at a timeEvery program has at least one thread

Multithreaded programmingUsing more than one thread to do multiple things “at the same time”

Some tasks can proceed concurrentlyPerform independent tasks in separate threads to improve performance

Page 6: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Multithreading Terminology

Blocking operationCauses a thread to stop executing until that operation is completedAlso called a synchronous operation

Non-blocking operationA thread can try to perform an operationOperation may fail and need to be retried later

Thread can go do other things before trying againThread’s execution is not impededAlso called an asynchronous operation

Page 7: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Uses of Multithreaded Programming

Leverage multiple processorsPrograms don’t “automatically” use multiple processorsMust write programs to use multiple processors

Perform slow operations on a separate threade.g. download a file from the InternetAllows user interaction with your program to be fast

Sometimes, threads provide a cleaner abstractionSome tasks are easier to reason about using a thread-based approachExample: using blocking IO operations in multiple threads, vs. using non-blocking IO operations in a single thread

Page 8: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Threads and Resources

Threads can have their own resourcesData structures, open files, etc.No special requirements for interactions

Threads can also share resourcesMultiple threads accessing the same data, etc.

Access to shared resources must be properly synchronizedMost common approach: exclusive access to shared resources

Only one thread manipulates a shared resource at a timeCalled an atomic operation: indivisible from a threading perspective

Page 9: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Costs of Multithreaded Programming

Each thread incurs some overheadPer-thread space overhead: each thread’s execution details need stored somewhereContext-switching overhead: switching between threads takes time!Thread contention: when multiple threads spend too much time contending for the same shared resources

Goal:Use as many threads as necessary, and no moreMinimize thread-synchronization timeSometimes this requires some tuning ☺

Page 10: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Potential Pitfalls

Lots of new bugs in multithreaded programming!Spurious or incorrect values

Changes to shared data that are not properly guardedDeadlock: multiple threads waiting for each other

Each one is blocked on another threadThe whole system fails to progress

Livelock: similar to deadlockThreads are able to continue executing, but they impede each others’ progressAgain, the whole system fails to progress

Some bugs only happen on multiprocessor systems!Rigorous testing on different platforms is best approach

Page 11: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

POSIX Threads

Standard UNIX API for multithreaded programmingThreads and synchronization primitives require support from the operating systemAbbreviated “pthreads”

POSIXPortable Operating System Interface for uniXSpecifications for different APIs and their behaviorFacilitates source-portability across multiple UNIX implementations

Linux, Solaris, MacOS/X, BSD, etc.Windows too!

Page 12: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Using POSIX Threads

Must include POSIX thread header#include <pthread.h>

Defines all the data-types and functionsMust link against pthread library

Add -lpthread to compiler argumentsLibrary is usually in a standard location

POSIX thread library is a C libraryCan use with C++Functions indicate errors with an int return value

Page 13: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Creating a Thread

Every thread has its own identifierType of thread identifier is pthread_tCan create variables of type pthread_t to hold a thread’s IDCopying the identifier doesn’t make new threads

Threads are created using pthread_createfunctionint pthread_create(pthread_t *thread, pthread_attr_t *attr,

void * (*start_routine)(void *), void *arg)

Pass in a pointer to a function to run: start_routinePass in an argument: argIf thread is created successfully, new ID stored in thread

Page 14: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Output Arguments

POSIX APIs use arguments to return valuesint pthread_create(pthread_t *thread, pthread_attr_t *attr,

void * (*start_routine)(void *), void *arg)

thread is an output argumentProcedure:

Create a variable to receive the valuePass in a pointer to that variableFunction uses the pointer to set variable’s value

Example:pthread_t tid;pthread_create(&tid, ...); // Sets value of thread ID

Page 15: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Return Codes

POSIX thread API uses return codes to indicate statusSome are standard UNIX error codes#include <cerrno> // errno.h std header

Some are specific to POSIX thread APIShould get into habit of checking error codes

…at least for the major allocation operations!Use a common variable name for this

int rc; // return-code

rc = pthread_create(...);if (rc != 0)throw FatalError("couldn't create thread");

Page 16: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Return Codes (2)

Return-value of 0 means “success”Nonzero values indicate errorsAPI documentation specifies what errors are returned, and when

Good opportunity to use C++ exception handling!

Use return codes for now…

Page 17: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Function Pointers

C/C++ functions can be referred to by namesin(x), cos(x), sqrt(x), etc.

Can also refer to functions via function pointersLike a normal pointer, but function can be called through itFunction’s signature is part of the pointer’s type

Number and types of arguments, return typeAbove funcs take a double and return a double

A function pointer for them could be like this:double (*fp)(double);

Variable name is fpPoints to a function that takes a double and returns a double

Page 18: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Using Function Pointers

Normally refer to functions to invoke themdouble rot = length * sin(angle);

Invokes sin, using angle as argument

Can also get a function’s address via its namedouble (*fp)(double);...fp = sin; // No arguments to sin here!...double res = fp(input);

Use fp like a normal functionCan set fp to any function with the same signature

sin, cos, tan, sqrt, log, exp, your own functions, etc.

Page 19: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Thread Function

pthread_create() signature specified:void * (*start_routine)(void *)

start_routine is any function that:takes a void * argumentreturns a void * valueNULL (or 0 in C++) is perfectly acceptable

Need to declare a function to pass to the threadvoid * MyThreadFunc(void *arg);

Then, can pass this to pthread_create()int rc = pthread_create(..., MyThreadFunc, ...);

Page 20: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Thread Arguments

Thread function takes a void * argumentCan pass in anything – an object, a struct, etc.No type information; not very niceCast void-pointer to appropriate type, then use it:

void * MyThreadFunc(void *arg) {MySharedState *pShared = (MySharedState *) arg;... // Do stuff.

}

Can start multiple threads that share statePass same object to all of them

Page 21: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Thread Attributes

Can specify new thread’s attributes at creation timeint pthread_create(pthread_t *thread, pthread_attr_t *attr,

void * (*start_routine)(void *), void *arg)

For specifying scheduling behavior, etc.Passing NULL (or 0) for thread’s attributes means “use the defaults”Can use pthread_attr_init(), … functions to set up different thread attributesNot all options are supported on every platform…

Read each platform’s pthread documentation

Page 22: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Synchronized Access

Threads can synchronize access to shared resources using a mutex

Mutex = mutual exclusion lockAt most one thread can hold a particular mutex

Mutex data-type is pthread_mutex_tpthread_mutex_t mut;

To create a mutex:int pthread_mutex_init(pthread_mutex_t *mutex,

const pthread_mutex_attr_t *mutexattr)Can specify NULL (or 0) for mutex attributes to get defaultsExample:int rc = pthread_mutex_init(&mut, 0);

Page 23: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Locking and Unlocking a Mutex

To lock a mutex:int pthread_mutex_lock(pthread_mutex_t *mutex)int pthread_mutex_trylock(pthread_mutex_t *mutex)

First version will block until lock is acquiredSecond version returns immediately with an error code, if mutex is not available

To unlock a mutex:int pthread_mutex_unlock(pthread_mutex_t *mutex)

Thread must already have a lock on the mutex (obvious)By default, POSIX mutexes are not recursive!

A thread cannot lock the same mutex multiple timesIf a thread tries this, it will deadlock on itselfCan configure POSIX mutexes to be recursive

Page 24: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Destroying a Mutex

When done, must release any resources the mutex holds

int pthread_mutex_destroy(pthread_mutex_t *mutex)

Mutexes don’t use resources on every OS, but do it just in case!

If it’s reasonably simple, always write your code to be as portable as possible

The function also checks that mutex isn’t locked by somebody else, too

Page 25: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Barriers

Another synchronization primitive in POSIX threadsUsed to make sure that all threads have reached a certain point, before any of them continue onSpecify number of threads the barrier is for, at construction timeWhen threads reach the barrier, they are blocked until allthreads have reached it

POSIX data type: pthread_barrier_tint pthread_barrier_init(pthread_barrier_t *barrier,

const pthread_barrierattr_t *attr, unsigned int count)int pthread_barrier_destroy(pthread_barrier_t *barrier)

Page 26: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Using a Barrier

Create a barrier for 3 threads:pthread_barrier_t barrier;int rc = pthread_barrier_init(&barrier, 0, 3);

Threads call pthread_barrier_wait()Inside each thread:

pthread_barrier_wait(&barrier);

Threads block on this call until all three have reached the barrier

Can use a barrier multiple timesDon’t forget to clean them up, too

Page 27: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Ending A Thread’s Execution

A thread stops running when its thread function returnsCan also call pthread_exit(void *) if necessary

Thread function returns a void *Void-pointer to thread’s resultsAgain, no type information…void * MyThreadFunc(void *arg) {MySharedState *pShared = (MySharedState *) arg;...MyResult *pResult = new MyResult(...);return pResult;

}Return NULL or 0 if the thread function has no data to return.

Don’t return a pointer to a local variable! ☺

Page 28: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Waiting for Threads To Finish

A thread can wait for another thread to finishint pthread_join(pthread_t th, void **thread_return)

Only one thread can listen for another thread to finish!Must specify ID of thread to wait forCan optionally get thread’s return-value too

pthread_t thread;MyResult *result;...rc = pthread_join(thread, (void **) &result);if (rc != 0)throw SomeError("bad day to be you...");

cout << result;

Page 29: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Cleaning Up After A Thread

What if a thread terminates before you call pthread_join() on it?

Thread’s state, return-value don’t get cleaned up until someone calls pthread_join() on it

Always call pthread_join() on threads to clean up after them!

Clean up any dynamic resources in return-value as well…

Probably easiest to do this from main thread

Page 30: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Static Initializers

POSIX API provides static initializers for some objects

pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER;pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

Are macros for initializing static variablesNot intended for local variables or dynamically allocated objects!

Should only use these in global variablesWant to avoid global variables, in general…

Page 31: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

POSIX Thread Documentation

POSIX thread API docs available via man utilityAt Linux command-prompt:man pthread.h

Shows all constants, types, functions, etc.man pthread_mutex_init

Shows details for functions related to mutexesWhen man is running:

Space moves forward one pageEnter moves forward one line“b” moves back one pageArrow keys scroll up and down as well“q” exits from man

Page 32: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

This Week’s Assignment

POSIX Threads version of “Hello World!”

Create a thread functionPass it some shared state

Includes a mutex and a barrierThread function prints some messages

Use the shared state for synchronizationCompile and run your program

Page 33: CS11 C++ DGCcourses.cms.caltech.edu/cs11/material/dgc/lectures/... · Uses of Multithreaded Programming Leverage multiple processors Programs don’t “automatically” use multiple

Next Week

Using C++ classes to make POSIX thread programming easier

Simpler to use“Exceptionally safe!”