python threads · python threads aahz ... threaded example threaded example from threading import...
TRANSCRIPT
title: Title
Python ThreadsAahz
[email protected]://starship.python.net/crew/aahz/
Powered by PythonPointhttp://www.reportlab.com/
title: Meta Tutorial
Meta Tutorial• I'm hearing-impaired
Please write questions if at all possible
• Take notes - or not
• Pop Quiz
• Slideshow on web
title: Contents
Contents• Goal: Use Threads!
• Thread Overview
• Python's Thread Library
• Two ApplicationsWeb SpiderGUI Background Thread
title: Generic Threads
Generic Threads• Similar to processes
• Shared memory
• Light-weight
• Difficult to set upEspecially cross-platform
title: Why Use Threads?
Why Use Threads?• Efficiency/speed
• Responsiveness
• Algorithmic simplicity
title: Python Threads
Python Threads• Class-based
Use threading, not thread
• Cross-platform
• Thread Library
title: 1.5.2 vs. 2.0
Python 1.5.2 vs. 2.0• Compile --with-thread
Except on MS Windows and some Linuxdistributions
• Multi-CPU bugCreating/destroying large numbers ofthreads
title: GIL
GIL• Global Interpreter Lock (GIL)
• Full Documentation: http://www.python.org/doc/current/api/threads.html
• Only one Python thread can run
• GIL is your friend (really!)
title: GIL in action
GIL in Action• Which is faster?One Threadtotal = 1for i in range(10000): total += 1total = 1for i in range(10000): total += 1
Two Threadstotal = 1 total = 1for i in range(10000): for i in range(10000): total += 1 total += 1
title: Dealing with GIL
Dealing with GIL• sys.setcheckinterval()
(default 10)
• C extensions can release GIL
• Blocking I/O releases GILSo does time.sleep(!=0)
• Multiple Processes
title: Performance Tip
Performance Tip• python -O
Also set PYTHONOPTIMIZE15% performance boostRemoves bytecodes (SET_LINENO)Fewer context switches!
title: GIL and C Extensions
GIL and C Extensions• Look for macros:
Py_BEGIN_ALLOW_THREADSPy_END_ALLOW_THREADS
• Some common extensions:mxODBC - yesNumPy - no
title: Share External Objects 3
Share External Objects• Files, GUI, DB connections
Don't• Partial exception: print
• Still need to share?Use worker thread
title: Create Python Threads
Create Python Threads• Subclass threading.Thread
• Override __init__() and run()
• Do not override start()
• In __init__(), callThread.__init__()
title: Using Python Threads
Using Python Threads• Instantiate thread object
t = MyThread()
• Start the threadt.start()
• Methods/attribs from outside threadt.put('foo')if t.done:
title: Non-threaded Example
Non-threaded Exampleclass Retriever: def __init__(self, URL): self.URL = URL self.page = self.getPage()
retriever = Retriever('http://www.foo.com/')URLs = retriever.getLinks()
title: Threaded Example
Threaded Examplefrom threading import Thread
class Retriever(Thread): def __init__(self, URL): Thread.__init__(self) self.URL = URL def run(self): self.page = self.getPage()
retriever = Retriever('http://www.foo.com/')retriever.start()while retriever.isAlive(): time.sleep(1)URLs = retriever.getLinks()
title: Multiple Threads
Multiple Threadsseeds = ['http://www.foo.com/', 'http://www.bar.com/', 'http://www.baz.com/']threadList = []URLs = []
for seed in Seed: retriever = Retriever(seed) retriever.start() threadList.append(retriever)
for retriever in threadList: # join() is more efficient than sleep() retriever.join() URLs += retriever.getLinks()
title: import Editorial
import Editorial• How to import from threading import Thread, Semaphore
or import threading
• Don't use from threading import *
title: Thread Methods
Thread Methods• Module functions:
activeCount() (not useful)enumerate() (not useful)
• Thread object methods:start() join() (somewhat useful)isAlive() (not useful)isDaemon() setDaemon()
title: Brute Thread Spider
Brute Thread Spider• BruteThreadSpider.py
• Few changes fromSingleThreadSpider.py
• Spawn one thread per retrieval
• Inefficient polling in main loop
title: Thread Order
Thread Order• Non-determinate Thread 1 Thread 2print "a,", print "1,",print "b,", print "2,",print "c,", print "3,",
• Sample output 1, a, b, 2, c, 3, a, b, c, 1, 2, 3, 1, 2, 3, a, b, c, a, b, 1, 2, 3, c,
title: Thread Communication
Thread Communication• Critical Section Lock
Protects shared memoryOnly one thread accesses chunk of codeaka "mutex", or "atomic operation"
• Wait/NotifySynchronizes actions between threadsThreads wait for each other to finish a taskMore efficient than polling
title: Thread Library
Thread Library• Lock()
• RLock()
• Semaphore()
• Condition()
• Event()
• Queue.Queue()
title: Critical Section Lock
Critical Section Lock Thread 1 Thread 2mutex.acquire() ...if myList: ... work = myList.pop() ...mutex.release() ...... mutex.acquire()... if len(myList)<10:... myList.append(work)... mutex.release()
title: GIL and Shared Vars
GIL and Shared Vars• Safe: one bytecode
Single operations against Python basictypes (e.g. appending to a list)
• UnsafeMultiple operations against Pythonvariables (e.g. checking the length of a listbefore appending) or any operation thatinvolves a callback to a class (e.g. the__getattr__ hook)
title: GIL example
GIL example• Mutex only one thread Thread 1 Thread 2myList.append(work) mutex.acquire()... if myList:... work = myList.pop()... mutex.release()
title: dis this
dis this• disassemble source to byte codes
• Thread-unsafe statementIf a single Python statement uses the sameshared variable across multiple byte codes,or if there are multiple mutually-dependentshared variables, that statement is notthread-safe
title: Misusing Lock()
Misusing Lock()• Lock() steps on itselfmutex = Lock()mutex.acquire() ...mutex.acquire() # OOPS!
title: Synching threads
Synch Two Threadsclass Synchronize: def __init__(self): self.lock = Lock() def wait(self): self.lock.acquire() self.lock.acquire() self.lock.release() def notify(self): self.lock.release()
Thread 1 Thread 2self.synch.wait() ...... self.synch.notify()... self.synch.wait()self.synch.notify() ...
title: RLock()
RLock()• Mutex only
Other threads cannot release RLock()
• Recursive
• Methodsacquire(blocking)release()
title: Using RLock()
Using RLock()mutex = RLock()mutex.acquire() ...mutex.acquire() # Safe ...mutex.release()mutex.release()
Thread 1 Thread 2mutex.acquire() ...self.update() ...mutex.release() ...... mutex.acquire()... self.update()... mutex.release()
title: Semaphore()
Semaphore()• Restricts number of running threads
In Python, primarily useful for simulations(but consider using microthreads)
• MethodsSemaphore(value)acquire(blocking)release()
title: Condition()
Condition()• Methods
Condition(lock)acquire(blocking)release() wait(timeout)notify() notifyAll()
title: Using Condition()
Using Condition()• Must use lockcond = Condition()cond.acquire()cond.wait() # or notify()/notifyAll()cond.release()
• Avoid timeoutCreates polling loop, so inefficient
title: Event()
Event()• Thin wrapper for Condition()
Don't have to mess with lockOnly uses notifyAll(), so can beinefficient
• Methodsset() clear() isSet() wait(timeout)
title: TMTOWTDI
TMTOWTDI• Perl:
There's More Than One Way To Do It
• Python:There should be one - and preferably onlyone - obvious way to do it
• Threads more like Perl
title: Factory Objects 1
Factory Objects 1Body Wheelsbody.list wheels.listbody.rlock wheels.rlockbody.event wheels.eventassembly.event assembly.event
Assemblybody.listbody.rlockbody.eventwheels.listwheels.rlockwheels.eventassembly.rlockassembly.event
title: Queue()
Queue()• Does not use threading
Can be used with thread
• Designed for subclassingCan implement stack, priority queue, etc.
• Simple!Handles both data protection andsynchronization
title: Queue() Objects
Queue() Objects• Methods
Queue(maxsize)put(item,block)get(block)qsize() empty() full()
• Raises exception when non-blocking
title: Using Queue()
Using Queue()Thread 1 Thread 2output = self.doWork() ...queue.put(output) ...... self.input = queue.get()... output = self.doWork()... queue.put(output)self.input = queue.get() ...
title: Factory Objects 2
Factory Objects 2Body Wheelsbody.queue wheels.queue
Assemblybody.queuewheels.queueassembly.rlock
title: Factory Objects 3
Factory Objects 3Body Wheelsbody.queue wheels.queue
Packagerwhile 1: body = self.body.queue.get() wheels = self.wheels.queue.get() self.assembly.put( (body,wheels) )
Assemblyassembly.queue
title: Recap Part 2
Recap Part 2• Data protection and synchronization
• Python Thread Library
• Queues are good
title: Spider w/Queue
Spider w/Queue• ThreadPoolSpider.py
• Two queuesPass work to thread poolGet links back from thread pool
• Queue for both data and events
title: GUI building blocks
GUI building blocks• Widgets
Windows, buttons, checkboxes, text entry,listboxes
• EventsWidget activation, keypress, mousemovement, mouse click, timers
title: Tkinter resources
Tkinter resources• Web
http://www.python.org/topics/tkinter/doc.html
• BooksPython and Tkinter Programming, John E.Grayson
title: Fibonacci
Fibonacci• Fibonacci.py
• UI freezes during calc
• Frequent screen updates slow calc
title: Threaded Fibonacci
Threaded Fibonacci• FibThreaded.py
• Tkinter needs to pollUse after event
• Single-element queueUse in non-blocking mode to minimizeupdates
• Must use "Quit" button
title: Pop Quiz 1
Pop Quiz 1How are threads and processes similar and different?
What is the GIL?
In what ways does the GIL make thread programming easier and harder?
How do you create a thread in Python?
What should not be shared between threads?
What are "brute force" threads?