1 python training for hp oso guido van rossum cnri 7/23/1999 9am - 1pm
TRANSCRIPT
1
Python Trainingfor HP OSO
Guido van RossumCNRI
7/23/19999am - 1pm
2
Plug
• The Practice ofProgramming
• Brian W. Kernighanand Rob Pike
• Addison-Wesley, 1999
Mostly about C, but very useful!
http://cm.bell-labs.com/cm/cs/tpop/
3
CODE STRUCTURE
4
The importance of readability
• Most time is spent on maintenance
• Think about the human reader
• Can you still read your own code...– next month?– next year?
5
Writing readable code
• Be consistent(but not too consistent!)
• Use whitespace judicously• Write appropriate comments• Write helpful doc strings
– not novels
• Indicate unfinished business
6
Modifying existing code
• Conform to the existing style– even if it’s not your favorite style!– local consistency overrides global
• Update the comments!!– and the doc strings!!!
7
Organizing code clearly
• Top-down or bottom-up?• Pick one style, stick to it• Alternative: group by functionality
– eg:• constructor, destructor• housekeeping• low level methods• high level methods
8
When to use classes(...and when not!)
• Use a class:– when multiple copies of state needed
• e.g.: client connections; drawing objects
• Use a module:– when on copy of state always suffices
• e.g.: logger; cache
• Use functions:– when no state needed; e.g. sin()
9
Class hierarchies
• Avoid deep class hierarchies– inefficient
• multi-level lookup
– hard to read• find method definitions
– easy to make mistakes• name clashes between attribute
10
Modules and packages
• Modules collect classes, functions
• Packages collect modules
• For group of related modules:– consider using a package
• minimizes chance of namespace clashes
11
Naming conventions(my preferred style)
• Modules, packages: lowercase• except when 1 module ~ 1 class
• Classes: CapitalizedWords• also for exceptions
• Methods, attrs: lowercase_words• Local variables: i, j, sum, x0, etc.• Globals: long_descriptive_names
12
The main program
• In script or program:def main(): ...if __name__ == “__main__”: main()
• In module:def _test(): ...if __name__ == “__main__”: _test()
• Always define a function!
13
DOCUMENTATION
14
Writing comments
• Explain salient points (only)n = n+1 # include end point
• Note dependencies, refs, bugs# Assume reader() handles I/O errors# See Knuth, vol.3, page 410# XXX doesn’t handle x<0 yet
15
Writing doc strings
"""Brief one-line description.
Longer description, documentingargument values, defaults,return values, and exceptions."""
16
When NOT to use comments
• Don’t comment what’s obviousn = n+1 # increment n
• Don’t put a comment on every line• Don’t draw boxes, lines, etc.
#------------------------def remove_bias(self):#------------------------ self.bias = 0
17
One more thing...
UPDATE THE COMMENTS WHEN UPDATING THE CODE!
(dammit!)
18
THE LIBRARY
19
The library is your friend!
• Know what's there• Study the library manual
– especially the early chapters:• Python, string, misc, os services
• Notice platform dependencies• Avoid obsolete modules
20
Stupid os.path tricks
• os.path.exists(p), isdir(p), islink(p)• os.path.isabs(p)• os.path.join(p, q, ...), split(p)• os.path.basename(p), dirname(p)• os.path.splitdrive(p), splitext(p)• os.path.normcase(p), normpath(p)• os.path.expanduser(p)
21
PORTING YOUR BRAIN
(from Java to Python)
22
Class or module?
• Stateless operatons, factory funcs• Java: static methods• Python: functions in module
• Singleton state• Java: static members, methods• Python module globals, functions
23
Private, protected, public?
• Java:• private, protected, public
– enforced by compiler (and JVM?)
• Python:• __private
– enforced by compiler– loophole: _Class__private
• _private, _protected, public– used by convention
24
Method/constr. overloading
• Java:
class C { int f() { ... } int f(int i) { ... } int f(int i, int arg) { ... }}
• Python:
class C: def f(i=0, arg=None): ...
25
Java interfaces
• In Python, interfaces often implied
class File: def read(self, n): ...
class CompressedFile: def read(self, n): ...
26
Abstract classes
• Not used much in Python• Possible:
class GraphicalObject: def draw(self, display): raise NotImplementedError def move(self, dx, dy): raise NotImplementedError ....
27
ERROR HANDLING
28
When to catch exceptions
• When there's an alternative optiontry: f = open(".startup")except IOError: f = None # No startup file; use defaults
• To exit with nice error messagetry: f = open("data")except IOError, msg: print "I/O Error:", msg; sys.exit(1)
29
When NOT to catch them
• When the cause is likely a bug• need the traceback to find the cause!
• When the caller can catch it• keep exception handling in outer layers
• When you don't know what to dotry: receive_message()except: print "An error occurred!"
30
Exception handling style
• Bad:try: parse_args() f = open(file) read_input() make_report()except IOError: print file, "not found"
# (what if read_input()# raises IOError?)
• Good:parse_args()try: f = open(file)except IOError, msg: print file, msg sys.exit(1)read_input()make_report()
31
Error reporting/logging
• Decide where errors should go:– sys.stdout - okay for small scripts– sys.stderr - for larger programs– raise exception - in library modules
• let caller decide how to report!
– log function - not recommended• better redirect sys.stderr to log object!
32
The danger of “except:”
• What's wrong with this code:try: return self.children[O] # first childexcept: return None # no children
• Solution:except IndexError:
33
PYTHON PITFALLS
34
Sharing mutable objects
• through variablesa = [1,2]; b = a; a.append(3); print b
• as default argumentsdef add(a, list=[]): list.append(a); return list
• as class attributesclass TreeNode: children = [] ...
35
Lurking bugs
• bugs in exception handlerstry: f = open(file)except IOError, err: print "I/O Error:", file, msg
• misspelled names in assignmentsself.done = 0while not done: if self.did_it(): self.Done = 1
36
Global variables
# logging module
log = []
def addlog(x): log.append(x)
def resetlog(): log = [] # doesn’t work!
# logging module# corrected version
log = []
def addlog(x): log.append(x)
def resetlog(): global log log = []
37
kjpylint
• Detects many lurking bugs– http://www.chordate.com/kwParsing/
38
PERFORMANCE
39
When to worry about speed
• Only worry about speed when...
– your code works (!)– and its overall speed is too slow– and it must run many times– and you can't buy faster hardware
40
Using the profile module>>> import profile>>> import xmlini>>> data = open("test.xml").read()>>> profile.run("xmlini.fromxml(data)")>>> profile.run("for i in range(100): xmlini.fromxml(data)") 10702 function calls in 1.155 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function) 1 0.013 0.013 1.154 1.154 <string>:1(?) 1 0.001 0.001 1.155 1.155 profile:0(for i in range(100): xmlini.fromxml(data)) 0 0.000 0.000 profile:0(profiler) 500 0.018 0.000 0.018 0.000 xmlini.py:105(end_group) 700 0.032 0.000 0.032 0.000 xmlini.py:109(start_item) 700 0.050 0.000 0.050 0.000 xmlini.py:115(end_item) 200 0.007 0.000 0.007 0.000 xmlini.py:125(start_val) 200 0.014 0.000 0.014 0.000 xmlini.py:129(end_val) 1600 0.190 0.000 0.270 0.000 xmlini.py:134(finish_starttag) 1600 0.163 0.000 0.258 0.000 xmlini.py:143(finish_endtag) 100 0.004 0.000 0.004 0.000 xmlini.py:152(handle_proc) 100 0.007 0.000 0.007 0.000 xmlini.py:162(handle_charref) 100 0.007 0.000 0.007 0.000 xmlini.py:167(handle_entityref) 3600 0.161 0.000 0.161 0.000 xmlini.py:172(handle_data) 100 0.003 0.000 0.003 0.000 xmlini.py:182(handle_comment) 100 0.420 0.004 1.141 0.011 xmlini.py:60(fromxml) 100 0.007 0.000 0.007 0.000 xmlini.py:70(__init__) 100 0.004 0.000 0.004 0.000 xmlini.py:80(getdict) 200 0.012 0.000 0.012 0.000 xmlini.py:86(start_top) 200 0.014 0.000 0.014 0.000 xmlini.py:92(end_top) 500 0.029 0.000 0.029 0.000 xmlini.py:99(start_group)
41
Measuring raw speed
# Here's one wayimport timedef timing(func, arg, ncalls=100): r = range(ncalls) t0 = time.clock() for i in r: func(arg) t1 = time.clock() dt = t1-t0 print "%s: %.3f ms/call (%.3f seconds / %d calls)" % ( func.__name__, 1000*dt/ncalls, dt, ncalls)
42
How to hand-optimize code
import string, types
def dictser(dict, ListType=types.ListType, isinstance=isinstance): L = [] group = dict.get("main") if group: for key in group.keys(): value = group[key] if isinstance(value, ListType): for item in value: L.extend([" ", key, " = ", item, "\n"]) else: L.extend([" ", key, " = ", value, "\n"]) ... return string.join(L, "")
43
When NOT to optimize code
• Usually• When it's not yet working• If you care about maintainability!
• Premature optimization is the root of all evil (well, almost :)
44
THREAD PROGRAMMING
45
Which API?
• thread - traditional Python APIimport threadthread.start_new(doit, (5,))# (can't easily wait for its completion)
• threading - resembles Java APIfrom threading import Thread # and much more...t = Thread(target=doit, args=(5,))t.start()t.join()
46
Atomic operations
• Atomic:i = Nonea.extend([x, y, z])x = a.pop()v = dict[k]
• Not atomic:i = i+1if not dict.has_key(k): dict[k] = 0
47
Python lock objects
• Not reentrant:lock.acquire(); lock.acquire() # i.e. twice!
– blocks another thread callslock.release()
• No "lock owner"• Solution:
– threading.RLock class• (more expensive)
48
Critical sections
lock.acquire()try: "this is the critical section" "it may raise an exception..."finally: lock.release()
49
"Synchronized" methods
class MyObject:
def __init__(self): self._lock = threading.RLock() # or threading.Lock(), if no reentrancy needed
def some_method(self): self._lock.acquire() try: "go about your business" finally: self._lock.release()
50
Worker threads
• Setup:def consumer(): ...def producer(): ...for i in range(NCONSUMERS): thread.start_new(consumer, ())for i in range(NPRODUCERS): thread.start_new(producer, ())"now wait until all threads done"
51
Shared work queue
• Producers:while 1: job = make_job() Q.put(job)
• Consumers:while 1: job = Q.get() finish_job(job)
• Shared:import QueueQ = Queue.Queue(0) # or maxQsize
52
Using a list as a queue
• Shared:Q = []
• Producers:while 1: job = make_job() Q.append(job)
• Consumers:while 1: try: job = Q.pop() except IndexError: time.sleep(...) continue finish_job(job)
53
Using a condition variable
• Shared:Q = []cv = Condition()
• Producers:while 1: job = make_job() cv.acquire() Q.append(job) cv.notify() cv.release()
• Consumers:while 1:
cv.acquire()
while not Q:
cv.wait()
job = Q.pop()
cv.release()
finish_job(job)
54
TIME FOR DISCUSSION