python lecture 07
TRANSCRIPT
Python & Perl
Lecture 07
Department of Computer ScienceUtah State University
Outline
● Encoding and decoding with Huffman Trees● List Comprehension● Introduction to OOP
Encoding & Decoding Messages with Huffman Trees
Sample Huffman Tree
G: 1 H: 1 E: 1 F: 1
{G, H}: 2 {E, F}: 2
{E, F, G, H}: 4
C: 1 D: 1
{C, D}: 2 B: 3 B: 3
{B, C, D}: 5
{B, C, D, E, F, G, H}: 9 A: 8
{A, B, C, D, E, F, G, H}: 17 0 1
0 1
0 1
0 1
0 1
0 1 1 0 1
Symbol Encoding1. Given a symbol s and a Huffman tree ht, set current_node to the root node and encoding to an empty list (you can also check if s is in the root node's symbol leaf and, if not, signal error)
2. If current_node is a leaf, return encoding
3. Check if s is in current_node's left branch or right branch
4. If in the left, add 0 to encoding, set current_node to the root of the left branch, and go to step 2
5. If in the right, add 1 to encoding, set current_node to the root of the right branch, and go to step 2
6. If in neither branch, signal error
Example● Encode B with the sample Huffman tree● Set current_node to the root node● B is in current_node's the right branch, so add 1 to encoding &
recurse into the right branch (current_node is set to the root of the right branch – {B, C, D, E, F, G, H}: 9)
● B is in current_node's left branch, so add 0 to encoding and re-curse into the left branch (current_node is {B, C, D}: 5)
● B is in current_node's left branch, so add 0 to encoding & recurse into the left branch (current_node is B: 3)
● current_node is a leaf, so return 100 (value of encoding)
Message Encoding● Given a sequence of symbols message and a Huffman
tree ht
● Concatenate the encoding of each symbol in message from left to right
● Return the concatenation of encodings
Example● Encode ABBA with the sample Huffman tree● Encoding for A is 0● Encoding for B is 100
● Encoding for B is 100
● Encoding for A is 0● Concatenation of encodings is 01001000
Message Decoding1. Given a sequence of bits message and a Huffman tree ht, set current_node to the root and decoding to an empty list
2. If current_node is a leaf, add its symbol to decoding and set current_node to ht's root
3. If current_node is ht's root and message has no more bits, return decoding
4. If no more bits in message & current_node is not a leaf, signal error
5. If message's current bit is 0, set current_node to its left child, read the bit, & go to step 2
6. If message's current bit is 1, set current_node to its right child, read the bit, & go to step 2
Example● Decode 0100 with the sample Huffman tree● Read 0, go left to A:8 & add A to decoding and reset
current_node to the root ● Read 1, go right to {B, C, D, E, F, G, H}: 9
● Read 0, go left to {B, C, D}:5
● Read 0, go left to B:3
● Add B to decoding & reset current_node to the root● No more bits & current_node is the root, so return AB
Generation of Huffman Trees
Algorithm● Basic idea: Build the tree bottom up so that symbols with the smallest fre-
quencies are farthest from the root
● Given a sequence of nodes (initially single symbols and their frequencies), find two nodes with the smallest frequencies and combine them into a new node whose symbol list contains the symbols of the two nodes and whose frequency is the sum of the frequencies of the two nodes
● Remove the two combined nodes from the sequence and add the newly con-structed node back to the sequence (note that the length of the sequence is now reduced by 1)
● Keep combining pairs of nodes in the above fashion until there is only one node left in the sequence: this is the root of the Huffman tree
Example● Initial sequence: [A:4, B:2, C:1, D:1]
● Find two nodes with the smallest frequencies and combine them into a new node whose symbol list contains the symbols of the two nodes and whose frequency is the sum of the frequencies of the two nodes
● The nodes are C:1 and D:1
● The new node is {C, D}:2
● After removing C:1 and D:1 and adding {C, D}:2, the sequence be-comes [A:4, B:2, {C, D}:2]
Example● The Huffman tree so far:
C:1 D:1
{C,D}:2
Example● Current sequence: [A:4, B:2, {C,D}:2]
● Find two nodes with the smallest frequencies and combine them into a new node whose symbol list contains the symbols of the two nodes and whose frequency is the sum of the frequencies of the two nodes
● The nodes are B:2 and {C, D}:2
● The new node is {B, C, D}:4
● After removing B:2 and {C, D}:2 and adding {B, C, D}:4, the se-quence becomes [A:4, {B, C, D}:4]
Example● The Huffman tree so far:
C:1 D:1
{C,D}:2
{B,C,D}:4
B:2
Example● Current sequence: [A:4, {B,C,D}:4]
● Find two nodes with the smallest frequencies and combine them into a new node whose symbol list contains the symbols of the two nodes and whose frequency is the sum of the frequencies of the two nodes
● The nodes are A:4 and {B,C, D}:4
● The new node is {A,B, C, D}:4
● After removing A:4 and {B,C, D}:4 and adding {A,B, C, D}:8, the sequence becomes [{A,B, C, D}:8]
● We are done, because the sequence has only one node
Example● The final Huffman tree:
C:1 D:1
{C,D}:2
{B,C,D}:4
B:2
A:4
{A, B,C,D}:8
This is a programming assignment
Remarks on the Algorithm● The algorithm does not specify a unique Huffman tree, because there
may be more than two nodes in the sequence with the same frequen-cies
● How these nodes are combined at each step (e.g., two rightmost nodes, two leftmost nodes, two middle nodes) is arbitrary, and is left for the programmer to decide
● The algorithm does guarantee the same code lengths regardless of which combination method is used
List Comprehension
List Comprehension● List comprehension is a syntactic construct in some
programming languages for building lists from list specifi-cations
● List comprehension derives its conceptual roots from the set-former (set-builder) notation in mathematics
[Y for X in LIST]
● List comprehension is available in other programming languages such as Common Lisp, Haskell, and Ocaml
Set-Former Notation Example
predicate theis 100
setinput theis
variable theis
functionoutput theis 4
100,|4
2
2
x
N
x
x
xNxx
For-Loop Implementation
### building the list of the set-former example with for-loop
>>> rslt = []
>>> for x in xrange(201):
if x ** 2 < 100:
rslt.append(4 * x)
>>> rslt
[0, 4, 8, 12, 16, 20, 24, 28, 32, 36]
List Comprehension Equivalent
### building the same list with list comprehen-sion
>>> s = [ 4 * x for x in xrange(201) if x ** 2 < 100]
>>> s
[0, 4, 8, 12, 16, 20, 24, 28, 32, 36]
For-Loop
### building list of squares of even numbers in [0, 10]
### with for-loop
>>> rslt = []
>>> for x in xrange(11):
if x % 2 == 0:
rslt.append(x**2)>>> rslt
[0, 4, 16, 36, 64, 100]
List Comprehension Equivalent
### building the same list with list comprehen-sion
>>> [x ** 2 for x in xrange(11) if x % 2 == 0]
[0, 4, 16, 36, 64, 100]
For-Loop## building list of squares of odd numbers in [0, 10]
>>> rslt = []
>>> for x in xrange(11):
if x % 2 != 0:rslt.append(x**2)
>>> rslt
[1, 9, 25, 49, 81]
List Comprehension Equivalent
## building list of squares of odd numbers [0, 10]
## with list comprehension
>>> [x ** 2 for x in xrange(11) if x % 2 != 0]
[1, 9, 25, 49, 81]
List Comprehension with For-Loops
For-Loop>>> rslt = []
>>> for x in xrange(6):
if x % 2 == 0:
for y in xrange(6):if y % 2 != 0:
rslt.append((x, y))>>> rslt
[(0, 1), (0, 3), (0, 5), (2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)]
List Comprehension Equivalent
>>> [(x, y) for x in xrange(6) if x % 2 == 0 \
for y in xrange(6) if y % 2 != 0]
[(0, 1), (0, 3), (0, 5), (2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)]
List Comprehension with Matrices
List Comprehension with Matrices● List comprehension can be used to scan rows and columns in ma-
trices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract all rows
>>> [r for r in matrix]
[[10, 20, 30], [40, 50, 60], [70, 80, 90]]
List Comprehension with Matrices>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 0
>>> [r[0] for r in matrix]
[10, 40, 70]
List Comprehension with Matrices>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 1
>>> [r[1] for r in matrix]
[20, 50, 80]
List Comprehension with Matrices>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 2
>>> [r[2] for r in matrix]
[30, 60, 90]
List Comprehension with Matrices
### turn matrix columns into rows
>>> rslt = []
>>> for c in xrange(len(matrix)):
rslt.append([matrix[r][c] for r in xrange(len(matrix))])
>>> rslt
[[10, 40, 70], [20, 50, 80], [30, 60, 90]]
List Comprehension with Matrices● List comprehension can work with iterables (e.g., dictio-
naries)
>>> dict = {'a' : 'A', 'bb' : 'BB', 'ccc' : 'CCC'}
>>> [(item[0], item[1], len(item[0]+item[1])) \
for item in dict.items()]
[('a', 'A', 2), ('ccc', 'CCC', 6), ('bb', 'BB', 4)]
List Comprehension
● If the expression inside [ ] is a tuple, parentheses are a must
>>> cubes = [(x, x**3) for x in xrange(5)]
>>> cubes
[(0, 0), (1, 1), (2, 8), (3, 27), (4, 64)]
● Sequences can be unpacked in list comprehension
>>> sums = [x + y for x, y in cubes]
>>> sums
[0, 2, 10, 30, 68]
List Comprehension ● for-clauses in list comprehensions can iterate over
any sequences:
>>> rslt = [ c * n for c in 'math' for n in (1, 2, 3)]
>>> rslt
['m', 'mm', 'mmm', 'a', 'aa', 'aaa', 't', 'tt','ttt', 'h', 'hh', 'hhh']
List Comprehension & Loop Variables ● The loop variables used in the list comprehension for-loops
(and in regular for-loops) stay after the execution.>>> for i in [1, 2, 3]: print i
1
2
3
>>> i + 4
7
>>> [j for j in xrange(10) if j % 2 == 0]
[0, 2, 4, 6, 8]
>>> j * 2
18
When To Use List Comprehension
● For-loops are easier to understand and debug● List comprehensions may be harder to understand● List comprehensions are faster than for-loops in the inter-
preter● List comprehensions are worth using to speed up simpler
tasks● For-loops are worth using when logic gets complex
OOP in Python
Classes vs. Object● A class is a definition (blueprint, description) of
states and behaviors of objects that belong to it● An object is a member of its class that
behaves according to its class blueprint● Objects of a class are also called instances of
that class
Older Python: Classes vs. Types● In older versions of Python, there was a
difference between classes and types● The programmer could create classes but not
types● In newer versions of Python, the distinction
between types and classes is disappearing● The programmer can now make subclasses of
built-in types and the types are behaving like classes
Older Python: Classes vs. Types● In Python versions prior to Python 3.0, old style
classes are default● To get the new style classes, place
__metaclass__ = type at the beginning of a script or a module
● There is no reason to use old style classes any more (unless there is a serious backward compatibility issue).
● Python 3.0 and higher do not support old style classes
Class Definition Syntax__metaclass__ = type
class ClassName:
<statement-1>
…
<statement-N>
Class Defimition
Class Definition Evaluation● When a class definition is evaluated, a new
namespace is created and used as the local scope
● All assignments of local variables occur in that new namespace
● Function definitions bind function names in that new namespace
● When a class definition is exited, a class object is created
class Statement
● class statement defines a named class● class statements can be placed inside functions● Multiple classes can be defined in one .py file● Class definition must have at least one statement in
its body (pass can be used as a placeholder)
Class Documentation● To document a class, place a docstring immediately after the
class statement
class <ClassName>:
"""
Does nothing for the moment
"""
pass
Creating Objects● There is no new in Python ● Class objects (instances) are created by the class name followed by () ● This object creation process is called class instantiation:
class SimplePrinter:
"""
This is class Printer.
"""
pass
>>> x = Printer()
Operations Supported by Class Objects● Class objects support two types of operations: attribute
reference and instantiation
__metaclass__ = type
class A:
''' this is class A. '''
x = 12
def g(self):
return 'Hello from A!'
Class Objects
>>> A.x ## attribute reference
>>> A.g ## attribute reference
>>> A.__doc__ ## attribute reference
>>> a = A() ## a is an instance of
## class A (instantiation)
Defining Class Methods● In C++ terminology, all class members are public and all class methods
are virtual● All class methods are defined with def and must have the parameter
self as their first argument● One can think of self as this in Java and C++
class SimplePrinter:
def println(self):
def print_obj(self, obj):
print obj,
Calling Methods on Instances● To call a method on an instance, use the dot operator● Do not put self as the first argument
>>> sp = SimplePrinter()
>>> sp.print_obj([1, 2]); sp.println()
Calling Methods on Instances● What happens to self in sp.println()?● The definition inside the SimplePrinter class is
def println(self):
print ● How come self is not the first argument?
Calling Methods on Instances● The statement sp.println() is converted to
SimplePrinter.println(sp) so self is bound to sp
● In general, suppose there is a class C with a method f(self, x1, ..., xn)
● Suppose we do:>>> x = C()
>>> x.f(v1, ..., vn)
● Then x.f(v1, ..., vn) is converted to C.f(x, v1, ..., vn)
Exampleclass C:
def f(self, x1, x2, x3):
return [x1, x2, x3]
>>> x = C()
>>> x.f(1, 2, 3)
[1, 2, 3]
>>> C.f(x, 1, 2, 3) ## equivalent to x.f(1, 2, 3)
[1, 2, 3]
Attributes and Attribute References● The term attribute is used for any name that follows a dot ● For example, in the expression “a.x”, x is an attribute
class A:
""" This is class A. """
def printX(self):
print self._x,
● A.__doc__, A._x, A.printX, A._list are valid attribute references
Types of Attribute Names● There are two types of attribute names: data attributes and
method attributes
class A:
""" This is class A. """
_x = 0 ## data attribute _x
_list = [] ## data attribute _list
def printX(self): ## method attribute
print self._x,
● A.__doc__, A._x, A._list are data attributes● A.printX is a method attribute
Data Attributes● Data attributes loosely correspond to data members
in C++● A data attribute does not have to be explicitly
declared in the class definition● A data attribute begins to exist when it is first
assigned to● Of course, integrating data attributes into the class
definition makes the code easier to read and debug
Data Attributes● This code illustrates that attributes do not have to be declared and
begin their existence when they are first assigned toclass B:
""" This is class B. """
def __init__(self):
self._number = 10
self._list = [1, 2, 3]
>>> b = B()
>>> b._number
10
>>> b._list
[1, 2, 3]
Method Attributes● Method attributes loosely correspond to data member
functions in C++● A method is a function that belongs to a class● If a is an object of class A, then a.printX is a
method object● Like function objects, method objects can be used
outside of their classes, e.g. assigned to variables and called at some later point
Method Attributes● Method attributes loosely correspond to data member functions in C++
class A:
_x = 0
def printX(self):
print self._x,
>>> a = A()
>>> m = a.printX
>>> m()
0
>>> a._x = 20
>>> m()
20
Method Attributes● Data attributes override method attributes
class C:
def f(self):
print "I am a C object."
>>> c = C()
>>> c.f()
I am a C object.
>>> c.f = 10
>>> c.f
10
>>> c.f() ### error
Method Attributes● Consistent naming conventions help avoid clashes between data
attributes and method attributes● Choosing a naming convention and using it consistently makes reading
and debugging code much easier● Some naming conventions:
First letter in data attributes is lower case; first letter in method attributes is upper case
First letter in data attributes is underscore; first letter in method attributes is not underscore
Nouns are used for data attributes; verbs are used for methods
Reading & References● www.python.org● Ch 02, H. Abelson and G. Sussman. Structure and Interpre-
tation of Computer Programs, MIT Press● S. Roman, Coding and Information Theory, Springer-Verlag● Ch 03, M. L. Hetland. Beginning Python From Novice to Pro-
fessional, 2nd Ed., APRESS● Ch 04, M. L. Hetland. Beginning Python From Novice to Pro-
fessional, 2nd Ed., APRESS