Download - Advances Data Structures
-
8/14/2019 Advances Data Structures
1/36
Advanced Data StructuresLecture 1: Introduction
Mircea Marin
October 1 2013
Mircea Marin Advanced Data Structures
http://find/ -
8/14/2019 Advances Data Structures
2/36
Organizatorial items
Lecturer and TA: Mircea Marin
email: [email protected]
Course objectives:1 Become familiar with some of the advanced data structures
and algorithms which underpin much of todays computerprogramming
2
Recognize the data structure and algorithms that are bestsuited to model and solve your problem3 Be able to perform a qualitative analysis of your algorithms
time complexity analysis
Course webpage:
http://web.info.uvt.ro/mmarin/lectures/ADSHandouts: will be posted on the webpage of the lecture
Grading: 50% final exam (written), 50% lab + homework
Attendance: required
Mircea Marin Advanced Data Structures
http://web.info.uvt.ro/~mmarin/lectures/ADShttp://web.info.uvt.ro/~mmarin/lectures/ADShttp://find/ -
8/14/2019 Advances Data Structures
3/36
Organizatorial items
Lab work: implementation in C++ or Java of applications,
using data structures and algorithms presented in this
lecture
Requirements:
Be up-to-date with the presented material
Prepare the programming assignments for the stateddeadlines
Recommended textbooks
Cormen, Leiserson, Rivest. Introduction to Algorithms. MITPress.
Adam Drozdek. Data structures and algorithms inC++.Third edition, Thomson Course Technology, 2005.Aho, Hopcroft, Ullmann. Data structures and algorithms,Addison-Wesley, 1985.
Mircea Marin Advanced Data Structures
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
4/36
From problems to programs
Mircea Marin Advanced Data Structures
http://find/http://goback/ -
8/14/2019 Advances Data Structures
5/36
From problems to programs
Initial problems have no simple, precise specifications
Mircea Marin Advanced Data Structures
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
6/36
From problems to programs
Initial problems have no simple, precise specifications
Identify a good data structure to model the problem
Mircea Marin Advanced Data Structures
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
7/36
From problems to programs
Initial problems have no simple, precise specifications
Identify a good data structure to model the problem
Familiarization with the most common data structures usedin computer science is a good prerequisite!
Mircea Marin Advanced Data Structures
F bl
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
8/36
From problems to programs
Initial problems have no simple, precise specifications
Identify a good data structure to model the problem
Familiarization with the most common data structures usedin computer science is a good prerequisite!
Find a solution in the form of analgorithm, using the
selected modelAlgorithm =
Mircea Marin Advanced Data Structures
F bl t
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
9/36
From problems to programs
Initial problems have no simple, precise specifications
Identify a good data structure to model the problem
Familiarization with the most common data structures usedin computer science is a good prerequisite!
Find a solution in the form of analgorithm, using the
selected modelAlgorithm = finite sequence of instructions, each of whichhas a clear meaning and can be performed with a finiteamount of effort in a finite length of time.
Mircea Marin Advanced Data Structures
F bl t
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
10/36
From problems to programs
Initial problems have no simple, precise specifications
Identify a good data structure to model the problem
Familiarization with the most common data structures usedin computer science is a good prerequisite!
Find a solution in the form of analgorithm, using the
selected modelAlgorithm = finite sequence of instructions, each of whichhas a clear meaning and can be performed with a finiteamount of effort in a finite length of time.Do not confusealgorithmwithprogram!
Mircea Marin Advanced Data Structures
F bl t
http://find/ -
8/14/2019 Advances Data Structures
11/36
From problems to programs
Initial problems have no simple, precise specifications
Identify a good data structure to model the problem
Familiarization with the most common data structures usedin computer science is a good prerequisite!
Find a solution in the form of analgorithm, using the
selected modelAlgorithm = finite sequence of instructions, each of whichhas a clear meaning and can be performed with a finiteamount of effort in a finite length of time.Do not confusealgorithmwithprogram!
Algorithm is language-independent: it can be written in
pseudocode
Mircea Marin Advanced Data Structures
From problems to programs
http://find/ -
8/14/2019 Advances Data Structures
12/36
From problems to programs
Initial problems have no simple, precise specifications
Identify a good data structure to model the problem
Familiarization with the most common data structures usedin computer science is a good prerequisite!
Find a solution in the form of analgorithm, using the
selected modelAlgorithm = finite sequence of instructions, each of whichhas a clear meaning and can be performed with a finiteamount of effort in a finite length of time.Do not confusealgorithmwithprogram!
Algorithm is language-independent: it can be written in
pseudocode
To write a program, you need to choose a programming
language
Mircea Marin Advanced Data Structures
From problems to programs
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
13/36
From problems to programsA modeling scenario
Problem specification:
intersection of 5 roads: A, B, C, D, E; Roads C and E areoneway; There are 13 possible turns: AB, EC, etc.Design a light intersection system to avoid car collisions.
Chosen model: a graph with nodes = possible turns:
{AB, AC, AD, BA, BC, BD, DA, DB, DC, EA, EB, EC, ED} edges are between turns that can be performed
simultaneously.
Mircea Marin Advanced Data Structures
From problems to programs
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
14/36
From problems to programsA modeling scenario (contd.)
Graph coloring: assignment of a color to each node so
that no two vertices connected by an edge have the samecolor.
REMARK: Our problem is the same as coloring the graph
of incompatible turns usingas few colors as possible.
Advantages of the chosen model:
Graph are a well-known abstract data structure.
Graph coloring is a well-studied problem.
Bad news: the problem isNP complete: to find the best
solution, we must essentially try all possibilities :-(finding optimal solution may be very inefficient.Alternative:be happy with a reasonably good solution,using a heuristic algorithm. E.g., use the following "greedy"algorithm.
Mircea Marin Advanced Data Structures
From problems to programs
http://find/ -
8/14/2019 Advances Data Structures
15/36
From problems to programsA modeling scenario (contd.)
Example (Optimal coloring)
color turns
blue AB, AC, AD, BA, DC, EDred BC, BD, EA
green DA, DByellow EB, EC
Mircea Marin Advanced Data Structures
From problems to programs
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
16/36
From problems to programsA modeling scenario (contd.)
Example (Optimal coloring)
color turns
blue AB, AC, AD, BA, DC, EDred BC, BD, EA
green DA, DByellow EB, EC
The "greedy" algorithm - informal description
1 Select some uncolored node and color it with a new color.2 Scan the list of uncolored nodes. For each node,
determine whether it has an edge to any vertex already
colored with the new color. If there is no such edge, color
the present vertex with the new color.Mircea Marin Advanced Data Structures
Remarks about the greedy algorithm
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
17/36
Remarks about the greedy algorithm
Does not always use the minimum possible number of colors.
We can use the theory of algorithms to evaluate the
goodness of the solution produced.
k-clique= set of knodes, every pair of which is connected
by an edge.
REMARK 1: We needkcolors to color a k-clique.
REMARK 2: The illustrated graph of incompatible turns
contains a 4-cliqueAC, DA, BD, EB 4 colors are
needed.
Mircea Marin Advanced Data Structures
Pseudo-Language and Stepwise Refinements
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
18/36
Pseudo Language and Stepwise RefinementsIllustrated algorithm: greedy
Mircea Marin Advanced Data Structures
Pseudo-Language and Stepwise Refinements
http://find/http://goback/ -
8/14/2019 Advances Data Structures
19/36
Pseudo Language and Stepwise RefinementsIllustrated algorithm: greedy
This algorithm can be refined to more specific
implementation.
Mircea Marin Advanced Data Structures
Pseudo-Language and Stepwise Refinements
http://find/ -
8/14/2019 Advances Data Structures
20/36
Pseudo Language and Stepwise RefinementsIllustrated algorithm: greedy
First refinement: clarify the notion of adjacency
Mircea Marin Advanced Data Structures
Pseudo-Language and Stepwise Refinements
http://find/ -
8/14/2019 Advances Data Structures
21/36
Pseudo Language and Stepwise RefinementsIllustrated algorithm: greedy
Second refinement: clarify the notion of for each loop
procedure greedy(var G:GRAPH; var newclr:LIST);bool found;int v, w;
beginwhile v ! = null do begin
found := falsew := first vertex in newclrif there is an edge between v and w in G then
found:=true;w:=next vertex in newclr
end
if found=false do beginmark v colored;add v to newclr
endv :=next uncolored vertex in G
end { greedy }Mircea Marin Advanced Data Structures
Representations of graphs
http://find/ -
8/14/2019 Advances Data Structures
22/36
Representations of graphsAdjacency matrix, etc.
Adjacency matrix
AB AC AD BA BC BD DA DB DC EA EB EC ED
AB 1 1 1 1AC 1 1 1 1 1AD 1 1 1BA
BC 1 1 1BD 1 1 1 1 1DA 1 1 1 1 1DB 1 1 1DCEA 1 1 1EB 1 1 1 1 1
EC 1 1 1 1ED
Mircea Marin Advanced Data Structures
Abstract Data Types (ADT)
http://find/ -
8/14/2019 Advances Data Structures
23/36
yp ( )
ADT= mathematical model with a collection of operations
defined on that model.
ADT = generalization of primitive data types (integer, float,
char, boolean, etc.)
Typical examples of ADT:Lists, stacks, heaps, trees, sets, hash-tables, etc.
In OOP (e.g., C++), they can be implemented as classes.
Benefits: encapsulation, generalization (by classextension), overloading, . . .
Mircea Marin Advanced Data Structures
ADTs for the greedy algorithm
http://find/ -
8/14/2019 Advances Data Structures
24/36
g y g
For colors:List
Desirable operations:init(colorList) - initialize the list of colors to be empty.first(colorList) - return the first element of the list, and null ifthe list is empty.next(colorList) - get the next element of the list, and null if
there is no next memberinsert(c, colorList) - insert a new color c into the colorList
For colored graphs:GraphDesirable operations:
get the first uncolored node,
test whether there is an edge between two nodes,mark a node colored,get the next uncolored node,. . .
Mircea Marin Advanced Data Structures
Running time of a program
http://find/http://goback/ -
8/14/2019 Advances Data Structures
25/36
g p g
Factors on which it depends:
Size of the input.
Quality of compiler-generated code.
Nature and speed of computer instructions.Time complexity of the underlying algorithm.
The running time of a program should be defined as a function
T(n)of the sizenof input.
Mircea Marin Advanced Data Structures
The Big-Oh, Big-Omega, and Big-Theta notation
http://find/ -
8/14/2019 Advances Data Structures
26/36
g , g g , g
Assumptions:
T(n) :running time function, depending on input sizen N.
"Big-Oh" notation: T(n)is O(f(n))if there arec R,c >0,andn0 N such thatT(n) c f(n)for allnn0.
"Big-Omega" notation: T(n)is(f(n))if there arec R,c >0 such that T(n) c f(n)for infinitely many values ofn.
"Big -Theta" notation: T(n)is(f(n))if there are
c1, c2 R,c1, c2 >0, andn0 N such thatc1 f(n)T(n) c2 f(n)for allnn0.
Mircea Marin Advanced Data Structures
Desirable running times
http://find/ -
8/14/2019 Advances Data Structures
27/36
g
Polynomial running time: T(n)isO(nk)for somek >0.
Algorithms with polynomial running time are calledtractable.
Question:AlgorithmA1 has running time 5 n3 andA2 has
running time 100 n2.Which alg. has better running time?
Mircea Marin Advanced Data Structures
Desirable running times
http://find/ -
8/14/2019 Advances Data Structures
28/36
g
Polynomial running time: T(n)isO(nk)for somek >0.
Algorithms with polynomial running time are calledtractable.
Question:AlgorithmA1 has running time 5 n3 andA2 has
running time 100 n2.Which alg. has better running time?
Mircea Marin Advanced Data Structures
Desirable running times
http://find/ -
8/14/2019 Advances Data Structures
29/36
Polynomial running time: T(n)isO(nk)for somek >0.
Algorithms with polynomial running time are called
tractable.
Question:AlgorithmA1 has running time 5 n3 andA2 has
running time 100 n2.Which alg. has better running time?
Answer: A1 ifn
-
8/14/2019 Advances Data Structures
30/36
Rule for sumsfor programP=P1;. . .; Pkmade ofsequence of programsP1, . . . ,Pk.
IfPi isO(fi(n))fori=1, . . . , k thenP isO(f(n))wheref(n) =max{fi(n)| 1 ik}.
If there existsn0 N andg(n)such thatfi(n) g(n)for alln n0 thenPis inO(g(n)).
Rule for products: IfT1(n)isO(f(n))andT2(n)isO(g(n))thenT1(n) T2(n)isO(f(n)g(n)).
Mircea Marin Advanced Data Structures
Case study: Bubble-sort
http://find/ -
8/14/2019 Advances Data Structures
31/36
Input size: n
Question:What isT(n)?
Mircea Marin Advanced Data Structures
General rules for the analysis of programs
http://find/ -
8/14/2019 Advances Data Structures
32/36
ASSUMPTION: The only parameter for the running time functionTis the input size (n)
The running time of each assignment, read, and write statementcan usually be taken to beO(1).
The running time of a sequence of stmts. is given by sum rule.That is, the running time of the sequence is, to within a constantfactor, the largest running time of any stmt. in the sequence.
An ifstatement has running time = time for evaluating condition(usuallyO(1)) + time for evaluating the corresponding branch.
The running time to execute a loop is the sum, over all timesaround the loop, of the time to execute the body and the time to
evaluate the condition for termination (usually the latter is O(1)).Often this time is, neglecting constant factors, the product of thenumber of times around the loop and the largest possible timefor one execution of the body, but we must consider each loopseparately to make sure.
Mircea Marin Advanced Data Structures
Exercise 1
http://find/ -
8/14/2019 Advances Data Structures
33/36
What is the running time T(n)for the computation offact(n)bythe following recursive program:
Mircea Marin Advanced Data Structures
Exercise 3
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
34/36
Give, using big oh notation, the worst case running times ofthe following procedures as a function of n.
Mircea Marin Advanced Data Structures
From algorithms to programs
http://goforward/http://find/http://goback/ -
8/14/2019 Advances Data Structures
35/36
Object-oriented programming languages provide goodsupport for abstract data types
Programming in C++ is encouragedProgramming in Java is also ok
Modern OOP languages, including C++ and Java, have
built-in implementations of several advanced datastructures (ADS) and algorithms presented in this course
The missing ADS and algorithms are not hard to implementby yourself
C++ provides the Standard Template Library STL to work
with efficient implementations of data structures.
Mircea Marin Advanced Data Structures
References
http://find/ -
8/14/2019 Advances Data Structures
36/36
Alfred V. Ahoet al. Data Structures and Algorithms.
Chapter 1: Design and Analysis of Algorithms.
1983. Addison-Wesley.
Mircea Marin Advanced Data Structures
http://find/