csc 430: basics of algorithm runtime analysis
DESCRIPTION
CSC 430: Basics of Algorithm Runtime Analysis. What is Runtime Analysis?. Fundamentally, it is the process of counting the number of “steps” it takes for an algorithm to terminate “Step” = primitive computer operation Arithmetic Read/Write memory Make a comparison … - PowerPoint PPT PresentationTRANSCRIPT
1
CSC 430: Basics of Algorithm Runtime Analysis
2
What is Runtime Analysis?
• Fundamentally, it is the process of counting the number of “steps” it takes for an algorithm to terminate
• “Step” = primitive computer operation– Arithmetic– Read/Write memory– Make a comparison– …
• The number of steps needed depends on: – The fundamental algorithmic structure– The data structures you use
5
Other Types of Algorithm Analysis
• Algorithm space analysis• Approximate-solution analysis• Parallel algorithm analysis• A PROBLEM’S run-time analysis• Complexity class analysis
6
Runtime Motivates Design
• Easiest Algorithm to implement for any problem: Brute force.– check every possible solution.– Typically takes 2N time or worse for inputs of size N.– Unacceptable in practice.
• Usually, through insight, we can do much better• What do we mean by “better”?
7
Worst-Case Analysis• Best case running time:
– Rarely used; often superfluous• Average case running time:
– Obtain bound on running time of algorithm on random input as a function of input size N.
– Hard (or impossible) to accurately model real instances by random distributions.
– Acceptable in certain, easily-justified scenarios• Worst case running time:
– Obtain bound on largest possible running time– Generally captures efficiency in practice.– Pessimistic view, but hard to find effective alternative.
9
Defining Efficiency: Worst-Case Polynomial-Time
• Def. An algorithm is efficient if its worst-case running time is polynomial or better.
• Not always true, but works in practice:– 6.02 1023 N20 is technically poly-time– Breaking through the exponential barrier of brute force typically exposes some
crucial structure of the problem.– Most poly-time algorithms that people develop have low constants and low
exponents (e.g., N1 N2 N3). • Exceptions.
– Sometimes the problem size demands greater efficiency than poly-time– Some poly-time algorithms do have high constants and/or exponents, and are
useless in practice.– Some exponential-time (or worse) algorithms are widely used because the
worst-case instances seem to be rare. simplex methodUnix grep
11
2.2 Asymptotic Order of Growth
12
What is Asymptotic Analysis
• Instead of giving precise runtime, we’ll show that algorithms are bounded within a constant factor of common functions
• Three types of bounds– Upper Bounds (never greater)– Lower Bounds (never less)– Tight Bounds (Upper and Lower are same)
13
Growth Notation
• Upper bounds (Big-Oh):– T(n) O(f(n)) if there exist constants c > 0 and n0 0 such
that for all n n0 we have T(n) c · f(n).
• Lower bounds (Omega):– T(n) (f(n)) if there exist constants c > 0 and n0 0 such
that for all n n0 we have T(n) c · f(n).
• Tight bounds (Theta): – T(n) (f(n)) if T(n) is both O(f(n)) and (f(n)).
• Frequently abused notation: • T(n) = O(f(n)).– Asymmetric:• f(n) = 5n3; g(n) = 3n2
• f(n) = O(n3) = g(n)• but f(n) g(n).
• Better notation: T(n) O(f(n)).
14
Use-Cases• Big-Oh:
– “The algorithm’s runtime is O(N lgN), a potential improvement over prior approaches, which were O(N2).”
• Omega: – “Their algorithm’s runtime is (N2),
which is worse than our algorithm’s proven O(N1.5) runtime.”
• Theta:– “Our original algorithm was (N
lgN) and O(N2). With modifications, we were able to improve the runtime bounds to (N1.5)”
0 50 100 150 200 250 3000
10000
20000
30000
40000
50000
60000
70000
80000
N^2NlgnN^1.5
15
Properties• Transitivity.
– If f O(g) and g O(h) then f O(h).– If f (g) and g (h) then f (h). – If f (g) and g (h) then f (h).
• Additivity.– If f O(h) and g O(h) then f + g O(h). – If f (h) and g (h) then f + g (h).– If f (h) and g O(h) then f + g (h).– These generalize to any number of terms
• “Symmetry”:– If f (h) then h (f)– If f O(h) then h (f)– If f (h) then h O(f)
16
Asymptotic Bounds for Some Common Functions
• Polynomials. a0 + a1n + … + adnd is (nd) if ad > 0.
• Polynomial time. Running time is O(nd) for some constant d independent of the input size n.
• Logarithms. O(log a n) = O(log b n) for any constants a, b > 0.
• Logarithms. For every x > 0, log n = O(nx).
• Exponentials. For every r > 1 and every d > 0, nd = O(rn).every exponential grows faster than every polynomial
can avoid specifying the base
log grows slower than every polynomial
17
Proving Bounds• Use Transitivity:
– If an algorithm is at least as fast as another, then it has the same upper bound
• Use Additivity:– f(n) = n3 + n2 + n1 + 3– f(n) (n3)
• Be direct:– Start with the definition
and provide the parts it requires
• Take the logarithm of both functions
• Use limits:
• Keep in mind that sometimes functions can’t be bounded. (But I won’t assign you any like that!)
))(()( then ,))(()( then ,0))(()( then ,
)()(lim
xgxfxgOxfxgxfc
xgxf
x
19
2.4 A Survey of Common Running Times
20
Linear Time: O(n)
• Linear time. Running time is at most a constant factor times the size of the input.
• Computing the maximum. Compute maximum of n numbers a1, …, an.
def max(L):max = L[0]for i in L:
if i > max: max = i
return max
21
Linear Time: O(n)
• Claim. Merging two lists of size n takes O(n) time.• Pf. After each comparison, the length of output list increases by 1.
#the merge algorithm combines two#sorted lists into one larger sorted #listdef merge(A,B):
i = j = 0M = []while i < len(A) or j < len(B):
if A[i] B[j]: M.append(A[i])i+=1
else:M.append(B[j])j+=1
if i == len(A):return M + B[j:]
return M + A[i:]
22
O(n log n) Time• O(n log n) time. Arises in divide-and-conquer algorithms.
• Sorting. Mergesort and heapsort are sorting algorithms that perform O(n log n) comparisons.
• Largest empty interval. Given n time-stamps x1, …, xn on which copies of a file arrive at a server, what is largest interval of time when no copies of the file arrive?
• O(n log n) solution. Sort the time-stamps. Scan the sorted list in order, identifying the maximum gap between successive time-stamps.
23
Quadratic Time: O(n2)• Quadratic time. Enumerate all pairs of elements.
• Closest pair of points. Given a list of n points in the plane (x1, y1), …, (xn, yn), find the pair that is closest.
• Remark. (n2) seems inevitable, but this is just an illusion.
def closest_pair(L):x1 = L[0]x2 = L[1]#no need to compute the sqrt for the distancemin = (x1[0] – x2[0])**2 + (x1[1] – x2[1])**2for i in range(len(L)):
for j in range(i+1,len(L)) :x1,x2 = L[i],L[j]d = (x1[0] – x2[0])**2 + (x1[1] – x2[1])**2if d < min :
min dreturn min
see chapter 5
24
Cubic Time: O(n3)• Cubic time. Enumerate all triples of elements.• Set disjointness. Given n sets S1, …, Sn each of which is a
subset of1, 2, …, n, is there some pair of these which are disjoint?def find_disjoint_sets(S):
for i in range(len(S)):for j in range(i+1,len(S)):
for p in S[i]:if p in S[j]:
break;else: #no p in S[i] belongs to
S[j] return S[i],S[j] #return tuple if found
return None #return nothing if all sets overlap
25
Polynomial Time: O(nk) Time• Independent set of size k. Given a graph, are there k nodes such that no two are
joined by an edge?
• O(nk) solution. Enumerate all subsets of k nodes.
– Check whether S is an independent set = O(k2).– Number of k element subsets = – O(k2 nk / k!) = O(nk).
def independent_k(V,E,k):#subset_k,independent omitted for
spacefor sub in subset_k(V,k):
if independent(sub,E): return sub
return None
nk
n (n 1) (n 2) (n k 1)k (k 1) (k 2) (2) (1)
nk
k!
poly-time for k=17,but not practical
k is a constant
26
Exponential Time
• Independent set. Given a graph, what is maximum size of an independent set?
• O(2n) solution. Enumerate all subsets.
def max_independent(V,E):for k in len(V):
if not independent_k(V,E,k):return k-1