csc – 332 data structures generics analysis of algorithms dr. curry guinn

42
CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Upload: edwin-townsend

Post on 17-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

CSC – 332 Data StructuresGenerics

Analysis of AlgorithmsDr. Curry Guinn

Page 2: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

For Next Class, Thursday

• Homework 1 due tonight– Quiz 2 – Today, 01/21, before class!

• Up to 3 submissions

– Quiz 3 Thursday by class time

• Homework 2 due Monday, 01/27, 11:59pm• For Thursday

– Chapter 2, Sections 2.1-2.4.2

Page 3: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Quiz Answers

• Let’s go to Blackboard Learn and see

Page 4: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Generics

• http://people.uncw.edu/guinnc/courses/spring14/332/notes/day2_unix/Generics.ppt

Page 5: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Is This Algorithm Fast?

• Problem: given a problem, how fast does this code solve that problem?

• "My program finds all the primes between 2 and 1,000,000,000 in 1.37 seconds."– How good is this solution?

• Could try to measure the time it takes, but that is subject to lots of errors– multitasking operating system– speed of computer– language solution is written in

Page 6: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Math background: exponents• Exponents

– XY , or "X to the Yth power";X multiplied by itself Y times

• Some useful identities– XA XB = XA+B

– XA / XB = XA-B

– (XA)B = XAB

– XN+XN = 2XN

– 2N+2N = 2N+1

Page 7: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Math background:Logarithms• Logarithms

– definition: XA = B if and only if logX B = A

– intuition: logX B means: "the power X must be raised to, to get B"

– In this course, a logarithm with no base implies base 2.log B means log2 B

• Examples– log2 16 = 4 (because 24 = 16)

– log10 1000 = 3 (because 103 = 1000)

Page 8: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Identities for logs with addition, multiplication, powers:

• log (AB) = log A + log B

• log (A/B) = log A – log B

• log (AB) = B log A

• logA(B) = logC(B)/logC(A)

Logarithm identities

Page 9: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Some helpful mathematics

• N + N + N + …. + N (total of N times)– N*N = N2 which is O(N2)

• 1 + 2 + 3 + 4 + … + N– N(N+1)/2 = N2/2 + N/2 is O(N2)

Page 10: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Analysis of Algorithms• What do we mean by an “efficient” algorithm?

– We mean an algorithm that uses few resources.

– By far the most important resource is time.– Thus, when we say an algorithm is

efficient (assuming we do not qualify this further), we mean that it can be executed quickly.

Page 11: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• Is there some way to measure efficiency that does not depend on the state of current technology?– Yes!

• The Idea– Determine the number of “steps” an algorithm

requires when given some input.• We need to define “step” in some reasonable way.

– Write this as a formula, based on the input.

Page 12: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Generally, when we determine the efficiency of an algorithm, we are interested in:– Time Used by the Algorithm

• Expressed in terms of number of steps.

• People also talk about “space efficiency”, etc.

– How the Size of the Input Affects Running Time• Think about giving an algorithm a list of items to operate on.

The size of the problem is the length of the list.

– Worst-Case Behavior• What is the slowest the algorithm ever runs for a given input

size?

• Occasionally we also analyze average-case behavior.

Page 13: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• Typically use a simple model for basic operation costs

• RAM (Random Access Machine) model– RAM model has all the basic operations:

+, -, *, / , =, comparisons– fixed sized integers (e.g., 32-bit)– infinite memory– All basic operations take exactly one time unit (one

CPU instruction) to execute

RAM model

Page 14: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Critique of the model• Strengths:

– simple– easier to prove things about the model than the real

machine– can estimate algorithm behavior on any

hardware/software

• Weaknesses:– not all operations take the same amount of time in a

real machine– does not account for page faults, disk accesses, limited

memory, floating point math, etc

Page 15: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Relative rates of growth• Most algorithms' runtime can be

expressed as a function of the input size N

• Rate of growth: measure of how quickly the graph of a function rises

• Goal: distinguish between fast- and slow-growing functions– We only care about very large input sizes

(for small sizes, most any algorithm is fast enough)

Page 16: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Growth rate example

Consider these graphs of functions.Perhaps each one represents an

algorithm:n3 + 2n2

100n2 + 1000

• Which growsfaster?

Page 17: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Growth rate example

• How about now?

Page 18: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• “The fundamental law of computer science: As machines become more powerful, the efficiency of algorithms grows more important, not less.” — Nick Trefethen

• An algorithm (or function or technique …) that works well when used with large problems & large systems is said to be scalable.– Or “it scales well”.

Page 19: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Big O• Definition: T(N) = O(f(N))

if there exist positive constants c , n0 such that:

T(N) c · f(N) for all N n0

• Idea: We are concerned with how the function grows when N is large. We are not picky about constant factors: coarse distinctions among functions

• Lingo: "T(N) grows no faster than f(N)."

Page 20: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Big O

c , n0 > 0 such that f(N) c g(N) when N n0

• f(N) grows no faster than g(N) for “large” N

Page 21: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• pick tightest bound. If f(N) = 5N, then:f(N) = O(N5)

f(N) = O(N3)

f(N) = O(N log N)

f(N) = O(N) preferred

• ignore constant factors and low order termsT(N) = O(N), not T(N) = O(5N)

T(N) = O(N3), not T(N) = O(N3 + N2 + N log N)

• remove non-base-2 logarithmsf(N) = O(N log6 N)

f(N) = O(N log N) preferred

Preferred big-O usage

Page 22: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Big-O of selected functions

Page 23: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• Defn: T(N) = (g(N)) if there are positive constants c and n0 such that T(N) c g(N) for all N n0

– Lingo: "T(N) grows no slower than g(N)."

• Defn: T(N) = (g(N)) if and only if T(N) = O(g(N)) and T(N) = (g(N)).– Big-O, Omega, and Theta establish a relative

ordering among all functions of N

Big Omega, Theta

Page 24: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

notation intuition

O (Big-O) (Big-Omega) (Theta) =

o (little-O) <

Intuition about the notations

Page 25: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Big-Omega

• c , n0 > 0 such that f(N) c g(N) when N n0

• f(N) grows no slower than g(N) for “large” N

Page 26: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Big Theta: f(N) = (g(N))

• the growth rate of f(N) is the same as the growth rate of g(N)

Page 27: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• An O(1) algorithm is constant time.– The running time of such an algorithm is essentially independent of the input.– Such algorithms are rare, since they cannot even read all of their input.

• An O(logbn) [for some b] algorithm is logarithmic time.– We do not care what b is.

• An O(n) algorithm is linear time.– Such algorithms are not rare.– This is as fast as an algorithm can be and still read all of its input.

• An O(n logbn) [for some b] algorithm is log-linear time.– This is about as slow as an algorithm can be and still be truly useful

(scalable).

• An O(n2) algorithm is quadratic time.– These are usually too slow.

• An O(bn) [for some b] algorithm is exponential time.– These algorithms are much too slow to be useful.

Page 28: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• T(N) = O(f(N))– f(N) is an upper bound on T(N)– T(N) grows no faster than f(N)

• T(N) = (g(N))– g(N) is a lower bound on T(N)– T(N) grows at least as fast as g(N)

• T(N) = o(h(N)) (little-O)– T(N) grows strictly slower than h(N)

Hammerin’ the terminolgy

Page 29: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Asymptotically less than or equal to O (Big-O)

Asymptotically greater than or equal to (Big-Omega)

Asymptotically equal to (Big-Theta)

Asymptotically strictly less o (Little-O)

Notations

Page 30: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Facts about big-O

• If T(N) is a polynomial of degree k, then:T(N) = (Nk)– example: 17n3 + 2n2 + 4n + 1 = (n3)

Page 31: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Hierarchy of Big-O• Functions, ranked in increasing order of growth:

– 1

– log log n

– log n

– n

– n log n

– n2

– n2 log n

– n3

...

– 2n

– n!

– nn

Page 32: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Various growth rates

)20 = or 10 = (e.g., of valuessmallfor Practical

time; : )!()(),()(),2()(

time : )()(

time : )()(

sortingfor famous : )log()(

time : )()(

time : )(log)(

imeconstant t asfast asabout Just : )log(log)(

imeConstant t : )()(

2

nnn

nnTnnTnT

nnT

nnT

nnnT

nnT

nnT

nnT

nT

nn

k

lexponentia

polynomial

quadratic

- time linear-log

linear

clogarithmi

1

Page 33: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• Evaluate:

)(

)(lim

Ng

NfN

limit is Big-Oh relation

0 f(N) = o(g(N))

c 0 f(N) = (g(N))

g(N) = o(f(N))

no limit no relation

Techniques for Determining Which Grows Faster

Page 34: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• L'Hôpital's rule:

If and , then

example: f(N) = N, g(N) = log N

Use L'Hôpital's rule

f'(N) = 1, g'(N) = 1/N

g(N) = o(f(N))

)(lim NfN

)(lim NgN

)('

)('lim

)(

)(lim

Ng

Nf

Ng

NfNN

Techniques, cont'd

Page 35: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

for (int i = 0; i < n; i += c) // O(n) statement(s);

• Adding to the loop counter means that the loop runtime grows linearly when compared to its maximum value n.– Loop executes its body exactly n / c times.

Program loop runtimes

Page 36: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

for (int i = 0; i < n; i *= c) // O(log n)

statement(s);

• Multiplying the loop counter means that the maximum value n must grow exponentially to linearly increase the loop runtime; therefore, it is logarithmic.– Loop executes its body exactly logc n times.

Page 37: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

for (int i = 0; i < n * n; i += c) // O(n2)

statement(s);

• The loop maximum is n2, so the runtime is quadratic.– Loop executes its body exactly (n2 / c) times.

Page 38: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• Nesting loops multiplies their runtimes.for (int i = 0; i < n; i += c) { //O(n2) for (int j = 0; j < n; i += c) { statement;} }

More loop runtimes

Page 39: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• Loops in sequence add together their runtimes, which means the loop set with the larger runtime dominates.for (int i = 0; i < n; i += c) { // O(n) statement;}// O(nlog n)for (int i = 0; i < n; i += c) { for (int j = 0; j < n; i *= c) {

statement;} }

Page 40: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

• Express the running time as f(N), where N is the size of the input

• worst case: your enemy gets to pick the input

• average case: need to assume a probability distribution on the inputs

Types of runtime analysis

Page 41: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

Some rules

When considering the growth rate of a function using Big-O

• Ignore the lower order terms and the coefficients of the highest-order term

• No need to specify the base of logarithm– Changing the base from one constant to another changes

the value of the logarithm by only a constant factor

• If T1(N) = O(f(N) and T2(N) = O(g(N)), then

– T1(N) + T2(N) = max(O(f(N)), O(g(N))),

– T1(N) * T2(N) = O(f(N) * g(N))

Page 42: CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn

For Next Class, Thursday

• Homework 1 due tonight– Quiz 2 – Today, 01/21, before class!

• Up to 3 submissions

– Quiz 3 Thursday by class time

• Homework 2 due Monday, 01/27, 11:59pm• For Thursday

– Chapter 2, Sections 2.1-2.4.2