o() analysis of methods and data structures reasonable vs. unreasonable algorithms using o()...

Post on 01-Apr-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

O() Analysis of Methods and Data Structures

Reasonable vs. UnreasonableAlgorithms

Using O() Analysis in Design

Now Available Online!

http://www.coursesurvey.gatech.edu

O() Analysis of Methods and Data Structures

The Scenario

• We’ve talked about data structures and methods to act on these structures…– Linked lists, arrays, trees– Inserting, Deleting, Searching, Traversal,

Sorting, etc.

• Now that we know about O() notation, let’s discuss how each of these methods perform on these data structures!

Recipe for Determining O()

• Break algorithm down into “known” pieces

– We’ll learn the Big-Os in this section

• Identify relationships between pieces

– Sequential is additive

– Nested (loop / recursion) is multiplicative• Drop constants• Keep only dominant factor for each variable

Array Size and Complexity

How can an array change in size?

MAX is 30ArrayType definesa array[1..MAX] of Num

We need to know what N is in advance to declare an array, but for analysis of complexity, we can still use N as a variable for input size.

Traversals

• Traversals involve visiting every element in a collection.

• Because we must visit every node, a traversal must be O(N) for any data structure.

– If we visit less than N elements, then it is not a traversal.

Comparing Data Structures and Methods

Data Structure Traverse

Unsorted L List N

Sorted L List N

Unsorted Array N

Sorted Array N

Binary Tree N

BST N

F&B BST N

LB

Searching for an Element

Searching involves determining if an element is a member of the collection.

• Simple/Linear Search:– If there is no ordering in the data structure– If the ordering is not applicable

• Binary Search:– If the data is ordered or sorted– Requires non-linear access to the elements

Simple Search

• Worst case: the element to be found is the Nth element examined.

• Simple search must be used for:– Sorted or unsorted linked lists– Unsorted array– Binary tree– Binary Search Tree if it is not full and balanced

Comparing Data Structures and Methods

Data Structure Traverse Search

Unsorted L List N N

Sorted L List N N

Unsorted Array N N

Sorted Array N

Binary Tree N N

BST N N

F&B BST N

?

LB

Balanced Binary Search Trees

If a binary search tree is not full, then in the worst case it takes on the structure and characteristics of a sorted linked list with N/2 elements.

7

11

14

42

58

Binary Search TreesIf a binary search tree is not full or balanced, then in

the worst case it takes on the structure and characteristics of a sorted linked list.

7

11

14

42

58

Example: Linked List

• Let’s determine if the value 83 is in the collection:

5 19 35 42 \\

83 Not Found!

Head

Simple/Linear Search Algorithm

cur <- head

loop exitif(cur = NIL) OR (cur^.data = target) cur <- cur^.next

endloop

if(cur <> NIL) then print( “Yes, target is there” )

else print( “No, target isn’t there” )

endif

Pre-Order Search Traversal Algorithm

• As soon as we get to a node, check to see if we have a match

• Otherwise, look for the element in the left sub-tree

• Otherwise, look for the element in the right sub-tree

14

Left ??? Right ???

94

36

6722 14 9

3

Find 9

curIf I have to watchone more of theseI think I’m going to die.

LB

Big-O of Simple Search

• The algorithm has to examine every element in the collection– To return a false– If the element to be found is the Nth

element

• Thus, simple search is O(N).

Binary Search

• We may perform binary search on– Sorted arrays– Full and balanced binary search trees

• Tosses out ½ the elements at each comparison.

Full and Balanced Binary Search Trees

Contains approximately the same number of elements in all left and right sub-trees (recursively) and is fully populated.

34

25

21 29

45

41 52

Binary Search Example

7 12 42 59 71 86 104 212

Looking for 89

Binary Search Example

7 12 42 59 71 86 104 212

Looking for 89

Binary Search Example

7 12 42 59 71 86 104 212

Looking for 89

Binary Search Example

7 12 42 59 71 86 104 212

89 not found – 3 comparisons

3 = Log(8)

Binary Search Big-O

• An element can be found by comparing and cutting the work in half.– We cut work in ½ each time– How many times can we cut in half?

– Log2N

• Thus binary search is O(Log N).

What?

N

2

4

8

16

32

64

128

256

512

LB

Searches

1

2

3

4

5

6

7

8

9

What?

log2(N)

log2(2)

log2(4)

log2(8)

log2(16)

log2(32)

log2(64)

log2(128)

log2(256)

log2(512)

LB

Searches

1

2

3

4

5

6

7

8

9

=

=

=

=

=

=

=

=

=

CS Notation

lg(N)

lg(2)

lg(4)

lg(8)

lg(16)

lg(32)

lg(64)

lg(128)

lg(256)

lg(512)

LB

Searches

1

2

3

4

5

6

7

8

9

=

=

=

=

=

=

=

=

=

Recall

LB

log2 N = k • log10 N

k = 0.30103...

So: O(lg N) = O(log N)

Comparing Data Structures and Methods

Data Structure Traverse Search

Unsorted L List N N

Sorted L List N N

Unsorted Array N N

Sorted Array N Log N

Binary Tree N N

BST N N

F&B BST N Log N

LB

Insertion

• Inserting an element requires two steps:– Find the right location– Perform the instructions to insert

• If the data structure in question is unsorted, then it is O(1)– Simply insert to the front – Simply insert to end in the case of an array– There is no work to find the right spot and

only constant work to actually insert.

Comparing Data Structures and Methods

Data Structure Traverse Search Insert

Unsorted L List N N 1

Sorted L List N N

Unsorted Array N N 1

Sorted Array N Log N

Binary Tree N N 1

BST N N

F&B BST N Log N

LB

Insert into a Sorted Linked List

Finding the right spot is O(N)– Recurse/iterate until found

Performing the insertion is O(1)– 4-5 instructions

Total work is O(N + 1) = O(N)

Inserting into a Sorted Array

Finding the right spot is O(Log N)– Binary search on the element to insert

Performing the insertion – Shuffle the existing elements to make

room for the new item

Shuffling Elements

5 12 35 77 101

Note – we must have at least one empty cell

Insert 29

Big-O of Shuffle

5 12

Worst case: inserting the smallest number

1017735

Would require moving N elements…Thus shuffle is O(N)

Big-O of Inserting into Sorted Array

Finding the right spot is O(Log N)

Performing the insertion (shuffle) is O(N)

Sequential steps, so add:

Total work is O(Log N + N) = O(N)

Comparing Data Structures and Methods

Data Structure Traverse Search Insert

Unsorted L List N N 1

Sorted L List N N N

Unsorted Array N N 1

Sorted Array N Log N N

Binary Tree N N 1

BST N N

F&B BST N Log N

LB

Inserting into a Full and Balanced BST

Always insert when current = NIL.

Find the right spot – Binary search on the element to insert

Perform the insertion – 4-5 instructions to create & add node

Full and Balanced BST Insert

12

3

7

41

98

Add 4

352

Full and Balanced BST Insert

12

3

7

41

98352

4

Big-O of Full & Balanced BST Insert

Find the right spot is O(Log N)

Performing the insertion is O(1)

Sequential, so add:

Total work is O(Log N + 1) = O(Log N)

Comparing Data Structures and Methods

Data Structure Traverse Search Insert

Unsorted L List N N 1

Sorted L List N N N

Unsorted Array N N 1

Sorted Array N Log N N

Binary Tree N N 1

BST N N

F&B BST N Log N Log N

LB

What about a BST

Not full & balanced?

LB

Comparing Data Structures and Methods

Data Structure Traverse Search Insert

Unsorted L List N N 1

Sorted L List N N N

Unsorted Array N N 1

Sorted Array N Log N N

Binary Tree N N 1

BST N N N

F&B BST N Log N Log N

LB

Two Sorting Algorithms

• Bubblesort– Brute-force method of sorting– Loop inside of a loop

• Mergesort– Divide and conquer approach– Recursively call, splitting in half– Merge sorted halves together

Bubblesort Review

Bubblesort works by comparing and swapping values in a list

512354277 101

1 2 3 4 5 6

Bubblesort Review

Bubblesort works by comparing and swapping values in a list

77123542 5

1 2 3 4 5 6

101

Largest value correctly placed

procedure Bubblesort(A isoftype in/out Arr_Type) to_do, index isoftype Num to_do <- N – 1

loop exitif(to_do = 0) index <- 1 loop exitif(index > to_do) if(A[index] > A[index + 1]) then Swap(A[index], A[index + 1]) endif index <- index + 1 endloop to_do <- to_do - 1 endloopendprocedure // Bubblesort

to_doN-1

Analysis of Bubblesort

• How many comparisons in the inner loop?– to_do goes from N-1 down to 1, thus– (N-1) + (N-2) + (N-3) + ... + 2 + 1– Average: N/2 for each “pass” of the

outer loop.

• How many “passes” of the outer loop?– N – 1

Bubblesort Complexity

Look at the relationship between the two loops:– Inner is nested inside outer – Inner will be executed for each iteration of

outer

Therefore the complexity is:

O((N-1)*(N/2)) = O(N2/2 – N/2) = O(N2)

LB

Graphically

N-1

N-1

2N-1

2N-1

LB

O(N2) Runtime Example

Assume you are sorting 250,000,000 items

N = 250,000,000

N2 = 6.25 x 1016

If you can do one operation per nanosecond (10-9 sec) which is fast!

It will take 6.25 x 107 seconds

So 6.25 x 107

60 x 60 x 24 x 365

= 1.98 years

674523 14 6 3398 42

674523 14 6 3398 42

4523 1498

2398 45 14

676 33 42

676 33 42

23 98 4514 676 4233

14 23 45 98 6 33 42 67

6 14 23 33 42 45 67 98

Mergesort

Log N

Log N

Analysis of MergesortPhase I

– Divide the list of N numbers into two lists of N/2 numbers

– Divide those lists in half until each list is size 1

Log N steps for this stage.

Phase II– Build sorted lists from the decomposed lists– Merge pairs of lists, doubling the size of the

sorted lists each time

Log N steps for this stage.

Analysis of the Merging

2398 45 14 676 33 42

23 98 4514 676 4233

Merge MergeMergeMerge

14 23 45 98 6 33 42 67

MergeMerge

6 14 23 33 42 45 67 98

Merge

Mergesort Complexity

Each of the N numerical values is compared or copied during each pass– The total work for each pass is O(N).– There are a total of Log N passes

Therefore the complexity is:

O(Log N + N * Log N) = O (N * Log N)

Break apart Merging

O(NLogN) Runtime Example

Assume same 250,000,000 items

N*Log(N) = 250,000,000 x 8.3

= 2, 099, 485, 002

With the same processor as before

2 seconds

Summary• You now have the O() for basic methods on

varied data structures.• You can combine these in more complex

situations.

• Break algorithm down into “known” pieces

• Identify relationships between pieces

– Sequential is additive

– Nested (loop/recursion) is multiplicative• Drop constants• Keep only dominant factor for each variable

Questions?

Reasonable vs. UnreasonableAlgorithms

Reasonable vs. Unreasonable

Reasonable algorithms have polynomial factors– O (Log N)– O (N)– O (NK) where K is a constant

Unreasonable algorithms have exponential factors– O (2N)– O (N!)– O (NN)

Algorithmic Performance Thus Far

• Some examples thus far:– O(1) Insert to front of linked list– O(N) Simple/Linear Search– O(N Log N) MergeSort– O(N2) BubbleSort

• But it could get worse:– O(N5), O(N2000), etc.

An O(N5) Example

For N = 256

N5 = 2565 = 1,100,000,000,000

If we had a computer that could execute a million instructions per second…

• 1,100,000 seconds = 12.7 days to complete

But it could get worse…

The Power of Exponents

A rich king and a wise peasant…

The Wise Peasant’s Pay

Day(N) Pieces of Grain

1 2

2 4

3 8

4 16

...

2N

63 9,223,000,000,000,000,000

64 18,450,000,000,000,000,000

How Bad is 2N?

• Imagine being able to grow a billion (1,000,000,000) pieces of grain a second…

• It would take– 585 years to grow enough grain

just for the 64th day– Over a thousand years to fulfill

the peasant’s request!

?

So the King cut off the peasant’s head.

LB

The Towers of Hanoi

A B C

Goal: Move stack of rings to another peg– Rule 1: May move only 1 ring at a time– Rule 2: May never have larger ring on top of

smaller ring

Original State Move 1

Move 2 Move 3

Move 4 Move 5

Move 6 Move 7

Towers of Hanoi: Solution

Towers of Hanoi - Complexity

For 3 rings we have 7 operations.

In general, the cost is 2N – 1 = O(2N)

Each time we increment N, we double the amount of work.

This grows incredibly fast!

Towers of Hanoi (2N) Runtime

For N = 64

2N = 264 = 18,450,000,000,000,000,000

If we had a computer that could execute a million instructions per second…

• It would take 584,000 years to complete

But it could get worse…

The Bounded Tile Problem

Match up the patterns in thetiles. Can it be done, yes or no?

The Bounded Tile Problem

Matching tiles

Tiling a 5x5 Area

25 availabletiles remaining

Tiling a 5x5 Area

24 availabletiles remaining

Tiling a 5x5 Area

23 availabletiles remaining

Tiling a 5x5 Area

22 availabletiles remaining

Tiling a 5x5 Area

2 availabletiles remaining

Analysis of the Bounded Tiling Problem

Tile a 5 by 5 area (N = 25 tiles)1st location: 25 choices2nd location: 24 choicesAnd so on…

Total number of arrangements:– 25 * 24 * 23 * 22 * 21 * .... * 3 * 2 * 1– 25! (Factorial) =

15,500,000,000,000,000,000,000,000Bounded Tiling Problem is O(N!)

Tiling (N!) Runtime

For N = 25

25! = 15,500,000,000,000,000,000,000,000

If we could “place” a million tiles per second…

• It would take 470 billion years to complete

Why not a faster computer?

A Faster Computer

• If we had a computer that could execute a trillion instructions per second (a million times faster than our MIPS computer)…

• 5x5 tiling problem would take 470,000 years

• 64-ring Tower of Hanoi problem would take 213 days

Why not an even faster computer!

The Fastest Computer Possible?

• What if:– Instructions took ZERO time to execute– CPU registers could be loaded at the speed of

light

• These algorithms are still unreasonable!• The speed of light is only so fast!

Where Does this Leave Us?

• Clearly algorithms have varying runtimes.

• We’d like a way to categorize them:

– Reasonable, so it may be useful – Unreasonable, so why bother running

Performance Categories of Algorithms

Sub-linear O(Log N)

Linear O(N)

Nearly linear O(N Log N)

Quadratic O(N2)

Exponential O(2N)

O(N!)

O(NN)

Po

lyn

om

ial

Reasonable vs. Unreasonable

Reasonable algorithms have polynomial factors– O (Log N)– O (N)– O (NK) where K is a constant

Unreasonable algorithms have exponential factors– O (2N)– O (N!)– O (NN)

Reasonable vs. Unreasonable

Reasonable algorithms• May be usable depending upon the input size

Unreasonable algorithms• Are impractical and useful to theorists• Demonstrate need for approximate solutions

Remember we’re dealing with large N (input size)

Two Categories of Algorithms

2 4 8 16 32 64 128 256 512 1024Size of Input (N)

1035

1030

1025

1020

1015

trillionbillionmillion100010010

N

N5

2NNN

Unreasonable

Don’t Care!

Reasonable

Ru

nti

me

Summary

• Reasonable algorithms feature polynomial factors in their O() and may be usable depending upon input size.

• Unreasonable algorithms feature exponential factors in their O() and have no practical utility.

Questions?

Using O() Analysis in Design

Air Traffic Control

Coast, add, delete

Conflict Alert

Problem Statement

• What data structure should be used to store the aircraft records for this system?

• Normal operations conducted are:– Data Entry: adding new aircraft entering the

area– Radar Update: input from the antenna– Coast: global traversal to verify that all aircraft

have been updated [coast for 5 cycles, then drop]

– Query: controller requesting data about a specific aircraft by location

– Conflict Analysis: make sure no two aircraft are too close together

Air Traffic Control System

Program Algorithm Freq 1. Data Entry / Exit Insert 15 2. Radar Data Update N*Search 12 3. Coast / Drop Traverse 60 4. Query Search 1 5. Conflict Analysis Traverse*Search 12

LLU LLS AU AS BT F/B BST

#1 1 N 1 N 1 LogN

#2 N^2 N^2 N^2 NlogN N^2 NlogN

#3 N N N N N N

#4 N N N LogN N LogN

#5 N^2 N^2 N^2 NlogN N^2 NlogN

Questions?

top related