object (data and algorithm) analysis cmput 115 - lecture 5 department of computing science...
Post on 20-Dec-2015
213 views
TRANSCRIPT
Object (Data and Algorithm) Analysis
Cmput 115 - Lecture 5
Department of Computing Science
University of Alberta©Duane Szafron 1999
Some code in this lecture is based on code from the book:Java Structures by Duane A. Bailey or the companion structure package
Revised 12/31/99
©Duane Szafron 1999
2
About This LectureAbout This Lecture
In this lecture we will learn how to compare the relative space and time efficiencies of different software solutions.
©Duane Szafron 1999
3
OutlineOutline
Abstraction
Implementation choice criteria
Time and space complexity
Asymptotic analysis
Worst, best and average case complexity
©Duane Szafron 1999
4
Abstraction in Software SystemsAbstraction in Software Systems
A software system is an abstract model of a real-world phenomena.
It is abstract since some of the properties of the real world are ignored in the system.
For example, a banking system does not usually model the height and weight of its customers, even though it represents its customers in the software.
The Customer class is only an abstraction of a real customer, with details missing.
©Duane Szafron 1999
5
Design ChoicesDesign Choices
When a software system is designed, the designer has many choices about what abstractions to make.
For example, in a banking system, the designer can choose to design a Customer object that knows its client number and password.
Alternately, each Customer object could know its BankCard object which knows its client number and password.
©Duane Szafron 1999
6
Implementation ChoicesImplementation Choices
After a software system is designed, the programmer has many choices to make when implementing classes.
For example, in a banking system, should an Account object remember an Array of Transactions or a Vector of Transactions?
For example, should a sort method use a selection sort or a merge sort?
In this course, we will focus on choices at the class and method level.
©Duane Szafron 1999
7
Choice CriteriaChoice Criteria
What criteria should be used to make a choice at the implementation level?
Historically, these two criteria have been considered the most important:
– The amount of space that an object uses and the amount of temporary storage used by its methods.
– The amount of time that a method takes to run.
©Duane Szafron 1999
8
Other Choice CriteriaOther Choice Criteria
However, we should also consider:– The simplicity of an object representation.– The simplicity of an algorithm that is implemented as a
method. Why?
These criteria are important because they directly affect:– Amount of time it takes to implement a class.– Number of defects (bugs) in classes.– Amount of time it takes to update a class when
requirements change.
©Duane Szafron 1999
9
Comparing MethodsComparing Methods
What is better, a selection sort or a merge sort?
– We can implement each sort as a method.
– We can count the amount of storage space each
method requires for sample data.
– We can time the two methods on sample data and see
which one is faster.
– We could time how long it takes a programmer to write
the code for each method.
– We can count the defects in the two implementations.
©Duane Szafron 1999
10
Comparing Average MethodsComparing Average Methods
Unfortunately, our answers would depend on the sample data and the individual programmer chosen.
To circumvent this problem, we could compute averages over many sample data sets and over many programmers.
This would be a long and expensive process.
We need a way to predict the answers by analysis rather than experimentation.
11
©Duane Szafron 1999
Time and Space ComplexityTime and Space ComplexityTime and Space ComplexityTime and Space Complexity
The minimum time and space requirements of a method can be determined from the algorithm.
Since these values depend on the data size, we often describe them as a function of the data size (referred to as n).
These functions are called respectively, the time complexity, time(n), and space complexity, space(n), of the algorithm.
©Duane Szafron 1999
12
Space ComplexitySpace Complexity It is easy to compute the space complexity for many
algorithms:
– A selection sort of one word elements has space complexity: space(n) = n + 1 words
• it takes n words to store the elements.• It takes one extra word for swapping elements.
– A merge sort of one word elements has space complexity: space(n) = 2n words
• It takes n words to store the elements.• It takes n words for swapping elements during merges.
©Duane Szafron 1999
13
Time Complexity - OperationsTime Complexity - Operations The time complexity of an algorithm is more difficult to
compute.– The time depends on the number and kind of machine
instructions that are generated.– The time depends on the time it takes to run machine instructions.
One approach is to pick some basic operation and count these operations.– For example, if we count addition operations in an algorithm that
adds the elements of a list, the time complexity for a list with n elements is: n.
©Duane Szafron 1999
14
Data Size is ImportantData Size is Important
If we are sorting 100 values, it probably doesn’t matter which sort we use:– The time and space complexities are irrelevant.– However, the simplicity may be a factor.
If we are sorting 20,000 values, it may matter: – 1 sec (SS) versus 16 sec (MS) on current hardware.– 20,001 words (SS) versus 40,000 words (MS).– The simplicity difference is still the same.
If we are sorting 1,000,000 values, it does matter.
n2 = 1,000,000,000,000 nlog2n= 1,000,000*20=20,000,000
©Duane Szafron 1999
15
Asymptotic AnalysisAsymptotic Analysis We are often only interested in the time and space
complexities for large data sizes, n.
That is, we are interested in the asymptotic behavior of time(n) and space(n) in the limit as n goes to infinity.
A function f(n) is O(g(n)) “big-O” if and only if:limit {f(n)/g(n)} = c, where c is a constantn
If you are uncomfortable with limits then the textbook has an alternate definition.
©Duane Szafron 1999
16
Constant Time - O(1) AlgorithmsConstant Time - O(1) Algorithms The time to assign a value to an array is:
– It takes a constant time, say c1.
– time(n) is O(1) since
limit {c1 / 1} = c1.n
The time to add two integers stored in memory is:– Accessing each int takes a constant time, say c1.
– Adding the two ints takes a constant time, say c2.
– time(n) is O(1) since
limit {(c1 + c2) / 1} = c1 + c2. n
myArray[0] = 1;
first + second
©Duane Szafron 1999
17
Linear Time - O(n) AlgorithmsLinear Time - O(n) Algorithms
The time to add the values in an array is linear:
– Initializing the sum and index to zero takes c1.
– Each addition and store takes c2.
– Each loop index increment and comparison to the stop index takes c3.
– time(n) is O(n) since
limit {(c1 + n* c2 + n* c3)/ n} = c2 + c3.
n sum = 0;for (i = 0; i < list.length; i++)
sum = sum + list[i];
©Duane Szafron 1999
18
Polynomial Time - O(nPolynomial Time - O(ncc) Algorithms) Algorithms
The time to add up the elements in a square matrix of size n is quadratic (nc with c = 2):
– Initializing the running sum to zero takes c1.– Each addition takes c2.– Each outer loop index modification takes c3.– Each inner loop index modification takes c3.– time(n) is O(n2) since
limit {(c1+n2*c2 + (n+n2)*c3)/ n2} = c2 + c3.
n sum = 0;for (row = 0; row < aMatrix.height(); row++)
for (column = 0; column < aMatrix.height(); column++)sum = sum + aMatrix.elementAt(row, column);
©Duane Szafron 1999
19
Exponential Time - O(cExponential Time - O(cnn) Algorithms) Algorithms
The time to print each element of a [full] binary tree of depth n is exponential (cn with c = 2):
– Each print takes c1.– Each left node and right node access takes c2.– Each stop check takes c3.– time(n) is O(2n) since
limit {(c1*(2n-1)+c2*(2n-2)+c3*(2n-2) )/ 2n} = c1+c2+c3.
n System.out.println(aNode.value());leftNode = aNode.left();if (leftNode != null) leftNode.print();rightNode = aNode.right();if (rightNode != null) rightNode.print();
The number of nodesin the tree is 2n - 1
©Duane Szafron 1999
20
Variable Time AlgorithmsVariable Time Algorithms
The time complexity of many algorithms depends not only on the size of the data (n), but also on the specific data values:
– The time to find a key value in an indexed container depends on the location of the key in the container.
– The time to sort a container depends on the initial order of the elements.
©Duane Szafron 1999
21
Worst, Best and Average ComplexityWorst, Best and Average Complexity
We define three common cases.
– The best case complexity is the smallest value for any problem of size n.
– The worst case complexity is the largest value for any problem of size n.
– The average case complexity is the weighted average of all problems of size n, where the weight of a problem is the probability that the problem occurs.
©Duane Szafron 1999
22
35 35 35 35 35 3535 35 35 35 35
25 50 10 95 75 30 70 55 60 800 1 2 3 4 5 6 7 8 9 Index
Value
Worst Case Complexity ExampleWorst Case Complexity Example
Assume we are doing a sequential search on a container of size n.
The worst case complexity is when the element is not in the container.
35
The worst case complexity is O(n).
©Duane Szafron 1999
23
Best Case Complexity ExampleBest Case Complexity Example
Assume we are doing a sequential search on a container of size n.
The best case complexity is when the element is the first one.
25
25 50 10 95 75 30 70 55 60 800 1 2 3 4 5 6 7 8 9
The best case complexity is O(1).
©Duane Szafron 1999
24
Average Case Complexity ExampleAverage Case Complexity Example
To compute the average case complexity, we must make assumptions about the probability distribution of the data sets. – Assume a sequential search on a container of size n.
– Assume a 50% chance that the key is in the container.
– Assume equal probabilities that the key is at any position.
The probability that the key is at any particular location in the container is (1/2) * (1/n) = 1/2n.
The average case complexity is:1*(1/2n) + 2*(1/2n) + … + n*(1/2n) + n*(1/2) =
(1/2n)*(1 + 2 + … + n) + (n/2) = (1/2n)*(n*(n+1)/2) + (n/2) =
(n+1)/4 + (n/2) = (3n + 1)/4 = O(n)
©Duane Szafron 1999
25
Trading Time and SpaceTrading Time and Space It is often possible to construct two different
algorithms, where one is faster but takes more space.
For example:– A selection sort has worst case time complexity O(n2)
while using n+1 words of space.
– A merge sort has worst case time complexity O(n*log(n)) while using 2*n words of space.
Analysis lets us choose the best algorithm based on our time and space requirements.
©Duane Szafron 1999
26
PrinciplesPrinciples
Big O analysis generally is only applicablefor large n!
(n)
©Duane Szafron 1999
27
Trace of Selection SortTrace of Selection Sort
Unsorted Pass Number(i) Sorted
J Kj 1 2 3 4 5 6 7
12345678
4223741165589436
1123744265589436
1123744265589436
1123364265589474
1123364265589474
1123364258659474
1123364258659474
1123364258657494
space(n)= n + 1 time(n)=n+(n-1)+(n-2)+…+1= n(n+1)/2= O(n2)
©Duane Szafron 1999
28
Trace of Merge SortTrace of Merge Sort
Unsorted Pass Number(i) Sorted
J Kj 1 2 3 4 5 6 7
12345678
4223741165589436
1123427436586594
1123364258657494
space(n)= 2ntime(n)=1+2+4+…+2log(n-1)+n*1+(n/2)*2+(n/4)*4 +...+1*n = n + n*log(n) = O(n*logn)
4223741165589436
4223741165589436
4223741165589436
2342117458653694
Recallx=2n =>log2x=n
8=23 =>log28=3
DECOMPOSE MERGE