cs 1031 introduction (outline) the software development process performance analysis: the big oh....

36
CS 103 1 Introduction (Outline) • The Software Development Process • Performance Analysis: the Big Oh. • Abstract Data Types • Introduction to Data Structures

Upload: jonathan-mathews

Post on 25-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 1

Introduction(Outline)

• The Software Development Process

• Performance Analysis: the Big Oh.

• Abstract Data Types

• Introduction to Data Structures

Page 2: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 2

The Software Development Process

Page 3: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 3

Software Development

• Requirement analysis, leading to a specification of the problem

• Design of a solution

• Implementation of the solution (coding)

• Analysis of the solution

• Testing, debugging and integration

• Maintenance and evolution of the system.

Page 4: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 4

Specification of a problem

• A precise statement/description of the problem.

• It involves describing the input, the expected output, and the relationship between the input and output.

• This is often done through preconditions and postconditions.

Page 5: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 5

Design

• Formulation of a method, that is, of a sequence of steps, to solve the problem.

• The design “language” can be pseudo-code, flowcharts, natural language, any combinations of those, etc.

• A design so expressed is called an algorithm(s). • A good design approach is a top-down design

where the problem is decomposed into smaller, simpler pieces, where each piece is designed into a module.

Page 6: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 6

Implementation

• Development of actual C++ code that will carry out the design and solve the problem.

• The design and implementation of data structures, abstract data types, and classes, are often a major part of design implementation.

Page 7: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 7

Implementation(Good Principles)

• Code Re-use– Re-use of other people’s software– Write your software in a way that makes it (re)usable

by others

• Hiding of implementation details: emphasis on the interface.

• Hiding is also called data encapsulation• Data structures are a prime instance of data

encapsulation and code re-use

Page 8: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 8

Analysis of the Solution

• Estimation of how much time and memory an algorithm takes.

• The purpose is twofold: – to get a ballpark figure of the speed and memory

requirements to see if they meet the target

– to compare competing designs and thus choose the best before any further investment in the application (implementation, testing, etc.)

Page 9: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 9

Testing and Debugging

• Testing a program for syntactical correctness (no compiler errors)

• Testing a program for semantic correctness, that is, checking if the program gives the correct output.

• This is done by – having sample input data and corresponding, known output data– running the programs against the sample input– comparing the program output to the known output– in case there is no match, modify the code to achieve a perfect

match. • One important tip for thorough testing: Fully exercise

the code, that is, make sure each line of your code is executed.

Page 10: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 10

Integration

• Gluing all the pieces (modules) together to create a cohesive whole system.

Page 11: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 11

Maintenance and Evolution of a System

• Ongoing, on-the-job modifications and updates of the programs.

Page 12: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 12

Preconditions and Postconditions

• A semi-formal, precise way of specifying what a function/program does, and under what conditions it is expected to perform correctly

• Purpose: Good documentation, and better communications, over time and space, to other programmers and user of your code

Page 13: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 13

Precondition

• It is a statement of what must be true before function is called.

• This often means describing the input: – the input type– the conditions that the input values must satisfy.

• A function may take data from the environment– Then, the preconditions describe the state of that environment– that is, the conditions that must be satisfied, in order to guarantee

the correctness of the function.

• The programmer is responsible for ensuring that the precondition is valid when the function is called.

Page 14: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 14

Postcondition

• It is a statement of what will be true when the function finishes its work.

• This is often a description of the function output, and the relationship between output and input.

• A function may modify data from the environment (such as global variables, or files)– the postconditions describe the new values of those

data after the function call is completed, in relation to what the values were before the function was called.

Page 15: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 15

Example of Pre/Post-Conditions

void get_sqrt( double x)

// Precondition: x >= 0.

// Postcondition: The output is a number

// y = the square root of x.

Page 16: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 16

C++ Way of Asserting Preconditions

• Use the library call assert (condition)

• You have to include #include <cassert>• It makes sure that condition is satisfied (= true),

in which case the execution of the program continues.

• If condition is false, the program execution terminates, and an error message is printed out, describing the cause of the termination.

Page 17: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 17

Performance Analysis and Big-O

Page 18: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 18

Performance Analysis

• Determining an estimate of the time and memory requirement of the algorithm.

• Time estimation is called time complexity analysis

• Memory size estimation is called space complexity analysis.

• Because memory is cheap and abundant, we rarely do space complexity analysis

• Since time is “expensive” , analysis now defaults to time complexity analysis

Page 19: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 19

Big-O Notation

• Let n be a non-negative integer representing the size of the input to an algorithm

• Let f(n) and g(n) be two positive functions, representing the number of basic calculations (operations, instructions) that an algorithm takes (or the number of memory words an algorithm needs).

Page 20: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 20

Big-O Notation (contd.)

• f(n)=O(g(n)) iff there exist a positive constant C and non-negative integer n0 such that

f(n) Cg(n) for all nn0.

• g(n) is said to be an upper bound of f(n).

Page 21: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 21

Big-O Notation(Examples)

• f(n) = 5n+2 = O(n) // g(n) = n– f(n) 6n, for n 3 (C=6, n0=3)

• f(n)=n/2 –3 = O(n)– f(n) 0.5 n for n 0 (C=0.5, n0=0)

• n2-n = O(n2) // g(n) = n2

– n2-n n2 for n 0 (C=1, n0=0)

• n(n+1)/2 = O(n2)– n(n+1)/2 n2 for n 0 (C=1, n0=0)

Page 22: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 22

Big-O Notation (In Practice)

• When computing the complexity, – f(n) is the actual time formula– g(n) is the simplified version of f

• Since f(n) stands often for time, we use T(n) instead of f(n)

• In practice, the simplification of T(n) occurs while it is being computed by the designer

Page 23: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 23

Simplification Methods

• If T(n) is the sum of a constant number of terms, drop all the terms except for the most dominant (biggest) term;

• Drop any multiplicative factor of that term

• What remains is the simplified g(n).

• amnm + am-1nm-1+...+ a1n+ a0=O(nm).

• n2-n+log n = O(n2)

Page 24: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 24

Big-O Notation (Common Complexities)

• T(n)=O(1) // constant time• T(n)=O(log n) // logarithmic• T(n)=O(n) // linear • T(n)=O(n2) //quadratic• T(n)=O(n3) //cubic• T(n)=O(nc), c 1 // polynomial• T(n)=O(logc n), c 1 // polylogarithmic• T(n)=O(nlog n)

Page 25: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 25

Big-O Notation(Characteristics)

• The big-O notation is a simplification mechanism of time/memory estimates.

• It loses precision, trading precision for simplicity

• Retains enough information to give a ballpark idea of speed/cost of an algorithm, and to be able to compare competing algorithms.

Page 26: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 26

Common Formulas

• 1+2+3+…+n= n(n+1)/2 = O(n2).

• 12+22+32+…+n2= n(n+1)(2n+1)/6 = O(n3)

• 1+x+x2+x3+…+xn=(x n+1 – 1)/(x-1) = O(xn).

Page 27: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 27

Example of Time Complexity Analysis and Big-O

• Pseudo-code of finding a maximum of x[n]:

double M=x[0];for i=1 to n-1 do

if (x[i] > M)M=x[i];

endifendforreturn M;

Page 28: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 28

Complexity of the algorithm

• T(n) = a+(n-1)(b+a) = O(n)• Where “a” is the time of one assignment,

and “b” is the time of one comparison• Both “a” and “b” are constants that depend

on the hardware• Observe that the big O spares us from

– Relatively unimportant arithmetic details– Hardware dependency

Page 29: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 29

Abstract Data Types

Page 30: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 30

Abstract Data Types

• An abstract data type is a mathematical set of data, along with operations defined on that kind of data

• Examples: – int: it is the set of integers (up to a certain

magnitude), with operations +, -, /, *, %– double: it’s the set of decimal numbers (up to a

certain magnitude), with operations +, -, /, *

Page 31: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 31

Abstract Data Types (Contd.)

• The previous examples belong to what is called built-in data types

• That is, they are provided by the programming language

• But new abstract data types can be defined by users, using arrays, enum, structs, classes (if object oriented programming), etc.

Page 32: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 32

Introduction to Data Structures

Page 33: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 33

Data Structures

• A data structure is a user-defined abstract data type

• Examples: – Complex numbers: with operations +, -, /, *,

magnitude, angle, etc.

– Stack: with operations push, pop, peek, isempty

– Queue: enqueue, dequeue, isempty …

– Binary Search Tree: insert, delete, search.

– Heap: insert, min, delete-min.

Page 34: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 34

Data Structure Design

• Specification– A set of data – Specifications for a number of operations to be

performed on the data

• Design– A lay-out organization of the data– Algorithms for the operations

• Goals of Design: fast operations

Page 35: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 35

Implementation of a Data Structure

• Representation of the data using built-in data types of the programming language (such as int, double, char, strings, arrays, structs, classes, pointers, etc.)

• Language implementation (code) of the algorithms for the operations

Page 36: CS 1031 Introduction (Outline) The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures

CS 103 36

Object-Oriented Programming (OOP) And Data Structures

• When implementing a data structure in non-OOP languages such as C, the data representation and the operations are separate

• In OOP languages such as C++, both the data representation and the operations are aggregated together into what is called objects

• The data type of such objects are called classes.• Classes are blue prints, objects are instances.