Download - Brief introduction to algorithms

8/13/2019 Brief introduction to algorithms

1/17

Chapter 1.Introduction

1.1Need for studying algorithms:The study of algorithms is the cornerstone of computerscience.It can be recognized as the core of computer science. Computer programs would

not exist without algorithms. With computers becoming an essential part of our profes-

sional & personal lifes, studying algorithms becomes a necessity, more so for computer

science engineers.

Another reason for studying algorithms is that if we know a standard set of important

algorithms ,They further our analytical skills & help us in developing new algorithms for

required applications

1.2ALGORITHMAn algorithm is finite set of instructions that is followed, accomplishes a particular task. In

addition, all algorithms must satisfy the following criteria:

1. Input. Zero or more quantities are externally supplied.2. Output. At least one quantity is produced.3.

Definiteness. Each instruction is clear and produced.

4. Finiteness. If we trace out the instruction of an algorithm, then for all cases, the algo-rithm terminates after a finite number of steps.

5. Effectiveness. Every instruction must be very basic so that it can be carried out, inprincipal, by a person using only pencil and paper. It is not enough that each operation

be definite as in criterion 3; it also must be feasible.


2/17

Fig 1.a.

An algorithm is composed of a finite set of steps, each of which may require one or more op-

erations. The possibility of a computer carrying out these operations necessitates that certain

constraints be placed on the type of operations an algorithm can include. The fourth criterion

for algorithms we assume in this book is that they terminate after a finite number of opera-

tions.

Criterion 5 requires that each operation be effective; each step must be such that it can, at least

in principal, be done by a person using pencil and paper in a finite amount of time. Performing

arithmetic on integers is an example of effective operation, but arithmetic with real numbers is

not, since some values may be expressible only by infinitely long decimal expansion. Adding

two such numbers would violet the effectiveness property.

Algorithms that are definite and effective are also called computational procedures. The same algorithm can be represented in same algorithm can be represented in sever-

al ways

Several algorithms to solve the same problem Different ideas different speed

Example:

COMPUTER


3/17

Problem:GCD of Two numbers m,n

Input specifiastion :Two inputs,nonnegative,not both zero

Euclids algorithm

-gcd(m,n)=gcd(n,m mod n)

Untill m mod n =0,since gcd(m,0) =m

Another way of representation of the same algorithm

Euclids algorithm

Step1:if n=0 return val of m & stop else proceed step 2

Step 2:Divide m by n & assign the value of remainder to r

Step 3:Assign the value of n to m,r to n,Go to step1.

Another algorithm to solve the same problem

Euclids algorithm

Step1:Assign the value of min(m,n) to t

Step 2:Divide m by t.if remainder is 0,go to step3 else goto step4

Step 3: Divide n by t.if the remainder is 0,return the value of t as the answer and

stop,otherwise proceed to step4

Step4 :Decrease the value of t by 1. go to step 2

1.3Fundamentals of Algorithmic problem solving Understanding the problem Ascertain the capabilities of the computational device Exact /approximate soln. Decide on the appropriate data structure Algorithm design techniques Methods of specifying an algorithm Proving an algorithms correctness Analysing an algorithm


4/17

Understanding the problem:The problem given should be understood complete-

ly.Check if it is similar to some standard problems & if a Known algorithm ex-

ists.otherwise a new algorithm has to be devised.Creating an algorithm is an art which

may never be fully automated. An important step in the design is to specify an in-

stance of the problem.

Ascertain the capabilitiesof the computational device: Once a problem is unders-

tood we need to Know the capabilities of the computing device this can be done by

Knowing the type of the architecture,speed & memory availability.

Exact /approximate soln.:Once algorithm is devised, it is necessary to show that it

computes answer for all the possible legal inputs. The solution is stated in two

forms,Exact solution or approximate solution.examples of problems where an exact

solution cannot be obtained are i)Finding a squareroot of number.

ii)Solutions of non linear equations.

Decide on the appropriate data structure:Some algorithms do not demand any in-

genuity in representing their inputs.Someothers are in fact are predicted on ingenious

data structures.A data typeis a well-defined collection of data with a well-defined set

of operations on it.A data structureis an actual implementation of a particular abstract

data type. The Elementary Data Structures are

ArraysThese let you access lots of data fast. (good) .You can have arrays of anyother data type. (good) .However, you cannot make arrays bigger if your program decides it needs

more space. (bad) .

RecordsThese let you organize non-homogeneous data into logical packages to keep every-thing together. (good) .These packages do not include operations, just data fields (bad, which

is why we need objects) .Records do not help you process distinct items in loops (bad, which

is why arrays of records are used)

SetsThese let you represent subsets of a set with such operations as intersection, union, and

equivalence. (good) .Built-in sets are limited to a certain small size. (bad, but we can build ourown set data typeout of arrays to solve this problem if necessary)


5/17

Algorithm design techniques: Creating an algorithm is an art which may never be fully au-

tomated. By mastering these design strategies, it will become easier for you to devise new and

useful algorithms. Dynamic programming is one such technique. Some of the techniques are

especially useful in fields other then computer science such as operation research and electric-

al engineering. Some important design techniques are linear, non linear and integer program-

ming

Methods of specifying an algorithm: There are mainly two options for specifying an algo-

rithm: use of natural language or pseudocode & Flowcharts.

A Pseudo code is a mixture of natural language & programming language like constructs. A

flowchart is a method of expressing an algorithm by a collection of connected geometric

shapes.

Proving an algorithms correctness: Once algorithm is devised, it is necessary to show that it

computes answer for all the possible legal inputs .We refer to this process as algorithm valida-

tion. The process of validation is to assure us that this algorithm will work correctly indepen-

dent of issues concerning programming language it will be written in. A proof of correctness

requires that the solution be stated in two forms. One form is usually as a program which is

annotated by a set of assertions about the input and output variables of a program. These as-

sertions are often expressed in the predicate calculus. The second form is called a specifica-

tion, and this may also be expressed in the predicate calculus. A proof consists of showing

that these two forms are equivalent in that for every given legal input, they describe same out-

put. A complete proof of program correctness requires that each statement of programming

language be precisely defined and all basic operations be proved correct. All these details may

cause proof to be very much longer than the program.

Analyzing algorithms: As an algorithm is executed, it uses the computers central processingunit to perform operation and its memory (both immediate and auxiliary) to hold the program

and data. Analysis of algorithms and performance analysis refers to the task of determining

how much computing time and storage an algorithm requires. This is a challenging area in

which some times require grate mathematical skill. An important result of this study is that it

allows you to make quantitative judgments about the value of one algorithm over another.


6/17

Another result is that it allows you to predict whether the software will meet any efficiency

constraint that exits.

Performance analysis

There are any criteria upon which we can judge an algorithm for instance:

1. Does it do what we want to do?

2. Does it work correctly according to the original specifications to the task?

3. is there documentation that describes how to use it and how it works?

4. Are procedures created in such a way that they perform logical sub functions?

5. is the code readable?

The space complexity of an algorithm is the amount of memory it needs to run to completion.

The time complexity of an algorithm is the amount of computer time it needs to run to com-

pletion.

Performance evaluation can be loosely divided into two major phases:

(1) A priory estimate and (2) a posteriori testing. We refer to these performance analysis

and performance measurements respectively.

Space complexity

(1)A fixed part that is independent of characteristics (e.g., number, size) of the inputs and

outputs this part typically includes the instruction space (i.e. space for code), space for

simple variables and fixed size component variables (also called aggregate),space for con-

stants and so on.

(2) A variable part that consist of space needed by component variable whose size is de-

pendent on particular problem instance being solved, the space needed by referenced va-

riables (to the extent that it depends on instance characteristics), and the recursion stack

space (insofar and this space depends on the instance characteristics). The space require-

ment S(P) of an algorithm P may therefore be written as S(P) =c + S P (instance characte-

ristics),where c is constant. When analyzing the space complexity of an algorithm, we

concentrate solely on estimating SP (instance characteristics). For any given problem, we


7/17

need first to determine which instance characteristics to use to measure the space require-

ment. Generally speaking our choices are related to the number and magnitude of the in-

puts to and outputs from the algorithm at times, more complexity measures of the interre-

lationship among the data times are used.

Time complexity

The time T (P) taken by a program P is sum of compile time and run time. The compile time

does not depend on the instance characteristics. Also, we may assume that a compiled pro-

gram will be run several time of a program. This run time is denoted by tp.

Because of many of the factor tp depends on are not known at the time of a program is con-

ceived, it is reasonable to attempt only to estimate tp. If we knew the characteristics of a com-

piler to be used, we could

Proceed to determine the number of additions, subtractions, multiplication, divisions, com-

pares, loads, stores and so on, that would be made by the code for P. So, we could obtain an

expression for tp(n) of the form

tp(n)=caADD(n)+csSUB(n)+cmMUL(n)+cdDIV(n)+.

Where n denotes the instance characteristics ,and ca, cs, cm,cd, and so on, respectively, denote

the time needed for an addition ,subtraction, multiplication, division and so on, and ADD,

SUB,MUL,DIV, and so on are the functions whose values are numbers of additions

,subtractions, multiplication, division and soon ,that are performed when code for P is used on

an instance with characteristics n.

Obtaining such an exact formula is in itself an impossible task, since the time needed for addi-

tion, subtraction, multiplication, and so on, depend on the number being added, and subtract,

multiplication and so on. The value of tp (n) for any given n can be obtained only experimen-

tally. The program is typed, compiled, and run on a particular machine. The execution time is

physically clocked, and tp (n) obtained. Even with this experimental approach, one could face

difficulties. In a multiuser system, the execution time depends on such factors as system load,

the number of other programs running on the computer at the time program P is run, the cha-

racteristics of these programs, and so on.


8/17

Given the minimal utility of determining the exact number of additions, subtraction, and so

on, that are needed to solve a problem instance with characteristics given by n, we might as

well lump all the operations together and obtain a cont for the total number of operations .We

can go one more step further and count only the number of program steps.

A program step is loosely defined as a syntactically or semantically meaningful segment of a

program that has an execution time that is independent of the instance characteristics. For ex-

ample, the entire statement

Return a+b+b*c+(a+b-c)/(a+b)+4.0;

Of Algorithm 1.5 could be regarded as a step since its execution time is independent of the

instance characteristics (this statement is not strictly true , since the time for a multiply and

divide generally depends on the numbers involved in the operation).

The number of steps any program statement is assigned depends on the kind of statement. For

example comments count as zero steps; an assignment statement which does not involve calls

any to other algorithms is encountered as one step; in an iterative statement such as for, while

and repeat untilstatement, we consider the step count only for control part of the statement.

The control parts for and while statements have the following forms:

for i=todo

while()do

each execution of the control part of a while statement is a step count equal to the number of

step counts assignable to .the step count for each execution of control part of a for

statement is one, unless the count attribute to and are functions of the in-

stance characteristics . In this latter case the first execution of the control part of the for has

step count equal to the sum of counts for and .remaining executions of the for

statement have a step count of one; and so on.

We can determine number of steps needed by program to solve a particular problem instance

in one of the two ways. in the first method we introduce a new variable ,count ,into the pro-

gram .this is the global variable with initial value 0.statement to increment count by appropri-

ate amount are introduced into the program. this is done so that each time a statement in the

original program is executed ,count is incremented by step count of that statement.


9/17

1.4 Important Problem Types Sorting Searching String processing Graph problems Combinatorial problems Geometric problems Numerical problems

sorting algorithmis an algorithm that puts elements of a list in a certain order. The most-

used orders are numerical order and lexicographical order. Efficient sorting is important to

optimizing the use of other algorithms (such as search and merge algorithms) that require

sorted lists to work correctly; it is also often useful for canonicalizing data and for producinghuman-readable output. More formally, the output must satisfy two conditions:

1. The output is in nondecreasing order (each element is no smaller than the previouselement according to the desired total order);

2. The output is a permutation, or reordering, of the input.Since the dawn of computing, the sorting problem has attracted a great deal of research, per-

haps due to the complexity of solving it efficiently despite its simple, familiar statement. Forexample, bubble sort was analyzed as early as 1956. [1]Although many consider it a solved

problem, useful new sorting algorithms are still being invented (for example, library sort was

first published in 2004). Sorting problem provides a gentle introduction to a variety of core

algorithm concepts, such as big O notation, divide and conquer algorithms, data structures,randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and lower

bounds.

Searching : In computer science, a search algorithm, broadly speaking, is an algorithm for

finding an item with specified properties among a collection of items. The items may be

stored individually as records in a database; or may be elements of a search space defined by amathematical formula or procedure, such as the roots of an equation with integer variables; or

a combination of the two, such as the Hamiltonian circuits of a graph.Searching algorithms

are closely related to the concept of dictionaries. Dictionaries are data structures that supportsearch, insert, and delete operations. One of the most effective representations is a hash table.

Typically, a simple function is applied to the key to determine its place in the dictionary.

Another efficient search algorithms on sorted tables is binary search


10/17

String processing:String searching algorithms are important in all sorts of applications that

we meet everyday. In text editors, we might want to search through a very large document

(say, a million characters) for the occurence of a given string (maybe dozens of characters). In

text retrieval tools, we might potentially want to search through thousands of such documents

(though normally these files would be indexed, making this unnecessary). Other applications

might require string matching algorithms as part of a more complex algorithm (e.g., the Unix

program ``diff'' that works out the differences between two simiar text files). Sometimes we

might want to search in binary strings (ie, sequences of 0s and 1s). For example the ``pbm''

graphics format is based on sequences of 1s and 0s. We could express a task like ``find a wide

white stripe in the image'' as a string searching problem.

Graph problems:Graph algorithms are one of the oldest classes of algorithms and they have

been studied for almost 300 years (in 1736 Leonard Euler formulated one of the first graph

problems Knigsberg Bridge Problem)

There are two large classes of graphs:

directed graphs (digraphs) undirected graphs

Some algorithms differ by the class. Moreover the set of problems for digraphs and undirected

graphs are different. There are special cases of digraphs and graphs that have their own sets

of problem. One example for digraphs will beprogram graphs. Program graphs are importantin compiler construction and were studied in detail after the invention of the computers.

Graphs are made up of vertices and edges. The simplest property of a vertex is its degree, thenumber of edges incident upon it. The sum of the vertex degrees in any undirected graph is

twice the number of edges, since every edge contributes one to the degree of both adjacent

vertices. Treesare undirected graphs which contain no cycles. Vertex degrees are important in

the analysis of trees. A leafof a tree is a vertex of degree 1. Every -vertex tree contains

edges, so all non-trivial trees contain at least two leaf vertices.


11/17

Among classic algorithms/problems on digraphs we can note the following:

Reachability. Can you get to B from A? Shortest path (min-cost path). Find the path from B to A with the minimum cost (de-

termined as some simple function of the edges traversed in the path) (Dijkstra's and

Floyd's algorithms) Visit all nodes. Traversal. Depth- and breadth-first traversals Transitive closure.Determine all pairs of nodes that can reach each other (Floyd's al-

gorithm)

Dominators a node ddominatesa node nif every path from the start node to nmustgo through d. Notationally, this is written as ddom n. By definition, every node domi-nates itself. There are a number of related concepts:

o immediate dominatoro pre-dominatoro post-dominator.o dominator tree

Minimum spanning tree.A spanning three is a set of edges such that every node isreachable from every other node, and the removal of any edge from the tree eliminates

the reachability property. A minimum spanning tree is the smallest such tree. (Prim'sand Kruskal's algorithms)

Combinatorial problems: From a more abstract perspective ,the traveling Salesman problem

and the graph coloring problems of combinatorial problems are problems that a task to find a

combinatorial object-such as a permutation a combination ,or a subset-that satisfies certain

constraints and has some desired property.Generally speaking, combinatorial problems are the

most difficult problems in computing ,from both the theoretical and practical standpoints.

Their difficulty stems from the following facts. First ,the number of combinatorial objects

typically grows extremely fast with a problem size , reaching unimaginable magnitudes even

moderate-sized intances. Second, there are no known algorithms for solving most such prob-

lems exactly in an acceptable amount of time. Moreover, most computer scientist believe

such algorithms do not exist. This conjecture has been neither proved nor disproved ,and it

remains the most important resolved issue in theoretical computer science.

Some combinatorial problems can be solved by efficient algorithms, but they should be

considered fortunate to the rule. The shortest-problem mentioned earlier is among such excep-tions.

Geometric Problems

Geometric algorithms deal with geometric objects such as points , lines, and polygons. An-

cient Greeks were very much interested in developing procedures for solving a variety ofgeometric problems including problems of constructing simple geometric shapes-triangles


12/17

,circles and so on-with an unmarked ruler and a compass. Then ,for about2000 years ,intense

interest in geometrics disappeared, to be resurrected in the age of computers-no more rulersand compasses, just bits, bytes, and good old human ingenuity. Of course, today people are

interested in geometric algorithms with quite different applications in mind, such as computer

Graphics, robotics, and tomography.

We will discuss algorithms for only two classic problems of computational geometry: theclosest pair problem and the convex-hull problem. The closest-pair problem is self explanato-

ry :given n points in the plain, find the closest pair among them. The convex hull problem isto find the smallest convex polygon that would include all points of a given set.

Numerical ProblemsNumerical problems, another large area of applications are problems that involve mathemati-

cal objects of continuous nature: solving equations and system of equation, computing definite

integrals, evaluating functions and so on. The majority of such mathematical problems can be

solved only approximately. Another principal difficulty stems from the fact that such problemtypically requires manipulating real numbers, which can be represented in computer only ap-

proximately. Moreover, a large number of arithmetic operations performed on approximatelyrepresented numbers can lead to an accumulation of the round-off error to a point where it candrastically distort an output produced by a seemingly sound algorithm.

Many sophisticated algorithms have been developed over the years in this area ,and they con-

tinue to play a critical role in many scientific and engineering applications. But in the last25years or so, the computing industry has shifted its focus into business application .These

new application require primary algorithms for information storage, retrieval ,transportation

through networks and presentation to users. As a result of this revolutionary change, numeri-

cal analysis has lost formerly dominating position in both industry and computer science pro-grams. Still, it is important for any computer-literate person to have at least a rudimentary idea

about numerical algorithms.

1.5FUNDAMENTALS OF DATA STRUCTURESSince most of the algorithms operate on the data ,particular ways of arranging the data play acritical role in the design & analysis of algorithms.A data structure can be defined as a par-

ticular way of arrangement of data. The expression ``data structure'', however, is usually usedto refer to more complex ways of storing and manipulating data, such as arrays, stacks, queues

etc. We begin by discussing the simplest, but one of the most useful data structures, namely

the array.

ARRAY

Recall that an array is a named collection of homogeneous items An items place within the

collection is called an index. The index is an integer between 0 & 1.If there is no ordering onthe items in the container, we call the container unsorted,If there is an ordering, we call the

container sorted.The size of the array is given by max length.Every item in the array can be

accessed in the same constant amount of time.


13/17

Fig 1.b.

Linked list:

A linked list consists of head & node,A node consists of two fields.data & pointer.The pointerpoints to the next data .The time to acces any data is variable & is dependent on the position

of the data in the list.


14/17

Fig 1.c.

Stacks:Stacks are known as LIFO (Last In, First Out) lists.The last element inserted will bethe first to be retrieved, using Push and Pop

Push

Add an element to the topof the stackPop

Remove the element at the topof the stack

Fig 1.d

QUES: Accessing the elements of queues follows a FIFO (First In, First Out) order

The first element inserted will be the first to be retrieved, using Enqueue and Dequeue

Enqueue Add an element after the rearof the queue

Dequeue Remove the element at thefrontof the queue


15/17

Fig 1.e.

Fig1.f.Stack and queue visualized as linked structures


16/17

Graphs

A data structure that consists of a set of nodes and a set of edges that relate the nodes

to each other.Undirected graph A graph in which the edges have no direction

Directed graph (Digraph) A graph in which each edge is directed from one vertex to

another (or the same) vertex.

An undirected graphGis a pair (V,E), where Vis a fi-nite set of points called verticesandEis a finite set of edges.

An edge e Eis an unordered pair (u,v), where u,v V.In a directed graph, the edge eis an ordered pair (u,v). An edge (u,v)is incident from

vertex uand is incident tovertex v.

Apathfrom a vertex vto a vertex uis a sequence of vertices where v0

= v, vk= u, and (vi, vi+1) EforI = 0, 1,, k-1.The length of a path is defined as the number of edges in the path

Fig1.g. A directed undirected graph.

Graph Properties-- Acyclicity

Cycle A simple path of a positive length that starts and ends a the same vertex.

Acyclic graph

A graph without cycles DAG (Directed Acyclic Graph)

Paths and ConnectivityPaths A path from vertex u to v of a graph G is defined as a sequence of adjacent

(connected by an edge) vertices that starts with u and ends with v.Simple paths: All edges of a path are distinct.

Path lengths: the number of edges, or the number of vertices 1.


17/17

Connected graphs

A graph is said to be connected if for every pair of its vertices u and v there is apath from u to v.

Connected component

-The maximum connected subgraph of a given graph

Graph Representation : A graph is represented by

Adjacency matrixn x n boolean matrix if |V| is n.

The element on the ith row and jth column is 1 if theres an edge from ith ver-tex to the jth vertex; otherwise 0.

The adjacency matrix of an undirected graph is symmetric. Adjacency linked listsA collection of linked lists, one for each vertex, that contain all the vertices adjacent to thelists vertex

Fig1.h.Graph representation.

Download - Brief introduction to algorithms

Top Related