brief introduction to algorithms

Upload: janey24

Post on 03-Jun-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Brief introduction to algorithms

    1/17

    Chapter 1.Introduction

    1.1Need for studying algorithms:The study of algorithms is the cornerstone of computerscience.It can be recognized as the core of computer science. Computer programs would

    not exist without algorithms. With computers becoming an essential part of our profes-

    sional & personal lifes, studying algorithms becomes a necessity, more so for computer

    science engineers.

    Another reason for studying algorithms is that if we know a standard set of important

    algorithms ,They further our analytical skills & help us in developing new algorithms for

    required applications

    1.2ALGORITHMAn algorithm is finite set of instructions that is followed, accomplishes a particular task. In

    addition, all algorithms must satisfy the following criteria:

    1. Input. Zero or more quantities are externally supplied.2. Output. At least one quantity is produced.3.

    Definiteness. Each instruction is clear and produced.

    4. Finiteness. If we trace out the instruction of an algorithm, then for all cases, the algo-rithm terminates after a finite number of steps.

    5. Effectiveness. Every instruction must be very basic so that it can be carried out, inprincipal, by a person using only pencil and paper. It is not enough that each operation

    be definite as in criterion 3; it also must be feasible.

  • 8/13/2019 Brief introduction to algorithms

    2/17

    Fig 1.a.

    An algorithm is composed of a finite set of steps, each of which may require one or more op-

    erations. The possibility of a computer carrying out these operations necessitates that certain

    constraints be placed on the type of operations an algorithm can include. The fourth criterion

    for algorithms we assume in this book is that they terminate after a finite number of opera-

    tions.

    Criterion 5 requires that each operation be effective; each step must be such that it can, at least

    in principal, be done by a person using pencil and paper in a finite amount of time. Performing

    arithmetic on integers is an example of effective operation, but arithmetic with real numbers is

    not, since some values may be expressible only by infinitely long decimal expansion. Adding

    two such numbers would violet the effectiveness property.

    Algorithms that are definite and effective are also called computational procedures. The same algorithm can be represented in same algorithm can be represented in sever-

    al ways

    Several algorithms to solve the same problem Different ideas different speed

    Example:

    COMPUTER

  • 8/13/2019 Brief introduction to algorithms

    3/17

    Problem:GCD of Two numbers m,n

    Input specifiastion :Two inputs,nonnegative,not both zero

    Euclids algorithm

    -gcd(m,n)=gcd(n,m mod n)

    Untill m mod n =0,since gcd(m,0) =m

    Another way of representation of the same algorithm

    Euclids algorithm

    Step1:if n=0 return val of m & stop else proceed step 2

    Step 2:Divide m by n & assign the value of remainder to r

    Step 3:Assign the value of n to m,r to n,Go to step1.

    Another algorithm to solve the same problem

    Euclids algorithm

    Step1:Assign the value of min(m,n) to t

    Step 2:Divide m by t.if remainder is 0,go to step3 else goto step4

    Step 3: Divide n by t.if the remainder is 0,return the value of t as the answer and

    stop,otherwise proceed to step4

    Step4 :Decrease the value of t by 1. go to step 2

    1.3Fundamentals of Algorithmic problem solving Understanding the problem Ascertain the capabilities of the computational device Exact /approximate soln. Decide on the appropriate data structure Algorithm design techniques Methods of specifying an algorithm Proving an algorithms correctness Analysing an algorithm

  • 8/13/2019 Brief introduction to algorithms

    4/17

    Understanding the problem:The problem given should be understood complete-

    ly.Check if it is similar to some standard problems & if a Known algorithm ex-

    ists.otherwise a new algorithm has to be devised.Creating an algorithm is an art which

    may never be fully automated. An important step in the design is to specify an in-

    stance of the problem.

    Ascertain the capabilitiesof the computational device: Once a problem is unders-

    tood we need to Know the capabilities of the computing device this can be done by

    Knowing the type of the architecture,speed & memory availability.

    Exact /approximate soln.:Once algorithm is devised, it is necessary to show that it

    computes answer for all the possible legal inputs. The solution is stated in two

    forms,Exact solution or approximate solution.examples of problems where an exact

    solution cannot be obtained are i)Finding a squareroot of number.

    ii)Solutions of non linear equations.

    Decide on the appropriate data structure:Some algorithms do not demand any in-

    genuity in representing their inputs.Someothers are in fact are predicted on ingenious

    data structures.A data typeis a well-defined collection of data with a well-defined set

    of operations on it.A data structureis an actual implementation of a particular abstract

    data type. The Elementary Data Structures are

    ArraysThese let you access lots of data fast. (good) .You can have arrays of anyother data type. (good) .However, you cannot make arrays bigger if your program decides it needs

    more space. (bad) .

    RecordsThese let you organize non-homogeneous data into logical packages to keep every-thing together. (good) .These packages do not include operations, just data fields (bad, which

    is why we need objects) .Records do not help you process distinct items in loops (bad, which

    is why arrays of records are used)

    SetsThese let you represent subsets of a set with such operations as intersection, union, and

    equivalence. (good) .Built-in sets are limited to a certain small size. (bad, but we can build ourown set data typeout of arrays to solve this problem if necessary)

  • 8/13/2019 Brief introduction to algorithms

    5/17

    Algorithm design techniques: Creating an algorithm is an art which may never be fully au-

    tomated. By mastering these design strategies, it will become easier for you to devise new and

    useful algorithms. Dynamic programming is one such technique. Some of the techniques are

    especially useful in fields other then computer science such as operation research and electric-

    al engineering. Some important design techniques are linear, non linear and integer program-

    ming

    Methods of specifying an algorithm: There are mainly two options for specifying an algo-

    rithm: use of natural language or pseudocode & Flowcharts.

    A Pseudo code is a mixture of natural language & programming language like constructs. A

    flowchart is a method of expressing an algorithm by a collection of connected geometric

    shapes.

    Proving an algorithms correctness: Once algorithm is devised, it is necessary to show that it

    computes answer for all the possible legal inputs .We refer to this process as algorithm valida-

    tion. The process of validation is to assure us that this algorithm will work correctly indepen-

    dent of issues concerning programming language it will be written in. A proof of correctness

    requires that the solution be stated in two forms. One form is usually as a program which is

    annotated by a set of assertions about the input and output variables of a program. These as-

    sertions are often expressed in the predicate calculus. The second form is called a specifica-

    tion, and this may also be expressed in the predicate calculus. A proof consists of showing

    that these two forms are equivalent in that for every given legal input, they describe same out-

    put. A complete proof of program correctness requires that each statement of programming

    language be precisely defined and all basic operations be proved correct. All these details may

    cause proof to be very much longer than the program.

    Analyzing algorithms: As an algorithm is executed, it uses the computers central processingunit to perform operation and its memory (both immediate and auxiliary) to hold the program

    and data. Analysis of algorithms and performance analysis refers to the task of determining

    how much computing time and storage an algorithm requires. This is a challenging area in

    which some times require grate mathematical skill. An important result of this study is that it

    allows you to make quantitative judgments about the value of one algorithm over another.

  • 8/13/2019 Brief introduction to algorithms

    6/17

    Another result is that it allows you to predict whether the software will meet any efficiency

    constraint that exits.

    Performance analysis

    There are any criteria upon which we can judge an algorithm for instance:

    1. Does it do what we want to do?

    2. Does it work correctly according to the original specifications to the task?

    3. is there documentation that describes how to use it and how it works?

    4. Are procedures created in such a way that they perform logical sub functions?

    5. is the code readable?

    The space complexity of an algorithm is the amount of memory it needs to run to completion.

    The time complexity of an algorithm is the amount of computer time it needs to run to com-

    pletion.

    Performance evaluation can be loosely divided into two major phases:

    (1) A priory estimate and (2) a posteriori testing. We refer to these performance analysis

    and performance measurements respectively.

    Space complexity

    (1)A fixed part that is independent of characteristics (e.g., number, size) of the inputs and

    outputs this part typically includes the instruction space (i.e. space for code), space for

    simple variables and fixed size component variables (also called aggregate),space for con-

    stants and so on.

    (2) A variable part that consist of space needed by component variable whose size is de-

    pendent on particular problem instance being solved, the space needed by referenced va-

    riables (to the extent that it depends on instance characteristics), and the recursion stack

    space (insofar and this space depends on the instance characteristics). The space require-

    ment S(P) of an algorithm P may therefore be written as S(P) =c + S P (instance characte-

    ristics),where c is constant. When analyzing the space complexity of an algorithm, we

    concentrate solely on estimating SP (instance characteristics). For any given problem, we

  • 8/13/2019 Brief introduction to algorithms

    7/17

    need first to determine which instance characteristics to use to measure the space require-

    ment. Generally speaking our choices are related to the number and magnitude of the in-

    puts to and outputs from the algorithm at times, more complexity measures of the interre-

    lationship among the data times are used.

    Time complexity

    The time T (P) taken by a program P is sum of compile time and run time. The compile time

    does not depend on the instance characteristics. Also, we may assume that a compiled pro-

    gram will be run several time of a program. This run time is denoted by tp.

    Because of many of the factor tp depends on are not known at the time of a program is con-

    ceived, it is reasonable to attempt only to estimate tp. If we knew the characteristics of a com-

    piler to be used, we could

    Proceed to determine the number of additions, subtractions, multiplication, divisions, com-

    pares, loads, stores and so on, that would be made by the code for P. So, we could obtain an

    expression for tp(n) of the form

    tp(n)=caADD(n)+csSUB(n)+cmMUL(n)+cdDIV(n)+.

    Where n denotes the instance characteristics ,and ca, cs, cm,cd, and so on, respectively, denote

    the time needed for an addition ,subtraction, multiplication, division and so on, and ADD,

    SUB,MUL,DIV, and so on are the functions whose values are numbers of additions

    ,subtractions, multiplication, division and soon ,that are performed when code for P is used on

    an instance with characteristics n.

    Obtaining such an exact formula is in itself an impossible task, since the time needed for addi-

    tion, subtraction, multiplication, and so on, depend on the number being added, and subtract,

    multiplication and so on. The value of tp (n) for any given n can be obtained only experimen-

    tally. The program is typed, compiled, and run on a particular machine. The execution time is

    physically clocked, and tp (n) obtained. Even with this experimental approach, one could face

    difficulties. In a multiuser system, the execution time depends on such factors as system load,

    the number of other programs running on the computer at the time program P is run, the cha-

    racteristics of these programs, and so on.

  • 8/13/2019 Brief introduction to algorithms

    8/17

    Given the minimal utility of determining the exact number of additions, subtraction, and so

    on, that are needed to solve a problem instance with characteristics given by n, we might as

    well lump all the operations together and obtain a cont for the total number of operations .We

    can go one more step further and count only the number of program steps.

    A program step is loosely defined as a syntactically or semantically meaningful segment of a

    program that has an execution time that is independent of the instance characteristics. For ex-

    ample, the entire statement

    Return a+b+b*c+(a+b-c)/(a+b)+4.0;

    Of Algorithm 1.5 could be regarded as a step since its execution time is independent of the

    instance characteristics (this statement is not strictly true , since the time for a multiply and

    divide generally depends on the numbers involved in the operation).

    The number of steps any program statement is assigned depends on the kind of statement. For

    example comments count as zero steps; an assignment statement which does not involve calls

    any to other algorithms is encountered as one step; in an iterative statement such as for, while

    and repeat untilstatement, we consider the step count only for control part of the statement.

    The control parts for and while statements have the following forms:

    for i=todo

    while()do

    each execution of the control part of a while statement is a step count equal to the number of

    step counts assignable to .the step count for each execution of control part of a for

    statement is one, unless the count attribute to and are functions of the in-

    stance characteristics . In this latter case the first execution of the control part of the for has

    step count equal to the sum of counts for and .remaining executions of the for

    statement have a step count of one; and so on.

    We can determine number of steps needed by program to solve a particular problem instance

    in one of the two ways. in the first method we introduce a new variable ,count ,into the pro-

    gram .this is the global variable with initial value 0.statement to increment count by appropri-

    ate amount are introduced into the program. this is done so that each time a statement in the

    original program is executed ,count is incremented by step count of that statement.

  • 8/13/2019 Brief introduction to algorithms

    9/17

    1.4 Important Problem Types Sorting Searching String processing Graph problems Combinatorial problems Geometric problems Numerical problems

    sorting algorithmis an algorithm that puts elements of a list in a certain order. The most-

    used orders are numerical order and lexicographical order. Efficient sorting is important to

    optimizing the use of other algorithms (such as search and merge algorithms) that require

    sorted lists to work correctly; it is also often useful for canonicalizing data and for producinghuman-readable output. More formally, the output must satisfy two conditions:

    1. The output is in nondecreasing order (each element is no smaller than the previouselement according to the desired total order);

    2. The output is a permutation, or reordering, of the input.Since the dawn of computing, the sorting problem has attracted a great deal of research, per-

    haps due to the complexity of solving it efficiently despite its simple, familiar statement. Forexample, bubble sort was analyzed as early as 1956. [1]Although many consider it a solved

    problem, useful new sorting algorithms are still being invented (for example, library sort was

    first published in 2004). Sorting problem provides a gentle introduction to a variety of core

    algorithm concepts, such as big O notation, divide and conquer algorithms, data structures,randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and lower

    bounds.

    Searching : In computer science, a search algorithm, broadly speaking, is an algorithm for

    finding an item with specified properties among a collection of items. The items may be

    stored individually as records in a database; or may be elements of a search space defined by amathematical formula or procedure, such as the roots of an equation with integer variables; or

    a combination of the two, such as the Hamiltonian circuits of a graph.Searching algorithms

    are closely related to the concept of dictionaries. Dictionaries are data structures that supportsearch, insert, and delete operations. One of the most effective representations is a hash table.

    Typically, a simple function is applied to the key to determine its place in the dictionary.

    Another efficient search algorithms on sorted tables is binary search

  • 8/13/2019 Brief introduction to algorithms

    10/17

    String processing:String searching algorithms are important in all sorts of applications that

    we meet everyday. In text editors, we might want to search through a very large document

    (say, a million characters) for the occurence of a given string (maybe dozens of characters). In

    text retrieval tools, we might potentially want to search through thousands of such documents

    (though normally these files would be indexed, making this unnecessary). Other applications

    might require string matching algorithms as part of a more complex algorithm (e.g., the Unix

    program ``diff'' that works out the differences between two simiar text files). Sometimes we

    might want to search in binary strings (ie, sequences of 0s and 1s). For example the ``pbm''

    graphics format is based on sequences of 1s and 0s. We could express a task like ``find a wide

    white stripe in the image'' as a string searching problem.

    Graph problems:Graph algorithms are one of the oldest classes of algorithms and they have

    been studied for almost 300 years (in 1736 Leonard Euler formulated one of the first graph

    problems Knigsberg Bridge Problem)

    There are two large classes of graphs:

    directed graphs (digraphs) undirected graphs

    Some algorithms differ by the class. Moreover the set of problems for digraphs and undirected

    graphs are different. There are special cases of digraphs and graphs that have their own sets

    of problem. One example for digraphs will beprogram graphs. Program graphs are importantin compiler construction and were studied in detail after the invention of the computers.

    Graphs are made up of vertices and edges. The simplest property of a vertex is its degree, thenumber of edges incident upon it. The sum of the vertex degrees in any undirected graph is

    twice the number of edges, since every edge contributes one to the degree of both adjacent

    vertices. Treesare undirected graphs which contain no cycles. Vertex degrees are important in

    the analysis of trees. A leafof a tree is a vertex of degree 1. Every -vertex tree contains

    edges, so all non-trivial trees contain at least two leaf vertices.

  • 8/13/2019 Brief introduction to algorithms

    11/17

    Among classic algorithms/problems on digraphs we can note the following:

    Reachability. Can you get to B from A? Shortest path (min-cost path). Find the path from B to A with the minimum cost (de-

    termined as some simple function of the edges traversed in the path) (Dijkstra's and

    Floyd's algorithms) Visit all nodes. Traversal. Depth- and breadth-first traversals Transitive closure.Determine all pairs of nodes that can reach each other (Floyd's al-

    gorithm)

    Dominators a node ddominatesa node nif every path from the start node to nmustgo through d. Notationally, this is written as ddom n. By definition, every node domi-nates itself. There are a number of related concepts:

    o immediate dominatoro pre-dominatoro post-dominator.o dominator tree

    Minimum spanning tree.A spanning three is a set of edges such that every node isreachable from every other node, and the removal of any edge from the tree eliminates

    the reachability property. A minimum spanning tree is the smallest such tree. (Prim'sand Kruskal's algorithms)

    Combinatorial problems: From a more abstract perspective ,the traveling Salesman problem

    and the graph coloring problems of combinatorial problems are problems that a task to find a

    combinatorial object-such as a permutation a combination ,or a subset-that satisfies certain

    constraints and has some desired property.Generally speaking, combinatorial problems are the

    most difficult problems in computing ,from both the theoretical and practical standpoints.

    Their difficulty stems from the following facts. First ,the number of combinatorial objects

    typically grows extremely fast with a problem size , reaching unimaginable magnitudes even

    moderate-sized intances. Second, there are no known algorithms for solving most such prob-

    lems exactly in an acceptable amount of time. Moreover, most computer scientist believe

    such algorithms do not exist. This conjecture has been neither proved nor disproved ,and it

    remains the most important resolved issue in theoretical computer science.

    Some combinatorial problems can be solved by efficient algorithms, but they should be

    considered fortunate to the rule. The shortest-problem mentioned earlier is among such excep-tions.

    Geometric Problems

    Geometric algorithms deal with geometric objects such as points , lines, and polygons. An-

    cient Greeks were very much interested in developing procedures for solving a variety ofgeometric problems including problems of constructing simple geometric shapes-triangles

  • 8/13/2019 Brief introduction to algorithms

    12/17

    ,circles and so on-with an unmarked ruler and a compass. Then ,for about2000 years ,intense

    interest in geometrics disappeared, to be resurrected in the age of computers-no more rulersand compasses, just bits, bytes, and good old human ingenuity. Of course, today people are

    interested in geometric algorithms with quite different applications in mind, such as computer

    Graphics, robotics, and tomography.

    We will discuss algorithms for only two classic problems of computational geometry: theclosest pair problem and the convex-hull problem. The closest-pair problem is self explanato-

    ry :given n points in the plain, find the closest pair among them. The convex hull problem isto find the smallest convex polygon that would include all points of a given set.

    Numerical ProblemsNumerical problems, another large area of applications are problems that involve mathemati-

    cal objects of continuous nature: solving equations and system of equation, computing definite

    integrals, evaluating functions and so on. The majority of such mathematical problems can be

    solved only approximately. Another principal difficulty stems from the fact that such problemtypically requires manipulating real numbers, which can be represented in computer only ap-

    proximately. Moreover, a large number of arithmetic operations performed on approximatelyrepresented numbers can lead to an accumulation of the round-off error to a point where it candrastically distort an output produced by a seemingly sound algorithm.

    Many sophisticated algorithms have been developed over the years in this area ,and they con-

    tinue to play a critical role in many scientific and engineering applications. But in the last25years or so, the computing industry has shifted its focus into business application .These

    new application require primary algorithms for information storage, retrieval ,transportation

    through networks and presentation to users. As a result of this revolutionary change, numeri-

    cal analysis has lost formerly dominating position in both industry and computer science pro-grams. Still, it is important for any computer-literate person to have at least a rudimentary idea

    about numerical algorithms.

    1.5FUNDAMENTALS OF DATA STRUCTURESSince most of the algorithms operate on the data ,particular ways of arranging the data play acritical role in the design & analysis of algorithms.A data structure can be defined as a par-

    ticular way of arrangement of data. The expression ``data structure'', however, is usually usedto refer to more complex ways of storing and manipulating data, such as arrays, stacks, queues

    etc. We begin by discussing the simplest, but one of the most useful data structures, namely

    the array.

    ARRAY

    Recall that an array is a named collection of homogeneous items An items place within the

    collection is called an index. The index is an integer between 0 & 1.If there is no ordering onthe items in the container, we call the container unsorted,If there is an ordering, we call the

    container sorted.The size of the array is given by max length.Every item in the array can be

    accessed in the same constant amount of time.

  • 8/13/2019 Brief introduction to algorithms

    13/17

    Fig 1.b.

    Linked list:

    A linked list consists of head & node,A node consists of two fields.data & pointer.The pointerpoints to the next data .The time to acces any data is variable & is dependent on the position

    of the data in the list.

  • 8/13/2019 Brief introduction to algorithms

    14/17

    Fig 1.c.

    Stacks:Stacks are known as LIFO (Last In, First Out) lists.The last element inserted will bethe first to be retrieved, using Push and Pop

    Push

    Add an element to the topof the stackPop

    Remove the element at the topof the stack

    Fig 1.d

    QUES: Accessing the elements of queues follows a FIFO (First In, First Out) order

    The first element inserted will be the first to be retrieved, using Enqueue and Dequeue

    Enqueue Add an element after the rearof the queue

    Dequeue Remove the element at thefrontof the queue

  • 8/13/2019 Brief introduction to algorithms

    15/17

    Fig 1.e.

    Fig1.f.Stack and queue visualized as linked structures

  • 8/13/2019 Brief introduction to algorithms

    16/17

    Graphs

    A data structure that consists of a set of nodes and a set of edges that relate the nodes

    to each other.Undirected graph A graph in which the edges have no direction

    Directed graph (Digraph) A graph in which each edge is directed from one vertex to

    another (or the same) vertex.

    An undirected graphGis a pair (V,E), where Vis a fi-nite set of points called verticesandEis a finite set of edges.

    An edge e Eis an unordered pair (u,v), where u,v V.In a directed graph, the edge eis an ordered pair (u,v). An edge (u,v)is incident from

    vertex uand is incident tovertex v.

    Apathfrom a vertex vto a vertex uis a sequence of vertices where v0

    = v, vk= u, and (vi, vi+1) EforI = 0, 1,, k-1.The length of a path is defined as the number of edges in the path

    Fig1.g. A directed undirected graph.

    Graph Properties-- Acyclicity

    Cycle A simple path of a positive length that starts and ends a the same vertex.

    Acyclic graph

    A graph without cycles DAG (Directed Acyclic Graph)

    Paths and ConnectivityPaths A path from vertex u to v of a graph G is defined as a sequence of adjacent

    (connected by an edge) vertices that starts with u and ends with v.Simple paths: All edges of a path are distinct.

    Path lengths: the number of edges, or the number of vertices 1.

  • 8/13/2019 Brief introduction to algorithms

    17/17

    Connected graphs

    A graph is said to be connected if for every pair of its vertices u and v there is apath from u to v.

    Connected component

    -The maximum connected subgraph of a given graph

    Graph Representation : A graph is represented by

    Adjacency matrixn x n boolean matrix if |V| is n.

    The element on the ith row and jth column is 1 if theres an edge from ith ver-tex to the jth vertex; otherwise 0.

    The adjacency matrix of an undirected graph is symmetric. Adjacency linked listsA collection of linked lists, one for each vertex, that contain all the vertices adjacent to thelists vertex

    Fig1.h.Graph representation.