cs 4407 algorithms lecture 1: introductiongprovan/cs4407/l1-intro.pdf · cs 4407 algorithms lecture...

43
CS 4407 Algorithms Lecture 1: Introduction Comp 122 1 Prof. Gregory Provan Department of Computer Science University College Cork

Upload: hanhi

Post on 10-May-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

CS 4407

Algorithms

Lecture 1: Introduction

Comp 122 1

Prof. Gregory ProvanDepartment of Computer Science

University College Cork

Course Information

Wed 10-11am

http://www.cs.ucc.ie/~gprovan/cs4407/

Gregory Provan

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

Gregory Provan

WGB 1-71

[email protected] , 420-5928

Office Hours: Wednesday, 11-1(or by appointment)

Textbook & References

� Introduction to Algorithms, 2nd Ed. by Cormen, Leiserson, Rivest & Stein, MIT Press, 2001 (noted as CLRS)

ALTERNATIVE REFERENCES --• The Design and Analysis of Algorithms, by Anany Levitin

• The Design and Analysis of Computer Algorithms, by Aho, Hopcroft and Ullman

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

• The Design and Analysis of Computer Algorithms, by Aho, Hopcroft and Ullman

• Algorithms, by Sedgewick

• Fundamentals of Algorithms, by Brassard & Bratley

• Writing Efficient Programs, by Bentley

• The Science of Programming, by Gries

• An Introduction to Bioinformatics Algorithms, by Jones and Pevzner

Lecture Outline

Introduction to Algorithms

ObjectivesLogisticsHow do we approach the study of algorithms?

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

How do we approach the study of algorithms?Overview of course contentDefinition of algorithmExample

Today’s Learning Objectives

� Topics covered in the course

– Graph algorithms

– Mathematical principles of complexity

– String algorithms

Definition of algorithm

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Definition of algorithm

– Content: An up to date grasp of fundamental

problems and solutions

– Method: Principles and techniques to solve the

The future belongs to the computer scientist who has

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

– Method: Principles and techniques to solve the

vast array of unfamiliar problems that arise in a

rapidly changing field

Rudich www.discretemath.com

Course Content� A survey of algorithmic design techniques.

� Abstract thinking.

� How to develop new algorithms for any problem that may arise.

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

Solving a Computational Problem

� Problem definition & specification

– specify input, output and constraints

� Algorithm design & analysis

– devise a correct & efficient algorithm

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Implementation planning

� Coding, testing and verification

Primary Focus

Develop thinking ability

– formal thinking

(proof techniques & analysis)

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

– problem solving skills

(algorithm design and application)

Why Analyze Algorithms?

An algorithm can be analyzed in terms of time efficiency or space utilization. We will consider only the former right now. The running time of an algorithm is influenced by several factors:

� Speed of the machine running the program

� Language in which the program was written. For example, programs written in assembly language generally run faster than those written in C or C++, which in turn tend to run faster than those written in Java.

� Efficiency of the compiler that created the program

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Efficiency of the compiler that created the program

� The size of the input: processing 1000 records will take more time than processing 10 records.

� Organization of the input: if the item we are searching for is at the top of the list, it will take less time to find it than if it is at the bottom.

Generalizing Running Time

InputSize:n

(1) log n n n log n n² n³ 2ⁿ

5 1 3 5 15 25 125 32

Comparing the growth of the running time as the input grows to the growth of known functions.

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

10 1 4 10 33 100 10³ 10³

100 1 7 100 664 104 106 1030

1000 1 10 1000 104 106 109 10300

10000 1 13 10000 105 108 1012 103000

General Approach

� Examine interesting problems

� Devise algorithms for solving them

� Prove their correctness

� Analyze their runtime performance

� Study data structures & core algorithms

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Study data structures & core algorithms

� Learn problem-solving techniques

� Applications to real-world problems

– Bioinformatics, computer networking, scheduling, etc.

Goals

� Be very familiar with a collection of core algorithms

� Be fluent in algorithm design paradigms: divide & conquer,

greedy algorithms, randomization, dynamic programming,

approximation methods

� Be able to analyze the correctness and runtime performance of a

given algorithm

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

given algorithm

� Be familiar with the inherent complexity (lower bounds &

intractability) of some problems

� Be intimately familiar with basic data structures

� Be able to apply techniques in practical problems

Applications

� Recent “hot” application

– Bioinformatics and Computational Biology, web

algorithms (Google)

� Traditional applications

– Computer networking, scheduling, diagnostics,

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

– Computer networking, scheduling, diagnostics,

constraint-based inference

Course Overview

Introduction to algorithm design, analysis and

their applications

� Period 1

� Algorithm Fundamentals (6 lectures)

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Algorithm Fundamentals (6 lectures)

� Graph Algorithms (12 lectures)

� Period 2

� Complexity Measures (10 lectures)

� String Algorithms (8 lectures)

Algorithm Fundamentals

� Asymptotic Notation

� Recurrence Relations

� Theory of Growth Functions

� Proof Techniques

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Proof Techniques

� Inherent Complexity

Algorithmics Basics (6)

� Introduction to algorithms, complexity and proof of

correctness (CLRS Chapters 1 & 2)

� Asymptotic Notation (CLRS Chapter 3.1)

GOAL: Know how to write a problem specification, what a

computational model is, and how to measure the efficiency of

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

computational model is, and how to measure the efficiency of

an algorithm. Know the difference between upper and lower

bounds for an algorithm and what they convey. Be able to

prove the correctness of an algorithm and establish

computational complexity.

Graph Algorithms (12)

� Graph theory

� Basic algorithms

– Breadth-first, depth-first search

� Minimum spanning tree

� Greedy Algorithms

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Greedy Algorithms

� Shortest path

� Tree-decomposition

Graph Algorithms (12)

� Basic Graph Algorithms (Chapter 22)

GOAL: Know how graphs arise, their definition and implications. Be able to use the adjacency matrix representation of a graph and the edge list representation appropriately. Understand

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

list representation appropriately. Understand and be able to use “cut-and-paste” proof techniques as seen in the basic algorithms (DFS, BFS, topological sort, connected comp).

More Graph Algorithms (Greedy, Network Flows) (4)

� Minimum Spanning Trees (Chapter 23)

� Shortest Paths (Chapter 24)

� Max-Flows (Chapter 26)

� Tree-Decomposition

GOAL: Know when to use greedy algorithms and their essential

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

GOAL: Know when to use greedy algorithms and their essential characteristics. Be able to prove the correctness of a greedy algorithm in solving an optimization problem. Understand where minimum spanning trees and shortest path computations arise in practice. Know the basic approaches for computing max-flow in a graph. Understand the algorithms and applications behind tree-decomposition algorithms

Complexity Measures

� Algorithm classes P and NP

� Theory of NP-completeness

� NP-complete reductions

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� NP-complete reductions

� Methods for circumventing intractability

Complexity and NP-Completeness (12)

� Introduction to inherent complexity of particular computational tasks (CLRS Chapter 34.1)

� Complexity Classes P and NP (CLRS Chapter 34.2)

� NP-Complete Reductions (CLRS Chapter 34.3-5)

GOAL: Know how to specify the inherent complexity of a task, and the differences between the classes P and NP.

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

task, and the differences between the classes P and NP. Know the notion of NP-Complete and its significance. Be able to reduce decision tasks to establish inclusion in the class NP-Complete of the computational complexity hierarchy.

String Matching (8)

� String matching task

� String matching algorithms

– Rabin-Karp, KMP, FSA

� Probabilistic string matching

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Probabilistic string matching

� Applications

– Sequence alignment

String Matching (8)

� Introduction to the applications needing string matching, e.g. Bioinformatics

� Introduction to the standard string matching algorithms (CLRS Chapter 32.2, 32.4)

� Alternative algorithms (CLRS Chapter 32.3)

GOAL: Understand the motivations for needing good

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

GOAL: Understand the motivations for needing good string matching algorithms. Know the standard string matching algorithms (Rabin-Karp, KMP) and their strengths and weaknesses. Understand the alternative algorithms (FSA, HMM) and their strengths and weaknesses.

Prerequisites

� Basic discrete mathematics

– Counting arguments

– Set notation

– Induction

� Standard data structures

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Standard data structures

� Graph theory -- will provide review

� Case Studies of Real-World Problems (lecture notes

& handouts)

– BioInformatics Algorithms

– Tree-Decomposition Algorithms

– Pagerank Algorithm

Special Topics (1-2)

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

– Pagerank Algorithm

GOAL: See how core algorithms can be put to use in

real-world applications.

Course Work & Grades

� In-class tests: 40%– Mid-term, Period 1: 10%

– End-of-Term, Period 1: 20%

– Mid-term, Period 2: 10%

� Final Exam: 60%

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Homework will be assigned– Solutions posted

– Solutions of key problems will be covered in class

Examinations

� Period1 Test1: 27 October

� Period 1 Test2: 15 December

� Period2 Test: 16 February

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Final: TBA

All closed book

Communication

� All lecture notes and most handouts will be posted

at the course website:

http://www.cs.ucc.ie/~gprovan/cs4407/

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Major messages will be sent by email

How to Succeed in this Course

� Start early on all assignments. DON'T

procrastinate.

� Participate in class.

� Think in class.

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Think in class.

� Review after each class.

� Be formal and precise on all problem sets and

in-class exams

Algorithm Definition

� A tool for solving a well-specified computational

problem

AlgorithmInput Output

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

� Example: sorting

– input: A sequence of numbers

– output: An ordered permutation of the input

– issues: correctness, efficiency, storage, etc.

Strengthening the Informal Definition

� An algorithm is a finite sequence of

unambiguous instructions for solving a well-

specified computational problem.

� Important Features:

– Finiteness.

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

– Finiteness.

– Definiteness.

– Input.

– Output.

– Effectiveness.

Algorithm – In Formal Terms…

� In terms of mathematical models of computational platforms (general-purpose computers).

� One definition – Turing Machine that always halts.

� Other definitions are possible.

– Example: Based on Lambda Calculus.

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

– Example: Based on Lambda Calculus.

� Mathematical basis is necessary to answer questions such as:

– Is a problem solvable? (Does an algorithm exist?)

– Complexity classes of problems. (Is an efficient algorithm possible?)

Analyzing Algorithms

� Assumptions– Generic-one processor, random access machine

– running time (others: memory, communication, etc)

� Worst Case Running Time: the longest time for any input

of size n– upper bound on the running time for any input

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

– upper bound on the running time for any input

– in some cases like searching, this is close

� Average Case Behavior: the expected performance

averaged over all possible inputs– it is generally better than worst case behavior

– sometimes it’s roughly as bad as worst case

Motto of the day….

Flon's Law:

There is not now, and never will be,

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

There is not now, and never will be, a language in which it is the least bit difficult to write bad programs.

Example: Complex Numbers

�Remember how to multiply 2 complex numbers?

�(a+bi)(c+di) = [ac –bd] + [ad + bc] i

�Input: a,b,c,d Output: ac-bd, ad+bc

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

�If a real multiplication costs €1 and an addition costs a penny, what is the cheapest way to obtain the output from the input?

�Can you do better than €4.02?

Gauss’ €3.05 Method:

Input: a,b,c,d Output: ac-bd, ad+bc

� m1 = ac

� m2 = bd

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

2

� A1 = m1 – m2 = ac-bd

� m3 = (a+b)(c+d) = ac + ad + bc + bd

� A2 = m3 – m1 – m2 = ad+bc

Question:

�The Gauss “hack” saves one multiplication

out of four. It requires 25% less work.

�Could there be a context where

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

�Could there be a context where

performing 3 multiplications for every 4

provides a more dramatic savings?

Multiplication Algorithms

“Kindergarten” n2n

“Grade School” n2

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

Karatsuba n1.58…

Fastest Known n logn loglogn

Summary

� We approach the study of algorithms in terms of underlying principles

– Provides greatest long-term benefits

� Course focus

– Mathematical principles of complexity

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

– Mathematical principles of complexity

– Algorithm classes: strings, graphs

� Algorithm definition

– An algorithm is a finite sequence of unambiguousinstructions for solving a well-specified computational problem

Appendix

� Details of Karutsuba multiplication

– Why fast multiplication may matter

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

Gaussified MULT (Karatsuba 1962)

MULT(X,Y):

If |X| = |Y| = 1 then RETURN XY

Break X into a;b and Y into c;d

e = MULT(a,c) and f =MULT(b,d)

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

�T(n) = 3 T(n/2) + n

�Actually: T(n) = 2 T(n/2) + T(n/2 + 1) + kn

e = MULT(a,c) and f =MULT(b,d)

RETURN e2n + (MULT(a+b, c+d) – e - f) 2n/2 + f

Dramatic improvement for large n

Not just a 25% savings!

CS 4407, AlgorithmsUniversity College Cork,

Gregory M. Provan

θ(n2) vs θ(n1.58..)