centre for applied internet research cair-uk

Centre for AppliedInternet Research

Centre for AppliedInternet Researchwww.cair-uk.org


How hard can it be?(A gentle overview of complexity theory and the limitations of computers)

Professor Vic GroutDirector of the Centre for Applied Internet Research (CAIR)Glyndŵr University, North Wales

[email protected]

Inaugural Professorial Lecture, Glyndŵr University, 25th June 2009


Two Apologies!This talk is aimed primarily at a non-specialist audience …

… but there is some maths/computing in it

… feel free to ignore it!

For the mathematics/computer science purists …

… some liberties have been taken with notation, terminology, simplification, etc.

… please ignore it!


What is Complexity?Basically, some things we might want to do with a computer are a bit more awkward than they might seem!

They can take ‘a long time’

We generally term anything we might want to do with a computer as a ‘problem’ (to be solved).

In fact, the complexity of a problem is just one way in which it might be awkward. Some things can’t be done at all!

We’ll come back to this, but first …

… some ‘problems’ …

… then, an argument …

… then an example …


What is a Problem?Valid problems:

Calculate 2 x 4 + 9 – 3

If 5 – a = 2 what’s a?

Find the largest from 5,7,1,4,8,5,2,4,8,5,2,5,6,2,4,3,6,7,7,6,5,4

Sort 25,44,66,72,12,45,56,90,45,69,11,10,12,42,88 into order

Arrange 1,2,3,4,5,6,7,8,9 into a ‘magic square’

What’s the quickest way to get to Paris?

Invalid problems:

What’s the meaning of life, the universe and everything?

What will this week’s lottery numbers be?


It doesn’t really matter, does it?‘Moore’s Law’ suggests (broadly speaking) that computing ‘power’ approximately doubles approximately every two years

Two objections:

• It may not be true any more

• It doesn’t help anyway!

Source: www.indybay.org


The ‘Travelling Salesman Problem’ (TSP)Imagine a sales rep moving from town to town, trying to sell their wares …

Presumably, some ways are better than others?


The ‘Travelling Salesman Problem’ (TSP)The pedlar has to decide where to start …

... then which town to visit in each turn until they get back to the start


Clearly, there are lots of choices …

… but which is the best?

The ‘Travelling Salesman Problem’ (TSP)


Surprisingly, no-one has found a much better way than …

… trying out each possible route in turn!



So how complicated (complex) is that? (How many routes are there?)

Well, consider these 16 towns …



There are 16 choices of the starting townThat leaves 15 choices for the second townSo there are 16 x 15 = 240 choices for the first two towns14 choices for the thirdSo there are 16 x 15 x 14 = 3,360 choices for the first three towns13 choices for the fourth :2 choices for the last but one1 choice for the very lastSo there are 16 x 15 x 14 x 13 x … x 2 x 1 = 20,922,789,888,000 choices for visiting all 16 towns ( = 16! ‘sixteen factorial’ )



1! = 1 Imagine 1 metre from here2! = 2 2m3! = 6 6m 4! = 24 Fellows Bar5! = 120 Wrexham FC6! = 720 Wrexham7! = 5,040 Rhos8! = 40,320 Shrewsbury9! = 362,880 Southampton10! = 3,628,800 Istanbul11! = 39,916,800 Anywhere on Earth (and back)12! = 479,001,600 The Moon (and beyond)

Factorials


There are 16 choices of the starting townThat leaves 15 choices for the second townSo there are 16 x 15 = 240 choices for the first two towns14 choices for the thirdSo there are 16 x 15 x 14 = 3360 choices for the first three towns13 choices for the fourth :2 choices for the last but one1 choice for the very lastSo there are 16 x 15 x 14 x 13 x … x 2 x 1 = 20,922,789,888,000 choices for visiting all 16 towns ( = 16! ‘sixteen factorial’ )OK, but does this really matter?



Suppose we have a computer that can (just) solve the TSP for 20 townsThat is, one that can search through the 20 x 19 x 18 x … x 2 x 1 = 2,432,902,008,176,640,000 possible choices (in ‘reasonable’ time)How much extra effort is involved in solving the TSP for 21 towns?That would be 21 x 20 x 19 x 18 x … 2 x 1, which is 21 times as much as for 20 townsAccording to Moore’s Law, it will take about another 8 years to develop a computer this powerfulTo solve the TSP for 30 towns would take a computer 30 x 29 x 28 x … x 22 x 21 = 109,027,350,432,000 times as powerfulAbout 95 years by Moore’s Law! … but this will fail long before that!How far could we ever get?



Suppose we turn the Universe into a computer and run it as efficiently as theoretically possibleSuppose every sub-atomic particle in the Universe is a component of the computer.Suppose we manage to perform a calculation every time a sub-atomic particle changes quantum state, for as many such particles as there are, for as many states as they have, as often as such changes are possible and we run the whole thing for the lifetime of the UniverseWhat’s the biggest TSP we can solve?Approximately …

80 (towns)

The problem is that the complexity of the TSP increases exponentiallyWe’ll come back to that later as well!

The ‘Perfect Computer’!


Finding the Limitations of ComputersTrying to work out (theoretically) what computers can and can’t do and how difficult some things might be is, in itself, tricky because computers are themselves … well complex!So we tend to start with something simplerAn idealised model of a computerThere are lots of these aroundBut probably the most well-known, and certainly the first, is the …Turing Machine (Alan Turing, 1936)A ‘computer’ is modelled as being something that can exist in a number of ‘states’, can respond to simple input (i.e. change state) and produce simple outputFor some initial input, it does this over and over again until it produces some final output (hopefully).


q …

r …

s …

t …

y …

z …

The Turing Machine

‘State’ s

Read/WriteHead

0 1 & $ % 1 0 $

Data ‘tape’ Input/output data

0 1 % & $ *

Input

‘Action Grid’ (program)

State : : : : : :Write ‘0’

Change to state ‘t’Move left


Problem ‘Solvability’A problem is ‘solvable’ if its input can be written to the data tape of an appropriate Turing machine and, through the application of a suitable action grid (program), its output eventually read from the data tapeIf a problem is unsolvable, either it will not be possible to construct such a machine or any such attempt will fail to terminate (run indefinitely)It is possible to show that some very complex problems can be solved by a Turing machineHowever, Turing himself proved that there are some problems that are simply not solvableFor example, the ‘Halting Problem’Given a program and its input (as a Turing machine, say), can we produce another program that will determine if the first will finish?NO!


‘Decision’ FunctionsMost ‘natural’ or ‘general’ problems have a variety of answers …number, name, colour, shape, size, direction, etc.A ‘decision function’ instead always evaluates to ‘yes’ or ‘no’However …Every general problem has an equivalent decision function!For example, in complexity terms, ‘who is the tallest person in the room’ is the same as ‘is there anyone in the room taller than #.## metres?’For some values of #.##, it will be equally difficult to determine eitherA problem is solvable if its equivalent decision function is ‘decidable’In 1931, Kurt Gödel proved that not all functions are decidableIn 1936, Alan Turing proved that not all problems are solvableSame thing!


q …

r …

s …

t …

y …

z …

Extended Turing Machines

‘State’ s

Read/WriteHead

0 1 & $ % 1 0 $

Data ‘tape’ Input/output data

0 1 % & $ *

Input

‘Action Grid’ (program)

State : : : : : :Write ‘0’

Change to state ‘t’Move left


q …

r …

s …

t …

y …

z …


s

0 1 & $ % 1 0 $

0 1 % & $ *



Multiplemachines/tapes

MultipleAction grids

Write ‘0 1 0 …’Change to state ‘t’Move left 3 places

Extendedfunctionality

There’s nothing an extendedTuring machine can do that

a basic Turing machine can’t(It might just take a longer

program (tape) or more time)


Equivalent ModelsSo an extended Turing machine is (essentially) no different to a standard Turing machineIt can’t really do any more … or lessIn fact there are many models of the computational process …… and they’re all the same!We can start with something very ‘simple’ and show what we can and can’t do with it …… then gradually get more ‘complex’, showing that we haven’t gained or lost anything at each step …… until we’ve effectively got a ‘real’ computerSo we can say what a computer can and can’t doUnfortunately, the results aren’t always … ‘comfortable’!


A decision function may be …

Enough of that! (For now …)

Levels of Solvability

YES NO UNDECIDABLE

CAN’T TELL!

CAN’T TELL!

CAN’T TELL!


Measuring ComplexitySome things are simple!For example ‘find x if x+3 = 5’ is a trivial calculation x = 5 - 3This has ‘no complexity at all’ or ‘constant’ complexityHowever, some things take longer!What really matters is how complexity grows‘Who’s the tallest person in the room?’ involves a ‘search’

------------------------------------------------------------------------------Searching 10 people involves 10 comparisonssearching 20 people involves 20 comparisonsIn general, searching ‘n’ people involves ‘n’ comparisonsWe say the (time) complexity is ‘of the order of n’ … ‘O(n)’ for short


Measuring ComplexitySearching a map (n x n grid) for something …

… involves n x n checks, so has O(n x n) or O(n2) complexity

1 2 3 … n123

:

n


Measuring ComplexitySearching a volume (n x n x n) …

… has O(n x n x n) = O(n3) complexityWe call this O(n), O(n2), O(n3), etc. complexity ‘polynomial complexity’ … or ‘EASY’ … or ‘P’

n

n

n


Measuring ComplexityBut not all problems have polynomial complexity(Well, probably not!)A basic TSP search, for example has O(n x n-1 x n-2 x … x 2 x 1) = O(n!) complexityThe ‘best’ known TSP solution has O(2n) complexity (2n = 2 x 2 x 2 x …….. x 2 < - - - n times - - - > )This O(2n), O(n!), O(nn), etc. complexity is known as ‘exponential complexity’ or ‘HARD’We like EASY (polynomial)We don’t like HARD (exponential)Why?


time (logarithmic)

10 000

1000

100

10

0

time

400

300

200

100

0

Polynomial vs. Exponential Time

n n

n

n2

n3

2nn!nn

n

n2

n3

2nn!

nn


All problems

Complexity ClassesIt should look like this …

… but does it?

‘NP’ problems

‘P’ problems

SolvableproblemsRealistic problems

(Hard)

(Easy)

* Haltingproblem

* TSP


An algorithm is a method of solution for a problem … the steps to be followed… an abstract version of a programThe complexity of a problem is really the complexity of the (best) algorithm that solves itBut there’s a big issue here …How do we know we’ve got the best algorithm for the job?A problem could be simpler than we realise …… there might be an easy algorithm for it out there …… we just haven’t found it yet!Let’s take an example …

Algorithms


Climbing stairs …… taking one or two steps at a timeHow many different ways are there of taking 6 steps?

1

6

5

4

3

2

The Stairs Problem


Climbing stairs …… taking one or two steps at a timeHow many different ways are there of taking 6 steps?How many different ways are there of taking n steps?

Call this fn

Then f1 = 1

f2 = 2

f3 = 3

f4 = 5

??

The Stairs Problem

1

n

n-1

:

:

2


How about the general case, fn?

Well, consider the first step …A single step leaves n-1A double step leaves n-2

There are fn-1 ways of taking n-1 steps …

… and fn-1 ways of taking n-1 steps

It has to be one or the other so …… the number of ways of taking n steps is made up of the number of ways of taking n-1 steps and the number of ways of taking n-2 stepsIn other ‘words’ …… a ‘recursive definition’

1

2

n

n-1

:

:

fn-1

fn-2

fn = fn-1 + fn-2

The Stairs Problem


We can write a program to do this …

Recursion f3

f2f4

f2

f1f3

f2

f1f3

f2

f1

f2

f4

f5

f6

(2)

(2)

(2)

(2)

(2)

(1)

(1)

(1)


Let’s try running this … [PROGRAM]

Recursion// C++ fn by recursion

#include <iostream.h>int n;float f(int n);

main() { for (n = 3; n <= 100; ++n) cout << “n = “ << n << “: f(n) = “ << f(n) << “\n”; }

float f(int n) { if (n == 1 || n == 2) return n; else return f(n-1) + f(n-2); }


Recursion is elegant, not efficient!

Recursion f3

f2f4

f2

f1f3

f2

f1f3

f2

f1

f2

f4

f5

f6

(2)

(2)

(2)

(2)

(2)

(1)

(1)

(1)


How about the general case, fn?

Well, consider the first step …A single step leaves n-1A double step leaves n-2

There are fn-1 ways of taking n-1 steps …

… and fn-1 ways of taking n-1 steps

It has to be one or the other so …… the number of ways of taking n steps is made up of the number of ways of taking n-1 steps and the number of ways of taking n-2 stepsIn other ‘words’ …… a ‘recursive definition’

1

2

n

n-1

:

:

fn-1

fn-2

fn = fn-1 + fn-2

The Stairs Problem


Start with the values we know, not the ones we don’t!

f1 = 1f2 = 2

f3 = f2 + f1 = 2 + 1 = 3

f4 = f3 + f2 = 3 + 2 = 5

f5 = f4 + f3 = 5 + 3 = 8

This is the ‘Fibonacci Sequence’

n 1 2 3 4 5 6 7 ...

fn 1 2 3 5 8 13 21 …

Iteration


Much simpler!

We can write a program to do this …

Iteration

f1 f2 f3 f4 f5 f6

(1) (2)


Let’s try running this … [PROGRAM]

Iteration// C++ fn by iteration

#include <iostream.h>int n;float fn_2 = 1; fn_1 = 2, fn;

main() { for (n = 3; n <= 100; ++n) { fn = fn_1 + fn_2; fn_2 = fn_1; fn_1 = fn; cout << “n = “ << n << “: f(n) = “ << fn << “\n”; }}


In fact the ‘equation’ fn = fn-1 + fn-2 can be ‘solved’ to give …

(an interesting expression for a whole number?)

This is a single step surely?

Calculation

11

2

51

5

1

2

51

5

1

nn

nf


Calculation// C++ fn by calculation

#include <iostream.h>#include <math.h>

int n;float f(int n);

main() { for (n = 3; n <= 100; ++n) cout << “n = “ << n << “: f(n) = “ << f(n) << “\n”; }

float f(int n) { return (pow((1 + sqrt(5)) / 2, n+1) - pow((1 – sqrt(5)) / 2, n+1)) / sqrt(5); }


In fact the ‘equation’ fn = fn-1 + fn-2 can be ‘solved’ to give …

(an interesting expression for a whole number?)

This is a single step surely?

Not for a computer! (‘powers are complex’)

The key point is that we can’t be sure whether we’ve got the best algorithm so how can we say whether a problem is easy or hard?

Calculation

11

2

51

5

1

2

51

5

1

nn

nf


Examples

But the equivalent problem forwireless networks (MCDS) …

… appears to be HARD

The shortest path problem (SPP) …… is EASY

The shortest tour problem (TSP) …

… appears to be HARD

The minimum connecting networkproblem (MST) … … is EASY


Some problems are easypolynomial (P)

Some problems don’t appear to bebut we can’t be sure

Some of these ‘harder’ problems are ‘equivalent’ to each othera polynomial (easy) algorithm for any one of them could be

easily adapted to all of the othersORa proof that no polynomial (easy) algorithm exists for any one of

them would prove that none exists for any of the others

We call this class of equivalent ‘harder’ problems …NP-complete

What DO We Know …


How it might look

But only if P NP (otherwise there are no hard problems)

Complexity Classes

All problems

‘NP’ problems

‘P’ problems‘NP-complete problems


How it might look if P = NP

Complexity Classes

All problems

‘P’ problems‘NP’ problems‘NP-complete problems


It comes down to this …

This has exercised the minds of some of our greatest thinkers over the last few decades …

The Big Question …

P NP=?


The Big Question …


The Million Dollar Question …In 1998, the Clay Mathematics Institute offered prizes of $1,000,000 for solving each of seven Millennium Prize Problems

‘P vs. NP’ is one of them

So far, none have been solved(Well, maybe one?)

But, before you dig out the pencil and paper, consider this …


is a decision function!

Will We Ever Know?

YES NO UNDECIDABLE

CAN’T TELL!

CAN’T TELL!

CAN’T TELL!

P = NP?


is a decision function!

Where does it live?

Will We Ever Know?

UNDECIDABLE

CAN’T TELL!

CAN’T TELL!

CAN’T TELL!

P = NP?

NP

P NP-Comp


This ‘P easy – NP hard’ idea may not be as clear as we might think!

How large does n have to get before xn becomes larger than ny ?

( eg. 1.0001n vs. n9999 )

For x small enough and/or y large enough, n has to be as large as you like!

So what’s the distinction?If there is one, it’s purely theoretical rather than practical!

Will We Ever Know?

n

xn nx


ConclusionsProbably no serious mathematician/computer scientist thinks that P = NP

In other words, yes, there are some genuinely hard problems

But this P ≠ NP may be very difficult to prove

However, we can make a problem harder than it needs to be through choosing a poor algorithm

Ultimately, computers really do seem to have simple, but real, limits

Good luck to anyone who can prove this for sure!


Thank-you … … Any questions?Professor Vic GroutDirector of the Centre for Applied Internet Research (CAIR)Glyndŵr University, North Wales

[email protected]

Inaugural Professorial Lecture, Glyndŵr University, 25th June 2009


Centre for AppliedInternet Researchwww.cair-uk.org

centre for applied internet research cair-uk

Documents