biologically inspired computing: optimisation this is mandatory additional material for...

30
Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard problems and easy problems; computational complexity; trial and error search;

Upload: judith-davidson

Post on 04-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Biologically Inspired Computing: Optimisation

This is mandatory additional material for

`Biologically Inspired Computing’

Contents:

Optimisation; hard problems and easy problems; computational complexity; trial and error search;

Page 2: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

This Material A formal statement about optimisation, and explanation of the close relationship with

classification. About optimisation problems: formal notions for hard problems and easy ones.

Formal notions about optimisation algorithms: exact algorithms and approximate algorithms.

The status of EAs (and other bio-inspired optimisation algorithms) in this context.

The classical computing `alternatives’ to EAs

Page 3: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

What to take from this material:A clear understanding of what an optimisation problem

is, and an appreciation of how classification problems are also optimisation problems

What it means, technically, for an optimisation problem to be hard or easy

What it means, technically, for an optimisation algorithm to be exact or approximate

What kinds of algorithms, technically speaking, EAs are. An appreciation of non-EA techniques for optimisationA basic understanding of when an EA might be

applicable to a problem, and when it might not.

Page 4: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Search and OptimisationImagine we have 4 items as follows:

(item 1: 20kg; item2: 75kg; item 3: 60kg, item4: 35kg)Suppose we want to find the subset of items with highest total weight

Here is a standard treatment of this as an optimisation problem.The set S of all possible solutions is indicated above,The fitness of a solution s is the function total_weight(s)We can solve the problem (I.e. find the fittest s) by working out thefitness of each one in turn. But … any thoughts? I mean, how hard is this?

0000 0100 1000 11000001 0101 1001 11010010 0110 1010 11100011 0111 1011 1111

Page 5: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

• Of course, this problem is much easier to solve than that. We already know that 1111 has to be the best solution.

• We can prove, mathematically, that any problem of the form “find heaviest subset of a set of k things”, is solved by the “all of them” subset, as long as each thing has positive weight.

• In general, some problems are easy in this sense, in that there is an easily provable way to construct an optimal solution. Sometimes it is less obvious than in this case though.

Page 6: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Search and Optimisation In general, optimisation means that you are trying to find

the best solution you can (usually in a short time) to a given problem.

S

s1 s2 s3 …

We always have a set S of all possible solutions

In realistic problems, S is too large to searchone by one. So we need to find some other wayto search through S.

One way is random search. E.g. in a 500-iterationrandom search, we might randomly choose somethingin S and evaluate its fitness, repeating that 500 times.

Page 7: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

The Fitness function Every candidate solution s in the set of all solutions S can

be given a score, or a “fitness”, by a so-called fitness function. We usually write f(s) to indicate the fitness of solution s. Obviously, we want to find the s in S which has the best score.

Examples

timetabling: f could be no. of clashes. wing design: f could be aerodynamic drag delivery schedule f will be total distance travelled

Page 8: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

An Aside about ClassificationIn a classification problem, we have a set of things to classify, and a number of

possible classes.

To classify s we use an algorithm called a classifier. So, classifier(s) gives us a class label for s.

We can assign a fitness value to a classifier – this can be simply the percentage of examples it gets right.

In finding a good classifier, we are solving the following optimisation problem: Search a space of classifiers, and find the one that gives the best accuracy.

E.g. the classifier might be a neural network, and we may use an EA to evolve the NN with the best connection weights.

Page 9: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Searching through S When S is small (e.g. 10, 100, or only

1,000,000 or so items), we can simply do so-called exhaustive search.

Exhaustive search: Generate every possible solution, work out its fitness, and hence discover which is best (or which set share the best fitness)

This is also called Enumeration

Page 10: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

However … In all interesting/important cases, S is much much

much too large for exhaustive search (ever).

There are two kinds of `too-big’ problem:– easy (or `tractable’, or `in P’)– hard (or `intractable’, or `not known to be in P’)

There are rigorous mathematical definitions of the two types – that’s what the ‘P’ is about – it stands for the ‘class of problems that can be solved by a deterministic polynomial Turing machine’. That’s outside the scope of this module.

But, what’s important for you to know is that almost all important problems are technically hard.

Page 11: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

About Optimisation Problems To solve a problem means to find an optimal

solution. I.e. to deliver an element of s whose fitness is guaranteed to be the best in S.

An Exact algorithm is one which can do this (i.e. solve a problem, guaranteeing to find the best).

Is 500-iteration random search an Exact algorithm?

Page 12: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Problem complexity‘Problem complexity’, in the context of computer science, is all

about characterising how hard it is to solve a given problem. Statements are made in terms of functions of n, which is meant to be some indication of the size of the problem. E.g.:

Correctly sort a set of n numbers

• Can be done in around n log n steps

Find the closest pair out of n vectors

• Can be done in O(n2) steps

Find best design for an n-structural-element bridge

• Can be done in O(10n) steps …

Page 13: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Polynomial and Exponential Complexity

Given some problem Q, with `size’ n, imagine that A is the fastest algorithm known for solving that problem exactly. The complexity of problem Q is the time it takes A to solve it, as a function of n.

There are two key kinds of complexity:

Polynomial: the dominant term in the expression is polynomial in n. E.g. n34, n.log.n, sin(n2.2), etc …

Exponential: the dominant term is exponential in n. E.g. 1.1n, nn+2 , 2n, …

Page 14: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Polynomial and Exponential Complexity

(exp)

1.1n 1.21 1.33 1.46 1.61 1.77 2.59 6.73 117 13,780

(poly)

n1.1 2.14 3.35 4.59 5.87 7.18 12.6 27.0 73.9 159

n 2 3 4 5 6 10 20 50 100

Problems with exponential complexity take too long to solve at large nNote how an exponential always eventually catches up, and then rapidly overtakes, a polynomial.

Page 15: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Hard and Easy Problems

Polynomial Complexity: these are called tractable, and easy problems. Fast algorithms are known which provide the best solution. Pairwise alignment is one such problem. Sorting is another.

Exponential Complexity: these are called intractable, and hard problems. The fastest known algorithm which exactly solves it is usually not significantly faster than exhaustive search.

Page 16: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

polynomial

exponential

Increasing n

An exponential curve always takesover a polynomial one.

E.g. time needed on fastest computers to search all proteinstructures with 500 amino acids: trillions of times longerthan the current age of the universe.

Page 17: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Example: Minimum Spanning Tree Problems

This is a case of an easy problem, but not as obvious as our first easy example.

Find the cheapest tree which connects all (i.e. spans) the nodes of a given graph.

Applications: Comms network backbone design; Electricity distribution networks, water distribution networks, etc …

Page 18: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

A graph, showing the costs of building each

pair-to-pair link

7

12

4

9

68

5

1463

What is the minimal-cost spanning tree?

(Spanning Tree = visits all nodes, has no cycles;

cost is sum of costs of edges used in the tree)

Page 19: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Here’s one tree:

7

12

14

3

With cost = 36

Page 20: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Here’s a cheaper one

7

4

63

With cost 20 …

Page 21: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

The problem find the minimal cost spanning tree

(aka the `MST’) is easy in the technical sense.

7

12

4

9

68

5

1463

Several fast algorithms are known which solve this in polynomial time;Here is the classic one: Prim’s algorithm: Start with empty tree (no edges) Repeat: choose cheapest edge which feasibly extends the tree Until: n — 1 edges have been chosen.

Page 22: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Prim’s step 1:

7

12

4

9

68

5

1463

Page 23: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Prim’s step 2:

7

12

4

9

68

5

1463

Page 24: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Prim’s step 3:

7

12

4

9

68

5

1463

Page 25: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Prim’s step 4:

7

12

4

9

68

5

1463

Page 26: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Prim’s step 4:

4

5

63

Guaranteed to have minimal possible cost for this graph;i.e. this is the (or a) MST in this case.

Page 27: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

But change the problem slightly:We may want the degree constrained – MST (I.e. the MST,

but where no node in the tree has a degree above 4) Or we may want the optimal communication spanning tree –

which is the MST, but constrained among those trees which satisfy certain bandwith requirements between certain pairs of nodes

There are many constrained/different forms of the MST. These are essentially problems where we seek the cheapest tree structure, but where many, or even most, trees are not actually feasible solutions.

Here’s the thing: These constrained versions are almost always technically hard. and Real-world MST-style problems are invariably of this kind.

Page 28: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Approximate Algorithms For hard optimisation problems (again, which turnsout to be nearly all the important ones), we need Approximate algorithms .

These:• deliver solutions in reasonable time

• try to find pretty good (`near optimal’) solutions, and often get optimal ones.

• do not (cannot) guarantee that they have delivered the optimal solution.

Page 29: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

Typical Performance of Approximate Methods

Evolutionary Algorithms turn out to be the most successful and generally useful approximate algorithms around. They often take a Long time though – it’s worth getting used to the following curvewhich tends to apply across the board.

Time

Qua

lit

y

Simple method gets goodsolutions fast

Sophisticated method, slow, butbetter solutions eventually

Page 30: Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard

So …

• Most real world problems that we need to solve are hard problems

• We can’t expect truly optimal solutions for these, so we use approximate algorithms and do as well as we can.

• EAs (and other BIC methods we will see) are very successful approximate algorithms