problem whose solutions can be 'ranked

27
Randomized Algorithms

Upload: others

Post on 15-Nov-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Problem whose solutions can be 'ranked

Randomized Algorithms

Page 2: Problem whose solutions can be 'ranked

Essence

Probabilistic algorithms or sampling

A degree of randomness is part of the logic

(using a random number generator)

Results can be different for different runs

Algorithms may

Produce incorrect results (Las Vegas)

Fail to produce a result (Monte Carlo)

Page 3: Problem whose solutions can be 'ranked

Example

findA_LV

Repeat

Randomly select one

out of n elements

Until ‘a’ is found

End

findA_MC

count = 0

Repeat

Randomly select one

out of n elements

count ++

Until ‘a’ is found or

count > k

End

An array of n elements, half are ‘a’ and half are ‘b’

Find an ‘a’ in the array

Page 4: Problem whose solutions can be 'ranked

Example (cont)

findA_LV

Always succeeds

Random run time

(average is O(1))

findA_MC

Succeed with P=1-

(1/2)^k

Max run time is O(k)

An array of n elements, half are ‘a’ and half are ‘b’

Find an ‘a’ in the array

Page 5: Problem whose solutions can be 'ranked

More Example

Quick sort

The selection of pivot is random

Always produce the correct results, but runtime

is random

Average runtime is O(nlogn)

Page 6: Problem whose solutions can be 'ranked

General Curve Fitting

2,4,1 ,,for solve

1

2

1

139

124

111

)(

)(

139224

1

),,,,(

),,,,(),,,,(

equations 3equationsn

(3,1)(2,2),(1,1),pointsinput 3

),(,),,(),,(pointsinput

),,,,(

1

1

2

1

1

21

2122

2111

2211

2

21

cbaaa

c

b

a

y

y

a

a

a

xf

xf

cbacba

cba

aaaxfy

aaaxfyaaaxfy

yxyxyxn

cbxaxyaaaxfy

n

n

n

n

nnn

n

n

nn

n

Page 7: Problem whose solutions can be 'ranked

General Least Square Regression

0))...((

,...,1,0

))...((min

)ˆ(

min

1

0

1

1

2

2

1

1

2

1

0

1

1

2

2

1

1),...,,(

1

2

),...,,(

110

110

m

i

i

n

in

n

ini

j

i

j

m

i

i

n

in

n

iniaaa

m

i

ii

aaa

axaxaxayx

nja

E

axaxaxay

yyEwhere

E

n

n

Page 8: Problem whose solutions can be 'ranked

General Least Square Regression

m

i

i

m

i

i

n

i

m

i

i

n

i

o

n

n

m

i

m

i

n

i

m

i

n

i

m

i

n

i

m

i

n

i

n

i

m

i

n

i

n

i

m

i

n

i

m

i

n

i

n

i

m

i

n

i

n

i

m

i

i

j

i

m

i

j

i

m

i

i

j

in

m

i

n

i

j

in

m

i

n

i

j

i

m

i

i

n

in

n

ini

j

i

y

yx

yx

a

a

a

xx

xxxxx

xxxxx

yxaxaxxaxxaxx

axaxaxayx

1

1

2

1

1

2

1

11

2

1

1

1

2

1

22

1

12

1

1

1

21

1

11

1

0

1

1

1

1

2

1

2

1

1

1

1

0

1

1

2

2

1

1

1

)()(...)()(

0))...((

Page 9: Problem whose solutions can be 'ranked

Homework #4

i

ii

i

ii

ii

m

i

ii

iiii

m

i

iii

m

i

iiba

m

i

ii

y

yx

b

a

x

xx

ybaxbaxyb

E

yxbxaxbaxyxa

E

b

E

a

E

baxyE

yyEE

1

)1()(0))((2

)()(0))((2

0

))((minmin

)ˆ( where,min

2

1

2

1

2

1),(

1

2

Page 10: Problem whose solutions can be 'ranked

Caveats LS

Democracy, everybody gets an equal say

Perform badly with “outliers”

X

Y

Page 11: Problem whose solutions can be 'ranked

noisy data

outliers

Noisy Data vs.

Outliers

Page 12: Problem whose solutions can be 'ranked

Outliers

Outliers (y) Outliers (x, leverage points)

Page 13: Problem whose solutions can be 'ranked

Computer Vision and Image Analysis

Randomized Algorithm

Choose p points at random from the set of n

data points

Compute the fit of model to the p points

Compute the median of the fitting error for

the remaining n-p points

The fitting procedure is repeated until a fit

is found with sufficiently small median of

squared residuals or up to some

predetermined number of fitting steps

(Monte Carlo Sampling)

Page 14: Problem whose solutions can be 'ranked

How Many Trials? Well, theoretically it is C(n,p) to find all

possible p-tuples

Very expensive

tries inset good oneleast at got :))1(1(1

tries all indata badgot :))1(1(

bad is sample oneleast at :)1(-1

good are samples all:)1(

data good of fraction:)1(

data bad of fraction:

))1(1(1

mp

m

p

mp

mp

p

p

mp

Page 15: Problem whose solutions can be 'ranked

How Many Trials (cont.)

Make sure the probability is high (e.g. >95%)

given p and epsilon, calculate m

p 5% 10

%

20

%

25

%

30

%

40

%

50

%

1 1 2 2 3 3 4 5

2 2 2 3 4 5 7 11

3 2 3 5 6 8 13 23

4 2 3 6 8 11 22 47

5 3 4 8 12 17 38 95

Page 16: Problem whose solutions can be 'ranked

Best Practice

Randomized selection

can completely

remove outliers

“plutocratic”

Results are based on a

small set of features

LS is most fair,

everyone get an equal

say

“democratic”

But can be seriously

influenced by bad data

Use randomized algorithm to remove outliers

Use LS for final “polishing” of results (using all

“good” data)

Allow up to 50% outliers theoretically

Page 17: Problem whose solutions can be 'ranked

Navigation

Autonomous Land Vehicle (ALV),

Google’s Self-Driving Car, etc.

One important requirement: track the

position of the vehicle

Kalman Filter, loop of

(Re)initialization

Prediction

Observation

Correction

Page 18: Problem whose solutions can be 'ranked
Page 19: Problem whose solutions can be 'ranked

Navigation

Hypothesis and verification

Classic Approach like Kalman Filter maintains

a single hypothesis

Newer approach like particle filter maintains

multiple hypotheses (Monte Carlo sampling of

the state space)

Page 20: Problem whose solutions can be 'ranked

Single Hypothesis

If the “distraction” – noise – is white and

Gaussian

State-space probability profile remains

Gaussian (a single dominant mode)

Evolving and tracking the mean, not a

whole distribution

Page 21: Problem whose solutions can be 'ranked

Multi-Hypotheses

The distribution can have multiple modes

Sample the probability distribution with

“importance” rating

Evolve the whole distribution, instead of

just the mean

Page 22: Problem whose solutions can be 'ranked

Key – Baeys Rule

In the day time, some animal runs in front of

you on the bike path, you know exactly what it

is (p(o|si) is sufficient)

In the night time, some animal runs in front of

you on the bike path, you can hardly distinguish

the shape (p(o|si) is low for all cases, but you

know it is probably a squirrel, not a lion

because of p(si))

nobservatioo

states

sPsopop

sPsop

op

soposP ii

iiii

:

:

)()|()(

)()|(

)(

),()|(

Page 23: Problem whose solutions can be 'ranked

Initialization: before observation and measurement

Observation: after seeing a door

P(s): probability of state

P(o|s): probably of observation given current state

Page 24: Problem whose solutions can be 'ranked

Prediction : internal mechanism saying that robot moves right

Correction : prediction is weighed by confirmation with observation

Page 25: Problem whose solutions can be 'ranked
Page 26: Problem whose solutions can be 'ranked

new particles

+ weights

controlsmeasurements

Particles + weights

Total weights

Page 27: Problem whose solutions can be 'ranked

Why Be Stochastic?

More Choices – remove bad data

More Alternatives – sample the problem

states based on likelihood