do humans make good observers – and can they reliably fuse information? dr. mark bedworth mv...

Post on 02-Jan-2016

215 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Do Humans make Good Observers – and can they

Reliably Fuse Information?

Dr. Mark BedworthMV Concepts Ltd.

mark.bedworth@mv-concepts.com

What we will cover:

• The decision making process• The information fusion context• The reliability of the process• Where the pitfalls lie• How not to get caught out• Suggestions for next steps

What we will not cover:• Systems design and architectures• Counter-piracy specifics• Inferencing frameworks• Tracking• Multi-class problems• Extensive mathematics• In fact… most of the detail!

Our objectives:• Understanding of the context of data fusion

for decision making• Quantitative grasp of a few key theories• Appreciation of how to put the theory into

practice• Knowledge of where the gaps in theory

remain

Warning

This presentation containsaudience participation

experiments

Decision Making

• To make an informed decision:– Obtain data on the relevant factors– Reason within the domain context– Understand the possible outcomes– Have a method of implementation

Boyd Cycle• This is captured more formally as a

fusion architecture:– Observe: acquire data– Orient: form perspective– Decide: determine course of action– Act: put into practice

• Also called OODA loop

OODA loop

Decide

Observe

ActOrient

Adversarial OODA Loops

Owninformation

Adversaryinformation

Decide

Observe

ActOrient

Decide

Observe

OrientAct

Physical world

Winning the OODA Game

• To achieve dominance:– Make better decisions– In a more timely manner– And implement more effectively

Dominance History• Action dominance (-A)

– Longer range, more destructive, more accurate weapons

• Observation dominance (O-)– Longer range, more robust, more accurate

sensors• Information dominance (-O-D-)

– More timely and relevant information with better support to the decision maker

Information DominancePart One: Orientation

“Having acquired relevant data;to undertake reasoning about the data within the domain context to form aperspective of the current situation;so that an informed decision cansubsequently be made”

A number of approaches• Fusion of hard decisions

– Majority rule– Weighted voting– Maximum a posteriori fusion– Behaviour knowledge space

• Fusion of soft decisions– Probability fusion

Reasoning Frameworks• Boolean

– Truth and falsehood

• Fuzzy (Zadeh)– Vagueness

• Evidential (Dempster-Shafer)– Belief and ignorance

• Probabilistic (Bayesian)– Uncertainty

Probability theory• 0 ≤ P(H) ≤ 1• if P(H)=1

then H is certain to occur• P(H) + P(~H) = 1

either H or not-H is certain to occur (negation rule)

• P(G,H) = P(G|H) P(H) = P(H|G) P(G)the joint probability is the conditional probability multiplied by the prior (conjunction rule)

Bayes’ Theorem

Posteriorprobability

Likelihood Priorprobability

Marginallikelihood

)()()|(

)|(XP

HPHXPXHP

Perspective Calculation

• Usually the marginal likelihood is awkward to compute– But is not needed since it is independent

of the hypothesis– Compute the products of the likelihoods

and priors; then normalise over hypotheses

Human Fusion Experiment (1)

• A threat is present 5% of the time it is looked for

• Observers A and B both independently look for the threat

• Both report an absence of the threat with posterior probabilities 70% and 80%

• What is the fused probability that the threat is absent?

Human Fusion Experiment (2)

• Threat absent ≡ the hypothesis (H)• P(~H) = 0.05• P(H) = 0.95

• P(H|XA) = 0.70

• P(H|XB) = 0.80

• P(H|XA,XB) = ?

Human Fusion Experiment (3)No threat

H=1.00

PriorP(H)=0.95

Report BP(H|XB)=0.80

Report AP(H|XA)=0.70

Conditional Independence• Assume the data to be conditionally independent

given the class:

• Note that this does not necessarily imply:

)|()|()|,( HBPHAPHBAP

)()(),( BPAPBAP

Conditionally Independent

Sensor 1 measurement

Sen

sor

2 m

easu

rem

ent

Conditionally independent

Sensor 1 measurement

Sen

sor

2 m

easu

rem

ent

Not conditionally independent

Sensor 1 measurement

Sen

sor

2 m

easu

rem

ent

Not conditionally independent

Sensor 1 measurement

Sen

sor

2 m

easu

rem

ent

Fusion: Product Rule (1)

• We require:

• From Bayes’ theorem:

),|( BAHP

),()()|,(

),|(BAP

HPHBAPBAHP

Fusion: Product Rule (2)

• We assume conditional independence so may write:

),()()|()|(

),|(BAP

HPHBPHAPBAHP

Fusion: Product Rule (3)

• Applying Bayes’ theorem again:

• And collecting terms:

),()(

)()()|(

)()()|(

),|(BAP

HPHP

BPBHPHP

APAHPBAHP

),()()(

)()|()|(

),|(BAPBPAP

HPBHPAHP

BAHP .

Fusion: Product Rule (4)

• We may drop the marginal likelihoods again and normalise:

)()|()|(

),|(HP

BHPAHPBAHP

Posteriorprobability

Priorprobability

Posteriorprobability

Fused posteriorprobability

Multisource Fusion Rule

• The generalisation of this fusion rule to multiple sources:

• This is commutative

11

)(

)|()|(

N

N

ii

HP

xHPXHP

Commutativity of Fusion (1)

)()(

)|(

)(

)|(

)(

)|()|(

11

11

11

HPHP

xHP

HP

xHP

HP

xHPXHP

S

S

ii

R

R

ii

N

N

ii

Commutativity of Fusion (2)

• The probability fusion rule commutes:– It doesn’t matter what the architecture is– It doesn’t matter if it is single stage or

multi-stage

Experiment: Results

• Normalising gives:P(H|A,B) = 0.33 P(~H|A,B) = 0.67

59.095.0

80.070.0)(

)|()|(),|(

×

HPBHPAHP

BAHP

20.105.0

20.030.0)(~

)|(~)|(~),|(~

×

HPBHPAHP

BAHP

Human Fusion Experiment (3)No threat

H=1.00

PriorP(H)=0.95

Report BP(H|XB)=0.80

Report AP(H|XA)=0.70

Fusion A,BP(H|XA,XB)=0.33

Why was that so hard?

• Most humans find it difficult to intuitively fuse uncertain information– Not because they are innumerate– But because they cannot comfortably

balance the evidence (likelihood) with their predisposition (prior)

Prior Sensitivity (1)

• If the issue is with the priors – do they matter?

• Can we ignore the priors?• Do we get the same final decision if

we change the priors?

Prior Sensitivity (2)• If P(H|A) = P(H|B)• What value of P(H)

makes P(H|A,B) = 0.5?

22

2

)|(1)|(

)|()(

AHPAHP

AHPHP

Prior Sensitivity (3)

0

0.1

0.20.3

0.4

0.5

0.6

0.70.8

0.9

1

0 0.2 0.4 0.6 0.8 1

P(H|A)=P(H|B)

P(H)

Prior Sensitivity (4)

• Between 0.2 < P(H|A) < 0.8 the prior has a significant effect

• Carefully define the domain over which the prior is evaluated

• Put effort into using a reasonable value

Sensitivity to Posterior Probability

• What about the posterior probabilities delivered to the fusion centre?

• Can we endure errors here?• Which types of errors hurt most?

Probability Experiment (1)

• 10 estimation questions• Write down lower and upper bound• So that you are 90% sure it covers the

actual value• All questions relate to the highest

point in various countries (in metres)

Probability experiment (2)

• Winner defined as:– Person with most answers correct– Tie-break decided by smallest sum of

ranges (for all 10 questions)

• Pick a range big enough• But not too big!

The questions:-1. Australia2. Chile3. Cuba4. Egypt5. Ethiopia6. Finland7. Hong Kong8. India9. Lithuania10. Poland

The answers:-1. Australia (2228m)2. Chile (6893m)3. Cuba (1974m)4. Egypt (2629m)5. Ethiopia (4550m)6. Finland (1324m)7. Hong Kong (958m)8. India (8586m)9. Lithuania (294m)10. Poland (2499m)

Overconfidence (1)• Large trials show that most people get

fewer than 40% correct• Should be 90% correct!• People are often overconfident

(even when primed that they are being tested!)

Overconfidence (2)

Actual probability

Decl

are

d p

robab

ility

overconfident

overconfident

underconfident

underconfident

wrong

wrong

Confidence Amplification(1)

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Input class probability

Fus

ed

cla

ss p

rob

ab

ility

2 sensors3 sensors4 sensors5 sensors

Fu

sed

cla

ss p

rob

ab

ility

Input class probability

Confidence Amplification(2)

Veto Effect• If any local decision-maker outputs a

probability of close to zero for a class then the fused probability is close to zero– even if all the other decision-makers output a

high probability– about 40% of the response surface for two

sensors is either <0.1 or >0.9– this rises to 50% for three sensors and nearly

60% for four

Moderation of probabilities

• If we suspect that the posterior probabilities are overconfident then we should moderate them– By building it into automatic techniques– By allowing for it if this is not possible

Gaussian Moderation

• For Gaussian classifiers the Bayesian correction is analytically tractable

• By integrating over the mean and variance rather than taking the maximum likelihood value

Student t-distribution(1)• For Gaussian data this is:

• Which is a “Student” t-distribution:

)|,(),|()|( 2

0

22 DPxPddDxP ii

2

2

22 1

ˆ)1(

)ˆ(.

21

)1(ˆ

2),ˆ,ˆ|(

N

ii

N

xN

N

N

NxP

Student t-distribution(2)

-10 -5 0 5 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Measurement value

Lik

elih

oo

d o

f da

taMeasurement value

Like

lihoo

d of

dat

a-10 -5 0 5 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Measurement value

Lik

elih

oo

d o

f da

ta

Student t-distribution(3)

-10 -5 0 5 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Measurement value

Pro

ba

bili

ty o

f cla

ss 1

Pro

babi

lity

of c

lass

1

-10 -5 0 5 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Measurement valueP

rob

ab

ility

of c

lass

1P

roba

bilit

y of

cla

ss 1

Approximate Moderation(1)

• We can get a similar effect at the fusion centre using the posteriors– Convert back to “likelihoods” by dividing by the

prior– Add a constant to everything– Convert back to “posteriors” by multiplying by

the prior– Renormalise

Approximate Moderation(2)

• How much to add depends on the source of the posterior probabilities– Correction factor for each source– Learned from data

Other Issues

• Conditional independence not holding• Information incest• Missing data• Communication errors• Asynchronous information

Information DominancePart Two: Decision

“Having reasoned about the datato form a perspective of the current situation; to make an informed decision which optimises the desirability of the outcome”

Deciding what to do

“Decision theory is trivial, apart from the details”

• Select an action that maximises the expected utility of the outcome

Utility functions?

• A utility function describes how desirable each possible outcome is– People are sometimes irrational– Desirability cannot be captured by a

single valued function– Allais paradox

Utility Experiment(1)

1. Guaranteed €1 million2. 89% chance of €1 million

10% chance of €5 million1% chance of nothing

Utility Experiment(2)

1. 89% chance of nothing11% chance of €1 million

2. 90% chance of nothing10% chance of €5 million

Utility Experiment(3)

• If you prefer 1 to 2 on the first slideYou should prefer 1 to 2 on the second slide as well

• If not you are acting irrationally…

Decision Theory• Assume we are able to construct a utility

function (or at least get our superior to define one!)

• Enumerate the possible actions– Use our fused probabilities to weight the utility of

the possible outcomes– Choose the action for which the expected utility

of the outcome is greatest

Timing the decision

• What about timing?• When should the decision be made?

– If we wait then maybe the (fused) probabilities will be more accurate

– Or the action will be more effective

Explore versus Exploit• By waiting you can explore the situation• By stopping you can exploit the situation• Stopping rule

– Sequential analysis– SPRT– Bayesian optimal stopping

Experiment with timing

• I will show you 20 numbers• They are drawn from the same

(uniform) distribution• Select the highest value• But no going back• A bit like ¡Allá tú!

Experiment with timing(1)

131

Experiment with timing(2)

16

Experiment with timing(3)

125

Experiment with timing(4)

189

Experiment with timing(5)

105

Experiment with timing(6)

172

Experiment with timing(7)

39

Experiment with timing(8)

94

Experiment with timing(9)

57

Experiment with timing(10)

133

Experiment with timing(11)

52

Experiment with timing(12)

69

Experiment with timing(13)

7

Experiment with timing(14)

242

Experiment with timing(15)

148

Experiment with timing(16)

163

Experiment with timing(17)

23

Experiment with timing(18)

139

Experiment with timing(19)

146

Experiment with timing(20)

211

The answer…

• How many people chose 242?• Balance between collecting data on

how big the numbers might be (exploration)and actually picking a big number(exploitation)

The 1/e Law(1)

• Consider a rule of the form:

Observe M and remember the best value (V)

Observe remaining N-M and pick the first that exceeds V

The 1/e Law(2)

• It can be shown that the optimum value for M is N/e

• And that for this rule the probability of selecting the maximum is at least 1/e

• Even for huge values of N

Time Pressure (1)

• Individuals tend to make the decision too early

• Committees tend to leave the decision too late

Time Pressure (2)

• Lecturers tend to overrun their time slot!

Time Pressure (3)• Apologies for skipping over so much of the

detail• Some of the other areas that warrant

mention:– Game theory– Sensor management– Graphical models– Cognitive inertia– Inattentional blindness

Please feel freeto contact me

mark.bedworth@mv-concepts.com

www.mv-concepts.com

Or just come and introduce yourself…

Thank you!

Questions…

top related