prediction and change detection mark steyvers scott brown mike yi university of california, irvine...
Post on 22-Dec-2015
213 views
TRANSCRIPT
Prediction and Change Detection
Mark Steyvers Scott Brown Mike Yi
University of California, Irvine
This work is supported by a grant from the US Air Force Office of Scientific Research (AFOSR grant number FA9550-04-1-0317)
Perception of Random Sequences
• People perceive too much structure:
– Coin tosses: Gambler’s fallacy
– Sports scoring sequence: Hot hand belief
• Sequences are (mostly) stationary but people perceive non-stationarity
Bias to detect too much change?
Our Approach
• Non-stationary random sequences – changes in parameters over time.
• How well can people make inferences about underlying changes?
• How well can people make predictions about future outcomes?
• Compare data to:
– Bayesian (ideal observer) models
– Descriptive models
Two Tasks
Inference taskwhat caused the latest
observation?
Observed Data
Internal State (Unobserved)
Future Data
Prediction taskwhat is the next most
likely outcome?
A B C D
trialA
A
A
A
A
B
B
B
D
D
D
D
D
D
A
A
Sequence Generation
• Start with one of four normal distributions
• Draw samples from this distribution
• With probability alpha, switch to a new generating distribution (uniformly chosen)
• Alpha determines number of change points
changepoints
Tomato Cans Experiment
• Cans roll out of pipes A, B, C, or D
• Machine perturbs position of cans (normal noise)
(real experiment has response buttons and is subject paced)
A B C D
Tomato Cans Experiment
(real experiment has response buttons and is subject paced)
A B C D • Cans roll out of pipes A, B, C, or D
• Machine perturbs position of cans (normal noise)
• Curtain obscures sequence of pipes
Tasks
A B C D• Inference:
what pipe produced the last can?
A, B, C, or D?
• Prediction: in what region will the next can arrive?
1, 2, 3, or 4?
1 2 3 4
Experiment 1
• 63 subjects
• 12 blocks
– 6 blocks of 50 trials for inference task
– 6 blocks of 50 trials for prediction task
– Identical trials for inference and prediction
• Alpha = 0.1
0 10 20 30 40 50 6030
40
50
60
70
80
90
100
% Changes (Subject)
Acc
ura
cy (
ag
ain
st T
rue)
SubjectIdeal
0 10 20 30 40 50 6030
40
50
60
70
80
90
100
% Changes (Subject)
Acc
ura
cy (
ag
ain
st T
rue)
SubjectIdeal
Accuracy vs. Number of Perceived Changes
ideal
INFERENCE PREDICTION
ideal
(Each dot is a subject)
INFERENCE PREDICTION
Sequence
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
1
2
3
4
5
6
1 2 3 4 5 6A B C D
Trial
INFERENCE PREDICTION
Sequence
Ideal Observer
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
1
2
3
4
5
6
1 2 3 4 5 6A B C D
Trial
INFERENCE PREDICTION
Sequence
Ideal Observer
Individualsubjects
Trial0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
1
2
3
4
5
6
1 2 3 4 5 6A B C D
INFERENCE PREDICTION
Sequence
Ideal Observer
Trial0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
1
2
3
4
5
6
1 2 3 4 5 6A B C D
Individualsubjects
INFERENCE PREDICTION
Sequence
Ideal Observer
Trial0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
1
2
3
4
5
6
1 2 3 4 5 6A B C D
Individualsubjects
50
60
70
80
90
100low alpha
50
60
70
80
90
100
% A
ccu
racy
(ag
ain
st T
rue
)
med alpha
0 10 20 30 40 50 6050
60
70
80
90
100
% Changes
high alpha
Exp. 1b
• Alpha = .08, .16, .32
• 136 subjects
• Inference judgments only
Subjects track changes in alpha
ideal
ideal
ideal
(view full screen to see animation)
Familiarization Trials
• Input pipe changes at each trial with prob. alpha
Observed Distributions Match Theory
Emp. Theo. Emp. Theo. Emp. Theo. Emp. Theo.
78.9% 82.7%0.0% 0.1% 18.1% 17.2%
0.1%
C 0.0% 0.1% 14.5% 17.1% 68.9% 65.6% 21.1% 17.2%
0.0% 0.0%
B 17.9% 17.2% 65.9% 65.6% 13.0% 17.1% 0.0%
19.5% 17.2% 0.0% 0.1%
Out
pu
t B
inA 82.1% 82.7%
D 0.0% 0.0%
A B C D
Input Bin
Note: mode of output distribution centers on input bin
(view full screen to see animation)
Decision Phase
• Main phase of experiment uses closed device
• Inference taskwhich input pipe was used,
A, B, C, or D?
• Prediction task where will next ball arrive,A, B, C, or D?
Accuracy vs. Number of Perceived Changes
0 20 40 6030
40
50
60
70
80
90
100
% Changes
Acc
ura
cy (
ag
ain
st T
rue)
0 20 40 6030
40
50
60
70
80
90
100
% Changes
Acc
ura
cy (
ag
ain
st T
rue)
INFERENCE PREDICTION
44 subjects
Main Finding
• Ideal observer:
# changes in prediction = # changes in inference
• Subjects
# changes in prediction >> # changes in inference
• Explanation?
Variability Matching
• Example output sequence:
– A B A A B C
• Strategy: match the observed variability in prediction sequence
• Suboptimal! Part of the variability is due to noise that is useless for prediction
Conclusion
• Subjects are able to track changes in dynamic decision environments
• Individual differences
– Over-reaction: perceiving too much change
– Under-reaction: perceiving too little change
• More over-reaction in prediction task
Analogy to Hot Hand Belief
• Inference task: does a player have a hot hand?
• Prediction task: will a player make the next shot?
Process Model
• Memory buffer for K samples
• Calculate prob. of new sample under normal distribution of buffer
• If prob. < τ,
– Assume a change
– Flush the buffer
– Put new sample in buffer
• Inference responses based on buffer mean
• Prediction responses are the same, except the model tries to anticipate changes by making a purely random response on some fraction X of trials
0 10 20 30 40 50 6030
40
50
60
70
80
90
100
Changes (%)
Acc
ura
cy (
aga
inst
tru
th)
Prediction
0 10 20 30 40 50 6030
40
50
60
70
80
90
100
Changes (%)
Acc
ura
cy (
aga
inst
tru
th)
Inference
model
subject
Sweeping Alpha and Sigma in Bayesian Model
INFERENCE PREDICTION
0 20 40 6030
40
50
60
70
80
90
100
% Changes (Subject)
Acc
ura
cy (
ag
ain
st T
rue)
SubjectIdeal
0 20 40 6030
40
50
60
70
80
90
100
% Changes (Subject)
Acc
ura
cy (
ag
ain
st T
rue)
SubjectIdeal
Optimal Prediction Strategy
• Best Prediction = Last Inference
Subject:inference: A A B B B D …prediction: … A B A B D C …
A A B B B D …
Using shifted inference judgments for prediction, 70% of subjects improve in prediction performance
Observed Data
Internal State(Unobserved)
Future Data
Prediction Infe
renc
e
Locus of Gambler’s
fallacy?
Generating Model
1x
1z
1y
tx
tz
ty
...
Change probability
Changepoints
Distribution parameters
Observed data
|tx Bernoulli
1 if 0|
Uniform({1,..,P}) if 1t t t
t tt
z z xz x
x
| ,t t ty z N z S
1tx
1tz
1ty
...
Bayesian Inference
Given observed sequence y, what are the latent states z and change points x?
Cannot calculate this complex posterior distribution. Use posterior simulation instead: MCMC with Gibbs sampling
1x
1z
1y
tx
tz
ty
...
1tx
1tz
1ty
...
' ' ' 'x',z'
P y | x,z P x,zP x,z | y =
P y | x ,z P x ,z
Gibbs Sampling
• Simulate the high-dimensional distribution by sampling on lower-dimensional subsets of variables where each subset is conditioned on the value of all others. The sampling is done sequentially and proceeds until the sampled values approximate the target distribution.
1x
1z
1y
tx
tz
ty
...
1tx
1tz
1ty
...
Use the subset {zt, xt, xt+1}
Why include xt+1? To preserve consistency. For example, suppose before sampling, zt+1 ≠ zt, and therefore xt+1 = 1. If the sample leads to zt = zt+1, then xt+1 needs to be updated.
Gibbs Sampling
• Assume α is a constant (for now)
• The set of variables {zt, xt, xt+1} is conditionally dependent only on these variables: {yt, zt-1, zt+1}
• Sample values {zt, xt, xt+1} from this distribution:
1 1 1
1 1 1 1
, , | , ,
| | , | ,
t t t t t t
t t t t t t t t t t
P z x x y z z
P y z P z x z P z x z P x P x
Gibbs Sampling
2 2
| t ty z St tP y z e For tomato cans experiment
For plinko experiment, look up from a table |t tP y z
1
1 1
0 0,
| , 1 0,
1/ 1
t t t
t t t t t t
t
x z z
P z x z x z z
P x
(P = number of input pipes)
1tP x
Example comparing HMM Viterbi algorithm to Gibbs sampling algorithm
0 10 20 30 40 50 60 70 80 90 1000
1
2
3
4
5Observed Ouput Sequence
0 10 20 30 40 50 60 70 80 90 1000
1
2
3
4
5True and HMM Inferred Input Sequence (Acc=87.000%)TRUE
INFERRED
0 10 20 30 40 50 60 70 80 90 1000
1
2
3
4
5True and Gibbs Inferred Input Sequence (Acc=89.000%)TRUE
INFERRED