bayesian estimation & information...
TRANSCRIPT
![Page 1: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/1.jpg)
Bayesian Estimation & Information Theory
Jonathan Pillow
Mathematical Tools for Neuroscience (NEU 314)Spring, 2016
lecture 18
![Page 2: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/2.jpg)
Bayesian Estimation
1. Likelihood
2. Prior
3. Loss function
jointly determine the posterior
“cost” of making an estimate if the true value is
• fully specifies how to generate an estimate from the data
Bayesian estimator is defined as:
✓̂(m) = argmin✓̂
ZL(✓̂, ✓)p(✓|m)d✓
L(✓̂, ✓)
“Bayes’ risk”
three basic ingredients:
![Page 3: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/3.jpg)
Typical Loss functions and Bayesian estimators
1. squared error loss
need to find minimizing the expected loss:
Differentiate with respect to and set to zero:
“posterior mean”
also known as Bayes’ Least Squares (BLS) estimator
L(✓̂, ✓) = (✓̂ � ✓)2
0
![Page 4: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/4.jpg)
Typical Loss functions and Bayesian estimators
2. “zero-one” loss (1 unless )
• posterior maximum (or “mode”).• known as maximum a posteriori (MAP) estimate.
expected loss:
which is minimized by:
L(✓̂, ✓) = 1� �(✓̂ � ✓)0
![Page 5: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/5.jpg)
MAP vs. Posterior Mean estimate:
0 2 4 6 8 100
0.1
0.2
0.3
Note: posterior maximum and mean not always the same!
gamma pdf
![Page 6: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/6.jpg)
Typical Loss functions and Bayesian estimators
3. “L1” loss
expected loss:
HW problem: What is the Bayesian estimator for this loss function?
0
![Page 7: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/7.jpg)
Simple Example: Gaussian noise & prior
1. Likelihood additive Gaussian noise
3. Loss function:
zero-mean Gaussian2. Prior
doesn’t matter (all agree here)
posterior distribution
MAP estimate variance
![Page 8: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/8.jpg)
Likelihood
8 0 8
8
0
8
-
-
θ
m
![Page 9: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/9.jpg)
Likelihood
θ
m
8 0 8
8
0
8
-
-
![Page 10: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/10.jpg)
Likelihood
θ
m
8 0 8
8
0
8
-
-
![Page 11: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/11.jpg)
Likelihood
θ
m
8 0 8
8
0
8
-
-
8 0 8-
8 0 8-
![Page 12: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/12.jpg)
Prior
θ
m
8 0 8-
8
0
8
-
![Page 13: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/13.jpg)
Computing the posterior
x
likelihood prior
00
∝
posterior
0
0
θm
![Page 14: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/14.jpg)
x ∝
likelihood prior posterior
00 0
00 0
0
bias
m*
θm
Making an Bayesian Estimate:
![Page 15: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/15.jpg)
x ∝
likelihood prior posterior
00 0
00 0
0
largerbias
θm
High Measurement Noise: large bias
![Page 16: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/16.jpg)
x ∝
likelihood prior posterior
00 0
00 0
0
smallbias
θm
Low Measurement Noise: small bias
![Page 17: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/17.jpg)
Bayesian Estimation:
• Likelihood and prior combine to form posterior
• Bayesian estimate is always biased towards the prior (from the ML estimate)
![Page 18: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/18.jpg)
+
Which grating moves faster?
Application #1: Biases in Motion Perception
![Page 19: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/19.jpg)
+
Which grating moves faster?
Application #1: Biases in Motion Perception
![Page 20: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/20.jpg)
Explanation from Weiss, Simoncelli & Adelson (2002):
• In the limit of a zero-contrast grating, likelihood becomes infinitely broad ⇒ percept goes to zero-motion.
prior priorlikelihood
likelihoodposterior
0 0
Noisier measurements, so likelihood is broader⇒ posterior has
larger shift toward 0(prior = no motion)
• Claim: explains why people actually speed up when driving in fog!
![Page 21: Bayesian Estimation & Information Theorypillowlab.princeton.edu/teaching/mathtools16/slides/lec18_BayesianEstim.pdf• Bayes’ least squares (BLS) estimator (posterior mean) • maximum](https://reader036.vdocument.in/reader036/viewer/2022071404/60f88d00e7c49a6e6d1ed5c1/html5/thumbnails/21.jpg)
summary
• 3 ingredients for Bayesian estimation (prior, likelihood, loss)
• Bayes’ least squares (BLS) estimator (posterior mean)
• maximum a posteriori (MAP) estimator (posterior mode)
• accounts for stimulus-quality dependent bias in motion perception (Weiss, Simoncelli & Adelson 2002)