frequentist versus bayesian. glen cowanstatistics in hep, iop half day meeting, 16 november 2005,...
TRANSCRIPT
![Page 1: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/1.jpg)
Frequentist versus Bayesian
![Page 2: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/2.jpg)
![Page 3: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/3.jpg)
Glen Cowan Statistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester
The Bayesian approach
In Bayesian statistics we can associate a probability witha hypothesis, e.g., a parameter value .
Interpret probability of as ‘degree of belief’ (subjective).
Need to start with ‘prior pdf’ (), this reflects degree of belief about before doing the experiment.
Our experiment has data x, → likelihood function L(x|).
Bayes’ theorem tells how our beliefs should be updated inlight of the data x:
Posterior pdf p(|x) contains all our knowledge about .
![Page 4: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/4.jpg)
Glen Cowan Statistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester
Case #4: Bayesian method
We need to associate prior probabilities with 0 and 1, e.g.,
Putting this into Bayes’ theorem gives:
posterior Q likelihood prior
← based on previous measurement
reflects ‘prior ignorance’, in anycase much broader than
![Page 5: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/5.jpg)
Glen Cowan Statistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester
Bayesian method (continued)
Ability to marginalize over nuisance parameters is an importantfeature of Bayesian statistics.
We then integrate (marginalize) p(0, 1 | x) to find p(0 | x):
In this example we can do the integral (rare). We find
![Page 6: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/6.jpg)
Bayesian Statistics at work: The Troublesome Extraction of the
angle
Stéphane T’JAMPENS
LAPP (CNRS/IN2P3 & Université de Savoie)
J. Charles, A. Hocker, H. Lacker, F.R. Le Diberder, S. T’Jampens, hep-ph-0607246
![Page 7: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/7.jpg)
Frequentist: probability about the data (randomness of measurements), given the model
P(data|model)
Hypothesis testing: given a model, assess the consistency of the data with a particular parameter value 1-CL curve (by varying the parameter value)
[only repeatable events (Sampling Theory)]
Statistics tries answering a wide variety of questions two main different! frameworks:
Digression: StatisticsD.R. Cox, Principles of Statistical Inference, CUP (2006)
W.T. Eadie et al., Statistical Methods in Experimental Physics, NHP (1971)
www.phystat.org
Bayesian: probability about the model (degree of belief), given the data
P(model|data) Likelihood(data,model) Prior(model)
![Page 8: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/8.jpg)
Bayesian Statistics in 1 slide
Bayesian: probability about the model (degree of belief), given the data
P(model|data) Likelihood(data;model) Prior(model)
“it treats information derived from data (“likelihood”) as on exactly equal footing with probabilities derived from vague and unspecified sources (“prior”). The assumption that all aspects of uncertainties are directly comparable is often unacceptable.”
“nothing guarantees that my uncertainty assessment is any good for you - I'm just expressing an opinion (degree of belief). To convince you that it's a good uncertainty assessment, I need to show that the statistical model I created makes good predictions in situations where we know what the truth is, and the process of calibrating predictions against reality is inherently frequentist.”(e.g., MC simulations)
Bayes’rule
The Bayesian approach is based on the use of inverse probability (“posterior”):
Cox – Principles of Statistical Inference (2006)
![Page 9: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/9.jpg)
Uniform prior: model of ignorance?
A central problem : specifying a prior distribution for a parameter about which nothing is known flat prior
Problems:
Not re-parametrization invariant (metric dependent): uniform in is not uniform in z=cos
Favors large values too much [the prior probability for the range 0.1 to 1 is 10 times less than for 1 to 10]
Flat priors in several dimensions may produce clearly unacceptable answers.
In simple problems, appropriate* flat priors yield essentially same answer as non-Bayesian sampling theory. However, in other situations, particularly those involving more than two parameters, ignorance priors lead to different and entirely unacceptable answers.* (uniform prior for scalar location parameter, Jeffreys’ prior for scalar scale parameter).
Cox – Principles of Statistical Inference (2006)
![Page 10: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/10.jpg)
Hypersphere:
One knows nothing about the individual Cartesian coordinates x,y,z…
What do we known about the radius r =√(x^2+y^2+…) ?
One has achieved the remarkable feat of learning something about the radius of the hypersphere, whereas one knew nothing about the Cartesian coordinates and without making any experiment.
6D space
Uniform Prior in Multidimensional Parameter Space
![Page 11: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/11.jpg)
Isospin Analysis : B→hh J. Charles et al. – hep-ph/0607246
Gronau/London (1990)
MA: Modulus & ArgumentRI: Real & Imaginary
Improper posterior
![Page 12: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/12.jpg)
Isospin Analysis: removing information from B0→00
No model-independent constraint on can be inferred in this case
Information is extracted on , which is introduced by the priors (where else?)
![Page 13: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/13.jpg)
Conclusion
Statistics is not a science, it is mathematics (Nature will not decide for us) [You will not learn it in Physics books go to the professional literature!]
Many attempts to define “ignorance” prior to “let the data speak by themselves” but none convincing. Priors are informative.
Quite generally a prior that gives results that are reasonable from various viewpoints for a single parameter will have unappealing features if applied independently to many parameters.
In a multiparameter space, credible Bayesian intervals generally under-cover.
If the problem has some invariance properties, then the prior should have the corresponding structure.specification of priors is fraught with pitfalls (especially in high dimensions).
Examine the consequences of your assumptions (metric, priors, etc.)Check for robustness: vary your assumptionsExploring the frequentist properties of the result should be strongly encouraged.
PHYSTAT Conferences:
http://www.phystat.org
![Page 14: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/14.jpg)
![Page 15: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/15.jpg)
![Page 16: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/16.jpg)
![Page 17: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/17.jpg)
![Page 18: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/18.jpg)
![Page 19: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/19.jpg)
![Page 20: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/20.jpg)
![Page 21: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/21.jpg)
![Page 22: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/22.jpg)
![Page 23: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/23.jpg)
![Page 24: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/24.jpg)
![Page 25: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/25.jpg)
![Page 26: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/26.jpg)
αα[[ππππ] : ] : B-factories status LP07 B-factories status LP07
![Page 27: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/27.jpg)
A+0
A+0
• B+0 |A+0|= |A+0|
Isospin analysis : reminderIsospin analysis : reminder
√2 A+0 = √2 A(Bu π+π0) = e-iα (T+- +T00) √2 A+0 = e+iα (T+- +T00)
A+- = A(Bd π+π-) = e-iα T+- + P+- A+- = e+iα T+- + P+-
√2 A00 = √2 A(Bd π0π0) = e-iα T00 - P+- √2 A00 = e+iα T00 - P+-
ΔΦΔΦ=2=2αα
ΔΦΔΦ=2=2ααeffeff
• Neglecting EW penguin, the amplitude of the SU(2)-related Bππ modes is :
• SU(2) triangular relation : A+0 = A+-/ √2 + A00
• Same for Bρρ decay dominated by longitudinal polarized ρ (CP-even fs)
• S+- sin(2αeff ) 2-fold αeff in [0,π]
• B00, C00 |A00|,|A00|
A00
A00
A+-/√2
A+-/√2
• B+-, C+- |A+-|,|A+-|
Closing SU(2) triangle Closing SU(2) triangle 8-fold 8-fold αα
α
• SS0000 relative phase between A00 & A00
Re
Im
![Page 28: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/28.jpg)
BbarBbar
BB
PiPiPiPi RhoRho RhoRho RhoRho RhoRho CC0000 but noS but noS0000 no Cno C0000/S/S0000 CC0000 AND S AND S0000
• Sin(2αeff) from B (π/ρ)+ (π/ρ)- 2 solutions for αeff in [0,π]• Δα = α-αeff from SU(2) B/Bbar triangles 1 ,2 or 4 solutions for Δα (dep. on triangles closure)
2, 4 or 8 solutions for 2, 4 or 8 solutions for αα = = ααeff eff + + ΔαΔα
4-fold Δα
2-fold Δα 1-fold Δα (‘plateau’)A00/A+0
A+-/√2/A+0
1-fold Δα (peak)
Isospin analysis : reminderIsospin analysis : reminder
![Page 29: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/29.jpg)
Developments in Bayesian Priors
Roger Barlow
Manchester IoP meeting
November 16th 2005
![Page 30: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/30.jpg)
Plan
• Probability– Frequentist– Bayesian
• Bayes Theorem– Priors
• Prior pitfalls (1): Le Diberder• Prior pitfalls (2): Heinrich• Jeffreys’ Prior
– Fisher Information
• Reference Priors: Demortier
![Page 31: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/31.jpg)
Probability
Probability as limit of frequency
P(A)= Limit NA/Ntotal
Usual definition taught to students
Makes sense
Works well most of the time-
But not all
![Page 32: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/32.jpg)
Frequentist probability
“It will probably rain tomorrow.”
“ Mt=174.3±5.1 GeV means the top quark mass lies between 169.2 and 179.4, with 68% probability.”
“The statement ‘It will rain tomorrow.’ is probably true.”
“Mt=174.3±5.1 GeV means: the top quark mass lies between 169.2 and 179.4, at 68% confidence.”
![Page 33: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/33.jpg)
Bayesian Probability
P(A) expresses my belief that A is true
Limits 0(impossible) and 1 (certain)
Calibrated off clear-cut instances (coins, dice, urns)
![Page 34: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/34.jpg)
Frequentist versus Bayesian?
Two sorts of probability – totally different. (Bayesian probability also known as Inverse Probability.)
Rivals? Religious differences? Particle Physicists tend to be frequentists.
Cosmologists tend to be BayesiansNo. Two different tools for practitionersImportant to:• Be aware of the limits and pitfalls of both• Always be aware which you’re using
![Page 35: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/35.jpg)
Bayes Theorem (1763)
P(A|B) P(B) = P(A and B) = P(B|A) P(A)
P(A|B)=P(B|A) P(A)
P(B)
Frequentist use eg Čerenkov counter
P( | signal)=P(signal | ) P() / P(signal)
Bayesian use
P(theory |data) = P(data | theory) P(theory)
P(data)
![Page 36: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/36.jpg)
Bayesian Prior
P(theory) is the Prior
Expresses prior belief theory is true
Can be function of parameter:
P(Mtop), P(MH), P(α,β,γ)
Bayes’ Theorem describes way prior belief is modified by experimental data
But what do you take as initial prior?
![Page 37: Frequentist versus Bayesian. Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester The Bayesian approach In Bayesian statistics](https://reader035.vdocument.in/reader035/viewer/2022062804/56649de65503460f94ade5eb/html5/thumbnails/37.jpg)
Uniform Prior
General usage: choose P(a) uniform in a(principle of insufficient reason)
Often ‘improper’: ∫P(a)da =∞. Though posterior P(a|x) comes out sensible
BUT!If P(a) uniform, P(a2) , P(ln a) , P(√a).. are notInsufficient reason not valid (unless a is ‘most
fundamental’ – whatever that means)Statisticians handle this: check results for
‘robustness’ under different priors