some aspects of statistics at lhc

69
October 2013 Some aspects of statistics at LHC N.V.Krasnikov INR, Moscow

Upload: zenda

Post on 05-Jan-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Some aspects of statistics at LHC. N.V.Krasnikov INR, Moscow. Outline. 1. Introduction 2. Parameters estimation 3. Confidence intervals 4. Systematics 5. Upper limits at LHC 6. Conclusion. 1. Introduction The statistical model of an analysis - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Some aspects of statistics at LHC

October 2013

Some aspects of statistics at LHC

N.V.Krasnikov

INR, Moscow

Page 2: Some aspects of statistics at LHC

October 2013

1. Introduction

2. Parameters estimation

3. Confidence intervals

4. Systematics

5. Upper limits at LHC

6. Conclusion

Outline

Page 3: Some aspects of statistics at LHC

October 2013

1. Introduction

The statistical model of an analysisprovides the complete description of thatanalysisThe main problem – from the known probability density and x = xobs to extract some information

on θ parameterTwo approaches 1.Frequentist method2. Bayesian methodAlso very important – the notion of the likelihood

Page 4: Some aspects of statistics at LHC

October 2013

Likelihood - the probability density evaluated at the observed value x=xobs

Page 5: Some aspects of statistics at LHC

Frequents statistics – general philosophy

In frequentist statistics, probabilities are associated only with data, i.e. outcomes of

repeatable observations. The preferred models are those for which our observations

have non small probabilities

October 2013

Page 6: Some aspects of statistics at LHC

October 2013

Page 7: Some aspects of statistics at LHC

Bayesian statistics – general philosophy

In Bayesian statistics , interpretation of probability is extended to degree of

belief(subjective probability). Bayesian methods can provide more natural treatment

of non repeatable phenomena :

systematic uncertainties

October 2013

Page 8: Some aspects of statistics at LHC

October 2013

Parameters estimation

Maximum likelihood principle

Page 9: Some aspects of statistics at LHC

October 2013

Normal distribution

Page 10: Some aspects of statistics at LHC

October 2013

Bayesian method

• In Bayes approach

For flat prior π(θ) = const

Bayes and likelihood coincide

Page 11: Some aspects of statistics at LHC

October 2013

Confidence intervals

Suppose we measure x = xobs

• What are possible values of θ parameter?

• Frequentist answer:

Neyman belt construction

Alternative:

Bayes credible interval

Page 12: Some aspects of statistics at LHC

October 2013

Neyman belt construction

(1-α) – confidence level. The choice of x1 and x2

is not unique

Page 13: Some aspects of statistics at LHC

Neyman belt construction

October 2013

Page 14: Some aspects of statistics at LHC

October 2013

Neyman belt construction

Page 15: Some aspects of statistics at LHC

October 2013

Neyman belt construction

• For normal distribution Neyman belt equations for lower limit lead to

Page 16: Some aspects of statistics at LHC

October 2013

Neyman belt equations

Page 17: Some aspects of statistics at LHC

October 2013

Maximal likelihood Approximate estimate

For normal distribution

Page 18: Some aspects of statistics at LHC

October 2013

Bayes approach

Bayes theorem

P(A|B)*P(B) = P(B|A)*P(B)

P(A|B) – conditional probability

Page 19: Some aspects of statistics at LHC

October 2013

Bayes approach• Due to Bayes formula

the statistics problem is reduced to the

probability problem

Page 20: Some aspects of statistics at LHC

October 2013

Bayes approach• The main problem – prior function π(θ) is

not known • For what prior frequentist and Bayes approaches

coincide?

Page 21: Some aspects of statistics at LHC

October 2013

The relation between Bayes and frequentist approaches

• Two examples

1.Example A

Page 22: Some aspects of statistics at LHC

October 2013

The relation between Bayes and frequentist approaches

2.Example B

Page 23: Some aspects of statistics at LHC

October 2013

Parameter determination with additional constraint

• Consider the case of normal distribution

with additional constraint

Maximum likelihood method gives

Page 24: Some aspects of statistics at LHC

October 2013

Parameter determination with additional constraint

• How to construct the confidence interval for the μ parameter?

Cousins, Feldman method• Maximum of

Page 25: Some aspects of statistics at LHC

October 2013

Neyman belt construction

• The ordering principle on • As a consequence we find

Page 26: Some aspects of statistics at LHC

October 2013

Likelihood method

• For x0 < 0

• or

Page 27: Some aspects of statistics at LHC

October 2013

Likelihood principle

• For x0 > 0

• or

Page 28: Some aspects of statistics at LHC

October 2013

Bayes approach

• We choose π(μ) = θ(μ)

So prior function is zero for negative μ

automatically• The equation for the credible interval

determination

Page 29: Some aspects of statistics at LHC

October 2013

Confidence intervals for Poisson distribution

• The generalization of Neyman belt construction is

Klopper-Pearson interval

Page 30: Some aspects of statistics at LHC

October 2013

Poisson distribution

• The Kloper-Pearson interval is conservative and it does not have the coverage property. Coverage is the probability that interval covers true value with the probability Besides for

So we have negative probability - contradiction

Page 31: Some aspects of statistics at LHC

October 2013

Stevens interval

• To overcome these problems Stevens (1952) suggested to introduce new random variable U. Modified equations are

Page 32: Some aspects of statistics at LHC

October 2013

Stevens equations• One can derive Stevens equations using the

regularization of discrete Poisson distribution(S.Bityukov,N.V.K). Namely let us introduce Poisson generalization

• The integral

• is not well defined

Page 33: Some aspects of statistics at LHC

October 2013

Stevens interval

Let us introduce the regularization

Page 34: Some aspects of statistics at LHC

October 2013

Stevens interval

• We can use Neyman belt construction for regularized distribution and we find

Page 35: Some aspects of statistics at LHC

Stevens intervalIn the limit of the regularization removement

we find

October 2013

Page 36: Some aspects of statistics at LHC

October 2013

Likelihood method

• The use of likelihood method gives

• The solution is

Page 37: Some aspects of statistics at LHC

October 2013

Likelihood method

Page 38: Some aspects of statistics at LHC

October 2013

Bayes approach

• The basic equations are

• Due to identities

Page 39: Some aspects of statistics at LHC

October 2013

Bayes approach

Upper Klopper-Pearson limit coincides

with Bayesian limit for flat prior and lower limit corresponds to prior

The Stevens equations for non

dependent on λ

are equivalent to Bayes approach with prior

function

Page 40: Some aspects of statistics at LHC

October 2013

Uncertainties in extraction of an upper limit

Page 41: Some aspects of statistics at LHC

October 2013

Modified frequentist definition• We require(S.Bityukov,N.V.K.,2012) that

• Our definition is equivalent to Bayes • approach with prior function

Page 42: Some aspects of statistics at LHC

October 2013

Signal extraction for nonzero background

• Consider the case

• Cousins-Feldman method

Page 43: Some aspects of statistics at LHC

October 2013

Nonzero background

• Likelihood ordering

Plus Neyman construction

Page 44: Some aspects of statistics at LHC

October 2013

CLS method(T.Junk,A.Read)

• Upper bound

• CLS method

• In Bayes approach it corresponds to the replacement

Page 45: Some aspects of statistics at LHC

October 2013

Bayes method

• For flat prior

• We can interprete this formula in terms of conditional probability

Page 46: Some aspects of statistics at LHC

October 2013

Bayes method

• Namely the probability that parameter λ lies in the interval [λ, λ+dλ] provided λ≥λb is determined by the formula

that coincides with the previous Bayes formula

Page 47: Some aspects of statistics at LHC

October 2013

Systematics

• 1. Systematics that can be eliminated by the measurement of some variables in other kinematic region

• 2. Uncertainties related with nonexact accuracy in determination of particle momenta, misidentification...

• 3. Uncertainties related with nonexact knowledge of theoretical cross sections

Page 48: Some aspects of statistics at LHC

October 2013

Systematics• 3 methods to deal with systematics(at least)

1. Suppose we measure some events in

two kinematic regions with distribution

functions + ,

The random variable Z = X-Y obeys normal distribution As a consequence we find

Page 49: Some aspects of statistics at LHC

October 2013

Systematics

• For Poisson distributions

and

due to identity

Page 50: Some aspects of statistics at LHC

October 2013

Systematics

• The problem is reduced to the determination of the ρ parameter from experimental data

Page 51: Some aspects of statistics at LHC

October 2013

Systematics

2.Bayesian treatment or Cousins-Highland

method is based on integration over nonessential variables

For normal distributions and flat prior we find

Page 52: Some aspects of statistics at LHC

October 2013

Systematics

• In other words the main effect is the replacement

and the significance is

So for normal distribution this method coincides with the first method

Page 53: Some aspects of statistics at LHC

October 2013

Systematics

• Profile likelihood method

Suppose likelihood function

depends on nonessential variables θ

and essential variables λ

Profile likelihood

Page 54: Some aspects of statistics at LHC

October 2013

Profile likelihood

New variable(statistics)

• Per construction • For new statistics defines probability density

Page 55: Some aspects of statistics at LHC

October 2013

Profile likelihood

• For normal distributions profile likelihood method coincides with the Cousins-Highland method

• Very often p-value is used• By definition

p-value determines the agreement of data with amodel• Small p-value(p < 5.9*10-7) - the model is

excluded by experimental data

Page 56: Some aspects of statistics at LHC

October 2013

P-value

• For Poisson distribution p-value definition is

Page 57: Some aspects of statistics at LHC

October 2013

Limits on new physics at LHC

For the Higgs boson search CMS and ATLAS introduce the extended model

with additional μ parameter and the replacement cross section the same. The case μ =1 corresponds to SM. The case μ=0 corresponds to the absence of the SM Higgs boson.

The likelihood of the general model can be written in the form

Page 58: Some aspects of statistics at LHC

October 2013

Likelihood of the model

Here is the probability density of nonessential parameters . Usually

Is taken as normal or lognormal distribution

Page 59: Some aspects of statistics at LHC

October 2013

Bayes approach

• In Bayes approach the use of the formula

• allows to determine the probability density for μ parameter. Upper limit μup is detemined as

Usually α= 0.05

Page 60: Some aspects of statistics at LHC

October 2013

Frequentist approach

• CMS and ATLAS use statistics

Often modifications are used with additional conditions as

Page 61: Some aspects of statistics at LHC

October 2013

Frequentist approach

Very often the hypothesis μ=0 is tested

against μ>0. For such case it is convenient to use

For single Poisson

Page 62: Some aspects of statistics at LHC

October 2013

Single Poisson

• By construction q0≥0 and

In the limit nobs»1 the

probability density is

Page 63: Some aspects of statistics at LHC

October 2013

Upper limits

• To derive upper limits the statistics

is used. For single Poisson

Page 64: Some aspects of statistics at LHC

Higgs boson search at CMS

As an illustration consider the Higgs boson search at CMS detector

October 2013

Page 65: Some aspects of statistics at LHC

October 2013

P-value for Higgs boson search

Page 66: Some aspects of statistics at LHC

October 2013

Page 67: Some aspects of statistics at LHC

Summary of Higgs boson measurements

October 2013

Page 68: Some aspects of statistics at LHC

October 2013

Page 69: Some aspects of statistics at LHC

Conclusions

Experiments CMS and ATLAS use both frequentist and Bayesian

methods to extract the parameters of Higgs boson and limits on new

physics. As a rule they give numerically similar results

October 2013