k. desch – statistical methods of data analysis ss10 2. probability 2.3 joint p.d.f.´s of several...
Post on 21-Dec-2015
217 views
TRANSCRIPT
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.3 Joint p.d.f.´s of several random variables
Examples: Experiment yields several simultaneous measurements(e.g. temperature and pressure)
Joint p.d.f. (here only for 2 variables):
f(x,y) dx dy = probability, that x[x,x+dx] and y[y,y+dy]
Normalization:
Individual probability distribution (“marginal p.d.f.”) for x and y:
yields probability density for x (or y) independent of y (or x)
x and y are statistically independent if
Sf(x,y) dxdy 1
xf (x) f(x,y) dy
yf (y) f(x,y) dx
x yf(x,y) f (x) f (y)
f(x | y) f(x) for any y (and vice versa)
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.3 Joint p.d.f.´s of several random variables
conditional p.d.f.´s:
h(y|x)dxdy is the probability for event to lie in the interval [y,y+dy]when the event is known to lie in the interval [x,x+dx].
x
f(x,y)h(y | x)
f (x)
y
f(x,y)g(x | y)
f (y)
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.3 Joint p.d.f.´s of several random variables
Example: measurement of the length of a bar and its temperature
x = deviation from 800 mmy = temperature in 0C
a) 2-dimentional histogram (“scatter-plot”)
b) Marginal distribution of y (“y-projection”)
c) Marginal distribution of x (“x-projection”)
d) 2 conditional distributions of x (s. edges in (a))
Width in d) smaller than in c)
x and y are “correlated”
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.3 Joint p.d.f.´s of several random variables
Expectation value (analog to 1-dim. case)
n1E[a(x)] a(x) f(x)dx ...dx
Variance (analog to 1-dim. case)
2 2a a n1
V[a(x)] (a(x) ) f(x)dx ...dx
Covariance
for 2 variables x, y with joint p.d.f. f(x,y):
important when more than one variable: measure for the correlation of the variables:
xy x ycov[x,y] V : E[(x )(y )]
x yE[xy]
x y... xy f(x,y)dxdy
if x, y are stat. independent (f(x,y) = fx(x)fy(y)) then cov[x,y] = 0(but not vice versa!!)
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.3 Joint p.d.f.´s of several random variables
Positive correlation: positive (negative) deviation of xfrom its x increases the probability, that y has a positive (negative) deiationof its y
For the sum of random numbers x+y holds: V[x+y] = V[x] + V[y] + 2 cov[x,y](proof: linearity of E[])
For n random variables xi i=1,n: is the covariance matrix (symmetric matrix)
diagonal elements:
For uncorrelated variables: covariance matrix is diagonal
For all elements:
i
2i i i xcov[x ,x ] V[x ]
i ji j x xcov[x ,x ] V
i ji j x xcov[x ,x ]
Normalized quantity: is the correlation coefficient i j
i j
i jx x
x x
cov[x ,x ]:
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.3 Joint p.d.f.´s of several random variables
examples for correlation coefficients
(Axis units play no role !)
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.3 Joint p.d.f.´s of several random variables
one more example:
[Barlow]
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.3 Joint p.d.f.´s of several random variables
another example:
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.4 Transformation of variables
Measured quantity: x (distributed according to pdf f(x))
Derived quanitity: y = a(x) What is the p.d.f. of y, g(y) ?
Define g(y) by requiring the same probability for
y
x
[y,y+dy]
[x,x+dx] =: dS
a(x)
dS
g(y)dy f(x)dx
dxg(y) f(x(y))
dy
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.4 Transformation of variables
More tedious when x y not a 11 relation, e.g.2xy(x)
y
[y,y+dy]
;y2
1
dy
dx )yf()yf(
y2
1g(y)
two branches x>0 and x<0
for g(y) sum up the probabilitiesfor x>0 and x<0
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.Transformation of variables
Functions of more variables :
transformation through the Jacobian matrix:
g(y) f(x(y))det J
1 1
1 n
n n
1 n
x x
y y
J
x x
y y
y a(x)
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.4 Transformation of variables
Example: Gaussian momentum distributionMomentum in x and y:
polar coordinates
x = r cos φ
y = r sin φ
r2 := x2 + y2
2xef(x) 2yef(y)
22 yxef(x)
cos rsin
sin rcosy
r
y
x
r
x
J
det J = r → g (r,φ) = f ( x (r,φ), y (r,φ) ) • det J = re2r
In 3-dimenions → Maxwell distribution
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.5 Error propagation
Often, one is not interested in complete transformation of p.d.f. but onlyin the transformation of its variance (=squared error) measured error of x derived error of y
When σx is small relative to curvature of y(x) :
→ linear approach
E y(x) y(μ)
n
i ii 1 i x
yy(x) y( ) (x )
x
What about the variance?
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.5 Error propagation
Variance : 2) )μy(-)xy( (E )xy(V
2n
1iiiμx
i
)μ(x|x
yE
n
1jjjμx
j
n
1iiiμx
i
)μ(x|x
y)μ(x|
x
yE
n n2y ij
i 1 j 1 i j
y yV[y] V
x x→
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.5 Error propagation
For more variables yi:
n nk l
k l i ji 1 j 1 i j
y ycov[y ,y ] cov[x ,x ]
x x
→ general formula for error propagation (in linear approximation)
Special cases:
a) uncorrelated xj :
and
|x
y )xy(V
n
1iμx
2x
2
ii
|σx
y
x
y y,ycov
n
1iμx
2x
i
l
i
klk i
even if xi are uncorrelated, the yi are in general correlated
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.5 Error propagation
b) Sum y = x1 + x2 → 2x
2x
2y 21
σσσ
errors added in quadratures
c) Product y = x1x2 → 2x
2x
2x
2x
2y
2y
2y 2
2
1
1
μ
σ
μ
σ
μ
σ
μ
v[y]
relative errors added in quadratures
x1 and x2 are uncorrelated !
K. Desch – Statistical methods of data analysis SS10
2. Probability 2.5 Convolution
Convolution :
Typical case when a probability distribution consists of two random variables x, y like a sum w = x + y. w is also a random variable
Example: x: Breit-Wigner Resonance
y: Exp. Resolution (Gauss)
What is the p.d.f. for w when fx(x) and fy(y) are known
w x yf (w) f (x)f (y)δ(w x y)dxdy
x)dx(w(x)ff yx
y)dy(w(y)ff xy
yx ff
xy
3. Distributions
Important probability distributions
- Binominal distribution
- Poisson distribution
- Gaussian distribution
- Cauchy (Breit-Wigner) distribution
- Chi-squared distribution
- Landau distribution
- Uniform distribution
Central limit theorem
3. Distributions 3.1 Binomial distribution
Binomial distribution appears when one has exactly two possible trial outcomes (success-failure, head-tail, even-odd, …)
event “success”: event “failure”:
Probability:
A A
p P(A) q (1 p) P(A)
Example: (ideal) coins
Probability for “head” (A) = p = 0.5, q=0.5
Probability for n=4 trials to get k-time “head” (A) ?
k=0: P = (1-p)4 = 1/16k=1: P = (p (1-p)3) times number of combinations (HTTT, THTT, TTHT, TTTH) = 4*1/16 = ¼k=2: P = (p2 (1-p)2) times (HHTT, HTTH, TTHH, HTHT, THTH, THHT) = 6*1/16 = 3/8 k=3: P = (p3 (1-p)) times (HHHT, HHTH, HTHH, THHH) = 4*1/15 = ¼k=4: P = p4 = 1/16
P(0)+P(1)+P(2)+P(3)+P(4) = 1/16+1/4+3/8+1/4+1/16 = 1 ok
3. Distributions 3.1 Binomial distribution
Number of permutations for k successes by n trials:
Binominal coefficient:
Binomial distribution:
- Discrete probability distribution
- Random variable: k
- Depends on 2 parameters: n (number of attempts) and p (probability of suc.)
- Sequence of appearance of k successes play no role
- n trials must be independent
n n!
k k!(n k)!
k n k n
f(k;n,p) p (1 p)k
3. Distributions 3.1 Binomial distribution (properties)
Normalisation:
Expectation value(mean value):
Proof:
n nk n k n
k 0 k 0
nf(k;n,p) p (1 p) (p (1 p)) 1
k
n nk n k k 1 n k
k 0 k 1
nk 1 n k
k 1
nk n k
k 0
n! (n 1)!kp (1 p) np kp (1 p)
k!(n k)! k!(n k)!
(n 1)!np p (1 p)
(k 1)!(n k)!
n !np p (1 p) np (mit n n 1,k k 1)
k !(n k )!
npp)n,kf(k;kE[k]k
3. Distributions 3.1 Binomial distribution (properties)
Variance:
Proof:
nk n k
k 0
n2 k 2 n k
k 2
n2 k n k 2
k 0
n!k(k 1)p (1 p)
k!(n k)!
(n 2)!p n(n 1) p (1 p)
(k 2)!(n k)!
n !p n(n 1) p (1 p) p n(n 1) (mit n n 2,k k 2)
k !(n k )!
k(k 1)
However:
2 2 2 2 2 2
2 22 2 2 2 2
V[k] E[k ] E[k] E[k ] E[k] E[k] E[k] E[k k] E[k] E[k ]
k k k k p n(n 1) np n p np(k 1k p)
p)-np(1p)n,f(k;μ)-(kσV[k]k
22
3. Distributions 3.1 Binomial distribution
3. Distributions 3.1 Binomial distribution
HERA-B experiment muon spectrometer
12 chambers; efficiency of one chamber is ε = 95%Trigger condition: 11 out of 12 chambers hit
εTOTAL = P(11; 12,0.95) + P(12; 12,0.95) = 88.2 %
When chambers reach only ε = 90% then εTOTAL = 65.9%
When one chambers fails: εTOTAL = P(11, 0.95, 12) = 56.9 %
Random coincidences (noise): εBG = 10% 20% - twice more noise εTOTAL_BG = 1•10-9 2•10-7 200x more background
x x x x
xxxxxxx
μ
3. Distributions 3.1 Binomial distribution
Example: number of error bars in 1-interval (p=0.68)