estimation – detectionherwig.wendt/data/slides_dect_12.pdf · 2012-12-03 · estimation –...
TRANSCRIPT
Estimation – Detection
Herwig Wendt
CNRS
IRIT - ENSEEIHT
Theme 1: Information Analysis and Synthesis
Estimation - Detection, 2012 – p. 1/29
Outline
Part 1: Estimation
Part 2: Detection - Statistical tests
Introduction, example
Neyman Pearson Theorem
Bayesian test (simple hypotheses)
Generalized likelihood ratio test
Bayesian test (composite hypotheses)
Goodness of fit tests
Estimation - Detection, 2012 – p. 2/29
Introduction
Principle: A statistical test is a procedure that enablesto decide between different hypotheses H0, H1, ... fromn observations x1, ..., xn. We restrict ourselves here totwo hypotheses H0 and H1. Performing a test consistsin determining a test statistic T (X1, ..., Xn) and a set ∆such that
H0 rejected if T (X1, ..., Xn) ∈ ∆
H0 accepted if T (X1, ..., Xn) /∈ ∆.
Terminology
H0 is the null hypothesis
H1 is the alternative hypothesis
(x1, ..., xn)|T (x1, ..., xn) ∈ ∆ : critical region
Estimation - Detection, 2012 – p. 3/29
Definitions
Parametric and non-parametric tests
Simple and composite hypotheses
Type I error = probability of false alarm
α = PFA = P [Reject H0|H0 true]
Type II error = probability of non-detection
β = PND = P [Reject H1|H1 true]
Power of the test = probability of detection:
π = PD = P [Reject H0|H1 true]
= 1− β
Estimation - Detection, 2012 – p. 4/29
Example
Xi ∼ N (m,σ2), σ2 known
Hypotheses
H0 : m = m0, H1 : m = m1 > m0
Test strategy
Reject H0 if X =1
n
n∑
i=1
Xi > tα
ProblemsDetermine the critical value tα, the risk β and the powerπ of the test.
Estimation - Detection, 2012 – p. 5/29
ROC
Receiver operating charactersitic
PD = h (PFA)
Example: Xi ∼ N (m,σ2), σ2 known
H0 : m = m0, H1 : m = m1 > m0
Probability of false alarm
tα = m0 +σ√nF−1(1− α)
Detection probability
PD = π = F
(tα −m1
σ√n
)
Estimation - Detection, 2012 – p. 6/29
ROC
ROC curves for previous example
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
α = PFA
β =
PD
m0=0, m1=1
σ=3, n=7σ=1, n=7σ=3, n=15σ=1, n=15
Estimation - Detection, 2012 – p. 7/29
Outline
Part 1: Estimation
Part 2: Detection - Statistical tests
Introduction, example
Neyman Pearson Theorem
Bayesian test (simple hypotheses)
Generalized likelihood ratio test
Bayesian test (composite hypotheses)
Goodness of fit tests
Estimation - Detection, 2012 – p. 8/29
Neyman-Pearson Theorem
Parametric test for simple hypotheses
H0 : θ = θ0 and
H1 : θ = θ1
Continuous random variables
Theorem: for α fixed, the test that minimizes β(maximizes π) is defined by
Reject H0 ifL(x1, ..., xn|H1)
L(x1, ..., xn|H0)> Sα
Remark: L(x1, ..., xn|Hi) = f(x1, ..., xn|θi)
Example: Xi ∼ N (m,σ2), σ2 known
H0 : m = m0, H1 : m = m1 > m0
Estimation - Detection, 2012 – p. 9/29
Neyman-Pearson theorem
Discrete random variables
Theorem: among all tests with type I error ≤ α fixed,the most powerful test is defined by
Reject H0 ifL(x1, ..., xn|H1)
L(x1, ..., xn|H0)> tα
Remark:
L(x1, ..., xn|Hi) = P [X1 = x1, ..., Xn = xn|θi]
Example: Xi ∼ P(λ), H0 : λ = λ0, H1 : λ = λ1 > λ0
Asymptotic law: when n is sufficiently large,application of central limit theorem
Estimation - Detection, 2012 – p. 10/29
Summary
Performing a Neyman-Pearson test:
1. Determination of the test statistique and the criticalregion of the test
2. Determination of the relation between the critical valuetα and the significance α of the test
3. Determination of the risk β and the power π of the test(or the ROC curve)
4. Numerical application: one accepts or rejects H0 forgiven risk α
Estimation - Detection, 2012 – p. 11/29
Outline
Part 1: Estimation
Part 2: Detection - Statistical tests
Introduction, example
Neyman Pearson Theorem
Bayesian test (simple hypotheses)
Generalized likelihood ratio test
Bayesian test (composite hypotheses)
Goodness of fit tests
Estimation - Detection, 2012 – p. 12/29
Bayesian test
Simple hypotheses H0 : θ = θ0 and H1 : θ = θ1 withprior probabilities P (H0) and P (H1)
costs cij to decide on Hi when Hj is true
probabilities pij to decide on Hi when Hj is true
Minimize cost E[C] = c00p00 + c01p01 + c10p10 + c11p11
Definition:
Reject H0 ifL(x1, ..., xn|H1)
L(x1, ..., xn|H0)>
P (H0)
P (H1)
c10 − c00c01 − c11
Example: Xi ∼ N (m,σ2), σ2 known
H0 : m = m0, H1 : m = m1 > m0
Estimation - Detection, 2012 – p. 13/29
Outline
Part 1: Estimation
Part 2: Detection - Statistical tests
Introduction, example
Neyman Pearson Theorem
Bayesian test (simple hypotheses)
Generalized likelihood ratio test
Bayesian test (composite hypotheses)
Goodness of fit tests
Estimation - Detection, 2012 – p. 14/29
Generalized likelihood ratio test
Parametric test for composite hypotheses
H0 : θ ∈ Ω0 and H1 : θ ∈ Ω1
Definition (GLR Test)
Reject H0 if
L(x1, ..., xn|θ
ML
1
)
L(x1, ..., xn|θ
ML
0
) > tα
where θML
0 and θML
1 are the maximum likelihoodestimators of θ under the hypotheses H0 and H1.
RemarkL(x1, ..., xn|θ
ML
i
)= sup
θ∈Ωi
L (x1, ..., xn|θ)
Estimation - Detection, 2012 – p. 15/29
Outline
Part 1 : Estimation
Part 2 : Detection - Statistical tests
Introduction, example
Neyman Pearson Theorem
Bayesian test (simple hypotheses)
Generalized likelihood ratio test
Bayesian test (composite hypotheses)
Goodness of fit tests
Estimation - Detection, 2012 – p. 16/29
Bayesian test
Parametric test for composite hypotheses
H0 : θ ∈ Ω0 and H1 : θ ∈ Ω1
Principle: Define prior laws p0(θ) and p1(θ) thatcorrespond to the constraints θ ∈ Ω0 and θ ∈ Ω1
Definition (Bayesian detector)
Reject H0 if
∫f (x1, · · · , xn|θ) p1(θ)dθ∫f (x1, · · · , xn|θ) p0(θ)dθ
> tα
Example: Xi ∼ N (m,σ2), σ2 known
H0 : m = 0, H1 : m ∼ N (0, ν2)
Estimation - Detection, 2012 – p. 17/29
Outline
Part 1 : Estimation
Part 2 : Detection - Statistical tests
Introduction, example
Neyman Pearson Theorem
Bayesian test (simple hypotheses)
Generalized likelihood ratio test
Bayesian test (composite hypotheses)
Goodness of fit tests
Estimation - Detection, 2012 – p. 18/29
χ2 test
The χ2 test is a non parametric goodness of fit test whichenables to test the following two hypotheses,
H0 : L = L0, H1 : L 6= L0,
where L0 is a given law. The test consists in determiningwhether (x1, ..., xn) is of law L0 or not. For simplicity, we onlyconsider the case xi ∈ R.
Definition
Reject H0 if φn =
K∑
k=1
(nk − npk)2
npk> tα
Remark: L0 can be discrete or continuous
Estimation - Detection, 2012 – p. 19/29
χ2 test
Test statistic
nk: number of observations xi in class Ck,k = 1, ...,K
pk: probability that an observation xi belongs toclass Ck when Xi ∼ L0
P [Xi ∈ Ck|Xi ∼ L0]
n: total number of observations
Law of the test statistic
φnL→
n→∞χ2K−1
Estimation - Detection, 2012 – p. 20/29
Remarks
Interpretation of φ
φn =
K∑
k=1
n
pk
(nkn
− pk
)2
Distance between theoretical and empirical probabilities
Asymptotic law of φn: see course or textbooks
Finite number of observationHeuristic: The asymptotic law of φn is a goodapproximation for finite n if npk ≥ 5 ∀k = 1, ...,K
=⇒ equally likely classes
Estimation - Detection, 2012 – p. 21/29
Remarks
CorrectionWhen (a subset of) the parameters of L0 are unknown
φnL→
n→∞χ2K−1−np
where np is the number of unknown parameters,estimated by the maximum likelihood method
Power of the testCan not be computed
Estimation - Detection, 2012 – p. 22/29
Example
4.13 1.41 −1.16 −0.75 1.96 2.46 0.197 0.24 0.42 2.00
2.08 1.48 1.73 0.82 0.33 −0.76 0.42 4.60 −2.83 0.197
2.59 0.54 4.06 −0.69 4.99 0.67 2.45 5.61 2.13 1.76
5.03 0.85 1.29 0.17 −0.38 2.76 −1.03 1.87 4.48 0.73
Is it reasonable to assume that the observations stem froma population of law N (1, 4)?Solution
Classes
C1 : ]−∞,−0.34], C2 : ]−0.34, 1], C3 : ]1, 2.34], C4 : ]2.34,∞[
Number of observations
Z1 = 7, Z2 = 12, Z3 = 10, Z4 = 11
Estimation - Detection, 2012 – p. 23/29
Example
Test statistic
φn = 1.4
critical values
χ22 χ2
3
t0.05 5.991 7.815
t0.01 9.210 11.345
hence hypothesis H0 is accepted with risks α = 0.01 andα = 0.05.
Estimation - Detection, 2012 – p. 24/29
Kolmogorov test
The Kolmogorov test is a non parametric goodness of fittest which enables to test the following two hypotheses,
H0 : L = L0, H1 : L 6= L0,
where L0 is a given law. The test consists in determiningwhether (x1, ..., xn) is of law L0 or not. For simplicity, we onlyconsider the case xi ∈ R.
Definition
Reject H0 if Ψn = supx∈R
|F (x)− F0(x)| > tα
Remark: L0 must be a continuous law
Estimation - Detection, 2012 – p. 25/29
Remarks
Test statisticF0(x) is the theoretical cumulative distribution function
of L0 and F (x) is the empirical distribution function of(x1, ..., xn)
Asymptotic law of Ψn: see textbooks
P [√nΨn < y]
L→n→∞
1− 2
+∞∑
l=1
(−1)l−1 exp(−2l2y2) = K(y)
Determination of the critical value: tα = 1√nK−1(1− α)
The critical value depends on α and n.
Estimation - Detection, 2012 – p. 26/29
Remarks
Computing Ψn
Ψn = maxi∈1,...,n
maxE+i , E
−i
E+i =
∣∣∣F(x∗+i)− F0 (x
∗i )∣∣∣ , E−
i =∣∣∣F(x∗−i)− F0 (x
∗i )∣∣∣
x∗1, ..., x∗n is the order statistic of x1, ..., xn.
F(x∗+i)= i/n and F
(x∗−i)= (i− 1)/n.
Power of the testCan not be computed
Estimation - Detection, 2012 – p. 27/29
Example
Is it reasonable to believe that the following observationsstem from a population of uniform law U(0, 1)?
i 1 2 3 4 5 6 7 8 9 10
xi 0.0078 0.063 0.10 0.25 0.32 0.39 0.40 0.48 0.49 0.53
E−
i0.0078 0.013 0.00 0.10 0.07 0.14 0.05 0.008 0.04 0.03
E+
i0.0422 0.037 0.05 0.05 0.12 0.09 0.10 0.13 0.09 0.08
max(E+
i, E
−
i) 0.0422 0.037 0.05 0.1 0.12 0.14 0.10 0.13 0.09 0.08
i 11 12 13 14 15 16 17 18 19 20
xi 0.67 0.68 0.69 0.73 0.79 0.80 0.87 0.88 0.90 0.996
E−
i0.17 0.13 0.04 0.03 0.04 0.05 0.07 0.03 0.05 0.046
E+
i0.12 0.08 0.09 0.08 0.09 0.00 0.02 0.02 0.00 4e− 3
max(E+
i, E
−
i) 0.17 0.13 0.09 0.08 0.09 0.05 0.07 0.03 0.05 0.046
Estimation - Detection, 2012 – p. 28/29
Example
Test statistic
Dn = 0.17
Critical values for n = 20
t0.05 0.294
t0.01 0.352
hence hypothesis H0 is accepted with risks α = 0.01 andα = 0.05.
Estimation - Detection, 2012 – p. 29/29