estimation – detectionherwig.wendt/data/slides_dect_12.pdf · 2012-12-03 · estimation –...

Estimation – Detection

Herwig Wendt

CNRS

IRIT - ENSEEIHT

Theme 1: Information Analysis and Synthesis

[email protected]

Estimation - Detection, 2012 – p. 1/29

Outline

Part 1: Estimation

Part 2: Detection - Statistical tests

Introduction, example

Neyman Pearson Theorem

Bayesian test (simple hypotheses)

Generalized likelihood ratio test

Bayesian test (composite hypotheses)

Goodness of fit tests


Introduction

Principle: A statistical test is a procedure that enablesto decide between different hypotheses H0, H1, ... fromn observations x1, ..., xn. We restrict ourselves here totwo hypotheses H0 and H1. Performing a test consistsin determining a test statistic T (X1, ..., Xn) and a set ∆such that

H0 rejected if T (X1, ..., Xn) ∈ ∆

H0 accepted if T (X1, ..., Xn) /∈ ∆.

Terminology

H0 is the null hypothesis

H1 is the alternative hypothesis

(x1, ..., xn)|T (x1, ..., xn) ∈ ∆ : critical region


Definitions

Parametric and non-parametric tests

Simple and composite hypotheses

Type I error = probability of false alarm

α = PFA = P [Reject H0|H0 true]

Type II error = probability of non-detection

β = PND = P [Reject H1|H1 true]

Power of the test = probability of detection:

π = PD = P [Reject H0|H1 true]

= 1− β


Example

Xi ∼ N (m,σ2), σ2 known

Hypotheses

H0 : m = m0, H1 : m = m1 > m0

Test strategy

Reject H0 if X =1

n

n∑

i=1

Xi > tα

ProblemsDetermine the critical value tα, the risk β and the powerπ of the test.


ROC

Receiver operating charactersitic

PD = h (PFA)

Example: Xi ∼ N (m,σ2), σ2 known

H0 : m = m0, H1 : m = m1 > m0

Probability of false alarm

tα = m0 +σ√nF−1(1− α)

Detection probability

PD = π = F

(tα −m1

σ√n

)


ROC

ROC curves for previous example

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

α = PFA

β =

PD

m0=0, m1=1

σ=3, n=7σ=1, n=7σ=3, n=15σ=1, n=15


Outline

Part 1: Estimation









Neyman-Pearson Theorem

Parametric test for simple hypotheses

H0 : θ = θ0 and

H1 : θ = θ1

Continuous random variables

Theorem: for α fixed, the test that minimizes β(maximizes π) is defined by

Reject H0 ifL(x1, ..., xn|H1)

L(x1, ..., xn|H0)> Sα

Remark: L(x1, ..., xn|Hi) = f(x1, ..., xn|θi)


H0 : m = m0, H1 : m = m1 > m0


Neyman-Pearson theorem

Discrete random variables

Theorem: among all tests with type I error ≤ α fixed,the most powerful test is defined by


L(x1, ..., xn|H0)> tα

Remark:

L(x1, ..., xn|Hi) = P [X1 = x1, ..., Xn = xn|θi]

Example: Xi ∼ P(λ), H0 : λ = λ0, H1 : λ = λ1 > λ0

Asymptotic law: when n is sufficiently large,application of central limit theorem


Summary

Performing a Neyman-Pearson test:

1. Determination of the test statistique and the criticalregion of the test

2. Determination of the relation between the critical valuetα and the significance α of the test

3. Determination of the risk β and the power π of the test(or the ROC curve)

4. Numerical application: one accepts or rejects H0 forgiven risk α


Outline

Part 1: Estimation









Bayesian test

Simple hypotheses H0 : θ = θ0 and H1 : θ = θ1 withprior probabilities P (H0) and P (H1)

costs cij to decide on Hi when Hj is true

probabilities pij to decide on Hi when Hj is true

Minimize cost E[C] = c00p00 + c01p01 + c10p10 + c11p11

Definition:


L(x1, ..., xn|H0)>

P (H0)

P (H1)

c10 − c00c01 − c11


H0 : m = m0, H1 : m = m1 > m0


Outline

Part 1: Estimation










Parametric test for composite hypotheses

H0 : θ ∈ Ω0 and H1 : θ ∈ Ω1

Definition (GLR Test)

Reject H0 if

L(x1, ..., xn|θ

ML

1

)

L(x1, ..., xn|θ

ML

0

) > tα

where θML

0 and θML

1 are the maximum likelihoodestimators of θ under the hypotheses H0 and H1.

RemarkL(x1, ..., xn|θ

ML

i

)= sup

θ∈Ωi

L (x1, ..., xn|θ)


Outline

Part 1 : Estimation

Part 2 : Detection - Statistical tests








Bayesian test

Parametric test for composite hypotheses

H0 : θ ∈ Ω0 and H1 : θ ∈ Ω1

Principle: Define prior laws p0(θ) and p1(θ) thatcorrespond to the constraints θ ∈ Ω0 and θ ∈ Ω1

Definition (Bayesian detector)

Reject H0 if

∫f (x1, · · · , xn|θ) p1(θ)dθ∫f (x1, · · · , xn|θ) p0(θ)dθ

> tα


H0 : m = 0, H1 : m ∼ N (0, ν2)


Outline

Part 1 : Estimation

Part 2 : Detection - Statistical tests








χ2 test

The χ2 test is a non parametric goodness of fit test whichenables to test the following two hypotheses,

H0 : L = L0, H1 : L 6= L0,

where L0 is a given law. The test consists in determiningwhether (x1, ..., xn) is of law L0 or not. For simplicity, we onlyconsider the case xi ∈ R.

Definition

Reject H0 if φn =

K∑

k=1

(nk − npk)2

npk> tα

Remark: L0 can be discrete or continuous


χ2 test

Test statistic

nk: number of observations xi in class Ck,k = 1, ...,K

pk: probability that an observation xi belongs toclass Ck when Xi ∼ L0

P [Xi ∈ Ck|Xi ∼ L0]

n: total number of observations

Law of the test statistic

φnL→

n→∞χ2K−1


Remarks

Interpretation of φ

φn =

K∑

k=1

n

pk

(nkn

− pk

)2

Distance between theoretical and empirical probabilities

Asymptotic law of φn: see course or textbooks

Finite number of observationHeuristic: The asymptotic law of φn is a goodapproximation for finite n if npk ≥ 5 ∀k = 1, ...,K

=⇒ equally likely classes


Remarks

CorrectionWhen (a subset of) the parameters of L0 are unknown

φnL→

n→∞χ2K−1−np

where np is the number of unknown parameters,estimated by the maximum likelihood method

Power of the testCan not be computed


Example

4.13 1.41 −1.16 −0.75 1.96 2.46 0.197 0.24 0.42 2.00

2.08 1.48 1.73 0.82 0.33 −0.76 0.42 4.60 −2.83 0.197

2.59 0.54 4.06 −0.69 4.99 0.67 2.45 5.61 2.13 1.76

5.03 0.85 1.29 0.17 −0.38 2.76 −1.03 1.87 4.48 0.73

Is it reasonable to assume that the observations stem froma population of law N (1, 4)?Solution

Classes

C1 : ]−∞,−0.34], C2 : ]−0.34, 1], C3 : ]1, 2.34], C4 : ]2.34,∞[

Number of observations

Z1 = 7, Z2 = 12, Z3 = 10, Z4 = 11


Example

Test statistic

φn = 1.4

critical values

χ22 χ2

3

t0.05 5.991 7.815

t0.01 9.210 11.345

hence hypothesis H0 is accepted with risks α = 0.01 andα = 0.05.


Kolmogorov test

The Kolmogorov test is a non parametric goodness of fittest which enables to test the following two hypotheses,

H0 : L = L0, H1 : L 6= L0,

where L0 is a given law. The test consists in determiningwhether (x1, ..., xn) is of law L0 or not. For simplicity, we onlyconsider the case xi ∈ R.

Definition

Reject H0 if Ψn = supx∈R

|F (x)− F0(x)| > tα

Remark: L0 must be a continuous law


Remarks

Test statisticF0(x) is the theoretical cumulative distribution function

of L0 and F (x) is the empirical distribution function of(x1, ..., xn)

Asymptotic law of Ψn: see textbooks

P [√nΨn < y]

L→n→∞

1− 2

+∞∑

l=1

(−1)l−1 exp(−2l2y2) = K(y)

Determination of the critical value: tα = 1√nK−1(1− α)

The critical value depends on α and n.


Remarks

Computing Ψn

Ψn = maxi∈1,...,n

maxE+i , E

−i

E+i =

∣∣∣F(x∗+i)− F0 (x

∗i )∣∣∣ , E−

i =∣∣∣F(x∗−i)− F0 (x

∗i )∣∣∣

x∗1, ..., x∗n is the order statistic of x1, ..., xn.

F(x∗+i)= i/n and F

(x∗−i)= (i− 1)/n.

Power of the testCan not be computed


Example

Is it reasonable to believe that the following observationsstem from a population of uniform law U(0, 1)?

i 1 2 3 4 5 6 7 8 9 10

xi 0.0078 0.063 0.10 0.25 0.32 0.39 0.40 0.48 0.49 0.53

E−

i0.0078 0.013 0.00 0.10 0.07 0.14 0.05 0.008 0.04 0.03

E+

i0.0422 0.037 0.05 0.05 0.12 0.09 0.10 0.13 0.09 0.08

max(E+

i, E

−

i) 0.0422 0.037 0.05 0.1 0.12 0.14 0.10 0.13 0.09 0.08

i 11 12 13 14 15 16 17 18 19 20

xi 0.67 0.68 0.69 0.73 0.79 0.80 0.87 0.88 0.90 0.996

E−

i0.17 0.13 0.04 0.03 0.04 0.05 0.07 0.03 0.05 0.046

E+

i0.12 0.08 0.09 0.08 0.09 0.00 0.02 0.02 0.00 4e− 3

max(E+

i, E

−

i) 0.17 0.13 0.09 0.08 0.09 0.05 0.07 0.03 0.05 0.046


Example

Test statistic

Dn = 0.17

Critical values for n = 20

t0.05 0.294

t0.01 0.352

hence hypothesis H0 is accepted with risks α = 0.01 andα = 0.05.


estimation – detectionherwig.wendt/data/slides_dect_12.pdf · 2012-12-03 · estimation –...

Documents