test.pdf

OutlineTest on single parametersTest on more parameters

Test: matrix algebra approach

Test on regression parameters

Prof. Rizzi Laura

October 28, 2013

Prof. Rizzi Laura Test on regression parameters



Overview on:

I Test on single parametersI Test on more parametersI Test: matrix algebra approach




Test on single parameters

(See 5.6 Stock and Watson)

We assessed thatbjjvar(bj )

has approximately standard normal distribution (due to Central Limit

Theorem), then the hypothesis on single parameters j of the regression models may be verified

using the test statistic z or t (the latter is used when 2 is unknown).The confidence interval for parameter j may be derived following the usual approach, then isdefined as {bj 1.96 s.e.(bj )}. This holds for each of the parameters 1, 2, ..., k .

Example test-STRThe estimated models were:

Testi = 689.9 2.28 STRi(10.4) (0.52)

Testi = 696 1.1 STRi 0.65 PctELi(8.7) (0.43) (0.031)




Test on more parameters

If we consider the variable EXPN, which measures the student/expenditure, to include into themodel we obtain:

Test i = 1 + 2 STRi + 3 EXPNi + 4 PctELi

The hypothesis that district expenditure per student does not affect the performances of studentsmay be expressed as follows:

H0 : 2 = 0, 3 = 0

H1 : 2 6= 0 3 6= 0This hypothesis involves more parameters, moreover it is not a single hypothesis on moreparameters but a multiple hypothesis involving more single hypothesis. It involves morerestrictions on more that one parameters.

We could think to verifiy this multiple hypothesis taking into consideration the test statistic t

relative to each single hypothesis involved. Rejecting thus the null if one of the test t relative to

single parameters has sample value belonging to the rejection tails.





This approach is not correct because the final multiple test statistic derived would not have thecorrect significance level.Proof:Consider the null hypothesis H0 : 2 = 0 and 3 = 0 and derive the significance level based on theconsideration of the single test statistics t.Suppose that b2 and b3 have independent sampling distributions, then the relative test statistics tare:

t2 =b2 0s.e.(b2)

t3 =b30

s.e.(b3)

The decision rule could be:

reject H0 : 2 = 3 = 0 if |t2| > t2

and/or |t3| > t2

Which is the probability to reject the null, even a true null? Which is thus the first type error

probability? (It should be equal to anyway)





This probability is the following:

PrH0 [|t2| > 1.96, |t3| > 1.96] + PrH0 [|t2| > 1.96, |t3| 1.96] ++ PrH0 [|t2| 1.96, |t3| > 1.96]= PrH0 [|t2| > 1.96] PrH0 [|t3| > 1.96] ++ PrH0 [|t2| > 1.96] PrH0 [|t3| 1.96] ++ PrH0 [|t2| 1.96] PrH0 [|t3| > 1.96]

With independent t2 and t3 we obtain:

= 0.05 0.05 + 0.05 0.95 + 0.95 0.05 = 0.0975 = 9.75

Which is not equal to = 0.05. The size depends on the correlation between t2 and t3, then on

the correlation between b2 and b3.





The possible solutions of this problem are:

I to consider a critical value different from 2 (Bonferroni approach);I to consider a different test statistic, involving at the same time more than one hypothesis on

different parameters (the test statistic F).

The test F derived in Stock and Watson (see 5.7)The test statistic F verifies multiple hypothesis on more than two parameters. If the multiple nullhypothesis is for instance 1 = 1,0 and 2 = 2,0 in a model with two regressors (k = 3) the testF is:

F =1

2

(t21 + t

22 2t1,t2 t1t21 t1,t2

)Where t1,t2 is the estimated correlation between t1 and t2. The test F takes hight sampling values

when t1 and/or t2 have hight values. The test F adjust for the correlation between t1 and t2. The

expression of F statistic becomes more complicated when more than two parameters are involved

in the null hypothesis (in these situations it is better to consider the matrix algebra approach)




F distribution in large samples

If t1 and t2 are independent t1,t2p 0, then in large samples the test F expression becomes:

1

2

(t21 + t

22 2t1,t2 t1t21 t1,t2

)= 1

2(t21 + t

22 )

The F test statistic in large samples is the average of the two test statistic t.

Stata software example

The Stata statistical software requires the command test after regression estimation to apply test

statistic F to verify every null hypothesis. We verify the multiple hypothesis for the significance of

the regression coefficients related to the covariates STR e EXPN.




Stata software example

reg testscr str expn_stu pctel, r;

Regression with robust standard errors Number of obs = 420 F( 3, 416) = 147.20 Prob > F = 0.0000 R-squared = 0.4366 Root MSE = 14.353

------------------------------------------------------------------------------

| Robust testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+----------------------------------------------------------------

str | -.2863992 .4820728 -0.59 0.553 -1.234001 .661203 expn_stu | .0038679 .0015807 2.45 0.015 .0007607 .0069751 pctel | -.6560227 .0317844 -20.64 0.000 -.7185008 -.5935446 _cons | 649.5779 15.45834 42.02 0.000 619.1917 679.9641 ------------------------------------------------------------------------------

NOTE

test str expn_stu; The test command follows the regression

( 1) str = 0.0 There are q=2 restrictions being tested ( 2) expn_stu = 0.0

F( 2, 416) = 5.43 The 5% critical value for q=2 is 3.00 Prob > F = 0.0047 Stata computes the p-value for you

Figure: Model estimation and test on 2 and 3





We may verify different hypothesis on the parameters of a regression model:

I H0 : j = 0 against H1 : j 6= 0, this is a significance test on the parameter j ;I H0 : j = j,0 against H1 : j 6= j,0, in this case we verify if the parameter j is equal to

j,0;

I H0 : j + i = 1 against H1 : j + i 6= 1;I H0 : j = i against H1 : j 6= i

These are single hypothesis, which involve one or more model parameters. There may be multiplehypothesis which involve more single hypothesis on the parameters at the same time:

I H0 : ~2 =

23...k

=

00

.

.

.0





In the last case we verify the significance of all regression coefficients.

In other cases we consider the significance of a subset of parameters (partitioning the vector ~) asfollows:

~ =

~1~2

In this situation ~1 and ~2 are vectors of order (k1 1) and ((k k1) 1), respectively, and thenull hypothesis may involve the significance of the parameters included in one of the sub-vectors.





All the hypothesis may be expressed in matrix algebra terms:

~R~ = ~r

Where:

I ~R is a matrix (q k) of constants;I ~r is a vector (q 1) of constants;I ~ is the (k 1) vector of the model parameters;I q are the rows of the matrix R and gives also the number of single hypothesis on the

parameters (restrictions) involved (it must be that q < k)

Consider again the hypothesis above in matrix algebra terms:[1] - H0 : j = 0

~R =[

0 0 1 0 0 ]Where ~r = 0, q = 1 then ~R is a (1 k) matrix.





[2]- H0 : j = j,0

~R =[

0 0 1 0 0 ]Where ~r = j,0, q = 1 then ~R is a (1 k) matrix.

[3]- H0 : 2 + 3 = 1

~R =[

0 1 1 0 0 ]Where ~r = 1, q = 1 then ~R is a (1 k) matrix.

[4]- H0 : 2 = 3 o H0 : 2 3 = 0

~R =[

0 1 1 0 0 ]Where ~r = 0 and q = 1.





[5]- H0 : ~2 = 0

~2 =

23...k

=

00

.

.

.0

~R =[~0 ~Ik1

]=

0 1 0 00 0 1 0

0 0 0. . . 0

.

.

....

.

.

....

.

.

.0 0 0 1





The second part of matrix R is an identity matrix with order ((k 1) (k 1)), in general thematrix R has order (k 1) k, moreover the vector ~r = 0 is a ((k 1) 1) vector with al zerovalues and q = k 1.

All restriction, then all hypothesis, may be expressed as ~R~ = ~r or as ~R~ ~r = 0.The general test may be specialized to deal with any specification.

Given OLS estimator we can compute the vector of differences ~R~ ~r = 0 and the relative OLSestimate ~R~b ~r = 0.We consider the sampling distribution of : ~R~b ~r .The general test for all restriction is based on the sampling distribution of ~R~b under the nullhypothesis.





~R~b has sampling distribution given by:

E(Rb) = R

var(Rb) = E[

(Rb R)(Rb R)T]

= E[R(b )(b )TRT

]= RE

[(b )(b )T

]RT = Rvar(b)RT = 2R(XTX )1RT

~b is a function of ~u then the sampling distribution of Rb depends on the distribution of ~u which isassumed u N(0, 2). Thus:

b N(, 2(XTX )1)Rb N(R, 2R(XTX )1RT )

Rb R N(0, 2R(XTX )1RT )

If R = r under the true H0 then we obtain that:

(Rb r) N(0, 2R(XTX )1RT )





Remember that:

x N(0, 2) x N(0, 1)x2

2 2

Considering the previous random variable, squared and standardized, we have:

(Rb r)T (Rb r)2R(XTX )1RT

2(q)

(Rb r)T[

2R(XTX )1RT]1

(Rb r) 2(q)

But often is unknown and has to be estimated:

eT e

2 2(n k)





The ratio between two independent r.v. x1 and x2 with probability distribution 2 with d.f. n1 and

n2, respectively, divided by their d.f., drives a r.v. with probability distribution F . This r.v. Funder the null hypothesis is distributed as follows:

F =(Rb r)T

[R(XTX )1RT

]1(Rb r)/q

eT e/(n k) F (q, n k)

The null H0 : R = r is rejected if the absolute sampling value of the test F is greater than thecritical values (given a significance level equal to 5% or 1%) on the F (q, n k) distribution.We furthermore know that:

s2 =eT e

n kThen we obtain:

F = (Rb r)T[s2R(XTX )1RT

]1(Rb r)/q F (q, n k)

Where s2(XTX )1 is the variance-covariance estimated matrix of b.





Examples:[1]- H0 : j = 0

Rb =[

0 0 1 0 0 ]

b1...

bj

.

.

.bk

= bj

With r = 0 and q = 1. Then Rb r = bj and:

R(XTX )1RT =[

0 1 0 ]

c11

. . .

cjj

. . .

ckk

0

.

.

.1

.

.

.0

= cjj





Where R is (1 k) then RT is (k 1) and (XTX )1 is (k k).The test F is the following:

F =b2j

s2cjj=

b2j

var(bj ) F (1, n k)

Or: t =F =

bjs.e.(bj )

t(n k), this is the test t for the hypothesis on the significance of theparameter j .

[2]- H0 : j = 1This test is similar to the previous one, then:

t =bj 1s.e.(bj )

t(n k) F = (bj1)2

var(bj )

Where bj 1 = Rb r .





[3]- H0 : 2 3 = 1

Rb =[

0 1 1 0 0 0 ]

b1b2b3...bk

= b2 b3

Furthermore r = 1.

R(XTX )1RT =[

0 1 1 0 ]

c11 c12 c21 c22 c31 c32 c33...

.

.

....

ckk

011

.

.

.0

= c22 2c23 + c33





Then:

F =(b2 b3 1)2

s2(c22 2c23 + c33)=

(b2 b3 1)2var(b2) 2cov(b2b3) + var(b3)

F (1, n k)

t =(b2 b3 1)var(b2 b3)

t(n k)

The confidence interval for 2 3 is [(b2 b3) t0.025

var(b2 b3)].

[4]- H0 : 3 = 4 o H0 : 3 + 4 = 0From above we derive:

t =(b3 + b4)var(b3 + b4)

t(n k)

Then Rb = b3 + b4 and r = 0.

H0 : 2 = 3 = = k = 0





In this case q = k 1, there is not a single hypothesis but a multiple one which relates to k 1regression coefficients (in this case the test t is not equivalent to the test F ). We consider in thissituation a partition of the X matrix:~X =

[~i ~X2

]Where ~X2 is the (n (k 1)) matrix of the observations on the covariates, while~i is the (n 1)vector with all values equal to 1 relative to the constant.

We consider also the partition of the vector ~b as follows ~bT =[

b1 bT2

]

Rb = Rb2 =

1 0 0

0 1 ...

.

.

. . . . 0

0 0 1

b2b3...bk

= ~b2

This a ((k 1) 1) vector of estimated parameters





The partitioned matrix is X =[

i X2], then:

XTX =

[iT

XT2

] [i X2

]=

[n iTX2

XT2 i XT2 X2

]

The inverse of the partitioned matrix is:

(XTX )1 =[

B22

]

Where B22 =[XT2 X2 XT2 in1iTX2

]1=[XT2 X2

(I 1n iiT

)]1and then

B22 =[XT2 AX2

]1= (XTX)1

Where A is the idempotent matrix such that AAT = A, from which we obtain:





R(XTX )1RT = (XTX)1

Which is a ((k 1) (k 1)) matrix equal to B22. Then the test F is:

F =(Rb r)T

[R(XTX )1RT

]1(Rb r)/q

eT e/(n k) F (q, n k)

F =~bT2 (X

TX)~b2/(k 1)eT e/(n k) F (k 1, n k)

This is the test on the significance of all regression coefficients:

F =ESS/(k 1)RSS/(n k) F (k 1, n k)





Given that R2 = ESSTSS = 1 RSSTSS , ESS = R2TSS and RSS = (1 R2)TSS we may express thetest F as follows:

F =R2/(k 1)

(1 R2)/(n k) F (k 1, n k)

We consider the null hypothesis H0 : ~2 = 0

This hypothesis relates to a sub-set of regression coefficients and we consider a partition of the Xmatrix and of the vector :

Y =[

X1 X2] [ b1

b2

]+ e = X1b1 + X2b2 + e





Where X1 is a (n k1) matrix , X2 is a (n k2) or (n (k k1)) matrix, b1 is a (k1 1) vectorand b2 is a (k2 1) vector.In this case we have:

XTX =

[XT1 X1 X

T1 X2

XT2 X1 XT2 X2

]We may prove that: R(XTX )1RT = (XT2 M1X2)

1. Where M1 = I X1(XT1 X1)1XT1 is aprojection matrix such that M1X1 = 0, M1e = e and M1Y = e1 which are the residuals of theregression of Y only on X1.Then:

(Rb r)T[R(XTX )1RT

]1(Rb r)/q = bT2 (XT2 M1X2)b2/k2

Where k2 is the number of restrictions.





Moreover, given that:

M1Y = M1X1b1 + M1X2b2 + M1e = M1X2b2 + e

Considering the quadratic form we obtain:

(M1Y )T (M1Y ) = Y

TM1Y = bT2 X

T2 M1X2b2 + e

T e

Where:

I Y TM1Y are the RSS of the regression of Y on X1;I eT e are the RSS of the regression of Y on X1 and X2;I bT2 XT2 M1X2b2 is the reduction of the RSS when the variable X2 is included into the

regression of Y on X1;

We may denote with eT e the RSS of the regression of Y on X =[

X1 X2]

and with eT e

the RSS dof the regression of Y only on X1.





Then:

eT e = Y TM1Y = bT2 X

T2 M1X2b2 + e

T e

bT2 XT2 M1X2b2 = e

T e eT e

The test statistic is:

F =(eT e eT e)/k2

eT e/(n k) F (k2, n k)


OutlineTest on single parametersTest on more parametersTest: matrix algebra approach

test.pdf

Documents

single parameterstest

single hypothesis

single parametersi test

prh0 t2

rizzi laura test

test i

prh0 t3

single test statistics