statistical properties of ols

Upload: evgeniy-bokhon

Post on 07-Apr-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Statistical Properties of OLS

    1/59

    1. What is the disturbance term?

    2. Why does the disturbance term exist?(5 reasons)

    3. The fitted line.

    4. What is a residual for eachobservation?

    5. What is RSS?

    6. The normal equations for the regressioncoefficients.

    7. Formulas for estimates b1 and b2.

  • 8/4/2019 Statistical Properties of OLS

    2/59

    Part 3. Properties of the regressioncoefficients and hypothesis testing.Questions.

    1. The F-Test of Goodness of Fit;

    2. The Random Components of the RegressionCoefficients

    3. The Gauss Markov Theorem;

    4. Unbiasedness of the Regression Coefficients

    5. Precision of the Regression Coefficients;

    6. Testing Hypotheses Relating to the RegressionCoefficients;

    7. Confidence Intervals;

    8. One-Tailed t Tests

  • 8/4/2019 Statistical Properties of OLS

    3/59

    1.The F-Test of Goodness of Fit

  • 8/4/2019 Statistical Properties of OLS

    4/59

    Even if there is no relationship between Y

    and X, in any given sample of observationsthere may appear to be one, if only a faintone.

    Only by coincidence will the samplecovariance be exactly equal to 0.

    Accordingly, only by coincidence will the

    correlation coefficient and R2be exactlyequal to 0.

  • 8/4/2019 Statistical Properties of OLS

    5/59

    Suppose that the regression model is

    We take as our null hypothesis that there is no

    relationship between Y and X, that is,H0: 2= 0.

    We calculate the value that would be exceeded by

    R2

    as a matter of chance, 5 percent of the time.We then take this figure as the critical level ofR2

    for a 5 percent significance test.

  • 8/4/2019 Statistical Properties of OLS

    6/59

    If it is exceeded, we reject the null hypothesis iffavor of

    H1:

    2 0.

    Suppose that, as in this case, you can decompose

    the variance of the dependent variable into"explained" and "unexplained" componentsusing (formula (4) previous lecture)

  • 8/4/2019 Statistical Properties of OLS

    7/59

    Using the definition of sample variance, andmultiplying through by n, we can rewrite the

    decomposition as

    (Remember that e is 0 and that the sample meanof is equal to the sample mean of Y.)

  • 8/4/2019 Statistical Properties of OLS

    8/59

    The left side is TSS, the total sum of squares of thevalues of the dependent variable about itssample mean.

    The first term on the right side is ESS, theexplained sum of squares,

    and the second term is RSS, the unexplained,residual sum of squares:

  • 8/4/2019 Statistical Properties of OLS

    9/59

    The F statistic for the goodness of fit of aregression is written as the explained sum ofsquares, per explanatory variable, divided by theresidual sum of squares, per degree of freedom

    remaining:

    where k is the number of parameters in theregression equation

    (intercept and k1 slope coefficients).

  • 8/4/2019 Statistical Properties of OLS

    10/59

    By dividing both the numerator and thedenominator of the ratio by TSS, this F statisticmay equivalently be expressed in terms of R2:

    In the present context, k is 2,

  • 8/4/2019 Statistical Properties of OLS

    11/59

    Having calculated F from your value of R2, you

    look up Fcrit, the critical level of F, in theappropriate table.

    If

    F is greater than Fcrit,

    you conclude that the "explanation" of Y is betterthan is likely to have arisen by chance.

  • 8/4/2019 Statistical Properties of OLS

    12/59

    2.The Random Components ofthe Regression Coefficients

  • 8/4/2019 Statistical Properties of OLS

    13/59

    A least squares regression coefficient is a specialform of random variable whose properties

    depend on those of the disturbance term in theequation.

    Suppose that Ydepends onXaccording to the

    relationship

    and we are fitting the regression equation given a

    sample ofn observations

  • 8/4/2019 Statistical Properties of OLS

    14/59

    We shall also continue to assume thatXis anonstochasticexogenous variable.

    Its value in each observation may be considered tobe predetermined by factors unconnected withthe present relationship.

    Note that Yi has two components.

    It has a nonrandom component (1+ 2Xi), whichowes nothing to the laws of chance (1and 2

    may be unknown, but they are fixed constants),

    and it has the random component ui.

  • 8/4/2019 Statistical Properties of OLS

    15/59

    We can calculate b2 according to the usual formula

    b2 also has a random component.

    Cov(X, Y) depends on the values of Y, and thevalues of Y depend on the values of u.

    If the values of the disturbance term had beendifferent in the n observations, we would haveobtained different values of Y,

    hence different values of Cov(X, Y), and hencedifferent values of b

    2

    .

  • 8/4/2019 Statistical Properties of OLS

    16/59

    We can in theory decompose b2 into its

    nonrandom and random components

    1 = const and 2 = const,

    By Covariance Rule, Cov(X, 1) must be equal to 0

  • 8/4/2019 Statistical Properties of OLS

    17/59

    Cov(X, 2X) is equal to 2Cov(X, X).

    Cov(X, X) is the same as Var(X).

    Hence we can write

    and so

    (*)

  • 8/4/2019 Statistical Properties of OLS

    18/59

    Thus we have shown that the regression

    coefficient b2 obtained from any sample consistsof

    (1) a fixed component, equal to the true value, 2,

    and(2) a random component dependent on Cov(X, u),

    which is

    responsible for its variations around this centraltendency.

  • 8/4/2019 Statistical Properties of OLS

    19/59

    One may easily show that b1 has a fixed componentequal to the true value, 1, plus a random

    component that depends on the random factor u.

    Note that you are not able to make these

    decompositions in practice because you do notknow the true values of 1and 2 or the actualvalues of u in the sample.

    We are interested in them because they enable us tosay something about the theoretical properties of b1and b2, given certain assumptions.

  • 8/4/2019 Statistical Properties of OLS

    20/59

    2. The GaussMarkov Theorem

  • 8/4/2019 Statistical Properties of OLS

    21/59

    We shall continue to work with the simpleregression model where Y depends on X

    according to the relationship

    and we are fitting the regression equation given a

    sample ofn observations

  • 8/4/2019 Statistical Properties of OLS

    22/59

    The properties of the regression coefficients dependcritically on the properties of the disturbance term.

    Indeed the latter has to satisfy four conditions,known as the GaussMarkov conditions, ifordinary least squares regression analysis is to givethe best possible results.

  • 8/4/2019 Statistical Properties of OLS

    23/59

    GaussMarkov Condition 1: E(ui) = 0 for All Observations

    The first condition is that the expected value of the disturbanceterm in any observation should be 0.

    Sometimes it will be positive, sometimes negative, but it should nothave a systematic tendency in either direction.

  • 8/4/2019 Statistical Properties of OLS

    24/59

    GaussMarkov Condition 2: Population Variance of uiConstant for All Observations

    The second condition is that the population

    variance of the disturbance term should beconstant for all observations.

    Sometimes the disturbance term will be greater,

    sometimes smaller, but there should not be anya priori reason for it to be more irregular insome observations than in others.

    The constant is usually denoted

  • 8/4/2019 Statistical Properties of OLS

    25/59

    Since E(ui) is 0, the population variance of ui

    is equal to ,

    so the condition can also be written

    of course, is unknown.One of the tasks of regression analysis is to

    estimate the standard deviation of the

    disturbance term.

  • 8/4/2019 Statistical Properties of OLS

    26/59

    GaussMarkov Condition 3:

    ui Distributed Independently of uj (i j)

    This condition states that there should be no

    systematic association between the values of

    the disturbance term in any twoobservations.

    The condition implies that the

    population covariance between ui and uj , is0, because

  • 8/4/2019 Statistical Properties of OLS

    27/59

    GaussMarkov Condition 4: u DistributedIndependently of the Explanatory Variables

    The population covariance between theexplanatory variable and the disturbance

    term is 0.

    Since E(ui) is 0, and the term involving X isnonstochastic

  • 8/4/2019 Statistical Properties of OLS

    28/59

    The Normality Assumption

    In addition to the GaussMarkov conditions,

    one usually assumes that the disturbanceterm is normally distributed.

  • 8/4/2019 Statistical Properties of OLS

    29/59

    3. Unbiasedness of theRegression Coefficients

  • 8/4/2019 Statistical Properties of OLS

    30/59

    From (*) we can show that b2 must be an unbiasedestimator of

    2if the fourth GaussMarkov

    condition is satisfied:

    If we adopt the strong version of the fourthGaussMarkov condition and assume that X isnonrandom, we may also take Var(X) as a givenconstant, and so

  • 8/4/2019 Statistical Properties of OLS

    31/59

    We will demonstrate that E[Cov(X, u)] is 0:

  • 8/4/2019 Statistical Properties of OLS

    32/59

    In the second line on the previous slide, thesecond expected value rule has been used to

    bring (1/n) out of the expression as a commonfactor, and the first rule has been used to breakup the expectation of the sum into the sum ofthe expectations.

  • 8/4/2019 Statistical Properties of OLS

    33/59

    In the third line, the term involving X has been

    brought out because X is nonstochastic.By virtue of the first GaussMarkov condition,

    E(ui) is 0 , and hence is also 0.

    Therefore E[Cov(X, u)] is 0 and

    In other words, b2is an unbiased estimator of 2.One may easily show that b1 is an unbiased estimator

    of 1.

  • 8/4/2019 Statistical Properties of OLS

    34/59

    4. Precision ofthe Regression Coefficients

  • 8/4/2019 Statistical Properties of OLS

    35/59

    Now we shall consider

    the population variances of b1 and b2 about

    their population means.These are given by the following expressions

    (Thomas, 1983)

  • 8/4/2019 Statistical Properties of OLS

    36/59

  • 8/4/2019 Statistical Properties of OLS

    37/59

  • 8/4/2019 Statistical Properties of OLS

    38/59

  • 8/4/2019 Statistical Properties of OLS

    39/59

  • 8/4/2019 Statistical Properties of OLS

    40/59

    The standard errors of the regressions coefficientwill be calculated

    (*)

  • 8/4/2019 Statistical Properties of OLS

    41/59

    The higher the variance of the disturbance term, thehigher the sample variance of the residuals is likely

    to be,and hence the higher will be the standard errors of

    the coefficients in the regression equation,

    reflecting the risk that the coefficients areinaccurate.

    However, it is only a risk. It is possible that in any

    particular sample the effects of the disturbanceterm in the different observations will cancel eachother out and the regression coefficients will beaccurate after all.

  • 8/4/2019 Statistical Properties of OLS

    42/59

    6.Testing Hypotheses Relatingto the Regression Coefficients

  • 8/4/2019 Statistical Properties of OLS

    43/59

    Suppose you have a theoretical relationship

    and your null and alternative hypotheses are

  • 8/4/2019 Statistical Properties of OLS

    44/59

    We have assumed that the standard deviation of b2is known, which is most unlikely in practice.

    It has to be estimated by the standard error of b2,given by (*).

    This causes two modifications to the test

    procedure.

  • 8/4/2019 Statistical Properties of OLS

    45/59

    First, z is now defined using s.e.(b2)

    instead of s.d.(b2), and it is referred toas the t statistic

    Second, the critical levels of t depend

    upon what is known as a t distribution

  • 8/4/2019 Statistical Properties of OLS

    46/59

    The critical value of t denote as tcrit

    The condition that a regression estimateshould not lead to the rejection of a null

    hypothesisH0: 2=

    is

  • 8/4/2019 Statistical Properties of OLS

    47/59

    Hence we have the decision rule: reject H0

    do not reject if

    Where is the absolute value (numericalvalue, neglecting the sign) oft.

  • 8/4/2019 Statistical Properties of OLS

    48/59

    The p value approach is more informative

    than the 5 percent/1 percent approach, in

    that it givesthe exact probability of a Type I error, if the

    null hypothesis is true.

  • 8/4/2019 Statistical Properties of OLS

    49/59

    7. Confidence Intervals

  • 8/4/2019 Statistical Properties of OLS

    50/59

    At last lecture we have shown that

    we can see that regression coefficient b2and hypothetical value 2 are

    incompatible if either

  • 8/4/2019 Statistical Properties of OLS

    51/59

    that is, if either

    that is, if either

  • 8/4/2019 Statistical Properties of OLS

    52/59

    It therefore follows that a hypothetical 2 iscompatible with the regression result if both

    that is, if 2 satisfies the double inequality

  • 8/4/2019 Statistical Properties of OLS

    53/59

    Any hypothetical value of 2 that satisfies

    will therefore automatically be compatible

    with the estimate b2, that is, will not berejected by it.

    The set of all such values, given by the

    interval between the lower and upper limitsof the inequality, is known as the

    confidence interval for 2

  • 8/4/2019 Statistical Properties of OLS

    54/59

    Note that the center of the confidence interval isb2 itself.

    The limits are equidistant on either side.Note also that, since the value of tcrit depends

    upon the choice of significance level, the limits

    will also depend on this choice.If the 5 percent significance level is adopted, the

    corresponding confidence interval is known as

    the 95 percent confidence interval.If the 1 percent level is chosen, one obtains the 99

    percent confidence interval, and so on.

  • 8/4/2019 Statistical Properties of OLS

    55/59

    8. One-Tailed t

    Tests

  • 8/4/2019 Statistical Properties of OLS

    56/59

    The use of a one-tailed test has to be

    justified beforehand on the grounds oftheory, common sense, or previous

    experience.

    When stating the justification, you

    should be careful not to exclude thepossibility that the null hypothesis is

    true.

    O il d i i

  • 8/4/2019 Statistical Properties of OLS

    57/59

    One-tailed tests are very important inpractice in econometrics.

    The usual way of establishing that anexplanatory variable really does influence adependent variable is to set up the null

    hypothesis

    H0: 2= 0 and try to refute it.

    We can refute H0and accept H1 if

  • 8/4/2019 Statistical Properties of OLS

    58/59

    Very frequently, our theory is strong

    enough to tell us that, if X does

    influence Y, its effect will be in a givendirection.

    If we have good reason to believe thatthe effect is not negative, we are in a

    position to use the alternative

    hypothesis H1: 2 0 instead

    of the more general H1: 2 0.

  • 8/4/2019 Statistical Properties of OLS

    59/59

    This is an advantage because the critical value of tfor rejecting H0 is lower for the one-tailed test,so it is easier to refute the null hypothesis and

    establish the relationship.