chapters 1 & 2-final.ppt econmetrics- smith/watson

Upload: alozie-iheduru

Post on 14-Apr-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    1/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    Introductionto Econometrics

    Chapters 1 and 2

    The statistical analysis of

    economic (and related)data

    and

    Review of Probability

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    2/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-2

    Introduction to Econometrics is title of textWhat is econometrics?

    What is it? Science (& art!)

    Broadly, using theory and statistical methods to analyzedata

    What are some uses? Test theories

    Forecast values (e.g., firms sales, unemployment, stockprices, path of a hurricane, & much, much more)

    Fit mathematical economic models to data

    Use data to make numerical policy recommendations ingovt. and business

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    3/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-3

    Brief Overview of the Course

    Economics suggests important relationships, oftenwith policy implications, but virtually neversuggests quantitative magnitudes of causaleffects.

    What is the quantitative effect of reducing class size onstudent achievement?

    How does a bachelors degree change earnings?

    What is the price elasticity of cigarettes?

    What is the effect on output growth of a 1 percentage

    point increase in interest rates by the Fed? What is the effect on housing prices of environmental

    improvements?

    How much does knowing econometrics improve your lovelife?

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    4/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-4

    Economic Questions Well Examine

    1. Does reducing class size improve elementaryschool education?

    2. Is there racial discrimination in the market forhome loans?

    3. How much do cigarette taxes reduce smoking? 4. What will be the rate of inflation next year?

    (in todays economy, a bigger question might be Whatwill be the unemployment rate next year?)

    5. How much does knowing econometrics improveyour love life?

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    5/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-5

    This course is about using data tomeasure causal effects.

    Ideally, we would like an experiment

    What would be an experiment to estimate the effect of classsize on standardized test scores?

    But almost always we only have observational(nonexperimental) data.

    returns to education cigarette prices

    monetary policy

    Most of the course deals with difficulties arising from usingobservational data to estimate causal effects

    confounding effects (omitted factors) simultaneous causality

    correlation does not imply causation

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    6/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-6

    Learn methods for estimating causal effects usingobservational data

    Learn some tools that can be used for other purposes; forexample, forecasting using time series data;

    Focus on applications theory is used only as needed tounderstand the whys of the methods;

    Learn to evaluate the regression analysis of others thismeans you will be able to read/understand empiricaleconomics papers in other econ courses;

    Get some hands-on experience with regression analysis inyour problem sets.

    In this course you will:

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    7/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-7

    Three types of data

    Cross-sectional Different entities, single time period

    Time series

    Single entity, multiple time periods

    Panel Multiple entities, two or more time periods

    Speaking of using observational data. . .

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    8/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-8

    Empirical problem: Class size and educationaloutput

    Policy question: What is the effect on test scores (orsome other outcome measure) of reducing class size by

    one student per class? by 8 students/class? We must use data to find out (is there any way to answer

    this withoutdata?)

    Review of Probability and Statistics(Chapter 2)

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    9/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-9

    The California Test Score Data Set (note 1-1)

    All K through 8 California school districts (n = 420)

    1999

    Variables: 5th grade test scores

    district-wide mean of reading and math scores for fifthgraders.

    Student-teacher ratio (STR)

    no. of students in the district divided by no. of full-timeteachers

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    10/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-10

    Initial look at the data: (note 1-2)(You should already know how to interpret this table)

    What does this table tell us about the relationship between testscores and the STR?

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    11/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-11

    Do districts with smaller classes havehigher test scores?

    Scatterplot of test score v. student-teacher ratio

    What does this figure show?

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    12/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-12

    We need to get some numerical evidence on whetherdistricts with low STRs have higher test scores but how?

    1. Compare average test scores in districts with low STRs to

    those with high STRs (estimation)

    2. Test the null hypothesis that the mean test scores in the

    two types of districts are the same, against the

    alternative hypothesis that they differ (hypothesistesting)

    3. Estimate an interval for the difference in the mean test

    scores, high v. low STR districts (confidence interval)

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    13/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-13

    Initial data analysis: Compare districts withsmall (STR < 20)andlarge (STR 20) class sizes: (note 1-3)

    1. Estimation of = difference between group means

    2. Test the hypothesis that = 0

    3. Construct a confidence intervalfor

    Class Size Average score( )

    Standard deviation(sY) n

    Small 657.4 19.4 238

    Large 650.0 17.9 182

    Y

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    14/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-14

    1. Estimation (note 1-4)

    =

    = 657.4 650.0

    = 7.4 Is this a large difference in a real-world sense?

    Standard deviation across districts = 19.1

    Is this a big enough difference to be important for

    school reform discussions, for parents, or for aschool committee?

    What does this tell us about the population?

    1nsmall

    Yi

    i1

    nsmall

    Ysmall Ylarge

    1nlarge

    Yi

    i1

    nlarge

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    15/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-15

    2. Hypothesis testing (note 1-5)

    tYs

    Yl

    ss2

    ns

    sl2

    nl

    Ys

    Yl

    SE(Ys

    Yl)

    Difference-in-means test: compute the t-statistic,(remember this?)

    where SE( ) is thestandard error of ,

    the subscripts s and lrefer to small and large

    STR districts, and (etc.)

    Ys

    Yl

    Ys

    Yl

    ss

    2 1

    ns 1(Y

    i

    Ys

    )2

    i1

    ns

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    16/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    2. Hypothesis testing (note 1-6)

    Before testing. . . what are the H0 and HAfor this test?

    1-16

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    17/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-17

    Compute the difference-of-means t-statistic:(note 1-7)

    = 4.05

    (note p-value = .000061)

    So. . . reject the null hypothesis that the two meansare the same or not? Explain your decision.

    Size sY nsmall 657.4 19.4 238

    large 650.0 17.9 182

    Y

    tYs Ylss2

    ns

    sl2

    nl

    657.4 650.0

    19.42

    238 17.9

    2

    182

    7.4

    1.83

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    18/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-18

    3. Confidence interval (note 1-8)

    A 95% confidence interval for the differencebetween the means is,

    ( ) 1.96SE( )

    = 7.4 1.961.83 = (3.8, 11.0)

    So. . . reject the null hypothesis that the two meansare the same or not? Explain your decision.

    Yl

    Ys

    Yl

    Ys

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    19/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-19

    What comes next

    The mechanics of estimation, hypothesis testing,and confidence intervals should be familiar

    These concepts extend directly to regression andits variants

    Before turning to regression, however, we willreview some of the underlying theory ofestimation, hypothesis testing, and confidenceintervals: Why do these procedures work, and why use these rather

    than others? We will review the intellectual foundations of statistics

    and econometrics

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    20/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-20

    Review of Statistical Theory (note 1-9)

    Why review probability?

    Randomness everywhere theory of probability so as todescribe that randomness

    Structure of notes:

    1. The probability framework for statistical inference -now

    2. Estimation

    3. Testing

    4. Confidence Intervals

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    21/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-21

    Review of Statistical Theory

    The probability framework for statistical inference

    Single random variable: Population, random variable, and distribution

    Moments of a distribution (mean, variance, standard deviation,

    covariance, correlation)Two random variables:

    Conditional distributions and conditional means

    Four useful distributions Normal, chi-squared, students t, F

    Random sampling & sampling distribution: Distribution of a sample of data drawn randomly from a population: Y1,, Yn

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    22/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-22

    (a) Single random variable (note 1-10)

    Population

    The group or collection of all possible entities of interest(school districts)

    We will think of populations as infinitely large ( is an

    approximation to very big)

    Sample

    Whats a sample?

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    23/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-23

    (a) Single random variable (note 1-11)

    Fundamental concepts Outcomes

    Probability

    Event

    Random variable Y

    Numerical summary of a random outcome (district average test score, district STR)

    Types of random variables Discrete

    Continuous

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    24/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-24

    (a) Single random variable (note 1-12)

    Probability distributions - discrete Definition

    Probabilities of events

    c.d.f.

    Bernoulli

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    25/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-25

    (a) Single random variable (note 1-13)

    Probability distributions continuous p.d.f.

    c.d.f.

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    26/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-26

    Population distribution of Y

    The probabilities of different values ofYthat occurin the population, for ex. Pr[Y= 650] (when Yisdiscrete)

    or: The probabilities of sets of these values, forex. Pr[640 Y 660] (when Yis continuous).

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    27/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-27

    (b) Moments of a population distribution: mean,variance, standard deviation (note 1-14)

    mean = expected value (expectation) ofY

    = E(Y)

    = Y

    = long-run average value ofYover manyrepeated occurrences ofY

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    28/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-28

    Moments (cont.) (note 1-15)

    variance = E[(YY)2]

    =

    = measure of the squared spread ofthe distribution around its mean

    standard deviation = = Y

    Y

    2

    variance

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    29/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-29

    Moments (cont.) (note 1-16)

    skewness =

    = measure of asymmetry (lack of symmetry) of adistribution

    skewness = 0: distribution is symmetric

    skewness > (

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    30/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-30

    Moments (cont.) (note 1-17)

    kurtosis =

    = measure of mass in tails

    = measure of probability of large values kurtosis = 3: normal distribution

    kurtosis> 3: heavy tails (leptokurtotic)

    E Y Y

    4

    Y

    4

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    31/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-31

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    32/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-32

    Two random variables

    Random variablesXand Y

    Together they have ajoint distribution

    Each one has a marginal distribution

    Each one has a conditionaldistribution

    Joint distribution of two discrete X and Y

    Probability that X and Y simultaneously take on certain values, say xand y.

    Pr(X = x, Y = y) or Pr(x, y) or P(X = x, Y = y) or P(x, y)

    NOTE lower case symbols x and y denote values and. . .

    Upper case symbols X and Y denote random variables

    Probabilities of all possible (x, y) combinations sum to what?

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    33/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    Joint distribution (cont.)

    After recording data for many commutes prob. of long, rainy commute = P(X=0,Y=0) = .15

    prob. of long, clear commute = P(X=1,Y=0) = ??

    prob. of short, rainy commute = P(X=0,Y=1) = ??

    prob. of short, clear commute = P(X=1,Y=1) = ??

    These four outcomes are mutually exclusive and exhaust all possibilities

    So, they must sum to ??

    1-33

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    34/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    Marginal distribution

    Marginal distribution is P(X=x) or P(Y=y)

    Sum of joint probabilities:

    prob. of long commute = P(X=0,Y=0) + P(X=1,Y=0) = .15 +.07 =.22

    prob. of short commute = ??

    prob. of rainy commute = ??

    prob. of clear commute = ??

    1-34

    1

    ( ) ( , )L

    i

    i

    P Y y P X x Y y

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    35/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-35

    Conditional Distribution

    Conditional distribution of X and Y

    Probability that Y is some value conditional on (depending on or after)X taking on a specified value

    Examples: distribution of. . .

    test scores, given that STR < 20 wages of all female workers (Y= wages,X= gender)

    mortality rate of those given an experimental treatment (Y= live/die;

    X= treated/not treated)

    P(Y=y | X=x) or P(y | x)

    ( , ) ( , )( | ) ( | )

    ( ) ( )

    P X x Y y P x yP Y y x X or P y x

    P X x P x

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    36/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    Conditional Distribution (cont.)

    Example prob. of long commute (Y=0) if you know its raining (X=0)

    If its raining, only two possibilities. What are they?

    So, prob. of short commute (Y=1) if you know its raining (X=0)

    = P(Y=1| X=0) = ?? (hint: recall the answer above)

    1-36

    ( 0, 0) .15( 0 | 0) .50

    ( 0) .30

    P X YP Y X

    P X

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    37/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    Conditional Distribution (cont.)

    Question from previous slide (cont.) prob. of short commute (Y=1) if you know its raining (X=0)

    Now, check your answer by calculation

    1-37

    ( 0, 1)( 1| 0) ??

    ( 0)

    P X YP Y X

    P X

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    38/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    Conditional Distribution (cont.) (note 1-18)

    Questions What is prob. of long commute (Y=0) if you know its not raining (X=1)?

    What is prob. of short commute (Y=1) if you know its not raining(X=1)?

    What do these two probabilities sum to?

    1-38

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    39/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    Conditional Distribution Examples. (note 1-19)Figure 2.4 Average Hourly Earnings of U.S. Full-Time Workersin 2008. Why do I say that these are conditionaldistributions?

    1-39

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    40/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-40

    Independence

    Two rvs X and Y independent if

    Knowing value of one tells you nothing about the value of the other

    Conditional distribution of X & Y = marginal distribution of Y (or X)

    P(Y=y | X=x) = P(Y=y) or. . .

    P(X=x | Y=y) = P(X=x)

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    41/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-41

    Independence (cont.)

    Recall rvs X and Y independent if

    P(Y=y | X=x) = P(Y=y)

    Example

    M = number of PC crashes & A = age of PC (0 = old & 1 = new)

    P(M = 0) = 0.80 and P(M = 1) = 0.07

    Are M and A independent? Explain your answer.

    Case 1: P(M=0 | A = 0) = 0.70

    Case 2: P(M=1 | A = 1) = 0.07

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    42/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-42

    Two random variables: joint distributionsand covariance (note 1-20)

    Random variablesXand Zhave ajoint distribution

    The covariance betweenXand Zis

    cov(X,Z) = E[(XX)(ZZ)] = XZ

    The covariance is a measure of the linear association

    betweenXand Z; its units are units ofXunits ofZ

    cov(X,Z) > 0 means a positive relation betweenXand Z

    IfXand Zare independently distributed, then cov(X,Z) = 0(but not vice versa!!)

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    43/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-43

    The covariance between Test Score and STR is negative:

    So is the correlation

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    44/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    Covariance vs. Correlation

    Recall The covariance. . . units are units ofXunits ofZ.

    If X & Z in feet, then covariance is feet2

    If X & Z (both same variables) in meters, thencovariance is meters2

    Same association but different values ofcovariance

    What if X in feet and Z in lbs. What units forcovariance?

    Problems!

    1-44

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    45/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-45

    The correlation coefficientis defined in terms ofthe covariance:

    corr(X,Z) = = rXZ

    1 corr(X,Z) 1

    corr(X,Z) = 1 mean perfect positive linear association corr(X,Z) = 1 means perfect negative linear association

    corr(X,Z) = 0 means no linear association

    Correlation coefficient is unitless, so it avoids theproblems of the covariance.

    corr(X,Z) when measured in feet same as corr(X,Z) whenX & Z in meters or pounds or. . .

    cov(X,Z)var(X)var(Z)

    XZ

    X

    Z

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    46/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-46

    Thecorrelation coefficient measures linearassociation

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    47/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    Four Distributions: normal, chi-squared,Student t, F (note 1-21)

    Normal Distribution

    bell-shaped probability density

    X ~ N(, 2)

    Standard normal Z ~ N(0, 1)

    Standardizing a normal r.v. (z score)

    Used for finding probabilities about X ~ N(, 2)

    1-47

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    48/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    Normal Distribution (cont.)A Bad Day on Wall Street (note 1-22)

    The box A Bad Day on Wall Street has anexample of the normal distribution in theU.S. stock market

    1-48

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    49/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    A Bad Day on Wall Street (cont.)(note 1-22)

    1-49

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    50/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    The Chi-squared Distribution (note 1-23)

    Usually written as

    Shape of distribution

    Shape depends on degrees of freedom m

    When used

    1-50

    2m

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    51/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    The Student t Distribution (note 1-24)

    Always lower case t

    Shape of distribution

    Symmetric like normal distribution

    Shape depends on degrees of freedom m m < 20: fatter tails than normal distribution

    m > 30: shape close to normal distribution

    m : exactly like normal distribution

    When used

    1-51

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    52/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    The F Distribution (note 1-25)

    Shape of distribution

    Shape depends on two degrees of freedom

    Numerator d.f. n

    denominator d.f. m When used

    1-52

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    53/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-53

    (d) Distribution of a sample of data drawnrandomly from a population: Y1,, Yn (note 1-26)

    We will assume simple random sampling

    Choose an individual (district, entity) at random from thepopulation

    Randomness and data

    Prior to sample selection, the value ofYis random becausethe individual selected is random

    Once the individual is selected and the value ofYisobserved, then Yis just a number not random

    The data set is (Y1, Y2,, Yn), where Yi= value ofYfor the ith

    individual (district, entity) sampled

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    54/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-54

    Distribution of Y1,, Ynunder simplerandom sampling (note 1-27)

    Because individuals #1 and #2 are selected atrandom, the value ofY1 has no informationcontent for Y2. Thus:

    Y1 and Y2 are independently distributed

    Y1 and Y2 come from the same population (distribution).That is, Y1, Y2 are identically distributed

    So, under simple random sampling, Y1 and Y2 areindependently and identically distributed (i.i.d.).

    More generally, under simple random sampling, {Yi},

    i= 1,, n, are i.i.d.

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    55/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-55

    Simple Random Sampling (note 1-28)

    Recall: Under simple random sampling, {Yi}, i= 1,, n, arei.i.d.

    This framework allows rigorous statistical inferences aboutmoments of population distributions using a sample of datafrom that population

    Structure of notes:

    1. The probability framework for statistical inference

    2. Estimation - now3. Testing

    4. Confidence Intervals

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    56/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-56

    Estimation

    is the natural estimator of the population mean. But:

    a) What are the properties of ?

    b) Why should we use rather than some other estimator?

    Y1 (the first observation)

    maybe unequal weights not simple average

    median(Y1,, Yn)

    The starting point is thesampling distribution of

    Y

    Y

    Y

    Y

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    57/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-57

    (a) The sampling distribution of (note 1-29)

    is a random variable, and its properties aredetermined bythesampling distribution of

    The individuals in the sample are drawn at random.

    Thus the values of (Y1, , Yn) are random

    Thus functions of (Y1, , Yn), such as , are random:had a different sample been drawn, they would havetaken on a different value

    The distribution of overALL possible different

    samples of size nis called the. . .

    sampling distribution of .

    Y

    Y

    Y

    Y

    Y

    Y

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    58/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-58

    (a) The sampling distribution of

    Recall: The distribution of overALL possible different

    samples of size nis called the. . .

    sampling distribution of .

    The mean and variance of all of the values are themean and variance of its sampling distribution,

    E( ) and var( ).

    (remember: is a sample statistic.)

    VIP: The concept of the sampling distribution underpinsall of inference in econometrics.

    Y

    Y

    Y

    Y

    Y Y

    Y

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    59/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-59

    The sampling distribution of (cont.)(note 1-30)

    Example: Suppose Ytakes on 0 or 1 (a Bernoullirandomvariable) with the probability distribution,

    Pr[Y= 0] = .22, Pr(Y=1) = .78

    Then

    E(Y) =p1 + (1 p) 0 =p = .78

    = E[YE(Y)]2 =p(1 p) [remember this?]

    = .78 (1.78) = 0.1716

    The sampling distribution of depends on n.

    Consider n = 2. The sampling distribution of is,

    Pr( = 0) = .222 = .0484

    Pr( = ) = 2.22.78 = .3432

    Pr( = 1) = .782 = .6084

    Y

    Y

    2

    Y

    Y

    Y

    Y

    Y

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    60/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-60

    The sampling distribution of when Yis Bernoulli (p= .78): (note 1-31)

    Y

    Thi t t k b t th

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    61/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-61

    Things we want to know about thesampling distribution:

    What is the mean of ? IfE( ) = true = .78, then is an unbiasedestimator

    of

    What is the variance of ?

    How does var( ) depend on n (famous 1/n formula)

    Does become close to when n is large?

    Law of large numbers: is a consistentestimator of

    Distribution of appears bell shaped for n

    largeis this generally true? Wait until next section (2.6 in 3rd ed.) to answer this

    question about the SHAPE of the sampling distribution of

    .

    Y

    Y

    Y

    Y

    Y

    Y

    Y

    Y

    Y

    Th d i f th li

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    62/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-62

    The mean and variance of the samplingdistribution of

    General case that is, for Yi i.i.d. from ANY distribution, notjust Bernoulli:

    mean: E( ) = E( ) = = = Y

    Variance: var( ) = E[ E( )]2

    = E[ Y]

    2

    = E

    = E

    Y

    Y

    1

    nYi

    i1

    n

    1

    nE(Y

    i)

    i1

    n

    1

    n

    Y

    i1

    n

    Y

    Y

    Y

    1

    nY

    i

    i1

    n

    Y

    2

    1

    n(Y

    i

    Y)

    i1

    n

    2

    Y

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    63/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-63

    so var( ) = E

    =

    =

    =

    =

    =

    1

    n(Y

    i

    Y)

    i1

    n

    2

    Y

    E1

    n(Y

    i

    Y)

    i1

    n

    1

    n(Y

    j

    Y)

    j1

    n

    1

    n

    2E (Y

    i

    Y)(Y

    j

    Y)

    j1

    n

    i1

    n

    1

    n2cov(Y

    i,Y

    j)

    j1

    n

    i1

    n

    2 2

    2 21

    1 1

    [ : cov( , ) var( )]

    n

    Y Y

    in note Y Y Y n n

    2

    Y

    n

    Mean and a iance of sampling

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    64/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-64

    Mean and variance of samplingdistribution of (cont.) (note 1-32)

    E( ) = Y

    var( ) =

    Implications:1. is an unbiasedestimator ofY(that is, E( ) = Y)

    2. var( ) is inversely proportional to n

    1. the spread (standard deviation) of the samplingdistribution is proportional to 1/

    2. Thus the sampling uncertainty associated withis proportional to 1/ (larger samples, lessuncertainty, but square-root law)

    Y

    Y

    Y2

    Y

    n

    Y

    Y

    Y

    n

    n

    Y

    The sampling distribution of when n isY

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    65/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-65

    The sampling distribution of when n islarge (note 1-33)

    For small sample sizes, the distribution of willusually be complicated (unless. . . what is true aboutthe distribution of the Yivalues in the population?)

    But ifn is large, the sampling distribution is simple!1. As n increases, the distribution of becomes more

    tightly centered around Y(the Law of Large Numbers)

    2. Moreover, the distribution of both become normal (theCentral Limit Theorem)

    1.

    2.

    Y

    Y

    Y

    Y

    YY

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    66/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-66

    The Law of Large Numbers: (note 1-34)

    An estimator is consistentif the probability that its falls withinan interval of the true population value tends to one as thesample size increases.

    If (Y1,,Yn) are i.i.d. and < , then is a consistentestimator ofY, that is,

    Pr[| Y| < ] 1 as n

    which can be written,

    ( means converges in probability toY).

    Y

    2Y

    Y

    Y

    p

    YY

    " "p

    YY

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    67/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-67

    The Central Limit Theorem (CLT): (note 1-35)

    If (Y1,,Yn) are i.i.d. and 0 < < , then when n islarge the distribution of is well approximated bya normal distribution.

    is approximately distributed N(Y, ) (normal

    distribution with mean Yand variance /n) AND. . .

    ( Y)/Y is Y approximately distributed N(0,1)(standard normal)

    That is, standardized = = is

    approximately distributed as N(0,1)

    VIP: The larger is n, the better is theapproximation.

    n Y

    Y

    2

    Y

    Y

    Y

    2

    n

    Y2

    Y YE(Y)

    var(Y)

    Y Y

    Y / n

    Fig 2 8 Sampling distribution of when Y isY

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    68/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-68

    Fig. 2.8 Sampling distribution of when YisBernoulli,p = 0.78 (n = 2, 5, 25, 100)

    Y

    i 2 8 S li di ib i f ( )

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    69/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved.

    Fig. 2.8 Sampling distribution of (cont.)(note 1-36)

    In figure on previous slide (fig. 2.8), whenn = 100, it might not be easy to see thatthe distribution of is normal.

    Its easier to see this if we examine thedistribution of standardized =

    See next slide

    1-69

    Y

    Y

    /

    Y

    Y

    Y

    n

    Y

    Same example: sampling distribution of Y E(Y )

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    70/71

    Copyright 2011 Pearson Addison-Wesley. All rights reserved. 1-70

    Same example: sampling distribution of(n = 2, 5, 25, 100) (Fig. 2.9 in book)

    YE(Y)

    var(Y)

  • 7/27/2019 Chapters 1 & 2-final.ppt Econmetrics- Smith/Watson

    71/71

    Summary: The Sampling Distribution of

    For Y1,,Yn i.i.d. with 0 < < ,

    The exact (finite sample) sampling distribution of has meanY( is an unbiased estimator ofY) and variance /n

    Other than its mean and variance, the exact distribution of iscomplicated and depends on the distribution ofY(thepopulation distribution)

    When n is large, the sampling distribution simplifies:

    Y

    Y2

    Y

    Y

    2

    Y

    (Law of large numbers)p

    YY ( )

    var( ) var( )

    YY E Y Y

    Y Y

    is approximately N(0,1) (CLT)

    Y