practice problems for midterm 1

8/19/2019 Practice Problems for Midterm 1

http://slidepdf.com/reader/full/practice-problems-for-midterm-1 1/84

Practice Problems for Midterm 1

Multiple Choice Questions

Chapter 2

1) The probability of an outcome

a. is the number of times that the outcome occurs in the long run. b. equals M × N , where M is the number of occurrences and N is the population size.c. is the proportion of times that the outcome occurs in the long run.d. equals the sample mean divided by the sample standard deviation.

Answer: c

) The probability of an event A or B !"r! A or B)) to occur equals

a. "r! A) × "r! B). b. "r! A) # "r! B) if A and B are mutually e$clusive.

c."r! )

"r! )

A

B.

d. "r! A) # "r! B) even if A and B are not mutually e$clusive.

Answer: b

)% The cumulative probability distribution shows the probability

e. that a random variable is less than or equal to a particular value.f. of two or more events occurring at once.g. of all possible events occurring.h. that a random variable ta&es on a particular value given that another event has

happened.

Answer: a)' The e$pected value of a discrete random variable

a. is the outcome that is most li&ely to occur. b. can be found by determining the (* value in the c.d.f.c. equals the population median.d. is computed as a weighted average of the possible outcome of that random

variable, where the weights are the probabilities of that outcome.

1



Answer: d

+) et Y be a random variable. Then var!Y ) equals

a. -! ) Y E Y µ − .

b. -/ ! ) /Y E Y µ − .

c.-! ) Y E Y µ − .

d. -! )Y E Y µ − .

Answer: c

1) The conditional distribution of Y given X 0 x, "r! / )Y y X x= = , is

a.

"r! )

"r! )

Y y

X x

=

= .

b.1

"r! , )l

i

i

X x Y y=

= =∑ .

c."r! , )

"r! )

X x Y y

Y y

= ==

.

d."r! , )

"r! )

X x Y y

X x

= ==

.

Answer: d

11) The conditional e$pectation of Y given X , ! / ) E Y X x= , is calculated as follows:

a.1

"r! / )k

i i

i

y X x Y y=

= =∑ .

b. - ! / ) E E Y X .

c.1

"r! / )k

i i

i

y Y y X x=

= =∑ .

d.1

! / ) "r! )l

i i

i

E Y X x X x=

= =∑ .

Answer: c

+) Two random variables X and Y are independently distributed if all of the followingconditions hold, with the e$ception of



a. "r! / ) "r! )Y y X x Y y= = = = .

b. &nowing the value of one of the variables provides no information about the other.c. if the conditional distribution of Y given X equals the marginal distribution of Y .d. ! ) - ! / ) E Y E E Y X = .

Answer: d

9) The correlation between X and Y

a. cannot be negative since variances are always positive. b. is the covariance squared.c. can be calculated by dividing the covariance between X and Y by the product of

the two standard deviations.

d. is given bycov! , )

! , )var! ) var! )

X Y corr X Y

X Y = .

Answer: c

10) Two variables are uncorrelated in all of the cases below, with the e$ception of

a. being independent. b. having a zero covariance.

c. / / XY X Y σ σ σ ≤ .

d. ! / ) E Y X = .

Answer: c

11) var! )aX bY + =

a.

X Y a bσ σ + .

b. X XY Y a ab bσ σ σ + + .

c. XY X Y σ µ µ + .

d.

X Y a bσ σ + .

Answer: b

12) To standardize a variable you

a. subtract its mean and divide by its standard deviation. b. integrate the area below two points under the normal distribution.c. add and subtract 1.+ times the standard deviation to the variable.d. divide it by its standard deviation, as long as its mean is 1.

%



Answer: a

13) Assume that Y is normally distributed! , ) N µ σ . To find 1 "r! )c Y c≤ ≤ , where 1 c c<

and ii

cd

µ

σ

−= , you need to calculate 1 "r! )d Z d ≤ ≤ =

a. 1! ) ! )d d Φ Φ − .

b. !1.+) ! 1.+)Φ Φ − − .

c. 1! ) !1 ! ))d d Φ Φ − − .

d. 11 ! ! ) ! ))d d Φ Φ − − .

Answer: a2tudent t distribution is

a. the distribution of the sum of m squared independent standard normal random

variables. b. the distribution of a random variable with a chi3squared distribution with m

degrees of freedom, divided by m.c. always well appro$imated by the standard normal distribution.d. the distribution of the ratio of a standard normal random variable, divided by the

square root of an independently distributed chi3squared random variable with mdegrees of freedom divided by m.

Answer: d

14) 5hen there are ∞ degrees of freedom, the t ∞ distribution

a. can no longer be calculated. b. equals the standard normal distribution.c. has a bell shape similar to that of the normal distribution, but with 6fatter7 tails.

d. equals the χ ∞ distribution.

Answer: b

18) The sample average is a random variable and

a. is a single number and as a result cannot have a distribution..b has a probability distribution called its sampling distribution..c has a probability distribution called the standard normal distribution.

.d has a probability distribution that is the same as for the 1,..., nY Y i.i.d. variables.

Answer: b

'



1+) To infer the political tendencies of the students at your college9university, you sample 1(of them. nly one of the following is a simple random sample: ;ou

a. ma&e sure that the proportion of minorities are the same in your sample as in theentire student body. b. call every fiftieth person in the student directory at + a.m. <f the person does not

answer the phone, you pic& the ne$t name listed, and so on.c. go to the main dining hall on campus and interview students randomly there.d. have your statistical pac&age generate 1( random numbers in the range from 1 to

the total number of students in your academic institution, and then choose thecorresponding names in the student telephone directory.

Answer: d

) The variance of

, Y Y σ , is given by the following formula:

a.

Y σ .

b. Y

n

σ .

c.

Y

n

σ .

d.

Y

n

σ .

Answer: c

1) The mean of the sample average , ! )Y E Y , is

a.1

Y n µ .

b. Y µ .

c. Y

n

µ .

d. Y

Y

σ µ

for n = %.

Answer: b

(



) <n econometrics, we typically do not rely on e$act or finite sample distributions because

a. we have appro$imately an infinite number of observations !thin& of re3sampling). b. variables typically are normally distributed.

c. the covariances of ,i jY Y are typically not zero.d. asymptotic distributions can be counted on to provide good appro$imations to the

e$act sampling distribution.

Answer: d

%) The central limit theorem states that

a. the distribution forY

Y

Y µ

σ

− becomes arbitrarily well appro$imated by the standard

normal distribution.

b. p

Y Y µ → .

c. the probability that Y is in the range Y c µ ± becomes arbitrarily close to one as n

increases for any constant c > .d. the t distribution converges to the distribution for appro$imately n = %.

Answer: a

') The covariance inequality states that

a. 1 XY σ ≤ ≤ .

b.

XY X Y σ σ σ ≤ .

c.

XY X Y σ σ σ − ≤ .

d.

X XY

Y

σ σ

σ ≤ .

Answer: b

Chapter 3

1) An estimator is

a. an estimate. b. a formula that gives an efficient guess of the true population value.c. a random variable.d. a nonrandom number.



Answer: c

) An estimate is

a. efficient if it has the smallest variance possible. b. a nonrandom number.c. unbiased if its e$pected value equals the population value.d. another word for estimator.

Answer: b

)% An estimator >Y µ of the population value Y µ is consistent if

a. > p

Y Y µ µ → .

b. its mean square error is the smallest possible.

c. Y is normally distributed.d.

p

Y → .

Answer: a

)' An estimator >Y µ of the population value Y µ is more efficient when compared to another

estimator Y µ %, if

a. E ! >Y µ ) = E ! Y µ %).

b. it has a smaller variance.c. its c.d.f. is flatter than that of the other estimator.

d. both estimators are unbiased, and var! >Y µ ) ? var! Y µ %).

Answer: d

() The standard error of >, ! )Y

Y !E Y σ = is given by the following formula:

i.

1

1! )

n

i

i

Y Y n =

−∑ .

@.

Y "

n.

&. Y " .

l. Y "

n.

Answer: d

4



4) 5hen you are testing a hypothesis against a two3sided alternative, then the alternative iswritten as

a. ,! ) Y E Y µ > .

b. ,! ) Y E Y µ = .

c. ,Y Y µ ≠ .

d. ,! ) Y E Y µ ≠ .

Answer: d

8) A scatterplot

a. shows how Y and X are related when their relationship is scattered all over the place.

b. relates the covariance of X and Y to the correlation coefficient.c. is a plot of n observations on i X and iY , where each observation is represented by

the point ! ,i i X Y ).d. shows n observations of Y over time.

Answer: c

+) The following types of statistical inference are used throughout econometrics, with thee$ception of

a. confidence intervals. b. hypothesis testing.c. calibration.d. estimation.

Answer: c

1) Among all unbiased estimators that are weighted averages of 1,..., nY Y , Y is

a. the only consistent estimator of Y µ .

b. the most efficient estimator of Y µ .

c. a number which, by definition, cannot have a variance.

d. the most unbiased estimator of Y µ .

Answer: b

8



11) To derive the least squares estimator Y µ , you find the estimator m which minimizes

e.

1

! )n

i

i

Y m=

−∑ .

f.1

/ ! ) /

n

i

iY m

=−∑ .

g.

1

n

i

i

mY =

∑ .

h.1

! )n

i

i

Y m=

−∑ .

Answer: a

1) <f the null hypothesis states ,: ! ) Y # E Y µ = , then a two3sided alternative hypothesis is

e. 1 ,: ! ) Y # E Y µ ≠ .

f. 1 ,: ! ) Y # E Y µ ≈ .

g. 1 ,: Y Y # µ µ < .

h. 1 ,: ! ) Y # E Y µ > .

Answer: a

1') A large p3value implies

e. re@ection of the null hypothesis.f. a large t 3statistic.

g. a large act Y .

h. that the observed value act Y is consistent with the null hypothesis.

Answer: d

1() The formula for the sample variance is

a.

1

1! )

1

n

Y i

i

" Y Y

n =

= −− ∑

.

b.

1

1! )

1

n

Y i

i

" Y Y n =

= −− ∑ .

c.

1

1! )

1

n

Y i Y

i

" Y n

µ =

= −− ∑ .

+



d.1

1

1! )

1

n

Y i

i

" Y Y n

−

=

= −− ∑ .

Answer: b

1) egrees of freedom

a. in the conte$t of the sample variance formula means that estimating the mean usesup some of the information in the data.

b. is something that certain undergraduate ma@ors at your university9college otherthan economics seem to have an ∞ amount of.

c. are !n3) when replacing the population mean by the sample mean.

d. ensure that

Y Y " σ = .

Answer: a

14) The t 3statistic is defined as follows:

a.,

Y

Y

Y t

n

µ

σ

−=

.

.e,

! )

Y Y t

!E Y

µ −= .

.f

,! )

! )

Y Y t

!E Y

µ −= .

.g 1.+.

Answer: b

18) The power of the test

e. is the probability that the test actually incorrectly re@ects the null hypothesis whenthe null is true.

f. depends on whether you use Y or Y for the t 3statistic.

g. is one minus the size of the test.h. is the probability that the test correctly re@ects the null when the alternative is true.

Answer: d

1+) The sample covariance can be calculated in any of the following ways, with the e$ceptionof:

1



a.1

1! )! )

1

n

i i

i

X X Y Y n =

− −− ∑ .

b.1

1

1 1

n

i i

i

n X Y XY

n n=

−− −∑ .

c.1

1 ! )! )n

i X i Y

i

X Y n µ µ

=− −∑ .

d. XY Y Y r " " , where XY r is the correlation coefficient.

Answer: c

) 5hen the sample size n is large, the +* confidence interval for µ Y is

a. 1.+ ! )Y !E Y ± .

b. 1.' ! )Y !E Y ± .

c. 1.' Y Y σ ± .

d. 1.+Y ± .

Answer: b

1) The standard error for the difference in means if two random variables M and $ , whenthe two population variances are different, is

a.

M $

M $

" "

n n

++

.

b. $ M

M $

" "

n n+ .

c.1

! )

$ M

M $

" "

n n+ .

d.

$ M

M $ " "n n+ .

Answer: d

) The following statement about the sample correlation coefficient is true.

11



a. B1 XY r ≤ ≤ 1.

b. ! , ) p

XY i ir corr X Y → .

c. / / 1 XY r < .

d.

XY XY

X Y

"r " "

= .

Answer: a

%) The correlation coefficient

a. lies between zero and one. b. is a measure of linear association.c. is close to one if X causes Y .d. ta&es on a high value if you have a strong nonlinear relationship.

Answer: b

Chapter 4

1) 5hen the estimated slope coefficient in the simple regression model,1

>β , is zero, then

a. %

0 Y . b. ? % ? 1.c. % 0 .d. % = !!!%9&!! ).

Answer: c

) Ceteros&edasticity means that

a' homogeneity cannot be assumed automatically for the model.b' the variance of the error term is not constant.

c' the observed units have different preferences.d' agents are not all rational.

Answer: b

%) 5ith heteros&edastic errors, the weighted least squares estimator is DEF. ;ou should

1



use 2 with heteros&edasticity3robust standard errors because

a. this method is simpler. b. the e$act form of the conditional variance is rarely &nown.c. the Gauss3Har&ov theorem holds.

e. your spreadsheet program does not have a command for weighted least squares.

Answer: b

') 5hich of the following statements is correctI

a' &!! 0 E!! # !!%b' E!! 0 !!% # &!! c' E!! = &!! d' % 0 1 B ! E!! 9&!! )

Answer: a

() Dinary variables

a. are generally used to control for outliers in your sample. b. can ta&e on more than two values.c. e$clude certain individuals from your sample.d. can ta&e on only two values.

Answer: d

) 5hen estimating a demand function for a good where quantity demanded is a linearfunction of the price, you should

a. not include an intercept because the price of the good is never zero. b. use a one3sided alternative hypothesis to chec& the influence of price on quantity.c. use a two3sided alternative hypothesis to chec& the influence of price on quantity.d. re@ect the idea that price determines demand unless the coefficient is at least 1.+.

Answer: b

4) The reason why estimators have a sampling distribution is that

a. economics is not a precise science. b. individuals respond differently to incentives.c. in real life you typically get to sample many times.d. the values of the e$planatory variable and the error term differ across samples.

1%



Answer: d

8) The 2 estimator is derived by

a. connecting the Y i corresponding to the lowest X i observation with the Y i correspondingto the highest X i observation. b. ma&ing sure that the standard error of the regression equals the standard error of the

slope estimator.c. minimizing the sum of absolute residuals.d. minimizing the sum of squared residuals.

Answer: d

+) <nterpreting the intercept in a sample regression function is

a. not reasonable because you never observe values of the e$planatory variables aroundthe origin. b. reasonable because under certain conditions the estimator is DEF.c. reasonable if your sample contains values of X i around the origin.d. not reasonable because economists are interested in the effect of a change in X on the

change in Y .

Answer: c

1) The sample average of the 2 residuals is

a. some positive number since 2 uses squares. b. zero.c. unobservable since the population regression function is un&nown.d. dependent on whether the e$planatory variable is mostly positive or negative.

Answer: b

11) The t 3statistic is calculated by dividing

a. the 2 estimator by its standard error. b. the slope by the standard deviation of the e$planatory variable.c. the estimator minus its hypothesized value by the standard error of the estimator.d. the slope by 1.+.

Answer: c

1'



1) The slope estimator, β 1, has a smaller standard error, other things equal, if

a. there is more variation in the e$planatory variable, X . b. there is a large variance of the error term, (.

c. the sample size is smaller.d. the intercept, β , is small.

Answer: a

1%) The regression % is a measure of

a. whether or not X causes Y . b. the goodness of fit of your regression line.c. whether or not E!! = &!! .d. the square of the determinant of %.

Answer: b

1') !Jequires Appendi$) The sample regression line estimated by 2

a. will always have a slope smaller than the intercept. b. is e$actly the same as the population regression line.c. cannot have a slope of zero.

d. will always run through the point ! , X Y ).

Answer: d

1() The confidence interval for the sample regression function slope

a. can be used to conduct a test about a hypothesized population regression functionslope.

b. can be used to compare the value of the slope relative to that of the intercept.c. adds and subtracts 1.+ from the slope.d. allows you to ma&e statements about the economic importance of your estimate.

Answer: a

1) <f the absolute value of your calculated t 3statistic e$ceeds the critical value from thestandard normal distribution, you can

a. re@ect the null hypothesis. b. safely assume that your regression results are significant.

1(



c. re@ect the assumption that the error terms are homos&edastic.d. conclude that most of the actual values are very close to the regression line.

Answer: a

14) Ender the least squares assumptions !zero conditional mean for the error term, X i and Y i being i.i.d., and X i and (i having finite fourth moments), the 2 estimator for the slopeand intercept

a. has an e$act normal distribution for n = 1(. b. is DEF.c. has a normal distribution even in small samples.d. is unbiased.

Answer: d

18) To obtain the slope estimator using the least squares principle, you divide the

a. sample variance of X by the sample variance of Y . b. sample covariance of X and Y by the sample variance of Y .c. sample covariance of X and Y by the sample variance of X .d. sample variance of X by the sample covariance of X and Y .

Answer: c

1+) To decide whether or not the slope coefficient is large or small,

a. you should analyze the economic importance of a given increase in X . b. the slope coefficient must be larger than one.c. the slope coefficient must be statistically significant.d. you should change the scale of the X variable if the coefficient appears to be too

small.

Answer: a

) F!(i / X i) 0 says that

a. dividing the error by the e$planatory variable results in a zero !on average). b. the sample regression function residuals are unrelated to the e$planatory variable.c. the sample mean of the Ks is much larger than the sample mean of the errors.d. the conditional distribution of the error given the e$planatory variable has a zero

1



mean.

Answer: d

1) <n the linear regression model, iii ( X Y ++= 1 β β , i X 1 β β + is referred to as

a. the population regression function. b. the sample regression function.c. e$ogenous variation.d. the right3hand variable or regressor.

Answer: a

) Hultiplying the dependent variable by 1 and the e$planatory variable by 1,leaves the

a. 2 estimate of the slope the same. b. 2 estimate of the intercept the same.c. regression % the same.d. heteros&edasticity3robust standard errors of the 2 estimators the same.

Answer: c

Analytical Questions

Chapter 2

1) Thin& of the situation of rolling two dice and let M denote the sum of the number of dotson the two dice. !2o M is a number between 1 and 1.)

!a) <n a table, list all of the possible outcomes for the random variable M together with its probability distribution and cumulative probability distribution. 2&etch both distributions.

Answer:

utcome!sum of

dots)

% ' ( 4 8 + 1 11 1

"robabilitydistribution

.8

.(

.8%

.111 .1%+

.14

.1%+

.111 .8%

.(

.8

Lumulative probability

.8

.8%

.14

.48

.'14

.(8%

.4

.8%%

.+1

.+4

1.

14



distribution

Probability and Cumulative Probability

Distribution of Number of Dots

0

0.020.04

0.06

0.080.1

0.12

0.140.16

0.18

2 3 4 5 6 7 8 9 10 11 12

Number of Dots

P r o b a b i l i t y

00.10.20.30.40.50.60.70.80.91

2 3 4 5 6 7 8 9 10 11 12

Probabili ty Cumulative Probabili ty

!b) Lalculate the e$pected value and the standard deviation for M .

Answer: 4.M .'.

!c) oo&ing at the s&etch of the probability distribution, you notice that it resembles a normaldistribution. 2hould you be able to use the standard normal distribution to calculate probabilities of eventsI 5hy or why notI

Answer: ;ou cannot use the normal distribution !without continuity correction) tocalculate probabilities of events, since the probability of any event equals zero.

!d) 5hat is the probability of the following outcomesI

!i) "r! M 0 4)!ii) "r! M 0 or M 0 1)

!iii) "r! M 0 ' or M ≠ ')!iv) "r! M 0 and M 0 +)!v) "r! M ? 8)!vi) "r! M 0 or M = 1)

18



Answer: !i) .14 or 1

% = M !ii) .111 or

' 1

%+ += M !iii) 1M !iv) M !v) .(8%M

!vi) . or8

% += .

) "robabilities and relative frequencies are related in that the probability of an outcome is

the proportion of the time that the outcome occurs in the long run. Cence concepts of @oint, marginal, and conditional probability distributions stem from related concepts offrequency distributions.

;ou are interested in investigating the relationship between the age of heads ofhouseholds and wee&ly earnings of households. The accompanying data gives the number of occurrences grouped by age and income. ;ou collect data from 1,4'' individuals andthin& of these individuals as a population that you want to describe, rather than a samplefrom which you want to infer behavior of a larger population. After sorting the data, yougenerate the accompanying table:

Joint Absolute Frequencies of Age and Income, 1,744 ouseholds

Age of head of

household

1 X X % X ' X ( X

ousehold Income 13under 3under ( (3under '( '(3under ( ( and =

1Y N3under N 8 4 1% 8 '

Y N3under N' 1% + %' 1' 8

%Y N'3under N 1+ (1 11

'Y N3under N8 1 11 11 (( 1(Y N8 and = 1 1 18 8'

The median of the income group of N8 and above is N1,(.

1+



!a) Lalculate the @oint relative frequencies and the marginal relative frequencies. <nterpretone of each of these. 2&etch the cumulative income distribution.

Answer: The @oint relative frequencies and marginal relative frequencies are given in theaccompanying table. (. percent of the individuals are between the age of

and ', and ma&e between N and under N'. 1. percent of the individualsearn between N' and under N.

Joint !elati"e and #arginal Frequencies of Age and Income, 1,744 ouseholds

Age of head of

household

1 X X % X ' X ( X

ousehold Income 13under 3under ( (3under '( '(3under ( ( and = $otal

1Y N3under N .' .'' .4( .'+ .1' .4

Y N3under N' .4 .( .1+8 .8 .( .%'%Y N'3under N . .11 .1'' .(8 .% .1

'Y N3under N8 .1 . .% .% .1 .1

(Y N8 and = .1 .1 . .'8 .1 .11



Cumulative Income Distribution

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0!"200 200!"400 400!"600 600!"800 800 an# $

Income Class

P e r c e n t

Cumulative Income Distribution

!b) Lalculate the conditional relative income frequencies for the two age categories 13under, and '(3under (. Lalculate the mean household income for both age categories.

Answer: The mean household income for the 13under age category is roughly N1''.<t is appro$imately N'8+ for the '(3under ( age category.

Conditional !elati"e Frequencies of Income and Age 1%&under 2', and 4(&under %(,

1,744 ouseholds

Age of head of household

1 X ' X

ousehold Income 13under '(3under (

1

Y N3under N .8' .18(

Y N3under N' .1%4 .%

%Y N'3under N . .14

'Y N3under N8 .1 .118

(Y N8 and = .1 .18

1



!c) <f household income and age of head of household were independently distributed, whatwould you e$pect these two conditional relative income distributions to loo& li&eI Arethey similar hereI

Answer: They would have to be identical, which they clearly are not.

!d) ;our te$tboo& has given you a primary definition of independence that does not involveconditional relative frequency distributions. 5hat is that definitionI o you thin& thatage and income are independent here, using this definitionI

Answer: "r! , ) "r! ) "r! )Y y X x Y y X x= = = = = . 5e can chec& this by multiplying two

marginal probabilities to see if this results in the @oint probability. Oor e$ample,

%"r! ) .1Y Y = = and %"r! ) .(' X X = = , resulting in a product of .114,

which does not equal the @oint probability of .1''. Given that we are loo&ingat the data as a population, not a sample, we do not have to test how 6close7

.114 is to .1''.

%) Hath and verbal 2AT scores are each distributed normally with !(,1) N .

!a) 5hat fraction of students scores above 4(I Above I Detween ' and (%I Delow'8I Above (%I

Answer: "r!;=4() 0 .M "r!;=) 0 .1(84M "r!'?;?(%) 0 .'1M"r!;?'8) 0 .'4M "r!;=(%) 0 .%81.

!b) <f the math and verbal scores were independently distributed, which is not the case, then

what would be the distribution of the overall 2AT scoreI Oind its mean and variance.

Answer: The distribution would be !1,) N , using equations !.+) and !.%1) in

the te$tboo&. Pote that the standard deviation is now roughly 1'1 rather than.

!c) Pe$t, assume that the correlation coefficient between the math and verbal scores is .4(.Oind the mean and variance of the resulting distribution.

Answer: Given the correlation coefficient, the distribution is now !1,%() N ,

which has a standard deviation of appro$imately 184.

!d) Oinally, assume that you had chosen ( students at random who had ta&en the 2AT e$am.erive the distribution for their average math 2AT score. 5hat is the probability that thisaverage is above (%I 5hy is this so much smaller than your answer in !a)I



Answer: The distribution for the average math 2AT score is !(,') N . "r! (%)Y > 0

.8. This probability is smaller because the sample mean has a smallerstandard deviation ! rather than 1).

;ou have read about the so3called catch3up theory by economic historians, whereby nations that

are further behind in per capita income grow faster subsequently. <f this is true systematically,then eventually laggards will reach the leader. To put the theory to the test, you collect dataon relative !to the Enited 2tates) per capita income for two years, 1+ and 1++, for 'FL countries. ;ou thin& of these countries as a population you want to describe, ratherthan a sample from which you want to infer behavior of a larger population. The relevantdata for this question is as follows:

Y 1 X X 1Y X × Y

1 X

X .% .44 1.% .18 .(% .(+% 1.+

.1' 1. 1. .1' . 1. 1.Q. Q. Q. Q. Q. Q. Q.

.'1 . .'( .8 .18 .' .(

.%% .1% .% .' .1+ .14 .(+

.( 1%. 14.8 .+' .1844 8.(+ 1%.+1'

where 1 X and X are per capita income relative to the Enited 2tates in 1+ and 1++

respectively, and Y is the average annual growth rate in X over the 1+31++ period. Pumbers in the last row represent sums of the columns above.

!a) Lalculate the variance and standard deviation of 1 X and X . Oor a catch3up effect to be present, what relationship must the two standard deviations showI <s this the case hereI

Answer: The variances of 1 X and X are .( and .+8 respectively, with standard

deviations of .4+ and .14. Oor the catch3up effect to be present, thestandard deviation would have to shrin& over time. This is the case here.

!b) Lalculate the correlation between Y and 1 X . 5hat sign must the correlation coefficient

have for there to be evidence of a catch3up effectI F$plain.

Answer: The correlation coefficient is B.88. <t has to be negative for there to beevidence of a catch3up effect. <f countries that were relatively ahead in the initial period and in terms of per capita income grow by relatively less over time, theneventually the laggards will catch3up.

%



') Oollowing Alfred PobelRs will, there are five Pobel "rizes awarded each year. These arefor outstanding achievements in Lhemistry, "hysics, "hysiology or Hedicine, iterature,and "eace. <n 1+8, the Dan& of 2weden added a prize in Fconomic 2ciences in memoryof Alfred Pobel. ;ou thin& of the data as describing a population, rather than a samplefrom which you want to infer behavior of a larger population. The accompanying table

lists the @oint probability distribution between recipients in economics and the other five prizes, and the citizenship of the recipients, based on the 1++31 period.

Soint istribution of Pobel "rize 5inners in Fconomics and Pon3Fconomics isciplines, and Litizenship, 1++31

E.2. Litizen! Y = )

Pon3E.2. Litizen !1Y = )

Total

Fconomics Pobel"rize ! X = )

.118 .'+ .14

"hysics, Lhemistry,Hedicine, iterature,

and "eace Pobel"rize ! 1 X = )

.%'( .'88 .8%%

Total .'% .(%4 1.

!a) Lompute ! ) E Y and interpret the resulting number.

Answer: ! ) .(%4 E Y = . (%.4 percent of Pobel "rize winners were non3E.2. citizens.

!b) Lalculate and interpret ! / 1) E Y X = and ! / ) E Y X = .

Answer: ! / 1) .(8 E Y X = = . (8. percent of Pobel "rize winners in non3economics

disciplines were non3E.2. citizens. ! / ) .+% E Y X = = . +.% percent of the

Fconomics Pobel "rize winners were non3E.2. citizens.

!c) A randomly selected Pobel "rize winner reports that he is a non3E.2. citizen. 5hat is the probability that this genius has won the Fconomics Pobel "rizeI A Pobel "rize in theother five disciplinesI

Answer: There is a +.1 percent chance that he has won the Fconomics Pobel "rize, and a+.+ percent chance that he has won a Pobel "rize in one of the other five

disciplines.

!d) 2how what the @oint distribution would loo& li&e if the two categories were independent..

Answer:

'



Soint istribution of Pobel "rize 5inners in Fconomics and Pon3Fconomics isciplines, and Litizenship, 1++31, under assumption ofindependence

E.2. Litizen! Y = )

Pon= E.2. Litizen !

1Y = )

Total

Fconomics Pobel"rize ! X = ) .44 .+ .14

"hysics, Lhemistry,Hedicine, iterature,

and "eace Pobel"rize ! 1 X = )

.%8 .''4 .8%%

Total .'% .(%4 1.

4) A few years ago the news magazine &* Economi"t listed some of the strangere$planations used in the past to predict presidential election outcomes. These included

whether or not the hemlines of womenRs s&irts went up or down, stoc& mar&et performances, baseball 5orld 2eries wins by an American eague team, etc. Thin&ingabout this problem more seriously, you decide to analyze whether or not the presidentialcandidate for a certain party did better if his party controlled the house. Accordingly youcollect data for the last %' presidential elections. ;ou thin& of this data as comprising a population which you want to describe, rather than a sample from which you want toinfer behavior of a larger population. ;ou generate the accompanying table:

Soint istribution of "residential "arty Affiliation and "arty Lontrol of Couse of Jepresentatives, 1831++

emocratic Lontrol

of Couse ! Y = )

Jepublican Lontrol

of Couse ! 1Y = )

Total

emocratic"resident ! X = )

.'1 .% .''1

Jepublican"resident ! 1 X = )

.14 .%8 .((+

Total .(88 .'1 1.

!a) <nterpret one of the @oint probabilities and one of the marginal probabilities.

Answer: %8. percent of the presidents were Jepublicans and were in the 5hiteCouse while Jepublicans controlled the Couse of Jepresentatives. ''.1 percent of all presidents were emocrats.

(



!b) Lompute ! ) E X . Cow does this differ from ! / ) E X Y = I F$plain..

Answer: ! ) E X 0 .((+. ! / ) E X Y = 0 .41. ! ) E X gives you the unconditional

e$pected value, while ! / ) E X Y = is the conditional e$pected value.

!c) <f you pic&ed one of the Jepublican presidents at random, what is the probability thatduring his term the emocrats had control of the CouseI

Answer: ! ) .((+ E X = . ((.+ percent of the presidents were Jepublicans.

! / ) .++ E X Y = = . +.+ percent of those presidents who were in office while

emocrats had control of the Couse of Jepresentatives were Jepublicans. Thesecond conditions on those periods during which emocrats had control of theCouse of Jepresentatives, and ignores the other periods.

!d) 5hat would the @oint distribution loo& li&e under independenceI Lhec& your results bycalculating the two conditional distributions and compare these to the marginaldistribution.

Answer:Soint istribution of "residential "arty Affiliation and "arty Lontrol of Couse of Jepresentatives, 1831++, under the Assumption of<ndependence

emocratic Lontrolof Couse ! Y = )

Jepublican Lontrolof Couse ! 1Y = )

Total

emocratic"resident ! X = )

.(+ .18 .''1

Jepublican"resident ! 1 X = )

.%+ .% .((+

Total .(88 .'1 1.

.(+"r! / ) .''

.(88 X Y = = = = !there is a small rounding error).

.%"r! 1/ 1) .'11

.((+Y X = = = = !there is a small rounding error).

8) The e$pectations augmented "hillips curve postulates

! ) p + ( (∆ π = − − ,

where p∆ is the actual inflation rate, π is the e$pected inflation rate, and ( is the

unemployment rate, with 6B7 indicating equilibrium !the PA<JE B Pon3Accelerating

<nflation Jate of Enemployment). Ender the assumption of static e$pectations ! π 0 1 p∆ −



), i'*' that you e$pect this periodRs inflation rate to hold for the ne$t period !6the sunshines today, it will shine tomorrow7), then the prediction is that inflation will accelerateif the unemployment rate is below its equilibrium level. The accompanying table belowdisplays information on accelerating annual inflation and unemployment rate differencesfrom the equilibrium rate !cyclical unemployment), where the latter is appro$imated by a

five 0 year moving average. ;ou thin& of this data as a population which you want todescribe, rather than a sample from which you want to infer behavior of a larger population. The data is collected from Enited 2tates quarterly data for the period 1+':1to 1++(:'.

Soint istribution of Accelerating <nflation and Lyclical Enemployment,

1+':131++(:'

! ) ( (− >

! Y = )

! ) ( (− ≥

! 1Y = )

Total

1 p p∆ ∆−

− > !

X = )

.1( .%8% .(%+

1 p p∆ ∆ −− ≤ !

1 X = )

.+4 .1' .'1

Total .'(% .('4 1.

!a) Lompute ! ) E Y and ! ) E X , and interpret both numbers.

Answer: ! ) .('4 E Y = . ('.4 percent of the quarters saw cyclical unemployment.

! ) .'1 E X = . '.1 percent of the quarters saw decreasing inflation rates.

!b) Lalculate ! / 1) E Y X = and ! / ) E Y X = . <f there was independence between cyclicalunemployment and acceleration in the inflation rate, what would you e$pect therelationship between the two e$pected values to beI Given that the two means aredifferent, is this sufficient to assume that the two variables are independentI

Answer: ! / 1) .%( E Y X = = M ! / ) .411 E Y X = = . ;ou would e$pect the two

conditional e$pectations to be the same. <n general, independence in means doesnot imply statistical independence, although the reverse is true.

!c) 5hat is the probability of inflation to increase if there is positive cyclical unemploymentI Pegative cyclical unemploymentI

Answer: There is a %'.' percent probability of inflation to increase if there is positivecyclical unemployment. There is a 4 percent probability of inflation to increaseif there is negative cyclical unemployment.

4



!d) ;ou randomly select one of the (+ quarters when there was positive cyclical

unemployment ! ! ) ( (− > ) . 5hat is the probability there was decelerating inflation

during that quarterI

Answer: There is a (. percent probability of inflation to decelerate when there is

positive cyclical unemployment.

+) The accompanying table shows the @oint distribution between the change of theunemployment rate in an election year and the share of the candidate of the incumbent partysince 1+8. ;ou thin& of this data as a population which you want to describe, rather than asample from which you want to infer behavior of a larger population.

Soint istribution of Enemployment Jate Lhange and <ncumbent "artyRs ote 2hare in Total ote Last for the Two Ha@or3"arty Landidates,

1+83

! (*) ,nc(mb*nt − >

! Y = )

! (*) ,nc(mb*nt − ≤! 1Y = )

Total

(∆ > ! X = ) .(% .11 .'

(∆ ≤ ! 1 X = ) .(4+ .1(4 .4%

Total .% .%8 1.

!a) Lompute and interpret ! ) E Y and ! ) E X .

Answer: ! ) .%8 E Y = M ! ) .4% E X = . The probability of an incumbent to have less

than (* of the share of votes cast for the two ma@or3party candidates is .%8.The probability of observing falling unemployment rates during the election

year is 4%. percent.

!b) Lalculate ! / 1) E Y X = and ! / ) E Y X = . id you e$pect these to be very differentI

Answer: ! / 1) .1% E Y X = = M ! / ) .4++ E Y X = = . A student who believes that

incumbents will attempt to manipulate the economy to win elections will answer affirmatively here.

!c) 5hat is the probability that the unemployment rate decreases in an election yearI

Answer: "r! 1) X = = .4%.

!d) Londitional on the unemployment rate decreasing, what is the probability that anincumbent will lose the electionI

Answer: "r! 1/ 1) .1%Y X = = = .

8



!e) 5hat would the @oint distribution loo& li&e under independenceI

Answer:Soint istribution of Enemployment Jate Lhange and <ncumbent "artyRs ote 2hare in Total ote Last for the Two Ha@or3"arty Landidates,

1+83 under Assumption of 2tatistical <ndependence

! (*) ,nc(mb*nt − >

! Y = )

! (*) ,nc(mb*nt − ≤! 1Y = )

Total

(∆ > ! X = ) .14 .+4 .'

(∆ ≤ ! 1 X = ) .'( .41 .4%

Total .% .%8 1.

1) The accompanying table lists the @oint distribution of unemployment in the Enited2tates in 1 by demographic characteristics !race and gender).

Soint istribution of Enemployment by emographic Lharacteristics,

Enited 2tates, 1

5hite! Y = )

Dlac& and ther ! 1Y = )

Total

Age 131+! X = )

.1% .( .18

Age and above! 1 X = )

. . .8

Total .4% .4 1.

!a) 5hat is the percentage of unemployed white teenagersI

Answer: "r! , ) .1%.Y X = = =

!b) Lalculate the conditional distribution for the categories 6white7 and 6blac& and other.7

+



Ans)er* Londitional istribution of Enemployment by emographic

Lharacteristics, Enited 2tates, 1

5hite! Y = )

Dlac& and ther ! 1Y = )

Age 131+! X = )

.18 .1+

Age and above! 1 X = )

.8 .81

Total 1. 1.

!c) Given your answer in the previous question, how do you reconcile this fact with the probability to be * of finding an unemployed adult white person, and only * for thecategory 6blac& and other.7

Answer: The original table showed the @oint probability distribution, while the table in!b) presented the conditional probability distribution.

%



Mathematical and Graphical Problems

1) Thin& of an e$ample involving five possible quantitative outcomes of a discrete randomvariable and attach a probability to each one of these outcomes. isplay the outcomes,

probability distribution, and cumulative probability distribution in a table. 2&etch boththe probability distribution and the cumulative probability distribution.

Answer: Answers will vary by student. The generated table should be similar to Table.1 in the te$t, and figures should resemble Oigures .1 and . in the te$t.

) The height of male students at your college9university is normally distributed with amean of 4 inches and a standard deviation of %.( inches. <f you had a list of telephonenumbers for male students for the purpose of conducting a survey, what would be the probability of randomly calling one of these students whose height is

!a) taller than UVI!b) between (U%V and U(VI!c) shorter than (U4V, the mean height of female studentsI!d) shorter than (UVI!e) taller than 2haq RPeal, the center of the .A. a&ers, who is 4U1V tallI Lompare this to

the probability of a woman being pregnant for 1 months !% days), where days of pregnancy is normally distributed with a mean of days and a standard deviation of 1days.

Answer: !a) "r!W = .(41') 0 .8%+M !b) "r! B ? W ? ) 0 .+('( or appro$imately.+(M !c) "r!W ? 3.8(41) 0 .1+(4M !d) "r!W ? 3.8(41) 0 .1M

!e) "r!W = '.8(4) 0 .+ !the te$t does not show values above .++standard deviations, "r!W=.++ 0 .1') and "r!W = .1() 0 .18.

%) Lalculate the following probabilities using the standard normal distribution. 2&etch the probability distribution in each table case, shading in the area of the calculated probability.

!a) "r! Z ? .)!b) "r! Z ≤ 1.)!c) "r! Z = 1.+)!d) "r! Z ? B.)!e) "r! Z = 1.'()

!f) "r! Z = B1.'()!g) "r!B1.+ ? Z ? 1.+)!h) "r! Z ? .(4 or Z = .(4)!i) "r! Z = -) 0 .1M find -.!@) "r! Z ? 3 - or Z = -) 0 .(M find -.

%1



Answer: !a) .(M !b) .8'1%M !c) .(M !d) .8M !e) .(M !f) .+(M !g).(M !h) .1M !i) 1.81M !@) 1.+.

') Esing the fact that the standardized variable Z is a linear transformation of the normallydistributed random variable Y , derive the e$pected value and variance of Z .

Answer:1Y Y

Y Y Y

Y Z Y a bY

µ µ

σ σ σ

−= = − + = + , with

Y

Y

a µ

σ = − and

1

Y

bσ

= . Given !.+)

and !.%) in the te$t,1

! ) Y Y

Y Y

E Z µ

µ σ σ

= − + = , and

11 Z Z

Z

σ σ σ

= = .

() 2how in a scatterplot what the relationship between two variables X and Y would loo&li&e if there was

!a) a strong negative correlation. !b) a strong positive correlation.Answer: Answer:

%



!c) no correlation.

!d) 5hat would the correlation coefficient be if all observations for the two variables were on a

curve described by Y X = I

Answer: The correlation coefficient would be zero in this case, since the relationship is non3linear.

) Oind the following probabilities:

!)a Y is distributed

' χ . Oind "r!Y = +.'+).

Answer: .(.

!)b Y is distributed t ∞ . Oind "r!; = B.().

Answer: .+1(.

!c) Y is distributed ', ∞ . Oind "r!Y ? %.%).

Answer: .++.

!d) Y is distributed !(,1) N . Oind "r!Y = + or Y ? %').

Answer: .(.

%%



4) <n considering the purchase of a certain stoc&, you attach the following probabilities to possible changes in the stoc& price over the ne$t year.

2toc& "rice Lhange uring

Pe$t Twelve Honths !*)

"robability

#1( .

#( .%

.'

3( .(

31( .(

5hat is the e$pected value, the variance, and the standard deviationI 5hich is the mostli&ely outcomeI 2&etch the cumulative distribution function.

.

Answer: F!;) 0 %.(M

Y σ = 8.'+M Y σ 0 .+1M most li&ely: .

Stock Price Change During Next !elve Months

0

0.1

0.2

0.3

0.4

0.5

0.60.7

0.8

0.9

1

!15 !5 0 %5 %15

Percentage Change


&toc' Price ( C)an*e

%'



8) ;ou consider visiting Hontreal during the brea& between terms in Sanuary. ;ou go to therelevant 5eb site of the official tourist office to figure out the type of clothes you shouldta&e on the trip. The site lists that the average high during Sanuary is B4 L, with a

standard deviation of '

L. Enfortunately you are more familiar with Oahrenheit thanwith Lelsius, but find that the two are related by the following linear function:

(%

+. = − + .

Oind the mean and standard deviation for the Sanuary temperature in Hontreal inOahrenheit.

Answer: Esing equations !.+) and !.%) from the te$tboo&, the result is 1+.' and 4..

+) Two random variables are independently distributed if their @oint distribution is the product of their marginal distributions. <t is intuitively easier to understand that tworandom variables are independently distributed if all conditional distributions of Y given X are equal. erive one of the two conditions from the other.

Answer: <f all conditional distributions of Y given X are equal, then

"r! / 1) "r! / ) ... "r! / )Y y X Y y X Y y X l = = = = = = = = = .

Dut if all conditional distributions are equal, then they must also equal themarginal distribution, i.e.

"r! / ) "r! )Y y X x Y y= = = = .

Given the definition of the conditional distribution of Y given X 0 x, you thenget

"r! , )"r! / ) "r! )

"r! )

Y y X xY y X x Y y

X x

= == = = = =

=,

which gives you the condition

"r! , ) "r! ) "r! )Y y X x Y y X x= = = = = .

1) There are frequently situations where you have information on the conditionaldistribution of Y given X , but are interested in the conditional distribution of X given Y .

%(



Jecalling"r! , )

"r! / )"r! )

X x Y yY y X x

X x

= == = =

=, derive a relationship between

"r! / ) X x Y y= = and "r! / )Y y X x= = . This is called DayesR theorem.

Answer: Given

"r! , )

"r! / ) "r! )

X x Y y

Y y X x X x

= == = = = ,

"r! / ) "r! ) "r! , )Y y X x X x X x Y y= = × = = = = M

similarly"r! , )

"r! / )"r! )

X x Y y X x Y y

Y y

= == = =

= and

"r! / ) "r! ) "r! , ) X x Y y Y y X x Y y= = × = = = = . Fquating the two and solving

for "r! / ) X x Y y= = then results in

"r! / ) X x Y y= = 0"r! / ) "r! )

"r! )

Y y X x X x

Y y

= = × ==

.

11) ;ou are at a college of roughly 1, students and obtain data from the entire freshmanclass !( students) on height and weight during orientation. ;ou consider this to be a population that you want to describe, rather than a sample from which you want to infergeneral relationships in a larger population. 5eight !Y ) is measured in pounds and height! X ) is measured in inches. ;ou calculate the following sums:

1

n

i

i

y=

∑ 0 +',8.8,

1

n

i

i

x=

∑ 0 1,'8.+,1

n

i i

i

x y=

∑ 0 4,(.+

!small letters refer to deviations from means as in i i - Z Z = − ).

!a) Given your general &nowledge about human height and weight of a given age, what canyou say about the shape of the two distributionsI

Answer: Doth distributions are bound to be normal.

!b) 5hat is the correlation coefficient between height and weight hereI

Answer: .4%.

1) Ese the definition for the conditional distribution of Y given X x= and the marginal

distribution of X to derive the formula for "r! , ) X x Y y= = . This is called the

multiplication rule. Ese it to derive the probability for drawing two aces randomly from adec& of cards !no @o&er), where you do not replace the card after the first draw. Pe$t,generalizing the multiplication rule and assuming independence, find the probability ofhaving four girls in a family with four children.

%



Answer:' %

.'(( (1

× = M .( or

'1 1

1

=

.

1%) The systolic blood pressure of females in their s is normally distributed with a mean of1 with a standard deviation of +. 5hat is the probability of finding a female with a blood pressure of less than 1I Hore than 1%(I Detween 1( and 1%I ;ou visit thewomenRs soccer team on campus, and find that the average blood pressure of the (members is 11'. <s it li&ely that this group of women came from the same populationI

Answer: "r!;?1) 0 .1%1M "r!;=1%() 0 .'48M "r!1(?;?1%) 0 .48'M

"r! 11') "r! %.%%) .'Y Z < = < − = . !The smallest z3value listed in the table

in the te$tboo& is B.++, which generates a probability value of .1'.) Thisunli&ely that this group of women came from the same population.

1') 2how that the correlation coefficient between Y and X is unaffected if you use a linear

transformation in both variables. That is, show that

X X

! , ) ! , )corr X Y corr X Y = , whereX X a bX = + and XY c dY = + , and where a, b, c, and d are arbitrary non3zero constants.

Answer:

X XX X

X X

cov! , ) cov! , )! , ) ! , )

var! ) var! ) var! ) var! )

X Y bd X Y corr X Y corr X Y

X Y b X d Y = = .

1() The te$tboo& formula for the variance of the discrete random variable Y is given as

1

! )k

Y i Y i

i

y pσ µ =

= −∑ .

Another commonly used formulation is

1

k

Y i i y

i

y pσ µ =

= −∑ .

"rove that the two formulas are the same.

Answer:

1 1 1

! ) ! ) ! )k k k

Y i Y i i Y Y i i i i Y i Y i i

i i i

y p y y p y p p y pσ µ µ µ µ µ = = =

= − = + − = + −∑ ∑ ∑ .

Hoving the summation sign through results in

1 1 1

k k k

Y i i Y i Y i i

i i i

y p p y pσ µ µ = = =

= + −∑ ∑ ∑ . Dut1

1k

i

i

p=

=∑ and1

k

Y i i

i

y p µ =

= ∑ , giving

you the second e$pression after simplification.

%4



1) The Economic %*port o+ t* /r*"id*nt gives the following age distribution of the Enited

2tates population for the year :

+nited tates -opulation ./ Age 0roup, 2'''

utcome !agecategory)

Ender ( (31( 131+ 3' (3'' '(3' ( andover

"ercentage . .1 . .4 .% . .1%

<magine that every person was assigned a unique number between 1 and 4(,%4, !thetotal population in ). <f you generated a random number, what would be the probabilitythat you had drawn someone older than ( or under 1I Treating the percentages as

probabilities, write down the cumulative probability distribution. 5hat is the probability ofdrawing someone who is ' years or youngerI

Answer: "r! 1Y < or ()Y > 0 .%(M

utcome !agecategory)

Ender ( (31( 131+ 3' (3'' '(3' ( andover

Lumulative probabilitydistribution

. . .8 .%( .( .84 1.

"r! ') .%(.Y ≤ =

14) The accompanying table gives the outcomes and probability distribution of the number of times a student chec&s her e3mail daily:

-robabilit/ of Checing &#ail

utcome!number of e3mail chec&s)

1 % ' (

"robabilitydistribution

.( .1( .% .( .1( .8 .

2&etch the probability distribution. Pe$t, calculate the c.d.f. for the above table. 5hat isthe probability of her chec&ing her e3mail between 1 and % times a dayI f chec&ing itmore than % times a dayI

%8



Answer:

utcome!number of e3mail chec&s)

1 % ' (


.( . .( .4( .+ .+8 1.

"r!1 %) .4Y ≤ ≤ = M "r! .()Y > .

Cumulative Distribution +unction

0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6

Number of "#mail Checks


Cumulative Distribution +unction

18) The accompanying table lists the outcomes and the cumulative probability distributionfor a student renting videos during the wee& while on campus.

ideo !entals per ee during emester

utcome !number of wee&lyvideo rentals)

1 % ' (

"robability distribution .( .(( .( .( .4 . .1

%+



2&etch the probability distribution. Pe$t, calculate the cumulative probabilitydistribution for the above table. 5hat is the probability of the student renting between and ' a wee&I f less than % a wee&I

Answer: The cumulative probability distribution is given below. The probability of

renting between two and four videos a wee& is .%4. The probability of rentingless than three a wee& is .8(.

utcome !number ofwee&ly video rentals)

1 % ' (


.( . .8( .+ .+4 .++ 1.

1+) The te$tboo& mentioned that the mean of , ! )Y E Y is called the first moment of Y , and

that the e$pected value of the square of, ! )Y E Y is called the second moment of Y , and

so on. These are also referred to as moments about the origin. A related concept is

moments about the mean, which are defined as -! ) r Y E Y µ − . 5hat do you call the

second moment about the meanI 5hat do you thin& the third moment, referred to as6s&ewness,7 measuresI o you believe that it would be positive or negative for anearnings distributionI 5hat measure of the third moment around the mean do you get fora normal distributionI

Answer: The second moment about the mean is the variance. 2&ewness measures thedeparture from symmetry. Oor the typical earnings distribution, it will be positive. Oor the normal distribution, it will be zero.

,umber o- ee'ly /i#eo entals

0

0.1

0.2

0.3

0.4

0.5

0.6

1 2 3 4 5 6 7

Number of $entals


,umber o- ee'ly /i#eo entals

'



) F$plain why the two probabilities are identical for the standard normal distribution:"r! 1.+ 1.+) X − ≤ ≤ and "r! 1.+ 1.+) X − < < .

Answer: Oor a continuous distribution, the probability of a point is zero.

Chapter 3

Thin& of at least nine e$amples, three of each, that display a positive, negative, or no correlation between two economic variables. <n each of the positive and negative e$amples, indicatewhether or not you e$pect the correlation to be strong or wea&.

Answer: Answers will vary by student. 2tudents frequently bring up the followingcorrelations. "ositive correlations: earnings and education !hopefully strong),consumption and personal disposable income !strong), per capita income andinvestment3output ratio or saving rate !strong)M negative correlation: &unRsaw !strong), income velocity and interest rates !strong), the "hillips curve!strong)M no correlation: productivity growth and initial level of per capitaincome for all countries of the world !beta3convergence regressions),consumption and the !real) interest rate, employment and real wages.

)% Adult males are taller, on average, than adult females. isiting two recent American;outh 2occer rganization !A;2) under313year3old !E1) soccer matches on a

2aturday, you do not observe an obvious difference in the height of boys and girls of thatage. ;ou suggest to your little sister that she collect data on height and gender of childrenin 'th to th grade as part of her science pro@ect. The accompanying table shows herfindings.

eight of 5oung .o/s and 0irls, 0rades 4&%, in inches

Doys Girls

Boy"Y Boy" " Boy"n0irl"Y 0irl" " 0irl"n

(4.8 %.+ (( (8.' '. (4

!e) et your null hypothesis be that there is no difference in the height of females and malesat this age level. 2pecify the alternative hypothesis.

'1



Answer: : Boy" 0irl" # µ µ − = vs. 1 : Boy" 0irl" # µ µ − ≠

!f) Oind the difference in height and the standard error of the difference.

Answer: Boy" 0irl"Y Y − 0 3., 2F! Boy" 0irl"Y Y − ) 0

%.+ '.

(( (4+ 0 .44.

!g) Generate a +(* confidence interval for the difference in height.

Answer: 3. ± 1.+×.44 0 !3.11, .+1).

!h) Lalculate the t 3statistic for comparing the two means. <s the difference statisticallysignificant at the 1* levelI 5hich critical value did you useI 5hy would this number besmaller if you had assumed a one3sided alternative hypothesisI 5hat is the intuition behind thisI

Answer: t 0 3.48, so / t / ? .(8, which is the critical value at the 1* level. Cence youcannot re@ect the null hypothesis. The critical value for the one3sided hypothesiswould have been .%%. Assuming a one3sided hypothesis implies that you havesome information about the problem at hand, and, as a result, can be more easilyconvinced than if you had no prior e$pectation.

)' Hath 2AT scores !Y ) are normally distributed with a mean of ( and a standarddeviation of 1. An evening school advertises that it can improve studentsR scores byroughly a third of a standard deviation, or % points, if they attend a course which runsover several wee&s. !A similar claim is made for attending a verbal 2AT course.) Thestatistician for a consumer protection agency suspects that the courses are not effective.

2he views the situation as follows: : (Y # µ = vs. 1 : (%Y # µ = .

!e) 2&etch the two distributions under the null hypothesis and the alternative hypothesis.

Answer:

!f) The consumer protection agency wants to evaluate this claim by sending ( students toattend classes. ne of the students becomes sic& during the course and drops out. 5hat is

'



the distribution of the average score of the remaining '+ students under the null, andunder the alternative hypothesisI

Answer: Y of the '+ participants is normally distributed, with a mean of ( and a

standard deviation of 1'.8 under the null hypothesis. Ender the alternative

hypothesis, it is normally distributed with a mean of (% and a standarddeviation of 1'.8.

'%



!g) Assume that after graduating from the course, the '+ participants ta&e the 2AT test andscore an average of (. <s this convincing evidence that the school has fallen short of itsclaimI 5hat is the p3value for such a score under the null hypothesisI

Answer: <t is possible that the consumer protection agency had chosen a group of '+

students whose average score would have been '+ without attending thecourse. The crucial question is how li&ely it is that '+ students, chosenrandomly from a population with a mean of ( and a standard deviation of1, will score an average of (. The p3value for this score is .81, meaningthat if the agency re@ected the null hypothesis based on this evidence, it wouldma&e a mista&e, on average, roughly 1 out of 1 times. Cence the average scoreof ( would allow re@ection of the null hypothesis that the school has had noeffect on the 2AT score of students at the 1* level.

!h) 5hat would be the critical value under the null hypothesis if the size of your test were(*I

Answer: The critical value would be (%.

!i) Given this critical value, what is the power of the testI 5hat options does the statisticianhave for increasing the power in this situationI

Answer: 1"r! (% /Y # < is true) 0 .%1. Cence the power of the test is .88. 2he could

increase the power by decreasing the size of the test. Alternatively, she could tryto convince the agency to hire more test sub@ects, i.e., she could increase thesample size.

)( ;our pac&aging company fills various types of flour into bags. Jecently there have beencomplaints from one chain of stores: a customer returned one opened ( pound bag whichweighed significantly less than the label indicated. ;ou view the weight of the bag as arandom variable which is normally distributed with a mean of ( pounds, and, afterstudying the machine specifications, a standard deviation of .( pounds.

.a ;ou ta&e a sample of bags and weigh them. 2&etch below what the average pattern ofindividual weights might loo& li&e. et the horizontal a$is indicate the sampled bagnumber !1, , Q, ). n the vertical a$is, mar& the e$pected value of the weight underthe null hypothesis, and two ! ≈ 1.+) standard deviations above and below the e$pected

value. raw a line through the graph for E !Y ) # Y σ , E !Y ), and E !Y ) B Y σ . Cow many

of the bags in a sample of will you e$pect to weigh either less than '.+ pounds or morethan (.1 poundsI

''



Answer: n average, there should be one bag in every sample of which weighs lessthan '.+ pounds or more than (.1 pounds.

.b ;ou sample ( bags of flour and calculate the average weight. 5hat is the distribution ofthe average weight of these ( bagsI Jepeating the same e$ercise times, s&etch whatthe distribution of the average weights would loo& li&e in a graph similar to the one you

drew in !b), where you have ad@usted the standard error of Y accordingly.

Answer: The average weight of ( bags will be normally distributed, with a mean of ( pounds and a standard deviation of .1 pounds.

'(



.c Oor each of the twenty observations in !b), a +(* confidence interval is constructed.raw these confidence intervals, using the same graph as in !b). Cow many of these confidence intervals would you e$pect to weigh ( pounds under the null hypothesisI

Answer: ;ou would e$pect 1+ of the confidence intervals to contain ( pounds.

() Assume that two presidential candidates, call them Dush and Gore, receive (* of the

votes in the population. ;ou can model this situation as a Dernoulli trial, where Y is arandom variable with success probability "r! 1)Y p= = , and where Y 0 1 if a person

votes for Dush and Y 0 otherwise. Ourthermore, let > p be the fraction of successes !1s)

in a sample, which is distributed N ! p,!1 ) p p

n

−) in reasonably large samples, say for n ≥

'.!a) Given your &nowledge about the population, find the probability that in a random sampleof ', Dush would receive a share of '* or less.

Answer:

µ .' .("r! .') "r! ) "r! 1.) .1'.

.('

p Z Z −

< = < = < − ≈<n roughly every

1th sample of this size, Dush would receive a vote of less than '*, although intruth, his share is (*.

!a) Cow would this situation change with a random sample of 1I

'



Answer:µ .' .(

"r! .') "r! ) "r! .) .%..(

1

p Z Z −

< = < = < − ≈5ith this sample

size, you would e$pect this to happen only every (th

sample.

!b) Given your answers in !a) and !b), would you be comfortable to predict what the votingintentions for the entire population are if you did not &now p but had polled 1,

individuals at random and calculated > p I F$plain.

Answer: The answers in !a) and !b) suggest that for even moderate increases in thesample size, the estimator does not vary too much from the population mean.

"olling 1, individuals, the probability of finding a µ p of .'8, for e$ample,

would be .%. Enless the election was e$tremely close, which the election was, polls are quite accurate even for sample sizes of ,(.

!c) This result seems to hold whether you poll 1, people at random in the Petherlands or the Enited 2tates, where the former has a population of less than million people, whilethe Enited 2tates is 1( times as populous. 5hy does the population size not come into playI

Answer: The distribution of sample means shrin&s very quic&ly depending on the samplesize, not the population size. Although at first this does not seem intuitive, the

standard error of an estimator is a value which indicates by how much theestimator varies around the population value. Oor large sample sizes, the samplemean typically is very close to the population mean.

) ;ou have collected wee&ly earnings and age data from a sub3sample of 1,4'' individualsusing the Lurrent "opulation 2urvey in a given year.

.!a) Given the overall mean of N'%'.'+ and a standard deviation of N+'.4, construct a ++*

confidence interval for average earnings in the entire population. 2tate the meaning ofthis interval in words, rather than @ust in numbers. <f you constructed a +* confidence

interval instead, would it be smaller or largerI 5hat is the intuitionI

Answer: The confidence interval for mean wee&ly earnings is '%'.'+± .(4+'.4

14''×

0 '%'.'+ ± 18.1% 0 !'1.%, '(.). Dased on the sample at hand, the bestguess for the population mean is N'%'.'+. Cowever, because of random

'4



sampling error, this guess is li&ely to be wrong. <nstead, the best guess is for theaverage earnings to lie between N'1.% and N'(.. Lommitting to such aninterval repeatedly implies that the resulting statement is incorrect 1 out of 1times. Oor a +* confidence interval, the only change in the calculation of theconfidence interval is to replace .(4 by 1.'. Cence the confidence interval is

smaller. A smaller interval implies, given the same average earnings and thestandard deviation, that the statement will be false more often. The larger theconfidence interval, the more li&ely it is to contain the population value.

1. 5hen dividing your sample into people '( years and older, and younger than '(, theinformation shown in the table is found.

Age Lategory Average Farnings

Y 2tandard eviation

Y " N

Age ≥ '( N'88.84 N%8.' (4

Age< '( N'1. N4.% 1%4

Test whether or not the difference in average earnings is statistically significant. Givenyour &nowledge of age3earning profiles, does this result ma&e senseI

Answer: Assuming unequal population variances,

!'88.84 '1.)

%8.' 4.%

(4 1%4

t −

=

+ 0 '.,

which is statistically significant at conventional levels whether you use a two3sided or one3sided alternative. Cence the null hypothesis of equal average

earnings in the two groups is re@ected. Age3earning profiles typically ta&e on aninverted E3shape. Ha$imum earnings occur in the 's, depending on someother factors such as years of education, which are not considered here. Cence itis not clear if the alternative hypothesis should be one3sided or two3sided. <nsuch a situation, it is best to assume a two3sided alternative hypothesis.

4) A manufacturer claims that a certain brand of LJ player has an average life e$pectancyof ( years and months with a standard deviation of 1 year and months. Assume thatthe life e$pectancy is normally distributed.

!)c 2electing one LJ player from this brand at random, calculate the probability of its lifee$pectancy e$ceeding 4 years.

Answer: "r! 4) "r! 1)Y Z > = > 0 .1(84.

!)d The .ritical .on"(m*r magazine decides to test fifty LJs of this brand. The averagelife in this sample is years and the sample standard deviation is years. Lalculate a++* confidence interval for the average life.

'8



Answer: ± .(4

(× 0 ± .4% 0 !(.4, .4%).

'+



!c) Cow many more LJs would the magazine have to test in order to halve the width of theconfidence intervalI

Answer :1 1

!.(4 ) .(4 .(4

( ( ' (

× × = × × = ×

×, or n 0 .

8) '!' N*" and $orld %*port ran&s colleges and universities annually. ;ou randomlysample 1 of the national universities and liberal arts colleges from the year issue. Theaverage cost, which includes tuition, fees, and room and board, is N%,(41.'+ with a standarddeviation of N4,1(.(.

!a) Dased on this sample, construct a +(* confidence interval of the average cost ofattending a university9college in the Enited 2tates.

Answer: %,(41.'+ ± 1.+ 4,1(.(1

× 0 %,(41.'+ ± 41.(( 0 !,8+.+', ',4%.').

!b) Lost varies by quite a bit. ne of the reasons may be that some universities9colleges havea better reputation than others. '!' N*" and $orld %*port tries to measure this factor byas&ing university presidents and chief academic officers about the reputation of institutions.The ran&ing is from 1 !6marginal7) to ( !6distinguished7). ;ou decide to split the sampleaccording to whether the academic institution has a reputation of greater than %.( or not. Oorcomparison, in , Laltech had a reputation ran&ing of '.4, 2mith Lollege had '.(, andAuburn Eniversity had %.1. This gives you the statistics shown in the accompanying table.

Jeputation Lategory Average Lost

Y 2tandard eviation

of Lost ! Y " )

P

Jan&ing = %.( N+,%11.%1 N(,'+.1 +

Jan&ing ≤ %.( N1,4. N,1%%.%8 41

Test the hypothesis that the average cost for all universities9colleges is the sameindependent of the reputation. 5hat alternative hypothesis did you useI


!+,%11.%1 1,4.)

(,'+.1 ,1%%.%8

+ 41

t −=+

0 .%%,

which is statistically significant whether or not you use a one3sided or two3sidedhypothesis test. ;our prior e$pectation is that academic institutions with a

(



higher reputation will charge more for attending, and hence a one3sidedalternative would have been appropriate here.

!c) 5hat other factors should you consider before ma&ing a decision based on the data in!b)I

Answer: There may be other variables which potentially have an effect on the cost ofattending the academic institution. 2ome of these factors might be whether ornot the college9university is private or public, its size, whether or not it has areligious affiliation, etc. <t is only after controlling for these factors that the6pure7 relationship between reputation and cost can be identified.

1) The development office and the registrar have provided you with anonymous matches ofstarting salaries and G"As for 18 graduating economics ma@ors. ;our sample contains avariety of @obs, from church pastor to stoc&bro&er.

!a) The average starting salary for the 18 students was N%8,''.8 with a standard deviationof N4,('1.'. Lonstruct a +(* confidence interval for the starting salary of all economicsma@ors at your university9college.

Answer: %8,''.8 ± 1.+4,('1.'

18× 0 %8,''.8 ± 1,'.% 0 !%4,.(', ',4.18).

!b) A similar sample for psychology ma@ors indicates a significantly lower starting salary.Given that these students had the same number of years of education, does this indicatediscrimination in the @ob mar&et against psychology ma@orsI

Answer: <t suggests that the mar&et values certain qualifications more highly than others.Lomparing means and identifying that one is significantly lower than othersdoes not indicate discrimination.

!c) ;ou wonder if it pays !no pun intended) to get good grades by calculating the averagesalary for economics ma@ors who graduated with a cumulative G"A of D# or better, and thosewho had a D or worse. The data is as shown in the accompanying table:

Lumulative G"A Average Farnings

Y 2tandard eviation

Y "n

D# or better N%+,+1(.( N8,%%.1 (+

D or worse N%4,8%.%% N,14'.8 '+

(1



Londuct a t 3test for the hypothesis that the two starting salaries are the same in the population. Given that this data was collected in 1+++, do you thin& that your results willhold for other years, such as I


!%+,+1(.( %4,8%.%%)

8,%%.1 ,14'.8

(+ '+

t −

=

+ 0 .%.

The critical value for a one3sided test is 1.', for a two3sided test 1.+, both atthe (* level. Cence you can re@ect the null hypothesis that the two startingsalaries are equal. "resumably you would have chosen as an alternative that better students receive better starting salaries, so that this becomes your newwor&ing hypothesis. 1+++ was a boom year. <f better students receive betterstarting offers during a boom year, when the labor mar&et for graduates is tight,then it is very li&ely that they receive a better offer during a recession year,

assuming that they receive an offer at all.

1%) uring the last few days before a presidential election, there is a frenzy of votingintention surveys. n a given day, quite often there are conflicting results from threema@or polls.

!a) Thin& of each of these polls as reporting the fraction of successes !1s) of a Dernoulli

random variable Y , where the probability of success is "r! 1)Y p= = . et > p be the fraction of

successes in the sample and assume that this estimator is normally distributed with a mean of

p and a variance of!1 ) p p

n

−. 5hy are the results for all polls different, even though they are

ta&en on the same dayI

Answer: 2ince all polls are only samples, there is random sampling error. As a result, > p

will differ from sample to sample, and most li&ely also from p.

!b) Given the estimator of the variance of > p ,µ µ!1 ) p p

n

−, construct a +(* confidence interval

for > p . Oor which value of > p is the standard deviation the largestI 5hat value does it ta&e in

the case of a ma$imum > p I

(



Answer: µ µ µ!1 )

1.+ p p

pn

−± × . A bit of thought or calculus will show that the standard

deviation will be largest for > p 0 .(, in which case it becomes.(

n.

!)e 5hen the results from the polls are reported, you are told, typically in the small print, thatthe 6margin of error7 is plus or minus two percentage points. Esing the appro$imation of 1.+ ≈ , and assuming, 6conservatively,7 the ma$imum standard deviation derived in!b), what sample size is required to add and subtract !6margin of error7) two percentage points from the point estimateI

Answer: n 0 ,(.

!)f 5hat sample size would you need to halve the margin of errorI

Answer: n 0 1,.

(%



Mathematical and Graphical Problems

a. ;our te$tboo& defined the covariance between X and Y as follows:

1

1 ! )! )1

n

i i

i

X X Y Y n =

− −− ∑

"rove that this is identical to the following alternative specification:

1

1

1 1

n

i i

i

n X Y XY

n n=

−− −∑

Answer:

1 1

1 1 1 1

1 1

! )! ) ! )1 1

1 1! ) ! )

1 1

n n

i i i i i ii i

n n n n

i i i i i i

i i i i

X X Y Y X Y XY YX YX n n

X Y X Y Y X nYX X Y nXY nYX nYX n n

= =

= = = =

− − = − − +− −

= − − + = − − +− −

∑ ∑∑ ∑ ∑ ∑

01

1

1 1

n

i i

i

n X Y XY

n n=

−− −∑ .

b. Oor each of the accompanying scatterplots for several pairs of variables, indicate whetheryou e$pect a positive or negative correlation coefficient between the two variables, andthe li&ely magnitude of it !you can use a small range).

!a)

2

4

6

8

10

2 4 6 8 10 12 14

2

Answer: "ositive correlation. The actual correlation coefficient is .'.

('



!b)

!0.04

!0.02

0.00

0.02

0.04

0.06

0.08

0.0 0.2 0.4 0.6 0.8 1.0

2

Answer: Po relationship. The actual correlation coefficient is .4.

!c)

0.35

0.40

0.45

0.50

0.55

0.60

0.65

3.5 4.0 4.5 5.0 5.5 6.0 6.5

1

2

Answer: Pegative relationship. The actual correlation coefficient is B.4.

((



!d)

0

500

1000

1500

2000

0 20 40 60 80 100

2

Answer: Ponlinear !inverted E) relationship. The actual correlation coefficient is .%.

c. ;our te$tboo& defines the correlation coefficient as follows:

1

1 1

1! ) ! )

1

1 1! ) ! )

1 1

n

i i

i

n n

i i

i i

Y Y X X n

r

Y Y X X n n

=

= =

− −−

=

− −− −

∑

∑ ∑

Another te$tboo& gives an alternative formula:

1 1 1

1 1 1 1

! )! )

! ) ! )

n n n

i i i i

i i i

n n n n

i i i i

i i i i

n Y X Y X r

n Y Y n X X

= = =

= = = =

−=

− −

∑ ∑ ∑

∑ ∑ ∑ ∑

"rove that the two are the same.

Answer:

(



1 1

1 1 1 1

1 1! ) ! ) ! )

1 1

1 1 1! ) ! ) ! ) ! )

1 1 1

n n

i i i i i i

i i

n n n n

i i i i i i

i i i i

Y Y X X Y X YX XY YX n n

r

Y Y X X Y YY Y X XX X n n n

= =

= = = =

− − − − +− −= =

− − − + − +− − −

∑ ∑

∑ ∑ ∑ ∑

01

1 1

n

i i

i

n n

i i

i i

Y X nYX

Y nY X nX

=

= =

−

− −

∑

∑ ∑0

1

1 1

n

i i

i

n n

i i

i i

n Y X nYnX

n Y nY X X

=

= =

−

− −

∑

∑ ∑ 0

1 1 1

1 1 1 1

! )! )

! ) ! )

n n n

i i i i

i i i

n n n n

i i i i

i i i i

n Y X Y X

n Y Y n X X

= = =

= = = =

−

− −

∑ ∑ ∑

∑ ∑ ∑ ∑.

') <Ys of individuals are normally distributed with a mean of 1 and a standard deviationof 1. <f you sampled students at your college and assumed, as the null hypothesis, that theyhad the same <Y as the population, then in a random sample of size

!a) n 0 (, find "r! 1()Y < .

!b) n 0 1, find "r! +4)Y > .

!c) n 0 1'', find "r!11 1%)Y < < .

Answer: a. .+'M b. .+4M c. .1.

() Lonsider the following alternative estimator for the population mean:.

°1 % ' 1

1 1 4 1 4 1 4! ... )' ' ' ' ' '

n nY Y Y Y Y Y Y n −

= + + + + + +

"rove that °Y is unbiased and consistent, but not efficient when compared to Y .

Answer: °1 % ' 1

1 1 4 1 4 1 4! ) ! ! ) ! ) ! ) ! ) ... ! ) ! ))

' ' ' ' ' 'n n E Y E Y E Y E Y E Y E Y E Y

n −= + + + + + +

01 1 4

! ... ) .

' '

Y Y Y

n

n n

µ µ µ + + + + = = Cence °Y is unbiased.

° ° var! ) ! )Y Y E Y µ = − 0

1 % ' 1

1 1 4 1 4 1 4- ! ... )

' ' ' ' ' 'n n Y E Y Y Y Y Y Y

n µ −+ + + + + + −

0

1 1

1 1 4 1 4- ! ) ! ) ... ! ) ! )' ' ' '

Y Y n Y n Y E Y Y Y Y n

µ µ µ µ −

− + − + + − + −

(4



0

1 1

1 1 '+ 1 '+- ! ) ! ) ... ! ) ! ) 1 1 1 1

Y Y n Y n Y E Y E Y E Y E Y n

µ µ µ µ −− + − + + − + −

0

1 1 '+ 1 '+- ... 1 1 1 1

Y Y Y Y n σ σ σ σ + + + + 0

1 '+- ! ) 1 1

Y n

n

σ + 0 1.((

Y

n

σ .

2ince °var! ) Y →

as ,n → ∞ °

Y is consistent. °

Y has a larger variance than

Y and is therefore not as efficient.

)( <magine that you had sampled 1,, females and 1,, males to test whether or not females have a higher <Y than males. <Ys are normally distributed with a mean of 1and a standard deviation of 1. ;ou are e$cited to find that females have an average <Y of 11 in your sample, while males have an <Y of ++. oes this difference seem importantIo you really need to carry out a t 3test for differences in means to determine whether or not this difference is statistically significantI 5hat does this result tell you about testinghypotheses when sample sizes are very largeI

Answer: The difference seems very small, both in terms of absolute values and, moreimportantly, in terms of standard deviations. 5ith a sample size as large asn01,,, the standard error becomes e$tremely small. This implies that thedistribution of means, or differences in means, has almost turned into a spi&e. <nessence, you are !very close to) observing the population. <t is thereforeunnecessary to test whether or not the difference is statistically significant. After all, if in the population, the male <Y were ++.++ and the female <Y were 1.1,they would be different. <n general, when sample sizes become very large, it isvery easy to re@ect null hypotheses about population means, which involvesample means as an estimator, even if hypothesized differences are very small.This is the result of the distribution of sample means collapsing fairly rapidly as

sample sizes increase.

4) et Y be a Dernoulli random variable with success probability "r! 1)Y p= = , and let

1,..., nY Y be i.i.d. draws from this distribution. et > p be the fraction of successes !1s) in

this sample. <n large samples, the distribution of > p will be appro$imately normal, i.e. > p

is appro$imately distributed!1 )

! , ) p p

N pn

−. Pow let X be the n(mb*r o+ "(cc*""*" and n

the sample size. <n a sample of 1 voters !n01), if there are si$ who vote for candidate

A, then X 0 . Jelate X , the number of success, to > p , the success proportion, or fraction

of successes. Pe$t, using your &nowledge of linear transformations, derive thedistribution of X .

Answer: µ X n p= × . Cence if µ p is distributed!1 )

! , ) p p

N pn

−, then, given that X is a

linear transformation of µ p , X is distributed ! , !1 )) N np np p− .

(8



8) 5hen you perform hypothesis tests, you are faced with four possible outcomes describedin the accompanying table.

6ecision based on

sample

$ruth -opulation8

# is true 1 # is true

Je@ect # I Z

o not re@ect # Z II

6Z7 indicates a correct decision, and < and << indicate that an error has been made. <n probability terms, state the mista&es that have been made in situation < and <<, and relatethese to the 2ize of the test and the "ower of the test !or transformations of these).

Answer: I: "r!re@ect / # # is correct) 0 2ize of the test.

II: "r!re@ect 1 1

/ # # is correct) 0 !13"ower of the test).

+) Assume that under the null hypothesis, Y has an e$pected value of ( and a standard

deviation of . Ender the alternative hypothesis, the e$pected value is ((. 2&etch the probability density function for the null and the alternative hypothesis in the same figure."ic& a critical value such that the p3value is appro$imately (*. Har& the areas, whichshow the size and the power of the test. 5hat happens to the power of the test if the

alternative hypothesis moves closer to the null hypothesis, i.e., Y µ 0 (', (%, (, etc.I

Answer: Oor a given size of the test, the power of the test is lower.

(+



1' B Endergraduate Fconometrics /ro+*""or "car 4ord52pring %

1) The net weight of a bag of flour is guaranteed to be ( pounds with a standarddeviation of .( pounds. ;ou are concerned that the actual weight is less. To testfor this, you sample ( bags. Larefully state the null and alternative hypothesis inthis situation. etermine a critical value such that the size of the test does note$ceed (*. Oinding the average weight of the ( bags to be '.4 pounds, can you

re@ect the null hypothesisI 5hat is the power of the test hereI 5hy is it so lowI

Answer: et Y be the net weight of the bag of flour. Then : ! ) ( # E Y = and

1 : ! ) ( # E Y < . Ender the null hypothesis, Y is distributed normally,

with a mean of ( pounds and a standard deviation of .1 pounds. Thecritical value is appro$imately '.+8 pounds. 2ince '.4 pounds falls in there@ection region, the null hypothesis is re@ected. The power of the test islow here, since there is no simple alternative. <n the e$treme case, wherethe alternative hypothesis would place the net weight marginally belowfive pounds, the power of the test would appro$imately equal its size, or(* in this case.

11) 2ome policy advisors have argued that education should be subsidized indeveloping countries to reduce fertility rates. To investigate whether or noteducation and fertility are correlated, you collect data on population growth rates!Y ) and education ! X ) for 8 countries. Given the sums below, compute thesample correlation:

1

n

i

i

Y =

∑ 0 1.(+'M1

n

i

i

X =

∑ 0 ''+.M1

n

i i

i

Y X =

∑ 0 .'+4M

1

n

i

i

Y =

∑ 0 .%+8M

1

n

i

i

X =

∑ 0

%,.4

Answer: r 0 B.41.

1) !Advanced) Enbiasedness and small variance are desirable properties ofestimators. Cowever, you can imagine situations where a trade3off e$ists betweenthe two: one estimator may be have a small bias but a much smaller variance thananother, unbiased estimator. The concept of 6mean square error7 estimator

combines the two concepts. et > µ be an estimator of µ . Then the mean square

error !H2F) is defined as follows: H2F! > µ ) 0 >! ) E µ µ − . "rove that H2F! > µ ) 0

bias # var! > µ ). !Cint: subtract and add >! ) E µ in>! ) E µ µ − .)

Answer:µ

> > > > > >! ) ! ! ) ! ) ) -! ! )) ! ! ) )

> > > > > >-! ! )) ! ! ) ) ! ! ))! ! ) )

M!E E E E E E E

E E E E E

µ µ µ µ µ µ µ µ µ

µ µ µ µ µ µ µ µ

= − + − = − + −

= − + − + − − Pe$t, moving through the e$pectation operator results in

> > > > > >- ! ) - ! ) ) -! ! ))! ! ) ) E E E E E E E µ µ µ µ µ µ µ µ − + − + − − .

1




The first term is the variance, and the second term is the squared bias,

since > >- ! ) ) - ! ) ) E E E µ µ µ µ − = − . This proves H2F! > µ ) 0 bias # var!

> µ ) if the last term equals zero. Dut> > > > > > > >-! ! ))! ! ) ) - ! ) ! ! )) ! ) E E E E E E E µ µ µ µ µ µ µµ µ µ µ − − = − − +

0

> > > > >! ) ! ) ! ) ! ! )) ! ) . E E E E E µ µ µ µ µ µ µ − − + =

1%) ;our te$tboo& states that when you test for differences in means and you assumethat the two population variances are equal, then an estimator of the populationvariance is the following 6pooled7 estimator:

1 1

1! ) ! )

m 2n n

pool*d i m i 2

i im 2

" Y Y Y Y n n = =

= − + − + −

∑ ∑

F$plain why this pooled estimator can be loo&ed at as the weighted average of thetwo variances.

Answer:

1 1

1 1! ) ! ) ! 1) ! 1)

! 1) ! 1).

m 2n n

pool*d i m i 2 m m 2 2

i im 2 m 2

m 2m 2

m 2 m 2

" Y Y Y Y n " n "n n n n

n n " "

n n n n

= =

= − + − = − + − + − + −

− −= +

+ − + −

∑ ∑

i. ;our te$tboo& suggests using the first observation from a sample of n as anestimator of the population mean. <t is shown that this estimator is unbiased but

has a variance of

Y σ , which ma&es it less efficient than the sample mean. F$plainwhy this estimator is not consistent. ;ou develop another estimator, which is thesimple average of the first and last observation in your sample. 2how that thisestimator is also unbiased and show that it is more efficient than the estimatorwhich only uses the first observation. <s this estimator consistentI

Answer: The estimator is not consistent because its variance does not vanish as ngoes to infinity, i.e. 1var! ) Y → as n → ∞ does not hold.

°1

1! )

nY Y Y = + . °

1

1! ) ! ! ) ! ))

n E Y E Y E Y = + 0

1! ) .

Y Y Y µ µ µ + = Cence °Y

is unbiased. ° °

var! ) ! )Y Y E Y µ = − 0

1

1 1

-! ) n Y E Y Y µ + −

0

1

1 1- ! ) ! )

Y n Y E Y Y µ µ − + − 0

1

1- ! ) ! )

' Y n Y E Y E Y µ µ − + − 0

1-

' Y Y σ σ + 0

Y σ

.




2ince °var! ) Y → as ,n → ∞ does not hold, °Y is not consistent.

°1var! ) var! )Y Y < , and is therefore more efficient than the estimator,

which only uses the first observation.

1()et p be the success probability of a Dernoulli random variable Y , i'*'6"r! 1) p Y = = . <t can be shown that µ p , the fraction of successes in a sample, is

asymptotically distributed!1 )

! , ) p p

N pn

−. Esing the estimator of the variance of µ p ,

µ µ!1 ) p p

n

−, construct a +(* confidence interval for p. 2how that the margin for

sampling error simplifies to 19 n if you used instead of 1.+ assuming,

conservatively, that the standard error is at its ma$imum. Lonstruct a table indicatingthe sample size needed to generate a margin of sampling error of 1*, *, (* and1*. 5hat do you notice about the increase in sample size needed to halve the

margin of errorI !The margin of sampling error is 1.+ µ! )!E p× .)

Answer: The +(* confidence interval for p is µ µ µ!1 )

1.+ p p

pn

−± × .

µ µ!1 ) p p

n

−is

at a ma$imum for µ p 0 .(, in which case the confidence interval reduces

to µ µ.( 11.+ p p

n n± × ≈ ± , and the margin of sampling error is

1

n.

1

n

n

.1 1,

. ,(

.( '

.1 1

To halve the margin of error, the sample size has to increase fourfold.

1) et Y be a Dernoulli random variable with success probability "r! 1)Y p= = , and

let 1,..., nY Y be i.i.d. draws from this distribution. et > p be the fraction of

successes !1s) in this sample. Given the following statement

"r! 1.+ 1.+) .+( -− < < =

and assuming that > p is appro$imately distributed!1 )

! , ) p p

N pn

−, derive the +(*

confidence interval for p by solving the above inequalities.

%




Answer:

µ"r! 1.+ 1.+) .+(

!1 )

p p

p p

n

−− < < =

− . Hultiplying through by the

standard deviation results in

µ!1 ) !1 )"r! 1.+ 1.+ ) .+( p p p p p pn n− −− × < − < × = . 2ubtraction of

> p then yields, after multiplying both sides by !31),

µ µ!1 ) !1 )"r! 1.+ 1.+ ) .+(

p p p p p p p

n n

− −− × < < + × = . The +(*

confidence interval for p then is µ !1 )1.+

p p p

n

−± × .

14) ;our te$tboo& mentions that dividing the sample variance by n 7 1 instead of n iscalled a degrees of freedom correction. The meaning of the term stems from the

fact that one degree of freedom is used up when the mean is estimated. Cencedegrees of freedom can be viewed as the number of independent observationsremaining after estimating the sample mean.

Lonsider an e$ample where initially you have independent observations on theheight of students. After calculating the average height, your instructor claims thatyou can figure out the height of the th student if she provides you with theheight of the other 1+ students and the sample mean. Cence you have lost onedegree of freedom, or there are only 1+ independent bits of information. F$plainhow you can find the height of the th student.

Answer: 2ince

1

1 ,

i

i

Y Y =

= ∑ 1+

1 1

i i

i i

Y Y Y Y = =

× = = +∑ ∑ . Cence &nowledge of the

sample mean and the height of the other 1+ students is sufficient forfinding the height of the th student.

18) The accompanying table lists the height !!&8##& ) in inches and weight!$*it ) in pounds of five college students. Lalculate the correlation coefficient.

STUDHGHT WEIGHT

74 165 73 165 72 145 68 155 66 140

Answer: r 0 .4.

'




1+) !Jequires calculus.) The variance of the success probability p !a Dernoulli

random variable) is!1 ) p p

n

−. Ese calculus to show that this variance is minimized

for p 0 .(.

Answer:

!1 )

1.

p p p pn

p n n

− ∂ − = − =∂

Cence 1 p− = or1

.

p =

) Lonsider two estimators: one which is biased and has a smaller variance, the other which is unbiased and has a larger variance. 2&etch the sampling distributions andthe location of the population parameter for this situation. iscuss conditions underwhich you may prefer to use the first estimator over the second one.

Answer: The bias indicates 6how far away,7 on average, the estimator is from the population value. Although this average is zero for an unbiasedestimator, there may be quite some variation around the populationmean. <n a single draw, there is therefore a high probability of beingsome distance away from the population mean. n the other hand, if thevariance is very small and the estimator is biased by a small amount,then the probability of being closer to the population value may behigher. !The biased estimator may have a smaller mean square error thanthe unbiased estimator.)

(




Chapter 4

1) ;ou have obtained measurements of height in inches of + female and 81 malestudents !!t(d*nt) at your university. A regression of the height on a constant and a binary variable ! B*mm*), which ta&es a value of one for females and is zero otherwise,yields the following result:

·!t(d*nt 0 41. 3 '.8'[ B*mm* , % 0 .', !E% 0 .

!.%) !.(4)

!a) <nterpret the results.

Answer: The average height of male students is 41 inches, and that of females isappro$imately inches.

!b) Test the hypothesis that females, on average, are shorter than males, at the 1*

level.

Answer: The t 3statistic for the difference in means is 38.'+. Oor a one3sided test,the critical value is B.%%. Cence the difference is statisticallysignificant.

!c) <s it li&ely that the error term is homos&edastic hereI




Answer: <t is safer to assume that the variances for males and females aredifferent. <n the underlying "ampl* the standard deviation for femaleswas smaller.

) ;ou have obtained a sub3sample of 14'' individuals from the Lurrent "opulation

2urvey !L"2) and are interested in the relationship between wee&ly earnings andage. The regression, using heteros&edasticity3robust standard errors, yielded thefollowing result:

· Earn 0 %+.1 # (.[ A* , % 0 .(, !E% 0 84.1.,

!.') !.(4)

where Earn and A* are measured in dollars and years respectively.

!a) <nterpret the results.

Answer: A person who is one year older increases her wee&ly earnings by N(..There is no meaning attached to the intercept. The regression e$plains ( percent of the variation in earnings.

!b) <s the relationship between A* and Earn statistically significantI <s the effect ofage on earnings largeI

Answer: The t 3statistic on the slope is +.1, which is above the critical value fromthe standard normal distribution for any reasonable level of significance.Assuming that people wor&ed ( wee&s a year, the effect of being oneyear older translates into an additional N4.' a year. This does not

seem particularly large in dollars, but may have been earlier.

!c) 5hy should age matter in the determination of earningsI o the results suggestthat there is a guarantee for earnings to rise for everyone as they become olderIo you thin& that the relationship between age and earnings is linearI

Answer: <n general, age3earnings profiles ta&e on an inverted E3shape. Cence itis not linear and the linear appro$imation may not be good at all. Agemay be a pro$y for 6e$perience,7 which in itself can appro$imate 6onthe @ob training.7 Cence the positive effect between age and earnings.The results do not suggest that there is a guarantee for earnings to rise

for everyone as they become older since the regression %

does not equal1. <nstead the result holds 6on average.7

!d) The variance of the error term and the variance of the dependent variable arerelated. Given the distribution of earnings, do you thin& it is plausible that thedistribution of errors is normalI

4




Answer: 2ince the earnings distribution is highly s&ewed, it is not reasonable toassume that the error distribution is normal.

!e) !Jequires Appendi$ Haterial) The average age in this sample is %4.( years. 5hatis annual income in the sampleI

Answer: 2ince ¶ µ ¶ µ 1 1Y X Y X β β β β = − ⇒ = + . 2ubstituting the estimates for the

slope and the intercept then results in average wee&ly earnings ofN'%'.1 or annual average earnings of N,(4.%.

%) The baseball team nearest to your home town is, once again, not doing well.Given that your &nowledge of what it ta&es to win in baseball is vastly superior tothat of management, you want to find out what it ta&es to win in Ha@or eagueDaseball !HD). ;ou therefore collect the winning percentage of all % baseballteams in HD for 1+++ and regress the winning percentage on what you consider

the primary determinant for wins, which is quality pitching !team earned runaverage). ;ou find the following information on team performance:

ummar/ of the 6istribution of inning -ercentage and $eam arned !un

A"erage for #9. in 1:::

A"erage tandard

de"iation

-ercentile

1'; 2(; 4'; (';

median8

%'; 7(; :';

TeamFJA

'.41 .(% %.8' '.%( '.4 '.48 '.+1 (. (.(

5inning"ercentage

.( .8 .' .'% .' .'8 .'+ .(+ .

!a) 5hat is your e$pected sign for the regression slopeI 5ill it ma&e sense tointerpret the interceptI <f not, should you omit it from your regression and forcethe regression line through the originI

Answer: ;ou e$pect a negative relationship, since a higher team FJA implies alower quality of the input. Po team comes close to a zero team FJA,

and therefore it does not ma&e sense to interpret the intercept. Oorcingthe regression through the origin is a false implication from this insight.<nstead the intercept fi$es the level of the regression.

!b) The authors of your te$tboo& have informed you that unless you have more than1 observations, it may not be plausible to assume that the distribution of your2 estimators is normal. 5hat are the implications here for testing thesignificance of your theoryI

8




Answer: 2ince there are only % observations, the distribution of the t 3statistic isun&nown. ;ou should therefore not conduct statistical inference.

!c) 2 estimation of the relationship between the winning percentage and the team

FJA yields the following:

·$inpct 0 .+' B .1[t*am*ra , %0.'+, !E% 0 .,

!.8) !.)

where inpct is measured as wins divided by games played, so for e$ample ateam that won half of its games would have $inpct 0 .(. <nterpret yourregression results.

Answer: Oor every one point increase in Team FJA, the winning percentagedecreases by 1 percentage points, or .1. Joughly half of the

variation in winning percentage is e$plained by the quality of team pitching.

!d) <t is typically sufficient to win + games to be in the playoffs and9or to win adivision. 5inning over 1 games a season is e$ceptional: the Atlanta Draves hadthe most wins in 1+++ with 1%. Teams play a total of 1 games a year. Giventhis information, do you consider the slope coefficient to be large or smallI

Answer: The coefficient is large, since increasing the winning percentage by .1is the equivalent of winning 1 more games per year. 2ince it istypically sufficient to win ( percent of the games to qualify for the

playoffs, this difference of .1 in winning percentage turns can easilyturn a loosing team into a winning team.

!e) 5hat would be the effect on the slope, the intercept, and the regression % if youmeasured $inpct in percentage points, i.e. as !5ins9Games)[1I

Answer: Llearly the regression % will not be affected by a change in scale, sincea descriptive measure of the quality of the regression would depend onwhim otherwise. The slope of the regression will compensate in such away that the interpretation of the result is unaffected, i'*' it will become1 in the above e$ample. The intercept will also change to reflect the

fact that if K were , then the dependent variable would now bemeasured in percentage, i'*', it will become +'. in the above e$ample.

!f) Are you impressed with the size of the regression %I Given that there is (1* ofune$plained variation in the winning percentage, what might some of thesefactors beI

+




Answer: <t is impressive that a single variable can e$plain roughly half of thevariation in winning percentage. Answers to the second question willvary by student, but will typically include the quality of hitting,fielding, and management. 2alaries could be included, but should bereflected in the inputs.

') ;ou have learned in one of your economics courses that one of the determinantsof per capita income !the 65ealth of Pations7) is the population growth rate.Ourthermore you also found out that the "enn 5orld Tables contain income and population data for 1' countries of the world. To test this theory, you regress theG" per wor&er !relative to the Enited 2tates) in 1++ ! %*l/*r",nc) on thedifference between the average population growth rate of that country !n) to theE.2. average population growth rate !n(" ) for the years 1+8 to 1++. This resultsin the following regression output:

·Je l/*r",nc 0 .(18 B 18.8%1[!n 7 n(") , %0.(, !E% 0 .1+4

!.() !%.144)

!a) <nterpret the results carefully. <s this relationship statistically significantI <s iteconomically importantI

Answer: A relative increase in the population rate of one percentage point, from.1 to ., say, lowers relative per3capita income by almost percentage points !.188). This is a quantitatively important and largeeffect. Pations which have the same population growth rate as theEnited 2tates have, on average, roughly half as much per capitaincome. The t 3statistic is (.+%, ma&ing the relationship statistically

significant.

!b) 5hat would happen to the slope, intercept, and regression % if you ran anotherregression where the above e$planatory variable was replaced by n only, i'*', theaverage population growth rate of the countryI !The population growth rate of theEnited 2tates from 1+8 to 1++ was .+.) 2hould this have any affect on the t 3statistic of the slopeI

Answer: The interpretation of the partial derivative is unaffected, in that the slopestill indicates the effect of a one percentage point increase in the population growth rate. The regression % and t 3statistic will remain the

same since only a constant was removed from the e$planatory variable.The intercept will change as a result of the change in X .

!c) <s there any reason to believe that the variance of the error terms ishomos&edasticI

Answer: There are vast differences in the size of these countries, both in terms ofthe population and G". Ourthermore, the countries are at different

4




stages of economic and institutional development. ther factors vary aswell. <t would therefore be odd to assume that the errors would behomos&edastic.

!d) %1 of the 1' countries have a dependent variable of less than .1. oes it

therefore ma&e sense to interpret the interceptI

Answer: To interpret the intercept, you must observe values of X close to zero,not Y .

() The neoclassical growth model predicts that for identical savings rates and population growth rates, countries should converge to the per capita income level.This is referred to as the convergence hypothesis. ne way to test for the presenceof convergence is to compare the growth rates over time to the initial startinglevel.

!a) <f you regressed the average growth rate over a time period !1+31++) on theinitial level of per capita income, what would the sign of the slope have to be toindicate this type of convergenceI F$plain. 5ould this result confirm or re@ect the prediction of the neoclassical growth modelI

Answer: ;ou would require a negative sign. Lountries that are far ahead of othersat the beginning of the period would have to grow relatively slower forthe others to catch up. This represents unconditional convergence,whereas the neoclassical growth model predicts conditionalconvergence, i'*', there will only be convergence if countries haveidentical savings, population growth rates, and production technology.

!b) The results of the regression for 1' countries were as follows:

· + 0 .1+ B .[ %*l/rod :0 , % 0 .4, !E% 0 .1,

!.') !.4%)

where :090 is the average annual growth rate of G" per wor&er for the 1+31++ sample period, and %*l/rod :0 is G" per wor&er relative to the Enited2tates in 1+.<nterpret the results. <s there any evidence of unconditional convergence between

the countries of the worldI <s this result surprisingI 5hat other concept could youthin& about to test for convergence between countriesI

Answer: An increase in 1 percentage points in %*l/rod :0 results in a decrease of. in the growth rate from 1+ to 1++, i'*', countries that werefurther ahead in 1+ do grow by less. There are some countries in thesample that have a value of %*l/rod :0 close to zero !Lhina, Eganda,Togo, Guinea) and you would e$pect these countries to grow roughly

41




by percent per year over the sample period. The regression %

indicates that the regression has virtually no e$planatory power. This isconfirmed by the very low t 3statistic, indicating that the slope is notstatistically significant. The result is not surprising given that there arenot many theories that predict unconditional convergence between the

countries of the world.

!c) Esing the 2 estimator with homos&edasticity3only standard errors, the resultschanged as follows:

· + 0 .1+ B .[ %*l/rod :0 , % 0 .4, !E% 0 .1

!.) !.8)

5hy didnRt the estimated coefficients changeI Given that the standard error of theslope is now smaller, can you re@ect the null hypothesis of no beta convergenceIAre the results in !c) more reliable than the results in !b)I F$plain.

Answer: Esing homos&edasticity3only standard errors has no effect on the 2estimator. The t 3 statistic remains small and is certainly below thecritical value. The results are less reliable since there is no reason to believe that the error variance is homos&edastic.

!d) ;ou decide to restrict yourself to the ' FL countries in the sample. Thischanges your regression output as follows:

· + 0 .'8 B .'' %*l/rod :0 , % 0 .8 , !E% 0 .'

!.') !.%)

Cow does this result affect your conclusions from aboveI 5hen you test forconvergence, should you worry about the relatively small sample sizeI

Answer: Sudging by the size of the slope coefficient, there is strong evidence ofunconditional convergence for the FL countries. The regression %

is quite high, given that there is only a single e$planatory variable inthe regression. Cowever, since we do not &now the samplingdistribution of the estimator in this case, we cannot conduct inference.

) <n 1, the Arizona iamondbac&s defeated the Pew ;or& ;an&ees in theDaseball 5orld 2eries in 4 games. 2ome players, such as Dautista and Oinley for theiamondbac&s, had a substantially higher batting average during the 5orld 2eriesthan during the regular season. thers, such as Drosius and Seter for the ;an&ees, didsubstantially poorer. ;ou set out to investigate whether or not the regular season batting average is a good indicator for the 5orld 2eries batting average. The resultsfor 11 players who had the most at bats for the two teams are:

4




· AZ$"a; 0 B.%'4 # .+ AZ!*a"a; , %0.11, !E% 0 .1'(,

!.') !.1()

· NY$"a; 0 .1%' # .1% NY!*a"a; , %0.1, !E% 0 .+,

!.%41) !1.%'4)

where $"a; and !*a"a; indicate the batting average during the 5orld 2eriesand the regular season respectively.

!a) Oocusing on the coefficients first, what is your interpretationI

Answer: The two regressions are quite different. Oor the iamondbac&s, playerswho had a 1 point higher batting average during the regular seasonhad roughly a % point higher batting average during the 5orld 2eries.Cence top performers did relatively better. The opposite holds for the;an&ees.

!b) 5hat can you say about the e$planatory power of your equationI 5hat do youconclude from thisI

Answer: Doth regressions have little e$planatory power as seen from theregression %. Cence performance during the season is a poor forecast of 5orld 2eries performance.

!c) Lalculate the t 3statistics for the various regression coefficients. Are any of thesesignificant at the (* levelI 5hen using statistical inference in this case, shouldyou be concerned about the number of observationsI

Answer: The respective t 3statistics are B.(4(, 1., .%1, and .11. Pone ofthese are statistically significant at the (* level. Cowever, given thatthere are only 11 observations, you should not conduct inference, sincethe sampling distribution is un&nown.

Mat*matical <(*"tion"

1) "rove that the regression % is identical to the square of the correlation coefficient between two variables Y and X . Jegression functions are written in a form thatsuggests causation running from X to Y . Given your proof, does a high regression % present supportive evidence of a causal relationshipI Lan you thin& of someregression e$amples where the direction of causality is not clearI <s without adoubtI

4%




Answer: The regression E!!

%&!!

= , where F22 is given by µ

1

! )n

i

i

Y Y =

−∑ . Dut

µ µ µ 1i iY X β β = + and µ µ

1Y X β β = + . Cence µ µ

1! ) ! )iiY Y X X β − = − ,

and therefore

µ

1 1

! )n

ii

E!! X X β =

= −

∑ . Esing small letters to indicate

deviations from mean, i'*', i i - Z Z = − , we get that the regression

µ

1 1

1

n

i

i

n

i

i

x %

y

β =

=

=∑

∑. The square of the correlation coefficient is

µ

1 1 1 1 1

1 1 1 1 1

! ) ! )

! )

n n n n

i i i i i i

i i i i

n n n n n

i i i i i

i i i i i

y x y x x xr

x y x y y

β = = = =

= = = = =

= = =∑ ∑ ∑ ∑

∑ ∑ ∑ ∑ ∑

. Cence the two are the

same. Lorrelation does not imply causation. <ncome is a regressor in theconsumption function, yet consumption enters on the right3hand side ofthe G" identity. Jegressing the weight of individuals on the height isa situation where causality is without doubt, since the author of this test ban& should be seven feet tall otherwise. The authors of the te$tboo&use weather data to forecast orange @uice prices later in the te$t.

) <n order to formulate whether or not the alternative hypothesis is one3sided ortwo3sided, you need some guidance from economic theory. Lhoose at least threee$amples from economics or other fields where you have a clear idea what the null

hypothesis and the alternative hypothesis for the slope coefficient should be. 5rite a brief @ustification for your answer.

Answer: Answers will vary by student. The problem is to find e$amples wherethere is only a single e$planatory variable. A student may argue that the price coefficient in a demand function is downward sloping, but unlessyou control for other variables, this may not be so. The demand for .A.a&er tic&ets and their price comes to mind. LA"H is a nice e$ample."erhaps the marginal propensity to consume in a consumption functionis another. Testing for speculative efficiency in e$change rate mar&etsmay also wor&.

%) Oor the following estimated slope coefficients and their standard errors, find the t 3statistics for the null hypothesis # : β 1 0 . <ndicate whether or not you are able

to re@ect the null hypothesis at the 1*, (*, and 1* level of a one3sided and two3sided hypothesis.

!a)1 1

> >'., ! ) .'!E β β = =

4'




!b)1 1

> >.(, ! ) .%4!E β β = =

!c)1 1

> >.%, ! ) .!E β β = =

!d)1 1

> >%, ! ) %!E β β = =

Answer: a) t 0 1.4(M re@ect null 1* level of two3sided test, and (* ofone3sided test. b) t 0 1.%(M cannot re@ect null at 1* of two3sided test, re@ect null at1* of one3sided test.c) t 0 1.(M cannot re@ect null at 1* of two3sided test, re@ect null at1* of one3sided test.d) t 0 1.M cannot re@ect null at 1* of both two3sided and one3sidedtest.

') F$plain carefully the relationship between a confidence interval, a one3sidedhypothesis test, and a two3sided hypothesis test. 5hat is the unit of measurement

of the t 3statisticI

Answer: <n the case of a two3sided hypothesis test, the relationship between the t 3statistic and the confidence interval is straightforward. The t 3statisticcalculates the distance between the estimate and the hypothesized valuein standard deviations. <f the distance is larger than 1.+ !size of the test:(*), then the distance is large enough to re@ect the null hypothesis. Theconfidence interval adds and subtracts 1.+ standard deviations in thiscase, and as&s whether or not the hypothesized value is contained withinthe confidence interval. Cence the two concepts resemble the two sidesof a coin. They are simply different ways to loo& at the same problem. <n

the case of the one3sided test, the relationship is more comple$. 2inceyou are loo&ing at a one3sided alternative, it does not really ma&e senseto construct a confidence interval. Cowever, the confidence intervalresults in the same conclusion as the t 3test if the critical value from thestandard normal distribution is appropriately ad@usted, *'' to 1* rather than (*. The unit of measurement of the t 3statistic is standarddeviations.

() ;ou have analyzed the relationship between the weight and height of individuals.Although you are quite confident about the accuracy of your measurements, you

feel that some of the observations are e$treme, say, two standard deviations aboveand below the mean. ;our therefore decide to disregard these individuals. 5hatconsequence will this have on the standard deviation of the 2 estimator of theslopeI

Answer: ther things being equal, the standard error of the slope coefficient willdecrease the larger the variation in X . Cence you prefer more variationrather than less. This is easier to see in the case of homos&edasticity3

4(




only standard errors, but carries over to the heteros&edasticity3robuststandard errors. <ntuitively it is easier for 2 to detect a response to aunit change in X if the data varies more.

) <n order to calculate the regression % you need the &!! and either the !!% or the

E!! . The &!! is fairly straightforward to calculate, being @ust the variation of Y .Cowever, if you had to calculate the !!% or E!! by hand !or in a spreadsheet),you would need all fitted values from the regression function and their deviationsfrom the sample mean, or the residuals. Lan you thin& of a quic&er way tocalculate the E!! simply using terms you have already used to calculate the slopecoefficientI

Answer: The E!! is given by µ

1

! )n

i

i

Y Y =

−∑ . Dut µ µ µ 1i iY X β β = + and

µ µ 1

Y X β β = + . Cence µ µ

1! ) ! )iiY Y X X β − = − , and therefore

µ 1

1

! )n

i

i

E!! X X β =

= −∑ . The right3hand side contains the estimated

slope squared and the denominator of the slope, i'*', all values that havealready been calculated.

4) !Jequires Appendi$ Haterial) <n deriving the 2 estimator, you minimize the

sum of squared residuals with respect to the two parameters >β and

1>β . The

resulting two equations imply two restrictions that 2 places on the data,

namely that1

>n

i

i

(=

∑ 0 and1

>n

i i

i

( X =

∑ 0 . 2how that you get the same formula for

the regression slope and the intercept if you impose these two conditions on thesample regression function.

Answer: The sample regression function is µ µ $1 ii io

Y X (β β = + + . 2umming both

sides results in µ µ µ1

1 1 1

n n n

i i io

i i i

Y n X (β β = = =

= + +∑ ∑ ∑ . <mposing the first

restriction, namely that the sum of the residuals is zero, dividing both

sides of the equation by n, and solving for µoβ gives the 2 formula

for the intercept.

Oor the second restriction, multiply both sides of the sample regression

function by i X and then sum both sides to get

µ µ µ

1

1 1 1 1

n n n n

i i i i i io

i i i i

Y X X X ( X β β = = = =

= + +∑ ∑ ∑ ∑ . After imposing the restriction

1

>n

i i

i

( X =

∑ 0 and substituting the formula for the intercept, you get

4




µ µ

1 1

1 1

! )n n

i i i

i i

Y X Y X nX X β β = =

= − +∑ ∑ or µ µ

1 1

1 1

n n

i i i

i i

Y X nYX X X β β = =

− = −∑ ∑ ,

which, after isolating µ1

β and dividing by the variation in , X results in

the 2 estimator for the slope.

8) !Jequires Appendi$ Haterial) 2how that the two alternative formulae for theslope given in your te$tboo& are identical.

1

1

1

1

n

i i

i

n

i

i

X Y XY n

X X n

=

=

−

−

∑

∑ 0

1

1

! )! )

! )

n

i i

i

n

i

i

X X Y Y

X X

=

=

− −

−

∑

∑

<n addition, the help function for a commonly used spreadsheet program gives thefollowing definition for the regression slope it estimates:

∑ ∑

∑ ∑∑

= =

= ==

−

−

n

i

n

i

ii

n

i

n

i

i

n

i

iii

X X n

Y X Y X n

1 1

1 11

)!

))!!

"rove that this formula is also the same as those given above.

Answer: etRs start with the first equality. The numerator of the right3hand sidee$pression can be written as follows:

1 1 1 1 1

! )! ) ! )n n n n n

i i i i i i i i i i

i i i i i

X X Y Y X Y XY YX XY X Y X Y Y X nYX = = = = =

− − = − − + = − − +∑ ∑ ∑ ∑ ∑

1 1

n n

i i i i

i i

Y X nXY nXY nXY Y X nXY = =

= − − + = −∑ ∑ . !Pote that1

n

i

i

X nX =

=∑ .)

Hultiplying out the terms in the denominator and moving the summation

sign into the e$pression in parentheses similarly yields

1

n

i

i

X nX

=

−∑.

ividing both of these e$pressions by n then results in the left3hand sidefraction.

44




Oinally,

1 1 1 1 1

1 1 1 1

! )! )

! ) ! )

n n n n n

i i i i i i i i

i i i i i

n n n n

i i i i

i i i i

n X Y X Y n X Y nXnY X Y nXY

n X X n X nX X nX

= = = = =

= = = =

− − −= =

− − −

∑ ∑ ∑ ∑ ∑

∑ ∑ ∑ ∑.

ividing both numerator and denominator by n then gives you thedesired result.

+) !Jequires Lalculus) Lonsider the following model:

ii (Y += β .

erive the 2 estimator for = .

Answer: To derive the 2 estimator, minimize the sum of squared prediction

mista&es

1

! )n

i

i

Y b=

−∑ . Ta&ing the derivative with respect to b results in

1 1 1

! ) ! ) ! )! 1)n n n

i i i

i i i

Y b Y b Y bb b= = =

∂ ∂− = − = − −

∂ ∂∑ ∑ ∑

1 1

! ) ! ) ! )n n

i i

i i

Y b Y nb= =

= − − = − −∑ ∑ . 2etting the derivative to zero then

results in the 2 estimator:

µ µ

1

! ) n

i o

i

Y n Y β β =

− − = ⇒ =∑ .

1) !Jequires Lalculus) Lonsider the following model:

iii ( X Y += 1β .

erive the 2 estimator for = 1.

Answer: To derive the 2 estimator, minimize the sum of squared prediction

mista&es

1

1

! )n

i i

i

Y b X =

−∑ . Ta&ing the derivative with respect to 1b results

in

1 1 1

1 1 11 1

! ) ! ) ! )! )n n n

i i i i i i i

i i i

Y b X Y b X Y b X X b b= = =

∂ ∂− = − = − −

∂ ∂∑ ∑ ∑

1 1

1 1

! ) ! )! ) ! )! )n n

i i i i i i

i i

Y b X X Y X b X = =

= − − = − −∑ ∑ . 2etting the derivative to

zero then results in the 2 estimator:

48




µ µ 11 1

1 1

1

! )!

n

i in ni

i i i ni i

i

i

Y X Y X X

X β β =

= =

=

− − = ⇒ =∑

∑ ∑∑

.

11) 2how first that the regression % is the square of the sample correlationcoefficient. Pe$t, show that the slope of a simple regression of Y on X is onlyidentical to the inverse of the regression slope of X on Y if the regression % equals one.

Answer: The regression E!!

%&!!

= , where F22 is given by µ

1

! )n

i

i

Y Y =

−∑ . Dut

µ µ µ 1i iY X β β = + and µ µ

1Y X β β = + . Cence µ µ

1! ) ! )iiY Y X X β − = − ,

and therefore µ

1

1

! )

n

i

i E!! X X β

== −∑ . Esing small letters to indicate

deviations from mean, i'*', i i - Z Z = − , we get that the regression

µ

1 1

1

n

i

i

n

i

i

x %

y

β =

=

=∑

∑. The square of the correlation coefficient is

µ

1 1 1 1 1

1 1 1 1 1

! ) ! )

! )

n n n n

i i i i i i

i i i i

n n n n n

i i i i i

i i i i i

y x y x x xr

x y x y y

β = = = =

= = = = =

= = =∑ ∑ ∑ ∑

∑ ∑ ∑ ∑ ∑. Cence the two are the

same.

Pow

µ

µ

1

1 11

1 1

1

n n

i i

i i

n n

i i

i i

x yr

y x

β

β = =

= =

= = ⇒ =∑ ∑

∑ ∑. Dut

µ µ

11 1

1

n

i i

i

n

i

i

x y

xβ β =

=

=∑

∑and therefore

µ

11

1

n

i

i

n

i i

i

y

x yβ =

=

=∑

∑,

which is the inverse of the regression slope of K on ;.

1) Lonsider the sample regression function

1> > >

i i iY X (β β = + + .

Oirst, ta&e averages on both sides of the equation. 2econd, subtract the resultingequation from the above equation to write the sample regression function indeviations from means. !Oor simplicity, you may want to use small letters to

indicate deviations from the mean, i.e., i i - Z Z = − .) Oinally, illustrate in a two3

4+




dimensional diagram with !!% on the vertical a$is and the regression slope on thehorizontal a$is how you could find the least squares estimator for the slope byvarying its values through trial and error.

Answer: Ta&ing averages results in 1> >Y X β β = + , and subtracting this equation

from the above one, we get 1> >i i i y x (β = + .

$ µ

1

1

! )n

i i i

i

!!% ( y xβ =

= = −∑ ∑ is a quadratic which ta&es on different

values for different choices of µ1

β !the y and x are given in this case,

i'*', different from the usual calculus problems, they cannot vary here).;ou could choose a starting value of the slope and calculate !!%. Pe$tyou could choose a different value for the slope and calculate the new!!%. There are two choices for the new slope value for you to ma&e:first, in which direction you want to move, and second, how large adistance you want to choose the new slope value from the old one. !<nessence, this is what sophisticated search algorithms do.) ;ou continuewith this procedure until you find the smallest !!%. The slopecoefficient which has generated this !!% is the 2 estimator.

1%) Larefully discuss the advantages of using heteros&edasticity3robust standarderrors over standard errors calculated under the assumption of homos&edasticity.Give at least five e$amples where it is very plausible to assume that the errorsdisplay heteros&edasticity.

Answer: There are virtually no e$amples where economic theory suggests thatthe errors are homos&edastic. Cence the maintained hypothesis should be that they are heteros&edastic. Esing homos&edasticity3only standarderrors when in truth heteros&edasticity3robust standard errors should

8




be used, results in false inference. 5hat ma&es this worse is thathomos&edasticity3only standard errors are typically smaller thanheteros&edasticity3robust standard errors, resulting in t 3statistics thatare too large, and hence re@ection of the null hypothesis too often.There is an alternative G2 estimator, weighted least squares, which is

DEF, but requires &nowledge of how the error variance depends on X , e.g. X or K. Answers will vary by student regarding the e$amples, but earnings functions, cross country beta3convergence regressions,consumption functions, sports regressions involving teams frommar&ets with varying population size, weight3height relationships forchildren, etc., are all good candidates.

1') The effect of decreasing the student3teacher ratio by one is estimated to result inan improvement of the districtwide score by .8 with a standard error of .(.Lonstruct a +* and ++* confidence interval for the size of the slope coefficientand the corresponding predicted effect of changing the student3teacher ratio by

one. 5hat is the intuition on why the ++* confidence interval is wider than the+* confidence intervalI

Answer: The +* confidence interval for the slope is calculated as follows: !.8 B 1.'(×.(, .8 # 1.'(×.() 0 !1.', %.1').

The corresponding predicted effect of a unit change in the student3teacher ratio is the same, since the change in X is 1.The ++* confidence interval for the slope coefficient and the unitchange in the student3teacher ratio is:

!.8 B .(8×.(, .8 # .(8×.() 0 !.+', %.).

The ++* confidence interval corresponds to a smaller size of the test.This means that you want to be 6more certain7 that the population parameter is contained in the interval, and that requires a larger interval.

1() Given the amount of money and effort that you have spent on your education, youwonder if it was !is) all worth it. ;ou therefore collect data from the Lurrent"opulation 2urvey !L"2) and estimate a linear relationship between earnings andthe years of education of individuals. 5hat would be the effect on your regressionslope and intercept if you measured earnings in thousands of dollars rather than indollarsI 5ould the regression % be affectedI 2hould statistical inference bedependent on the scale of variablesI iscuss.

Answer: <t should be clear that interpretation of estimated relationships andstatistical inference should not depend on the units of measurement.therwise whim could dictate conclusions. Cence the regression % andstatistical inference cannot be effected. <t is easy but tedious to showthis mathematically. Pe$t, the intercept indicates the value of Y when X

81




is zero. The change in the units of measurement have no effect on this,

since the change in X is cancelled by the change in µ1

β . The slope

coefficient will change to compensate for the change in the units ofmeasurement of K. <n the above case, the decimal point will move %digits to the left.

1) !Jequires Appendi$ Haterial) Lonsider the sample regression function

X X

1> > >

i i iY X (γ γ = + + ,

where 6X7 indicates that the variable has been standardized. 5hat are the units ofmeasurement for the dependent and e$planatory variableI 5hy would you wantto transform both variables in this wayI 2how that the 2 estimator for theintercept equals zero. Pe$t prove that the 2 estimator for the slope in this caseis identical to the formula for the least squares estimator where the variables havenot been standardized, times the ratio of the sample standard deviation of X and Y ,

i.e., 1 1>> X X

Y

"

"γ β = .

Answer: The units of measurement are in standard deviations.2tandardizing the variables allows conversion into common units andallows comparison of the size of coefficients. The mean of standardizedvariables is zero, and hence the 2 intercept must also be zero. The

slope coefficient is given by the formula $

X X

11

X

1

n

i i

i

n

i

i

x y

xγ =

=

=∑

∑, where small

letters indicate deviations from mean, i'*', - Z Z = − .

Pote that means of standardized variables are zero, and hence we get $

X X

11

X

1

n

i i

i

n

i

i

X Y

X

γ =

=

=∑

∑.

5riting this e$pression in terms of originally observed variables results in

$ 1

1

1

1 1

1

n

i i

i X Y

n

i

i X

x y! !

x!

γ =

=

=∑

∑, which is the same as the sought after e$pression after simplification.

14) The 2 slope estimator is not defined if there is no variation in the data for thee$planatory variable. ;ou are interested in estimating a regression relatingearnings to years of schooling. <magine that you had collected data on earnings

8




for different individuals, but that all these individuals had completed a collegeeducation !1 years of education). 2&etch what the data would loo& li&e ande$plain intuitively why the 2 coefficient does not e$ist in this situation.

Answer: There is no variation in X in this case, and it is therefore unreasonable to

as& by how much Y would change if X changed by one unit. Jegressionanalysis cannot figure out the answer to this question, because a changein X never happens in the sample.

8%

Farnings

;ears ofFducation1

KKKKK

KKK




18) <ndicate in a scatterplot what the data for your dependent variable and youre$planatory variable would loo& li&e in a regression with an % equal to zero.Cow would this change if the regression % was equal to oneI

Answer:

1+) <magine that you had discovered a relationship that would generate a scatterplot

very similar to the relationship

ii X Y = , and that you would try to fit a linear

regression through your data points. 5hat do you e$pect the slope coefficient to beI 5hat do you thin& the value of your regression %2 is in this situationI 5hatare the implications from your answers in terms of fitting a linear regressionthrough a non3linear relationshipI

Answer: ;ou would e$pect the slope to be a straight line !0) and the regression %2 to be zero in this situation. The implication is that although theremay be a relationship between two variables, you may not detect it ifyou use the wrong functional form.

) !Jequires Appendi$ Haterial) A necessary and sufficient condition to derive the

2 estimator is that the following two conditions hold:1

>n

i

i

(=

∑ 0 and1

>n

i i

i

( X =

∑ 0

. 2how that these conditions imply that >>n

i i( Y ∑ 0 .

practice problems for midterm 1

Documents