e2206finalx1

SCHOOL OF ECONOMICS

Econ 2206 Introductory Econometrics

Final Examination

Session 1, 2007

1. TIME ALLOWED - 2 Hours.

2. TOTAL NUMBER OF QUESTIONS - 6.

3. ANSWER ALL QUESTIONS.

4. ALL QUESTIONS ARE OF EQUAL VALUE (The marks awarded to each part of a question are indicated- the total marks for this exam is 60).

5. CANDIDATES MAY BRING THEIR OWN CALCULATORS TO THE EXAM

6. STATISTICAL TABLES ARE PROVIDED AT THE END OF THE EXAM PAPER

7. ALL ANSWERS MUST BE WRITTEN IN PEN. PENCILS MAY BE USED ONLY FOR DRAWING,SKETCHING OR GRAPHICAL WORK.

ANSWER ALL SIX QUESTIONS

REMINDER: When performing statistical tests, always state the null and alternative hypothe-ses, the test statistic and it’s distribution under the null hypothesis, the level of significanceand the conclusion of the test.

Question 1. (10 Marks).

(i) Suppose that the correct population regression model is:

y = β0 + β1 x1 + β2 x2 + u (1.1)

However we only have data only on y and x1, and as a consequence we estimate the following model by OLS:

y = β̂0 + β̂1 x1 + v (1.2)

In what circumstance will the OLS estimator for model (1.2):(a) provide an unbiased estimate of the true population parameter β1 ? (2 marks)

(b) provide an estimate of β1 that has positive (or upward) bias ? (2 marks)

(ii) Outline the advantages of using larger samples of data in regression analysis. (2 marks)

(iii) A model used analysing the effect of house characteristics on the sale price was:

log(price) = β0 + β1 area+ β2bdrms+ β3 area× bdrms+ u

where price is the house price, area is the floor area of the house (measured in square metres), and bdrms

is the number of bedrooms. What is the partial effect on dlog(price) of increasing area by 1 square metre ?( 2 marks).

(iv) What is the meaning of the term “contemporaneous exogeneity” as used in the context of timeseries data ? What is the difference between contemporaneous exogeneity and “strict exogeneity” as used inmultiple regression models for time series data ? (2 marks)

2

Question 2. (10 Marks in total)The following regression model explains the monthly wages as a function of years of education (educ), yearsof labour market experience (exper) and current job tenure (tenure):

log(wage) = β0 + β1educ+ β2exper + β3tenure+ u (2.1)

With a random sample of data the following output was obtained using SHAZAM:Welcome to SHAZAM - Version 10.0|_sample 1 722|_read wage educ exper tenure4 VARIABLES AND 722 OBSERVATIONS STARTING AT OBS 1|_genr lnwage=log(wage)

|_* Model estimates|_ols lnwage educ exper tenure

REQUIRED MEMORY IS PAR= 81 CURRENT PAR= 2000OLS ESTIMATION722 OBSERVATIONS DEPENDENT VARIABLE= LNWAGE

...NOTE..SAMPLE RANGE SET TO: 1, 722

R-SQUARE = 0.1551 R-SQUARE ADJUSTED = 0.1524VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.19493STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.44151SUM OF SQUARED ERRORS-SSE= 139.96MEAN OF DEPENDENT VARIABLE = 6.7790LOG OF THE LIKELIHOOD FUNCTION = -438.839

VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITYNAME COEFFICIENT ERROR 718 DF P-VALUE CORR. COEFFICIENT AT MEANSEDUC 0.74864E-01 0.6512E-02 11.50 0.000 0.353 0.3905 0.1487EXPER 0.15328E-01 0.3370E-02 4.549 0.000 0.147 0.1592 0.0261TENURE 0.13375E-01 0.2587E-02 5.170 0.000 0.167 0.1612 0.0143CONSTANT 5.4967 0.1105 49.73 0.000 0.852 0.0000 0.8108

(i) What is the interpretation of the coefficient on education, β1 ? (2 marks).

(ii) Calculate the exact percentage effect of another year of education on the predicted wage level. (2 marks).

(iii) Test the null hypothesis that all the slope parameters in the model are jointly equal to zero using a 1percent significance level. What do you conclude ? (3 mark).Note: The F-test statistic is given by the formula based on R2 is:

F =(R2ur −R2r)/q

(1−R2ur)/(n− k − 1)where q is the number of restrictions, and ur and r stand for unrestricted and restricted models, respectively.

(iv) We are interesting in constructing a confidence interval for the (conditional) predicted log(wage) wheneduc = 13, exper = 11 and tenure = 7. To obtain the standard error for the prediction we need to estimatea transformed model that is equivalent to (2.1). Derive the transformed model which will give a directestimate of the prediction and the standard error of the prediction. (3 marks).

3

Question 3. (10 Marks in total)We are interested in analysing the effect of different house characteristics on the market price of the housein the Sydney, and consider the following regression model:

log(price) = β0 + β1 log(lotsize) + β2 log(sqrft) + β3 log(bdrms) + u (3.1)

where price is the sale price (measured in $1000), lotsize is land area (square metres), sqrmtr (is the floorarea of the house (also measured in square metres), and bdrms is the number of bedrooms. Based on asample of data from 2005 house sales in Sydney, the following regression estimates were obtained:

dlog(price) = 0.5481 + 0.7013 log(sqrmtr) + 0.1745 log(lotsize) + 0.0363 log(bdrms)

(0.3945)(0.0823) (0.0353) (0.0932)

n = 108, R2 = 0.551, R̄2 = 0.538

(i) Construct a 90% confidence interval for β̂3 (the coefficient on log(bdrms)). Is zero within the confidenceinterval ? (3 marks).

(ii) Given the estimation results, would you conclude that this is a good econometric model ? Explain.(3 marks).

(iii) We are concerned that the model in (3.1) may be misspecified. An alternative model specification whereall the variables are in level form (rather than in log form) is:

price = β0 + β1lotsize+ β2sqrft+ β3bdrms+ u (3.2)

Outline a procedure for testing whether model (3.1) or model (3.2) is a better specification. What are thelimitations (if any) of the test ? Explain. (4 marks)

4

Question 4. (10 Marks in total).In a recent study an economist examined the factors explaining whether a firm was taken over by anotherfirm during a given year. The dependent variable in the analysis was Takeover - which is a binary variableequal to 1 if it was taken over (and 0 otherwise). The explanatory variables were profit which is the firm’saverage profit rate over the previous five years, mktval which is the market value of the firm (in $100m), anddebtearn which is the debt-to-earnings ratio. The table below presents coefficient estimates (and standarderrors) based on a sample of 177 firms in 2004.

Table 4.1. Estimation Results for Takeover Models

Dependent Variable: TakeoverVariablesprofit 0.251

(0.068)mktval −0.930

(0.287)debtearn −0.364

(0.249)constant −19.21

(4.839)Observations(n) 177R2 0.233Note: The usual OLS standard errors in () below the coefficient estimates.

(i) What is the interpretation of the coefficient on profit ? (2 mark)

(ii) What is the predicted probability of dTakeover for a firm with the following characteristics: profit = 0.05,mktval = 1.5 and debtearn = 6 ? Briefly explain whether the result is sensible. (2 marks)

(iii) We know the Linear Probability Model must contain “heteroskedaticity”. What is heteroskedasticityand what are the consequences of heteroskedasticity for:

(a) estimation, and(b) inference with the standard OLS procedures ?

(2 marks)

(iv) Given that we know the model contains heteroskedasticity, what advice would you give an economistwishing to analyse the determinant of Takeover with regression methods ? (4 marks)

5

Question 5. (10 Marks in total).The following regression model was proposed for analyse the effect of the minimum wage on employment:

log(emprtet) = β0 + β1 log(minwgt) + β2 log(minwgt−1) + β3 log(GNPt) + ut (5.1)

where emprtet is the employment rate, minwgt is the minimum wage and GNPt is GNP (a proxy for labourdemand) in year t.

(i) What is the interpretation of the coefficient β1 ? (2 mark).

(ii) Is this a “static” or “dynamic” model ? What is the purpose of including the lagged term minwgt−1?Briefly explain. (2 marks).

Using annual data from 1950-1987, the following regression model estimates were obtained:

dlog(emprtet) = −7.05− 0.072 log(minwgt)− 0.061 log(minwgt−1)− 0.012 log(GNPt) (5.2)

(0.77) (0.031) (0.015) (0.089)

n = 38, R2 = 0.661, R̄2 = 0.641

(iii) Test the null hypothesis that the lagged term minwgt−1 is insignificant using a 10 percent significancelevel and the one-sided alternative that the coefficent is negative (H0 : β2 = 0, H1 : β2 < 0 ). (2 marks).

(iv) There is not enough information in the results presented in (5.2) to construct a confidence interval forthe Long Run Propensity (LRP). Rewrite the model in (5.1) into a form which you give you a direct estimateof the LRP (and the standard error on the LRP). What parameter in this transformed model correspondsto the LRP ? (2 marks).

(v) I am concerned that the model in (5.2) may suffer from the “spurious regression” problem. What is thespurious regression problem and what simple adjustment to the model would help reduce the possibility ofthis problem ? (2 marks).

6

Question 6. (10 Marks in total).We are interested in analysing the effect of locating a water desalination plant on local property prices.

Desalination plants are large, industrial sites which can generate a lot of noise pollution and reduce amenitiesin the local area. The South Australian government built a desalination plant in the Adelaide area of SouthBeach in 1998. Discussion about building a desalination plant in South Beach began after 1994, and theplant was built and began operating in 1998. We have data on the prices of houses sold in South Beach in1994 (the “before” period) and another sample on houses sold in 2002 (the “after” period). The hypothesiswe wish to test is that the price of houses located near the site of the desalination plant would fall below theprice of more distant houses.The data for each year includes the dummy variable nearplant which is equal to one if the house is

located within 3 kilometres of the desalination plant. The variable hprice denotes the real house price(scaled by $10,000). The following simple regression model was estimated using only the year 2002 sampleof data:

dhprice = 21.311− 6.198nearplant (6.1)

(0.618) (0.992)

n = 353, R2 = 0.212

Using the 1994 sample, the following regression results were obtained:

dhprice = 16.527− 3.679nearplant (6.2)

(0.538) (0.615)

n = 182, R2 = 0.172

(i) What is the interpretation of the coefficient on the intercept term in model (6.2) (that is, what does thevalue 16.527 represent) ? What is the interpretation of the coefficient on nearplant in model (6.2) ?(2 marks)

(ii) Can you infer from the estimates in (6.1), based on the year 2002 data, that the location of the plantcaused the price of houses located nearby to fall by an average of $61,980 ? Explain . (2 marks)

(iii) An alternative approach is to pool the data for both years and estimate the following model:

dhprice = 16.527 + 4.7840 year2− 3.679nearplant− 2.519 year2 . nearplant (6.3)

(0.793) (0.9471) (0.876) (1.128)

n = 535, R2 = 0.202

where year2 is a dummy variable equal to one if the observation is for the year 2002 (and is equal to zero ifthe observation is for the year 1994).What is the estimated effect of the plant on neighbouring house prices based on the “difference-in-difference”estimator ? Is the effect significantly different from 0 at the 5% significance level ? (use the one-sidedalternative hypothesis that the coefficient is negative). (3 marks)

(iv) What, if any, would be the advantages of collecting and using panel data to evaluate the effect of thelocation of the desalination plant on local property prices ? Explain. (3 marks).

7

Table 1. Critical Values of the t Distribution

0.10 0.05 0.025 0.01 0.0050.20 0.10 0.05 0.02 0.01

1 3.078 6.314 12.706 31.821 63.6562 1.886 2.920 4.303 6.965 9.9253 1.638 2.353 3.182 4.541 5.8414 1.533 2.132 2.776 3.747 4.6045 1.476 2.015 2.571 3.365 4.0326 1.440 1.943 2.447 3.143 3.7077 1.415 1.895 2.365 2.998 3.4998 1.397 1.860 2.306 2.896 3.3559 1.383 1.833 2.262 2.821 3.25010 1.372 1.812 2.228 2.764 3.16911 1.363 1.796 2.201 2.718 3.10612 1.356 1.782 2.179 2.681 3.05513 1.350 1.771 2.160 2.650 3.01214 1.345 1.761 2.145 2.624 2.97715 1.341 1.753 2.131 2.602 2.94716 1.337 1.746 2.120 2.583 2.92117 1.333 1.740 2.110 2.567 2.89818 1.330 1.734 2.101 2.552 2.87819 1.328 1.729 2.093 2.539 2.86120 1.325 1.725 2.086 2.528 2.84521 1.323 1.721 2.080 2.518 2.83122 1.321 1.717 2.074 2.508 2.81923 1.319 1.714 2.069 2.500 2.80724 1.318 1.711 2.064 2.492 2.79725 1.316 1.708 2.060 2.485 2.78726 1.315 1.706 2.056 2.479 2.77927 1.314 1.703 2.052 2.473 2.77128 1.313 1.701 2.048 2.467 2.76329 1.311 1.699 2.045 2.462 2.75630 1.310 1.697 2.042 2.457 2.75040 1.303 1.684 2.021 2.423 2.70460 1.296 1.671 2.000 2.390 2.66090 1.291 1.662 1.987 2.368 2.632120 1.289 1.658 1.980 2.358 2.617∞ 1.282 1.645 1.960 2.326 2.576

Example: The 1% critical value for a one tailed test with 25 df is 2.485. The 5% critical value for a two-tailed test with large (>120) df is 1.960.

Degrees of Freedom

1-Tailed:2-Tailed:

Significance Level

8

Table 2. 1% Critical Values of the F Distribution

1 2 3 4 5 6 7 8 9 1010 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.8511 9.65 7.21 6.22 5.67 5.32 5.07 4.89 4.74 4.63 4.5412 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.3013 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 4.1014 8.86 6.51 5.56 5.04 4.69 4.46 4.28 4.14 4.03 3.9415 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.8016 8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 3.6917 8.40 6.11 5.19 4.67 4.34 4.10 3.93 3.79 3.68 3.5918 8.29 6.01 5.09 4.58 4.25 4.01 3.84 3.71 3.60 3.5119 8.18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 3.4320 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 3.3721 8.02 5.78 4.87 4.37 4.04 3.81 3.64 3.51 3.40 3.3122 7.95 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 3.2623 7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 3.2124 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 3.1725 7.77 5.57 4.68 4.18 3.85 3.63 3.46 3.32 3.22 3.1326 7.72 5.53 4.64 4.14 3.82 3.59 3.42 3.29 3.18 3.0927 7.68 5.49 4.60 4.11 3.78 3.56 3.39 3.26 3.15 3.0628 7.64 5.45 4.57 4.07 3.75 3.53 3.36 3.23 3.12 3.0329 7.60 5.42 4.54 4.04 3.73 3.50 3.33 3.20 3.09 3.0030 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.07 2.9840 7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.89 2.8060 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.6390 6.93 4.85 4.01 3.53 3.23 3.01 2.84 2.72 2.61 2.52120 6.85 4.79 3.95 3.48 3.17 2.96 2.79 2.66 2.56 2.47∞ 6.63 4.61 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32

Example: The 1% critical value for numerator df =3 and denominator df=60 is 4.13.

Numerator Degrees of Freedom

Denominator Degrees of Freedom

9

e2206finalx1

Documents