economic modelling 200053 lecture #1 introduction & simple regression analysis em/mv/sem2/2010 1

92
ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Post on 19-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

ECONOMIC MODELLING 200053

LECTURE #1INTRODUCTION & SIMPLE REGRESSION ANALYSIS

EM/MV/Sem2/2010

1

Page 2: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

2

Lecture Outline

ADMINISTRATION

INTRODUCTION TO ECONOMIC MODELLING

1. What is Economic Modelling?

2. Why study Economic Modelling?

FINANCIAL & ECONOMIC DATA

3. Types of economic & financial data

4. Obtaining data

5. Working with data: graphical methods

6. Working with data: descriptive statistics

7. Expected values and variances

SIMPLE REGRESSION ANALYSIS

EM/MV/Sem1/2011

Page 3: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem1/2011 3

Econometrics

Literally means “measurement in economics”

More practically it means “the application of statistical techniques to problems in economics”

In this course we focus on problems in financial economics

To many researchers, Econometrics is about applying statistical techniques to analyse economic and financial data. I am tending to define the subject in a broader sense and argue that econometricians tend to conduct the following tasks: 

•Mathematical and statistical modelling of economy and financial markets;•Develop and apply statistical and mathematical techniques to analyse economic and financial data

 

Page 4: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

4

Assess implications for theory

In terpret m odel

Satisfactory

R e-estim ate m odel using better techniquesC ollect better dataR eform ulate m odel

U nsatisfactory

5. Evaluate estim ation results

4. Estim ate m odel

3. C ollect data

2. D erive estim able m odel

1. U nderstand finance theory

Econometric Model Building

EM/MV/Sem1/2011

Page 5: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem1/2011 5

Likewise, in most textbooks Econometrics is discussed to have the following three major uses:

1. the description of economic reality2. the testing of hypotheses about economic or financial theory3. the forecasting of future economic activity.

Why Study Economic Modelling or Econometrics?4. Econometrics is a set of research tools also employed in the business

disciplines of accounting, finance, marketing and management. It is also used by social scientists, specifically researchers in history, political science and sociology. Econometrics plays an important role in such diverse fields as forestry, and in agricultural economics.

5. Studying econometrics fills a gap between being “a student of economics” and being “a practicing economist.”

6. By taking this introduction to econometrics you will gain an overview of what econometrics is about, and develop some “intuition” about how things work.

Page 6: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem1/2011 6

What is Econometrics?

In economics we express our ideas about relationships between economic variables using the mathematical concept of a function.

For example, to express a relationship between income i and consumption c, we may write

The demand for an individual commodity, say the Honda Accord, might be expressed as

The quantity of Honda Accords demanded, qd, is a function of the price of Honda Accords p, the price of cars that are substitutes ps, the price of items that are complements pc, like gasoline, and the level of income i.

The supply of an agricultural commodity such as beef might be written as

qs is the quantity supplied, p is the price of beef, pc is the price of competitive products in production (for example, the price of hogs), and pf is the price of factors or inputs (for example, the price of corn) used in the production process.

Econometrics is about how we can use economic, business or social science theory and data, along with tools from statistics, to answer “how much” type questions.

Page 7: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem1/2011 7

Some Examples A question facing Glenn Stevens is “How much should we

increase the discount rate to slow inflation, and yet maintain a stable and growing economy?” The answer will depend on the responsiveness of firms and individuals to increases in the interest rates and to the effects of reduced investment on Gross Domestic Product. The key elasticities and multipliers are called parameters. The values of economic parameters are unknown and must be estimated using a sample of economic data when formulating economic policies.

Econometrics is about how to best estimate economic parameters given the data we have. “Good” econometrics is important, since errors in the estimates used by policy makers such as the RBA may lead to interest rate corrections that are too large or too small, which has consequences for all of us.

Other examples include: A Redfern city council ponders the question of how much

violent crime will be reduced if an additional million dollars is spent putting uniformed police on the street.

Page 8: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem1/2011 8

The owner of a local Pizza Hut franchise must decide how much advertising space to purchase in the local newspaper, and thus must estimate the relationship between advertising and sales.

The CEO of Proctor & Gamble must estimate how much demand there will be in ten years for the detergent Tide, as she decides how much to invest in new plant and equipment.

A real estate developer must predict by how much population and income will increase to the residents of Penrith, over the next few years, and if it will be profitable to begin construction of a bigger Penrith Plaza/Westfield.

You must decide how much of your savings will go into a stock fund and how much into the money market. This requires you to make predictions of the level of economic activity, the rate of inflation and interest rates over your planning horizon.

Page 9: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem1/2011 9

TYPES OF ECONOMIC & FINANCIAL DATA

Page 10: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 10

Types of Economic & Financial Data

Time series data for the variable Y over sample period T collected at time t, (Yt for t = 1,…,T)

Data collected over time at identically spaced intervals

Low frequency: Annual, Quarterly, Monthly Data For example, the 90 day Bank Bill rate per month

Quarterly CPI, inflation rate, GDP growth rate

High Frequency: Daily and Intra Day Data For example, daily closing stock price for several years

Page 11: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 11

Time-series data are data arranged chronologically, usually at regular intervals

◦Examples of Problems that Could be Tackled Using a Time Series Regression

How the value of a country’s stock index has varied with that country’s macroeconomic fundamentals.

How a company’s stock returns has varied when it announced the value of its dividend payment.

The effect on a country’s currency of an increase in its interest rate

Time Series Data

Page 12: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 12

Types of Financial Data

Cross sectional data for the variable Y over sample N from individuals i, (Yi for i = 1,…,N)

Data collected at a point in time over a sample of individuals from the same population

For example, data on the dividend per share and dividend yields for Australian banks in 2007

Page 13: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 13

Cross-sectional data are data on one or more variables collected at a single point in time

◦Examples of Problems that Could be Tackled Using a Cross-Sectional Regression

The relationship between company size and the return to investing in its shares

The relationship between a country’s GDP level and the probability that the government will default on its sovereign debt.

Cross-section Data

Page 14: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 14

Types of Financial Data

Panel data for the variable Y over sample period T collected at time t, and over sample observation N from individuals i, (Yit for i = 1,…,N and t = 1,…,T)

Data collected over time at identically spaced intervals and over the same sample of individuals from a population

For example, annual data on the dividend per share and dividend yields for the same Australian Banks for 2003 to 2007

Page 15: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 15

Panel Data has the dimensions of both time series and cross-sections

e.g. the daily prices of a number of blue chip stocks over two years.

◦ It is common to denote each observation by the letter t and the total number of observations by T for time series data,

◦and to to denote each observation by the letter i and the total number of observations by N for cross-sectional data.

Panel Data

Page 16: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 16

◦Little difference between econometrics and financial econometrics beyond emphasis

◦Data samples Economics-based econometrics often suffers from paucity

of data Financial economics often suffers from infoglut and signal

to noise problems even in short data samples

◦Time scales Economic data releases often regular calendar events Financial data are likely to be real-time or tick-by-tick

Economics versus Financial Econometrics

Page 17: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 17

Economic & Financial data have some defining characteristics that shape the econometric approaches that can be applied◦outliers◦trends◦mean-reversion◦volatility clustering

Economics VS Financial Data

Page 18: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 18

Outliers

Page 19: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 19

Trends

Page 20: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 20

Mean Reversion (with outliers)

Page 21: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 21

Volatility Clustering

Page 22: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 22

These data may be collected at various levels of aggregation:

micro—data collected on individual economic decision making units such as individuals, households or firms.

macro—data resulting from a pooling or aggregating over individuals, households or firms at the local, state or national levels.

The data collected may also represent a flow or a stock: flow—outcome measures over a period of time, such as the

consumption of gasoline during the last quarter of 2004. stock—outcome measured at a particular point in time, such

as the quantity of crude oil held by CALTEX in its Sydney storage tanks April 1, 2004

Further Types of Data

Page 23: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 23

Or it might either be Qualitative & Quantitative Data

Quantitative data is inherently numerical

e.g. share price is $25

Qualitative data is not inherently numerical but for some analysis it must be expressed numerically

e.g. in a survey of companies ask if investment financed through debt, answer is Yes/No

binary variable equal 0 or 1. and Yes =1, No = 0

used for turning qualitative data into quantitative data

Page 24: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 24

Ways of Obtaining data

Often original data is transformed to the ratio of two variables

For example:

E = company earnings

S = number of shares

Y = E / S, which is earnings per share

Gross rates

Returns

Excess returns

Page 25: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 25

Ways of Obtaining data

Growth rate is the percentage change

% change =

One period simple net return (growth) Pt is share price at time t

Pt-1 is share price in the previous period

Rt is the periodic growth in price or simple net return over one period excluding dividend

100)( 1

t

tt

Y

YY

1

1 1

1t t tt

t t

P P PR

P P

Page 26: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 26

Obtaining data

One period simple net return

Log return

One period simple net return including dividend

1

1 tt

t

PR

P

11

ln 1 ln ln lntt t t t

t

Pr R P P

P

1

1 1

1t t t t tt

t t

P D P P DR

P P

Page 27: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 27

Obtaining data

Excess return (ER) is important in much financial decision making

ER = return on asset – return on risk free asset

ERt = Rt – Rot

Rot (risk free asset) refer to return on government bonds

Page 28: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 28

Index Numbers

Frequently financial analysts work with index numbers for stock markets or specific sectors in equities markets If interest is in the performance of the stock market as a whole

Price index takes the weighted average of share prices of a set of companies

the S&P/ASX 100 takes the biggest 100 companies listed on the Australian Stock Exchange

The “weights” in weighted average reflect the relative importance of the individual components in the index

In equity price indices big companies have more weight than small companies where the weight is usually measured by the market capitalization of the company

Page 29: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 29

WORKING WITH DATA

Page 30: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 30

Working with data: graphical methods

Time series graph Ex. monthly £/$ exchange rates from January 1947 to October

1996

0

50

100

150

200

250

300

350

400

450

Jan

-47

Jan

-49

Jan

-51

Jan

-53

Jan

-55

Jan

-57

Jan

-59

Jan

-61

Jan

-63

Jan

-65

Jan

-67

Jan

-69

Jan

-71

Jan

-73

Jan

-75

Jan

-77

Jan

-79

Jan

-81

Jan

-83

Jan

-85

Jan

-87

Jan

-89

Jan

-91

Jan

-93

Jan

-95

Pen

ce p

er d

oll

ar

Date

Figure 2.1: Time Series Graph of U.K. pound/U.S. dollar Exchange rate

Page 31: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 31

Working with data: graphical methods

Histograms Ex. Real GDP per capita in 1992 for 90 countries measured in

$US.

0

5

10

15

20

25

30

35

Fre

qu

ency

Bin

Figure 2.2: Histogram

Frequency

Page 32: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 32

Working with data: graphical methods

Scatter diagrams Ex. Executive compensation and profits of 70 companies

Figure 2.3: XY-plot of executive compensation against profits

Company 43

0

500

1000

1500

2000

2500

3000

0.0 1.0 2.0 3.0 4.0 5.0 6.0

Executive compensation ($millions)

Co

mp

an

y P

rofi

ts (

$m

illio

ns)

Page 33: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 33

Descriptive statistics

Descriptive statistics are numbers which cab be used to summarize information in a data set

Measures of location (center of distribution)

Sample Mean/Average:

Median: value which divides the sample in half; the value of the 50th percentile

Mode: the most common value

N

YY

N

i i 1

Page 34: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 34

Descriptive statistics

Measures of dispersion (spread, variability) Standard deviation,

Variance: squared standard deviation S2

Range: R = Ymax– Ymin

Percentile: the data value at the percentile of the sample

e.g. the 50th percentile value must be the median

1

)(1

2

N

YYN

i is

Page 35: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Formulae for expectation and moment conditions.

1st Moment – Mean of a r.v

2nd Moment – Variance

11

1

[ ( )] [ ]

[ ] ( ) ( )n

i ix i

E X E X E X

E X xf x x f x

22

2 2

[ ] [ ( )]

[ ] [ ] [ ]

VAR X E X E X

VAR X E X E X

35EM/MV/SEM2/2010

Page 36: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Formulae for expectation and moment conditions.

3rd Moment – Skewness of a random variable

4th Moment – Kurtosis of a random Variable

3

32 2

[ ][ ]

[ ]

E XSKEW X

E X

4

2

[ ][ ]

[ ]

E XKURT X

E X

36EM/MV/SEM2/2010

Page 37: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Normal Distribution

Effect of different means

Effect of different standard deviations

37EM/MV/SEM2/2010

Page 38: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Skewness in the normal Distribution (Green)

The median of the green function is approximated by the mean of a normal distribution.

When mean<median +skew

When mean>median -skew

Skew of a normal distribution (yellow)= 0

medianmean

median mean

38EM/MV/SEM2/2010

Page 39: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Kurtosis in the Normal distribution

Kurtosis is the degree of peakedness of a distribution.

A normal distribution is a mesokurtic distribution.

A pure leptokurtic distribution has a higher peak than the normal distribution and has heavier tails.

A pure platykurtic distribution has a lower peak than a normal distribution and lighter tails.

Mesokurtotic, kurtosis =3Excess kurtosis=kurt[X]-3=0

Kurt[X]>3

Kurt[X]<3

39EM/MV/SEM2/2010

Page 40: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 40

Basic data analysis

Summary statistics Average level of variable

Mean, median and mode

Variability around this central tendency

Standard deviation, variance and range

Distribution of data

Skewness and kurtosis

Number of observations and number of missing values

Page 41: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 41

Summary of Introduction

Economic and Financial data come in many forms Common types are time series, cross section and panel data

Graphical techniques are useful ways of summarizing the information in a data set Time series graphs, histograms and scatter XY-plots

Many numerical summaries can be used The mean, a measure of the location of a distribution

The standard deviation (and variance), measures of how spread out or dispersed

Page 42: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

The Simple Regression Model

THE THEORY

y = b0 + b1x + u

EM/MV/Sem2/2010 42

Page 43: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 43

Deterministic VS Stochastic Model

y = b0 + b1x

y = b0 + b1x + u,

y = 100+2X

y = 200+ 2x + (50% chance +10; 50% chance-15)

Page 44: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Some Terminology

In the simple linear regression model, where y = b0 + b1x +u

we typically refer to y as the

◦Dependent Variable, or◦Left-Hand Side Variable, or◦Explained Variable, or◦Regressand

EM/MV/Sem2/201044

Page 45: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Some Terminology, cont.

In the simple linear regression of y on x, we typically refer to x as the◦Independent Variable, or◦Right-Hand Side Variable, or◦Explanatory Variable, or◦Regressor, or◦Covariate, or◦Control Variables

EM/MV/Sem2/201045

Page 46: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 46

Why stochastic?

y = b0 + b1x +u

U measures:Unpredictable element of randomness of

human responsesEffect of large number of omitted

variablesMeasurement error in y

Page 47: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Ordinary Least Squares

Basic idea of regression is to estimate the population parameters from a sample

Let {(xi,yi): i=1, …,n} denote a random sample of size n from the population

For each observation in this sample, it will be the case that

yi = b0 + b1xi + ui

EM/MV/Sem2/201047

Page 48: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 48

.

..

.

y4

y1

y2

y3

x1 x2

x3

x4

}

}

{

{

u1

u2

u3

u4

x

y

Population regression line, sample data pointsand the associated error terms

E(y|x) = b0 + b1x

Page 49: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Deriving OLS Estimates

To derive the OLS estimates we need to realize that our main assumption of E(u|x) = E(u) = 0 also implies that

Cov(x,u) = E(xu) = 0

Why? Remember from basic probability that Cov(X,Y) = E(XY) – E(X)E(Y)

EM/MV/Sem2/201049

Page 50: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Deriving OLS continued

We can write our 2 restrictions just in terms of x, y, b0 and b1 , since u = y – b0 – b1x

E(y – b0 – b1x) = 0 E[x(y – b0 – b1x)] = 0

These are called moment restrictions

EM/MV/Sem2/201050

Page 51: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Deriving OLS using M.O.M.

The method of moments approach to estimation implies imposing the population moment restrictions on the sample moments

What does this mean? Recall that for E(X), the mean of a population distribution, a sample estimator of E(X) is simply the arithmetic mean of the sample

EM/MV/Sem2/201051

Page 52: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

More Derivation of OLS We want to choose values of the

parameters that will ensure that the sample versions of our moment restrictions are true

The sample versions are as follows:

0ˆˆ

0ˆˆ

110

1

110

1

n

iiii

n

iii

xyxn

xyn

EM/MV/Sem2/201052

Page 53: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

More Derivation of OLSGiven the definition of a sample mean,

and properties of summation, we can rewrite the first condition as follows

xy

xy

10

10

ˆˆ

or

,ˆˆ

EM/MV/Sem2/201053

Page 54: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

More Derivation of OLS

n

iii

n

ii

n

iii

n

iii

n

iiii

xxyyxx

xxxyyx

xxyyx

1

21

1

11

1

111

ˆ

ˆ

0ˆˆ

EM/MV/Sem2/201054

Page 55: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

So the OLS estimated slope is

0 that provided

ˆ

1

2

1

2

11

n

ii

n

ii

n

iii

xx

xx

yyxx

55

Page 56: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Summary of OLS slope estimate

The slope estimate is the sample covariance between x and y divided by the sample variance of x

If x and y are positively correlated, the slope will be positive

If x and y are negatively correlated, the slope will be negative

Only need x to vary in our sample

56

Page 57: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

More OLS

Intuitively, OLS is fitting a line through the sample points such that the sum of squared residuals is as small as possible, hence the term least squares

The residual, û, is an estimate of the error term, u, and is the difference between the fitted line (sample regression function) and the sample point

57

Page 58: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

58

.

..

.

y4

y1

y2

y3

x1 x2

x3

x4

}

}

{

{

û1

û2

û3

û4

x

y

Sample regression line, sample data pointsand the associated estimated error terms

xy 10ˆˆˆ

Page 59: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Alternate approach to derivation

Given the intuitive idea of fitting a line, we can set up a formal minimization problem

That is, we want to choose our parameters such that we minimize the following:

n

iii

n

ii xyu

1

2

101

2 ˆˆˆ

59

Page 60: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Alternate approach, continued

If one uses calculus to solve the minimization problem for the two parameters you obtain the following first order conditions, which are the same as we obtained before, multiplied by n

0ˆˆ

0ˆˆ

110

110

n

iiii

n

iii

xyx

xy

60

Page 61: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Algebraic Properties of OLS

The sum of the OLS residuals is zero Thus, the sample average of the OLS

residuals is zero as well The sample covariance between the

regressors and the OLS residuals is zero The OLS regression line always goes

through the mean of the sample

61

Page 62: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Algebraic Properties (precise)

xy

ux

n

uu

n

iii

n

iin

ii

10

1

1

1

ˆˆ

0

ˆ

thus,and 0ˆ

62

Page 63: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

More terminology

SSR SSE SSTThen

(SSR) squares of sum residual theis ˆ

(SSE) squares of sum explained theis ˆ

(SST) squares of sum total theis

:following thedefine then Weˆˆ

part, dunexplainean and part, explainedan of up

made being asn observatioeach ofcan think We

2

2

2

i

i

i

iii

u

yy

yy

uyy

63

Page 64: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Proof that SST = SSE + SSR

0 ˆˆ that know weand

SSE ˆˆ2 SSR

ˆˆˆ2ˆ

ˆˆ

ˆˆ

22

2

22

yyu

yyu

yyyyuu

yyu

yyyyyy

ii

ii

iiii

ii

iiii

64

Page 65: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 65

ASSUMPTIONS OF OLS

Page 66: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

A Simple Assumption

The average value of u, the error term, in the population is 0. That is,

E(u) = 0

This is not a restrictive assumption, since we can always use b0 to normalize E(u) to 0

EM/MV/Sem2/201066

Page 67: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Zero Conditional Mean

We need to make a crucial assumption about how u and x are related

We want it to be the case that knowing something about x does not give us any information about u, so that they are completely unrelated. That is, that

E(u|x) = E(u) = 0, which implies E(y|x) = b0 + b1x

EM/MV/Sem2/2010 67

Page 68: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/201068

..

x1 x2

E(y|x) as a linear function of x, where for any x the distribution of y is centered about E(y|x)

E(y|x) = b0 + b1x

y

f(y)

Page 69: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Unbiasedness of OLS

Assume the population model is linear in parameters as y = b0 + b1x + u

Assume we can use a random sample of size n, {(xi, yi): i=1, 2, …, n}, from the population model. Thus we can write the sample model yi = b0 + b1xi + ui

Assume E(u|x) = 0 and thus E(ui|xi) = 0 Assume there is variation in the xi

69

Page 70: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Unbiasedness of OLS (cont)

In order to think about unbiasedness, we need to rewrite our estimator in terms of the population parameter

Start with a simple rewrite of the formula as

22

21 where,ˆ

xxs

s

yxx

ix

x

ii

70

Page 71: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Unbiasedness of OLS (cont)

ii

iii

ii

iii

iiiii

uxx

xxxxx

uxx

xxxxx

uxxxyxx

10

10

10

71

Page 72: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Unbiasedness of OLS (cont)

211

21

2

ˆ

thusand ,

asrewritten becan numerator the,so

,0

x

ii

iix

iii

i

s

uxx

uxxs

xxxxx

xx

72

Page 73: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Unbiasedness of OLS (cont)

1211

21

then,1ˆ

thatso ,let

iix

iix

i

ii

uEds

E

uds

xxd

73

Page 74: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Unbiasedness Summary

The OLS estimates of b1 and b0 are unbiased

Proof of unbiasedness depends on our 4 assumptions – if any assumption fails, then OLS is not necessarily unbiased

Remember unbiasedness is a description of the estimator – in a given sample we may be “near” or “far” from the true parameter

74

Page 75: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Variance of the OLS Estimators

Now we know that the sampling distribution of our estimate is centered around the true parameter

Want to think about how spread out this distribution is

Much easier to think about this variance under an additional assumption, so

Assume Var(u|x) = s2 (Homoskedasticity)

75

Page 76: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Variance of OLS (cont)

Var(u|x) = E(u2|x)-[E(u|x)]2

E(u|x) = 0, so s2 = E(u2|x) = E(u2) = Var(u) Thus s2 is also the unconditional variance,

called the error variance s, the square root of the error variance is

called the standard deviation of the error Can say: E(y|x)=b0 + b1x and Var(y|x) =

s2

76

Page 77: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

77

..

x1 x2

Homoskedastic Case

E(y|x) = b0 + b1x

y

f(y|x)

Page 78: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

78

.x x1 x2

yf(y|x)

Heteroskedastic Case

x3

..

E(y|x) = b0 + b1x

Page 79: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Variance of OLS (cont)

12

222

22

22

2222

2

2

22

2

2

2

211

ˆ1

11

11

Vars

ss

ds

ds

uVards

udVars

uds

VarVar

xx

x

ix

ix

iix

iix

iix

79

Page 80: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Variance of OLS Summary

The larger the error variance, s2, the larger the variance of the slope estimate

The larger the variability in the xi, the smaller the variance of the slope estimate

As a result, a larger sample size should decrease the variance of the slope estimate

Problem that the error variance is unknown

80

Page 81: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Estimating the Error Variance

We don’t know what the error variance, s2, is, because we don’t observe the errors, ui

What we observe are the residuals, ûi

We can use the residuals to form an estimate of the error variance

81

Page 82: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Error Variance Estimate (cont)

2/ˆ2

is ofestimator unbiasedan Then,

ˆˆ

ˆˆ

ˆˆˆ

22

2

1100

1010

10

nSSRun

u

xux

xyu

i

i

iii

iii

82

Page 83: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Error Variance Estimate (cont)

21

21

1

2

/ˆˆse

, ˆ oferror standard the

have then wefor ˆ substitute weif

ˆsd that recall

regression theoferror Standardˆˆ

xx

s

i

x

83

Page 84: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Goodness-of-Fit

How do we think about how well our sample regression line fits our sample data?

Can compute the fraction of the total sum of squares (SST) that is explained by the model, call this the R-squared of regression

R2 = SSE/SST = 1 – SSR/SST

Economics 20 - Prof. Anderson 84

Page 85: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

Using Stata for OLS regressions

Now that we’ve derived the formula for calculating the OLS estimates of our parameters, you’ll be happy to know you don’t have to compute them by hand

Regressions in Stata are very simple, to run the regression of y on x, just type

reg y x

85

Page 86: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

The Simple Regression Model

THE APPLICATION

y = b0 + b1x + u

EM/MV/Sem2/2010 86

Page 87: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 87

Interpreting OLS estimates

Interpretation of

Estimated value of Y = if X = 0

This is often not of interest

Ex. X = lot size, Y = house price

= estimated value of a house with lot size = 0

ˆˆY X u

0

Y

a

X

Page 88: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 88

Interpreting OLS estimates

Interpretation of as the marginal effect

Differentiate the regression model

Derivatives measure how much Y tends to change when X is changed by a marginal amount

If X changes by 1 unit then Y tends to change by units

where “units” refers to what the variables are measured (e.g. $, £, %, etc.)

ˆˆY X u

dX

dY

Page 89: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 89

Interpreting OLS estimates

Ex. Executive compensation and profit

Y = executive compensation (millions of $) = dependent variable

X = profits (millions of $) = explanatory variable

from example, = 0.000842

since is positive, X and Y are positively correlated

Interpretation of

If profits increase by $1 million, then executive compensation will tend to increase by .000842 millions dollars (i.e. $842)

ˆˆY X u

Page 90: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 90

Fitted Values

Estimated Regression Line:

The residual

Fitted value of dependent variable is

The difference between actual and fitted values of Y

(another way to express residual)

Good fitting models have small residuals (i.e. small SSR)

If residual is big for one observation (relative to other observations) then it is an outlier

Looking at fitted values and residuals can be very informative

ˆˆY X u

XYu ˆˆ

ii XY ˆˆˆ

ii YYu ˆ

Page 91: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

EM/MV/Sem2/2010 91

R2: A measure of fit

How well does the regression line fit the data points

A good measure of the proportion of the total variation explained

by the regression equation

TSS = the variation in the dependant variable about its mean value

TSS = RSS + SSR

TSS = Total sum of squares,

RSS = Regression sum of squares,

SSR = sum of squared residuals,

TSS

RSS

TSS

SSRR 12

2)( YYTSS i

2)ˆ( YYRSS i

2iuSSR

Page 92: ECONOMIC MODELLING 200053 LECTURE #1 INTRODUCTION & SIMPLE REGRESSION ANALYSIS EM/MV/Sem2/2010 1

AGAIN, WELCOME TO ECONOMIC MODELLING!

You are enrolled in this unit..you need to start reading…

See you next week.

92