economic modelling 200053 lecture #1 introduction & simple regression analysis em/mv/sem2/2010 1
Post on 19-Dec-2015
215 views
TRANSCRIPT
ECONOMIC MODELLING 200053
LECTURE #1INTRODUCTION & SIMPLE REGRESSION ANALYSIS
EM/MV/Sem2/2010
1
2
Lecture Outline
ADMINISTRATION
INTRODUCTION TO ECONOMIC MODELLING
1. What is Economic Modelling?
2. Why study Economic Modelling?
FINANCIAL & ECONOMIC DATA
3. Types of economic & financial data
4. Obtaining data
5. Working with data: graphical methods
6. Working with data: descriptive statistics
7. Expected values and variances
SIMPLE REGRESSION ANALYSIS
EM/MV/Sem1/2011
EM/MV/Sem1/2011 3
Econometrics
Literally means “measurement in economics”
More practically it means “the application of statistical techniques to problems in economics”
In this course we focus on problems in financial economics
To many researchers, Econometrics is about applying statistical techniques to analyse economic and financial data. I am tending to define the subject in a broader sense and argue that econometricians tend to conduct the following tasks:
•Mathematical and statistical modelling of economy and financial markets;•Develop and apply statistical and mathematical techniques to analyse economic and financial data
4
Assess implications for theory
In terpret m odel
Satisfactory
R e-estim ate m odel using better techniquesC ollect better dataR eform ulate m odel
U nsatisfactory
5. Evaluate estim ation results
4. Estim ate m odel
3. C ollect data
2. D erive estim able m odel
1. U nderstand finance theory
Econometric Model Building
EM/MV/Sem1/2011
EM/MV/Sem1/2011 5
Likewise, in most textbooks Econometrics is discussed to have the following three major uses:
1. the description of economic reality2. the testing of hypotheses about economic or financial theory3. the forecasting of future economic activity.
Why Study Economic Modelling or Econometrics?4. Econometrics is a set of research tools also employed in the business
disciplines of accounting, finance, marketing and management. It is also used by social scientists, specifically researchers in history, political science and sociology. Econometrics plays an important role in such diverse fields as forestry, and in agricultural economics.
5. Studying econometrics fills a gap between being “a student of economics” and being “a practicing economist.”
6. By taking this introduction to econometrics you will gain an overview of what econometrics is about, and develop some “intuition” about how things work.
EM/MV/Sem1/2011 6
What is Econometrics?
In economics we express our ideas about relationships between economic variables using the mathematical concept of a function.
For example, to express a relationship between income i and consumption c, we may write
The demand for an individual commodity, say the Honda Accord, might be expressed as
The quantity of Honda Accords demanded, qd, is a function of the price of Honda Accords p, the price of cars that are substitutes ps, the price of items that are complements pc, like gasoline, and the level of income i.
The supply of an agricultural commodity such as beef might be written as
qs is the quantity supplied, p is the price of beef, pc is the price of competitive products in production (for example, the price of hogs), and pf is the price of factors or inputs (for example, the price of corn) used in the production process.
Econometrics is about how we can use economic, business or social science theory and data, along with tools from statistics, to answer “how much” type questions.
EM/MV/Sem1/2011 7
Some Examples A question facing Glenn Stevens is “How much should we
increase the discount rate to slow inflation, and yet maintain a stable and growing economy?” The answer will depend on the responsiveness of firms and individuals to increases in the interest rates and to the effects of reduced investment on Gross Domestic Product. The key elasticities and multipliers are called parameters. The values of economic parameters are unknown and must be estimated using a sample of economic data when formulating economic policies.
Econometrics is about how to best estimate economic parameters given the data we have. “Good” econometrics is important, since errors in the estimates used by policy makers such as the RBA may lead to interest rate corrections that are too large or too small, which has consequences for all of us.
Other examples include: A Redfern city council ponders the question of how much
violent crime will be reduced if an additional million dollars is spent putting uniformed police on the street.
EM/MV/Sem1/2011 8
The owner of a local Pizza Hut franchise must decide how much advertising space to purchase in the local newspaper, and thus must estimate the relationship between advertising and sales.
The CEO of Proctor & Gamble must estimate how much demand there will be in ten years for the detergent Tide, as she decides how much to invest in new plant and equipment.
A real estate developer must predict by how much population and income will increase to the residents of Penrith, over the next few years, and if it will be profitable to begin construction of a bigger Penrith Plaza/Westfield.
You must decide how much of your savings will go into a stock fund and how much into the money market. This requires you to make predictions of the level of economic activity, the rate of inflation and interest rates over your planning horizon.
EM/MV/Sem1/2011 9
TYPES OF ECONOMIC & FINANCIAL DATA
EM/MV/Sem2/2010 10
Types of Economic & Financial Data
Time series data for the variable Y over sample period T collected at time t, (Yt for t = 1,…,T)
Data collected over time at identically spaced intervals
Low frequency: Annual, Quarterly, Monthly Data For example, the 90 day Bank Bill rate per month
Quarterly CPI, inflation rate, GDP growth rate
High Frequency: Daily and Intra Day Data For example, daily closing stock price for several years
EM/MV/Sem2/2010 11
Time-series data are data arranged chronologically, usually at regular intervals
◦Examples of Problems that Could be Tackled Using a Time Series Regression
How the value of a country’s stock index has varied with that country’s macroeconomic fundamentals.
How a company’s stock returns has varied when it announced the value of its dividend payment.
The effect on a country’s currency of an increase in its interest rate
Time Series Data
EM/MV/Sem2/2010 12
Types of Financial Data
Cross sectional data for the variable Y over sample N from individuals i, (Yi for i = 1,…,N)
Data collected at a point in time over a sample of individuals from the same population
For example, data on the dividend per share and dividend yields for Australian banks in 2007
EM/MV/Sem2/2010 13
Cross-sectional data are data on one or more variables collected at a single point in time
◦Examples of Problems that Could be Tackled Using a Cross-Sectional Regression
The relationship between company size and the return to investing in its shares
The relationship between a country’s GDP level and the probability that the government will default on its sovereign debt.
Cross-section Data
EM/MV/Sem2/2010 14
Types of Financial Data
Panel data for the variable Y over sample period T collected at time t, and over sample observation N from individuals i, (Yit for i = 1,…,N and t = 1,…,T)
Data collected over time at identically spaced intervals and over the same sample of individuals from a population
For example, annual data on the dividend per share and dividend yields for the same Australian Banks for 2003 to 2007
EM/MV/Sem2/2010 15
Panel Data has the dimensions of both time series and cross-sections
e.g. the daily prices of a number of blue chip stocks over two years.
◦ It is common to denote each observation by the letter t and the total number of observations by T for time series data,
◦and to to denote each observation by the letter i and the total number of observations by N for cross-sectional data.
Panel Data
EM/MV/Sem2/2010 16
◦Little difference between econometrics and financial econometrics beyond emphasis
◦Data samples Economics-based econometrics often suffers from paucity
of data Financial economics often suffers from infoglut and signal
to noise problems even in short data samples
◦Time scales Economic data releases often regular calendar events Financial data are likely to be real-time or tick-by-tick
Economics versus Financial Econometrics
EM/MV/Sem2/2010 17
Economic & Financial data have some defining characteristics that shape the econometric approaches that can be applied◦outliers◦trends◦mean-reversion◦volatility clustering
Economics VS Financial Data
EM/MV/Sem2/2010 18
Outliers
EM/MV/Sem2/2010 19
Trends
EM/MV/Sem2/2010 20
Mean Reversion (with outliers)
EM/MV/Sem2/2010 21
Volatility Clustering
EM/MV/Sem2/2010 22
These data may be collected at various levels of aggregation:
micro—data collected on individual economic decision making units such as individuals, households or firms.
macro—data resulting from a pooling or aggregating over individuals, households or firms at the local, state or national levels.
The data collected may also represent a flow or a stock: flow—outcome measures over a period of time, such as the
consumption of gasoline during the last quarter of 2004. stock—outcome measured at a particular point in time, such
as the quantity of crude oil held by CALTEX in its Sydney storage tanks April 1, 2004
Further Types of Data
EM/MV/Sem2/2010 23
Or it might either be Qualitative & Quantitative Data
Quantitative data is inherently numerical
e.g. share price is $25
Qualitative data is not inherently numerical but for some analysis it must be expressed numerically
e.g. in a survey of companies ask if investment financed through debt, answer is Yes/No
binary variable equal 0 or 1. and Yes =1, No = 0
used for turning qualitative data into quantitative data
EM/MV/Sem2/2010 24
Ways of Obtaining data
Often original data is transformed to the ratio of two variables
For example:
E = company earnings
S = number of shares
Y = E / S, which is earnings per share
Gross rates
Returns
Excess returns
EM/MV/Sem2/2010 25
Ways of Obtaining data
Growth rate is the percentage change
% change =
One period simple net return (growth) Pt is share price at time t
Pt-1 is share price in the previous period
Rt is the periodic growth in price or simple net return over one period excluding dividend
100)( 1
t
tt
Y
YY
1
1 1
1t t tt
t t
P P PR
P P
EM/MV/Sem2/2010 26
Obtaining data
One period simple net return
Log return
One period simple net return including dividend
1
1 tt
t
PR
P
11
ln 1 ln ln lntt t t t
t
Pr R P P
P
1
1 1
1t t t t tt
t t
P D P P DR
P P
EM/MV/Sem2/2010 27
Obtaining data
Excess return (ER) is important in much financial decision making
ER = return on asset – return on risk free asset
ERt = Rt – Rot
Rot (risk free asset) refer to return on government bonds
EM/MV/Sem2/2010 28
Index Numbers
Frequently financial analysts work with index numbers for stock markets or specific sectors in equities markets If interest is in the performance of the stock market as a whole
Price index takes the weighted average of share prices of a set of companies
the S&P/ASX 100 takes the biggest 100 companies listed on the Australian Stock Exchange
The “weights” in weighted average reflect the relative importance of the individual components in the index
In equity price indices big companies have more weight than small companies where the weight is usually measured by the market capitalization of the company
EM/MV/Sem2/2010 29
WORKING WITH DATA
EM/MV/Sem2/2010 30
Working with data: graphical methods
Time series graph Ex. monthly £/$ exchange rates from January 1947 to October
1996
0
50
100
150
200
250
300
350
400
450
Jan
-47
Jan
-49
Jan
-51
Jan
-53
Jan
-55
Jan
-57
Jan
-59
Jan
-61
Jan
-63
Jan
-65
Jan
-67
Jan
-69
Jan
-71
Jan
-73
Jan
-75
Jan
-77
Jan
-79
Jan
-81
Jan
-83
Jan
-85
Jan
-87
Jan
-89
Jan
-91
Jan
-93
Jan
-95
Pen
ce p
er d
oll
ar
Date
Figure 2.1: Time Series Graph of U.K. pound/U.S. dollar Exchange rate
EM/MV/Sem2/2010 31
Working with data: graphical methods
Histograms Ex. Real GDP per capita in 1992 for 90 countries measured in
$US.
0
5
10
15
20
25
30
35
Fre
qu
ency
Bin
Figure 2.2: Histogram
Frequency
EM/MV/Sem2/2010 32
Working with data: graphical methods
Scatter diagrams Ex. Executive compensation and profits of 70 companies
Figure 2.3: XY-plot of executive compensation against profits
Company 43
0
500
1000
1500
2000
2500
3000
0.0 1.0 2.0 3.0 4.0 5.0 6.0
Executive compensation ($millions)
Co
mp
an
y P
rofi
ts (
$m
illio
ns)
EM/MV/Sem2/2010 33
Descriptive statistics
Descriptive statistics are numbers which cab be used to summarize information in a data set
Measures of location (center of distribution)
Sample Mean/Average:
Median: value which divides the sample in half; the value of the 50th percentile
Mode: the most common value
N
YY
N
i i 1
EM/MV/Sem2/2010 34
Descriptive statistics
Measures of dispersion (spread, variability) Standard deviation,
Variance: squared standard deviation S2
Range: R = Ymax– Ymin
Percentile: the data value at the percentile of the sample
e.g. the 50th percentile value must be the median
1
)(1
2
N
YYN
i is
Formulae for expectation and moment conditions.
1st Moment – Mean of a r.v
2nd Moment – Variance
11
1
[ ( )] [ ]
[ ] ( ) ( )n
i ix i
E X E X E X
E X xf x x f x
22
2 2
[ ] [ ( )]
[ ] [ ] [ ]
VAR X E X E X
VAR X E X E X
35EM/MV/SEM2/2010
Formulae for expectation and moment conditions.
3rd Moment – Skewness of a random variable
4th Moment – Kurtosis of a random Variable
3
32 2
[ ][ ]
[ ]
E XSKEW X
E X
4
2
[ ][ ]
[ ]
E XKURT X
E X
36EM/MV/SEM2/2010
Normal Distribution
Effect of different means
Effect of different standard deviations
37EM/MV/SEM2/2010
Skewness in the normal Distribution (Green)
The median of the green function is approximated by the mean of a normal distribution.
When mean<median +skew
When mean>median -skew
Skew of a normal distribution (yellow)= 0
medianmean
median mean
38EM/MV/SEM2/2010
Kurtosis in the Normal distribution
Kurtosis is the degree of peakedness of a distribution.
A normal distribution is a mesokurtic distribution.
A pure leptokurtic distribution has a higher peak than the normal distribution and has heavier tails.
A pure platykurtic distribution has a lower peak than a normal distribution and lighter tails.
Mesokurtotic, kurtosis =3Excess kurtosis=kurt[X]-3=0
Kurt[X]>3
Kurt[X]<3
39EM/MV/SEM2/2010
EM/MV/Sem2/2010 40
Basic data analysis
Summary statistics Average level of variable
Mean, median and mode
Variability around this central tendency
Standard deviation, variance and range
Distribution of data
Skewness and kurtosis
Number of observations and number of missing values
EM/MV/Sem2/2010 41
Summary of Introduction
Economic and Financial data come in many forms Common types are time series, cross section and panel data
Graphical techniques are useful ways of summarizing the information in a data set Time series graphs, histograms and scatter XY-plots
Many numerical summaries can be used The mean, a measure of the location of a distribution
The standard deviation (and variance), measures of how spread out or dispersed
The Simple Regression Model
THE THEORY
y = b0 + b1x + u
EM/MV/Sem2/2010 42
EM/MV/Sem2/2010 43
Deterministic VS Stochastic Model
y = b0 + b1x
y = b0 + b1x + u,
y = 100+2X
y = 200+ 2x + (50% chance +10; 50% chance-15)
Some Terminology
In the simple linear regression model, where y = b0 + b1x +u
we typically refer to y as the
◦Dependent Variable, or◦Left-Hand Side Variable, or◦Explained Variable, or◦Regressand
EM/MV/Sem2/201044
Some Terminology, cont.
In the simple linear regression of y on x, we typically refer to x as the◦Independent Variable, or◦Right-Hand Side Variable, or◦Explanatory Variable, or◦Regressor, or◦Covariate, or◦Control Variables
EM/MV/Sem2/201045
EM/MV/Sem2/2010 46
Why stochastic?
y = b0 + b1x +u
U measures:Unpredictable element of randomness of
human responsesEffect of large number of omitted
variablesMeasurement error in y
Ordinary Least Squares
Basic idea of regression is to estimate the population parameters from a sample
Let {(xi,yi): i=1, …,n} denote a random sample of size n from the population
For each observation in this sample, it will be the case that
yi = b0 + b1xi + ui
EM/MV/Sem2/201047
EM/MV/Sem2/2010 48
.
..
.
y4
y1
y2
y3
x1 x2
x3
x4
}
}
{
{
u1
u2
u3
u4
x
y
Population regression line, sample data pointsand the associated error terms
E(y|x) = b0 + b1x
Deriving OLS Estimates
To derive the OLS estimates we need to realize that our main assumption of E(u|x) = E(u) = 0 also implies that
Cov(x,u) = E(xu) = 0
Why? Remember from basic probability that Cov(X,Y) = E(XY) – E(X)E(Y)
EM/MV/Sem2/201049
Deriving OLS continued
We can write our 2 restrictions just in terms of x, y, b0 and b1 , since u = y – b0 – b1x
E(y – b0 – b1x) = 0 E[x(y – b0 – b1x)] = 0
These are called moment restrictions
EM/MV/Sem2/201050
Deriving OLS using M.O.M.
The method of moments approach to estimation implies imposing the population moment restrictions on the sample moments
What does this mean? Recall that for E(X), the mean of a population distribution, a sample estimator of E(X) is simply the arithmetic mean of the sample
EM/MV/Sem2/201051
More Derivation of OLS We want to choose values of the
parameters that will ensure that the sample versions of our moment restrictions are true
The sample versions are as follows:
0ˆˆ
0ˆˆ
110
1
110
1
n
iiii
n
iii
xyxn
xyn
EM/MV/Sem2/201052
More Derivation of OLSGiven the definition of a sample mean,
and properties of summation, we can rewrite the first condition as follows
xy
xy
10
10
ˆˆ
or
,ˆˆ
EM/MV/Sem2/201053
More Derivation of OLS
n
iii
n
ii
n
iii
n
iii
n
iiii
xxyyxx
xxxyyx
xxyyx
1
21
1
11
1
111
ˆ
ˆ
0ˆˆ
EM/MV/Sem2/201054
So the OLS estimated slope is
0 that provided
ˆ
1
2
1
2
11
n
ii
n
ii
n
iii
xx
xx
yyxx
55
Summary of OLS slope estimate
The slope estimate is the sample covariance between x and y divided by the sample variance of x
If x and y are positively correlated, the slope will be positive
If x and y are negatively correlated, the slope will be negative
Only need x to vary in our sample
56
More OLS
Intuitively, OLS is fitting a line through the sample points such that the sum of squared residuals is as small as possible, hence the term least squares
The residual, û, is an estimate of the error term, u, and is the difference between the fitted line (sample regression function) and the sample point
57
58
.
..
.
y4
y1
y2
y3
x1 x2
x3
x4
}
}
{
{
û1
û2
û3
û4
x
y
Sample regression line, sample data pointsand the associated estimated error terms
xy 10ˆˆˆ
Alternate approach to derivation
Given the intuitive idea of fitting a line, we can set up a formal minimization problem
That is, we want to choose our parameters such that we minimize the following:
n
iii
n
ii xyu
1
2
101
2 ˆˆˆ
59
Alternate approach, continued
If one uses calculus to solve the minimization problem for the two parameters you obtain the following first order conditions, which are the same as we obtained before, multiplied by n
0ˆˆ
0ˆˆ
110
110
n
iiii
n
iii
xyx
xy
60
Algebraic Properties of OLS
The sum of the OLS residuals is zero Thus, the sample average of the OLS
residuals is zero as well The sample covariance between the
regressors and the OLS residuals is zero The OLS regression line always goes
through the mean of the sample
61
Algebraic Properties (precise)
xy
ux
n
uu
n
iii
n
iin
ii
10
1
1
1
ˆˆ
0ˆ
0
ˆ
thus,and 0ˆ
62
More terminology
SSR SSE SSTThen
(SSR) squares of sum residual theis ˆ
(SSE) squares of sum explained theis ˆ
(SST) squares of sum total theis
:following thedefine then Weˆˆ
part, dunexplainean and part, explainedan of up
made being asn observatioeach ofcan think We
2
2
2
i
i
i
iii
u
yy
yy
uyy
63
Proof that SST = SSE + SSR
0 ˆˆ that know weand
SSE ˆˆ2 SSR
ˆˆˆ2ˆ
ˆˆ
ˆˆ
22
2
22
yyu
yyu
yyyyuu
yyu
yyyyyy
ii
ii
iiii
ii
iiii
64
EM/MV/Sem2/2010 65
ASSUMPTIONS OF OLS
A Simple Assumption
The average value of u, the error term, in the population is 0. That is,
E(u) = 0
This is not a restrictive assumption, since we can always use b0 to normalize E(u) to 0
EM/MV/Sem2/201066
Zero Conditional Mean
We need to make a crucial assumption about how u and x are related
We want it to be the case that knowing something about x does not give us any information about u, so that they are completely unrelated. That is, that
E(u|x) = E(u) = 0, which implies E(y|x) = b0 + b1x
EM/MV/Sem2/2010 67
EM/MV/Sem2/201068
..
x1 x2
E(y|x) as a linear function of x, where for any x the distribution of y is centered about E(y|x)
E(y|x) = b0 + b1x
y
f(y)
Unbiasedness of OLS
Assume the population model is linear in parameters as y = b0 + b1x + u
Assume we can use a random sample of size n, {(xi, yi): i=1, 2, …, n}, from the population model. Thus we can write the sample model yi = b0 + b1xi + ui
Assume E(u|x) = 0 and thus E(ui|xi) = 0 Assume there is variation in the xi
69
Unbiasedness of OLS (cont)
In order to think about unbiasedness, we need to rewrite our estimator in terms of the population parameter
Start with a simple rewrite of the formula as
22
21 where,ˆ
xxs
s
yxx
ix
x
ii
70
Unbiasedness of OLS (cont)
ii
iii
ii
iii
iiiii
uxx
xxxxx
uxx
xxxxx
uxxxyxx
10
10
10
71
Unbiasedness of OLS (cont)
211
21
2
ˆ
thusand ,
asrewritten becan numerator the,so
,0
x
ii
iix
iii
i
s
uxx
uxxs
xxxxx
xx
72
Unbiasedness of OLS (cont)
1211
21
1ˆ
then,1ˆ
thatso ,let
iix
iix
i
ii
uEds
E
uds
xxd
73
Unbiasedness Summary
The OLS estimates of b1 and b0 are unbiased
Proof of unbiasedness depends on our 4 assumptions – if any assumption fails, then OLS is not necessarily unbiased
Remember unbiasedness is a description of the estimator – in a given sample we may be “near” or “far” from the true parameter
74
Variance of the OLS Estimators
Now we know that the sampling distribution of our estimate is centered around the true parameter
Want to think about how spread out this distribution is
Much easier to think about this variance under an additional assumption, so
Assume Var(u|x) = s2 (Homoskedasticity)
75
Variance of OLS (cont)
Var(u|x) = E(u2|x)-[E(u|x)]2
E(u|x) = 0, so s2 = E(u2|x) = E(u2) = Var(u) Thus s2 is also the unconditional variance,
called the error variance s, the square root of the error variance is
called the standard deviation of the error Can say: E(y|x)=b0 + b1x and Var(y|x) =
s2
76
77
..
x1 x2
Homoskedastic Case
E(y|x) = b0 + b1x
y
f(y|x)
78
.x x1 x2
yf(y|x)
Heteroskedastic Case
x3
..
E(y|x) = b0 + b1x
Variance of OLS (cont)
12
222
22
22
2222
2
2
22
2
2
2
211
ˆ1
11
11
1ˆ
Vars
ss
ds
ds
uVards
udVars
uds
VarVar
xx
x
ix
ix
iix
iix
iix
79
Variance of OLS Summary
The larger the error variance, s2, the larger the variance of the slope estimate
The larger the variability in the xi, the smaller the variance of the slope estimate
As a result, a larger sample size should decrease the variance of the slope estimate
Problem that the error variance is unknown
80
Estimating the Error Variance
We don’t know what the error variance, s2, is, because we don’t observe the errors, ui
What we observe are the residuals, ûi
We can use the residuals to form an estimate of the error variance
81
Error Variance Estimate (cont)
2/ˆ2
1ˆ
is ofestimator unbiasedan Then,
ˆˆ
ˆˆ
ˆˆˆ
22
2
1100
1010
10
nSSRun
u
xux
xyu
i
i
iii
iii
82
Error Variance Estimate (cont)
21
21
1
2
/ˆˆse
, ˆ oferror standard the
have then wefor ˆ substitute weif
ˆsd that recall
regression theoferror Standardˆˆ
xx
s
i
x
83
Goodness-of-Fit
How do we think about how well our sample regression line fits our sample data?
Can compute the fraction of the total sum of squares (SST) that is explained by the model, call this the R-squared of regression
R2 = SSE/SST = 1 – SSR/SST
Economics 20 - Prof. Anderson 84
Using Stata for OLS regressions
Now that we’ve derived the formula for calculating the OLS estimates of our parameters, you’ll be happy to know you don’t have to compute them by hand
Regressions in Stata are very simple, to run the regression of y on x, just type
reg y x
85
The Simple Regression Model
THE APPLICATION
y = b0 + b1x + u
EM/MV/Sem2/2010 86
EM/MV/Sem2/2010 87
Interpreting OLS estimates
Interpretation of
Estimated value of Y = if X = 0
This is often not of interest
Ex. X = lot size, Y = house price
= estimated value of a house with lot size = 0
ˆˆY X u
0
Y
a
X
EM/MV/Sem2/2010 88
Interpreting OLS estimates
Interpretation of as the marginal effect
Differentiate the regression model
Derivatives measure how much Y tends to change when X is changed by a marginal amount
If X changes by 1 unit then Y tends to change by units
where “units” refers to what the variables are measured (e.g. $, £, %, etc.)
ˆˆY X u
dX
dY
EM/MV/Sem2/2010 89
Interpreting OLS estimates
Ex. Executive compensation and profit
Y = executive compensation (millions of $) = dependent variable
X = profits (millions of $) = explanatory variable
from example, = 0.000842
since is positive, X and Y are positively correlated
Interpretation of
If profits increase by $1 million, then executive compensation will tend to increase by .000842 millions dollars (i.e. $842)
ˆˆY X u
EM/MV/Sem2/2010 90
Fitted Values
Estimated Regression Line:
The residual
Fitted value of dependent variable is
The difference between actual and fitted values of Y
(another way to express residual)
Good fitting models have small residuals (i.e. small SSR)
If residual is big for one observation (relative to other observations) then it is an outlier
Looking at fitted values and residuals can be very informative
ˆˆY X u
XYu ˆˆ
ii XY ˆˆˆ
ii YYu ˆ
EM/MV/Sem2/2010 91
R2: A measure of fit
How well does the regression line fit the data points
A good measure of the proportion of the total variation explained
by the regression equation
TSS = the variation in the dependant variable about its mean value
TSS = RSS + SSR
TSS = Total sum of squares,
RSS = Regression sum of squares,
SSR = sum of squared residuals,
TSS
RSS
TSS
SSRR 12
2)( YYTSS i
2)ˆ( YYRSS i
2iuSSR
AGAIN, WELCOME TO ECONOMIC MODELLING!
You are enrolled in this unit..you need to start reading…
See you next week.
92