quantitative business analysis for decision making simple linear regression

Post on 21-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Quantitative Business Analysis for Decision Making

Simple Linear Simple Linear RegressionRegression

403.7 2

Lecture OutlinesLecture Outlines

Scatter Plots Correlation Analysis Simple Linear Regression Model Estimation and Significance Testing Coefficient of Determination Confidence and Prediction Intervals Analysis of Residuals

403.7 3

Regression Analysis ?Regression Analysis ?

Regression analysis is used for modeling the mean of “response” variable Y as afunction of “predictor” variables X1, X2,..,

Xk.

When K = 1, it is called simple regression

analysis.

403.7 4

Random SampleRandom Sample

Y: Response Variable, X: Predictor Variable

For each unit in a random sample of n, the pair

(X, Y) is observed resulting a random sample:

(x1, y1), (x2, y2),... (xn, yn)

403.7 5

Scatter PlotScatter Plot

Scatter Plot is a graphical displays of the sample (x1, y1), (x2, y2),... (xn, yn) by n points in 2-dimension.

It will suggest if there is a relationship between X and Y

403.7 6

A Scatter Plot Showing Linear A Scatter Plot Showing Linear TrendTrend

16 21 26

15

20

25

Nielsen

Peo

pleM

A Scatter Plot Showing Linear Trend

of Peoples Ratings and Nielsen Ratings

403.7 7

A Scatter Plot Showing No Linear A Scatter Plot Showing No Linear TrendTrend

-1 0 1

-1

0

1

Today

Yes

terd

a

A Scatter Plot Showing No Linear Trend

of Today's With Yesterday's DJIA

403.7 8

Modeling linear Trend Modeling linear Trend

A perfect linear relationship between Y A perfect linear relationship between Y

and X and X exists if . Coefficient is the slope--quantifying the amount of change in y corresponding to one unit change in x. There are no perfect linear relationships

in practical world.

X of XY

403.7 9

Simple Linear Regression Simple Linear Regression ModelModel

Model: Model:

is linear function (nonrandom) is random error. It is assumed to be normally distributed mean 0 and

standard deviation . So are parameters of the model

XY

and ,

X

Xy

403.7 10

EstimationEstimation

Simple linear regression analysis estimates the mean of

Y (linear trend) by

and

Xy bxay ˆ

xbya

2)(

))((

xx

yyxxb

403.7 11

Standard deviation

Standard deviation (s) of the sample of n points in the scatter plot around the estimated regression line is:

bxay ˆ

2

ˆ 2

n

yys

403.7 12

Testing the Slope of Linear Testing the Slope of Linear TrendTrend

For Testing

compute t-statistic and its p value:

0a00 :H vs.:H

bs

b 0-statistic-t

403.7 13

Coefficient of Determination: Coefficient of Determination: RR22

A quantification of the significance of estimated model is denoted by R2.

R2 > 85% = significant model R2 < 85% = model is perceived as

inadequate Low R2 will suggest a need for additional

predictors for modeling the mean of Y

bxay ˆ

403.7 14

Correlation Coefficient: rCorrelation Coefficient: r

The correlation coefficient r is the square root of R2. It is a number between -1 and 1.

– Closer r is to -1 or 1, the stronger is the linear trend

– Its sign is positive for increasing trend (slope b is positive)

– Its sign is negative for decreasing trend (slope b is negative)

403.7 15

Confidence and Prediction Confidence and Prediction IntervalsIntervals

To estimate by a confidence interval, or to predict response Y

corresponding to its predictor value x = x0 – 1. Compute:

– 2. compute:

xy

0ˆ bxay

yesy ˆ..ˆ

403.7 16

What is ?i.e. Standard Error of

yes ˆ..

For estimating ,y

2

20

)(

)(1)ˆ.(.

xx

xx

nsyes

For Predicting Y,

2

20

)(

)(11)ˆ.(.

xx

xx

nsyes

y

403.7 17

Analysis of ResidualsAnalysis of ResidualsResiduals are defined:

Residual analysis is used to check the normality and homogeneity of variance assumptions of random errors .

Histogram or box plot of residuals will help to ascertain if errors are normally distributed.

2,....n 1,i ,ˆ iii yye

403.7 18

Analysis of Residuals Analysis of Residuals (con’t)(con’t)

Plot of residual against observed predictor values xi will help ascertain

homogeneity assumption. – random appearance = homogeneity of

variance assumption is valid.– non-random appearance

=homogeneity assumption is not valid and variance is dependent on predictor values.

ie

top related