puaf 610 ta

26
06/21/22 1 PUAF 610 TA Session 10

Upload: menora

Post on 18-Mar-2016

78 views

Category:

Documents


3 download

DESCRIPTION

PUAF 610 TA. Session 10. TODAY. Ideas about Final Review Regression Review. Final Review. Any idea about the final review next week? Go over lectures Go over problem sets that related to the exam Go over extra exercises Try to get information from instructors Email me your preferences. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: PUAF 610 TA

04/24/23 1

PUAF 610 TA

Session 10

Page 2: PUAF 610 TA

04/24/23 2

TODAY

• Ideas about Final Review• Regression Review

Page 3: PUAF 610 TA

Final Review

• Any idea about the final review next week?• Go over lectures• Go over problem sets that related to the

exam• Go over extra exercises• Try to get information from instructors• Email me your preferences

04/24/23 3

Page 4: PUAF 610 TA

04/24/23 4

Regression

• In regression analysis we analyze the relationship between two or more variables.

• The relationship between two or more variables could be linear or non linear.– Simple Linear Regression y, x – Multiple Regression y, x1, x2, x3,…, xk

• If there exist a relationship, how could we use this relationship to forecast future.

Page 5: PUAF 610 TA

04/24/23 5

Regression

• Regression is the attempt to explain the variation in a dependent variable using the variation in independent variables.

• Regression is thus an explanation of causation.

Independent variable (x)

Dep

ende

nt v

aria

ble

Regression

Page 6: PUAF 610 TA

04/24/23 6

Simple Linear Regression

Independent variable (x)

Dep

ende

nt v

aria

ble

(y)

• The output of a regression is a function that predicts the dependent variable based upon values of the independent variables.

• Simple regression fits a straight line to the data.

y’ = b0 + b1X ± є

b0 (y intercept)

B1 = slope= ∆y/ ∆x

є

Regression

Page 7: PUAF 610 TA

04/24/23 7

Simple Linear Regression

Independent variable (x)

Dep

ende

nt v

aria

ble

The function will make a prediction for each observed data point.

The observation is denoted by y and the prediction is denoted by y.

Zero

Prediction: y

Observation: y

^

^

For each observation, the variation can be described as:

y = y + ε

Actual = Explained + Error

^

Regression

Page 8: PUAF 610 TA

04/24/23 8

Simple Linear Regression• Simple Linear Regression Model

y = 0 + 1x +

• Simple Linear Regression EquationE(y) = 0 + 1x

• Estimated Simple Linear Regression Equation

y = b0 + b1x^̂

Page 9: PUAF 610 TA

Simple Linear Regression

• The simplest relationship between two variables is a linear one:

• y = 0 + 1x• x = independent or explanatory variable (“cause”)• y = dependent or response variable (“effect”) 0 = intercept (value of y when x = 0) 1 = slope (change in y when x increases one

unit)

Page 10: PUAF 610 TA

Interpret the slope

• Y=0.3+2.6x

Page 11: PUAF 610 TA

04/24/23 11

Regression

Independent variable (x)

Dep

ende

nt v

aria

ble

•A least squares regression, or OLS, selects the line with the lowest total sum of squared prediction errors.

•This value is called the Sum of Squares of Error, or SSE.

Regression

Page 12: PUAF 610 TA

04/24/23 12

Calculating SSR

Independent variable (x)

Dep

ende

nt v

aria

ble

The Sum of Squares Regression (SSR) is the sum of the squared differences between the prediction for each observation and the population mean.

Population mean: y

Regression

Page 13: PUAF 610 TA

04/24/23 13

Regression Formulas

The Total Sum of Squares (SST) is equal to SSR + SSE.Mathematically,

SSR = ∑ ( y – y ) (measure of explained variation)

SSE = ∑ ( y – y ) (measure of unexplained variation)

SST = SSR + SSE = ∑ ( y – y ) (measure of total variation in y)

^

^

2

2

Regression

Page 14: PUAF 610 TA

04/24/23 14

The Coefficient of Determination

The proportion of total variation (SST) that is explained by the regression (SSR) is known as the Coefficient of Determination, and is often referred to as R .

R = =

The value of R can range between 0 and 1, and the higher its value the more accurate the regression model is.

SSR SSR SST SSR + SSE

2

2

2

Regression

Page 15: PUAF 610 TA

04/24/23 15

Page 16: PUAF 610 TA

04/24/23 16

Testing for Significance

• To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of 1 is zero.

• t Test is commonly used.

Page 17: PUAF 610 TA

04/24/23 17

• Hypotheses H0: 1 = 0

Ha: 1 = 0• Test Statistic

• Rejection Rule: Reject H0 if t < -tor t > twhere t is based on a t distribution

with n - 2 degrees of freedom.

Testing for Significance: t Test

1

1

bsbt

Page 18: PUAF 610 TA

04/24/23 18

Multiple Linear Regression

• More than one independent variable can be used to explain variance in the dependent variable.

• A multiple regression takes the form:

y = A + β X + β X + … + β k Xk + ε

where k is the number of variables, or parameters.

1 1 2 2

Multiple Regression

Page 19: PUAF 610 TA

04/24/23 19

Multiple Regression

) tan, 45(

3.0R

(0.3) (0.4) (0.1) 9.04.06.0ˆ

2

bracketsinerrorsdardsnsobservatio

zxy ttt

Page 20: PUAF 610 TA

04/24/23 20

Regression

• A unit rise in x produces 0.4 of a unit rise in y, with z held constant.

• Interpretation of the t-statistics remains the same, i.e. 0.4-0/0.4=1 (critical value is 2.02), so we fail to reject the null and x is not significant.

• The R-squared statistic indicates 30% of the variance of y is explained.

Page 21: PUAF 610 TA

04/24/23 21

Adjusted R-squared Statistic

• This statistic is used in a multiple regression analysis, because it does not automatically rise when an extra explanatory variable is added.

• Its value depends on the number of explanatory variables.

• It is usually written as (R-bar squared):

2R

Page 22: PUAF 610 TA

04/24/23 22

Adjusted R-squared

• It has the following formula (n-number of observations, k-number of parameters):

)1(1 222 Rkn

kRR

Page 23: PUAF 610 TA

04/24/23 23

F-test of explanatory power

• This is the F-test for the goodness of fit of a regression and in effect tests for the joint significance of the explanatory variables.

• It is based on the R-squared statistic.• It is routinely produced by most computer

software packages• It follows the F-distribution.

Page 24: PUAF 610 TA

04/24/23 24

F-test formula

• The formula for the F-test of the goodness of fit is:

1

2

2

)/()1(1/

kknF

knRkRF

Page 25: PUAF 610 TA

04/24/23 25

F-statistic

• When testing for the significance of the goodness of fit, our null hypothesis is that the explanatory variables jointly equal 0.

• If our F-statistic is below the critical value we fail to reject the null and therefore we say the goodness of fit is not significant.

Page 26: PUAF 610 TA

04/24/23 26