puaf 610 ta
DESCRIPTION
PUAF 610 TA. Session 10. TODAY. Ideas about Final Review Regression Review. Final Review. Any idea about the final review next week? Go over lectures Go over problem sets that related to the exam Go over extra exercises Try to get information from instructors Email me your preferences. - PowerPoint PPT PresentationTRANSCRIPT
04/24/23 1
PUAF 610 TA
Session 10
04/24/23 2
TODAY
• Ideas about Final Review• Regression Review
Final Review
• Any idea about the final review next week?• Go over lectures• Go over problem sets that related to the
exam• Go over extra exercises• Try to get information from instructors• Email me your preferences
04/24/23 3
04/24/23 4
Regression
• In regression analysis we analyze the relationship between two or more variables.
• The relationship between two or more variables could be linear or non linear.– Simple Linear Regression y, x – Multiple Regression y, x1, x2, x3,…, xk
• If there exist a relationship, how could we use this relationship to forecast future.
04/24/23 5
Regression
• Regression is the attempt to explain the variation in a dependent variable using the variation in independent variables.
• Regression is thus an explanation of causation.
Independent variable (x)
Dep
ende
nt v
aria
ble
Regression
04/24/23 6
Simple Linear Regression
Independent variable (x)
Dep
ende
nt v
aria
ble
(y)
• The output of a regression is a function that predicts the dependent variable based upon values of the independent variables.
• Simple regression fits a straight line to the data.
y’ = b0 + b1X ± є
b0 (y intercept)
B1 = slope= ∆y/ ∆x
є
Regression
04/24/23 7
Simple Linear Regression
Independent variable (x)
Dep
ende
nt v
aria
ble
The function will make a prediction for each observed data point.
The observation is denoted by y and the prediction is denoted by y.
Zero
Prediction: y
Observation: y
^
^
For each observation, the variation can be described as:
y = y + ε
Actual = Explained + Error
^
Regression
04/24/23 8
Simple Linear Regression• Simple Linear Regression Model
y = 0 + 1x +
• Simple Linear Regression EquationE(y) = 0 + 1x
• Estimated Simple Linear Regression Equation
y = b0 + b1x^̂
Simple Linear Regression
• The simplest relationship between two variables is a linear one:
• y = 0 + 1x• x = independent or explanatory variable (“cause”)• y = dependent or response variable (“effect”) 0 = intercept (value of y when x = 0) 1 = slope (change in y when x increases one
unit)
Interpret the slope
• Y=0.3+2.6x
04/24/23 11
Regression
Independent variable (x)
Dep
ende
nt v
aria
ble
•A least squares regression, or OLS, selects the line with the lowest total sum of squared prediction errors.
•This value is called the Sum of Squares of Error, or SSE.
Regression
04/24/23 12
Calculating SSR
Independent variable (x)
Dep
ende
nt v
aria
ble
The Sum of Squares Regression (SSR) is the sum of the squared differences between the prediction for each observation and the population mean.
Population mean: y
Regression
04/24/23 13
Regression Formulas
The Total Sum of Squares (SST) is equal to SSR + SSE.Mathematically,
SSR = ∑ ( y – y ) (measure of explained variation)
SSE = ∑ ( y – y ) (measure of unexplained variation)
SST = SSR + SSE = ∑ ( y – y ) (measure of total variation in y)
^
^
2
2
Regression
04/24/23 14
The Coefficient of Determination
The proportion of total variation (SST) that is explained by the regression (SSR) is known as the Coefficient of Determination, and is often referred to as R .
R = =
The value of R can range between 0 and 1, and the higher its value the more accurate the regression model is.
SSR SSR SST SSR + SSE
2
2
2
Regression
04/24/23 15
04/24/23 16
Testing for Significance
• To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of 1 is zero.
• t Test is commonly used.
04/24/23 17
• Hypotheses H0: 1 = 0
Ha: 1 = 0• Test Statistic
• Rejection Rule: Reject H0 if t < -tor t > twhere t is based on a t distribution
with n - 2 degrees of freedom.
Testing for Significance: t Test
1
1
bsbt
04/24/23 18
Multiple Linear Regression
• More than one independent variable can be used to explain variance in the dependent variable.
• A multiple regression takes the form:
y = A + β X + β X + … + β k Xk + ε
where k is the number of variables, or parameters.
1 1 2 2
Multiple Regression
04/24/23 19
Multiple Regression
) tan, 45(
3.0R
(0.3) (0.4) (0.1) 9.04.06.0ˆ
2
bracketsinerrorsdardsnsobservatio
zxy ttt
04/24/23 20
Regression
• A unit rise in x produces 0.4 of a unit rise in y, with z held constant.
• Interpretation of the t-statistics remains the same, i.e. 0.4-0/0.4=1 (critical value is 2.02), so we fail to reject the null and x is not significant.
• The R-squared statistic indicates 30% of the variance of y is explained.
04/24/23 21
Adjusted R-squared Statistic
• This statistic is used in a multiple regression analysis, because it does not automatically rise when an extra explanatory variable is added.
• Its value depends on the number of explanatory variables.
• It is usually written as (R-bar squared):
2R
04/24/23 22
Adjusted R-squared
• It has the following formula (n-number of observations, k-number of parameters):
)1(1 222 Rkn
kRR
04/24/23 23
F-test of explanatory power
• This is the F-test for the goodness of fit of a regression and in effect tests for the joint significance of the explanatory variables.
• It is based on the R-squared statistic.• It is routinely produced by most computer
software packages• It follows the F-distribution.
04/24/23 24
F-test formula
• The formula for the F-test of the goodness of fit is:
1
2
2
)/()1(1/
kknF
knRkRF
04/24/23 25
F-statistic
• When testing for the significance of the goodness of fit, our null hypothesis is that the explanatory variables jointly equal 0.
• If our F-statistic is below the critical value we fail to reject the null and therefore we say the goodness of fit is not significant.
04/24/23 26