chapter 7 – binary or zero/one or dummy variables
DESCRIPTION
Chapter 7 – Binary or Zero/one or Dummy Variables. Dummy Variables – Example. Example – WAGE1 Data Set. We want to fit the model : The term female is a dummy variable and takes into account the effect of female vs. male. Example – WAGE1 Data Set. We want to fit the model : - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/1.jpg)
Chapter 7 – Binary or Zero/one or Dummy Variables
![Page 2: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/2.jpg)
Dummy Variables – Example
![Page 3: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/3.jpg)
Example – WAGE1 Data Set
We want to fit the model:
The term female is a dummy variable and takes into account the effect of female vs. male.
![Page 4: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/4.jpg)
Example – WAGE1 Data SetWe want to fit the model:
The regression equation iswage = 0.623 - 2.27 female + 0.506 educ
Predictor Coef SE Coef T PConstant 0.6228 0.6725 0.93 0.355female -2.2734 0.2790 -8.15 0.000educ 0.50645 0.05039 10.05 0.000
S = 3.18552 R-Sq = 25.9% R-Sq(adj) = 25.6%
![Page 5: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/5.jpg)
Example – WAGE1 Data SetWe want to fit the model:
Interpretation of fitted model:wage = 0.623 - 2.27 female + 0.506 educThe coefficient on female measures the average difference in hourly wage between a woman and a man (for this model).
![Page 6: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/6.jpg)
Graphical Interpretation of Sex Effect
![Page 7: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/7.jpg)
Example – WAGE1 Data SetInterpretation of fitted model:wage = 0.623 - 2.27 female + 0.506 educ
The wage differential between females and males of -2.27 dollars per hour is due to sex of the individual and, potentially, factors we have not controlled for.
![Page 8: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/8.jpg)
Example – WAGE1 Data Set
Now, fit the model:
This allows us to test the hypotheses:Ho: -> mean wage same for the sexesH1:-> mean wage different for the sexes
![Page 9: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/9.jpg)
Ho: H1:
The regression equation iswage = 7.10 - 2.51 female
Predictor Coef SE Coef T PConstant 7.0995 0.2100 33.81 0.000female -2.5118 0.3034 -8.28 0.000
![Page 10: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/10.jpg)
Ho: H1:
The alternative hypothesis is supported – there is a statistically significant difference in mean wage between the sexes, p-value ≈ 0
![Page 11: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/11.jpg)
wage = 7.10 - 2.51 female
Estimated mean wage of males is $7.10 per hour
Estimate mean wage of females is $7.10 – 2.51 = $4.59 per hour
![Page 12: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/12.jpg)
Example – WAGE1 Data Set
What is the effect (if any) on wage of a person’s marital status?
(Did you check your regression assumptions?)
![Page 13: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/13.jpg)
Example – WAGE1 Data Set
Examine the effect on wage of a few of the other dummy variables in this data set.
![Page 14: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/14.jpg)
Example – WAGE1 Data Set
What is the interpretation of a dummy variable if the response is log(y)?
Now, fit the model:Log(
![Page 15: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/15.jpg)
Example – WAGE1 Data Set
lwage = 1.81 - 0.397 female
Predictor Coef SE Coef T PConstant 1.81357 0.02981 60.83 0.000Female -0.39722 0.04307 -9.22 0.000 About 40% decrease in hourly wage if individual is female!
![Page 16: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/16.jpg)
Example – WAGE1 Data Set
How do we handle multiple dummy variables at once?
Consider the two variables married and female.
We have four categories: single male, single female, married male, and married female.
![Page 17: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/17.jpg)
We have four categories: single male, single female, married male, and married female.
Need to create three new dummy variables
Marriedmale Marriedfemale Singlefemale
Single Male 0 0 0
Married Male 1 0 0
Single Female 0 0 1
Married Female 0 1 0
![Page 18: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/18.jpg)
We have four categories: single male, single female, married male, and married female.
Need to create three new dummy variables
In order to create the dummy variables marriedmale, marriedfemale, and singlefemale, use the calculator and a nested and statement within an if statement in Minitab.
![Page 19: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/19.jpg)
We have four categories: single male, single female, married male, and married female.
wage = 5.17 + 2.82 marriedmale - 0.602 marriedfemale - 0.556 singlefemale
Predictor Coef SE Coef T PConstant 5.1680 0.3614 14.30 0.000marriedmale 2.8150 0.4363 6.45 0.000marriedfemale -0.6021 0.4645 -1.30 0.195singlefemale -0.5564 0.4736 -1.18 0.241
S = 3.35181 R-Sq = 18.1% R-Sq(adj) = 17.6%
![Page 20: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/20.jpg)
Did you check the assumptions of homoskedasticity and normality?
![Page 21: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/21.jpg)
We have four categories: single male, single female, married male, and married female.
lwage = 1.52 + 0.427 marriedmale - 0.0797 marriedfemale - 0.132 singlefemale
Predictor Coef SE Coef T PConstant 1.52081 0.05099 29.83 0.000marriedmale 0.42668 0.06155 6.93 0.000marriedfemale -0.07974 0.06552 -1.22 0.224singlefemale -0.13164 0.06680 -1.97 0.049
S = 0.472836 R-Sq = 21.3% R-Sq(adj) = 20.9%
![Page 22: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/22.jpg)
Did you check the assumptions of homoskedasticity and normality?
![Page 23: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/23.jpg)
lwage = 0.321 + 0.213 marriedmale - 0.198 marriedfemale - 0.110 singlefemale + 0.0789 educ + 0.0268 exper - 0.000535 expersq + 0.0291 tenure - 0.000533 tenursq
Predictor Coef SE Coef T PConstant 0.3214 0.1000 3.21 0.001marriedmale 0.21268 0.05536 3.84 0.000marriedfemale -0.19827 0.05784 -3.43 0.001singlefemale -0.11035 0.05574 1.98 0.048educ 0.078910 0.006694 11.79 0.000exper 0.026801 0.005243 5.11 0.000expersq -0.0005352 0.0001104 -4.85 0.000tenure 0.029088 0.006762 4.30 0.000tenursq -0.0005331 0.0002312 -2.31 0.022
S = 0.393290 R-Sq = 46.1% R-Sq(adj) = 45.3%
![Page 24: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/24.jpg)
Did you check the assumptions of homoskedasticity and normality?
![Page 25: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/25.jpg)
Example – BEAUTY Data SetDo looks affect hourly wage?
Variable: looksHas five levels: 1, 2, 3, 4, 5
Make dummy variables where: 1, 2 – below average3 – average4, 5 – above average
![Page 26: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/26.jpg)
Example – BEAUTY Data SetDo looks affect hourly wage?
Make dummy variables where: 1, 2 – below average3 – average4, 5 – above average
You only need two dummy variables: belowaverage and aboveaverage
![Page 27: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/27.jpg)
Example – BEAUTY Data SetDo looks affect hourly wage?
Your conclusions? Did you check assumptions?
![Page 28: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/28.jpg)
Example – BEAUTY Data SetDo looks affect hourly wage?
Now, run a separate analysis for females and for males.
Your conclusions? Did you check assumptions?
![Page 29: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/29.jpg)
Dummy Variables and the Interaction Term
Consider the Wage1 data set.
Response: log(wage)Predictor variables: female, married, female*married, educ, exper, exper^2, tenure, and tenure^2.
NOTE: need to run this model in general linear regression of Minitab
![Page 30: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/30.jpg)
Dummy Variables and the Interaction Term
Term Coef SE Coef T PConstant 0.321378 0.100009 3.2135 0.001female -0.110350 0.055742 -1.9797 0.048married 0.212676 0.055357 3.8419 0.000female*married -0.300593 0.071767 -4.1885 0.000educ 0.078910 0.006694 11.7873 0.000exper 0.026801 0.005243 5.1118 0.000expersq -0.000535 0.000110 -4.8471 0.000tenure 0.029088 0.006762 4.3016 0.000tenursq -0.000533 0.000231 -2.3056 0.022
Summary of Model
R-Sq = 46.09% R-Sq(adj) = 45.25%
![Page 31: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/31.jpg)
Dummy Variables and the Interaction Term
lwage = 0.321378 - 0.11035 female + 0.212676 married + 0.0789103 educ + 0.0268006 exper - 0.000535245 expersq + 0.0290875 tenure -0.000533142 tenursq - 0.300593 female*married
Interpretation of dummy variables coefficients.
![Page 32: Chapter 7 – Binary or Zero/one or Dummy Variables](https://reader036.vdocument.in/reader036/viewer/2022062305/56815eda550346895dcd797c/html5/thumbnails/32.jpg)
Example – Problem C7.2
Complete C7.2 (i), (iii), and (iv) in class