mlbbaseballregressionsample
TRANSCRIPT
![Page 1: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/1.jpg)
MLB Linear and Robust Modeling SampleDan Tetrick
Friday, May 01, 2015
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, andMS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as theoutput of any embedded R code chunks within the document. You can embed an R code chunk like this:
Section 1: Set Up Function, Environment, and Data
A. Load in All Packages Functions
source('~/R/R Code/PackagesAndFunctions.R')PackagesAndFunctions(All = T)
B. Create BatModelData from df with Data Scrubber
BatModelData <- read.csv("~/Baseball Regression/Final Batting Data Full.csv", stringsAsFactors=FALSE)BatModelData <- Data.Scrubber(BatModelData)
C. Remove Pitchers from BatModelData
BatModelData<-BatModelData[which(BatModelData$POSPlayedMost!="P"),]
D. Remove All Batters with less than 50 Plate Appearances in AnyYear
BatModelData<-BatModelData[which(BatModelData$PA>=50),]
E. Remove All NA’s in Team Statistics
BatModelData<-BatModelData[which(!is.na(BatModelData$W)),]
1
![Page 2: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/2.jpg)
F. Remove All NA’s and 0’s from Salary Variable
BatModelData<-BatModelData[which(!is.na(BatModelData$Salary)),]BatModelData<-BatModelData[which(BatModelData$Salary!=0),]
G. Create a Summary Table of the Overall Model Data
summary(BatModelData)
Section 2: Summary Analysis, Plots, Histograms, Individual Regressions
A. Create a df Containing All of the Bat Model Data to be Used,Which Can Be Factored
BatModelFactors<-BatModelData[,c("teamIDFirst","yearIDBat","bats","throws","USA","POSPlayedMost","UtilityPOS","College","TotalAllStarSelected","TotalGoldGlove","TotalMostValuablePlayer","TotalRookieoftheYear","TotalSilverSlugger")]
BatModelFactors<-as.data.frame(BatModelFactors)
B. Create a df Containing All of the Continuous Bat Model Datato be Used
BatContinuous<-BatModelData[,c("Salary","weight","height","YearsExperience","HRBat","RBI","SBAttempts","BBBat","SOBat","OBPBat","SLGBat","RunsCreated","FieldingPCT","W","L","BPF","PPF","BodyMassIndex","RBat","HBat","IBB","HBPBat","SH","SFBat","GIDP","TotalMultiTeams","AVGBat","TBBat","OPSBat", "TotalRunsProduced","EBat","Age")]
C. Rebind the Bat Model Factor and Coninuous Variable dfs
BatModel<-as.data.frame(cbind(BatModelFactors,BatContinuous),stringsAsFactors=T)
D. Make the Awards and All-Star Variables Binary
BatModel$TotalAllStarSelected<-ifelse(BatModel$TotalAllStarSelected!=0,1,0)BatModel$TotalGoldGlove<-ifelse(BatModel$TotalGoldGlove!=0,1,0)BatModel$TotalMostValuablePlayer<-ifelse(BatModel$TotalMostValuablePlayer!=0,1,0)BatModel$TotalSilverSlugger<-ifelse(BatModel$TotalSilverSlugger!=0,1,0)
2
![Page 3: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/3.jpg)
E. Create a Summary Table of the Bat Model Data to be Used inthe Modeling Processes
(SummaryBatModelData<-summary(BatModel))
F. Create a simple Plot of all Continuous Variables in the BatModel Data
Plotter(df=BatContinuous,PlotterPath, PlotTitle = "MLB Bat Model Data Set Plot")
G. Create a Simple Histogram of All Coninuous Variables
Histograms(df = BatContinuous,HistogramPath,Title = "BatContinous Histograms")
H. Create a Scatterplot of Salary Regressed on Model Variables
SimpleScatter(df=BatContinuous,"Salary")
Section 3: Create a Linear Model Based on Entirety of Data
A. Create Bat Model Specs
BatModSpecs <<- paste0("log(Salary) ~ teamIDFirst + yearIDBat + bats + throws + POSPlayedMost +","UtilityPOS + College + height + YearsExperience + W + ","BPF + TotalMultiTeams + RunsCreated + IBB +","TotalAllStarSelected + TotalGoldGlove + ","TotalRookieoftheYear" )
B. Create Linear Model for Initial Analysis
LinearModel<-lm(as.formula(BatModSpecs),BatModel)
C. Summarize Results
3
![Page 4: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/4.jpg)
SummaryLinearModel<-summary(LinearModel)
D. Perform ANOVA
(anova(LinearModel))
## Analysis of Variance Table#### Response: log(Salary)## Df Sum Sq Mean Sq F value Pr(>F)## teamIDFirst 34 1128.2 33.2 56.0286 < 0.00000000000000022## yearIDBat 1 4020.8 4020.8 6789.2926 < 0.00000000000000022## bats 2 3.4 1.7 2.8949 0.05534## throws 1 16.1 16.1 27.1872 0.0000001875186932552## POSPlayedMost 8 362.1 45.3 76.4253 < 0.00000000000000022## UtilityPOS 1 1620.5 1620.5 2736.2784 < 0.00000000000000022## College 1 39.8 39.8 67.2073 0.0000000000000002673## height 1 11.7 11.7 19.7012 0.0000091290701096658## YearsExperience 1 7415.8 7415.8 12521.8488 < 0.00000000000000022## W 1 40.4 40.4 68.1410 < 0.00000000000000022## BPF 1 4.2 4.2 7.1300 0.00759## TotalMultiTeams 1 188.8 188.8 318.7442 < 0.00000000000000022## RunsCreated 1 2770.1 2770.1 4677.4347 < 0.00000000000000022## IBB 1 52.8 52.8 89.1738 < 0.00000000000000022## TotalAllStarSelected 1 26.6 26.6 44.8479 0.0000000000221728398## TotalGoldGlove 1 14.4 14.4 24.3445 0.0000008155481121597## TotalRookieoftheYear 1 14.4 14.4 24.2804 0.0000008431199481025## Residuals 12959 7674.7 0.6#### teamIDFirst ***## yearIDBat ***## bats .## throws ***## POSPlayedMost ***## UtilityPOS ***## College ***## height ***## YearsExperience ***## W ***## BPF **## TotalMultiTeams ***## RunsCreated ***## IBB ***## TotalAllStarSelected ***## TotalGoldGlove ***## TotalRookieoftheYear ***## Residuals## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
4
![Page 5: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/5.jpg)
E. Identify Potential Outliers
(outlierTest(LinearModel))
## rstudent unadjusted p-value Bonferonni p## 26179 -5.032587 0.00000049039 0.0063839## 7624 -4.744933 0.00000210800 0.0274420
F. Identify MulitCollinearity Issues
(vif(LinearModel))
## GVIF Df GVIF^(1/(2*Df))## teamIDFirst 3.861708 34 1.020068## yearIDBat 1.205054 1 1.097750## bats 1.642685 2 1.132110## throws 1.624370 1 1.274508## POSPlayedMost 2.065586 8 1.046382## UtilityPOS 1.234240 1 1.110963## College 1.065733 1 1.032343## height 1.278291 1 1.130615## YearsExperience 1.347075 1 1.160636## W 1.201951 1 1.096335## BPF 2.422080 1 1.556303## TotalMultiTeams 1.253269 1 1.119495## RunsCreated 2.134285 1 1.460919## IBB 1.842983 1 1.357565## TotalAllStarSelected 1.540961 1 1.241355## TotalGoldGlove 1.264041 1 1.124296## TotalRookieoftheYear 1.104717 1 1.051055
sqrt(vif(LinearModel)) > 2
## GVIF Df GVIF^(1/(2*Df))## teamIDFirst FALSE TRUE FALSE## yearIDBat FALSE FALSE FALSE## bats FALSE FALSE FALSE## throws FALSE FALSE FALSE## POSPlayedMost FALSE TRUE FALSE## UtilityPOS FALSE FALSE FALSE## College FALSE FALSE FALSE## height FALSE FALSE FALSE## YearsExperience FALSE FALSE FALSE## W FALSE FALSE FALSE## BPF FALSE FALSE FALSE## TotalMultiTeams FALSE FALSE FALSE## RunsCreated FALSE FALSE FALSE## IBB FALSE FALSE FALSE
5
![Page 6: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/6.jpg)
## TotalAllStarSelected FALSE FALSE FALSE## TotalGoldGlove FALSE FALSE FALSE## TotalRookieoftheYear FALSE FALSE FALSE
G. Plot Normality of Residuals
sresid <- studres(LinearModel)hist(sresid, freq=FALSE,
main="Distribution of Studentized Residuals")xModel<-seq(min(sresid),max(sresid),length=40)yModel<-dnorm(xModel)lines(xModel, yModel)
H. Evaluate homoscedasticity non-constant error variance test
(ncvTest(LinearModel))
## Non-constant Variance Score Test## Variance formula: ~ fitted.values## Chisquare = 1245.702 Df = 1 p = 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007132171
I. Test for Autocorrelated Errors
(durbinWatsonTest(LinearModel))
## lag Autocorrelation D-W Statistic p-value## 1 0.455044 1.089884 0## Alternative hypothesis: rho != 0
J. More Diagnostic Plots
plot(LinearModel)
6
![Page 7: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/7.jpg)
12 14 16 18 20
−4
−2
02
Fitted values
Res
idua
ls
lm(as.formula(BatModSpecs))
Residuals vs Fitted
26179762422217
−4 −2 0 2 4
−4
−2
02
4
Theoretical Quantiles
Sta
ndar
dize
d re
sidu
als
lm(as.formula(BatModSpecs))
Normal Q−Q
26179762422217
7
![Page 8: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/8.jpg)
12 14 16 18 20
0.0
0.5
1.0
1.5
2.0
Fitted values
Sta
ndar
dize
d re
sidu
als
lm(as.formula(BatModSpecs))
Scale−Location26179 7624
22217
8
![Page 9: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/9.jpg)
0.00 0.02 0.04 0.06 0.08
−6
−4
−2
02
4
Leverage
Sta
ndar
dize
d re
sidu
als
lm(as.formula(BatModSpecs))
Cook's distance
Residuals vs Leverage
7624
762239085
# K. Global test of model assumptions
gvmodel <- gvlma(LinearModel)(summary(gvmodel))
#### Call:## lm(formula = as.formula(BatModSpecs), data = BatModel)#### Residuals:## Min 1Q Median 3Q Max## -3.8588 -0.4787 -0.0281 0.4639 3.0105#### Coefficients:## Estimate Std. Error t value## (Intercept) -127.9132212 1.7564528 -72.825## teamIDFirstARI -0.2146555 0.0847453 -2.533## teamIDFirstATL -0.1328926 0.0785968 -1.691## teamIDFirstBAL -0.0791251 0.0778904 -1.016## teamIDFirstBOS 0.0349063 0.0788499 0.443## teamIDFirstCAL -0.2362334 0.0888873 -2.658## teamIDFirstCHA -0.0901495 0.0786144 -1.147## teamIDFirstCHN -0.1134147 0.0787268 -1.441## teamIDFirstCIN -0.2416669 0.0782027 -3.090## teamIDFirstCLE -0.1927796 0.0784724 -2.457## teamIDFirstCOL -0.2649864 0.0894058 -2.964
9
![Page 10: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/10.jpg)
## teamIDFirstDET -0.1431190 0.0782229 -1.830## teamIDFirstFLO -0.3646289 0.0837314 -4.355## teamIDFirstHOU -0.2070188 0.0781166 -2.650## teamIDFirstKCA -0.2249127 0.0778122 -2.890## teamIDFirstLAA -0.0059988 0.0939017 -0.064## teamIDFirstLAN -0.0207224 0.0788527 -0.263## teamIDFirstMIA -0.4899488 0.1290776 -3.796## teamIDFirstMIL -0.1927897 0.0851475 -2.264## teamIDFirstMIN -0.1796819 0.0777329 -2.312## teamIDFirstML4 -0.1748943 0.0877154 -1.994## teamIDFirstMON -0.3166123 0.0821394 -3.855## teamIDFirstNYA 0.1187761 0.0786234 1.511## teamIDFirstNYN -0.0666162 0.0784213 -0.849## teamIDFirstOAK -0.1936208 0.0784794 -2.467## teamIDFirstPHI -0.1428800 0.0786145 -1.817## teamIDFirstPIT -0.2157641 0.0788839 -2.735## teamIDFirstSDN -0.2440672 0.0790886 -3.086## teamIDFirstSEA -0.0810341 0.0782881 -1.035## teamIDFirstSFN -0.1076044 0.0786374 -1.368## teamIDFirstSLN -0.0733263 0.0782593 -0.937## teamIDFirstTBA -0.3196683 0.0842131 -3.796## teamIDFirstTEX -0.2496555 0.0781528 -3.194## teamIDFirstTOR -0.0959604 0.0784763 -1.223## teamIDFirstWAS -0.2036729 0.0940338 -2.166## yearIDBat 0.0696856 0.0008577 81.244## batsL -0.0835461 0.0234209 -3.567## batsR -0.0180927 0.0197007 -0.918## throwsR 0.0331896 0.0241859 1.372## POSPlayedMost2B 0.0252328 0.0310816 0.812## POSPlayedMost3B -0.0187086 0.0293550 -0.637## POSPlayedMostC -0.1226876 0.0283942 -4.321## POSPlayedMostCF 0.1180692 0.0296855 3.977## POSPlayedMostDH -0.2293316 0.0501588 -4.572## POSPlayedMostLF 0.0725315 0.0275052 2.637## POSPlayedMostRF 0.1072989 0.0283600 3.783## POSPlayedMostSS 0.0652769 0.0311159 2.098## UtilityPOSUTIL -0.3094957 0.0154371 -20.049## College 0.0449063 0.0139296 3.224## height 0.0079875 0.0036149 2.210## YearsExperience 0.1623170 0.0017876 90.803## W -0.0009419 0.0006209 -1.517## BPF 0.0016842 0.0021848 0.771## TotalMultiTeams -0.1102374 0.0130506 -8.447## RunsCreated 0.0132807 0.0002950 45.015## IBB 0.0179900 0.0022128 8.130## TotalAllStarSelected 0.1655547 0.0285624 5.796## TotalGoldGlove 0.2067810 0.0366641 5.640## TotalRookieoftheYear -0.3132058 0.0635626 -4.928## Pr(>|t|)## (Intercept) < 0.0000000000000002 ***## teamIDFirstARI 0.011322 *## teamIDFirstATL 0.090896 .## teamIDFirstBAL 0.309719## teamIDFirstBOS 0.657995
10
![Page 11: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/11.jpg)
## teamIDFirstCAL 0.007878 **## teamIDFirstCHA 0.251514## teamIDFirstCHN 0.149719## teamIDFirstCIN 0.002004 **## teamIDFirstCLE 0.014037 *## teamIDFirstCOL 0.003044 **## teamIDFirstDET 0.067328 .## teamIDFirstFLO 0.000013424453105854 ***## teamIDFirstHOU 0.008056 **## teamIDFirstKCA 0.003853 **## teamIDFirstLAA 0.949063## teamIDFirstLAN 0.792709## teamIDFirstMIA 0.000148 ***## teamIDFirstMIL 0.023579 *## teamIDFirstMIN 0.020819 *## teamIDFirstML4 0.046186 *## teamIDFirstMON 0.000116 ***## teamIDFirstNYA 0.130890## teamIDFirstNYN 0.395638## teamIDFirstOAK 0.013632 *## teamIDFirstPHI 0.069167 .## teamIDFirstPIT 0.006243 **## teamIDFirstSDN 0.002033 **## teamIDFirstSEA 0.300653## teamIDFirstSFN 0.171223## teamIDFirstSLN 0.348793## teamIDFirstTBA 0.000148 ***## teamIDFirstTEX 0.001404 **## teamIDFirstTOR 0.221430## teamIDFirstWAS 0.030333 *## yearIDBat < 0.0000000000000002 ***## batsL 0.000362 ***## batsR 0.358438## throwsR 0.170002## POSPlayedMost2B 0.416908## POSPlayedMost3B 0.523925## POSPlayedMostC 0.000015657802835226 ***## POSPlayedMostCF 0.000070071023176836 ***## POSPlayedMostDH 0.000004872993258926 ***## POSPlayedMostLF 0.008374 **## POSPlayedMostRF 0.000155 ***## POSPlayedMostSS 0.035937 *## UtilityPOSUTIL < 0.0000000000000002 ***## College 0.001268 **## height 0.027151 *## YearsExperience < 0.0000000000000002 ***## W 0.129293## BPF 0.440802## TotalMultiTeams < 0.0000000000000002 ***## RunsCreated < 0.0000000000000002 ***## IBB 0.000000000000000468 ***## TotalAllStarSelected 0.000000006939486785 ***## TotalGoldGlove 0.000000017372801593 ***## TotalRookieoftheYear 0.000000843119948103 ***
11
![Page 12: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/12.jpg)
## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#### Residual standard error: 0.7696 on 12959 degrees of freedom## Multiple R-squared: 0.6979, Adjusted R-squared: 0.6966## F-statistic: 516.2 on 58 and 12959 DF, p-value: < 0.00000000000000022###### ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS## USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:## Level of Significance = 0.05#### Call:## gvlma(x = LinearModel)#### Value p-value Decision## Global Stat 187.91347 0.000000000000 Assumptions NOT satisfied!## Skewness 1.03870 0.308123031028 Assumptions acceptable.## Kurtosis 151.06812 0.000000000000 Assumptions NOT satisfied!## Link Function 35.72387 0.000000002274 Assumptions NOT satisfied!## Heteroscedasticity 0.08277 0.773571576173 Assumptions acceptable.
## Value p-value## Global Stat 187.91346778 0.000000000000000## Skewness 1.03870421 0.308123031028374## Kurtosis 151.06811930 0.000000000000000## Link Function 35.72386937 0.000000002273612## Heteroscedasticity 0.08277491 0.773571576172987## Decision## Global Stat Assumptions NOT satisfied!## Skewness Assumptions acceptable.## Kurtosis Assumptions NOT satisfied!## Link Function Assumptions NOT satisfied!## Heteroscedasticity Assumptions acceptable.
Section 4: Create a Robust Model Using Entirety of Data
A. Create the Robust Model and View Summary
RobustModel<-rlm(formula = as.formula(BatModSpecs),data = BatModel)RobustSummaryModel<-summary(RobustModel)
B. Perform ANOVA
(anova(RobustModel))
## Analysis of Variance Table
12
![Page 13: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/13.jpg)
#### Response: log(Salary)## Df Sum Sq Mean Sq F value Pr(>F)## teamIDFirst 34 1099.2 32.3## yearIDBat 1 3762.5 3762.5## bats 2 4.2 2.1## throws 1 13.1 13.1## POSPlayedMost 8 353.2 44.2## UtilityPOS 1 1569.5 1569.5## College 1 44.0 44.0## height 1 9.4 9.4## YearsExperience 1 7714.3 7714.3## W 1 49.3 49.3## BPF 1 2.7 2.7## TotalMultiTeams 1 178.9 178.9## RunsCreated 1 2654.6 2654.6## IBB 1 60.6 60.6## TotalAllStarSelected 1 39.6 39.6## TotalGoldGlove 1 15.3 15.3## TotalRookieoftheYear 1 13.9 13.9## Residuals 7700.6
E. Identify Potential Outliers
(outlierTest(RobustModel))
## rstudent unadjusted p-value Bonferonni p## 26179 -5.203332 0.00000019874 0.0025871## 7624 -5.186704 0.00000021728 0.0028285
F. Identify MulitCollinearity Issues
(vif(RobustModel))
## GVIF Df GVIF^(1/(2*Df))## teamIDFirst 3.861708 34 1.020068## yearIDBat 1.205054 1 1.097750## bats 1.642685 2 1.132110## throws 1.624370 1 1.274508## POSPlayedMost 2.065586 8 1.046382## UtilityPOS 1.234240 1 1.110963## College 1.065733 1 1.032343## height 1.278291 1 1.130615## YearsExperience 1.347075 1 1.160636## W 1.201951 1 1.096335## BPF 2.422080 1 1.556303## TotalMultiTeams 1.253269 1 1.119495## RunsCreated 2.134285 1 1.460919
13
![Page 14: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/14.jpg)
## IBB 1.842983 1 1.357565## TotalAllStarSelected 1.540961 1 1.241355## TotalGoldGlove 1.264041 1 1.124296## TotalRookieoftheYear 1.104717 1 1.051055
(sqrt(vif(RobustModel)) > 2)
## GVIF Df GVIF^(1/(2*Df))## teamIDFirst FALSE TRUE FALSE## yearIDBat FALSE FALSE FALSE## bats FALSE FALSE FALSE## throws FALSE FALSE FALSE## POSPlayedMost FALSE TRUE FALSE## UtilityPOS FALSE FALSE FALSE## College FALSE FALSE FALSE## height FALSE FALSE FALSE## YearsExperience FALSE FALSE FALSE## W FALSE FALSE FALSE## BPF FALSE FALSE FALSE## TotalMultiTeams FALSE FALSE FALSE## RunsCreated FALSE FALSE FALSE## IBB FALSE FALSE FALSE## TotalAllStarSelected FALSE FALSE FALSE## TotalGoldGlove FALSE FALSE FALSE## TotalRookieoftheYear FALSE FALSE FALSE
G. Plot Normality of Residuals
sresid <- studres(RobustModel)hist(sresid, freq=FALSE,
main="Distribution of Studentized Residuals")xModel<-seq(min(sresid),max(sresid),length=40)yModel<-dnorm(xModel)lines(xModel, yModel)
H. Evaluate homoscedasticity non-constant error variance test
(ncvTest(RobustModel))
## Non-constant Variance Score Test## Variance formula: ~ fitted.values## Chisquare = 2197.184 Df = 1 p = 0
I. Test for Autocorrelated Errors
14
![Page 15: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/15.jpg)
(durbinWatsonTest(RobustModel))
## lag Autocorrelation D-W Statistic p-value## 1 0.4439713 1.11203 0## Alternative hypothesis: rho != 0
J. More Diagnostic Plots
(plot.lmRob(RobustModel))
15
![Page 16: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/16.jpg)
Residuals vs. Fitted Values
Fitted Values
Res
idua
ls
−4
−2
0
2
12 14 16 18 20
3156 1185 3722
RobustModel
Normal QQ Plot of Modified Residuals
Standard Normal Quantiles
Em
piric
al Q
uant
iles
of M
odifi
ed R
esid
uals
−4
−2
0
2
−4 −2 0 2 4
3156 1185 3722
RobustModel
16
![Page 17: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/17.jpg)
Scale−Location
Fitted Values
Mod
ified
Res
idua
ls
0.0
0.5
1.0
1.5
2.0
12 14 16 18 20
3156 1185 3722
RobustModel
17
![Page 18: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/18.jpg)
Modified Residuals vs. Leverage
Leverage
Mod
ified
Res
idua
ls
−4
−2
0
2
0.000 0.005 0.010 0.015 0.020 0.025
3156 1185 3722
RobustModel
## Call:## rlm(formula = as.formula(BatModSpecs), data = BatModel)## Converged in 6 iterations#### Coefficients:## (Intercept) teamIDFirstARI teamIDFirstATL## -128.2436598590 -0.2091091221 -0.1209566046## teamIDFirstBAL teamIDFirstBOS teamIDFirstCAL## -0.0898196200 0.0321250479 -0.2385107902## teamIDFirstCHA teamIDFirstCHN teamIDFirstCIN## -0.0662408849 -0.1168865090 -0.2352958910## teamIDFirstCLE teamIDFirstCOL teamIDFirstDET## -0.1788737920 -0.2498578845 -0.1737538577## teamIDFirstFLO teamIDFirstHOU teamIDFirstKCA## -0.3629384323 -0.2140456199 -0.2152045859## teamIDFirstLAA teamIDFirstLAN teamIDFirstMIA## -0.0115605236 -0.0292576292 -0.4866716336## teamIDFirstMIL teamIDFirstMIN teamIDFirstML4## -0.1955582629 -0.1764502116 -0.1937411671## teamIDFirstMON teamIDFirstNYA teamIDFirstNYN## -0.3097606940 0.0942372746 -0.0858051599## teamIDFirstOAK teamIDFirstPHI teamIDFirstPIT## -0.2159533180 -0.1560319741 -0.2166738066## teamIDFirstSDN teamIDFirstSEA teamIDFirstSFN## -0.2539300564 -0.0918286993 -0.1044125329## teamIDFirstSLN teamIDFirstTBA teamIDFirstTEX
18
![Page 19: MLBBaseballRegressionSample](https://reader030.vdocument.in/reader030/viewer/2022032514/55d3a11dbb61ebf8098b463d/html5/thumbnails/19.jpg)
## -0.0931436058 -0.3169427376 -0.2440492661## teamIDFirstTOR teamIDFirstWAS yearIDBat## -0.0934758688 -0.1994119196 0.0699192303## batsL batsR throwsR## -0.0814708172 -0.0171295569 0.0362715591## POSPlayedMost2B POSPlayedMost3B POSPlayedMostC## 0.0281920605 -0.0243023161 -0.1282680333## POSPlayedMostCF POSPlayedMostDH POSPlayedMostLF## 0.1090328956 -0.2580475652 0.0763428729## POSPlayedMostRF POSPlayedMostSS UtilityPOSUTIL## 0.1074938032 0.0654719517 -0.2994100738## College height YearsExperience## 0.0406298717 0.0058110589 0.1697009085## W BPF TotalMultiTeams## -0.0002097312 0.0007611303 -0.1100364805## RunsCreated IBB TotalAllStarSelected## 0.0132393078 0.0207097167 0.2118631512## TotalGoldGlove TotalRookieoftheYear## 0.2210210758 -0.3239296289#### Degrees of freedom: 13018 total; 12959 residual## Scale estimate: 0.686
K. Global test of model assumptions
gvmodel <- gvlma(RobustModel)(summary(gvmodel))
From Here the Next Round of Modeling Would Occur To Correct the Issues from First
19