linear statistical models as a first statistics course for math majors
DESCRIPTION
Linear Statistical Models as a First Statistics Course for Math Majors. George W. Cobb Mount Holyoke College [email protected] CAUSE Webinar October 12, 2010. Overview. A. Goals for a first stat course for math majors B. Example of a modeling challenge - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/1.jpg)
Linear Statistical Modelsas a First Statistics Course
for Math MajorsGeorge W. Cobb
Mount Holyoke [email protected]
CAUSE WebinarOctober 12, 2010
![Page 2: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/2.jpg)
Overview
A. Goals for a first stat course for math majors
B. Example of a modeling challengeC. Examples of methodological challengesD. Some important tensionsE. Two geometriesF. Gauss-Markov TheoremG. Conclusion
![Page 3: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/3.jpg)
A. Goals for a First Statistics Course for Math Majors :
1. Minimize prerequisites2. Teach what we want students to learn
- Data analysis and modeling- Methodological challenges- Current practice
3. Appeal to the mathematical mind- Mathematical substance- Abstraction as process- Math for its own sake
![Page 4: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/4.jpg)
B. A Modeling Challenge:Pattern only – How many groups?
![Page 5: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/5.jpg)
B. A Modeling Challenge:Pattern plus context: two lines?
![Page 6: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/6.jpg)
B. A Modeling Challenge:Lurking variable 1: The five solid dots
![Page 7: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/7.jpg)
B. A Modeling Challenge:Lurking variable 1: The five solid dots
![Page 8: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/8.jpg)
B. A Modeling Challenge:Lurking variable 2 – confounding
![Page 9: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/9.jpg)
B. A Modeling Challenge:Lurking variable 2 – confounding
![Page 10: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/10.jpg)
B. A Modeling Challenge:Lurking variable 2 -- confounding
![Page 11: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/11.jpg)
B. The Modeling Challenge: summary
• Do the data provide evidence of discrimination?• Alternative explanations
based on classical economics• Additional variables: percent unemployed in the subject percent non-academic jobs in the subject median non-academic salary in the subject• Which model(s) are most useful ?
![Page 12: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/12.jpg)
C. Methodological Challenges• How to “solve” an inconsistent linear system?
Stigler, 1990: The History of Statistics• How to measure goodness of fit?
(Invariance issues)• How to identify influential points?
Exploratory plots• How to measure multicolinearity?
Note that none of these require any assumptions about probability distributions
![Page 13: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/13.jpg)
D. Some Important Tensions1. Data analysis v. methodological challenges2. Abstraction: Top down v. bottom up3. Math as tool v. math as aesthetic object4. Structure by dimension v. structure by assumptions
Distribution Number of covariates assumptions One Two Many
A. None B. Moments C. Normality
5. Two geometries
![Page 14: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/14.jpg)
D4. Structure by assumptions
A. No distribution assumptions about errors 1. Inconsistent linear systems; OLS Theorem 2. Measuring fit and correlation 3. Measuring influence and the Hat matrix 4. Measuring multicolinearity
B. Moment assumptions: E{e}=0, Var{e}=s2I 1. Moment Theorem: EV and Var for OLS estimators 2. Variance Estimation Theorem: E{MSE} = s2
3. Gauss-Markov Theorm: OLS = BLUE
C. Normality assumption: e ~ N
![Page 15: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/15.jpg)
D4. Structure by assumptions
C. Normality assumption: e ~ N 1. Herschel-Maxwell Theorem 2. Distribution of OLS estimators 3. t-distribution and confidence intervals for bj
4. Chi-square distribution and confidence interval for s2
5. F-distribution and nested F-test
![Page 16: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/16.jpg)
E. Two Geometries:the Crystal Problem
(Tom Moore, Primus, 1992)
y1 = b + e1 y2 = 2b + e2
![Page 17: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/17.jpg)
E. Two Geometries: Crystal Problem
Individual Space Variable Space
Point = Case Vector = VariableAxis = Variable Axis = Case
![Page 18: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/18.jpg)
F. Gauss-Markov Theorem:
• Crystal Example: y1 = b + e1, y2 = 2b + e2
• LINEAR: estimator = a1y1 + a2y2
• UNBIASED: a1b + a22b = b, i.e. a1 + 2a2 = 1.– Ex: y1 = 1y1 + 0y2 a = (1, 0)T
– Ex: (1/2)y2 = 1y1 + 0y2 a = (0, 1/2)T
– Ex: (1/5)y1 + (2/5)y2 a = (1/5, 2/5)T
• BEST: SD2(aTy) = s2|a|2 best = shortest a• THEOREM:OLS = BLUE
![Page 19: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/19.jpg)
F. Gauss-Markov Theorem:Coefficient Space for the Crystal Problem
![Page 20: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/20.jpg)
F. Gauss-Markov Theorem:Estimator y1 = (1,0)(y1,y2)T
B.
![Page 21: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/21.jpg)
F. Gauss-Markov Theorem: Estimator y2/2 = (0,1/2)(y1,y2)T
B.
![Page 22: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/22.jpg)
F. Gauss-Markov Theorem: OLS estimator = (1/5,2/5)(y1,y2)T
B.
![Page 23: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/23.jpg)
F. Gauss-Markov Theorem:The Set of Linear Unbiased Estimators
B.
![Page 24: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/24.jpg)
F. Gauss-Markov Theorem:LUEs form a translate of error space
B.
![Page 25: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/25.jpg)
F. Gauss-Markov Theorem:OLS estimator lies in model space
B.
![Page 26: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/26.jpg)
F. Gauss-Markov Theorem:Four Steps plus Pythagoras
1. OLS estimator is an LUE2. LUEs of βj are a flat set in n-space3. LUEs of 0 = error space4. OLS estimator lies in model space
![Page 27: Linear Statistical Models as a First Statistics Course for Math Majors](https://reader034.vdocument.in/reader034/viewer/2022051821/568165e9550346895dd90a18/html5/thumbnails/27.jpg)
G. Conclusion:A Least Squares Course can be
1. Accessible- Requires only Calc. I and matrix algebra
2. A good vehicle for teaching data modeling3. A sequence of methodological challenges4. Mathematically attractive
- Mathematical substance- Abstraction as process- Math as tool and for its own sake
5. A direct route to current practice- Generalized linear models- Correlated data, time-to-event data- Hierarchical Bayes