Simple linear models
• Straight line is simplest case, but key is that parameters appear linearly in the model
• Needs estimates of the model parameters (slope and intercept)- usually by least squares
• Makes a number of assumptions, usually checked graphically using residuals
Examples for linear regression
• How is LOI related to moisture?• How should we estimate merchantable volume of wood
from the height of a living tree?• How is pest infestation late in the season affected by
the concentration of insecticide applied early in the season?
Scatterplot of tree volume vs height
Minitab commands
Regression Output
Interpreting the output
• Goodness of fit (R-squared) and ANOVA table p-value?• Confidence intervals and tests for the parameters• Assessing assumptions (outliers and influential
observations• Residual plots
t = distance between estimate and hypothesised value, in units of standard error
t Coef SECoef
vs tcrit
CI Coef tcrit SECoef
Confidence intervals and t-tests
Confidence intervals and t-tests
Confidence intervals and t-tests
Regression output
Outliers
Residual plots
Standardized Residual
Perc
ent
210-1-2
99
90
50
10
1
Fitted Value
Sta
ndard
ized R
esi
dual
5040302010
2
1
0
-1
-2
Standardized Residual
Fre
quency
210-1
8
6
4
2
0
Observation Order
Sta
ndard
ized R
esi
dual
30282624222018161412108642
2
1
0
-1
-2
Normal Probability Plot of the Residuals Residuals Versus the Fitted Values
Histogram of the Residuals Residuals Versus the Order of the Data
Residual Plots for VOLUME
Confidence and prediction intervals
HEIGHT
VOLU
ME
90858075706560
80
60
40
20
0
-20
S 13.3970R-Sq 35.8%R-Sq(adj) 33.6%
Regression95% CI95% PI
volume as a function of heightVOLUME = - 87.12 + 1.543 HEIGHT
Low R-sq
High R-sq
Low p-value: significant High p-value: non-significant
Four possible outcomes
• Not because relationships are linear• Transformations can often help linearise• Good simple starting point – results are well understood• Approximation to a smoothly varying curve
Why linear?