transformations. transformation (re-expression) of a variable a very useful transformation is the...
TRANSCRIPT
Transformations
Transformation (re-expression) of a Variable
• A very useful transformation is the natural log transformation
• Transformation of a variable can change its distribution from a skewed distribution to a normal distribution (bell-shaped, symmetric about its centre
ln( )newx transformed x x • For any value of x, ln(x) can be:
• Looked up in tables• Calculated by most calculators• Calculated by most statistical packages
Graph of ln(x)
0
1
2
3
4
5
6
0 20 40 60 80 100 120 140 160 180
ln( )newx x
x
0
1
2
3
4
5
6
0 20 40 60 80 100 120 140 160 180
The effect of the transformation ln( )newx x
x
The effect of the ln transformation• It spreads out values that are close to zero• Compacts values that are large
ln(x)newx
0
5
10
15
20
25
30
35
0
10
20
30
40
50
60
x
Transforming data to a normal distribution allows one to use powerful statistical procedures (discussed later on) that assumes the data is normally distributed.
Transformations to Linearity
• Many non-linear curves can be put into a linear form by appropriate transformations of the either– the dependent variable Y or
– the independent variable X
– or both.
• This leads to the wide utility of the Linear model. • Another use of trans
Intrinsically Linear (Linearizable) Curves 1 Hyperbolas
y = x/(ax-b)
Linear form: 1/y = a -b (1/x) or Y = 0 + 1 X
Transformations: Y = 1/y, X=1/x, 0 = a, 1 = -b
b/a
1/a
positive curvature b>0
y=x/(ax-b)
y=x/(ax-b)
negative curvature b< 0
1/a
b/a
2. Exponential
y = ex = x
Linear form: ln y = ln + x = ln + ln x or Y = 0 + 1 X
Transformations: Y = ln y, X = x, 0 = ln, 1 = = ln
2100
5
Exponential (B > 1)
x
y aB
a
2100
1
2
Exponential (B < 1)
x
y
a
aB
3. Power Functionsy = a xb
Linear from: ln y = lna + blnx or Y = 0 + 1 X
Transformations: Y = ln y, X = ln x, 0 = lna, 1 = b
Power functionsb>0
b > 1
b = 1
0 < b < 1
Power functionsb < 0
b < -1b = -1
-1 < b < 0
Summary
Transformations can be useful for:1. Changing data from a skewed distribution to a
Normal (bell- shaped) distribution
2. Straightening out Non-linear data
3. A common transformation is the natural log transformation ln(x)
Example – Motor Vehicle Data
The data is in an Excel file – MtrVeh.xlsDependent = mpg
Independent = Engine size, horsepower and weight
The data in an SPSS file
We will try to fit a model predicting mpg with Engine (engine size).
First a scatter plot:
The dialog box selecting the variables:
The scatter-plot
0 100 200 300 400 500
ENGINE
10
20
30
40
50
MPG
Similar to:2. Exponentialy = ex = x
Linear form: ln y = ln + x = ln + ln x or Y = 0 + 1 XTransformations: Y = ln y, X = x, 0 = ln, 1 = = ln
2100
5
Exponential (B > 1)
x
y aB
a
2100
1
2
Exponential (B < 1)
x
y
a
aB
• To perform a ln transformation in SPSS
• Go to the menu Transform->Compute
• In this dialogue box you define the tansformation
• Press OK and the trasformation will be performed
• The new variable has been added to the SPSS spreadsheet
• The scatterplot showing a better fit to a straight line using the new variable lnmpg.
0 100 200 300 400 500
ENGINE
2.00
2.50
3.00
3.50
4.00
lnmpg
Transformationssummary
• Transformations can be used to convert non-normal data to normally (bell-shaped) distributed data (allowing for the use of the more powerful techniques assuming normality)
• Transformations can be used to convert non-linear data linear (straight line) data.
Next topic
Probability