seminar 15 | tuesday, october 18, 2007 | aliaksei smalianchuk
TRANSCRIPT
![Page 1: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/1.jpg)
Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk
![Page 2: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/2.jpg)
Means and Variances
What happens to means and variances when data is manipulated?
Let’s check by manipulating data from the survey.
![Page 3: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/3.jpg)
Data
Height in inches (HT) Shoe size (Shoe) Age (Age) Additional Columns:
Height with a 1 inch heel (HeightPlus1)Height in centimeters (2.5TimesHeight)Sum of height and shoe size
(HeightPlusShoe)Sum of height and age (HeightPlusAge)
![Page 4: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/4.jpg)
Statistics
Variable N Mean StDev
HT 444 66.928 3.938
Shoe 445 9.1056 1.9484
Age 444 20.371 2.912
HeightPlus1 444 67.928 3.938
2.5TimesHeight 444 167.32 9.84
HeightPlusShoe 444 76.035 5.693
HeightPlusAge 444 87.299 4.913
![Page 5: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/5.jpg)
Variable N Mean StDev
HT 444 66.928 3.938
Shoe 445 9.1056 1.9484
Age 444 20.371 2.912
HeightPlus1 444 67.928 3.938
2.5TimesHeight 444 167.32 9.84
HeightPlusShoe 444 76.035 5.693
HeightPlusAge 444 87.299 4.913
Observation 1
The mean of heel heights is one inch larger than then mean of heights
![Page 6: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/6.jpg)
Why?
If every element is modified by a constant number the mean follows the same pattern.
![Page 7: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/7.jpg)
Variable N Mean StDev
HT 444 66.928 3.938
Shoe 445 9.1056 1.9484
Age 444 20.371 2.912
HeightPlus1 444 67.928 3.938
2.5TimesHeight 444 167.32 9.84
HeightPlusShoe 444 76.035 5.693
HeightPlusAge 444 87.299 4.913
Observation 2
The standard deviation of heel heights equals the standard deviation of heights
![Page 8: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/8.jpg)
Why?
Standard deviation is relative to the mean, and the shape of the distribution didn’t change
![Page 9: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/9.jpg)
Variable N Mean StDev
HT 444 66.928 3.938
Shoe 445 9.1056 1.9484
Age 444 20.371 2.912
HeightPlus1 444 67.928 3.938
2.5TimesHeight 444 167.32 9.84
HeightPlusShoe 444 76.035 5.693
HeightPlusAge 444 87.299 4.913
Observation 3
The standard deviation of heights is 2.5 times the standard deviation of heights in centimeters
![Page 10: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/10.jpg)
Why?
By multiplying all data values by a constant value we are increasing the spread of the histogram by the same value, therefore modifyingthe properties that depend on the spread (like standard deviation.)
![Page 11: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/11.jpg)
Variable N Mean StDev
HT 444 66.928 3.938
Shoe 445 9.1056 1.9484
Age 444 20.371 2.912
HeightPlus1 444 67.928 3.938
2.5TimesHeight 444 167.32 9.84
HeightPlusShoe 444 76.035 5.693
HeightPlusAge 444 87.299 4.913
Observation 4
Mean of HeightPlusShoe = Mean of Height + Mean of Shoe
![Page 12: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/12.jpg)
Variable N Mean StDev
HT 444 66.928 3.938
Shoe 445 9.1056 1.9484
Age 444 20.371 2.912
HeightPlus1 444 67.928 3.938
2.5TimesHeight 444 167.32 9.84
HeightPlusShoe 444 76.035 5.693
HeightPlusAge 444 87.299 4.913
Observation 5
Mean of HeightPlusAge = Mean of Height + Mean of Age
![Page 13: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/13.jpg)
Why?
Since
![Page 14: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/14.jpg)
Variances
Variance = σ2
Variances apply to a probability distribution
Variance is a way to capture the degree of spread of a distribution
![Page 15: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/15.jpg)
Variances
Variable Variance
HT 15.50784
Shoe 3.796263
Age 8.479744
HeightPlusShoe 32.41025
HeightPlusAge 24.13757
![Page 16: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/16.jpg)
Dependence
Are shoe sizes and heights dependent? Are age and height dependent? Let’s check using scatter plots
![Page 17: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/17.jpg)
Height vs. Shoe Size
![Page 18: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/18.jpg)
Height vs. Age
![Page 19: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/19.jpg)
Back to variances
Variance of HeightPlusShoe is much greater than Var(Height) + Var(Shoe)
Variance of HeightPlusAge is very close to Var(Height) + Var(Age)
Variable VarianceHT 15.50784Shoe 3.796263Age 8.479744
HeightPlusShoe 32.41025HeightPlusAge 24.13757
![Page 20: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/20.jpg)
Why?
Can you see a difference in relationships (Height vs. Shoe Size) and (Height vs. Age?)
![Page 21: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/21.jpg)
Dependence
Adding two dependent data distributions produces extremes (adding small values with corresponding small values and adding large values to correspondent large values)
This makes the variance much larger.
![Page 22: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/22.jpg)
Dependence
In case of independent sets, values do not necessarily correspond by relative value (large values can be added to small values)
This does not alter the spread of the distribution much
![Page 23: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/23.jpg)
Variance of sample mean Mean = (X1 + X2 + … + Xn)/n
Variance [(X1 + X2+ … +Xn)/n] = (Variance[X1] + Variance[X2]+ … + Variance[Xn])/n
![Page 24: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/24.jpg)
Dependence?
Would this work for dependent values of X1, X2 … Xn ?
Would the variance produced by this formula be larger or smaller than actual?
Sampling without replacementWould the variance formula hold true?Why?
![Page 25: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/25.jpg)
Dependence
Adding variances of dependent values will produce a smaller result than expected because adding dependent data sets will produce extremes, altering the spread
Sampling without replacement on smaller populations (n < 10) will produce dependence
![Page 26: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/26.jpg)
The End
![Page 27: Seminar 15 | Tuesday, October 18, 2007 | Aliaksei Smalianchuk](https://reader035.vdocument.in/reader035/viewer/2022062803/56649f565503460f94c7a242/html5/thumbnails/27.jpg)
Extra Credit (Dr. Pfenning) Use Minitab Calculator to create column
“Birthyear” Plot Earned vs. Birthyear, note relationship Create column “EarnedPlusBirthyear” Find sds of Earned, Birthyear,
EarnedPlusBirthyear, square to variances Compare variances Explain results