1 why do we need statistics? a.to confuse students b.to torture students c.to put the fear of the...
TRANSCRIPT
1
Why do we need statistics?
A. To confuse studentsB. To torture studentsC. To put the fear of the almighty in themD. To ruin their GPA, so that they don’t get
into grad school, have to buss tables and move back in with parents
E. All of the above F. All of the above (and other tragic
outcomes)
2
A positive optimistic view…
It is a tool that could help you succeed and move out of your parents house
There is nothing to fear but fear itself You need a passing grade of C Can help to get into grad school It’s important to understand so you
don’t get scammed…
3
The Caveat: Remember… There are lies There are d#$m (darn) lies and then
There are statistics Magic
4
Statistics
The science of collecting, displaying and analyzing data
Based on quantitative measurements of samples
Allow us to objectively evaluate data Descriptive Inferential
5
Defining variability
Amount of change or fluctuation Some variability is expected Is the observed variability due to the
usual variability among subjects from the population?
Or is the observed variability greater than the usual variability
6
Frequency Distribution
Dependent variable
Fre
qu
en
cy (
# o
f S
ub
jects
) From population
Sample 1Sample 2
Sample 3
0 highestscore
Hig
hest
7
Frequency Distribution
Dependent variable
Fre
qu
en
cy (
# o
f S
ub
jects
) Untreated groups of an experiment
Control Experimental
8
Frequency Distribution
Dependent variable
Fre
qu
en
cy (
# o
f S
ub
jects
) Treated Groups
Control Experimental
Beginning steps of an Experiment
Sample from population Hypothesis Define variables Assign subjects to conditions Measure performance Calculate means Calculate variability
10
Heading Error: Calculating VarianceHeading Error: Calculating Variance
Deviation from the mean Deviation from the mean for each subjectfor each subject
(X(Xii – X) – X)
(X(Xii – X) – X)22
Square the deviation from Square the deviation from the mean for each subjectthe mean for each subject
Add the squared deviations Add the squared deviations togethertogether
(X(Xii – X) – X)22
SexSexSUBJESUBJECT$ CT$ HEADGdeg HEADGdeg
FemaleFemale rat1 rat1 -4.4-4.4 4.44.4 -4.68-4.68 21.8621.86
FemaleFemale rat3 rat3 1111 1111 1.931.93 3.713.71
FemaleFemale rat5 rat5 2.32.3 2.32.3 -6.78-6.78 45.9045.90
FemaleFemale rat7 rat7 8.58.5 8.58.5 -0.58-0.58 0.330.33
FemaleFemale rat9 rat9 6.96.9 6.96.9 -2.18-2.18 4.734.73
FemaleFemale rat11 rat11 -10.8-10.8 10.810.8 1.731.73 2.982.98
FemaleFemale rat13 rat13 -10.9-10.9 10.910.9 1.831.83 3.333.33
FemaleFemale rat15 rat15 17.817.8 17.817.8 8.738.73 76.1376.13
MaleMale rat2 rat2 29.629.6 29.629.6 6.866.86 47.0947.09
MaleMale rat4 rat4 -18.5-18.5 18.518.5 -4.24-4.24 17.9617.96
MaleMale rat6 rat6 14.514.5 14.514.5 -8.24-8.24 67.8667.86
MaleMale rat8 rat8 58.258.2 58.258.2 35.4635.46 1257.591257.59
MaleMale rat10 rat10 -18.7-18.7 18.718.7 -4.04-4.04 16.3016.30
MaleMale rat12 rat12 -17.3-17.3 17.317.3 -5.44-5.44 29.5729.57
MaleMale rat14 rat14 -14.8-14.8 14.814.8 -7.94-7.94 63.0063.00
MaleMale rat16 rat16 10.310.3 10.310.3 -12.44-12.44 154.69154.69
Female = 158.96Male = 1654.06
11
Heading Error: Calculating VarianceHeading Error: Calculating Variance
SexSexSUBJESUBJECT$ CT$ HEADGdeg HEADGdeg
FemaleFemale rat1 rat1 -4.4-4.4 4.44.4 -4.68-4.68 21.8621.86
FemaleFemale rat3 rat3 1111 1111 1.931.93 3.713.71
FemaleFemale rat5 rat5 2.32.3 2.32.3 -6.78-6.78 45.9045.90
FemaleFemale rat7 rat7 8.58.5 8.58.5 -0.58-0.58 0.330.33
FemaleFemale rat9 rat9 6.96.9 6.96.9 -2.18-2.18 4.734.73
FemaleFemale rat11 rat11 -10.8-10.8 10.810.8 1.731.73 2.982.98
FemaleFemale rat13 rat13 -10.9-10.9 10.910.9 1.831.83 3.333.33
FemaleFemale rat15 rat15 17.817.8 17.817.8 8.738.73 76.1376.13
MaleMale rat2 rat2 29.629.6 29.629.6 6.866.86 47.0947.09
MaleMale rat4 rat4 -18.5-18.5 18.518.5 -4.24-4.24 17.9617.96
MaleMale rat6 rat6 14.514.5 14.514.5 -8.24-8.24 67.8667.86
MaleMale rat8 rat8 58.258.2 58.258.2 35.4635.46 1257.591257.59
MaleMale rat10 rat10 -18.7-18.7 18.718.7 -4.04-4.04 16.3016.30
MaleMale rat12 rat12 -17.3-17.3 17.317.3 -5.44-5.44 29.5729.57
MaleMale rat14 rat14 -14.8-14.8 14.814.8 -7.94-7.94 63.0063.00
MaleMale rat16 rat16 10.310.3 10.310.3 -12.44-12.44 154.69154.69
Compute the VarianceCompute the Variance
(X(Xii – X) – X)22
ss22 = =n - 1n - 1
ss22FemaleFemale = 22.71 = 22.71
ss22MaleMale = 236.29 = 236.29
12
Heading Error: Calculating standard deviationHeading Error: Calculating standard deviation
SexSexSUBJESUBJECT$ CT$ HEADGdeg HEADGdeg
FemaleFemale rat1 rat1 -4.4-4.4 4.44.4 -4.68-4.68 21.8621.86
FemaleFemale rat3 rat3 1111 1111 1.931.93 3.713.71
FemaleFemale rat5 rat5 2.32.3 2.32.3 -6.78-6.78 45.9045.90
FemaleFemale rat7 rat7 8.58.5 8.58.5 -0.58-0.58 0.330.33
FemaleFemale rat9 rat9 6.96.9 6.96.9 -2.18-2.18 4.734.73
FemaleFemale rat11 rat11 -10.8-10.8 10.810.8 1.731.73 2.982.98
FemaleFemale rat13 rat13 -10.9-10.9 10.910.9 1.831.83 3.333.33
FemaleFemale rat15 rat15 17.817.8 17.817.8 8.738.73 76.1376.13
MaleMale rat2 rat2 29.629.6 29.629.6 6.866.86 47.0947.09
MaleMale rat4 rat4 -18.5-18.5 18.518.5 -4.24-4.24 17.9617.96
MaleMale rat6 rat6 14.514.5 14.514.5 -8.24-8.24 67.8667.86
MaleMale rat8 rat8 58.258.2 58.258.2 35.4635.46 1257.591257.59
MaleMale rat10 rat10 -18.7-18.7 18.718.7 -4.04-4.04 16.3016.30
MaleMale rat12 rat12 -17.3-17.3 17.317.3 -5.44-5.44 29.5729.57
MaleMale rat14 rat14 -14.8-14.8 14.814.8 -7.94-7.94 63.0063.00
MaleMale rat16 rat16 10.310.3 10.310.3 -12.44-12.44 154.69154.69
Standard deviationStandard deviation
s or SD = s or SD = ss22
ssFemaleFemale = 4.77 = 4.77
ssMaleMale = 15.37 = 15.37
13
Heading Error: Calculating standard error of Heading Error: Calculating standard error of the Meanthe Mean
SexSexSUBJESUBJECT$ CT$ HEADGdeg HEADGdeg
FemaleFemale rat1 rat1 -4.4-4.4 4.44.4 -4.68-4.68 21.8621.86
FemaleFemale rat3 rat3 1111 1111 1.931.93 3.713.71
FemaleFemale rat5 rat5 2.32.3 2.32.3 -6.78-6.78 45.9045.90
FemaleFemale rat7 rat7 8.58.5 8.58.5 -0.58-0.58 0.330.33
FemaleFemale rat9 rat9 6.96.9 6.96.9 -2.18-2.18 4.734.73
FemaleFemale rat11 rat11 -10.8-10.8 10.810.8 1.731.73 2.982.98
FemaleFemale rat13 rat13 -10.9-10.9 10.910.9 1.831.83 3.333.33
FemaleFemale rat15 rat15 17.817.8 17.817.8 8.738.73 76.1376.13
MaleMale rat2 rat2 29.629.6 29.629.6 6.866.86 47.0947.09
MaleMale rat4 rat4 -18.5-18.5 18.518.5 -4.24-4.24 17.9617.96
MaleMale rat6 rat6 14.514.5 14.514.5 -8.24-8.24 67.8667.86
MaleMale rat8 rat8 58.258.2 58.258.2 35.4635.46 1257.591257.59
MaleMale rat10 rat10 -18.7-18.7 18.718.7 -4.04-4.04 16.3016.30
MaleMale rat12 rat12 -17.3-17.3 17.317.3 -5.44-5.44 29.5729.57
MaleMale rat14 rat14 -14.8-14.8 14.814.8 -7.94-7.94 63.0063.00
MaleMale rat16 rat16 10.310.3 10.310.3 -12.44-12.44 154.69154.69
Standard error of the MeanStandard error of the Mean
SEMSEM = =
SEMSEMFemaleFemale = 1.68 = 1.68
SEMSEMMaleMale = 5.43 = 5.43
SDSD
nn
14
Heading Error: Group Means with SEMHeading Error: Group Means with SEM
0
5
10
15
20
25
30
Group
Ab
solu
te H
ead
ing
Err
or
(deg
)
Male
Female
1515
Heading Error: Group Means with Heading Error: Group Means with 95% Confidence Interval95% Confidence Interval
0
5
10
15
20
25
30
35
40
Group
Male
Female
Hea
din
g E
rror
Confidence intervals (CI) represent a range of values above and below our sample mean that is likely to contain the population mean; i.e., the true mean of the population is likely (we’re 95% confident) to fall somewhere within the CI range.
CI = X± tcritSDSD
nn( )
16
Heading Error: Group Means with SEMHeading Error: Group Means with SEM
0
5
10
15
20
25
30
Group
Ab
solu
te H
ead
ing
Err
or
(deg
)
Male
Female
variance (ss22)- average squared deviation of scores from their meanstandard deviation (SD)- average deviation of scores about the mean standard error of the mean (SEM)- dispersion of the distribution of
sample means
SEMSEM = =SDSD
nn
SD = SD = ss22
(X(Xii – X) – X)22
ss22 = =n - 1n - 1
17
Choosing a significance levelChoosing a significance level
Significance levelSignificance level - A criterion for deciding - A criterion for deciding whether to reject the null hypothesis or not.whether to reject the null hypothesis or not.
• What is the convention? What is the convention? pp < .05 ( < .05 ( level) level)
• A stricter criterion may be required if the risk of A stricter criterion may be required if the risk of making a wrong decision (a Type I error) is making a wrong decision (a Type I error) is greater than usual. greater than usual. pp < .01 or < .01 or pp < .001. < .001.
• But there is a trade off in using a stricter But there is a trade off in using a stricter criterion.criterion.
18
Choosing a significance levelChoosing a significance level
Type II errorType II error – Failure to reject the null – Failure to reject the null hypothesis when it is really false (hypothesis when it is really false ().).
• Concluding that the difference is due to chance Concluding that the difference is due to chance variation when it is really due to the variation when it is really due to the independent variableindependent variable
• Power of the statistical test (1 - Power of the statistical test (1 - ))
19
RealityReality
CheckCheck
Decision based on Statisical ResultsDecision based on Statisical Results
Fail to reject HFail to reject Hoo Reject HReject Hoo
HHoo is true is trueCorrectCorrect
pp = 1 - = 1 - Type I errorType I error
pp = =
HHoo is False is FalseType II errorType II error
PP = = CorrectCorrect
pp = 1 - = 1 -
Summary ChartSummary Chart
20
Time estimation experimentTime estimation experimentTime will go faster for people having fun Time will go faster for people having fun
than for those not having fun.than for those not having fun.
Two group design: Fun - views cartoons with the captions for 10 min. No Fun – views cartons without captions for 10 min.
Ho = the time estimates of the two groups will be the same.H1 = the fun group will have shorter estimates than the control group.
Table 13-2 possible errors in the time estimation experiment (p.381, 6th ed.)
We conclude that there was no difference in the time estimates made by the “fun” and “no fun” groups even though the treatments did produce an effect.
We conclude that there was a difference in the time estimates made by the “fun” and “no fun” groups even though the treatments produced little or no effect at all.
What type of errors were made in the two descriptions?
Type IType II
Type IType II
Type III – failure to accurately identify a type 1 or 2 error Note - The error has been corrected in the 7th ed., p. 390.
Type 1 = Reporting an effect that doesn’t really exist
Type 2 = Missing an effect that does really exist
21
Questions to ask when Questions to ask when selecting a test statisticselecting a test statistic
Table 14-1Table 14-1 The parameters of data analysis The parameters of data analysis______________________________________________________________________________________________________
1.1. How many independent variables are there?How many independent variables are there?
2.2. How many treatment conditions are there?How many treatment conditions are there?
3.3. Is the experiment run between or within subjects?Is the experiment run between or within subjects?
4.4. Are the subjects matched?Are the subjects matched?
5.5. What is the level of measurement of the dependent variable?What is the level of measurement of the dependent variable?______________________________________________________________________________________________________
22
Answers based on the water maze studyAnswers based on the water maze study
Table 13-1Table 13-1 The parameters of data analysis The parameters of data analysis______________________________________________________________________________________________________
1.1. How many independent variables are there? How many independent variables are there? oneone
2.2. How many treatment conditions are there? How many treatment conditions are there? oneone
3.3. Is the experiment run between or within subjects? Is the experiment run between or within subjects? betweenbetween
4.4. Are the subjects matched? Are the subjects matched? nono
5.5. What is the level of measurement of the dependent variable? What is the level of measurement of the dependent variable? ratioratio______________________________________________________________________________________________________
23
Levels of MeasurementLevels of Measurement
Ratio – Ratio – a measure of magnitude having equal a measure of magnitude having equal intervals between values and having an intervals between values and having an absolute zero point.absolute zero point.
Interval – Interval – same as ratio except that there is no same as ratio except that there is no true zero point.true zero point.
Ordinal – Ordinal – a measure of magnitude in the form of a measure of magnitude in the form of ranks (not sure of equal intervals and no ranks (not sure of equal intervals and no absolute zero).absolute zero).
Nominal – Nominal – items are classified into categories items are classified into categories that have no quantitative relationship to one that have no quantitative relationship to one another.another.
24
Choosing a test statisticChoosing a test statistic
Level of Level of measurement measurement of dependent of dependent
variablevariable
One Independent VariableOne Independent Variable Two Independent VariablesTwo Independent Variables
Two TreatmentsTwo Treatments More Than Two TreatmentsMore Than Two Treatments Factorial DesignsFactorial Designs
Two Two Independent Independent
GroupsGroups
Two matched Two matched groups (or groups (or
within within subjects)subjects)
Multiple Multiple independent independent
groupsgroups
Multiple Multiple matched matched
groups (or groups (or within within
subjects)subjects)
Independent Independent groupsgroups
Matched Matched groups (or groups (or
within within sujects)sujects)
Independent Independent groups and groups and
matched groups matched groups (or between (or between subjects and subjects and
within subjectswithin subjects
Interval or Interval or ratioratio
tt test for test for independent independent
groupsgroups
tt test for test for matched groupsmatched groups
One-way One-way ANOVAANOVA
One-way One-way ANOVA ANOVA (repeated (repeated measures)measures)
Two-way Two-way ANOVAANOVA
Two-way Two-way ANOVA ANOVA (repeated (repeated measures)measures)
Two-way Two-way ANOVA (mixed)ANOVA (mixed)
ordinalordinalMann-Mann-
Whitney U Whitney U testtest
Wilcoxon testWilcoxon testKruskal-Kruskal-
Wallis testWallis testFriedmanFriedman
testtest
NominalNominalChi square Chi square
testtestChi square Chi square
testtestChi square Chi square
testtest
TABLE 14-2 Selecting a possible statistical test by number of independent variables and level of measurement
25
Heading Error: Statistical AnalysisHeading Error: Statistical Analysistt test for Independent Groups test for Independent Groups
XX11 – X – X22ttobsobs = =
(n(n11 + n + n22 – 2) – 2)( (n(n11 – 1) – 1) ss221 1 + (n+ (n22 – 1) – 1) ss22
22 ) ( )1 11 1++
nn11 n n2222.74 – 9.0822.74 – 9.08
ttobsobs = =
(8 + 8 – 2)(8 + 8 – 2)( (8–1)236.29+(8–1)22.71(8–1)236.29+(8–1)22.71 ) ( )1 11 1++
8 88 8
1) Lay out Formula1) Lay out Formula
2) Plug in Values2) Plug in Values
26
Heading Error: Statistical AnalysisHeading Error: Statistical Analysistt test for Independent Groups test for Independent Groups
ttobsobs = =
8) Divide the numerator by the denominator.8) Divide the numerator by the denominator.
XX11 – X – X22ttobsobs = =
(n(n11 + n + n22 – 2) – 2)( (n(n11 – 1) – 1) ss221 1 + (n+ (n22 – 1) – 1) ss22
22 ) ( )1 11 1++
nn11 n n22
FormulaFormula
13.6613.66
5.695.69ttobsobs = 2.40 = 2.40
27
Determining significanceDetermining significance
1.1.Was the hypothesis directional or Was the hypothesis directional or nondirectional?nondirectional?
2.2.What was the significance level?What was the significance level?
3.3.How many degrees of freedom do we How many degrees of freedom do we have?have?
Degrees of freedom (df)– the number of members in a set of Degrees of freedom (df)– the number of members in a set of data that can vary or change value without changing the data that can vary or change value without changing the value of a known statistic for those data.value of a known statistic for those data.
28
1.1.Was the hypothesis directional or Was the hypothesis directional or nondirectional? nondirectional? Nondirectional, so two-Nondirectional, so two-tailed.tailed.
2.2.What was the significance level? What was the significance level? p < .05p < .05
3.3.How many degrees of freedom do we How many degrees of freedom do we have? have? 1414
Answers to the questionsAnswers to the questions
Look on page 531 of Myers & Hansen to find Look on page 531 of Myers & Hansen to find the critical value of t…the critical value of t…
29
Answers to the questionsAnswers to the questions
Or you could just go on-line… e.g., Or you could just go on-line… e.g.,
http://www.psychstat.missouristate.edu/introbook/tdist.htm
Or you could just go on-line… e.g., Or you could just go on-line… e.g.,
http://www.psychstat.missouristate.edu/introbook/tdist.htm
30
Heading Error: Statistical AnalysisHeading Error: Statistical Analysistt test for Independent Groups test for Independent Groups
ttobsobs = =
8) Divide the numerator by the denominator.8) Divide the numerator by the denominator.
XX11 – X – X22ttobsobs = =
(n(n11 + n + n22 – 2) – 2)( (n(n11 – 1) – 1) ss221 1 + (n+ (n22 – 1) – 1) ss22
22 ) ( )1 11 1++
nn11 n n22
FormulaFormula
13.6613.66
5.695.69ttobsobs = 2.40 = 2.40
ttcritcrit = 2.145 = 2.145
p p < .05, two-tailed< .05, two-tailed
31
Compare to our computer output from SPSSCompare to our computer output from SPSS
ttobsobs = =
8) Divide the numerator by the denominator.8) Divide the numerator by the denominator.
XX11 – X – X22ttobsobs = =
(n(n11 + n + n22 – 2) – 2)( (n(n11 – 1) – 1) ss221 1 + (n+ (n22 – 1) – 1) ss22
22 ) ( )1 11 1++
nn11 n n22
FormulaFormula
13.6613.66
5.695.69ttobsobs = 2.40 = 2.40
ttcritcrit = 2.145 = 2.145
p p < .05, two-tailed< .05, two-tailed
32
Decision: Reject the null hypothesisDecision: Reject the null hypothesis
Are we done?Are we done?
ConclusionConclusion
- How much importance should we attach to - How much importance should we attach to this finding?this finding?- Was the effect just barely significant - Was the effect just barely significant (p<.05)?(p<.05)?- What if the sig level was, p<.0001? Would - What if the sig level was, p<.0001? Would this be a larger effect? this be a larger effect?
33
Assess the quality of the ExperimentAssess the quality of the Experiment
1)1)Were control procedures adequate?Were control procedures adequate?
2)2)Were variables defined appropriately?Were variables defined appropriately?
3)3)Is a Type I error likely?Is a Type I error likely?
Answers to the questionsAnswers to the questions
The t test is a The t test is a robustrobust statistic… statistic…Means that assumptions can be violated without changing the rate Means that assumptions can be violated without changing the rate of type I or type II error.of type I or type II error.
34
Effect sizeEffect sizeConvert Convert tt to a correlation coefficient to a correlation coefficient
rr = =tt22
tt22 + +dfdfrr = =
(2.40)(2.40)22
(2.40)(2.40)22 +14 +14
rr = .54 = .54
rr22 = .15 = .15
According to Cohen (1988), r ≥ .50 is considered a large effect (.30 is a moderate effect and below .30 is a small effect).
The r2 of .15 indicates that the IV accounts for 15% of the variability observed in the DV.
Online site for effect size calculator:Online site for effect size calculator:http://web.uccs.edu/lbecker/Psy590/escalc3.htm
35
Effect sizeEffect sizeConvert Convert tt to a correlation coefficient to a correlation coefficient
rr = =tt22
tt22 +df +dfrr = =
(2.40)(2.40)22
(2.40)(2.40)22 +14 +14
rr = .54 = .54
rr22 = .15 = .15
Online site for effect size calculator:Online site for effect size calculator:http://web.uccs.edu/lbecker/Psy590/escalc3.htm