1 why do we need statistics? a.to confuse students b.to torture students c.to put the fear of the...

1

Why do we need statistics?

A. To confuse studentsB. To torture studentsC. To put the fear of the almighty in themD. To ruin their GPA, so that they don’t get

into grad school, have to buss tables and move back in with parents

E. All of the above F. All of the above (and other tragic

outcomes)

2

A positive optimistic view…

It is a tool that could help you succeed and move out of your parents house

There is nothing to fear but fear itself You need a passing grade of C Can help to get into grad school It’s important to understand so you

don’t get scammed…

3

The Caveat: Remember… There are lies There are d#$m (darn) lies and then

There are statistics Magic

4

Statistics

The science of collecting, displaying and analyzing data

Based on quantitative measurements of samples

Allow us to objectively evaluate data Descriptive Inferential

5

Defining variability

Amount of change or fluctuation Some variability is expected Is the observed variability due to the

usual variability among subjects from the population?

Or is the observed variability greater than the usual variability

6

Frequency Distribution

Dependent variable

Fre

qu

en

cy (

# o

f S

ub

jects

) From population

Sample 1Sample 2

Sample 3

0 highestscore

Hig

hest

7


Dependent variable

Fre

qu

en

cy (

# o

f S

ub

jects

) Untreated groups of an experiment

Control Experimental

8


Dependent variable

Fre

qu

en

cy (

# o

f S

ub

jects

) Treated Groups

Control Experimental

Beginning steps of an Experiment

Sample from population Hypothesis Define variables Assign subjects to conditions Measure performance Calculate means Calculate variability

10

Heading Error: Calculating VarianceHeading Error: Calculating Variance

Deviation from the mean Deviation from the mean for each subjectfor each subject

(X(Xii – X) – X)

(X(Xii – X) – X)22

Square the deviation from Square the deviation from the mean for each subjectthe mean for each subject

Add the squared deviations Add the squared deviations togethertogether

(X(Xii – X) – X)22

SexSexSUBJESUBJECT$ CT$ HEADGdeg HEADGdeg

FemaleFemale rat1 rat1 -4.4-4.4 4.44.4 -4.68-4.68 21.8621.86

FemaleFemale rat3 rat3 1111 1111 1.931.93 3.713.71

FemaleFemale rat5 rat5 2.32.3 2.32.3 -6.78-6.78 45.9045.90



FemaleFemale rat11 rat11 -10.8-10.8 10.810.8 1.731.73 2.982.98


FemaleFemale rat15 rat15 17.817.8 17.817.8 8.738.73 76.1376.13

MaleMale rat2 rat2 29.629.6 29.629.6 6.866.86 47.0947.09

MaleMale rat4 rat4 -18.5-18.5 18.518.5 -4.24-4.24 17.9617.96

MaleMale rat6 rat6 14.514.5 14.514.5 -8.24-8.24 67.8667.86

MaleMale rat8 rat8 58.258.2 58.258.2 35.4635.46 1257.591257.59

MaleMale rat10 rat10 -18.7-18.7 18.718.7 -4.04-4.04 16.3016.30

MaleMale rat12 rat12 -17.3-17.3 17.317.3 -5.44-5.44 29.5729.57

MaleMale rat14 rat14 -14.8-14.8 14.814.8 -7.94-7.94 63.0063.00

MaleMale rat16 rat16 10.310.3 10.310.3 -12.44-12.44 154.69154.69

Female = 158.96Male = 1654.06

11

Heading Error: Calculating VarianceHeading Error: Calculating Variance










MaleMale rat2 rat2 29.629.6 29.629.6 6.866.86 47.0947.09

MaleMale rat4 rat4 -18.5-18.5 18.518.5 -4.24-4.24 17.9617.96

MaleMale rat6 rat6 14.514.5 14.514.5 -8.24-8.24 67.8667.86

MaleMale rat8 rat8 58.258.2 58.258.2 35.4635.46 1257.591257.59

MaleMale rat10 rat10 -18.7-18.7 18.718.7 -4.04-4.04 16.3016.30

MaleMale rat12 rat12 -17.3-17.3 17.317.3 -5.44-5.44 29.5729.57

MaleMale rat14 rat14 -14.8-14.8 14.814.8 -7.94-7.94 63.0063.00

MaleMale rat16 rat16 10.310.3 10.310.3 -12.44-12.44 154.69154.69

Compute the VarianceCompute the Variance

(X(Xii – X) – X)22

ss22 = =n - 1n - 1

ss22FemaleFemale = 22.71 = 22.71

ss22MaleMale = 236.29 = 236.29

12

Heading Error: Calculating standard deviationHeading Error: Calculating standard deviation










MaleMale rat2 rat2 29.629.6 29.629.6 6.866.86 47.0947.09

MaleMale rat4 rat4 -18.5-18.5 18.518.5 -4.24-4.24 17.9617.96

MaleMale rat6 rat6 14.514.5 14.514.5 -8.24-8.24 67.8667.86

MaleMale rat8 rat8 58.258.2 58.258.2 35.4635.46 1257.591257.59

MaleMale rat10 rat10 -18.7-18.7 18.718.7 -4.04-4.04 16.3016.30

MaleMale rat12 rat12 -17.3-17.3 17.317.3 -5.44-5.44 29.5729.57

MaleMale rat14 rat14 -14.8-14.8 14.814.8 -7.94-7.94 63.0063.00

MaleMale rat16 rat16 10.310.3 10.310.3 -12.44-12.44 154.69154.69

Standard deviationStandard deviation

s or SD = s or SD = ss22

ssFemaleFemale = 4.77 = 4.77

ssMaleMale = 15.37 = 15.37

13

Heading Error: Calculating standard error of Heading Error: Calculating standard error of the Meanthe Mean










MaleMale rat2 rat2 29.629.6 29.629.6 6.866.86 47.0947.09

MaleMale rat4 rat4 -18.5-18.5 18.518.5 -4.24-4.24 17.9617.96

MaleMale rat6 rat6 14.514.5 14.514.5 -8.24-8.24 67.8667.86

MaleMale rat8 rat8 58.258.2 58.258.2 35.4635.46 1257.591257.59

MaleMale rat10 rat10 -18.7-18.7 18.718.7 -4.04-4.04 16.3016.30

MaleMale rat12 rat12 -17.3-17.3 17.317.3 -5.44-5.44 29.5729.57

MaleMale rat14 rat14 -14.8-14.8 14.814.8 -7.94-7.94 63.0063.00

MaleMale rat16 rat16 10.310.3 10.310.3 -12.44-12.44 154.69154.69

Standard error of the MeanStandard error of the Mean

SEMSEM = =

SEMSEMFemaleFemale = 1.68 = 1.68

SEMSEMMaleMale = 5.43 = 5.43

SDSD

nn

14

Heading Error: Group Means with SEMHeading Error: Group Means with SEM

0

5

10

15

20

25

30

Group

Ab

solu

te H

ead

ing

Err

or

(deg

)

Male

Female

1515

Heading Error: Group Means with Heading Error: Group Means with 95% Confidence Interval95% Confidence Interval

0

5

10

15

20

25

30

35

40

Group

Male

Female

Hea

din

g E

rror

Confidence intervals (CI) represent a range of values above and below our sample mean that is likely to contain the population mean; i.e., the true mean of the population is likely (we’re 95% confident) to fall somewhere within the CI range.

CI = X± tcritSDSD

nn( )

16

Heading Error: Group Means with SEMHeading Error: Group Means with SEM

0

5

10

15

20

25

30

Group

Ab

solu

te H

ead

ing

Err

or

(deg

)

Male

Female

variance (ss22)- average squared deviation of scores from their meanstandard deviation (SD)- average deviation of scores about the mean standard error of the mean (SEM)- dispersion of the distribution of

sample means

SEMSEM = =SDSD

nn

SD = SD = ss22

(X(Xii – X) – X)22

ss22 = =n - 1n - 1

17

Choosing a significance levelChoosing a significance level

Significance levelSignificance level - A criterion for deciding - A criterion for deciding whether to reject the null hypothesis or not.whether to reject the null hypothesis or not.

• What is the convention? What is the convention? pp < .05 ( < .05 ( level) level)

• A stricter criterion may be required if the risk of A stricter criterion may be required if the risk of making a wrong decision (a Type I error) is making a wrong decision (a Type I error) is greater than usual. greater than usual. pp < .01 or < .01 or pp < .001. < .001.

• But there is a trade off in using a stricter But there is a trade off in using a stricter criterion.criterion.

18

Choosing a significance levelChoosing a significance level

Type II errorType II error – Failure to reject the null – Failure to reject the null hypothesis when it is really false (hypothesis when it is really false ().).

• Concluding that the difference is due to chance Concluding that the difference is due to chance variation when it is really due to the variation when it is really due to the independent variableindependent variable

• Power of the statistical test (1 - Power of the statistical test (1 - ))

19

RealityReality

CheckCheck

Decision based on Statisical ResultsDecision based on Statisical Results

Fail to reject HFail to reject Hoo Reject HReject Hoo

HHoo is true is trueCorrectCorrect

pp = 1 - = 1 - Type I errorType I error

pp = =

HHoo is False is FalseType II errorType II error

PP = = CorrectCorrect

pp = 1 - = 1 -

Summary ChartSummary Chart

20

Time estimation experimentTime estimation experimentTime will go faster for people having fun Time will go faster for people having fun

than for those not having fun.than for those not having fun.

Two group design: Fun - views cartoons with the captions for 10 min. No Fun – views cartons without captions for 10 min.

Ho = the time estimates of the two groups will be the same.H1 = the fun group will have shorter estimates than the control group.

Table 13-2 possible errors in the time estimation experiment (p.381, 6th ed.)

We conclude that there was no difference in the time estimates made by the “fun” and “no fun” groups even though the treatments did produce an effect.

We conclude that there was a difference in the time estimates made by the “fun” and “no fun” groups even though the treatments produced little or no effect at all.

What type of errors were made in the two descriptions?

Type IType II

Type IType II

Type III – failure to accurately identify a type 1 or 2 error Note - The error has been corrected in the 7th ed., p. 390.

Type 1 = Reporting an effect that doesn’t really exist

Type 2 = Missing an effect that does really exist

21

Questions to ask when Questions to ask when selecting a test statisticselecting a test statistic

Table 14-1Table 14-1 The parameters of data analysis The parameters of data analysis______________________________________________________________________________________________________

1.1. How many independent variables are there?How many independent variables are there?

2.2. How many treatment conditions are there?How many treatment conditions are there?

3.3. Is the experiment run between or within subjects?Is the experiment run between or within subjects?

4.4. Are the subjects matched?Are the subjects matched?

5.5. What is the level of measurement of the dependent variable?What is the level of measurement of the dependent variable?______________________________________________________________________________________________________

22

Answers based on the water maze studyAnswers based on the water maze study

Table 13-1Table 13-1 The parameters of data analysis The parameters of data analysis______________________________________________________________________________________________________

1.1. How many independent variables are there? How many independent variables are there? oneone

2.2. How many treatment conditions are there? How many treatment conditions are there? oneone

3.3. Is the experiment run between or within subjects? Is the experiment run between or within subjects? betweenbetween

4.4. Are the subjects matched? Are the subjects matched? nono

5.5. What is the level of measurement of the dependent variable? What is the level of measurement of the dependent variable? ratioratio______________________________________________________________________________________________________

23

Levels of MeasurementLevels of Measurement

Ratio – Ratio – a measure of magnitude having equal a measure of magnitude having equal intervals between values and having an intervals between values and having an absolute zero point.absolute zero point.

Interval – Interval – same as ratio except that there is no same as ratio except that there is no true zero point.true zero point.

Ordinal – Ordinal – a measure of magnitude in the form of a measure of magnitude in the form of ranks (not sure of equal intervals and no ranks (not sure of equal intervals and no absolute zero).absolute zero).

Nominal – Nominal – items are classified into categories items are classified into categories that have no quantitative relationship to one that have no quantitative relationship to one another.another.

24

Choosing a test statisticChoosing a test statistic

Level of Level of measurement measurement of dependent of dependent

variablevariable

One Independent VariableOne Independent Variable Two Independent VariablesTwo Independent Variables

Two TreatmentsTwo Treatments More Than Two TreatmentsMore Than Two Treatments Factorial DesignsFactorial Designs

Two Two Independent Independent

GroupsGroups

Two matched Two matched groups (or groups (or

within within subjects)subjects)

Multiple Multiple independent independent

groupsgroups

Multiple Multiple matched matched

groups (or groups (or within within

subjects)subjects)

Independent Independent groupsgroups

Matched Matched groups (or groups (or

within within sujects)sujects)

Independent Independent groups and groups and

matched groups matched groups (or between (or between subjects and subjects and

within subjectswithin subjects

Interval or Interval or ratioratio

tt test for test for independent independent

groupsgroups

tt test for test for matched groupsmatched groups

One-way One-way ANOVAANOVA

One-way One-way ANOVA ANOVA (repeated (repeated measures)measures)

Two-way Two-way ANOVAANOVA

Two-way Two-way ANOVA ANOVA (repeated (repeated measures)measures)

Two-way Two-way ANOVA (mixed)ANOVA (mixed)

ordinalordinalMann-Mann-

Whitney U Whitney U testtest

Wilcoxon testWilcoxon testKruskal-Kruskal-

Wallis testWallis testFriedmanFriedman

testtest

NominalNominalChi square Chi square

testtestChi square Chi square

testtestChi square Chi square

testtest

TABLE 14-2 Selecting a possible statistical test by number of independent variables and level of measurement

25

Heading Error: Statistical AnalysisHeading Error: Statistical Analysistt test for Independent Groups test for Independent Groups

XX11 – X – X22ttobsobs = =

(n(n11 + n + n22 – 2) – 2)( (n(n11 – 1) – 1) ss221 1 + (n+ (n22 – 1) – 1) ss22

22 ) ( )1 11 1++

nn11 n n2222.74 – 9.0822.74 – 9.08

ttobsobs = =

(8 + 8 – 2)(8 + 8 – 2)( (8–1)236.29+(8–1)22.71(8–1)236.29+(8–1)22.71 ) ( )1 11 1++

8 88 8

1) Lay out Formula1) Lay out Formula

2) Plug in Values2) Plug in Values

26


ttobsobs = =

8) Divide the numerator by the denominator.8) Divide the numerator by the denominator.


(n(n11 + n + n22 – 2) – 2)( (n(n11 – 1) – 1) ss221 1 + (n+ (n22 – 1) – 1) ss22

22 ) ( )1 11 1++

nn11 n n22

FormulaFormula

13.6613.66

5.695.69ttobsobs = 2.40 = 2.40

27

Determining significanceDetermining significance

1.1.Was the hypothesis directional or Was the hypothesis directional or nondirectional?nondirectional?

2.2.What was the significance level?What was the significance level?

3.3.How many degrees of freedom do we How many degrees of freedom do we have?have?

Degrees of freedom (df)– the number of members in a set of Degrees of freedom (df)– the number of members in a set of data that can vary or change value without changing the data that can vary or change value without changing the value of a known statistic for those data.value of a known statistic for those data.

28

1.1.Was the hypothesis directional or Was the hypothesis directional or nondirectional? nondirectional? Nondirectional, so two-Nondirectional, so two-tailed.tailed.

2.2.What was the significance level? What was the significance level? p < .05p < .05

3.3.How many degrees of freedom do we How many degrees of freedom do we have? have? 1414

Answers to the questionsAnswers to the questions

Look on page 531 of Myers & Hansen to find Look on page 531 of Myers & Hansen to find the critical value of t…the critical value of t…

29


Or you could just go on-line… e.g., Or you could just go on-line… e.g.,

http://www.psychstat.missouristate.edu/introbook/tdist.htm

Or you could just go on-line… e.g., Or you could just go on-line… e.g.,

http://www.psychstat.missouristate.edu/introbook/tdist.htm

30


ttobsobs = =



(n(n11 + n + n22 – 2) – 2)( (n(n11 – 1) – 1) ss221 1 + (n+ (n22 – 1) – 1) ss22

22 ) ( )1 11 1++

nn11 n n22

FormulaFormula

13.6613.66

5.695.69ttobsobs = 2.40 = 2.40

ttcritcrit = 2.145 = 2.145

p p < .05, two-tailed< .05, two-tailed

31

Compare to our computer output from SPSSCompare to our computer output from SPSS

ttobsobs = =



(n(n11 + n + n22 – 2) – 2)( (n(n11 – 1) – 1) ss221 1 + (n+ (n22 – 1) – 1) ss22

22 ) ( )1 11 1++

nn11 n n22

FormulaFormula

13.6613.66

5.695.69ttobsobs = 2.40 = 2.40

ttcritcrit = 2.145 = 2.145

p p < .05, two-tailed< .05, two-tailed

32

Decision: Reject the null hypothesisDecision: Reject the null hypothesis

Are we done?Are we done?

ConclusionConclusion

- How much importance should we attach to - How much importance should we attach to this finding?this finding?- Was the effect just barely significant - Was the effect just barely significant (p<.05)?(p<.05)?- What if the sig level was, p<.0001? Would - What if the sig level was, p<.0001? Would this be a larger effect? this be a larger effect?

33

Assess the quality of the ExperimentAssess the quality of the Experiment

1)1)Were control procedures adequate?Were control procedures adequate?

2)2)Were variables defined appropriately?Were variables defined appropriately?

3)3)Is a Type I error likely?Is a Type I error likely?


The t test is a The t test is a robustrobust statistic… statistic…Means that assumptions can be violated without changing the rate Means that assumptions can be violated without changing the rate of type I or type II error.of type I or type II error.

34

Effect sizeEffect sizeConvert Convert tt to a correlation coefficient to a correlation coefficient

rr = =tt22

tt22 + +dfdfrr = =

(2.40)(2.40)22

(2.40)(2.40)22 +14 +14

rr = .54 = .54

rr22 = .15 = .15

According to Cohen (1988), r ≥ .50 is considered a large effect (.30 is a moderate effect and below .30 is a small effect).

The r2 of .15 indicates that the IV accounts for 15% of the variability observed in the DV.

Online site for effect size calculator:Online site for effect size calculator:http://web.uccs.edu/lbecker/Psy590/escalc3.htm

35

Effect sizeEffect sizeConvert Convert tt to a correlation coefficient to a correlation coefficient

rr = =tt22

tt22 +df +dfrr = =

(2.40)(2.40)22

(2.40)(2.40)22 +14 +14

rr = .54 = .54

rr22 = .15 = .15

Online site for effect size calculator:Online site for effect size calculator:http://web.uccs.edu/lbecker/Psy590/escalc3.htm

1 why do we need statistics? a.to confuse students b.to torture students c.to put the fear of the...

Documents

usual variability

fluctuationsome variability

observed variability

heading error

headgdeg femalerat1

fear itselfyou

studentsto torture studentsto

grad schoolits important