exploring group differences

EXPLORING GROUP DIFFERENCES

Before Break:

1) Descriptive Statistics: Measures of central tendency Measures of variability Z-scores

2) Understanding statistical significance Hypothesis testing Alpha and p-values

3) Testing for relationships/associations between variables: Correlation (Pearson’s r) Simple regression Multiple regression

After Break:

1) Testing for Group differences T-tests ANOVA

2) Understanding statistical significance Effect Size Power

3) Nonparametric statistics and ‘other’ common tests Chi-square Logistic regression

If you have a firm grip on pre-break material – the second half of the

course becomes much easier (in my opinion)

Quick review I have a dataset that contains information on fitness

and academic performance in middle-school children I want to know if fitness is related to academic success I’m going to use PACER laps to quantify fitness and ISAT

science scores to quantify academic success

I can answer this question in various ways – let’s start with measures of association (correlations)

What would be my null and alternative hypothesis using a correlation?

Results

What is the relationship between aerobic fitness and science ISAT results?

p = 0.009, what does this mean? Low chance of random sampling error We would only see a correlation this strong, or stronger, 9

times out of 1000 due to random sampling error (due to chance)

Association

Association (and prediction) statistics like correlation and regression are useful, but can be limited

The other “half” of statistical testing is centered around determining group differences

For example, we could ask our fitness/academics question a different way and use a different set of statistics Also useful in experiments (treatment vs control),

comparing genders (males vs females), etc…

Example

Imagine I use PACER laps to split kids into two different groups High Fitness (high number of laps) Low Fitness (low number of laps)

NOTE: I took a continuous variable and made it into a categorical variable (nominal/ordinal)

Now I can ask the question a different way What are my null and alternative hypotheses? Remember, I believe that fitness is related to

academic success

Example cont

HO: There is no difference in science scores between the high fitness and low fitness group Notice, no difference would mean fitness has no effect

HA: There is a difference in science scores between the high fitness and low fitness group A difference would indicate that fitness has some

effect

This is simple enough – we know how to calculate and compare means in SPSS…

High vs Low Fitness: Mean

Conclusion…? Should I reject the null hypothesis?

Wait – could this difference be due to random sampling error?

Need for new statistical test

Is this difference due to random sampling error?

Due to the effect of random sampling, the two groups will NEVER have the exact same science scores I need a way to determine if this difference is

REAL or due to RSE I need to use a statistical test that can

determine group differences and provide me with a p-value

T-test

A t-test is a family of statistical tests designed to determine if differences exist between two groups (and ONLY two groups) Based on t-scores (which are very similar to z-

scores), should tip you off they are based on mean and SD

They test for ‘equality of means’ If the two group means are equal – then there is no

difference 3 major types of t-tests

One sample t-test, independent samples t-test, paired-samples t-test

T-tests One-sample t-test =

Compares mean of a single sample to known population mean i.e., group of 100 people took IQ test, are they different from the

population average? Do they have above average IQ? Independent samples t-test =

Compares the means scores of two different groups of subjects i.e., are science scores different between high fitness and low fitness

Paired-samples t-test = Compares the mean scores for the same group of subjects on

two different occasions i.e., is the group different before and after a treatment?

Also called a dependent t-test or a repeated measures t-test

In all cases TWO group means are being compared

Independent Samples T-Test

Let’s start here, since we need to use this test for our fitness/science question

Independent Samples T-tests: Used with a two-level, categorical, independent

variable (High/Low Fitness) – ONLY two groups… and with one continuous dependent variable

(science ISAT scores) Statistical assumptions – 1) data are normally

distributed, 2) samples represent the population, 3) the variance of the two groups are similar (homoscedasticity of variance)

NOTE: Same as correlation/regression – except we no longer have to worry about a ‘linear’ relationship since one of our variables is categorical (high/low fitness)

SPSS: Data format

In SPSS, the science scores are my continuous, dependent variable

I created the ‘high’ and ‘low’ fitness groups based on how many PACER laps each child completed When I created them, I coded ‘high fitness’

as 0 and ‘low fitness’ as 1 You need to recognize how your data are coded

for a t-test

SPSS – T-test

Move dependent variable to “Test Variable”

Move your independent variable to “Grouping Variable” Notice, it now has 2

question marks SPSS needs to know

which groups to compare

“Define Groups”

SPSS – T-test

Recall, ‘high fitness’ was 0, ‘low fitness’ was 1 Manually enter these values into the box When done, hit “continue’, then ‘ok’

T-test results

The first box will contain what you’ve already seen – the mean of the two groups:

Notice, n, mean, standard deviation (ignore SE) for each group

The next box is too big for one screen, so I’ve split it into two pieces…

SPSS results - Output

Recall, both groups need to have equal variance (homogeneity of variance, or homoscedasticity)

SPSS tests for this using “Levene’s Test” Null hypothesis = There is equal variance This means you do NOT want a p-value <

0.05

SPSS results - Output

If this Levene’s Test p-value is > 0.05 Equal variances exist, use the top line of

the table If this Leven’s Test p-value is < or =

0.05 Equal variance does not exist, use the

bottom line Becomes harder to find a statistically

significant result

df – Degrees of Freedom The table also shows df, or ‘degrees of freedom’ df is used to calculate the t-score and p-value

for the t-test df = n – 1

For each group you have, subtract 1 from the sample

We have two groups: High Fitness, n = 176, so 176 – 1 = 175 Low Fitness, n = 98, so 98 – 1 = 97 Total n = 176 + 98 = 274, we have 2 groups so… Total df = 274 – 2 = 272

Degrees of Freedom

df is important to understand if you are calculating the p-values by hand – we are NOT

All you need to know now is that: Larger sample size = ↑ df More groups = ↓ df

You want large df because it reduces your chance of random sampling error (a large sample) and increases the chance you’ll find statistically significant results

This becomes more important beyond t-tests, since we can have several groups (not just 2)

df in our example

Notice, the df in our example is 272 (274 subjects minus our two groups (high and low fitness)

If you do not have equal variances, SPSS ‘downgrades’ your df, making it more difficult to find statistically significant results

Before we move on… Questions about equality of variance test? df? Remember, we’re trying to determine if the

difference between the two groups is real – or due to RSE

What we know so far:

And, the two groups do have equal variance

More results

Here is the important stuff (remember, using top line): Our two groups (high/low fitness) had a mean

difference of 12.2 on the science ISAT 239.1 – 226.9 = 12.2

This difference is statistically significant, p = 0.001

Decisions

HO: There is no difference in science scores between the high fitness and low fitness group

HA: There is a difference in science scores between the high fitness and low fitness group

Decision? Results: The high fitness group scored higher than

the low fitness group on their science ISAT test by 12.2 points. This difference was statistically significant, t (272) 3.262, p = 0.001.

Usually report the t value of the test and the degrees of freedom in the paper (from table)

Questions about t-test results?

One more thing…

Notice in the t-test table that we also were provided with a 95% confidence interval:

95% confidence intervals are a statistic available from most tests, and are related to p-values. Lower Bound = 4.8, upper bound = 19.5

95% confidence intervals

Confidence intervals are similar to p-values Remember, p-values indicate probability of random

sampling error We want low p-values, which indicate a low

probability of random sampling error We most often use a p-value cutoff of 0.05,

meaning we like to be 0.95 (or 95% confident) that this was NOT due to random sampling error

Confidence intervals give you a similar type of information, but in a more practical sense Many people prefer confidence intervals over p-

values

95% confidence intervals

Remember, in statistics we are using samples to try and figure out information about the population When we calculate a mean for a sample, we are really

trying to understand what the REAL population mean is But, due to random sampling error, we always know

that our sample mean is different from the real population mean

Example – mean IQ score for all 7 billion humans is 100 Sample 1 of 100 humans = 102.1, Sample 2 = 105.3,

Sample 3 = 98.2, etc… Random sampling error

IQ

10085 11570 130

X = 100SD = 15

55 145

Distribution of IQ scores from the entire population

Sample 1 Mean = 102.1Sample 2 Mean = 105.3Sample 3 Mean = 98.2

1

23

Pretend we keep on drawing more and more samples until we got 100 different samples and 100 different lines on this chartIf we did that, would there be a pattern to where the lines were

drawn?

Would ALL the lines be so close to the population mean of 100?

10085 11570 130

X = 100SD = 15

55 145


Not all samples will be close to 100, just due to random sampling error

However, a 95% confidence interval would tell you where 95% of the 100 lines fell

95% Confidence Interval

10085 11570 13055 145


Could also make a 99% confidence interval if we wanted to

But notice, the ‘more confident’ we want to be, the wider the gap gets


Usually, people stick with a 95% confidence interval (since we usually use a p-value of 0.05)

10085 11570 130

X = 100SD = 15

55 145


In this example, a 95% confidence interval indicates that we are 95% certain that the REAL population mean falls between these two values


We can use a 95% confidence interval for virtually any population parameter we want to – such as a correlation coefficient, a regression slope, or a mean difference between two groups (like with our t-test)

Back to our t-test

Our 95% confidence interval:

Notice is says, “Interval of the Difference” We are 95% certain that the real difference is AT

LEAST as big as 4.8 and might be up to 19.5 points between our High and Low Fitness Groups

Our confidence level can never be 100%, so there is always a chance the real population difference is outside of this range (just like p can never be 0)

Confidence Intervals and p-values These two values are connected because:

Both related to RSE Both calculated using n (and df)

A low p-value (low chance of random sampling error) will result in a smaller (more narrow) confidence interval – we can be more confident

A larger p-value will result in a wider confidence interval – we are less confident

Questions on confidence intervals?

One more example t-test

Instead of using Aerobic fitness, let’s use flexibility

I split my sample into High Flexibility and Low Flexibility groups (based on sit and reach test)

Now, I’ll run a t-test to see if the High Flexibility kids score higher on their science tests than the Low Flexibility kids

What are my hypotheses?

HO: There is no difference in science scores between the high flexibility and low flexibility group

HA: There is a difference in science scores between the high flexibility and low flexibility group

T-test results: Flexibility

We can see that the high flexibility group has a higher Science ISAT score by about 5, but is this difference statistically significant???

T-test results: Flexibility

Levene’s Test p = 0.521 What does this mean?

df = 285 What is our sample size?

T-test results

Notice the mean difference (difference between High/Low groups) and the 95% confidence interval I have intentionally removed the p-value for this t-test

Is there a statistically significant difference between the two groups? Is the p-value < or = 0.05?

T-test results

Remember, to reject the null hypothesis we have to be reasonably certain that the two groups are different If this was the case, the difference between the

two groups could NOT be 0 If the mean difference is 0, the two groups are

identical

When the 95% confidence interval INCLUDES 0, we can’t be 95% certain that there is a group difference – and therefore, p is > 0.05

T-test results

Our 95% confidence interval includes 0 (one number is negative and the other is positive)

Therefore, we can’t be 95% certain the real group difference is NOT 0

p = 0.311, we can’t be sure this is not due to RSE

95% CI and p

If your 95% CI includes 0, your p-value will NOT be less than or equal to 0.05 Because both statistics are evaluating the

chance of RSE

If your 95% CI does not include 0 (both numbers are positive or both are negative), then we can be confident that the two groups are not the same

This means that p < or = 0.05Questions…?

Upcoming…

In-class activity

Homework: Cronk re-read 6.1, complete 6.3 (skip 6.2 for

now) Holcomb Exercises 35 and 37, 38, 39

More t-tests next week Single sample t-test Paired samples t-test (repeated measures t-test)

Example In-Class, 10 minutes Go to Blackboard and open the SPSS dataset

‘Fitness and Academics Reduced’ (week 7) Run two different independent samples t-

tests Determine if kids who are aerobically fit (using

PACER) score higher in reading or math than kids who have low fitness

Write down your results in this format (x2): T = XX, df = XX, p = XX Mean difference = XX, 95%CI (XX to XX)

Results of two t-tests

Results of two t-tests

Reading (equal variances NOT assumed): t = 4.856, df = 411.2, p < 0.0005 Mean difference = 9.8, 95%CI (5.9 to 13.8)

Math (equal variances assumed): t = 4.021, df = 837, p < 0.0005 Mean difference = 8.8, 95%CI (4.5 to 13.0)

exploring group differences

Documents

aerobic fitness

low fitness groupa difference

low fitness groupnotice

spsshigh vs low fitness

isat science scores

half of statistical

effect of random sampling

science scoresi