inputting data for a single sample t

Single Sample t-Tests

Welcome to a presentation explaining the concepts behind the use of a single sample t-test

Welcome to a presentation explaining the concepts behind the use of a single sample t-test in determining the probability that a sample and a population are similar to or different from one another statistically.

We will follow an example where researchers attempt to determine if the sample they have collected is statistically significantly similar or different from a population.

Their hope is that the sample and population are statistically similar to one another, so they can claim that results of experiments done to the sample are generalizable to the population.

Let’s imagine that this is the population distribution for IQ scores in the country:

It has a population mean of 100

m = 100

m = 100

This Greek symbol represents the mean

of a population

We decide to select a random sample to do experiments on.

m = 100

So, we randomly select 20 persons

m = 100

Let’s say that sample of 20 has an IQ score

mean of 70

m = 100

m = 100 = 70

Let’s say that sample of 20 has an IQ score

mean of 70

m = 100 = 70

Note, this x with a bar over it is the symbol for a sample mean.

m = 100 = 70

Again this Greek symbol m is the

symbol for a population mean.

m = 100 = 70

Along with a mean of 70 this sample has a

distribution that looks like this

m = 100 = 70

So, here’s the question:

m = 100 = 70

Is this randomly selected sample of 20 IQ scores representative of the population?

m = 100 = 70

The Single Sample t-test is a tool used to determine the probability that it is or is not.

So, how do we determine if the sample is a good representative of the population?

Let’s look at the population distribution of IQ scores first:

One thing we notice right off is that it

has a normal distribution

Normal Distributions have some important

properties or attributes that

make it possible to consider rare or

common occurrences

Also, Normal Distributions have

some constant percentages that

are true across all normal

distributions

Attribute #1: 50% of all of the scores are above the mean and the other 50% of the scores are below the mean


The Mean


The Mean 50% of all scores


The Mean50% of all scores

Before going on let’s take a brief time out

The next section requires an understanding of the concept of standard deviation.

If you are unfamiliar with this concept do a search for standard deviation in this software. After viewing it return to slide 36 of this presentation.

Time in – let’s get back to the instruction

The Mean

Attribute #2: 68% of all of the scores are between +1 standard deviation and -1 standard deviation

The Mean

Attribute #2: 68% of all of the scores are between +1 standard deviation and -1 standard deviation from the mean

The Mean


+1 sd

The Mean


+1 sd-1 sd

The Mean


+1 sd-1 sd

68% of all scores

The Mean

So what this means is – if you were randomly selecting samples from this population you have a 68% chance or .68 probability of pulling that sample from this part of the distribution.

+1 sd-1 sd

68% of all scores

The Mean

+1 sd-1 sd

68% of all scores

Let’s put some numbers to this idea.

The Mean

+1 sd-1 sd

68% of all scores

The mean of IQ scores across the population is 100

m = 100

+1 sd-1 sd

With a population standard deviation (s) of 15: +1 standard deviation would at an IQ score of 115 and -1 standard deviation would be at 85

68% of all scores

m = 100

+1 sd

-1 sd

-1s=85

68% of all scores


m = 100

+1 sd-1 sd

68% of all scores

+1s=115-1s=85


m = 100

+1 sd-1 sd

68% of all scores So, there is a 68% chance or .68 probability that a sample was collected between IQ scores of 85 and 115

+1s=115-1s=85

m = 100

+1 sd-1 sd

68% of all scores

Attribute 2: 2 standard deviation units above and below the mean constitute 95% of all scores.

+1s=115-1s=85

m = 100

+1 sd-1 sd

68% of all scores

Attribute 2: 2 standard deviation units above and below the mean constitute 95% of all scores.

+1s=115-1s=85

+2 sd+2 sd

m = 100

+1 sd-1 sd

68% of all scores

2 standard deviation units above the mean would be an IQ score of 130 or 100 + 2*15(sd))

+1s=115-1s=85

+2 sd+2 sd

m = 100

+1 sd-1 sd

68% of all scores

2 standard deviation units above the mean would be an IQ score of 130 or 100 + 2*15(sd))

+1s=115-1s=85

+2 sd

+1s=130

-2 sd

m = 100

+1 sd-1 sd

68% of all scores

2 standard deviation units below the mean would be an IQ score of 70 or 100 - 2*15(sd))

+1s=115-1s=85

+2 sd

+1s=115

-2 sd

m = 100

+1 sd-1 sd

68% of all scores

2 standard deviation units below the mean would be an IQ score of 70 or 100 - 2*15(sd))

+1s=115-1s=85

+2 sd

+1s=115-2s=70

-2 sd

m = 100

+1 sd-1 sd

68% of all scores

Now, it just so happens in nature that 95% of all scores are between +2 and -2 standard deviations in a normal distribution.

+1s=115-1s=85

+2 sd

+1s=115-2s=70

-2 sd

m = 100

+1 sd-1 sd

95% of all scores

+1s=115-1s=85

+2 sd

+1s=130-2s=70

-2 sd

Now, it just so happens in nature that 95% of all scores are between +2 and -2 standard deviations in a normal distribution.

m = 100

+1 sd-1 sd

95% of all scores

This means that there is a .95 chance that a sample we select would come from between these two points.

+1s=115-1s=85

+2 sd

+1s=130-2s=70

-2 sd

m = 100

+1 sd-1 sd

95% of all scores

+1s=115-1s=85

+2 sd

+1s=130-2s=70

-2 sd

Attribute #3: 99% of all scores are between +3 and -3 standard deviations.

m = 100

+1 sd-1 sd

99% of all scores

+1s=115-1s=85

+2 sd

+1s=130-2s=70

-2 sd

-3s=55

-3 sd

+3s=55

+3 sd

Attribute #3: 99% of all scores are between +3 and -3 standard deviations.

m = 100

+1 sd-1 sd

99% of all scores

+1s=115-1s=85

+2 sd

+1s=130-2s=70

-2 sd

-3s=55

-3 sd

-3s=55

+3 sd

These standard deviations are only approximates.

m = 100

+1 sd-1 sd

99% of all scores

+1s=115-1s=85

+2 sd

+1s=130-2s=70

-2 sd

-3s=55 -3s=55

+3 sd-3 sd

Here are the actual values

m = 100

+1 sd-1 sd

99% of all scores

+1s=115-1s=85

+2 sd

+1s=130-2s=70

-2 sd

-2.58 s=55

-2.58 sd

-3s=55

+3 sd


m = 100

+1 sd-1 sd

99% of all scores

+1s=115-1s=85

+2 sd

+1s=130-1.96 s=70

-1.96 sd-2.58 sd

-3s=55

+3 sd


-2.58 s=55

m = 100

+1 sd-1 sd

99% of all scores

+2 sd

+1s=130

-1.96 sd-2.58 sd

-3s=55

+3 sd


-1.96 s=70

-2.58 s=55

+1s=115-1 s=85

m = 100

+1 sd-1 sd

99% of all scores

+2 sd

+1s=130

-1.96 sd-2.58 sd

-3s=55

+3 sd


-1.96 s=70

-2.58 s=55

+1 s=115

-1 s=85

m = 100

+1 sd-1 sd

99% of all scores

+1.96 s=130

-1.96 sd-2.58 sd

-3s=55

+3 sd


+1.96 sd

-1.96 s=70

-2.58 s=55

+1 s=115

-1 s=85

m = 100

+1 sd-1 sd

99% of all scores

+1 s=115

-1 s=85

+1.96 s=130

-1.96 sd-2.58 sd

+2.58 s=145

+2.58 sd


+1.96 sd

-1.96 s=70

-2.58 s=55

+1 sd-1 sd

99% of all scores

+1.96 sd-1.96 sd-2.58 sd +2.58 sd

Based on the percentages of a normal distribution, we can insert the percentage of scores below each standard deviation

point.

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55

+1 sd-1 sd

99% of all scores

+1.96 sd-1.96 sd-2.58 sd +2.58 sd

Attribute #4: percentages can be calculated below or above each standard deviation point in the distribution.

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55

99% of all scores

+1 sd-1 sd +1.96 sd-1.96 sd-2.58 sd +2.58 sd

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55

99% of all scores

95% of all scores

+1 sd-1 sd +1.96 sd-1.96 sd-2.58 sd +2.58 sd

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55

99% of all scores

95% of all scores

68% of all scores

+1 sd-1 sd +1.96 sd-1.96 sd-2.58 sd +2.58 sd

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55

99% of all scores

95% of all scores

68% of all scores

With this information we can determine the probability that scores will fall into a number portions of the distribution.

+1 sd-1 sd +1.96 sd-1.96 sd-2.58 sd +2.58 sd

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55

For example:

0.5%

There is a 0.5% chance that if you randomly

selected a person that their IQ

score would be below a 55 or a -

2.58 SD

+1 sd-1 sd +1.96 sd-1.96 sd-2.58 sd +2.58 sd

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55

2.5%




1.96 SD

+1 sd-1 sd +1.96 sd-1.96 sd-2.58 sd +2.58 sd

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55

16.5%




1 SD

+1 sd-1 sd +1.96 sd-1.96 sd-2.58 sd +2.58 sd

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55

50%

There is a 50% chance that if you randomly


score would be below a 100 or a

0 SD

+1 sd-1 sd +1.96 sd-1.96 sd-2.58 sd +2.58 sd

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55

There is a 50% chance that if you randomly


score would be above a 100 or a

0 SD

50%

+1 sd-1 sd +1.96 sd-1.96 sd-2.58 sd +2.58 sd

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55




+1 SD

16.5%

+1 sd-1 sd +1.96 sd-1.96 sd-2.58 sd +2.58 sd

m = 100

+1 s=115

-1 s=85

+1.96 s=130

+2.58 s=145

-1.96 s=70

-2.58 s=55

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd




+2.58 SD

2.5%

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd




+2.58 SD

.5%

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

What you have just seen illustrated is the concept of probability density or the probability that a score or observation would be selected above,

below or between two points on a distribution.

Back to our example again.

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

Here is the sample we randomly selected

= 70

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

The sample mean is 30 units away from the

population mean (100 – 70 = 30).

= 70

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

The sample mean is 30 units away from the

population mean (100 – 70 = 30).

= 7030

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

Is that far away enough to be

called statistically significantly

different than the population?

= 70

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

How far is too far away?

= 70

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

Fortunately, statisticians have come up with a

couple of distances that are considered too far away to be a part of the population.

= 70

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

These distances are measured in

z-scores

= 70

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

= 70

Which are what these are:

z scores

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

= 70

Let’s say statisticians determined that if the sample mean you collected is below a -1.96 z-score or above a +1.96 z-score that that’s

just too far away from the mean to be a part of the population.

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

= 70

We know from previous slides that only 2.5% of the scores are below a -1.96

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

= 70

We know from previous slides that only 2.5% of the scores are below a -1.96

2.5%

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

= 70

Since anything at this point or below is considered to be too rare to be a part of this population, we would conclude that the population and the sample are statistically significantly

different from one another.

2.5%

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

= 70

And that’s our answer!

2.5%

What if the sample mean had been 105?

Since our decision rule is to determine that the sample mean is statistically significantly different than the population mean if the sample mean lies outside of the top or bottom 2.5% of all scores,


m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

2.5% 2.5%


m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

= 105

2.5% 2.5%

. . . and the sample mean (105) does not lie in these outer regions,

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

= 105

2.5% 2.5%

Therefore, we would say that this is not a rare event and the probability that the sample is significantly similar to the population is high.

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

= 105

2.5% 2.5%

By the way, how do we figure out the z-score for an IQ score of 105.

We use the following formula to compute z-scores across the normal distribution:


- mSD


- mSDHere’s our

sample mean: 70


70 - mSDHere’s our

sample mean: 70


70 - mSD

Here’s our Population mean: 100


70 - 100SD

Here’s our Population mean: 100


70 - 100SD

Here’s our Standard

Deviation: 15


70 - 10015

Here’s our Standard

Deviation: 15


70 - 10015


3015


2.0

A z score of 2 is located right here

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

= 70

2.5% 2.5%

In some instances we may not know the population standard deviation s (in this case 15).

Without the standard deviation of the population we cannot determine the z-scores or the probability that a sample mean is too far away to be apart of the population.


m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd


m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

50%


m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

16.5%


m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

2.5%


m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

Etc.

Therefore these values below cannot be computed:

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

Therefore these values below cannot be computed:

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145+2.58 sd

When we only know the population mean we use the Single Sample t-test.

Actually whenever we are dealing with a population and a sample, we generally use a single-sample t-test.

In the last example we relied on the population mean and standard deviation to determine if the sample mean was too far away from the population mean to be considered a part of the population.

The single sample t-test relies on a concept called the estimated standard error

The single sample t-test relies on a concept called the estimated standard error to compute something like a z-score to determine the probability distance between the population and the sample means.

We call it estimated because as you will see it is not really feasible to compute.

Standard error draws on two concepts:

1. sampling distributions

1. sampling distributions2. t-distributions

Let’s begin with sampling distributions.


What you are about to see is purely theoretical, but it provides the justification for the formula we will use to run a single sample t-test.


What you are about to see is purely theoretical, but it provides the justification for the formula we will use to run a single sample t-test.

x̄� – μSEmean

x̄� – μSEmean

the mean of a sample

x̄� – μSEmean


the mean of a population

x̄� – μSEmean


the mean of a population

the estimated standard error

x̄� – μSEmean

In our example,

this is 70

70 � – μSEmean

In our example, this is 70

70 � – μSEmean

And this is 100

70 � – 100SEmean

And this is 100

70 � – 100SEmean

The numerator here is easy to compute

-30SEmean

The numerator here is easy to compute

-30SEmean

This value will help us know the distance between 70 and 100

in t-values

-30SEmean

If the estimated standard error is

large, like 30, then the t value would be:

-30/30 = -1

A -1.0 t-value is like a z-score as shown below:

A -1.0 t-value is like a z-score as shown below:

m = 100

+1 sd-1 sd

+1 s=115

-1 s=85

+1.96 sd

+1.96 s=130

-1.96 s=70

-1.96 sd

-2.58 s=55

-2.58 sd

+2.58 s =145

+2.58 sd

But since we don’t know the population standard deviation, we have to use the standard deviation of the sample (not the population as we did before) to determine the distance in standard error units (or t values).

The focus of our theoretical justification is to explain our rationale for using information from the sample to compute the standard error or standard error of the mean.

The focus of our theoretical justification is to explain our rationale for using information from the sample to compute the standard error or standard error of the mean.

– μx̄�SEmean

Here comes the theory behind standard error:


Imagine we took a sample of 20 IQ scores from the population.


Imagine we took a sample of 20 IQ scores from the population. This sample of 20 would have its own IQ mean, standard deviation and distribution.


Imagine we took a sample of 20 IQ scores from the population. This sample of 20 would have its own IQ mean, standard deviation and distribution.

x̄� = 70

SD = 10


Imagine we took a sample of 20 IQ scores from the population. This sample of 20 would have its own IQ mean, standard deviation and distribution. And then let’s say we took another sample of 20 with its mean and distribution,

x̄� = 70


Imagine we took a sample of 20 IQ scores from the population. This sample of 20 would have its own IQ mean, standard deviation and distribution. And then let’s say we took another sample of 20 with its mean and distribution,

x̄� = 70 x̄� = 100


Imagine we took a sample of 20 IQ scores from the population. This sample of 20 would have its own IQ mean, standard deviation and distribution. And then let’s say we took another sample of 20 with its mean and distribution, and another

x̄� = 70 x̄� = 100


Imagine we took a sample of 20 IQ scores from the population. This sample of 20 would have its own IQ mean, standard deviation and distribution. And then let’s say we took another sample of 20 with its mean and distribution, and another

x̄� = 70 x̄� = 100 x̄� = 120


Imagine we took a sample of 20 IQ scores from the population. This sample of 20 would have its own IQ mean, standard deviation and distribution. And then let’s say we took another sample of 20 with its mean and distribution, and another, and another

x̄� = 70 x̄� = 100 x̄� = 120 x̄� = 140


Imagine we took a sample of 20 IQ scores from the population. This sample of 20 would have its own IQ mean, standard deviation and distribution. And then let’s say we took another sample of 20 with its mean and distribution, and another, and another, and so on…

x̄� = 70 x̄� = 100 x̄� = 120 x̄� = 140

Let’s say, theoretically, that we do this one hundred times.


We now have 100 samples of 20 person IQ scores:


We now have 100 samples of 20 person IQ scores: We take the mean of each of those samples:

x̄� = 110 x̄� = 102 x̄� = 120 x̄� = 90

x̄� = 114 x̄� = 100 x̄� = 120 x̄� = 90 x̄� = 95

x̄� = 100 x̄� = 120 x̄� = 90

x̄� = 115 x̄� = 100

And we create a new distribution called the sampling distribution of the means

x̄� = 110 x̄� = 102 x̄� = 120 x̄� = 90

x̄� = 114 x̄� = 100 x̄� = 120 x̄� = 90 x̄� = 95

x̄� = 100 x̄� = 120 x̄� = 90

x̄� = 115 x̄� = 100

And

x̄� = 110 x̄� = 100 x̄� = 100 x̄� = 90

x̄� = 115 x̄� = 70 x̄� = 120 x̄� = 90 x̄� = 95

x̄� = 100 x̄� = 120 x̄� = 90

x̄� = 115 x̄� = 105

And

x̄� = 110 x̄� = 100 x̄� = 100 x̄� = 90

x̄� = 115 x̄� = 70 x̄� = 120 x̄� = 90 x̄� = 95

x̄� = 100 x̄� = 120 x̄� = 90

x̄� = 115 x̄� = 105

70 75 70 85 90 95 100 105 110 115 120 125

And

x̄� = 110 x̄� = 100 x̄� = 100 x̄� = 90

x̄� = 115 x̄� = 70 x̄� = 120 x̄� = 90 x̄� = 95

x̄� = 100 x̄� = 120 x̄� = 90

x̄� = 115 x̄� = 105

70 75 70 85 90 95 100 105 110 115 120 125

etc. …

And then we do something interesting. We take the standard deviation of this sampling distribution.

And then we do something interesting. We take the standard deviation of this sampling distribution. If these sample means are close to one another then the standard deviation will be small.

And then we do something interesting. We take the standard deviation of this sampling distribution. If these sample means are close to one another then the standard deviation will be small.

70 75 70 85 90 95 100 105 110 115 120 125

And then we do something interesting. We take the standard deviation of this sampling distribution. If these sample means are far apart from one another then the standard deviation will be large.

70 75 70 85 90 95 100 105 110 115 120 125

This standard deviation of the sampling distribution of the means has another name:

This standard deviation of the sampling distribution of the means has another name: the standard error.

This standard deviation of the sampling distribution of the means has another name: the standard error.

x̄� – μSEmean

the estimated standard error

Standard error is a unit of measurement that makes it possible to determine if a raw score difference is really significant or not.


Think of it this way. If you get a 92 on a 100 point test and the general population gets on average a 90, is there really a significant difference between you and the population at large? If you retook the test over and over again would you likely outperform or underperform their average of 90?


Think of it this way. If you get a 92 on a 100 point test and the general population gets on average a 90, is there really a significant difference between you and the population at large? If you retook the test over and over again would you likely outperform or underperform their average of 90?

Standard error helps us understand the likelihood that those results would replicate the same way over and over again . . . or not.

So let’s say we calculate the standard error to be 0.2. Using the formula below we will determine how many standard error units you are apart from each other.


x̄� – μSEmean

t =


92 – 900.2

t =standard error

your score

population average


20.2

t =


20.2

t =This is the raw

score difference


10.0t =

And this is the difference in

standard error units


10.0t =And this is the

difference in standard error units, otherwise known as a t statistic

or t value

So, while your test score is two raw scores above the average for the population, you are 10.0 standard error units higher than the population score.


While 2.0 raw scores do not seem like a lot, 10.0 standard error units constitute a big difference!


While 2.0 raw scores do not seem like a lot, 10.0 standard error units constitute a big difference!

This means that this result is most likely to replicate and did not happen by chance.

But what if the standard error were much bigger, say, 4.0?


x̄� – μSEmean

t =


92 – 904.0

t =standard error

your score

population average


24.0

t =


0.5t =

Once again, your raw score difference is still 2.0 but you are only 0.5 standard error units apart. That distance is most likely too small to be statistically significantly different if replicated over a hundred times.

Once again, your raw score difference is still 2.0 but you are only 0.5 standard error units apart. That distance is most likely too small to be statistically significantly different if replicated over a hundred times.

We will show you how to determine when the number of standard error units is significantly different or not.

One more example:

One more example:

Let’s say the population average on the test is 80 and you still received a 92. But the standard error is 36.0.

One more example:

Let’s say the population average on the test is 80 and you still received a 92. But the standard error is 36.0. Let’s do the math again.

One more example:


x̄� – μSEmean

t =

One more example:


92 – 8036.0

t =standard error

your score

population average

One more example:


1236.0

t =

One more example:


0.3t =

In this case, while your raw score is 12 points higher (a large amount), you are only 0.3 standard error units higher.

In this case, while your raw score is 12 points higher (a large amount), you are only 0.3 standard error units higher. The standard error is so large (36.0) that if you were to take the test 1000 times with no growth in between it is most likely that your scores would vary greatly (92 on one day, 77 on another day, 87 on another day and so on and so forth.)

In summary, the single sample test t value is the number of standard error units that separate the sample mean from the population mean:

In summary, the single sample test t value is the number of standard error units that separate the sample mean from the population mean:

Let’s see this play out with our original example.

x̄� – μSEmean

t =

Let’s say our of 20 has an average IQ score of 70.


x̄� – μSEmean

t =


70 – μSEmean

t =

We already know that the population mean is 100.

70 – μSEmean

t =


70 – 100SEmean

t =

Let’s say the Standard Error of the Sampling Distribution means is 5

70 – 100SEmean

t =

Let’s say the Standard Error of the Sampling Distribution means is 5

70 – 1005

t =

The t value would be:

70 – 1005

t =


-305

t =


-6t =

So all of this begs the question: How do I know if a t value of -6 is rare or common?


• If it is rare we can accept the null hypothesis and say that there is a difference between the sample and the population.



• If it is common we can reject the null hypothesis and say there is not a significant difference between veggie eating IQ scores and the general population IQ scores (which in this unique case is what we want)

So all of this begs the question: How do I know if a t value of 5 is rare or common?


• If it is common we can reject the null hypothesis and say there is not a significant difference between veggie eating IQ scores and the general population IQ scores (which in this unique case is what we want)

Here is what we do: We compare this value (6) with the critical t value.

What is the critical t value?


The critical t value is a point in the distribution that arbitrarily separates the common from the rare occurrences.



rare occurrence



rare occurrence

rare occurrence



rare occurrence

common occurrence

rare occurrence



This represents the rare/common possibilities used to determine if the sample mean is similar the population mean).

rare occurrence

common occurrence

rare occurrence

If this were a normal distribution the red line would have a z critical value of + or – 1.96

rare occurrence

common occurrence

rare occurrence

If this were a normal distribution the red line would have a z critical value of + or – 1.96 (which is essentially a t critical value but for a normal distribution.)

rare occurrence

common occurrence

rare occurrence

A t or z value of + or – 1.96 means that 95% of the scores are in the center of the distribution and 5% are to the left and right of it.


common occurrence

rare occurrence

rare occurrence


rare occurrence

common occurrence

rare occurrence

+1.96-1.96

So if the t value computed from this equation:


x̄� – μSEmean

t =


… is between -1.96 and +1.96 we would say that that result is not rare and we would fail to reject the null hypothesis.

x̄� – μSEmean

t =



– μx̄�SEmean

t =

rare occurrence

common occurrence

rare occurrence

+1.96-1.96

However, if the t value is smaller than -1.96 or larger than +1.96 we would say that that result is rare and we would reject the null hypothesis.

However, if the t value is smaller than -1.96 or larger than +1.96 we would say that that result is rare and we would reject the null hypothesis.

rare occurrence

common occurrence

rare occurrence

+1.96-1.96


x̄� – μSEmean

t =

As a review, let’s say the distribution is normal. When the distribution is normal and we want to locate the z critical for a two tailed test at a level of significance of .05 (which means that if we took 100 samples we are willing to be wrong 5 times and still reject the null hypothesis), the z critical would be -+1.96.

As a review, let’s say the distribution is normal. When the distribution is normal and we want to locate the z critical for a two tailed test at a level of significance of .05 (which means that if we took 100 samples we are willing to be wrong 5 times and still reject the null hypothesis), the z critical would be -+1.96.

+ 1.96- 1.96

Because we generally do not have the resources to take 100 IQ samples of those who eat veggies, we have to estimate the t value from one sample.


x̄� = 110 x̄� = 100 x̄� = 100 x̄� = 90

x̄� = 115 x̄� = 70 x̄� = 120 x̄� = 90 x̄� = 95

x̄� = 100 x̄� = 120 x̄� = 90

x̄� = 115 x̄� = 105


x̄� = 110 x̄� = 100 x̄� = 100 x̄� = 90

x̄� = 115 x̄� = 70 x̄� = 120 x̄� = 90 x̄� = 95

x̄� = 100 x̄� = 120 x̄� = 90

x̄� = 115 x̄� = 105

x̄� = 110


x̄� = 110


So let’s imagine we selected a sampleof 20 veggie eaters with an average IQ score of 70.

x̄� = 70



Because we did not take hundreds of samples of 20 veggie eaters each, average each sample’s IQ scores, and form a sampling distribution from which we could compute the standard error and then the t value, we have to figure out another way to compute an estimate of the standard error.

x̄� = 70



Because we did not take hundreds of samples of 20 veggie eaters each, average each sample’s IQ scores, and form a sampling distribution from which we could compute the standard error and then the t value, we have to figure out another way to compute an estimate of the standard error. There is another way.

x̄� = 70

Since it is not practical to collect a hundreds of samples of 20 from the population, compute their mean score and calculate the standard error, we must estimate it using the following equation:

Sn

SEmean =

Since it is not practical to collect a hundreds of samples of 20 from the population, compute their mean score and calculate the standard error, we must estimate it using the following equation:

Sn

SEmean =

Standard Deviation of the sample

We estimate the standard error using the following equation:

Sn

SEmean =


We estimate the standard error using the following equation:

We won’t go into the derivation of this formula, but just know that this acts as a good substitute in the place of taking hundreds of samples and computing the standard deviation to get the actual standard error.

SEmean =


Square root of the sample size

Sn

So remember this point: The equation below is an estimate of the standard error, not the actual standard error, because the actual standard error is not feasible to compute.


Sn

SEmean =


However, just know that when researchers have taken hundreds of samples and computed their means and then taken the standard deviation of all of those means they come out pretty close to one another.

Sn

SEmean =

Now comes another critical point dealing with t-distributions.

Now comes another critical point dealing with t-distributions. Because we are working with a sample that is generally small, we do not use the same normal distribution to determine the critical z or t (-+1.96).

Now comes another critical point dealing with t-distributions. Because we are working with a sample that is generally small, we do not use the same normal distribution to determine the critical z or t (-+1.96). Here is what the z distribution looks like for samples generally larger than 30:

Now comes another critical point dealing with t-distributions. Because we are working with a sample that is generally small, we do not use the same normal distribution to determine the critical z or t (-+1.96).

Here is what the z distribution looks like for samples generally larger than 30:

95% of the scores

+ 1.96- 1.96

Sample Size 30+

As the sample size decreases the critical values increase - making it harder to get significance.


Notice how the t distribution gets shorter and wider when the sample size is smaller. Notice also how the t critical values increase as well.

95% of the scores

+ 2.09- 2.09

Sample Size 20



95% of the scores

+ 2.26- 2.26

Sample Size 10



95% of the scores

+ 2.78- 2.78

Sample Size 5

To determine the critical t we need two values:

To determine the critical t we need two values:• the degrees of freedom (sample size minus one [20-1 = 19])

To determine the critical t we need two values:• the degrees of freedom (sample size minus one [20-1 = 19])• the significance level (.05 or .025 for two tailed test)

To determine the critical t we need two values:• the degrees of freedom (sample size minus one [20-1 = 19])• the significance level (.05 or .025 for two tailed test)

Using these two values we can

locate the critical t

So, our t critical value that separates the common from the rare occurrences in this case is + or – 2.09

So, our t critical value that separates the common from the rare occurrences in this case is + or – 2.09

95% of the scores

+ 2.09- 2.09

Sample Size 20

What was our calculated t value again?


x̄� – μSEmean

t =


70 – μSEmean

t =


70 – 100SEmean

t =


To calculate the estimated standard error of the mean distribution we use the following equation:

70 – 100SEmean

t =


To calculate the estimated standard error of the mean distribution we use the following equation:

Sn

SEmean =

70 – 100SEmean

t =


The Standard Deviation for this sample is 26.82 and the sample size, of course, is 20

Sn

SEmean =

70 – 100SEmean

t =


Let’s plug in those values:

Sn

SEmean =

70 – 100SEmean

t =



26.8220

SEmean =

70 – 100SEmean

t =

26.824.47



SEmean =

70 – 100SEmean

t =

6.0



SEmean =

70 – 100SEmean

t =

6.0



SEmean =

70 – 1006.0

t =


70 – 1006.0

t =

Now do the math:

70 – 1006.0

t =


70 – 1006.0

t =


-306.0

t =


-5t =

With a t value of -5 we are ready to compare it to the critical t:

With a t value of -5 we are ready to compare it to the critical t: 2.093

With a t value of -5 we are ready to compare it to the critical t: 2.093 or 2.093 below the mean or -2.093.

Since a t value of +5.0 is below the cutoff point (critical t) of -2.09, we would reject the null hypothesis

Since a t value of +5.0 is below the cutoff point (critical t) of -2.09, we would reject the null hypothesis

95% of the scores

+ 2.09- 2.09

Here is how we would state our results:

Here is how we would state our results:

The randomly selected sample of twenty IQ scores with a sample mean of 70 is statistically significantly different then the population of IQ scores.

Now, what if the standard error had been much larger, say, 15.0 instead of 6.0?


70 – 1006.0

t =


70 – 1006.0

t =70 – 100

15.0t =


70 – 1006.0

t =30

15.0t =


70 – 1006.0

t = 2.0t =

A t value of -2.0 is not below the cutoff point (critical t) of -2.09 and would be considered a common rather than a rare outcome. We therefore would fail to reject the null hypothesis

A t value of -2.0 is not below the cutoff point (critical t) of -2.09 and would be considered a common rather than a rare outcome. We therefore would fail to reject the null hypothesis

95% of the scores

+ 2.09- 2.09 - 2.0

So in summary, the single sample t-test helps us determine the probability that the difference between a sample and a population did or did not occur by chance.

So in summary, the single sample t-test helps us determine the probability that the difference between a sample and a population did or did not occur by chance.

It utilizes the concepts of standard error, common and rare occurrences, and t-distributions to justify its use.

End of Presentation

inputting data for a single sample t

Education

sample mean

population standarddeviation

population distribution

mean of70

iq scores acrossthe

population aresimilar

selected sample

sample aregeneralizable