statistics 11 hypothesis testing discover the relationships that exist between events/things...

64
Statistics 1 1 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord with certain rules ... the scientific method. Question is a hypothesis Answer is obtained by testing the hypothesis Which gives the general model………

Upload: teresa-craig

Post on 15-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 1

Hypothesis Testing

Discover the relationships that exist between events/things

Accomplished by:

Asking questions

Getting answers

In accord with certain rules ... the scientific method.

Question is a hypothesis

Answer is obtained by testing the hypothesis

Which gives the general model………

Page 2: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 2

.

Page 3: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 3

With some IMPORTANT restrictions about – How the hypothesis is formed.

How the hypothesis is tested……..

Forming hypotheses is an "everyday-everybody" activity

I will do better on examinations if relax the night before

Is a "hypothesis" ... a statement of a relationship

OK, BUT NOT a scientific hypothesis

Page 4: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 4

A scientific hypothesis must meet certain criterion

A scientific hypothesis must be:

Specific

Empirically testable

Strictly related to some experimental procedure

Page 5: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 5

Moreover, a scientific hypothesis actually consists of two separate mutually exclusive hypotheses

A null hypothesis

An alternative hypothesis

Page 6: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 6

Null Hypothesis

A statement reflecting the possibility that there are no differences between the objects and/or events that are being observed

In formal terms:

Ho: µ1 = µ2

Where: µ1 and µ2 are the mean or average of several observations

Page 7: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 7

Alternative Hypothesis

A statement reflecting the possibility that there are differences between the objects and/or events that are being observed

In formal terms:

H1: µ1 <> µ2 or H1: µ1 < µ2 or H1: µ1 > µ2

Page 8: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 8

Testing between the null and alternative hypothesis

Accomplished through collection of data

Data must be scientifically acceptable, i.e. – Observable – Public – Replicable

The test concentrates on the null hypothesis which you either – Reject – Fail to reject

Page 9: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 9

If there are no differences between your observations you – Fail to reject the null hypothesis and – Disregard the alternative hypothesis

If there are differences between your observations you – Reject the null hypothesis and Accept the alternative

hypothesis

Page 10: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 10

Some things to note about hypothesis testing

Failing to reject the null hypothesis – Does not mean that the null hypothesis is TRUE – The null hypothesis can never be proven – You can only fail to reject it

Rejecting the null hypothesis – Means you accept the alternative hypothesis – It does not establish the validity of a relationship

Validity is a function of experimental design

Page 11: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 11

When testing a hypothesis

Two possible outcomes re: null hypothesis

Two possible states of real world

Thus four possible decisions – Two are incorrect ... i.e. errors

Page 12: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 12

The Real World

Ho: true Ho: false

YourDecision

Reject HoType I error

alpha (p level)

CorrectPower

1 - beta

Do not reject HoCorrect

1 - alphaType II error

beta

Page 13: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 13

Some things to file for future reference

Type I error – You can directly "set" this – It is the chance (probability of the making the error)

you are willing to accept when you test your hypothesis.

Type II error – You cannot directly "set" this

You can attempt to control it through good experimental design.

Page 14: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 14

Alpha Level or the level of significance is a probability value that is used to define the very unlikely sample outcomes if the null hypothesis is true.

Critical region is composed of extreme sample values that are very unlikely to be obtained if the null hypothesis is true. The boundaries for the critical region are determined by the alpha level. If sample data fall in the critical region, the null hypothesis is rejected.

Page 15: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 15

Estimating Population Parameters from Samples

Sample mean

Unlikely to be exactly equal to population mean

BUT

Not more likely to be greater

Not more likely to be less

So sample mean is an unbiased estimate of population mean

Page 16: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 16

Sample standard deviation

Unlikely to be exactly equal to population standard deviation

BUT

More likely to be less

Is usually an under estimate of population parameter

So sample standard deviation is a biased estimate of the population standard deviation

Page 17: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 17

To understand why this is so you must understand the nature and concept of a sampling distribution

What it all means

When you take a sample

Page 18: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 18

Sample mean is unbiased estimate of population mean – But no reason to suspect it is higher or lower than

population mean

Sample standard deviation is a biased estimate of population standard deviation – But it is more likely to be smaller than population

variance and standard deviation

Page 19: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 19

And so must correct any estimate of the population variance increase it (i.e. use "n-1" when calculating the estimate)

Page 20: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 20

Page 21: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 21

Parametric Tests-

Tests that do make assumptions and test hypotheses about population parameters.

• z & t

• ANOVA

• F test

Involves an assessment of whether your observed data is related to your independent variable

Page 22: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 22

Or is simply what might be expected by chance random sampling – i.e. no relation between

• Independent variable

• Dependent measure

Page 23: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 23

Requires knowing or estimating population parameters

Mean: (μ)

Standard deviation: (σ)

Assumption of normality

For example: consider • Pat (individual score) = 64 • Population mean (μ) = 50 • Population standard deviation (σ) = 8

Page 24: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 24

And if population is normal, then you know

~68 (68.26) % data points between + 1

~95 (95.44) % data points between + 2

~99 (99.74) % data points between + 3

And remember: these are percentages, not absolute valuesAreas under the normal curve

Page 25: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 25

Remember the z-distribution ?

Provided areas (proportion of scores) under a normal distribution according to

And if convert Pat's raw score to a z-score

Page 26: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 26

And then look up Pat's z-score in the Z-table

Meaning that ~96% (1.00 - .0401 = .9599) of scores in distribution are below Pat. (page 699, G&W).

OR PUT OTHERWISE – If we were to randomly select a score from Pat's

distribution

The probability that the score would be greater than Pat's would only be 4 in 100

Page 27: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 27

And we have done what we set out to do

Accomplished a statistical test

… i.e. comparing ….– Observed data and What would be expected by

chance .

And thus,

Pat's score is significant at p < .05

Page 28: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 28

P-value

The P-value is the probability, when Ho is true, of a test statistic value at least as contradictory to Ho as the value actually observed. The smaller the P-value, the more strongly the data contradict Ho. The P-value is denoted by P.

The P-value summarizes the evidence in the data about the null hypothesis. A moderate to large P-value means that the data are consistent with Ho.

Page 29: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 29

Eg. P-value .26 or .83 indicates that the observed data would not be unusual if Ho were true. However, a P-value such as .001 means that such data would be very unlikely, if Ho were true.

The P-value is the primary reported result of a significance test.

If the P-value is sufficiently small, one rejects Ho and accepts H1.

Page 30: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 30

Standard Error of the Mean

When comparing an individual to a population needed to know two things about the population

Mean:

Standard deviation:

• Only slightly different when comparing a sample to a population

Page 31: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 31

Since you are not concerned with a single individual but with a sample of individuals

The "population" of interest is not – A population of individuals but rather – A population of samples, i.e. a SAMPLING

DISTRIBUTION

Page 32: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 32

And the measure of "variability" is not – The standard deviation of a population but rather – The standard deviation of a sampling distribution,

i.e. the STANDARD ERROR OF THE MEAN

Page 33: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 33

The calculation details

The standard error of the mean

σm = σ ∕ √n

where: n = sample size

The z comparison

Z = (M – μ) ∕ σm

Which is not really different than what we did when comparing an individual to a population

Page 34: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 34

Suppose: Herd of 10 cows (n=10)Mean milk production is 1.8 gal/cow

Question: How unique is this herd?

μ = 1.5, σ = .55, n=10, M=1.8

Page 35: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 35

Thus herd is pretty unique since the likelihood that any random sample of 10 cows from the population would produce more milk is less than 5 times in 100

Or in "statistics" – The probability of selecting a herd of greater milk

producers is p < 0.05

Page 36: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 36

Page 37: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 37

Problem: compare a sample to a population

Method:

1. Use population parameters to calculate the standard error of the mean of a sampling distribution.

σm = σ ∕ √n

Page 38: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 38

2. Use the standard error of the mean to compare sample mean with population mean by calculating a z-score

Z = (M – μ) ∕ σm

Page 39: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 39

3. Use z-table to determine the probability that a random sample would yield a mean greater than the mean of the sample

Page 40: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 40

A word on the logic and requirements of the statistic

• The "uniqueness" of your sample is the probability that another random sample of the same size would have the same mean as your sample.

• Or put otherwise, is your sample mean, is what would be expected by chance, a random selection?

• The more unique your sample, the more likely it reflects a relationship between:

Page 41: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 41

• Your independent variable

• Your dependent measure

Page 42: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 42

Two requirements

The population is normally distributed

You know the population

• Mean

• Standard deviation

Page 43: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 43

The t statistic -An alternative to z

MUST know the population mean

But can estimate population standard deviation from sample data

A sample standard deviation is given by (as you know)

Page 44: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 44

And so an estimated standard error of the mean is:

sm = s ∕ √n

And to use the estimated standard error of the mean to compare your sample to the population must make one adjustment– Adjustment is necessary to account for the fact that

you are estimating

Page 45: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 45

Comparisons When Estimating Population Parameters

• The adjustment part

• Estimating the standard deviation requires a different sampling distribution

• Sampling distribution is the t-distribution

Page 46: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 46

The t-distribution

• Normal distribution

• More platokurtic than z-distribution

• Tails more elevated

Page 47: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 47

And comparison becomes

t = (M – μ) ∕ sm

Thus:

Because use estimate of population standard deviation to estimate standard error of mean

Must use t-distribution to get probability of randomly selecting a sample with a mean similar to the mean of your sample

But not conceptually different -- just an adjustment

Page 48: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 48

For Example…

Suppose: Mean milk production of your herd of 10 cows is:1.8, 1.7, 2.4, 2.3, 1.1, 1.7, 1.5, 2.4,1.9, 1.2 gals(Mean = 1.8 gal/cow)

Question: How unique is your herd?

μ = 1.5 gal/day *givenσ = unknownσm =unknown because do not know population standard deviation

Page 49: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 49

QUESTION now is what does that "t-value" mean

• i.e. What is the probability of a random sample of 10 cows being like your cows

• To find out, consult a t-table

Page 50: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 50

Page 51: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 51

The t-Distribution

• Z-table gives exact probabilities

• t-table gives ranges of probabilities

Page 52: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 52

Enter table with degrees of freedom of your sample – Degrees of freedom

• Number of values in a calculation that are free to vary

– That is: – The degrees of freedom for a mean of 10 values is

9 ... because – If the mean of 10 numbers is, for example, 5 – Nine of the numbers "free" to be any value but when

these are established, the 10th number is determined if the mean is to be 5

Page 53: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 53

Determine the probability of your t-observed by the tabled t-values that it falls between,

For example:

• With 10 data points there are 9 degrees of freedom (df=9)

• If the t-observed statistic is 2.04

Page 54: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 54

The t-table gives a probability of that occurring by chance between 0.05 and 0.02 two-tailed (between 5 and 2 times in 100)

And for your cows

The observed t-value of 2.04, df=9 gives a tabled probability of p > 0.05

Page 55: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 55

Which is traditionally not sufficient to reject the Null hypothesis – Any event that has probability of occurring 5 times

or more in 100 is considered by most psychologists an indication of a chance event

And thus your cows are "just average old cows"

Well maybe NOT……….

Page 56: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 56

Directionality of Statistical Tests

Statistical tests have a property called "directionality"

• Nondirectional, called "two-tailed" tests

• Directional, called "one-tailed" tests

Page 57: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 57

Directionality is determined BEFORE you run your experiment

Based upon prior knowledge or data

You predict of the outcome of your observations, the affect of your independent variable – Your independent variable "will improve

performance" – Your independent variable "will interfere with

learning" – etc

Page 58: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 58

Your ability to predict an outcome means that you are better able to determine whether an event is a chance occurrence

More likely to reject Null hypothesis

In statistical terms the region of the sampling distribution indicating that an event is something different than what would be expected by chance is larger

Page 59: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 59

And cows again

If you had valid reasons to predict that your cows produced more milk

You would use a directional test, i.e. one-tailed test

And you would reject the Null hypothesis at p < .05 that your herd was not different than what you would expect from another random sample of cows

Page 60: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 60

Page 61: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 61

• Z-test a statistical test used to decide if a sample mean does or does not come from a specified population, when the standard deviation of the population is known.

• When the standard deviation of the population is unknown then a t-test is performed.

• Hypothesis testing, the goal is to decide whether to reject the null hypothesis.

Page 62: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 62

• Alpha level, traditionally set at .05 where, also, the acceptance and rejection regions are determined.

• Critical value, the absolute value of that defines the rejection region(s).

• Non-directional (two-tailed), where rejection of the sample mean is either above or below hypothesized population mean.

• Directional (one-tailed), where rejection of the sample mean is determined prior to experimentation.

Page 63: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 63

Compared a Sample to a population:

When population parameters are known…..

Sample to population: assume a normal population and known standard deviation

Z = (M – μ) ∕ σm

Page 64: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord

Statistics 1 64

When population parameters are unknown….

Sample to population: assume a normal population and unknown standard deviation

t = (M – μ) ∕ sm