making decisions about distributions: introduction to the null hypothesis 47:269: research methods i...

Making decisions about distributions:

Introduction to the Null Hypothesis

47:269: Research Methods IDr. Leonard

April 14, 2010

Hypotheses• Hypothesis: specific prediction about the outcome of a study• Remember directional and non-directional hypotheses?

• Both kinds of hypotheses are called the alternative hypothesis (H1): a prediction that there will be some change or difference among scores• There is a relation between X and Y in the sample • Treatment and control groups are different• The sample scores are distinct from the population scores

• The other option is called the null hypothesis (H0): negation of the alternative hypothesis• There is NO relation between X and Y in the sample• Treatment and control groups are NOT different• The sample scores are NOT distinct from the population scores

Deciding which one is right

The alternative hypothesis (H1) and the null hypothesis (H0) can be envisioned as two separate probabilistic distributions that represent two distinct sets of scoresThe alternative hypothesis (H1) represents the scores in

which there is a significant change or difference in the sample of scores compared to the population

The null hypothesis (H0) represents the scores in which there is NO significant change or difference in the sample of scores compared to the population

Significance tests determine the probability that the null hypotheses (H0) is true (nothing happening with data)

Deciding which one is right

Before conducting a significance test, we can look for certain clues to decide which is correct…

Is there overlap between the H0 and H1 distributions?

If the two distributions overlap a lot, it is likely that there IS NOT a meaningful difference and we should maintain the null hypothesis (Ho) (nothing happening)

Which means reject the alternative hypothesis (H1)If the two distributions do not overlap much, it is likely that there IS a meaningful difference and we should reject the null hypothesis (Ho) (something happening)

Which means maintain the alternative hypothesis (H1)

Which has more overlap?

How do we decide whether the difference is meaningful?

In order to judge the two distributions as significantly different, we should look at how close the two Means are to each other If the means are close likely to maintain H0 (again, the

distributions overlap) If the means are far, likely to reject H0 (again, the

distributions don’t overlap)

We use a certain probability, p, or critical value as a cut-off point to decide whether the overlap is significant or not p corresponds to an area underneath the distributions curve The consensus in psychology is if p ≤.05, reject H0

(reasonable to assume that if a score is beyond 95% of scores in a distribution, it is not part of that distribution)

To be very cautious, researchers sometimes use p ≤.01

An example…

A high school uses tracking to divide students into “Advanced Placement” and “College Prep” classes. They believe College Prep students represent a general student population while AP students represent a “gifted” population and will have higher PSAT scores. Assume a directional alternative hypothesis (H1)

AP students will score higher than general students on the PSAT What would the null hypothesis (H0) be?

No real difference between AP and general students’ PSAT scores General school PSAT data

Mean = 500 and Std. Dev. = 200 Advanced Placement PSAT data

Mean = 650 and Std. Dev. = 100 Assume p ≤ .05

Where to locate p, the critical value

Directional alternative hypothesis (H1): likelihood that a score will fall in either extreme of the distribution or that change will be either positive or negative One-tailed All of p at one extreme (in predicted direction)

Because we predicted AP students’ PSAT scores would be higher, we drew p at the positive end of the distribution

Non-directional alternative hypothesis (H1): equal likelihood that a score could fall in either extreme of the distribution or that change could be positive or negative (two-tailed) Two-tailed Divide p by 2 for both extremes of distribution

p of .05 would mean .025 at each extreme

H1 is non- directional

p ≤ .05Two-tailed2.5% in each extreme

H1 is directional

p ≤ .05One-tailed5% in one extreme

Where to locate p, the critical value

Errors in our decisions about H1 and H0

Type I error: H0 is rejected when it is actually true and should be maintained; OVERCONFIDENCE - WORSE Area of Type I error: α “Confidence” is the probability of retaining null when alternative is

false (correct decision) Area of confidence: 1- α

Type II error: H0 is maintained when it is actually false and should be rejected; BEING TOO CAUTIOUS Area of Type II: ß “Power” is the probability of rejecting null when alternative is true

(correct decision) Area of power: 1- ß

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Power

Confidence

Making too much of results (worse error!)

Overlooking significant

results

Understanding these errors

One can reduce the risk of committing a Type I or Type II error by using a larger sample sizeTo minimize the chances of committing the worse,

Type I error, researchers use smaller or more strict p values (I.e., select .01 instead of .05)

But one can never totally eliminate the risk of making either type of errorIt is always possible that concluding there is a

significant difference or change (rejecting H0) is due to random error in the sample data

This is good reminder of how scientific knowledge remains fallible!

Approaches to data analysis

Descriptive statistics Describe or summarize data; characterize sample Organize responses to show trends in data Options:

Frequency distributionsMeasures of Central TendencyMeasures of Variability

Inferential statistics Draw inferences about population from sample Capture impact of random error on responses Options:

ParametricNon-parametric

Within inferential statistics…

Parametric tests deal with parameters (statistics that describe the population) and try to infer whether characteristics of the sample match the population

Parametric tests are powerful but have many assumptions (e.g., normal distributions)

Non-parametric tests have fewer assumptions and do not depend on population parameters, though they are less powerful

Non-parametric: The Chi-Square Test

The Chi-square test is a statistical test used to examine differences in nominal level variables (categories) Is there an interesting or meaningful pattern in the responses or is

the distribution of responses simply due to chance?

H0: no interesting pattern, responses just different by chance

H1: meaningful difference or relationship in responses

The Chi-square test allows us to check for meaningful differences among data for which we can’t compute the mean or standard deviation (e.g., Republican or Demoncrat) Can be univariate (one variable) or bivariate (two variables)

Univariate example: types of music preferred

Bivariate example: types of music preferred by region of country

Non-parametric: The Chi-Square Test

The outcome of the Chi-square test is a comparison of expected frequencies of responses (theoretically predicted) vs. observed frequencies of responses (actually obtained) for any given variable

Sometimes expected frequency is known (e.g., in gambling, we can predict how many times a 6 should be rolled in 100 die rolls)

More often in psychology, we do NOT know the expected frequency for a given variable so we define the expected frequency as the number of observations predicted for any category if H0 is true

We use the observed vs. expected frequencies to calculate a 2 value and compare it to a critical 2 value in a hypothetical distribution of 2 scores

Chi-Square Test Example: Music preferences of college students (N

= 80)

Country

R&B Alternative

Pop

12 18 34 16

The H0 would state that there is no difference in types of music preferred by college students so expected frequency, fE , would be 20 for each category

Univariate, Observed frequencies:

The 2 value for these scores (14) surpasses the critical 2 value at p = .05 (7.8) so we would reject H0 and conclude that there is a significant difference in the music preference of students surveyed.

Chi-Square Test Example: Music preferences of college students (N =

80)The H0 would state that there is no difference in type of music preferred by college students by region of the country so expected frequency, fE , would be 6.7 for each category

Bivariate, Observed frequencies:

The 2 value for these scores (32.9) surpasses the critical 2 value at p = .05 (12.6) so we would reject H0 and conclude that the observed differences are not due to chance. Rather, music preference and region of the country are meaningfully related.

Country

R&B

Alternative

Pop

N.East 1 10 19 4 34

West 1 5 11 10 27

M.West

10 3 4 2 19

12 18 34 16 80

making decisions about distributions: introduction to the null hypothesis 47:269: research methods i...

Documents

null hypothesis ho

sample of scores

alternative hypothesis

alternative hypothesis

null hypotheses h0

h1 distributions

differentthe sample

h0 reasonable