making decisions about distributions: introduction to the null hypothesis 47:269: research methods i...
TRANSCRIPT
Making decisions about distributions:
Introduction to the Null Hypothesis
47:269: Research Methods IDr. Leonard
April 14, 2010
Hypotheses• Hypothesis: specific prediction about the outcome of a study• Remember directional and non-directional hypotheses?
• Both kinds of hypotheses are called the alternative hypothesis (H1): a prediction that there will be some change or difference among scores• There is a relation between X and Y in the sample • Treatment and control groups are different• The sample scores are distinct from the population scores
• The other option is called the null hypothesis (H0): negation of the alternative hypothesis• There is NO relation between X and Y in the sample• Treatment and control groups are NOT different• The sample scores are NOT distinct from the population scores
Deciding which one is right
The alternative hypothesis (H1) and the null hypothesis (H0) can be envisioned as two separate probabilistic distributions that represent two distinct sets of scoresThe alternative hypothesis (H1) represents the scores in
which there is a significant change or difference in the sample of scores compared to the population
The null hypothesis (H0) represents the scores in which there is NO significant change or difference in the sample of scores compared to the population
Significance tests determine the probability that the null hypotheses (H0) is true (nothing happening with data)
Deciding which one is right
Before conducting a significance test, we can look for certain clues to decide which is correct…
Is there overlap between the H0 and H1 distributions?
If the two distributions overlap a lot, it is likely that there IS NOT a meaningful difference and we should maintain the null hypothesis (Ho) (nothing happening)
Which means reject the alternative hypothesis (H1)If the two distributions do not overlap much, it is likely that there IS a meaningful difference and we should reject the null hypothesis (Ho) (something happening)
Which means maintain the alternative hypothesis (H1)
How do we decide whether the difference is meaningful?
In order to judge the two distributions as significantly different, we should look at how close the two Means are to each other If the means are close likely to maintain H0 (again, the
distributions overlap) If the means are far, likely to reject H0 (again, the
distributions don’t overlap)
We use a certain probability, p, or critical value as a cut-off point to decide whether the overlap is significant or not p corresponds to an area underneath the distributions curve The consensus in psychology is if p ≤.05, reject H0
(reasonable to assume that if a score is beyond 95% of scores in a distribution, it is not part of that distribution)
To be very cautious, researchers sometimes use p ≤.01
An example…
A high school uses tracking to divide students into “Advanced Placement” and “College Prep” classes. They believe College Prep students represent a general student population while AP students represent a “gifted” population and will have higher PSAT scores. Assume a directional alternative hypothesis (H1)
AP students will score higher than general students on the PSAT What would the null hypothesis (H0) be?
No real difference between AP and general students’ PSAT scores General school PSAT data
Mean = 500 and Std. Dev. = 200 Advanced Placement PSAT data
Mean = 650 and Std. Dev. = 100 Assume p ≤ .05
Where to locate p, the critical value
Directional alternative hypothesis (H1): likelihood that a score will fall in either extreme of the distribution or that change will be either positive or negative One-tailed All of p at one extreme (in predicted direction)
Because we predicted AP students’ PSAT scores would be higher, we drew p at the positive end of the distribution
Non-directional alternative hypothesis (H1): equal likelihood that a score could fall in either extreme of the distribution or that change could be positive or negative (two-tailed) Two-tailed Divide p by 2 for both extremes of distribution
p of .05 would mean .025 at each extreme
H1 is non- directional
p ≤ .05Two-tailed2.5% in each extreme
H1 is directional
p ≤ .05One-tailed5% in one extreme
Where to locate p, the critical value
Errors in our decisions about H1 and H0
Type I error: H0 is rejected when it is actually true and should be maintained; OVERCONFIDENCE - WORSE Area of Type I error: α “Confidence” is the probability of retaining null when alternative is
false (correct decision) Area of confidence: 1- α
Type II error: H0 is maintained when it is actually false and should be rejected; BEING TOO CAUTIOUS Area of Type II: ß “Power” is the probability of rejecting null when alternative is true
(correct decision) Area of power: 1- ß
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Power
Confidence
Making too much of results (worse error!)
Overlooking significant
results
Understanding these errors
One can reduce the risk of committing a Type I or Type II error by using a larger sample sizeTo minimize the chances of committing the worse,
Type I error, researchers use smaller or more strict p values (I.e., select .01 instead of .05)
But one can never totally eliminate the risk of making either type of errorIt is always possible that concluding there is a
significant difference or change (rejecting H0) is due to random error in the sample data
This is good reminder of how scientific knowledge remains fallible!
Approaches to data analysis
Descriptive statistics Describe or summarize data; characterize sample Organize responses to show trends in data Options:
Frequency distributionsMeasures of Central TendencyMeasures of Variability
Inferential statistics Draw inferences about population from sample Capture impact of random error on responses Options:
ParametricNon-parametric
Within inferential statistics…
Parametric tests deal with parameters (statistics that describe the population) and try to infer whether characteristics of the sample match the population
Parametric tests are powerful but have many assumptions (e.g., normal distributions)
Non-parametric tests have fewer assumptions and do not depend on population parameters, though they are less powerful
Non-parametric: The Chi-Square Test
The Chi-square test is a statistical test used to examine differences in nominal level variables (categories) Is there an interesting or meaningful pattern in the responses or is
the distribution of responses simply due to chance?
H0: no interesting pattern, responses just different by chance
H1: meaningful difference or relationship in responses
The Chi-square test allows us to check for meaningful differences among data for which we can’t compute the mean or standard deviation (e.g., Republican or Demoncrat) Can be univariate (one variable) or bivariate (two variables)
Univariate example: types of music preferred
Bivariate example: types of music preferred by region of country
Non-parametric: The Chi-Square Test
The outcome of the Chi-square test is a comparison of expected frequencies of responses (theoretically predicted) vs. observed frequencies of responses (actually obtained) for any given variable
Sometimes expected frequency is known (e.g., in gambling, we can predict how many times a 6 should be rolled in 100 die rolls)
More often in psychology, we do NOT know the expected frequency for a given variable so we define the expected frequency as the number of observations predicted for any category if H0 is true
We use the observed vs. expected frequencies to calculate a 2 value and compare it to a critical 2 value in a hypothetical distribution of 2 scores
Chi-Square Test Example: Music preferences of college students (N
= 80)
Country
R&B Alternative
Pop
12 18 34 16
The H0 would state that there is no difference in types of music preferred by college students so expected frequency, fE , would be 20 for each category
Univariate, Observed frequencies:
The 2 value for these scores (14) surpasses the critical 2 value at p = .05 (7.8) so we would reject H0 and conclude that there is a significant difference in the music preference of students surveyed.
Chi-Square Test Example: Music preferences of college students (N =
80)The H0 would state that there is no difference in type of music preferred by college students by region of the country so expected frequency, fE , would be 6.7 for each category
Bivariate, Observed frequencies:
The 2 value for these scores (32.9) surpasses the critical 2 value at p = .05 (12.6) so we would reject H0 and conclude that the observed differences are not due to chance. Rather, music preference and region of the country are meaningfully related.
Country
R&B
Alternative
Pop
N.East 1 10 19 4 34
West 1 5 11 10 27
M.West
10 3 4 2 19
12 18 34 16 80