phd assign
TRANSCRIPT
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 1/35
1
1. Compare and Contrast
a. The measures of Central Tendency
b. The measures of Variability
Central tendency is a statistical measure that identifies a single score as representative of an
entire distribution of scores. The goal of central tendency is to find the single score that is most
typical or most representative of the entire distribution. Unfortunately, there is no single,
standard procedure for determining central tendency. The problem is that there is no single
measure that will always produce a central, representative value in every situation. There are
three main measures of central tendency: the arithmetical mean, the median and the mode.
The mean of a set of scores (abbreviated M) is the most common and useful measure of
central tendency. The mean is the sum of the scores divided by the total number of scores. The
mean is commonly known as the arithmetic average. The mean can only be used for variables
at the interval or ratio levels of measurement. The mean of [2 6 2 10] is (2 + 6 + 2 + 10)/4 = 20/4
= 5. One can think of the mean as the balance point of a distribution (the center of gravity). It
balances the distances of observations to the mean. Another measure of central tendency is
the median, which is defined as the middle value when the numbers are arranged in increasing
or decreasing order. The median is the score that divides the distribution of scores exactly in
half. The median is also the 50th percentile. The median can be used for variables at the
ordinal, interval or ratio levels of measurement. If for example, daily expenses are $50, $100,
$150, $350, $350 the middle value is $150, and therefore, $150 is the median. For odd number
of count the median is middle value. If there is an even number of items in a set, the median isthe average of the two middle values. For example, if we had four values$50, $100, $150,
$350the median would be the average of the two middle values, $100 and $150; thus, 125 is
the median in that case. The median may sometimes be a better indicator of central tendency
than the mean, especially when there are extreme values. Another indicator of central
tendency is the mode, or the value that occurs most often in a set of numbers. In other words,
the mode is the score or category of scores in a frequency distribution that has the greatest
frequency. In the set of expenses mentioned above, the mode would be $350 because it
appears twice and the other values appear only once. The mode can be used for variables at
any level of measurement (nominal, ordinal, interval or ratio). Sometimes a distribution has
more than one mode. Such a distribution is called multimodal. A distribution with two modes iscalled bimodal. Note that the modes do not have to have the same frequencies. The tallest
peak is called the major mode; other peaks are called minor modes. Some distributions do not
have modes. A rectangular distribution has no mode. Some distributions have many peaks and
valleys.
Variability provides a quantitative measure of the degree to which scores in a
distribution are spread out. The greater the difference between scores, the more spread out
the distribution is. The more tightly the scores group together, the less variability there is in the
distribution. Variability is the essence of statistics. The most frequently used methods of
measurement of this variance are: range, deviation and variance, interquartile range and
standard deviation. The range is simply the difference between the highest score and thelowest score in a distribution plus one. This statistic can be calculated for measurements that
are on an interval scale or above. In dataset with 10 numbers {99,45,23,67,45,91,82,78,62,51},
the highest number is 99 and the lowest number is 23, so 9923=76; the range is 76. The
interquartile range (IQR) is a range that contains the middle 50% of the scores in a distribution.
It is computed as follows: IQR=75th percentile25th percentile. A related measure of variability
is called the semi-interquartile range. The semi-interquartile range is defined simply as the
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 2/35
2
interquartile range divided by 2. Variance can be defined as a measure of how close the scores
in the distribution are to the middle of the distribution. Using the mean as the measure of the
middle of the distribution, the variance is defined as the average squared difference of the
scores from the mean. When the scores are spread out or heterogeneous, the measure of
variability should be large. When the scores are homogeneous the variability should be smaller.
Another measure of variability is the standard deviation. The standard deviation is simply thesquare root of the variance. The standard deviation is an especially useful measure of variability
when the distribution is normal or approximately normal (see Probability) because the
proportion of the distribution within a given number of standard deviations from the mean can
be calculated. Therefore standard deviation is the average distance from the mean. So the
mean is the representative value, and the standard deviation is the representative distance of
any one point in the distribution from the mean.
While the measures of central tendency convey information about the commonalties of
measured properties, the measures of variability quantify the degree to which they differ. If not
all values of data are the same, they differ and variability exists. The measures of central
tendency should be complemented by measures of variability for the same reason.
2. When are the 3 measure of Central Tendency used?
Mean
The mean is the most commonly-used measure of central tendency. When we talk about an
"average", we usually are referring to the mean. The mean is simply the sum of the values
divided by the total number of items in the set. The result is referred to as the arithmetic mean.
Sometimes it is useful to give more weighting to certain data points, in which case the result is
called the weighted arithmetic mean.
The notation used to express the mean depends on whether we are talking about the
population mean or the sample mean:
= population mean
= sample mean
The population mean then is defined as:
where
= number of data points in the population
= value of each data point i .
The mean is valid only for interval data or ratio data. Since it uses the values of all of the data
points in the population or sample, the mean is influenced by outliers that may be at the
extremes of the data set.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 3/35
3
Median
The median is determined by sorting the data set from lowest to highest values and taking the
data point in the middle of the sequence. There is an equal number of points above and below
the median. For example, in the data set {1,2,3,4,5} the median is 3; there are two data points
greater than this value and two data points less than this value. In this case, the median is equalto the mean. But consider the data set {1,2,3,4,10}. In this dataset, the median still is three, but
the mean is equal to 4. If there is an even number of data points in the set, then there is no
single point at the middle and the median is calculated by taking the mean of the two middle
points.
The median can be determined for ordinal data as well as interval and ratio data. Unlike the
mean, the median is not influenced by outliers at the extremes of the data set. For this reason,
the median often is used when there are a few extreme values that could greatly influence the
mean and distort what might be considered typical. This often is the case with home prices and
with income data for a group of people, which often is very skewed. For such data, the medianoften is reported instead of the mean. For example, in a group of people, if the salary of one
person is 10 times the mean, the mean salary of the group will be higher because of the
unusually large salary. In this case, the median may better represent the typical salary level of
the group.
Mode
The mode is the most frequently occurring value in the data set. For example, in the data set
{1,2,3,4,4}, the mode is equal to 4. A data set can have more than a single mode, in which case
it is multimodal. In the data set {1,1,2,3,3} there are two modes: 1 and 3.
The mode can be very useful for dealing with categorical data. For example, if a sandwich shop
sells 10 different types of sandwiches, the mode would represent the most popular sandwich.
The mode also can be used with ordinal, interval, and ratio data. However, in interval and ratio
scales, the data may be spread thinly with no data points having the same value. In such cases,
the mode may not exist or may not be very meaningful.
When to use Mean, Median, and Mode
The following table summarizes the appropriate methods of determining the middle or typical
value of a data set based on the measurement scale of the data.
Measurement Scale Best Measure of the "Middle"
Nominal
(Categorical) Mode
Ordinal Median
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 4/35
4
IntervalSymmetrical data: Mean
Skewed data: Median
Ratio Symmetrical data: MeanSkewed data: Median
3. What is Skewness. Explain each type and give examples.
Skewness
The first thing you usually notice about a distributions shape is whether it has one mode (peak)
or more than one. If its unimodal (has just one peak), like most data sets, the next thing you
notice is whether its symmetric or skewed to one side. If the bulk of the data is at the left and
the right tail is longer, we say that the distribution is skewed right or positively skewed; if the
peak is toward the right and the left tail is longer, we say that the distribution is skewed left or
negatively skewed.
Look at the two graphs below. They both have = 0.6923 and = 0.1685, but their shapes are
different.
skewness = 0.5370 skewness = +0.5370
The first one is moderately skewed left: the left tail is longer and most of the distribution is at
the right. By contrast, the second distribution is moderately skewed right: its right tail is longer
and most of the distribution is at the left.
You can get a general impression of skewness by drawing a histogram, but there are also some
common numerical measures of skewness. Some authors favor one, some favor another.
You may remember that the mean and standard deviation have the same units as the original
data, and the variance has the square of those units. However, the skewness has no units: its a
pure number, like a z-score.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 5/35
5
Computing
The moment coefficient of skewness of a data set is
skewness: g1 = m3 / m23/2
(1) where
m3 = (xx)3 / n and m2 = (xx)2 / n
x is the mean and n is the sample size, as usual. m3 is called the third moment of the data set.
m2 is the variance, the square of the standard deviation.
Youll remember that you have to choose one of two different measures of standard deviation,
depending on whether you have data for the whole population or just a sample. The same is
true of skewness. If you have the whole population, then g1 above is the measure of skewness.
But if you have just a sample, you need the sample skewness:
(2) sample skewness:
source: D. N. Joanes and C. A. Gill. Comparing Measures of Sample Skewness and Kurtosis.
The Statistician 47(1):183189.
Excel doesnt concern itself with whether you have a sample or a population: its measure of
skewness is always G1.
Example 1: College Mens Heights
Here are grouped data for heights of 100 randomly selected
male students, adapted from Spiegel & Stephens, Theory and
Problems of Statistics 3/e (McGraw-Hill, 1999), page 68.
A histogram shows that the
data are skewed left, not
symmetric.
But how highly skewed are
they, compared to other data
sets? To answer this question,
you have to compute the skewness.
Begin with the sample size and sample mean. (The sample size was given, but it never hurts to
check.)
n = 5+18+42+27+8 = 100
x = (61×5 + 64×18 + 67×42 + 70×27 + 73×8) ÷ 100
x = 9305 + 1152 + 2814 + 1890 + 584) ÷ 100
Height
(inches)
Class
Mark, x
Frequ-
ency, f
59.562.5 61 5
62.565.5 64 18
65.568.5 67 42
68.571.5 70 27
71.574.5 73 8
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 6/35
6
x = 6745÷100 = 67.45
Now, with the mean in hand, you can compute the skewness. (Of course in real life youd
probably use Excel or a statistics package, but its good to know where the numbers come
from.)
Class Mark,
xFrequency, f xf (xx) (xx)²f (xx)³f
61 5 305 -6.45 208.01 -1341.68
64 18 1152 -3.45 214.25 -739.15
67 42 2814 -0.45 8.51 -3.83
70 27 1890 2.55 175.57 447.70
73 8 584 5.55 246.42 1367.63
6745 n/a 852.75 269.33
x, m2, m3 67.45 n/a 8.5275 2.6933
Finally, the skewness is
g1 = m3 / m23/2
= 2.6933 / 8.52753/2
= 0.1082
But wait, theres more! That would be the skewness if the you had data for the whole
population. But obviously there are more than 100 male students in the world, or even inalmost any school, so what you have here is a sample, not the population. You must compute
the sample skewness:
= [(100×99) / 98] [2.6933 / 8.52753/2
] = 0.1098
Interpreting
If skewness is positive, the data are positively skewed or skewed right, meaning that the right
tail of the distribution is longer than the left. If skewness is negative, the data are negatively
skewed or skewed left, meaning that the left tail is longer.
If skewness = 0, the data are perfectly symmetrical. But a skewness of exactly zero is quite
unlikely for real-world data, so how can you interpret the skewness number? Bulmer, M. G.,
Principles of Statistics (Dover, 1979) a classic suggests this rule of thumb:
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 7/35
7
y If skewness is less than 1 or greater than +1, the distribution is highly skewed.
y If skewness is between 1 and ½ or between +½ and +1, the distribution is moderately
skewed.
y If skewness is between ½ and +½, the distribution is approximately symmetric.
With a skewness of 0.1098, the sample data for student heights are approximately symmetric.
Caution: This is an interpretation of the data you actually have. When you have data for the
whole population, thats fine. But when you have a sample, the sample skewness doesnt
necessarily apply to the whole population. In that case the question is, from the sample
skewness, can you conclude anything about the population skewness? To answer that question,
see the next section.
Inferring
Your data set is just one sample drawn from a population. Maybe, from ordinary sample
variability, your sample is skewed even though the population is symmetric. But if the sample is
skewed too much for random chance to be the explanation, then you can conclude that there is
skewness in the population.
But what do I mean by too much for random chance to be the explanation? To answer that,
you need to divide the sample skewness G1 by the standard error of skewness (SES) to get the
test statistic, which measures how many standard errors separate the sample skewness from
zero:
(3) test statistic: Zg1 = G1/SES where
This formula is adapted from page 85 of Cramer, Duncan, Basic Statistics for Social Research
(Routledge, 1997). (Some authors suggest (6/n), but for small samples thats a poor
approximation. And anyway, weve all got calculators, so you may as well do it right.)
The critical value of Zg1 is approximately 2. (This is a two-tailed test of skewness 0 at roughly
the 0.05 significance level.)
y If Zg1 < 2, the population is very likely skewed negatively (though you dont know by
how much).y If Zg1 is between 2 and +2, you cant reach any conclusion about the skewness of the
population: it might be symmetric, or it might be skewed in either direction.
y If Zg1 > 2, the population is very likely skewed positively (though you dont know by how
much).
Dont mix up the meanings of this test statistic and the amount of skewness. The amount of
skewness tells you how highly skewed your sample is: the bigger the number, the bigger the
skew. The test statistic tells you whether the whole population is probably skewed, but not by
how much: the bigger the number, the higher the probability.
Estimating
GraphPad suggests a confidence interval for skewness:
(4) 95% confidence interval of population skewness = G1 ± 2 SES
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 8/35
8
For the college mens heights, recall that the sample skewness was G1 = 0.1098. The sample
size was n = 100 and therefore the standard error of skewness is
SES = [ (600×99) / (98×101×103) ] = 0.2414
The test statistic is
Zg1 = G1/SES = 0.1098 / 0.2414 = 0.45
This is quite small, so its impossible to say whether the population is symmetric or skewed.
Since the sample skewness is small, a confidence interval is probably reasonable:
G1 ± 2 SES = .1098 ± 2×.2414 = .1098±.4828 = 0.5926 to +0.3730.
You can give a 95% confidence interval of skewness as about 0.59 to +0.37, more or less.
4. Define Kurtosis. Differentiate each from the others and give examples.
Kurtosis
If a distribution is symmetric, the next question is about the central peak: is it high and sharp, or
short and broad? You can get some idea of this from the histogram, but a numerical measure is
more precise.
The height and sharpness of the peak relative to the rest of the data are measured by a
number called kurtosis. Higher values indicate a higher, sharper peak; lower values indicate a
lower, less distinct peak. This occurs because, as Wikipedias article on kurtosis explains, higherkurtosis means more of the variability is due to a few extreme differences from the mean,
rather than a lot of modest differences from the mean.
Balanda and MacGillivray say the same thing in another way: increasing kurtosis is associated
with the movement of probability mass from the shoulders of a distribution into its center
and tails. (Kevin P. Balanda and H.L. MacGillivray. Kurtosis: A Critical Review. The American
Statistician 42:2 [May 1988], pp 111119, drawn to my attention by Karl Ove Hufthammer)
You may remember that the mean and standard deviation have the same units as the original
data, and the variance has the square of those units. However, the kurtosis has no units: its a
pure number, like a z-score.
The reference standard is a normal distribution, which has a kurtosis of 3. In token of this, often
the excess kurtosis is presented: excess kurtosis is simply kurtosis3. For example, the
kurtosis reported by Excel is actually the excess kurtosis.
y A normal distribution has kurtosis exactly 3 (excess kurtosis exactly 0). Any distribution
with kurtosis 3 (excess 0) is called mesokurtic.
y A distribution with kurtosis <3 (excess kurtosis <0) is called platykurtic. Compared to a
normal distribution, its central peak is lower and broader, and its tails are shorter and
thinner.y A distribution with kurtosis >3 (excess kurtosis >0) is called leptokurtic. Compared to a
normal distribution, its central peak is higher and sharper, and its tails are longer and
fatter.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 9/35
9
Visualizing
Kurtosis is unfortunately harder to picture than skewness, but these illustrations, suggested by
Wikipedia, should help. All three of these distributions have mean of 0, standard deviation of 1,
and skewness of 0, and all are plotted on the same horizontal and vertical scale. Look at the
progression from left to right, as kurtosis increases.
kurtosis = 1.8, excess = 1.2
Uniform(min=3, max=3)
kurtosis = 3, excess = 0
Normal(=0, =1)
kurtosis = 4.2, excess = 1.2
Logistic(=0, =0.55153)
Moving from the illustrated uniform distribution to a normal distribution, you see that the
shoulders have transferred some of their mass to the center and the tails. In other words, the
intermediate values have become less likely and the central and extreme values have become
more likely. The kurtosis increases while the standard deviation stays the same, because more
of the variation is due to extreme values.
Moving from the normal distribution to the illustrated logistic distribution, the trend continues.
There is even less in the shoulders and even more in the tails, and the central peak is higher and
narrower.
How far can this go? What are the smallest and largest possible values of kurtosis? The
smallest possible kurtosis is 1 (excess kurtosis 2), and the largest is , as shown here:
kurtosis = 1, excess = 2 kurtosis = , excess =
A discrete distribution with two equally likely outcomes, such as winning or losing on the flip of
a coin, has the lowest possible kurtosis. It has no central peak and no real tails, and you could
say that its all shoulder its as platykurtic as a distribution can be. At the other extreme,
Students t distribution with four degrees of freedom has infinite kurtosis. A distribution cant
be any more leptokurtic than this.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 10/35
10
Computing
The moment coefficient of kurtosis of a data set is computed almost the same way as the
coefficient of skewness: just change the exponent 3 to 4 in the formulas:
kurtosis: a4 = m4 / m22
and excess kurtosis: g2 = a43
(5) where
m4 = (xx)4 / n and m2 = (xx)
2 / n
Again, the excess kurtosis is generally used because the excess kurtosis of a normal distribution
is 0. x is the mean and n is the sample size, as usual. m4 is called the fourth moment of the data
set. m2 is the variance, the square of the standard deviation.
Just as with variance, standard deviation, and kurtosis, the above is the final computation if you
have data for the whole population. But if you have data for only a sample, you have to
compute the sample excess kurtosis using this formula, which comes from Joanes and Gill:
(6) sample excess kurtosis:
Excel doesnt concern itself with whether you have a sample or a population: its measure of
kurtosis is always G2.
Example: Lets continue with the example of the college mens heights, and compute the
kurtosis of the data set. n = 100, x = 67.45 inches, and the variance m2 = 8.5275 in² were
computed earlier.
Class Mark, x Frequency, f xx (xx)4f
61 5 -6.45 8653.84
64 18 -3.45 2550.05
67 42 -0.45 1.72
70 27 2.55 1141.63
73 8 5.55 7590.35
n/a 19937.60
m4 n/a 199.3760
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 11/35
11
Finally, the kurtosis is
a4 = m4 / m2² = 199.3760/8.5275² = 2.7418
and the excess kurtosis is
g2 = 2.74183 = 0.2582
But this is a sample, not the population, so you have to compute the sample excess kurtosis:
G2 = [99/(98×97)] [101×(0.2582)+6)] = 0.2091
This sample is slightly platykurtic: its peak is just a bit shallower than the peak of a normal
distribution.
Inferring
Your data set is just one sample drawn from a population. How far must the excess kurtosis be
from 0, before you can say that the population also has nonzero excess kurtosis?
The answer comes in a similar way to the similar question about skewness. You divide the
sample excess kurtosis by the standard error of kurtosis (SEK) to get the test statistic, which
tells you how many standard errors the sample excess kurtosis is from zero:
(7) test statistic: Zg2 = G2 / SEK where
The formula is adapted from page 89 of Duncan Cramers Basic Statistics for Social Research
(Routledge, 1997). (Some authors suggest (24/n), but for small samples thats a poor
approximation. And anyway, weve all got calculators, so you may as well do it right.)
The critical value of Zg2 is approximately 2. (This is a two-tailed test of excess kurtosis 0 at
approximately the 0.05 significance level.)
y If Zg2 < 2, the population very likely has negative excess kurtosis (kurtosis <3,
platykurtic), though you dont know how much.
y If Zg2 is between 2 and +2, you cant reach any conclusion about the kurtosis: excesskurtosis might be positive, negative, or zero.
y If Zg2 > +2, the population very likely has positive excess kurtosis (kurtosis >3,
leptokurtic), though you dont know how much.
For the sample college mens heights (n=100), you found excess kurtosis of G2 = 0.2091. The
sample is platykurtic, but is this enough to let you say that the whole population is platykurtic
(has lower kurtosis than the bell curve)?
First compute the standard error of kurtosis:
SEK = 2 × SES × [ (n²1) / ((n3)(n+5)) ]
n = 100, and the SES was previously computed as 0.2414.
SEK = 2 × 0.2414 × [ (100²1) / (97×105) ] = 0.4784
The test statistic is
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 12/35
12
Zg2 = G2/SEK = 0.2091 / 0.4784 = 0.44
You cant say whether the kurtosis of the population is the same as or different from the
kurtosis of a normal distribution.
Assessing Normality
There are many ways to assess normality, and unfortunately none of them are without
problems. Graphical methods are a good start, such as plotting a histogram and making a
quantile plot.
One test is the D'Agostino-Pearson omnibus test, so called because it uses the test statistics for
both skewness and kurtosis to come up with a single p-value. The test statistic is
(8) DP = Zg1² + Zg2² follows ² with df=2
You can look up the p-value in a table, or use ²cdf on a TI-83 or TI-84.
Caution: The DAgostino-Pearson test has a tendency to err on the side of rejecting normality,
particularly with small sample sizes. David Moriarty, in his StatCat utility, recommends that you
dont use DAgostino-Pearson for sample sizes below 20.
For college students heights you had test statistics Z
g1 = 0.45 for skewness and Z
g2 = 0.44 for
kurtosis. The omnibus test statistic is
DP = Zg1² + Zg2² = 0.45² + 0.44² = 0.3961
and the p-value for ²(2 df ) > 0.3961, from a table or a statistics calculator, is 0.8203. You
cannot reject the assumption of normality. (Remember, you never accept the null hypothesis,
so you cant say from this test that the distribution is normal.) The histogram suggests
normality, and this test gives you no reason to reject that impression.
Example 2: Size of Rat Litters
For a second illustration of inferences about skewness and kurtosis of a population, Ill use an
example from Bulmers Principles of Statistics:
Frequency distribution of litter size in rats, n=815
Litter size 1 2 3 4 5 6 7 8 9 10 11 12
Frequency 7 33 58 116 125 126 121 107 56 37 25 4
Ill spare you the detailed calculations, but you should be able to verify them by following
equation (1) and equation (2):
n = 815, x = 6.1252, m2 = 5.1721, m3 = 2.0316
skewness g1 = 0.1727 and sample skewness G1 = 0.1730
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 13/35
13
The sample is roughly symmetric but slightly skewed right, which looks about right from the
histogram. The standard error of skewness is
SES = [ (6×815×814) / (813×816×818) ] = 0.0856
Dividing the skewness by the SES, you get the test statistic
Zg1 = 0.1730 / 0.0856 = 2.02
Since this is greater than 2, you can say that there is some positive skewness in the population.
Again, some positive skewness just means a figure greater than zero; it doesnt tell us
anything more about the magnitude of the skewness.
If you go on to compute a 95% confidence interval of skewness from equation (4), you get
0.1730±2×0.0856 = 0.00 to 0.34.
What about the kurtosis? You should be able to follow equation (5) and compute a fourth
moment of m4 = 67.3948. You already have m2 = 5.1721, and therefore
kurtosis a4 = m4 / m2² = 67.3948 / 5.1721² = 2.5194
excess kurtosis g2 = 2.51943 = 0.4806
sample excess kurtosis G2 = [814/(813×812)] [816×(0.4806+6) = 0.4762
So the sample is moderately less peaked than a normal distribution. Again, this matches the
histogram, where you can see the higher shoulders.
What if anything can you say about the population? For this you need equation (7). Begin by
computing the standard error of kurtosis, using n = 815 and the previously computed SES of 0.0.0856:
SEK = 2 × SES × [ (n²1) / ((n3)(n+5)) ]
SEK = 2 × 0.0856 × [ (815²1) / (812×820) ] = 0.1711
and divide:
Zg2 = G2/SEK = 0.4762 / 0.1711 = 2.78
Since Zg2 is comfortably below 2, you can say that the distribution of all litter sizes is
platykurtic, less sharply peaked than the normal distribution. But be careful: you know that it is
platykurtic, but you dont know by how much.
You already know the population is not normal, but lets apply the DAgostino-Pearson test
anyway:
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 14/35
14
DP = 2.02² + 2.78² = 11.8088
p-value = P( ²(2) > 11.8088 ) = 0.0027
The test agrees with the separate tests of skewness and kurtosis: sizes of rat litters, for the
entire population of rats, is not normally distributed.
How do I determine whether my data are normal?
How do I determine whether my data are normal?
y There are three interrelated approaches to determine normality, and all three should be
conducted.
1. Look at a histogram with the normal curve superimposed. A histogram provides
useful graphical representation of the data. - To provide a rough example of
normality and non-normality, see the following histograms. The black linesuperimposed on the histograms represents the bell-shaped "normal" curve.
Notice how the data for variable1 are normal, and the data for variable2 are
non-normal. In this case, the non-normality is driven by the presence of an
outlier. For more information about outliers, see What are outliers?, How do I
detect outliers?, and How do I deal with outliers?. Problem -- All samples deviate
somewhat from normal, so the question is how much deviation from the black
line indicates non-normality? Unfortunately, graphical representations like
histogram provide no hard-and-fast rules. After you have viewed many (many!)
histograms, over time you will get a sense for the normality of data.
2. Look at the values of Skewness and Kurtosis. Skewness involves the symmetry
of the distribution. Skewness that is normal involves a perfectly symmetric
distribution. A positively skewed distribution has scores clustered to the left,
with the tail extending to the right. A negatively skewed distribution has scores
clustered to the right, with the tail extending to the left. Kurtosis involves thepeakedness of the distribution. Kurtosis that is normal involves a distribution
that is bell-shaped and not too peaked or flat. Positive kurtosis is indicated by a
peak. Negative kurtosis is indicated by a flat distribution. Both Skewness and
Kurtosis are 0 in a normal distribution, so the farther away from 0, the more
non-normal the distribution. The question is how much skew or kurtosis
render the data non-normal? This is an arbitrary determination, and sometimes
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 15/35
15
difficult to interpret using the values of Skewness and Kurtosis. - The histogram
above for variable1 represents perfect symmetry (skewness) and perfect
peakedness (kurtosis); and the descriptive statistics below for variable1 parallel
this information by reporting "0" for both skewness and kurtosis. The histogram
above for variable2 represents positive skewness (tail extending to the right) and
positive kurtosis (high peak); and the descriptive statistics below for variable2parallel this information. Problem -- The question is how much skew or
kurtosis render the data non-normal? This is an arbitrary determination, and
sometimes difficult to interpret using the values of Skewness and Kurtosis.
Luckily, there are more objective tests of normality, described next.
3. Look at established tests for normality that take into account both Skewness
and Kurtosis simultaneously. The Kolmogorov-Smirnov test (K-S) and Shapiro-
Wilk (S-W) test are designed to test normality by comparing your data to anormal distribution with the same mean and standard deviation of your sample.
If the test is NOT significant, then the data are normal, so any value above .05
indicates normality. If the test is significant (less than .05), then the data are non-
normal. - See the data below which indicate variable1 is normal, and
variable2 is non-normal. Also, keep in mind one limitation of the normality tests
is that the larger the sample size, the more likely to get significant results. Thus,
you may get significant results with only slight deviations from normality when
sample sizes are large.
4. Look at normality plots of the data. Normal Q-Q Plot provides a graphical
way to determine the level of normality. The black line indicates the values your
sample should adhere to if the distribution was normal. The dots are your actual
data. If the dots fall exactly on the black line, then your data are normal. If they
deviate from the black line, your data are non-normal. - Notice how
the data for variable1 fall along the line, whereas the data for variable2 deviate
from the line.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 16/35
16
5. What is correlation? Enumerate measurements of correlation and explain their uses. What
is the indication of the magnitude of a relationship? Give examples.
Correlation coefficients measure the strength of association between two variables. The most
common correlation coefficient, called the Pearson product-moment correlation coefficient,
measures the strength of the linear association between variables.
In this tutorial, when we speak simply of a correlation coefficient, we are referring to the
Pearson product-moment correlation. Generally, the correlation coefficient of a sample is
denoted by r , and the correlation coefficient of a population is denoted by or R.
How to Interpret a Correlation Coefficient
The sign and the absolute value of a correlation coefficient describe the direction and the
magnitude of the relationship between two variables.
y The value of a correlation coefficient ranges between -1 and 1.
y The greater the absolute value of a correlation coefficient, the stronger the linear
relationship.
y The strongest linear relationship is indicated by a correlation coefficient of -1 or 1.
y The weakest linear relationship is indicated by a correlation coefficient equal to 0.
y A positive correlation means that if one variable gets bigger, the other variable tends toget bigger.
y A negative correlation means that if one variable gets bigger, the other variable tends to
get smaller.
Keep in mind that the Pearson product-moment correlation coefficient only measures linear
relationships. Therefore, a correlation of 0 does not mean zero relationship between two
variables; rather, it means zero linear relationship. (It is possible for two variables to have zero
linear relationship and a strong curvilinear relationship at the same time.)
Scatterplots and Correlation Coefficients
The scatterplots below show how different patterns of data produce different degrees of
correlation.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 17/35
17
Maximum positive correlation
(r = 1.0)
Strong positive correlation
(r = 0.80)
Zero correlation
(r = 0)
Minimum negative correlation(r = -1.0)
Moderate negative correlation(r = -0.43)
Strong correlation & outlier(r = 0.71)
Several points are evident from the scatterplots.
y When the slope of the line in the plot is negative, the correlation is negative; and vice
versa.
y The strongest correlations (r = 1.0 and r = -1.0 ) occur when data points fall exactly on a
straight line.
y The correlation becomes weaker as the data points become more scattered.
y If the data points fall in a random pattern, the correlation is equal to zero.
y Correlation is affected by outliers. Compare the first scatterplot with the last scatterplot.The single outlier in the last plot greatly reduces the correlation (from 1.00 to 0.71).
How to Calculate a Correlation Coefficient
If you look in different statistics textbooks, you are likely to find different-looking (but
equivalent) formulas for computing a correlation coefficient. In this section, we present several
formulas that you may encounter.
The most common formula for computing a product-moment correlation coefficient (r) is given
below.
Product-moment correlation coefficient.
The formula below uses population means and population standard deviations to compute a
population correlation coefficient () from population data.
Product-moment correlation coefficient. The correlation r between two
variables is:
r = (xy) / sqrt [ ( x2 ) * ( y2 ) ]
where is the summation symbol, x = xi - x, xi is the x value for
observation i, x is the mean x value, y = yi - y, yi is the y value for
observation i, and y is the mean y value.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 18/35
18
The formula below uses sample means and sample standard deviations to compute a
correlation coefficient (r) from sample data.
The interpretation of the sample correlation coefficient depends on how the sample data is
collected. With a simple random sample, the sample correlation coefficient is an unbiased
estimate of the population correlation coefficient.
Each of the latter two formulas can be derived from the first formula. Use the second formula
when you have data from the entire population. Use the third formula when you only have
sample data. When in doubt, use the first formula. It is always correct.
Fortunately, you will rarely have to compute a correlation coefficient by hand. Many software
packages (e.g., Excel) and most graphing calculators have a correlation function that will do the
job for you.
Note: Sometimes, it is not clear whether a software package or a graphing calculator is
computing a population correlation coefficient or a sample correlation coefficient. For example,
a casual user might not realize that Microsoft uses a population correlation coefficient () for
the Pearson function in its Excel software.
6. What are the different measures in testing hypothesis? Explain each.
Hypothesis Tests
A statistical hypothesis is an assumption about a population parameter. This assumption may
or may not be true.
Population correlation coefficient. The correlation between two
variables is:
= [ 1 / N ] * { [ (Xi - X) / x ] * [ (Yi - Y) / y ] }
where N is the number of observations in the population, is the
summation symbol, Xi is the X value for observation i, X is the
population mean for variable X, Yi is the Y value for observation i, Y is
the population mean for variable Y, x is the population standard
deviation of X, and y is the population standard deviation of Y.
Sample correlation coefficient. The correlation r between two
variables is:
r = [ 1 / (n - 1) ] * { [ (xi - x) / sx ] * [ (yi - y) / sy ] }
where n is the number of observations in the sample, is the summation
symbol, xi is the x value for observation i, x is the sample mean of x, yi
is the y value for observation i, y is the sample mean of y, sx is the
sample standard deviation of x, and sy is the sample standard deviation
of y.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 19/35
19
The best way to determine whether a statistical hypothesis is true would be to examine the
entire population. Since that is often impractical, researchers typically examine a random
sample from the population. If sample data are not consistent with the statistical hypothesis,
the hypothesis is rejected.
There are two types of statistical hypotheses.y Null hypothesis. The null hypothesis, denoted by H0, is usually the hypothesis that
sample observations result purely from chance.
y Alternative hypothesis. The alternative hypothesis, denoted by H1 or Ha, is the
hypothesis that sample observations are influenced by some non-random cause.
For example, suppose we wanted to determine whether a coin was fair and balanced. A null
hypothesis might be that half the flips would result in Heads and half, in Tails. The alternative
hypothesis might be that the number of Heads and Tails would be very different. Symbolically,
these hypotheses would be expressed as
H0: P = 0.5
Ha: P 0.5
Suppose we flipped the coin 50 times, resulting in 40 Heads and 10 Tails. Given this result, we
would be inclined to reject the null hypothesis. We would conclude, based on the evidence,
that the coin was probably not fair and balanced.
Can We Accept the Null Hypothesis?
Some researchers say that a hypothesis test can have one of two outcomes: you accept the null
hypothesis or you reject the null hypothesis. Many statisticians, however, take issue with the
notion of "accepting the null hypothesis." Instead, they say: you reject the null hypothesis or
you fail to reject the null hypothesis.
Why the distinction between "acceptance" and "failure to reject?" Acceptance implies that thenull hypothesis is true. Failure to reject implies that the data are not sufficiently persuasive for
us to prefer the alternative hypothesis over the null hypothesis.
Hypothesis Tests
Statisticians follow a formal process to determine whether to reject a null hypothesis, based on
sample data. This process, called hypothesis testing, consists of four steps.
y State the hypotheses. This involves stating the null and alternative hypotheses. The
hypotheses are stated in such a way that they are mutually exclusive. That is, if one is
true, the other must be false.y Formulate an analysis plan. The analysis plan describes how to use sample data to
evaluate the null hypothesis. The evaluation often focuses around a single test statistic.
y Analyze sample data. Find the value of the test statistic (mean score, proportion, t-
score, z-score, etc.) described in the analysis plan.
y Interpret results. Apply the decision rule described in the analysis plan. If the value of
the test statistic is unlikely, based on the null hypothesis, reject the null hypothesis.
Decision Errors
Two types of errors can result from a hypothesis test.
y Type I error. A Type I error occurs when the researcher rejects a null hypothesis when it
is true. The probability of committing a Type I error is called the significance level. This
probability is also called alpha, and is often denoted by .
y Type II error. A Type II error occurs when the researcher fails to reject a null hypothesis
that is false. The probability of committing a Type II error is called Beta, and is often
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 20/35
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 21/35
21
o Significance level. Often, researchers choose significance levels equal to 0.01,
0.05, or 0.10; but any value between 0 and 1 can be used.
o Test method. Typically, the test method involves a test statistic and a sampling
distribution. Computed from sample data, the test statistic might be a mean
score, proportion, difference between means, difference between proportions,
z-score, t-score, chi-square, etc. Given a test statistic and its samplingdistribution, a researcher can assess probabilities associated with the test
statistic. If the test statistic probability is less than the significance level, the null
hypothesis is rejected.
y Analyze sample data. Using sample data, perform computations called for in the analysis
plan.
o Test statistic. When the null hypothesis involves a mean or proportion, use
either of the following equations to compute the test statistic.
Test statistic = (Statistic - Parameter) / (Standard deviation of statistic)
Test statistic = (Statistic - Parameter) / (Standard error of statistic)
where Parameter is the value appearing in the null hypothesis, and Statistic is the point
estimate of Parameter . As part of the analysis, you may need to compute the standard
deviation or standard error of the statistic. Previously, we presented common formulas for the
standard deviation and standard error.
When the parameter in the null hypothesis involves categorical data, you may use a chi-square
statistic as the test statistic. Instructions for computing a chi-square test statistic are presented
in the lesson on the chi-square goodness of fit test.
o P-value. The P-value is the probability of observing a sample statistic as extremeas the test statistic, assuming the null hypothesis is true.
y Interpret the results. If the sample findings are unlikely, given the null hypothesis, the
researcher rejects the null hypothesis. Typically, this involves comparing the P-value to
the significance level, and rejecting the null hypothesis when the P-value is less than the
significance level.
Hypothesis Test of the Mean
This lesson explains how to conduct a hypothesis test of a mean, when the following conditions
are met:y The sampling method is simple random sampling.
y The sample is drawn from a normal or near-normal population.
Generally, the sampling distribution will be approximately normally distributed if any of the
following conditions apply.
y The population distribution is normal.
y The sampling distribution is symmetric, unimodal, without outliers, and the sample size
is 15 or less.
y The sampling distribution is moderately skewed, unimodal, without outliers, and the
sample size is between 16 and 40.
y The sample size is greater than 40, without outliers.
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3)
analyze sample data, and (4) interpret results.
State the Hypotheses
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 22/35
22
Every hypothesis test requires the analyst to state a null hypothesis and an alternative
hypothesis. The hypotheses are stated in such a way that they are mutually exclusive. That is, if
one is true, the other must be false; and vice versa.
The table below shows three sets of hypotheses. Each makes a statement about how the
population mean is related to a specified value M. (In the table, the symbol means " not
equal to ".)
Set Null hypothesis Alternative hypothesis Number of tails
1 = M M 2
2 > M < M 1
3 < M > M 1
The first set of hypotheses (Set 1) is an example of a two-tailed test, since an extreme value on
either side of the sampling distribution would cause a researcher to reject the null hypothesis.
The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests, since an extreme value on
only one side of the sampling distribution would cause a researcher to reject the null
hypothesis.
Formulate an Analysis Plan
The analysis plan describes how to use sample data to accept or reject the null hypothesis. It
should specify the following elements.
y Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or
0.10; but any value between 0 and 1 can be used.
y Test method. Use the one-sample t-test to determine whether the hypothesized mean
differs significantly from the observed sample mean.
Analyze Sample Data
Using sample data, conduct a one-sample t-test. This involves finding the standard error,
degrees of freedom, test statistic, and the P-value associated with the test statistic.
y Standard error. Compute the standard error (SE) of the sampling distribution.
SE = s * sqrt{ ( 1/n ) * ( 1 - n/N ) * [ N / ( N - 1 ) ] }
where s is the standard deviation of the sample, N is the population size, and n is the sample
size. When the population size is much larger (at least 10 times larger) than the sample size, the
standard error can be approximated by:
SE = s / sqrt( n ) y Degrees of freedom. The degrees of freedom (DF) is equal to the sample size (n) minus
one. Thus, DF = n - 1.
y Test statistic. The test statistic is a t-score (t) defined by the following equation.
t = (x - ) / SE
where x is the sample mean, is the hypothesized population mean in the null hypothesis, and
SE is the standard error.
y P-value. The P-value is the probability of observing a sample statistic as extreme as the
test statistic. Since the test statistic is a t-score, use the t Distribution Calculator to
assess the probability associated with the t-score, given the degrees of freedom
computed above. (See sample problems at the end of this lesson for examples of how
this is done.)
Interpret Results
If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null
hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting
the null hypothesis when the P-value is less than the significance level.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 23/35
23
Hypothesis Test for the Difference Between Two Means
This lesson explains how to conduct a hypothesis test for the difference between two means.
The test procedure, called the two-sample t-test, is appropriate when the following conditions
are met:y The sampling method for each sample is simple random sampling.
y The samples are independent.
y Each population is at least 10 times larger than its respective sample.
y Each sample is drawn from a normal or near-normal population. Generally, the sampling
distribution will be approximately normal if any of the following conditions apply.
o The population distribution is normal.
o The sample data are symmetric, unimodal, without outliers, and the sample size
is 15 or less.
o The sample data are slightly skewed, unimodal, without outliers, and the sample
size is 16 to 40.
o The sample size is greater than 40, without outliers.
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3)
analyze sample data, and (4) interpret results.
State the Hypotheses
Every hypothesis test requires the analyst to state a null hypothesis and an alternative
hypothesis. The hypotheses are stated in such a way that they are mutually exclusive. That is, if
one is true, the other must be false; and vice versa.
The table below shows three sets of null and alternative hypotheses. Each makes a statementabout the difference d between the mean of one population 1 and the mean of another
population 2. (In the table, the symbol means " not equal to ".)
Set Null hypothesis Alternative hypothesis Number of tails
1 1 - 2 = d 1 - 2 d 2
2 1 - 2 > d 1 - 2 < d 1
3 1 - 2 < d 1 - 2 > d 1
The first set of hypotheses (Set 1) is an example of a two-tailed test, since an extreme value oneither side of the sampling distribution would cause a researcher to reject the null hypothesis.
The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests, since an extreme value on
only one side of the sampling distribution would cause a researcher to reject the null
hypothesis.
When the null hypothesis states that there is no difference between the two population means
(i.e., d = 0), the null and alternative hypothesis are often stated in the following form.
H0: 1 = 2
Ha: 1 2
Formulate an Analysis Plan
The analysis plan describes how to use sample data to accept or reject the null hypothesis. It
should specify the following elements.
y Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or
0.10; but any value between 0 and 1 can be used.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 24/35
24
y Test method. Use the two-sample t-test to determine whether the difference between
means found in the sample is significantly different from the hypothesized difference
between means.
Analyze Sample Data
Using sample data, find the standard error, degrees of freedom, test statistic, and the P-value
associated with the test statistic.
y Standard error. Compute the standard error (SE) of the sampling distribution.
SE = sqrt[(s12/n1) + (s2
2/n2)]
where s1 is the standard deviation of sample 1, s2 is the standard deviation of sample 2, n1 is the
size of sample 1, and n2
is the size of sample 2.
y Degrees of freedom. The degrees of freedom (DF) is:
DF = (s12/n1 + s2
2/n2)
2 / { [ (s12 / n1)
2 / (n1 - 1) ] + [ (s22 / n2)
2 / (n2 - 1) ] }
If DF does not compute to an integer, round it off to the nearest whole number. Some texts
suggest that the degrees of freedom can be approximated by the smaller of n1 - 1 and n2 - 1;
but the above formula gives better results.
y Test statistic. The test statistic is a t-score (t) defined by the following equation.
t = [ (x1 - x2) - d ] / SE
where x1 is the mean of sample 1, x2 is the mean of sample 2, d is the hypothesized difference
between population means, and SE is the standard error.y P-value. The P-value is the probability of observing a sample statistic as extreme as the
test statistic. Since the test statistic is a t-score, use the t Distribution Calculator to
assess the probability associated with the t-score, having the degrees of freedom
computed above. (See sample problems at the end of this lesson for examples of how
this is done.)
Interpret Results
If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null
hypothesis. Typically, this involves comparing the P-value to the significance level, and rejectingthe null hypothesis when the P-value is less than the significance level.
Hypothesis Test for Difference BetweenMatched Pairs
This lesson explains how to conduct a hypothesis test for the difference between paired means.
The test procedure, called the matched-pairs t-test, is appropriate when the following
conditions are met:
y The sampling method for each sample is simple random sampling.
y The test is conducted on paired data. (As a result, the data sets are not independent.)
y Each sample is drawn from a normal or near-normal population. Generally, the sampling
distribution will be approximately normal if any of the following conditions apply.o The population distribution is normal.
o The sample data are symmetric, unimodal, without outliers, and the sample size
is 15 or less.
o The sample data are slightly skewed, unimodal, without outliers, and the sample
size is 16 to 40.
o The sample size is greater than 40, without outliers.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 25/35
25
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3)
analyze sample data, and (4) interpret results.
State the Hypotheses
Every hypothesis test requires the analyst to state a null hypothesis and an alternativehypothesis. The hypotheses are stated in such a way that they are mutually exclusive. That is, if
one is true, the other must be false; and vice versa.
The hypotheses concern a new variable d, which is based on the difference between paired
values from two data sets.
d = x1 - x2
where x1 is the value of variable x in the first data set, and x2 is the value of the variable from
the second data set that is paired with x1.
The table below shows three sets of null and alternative hypotheses. Each makes a statement
about how the true difference in population values d is related to some hypothesized value D.
(In the table, the symbol means " not equal to ".)
Set Null hypothesis Alternative hypothesis Number of tails
1 d= D d D 2
2 d > D d < D 1
3 d < D d > D 1
The first set of hypotheses (Set 1) is an example of a two-tailed test, since an extreme value on
either side of the sampling distribution would cause a researcher to reject the null hypothesis.
The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests, since an extreme value on
only one side of the sampling distribution would cause a researcher to reject the nullhypothesis.
Formulate an Analysis Plan
The analysis plan describes how to use sample data to accept or reject the null hypothesis. It
should specify the following elements.
y Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or
0.10; but any value between 0 and 1 can be used.
y Test method. Use the matched-pairs t-test to determine whether the difference
between sample means for paired data is significantly different from the hypothesizeddifference between population means.
Analyze Sample Data
Using sample data, find the standard deviation, standard error, degrees of freedom, test
statistic, and the P-value associated with the test statistic.
y Standard deviation. Compute the standard deviation (sd) of the differences computed
from n matched pairs.
sd = sqrt [ ((di - d)2 / (n - 1) ]
where di is the difference for pair i , d is the sample mean of the differences, and n is the
number of paired values.y Standard error. Compute the standard error (SE) of the sampling distribution of d.
SE = sd * sqrt{ ( 1/n ) * ( 1 - n/N ) * [ N / ( N - 1 ) ] }
where sd is the standard deviation of the sample difference, N is the population size, and n is
the sample size. When the population size is much larger (at least 10 times larger) than the
sample size, the standard error can be approximated by:
SE = sd / sqrt( n )
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 26/35
26
y Degrees of freedom. The degrees of freedom (DF) is: DF = n - 1 .
y Test statistic. The test statistic is a t-score (t) defined by the following equation.
t = [ (x1 - x2) - D ] / SE = (d - D) / SE
where x1 is the mean of sample 1, x2 is the mean of sample 2, d is the mean difference between
paired values in the sample, D is the hypothesized difference between population means, and
SE is the standard error.y P-value. The P-value is the probability of observing a sample statistic as extreme as the
test statistic. Since the test statistic is a t-score, use the t Distribution Calculator to
assess the probability associated with the t-score, having the degrees of freedom
computed above. (See the sample problem at the end of this lesson for guidance on
how this is done.)
Interpret Results
If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null
hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting
the null hypothesis when the P-value is less than the significance level.
Hypothesis Test for a Proportion
This lesson explains how to conduct a hypothesis test of a proportion, when the following
conditions are met:
y The sampling method is simple random sampling.
y Each sample point can result in just two possible outcomes. We call one of these
outcomes a success and the other, a failure.
y The sample includes at least 10 successes and 10 failures. (Some texts say that 5
successes and 5 failures are enough.) y The population size is at least 10 times as big as the sample size.
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3)
analyze sample data, and (4) interpret results.
State the Hypotheses
Every hypothesis test requires the analyst to state a null hypothesis and an alternative
hypothesis. The hypotheses are stated in such a way that they are mutually exclusive. That is, if
one is true, the other must be false; and vice versa.
Formulate an Analysis Plan
The analysis plan describes how to use sample data to accept or reject the null hypothesis. It
should specify the following elements.
y Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or
0.10; but any value between 0 and 1 can be used.
y Test method. Use the one-sample z-test to determine whether the hypothesized
population proportion differs significantly from the observed sample proportion.
Analyze Sample Data
Using sample data, find the test statistic and its associated P-Value.
y Standard deviation. Compute the standard deviation () of the sampling distribution.
= sqrt[ P * ( 1 - P ) / n ]
where P is the hypothesized value of population proportion in the null hypothesis, and n is the
sample size.
y Test statistic. The test statistic is a z-score (z) defined by the following equation.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 27/35
27
z = (p - P) /
where P is the hypothesized value of population proportion in the null hypothesis, p is the
sample proportion, and is the standard deviation of the sampling distribution.
y P-value. The P-value is the probability of observing a sample statistic as extreme as the
test statistic. Since the test statistic is a z-score, use the Normal Distribution Calculator
to assess the probability associated with the z-score. (See sample problems at the end of this lesson for examples of how this is done.)
Interpret Results
If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null
hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting
the null hypothesis when the P-value is less than the significance level.
Hypothesis Tests of Proportions (Small Sample)
In the previous lesson, we showed how to conduct a hypothesis test for a proportion when the
sample included at least 10 successes and 10 failures. This requirement serves two purposes:
y It guarantees that the sample size will be at least 20 when the proportion is 0.5.
y It ensures that the minimum acceptable sample size increases as the proportion
becomes more extreme.
When the sample does not include at least 10 successes and 10 failures, the sample size will be
too small to justify the hypothesis testing approach presented in the previous lesson. This
lesson describes how to test a hypothesis about a proportion when the sample size is small, as
long as the sample includes at least one success and one failure. The key steps are:
y Formulate the hypotheses to be tested. This means stating the null hypothesis and the
alternative hypothesis.y Determine the sampling distribution of the proportion. If the sample proportion is the
outcome of a binomial experiment, the sampling distribution will be binomial. If it is the
outcome of a hypergeometric experiment, the sampling distribution will be
hypergeometric.
y Specify the significance level. (Researchers often set the significance level equal to 0.05
or 0.01, although other values may be used.)
y Based on the hypotheses, the sampling distribution, and the significance level, define
the region of acceptance.
y Test the null hypothesis. If the sample proportion falls within the region of acceptance,
accept the null hypothesis; otherwise, reject the null hypothesis.Hypothesis Test for Difference Between Proportions
This lesson explains how to conduct a hypothesis test to determine whether the difference
between two proportions is significant. The test procedure, called the two-proportion z-test, is
appropriate when the following conditions are met:
* The sampling method for each population is simple random sampling.
* The samples are independent.
* Each sample includes at least 10 successes and 10 failures. (Some texts say that 5 successes
and 5 failures are enough.)
* Each population is at least 10 times as big as its sample.
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3)
analyze sample data, and (4) interpret results.
State the Hypotheses
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 28/35
28
Every hypothesis test requires the analyst to state a null hypothesis and an alternative
hypothesis. The table below shows three sets of hypotheses. Each makes a statement about the
difference d between two population proportions, P1 and P2. (In the table, the symbol means
" not equal to ".)
Set Null hypothesis Alternative hypothesis Number of tails1 P1 - P2 = 0 P1 - P2 0 2
2 P1 - P2 > 0 P1 - P2 < 0 1
3 P1 - P2 < 0 P1 - P2 > 0 1
The first set of hypotheses (Set 1) is an example of a two-tailed test, since an extreme value on
either side of the sampling distribution would cause a researcher to reject the null hypothesis.
The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests, since an extreme value on
only one side of the sampling distribution would cause a researcher to reject the null
hypothesis.
When the null hypothesis states that there is no difference between the two population
proportions (i.e., d = 0), the null and alternative hypothesis for a two-tailed test are often stated
in the following form.
H0: P1 = P2
Ha: P1 P2
Formulate an Analysis Plan
The analysis plan describes how to use sample data to accept or reject the null hypothesis. It
should specify the following elements.
* Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10;
but any value between 0 and 1 can be used.
* Test method. Use the two-proportion z-test (described in the next section) to determine
whether the hypothesized difference between population proportions differs significantly from
the observed sample difference.
Analyze Sample Data
Using sample data, complete the following computations to find the test statistic and its
associated P-Value.
* Pooled sample proportion. Since the null hypothesis states that P1=P2, we use a pooled
sample proportion (p) to compute the standard error of the sampling distribution.
p = (p1 * n1 + p2 * n2) / (n1 + n2)
where p1 is the sample proportion from population 1, p2 is the sample proportion from
population 2, n1 is the size of sample 1, and n2 is the size of sample 2.
* Standard error. Compute the standard error (SE) of the sampling distribution differencebetween two proportions.
SE = sqrt{ p * ( 1 - p ) * [ (1/n1) + (1/n2) ] }
where p is the pooled sample proportion, n1 is the size of sample 1, and n2 is the size of
sample 2.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 29/35
29
* Test statistic. The test statistic is a z-score (z) defined by the following equation.
z = (p1 - p2) / SE
where p1 is the proportion from sample 1, p2 is the proportion from sample 2, and SE is the
standard error of the sampling distribution.
* P-value. The P-value is the probability of observing a sample statistic as extreme as the test
statistic. Since the test statistic is a z-score, use the Normal Distribution Calculator to assess the
probability associated with the z-score
The analysis described above is a two-proportion z-test.
Interpret Results
If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null
hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting
the null hypothesis when the P-value is less than the significance level.
Region of Acceptance
In this lesson, we describe how to find the region of acceptance for a hypothesis test.
One-Tailed and Two-Tailed Hypothesis Tests
The steps taken to define the region of acceptance will vary, depending on whether the null
hypothesis and the alternative hypothesis call for one- or two-tailed hypothesis tests. So we
begin with a brief review.
The table below shows three sets of hypotheses. Each makes a statement about how the
population mean is related to a specified value M. (In the table, the symbol means " notequal to ".)
Set Null hypothesis Alternative hypothesis Number of tails
1 = M M 2
2 > M < M 1
3 < M > M 1
The first set of hypotheses (Set 1) is an example of a two-tailed test, since an extreme value on
either side of the sampling distribution would cause a researcher to reject the null hypothesis.The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests, since an extreme value on
only one side of the sampling distribution would cause a researcher to reject the null
hypothesis.
How to Find the Region of Acceptance
We define the region of acceptance in such a way that the chance of making a Type I error is
equal to the significance level. Here is how that is done.
y Define a test statistic. Here, the test statistic is the sample measure used to estimate the
population parameter that appears in the null hypothesis. For example, suppose the null
hypothesis isH0: = M
The test statistic, used to estimate M, would be m. If M were a population mean, m would be
the sample mean; if M were a population proportion, m would be the sample proportion; if M
were a difference between population means, m would be the difference between sample
means; and so on.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 30/35
30
y Given the significance level , find the upper limit (UL) of the region of acceptance.
There are three possibilities, depending on the form of the null hypothesis.
o If the null hypothesis is < M: The upper limit of the region of acceptance will be
equal to the value for which the cumulative probability of the sampling
distribution is equal to one minus the significance level. That is, P( m < UL ) = 1 -
.o If the null hypothesis is = M: The upper limit of the region of acceptance will be
equal to the value for which the cumulative probability of the sampling
distribution is equal to one minus the significance level divided by 2. That is, P( m
< UL ) = 1 - /2 .
o If the null hypothesis is > M: The upper limit of the region of acceptance is
equal to plus infinity, unless the test statistic were a proportion or a percentage.
The upper limit is 1 for a proportion, and 100 for a percentage.
y In a similar way, we find the lower limit (LL) of the range of acceptance. Again, there are
three possibilities, depending on the form of the null hypothesis.
o If the null hypothesis is < M: The lower limit of the region of acceptance is
equal to minus infinity, unless the test statistic is a proportion or a percentage.
The lower limit for a proportion or a percentage is zero.
o If the null hypothesis is = M: The lower limit of the region of acceptance will be
equal to the value for which the cumulative probability of the sampling
distribution is equal to the significance level divided by 2. That is, P( m < LL ) =
/2 .
o If the null hypothesis is > M: The lower limit of the region of acceptance will be
equal to the value for which the cumulative probability of the sampling
distribution is equal to the significance level. That is, P( m < LL ) = .
The region of acceptance is defined by the range between LL and UL.
Power of a Hypothesis Test
The probability of not committing a Type II error is called the power of a hypothesis test.
Effect Size
To compute the power of the test, one offers an alternative view about the "true" value of the
population parameter, assuming that the null hypothesis is false. The effect size is thedifference between the true value and the value specified in the null hypothesis.
Effect size = True value - Hypothesized value
For example, suppose the null hypothesis states that a population mean is equal to 100. A
researcher might ask: What is the probability of rejecting the null hypothesis if the true
population mean is equal to 90? In this example, the effect size would be 90 - 100, which equals
-10.
Factors That Affect Power
The power of a hypothesis test is affected by three factors.
y Sample size (n). Other things being equal, the greater the sample size, the greater thepower of the test.
y Significance level (). The higher the significance level, the higher the power of the test.
If you increase the significance level, you reduce the region of acceptance. As a result,
you are more likely to reject the null hypothesis. This means you are less likely to accept
the null hypothesis when it is false; i.e., less likely to make a Type II error. Hence, the
power of the test is increased.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 31/35
31
y The "true" value of the parameter being tested. The greater the difference between the
"true" value of a parameter and the value specified in the null hypothesis, the greater
the power of the test. That is, the greater the effect size, the greater the power of the
test.
How to Compute Power
When a researcher designs a study to test a hypothesis, he/she should compute the power of
the test (i.e., the likelihood of avoiding a Type II error).
How to Compute the Power of a Hypothesis Test
To compute the power of a hypothesis test, use the following three-step procedure.
y Define the region of acceptance. Previously, we showed how to compute the region of
acceptance for a hypothesis test.
y Specify the critical parameter value. The critical parameter value is an alternative to the
value specified in the null hypothesis. The difference between the critical parameter
value and the value from the null hypothesis is called the effect size. That is, the effect
size is equal to the critical parameter value minus the value from the null hypothesis.
y Compute power. Assume that the true population parameter is equal to the critical
parameter value, rather than the value specified in the null hypothesis. Based on that
assumption, compute the probability that the sample estimate of the population
parameter will fall outside the region of acceptance. That probability is the power of the
test.
Chi-Square Goodness-of-Fit Test
This lesson explains how to conduct a chi-square goodness of fit test. The test is applied whenyou have one categorical variable from a single population. It is used to determine whether
sample data are consistent with a hypothesized distribution.
For example, suppose a company printed baseball cards. It claimed that 30% of its cards were
rookies; 60%, veterans; and 10%, All-Stars. We could gather a random sample of baseball cards
and use a chi-square goodness of fit test to see whether our sample distribution differed
significantly from the distribution claimed by the company. The sample problem at the end of
the lesson considers this example.
The test procedure described in this lesson is appropriate when the following conditions are
met:
y The sampling method is simple random sampling.
y The population is at least 10 times as large as the sample.
y The variable under study is categorical.
y The expected value for each level of the variable is at least 5.
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3)
analyze sample data, and (4) interpret results.
State the Hypotheses
Every hypothesis test requires the analyst to state a null hypothesis and an alternative
hypothesis. The hypotheses are stated in such a way that they are mutually exclusive. That is, if
one is true, the other must be false; and vice versa.For a chi-square goodness of fit test, the hypotheses take the following form.
H0: The data are consistent with a specified distribution.
Ha: The data are not consistent with a specified distribution.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 32/35
32
Typically, the null hypothesis specifies the proportion of observations at each level of the
categorical variable. The alternative hypothesis is that at least one of the specified proportions
is not true.
Formulate an Analysis Plan
The analysis plan describes how to use sample data to accept or reject the null hypothesis. The
plan should specify the following elements.
y Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or
0.10; but any value between 0 and 1 can be used.
y Test method. Use the chi-square goodness of fit test to determine whether observed
sample frequencies differ significantly from expected frequencies specified in the null
hypothesis. The chi-square goodness of fit test is described in the next section, and
demonstrated in the sample problem at the end of this lesson.
Analyze Sample Data
Using sample data, find the degrees of freedom, expected frequency counts, test statistic, and
the P-value associated with the test statistic.
y Degrees of freedom. The degrees of freedom (DF) is equal to the number of levels (k) of
the categorical variable minus 1: DF = k - 1 .
y Expected frequency counts. The expected frequency counts at each level of the
categorical variable are equal to the sample size times the hypothesized proportion
from the null hypothesis
Ei = npi
where Ei is the expected frequency count for the i th level of the categorical variable, n is the
total sample size, and pi is the hypothesized proportion of observations in level i .y Test statistic. The test statistic is a chi-square random variable (2
) defined by the
following equation.
2
= [ (Oi - Ei)2 / Ei ]
where Oi is the observed frequency count for the i th level of the categorical variable, and Ei is
the expected frequency count for the i th level of the categorical variable.
y P-value. The P-value is the probability of observing a sample statistic as extreme as the
test statistic. Since the test statistic is a chi-square, use the Chi-Square Distribution
Calculator to assess the probability associated with the test statistic. Use the degrees of
freedom computed above.
Interpret Results
If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null
hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting
the null hypothesis when the P-value is less than the significance level.
Chi-Square Test for Homogeneity
This lesson explains how to conduct a chi-square test of homogeneity. The test is applied to a
single categorical variable from two different populations. It is used to determine whether
frequency counts are distributed identically across different populations.For example, in a survey of TV viewing preferences, we might ask respondents to identify their
favorite program. We might ask the same question of two different populations, such as males
and females. We could use a chi-square test for homogeneity to determine whether male
viewing preferences differed significantly from female viewing preference. The sample problem
at the end of the lesson considers this example.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 33/35
33
The test procedure described in this lesson is appropriate when the following conditions are
met:
y For each population, the sampling method is simple random sampling.
y Each population is at least 10 times as large as its respective sample.
y The variable under study is categorical.
y If sample data are displayed in a contingency table (Populations x Category levels), theexpected frequency count for each cell of the table is at least 5.
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3)
analyze sample data, and (4) interpret results.
State the Hypotheses
Every hypothesis test requires the analyst to state a null hypothesis and an alternative
hypothesis. The hypotheses are stated in such a way that they are mutually exclusive. That is, if
one is true, the other must be false; and vice versa.
Suppose that data were sampled from r populations, and assume that the categorical variable
had c levels. At any specified level of the categorical variable, the null hypothesis states that
each population has the same proportion of observations. Thus,
H0: Plevel 1 of population 1 = Plevel 1 of population 2 = . . . = Plevel 1 of population r
H0: Plevel 2 of population 1 = Plevel 2 of population 2 = . . . = Plevel 2 of population r
. . .
H0: Plevel c of population 1 = Plevel c of population 2 = . . . = Plevel c of population r
The alternative hypothesis (Ha) is that at least one of the null hypothesis statements is false.
Formulate an Analysis Plan
The analysis plan describes how to use sample data to accept or reject the null hypothesis. The
plan should specify the following elements.
y Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or
0.10; but any value between 0 and 1 can be used.
y Test method. Use the chi-square test for homogeneity to determine whether observed
sample frequencies differ significantly from expected frequencies specified in the null
hypothesis. The chi-square test for homogeneity is described in the next section.
Analyze Sample Data
Using sample data from the contingency tables, find the degrees of freedom, expectedfrequency counts, test statistic, and the P-value associated with the test statistic. The analysis
described in this section is illustrated in the sample problem at the end of this lesson.
y Degrees of freedom. The degrees of freedom (DF) is equal to:
DF = (r - 1) * (c - 1)
where r is the number of populations, and c is the number of levels for the categorical variable.
y Expected frequency counts. The expected frequency counts are computed separately
for each population at each level of the categorical variable, according to the following
formula.
Er,c = (nr * nc) / n
where Er,c
is the expected frequency count for population r at level c of the categorical variable,
nr is the total number of observations from population r, nc is the total number of observations
at treatment level c, and n is the total sample size.
y Test statistic. The test statistic is a chi-square random variable (2) defined by the
following equation.
2
= [ (Or,c - Er,c)2 / Er,c ]
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 34/35
34
where Or,c is the observed frequency count in population r for level c of the categorical variable,
and Er,c is the expected frequency count in population r for level c of the categorical variable.
y P-value. The P-value is the probability of observing a sample statistic as extreme as the
test statistic. Since the test statistic is a chi-square, use the Chi-Square Distribution
Calculator to assess the probability associated with the test statistic. Use the degrees of
freedom computed above.
Interpret Results
If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null
hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting
the null hypothesis when the P-value is less than the significance level.
Chi-Square Test for Independence
This lesson explains how to conduct a chi-square test for independence. The test is applied
when you have two categorical variables from a single population. It is used to determine
whether there is a significant association between the two variables.
For example, in an election survey, voters might be classified by gender (male or female) and
voting preference (Democrat, Republican, or Independent). We could use a chi-square test for
independence to determine whether gender is related to voting preference. The sample
problem at the end of the lesson considers this example.
The test procedure described in this lesson is appropriate when the following conditions are
met:
y The sampling method is simple random sampling.
y Each population is at least 10 times as large as its respective sample.
y The variables under study are each categorical.y If sample data are displayed in a contingency table, the expected frequency count for
each cell of the table is at least 5.
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3)
analyze sample data, and (4) interpret results.
State the Hypotheses
Suppose that Variable A has r levels, and Variable B has c levels. The null hypothesis states that
knowing the level of Variable A does not help you predict the level of Variable B. That is, the
variables are independent.H0: Variable A and Variable B are independent.
Ha: Variable A and Variable B are not independent.
The alternative hypothesis is that knowing the level of Variable A can help you predict the level
of Variable B.
Note: Support for the alternative hypothesis suggests that the variables are related; but the
relationship is not necessarily causal, in the sense that one variable "causes" the other.
Formulate an Analysis Plan
The analysis plan describes how to use sample data to accept or reject the null hypothesis. The
plan should specify the following elements.
y Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or
0.10; but any value between 0 and 1 can be used.
y Test method. Use the chi-square test for independence to determine whether there is a
significant relationship between two categorical variables.
8/7/2019 phd assign.
http://slidepdf.com/reader/full/phd-assign 35/35
35
Analyze Sample Data
Using sample data, find the degrees of freedom, expected frequencies, test statistic, and the P-
value associated with the test statistic. The approach described in this section is illustrated in
the sample problem at the end of this lesson.
y Degrees of freedom. The degrees of freedom (DF) is equal to:DF = (r - 1) * (c - 1)
where r is the number of levels for one catagorical variable, and c is the number of levels for
the other categorical variable.
y Expected frequencies. The expected frequency counts are computed separately for each
level of one categorical variable at each level of the other categorical variable. Compute
r * c expected frequencies, according to the following formula.
Er,c = (nr * nc) / n
where Er,c is the expected frequency count for level r of Variable A and level c of Variable B, nr is
the total number of sample observations at level r of Variable A, nc is the total number of
sample observations at level c of Variable B, and n is the total sample size.
y Test statistic. The test statistic is a chi-square random variable (2) defined by the
following equation.
2 = [ (Or,c - Er,c)2 / Er,c ]
where Or,c is the observed frequency count at level r of Variable A and level c of Variable B, and
Er,c is the expected frequency count at level r of Variable A and level c of Variable B.
y P-value. The P-value is the probability of observing a sample statistic as extreme as the
test statistic. Since the test statistic is a chi-square, use the Chi-Square Distribution
Calculator to assess the probability associated with the test statistic. Use the degrees of
freedom computed above.
Interpret Results
If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null
hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting
the null hypothesis when the P-value is less than the significance level.