chapter 11 psrm
DESCRIPTION
TRANSCRIPT
![Page 1: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/1.jpg)
Chapter 11
Central TendencyDispersion
Statistical InferenceHypothesis Testing
![Page 2: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/2.jpg)
Description• We can describe data in a number of ways:–We could describe every observation, or every
value in a data set (but this would be overwhelming and mostly unhelpful)
– Alternatively, we could summarize the data:• Graphical summaries
– Bar graphs, pie graphs, dot plots, etc.
• Statistical summaries– Frequency distributions– Descriptive statistics
![Page 3: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/3.jpg)
Description• Frequency distributions– A table that shows the number of observations
having each value of a variable–May include other statistics like the relative
frequency proportion, percentage, missing values, or odds ratios
• Descriptive statistics– Describing a large amount of data with just one
number
![Page 4: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/4.jpg)
Description
• Two classes of descriptive statistics– Central tendency– Dispersion
![Page 5: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/5.jpg)
Central Tendency
• Measures of central tendency– Describe the typical case in a data set or
distribution– Three statistics• Mode• Median• Mean
![Page 6: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/6.jpg)
Central Tendency
• Mode– Indicates the most common observation– Simply count the number of times you observe
each value–Mode is resistant to outliers• By definition, the mode cannot be an outlier• Describes only a single value in the data
![Page 7: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/7.jpg)
Central Tendency• Median– Describes the middle value in an ordered set of
values– Important to rank order the observations first–Median = (N+1)/2 –With an even number of observations, average
the two middle values– Resistant to outliers—by definition, median is not
an outlier– Includes only one value
![Page 8: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/8.jpg)
Central Tendency
• Mean– Describes the average value–Mean = (∑Y)/N–Mean is not resistant to outliers• Outliers will pull the mean up or down, sometimes
significantly
– Computed using all values
![Page 9: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/9.jpg)
Central Tendency• Compute the mode, median, and mean for each
of these data sets:Data set #1 Data set #2
i Y i Y1 5 1 12 5 2 43 5 3 54 5 4 55 5 5 10
![Page 10: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/10.jpg)
Central Tendency
• Data set #1– Mode = 5– Median = 5– Mean = 5
• Clearly, the two data sets are not identical
• Data set #2– Mode = 5– Median = 5– Mean = 5
• But central tendency belies the truth
![Page 11: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/11.jpg)
Dispersion
• What we need is some way to differentiate between data set #1 and data set #2.
• The typical values in each data set were the same.
• We need a measure that describes the other values in the data sets.
• Measures of dispersion indicate how the other values vary around the typical value.
![Page 12: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/12.jpg)
Dispersion
• Measures of dispersion– Range– Variance– Standard deviation
![Page 13: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/13.jpg)
Dispersion
• Range– One of the simplest measures of dispersion is the
range.– Range = Y maximum – Y minimum – Describes the extremes of the data around the
typical case.
![Page 14: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/14.jpg)
Dispersion
• Variance– The variance takes into account all of the values in the
data set.– There are two formulas to calculate the variance:
• One formula for the sample• One formula for the population
– The only difference is that we subtract 1 from the sample size in the sample version of the equation.
![Page 15: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/15.jpg)
![Page 16: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/16.jpg)
Dispersion• Standard deviation– The standard deviation also takes into account all of
the values in the data set.– There are also two formulas to calculate the standard
deviation:• One formula for the sample• One formula for the population
– Like variance, the only difference is that we subtract 1 from the sample size in the sample version of the equation.
![Page 17: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/17.jpg)
![Page 18: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/18.jpg)
Dispersion• Compute the range, sample variance, and sample
standard deviation for each of these data sets:Data set #1 Data set #2i Y i Y1 5 1 12 5 2 43 5 3 54 5 4 55 5 5 10
![Page 19: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/19.jpg)
Dispersion
• Data set #1– Range = 0– Variance = 0– Standard deviation = 0
• Measures of dispersion indicate that the data sets are not the same.
• Data set #2– Range = 9– Variance = 10.5– Standard deviation = 3.24
![Page 20: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/20.jpg)
Dispersion
• Now try calculating the population versions of the variance and standard deviation for data set #2.
• Data set #2– Variance = ?– Standard deviation = ?
![Page 21: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/21.jpg)
Dispersion
• As you can see, the population variance and standard deviation are slightly smaller than in the sample version.
• This reflects our greater confidence in population data than in sample data.
• Data set #2– Variance = 8.4– Standard deviation = 2.89
![Page 22: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/22.jpg)
Dispersion• Variance and standard deviation– Variance is used in many different statistical
applications.– The standard deviation is used more often to
summarize the data than variance because the standard deviation is in the same units as the mean.
– If data sets #1 and #2 describe miles per gallon, we could say that in data set #2 we have a mean of 5 miles per gallon and a standard deviation of 2 miles per gallon.
![Page 23: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/23.jpg)
Statistical Inference– The normal distribution is our first choice in most
cases because it has such wonderful properties:• Distribution is symmetrical around the mean• Percentage of cases associated with standard deviations• Can identify probability of values under the curve• A linear combination of normally distributed variables is
itself distributed normally• Central limit theorem • Normal distribution is symmetric and mesokurtic
– Great flexibility in using the normal distribution
![Page 24: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/24.jpg)
Statistical Inference
![Page 25: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/25.jpg)
Statistical Inference
• We can calculate a z score for every observation in the data set.
• The z score allows us to compare each observation to the rest of the data set, relative to the mean.
• z score, or z of X = (X – )
![Page 26: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/26.jpg)
Statistical Inference
• Example:• = 64 = 2.4 Xi=70 or more
• z = (X – ) /
![Page 27: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/27.jpg)
Statistical Inference
• Example:• = 64 = 2.4 Xi=70 or more
• z = (X – ) / • z = (70 – 64) / 2.4• z = (6) / 2.4• z = 2.5 for 70 contacts• p = .0062; or 0.62%
![Page 28: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/28.jpg)
Hypothesis Testing
• How do you test hypotheses with statistics?• Comparing the means of two groups– Consider an experiment
• Research hypothesis: • Null hypothesis:
X1 ≠ X2 ──
X1 = X2
─ ─
![Page 29: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/29.jpg)
Hypothesis Testing
• Type 1 error– State of the world: Research hypothesis is false– Incorrect rejection of null
• Type 2 error– State of the world: Research hypothesis is true– Incorrect acceptance of null
![Page 30: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/30.jpg)
Hypothesis Testing
• Hypothesis: College students are less likely to read political news stories than are other voting-age citizens.
• X = 5; = 10; = 2; n = 25
(X – )
( / √n)__________z =
_
_
![Page 31: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/31.jpg)
Hypothesis Testing
• Hypothesis: College students are less likely to read political news stories than are other voting-age citizens.
• X = 5; = 10; = 2; n = 25
(X – )
( / √n)__________z =
_
_
(5 – 10)
(2 / √25)__________z =
(-5)
(.4)__________z =
-12.5 z =
![Page 32: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/32.jpg)
Hypothesis Testing
• Hypothesis: College students are less likely to read political news stories than are other voting-age citizens.
• 95% confidence• z critical = 1.96
(X – )
( / √n)__________z =
_
(5 – 10)
(2 / √25)__________z =
(-5)
(.4)__________z =
-12.5 z =
![Page 33: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/33.jpg)
Hypothesis Testing
• Hypothesis: College students rate liberal candidates higher than do the rest of the voting population.
• X = 52; = 50; = 5; n = 25
(X – )
( / √n)__________t =
_
_
![Page 34: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/34.jpg)
Hypothesis Testing
• Hypothesis: College students rate liberal candidates higher than do the rest of the voting population.
• X = 52; = 50; = 5; n = 25
(X – )
( / √n)__________t =
_
(52 – 50)
(5 / √25)___________t =
(2)
(1)__________t =
2 t =
_
![Page 35: Chapter 11 Psrm](https://reader034.vdocument.in/reader034/viewer/2022051609/547f2415b4af9f70218b4797/html5/thumbnails/35.jpg)
Hypothesis Testing
• Hypothesis: College students rate liberal candidates higher than do the rest of the voting population.
• Two-tailed test; .05 level; n – 1 df
• t critical = 2.064
(X – )
( / √n)__________t =
_
(52 – 50)
(5 / √25)___________t =
(2)
(1)__________t =
2 t =