descriptive statistics measures of center. essentials: measures of center (the great mean vs. median...
TRANSCRIPT
Descriptive Statistics
Measures of Center
Essentials: Measures of Center(The great mean vs. median conundrum.)
Be able to identify the characteristics of the median, mean and mode, and to which types of data each applies.
Be able to calculate the median, mean and mode, as appropriate, for a set of data.
Affected by vs. resistant to extreme values. What are the implications for the mean and median?.
Some Notation denotes the addition of a set of values
X (capital)is the variable usually used to represent the individual data
values
xi (small letter) represents a single value of a variable from the first value, x1, to the last value xn
n represents the number of data values in a sample
N represents the number of data values in a population
Measures of Center
Measures of Central Tendency
Indicate where the center or most typical value of a data set lies
Are often thought of as averages
Include the Mean, Median, Mode, and Midrange
The Mean (Arithmetic)
The “average” of a set of data.
Is the sum of the observations divided by the number of observations.
Is used only with quantitative data.
The Formula:
n
xx
n
ii
1
Population Mean vs. Sample Mean
A Sample Mean is represented by the lower case letter x with a bar above it (called x-bar)
A Population Mean is represented by the lower case Greek letter (mu)
Nx
nxx
Median The middle observation in a set of data.
Divides the data such that 50% of the observations lie below the median and 50% lie above it.
Is used only with quantitative data.
To obtain the median, the data must be placed in increasing order.
MEDIAN: The Formula
If there is an ODD number of scores, the middle score is the value of the Median. e.g: 1, 3, 6 => Median is
(n+1)/2 = (3+1)/2 = 2 (position). So, the Median is value in the second position of the list of values. Here the second value is the number 3.
If there is an EVEN number of scores, the Median lies between the two middle scores. e.g: 1, 2, 8, 15 => Median
is (n+1)/2 = (4+1)/2 = 2.5 (position). So, the Median is the data value that lies 1/2 way between the second and third data values. Here that value would be 5.
First: Arrange the scores in increasing order. Second: Apply the formula (n+1)/2. (Where n is the number of data values.)
Remember, the formula computes a position, not a data value.
Calculating a Median:
Determine the median for the following backpack weights:
Backpack weights (lb): 10, 14, 12, 18, 32, 15, 22, 19, 23, 61.
MODE: The Formula
The most frequently occurring score in a data set. Obtain the frequency of each value.
A Frequency Table based upon Single-Value Grouping or a Dot Plot would display this information.
Used with both qualitative and quantitative data. It is the only measure of center for qualitative data. There may be more than one Mode
If there are two modes, the data set is bimodal. If there are more than two modes, the data set is
multimodal. If there is the same number of each value, then there is no
mode
Example: Comparing the Mean, Median, and Mode
Find the mean, median, and mode of the sample ages of a class shown. Which measure of central tendency best describes a typical entry of this data set? Are there any outliers?
Source: Larson/Farber 4th ed.
Ages in a class
20 20 20 20 20 20 21
21 21 21 22 22 22 23
23 23 23 24 24 65
Solution: Comparing the Mean, Median, and Mode
Source: Larson/Farber 4th ed.
Mean: 20 20 ... 24 6523.8 years
20
xx
n
Median: 21 2221.5 years
2
20 years (the entry occurring with thegreatest frequency)
Ages in a class
20 20 20 20 20 20 21
21 21 21 22 22 22 23
23 23 23 24 24 65
Mode:
Solution: Comparing the Mean, Median, and Mode
Source: Larson/Farber 4th ed.
Mean ≈ 23.8 yrs. Median = 21.5 yrs. Mode = 20 yrs.
• The mean takes every entry into account, but is influenced by the outlier of 65.
• The median here was determined by taking the middle two entries into account, and it is not affected by the outlier.
• In this case the mode exists, but it doesn't appear to represent a typical entry.
Solution: Comparing the Mean, Median, and Mode
Source: Larson/Farber 4th ed.
Sometimes a graphical comparison can help you decide which measure of central tendency best represents a data set.
In this case, it appears that the median best describes the data set.
Mean vs. Median vs. Mode
MEAN: Is sensitive to the influence of extreme scores
(outliers), which will “pull” the mean away from the center.
Involves ALL data values in the calculation
MODE: May not be anywhere near the center of the data. Not really aimed at finding the middle of the data. Is the ONLY “Measure of Center” for Qualitative Data.
Which is the best Measure of Center????
MEDIAN: Is resistant to the influence of extreme values.
Only uses One or Two points in its calculation.
Midrange
The Midrange is a measure of center of a distribution. It indicates the value midway between the highest and lowest values in a data set. To find the midrange.
Highest Value + Lowest Value2
Additional Topics
Weighted Means
Weighted Mean – a mean computed with different scores assigned different weights. To find the weighted mean
wwxx
)(
Weighted Example: Finding a Weighted Mean
You are taking a class in which your grade is determined from five sources: 50% from your test mean, 15% from your midterm, 20% from your final exam, 10% from your computer lab work, and 5% from your homework. Your scores are 86 (test mean), 96 (midterm), 82 (final exam), 98 (computer lab), and 100 (homework). What is the weighted mean of your scores? If the minimum average for an A is 90, did you get an A?
Source: Larson/Farber 4th ed.
Solution: Finding a Weighted Mean
Source: Larson/Farber 4th ed.
Source Score, x Weight, w x w∙
Test Mean 86 0.50 86(0.50)= 43.0
Midterm 96 0.15 96(0.15) = 14.4
Final Exam 82 0.20 82(0.20) = 16.4
Computer Lab 98 0.10 98(0.10) = 9.8
Homework 100 0.05 100(0.05) = 5.0
Σw = 1 Σ(x w∙ ) = 88.6
( ) 88.688.6
1
x wx
w
Your weighted mean for the course is 88.6. You did not get an A.
Weighted Means Example
Calculating a GPA.
Given the following four grades, calculate the semester GPA.
Statistics A (of course; 3 CrHrs; numeric value for an A = 4)
History B (3 CrHr; B = 3)
Physics C (3 CrHr; C = 2)
Physical Education C (1 CrHr)
The grade numeric equivalents are the x values. The credit hour values are the weights.
Calculate the student’s GPA.
wwxx
)(
Finding a Mean From a Frequency Table (Grouped Data)When we view data in a frequency
table, it is impossible to know the exact values falling in a particular class. To find this value, obtain the product of each frequency and class midpoint (here “x”), add the products, and then divide by the sum of the frequencies.
ffxx
)(
Finding the Mean of a Frequency DistributionIn Words In Symbols
Source: Larson/Farber 4th ed.
( )x fx
n
(lower limit)+(upper limit)
2x
( )x f
n f
1. Find the midpoint of each class.
2. Find the sum of the products of the midpoints and the frequencies.
3. Find the sum of the frequencies.
4. Find the mean of the frequency distribution.
Example: Find the Mean of a Frequency DistributionUse the frequency distribution to approximate the mean number of minutes that a sample of Internet subscribers spent online during their most recent session.
Source: Larson/Farber 4th ed.
Class Midpoint
Frequency, f
7 – 18 12.5 6
19 – 30
24.5 10
31 – 42
36.5 13
43 – 54
48.5 8
55 – 66
60.5 5
67 – 78
72.5 6
79 – 90
84.5 2
Example: Find the Mean of a Frequency Distribution
Source: Larson/Farber 4th ed.
Class Midpoint, x Frequency, f (x f∙ )
7 – 18 12.5 6 12.5 6 = 75.0∙
19 – 30 24.5 10 24.5 10 = 245.0∙
31 – 42 36.5 13 36.5 13 = 474.5∙
43 – 54 48.5 8 48.5 8 = 388.0∙
55 – 66 60.5 5 60.5 5 = 302.5∙
67 – 78 72.5 6 72.5 6 = 435.0∙
79 – 90 84.5 2 84.5 2 = 169.0∙
n = 50 Σ(x f∙ ) = 2089.0
( ) 208941.8 minutes
50
x fx
n
Use the frequency distribution to approximate the mean number of minutes that a sample of Internet subscribers spent online during their most recent session.
End of Slides