DAY 3
14 Jan 2014
Today is
A. January 14, 2014
B. January 13, 2013
Recap
• Organizing Data• Qualitative & Quantitative Data• Frequency distribution & relative frequency
distribution• Single-value grouping, Limit grouping, Cut point
grouping• Histogram, Dotplots, Stem-and-leaf diagrams
Objective of the day:
• Distribution shapes
• Descriptive Measures => Central Measures => Mean, Median, Mode.
• Measures of Variations => Standard Deviation
Section 2.4
Distribution Shapes
Definition 2.10
Distribution of a Data Set
The distribution of a data set is a table, graph, or formula that provides the values of the observations andhow often they occur.
Relative-frequency histogram and approximating smooth curve for the distribution of heights
Common distribution shapes
Relative-frequency histogram for household size
Identify the shape of the distribution.
Example:
Relative-frequency histogram for household size
Identify the shape of the distribution.
Example:
Definition 2.12
Population and Sample Distributions; Distribution of a Variable
The distribution of population data is called the population distribution, or the distribution of the variable.
The distribution of sample data is called a sample distribution.
Population distribution and six sample distributions for household size
Key Facts: Population and Sample Distributions
For a simple random sample, the sample distributionapproximates the population distribution (i.e., thedistribution of the variable under consideration). The larger the sample size, the better the approximationtends to be.
Chapter 3
Descriptive Measures
Number that describe data set.
Descriptive Measures
Section 3.1
Measures of Center
Measure of Center
Descriptive measures that indicates where the center or most typicalvalue of data set lies are called measure of central tendency or measures of center.
Three most important measures of center:1. Mean2. Median3. Mode
Definition 3.1
Mean of a Data Set
The mean of a data set is the sum of the observations divided by the number of observations.
mean = sum of the observations / the number of observations.
Data Set I Data Set II
Example:
Data Set I Data Set II
Means in Data Set I and Data Set II
Example:
Definition 3.2
Median of a Data Set
Arrange the data in increasing order.
• If the number of observations is odd, then the median is the observation exactly in the middle of the ordered list.
• If the number of observations is even, then the median is the mean of the two middle observations in the ordered list.
In both cases, if we let n denote the number of observations, then the median is at position (n + 1) / 2 in the ordered list.
Definition 3.3
Mode of a Data Set
Find the frequency of each value in the data set.
• If no value occurs more than once, then the data set has no mode.
• Otherwise, any value that occurs with the greatest frequency is a mode of the data set.
Data Set I
Median in Data Set I
Example:
Data Set I
300 300 300 300 300 300 400 400 450 450 800 940 1050
Median is at the position (n+1)/2 = (13+1)/2 = 7
Median = ?
Data Set I
Median in Data Set I
Example:
Data Set I
300 300 300 300 300 300 400 400 450 450 800 940 1050
Median is at the position (n+1)/2 = (13+1)/2 = 7
Median = 400
Data Set I
Mode in Data Set I
Example:
Data Set I
300 300 300 300 300 300 400 400 450 450 800 940 1050
Mode = ?
Data Set I
Mode in Data Set I
Example:
Data Set I
300 300 300 300 300 300 400 400 450 450 800 940 1050
Mode = 300
Data Set I Data Set II
Mean, Median, and Mode in Data Set I and Data Set II
Example:
Definition 3.4
COMPARISON OF MEAN, MEDIAN, MODE: �
1. Note that the mean is pulled in the direction of the skewness, i.e. in the direction of the extreme observation. The mean is sensitive to extreme observations (very large or very small in comparison to the rest of the data). The mean is not a resistant measure of center.
2. The median is not pulled into the direction of the most extreme observations. The median is not sensitive to extremes, i.e. the median is a resistant measure of center.
3. When the data is skewed, therefore, the median is the preferred measure of center.
4. The mode may not be near the center and, thus not useful as a measure of center.
Relative positions of the mean and median for (a) right-skewed, (b) symmetric, and (c) left-skewed distributions
Section 3.2
Measures of Variation
Five starting players on two basketball teams
Example:
Shortest and tallest starting players on the teamsExample:
Definition 3.5
Range of a Data Set
The range of a data set is given by the formula
Range = Max – Min,
where Max and Min denote the maximum and minimum observations, respectively.
∑
∑ N = 1+2+3+4+5+6+7+8+9+10 N = 1
10
∑ N = ? N = 1
10
Definition 3.6
Five starting players on basketball Team I.
Example:
Five starting players on basketball Team I
Example:
Five starting players on basketball Team I
Example:
Five starting players on basketball Team II
Example:
Five starting players on basketball Team 1.
Example:
Five starting players on basketball Team 1.
Example:
Formula
Summary:
• Distribution shapes
• Descriptive Measures => Central Measures => Mean, Median, Mode.
• Measures of Variations => Standard Deviation
Next ...
• Lab : Finish section 2.3 and Quiz 1 (1.1-2.3)
• Sections: 3.3 & 3.4
Thank You