chapter 3 data description. outline 3-1introduction 3-2measures of central tendency 3-3measures of...
TRANSCRIPT
![Page 1: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/1.jpg)
CHAPTER 3Data Description
![Page 2: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/2.jpg)
OUTLINE
• 3-1 Introduction• 3-2 Measures of Central Tendency• 3-3 Measures of Variation• 3-4 Measures of Position• 3-5 Exploratory Data Analysis
![Page 3: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/3.jpg)
OBJECTIVES
• Summarize data using the measures of central tendency, such as the mean, median, mode, and midrange.
• Describe data using the measures of variation, such as the range, variance, and standard deviation.
![Page 4: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/4.jpg)
OBJECTIVES
• Identify the position of a data value in a data set using various measures of position, such as percentiles, deciles and quartiles.
• Use the techniques of exploratory data analysis, including stem and leaf plots, box plots, and five-number summaries to discover various aspects of data.
![Page 5: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/5.jpg)
3-1 Introduction
• A statistic is a characteristic or measure obtained by using the data values from a sample.
• A parameter is a characteristic or measure obtained by using the data values from a specific population.
![Page 6: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/6.jpg)
3-2 Measures of Central Tendency
Mean
Median
Mode
Mid-range
![Page 7: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/7.jpg)
3-2 The mean (arithmetric average)
• The meanmean is defined to be the sum of the data values divided by the total number of values.
![Page 8: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/8.jpg)
3-2 The Sample Mean
• The symbol X represents the sample mean. X is read as “X - bar”. The Greek symbol Σ is read as “sigma” and means “to sum”.
𝑋ത= 𝑋1 + 𝑋2 + ⋯+ 𝑋𝑛𝑛
= σ𝑋𝑛
![Page 9: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/9.jpg)
The following data represent the annual chocolate sales (in millions of RM) for a sample of seven states in Malaysia. Find the mean.
RM2.0, 4.9, 6.5, 2.1, 5.1, 3.2, 16.6
![Page 10: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/10.jpg)
million 5.77 RM =
7
16.6+ 3.2+5.1+2.1+6.5 +4.9+ 2.0 =
nX
=X
![Page 11: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/11.jpg)
3-2 The Population Mean
• The Greek symbol µ represents the population mean. The symbol µ is read as “mu”. N is the size of the finite population.
µ = 𝑋1 + 𝑋2 + ⋯+ 𝑋𝑛𝑁
= σ𝑋𝑁
![Page 12: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/12.jpg)
A small company consists of the owner, the manager, the salesperson, and two technicians. Their salaries are listed as RM 50,000, 20,000, 12,000, 9,000 and 9,000 respectively. Assume this is the population, find the mean.
![Page 13: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/13.jpg)
20,000 RM = 5
9,000+9,000 +12,000 +20,000+ 50,000 =
=N
X
![Page 14: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/14.jpg)
In a random sample of 7 ponds, the number of fishes were recorded as the following, find the mean.
23 56 45 36 28 3337
![Page 15: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/15.jpg)
3-2 The Sample Mean for an Ungrouped Frequency Distribution
• The mean for an ungrouped frequency distribution is given by
n)Xf(
=X
f = frequency of the corresponding value X n = f
![Page 16: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/16.jpg)
The scores of 25 students on a 4-point quiz are given in the table. Find the mean score.
Score, X Frequency, f
0 2
1 4
2 12
3 4
4 3
![Page 17: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/17.jpg)
Score, X Frequency, f f X
0 2 0
1 4 4
2 12 24
3 4 12
4 3 12
08.225
52 ==
n
XfX
![Page 18: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/18.jpg)
The number of balls in 17 bags were counted. Find the mean
Number of balls Frequency
5 5
6 4
7 2
8 6
![Page 19: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/19.jpg)
3-2 The Sample Mean for a Grouped Frequency Distribution
• The mean for grouped frequency distribution is given by
Xm = class midpoint = (UCL + LCL) / 2 n
XfX
m ) (=
![Page 20: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/20.jpg)
The lengths of 40 bean pods were showed in the table. Find the mean.
Length (cm) Frequency, f
4-8 2
9-13 4
14-18 7
19-23 14
24-28 8
29-33 5
![Page 21: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/21.jpg)
Length (cm)
Frequency, f Midpoint, Xm f Xm
4-8 2 6 12
9-13 4 11 44
14-18 7 16 112
19-23 14 21 294
24-28 8 26 208
29-33 5 31 155
![Page 22: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/22.jpg)
6.20 40
825=
=
825 =
1552082941124412
n
XfX
Xf
m
m
![Page 23: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/23.jpg)
Time (in minutes) that needed by a group of students to complete a game are shown as below. Find the mean.
Time (mins) Frequency
1-3 2
4-6 4
7-9 8
10-12 5
13-15 1
![Page 24: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/24.jpg)
ሺ𝑓∙𝑋ሻ≠ 𝑓 ∙ 𝑋
ሺ𝑓∙𝑋𝑚ሻ≠ 𝑓 ∙ 𝑋𝑚
![Page 25: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/25.jpg)
Mean = (f•X) / n = 23 / 12 = 1.92
Mean = (f • X) / n = (12 x 10) / 12 = 10
Score, X Frequency, f f•X
0 1 0
1 3 3
2 5 10
3 2 6
4 1 4
X = 10 n = f = 12 (f•X) = 23
![Page 26: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/26.jpg)
3-2 The Median
• When a data set is ordered, it is known as a data array.
• The median is defined to be the midpoint of the data array.
• The symbol used to denote the median is MD.
![Page 27: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/27.jpg)
The ages of seven preschool children are
1, 3, 4, 2, 3, 5, and 1. Find the median.
1. Arrange the data in order.
2. Select the middle point.
![Page 28: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/28.jpg)
![Page 29: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/29.jpg)
• In the previous example, there was an odd number of values in the data set.
• In this case it is easy to select the middle number in the data array.
• When there is an even number of values in the data set, the median is obtained by taking the average of the two middle numbers.
![Page 30: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/30.jpg)
Six customers purchased these numbers of magazines: 1, 7, 3, 2, 3, 4. Find the median.
1. Arrange the data in order.
2. Select the middle point.
![Page 31: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/31.jpg)
![Page 32: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/32.jpg)
3-2 The Median - Ungrouped Frequency Distribution
• For an ungrouped frequency distribution,
find the median by examining the cumulative frequencies to locate the middle value.
• If n is the sample size, compute n/2. Locate the data point where n/2 values fall below and n/2 values fall above.
![Page 33: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/33.jpg)
LRJ Appliance recorded the number of VCRs sold per week over a one-year period.
No. Sets Sold Frequency
1 4
2 9
3 6
4 2
5 3
Total 24
![Page 34: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/34.jpg)
• To locate the middle point, divide n by 2; 24/2 = 12.
• Locate the point where 12 values would fall below and 12 values will fall above.
• Consider the cumulative distribution.
• The 12th and 13th values fall in class 2. Hence MD = 2.
![Page 35: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/35.jpg)
This class contains the 5th through the 13th values.
No. Sets Sold Frequency Cumulative Frequency
1 4 4
2 9 13
3 6 19
4 2 21
5 3 24
Median
![Page 36: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/36.jpg)
3-2 The Median - Grouped
Frequency Distribution
• For grouped frequency distribution, find the median by using the formula as shown below:
Median, MD = Lm + (W)
n = sum of frequencies
cf = cumulative frequency of the class immediately preceding the median class
f = frequency of the median class
w = class width of the median class
Lm = Lower class boundary of the median class
![Page 37: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/37.jpg)
Find the median by using the following data.
Class Frequency, f
16 - 20 3
21 - 25 5
26 - 30 4
31 - 35 3
36 - 40 2
![Page 38: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/38.jpg)
Class Frequency, f Cumulative Frequency
16 - 20 3 3
21 - 25 5 8
26 - 30 4 12
31 - 35 3 15
36 - 40 2 17
![Page 39: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/39.jpg)
• To locate the halfway point, divide n by 2;
17/2 = 8.5 ≈ 9.
• Find the class that contains the 9th value. This will be the median class.
• Consider the cumulative distribution.
The median class will then be 26-30.
![Page 40: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/40.jpg)
=17 = = = –25.5 =5
( )
( ) = (17/ 2) – 8
4 = 26.125.
ncffwL
MDn cf
fw L
m
m
8430.525.5
25 25.5
( ) .
![Page 41: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/41.jpg)
Find the median by using the following data.
Class Frequency, f
5-14 5
15-24 7
25-34 19
35-44 17
45-54 7
![Page 42: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/42.jpg)
3-2 The Mode
• The mode is defined to be the value that occurs most often in a data set.
• A data set can have more than one mode.
• A data set is said to have no mode if all values occur with equal frequency.
![Page 43: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/43.jpg)
Find the mode for the number of children per family for 10 selected families.
Data set: 2, 3, 5, 2, 2, 1, 6, 4, 7, 3.
Ordered set: 1, 2, 2, 2, 3, 3, 4, 5, 6, 7.
Mode: 2.
![Page 44: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/44.jpg)
• Six strains of bacteria were tested to see how long they could remain alive outside their normal environment. The time, in minutes, is given below. Find the mode.
• Data set: 2, 3, 5, 7, 8, 10.
• There is no mode since each data value occurs equally with a frequency of one.
![Page 45: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/45.jpg)
• Eleven different automobiles were tested at a speed of 15 mph for stopping distances. The distance, in feet, is given below. Find the mode.
• Data set: 15, 18, 18, 18, 20, 22, 24, 24, 24, 26, 26.
• There are two modes (bimodal). The values are 18 and 24.
![Page 46: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/46.jpg)
3-2 The Mode – Ungrouped Frequency Distribution
Values Frequency, f
15 3
20 5
25 8
30 3
35 2
Mode Highest frequency
![Page 47: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/47.jpg)
• The mode for grouped data is the modal class.
• The modal class is the class with the largest frequency.
![Page 48: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/48.jpg)
Values Frequency, f
15.5 - 20.5 3
20.5 - 25.5 5
25.5 - 30.5 7
30.5 - 35.5 3
35.5 - 40.5 2
Modal Class
Highest frequency
3-2 The Mode – Grouped Frequency Distribution
![Page 49: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/49.jpg)
3-2 The Midrange
The midrange is found by adding the lowest and highest values in the data set and dividing by 2.
The midrange is a rough estimate of the middle value of the data.
The symbol that is used to represent the midrange is MR.
![Page 50: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/50.jpg)
![Page 51: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/51.jpg)
![Page 52: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/52.jpg)
3-2 Distribution Shapes
Frequency distributions can assume many shapes.
Three are three most important shapes:
i) Positively skewed
ii) Symmetrical
iii) Negatively skewed
![Page 53: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/53.jpg)
Positively SkewedPositively Skewed
X
Y
Mode < Median < Mean
Positively Skewed
![Page 54: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/54.jpg)
SymmetricalSymmetrical
n
Y
X
Symmetrical
Mean = Media = Mode
![Page 55: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/55.jpg)
Negatively SkewedNegatively Skewed
Y
X
Negatively Skewed
Mean < Median < Mode
![Page 56: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/56.jpg)
3-3 Measures of Variation
Measure the variation and dispersion of
the data.
Range
Variance
Standard Deviation
Coefficient of Variation
![Page 57: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/57.jpg)
3-3 Range
The range is defined to be the highest value minus the lowest value. The symbol R is used for the range.
R = highest value – lowest value.
Extremely large or extremely small data values can drastically affect the range.
![Page 58: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/58.jpg)
3-3 Population Variance
𝟐 = σ(𝑿− µ)𝟐𝑵
General Formula
• The symbol for population
variance is .
• X = Individual value
• µ = Population mean
• N = Population size
![Page 59: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/59.jpg)
3-3 Population Standard Deviation
𝟐 = ඥ𝝈𝟐 = ඨσ(𝑿− µ)𝟐𝑵
General Formula
• The population standard
deviation is square root of
the population variance.
![Page 60: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/60.jpg)
![Page 61: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/61.jpg)
ean = (10 + 60 + 50 + 30 + 40 + 20)/6 = 210/6
= 35.
![Page 62: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/62.jpg)
X X-µ (X-µ)2
10 -25 625
60 +25 625
50 +15 225
30 -5 25
40 +5 25
20 -15 225
∑X: 210 ∑(X-µ)2: 1750
The variance 2 = 1750/6 = 291.67.
The standard deviation = 291.67 = 17.08
![Page 63: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/63.jpg)
A class of 6 children sat a test; the resulting marks scored out of 10, were as follows:
4 5 6 8 4 9
Calculate the mean, variance and standard deviation of this population.
![Page 64: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/64.jpg)
3-3 Sample Variance
General Formula
• The symbol for sample
variance is .
• = Sample mean
• n = Sample size
𝑺𝟐 = σ(𝑿− 𝑿ഥ)𝟐𝒏− 𝟏
![Page 65: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/65.jpg)
3-3 Sample Standard Deviation
𝑺𝟐 = ඥ𝑺𝟐 = ඨσ(𝑿− 𝑿ഥ)𝟐𝒏− 𝟏
General Formula
• The sample standard
deviation is square root of
the sample variance.
![Page 66: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/66.jpg)
![Page 67: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/67.jpg)
Mean, X = (35+45+30+35+40+25)/6 = 210/6 = 35
![Page 68: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/68.jpg)
X X-X (X-X)2
35 0 0
45 +10 100
30 -5 25
35 0 0
40 +5 25
25 -10 100
∑X: 210 ∑(X-X)2 : 250
The variance 2 = 250/(6-1) = 50The standard deviation = 50 = 7.07
![Page 69: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/69.jpg)
Ungrouped frequency distribution
Grouped frequency distribution
𝟐 = σ𝒇(𝑿− µ)𝟐𝑵
𝟐 = σ𝒇(𝑿𝒎 − µ)𝟐𝑵
How to find the population variance and standard deviation for data with frequency provided?
How to find the population variance and standard deviation for data with frequency provided?
![Page 70: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/70.jpg)
• For ungrouped data, use the actual observed X value in the different classes.
• For grouped data, use the same formula with the X value replaced by class midpoints, Xm.
![Page 71: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/71.jpg)
Ungrouped frequency distribution
Grouped frequency distribution
How to find the sample variance and standard deviation for data with frequency provided?
How to find the sample variance and standard deviation for data with frequency provided?
𝑺𝟐 = σ𝒇(𝑿− 𝑿ഥ)𝟐𝒏− 𝟏
𝑺𝟐 = σ𝒇(𝑿𝒎− 𝑿ഥ)𝟐𝒏− 𝟏
![Page 72: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/72.jpg)
Example
X f Xm fXm (Xm - µ) (Xm - µ)2 f(Xm - µ)2
1-10 4 5.5 22 -10.59 112.1481 448.5924
11-20 8 15.5 124 -0.59 0.3481 2.7848
21-30 5 25.5 127.5 9.41 88.5481 442.7405
N = 17
∑fXm = 273.5
∑f(Xm - µ)2 =
894.1177
![Page 73: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/73.jpg)
Mean, µ = ∑fXm / N = 273.5/17 = 16.09
Variance, 2 = 894.1177 / 17 = 52.60
Standard deviation, = 2
= 52.60 = 7.25
![Page 74: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/74.jpg)
Question
a) The scores in a statistics test for 60 candidates are shown in the table. Find the mean, variance, and standard deviation for this population.
Score Frequency
0-19 8
20-39 13
40-59 24
60-79 11
80-99 4
![Page 75: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/75.jpg)
b) The following table showed the number of cars passed by UM hospital in a random sample of days. Compute the sample mean, variance and standard deviation.
Number of cars Number of days
101-120 2
121-130 4
131-140 15
141-150 10
151-160 7
![Page 76: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/76.jpg)
3-3 Coefficient of Variation
• The coefficient of variation is defined to be the standard deviation divided by the mean. The result is expressed as a percentage.
• It is used to compare the standard deviation of different units.
CVarsX
or CVar 100% 100%. =
![Page 77: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/77.jpg)
![Page 78: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/78.jpg)
3-3 Chebyshev’s Theorem
• 75% of the values will lie within 2 standard deviations of the mean.
• Approximately 89% will lie within 3 standard deviations.
![Page 79: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/79.jpg)
![Page 80: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/80.jpg)
3-3 Empirical (Normal) Rule
• For any bell shaped distribution:
Approximately 68% of the data values will fall within one standard deviation of the mean.
Approximately 95% will fall within two standard deviations of the mean.
Approximately 99.7% will fall within three standard deviations of the mean.
![Page 81: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/81.jpg)
-- 95%
![Page 82: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/82.jpg)
Question
The scores on a national achievement exam have a mean of 480 and standard deviation of 90. If the scores are normally distributed, find the scores for approximately 68% of the data values, 95% of the data values, and 99.7% of the data values.
![Page 83: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/83.jpg)
3-4 Measures of Position
Measure the position of particular data in a data set.
Z - score
Percentile
Decile
Quartile
![Page 84: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/84.jpg)
3-4 Z-Score
The standard score or z score for a value is obtained by subtracting the mean from the value and dividing the result by the standard deviation.
The symbol z is used for the z score.
Z = (value - mean)/ standard deviation
![Page 85: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/85.jpg)
• The z score represents the number of standard deviations a data value falls above or below the mean.
For samples
zX X
sFor populations
zX
:
.
:
.
![Page 86: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/86.jpg)
• A student scored 65 on a statistics exam that had a mean of 50 and a standard deviation of 10. Compute the z-score.
• z = (65 – 50)/10 = 1.5.
• That is, the score of 65 is 1.5 standard deviations above the mean.
• Above - since the z-score is positive.
• How about if the z-score shows a negative value?
![Page 87: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/87.jpg)
3-4 Percentile
• Percentiles divide the distribution into 100 equal groups.
P1 P2 P3 P4 ……………… P98 P99 P100
![Page 88: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/88.jpg)
How to find the Value Corresponding to a Given Percentile?
• Step 1: Arrange the data in order.
• Step 2: Compute c = (np)/100.
c = position value of the required percentilep = percentilen = sample size
![Page 89: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/89.jpg)
• Step 3: If c is not a whole number, round up to the next whole number and find the corresponding value. If c is a whole number, use the value halfway between c
and c+1.
![Page 90: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/90.jpg)
Find the value of the 25th percentile for the following data set: 18, 12, 3, 5, 15, 8, 10, 2, 6, 20.
![Page 91: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/91.jpg)
• Is the data arranged in order?
• Arrange data in order form: 2, 3, 5, 6, 8, 10, 12, 15, 18, 20.
• n = 10, p = 25, so c = (1025)/100 = 2.5. Since 2.5 is not a whole number, round up to c = 3.
• Thus, the value of the 25th percentile is the value X = 5.
![Page 92: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/92.jpg)
3-4 Decile
• Deciles divide the data set into 10 groups.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10
![Page 93: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/93.jpg)
What is the relationship between percentile and decile?
D1 corresponding to P10
D2 corresponding to P20
D9 corresponding to P90
D10 corresponding to P100
![Page 94: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/94.jpg)
3-4 Quartile
• Quartiles divide the data set into 4 groups.
Q1 Q2 Q3 Q4
![Page 95: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/95.jpg)
What is the relationship between percentile and quartile?
Q1 corresponding to P25
Q2 corresponding to P50
Q3 corresponding to P75
Q4 corresponding to P100
Median
![Page 96: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/96.jpg)
3-4 Outliers and the Interquartile Range (IQR)
• An outlier is an extremely high or an extremely low data value when compared with the rest of the data values.
• The Interquartile Range, IQR = Q3 – Q1.
![Page 97: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/97.jpg)
Procedures to identify outliers
• To determine whether a data value can be considered as an outlier:
• Step 1: Arrange the data in order.
• Step 2: Compute Q1 and Q3.
• Step 3: Find the IQR = Q3 – Q1.
• Step 4: Compute (1.5)(IQR).
![Page 98: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/98.jpg)
• Step 5: Compute Q1 – (1.5)(IQR) and Q3 + (1.5)(IQR).
• Step 6: Compare the data value (say X) with Q1 – (1.5)(IQR) and Q3 + (1.5)(IQR).
• If X < Q1 – (1.5)(IQR) or X > Q3 + (1.5)(IQR), then X is considered an outlier.
![Page 99: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/99.jpg)
![Page 100: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/100.jpg)
• Find Q1.
• Q1 is corresponding to P25
c = (n•p)/100 = (8•25)/100 = 2nd • Since 2 is a whole number, take value
between 2nd and 3rd.• Q1 = (6+12)/2 = 9
![Page 101: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/101.jpg)
• Find Q3.
• Q3 is corresponding to P75
c = (n•p)/100 = (8•75)/100 = 6th • Since 6 is a whole number, take value
between 6th and 7th.• Q3 = (18+22)/2 = 20
![Page 102: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/102.jpg)
• Find IQR.
• IQR = 20 - 9 = 11.
• (1.5)(IQR) = (1.5)(11) = 16.5.
• 9 – 16.5 = – 7.5 and 20 + 16.5 = 36.5.
• The value of 50 is outside the range – 7.5 to 36.5, hence 50 is an outlier.
![Page 103: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/103.jpg)
3-5 Exploratory Data Analysis - Stem and Leaf Plot
• A stem and leaf plot is a data plot that uses part of a data value as the stem and part of the data value as the leaf to form groups or classes.
Leaf
Stem
![Page 104: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/104.jpg)
Example of Stem and Leaf Plot
4 2 5 2
5 2 5 2
Stem Leaves
![Page 105: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/105.jpg)
Stem Leading digit
Leaf Trailing digit
![Page 106: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/106.jpg)
At an outpatient testing center, a sample of 20 days showed the following number of cardiograms done each day: 25, 31, 20, 32, 13, 14, 43, 02, 57, 23, 36, 32, 33, 32, 44, 32, 52, 44, 51, 45. Construct a stem and leaf plot for the data.
![Page 107: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/107.jpg)
Leading Digit (Stem) Trailing Digit (Leaf)
012345
23 40 3 51 2 2 2 2 3 63 4 4 51 2 7
![Page 108: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/108.jpg)
3-5 Exploratory Data Analysis - Box Plot
• When the data set contains a small number of values, a box plot is used to graphically represent the data set.
• These plots involve five values:
1. Minimum value
2. Q1
3. Median
4. Q3
5. Maximum value
![Page 109: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/109.jpg)
Example of Box Plot
Minimum value Maximum value
Q2Q1 Q3
![Page 110: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/110.jpg)
Information Obtained from a Box Plot
• If the median is near the center of the box, the distribution is approximately symmetric.
• If the median falls to the left of the center of the box, the distribution is positively skewed.
• If the median falls to the right of the center of the box, the distribution is negatively skewed.
![Page 111: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/111.jpg)
• If the lines are about the same length, the distribution is approximately symmetric.
• If the right line is larger than the left line, the distribution is positively skewed.
• If the left line is larger than the right line, the distribution is negatively skewed.
![Page 112: CHAPTER 3 Data Description. OUTLINE 3-1Introduction 3-2Measures of Central Tendency 3-3Measures of Variation 3-4Measures of Position 3-5Exploratory Data](https://reader035.vdocument.in/reader035/viewer/2022062517/56649f205503460f94c383e2/html5/thumbnails/112.jpg)
21 girls estimated the length of a line, in mm. The results were
51 45 31 43 97 16 18 23 34 35 35
85 62 20 22 51 57 49 22 18 27
Draw the box plot and use it to identify any outliers. Comment on the shape of the distribution of length.