measures of central tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfmeasures of central...

12
1 Online Student Guide OpusWorks 2019, All Rights Reserved Measures of Central Tendency

Upload: others

Post on 13-Mar-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

1

Online Student Guide

OpusWorks 2019, All Rights Reserved

Measures of Central Tendency

Page 2: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

2

Table of Contents

LEARNING OBJECTIVES ....................................................................................................................................3

INTRODUCTION ..................................................................................................................................................3 MEASURES OF CENTRAL TENDENCY ......................................................................................................................................... 3 CENTRALIZE .................................................................................................................................................................................... 3

MEAN .....................................................................................................................................................................3 WRITING OR REFERRING TO MEAN ........................................................................................................................................... 4 MATH NOTATION ........................................................................................................................................................................... 4 SUM OF THE MEASUREMENTS ..................................................................................................................................................... 5 CALCULATING THE MEAN ............................................................................................................................................................ 5 CALCULATING THE MEAN: GROUPED DATA ............................................................................................................................ 6 CLASS MIDPOINT ............................................................................................................................................................................ 6

MEDIAN .................................................................................................................................................................7 FIND THE MEDIAN ......................................................................................................................................................................... 7 RULE FOR FINDING THE MEDIAN ............................................................................................................................................... 8 LINEAR INTERPOLATION .............................................................................................................................................................. 9 ESTIMATING THE VALUE OF THE MEDIAN ............................................................................................................................... 9

MODE .................................................................................................................................................................. 10 IDENTIFYING THE MODE ........................................................................................................................................................... 10 MODAL CLASS .............................................................................................................................................................................. 10 WHICH MEASURE TO USE? ....................................................................................................................................................... 11 SHAPE OF THE DATA .................................................................................................................................................................. 11 RELATIONSHIP: MEAN & MEDIAN .......................................................................................................................................... 12

© 2019 by OpusWorks. All rights reserved. August, 2019 Terms of Use This guide can only be used by those with a paid license to the corresponding course in the e-Learning curriculum produced and distributed by OpusWorks. No part of this Student Guide may be altered, reproduced, stored, or transmitted in any form by any means without the prior written permission of OpusWorks. Trademarks All terms mentioned in this guide that are known to be trademarks or service marks have been appropriately capitalized. Comments Please address any questions or comments to your distributor or to OpusWorks at [email protected].

Page 3: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

3

Learning Objectives

Upon completion of this course, student will be able to: • Discuss the three measures of central tendency: mean, median, and mode • Describe how to estimate the relationship of the median and the mean based upon the shape of a

histogram • How changes to the original data affect the mean, median, and mode

• Calculate estimates for the median and mean and how to identify the modal class

Introduction

Measures of Central Tendency

Measures of central tendency shows us how the data tends to cluster around particular values, or “centralize.” They provide descriptive values that show the commonality of data.

Centralize

Measures of central tendency show us how the data tends to cluster around a particular value, or “centralize.” They provide descriptive values that we often think of as a typical value of the data. Now we will learn how to calculate values for each of these three measures of central tendency for grouped and ungrouped data.

Mean

In everyday language, the mean is the average value of a set of data and is one of the most commonly used descriptive statistics. To calculate the mean, simply add all of the data points together and then divide by the number of data points.

Page 4: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

4

Writing or Referring to Mean

When we refer to the mean of a population, the mean is denoted by the Greek letter “mu.” When it’s the mean of a sample, it’s denoted by an "x" with a horizontal line over it, which is pronounced “x bar.” When writing the formula for the mean, we use the summation sign and indicate a population size with an upper case N and a sample size with a lower case n.

Math Notation

We often use letters and subscripts to represent specific elements in a data set. By using letters for variables, subscripts for element identification, and other symbols for mathematical operations, our calculations and instructions are simplified. The first element is represented by the symbol x sub 1, the second measurement is shown as x sub 2, an so on. Please note: the subscript represents the position in the data set and does NOT imply any order according to value. We can now use the variable "x," the subscript "i," and the summation symbol to identify some simple calculations.

Page 5: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

5

Sum of the Measurements

First example: The sum as "i" goes from 1 to 2. This means we add the first and second elements of the data set. Next, the sum as 'i" goes from 1 to 3 means we add up the first three elements in the data set. In this example, we start with the second element and add the third and fourth elements. The answer is "14." In the last example, we start with the first element and end at the nth element. The nth element represents the last element in the data set. This is another way of saying "add up all the measurements in the data set." The answer here is 40. 40 is the sum of the measurements in this sample of size 7.

Calculating the Mean

To find the mean, we divide the sum of the measurements, 40, by the number of measurements, 7. The mean is approximately 5.714, rounded to the third decimal place. To further your understanding of the information just presented, let's look at the following examples.

Page 6: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

6

Calculating the Mean: Grouped Data

In this distribution, the value of the individual measurements is known whether it is shown in a table or as a dot plot. In this situation, we can calculate the mean exactly because we know the exact sum of the measurements. In the second frequency distribution, along with its associated histogram, the individual measurements are not known. Therefore, the mean of this data set can only be approximated. Let’s see how.

Class Midpoint

Because the individual measurements for each class are unknown, we use the class midpoint as a representative measurement for each value. "X sub i" represents the class midpoint value. The sum of the measurements is just the sum of the products of each class frequency times the class midpoint. This is represented by "f sub i times x sub i."

Page 7: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

7

The mean is the approximated sum of the measurements divided by the total number of measurements. In this example, it is 31.80.

Median

Now we will look at another measure of central tendency, the median. The median is the point that splits the data in half-- like the median strip of a highway. Half the highway is on one side of the median, and the other half is on the other side. Let’s look at some examples to learn how to calculate the median.

Find the Median

To find the median in a set of data, the first step is to arrange the data in ascending or descending order. With the data sorted, we locate the value in the middle of the data that places the same number of measurements on each side of the value.

Page 8: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

8

In this example, there are seven measurements. The median will be the fourth measurement. The median is 3. In this example, there is an even number of measurements – 6. With the data sorted in ascending order, the median will be a number between the third and fourth values.

For convenience, when there is an even number of measurements, the median is the average of the two middle values. The median in this example is 39.5. Here’s the rule for finding the median: “In a ranked data set, find the value in the position (n+1)/2.”

Rule for Finding the Median

In this example, there are eight measurements. When they’re arranged in order, the median is the average of the fourth and fifth measurements. Since the fourth and fifth measurements are both 12, the median is 12. You may have noticed in this example that there are six measurements less than or equal to the median, which is 75% of the data. And there are five measurements greater than or equal to the median, which is 62.5% of the data. This is not a problem. By definition, a median is a value where at least 50% of the data will be less than or equal to the median and at least 50% of the values will be greater than or equal to the median. Therefore, the rule for finding the median works in all examples of ungrouped data.

Page 9: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

9

Linear Interpolation

With grouped data, the median is estimated from the frequency table using a method called linear interpolation. With fourty measurements, the median will occur between the twentieth and twenty-first measurement. Since there are seventeen measurements or 42.5% of the data less than or equal to 17.5, and 11 measurements between 17.5 and 21.5, the median must be in the fourth class.

The fourth class is called the median class for this data set because it contains the value of the median.

To find the value of the median, we will use the cumulative relative frequency curve to help estimate the value.

Estimating the Value of the Median

To estimate the value of the median, click and drag the slider until the cumulative relative frequency value is close to .50. The measurement corresponding to the y-value on the curve is displayed in the box labeled "Waiting Time in Minutes." Note: This method assumes the 11 values are uniformly distributed in the interval 17.5 - 21.5. A simple linear calculation is used to estimate the median.

Page 10: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

10

Mode

The mode is the value in the data set that occurs most often. It is the value with the highest frequency. Of the three measures of central tendency, the mode is the easiest to identify but the least useful measure. Let’s see how to identify the mode of a data set.

Identifying the Mode

To identify the mode, find the measure that occurs most often or has the highest frequency.

If each value occurs only once, there is no mode. If two or more values occur with the highest frequency, then there is more than one mode in the data set. Here are three examples that are almost self-explanatory. In example one, the mode is 1. It is the measure with the highest frequency.

In example two, there is no mode. Each measurement is unique. In the last example, there are two modes. We call this data set bimodal.

Modal Class

Here we have a frequency distribution for a given data set. Without knowing the individual measurements, we cannot find the mode. With grouped data displayed in this format, we can only identify a modal class. The modal class is the class interval with the highest frequency.

Page 11: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

11

Which Measure to Use?

Which measure of central tendency is the best one to use? This question doesn’t have a “one type fits all” answer. To know which measure is best to use for a given situation, you have to understand the relationship between the measures and the advantages and disadvantages of each measure.

Shape of the Data

The shape of the data can affect the relationship of the mean and median. This is best illustrated by looking at a histogram of the data. When the histogram implies a symmetric distribution, the mean equals the median. But if the data is skewed right or left, the mean is affected by the extreme values more than the median is. In a skewed left distribution, the smaller extreme values in the left tail will make the mean smaller than the median. Similarly, in a skewed right distribution, the larger extreme values in the right tail will make the mean larger than the median.

Page 12: Measures of Central Tendencycuyahoga.qualitycampus.com/guides/com_000_01598.pdfMeasures of central tendency show us how the data tends to cluster around a particular value, or “centralize.”

12

Relationship: Mean & Median

In this example, we can see the effect on the mean and median by an extreme value in the data set. When the last number in the data set is changed from 20 to 90, the mean is significantly changed from 8 to 18. However, the median remained the same. The median was insensitive to a change in the value of the largest measurement. The change in the mean is rather large in this example because the size of the data set is small. In larger data sets, the change would not be as significant. To further your understanding of the information just presented, let's look at some examples.