measures of central tendency

36
1 Measures of Central Tendency AGBS Bangalore | 2013

Upload: nilanjan-bhaumik

Post on 18-Jul-2015

111 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Measures of central tendency

1

Measures of

Central Tendency

AGBS Bangalore | 2013

Page 2: Measures of central tendency

Central Tendency

• In general terms, central tendency is a

statistical measure that determines a

single value that accurately describes the

center of the distribution and represents

the entire distribution of scores.

• The goal of central tendency is to identify

the single value that is the best

representative for the entire set of data.

Page 3: Measures of central tendency

3

Central Tendency (cont.)

• By identifying the "average score," centraltendency allows researchers to summarize orcondense a large set of data into a singlevalue.

• Thus, central tendency serves as a descriptivestatistic because it allows researchers todescribe or present a set of data in a verysimplified, concise form.

• In addition, it is possible to compare two (ormore) sets of data by simply comparing theaverage score (central tendency) for one setversus the average score for another set.

Page 4: Measures of central tendency
Page 5: Measures of central tendency

5

The Mean, the Median,

& the Mode

• It is essential that central tendency be

determined by an objective and well-defined

procedure so that others will understand exactly

how the "average" value was obtained and can

duplicate the process.

• No single procedure always produces a

good, representative value. Therefore,

researchers have developed three commonly

used techniques for measuring central tendency:

the mean, the median, and the mode.

Page 6: Measures of central tendency

6

The Mean

• The mean is the most commonly usedmeasure of central tendency.

• Computation of the mean requires scoresthat are numerical values measured onan interval scale.

• The mean is obtained by computing thesum, or total, for the entire set of scores,then dividing this sum by the number ofscores.

Page 7: Measures of central tendency

7

The Mean (cont.)

Conceptually, the mean can also be defined as:

1.The mean is the amount that each individual

receives when the total (ΣX) is divided equally

among all N individuals.

2.The mean is the balance point of the

distribution because the sum of the distances

below the mean is exactly equal to the sum of

the distances above the mean.

Page 8: Measures of central tendency
Page 9: Measures of central tendency
Page 10: Measures of central tendency

Calculate the Mean number of credit hours

Page 11: Measures of central tendency

Calculate the Mean Salary

Calculate the Median Salary

Page 12: Measures of central tendency
Page 13: Measures of central tendency

13

Changing the Mean

• Because the calculation of the mean involvesevery score in the distribution, changing thevalue of any score will change the value ofthe mean.

• Modifying a distribution by discarding scores orby adding new scores will usually change thevalue of the mean.

• To determine how the mean will be affected forany specific situation you must consider: 1) howthe number of scores is affected, and 2) how thesum of the scores is affected.

Page 14: Measures of central tendency

14

Changing the Mean (cont.)

• If a constant value is added to every score

in a distribution, then the same constant

value is added to the mean.

• Also, if every score is multiplied by a

constant value, then the mean is also

multiplied by the same constant value.

Page 15: Measures of central tendency

15

When the Mean Won’t Work

• Although the mean is the most commonly usedmeasure of central tendency, there aresituations where the mean does not provide agood, representative value, and there aresituations where you cannot compute a mean atall.

• When a distribution contains a few extremescores (or is very skewed), the mean will bepulled toward the extremes (displaced towardthe tail). In this case, the mean will not provide a"central" value.

Page 16: Measures of central tendency

16

When the Mean Won’t Work (cont.)

• With data from a nominal scale it isimpossible to compute a mean, and whendata are measured on an ordinal scale(ranks), it is usually inappropriate tocompute a mean.

• Thus, the mean does not always work as ameasure of central tendency and it isnecessary to have alternative proceduresavailable.

Page 17: Measures of central tendency

17

The Median

• If the scores in a distribution are listed in orderfrom smallest to largest, the median isdefined as the midpoint of the list.

• The median divides the scores so that 50% ofthe scores in the distribution have valuesthat are equal to or less than the median.

• Computation of the median requires scores thatcan be placed in rank order (smallest tolargest).

Page 18: Measures of central tendency

18

The Median (cont.)

Usually, the median can be found by a

simple counting procedure:

1.With an odd number of scores, list the

values in order, and the median is the

middle score in the list.

2.With an even number of scores, list the

values in order, and the median is half-

way between the middle two scores.

Page 19: Measures of central tendency
Page 20: Measures of central tendency

20

The Median (cont.)

• One advantage of the median is that it is

relatively unaffected by extreme

scores.

• Thus, the median tends to stay in the

"center" of the distribution even when

there are a few extreme scores or when

the distribution is very skewed. In these

situations, the median serves as a good

alternative to the mean.

Page 21: Measures of central tendency

Median for Grouped Frequency Distribution

21

Page 22: Measures of central tendency

Median for Grouped Frequency Distribution

22

Page 23: Measures of central tendency

Median for Grouped Frequency Distribution

23

Page 24: Measures of central tendency

The Mode

• The most common observation in a group of scores.

– Distributions can be unimodal, bimodal, or multimodal.

• If the data is categorical (measured on the nominal scale)

then only the mode can be calculated.

• The most frequently occurring score (mode) below is

Vanilla.

0

5

10

15

20

25

30

Van

illa

Cho

colate

Stra

wbe

rry

Nea

politan

But

ter P

ecan

Roc

ky R

oad

Fudg

e Rippl

e

fFlavor f

Vanilla 28

Chocolate 22

Strawberry 15

Neapolitan 8

Butter Pecan 12

Rocky Road 9

Fudge Ripple 6

Page 25: Measures of central tendency

Chap 3-25

The Characteristics of the Mode

• Value that occurs most often

• Not affected by extreme values

• Used for either numerical or categorical

(nominal) data

• There may be no mode

• There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9

0 1 2 3 4 5 6

No Mode

Page 26: Measures of central tendency

The Mode

• The mode can also be calculated with

ordinal and higher data, but it often is not

appropriate.

– If other measures can be calculated, the

mode would never be the first choice!

• 7, 7, 7, 20, 23, 23, 24, 25, 26 has a mode

of 7, but obviously it doesn’t make much

sense.

Page 27: Measures of central tendency

Mode for Grouped Frequency Distribution

27

Page 28: Measures of central tendency

Calculate the Mode

28

Page 29: Measures of central tendency

What is the mean rate of return here?

An investment of $100,000 declined to $50,000 at the end of year one and rebounded to $100,000 at end of year two:

The overall two-year return is zero, since it started and ended

at the same level.

000,100$X000,50$X000,100$X 321

50% decrease 100% increase

Page 30: Measures of central tendency

The Geometric Mean & The Geometric Rate of Return

Geometric mean

Used to measure the rate of change of a variable over time

Geometric mean rate of return

Measures the status of an investment over time

Where Ri is the rate of return in time period i

n/1

n21G )XXX(X

1)]R1()R1()R1[(R n/1

n21G

Page 31: Measures of central tendency

The Geometric Mean Rate

of Return: Example

Use the 1-year returns to compute the arithmetic mean and the geometric mean:

%2525.2

)1()5.(

X

Arithmetic

mean rate

of return:

Geometric

mean rate of

return:%012/1112/1)]2()50[(.

12/1))]1(1())5.(1[(

1/1)]1()21()11[(

nnRRRGR

Misleading result

More

representative

result

(continued)

Page 32: Measures of central tendency

Measures of Central Tendency:

Summary

Central Tendency

Arithmetic

Mean

Median Mode Geometric Mean

n

X

X

n

i

i 1

n/1

n21G )XXX(X

Middle value

in the ordered

array

Most

frequently

observed

value

Rate of

change of

a variable

over time

Page 33: Measures of central tendency

Which measure to use?Mean is generally considered the best measure of central tendency and

the most frequently used one. However, there are some situations where

the other measures of central tendency are preferred.

Median is preferred to mean when

There are few extreme scores in the distribution.

Some scores have undetermined values.

There is an open ended distribution.

Data are measured in an ordinal scale.

Mode is the preferred measure when data are measured in a nominal

scale.

A geometric mean is often used when comparing different items – finding

a single "figure of merit" for these items – when each item has multiple

properties that have different numeric ranges

33

Page 34: Measures of central tendency

Quiz Time 1) A teacher gives a 10 point quiz to a class of 9 students. All

the scores are whole numbers and nobody got a 0 or a

perfect core of 10. If the median is 7, what is the lowest

possible mean? What is the highest possible mean?

2) A poll reports that out of 100 families surveyed, the mean

number of children per family was 2.038, the median was 1.9,

and the mode was 1.82. Which of these values must be

wrong (independently)?

3) What is the mode here?

34

Page 35: Measures of central tendency

Solve this!

Wages 0-10 10-20 20-30 30-40 40-50 50-60 60-70 Total

Frequency 4 16 ? ? ? 6 4 230

35

Median = 33.5; Mode = 34. Calculate the missing frequencies.

Page 36: Measures of central tendency

Now solve this!

Variable 10-20 20-30 30-40 40-50 50-60 60-70 70-80 Total

Frequency 12 30 ? 65 ? 25 18 229

36

Median = 46. Calculate the missing frequencies.