measures of central tendency

Post on 18-Jul-2015

111 Views

Category:

Data & Analytics

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Measures of

Central Tendency

AGBS Bangalore | 2013

Central Tendency

• In general terms, central tendency is a

statistical measure that determines a

single value that accurately describes the

center of the distribution and represents

the entire distribution of scores.

• The goal of central tendency is to identify

the single value that is the best

representative for the entire set of data.

3

Central Tendency (cont.)

• By identifying the "average score," centraltendency allows researchers to summarize orcondense a large set of data into a singlevalue.

• Thus, central tendency serves as a descriptivestatistic because it allows researchers todescribe or present a set of data in a verysimplified, concise form.

• In addition, it is possible to compare two (ormore) sets of data by simply comparing theaverage score (central tendency) for one setversus the average score for another set.

5

The Mean, the Median,

& the Mode

• It is essential that central tendency be

determined by an objective and well-defined

procedure so that others will understand exactly

how the "average" value was obtained and can

duplicate the process.

• No single procedure always produces a

good, representative value. Therefore,

researchers have developed three commonly

used techniques for measuring central tendency:

the mean, the median, and the mode.

6

The Mean

• The mean is the most commonly usedmeasure of central tendency.

• Computation of the mean requires scoresthat are numerical values measured onan interval scale.

• The mean is obtained by computing thesum, or total, for the entire set of scores,then dividing this sum by the number ofscores.

7

The Mean (cont.)

Conceptually, the mean can also be defined as:

1.The mean is the amount that each individual

receives when the total (ΣX) is divided equally

among all N individuals.

2.The mean is the balance point of the

distribution because the sum of the distances

below the mean is exactly equal to the sum of

the distances above the mean.

Calculate the Mean number of credit hours

Calculate the Mean Salary

Calculate the Median Salary

13

Changing the Mean

• Because the calculation of the mean involvesevery score in the distribution, changing thevalue of any score will change the value ofthe mean.

• Modifying a distribution by discarding scores orby adding new scores will usually change thevalue of the mean.

• To determine how the mean will be affected forany specific situation you must consider: 1) howthe number of scores is affected, and 2) how thesum of the scores is affected.

14

Changing the Mean (cont.)

• If a constant value is added to every score

in a distribution, then the same constant

value is added to the mean.

• Also, if every score is multiplied by a

constant value, then the mean is also

multiplied by the same constant value.

15

When the Mean Won’t Work

• Although the mean is the most commonly usedmeasure of central tendency, there aresituations where the mean does not provide agood, representative value, and there aresituations where you cannot compute a mean atall.

• When a distribution contains a few extremescores (or is very skewed), the mean will bepulled toward the extremes (displaced towardthe tail). In this case, the mean will not provide a"central" value.

16

When the Mean Won’t Work (cont.)

• With data from a nominal scale it isimpossible to compute a mean, and whendata are measured on an ordinal scale(ranks), it is usually inappropriate tocompute a mean.

• Thus, the mean does not always work as ameasure of central tendency and it isnecessary to have alternative proceduresavailable.

17

The Median

• If the scores in a distribution are listed in orderfrom smallest to largest, the median isdefined as the midpoint of the list.

• The median divides the scores so that 50% ofthe scores in the distribution have valuesthat are equal to or less than the median.

• Computation of the median requires scores thatcan be placed in rank order (smallest tolargest).

18

The Median (cont.)

Usually, the median can be found by a

simple counting procedure:

1.With an odd number of scores, list the

values in order, and the median is the

middle score in the list.

2.With an even number of scores, list the

values in order, and the median is half-

way between the middle two scores.

20

The Median (cont.)

• One advantage of the median is that it is

relatively unaffected by extreme

scores.

• Thus, the median tends to stay in the

"center" of the distribution even when

there are a few extreme scores or when

the distribution is very skewed. In these

situations, the median serves as a good

alternative to the mean.

Median for Grouped Frequency Distribution

21

Median for Grouped Frequency Distribution

22

Median for Grouped Frequency Distribution

23

The Mode

• The most common observation in a group of scores.

– Distributions can be unimodal, bimodal, or multimodal.

• If the data is categorical (measured on the nominal scale)

then only the mode can be calculated.

• The most frequently occurring score (mode) below is

Vanilla.

0

5

10

15

20

25

30

Van

illa

Cho

colate

Stra

wbe

rry

Nea

politan

But

ter P

ecan

Roc

ky R

oad

Fudg

e Rippl

e

fFlavor f

Vanilla 28

Chocolate 22

Strawberry 15

Neapolitan 8

Butter Pecan 12

Rocky Road 9

Fudge Ripple 6

Chap 3-25

The Characteristics of the Mode

• Value that occurs most often

• Not affected by extreme values

• Used for either numerical or categorical

(nominal) data

• There may be no mode

• There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9

0 1 2 3 4 5 6

No Mode

The Mode

• The mode can also be calculated with

ordinal and higher data, but it often is not

appropriate.

– If other measures can be calculated, the

mode would never be the first choice!

• 7, 7, 7, 20, 23, 23, 24, 25, 26 has a mode

of 7, but obviously it doesn’t make much

sense.

Mode for Grouped Frequency Distribution

27

Calculate the Mode

28

What is the mean rate of return here?

An investment of $100,000 declined to $50,000 at the end of year one and rebounded to $100,000 at end of year two:

The overall two-year return is zero, since it started and ended

at the same level.

000,100$X000,50$X000,100$X 321

50% decrease 100% increase

The Geometric Mean & The Geometric Rate of Return

Geometric mean

Used to measure the rate of change of a variable over time

Geometric mean rate of return

Measures the status of an investment over time

Where Ri is the rate of return in time period i

n/1

n21G )XXX(X

1)]R1()R1()R1[(R n/1

n21G

The Geometric Mean Rate

of Return: Example

Use the 1-year returns to compute the arithmetic mean and the geometric mean:

%2525.2

)1()5.(

X

Arithmetic

mean rate

of return:

Geometric

mean rate of

return:%012/1112/1)]2()50[(.

12/1))]1(1())5.(1[(

1/1)]1()21()11[(

nnRRRGR

Misleading result

More

representative

result

(continued)

Measures of Central Tendency:

Summary

Central Tendency

Arithmetic

Mean

Median Mode Geometric Mean

n

X

X

n

i

i 1

n/1

n21G )XXX(X

Middle value

in the ordered

array

Most

frequently

observed

value

Rate of

change of

a variable

over time

Which measure to use?Mean is generally considered the best measure of central tendency and

the most frequently used one. However, there are some situations where

the other measures of central tendency are preferred.

Median is preferred to mean when

There are few extreme scores in the distribution.

Some scores have undetermined values.

There is an open ended distribution.

Data are measured in an ordinal scale.

Mode is the preferred measure when data are measured in a nominal

scale.

A geometric mean is often used when comparing different items – finding

a single "figure of merit" for these items – when each item has multiple

properties that have different numeric ranges

33

Quiz Time 1) A teacher gives a 10 point quiz to a class of 9 students. All

the scores are whole numbers and nobody got a 0 or a

perfect core of 10. If the median is 7, what is the lowest

possible mean? What is the highest possible mean?

2) A poll reports that out of 100 families surveyed, the mean

number of children per family was 2.038, the median was 1.9,

and the mode was 1.82. Which of these values must be

wrong (independently)?

3) What is the mode here?

34

Solve this!

Wages 0-10 10-20 20-30 30-40 40-50 50-60 60-70 Total

Frequency 4 16 ? ? ? 6 4 230

35

Median = 33.5; Mode = 34. Calculate the missing frequencies.

Now solve this!

Variable 10-20 20-30 30-40 40-50 50-60 60-70 70-80 Total

Frequency 12 30 ? 65 ? 25 18 229

36

Median = 46. Calculate the missing frequencies.

top related