central tendency - kevin dooley's ultimate web page!dooleykevin.com/psyc60.2.pdf · central...

Central Tendency

Central Tendency n  A single summary score that best describes the central

location of an entire distribution of scores. n Measures of Central Tendency:

n  Mean n  The sum of all scores divided by the number of scores.

n  Median n  The value that divides the distribution in half when

observations are ordered. n  Mode

n  The most frequent score.

Central Tendency Example: Mode

n  52, 76, 100, 136, 186, 196, 205, 150, 257, 264, 264, 280, 282, 283, 303, 313, 317, 317, 325, 373, 384, 384, 400, 402, 417, 422, 472, 480, 643, 693, 732, 749, 750, 791, 891

n Mode: most frequent observation n Mode(s) for hotel rates:

n  264, 317, 384

Pros and Cons of the Mode

n  Pros n  Good for nominal data. n  Good when there are

two “typical” scores. n  Easiest to compute

and understand. n  The score comes from

the data set.

n  Cons n  Ignores most of the

information in a distribution.

n  Small samples may not have a mode.

Central Tendency Example: Median n  52, 76, 100, 136, 186, 196, 205, 150, 257, 264, 264,

280, 282, 283, 303, 313, 317, 317, 325, 373, 384, 384, 400, 402, 417, 422, 472, 480, 643, 693, 732, 749, 750, 791, 891

n  The median is the middle value when observations are ordered. n  To find the middle, count in (N+1)/2 scores when

observations are ordered lowest to highest. n  Median hotel rate:

n  (35+1)/2 = 18 n  317

Finding the median with an even number of scores. n  2, 2, 3, 5, 6, 7, 7, 7, 8, 9 n  With an even number of scores, the median is the

average of the middle two observations when observations are ordered. n  Find the average of the N/2 and the (N/2)+1 score.

n  N/2 = 5th score, (N/2)+1 = 6th score n  Add middle two observations and divide by two.

n  (6+7)/2 = 6.5 n  Median is 6.5

Pros and Cons of Median

n  Pros n  Not influenced by

extreme scores or skewed distributions.

n  Good with ordinal data. n  Easier to compute than

the mean.

n  Cons n  May not exist in the

data. n  Doesn’t take actual

values into account.

Mean: the average of all the scores

mean=∑x N eg. 3 5 1 2 3 5 4 6 7 5 mean = 3+5+1+2+3+5+4+6+7+5 10 = 4.1

*the most commonly used measure of central tendency *the balance point of a distribution (illustrated next slide)

Mean

n  Is the balance point of a distribution.

Mean

n Population

n Sample NXΣ

=µ

nXX Σ

=

“mu”

“X bar”

“sigma”, the sum of X, add up all scores

“n”, the total number of scores in a sample

“N”, the total number of scores in a population

“sigma”, the sum of X, add up all scores

Central Tendency Example: Mean

n  52, 76, 100, 136, 186, 196, 205, 150, 257, 264, 264, 280, 282, 283, 303, 313, 317, 317, 325, 373, 384, 384, 400, 402, 417, 422, 472, 480, 643, 693, 732, 749, 750, 791, 891

n Mean hotel rate:

60.37135

13005==X

nXX Σ

=

Mean hotel rate: $371.60

Pros and Cons of the Mean n  Pros

n  Mathematical center of a distribution.

n  Just as far from scores above it as it is from scores below it.

n  Good for interval and ratio data.

n  Does not ignore any information.

n  Inferential statistics is based on mathematical properties of the mean.

n  Cons n  Influenced by extreme

scores and skewed distributions.

n  May not exist in the data.

The effect of skew on average.

n  In a normal distribution, the mean, median, and mode are the same.

n  In a skewed distribution, the mean is pulled toward the tail.

Which average? n  Each measure contains a different kind of

information. n  For example, all three measures are useful for

summarizing the distribution of American household incomes.

n  In 1998, the income common to the greatest number of households was $25,000.

n  Half the households earned less than $38,885. n  The mean income was $50,600.

n  Reporting only one measure of central tendency might be misleading and perhaps reflect a bias.

Describing Data with Tables & Graphs

Descriptive Statistics

n  The goal of descriptive statistics is to summarize a collection of data in a clear and understandable way. n  What is the pattern of scores over the range of

possible values? n  Where, on the scale of possible scores, is a point

that best represents the set of scores? n  Do the scores cluster about their central point or

do they spread out around it?

What is the pattern of scores?

n  Graphs often make it easier to see certain characteristics and trends in a set of data. n  Graphs for quantitative data.

n  Stem and Leaf Display n  Histogram n  Frequency Polygon

n  Graphs for qualitative data. n  Bar Chart n  Pie Chart

number/count

class/category

Histogram

n Consists of a number of bars placed side by side. n  The width of each bar indicates the interval

size. n  The height of each bar indicates the frequency

of the interval. n  There are no gaps between adjacent bars.

n  Continuous nature of quantitative data.

Tables: Frequency Distributions n  (def) organizes raw data or observations that

have been collected by showing how often an observation occurs in each class

n  For quantitative data n  Ungrouped: list all possible scores that occur,

and then indicate how often each score occurs n  (Maximizes information about individual scores)

n  Grouped: combine all possible scores into classes and indicate how often each score occurs within a class

n  (Easier to see patterns in data, but lose information bout individual scores)

n  Do not use if you have more than 20 classes!

Organize these depression scores! (ranging from 1 to 10)

1 4 4 2 7 3 7 9 1 2 5 2 1 1 4 5

Ungrouped 1.  Each observation should be included in only one

class 2.  List all classes, even those with zero frequencies score f

1 8 2 6 3 6 4 4 5 4 6 0 7 2 8 0 9 1 10 1 Total 32

1 3 3 2 10 5 5 4 1 2 3 2 1 1 3 3

Organize these depression scores! (ranging from 1 to 10)

1 4 4 2 7 3 7 9 1 2 5 2 1 1 4 5

Grouped 1.  Find the lowest and highest scores 2.  Classes should have (roughly) equal intervals 3.  Each observation should be included in only one

class 4.  List all classes, even those with zero frequencies

score f 1-3 (no depression) 20 4-6 (moderate depression)

8

7-10 (high depression) 4 Total 32

1 3 3 2 10 5 5 4 1 2 3 2 1 1 3 3

Tables: Relative Frequency Distributions n  (def) a frequency distribution that also

includes the proportion of each group n Divide the frequency of each class by the

total frequency

score f Relative f 1-3 (no depression) 20 .625 4-6 (moderate depression)

8 .25

7-10 (high depression) 4 .125 Total 32 1.00

20/32 = .625

Tables: Cumulative Frequency Distributions

n  (def) a frequency distribution that also includes the total number of observations in each class and all lower-ranked classes

n  Cumulative f: Add the frequency of each class to the sum of all classes ranked before it

n  Cumulative %: Divide the cumulative frequency by the total sample (i.e., percentile rank)

score f Relative f

Cumulative f

Cumulative %

1-3 (no depression) 20 .625 20 62.5% 4-6 (moderate depression)

8 .25 28 87.5%

7-10 (high depression) 4 .125 32 100.0% Total 32 1.00

20 + 8 = 28 28/32 = .875

Graphs for quantitative data: Histogram n Consists of bars placed side by side (to

represent continuity of data), the width of each bar indicates a single interval, the height indicates the frequency of the interval

0!2!4!6!8!

10!

1! 2! 3! 4! 5! 6! 7! 8! 9! 10!

Freq

uenc

y!

Level of Depression!

Outliers

n  (def) a very extreme score

n  Is it accurate? n Should you segregate it from your summary? n Will it enhance your understanding?

Graphs: Shape

Review! Assume all scores go from low to high

n Draw and describe the shape: n  IQ scores for the general population n  Attractiveness scores for a bunch of supermodels n  1st grader’s math scores on a college-level exam n  Annual income level for a community consisting of

equal numbers of relatively poor and relatively wealthy people

Graphs for quantitative data: Frequency Polygons

n A line graph that emphasizes the continuity of continuous variables

n Uses a single point rather than a bar

0!2!4!6!8!

10!

1! 2! 3! 4! 5! 6! 7! 8! 9! 10!

Freq

uenc

y!


0!2!4!6!8!

10!

1! 2! 3! 4! 5! 6! 7! 8! 9! 10!

Freq

uenc

y!


Graphs for quantitative data: Stem & Leaf n A display for sorting data on the basis of

leading and trailing digits

0 1 2 3 4 5 6 7

Stem Leaf

001445679 4446789 134455556

2466

357 467

9

0339

14 24 36 73 52 35 15 24 35 33 34 66 11 14 54 56 24 35 75 43 17 67 26 31 40 29 10 43 27 49 16 9 35 77 28 34 19 64 10 56

Raw data

Histogram

Las Vegas Hotel Rates

0

1

2

3

4

5

6

7

8

9

0-99

100-199

200-299

300-399

400-499

500-599

600-699

700-799

800-899

Rates

Fre

qu

en

cy

hotel rates

Shapes of Histograms

Frequency Polygons n  Uses a single point rather than a bar

0

5

10

15

20

25

30

35

35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-74 75-79 80-84 85-89 90-94 95-99

Range

Frequency

Bar Graph

Pie Graph

Misleading Graphs

central tendency - kevin dooley's ultimate web page!dooleykevin.com/psyc60.2.pdf · central...

Documents