quants

40
What is a Variable? • any entity that can take on different values • not always 'quantitative' or numerical, but we can assign numerical values • attribute = a specific value of a variable Examples: • gender: 1=female; 2=male • attitudes: 1 = strongly disagree; 2 = disagree; 3 = neutral; 4 = agree; 5 = strongly agree

Upload: aliquis

Post on 27-Jan-2015

658 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Quants

What is a Variable?

• any entity that can take on different values

• not always 'quantitative' or numerical, but we can assign numerical values

• attribute = a specific value of a variableExamples:

• gender: 1=female; 2=male• attitudes: 1 = strongly disagree; 2 = disagree; 3 =

neutral; 4 = agree; 5 = strongly agree

Page 2: Quants

Coding in a data matrix

Page 3: Quants

Gender: Male = 1; Female=2

Political Orientation: Traditionalist=1; Moderate=2; Progressive=3

Social Class: Working=1; Upper working=2; Lower middle=3; Middle=4; Upper middle=5

Coding in a data matrix

Page 4: Quants

Levels of Measurement

• different kinds of variables

(1) Nominal

(2) Ordinal

(3) Interval and Ratio

Page 5: Quants

Nominal Variable

• used to classify things• represents equivalence (=)• adding, subtracting, multiplying or dividing

nominal numbers is meaningless • tells you how many categories there are in

the scheme

Page 6: Quants

Ordinal Variable

• ordering or ranking of the variable• the relationship between numbered items• ‘higher’, ‘lower’, ‘easier’, ‘faster’, ‘more

often’• equivalence (=) and relative size (greater

than) and < (less than)

Page 7: Quants

Interval (and Ratio) Variable

• All arithmetical operations are allowed• intervals between each step are of equal

size• Examples:

- length, weight, elapsed time, speed, temperature

Page 8: Quants

Women’s Shoe Sizes

British 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8

European 34 35 35.5 36 37 37.5 38 38.5 39 39.5 40 41 42

American 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10 10.5

Japanese (cm)

21.5 22 22.5 23 23 23.5 24 24 24.5 25 25.5 26 26.5

Page 9: Quants

Levels of measurement

Level are names

have an inherent order 

from more to less  or higher to lower

are numbers with equal intervals between them

Nominal  level

Ordinal  level

Interval  level

 

Page 10: Quants

Frequency distributions

• count number of occurrences that fall into each category of each variable

• allow you to compare information between groups of individuals

• also allow you to see what are the highest and lowest values and the value at which most scores cluster

• variables of any level of measurement can be displayed in a frequency table

Page 11: Quants

Frequency table

Page 12: Quants

Percentages• number of cases belonging to particular category divided

by the total number of cases and multiplied by 100.• the total of percentages in any particular group equals 100

per cent.

100% N

f

Page 13: Quants

Graphical presentation

• Pie charts• Barcharts• Line graphs• Histograms

Page 14: Quants

Pie chart

• illustrates the frequency (or percentage) of each individual category of a variable relative to the total.

• Pie charts are not appropriate for displaying

quantitative data. Gender of Sociology Students

81%

19%

Female

Male

SociologyStudents

Female 26Male 6

Page 15: Quants

15

Barcharts

• the height of the bar is proportional to the category of the variable - easy to compare

• used for Nominal or Ordinal level variables (or discrete interval/ratio level variables with relatively few categories)

Marital Status

140

60

8575

30 35

75

0

20

40

60

80

100

120

140

160

Married Living asmarried

Single Divorced Separated Widow ed Missing

Page 16: Quants

Multiple barchart

Marital status

0

20

40

60

80

100

120

140

160

Married Living asmarried

Single Divorced Separated Widow ed

Fre

qu

enci

es

1995

2000

Page 17: Quants

Compound or Component barchart

Sociology Students

41 42

2639

0

20

40

60

80

100

2001 2002Year

Freq

uenc

y Male

Female

Page 18: Quants

Line graphs

• interval/ratio level variables that are discrete• need to arrange the values in order

YEAR

20012000199919981997199619951994199319921991

Va

lue

PR

OD

UC

T

170

160

150

140

130

120

110

Page 19: Quants

Histograms

• represents continuous quantitative data• The height of the bars corresponds to the

frequency or percentage of cases in the class.• The width of the bars represents the size of the

intervals of the variable• The horizontal axis is marked out using the mid

points of class intervals

Page 20: Quants

Example: Histogram

MARKS

85.0

80.0

75.0

70.0

65.0

60.0

55.0

50.0

45.0

40.0

35.0

30.0

25.0

30

20

10

0

Std. Dev = 11.83

Mean = 55.4

N = 100.00

Page 21: Quants

Graphs have the capacity to distort

YEAR

20012000199919981997199619951994199319921991

Va

lue

PR

OD

UC

T

170

160

150

140

130

120

110

YEAR

200120001999199819971996199519941993199219911990V

alu

e P

RO

DU

CT

200

100

0

Page 22: Quants

Measures of Central Tendency

• describe sets of numbers briefly, yet accurately • describe groups of numbers by means of other,

but fewer numbers• Three main measures:

• mean• median• mode

Page 23: Quants

The Mean

• most common type of average that is computed.

Page 24: Quants

When to use the Mean

• When values in a particular group cluster closely around a central value, the mean is a good way of indicating the ‘typical’ score, i.e. it is truly representative of the numbers.

• If the values are very widely spread, are very unevenly distributed, or clustered around extreme values, than the mean can be misleading, and other measures of central tendency should be used instead.

Page 25: Quants

The Median

• Also an average, but of different kind.• It is defined as the midpoint in a set of scores. It

is the point at which one-half, or 50% of the scores fall above and one-half, or 50%, fell below.

• Computing the Median:(1) List the scores in order, either from highest

to lowest or lowest to highest.

(2) Find the middle score. That’s the median.

Page 26: Quants

The Median: Pros and Cons

• time-consuming• if one of the numbers near the middle of the distribution

moves even slightly, than the median would alter, unlike the mean, which is relatively unaffected by a change in one of the central numbers

• if one of the extreme values changes, than the median remains unaltered.

- 2, 80, 100, 120, 130, 140, 160, 200, 3150• single scores which are quite clearly ‘deviant’ when

compared with others, are known as outliers – 2 and 3150

Page 27: Quants

The Mode

• the value in any set of scores that occurs most often

• example 1: – 5, 6, 7, 8, 8, 8, 9, 10, 10, 12 – the mode = 8

• example 2: – 5, 6, 7, 8, 8, 8, 9, 10, 10, 10, 12 –two modes: 8 and

10 – bimodal

• very unstable figure – 1,1,6,7,8,10 – mode = 1– 1,6,7,8,10,10 – mode = 10

Page 28: Quants

When to Use What?

• depends on the type of data that you are describing

– for nominal data - only the mode– for ordinal data - mode and median– for interval data - all of them

• but, for extreme scores - use the median

Page 29: Quants

Measure of dispersion (spread)

• better impression of a distribution’s shape• measures indicate how widely scattered

the numbers are• how different scores are from one

particular score – the mean• variability - a measure of how much each

score in a group of scores differs from the mean

Page 30: Quants

The range

• tells us over how many numbers altogether a distribution is spread

lhr • where

• r is the range• h is the highest score in the data set• l is the lowest score in the data set.

Page 31: Quants

12 13 12 1114 13 12 10 10 11

55

0

10

20

30

40

50

60

0 2 4 6 8 10 12

r = biggest value - smallest value = 55-10 = 45

Page 32: Quants

The mean deviation

• number which indicates how much, on average, the scores in a distribution differ from a central point, the mean.

Mean deviation =

Page 33: Quants

160

580

2

738

6

734

51

689

0

100

200

300

400

500

600

700

800

0 2 4 6 8 10

Mean=370

-210

210

-368

368

-364

364

-319

319

X - mean= (-210)+210+(-368)+368+(-364)+364+(-319)+319 = 0

X - mean= 210+210+368+368+364+364+319+319 = 2522

mean deviation = 2522/8 = 315.25

Page 34: Quants

The standard deviation (SD)

• represents the average amount of variability• It is the average distance from the mean

N

XXs

2

• s the standard deviation• find the sum of what follows• X each individual score• the mean of all the scores• N the sample sizeX

Page 35: Quants

Standard deviations

Page 36: Quants

Shape of Normal Distribution

Mean MedianMode

Symmetrical

Asymptotic tail

Page 37: Quants

The area under the curve

• A normal distribution always has the same relative proportions of scores falling between particular values of the numbers involved.

• Areas under the curve = proportion of scores lying in the various parts of the complete distribution

Page 38: Quants

SS2008N - Surveys

Median

Median

50%50%

Page 39: Quants

SS2008N - Surveys

Quartiles

MedianQuartile 1 Quartile 3

25%

25%

25%

25%

Page 40: Quants

Standard Deviation