statistics review levels of measurement. nominal scale nominal measurement consists of assigning...

56
Statistics Review Levels of Measurement

Post on 21-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Statistics Review

Levels of Measurement

Page 2: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Levels of Measurement

Nominal scale• Nominal measurement consists of

assigning items to groups or categories. • No quantitative information is conveyed

and no ordering of the items is implied.• Nominal scales are therefore qualitative

rather than quantitative. • Examples: Religious preference, race, and

gender are all examples of nominal scales• Statistics: Sum, Frequency Distributions

Page 3: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Ordinal Scale• Measurements with ordinal scales are ordered:

higher numbers represent higher values. • However, the intervals between the numbers

are not necessarily equal. • There is no "true" zero point for ordinal scales

since the zero point is chosen arbitrarily. • For example, on a five-point Likert scale, the

difference between 2 and 3 may not represent the same difference as the difference between 4 and 5.

• Also, lowest point was arbitrarily chosen to be 1. It could just as well have been 0 or -5.

Page 4: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Interval & Ratio Scales• On interval measurement scales, one unit

on the scale represents the same magnitude on the trait or characteristic being measure across the whole range of the scale.

• For example, on an interval/ratio scale of anxiety, a difference between 10 and 11 would represent the same difference in anxiety as between 50 and 51.

Page 5: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Histograms

Statistics Review

Page 6: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

What can histograms tell you

A convenient way to summarize data (especially for larger datasets)

Shows the distribution of the variable in the population

Gives an approximate idea of the summary and spread of the variable

Page 7: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Distribution of No of Graphics on web pages (N=1873)

Mean = 17.93

N = 1873

Graphic Count

Std. Dev = 17.92

Median = 16.00

95.0

90.0

85.0

80.0

75.0

70.0

65.0

60.0

55.0

50.0

45.0

40.0

35.0

30.0

25.0

20.0

15.0

10.0

5.0

0.0

400

300

200

100

0

Page 8: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Statistics Review

Mean and Median

Page 9: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Mean and Median

Mean shifts around, Median does not shift much, is more stable

Computing Median: for odd numbered N

find middle numberFor even numbered N

interpolate between middle 2, e.g. if it is 7 and 9, then 8 is the median

Mean is arithmetic average, median is 50% pointMean is point where graph balances

Page 10: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Instability of meanDistribution of word count (N=1897)

Mean = 368.0

Maximum = 4132

Minimum = 0

WORDCNT24000.0

3600.0

3200.0

2800.0

2400.0

2000.0

1600.0

1200.0

800.0400.0

0.0

800

600

400

200

0

Median = 223

Page 11: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Distribution of word count (N=1873), top 1% removed

Mean = 333.4

Maximum = 4132

Minimum = 0

WORDCNT2

2400.0

2200.0

2000.0

1800.0

1600.0

1400.0

1200.0

1000.0

800.0600.0

400.0200.0

0.0

500

400

300

200

100

0

Median = 220

Page 12: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Statistics Review

Standard Deviation

Page 13: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

The SD says how far away numberson a list are from their average.

Most entries on the list will besomewhere around one SD awayfrom the average. Very few will bemore than two or three SD’s away.

Standard Deviation: a measure of spread

Page 14: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Properties of the standard deviation

• The standard deviation is in the same units as the mean

• The standard deviation is inversely related to sample size (therefore as a measure of spread it is biased)

• In normally distributed data 68% of the sample lies within 1 SD

Page 15: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Statistics Review

Normal Probability Curve

Page 16: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Properties of the Normal Probability Curve

• The graph is symmetric about the mean (the part to the right is a mirror image of the part to the left)

• The total area under the curve equals 100%

• Curve is always above horizontal axis• Appears to stop after a certain point (the

curve gets really low)

Page 17: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

• The graph is symmetric about the mean =• The total area under the curve equals

100%• Mean to 1 SD = +- 68%• Mean to 2 SD = +- 95%• Mean to 3 SD = +- 99.7%• You can disregard rest of curve

1 SD= 68%

2 SD = 95%

3 SD= 99.7%

Page 18: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

It is a remarkable fact that many histograms in real life tend to follow the Normal Curve.

For such histograms, the mean and SD are good summary statistics.

The average pins down the center, while the SD gives the spread.

For histogram which do not follow the normal Curve, the mean and SD are not good summary statistics.

What when the histogram is not normal ...

Page 19: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Can be used when SD is too influenced by outliers

Note.A percentile is a score below which a certain % of sample is

Use inter quartile range75th percentile - 25th percentile

Page 20: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Statistics Review

Population and Sample

Page 21: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

An investigator usually wants to generalize about a class of individuals/things (the population)

For example: in forecasting the results of elections, population is all eligible voters

Page 22: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

• Usually there are some numerical facts about the population (parameters) which you want to estimate

• You can do that by measuring the same aspect in the sample (statistic)

• Depending on the accuracy of your measurement, and how representative your sample is, you can make inferences about the population

Page 23: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Statistics Review

Scatter Plots and Correlations

Page 24: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Example Scatterplots

xx

x

x

x

x

x

x

x

x

x

x

xx

xx

x

x

x

x

x

x

y

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

yHigh correlation Low correlation

4

Page 25: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

What is a Correlation Coefficient

• A measure of degree of relationship between two variables.

• Sign refers to direction.• Based on covariance

• Measure of degree to which large scores go with large scores, and small scores with small scores

• Pearson’s correlation coefficient is most often used

Page 26: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Factors Affecting r• Range restrictions

• Outliers

• Nonlinearity e.g. anxiety and performance

• Heterogeneous subsamples Everyday examples

Page 27: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

The effect of outliers on correlations

Dataset: 20 cases selected from darts and pros

DARTS

806040200-20-40

Pro

s80

60

40

20

0

-20

-40

r = .80

Page 28: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

•no effect on Pearson's correlation coefficient.

•Example: r between height and weight is the same regardless of whether height is measured in inches, feet, centimeters or even miles.

•This is a very desirable property since choice of measurement scales that are linear transformations of each other is often arbitrary.

Effect of linear transformations of data

Page 29: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Non linear relationships

Example: Anxiety and Performance

0

2

4

6

8

10

12

14

16

-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

r = .07

13

Page 30: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

The interpretation of a correlation coefficient

• Ranges from –1 to 1• No correlation in the data means you

will get a is 0 r or near it• Suffers from sampling error (like

everything else!). So you need to estimate true population correlation from the sample correlation.

Page 31: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Statistics Review

Hypothesis Testing

Page 32: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Null and Alternative Hypothesis

• Sampling error implies that sometimes the results we obtain will be due to chance (since not every sample will accurately resemble the population)

• The null hypothesis expresses the idea that an observed difference is due to chance.

• For example: There is no difference between the norms regarding the use of email and voice mail

Page 33: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

• The alternative hypothesis (the experimental hypothesis) is often the one that you formulate:

• For example: There is a correlation between people’s perception of a website’s reliability and the probability of their buying something on the site

• Why bother to have a null hypothesis?– Can you reject the null hypothesis

The alternative hypothesis

Page 34: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

One Tailed and Two Tailed tests

One tailed tests: Based on a uni-directional hypothesisHypothesis: Training will reduce number of problems users have with Powerpoint

Two tailed tests: Based on a bi-directional hypothesisHypothesis: Training will change the number of problems with PP

Page 35: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Implications of one and two tailed tests

Mean Usability Index

7.257.00

6.756.50

6.256.00

5.755.50

5.255.00

4.754.50

4.254.00

3.75

Sampling Distribution

Population for usability of Powerpoint

Fre

quen

cy

1400

1200

1000

800

600

400

200

0

Std. Dev = .45

Mean = 5.65

N = 10000.00

Unidirectional hypothesis: .05 level

Bidirectional hypothesis: .05 level

Identify region

Page 36: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Significance levels

•What happens if we decrease our significance level from .01 to .05

Probability of finding differences that don’t exist goes up (criteria becomes more lenient)

PowerPoint example:If we set significance level at .05 level,

5% of the time we will find a difference by chance95% of the time the difference will be real

If we set significance level at .01 level1% of the time we will find a difference by chance99% of time difference will be real

Page 37: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

• Effect of decreasing significance level from .01 to .05– Probability of finding differences that don’t

exist goes up– Also called Type I error (Alpha)

• Effect of increasing significance from .01 to .001– Probability of not finding differences that

exist goes up– Also called Type II error (Beta)

Page 38: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Significance levels for usability

• For usability, if you are set out to find problems: setting lenient criteria might work better (you will identify more problems)

Page 39: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Degree of Freedom• The number of independent pieces of

information remaining after estimating one or more parameters

• Example: List= 1, 2, 3, 4 Average= 2.5

• For average to remain the same three of the numbers can be anything you want, fourth is fixed

• New List = 1, 5, 2.5, __ Average = 2.5

Page 40: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Statistics Review

Comparing Means: t tests

Page 41: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Major Points• T tests: are differences significant?• One sample t tests, comparing one

mean to population• Within subjects test: Comparing mean

in condition 1 to mean in condition 2• Between Subjects test: Comparing

mean in condition 1 to mean in condition 2

Page 42: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

One sample t test

• Mean of population known, but standard deviation (SD) not known

• Compute t statistic

• Compare t to tabled values (for relevant degree of freedom) which show critical values of t

Page 43: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Factors Affecting t

• Difference between sample and population means

• Magnitude of sample variance• Sample size

Page 44: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Factors Affecting Decision

• Significance level• One-tailed versus two-tailed test

Page 45: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Within subjects/ Repeated Measures / Related Samples t test

• Correlation between before and after scores– Causes a change in the statistic we can use

Advantages of within subject designs

•Eliminate subject-to-subject variability•Control for extraneous variables•Need fewer subjects

Page 46: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Disadvantages of Within Subjects

• Order effects• Carry-over effects• Subjects no longer naïve• Change may just be a function of

time• Sometimes not logically possible

Page 47: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Between subjects t test

• Distribution of differences between means

• Heterogeneity of Variance• Nonnormality

Page 48: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Assumptions of Between Subjects t tests

• Two major assumptions– Both groups are sampled from

populations with the same variance• “homogeneity of variance”

– Both groups are sampled from normal populations• Assumption of normality

– Frequently violated with little harm.

Page 49: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Statistics Review

Analysis of Variance

Page 50: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Analysis of Variance

ANOVA is a technique for using differences between sample means to draw inferences about the presence or absence of differences between populations means.

•Similar to t tests in two sample case•Can handle cases where there are more than two samples

Page 51: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Assumptions

– Observations normally distributed within each population

– Population variances are equal• Homogeneity of variance or

homoscedasticity

– Observations are independent

Page 52: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Logic of the Analysis of Variance

• Null hypothesis: Population means from different conditions are equal– Mean1 = Mean2 = Mean 3

• Alternative hypothesis: H1 – Not all population means equal.

Cont.

Page 53: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Lets visualize total amount of variance in an experiment

Between Group Differences(Mean Square Group)

Error Variance (Individual Differences + Random Variance) Mean Square Error

Total Variance = Mean Square Total

F ratio is a proportion of the MS group/MS Error.The larger the group differences, the bigger the F

Page 54: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

When there are more than two groups

• Significant F only shows that not all groups are equal– We want to know what groups are

different.• Such procedures are designed to

control familywise error rate.– Familywise error rate defined– Contrast with per comparison error rate

Page 55: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

Multiple Comparisons

• The more tests we run the more likely we are to make Type I error.– Good reason to hold down number of tests

Page 56: Statistics Review Levels of Measurement. Nominal scale Nominal measurement consists of assigning items to groups or categories. No quantitative information

How to make inferences

• What are significant effects in your results?

• If one t test is significant, check the distribution, where does the difference lie: Is it in the mean, is it the SD, does one variable have much greater range than another.

• Next conduct another independent analysis which can verify finding. For example: Check the