day 11: measures of association and anova€¦ · day 11: measures of association and anova ......

64
Day 11: Measures of Association and ANOVA Daniel J. Mallinson School of Public Affairs Penn State Harrisburg [email protected] PADM-HADM 503 Mallinson Day 11 November 2, 2017 1 / 45

Upload: vantu

Post on 25-May-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Day 11: Measures of Association andANOVA

Daniel J. Mallinson

School of Public AffairsPenn State [email protected]

PADM-HADM 503

Mallinson Day 11 November 2, 2017 1 / 45

Road map

Measures of AssociationTypes of MeasuresExamples

Analysis of Variance (ANOVA)

Mallinson Day 11 November 2, 2017 2 / 45

General Notes

Remember the three following questions:1 Is the relationship between the variables significant?

Conduct a significance test

2 How strong is the relationship?

Use a measure of association

3 What is the nature of the relationship between the variables?

Interpret outputs of your analyses: charts, tables, mathematicalformulas

Mallinson Day 11 November 2, 2017 3 / 45

General Notes

Mallinson Day 11 November 2, 2017 4 / 45

General Notes

Significance Tests for Measures of Association:

SPSS displays results of statistical tests for measures ofassociation

Shows if the calculated measures an association that appears bychance or is real

Ignore them! - Pointless to discuss, does not replace t-tests,ANOVA, or chi-square

Mallinson Day 11 November 2, 2017 5 / 45

Choosing a Measure of Association

Need to select appropriate test based on level of measurement ofIV(s) and DV

Need to consider the measure’s sensitivity (more on this later)

Researcher should be familiar with the chosen statistic

Mallinson Day 11 November 2, 2017 6 / 45

Choosing a Measure of Association

Asymmetric or symmetric?

Asymmetric Measures

Preferred when you know which variable is the IV and which is the DV

Symmetric Measures

Choose when you do not know which is IV and which is DV or whena symmetric measure is not available.

Choose asymmetric measures when they are available!

Mallinson Day 11 November 2, 2017 7 / 45

Choosing a Measure of Association

Several choices for calculating measures of association:

Proportional reduction of error (PRE) measures

Chi-square based measures

Correlational measures

Specific measures for ANOVA

See Table 13.8 in the textbook.

Mallinson Day 11 November 2, 2017 8 / 45

How to Interpret Levels of Association

Perfect positive relationship between variables: +1.0

Perfect negative relationship between variables: -1.0

No relationship between variables = 0

In general:

The closer to 0, the weaker the relationship

The closer to ±1, the stronger the relationship

Mallinson Day 11 November 2, 2017 9 / 45

How to Interpret Levels of Association

There is no universal scale to determine if a relationship isstrong or weak

Guidelines exist for some measures

See Table 13.8 in textbook

Mallinson Day 11 November 2, 2017 10 / 45

Types of Measures of Association

Proportional reduction of error (PRE) measures

Chi-square based measures

Correlational measures

Specific measures for ANOVA

Mallinson Day 11 November 2, 2017 11 / 45

Types of Measures of Association

Level of Measurement Measure of Assoc Type SymmetricNominal Lambda PRE Both

Phi Coefficient χ2 SymmetricCoef. of Contingency χ2 SymmetricCramer’s V χ2 Symmetric

Ordinal Gamma PRE SymmetricTau-b (square) PRE SymmetricTau-c (rectangle) PRE SymmetricSomers’ d PRE AsymmetricSpearman’s Rho (ρ) Correlation N/A

Interval Pearson’s r Correlation N/AEta and Eta2 ANOVA N/A

Bold measures are most likely candidates for use

Rule of thumb: report several if available, note the differences

Mallinson Day 11 November 2, 2017 12 / 45

Types of Measures of Association

1. PRE

Indicate how much knowing the IV decreases errors in estimatingthe values of the DV

Conservative measure: Yield lower values than chi-square basedmeasures, thus less likely to overestimate the strength ofassociation

Lambda sometimes underestimates the strength of arelationship, can yield 0 even when significance test shows asignificant relationship.

Cramer’s V preferred to Lambda

Report both in tables, talk about Cramer’s V in interpretation

Mallinson Day 11 November 2, 2017 13 / 45

Types of Measures of Association

2. Measures Based on Chi-Square

Difficult to interpret, not intuitive

Cramer’s V is most relevant of the three

Mallinson Day 11 November 2, 2017 14 / 45

Types of Measures of Association

3. Correlation-Based Measures

Spearman’s ρ is a relatively old measure

Kendall’s Tau-b is usually prefered over Spearman’s ρ when IVsand DVs are ordinal

Mallinson Day 11 November 2, 2017 15 / 45

Types of Measures of Association

4. ANOVA

Eta and Eta2

Used to measure strength of relationship in one-way ANOVA

Eta2 interpreted as proportion of variance explained in DV by theIV

Similar to R2 in multiple regression

Mallinson Day 11 November 2, 2017 16 / 45

Examples of Measures of Association

Example 1: IV and DV Nominal

Data file: gssnet.sav

Research Question: Are men or women more likely to use e-mail?To answer, we use data from General Social Survey dataset

Variables: Respondent’s sex (sex) and Use email (useemail).There are two categories of the DV (yes and no)

Mallinson Day 11 November 2, 2017 17 / 45

Examples of Measures of Association

Steps:

1. State hypotheses

Research hypothesis: There is a difference between mean andwomen in their usages of email

Null hypothesis: There is not difference.

2. Select and alpha level

α = 0.05

Mallinson Day 11 November 2, 2017 18 / 45

Examples of Measures of Association

Steps:

3. Compute test of statistical significance

Chi-square

4. Make a decision

If p < .05, there is a significant relationship

5. Interpret strength of the relationship

If there is a significant relationship, interpret Lambda and Cramer’s V

Mallinson Day 11 November 2, 2017 19 / 45

Examples of Measures of Association

In SPSS:

Descriptive Statistics

Cross Tabs

Select a column variable (IV) and a row variable (DV)

Click “Statistics” and select Chi-square, also select “Phi andCramer’s V” and “Lambda” under Nominal

Click “Cells” and select observed counts, expected counts, andcolumn percentages

Mallinson Day 11 November 2, 2017 20 / 45

Examples of Measures of Association

Results:

There is a difference between men and women

Mallinson Day 11 November 2, 2017 21 / 45

Examples of Measures of Association

Results:

The difference is significant

Mallinson Day 11 November 2, 2017 22 / 45

Examples of Measures of AssociationResults:

Lambda is erroneous, so interpret Cramer’s V; difference is significantMallinson Day 11 November 2, 2017 23 / 45

Examples of Measures of Association

Example 1 Interpretation

There is a significant relationship between sex and email usage

The relationship is very weak

Men are more likely to use e-mail messages

We can reject our null hypothesis

We can be confidence of this conclusion 95%

Mallinson Day 11 November 2, 2017 24 / 45

Examples of Measures of Association

Example 4: Independent and Dependent Variables Scale

Data file: country.sav

RQ: Does the availability of doctors in a country make anydifference in the female life expectancy in that country?

Variables in the dataset: doctors per 10,000 people (docs) andfemale life expectancy (lifeexpf)

Mallinson Day 11 November 2, 2017 25 / 45

Examples of Measures of Association

Steps:

This time we will not follow the steps from earlier examples

No hypothesis, for example

The purpose is to show how Pearson’s r is calculated and toshow a visual association between the variables (scatterplot)

We need to conduct a regression analysis to establish the“causal” relations between the variables

Mallinson Day 11 November 2, 2017 26 / 45

Examples of Measures of Association

In SPSS:

Correlation

Bivariate

Select the two variables: doctors per 10,000 people and femalelife expectancy

Also select “Pearson” under “Correlation Coefficients”

Mallinson Day 11 November 2, 2017 27 / 45

Examples of Measures of Association

In SPSS: For a visual illustration (scatterplot)

Graphs

Legacy Dialog

Scatter/Dot

Simple Scatter

Define

Enter doctors per 10,000 people as the “X axis” and female lifeexpectancy as the “Y axis”

Mallinson Day 11 November 2, 2017 28 / 45

Examples of Measures of Association

Results:

Positive association between the variables. How strong?

Mallinson Day 11 November 2, 2017 29 / 45

Examples of Measures of Association

Results:

Pearson’s r is fairly strong. We will leave further interpretation to ourdiscussion of regression.

Mallinson Day 11 November 2, 2017 30 / 45

Analysis of Variance (ANOVA)

ANOVA is similar to a t-test

The IV is nominal, the DV is scale

ANOVA is used when the IV has more than two groups

Makes overall comparisons among the groups of the IV

Mallinson Day 11 November 2, 2017 31 / 45

Analysis of Variance (ANOVA)

Also makes comparisons between the pairs of groups

Can be used with two groups, but produces identical results tot-test

Can calculate the strength of the statistical relationship betweenIV and DV (measure of association)

Mallinson Day 11 November 2, 2017 32 / 45

Analysis of Variance (ANOVA)

Need to keep the assumptions of ANOVA in mind:

1 DV must be scale-measured

2 Variances among groups of the IV should be equal

3 Each group normally distributed within itself

4 Groups should be independent of each other (no pre-postdesigns)

Mallinson Day 11 November 2, 2017 33 / 45

Analysis of Variance (ANOVA)

Steps in conducting an ANOVA test:

1 Plot an error bar chart to visual inspect the differences amonggroups

2 Describe group characteristics (mean values for each group)

3 Interpret the ANOVA table for overall differences among thegroups

4 If the F-test is significant, then run the Levene’s test(homogeneity of variance test), to determine the kind ofpost-hoc test you should use

5 Run the appropriate post-hot test (for pairwise groupcomparisons)

Mallinson Day 11 November 2, 2017 34 / 45

Analysis of Variance (ANOVA)

An SPSS example:

Dataset: gssft.sav

IV: Highest degree (degree)

DV: Number of hours worked last week (hrs1)

Mallinson Day 11 November 2, 2017 35 / 45

Analysis of Variance (ANOVA)

Step 1. Generating an error bar graph in SPSS:

Graphs

Legacy Dialog

Error Bar

Simple

Select “Summaries for groups of cases”

Define

Select variables (IV to “category axis” and DV to “variable”)

Accept “confidence interval for mean” under “Bar Represents”

Mallinson Day 11 November 2, 2017 36 / 45

Analysis of Variance (ANOVA)

Bars show 95% confidenceintervals

Bars do not representvariances, but becausestandard errors are used tocalculate them they areapproximations of variances

Arithmetic means shown inmiddle

Mallinson Day 11 November 2, 2017 37 / 45

Analysis of Variance (ANOVA)

Step 2. Describe the group characteristics and Step 3. Run theANOVA test

Analyze

Compare means

One-way ANOVA

Assign your DV to “Dependent List” and your IV to “Factor”

Under “Options,” select “Descriptive”

Mallinson Day 11 November 2, 2017 38 / 45

Analysis of Variance (ANOVA)

Mallinson Day 11 November 2, 2017 39 / 45

Analysis of Variance (ANOVA)

Step 4. If the test is significant, run homogeneity of variance test(Levene Test)

Analyze

Compare means

One-way ANOVA

Under “Options,” select “Homogeneity of variance test”

If the test is not significant, equal variances must be assumed

Mallinson Day 11 November 2, 2017 40 / 45

Analysis of Variance (ANOVA)

Step 4: Run the appropriate post-hot test

Equal Variances

The Bonferroni procedure is usually recommended for multiplecomparisons when the variances of samples are roughly equal

Unequal Variances

Use Dunnett T3 or Tamhane

Mallinson Day 11 November 2, 2017 41 / 45

Analysis of Variance (ANOVA)

Step 4: Run the appropriate post-hot test

Can you use a series of t-tests, instead of using the pair-wisecomparisons in ANOVA?

Statisticians tell us this will create a “multiple comparisonproblem” (i.e., increased risk of rejecting the null when it is true– Type I Error)O’Sullivan et al. say the opposite - Ignore them!

Mallinson Day 11 November 2, 2017 42 / 45

Analysis of Variance (ANOVA)

Step 4: Run the appropriate post-hot test in SPSS

Analyze

Compare means

One-way ANOVA

Under Post-Hoc tests, select either an equal variance(Bonferroni) or an unequal variance (Tamhane or Dunnet T3)test

Mallinson Day 11 November 2, 2017 43 / 45

Analysis of Variance (ANOVA)

Look at values under “Sig.”

Those less than .05 indicatepair of groups that aredifferent from each other

In this example, only“Graduate” and “Highschool” categories aresignificantly different fromeach other

Mallinson Day 11 November 2, 2017 44 / 45

Lab/Homework

Problem 1

Using the gssft.sav dataset, choose another ordinal variable that youbelieve is associated with general happiness. Lay out of the four stepsof significance testing (hypotheses, alpha, test, decision). Make sureyou choose and defend your chosen measure of association andcorrectly interpret your results.

Problem 2

Now, using the same dataset, choose a scale variable that you believeis associated with general happiness. Again, lay out all four of thesteps of significance testing. Make sure you choose the correct posthoc test based on the equal variances test. Interpret your results.

Mallinson Day 11 November 2, 2017 45 / 45

Appendix Slides

Additional Measures of Association examples

Mallinson Day 11 November 2, 2017 46 / 45

Examples of Measures of Association

Example 2: Independent and Dependent Variables Ordinal(Rectangular Table)

Data file: gssnet.sav

RQ: Does more education made you happier?

Variables in the dataset: respondent’s highest degree (degree)and general happiness (happy)

Mallinson Day 11 November 2, 2017 47 / 45

Examples of Measures of Association

Steps:

1. State hypotheses

Research hypothesis: There is a positive relationship betweeneducation level and general happiness. This is a directionalhypothesis.

Null hypothesis: There is no relationship.

2. Select and alpha level

α = 0.05

Mallinson Day 11 November 2, 2017 48 / 45

Examples of Measures of Association

Steps:

3. Compute test of statistical significance

Chi-square

4. Make a decision

If p < .05, there is a significant relationship

5. Interpret strength of the relationship

If there is a significant relationship, interpret Somers’ d, Kendall’stau-c, and gamma

Mallinson Day 11 November 2, 2017 49 / 45

Examples of Measures of Association

In SPSS:

Descriptive Statistics

Cross Tabs

Select a column variable (IV) and a row variable (DV)

Click “Statistics” and select Chi-square, also select Somers’ d,Gamma, and Kendall’s tau-c under “Ordinal”

Click “Cells” and select observed counts and columnpercentages (no expected counts this time)

Mallinson Day 11 November 2, 2017 50 / 45

Examples of Measures of AssociationResults:

Education seems to make a difference in happiness

Mallinson Day 11 November 2, 2017 51 / 45

Examples of Measures of Association

Results:

The relationship is significant

Mallinson Day 11 November 2, 2017 52 / 45

Examples of Measures of AssociationResults:

Somers’ d and Kendall’s tau-c agree; gamma exaggerates; preferSomers’ d as it is directional

Mallinson Day 11 November 2, 2017 53 / 45

Examples of Measures of Association

Example 2 Interpretation

There is a significant relationship between educational degreeand general happiness

The relationship is weak

The relationship is positive (as education increases, so doesgeneral happiness)

We can reject our null hypothesis

We can be confidence of this conclusion 95%

Mallinson Day 11 November 2, 2017 54 / 45

Examples of Measures of Association

Example 3: Independent and Dependent Variables Ordinal (SquareTable)

Data file: gssnet.sav

RQ: Does happiness in marriage you happier in general?

Variables in the dataset: happiness of marriage (hapmar) andgeneral happiness (happy)

Mallinson Day 11 November 2, 2017 55 / 45

Examples of Measures of Association

Steps:

1. State hypotheses

Research hypothesis: There is a positive relationship betweenhappiness in marriage and general happiness. This is adirectional hypothesis.

Null hypothesis: There is no relationship.

2. Select and alpha level

α = 0.05

Mallinson Day 11 November 2, 2017 56 / 45

Examples of Measures of Association

Steps:

3. Compute test of statistical significance

Chi-square

4. Make a decision

If p < .05, there is a significant relationship

5. Interpret strength of the relationship

If there is a significant relationship, interpret Somers’ d, Kendall’stau-b, gamma, and Spearman’s Rho

Mallinson Day 11 November 2, 2017 57 / 45

Examples of Measures of Association

In SPSS:

Descriptive Statistics

Cross Tabs

Select a column variable (IV) and a row variable (DV)

Click “Statistics” and select Chi-square, also select Somers’ d,Gamma, and Kendall’s tau-c under “Ordinal”

Click “Cells” and select observed counts and columnpercentages (no expected counts this time)

Mallinson Day 11 November 2, 2017 58 / 45

Examples of Measures of Association

In SPSS:

To compute Spearman’s Rho, you will need to go throughanother path:

CorrelationBivariateSelect two variables: Happiness of marriage and generalhappinessAlso select “Spearman” under “Correlation Coefficients”

Mallinson Day 11 November 2, 2017 59 / 45

Examples of Measures of AssociationResults:

There seems to be a relationship

Mallinson Day 11 November 2, 2017 60 / 45

Examples of Measures of Association

Results:

The relationship is significant

Mallinson Day 11 November 2, 2017 61 / 45

Examples of Measures of AssociationResults:

Mallinson Day 11 November 2, 2017 62 / 45

Examples of Measures of Association

Results:

Somers’ d, tau-b, and Spearman are all similar. Shows a moderatelystrong relationship.

Mallinson Day 11 November 2, 2017 63 / 45

Examples of Measures of Association

Example 3 Interpretation

There is a significant relationship between happiness in marriageand general happiness

The relationship is moderately strong

The relationship is positive (as happiness in marriage increases,so does general happiness)

We can reject our null hypothesis

We can be confidence of this conclusion 95%

Mallinson Day 11 November 2, 2017 64 / 45