urbp 204a quantitative methods i statistical analysis lecture iv

44
URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture IV Gregory Newmark San Jose State University (This lecture is based on Chapters 5,12,13, & 15 of Neil Salkind’s Statistics for People who (Think They) Hate Statistics, 2 nd Edition which is also the source of many of the offered examples. All cartoons are from CAUSEweb.org by J.B. Landers.)

Upload: rocco

Post on 25-Feb-2016

24 views

Category:

Documents


0 download

DESCRIPTION

URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture IV. Gregory Newmark San Jose State University (This lecture is based on Chapters 5,12,13, & 15 of Neil Salkind’s Statistics for People who (Think They) Hate Statistics, 2 nd Edition - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

URBP 204A QUANTITATIVE METHODS I

Statistical Analysis Lecture IV

Gregory NewmarkSan Jose State University

(This lecture is based on Chapters 5,12,13, & 15 of Neil Salkind’sStatistics for People who (Think They) Hate Statistics, 2nd Edition

which is also the source of many of the offered examples. All cartoons are from CAUSEweb.org by J.B. Landers.)

Page 2: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

More Statistical Tests• Factorial Analysis of Variance (ANOVA)

– Tests between means of more than two groups for two or more factors (independent variables)

• Correlation Coefficient– Tests the association between two variables

• One Sample Chi-Square (χ2)– Tests if an observed distribution of frequencies for one

factor is what one would expect by chance• Two Factor Chi-Square (χ2)

– Tests if an observed distribution of frequencies for two factors is what one would expect by chance

Page 3: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Factorial ANOVA• Compares observations of a single variable among two

or more groups which incorporate two or more factors.

• Examples:– Reading Skills

• School (Elementary, Middle, High)• Academic Philosophy (Montessori, Waldorf)

– Environmental Knowledge• Commute Mode (Car, Bus, Walking)• Age (Under 40, 40+)

– Wealth • Favorite Team (A’s, Giants, Dodger, Angels)• Home Location (Oakland, SF, LA)

– Weight Loss• Gender (Male, Female)• Exercise (Biking, Running)

Page 4: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Factorial ANOVA• Two Types of Effects

– Main Effects: differences within one factor– Interaction Effects: differences across factors

• Example:– Weight Loss

• Gender (Male, Female)• Exercise (Biking, Running)

– Main Effects:• Does weight loss vary by exercise?• Does weight loss vary by gender?

– Interaction Effects: • Does weight loss due to exercise vary by gender?

Page 5: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Factorial ANOVA

• Example:– “How is weight loss affected by exercise program

and gender?”• Steps:

– State hypotheses• Null :

H0 : µMale = µFemale

H0 : µBiking = µRunning

H0 : µMale-Biking = µFemale-Biking = µMale-Running = µFemale-Running

• Research : What would these three be?

Page 6: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Factorial ANOVA• Steps (Continued):

– Set significance level• Level of risk of Type I Error = 5% • Level of Significance (p) = 0.05

– Select statistical test• Factorial ANOVA

– Computation of obtained test statistic value • Insert obtained data into appropriate formula• (SPSS can expedite this step for us)

Page 7: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Factorial ANOVA• Weight Loss Data

Male-Biking Male-Running Female-Biking Female-Running

76 88 65 65

78 76 90 67

76 76 65 67

76 76 90 87

76 56 65 78

74 76 90 56

74 76 90 54

76 98 79 56

76 88 70 54

55 78 90 56

Page 8: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Factorial ANOVA

• SPSS Outputs

Tests of Between-Subjects Effects

Dependent Variable: WeightLoss

1522.875a 3 507.625 4.678 .007218892.025 1 218892.025 2017.386 .000

265.225 1 265.225 2.444 .127207.025 1 207.025 1.908 .176

1050.625 1 1050.625 9.683 .0043906.100 36 108.503

224321.000 405428.975 39

SourceCorrected ModelInterceptExerciseGenderExercise * GenderErrorTotalCorrected Total

Type III Sumof Squares df Mean Square F Sig.

R Squared = .281 (Adjusted R Squared = .221)a.

Between-Subjects Factors

Running 20Biking 20Male 20Female 20

12

Exercise

12

Gender

Value Label N

=p

Page 9: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Factorial ANOVA

• SPSS Outputs

Page 10: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Factorial ANOVA

• SPSS Outputs– Graph them!

FemaleMale

Gender

80

75

70

65

Mea

n W

eigh

tLos

s

BikingRunning

Exercise

Page 11: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Factorial ANOVA• Steps (Continued)

– Computation of obtained test statistic value • Exercise F = 2.444, p = 0.127• Gender F = 1.908, p = 0.176• Interaction F = 9.683, p = 0.004

– Look up the critical F score• dfnumerator = # of Factors – 1 • dfdenominator = # of Observations – # of Groups• What is the critical F score?

– Comparison of obtained and critical values• If obtained > critical reject the null hypothesis• If obtained < critical stick with the null hypothesis

Page 12: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Factorial ANOVA• Steps (Continued)

– Therefore we reject the null hypothesis for the interaction effects. This means that while choice of exercise alone and gender alone make no difference to weight loss, in combination they do differentially affect weight loss. Men should run and women should bike, according to these data.

Page 13: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient• Tests whether changes in two variables are

related• Examples

– “Are property values positively related to distance from waste dumps?”

– “Is age correlated with height for minors?”– “Are apartment rents negatively related to

commute time?”– “Does someone’s height relate to income?”– “How related are hand size and height?”

Page 14: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient• Are Tastiness and Ease correlated for fruit?• Is there directionality?

Page 15: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient• Numeric index that reflects the linear relationship

between two variables (bivariate correlation)– “How does the value of one variable change when another

variable changes?”– Each case has two data points:

• E.g. This study records each persons height and weight to see if they are correlated.

– Ranges from -1.0 to +1.0– Two types of possible correlations

• Change in the same direction : positive or direct correlation• Change in opposite directions: negative or indirect correlation

– Absolute value reflects strength of correlation• Pearson Product-Moment Correlation

– Both variables need to be ratio or interval

Page 16: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient• Scatterplot

Page 17: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient• Coefficient of Determination

– Squaring the correlation coefficient (r2)– The percentage of variance in one variable that is

accounted for by the variance in another variable• Example: GPA and Time Spent Studying

– [rGPA and Study Time = 0.70]; [r2GPA and Study Time = 0.49]

• 49% of the variance in GPA can be explained by the variance in studying time

• GPA and studying time share 49% of the variance between themselves

Page 18: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient• Example

– “How related are hand size and height?”• Steps

– State hypotheses• Null : H0 : ρHand Size and Height = 0

• Research: H1 : rHand Size and Height ≠ 0– Non-directional

– Set significance level• Level of risk of Type I Error = 5% • Level of Significance (p) = 0.05

Page 19: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient• Steps (Continued)

– Select statistical test• Correlation Coefficient (it is the test statistic!)

– Computation of obtained test statistic value • Insert obtained data into appropriate formula

Page 20: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient• Plot the data: n = 30

Page 21: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient• Steps (Continued)

– Computation of obtained test statistic value • rHand Size and Height = 0.736

Correlations

Height HandHeight Pearson

Correlation1 .736**

Sig. (2-tailed) .000

N 30 30Hand Pearson

Correlation.736** 1

Sig. (2-tailed) .000

N 30 30**. Correlation is significant at the 0.01 level (2-tailed).

Page 22: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient• Steps (Continued)

– Computation of critical test statistic value• Value needed to reject null hypothesis• Look up p = 0.05 in critical value table• Consider degrees of freedom [df= n – 2] • Consider number of tails (is there directionality?)• rcritical = ?

Page 23: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient

• What happens to the critical score when the number of cases (n) decreases? Why?

Page 24: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient

• Steps (Continued)– Comparison of obtained and critical values

• If obtained > critical reject the null hypothesis• If obtained < critical stick with the null hypothesis• robtained = 0.736 > rcritical = 0.349

– Therefore, we reject the null hypothesis and accept the research hypothesis that height and handbreadth are correlated.

• Is there a directionality to that correlation?

Page 25: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient

• Significance vs. Meaning– Rules of Thumb

• r = 0.8 to 1.0 Very strong relationship• r = 0.6 to 0.8 Strong relationship• r = 0.4 to 0.6 Moderate relationship• r = 0.2 to 0.4 Weak relationship• r = 0.0 to 0.2 Weak or no relationship

Page 26: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient

• Does correlation express causation?• Classic Example:

– Ice Cream Eaten– Crimes Committed

Page 27: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Correlation Coefficient

• Correlation expresses association only

Page 28: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Chi-Square (χ2)• Non-Parametric Test

– Does not rely on a given distribution• Useful for small sample sizes

– Enables consideration of data that comes as ordinal or nominal frequencies

• Number of children in different grades• Percentage of people by state receiving social security

Page 29: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

One Sample Chi-Square (χ2)• Tests whether an observed distribution of

frequencies for one factor is likely to have occurred by chance

• Examples:– “Is this community evenly distributed among ethnic

groups?”– “Are the 31 ice cream flavors at Baskin Robbins

equally purchased?”– “Are commuting mode shares evenly spread out?”– “Did people report equal preferences for a school

voucher policy?”

Page 30: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

One Sample Chi-Square (χ2)• Examples:

– “Did people report equal preferences for a school voucher policy?”

– Data (90 People split into 3 Categories)• For 23• Maybe 17• Against 50

– Always try to have at least 5 responses per category

Page 31: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

One Sample Chi-Square (χ2)• Steps:

– State hypotheses• Null :

H0 : ProportionFor = ProportionMaybe = ProportionAgainst

• Research : H1 : ProportionFor ≠ ProportionMaybe ≠ ProportionAgainst

– Set significance level• Level of risk of Type I Error = 5% • Level of Significance (p) = 0.05

– Select statistical test• Chi-Square (χ2)

Page 32: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

One Sample Chi-Square (χ2)• Steps (Continued):

– Computation of obtained test statistic value • Insert obtained data into appropriate formula• (SPSS can expedite this step for us)

Page 33: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

One Sample Chi-Square (χ2)• Steps (Continued):

– Computation of obtained test statistic value

Category O E (O-E) (O-E)2 (O-E)2/E

For 23 30 -7 49 1.63

Against 17 30 -13 169 5.63

Maybe 50 30 20 400 13.33

Total 90 90 -- -- 20.59

Page 34: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

One Sample Chi-Square (χ2)• Steps (Continued):

– Computation of obtained test statistic value • χ2 obtained = 20.59

– Computation of critical test statistic value• Value needed to reject null hypothesis• Look up p = 0.05 in χ2 table• Consider degrees of freedom [df= # of categories - 1] • χ2 critical = 5.99

Page 35: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

One Sample Chi-Square (χ2)• Steps (Continued):

– Computation of obtained test statistic value Votes

23 30.0 -7.017 30.0 -13.050 30.0 20.090

ForMaybeAgainstTotal

Observed N Expected N Residual

Test Statistics

20.6002

.000

Chi-Square a

dfAsymp. Sig.

Votes

0 cells (.0%) have expected frequencies less than5. The minimum expected cell frequency is 30.0.

a.

Page 36: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

One Sample Chi-Square (χ2)• Steps (Continued):

– Comparison of obtained and critical values• If obtained > critical reject the null hypothesis• If obtained < critical stick with the null hypothesis• χ2 obtained = 20.59 > χ2 critical = 5.99

– Therefore, we can reject the null hypothesis and we thus conclude that distribution of preferences regarding the school voucher is not even.

Page 37: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Two Factor Chi-Square (χ2)• What if we want to see if gender effects the

distribution of votes?

• How is this different from Factorial ANOVA?

Votes * Gender Crosstabulation

Count

17 6 237 10 17

20 30 5044 46 90

ForMaybeAgainst

Votes

Total

Male FemaleGender

Total

Page 38: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Two Factor Chi-Square (χ2)• Steps:

– State hypotheses• Null :

H0 : PFor*Male = PMaybe*Male = PAgainst *Male = PFor*Female = PMaybe*Female = PAgainst *Female

• Research : H1 : PFor*Male ≠ PMaybe*Male ≠ PAgainst *Male ≠ PFor*Female ≠ PMaybe*Female ≠ PAgainst

*Female

– Set significance level• Level of risk of Type I Error = 5% • Level of Significance (p) = 0.05

– Select statistical test• Chi-Square (χ2)

Page 39: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Two Factor Chi-Square (χ2)• Steps (Continued):

– Computation of obtained test statistic value • Insert obtained data into appropriate formula• Same as for One Factor Chi-Square

Page 40: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Two Factor Chi-Square (χ2)• How do we find the expected frequencies?

– (Row Total * Column Total)/ Total Total– Expected Value [For*Male] = (23*44)/90 = 11.2

Votes * Gender Crosstabulation

17 6 2311.2 11.8 23.0

7 10 178.3 8.7 17.020 30 50

24.4 25.6 50.044 46 90

44.0 46.0 90.0

CountExpected CountCountExpected CountCountExpected CountCountExpected Count

For

Maybe

Against

Votes

Total

Male FemaleGender

Total

Page 41: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Two Factor Chi-Square (χ2)• Steps (Continued):

– Computation of obtained test statistic value • χ2 obtained = 7.750

Chi-Square Tests

7.750a 2 .0217.984 2 .018

6.344 1 .012

90

Pearson Chi-SquareLikelihood RatioLinear-by-LinearAssociationN of Valid Cases

Value dfAsymp. Sig.

(2-sided)

0 cells (.0%) have expected count less than 5. Theminimum expected count is 8.31.

a.

Page 42: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Two Factor Chi-Square (χ2)• Steps (Continued):

– Computation of critical test statistic value• Value needed to reject null hypothesis• Look up p = 0.05 in χ2 table• Consider degrees of freedom • df= (# of rows – 1) * (# of columns – 1) • χ2 critical = ?

Page 43: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Two Factor Chi-Square (χ2)• Steps (Continued):

– Comparison of obtained and critical values• If obtained > critical reject the null hypothesis• If obtained < critical stick with the null hypothesis• χ2 obtained = 7.750 > χ2 critical = 5.99

– Therefore, we can reject the null hypothesis and we thus conclude that gender affects the distribution of preferences regarding the school vouchers.

Page 44: URBP 204A  QUANTITATIVE METHODS I Statistical Analysis Lecture IV

Tutorial Time