pasw-spss statistics

43
David Yens, Ph.D. NYCOM PASW-SPSS STATISTICS PASW-SPSS STATISTICS David P. Yens, Ph.D. New York College of Osteopathic Medicine, NYIT [email protected] PRESENTATION 5 REVIEW OF ANOVA CORRELATION AND REGRESSION

Upload: glenna

Post on 14-Jan-2016

60 views

Category:

Documents


2 download

DESCRIPTION

PASW-SPSS STATISTICS. David P. Yens, Ph.D. New York College of Osteopathic Medicine, NYIT [email protected] PRESENTATION 5 REVIEW OF ANOVA CORRELATION AND REGRESSION 2010. ANALYSIS OF VARIANCE. Simple - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: PASW-SPSS STATISTICS

David Yens, Ph.D. NYCOM

PASW-SPSS STATISTICSPASW-SPSS STATISTICS

David P. Yens, Ph.D. New York College of Osteopathic

Medicine, NYIT [email protected]

PRESENTATION 5 REVIEW OF ANOVA CORRELATION AND

REGRESSION

2010

Page 2: PASW-SPSS STATISTICS

David Yens, Ph.D. NYCOM

ANALYSIS OF VARIANCEANALYSIS OF VARIANCE Simple

◦Used to determine whether there are differences in means among more than two groups, or:

Factorial◦ on more than one dimension (independent

variable).◦Examples:

1. Compare blood pressures resulting from the use of three treatments.

2. Compare blood pressures resulting from the use of three treatments and between males and females.

Page 3: PASW-SPSS STATISTICS

D Yens, NYCOM 3 04/21/23

TREATMENTGROUP

AGROUP

BGROUP

CMean BloodPressure

Meanfor A

Meanfor B

Meanfor C

Page 4: PASW-SPSS STATISTICS

D Yens, NYCOM 4 04/21/23

MALES FEMALESTREATMENT

AMean A

TREATMENTB

Mean B

TREATMENTC

Mean C

MeanMales

MeanFemales

Page 5: PASW-SPSS STATISTICS

D Yens, NYCOM 5 04/21/23

ANOVAANOVA

Determining differences after ANOVA◦Planned contrasts◦Post-hoc analyses

Page 6: PASW-SPSS STATISTICS

hsbdataBEffect of fathers education on

◦Grades◦Visualization test◦Math achiement

Page 7: PASW-SPSS STATISTICS

ANOVAANOVAANALYZE COMPARE MEANS ONE-WAY ANOVA OR

◦ GENERAL LINEAR MODEL UNIVARIATE

Several options are available

Page 8: PASW-SPSS STATISTICS

Length of stay in different hospitalsPATIENT

A1 A2 A3 A41 2 3 5 102 3 6 8 113 4 7 9 134 3 5 10 85 4 4 4 96 2 5 6 9

HOSPITAL

Analysis of Variance

Page 9: PASW-SPSS STATISTICS

Anova: Single Factor

SUMMARY

Groups Count Sum Average Variance

A1 6 18 3 0.8

A2 6 30 5 2

A3 6 42 7 5.6

A4 6 60 10 3.2

ANOVA

Source of Variation SS df MS F P-value F crit

Between Groups 160.5 3 53.5 18.44828 5.6E-06 3.0983912

Within Groups 58 20 2.9

Total 218.5 23

Page 10: PASW-SPSS STATISTICS

David Yens, Ph.D. NYCOM

OTHER ANALYSIS OF OTHER ANALYSIS OF VARIANCE METHODSVARIANCE METHODS

◦Repeated measures◦Analysis of Covariance

Test statistic - F

Page 11: PASW-SPSS STATISTICS

D Yens, NYCOM 11 04/21/23

STATISTICAL ANALYSESSTATISTICAL ANALYSES

ANALYSIS OF VARIANCE (Repeated measures)◦Used to assess before and after

measures on the same individuals exposed to two or more treatments.

◦Example: Assess the increase in blood pressure for two groups exposed to different treatments.

Page 12: PASW-SPSS STATISTICS

REPEATED MEASURES REPEATED MEASURES ANOVAANOVA

04/21/23 D Yens, NYCOM 12

  TREATMENT  

  TREATMENT A TREATMENT B TREATMENT C

SUBJECT 1 BP-A BP-B BP-C

SUBJECT 2      

SUBJECT 3      

SUBJECT 4      

"----      

MEAN BLOOD PRESSURE MEAN FOR A MEAN FOR B MEAN FOR C

Page 13: PASW-SPSS STATISTICS

REPEATED MEASURES REPEATED MEASURES ANOVAANOVATemperature over 4 days

PATIENTA1 A2 A3 A4

1 101.2 100.3 99.8 98.72 102.3 100.7 100.1 99.13 103.2 101.1 100.2 99.14 102.2 100.6 99.2 98.55 104 102.1 100.1 99.26 103.2 101.5 100.3 99.3

TEMPERATURE

Page 14: PASW-SPSS STATISTICS

REPEATED MEASURES REPEATED MEASURES ANOVAANOVA

Page 15: PASW-SPSS STATISTICS

REPEATED MEASURES REPEATED MEASURES ANOVAANOVA

Page 16: PASW-SPSS STATISTICS

REPEATED MEASURES REPEATED MEASURES ANOVAANOVA

Page 17: PASW-SPSS STATISTICS

David Yens, Ph.D. NYCOM

CORRELATION AND CORRELATION AND REGRESSIONREGRESSION

Morgan, Chapt. 8Morgan, Chapt. 8CORRELATION – Expresses

relationship onlyREGRESSION – Prediction of one

variable from another. Implies direction of influence, does NOT prove causality

MULTIPLE REGRESSION – Prediction of a target variable from 2 or more predictors (independent variables)

Page 18: PASW-SPSS STATISTICS

David Yens, Ph.D. NYCOM

CORRELATIONCORRELATIONCorrelation coefficient is a number

between -1 and +1 whose sign is on the same as the slope of the line and whose magnitude is related to the degree of linear association between two variables

R2, the coefficient of determination, expresses the proportion of variance in the dependent variable explained by the independent variable◦On a ratio scale; an r2 =.50 is twice as

large as .25Interpretation of values

Page 19: PASW-SPSS STATISTICS

ASSUMPTIONS FOR PEARSON ASSUMPTIONS FOR PEARSON CORRELATION & SIMPLE CORRELATION & SIMPLE

REGRESSIONREGRESSION

Linear relationshipScores normally distributedOutliers can have a major impact

Page 20: PASW-SPSS STATISTICS

VARIABLES FOR VARIABLES FOR CORRELATIONCORRELATIONGrades MathAchievement4 9.005 10.336 7.673 5.003 -1.675 1.006 12.004 8.00ETC.

Page 21: PASW-SPSS STATISTICS

EXAMPLE FROM TEXTEXAMPLE FROM TEXTCheck assumptions

Page 22: PASW-SPSS STATISTICS

OBTAINING A OBTAINING A SCATTERPLOTSCATTERPLOTGRAPHS LEGACY DIALOGS SCATTER/DOT

Page 23: PASW-SPSS STATISTICS

SCATTERPLOTSCATTERPLOT

Page 24: PASW-SPSS STATISTICS

ADDING REGRESSION ADDING REGRESSION LINELINENow double-click the output

chart

Page 25: PASW-SPSS STATISTICS
Page 26: PASW-SPSS STATISTICS

USING CHART BUILDERUSING CHART BUILDERGRAPHS CHART BUILDER OKSELECT “Gallery”SELECT “Scatter/Dot” With mouse, move “Simple Scatter” to

Chart PreviewFind/move “math achievement test” to

vertical axis boxFind/move “grades in h.s.” to horizontal

axis box Click OK

Page 27: PASW-SPSS STATISTICS

OUTPUT FROM CHART OUTPUT FROM CHART BUILDERBUILDER

Page 28: PASW-SPSS STATISTICS

TO OBTAIN A FIT LINETO OBTAIN A FIT LINEDouble-click on chartSELECT “Elements”SELECT “Interpolation line”

Page 29: PASW-SPSS STATISTICS

FIT LINEFIT LINE

Page 30: PASW-SPSS STATISTICS

TO GET A CORRELATION TO GET A CORRELATION BETWEEN THE 2 VARIABLESBETWEEN THE 2 VARIABLESANALYZE CORRELATE BIVARIATE

Page 31: PASW-SPSS STATISTICS

CORRELATION OUTPUTCORRELATION OUTPUT

Mean Std. Deviation Ngrades in h.s. 5.68 1.570 75

math achievement test 12.5645 6.67031 75

grades in h.s.

math achievement

testPearson Correlation 1 .504**

Sig. (2-tailed) .000

N 75 75

Pearson Correlation .504** 1

Sig. (2-tailed) .000

N 75 75

**. Correlation is significant at the 0.01 level (2-tailed).

Descriptive Statistics

Correlations

grades in h.s.

math achievement test

Page 32: PASW-SPSS STATISTICS

CORRELATION EXAMPLECORRELATION EXAMPLEDayya (2005) looked at

predictors of obesity. In one example, he plotted percent of calories in carbs against BMI to see if there was a relationship with the following result:

Dayya, D. Analysis of the CDC-NHANES Database to Identify Predictors Of Obesity in a Multiple Linear and Logistic Regression Model. New York Medical Journal, online, Dec. 2005.

Page 33: PASW-SPSS STATISTICS

CORRELATION EXAMPLECORRELATION EXAMPLE

Page 34: PASW-SPSS STATISTICS

David Yens, Ph.D. NYCOM

REGRESSIONREGRESSION

The simplest regression is y=a+bx, where y is the dependent variable (plotted on the vertical axis), x is the independent variable (plotted on the horizontal axis), and a is the y intercept.

Refers to a mathematical equation that allows one variable (the target variable) to be predicted from another (the independent variable).

Implies a direction of influence; it does not prove causality.

From Greenhaigh, T. How to read a paper: statistics for the non-statistician. II. BMJ, 315 (7105)

Page 35: PASW-SPSS STATISTICS

David Yens, Ph.D. NYCOM

Simple RegressionSimple RegressionThe regression line is the straight

line passing through the data that minimizes the sum of squared differences between the original data and the fitted points◦Least-squares analysis◦This was the basis for ANOVA

proceduresIntercept term is equivalent to the

grand mean

Page 36: PASW-SPSS STATISTICS

QUESTIONQUESTIONCan we predict math

achievement from grades in high school?

Using the same variables as before:

ANALYZE REGRESSION LINEAR

Page 37: PASW-SPSS STATISTICS

INITIAL OUTPUT TABLESINITIAL OUTPUT TABLES

Mean Std. Deviation Nmath achievement test 12.5645 6.67031 75

grades in h.s. 5.68 1.570 75

math achievement test grades in h.s.math achievement test 1.000 .504

grades in h.s. .504 1.000

math achievement test . .000

grades in h.s. .000 .

math achievement test 75 75

grades in h.s. 75 75

N

Descriptive Statistics

Correlations

Pearson Correlation

Sig. (1-tailed)

Page 38: PASW-SPSS STATISTICS

REGRESSION TABLESREGRESSION TABLESVariables Entered

Variables Removed Method

1 grades in h.s.a . Enter

R R SquareAdjusted R

SquareStd. Error of the

Estimate1 .504a .254 .244 5.80018

Sum of Squares df Mean Square F Sig.Regression 836.606 1 836.606 24.868 .000a

Residual 2455.875 73 33.642

Total 3292.481 74

Standardized Coefficients

B Std. Error Beta Tolerance VIF(Constant) .397 2.530 .157 .876

grades in h.s. 2.142 .430 .504 4.987 .000 1.000 1.000

(Constant) grades in h.s.1 1.964 1.000 .02 .02

2 .036 7.421 .98 .98

1

a. Dependent Variable: math achievement test

1

a. Dependent Variable: math achievement test

Collinearity Diagnosticsa

Model Dimension

Eigenvalue Condition Index

Variance Proportions

b. Dependent Variable: math achievement test

Coefficientsa

Model

Unstandardized Coefficients

t Sig.

Collinearity Statistics

Model

a. Predictors: (Constant), grades in h.s.

ANOVAb

Model

1

a. Predictors: (Constant), grades in h.s.

Variables Entered/Removedb

Model

a. All requested variables entered.

b. Dependent Variable: math achievement test

Model Summary

Page 39: PASW-SPSS STATISTICS

REGRESSION EXAMPLEREGRESSION EXAMPLEWe could look at the Dayya data

again to predict BMI from percent calories in carbs. Do you think we could obtain an accurate prediction?

Other uses of regression might be to predict the number of fillings required during a 5-year period from the number of times teeth were brushed a week.

Page 40: PASW-SPSS STATISTICS

DAYYA DATADAYYA DATA

Page 41: PASW-SPSS STATISTICS

Multiple RegressionMultiple RegressionA more complex mathematical

equation that allows the target variable to be predicted from two or more independent variables (often known as co-variables).

EXAMPLE: predicting blood pressure from age, height, weight, and drug dosage.

Page 42: PASW-SPSS STATISTICS

SEESEE YOU IN ---YOU IN ---

Page 43: PASW-SPSS STATISTICS

David Yens, Ph.D. NYCOM

MULTIPLE REGRESSIONMULTIPLE REGRESSIONFINAL POINTS

◦Sample size – number of subjects at least 5 (preferably 10) times the number of variables

◦The multiple R should be at least .7◦The change in R2 should be at least a few

percent◦A gradual fall off should be seen in the

prediction of each successive variable◦Fewer predictor variables are better than

many; too many make interpretation difficult

◦Analyze the influence of outliers