logistic regression and discriminant function analysis

81
Logistic Regression and Discriminant Function Analysis

Upload: georgia-alexander

Post on 31-Dec-2015

237 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Logistic Regression and Discriminant Function Analysis

Logistic Regression and Discriminant Function Analysis

Page 2: Logistic Regression and Discriminant Function Analysis

Logistic Regression vs. Discriminant Function Analysis

• Similarities– Both predict group membership for each

observation (classification)– Dichotomous DV– Requires an estimation and validation sample

to assess predictive accuracy– If the split between groups is not more

extreme than 80/20, yield similar results in practice

Page 3: Logistic Regression and Discriminant Function Analysis

Logistic Reg vs. Discrim: Differences• Discriminant Analysis

– Assumes MV normality

– Assumes equality of VCV matrices

– Large number of predictors violates MV normality can’t be accommodated

– Predictors must be continuous, interval level

– More powerful when assumptions are met

– Many assumptions, rarely met in practice

– Categorical IVs create problems

• Logistic Regression– No assumption of MV normality – No assumption of equality of

VCV matrices– Can accommodate large

numbers of predictors more easily

– Categorical predictors OK (e.g., dummy codes)

– Less powerful when assumptions are met

– Few assumptions, typically met in practice

– Categorical IVs can be dummy coded

Page 4: Logistic Regression and Discriminant Function Analysis

Logistic Regression

• Outline:– Categorical Outcomes: Why not OLS

Regression?– General Logistic Regression Model– Maximum Likelihood Estimation– Model Fit– Simple Logistic Regression

Page 5: Logistic Regression and Discriminant Function Analysis

Categorical Outcomes: Why not OLS Regression?

• Dichotomous outcomes:– Passed / Failed– CHD / No CHD– Selected / Not Selected– Quit/ Did Not Quit– Graduated / Did Not Graduate

Page 6: Logistic Regression and Discriminant Function Analysis

• Example: Relationship b/w performance and turnover

Categorical Outcomes: Why not OLS Regression?

Performance

5.04.54.03.53.02.52.01.5

Tur

nove

r

1.5

1.0

.5

0.0

-.5

• Line of best fit?!• Errors (Y-Y’) across values of performance (X)?

Page 7: Logistic Regression and Discriminant Function Analysis

Problems with Dichotomous Outcomes/DVs

• The regression surface is intrinsically non-linear• Errors assume one of two possible values, violate

assumption of normally distributed errors• Violates assumption of homoscedasticity• Predicted values of Y greater than 1 and smaller

than 0 can be obtained• The true magnitude of the effects of IVs may be

greatly underestimated• Solution: Model data using Logistic Regression,

NOT OLS Regression

Page 8: Logistic Regression and Discriminant Function Analysis

• Logistic regression predicts a probability that an event will occur– Range of possible responses between 0 and 1– Must use an s-shaped curve to fit data

• Regression assumes linear relationships, can’t fit an s-shaped curve– Violates normal distribution– Creates heteroscedascity

Logistic Regression vs. Regression

Page 9: Logistic Regression and Discriminant Function Analysis

Example: Relationship b/w Age and CHD (1 = Has CHD)

Page 10: Logistic Regression and Discriminant Function Analysis

General Logistic Regression Model

• Y’ (outcome variable) is the probability that having one outcome or another based on a nonlinear function of the best linear combination of predictors

Where:• Y’ = probability of an event• Linear portion of equation (a + b1x1) used to

predict probability of event (0,1), not an end in itself

11

11

1'

Xba

Xba

e

eY

Page 11: Logistic Regression and Discriminant Function Analysis

The logistic (logit) transformation• DV is dichotomous purpose is to

estimate probability of occurrences (0, 1)– Thus, DV is transformed into a likelihood

• Logit/logistic transformation accomplishes (linear regression eq. takes log of odds)

)0(

)1(

)1(1

)1(

1

YP

YP

YP

YP

P

Podds

ijjXBA

Y

Y

P

PPitodds

'1

'ln

1ln)(log)log(

Page 12: Logistic Regression and Discriminant Function Analysis

Probability Calculation

bXa

bXa

e

eYP

1'

Where:The relation b/w logit (P) and X is intrinsically linearb = expected change of logit(P) given one unit change in Xa = intercepte = Exponential

Page 13: Logistic Regression and Discriminant Function Analysis

Ordinary Least Squares (OLS) Estimation

• Purpose is obtain the estimates that would best minimize the sum of squared errors, sum(y-y’)2

• The estimates chosen best describe the relationships among the observed variables (IVs and DV)

• Estimates chosen maximize the probability of obtaining the observed data (i.e., these are the population values most likely to produce the data at hand)

Page 14: Logistic Regression and Discriminant Function Analysis

• OLS can’t be used in logistic regression because of non-linear nature of relationships

• In ML, the purpose is to obtain the parameter estimates most likely to produce the data– ML estimators are those with the greatest joint

likelihood of reproducing the data• In logistic regression, each model yields a ML

joint probability (likelihood) value• Because this value tends to be very small

(e.g., .00000015), it is multiplied by -2log• The -2log transformation also yields a statistic

with a known distribution (chi-square distribution)

Maximum Likelihood (ML) estimation

Page 15: Logistic Regression and Discriminant Function Analysis

Model Fit• In Logistic Regression, R & R2 don’t make sense• Evaluate model fit using the -2log likelihood (-2LL)

value obtained for each model (through ML estimation)– The -2LL value reflects fit of model; used to

compare fit of nested models – The -2LL measures lack of fit – extent to

which model fits data poorly– When the model fits the data perfectly, -2LL = 0

• Ideally, the -2LL value for the null model (i.e., model with no predictors, or “intercept-only” model) would be larger than then the model with predictors

Page 16: Logistic Regression and Discriminant Function Analysis

Comparing Model Fit The fit of the null model can be tested against the fit of the model with predictors using chi-square test:

ModelNull LLLL 222 Where: 2 = chi-square for improvement in model fit (where df =

kNull – kModel)• -2LLMO = -2 Log likelihood value for null model

(intercept-only model)• -2LLM1 = -2 Log likelihood value for hypothesized model• Same test can be used to compare nested model with k

predictor(s) to model with k+1 predictors, etc.• Same logic as OLS regression, but the models are

compared using a different fit index (-2LL)

Page 17: Logistic Regression and Discriminant Function Analysis

Pseudo R2

• Assessment of overall model fit• Calculation

• Two primary Pseudo R2 stats:– Nagelkerke less conservative

• preferred by some because max = 1

– Cox & Snell more conservative

• Interpret like R2 in OLS regression

Null

ModelNull

LL

LLLLR

2

222

Page 18: Logistic Regression and Discriminant Function Analysis

Unique Prediction

• In OLS regression, the significance tests for the beta weights indicate if the IV is a unique predictors

• In Logistic regression, the Wald test is used for the same purpose

Page 19: Logistic Regression and Discriminant Function Analysis

Similarities to Regression

• You can use all of the following procedures you learned about OLS regression in logistic regression– Dummy coding for categorical IVs– Hierarchical entry of variables (compare

changes in % classification; significance of Wald test)

– Stepwise (but don’t use, its atheoretical)– Moderation tests

Page 20: Logistic Regression and Discriminant Function Analysis

Simple Logistic Regression Example

• Data collected from 50 employees

• Y = success in training program (1 = pass; 0 = fail)

• X1 = Job aptitude score (5 = very high; 1= very low)

• X2 = Work-related experience (months)

Page 21: Logistic Regression and Discriminant Function Analysis

Syntax in SPSS

LOGISTIC REGRESSION PASS /METHOD = ENTER APT EXPER /SAVE = PRED PGROUP /CLASSPLOT /PRINT = GOODFIT /CRITERIA = PIN(.05) POUT(.10) ITERATE(20) CUT(.5) .

DV

IVs

Page 22: Logistic Regression and Discriminant Function Analysis

Results• Block O: The Null Model results

– Can’t do any worse than this

• Block 1: Method = Enter– Tests of the model of interest– Interpret data from here

Omnibus Tests of Model Coefficients

10.169 2 .006

10.169 2 .006

10.169 2 .006

Step

Block

Model

Step 1Chi-square df Sig.

Tests if model is significantly better than the null model. Significant chi-square means yes!

Step, Block & Model yield same results because all IVs entered in same block

Page 23: Logistic Regression and Discriminant Function Analysis

Results ContinuedModel Summary

59.066a .184 .245Step1

-2 Loglikelihood

Cox & SnellR Square

NagelkerkeR Square

Estimation terminated at iteration number 4 becauseparameter estimates changed by less than .001.

a.

-2 Log Likelihood an index of fit - smaller number means better fit (Perfect fit = 0)Pseudo R2 – Interpret like R2 in regression

Nagelkerke preferred by some because max = 1, Cox & Snell more conservative estimate uniformly

Page 24: Logistic Regression and Discriminant Function Analysis

Classification: Null Model vs. Model Tested

Classification Tablea

16 8 66.7

6 20 76.9

72.0

Observedfail

pass

PASS

Overall Percentage

Step 1fail pass

PASS PercentageCorrect

Predicted

The cut value is .500a.

Classification Tablea,b

0 24 .0

0 26 100.0

52.0

Observedfail

pass

PASS

Overall Percentage

Step 0fail pass

PASS PercentageCorrect

Predicted

Constant is included in the model.a.

The cut value is .500b.

Null Model52% correctclassification

Model Tested72% correct classification

Page 25: Logistic Regression and Discriminant Function Analysis

Variables in EquationVariables in the Equation

.549 .235 5.473 1 .019 1.731

.111 .052 4.577 1 .032 1.118

-3.050 1.146 7.086 1 .008 .047

APT

EXPER

Constant

Step1

a

B S.E. Wald df Sig. Exp(B)

Variable(s) entered on step 1: APT, EXPER.a.

B effect of one unit change in IV on the log odds (hard to interpret)*Odds Ratio (OR) Exp(B) in SPSS = more interpretable; one unit change in aptitude increases the probability of passing by 1.7xWald Like t test, uses chi-square distributionSignificance to determine if wald test is significant

Page 26: Logistic Regression and Discriminant Function Analysis

Histogram of Predicted Probabilities

Page 27: Logistic Regression and Discriminant Function Analysis

To Flag Misclassified Cases

SPSS syntax

COMPUTE PRED_ERR=0.

IF LOW NE PGR_1 PRED_ERR=1.

You can use this for additional analyses to explore causes of misclassification

Page 28: Logistic Regression and Discriminant Function Analysis

Results ContinuedHosmer and Lemeshow Test

6.608 8 .579Step1

Chi-square df Sig.

An index of model fit. Chi-square compares the fit of the data (the observed events) with the model (the predicted events). The n.s. results means that the observed and expected values are similar this is good!

Page 29: Logistic Regression and Discriminant Function Analysis

Hierarchical Logistic Regression

• Question: Which of the following variables predict whether a woman is hired to be a Hooters girl?– Age– IQ– Weight

Page 30: Logistic Regression and Discriminant Function Analysis

Simultaneous v. Hierarchical

Omnibus Tests of Model Coefficients

48.462 3 .000

48.462 3 .000

48.462 3 .000

Step

Block

Model

Step 1Chi-square df Sig.

Omnibus Tests of Model Coefficients

.289 1 .591

.289 1 .591

.289 1 .591

Step

Block

Model

Step 1Chi-square df Sig.

Block 1. IQ

Cox & Snell .002; Nagelkerke .003

Omnibus Tests of Model Coefficients

42.044 1 .000

42.044 1 .000

42.333 2 .000

Step

Block

Model

Step 1Chi-square df Sig.

Block 2. Age

Cox & Snell .264; Nagelkerke .353

Block 3. WeightOmnibus Tests of Model Coefficients

6.129 1 .013

6.129 1 .013

48.462 3 .000

Step

Block

Model

Step 1Chi-square df Sig.

Cox & Snell .296; Nagelkerke .395

Block 1. IQ, Age, Weight

Model Summary

142.383a .296 .395Step1

-2 Loglikelihood

Cox & SnellR Square

NagelkerkeR Square

Estimation terminated at iteration number 6 becauseparameter estimates changed by less than .001.

a.

Page 31: Logistic Regression and Discriminant Function Analysis

Simultaneous v. HierarchicalBlock 1. IQ

Block 2. Age

Block 3. Weight

Block 1. IQ, Age, WeightClassification Tablea

53 12 81.5

26 47 64.4

72.5

Observednot hired

hired

Hired

Overall Percentage

Step 1not hired hired

Hired PercentageCorrect

Predicted

The cut value is .500a.

Classification Tablea

8 57 12.3

6 67 91.8

54.3

Observednot hired

hired

Hired

Overall Percentage

Step 1not hired hired

Hired PercentageCorrect

Predicted

The cut value is .500a.

Classification Tablea

55 10 84.6

28 45 61.6

72.5

Observednot hired

hired

Hired

Overall Percentage

Step 1not hired hired

Hired PercentageCorrect

Predicted

The cut value is .500a.

Classification Tablea

53 12 81.5

26 47 64.4

72.5

Observednot hired

hired

Hired

Overall Percentage

Step 1not hired hired

Hired PercentageCorrect

Predicted

The cut value is .500a.

Page 32: Logistic Regression and Discriminant Function Analysis

Simultaneous v. Hierarchical

Variables in the Equation

-.009 .015 .372 1 .542 .991

-.591 .125 22.224 1 .000 .554

-.277 .117 5.630 1 .018 .758

8.264 1.821 20.602 1 .000 3881.775

IQ

age

weight

Constant

Step1

a

B S.E. Wald df Sig. Exp(B)

Variable(s) entered on step 1: IQ, age, weight.a.

Block 1. IQ

Block 2. Age

Block 3. Weight

Block 1. IQ, Age, Weight

Variables in the Equation

-.009 .015 .372 1 .542 .991

-.591 .125 22.224 1 .000 .554

-.277 .117 5.630 1 .018 .758

8.264 1.821 20.602 1 .000 3881.775

IQ

age

weight

Constant

Step1

a

B S.E. Wald df Sig. Exp(B)

Variable(s) entered on step 1: IQ, age, weight.a.

Variables in the Equation

-.003 .014 .032 1 .858 .997

-.591 .120 24.220 1 .000 .554

6.484 1.533 17.899 1 .000 654.298

IQ

age

Constant

Step1

a

B S.E. Wald df Sig. Exp(B)

Variable(s) entered on step 1: age.a.

Variables in the Equation

.006 .012 .289 1 .591 1.006

-.185 .585 .100 1 .752 .831

IQ

Constant

Step1

a

B S.E. Wald df Sig. Exp(B)

Variable(s) entered on step 1: IQ.a.

Page 33: Logistic Regression and Discriminant Function Analysis

Multinomial Logistic Regression

• A form of logistic regression that allows prediction of probability into more than 2 groups– Based on a multinomial distribution

• Sometimes called polytomous logistic regression• Conducts an omnibus test first for each predictor

across 3+ groups (like ANOVA)– Then conduct pairwise comparisons (like post hoc

tests in ANOVA)

Page 34: Logistic Regression and Discriminant Function Analysis

Objectives of Discriminant Analysis• Determining whether significant differences exist

between average scores on a set of variables for 2+ a priori defined groups

• Determining which IVs account for most of the differences in average score profiles for 2+ groups

• Establishing procedures for classifying objects into groups based on scores on a set of IVs

• Establishing the number and composition of the dimensions of discrimination between groups formed from the set of IVs

Page 35: Logistic Regression and Discriminant Function Analysis

• Discriminant analysis develops a linear combination that can best separate groups.

• Opposite of MANOVA

• In MANOVA, groups are usually constructed by researcher and have clear structure (e.g., a 2 x 2 factorial design). Groups = IVs

• In discriminant analysis, the groups usually have no particular structure and their formation is not under experimental control. Groups = DVs

Discriminant Analysis

Page 36: Logistic Regression and Discriminant Function Analysis

• Linear combinations (discriminant functions) are formed that maximize the ratio of between-groups variance to within-groups variance for a linear combination of predictors.

• Total # discriminant functions = # groups – 1 OR # of predictors (whichever is smaller)

• If more than one discriminant function is formed, subsequent discriminant functions are independent of prior combinations and account for as much remaining group variation as possible.

How Discrim Works

Page 37: Logistic Regression and Discriminant Function Analysis

Assumptions in Discrim• Multivariate normality of IVs

– Violation more problematic if overlap between groups• Homogeneity of VCV matrices• Linear relationships• IVs continuous (interval scale)

– Can accommodate nominal but violates MV normality• Single categorical DV

Results influenced by:• Outliers (classification may be wrong)• Multicollinearity (interpretation of coefficients

difficult)

Page 38: Logistic Regression and Discriminant Function Analysis

Sample Size Considerations• Observations: # Predictors

– Suggested 20 observations per predictor– Minimum required 5 observations per

predictor

• Observations: Groups (in DV)– Minimum: smallest group size exceeds # of

IVs– Practical Guide: Each group should have 20+

observations – Wide variation in group size impacts results

(i.e., classification is incorrect)

Page 39: Logistic Regression and Discriminant Function Analysis

In this hypothetical example, data from 500 graduate students seeking jobs were examined. Available for each student were three predictors: GRE(V+Q), Years to Finish the Degree, and Number of Publications. The outcome measure was categorical: “Got a job” versus “Did not get a job.” Half of the sample was used to determine the best linear combination for discriminating the job categories. The second half of the sample was used for cross-validation.

Example

Page 40: Logistic Regression and Discriminant Function Analysis

DISCRIMINANT /GROUPS=job(1 2) /VARIABLES=gre pubs years /SELECT=sample(1) /ANALYSIS ALL /SAVE=CLASS SCORES PROBS /PRIORS SIZE /STATISTICS=MEAN STDDEV UNIVF BOXM COEFF RAW CORR COV GCOV TCOV TABLE CROSSVALID /PLOT=COMBINED SEPARATE MAP /PLOT=CASES /CLASSIFY=NONMISSING POOLED .

Page 41: Logistic Regression and Discriminant Function Analysis
Page 42: Logistic Regression and Discriminant Function Analysis
Page 43: Logistic Regression and Discriminant Function Analysis
Page 44: Logistic Regression and Discriminant Function Analysis
Page 45: Logistic Regression and Discriminant Function Analysis
Page 46: Logistic Regression and Discriminant Function Analysis
Page 47: Logistic Regression and Discriminant Function Analysis

Interpreting Output

• Box’s M

• Eigenvalues

• Wilks Lambda

• Discriminant Weights

• Discriminant Loadings

Page 48: Logistic Regression and Discriminant Function Analysis

Group Statistics

1296.20 96.913 179 179.000

3.50 2.029 179 179.000

6.47 2.094 179 179.000

1305.87 101.824 71 71.000

6.55 1.593 71 71.000

4.85 1.179 71 71.000

1298.94 98.224 250 250.000

4.36 2.357 250 250.000

6.01 2.016 250 250.000

GRE (V+Q)

Number of Publications

Years to Finish Degree

GRE (V+Q)

Number of Publications

Years to Finish Degree

GRE (V+Q)

Number of Publications

Years to Finish Degree

JOBOops!

Got One!

Total

Mean Std. Deviation Unweighted Weighted

Valid N (listwise)

Tests of Equality of Group Means

.998 .492 1 248 .483

.658 129.009 1 248 .000

.867 37.885 1 248 .000

GRE (V+Q)

Number of Publications

Years to Finish Degree

Wilks'Lambda F df1 df2 Sig.

Page 49: Logistic Regression and Discriminant Function Analysis

Test Results

49.679

8.137

6

114277.8

.000

Box's M

Approx.

df1

df2

Sig.

F

Tests null hypothesis of equal population covariance matrices.

Violates Assumption of Homogeneity of VCV matrices. But this test is sensitive in general and sensitive to violations of multivariate normality too. Tests of significance in discriminant analysis are robust to moderate violations of the homogeneity assumption.

Page 50: Logistic Regression and Discriminant Function Analysis

Eigenvalues

.693a 100.0 100.0 .640Function1

Eigenvalue % of Variance Cumulative %CanonicalCorrelation

First 1 canonical discriminant functions were used in theanalysis.

a.

Wilks' Lambda

.590 129.854 3 .000Test of Function(s)1

Wilks'Lambda Chi-square df Sig.

Page 51: Logistic Regression and Discriminant Function Analysis

Standardized Canonical Discriminant Function Coefficients

-.308

.944

-.423

GRE (V+Q)

Number of Publications

Years to Finish Degree

1

Function

Structure Matrix

.866

-.469

.054

Number of Publications

Years to Finish Degree

GRE (V+Q)

1

Function

Pooled within-groups correlations between discriminatingvariables and standardized canonical discriminant functions Variables ordered by absolute size of correlation within function.

Data from both these outputs indicate that one of the predictors best discriminates who did/did not get a job. Which one is it?

Discriminant Loadings

Discriminant Weights

Page 52: Logistic Regression and Discriminant Function Analysis

Canonical Discriminant Function Coefficients

-.003

.493

-.225

3.268

GRE (V+Q)

Number of Publications

Years to Finish Degree

(Constant)

1

Function

Unstandardized coefficients

Functions at Group Centroids

-.522

1.317

JOBOops!

Got One!

1

Function

Unstandardized canonical discriminantfunctions evaluated at group means

This is the raw canonical discriminant function.

The means for the groups on the raw canonical discriminant function can be used to establish cut-off points for classification.

Page 53: Logistic Regression and Discriminant Function Analysis

Prior Probabilities for Groups

.716 179 179.000

.284 71 71.000

1.000 250 250.000

JOBOops!

Got One!

Total

Prior Unweighted Weighted

Cases Used in Analysis

Classification can be based on distance from the group centroids and take into account information about prior probability of group membership.

Page 54: Logistic Regression and Discriminant Function Analysis

Classification Resultsb,c,d

170 9 179

23 48 71

95.0 5.0 100.0

32.4 67.6 100.0

169 10 179

24 47 71

94.4 5.6 100.0

33.8 66.2 100.0

175 10 185

17 48 65

94.6 5.4 100.0

26.2 73.8 100.0

JOBOops!

Got One!

Oops!

Got One!

Oops!

Got One!

Oops!

Got One!

Oops!

Got One!

Oops!

Got One!

Count

%

Count

%

Count

%

Original

Cross-validateda

Original

Cases Selected

Cases Not Selected

Oops! Got One!

Predicted GroupMembership

Total

Cross validation is done only for those cases in the analysis. In cross validation, each caseis classified by the functions derived from all cases other than that case.

a.

87.2% of selected original grouped cases correctly classified.b.

89.2% of unselected original grouped cases correctly classified.c.

86.4% of selected cross-validated grouped cases correctly classified.d.

Page 55: Logistic Regression and Discriminant Function Analysis

Canonical Discriminant Function 1

JOB = Oops!50

40

30

20

10

0

Std. Dev = 1.10

Mean = -.55

N = 364.00

Two modes?

Page 56: Logistic Regression and Discriminant Function Analysis

Canonical Discriminant Function 1

JOB = Got One!16

14

12

10

8

6

4

2

0

Std. Dev = .62

Mean = 1.30

N = 136.00

Page 57: Logistic Regression and Discriminant Function Analysis

Violation of the homogeneity assumption can affect the classification. To check, the analysis can be conducted using separate group covariance matrices.

Page 58: Logistic Regression and Discriminant Function Analysis

Classification Resultsa,b

165 14 179

21 50 71

92.2 7.8 100.0

29.6 70.4 100.0

168 17 185

11 54 65

90.8 9.2 100.0

16.9 83.1 100.0

JOBOops!

Got One!

Oops!

Got One!

Oops!

Got One!

Oops!

Got One!

Count

%

Count

%

Original

Original

Cases Selected

Cases Not Selected

Oops! Got One!

Predicted GroupMembership

Total

86.0% of selected original grouped cases correctly classified.a.

88.8% of unselected original grouped cases correctly classified.b.

No noticeable change in the accuracy of classification.

Page 59: Logistic Regression and Discriminant Function Analysis

The group that did not get a job was actually composed of two subgroups—those that got interviews but did not land a job and those that were never interviewed. This accounts for the bimodality in the discriminant function scores. The discriminant analysis of the three groups allows for the derivation of one more discriminant function, perhaps indicating the characteristics that separate those who get interviews from those who don’t, or, those who have successful interviews from those whose interviews do not produce a job offer.

Discriminant Analysis: Three Groups

Page 60: Logistic Regression and Discriminant Function Analysis

Remember this?

Canonical Discriminant Function 1

JOB = Oops!50

40

30

20

10

0

Std. Dev = 1.10

Mean = -.55

N = 364.00

Two modes?

Page 61: Logistic Regression and Discriminant Function Analysis
Page 62: Logistic Regression and Discriminant Function Analysis
Page 63: Logistic Regression and Discriminant Function Analysis

DISCRIMINANT /GROUPS=group(1 3) /VARIABLES=gre pubs years /SELECT=sample(1) /ANALYSIS ALL /SAVE=CLASS SCORES PROBS /PRIORS SIZE /STATISTICS=MEAN STDDEV UNIVF BOXM COEFF RAW CORR COV GCOV TCOV TABLE CROSSVALID /PLOT=COMBINED SEPARATE MAP /PLOT=CASES /CLASSIFY=NONMISSING POOLED .

Page 64: Logistic Regression and Discriminant Function Analysis

Group Statistics

1307.54 85.491 54 54.000

1.59 1.434 54 54.000

8.57 1.797 54 54.000

1305.87 101.824 71 71.000

6.55 1.593 71 71.000

4.85 1.179 71 71.000

1291.30 101.382 125 125.000

4.32 1.664 125 125.000

5.56 1.467 125 125.000

1298.94 98.224 250 250.000

4.36 2.357 250 250.000

6.01 2.016 250 250.000

GRE (V+Q)

Number of Publications

Years to Finish Degree

GRE (V+Q)

Number of Publications

Years to Finish Degree

GRE (V+Q)

Number of Publications

Years to Finish Degree

GRE (V+Q)

Number of Publications

Years to Finish Degree

GROUPUnemployed

Got a Job

Interview Only

Total

Mean Std. Deviation Unweighted Weighted

Valid N (listwise)

Tests of Equality of Group Means

.994 .761 2 247 .468

.455 147.864 2 247 .000

.529 109.977 2 247 .000

GRE (V+Q)

Number of Publications

Years to Finish Degree

Wilks'Lambda F df1 df2 Sig.

Page 65: Logistic Regression and Discriminant Function Analysis

Test Results

21.796

1.780

12

137372.4

.045

Box's M

Approx.

df1

df2

Sig.

F

Tests null hypothesis of equal population covariance matrices.

Separating the three groups produces better homogeneity of VCV matrices.

Still significant, but just barely. Not enough to worry about.

Page 66: Logistic Regression and Discriminant Function Analysis

Eigenvalues

5.353a 99.1 99.1 .918

.047a .9 100.0 .211

Function1

2

Eigenvalue % of Variance Cumulative %CanonicalCorrelation

First 2 canonical discriminant functions were used in theanalysis.

a.

Wilks' Lambda

.150 466.074 6 .000

.955 11.246 2 .004

Test of Function(s)1 through 2

2

Wilks'Lambda Chi-square df Sig.

Two significant linear combinations can be derived, but they are not of equal importance.

Page 67: Logistic Regression and Discriminant Function Analysis

Standardized Canonical Discriminant Function Coefficients

.734 .194

-1.246 .521

1.032 .602

GRE (V+Q)

Number of Publications

Years to Finish Degree

1 2

Function

Structure Matrix

-.466 .867*

.401 .796*

.008 .354*

Number of Publications

Years to Finish Degree

GRE (V+Q)

1 2

Function

Pooled within-groups correlations between discriminatingvariables and standardized canonical discriminant functions Variables ordered by absolute size of correlation within function.

Largest absolute correlation between each variable andany discriminant function

*.

What do the linear combinations mean now?

Loadings

Weights

Page 68: Logistic Regression and Discriminant Function Analysis

Functions at Group Centroids

4.026 .162

-2.469 .251

-.337 -.213

GROUPUnemployed

Got a Job

Interview Only

1 2

Function

Unstandardized canonical discriminantfunctions evaluated at group means

Canonical Discriminant Function Coefficients

.007 .002

-.781 .326

.701 .409

-10.496 -6.445

GRE (V+Q)

Number of Publications

Years to Finish Degree

(Constant)

1 2

Function

Unstandardized coefficients

Page 69: Logistic Regression and Discriminant Function Analysis

Functions at Group Centroids

4.026 .162

-2.469 .251

-.337 -.213

GROUPUnemployed

Got a Job

Interview Only

1 2

Function

Unstandardized canonical discriminantfunctions evaluated at group means

DF2

DF1

+4

+4

0

0-4

-4

-2

-2

+2

+2

unemployed

interview only

got a job

Page 70: Logistic Regression and Discriminant Function Analysis

Loadings

DF1 DF2

No. Pubs -.466 .867

Yrs to finish .401 .796

GRE .008 .354

Weights

DF1 DF2

No. Pubs -1.246 .521

Yrs to finish 1.032 .602

GRE .734 .194

DF2

DF1

+4

+4

0

0-4

-4

-2

-2

+2

+2

unemployed

interview only

got a job

Page 71: Logistic Regression and Discriminant Function Analysis

This figure shows that discriminant function #1, which is made up of number of publications and years to finish, reliably differentiates between those who got jobs, had interviews only, and had no job or interview. Specially, a high value on DF1 was associated with not getting a job, suggesting that having few publications (loading = -.466) and taking a long time to finish (loading = .401) was associated with not getting a job.

DF2

DF1

+4

+4

0

0-4

-4

-2

-2

+2

+2

unemployed

interview only

got a job

Page 72: Logistic Regression and Discriminant Function Analysis

Prior Probabilities for Groups

.216 54 54.000

.284 71 71.000

.500 125 125.000

1.000 250 250.000

GROUPUnemployed

Got a Job

Interview Only

Total

Prior Unweighted Weighted

Cases Used in Analysis

Page 73: Logistic Regression and Discriminant Function Analysis

Classification Function Coefficients

.238 .190 .205

-10.539 -5.440 -7.256

11.018 6.503 7.808

-196.112 -123.212 -139.036

GRE (V+Q)

Number of Publications

Years to Finish Degree

(Constant)

Unemployed Got a Job Interview Only

GROUP

Fisher's linear discriminant functions

Page 74: Logistic Regression and Discriminant Function Analysis

Territorial Map

Canonical DiscriminantFunction 2 -6.0 -4.0 -2.0 .0 2.0 4.0 6.0 ôòòòòòòòòòôòòòòòòòòòôòòòòòòòòòôòòòòòòòòòôòòòòòòòòòôòòòòòòòòòô 6.0 ô 23 31 ô ó 23 31 ó ó 23 31 ó ó 23 31 ó ó 23 31 ó ó 23 31 ó 4.0 ô ô ô 23 ô 31ô ô ô ó 23 31 ó ó 23 31 ó ó 23 31 ó ó 23 31 ó ó 23 31 ó 2.0 ô ô ô 23 ô 31 ô ô ó 23 31 ó ó 23 31 ó ó 23 31 ó ó 23 31 ó ó * 23 31 ó .0 ô ô ô 23 ô 31 * ô ó 23 * 31 ó ó 23 31 ó ó 23 31 ó ó 23 31 ó ó 23 31 ó -2.0 ô ô 23 ô ô31 ô ô ó 23 31 ó ó 23 31 ó ó 23 31 ó ó 23 31 ó ó 23 31 ó -4.0 ô ô 23 ô ô ô 31 ô ô ó 23 31 ó ó 23 31 ó ó 23 31 ó ó 23 31 ó ó 23 31 ó -6.0 ô 23 31 ô ôòòòòòòòòòôòòòòòòòòòôòòòòòòòòòôòòòòòòòòòôòòòòòòòòòôòòòòòòòòòô -6.0 -4.0 -2.0 .0 2.0 4.0 6.0 Canonical Discriminant Function 1

Symbols used in territorial map

Symbol Group Label------ ----- --------------------

1 1 Unemployed 2 2 Got a Job 3 3 Interview Only * Indicates a group centroid

Page 75: Logistic Regression and Discriminant Function Analysis

Canonical Discriminant Functions

Function 1

86420-2-4-6

Fun

ctio

n 2

4

3

2

1

0

-1

-2

-3

GROUP

Group Centroids

Interview Only

Got a Job

Unemployed

Interview OnlyGot a Job Unemployed

Page 76: Logistic Regression and Discriminant Function Analysis

A classification function is derived for each group. The original data are used to estimate a classification score for each person, for each group. The person is then assigned to the group that produces the largest classification score.

Classification Function Coefficients

.238 .190 .205

-10.539 -5.440 -7.256

11.018 6.503 7.808

-196.112 -123.212 -139.036

GRE (V+Q)

Number of Publications

Years to Finish Degree

(Constant)

Unemployed Got a Job Interview Only

GROUP

Fisher's linear discriminant functions

Classification

Page 77: Logistic Regression and Discriminant Function Analysis

Classification Resultsb,c,d

51 0 3 54

0 51 20 71

0 13 112 125

94.4 .0 5.6 100.0

.0 71.8 28.2 100.0

.0 10.4 89.6 100.0

51 0 3 54

0 51 20 71

0 13 112 125

94.4 .0 5.6 100.0

.0 71.8 28.2 100.0

.0 10.4 89.6 100.0

62 0 4 66

0 47 18 65

4 11 104 119

93.9 .0 6.1 100.0

.0 72.3 27.7 100.0

3.4 9.2 87.4 100.0

GROUPUnemployed

Got a Job

Interview Only

Unemployed

Got a Job

Interview Only

Unemployed

Got a Job

Interview Only

Unemployed

Got a Job

Interview Only

Unemployed

Got a Job

Interview Only

Unemployed

Got a Job

Interview Only

Count

%

Count

%

Count

%

Original

Cross-validateda

Original

Cases Selected

Cases Not Selected

Unemployed Got a Job Interview Only

Predicted Group Membership

Total

Cross validation is done only for those cases in the analysis. In cross validation, each case is classified by thefunctions derived from all cases other than that case.

a.

85.6% of selected original grouped cases correctly classified.b.

85.2% of unselected original grouped cases correctly classified.c.

85.6% of selected cross-validated grouped cases correctly classified.d.

Page 78: Logistic Regression and Discriminant Function Analysis

Expected

Unemployed Got a JobInterview

OnlyAll

Actual

Unemployed 51 0 3 54

Got a Job 0 51 20 71

Interview Only

0 13 112 125

All 51 64 135 250

Is the classification better than would be expected by chance? Observed values:

Page 79: Logistic Regression and Discriminant Function Analysis

Expected

Unemployed Got a JobInterview

OnlyAll

Actual

Unemployed(51x54)

250

(64x54)

250

(135x54)

25054

Got a Job(51x71)

250

(64x71)

250

(135x71)

25071

Interview Only

(51x125)

250

(64x125)

250

(135x125)

250125

All 51 64 135 250

Expected classification by chance

E = (Row x Column)/Total N

Page 80: Logistic Regression and Discriminant Function Analysis

Expected

Unemployed Got a JobInterview

OnlyAll

Actual

Unemployed 11.016 13.824 29.16 54

Got a Job 14.484 18.176 38.34 71

Interview Only

25.5 32 67.5 125

All 54 71 125 250

Correct classification that would occur by chance:

Page 81: Logistic Regression and Discriminant Function Analysis

The difference between chance expected and actual classification can be tested with a chi-square as well.

fff

ected

ectedobserved

exp

2

2 exp

Where degree of freedom = (# groups -1)2

df = 4

= 145.13 + 13.82 + 23.47 + 14.48 + 59.25 + 8.77 + 25.5 + 11.28 + 29.34

Chi squared = 331.04