phenotypic factor analysis marleen de moor & meike bartels

67
Phenotypic Factor Analysis Marleen de Moor & Meike Bartels Department of Biological Psychology, VU University Amsterdam [email protected] / [email protected]

Upload: eron

Post on 06-Jan-2016

32 views

Category:

Documents


1 download

DESCRIPTION

Phenotypic Factor Analysis Marleen de Moor & Meike Bartels Department of Biological Psychology, VU University Amsterdam [email protected] / [email protected]. Outline. Introduction to factor analysis What is factor analysis Relationship with regression and SEM - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

Phenotypic Factor Analysis

Marleen de Moor & Meike Bartels

Department of Biological Psychology, VU University Amsterdam

[email protected] / [email protected]

Page 2: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 2

Outline

• Introduction to factor analysis– What is factor analysis– Relationship with regression and SEM – Types of factor analysis

• Phenotypic factor analysis– 1 factor model– 2 factor model

• More advanced models– Factor models for categorical data– Multigroup factor models and measurement invariance

• From phenotypic to genetic factor analysis…

Page 3: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 3

Outline

• Introduction to factor analysis– What is factor analysis– Relationship with regression and SEM – Types of factor analysis

• Phenotypic factor analysis• More advanced models• From phenotypic to genetic factor analysis…

Page 4: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 4

Factor analysis

• Collection of methods• Measurement model• Describe/explain pattern of observed correlations

Latent Constructs /Unobserved Variables /Latent factors

Observed Variables /Indicators

Measurement error

Page 5: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 5

Classic example: IQ

Page 6: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 6

Relationship with regression analysis

Multiple regression: Multivariate multiple regression:

y1x1

x2

x3

x4

x5

x6

y1x1

x2

x3

x4

x5

x6

y2

y3

Page 7: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 7

Relationship with SEM

Measurement model

(latent variables measured by observed variables)

Measurement model

(latent variables measured by observed variables)

Structural model

(regression model

among latent variables)

Page 8: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 8

Types of factor analysis

• Principal component analysis (PCA)• Exploratory factor analysis (EFA)• Confirmatory factor analysis (CFA)

Page 9: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 9

PCA

• Data reduction technique• Linear transformation of the data• Summarize the observed pattern of correlations

among variables with a smaller number of principal components

Page 10: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 10

PCA

• First component explains as much variance as possible

• Different rotations possible: orthogonal or oblique• Principal components contain both common and

residual variance!

Adolescent data on:

Quality of life Anxious depression

Happiness Somatic complaints

Life satisfaction Social problems

Page 11: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 11

EFA

• Atheoretical• Discover the underlying constructs• Determine number of latent factors• Again, different rotations possible

F1

y1

y6

y2

y3

y4

y5

F1

y1

y6

y2

y3

y4

y5F2

Page 12: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 12

CFA

• Theoretical (model=hypothesis)• Test hypothesis about underlying constructs

F1

y1

y6

y2

y3

y4

y5F2

Page 13: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 13

Outline

• Introduction to factor analysis• Phenotypic factor analysis

– 1 factor model– 2 factor model

• More advanced models• From phenotypic to genetic factor analysis…

Page 14: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 14

The 1 factor model

F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f41 f51 f61

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(F1)

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

Page 15: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 15

Y is influenced by F and E

F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f41 f51 f61

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(F1)

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

Page 16: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 16

More formally…

• Yi1=f11*Fi1+Ei1

Random variables (varies across individuals i=1…N)

Page 17: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 17

More formally…

• Yi1=f11*Fi1+Ei1

Random variables (varies across individuals i=1…N)

Fixed parameter (constant across individuals)

Page 18: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 18

More formally…

• Yi1=f11*Fi1+Ei1

• Yi2=f21*Fi1+Ei2

• Yi3=f31*Fi1+Ei3

• Yi6=f61*Fi1+Ei6

Page 19: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 19

The 2 factor model

F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(F1)

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

F2

Var(F2)Cov(F1,F2)

Page 20: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 20

The 2 factor model

F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(F1)

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

F2

Var(F2)Cov(F1,F2)

f32

Page 21: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 21

The 2 factor model

F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(F1)

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

F2

Var(F2)Cov(F1,F2)

Cov(E1E4)

Page 22: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 22

The 2 factor model

F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(F1)

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

F2

Var(F2)Cov(F1,F2)

Page 23: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 23

From equations to matrices

• Yi1=f11*Fi1+Ei1

• Yi2=f21*Fi1+Ei2

• Yi3=f31*Fi1+Ei3

• Yi4=f42*Fi2+Ei4

• Yi5=f52*Fi2+Ei5

• Yi6=f62*Fi2+Ei6

6

5

4

3

2

1

2

1

61

51

41

31

21

11

6

5

4

3

2

1

0

0

0

0

0

0

i

i

i

i

i

i

i

i

i

i

i

i

i

i

E

E

E

E

E

E

F

F

f

f

f

f

f

f

Y

Y

Y

Y

Y

Yi=1…N number of individualsj=1…J number of observed variablesk=1…K number of factors

Assumption: Data follow a multivariate normal distribution

Page 24: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 24

Expected (co)variances

Can be obtained in 2 ways:• Using path diagram (Wright’s rules)• Using equations (algebraic derivation)

Page 25: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 25

Expected (co)variances – path diagram

EXERCISE:

Write down the expectations for:

Var(Y1)=??

Cov(Y1,Y2)=??

Cov(Y1,Y4)=?? F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(F1)

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

F2

Var(F2)Cov(F1,F2)

Page 26: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 26

Expected (co)variances – path diagram

ANSWER:

Var(Y1)= f112 * var(F1) + var(E1)

Cov(Y1,Y2)=??

Cov(Y1,Y4)= ?? F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(F1)

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

F2

Var(F2)Cov(F1,F2)

Page 27: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 27

Expected (co)variances – path diagram

ANSWER:

Var(Y1)= f112 * var(F1) + var(E1)

Cov(Y1,Y2)=f11*f21 * var(F1)

Cov(Y1,Y4)= ?? F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(F1)

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

F2

Var(F2)Cov(F1,F2)

Page 28: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 28

Expected (co)variances – path diagram

ANSWER:

Var(Y1)= f112 * var(F1) + var(E1)

Cov(Y1,Y2)=f11*f21 * var(F1)

Cov(Y1,Y4)= f11*f42 * cov(F1,F2) F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(F1)

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

F2

Var(F2)Cov(F1,F2)

Page 29: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 29

Expected (co)variances - equations

Var (Y1) = E [ (f11*Fi1+Ei1) * (f11*Fi1+Ei1) ]= E [ (f11*Fi1)2 + 2*f11*Fi1*Ei1 + (Ei1)2 ]= f11

2 * var(F1) + var(E1)

Cov (Y1,Y2) = E [ (f11*Fi1+Ei1) * (f21*Fi1+Ei2) ]= E [ f11*Fi1*f21*Fi1 + f11*Fi1 * Ei2 +

Ei1*f21*Fi1 + Ei1*Ei2 ]= f11*f21 * var(F1)

Cov (Y1,Y4) = f11*f42 * cov(F1,F2)

Page 30: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 30

Expected (co)variances - equations

625242

312111

221

211

62

52

42

31

21

11

000

000*

)var(),cov(

),cov()var(

0

0

0

0

0

0

fff

fff

FFF

FFF

f

f

f

f

f

f

)var(00000

0)var(0000

00)var(000

000)var(00

0000)var(0

00000)var(

6

5

4

3

2

1

E

E

E

E

E

E

)var()cov()cov()cov()cov()cov(

)cov()var()cov()cov()cov()cov(

)cov()cov()var()cov()cov()cov(

)cov()cov()cov()var()cov()cov(

)cov()cov()cov()cov()var()cov(

)cov()cov()cov()cov()cov()var(

65646362616

65545352515

64544342414

63534332313

62524232212

61514131211

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

JxJ symmetric

JxK full KxK symm KxJ full JxJ diag

(LISREL notation)

TLtPL

Cov

)(%*%%*%

exp

(OpenMx notation)

j=1…J number of observed variablesk=1…K number of factors

Page 31: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 31

Identification

• Latent factors have no scale: means, variances?

For each latent factor:• Mean: fix to zero• Variance: two most commonly used options

– Fix to one, estimate all factor loadings– Estimate variance, fix first factor loading to one

Page 32: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 32

Identification of (co)variances

625242

312111

21

21

62

52

42

31

21

11

000

000*

1),cov(

),cov(1

0

0

0

0

0

0

fff

fff

FF

FF

f

f

f

f

f

f

)var(00000

0)var(0000

00)var(000

000)var(00

0000)var(0

00000)var(

6

5

4

3

2

1

E

E

E

E

E

E

)var()cov()cov()cov()cov()cov(

)cov()var()cov()cov()cov()cov(

)cov()cov()var()cov()cov()cov(

)cov()cov()cov()var()cov()cov(

)cov()cov()cov()cov()var()cov(

)cov()cov()cov()cov()cov()var(

65646362616

65545352515

64544342414

63534332313

62524232212

61514131211

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

Page 33: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 33

Identification of (co)variances

6252

3121

221

211

62

52

31

21

1000

0001*

)var(),cov(

),cov()var(

0

0

10

0

0

01

ff

ff

FFF

FFF

f

f

f

f

)var(00000

0)var(0000

00)var(000

000)var(00

0000)var(0

00000)var(

6

5

4

3

2

1

E

E

E

E

E

E

)var()cov()cov()cov()cov()cov(

)cov()var()cov()cov()cov()cov(

)cov()cov()var()cov()cov()cov(

)cov()cov()cov()var()cov()cov(

)cov()cov()cov()cov()var()cov(

)cov()cov()cov()cov()cov()var(

65646362616

65545352515

64544342414

63534332313

62524232212

61514131211

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

YYYYYYYYYYY

Page 34: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 34

Identification

• Count number of observed statistics• Count number of free parameters

• If #obs. stat. < #free par. Model unidentified• If #obs. stat. = #free par. Model justidentified• If #obs. stat. > #free par. Model identified

Page 35: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 35

Identification of 1 factor model

Observed statistics:#obs. var. = J = 6#obs. cov. = J(J-1)/2 = 6*5/2 = 15#obs. var/cov. = J(J+1)/2 = 6*7/2 = 21

Free parameters:# residual variances = 6# factor loadings = 6

Degrees of freedom:df = 21-12 = 9

F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f41 f51 f61

E1 E2 E3 E4 E5

1 1 1 1 1 1

1

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

Page 36: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 36

Identification of 2 factor model

Observed statistics:#obs. var. = J = 6#obs. cov. = J(J-1)/2 = 6*5/2 = 15#obs. var/cov. = J(J+1)/2 = 6*7/2 = 21

Free parameters:# residual variances = 6# factor loadings = 6# covariances among factors = 1

Degrees of freedom:df = 21-13 = 8

F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1 1 1 1 1 1

1

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

F2

1Cov(F1,F2)

Page 37: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 37

Practical – Description of data

Page 38: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 38

DATASET

- Netherlands Twin Register (www.tweelingenregister.org)

- Dutch Health and Behavior Questionnaire (DHBQ)

- Adolescent Twins and non-twin Siblings

- Aged 14 and 16 (siblings between 12 and 25)

- Online & Paper and Pencil

Page 39: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 39

CONTENT DHBQ

- Psychopathology - Leisure time activities- Exercise - Family Size (no of sibs)- Self –esteem - Family situation (divorce)- Optimism - Zygosity- Life Events - Height, Weight- Loneliness - Eating Disorders- Number of peers and peer relation- General Health and Illnesses (astma, migraine, etc)- Pubertal Development- Hours sleep- Personality (age 16)- Family Functioning (Family Functioning, Family Conflict)- Life style (smoking, alcohol use, marihuna use)- Educational Achievement (incl truancy)- Wellbeing (Happiness, Satisfaction with Life, Quality of Life)

Page 40: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 40

N of individuals

N of families 1 twin 2 twins 1 twin + sib 2 twins + sib

MZM 1061 474 28 290 15 141

DZM 917 425 56 232 14 123

MZF 1540 697 54 432 11 200

DZF 1116 512 49 309 13 141

DOS 2061 999 169 566 32 232

Sibs only 78 78 -- -- -- --

Total 6773 3185 356 1829 85 837

Sample overview

Page 41: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 41

Today’s Focus

- Youth Self Report (YSR)

- Subjective Wellbeing

* subjective happiness

* satisfaction with life

* quality of life

- General Family Functioning

Page 42: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 42

Practical 1: Single group factor models

Data: CFA_family_wellbeing.dat• 1000 adolescent twins (one twin per family)• Observed variables:

– Quality of life– Happiness– Satisfaction with life– Anxious depression scale (YSR)– Somatic complaints scale (YSR)– Social problems scale (YSR)

Files are on F:\marleen\Boulder2010\CFA

Page 43: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 43

Practical 1: Single group factor models

1 factor model vs. 2 factor model

WB

QOL SOCHAP SAT AD SOMA

E6

f11 f21 f31 f41 f51 f61

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

Var(WB)

PosWB

QOL SOCHAP SAT AD SOMA

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1 1 1 1 1 1

Var(PosWB)

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

NegWB

Var(NegWB)Cov(PosWB, NegWB)

Page 44: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 44

Practical 1a: Single group 1 factor model

OpenMx script OneFactorModelMatrix_WELLBEING.R

require(OpenMx)

# Prepare Data# -----------------------------------------------------------------------allData<-read.table("CFA_family_wellbeing.dat", header=TRUE, na.strings=-999)

cfaData<-allData[, c('qol','hap','sat','ad','soma','soc')]

colMeans(cfaData, na.rm=TRUE)cov(cfaData[,c('qol','hap','sat','ad','soma','soc')],use="pairwise.complete.obs")cor(cfaData[,c('qol','hap','sat','ad','soma','soc')],use="pairwise.complete.obs")

nvar<-6nfac<-1

Read in data, -999 are treated as missing NA

Select variables to use in CFA

Compute descriptives of variables

Specify number of variables and factors

Page 45: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 45

Practical 1a: Single group 1 factor model

OpenMx script OneFactorModelMatrix_WELLBEING.R

# Run single group 1 factor model - cov data input# -----------------------------------------------------------------------observedVars <- names(cfaData)

oneFactorModelcov <- mxModel("One Factor", mxMatrix(type="Full", nrow=nvar, ncol=nfac, values=0.2, free=TRUE, name="L"), mxMatrix(type="Symm", nrow=nfac, ncol=nfac, values=1, free=TRUE, name="P"), mxMatrix(type="Diag", nrow=nvar, ncol=nvar, values=1, free=TRUE, name="T"), mxAlgebra(expression=L %*% P %*% t(L) + T, name="expCov"), mxData(cov(cfaData, use="pairwise.complete.obs"),type="cov", numObs=1000), mxMLObjective(covariance="expCov", dimnames = observedVars))

oneFactorFitcov<-mxRun(oneFactorModelcov)

summary(oneFactorFitcov)

Save variable names

Specify factor model

Run factor model

Demand summary output

Page 46: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 46

Practical 1a: Single group 1 factor model

1. Open script OneFactorModelMatrix_WELLBEING.R2. Identify the factor model by constraining Var(WB)=13. Run the 1 factor model4. Write down the following information:

Model: #obs. stat.

#free par.

Chi2 df AIC BIC RMSEA

1 factor

• Copy files from F:\marleen\Boulder2010\CFA to own directory• Check whether your own directory is your working directory!

Page 47: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 47

Practical 1a: Single group 1 factor model

Model: #obs. stat.

#free par.

Chi2 df AIC BIC RMSEA

1 factor 21 12 508.6 9 490.6 223.2 0.24

oneFactorModelcov <- mxModel("One Factor", mxMatrix(type="Full", nrow=nvar, ncol=nfac, values=0.2, free=TRUE, name="L"), mxMatrix(type="Symm", nrow=nfac, ncol=nfac, values=1, free=FALSE, name="P"), mxMatrix(type="Diag", nrow=nvar, ncol=nvar, values=1, free=TRUE, name="T"), mxAlgebra(expression=L %*% P %*% t(L) + T, name="expCov"), mxData(cov(cfaData, use="pairwise.complete.obs"),type="cov", numObs=1000), mxMLObjective(covariance="expCov", dimnames = observedVars))

Page 48: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 48

Practical 1b: Single group 2 factor model

OpenMx script TwoFactorModelMatrix_WELLBEING.R

require(OpenMx)

# Prepare Data# -----------------------------------------------------------------------allData<-read.table("CFA_family_wellbeing.dat", header=TRUE, na.strings=-999)

cfaData<-allData[, c('qol','hap','sat','ad','soma','soc')]

colMeans(cfaData, na.rm=TRUE)cov(cfaData[,c('qol','hap','sat','ad','soma','soc')],use="pairwise.complete.obs")cor(cfaData[,c('qol','hap','sat','ad','soma','soc')],use="pairwise.complete.obs")

nvar<-6nfac<-2

Read in data, -999 are treated as missing NA

Select variables to use in CFA

Compute descriptives of variables

Specify number of variables and factors

Page 49: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 49

Practical 1b: Single group 2 factor model

OpenMx script TwoFactorModelMatrix_WELLBEING.R

# Run single group 2 factor model - cov data input# -----------------------------------------------------------------------observedVars <- names(cfaData)

twoFactorModelCov <- mxModel("Two Factor", mxMatrix(type="Full", nrow=nvar, ncol=nfac, values=c(rep(0.3,3),rep(0,6),rep(0.3,3)), free=c(rep(TRUE,3),rep(FALSE,6),rep(TRUE,3)), name="L"), mxMatrix(type="Symm", nrow=nfac, ncol=nfac, values=c(0.9,0.5,0.9), free=c(TRUE,TRUE,TRUE), name="P"), mxMatrix(type="Diag", nrow=nvar, ncol=nvar, values=1, free=TRUE, name="T"), mxAlgebra(expression=L %*% P %*% t(L) + T, name="expCov"), mxData(cov(cfaData, use="pairwise.complete.obs"), type="cov", numObs=1000), mxMLObjective(covariance="expCov", dimnames = observedVars))

twoFactorFitCov<-mxRun(twoFactorModelCov)

summary(twoFactorFitCov)

Save variable names

Specify factor model

Run factor model

Demand summary output

Page 50: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 50

Factor loading matrix “L”

mxMatrix(type="Full", nrow=nvar, ncol=nfac, values=c(rep(0.3,3),rep(0,6),rep(0.3,3)), free=c(rep(TRUE,3),rep(FALSE,6),rep(TRUE,3)),

name="L"),

3.00

3.00

3.00

03.0

03.0

03.0

TRUEFALSE

TRUEFALSE

TRUEFALSE

FALSETRUE

FALSETRUE

FALSETRUE

62

52

42

31

21

11

0

0

0

0

0

0

f

f

f

f

f

f

Starting values for elements in this matrix:

Factor loading matrix L:

Free parameters in this matrix:

Page 51: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 51

Covariance matrix latent factors “P”

mxMatrix("Symm", nfac, nfac, values=c(0.9,0.5,0.9), free=c(TRUE,TRUE,TRUE), name="P"),

9.05.0

5.09.0

)var()cov(

)cov()var(

221

211

FFF

FFF

Starting values for elements in this matrix:

Covariance matrix P:

Free parameters in this matrix:

TRUETRUE

TRUETRUE

Page 52: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 52

Practical 1b: Single group 2 factor model

1. Open script TwoFactorModelMatrix_WELLBEING.R2. Identify the factor model by constraining Var(PosWB)=1 and

Var(NegWB)=13. Run the 2 factor model4. Write down the following information:

5. How do the models fit? Which model fits best?

Model: #obs. stat.

#free par.

Chi2 df AIC BIC RMSEA

1 factor 21 12 508.6 9 490.6 223.2 0.24

2 factor

Files are on F:\marleen\Boulder2010\CFA

Page 53: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 53

Practical 1b: Single group 2 factor model

Model: #obs. stat.

#free par.

Chi2 df AIC BIC RMSEA

1 factor 21 12 508.6 9 490.6 223.2 0.24

2 factor 21 13 69.9 8 53.9 7.3 0.09

twoFactorModelCov <- mxModel("Two Factor", mxMatrix(type="Full", nrow=nvar, ncol=nfac, values=c(rep(0.3,3),rep(0,6),rep(0.3,3)), free=c(rep(TRUE,3),rep(FALSE,6),rep(TRUE,3)), name="L"), mxMatrix(type="Symm", nrow=nfac, ncol=nfac, values=c(1,0.5,1), free=c(FALSE,TRUE,FALSE), name="P"), mxMatrix(type="Diag", nrow=nvar, ncol=nvar, values=1, free=TRUE, name="T"), mxAlgebra(expression=L %*% P %*% t(L) + T, name="expCov"), mxData(cov(cfaData, use="pairwise.complete.obs"), type="cov", numObs=1000), mxMLObjective(covariance="expCov", dimnames = observedVars))

Page 54: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 54

Outline

• Introduction to factor analysis• Phenotypic factor analysis• More advanced models

– Factor models for categorical data– Multigroup factor models and measurement invariance

• From phenotypic to genetic factor analysis…

Page 55: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 55

Factor models for categorical data

• What if my observed data are categorical?• For example, multiple items of one scale

Threshold models=Latent response variable models

Page 56: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 56

The 2 factor model – continuous data

F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1 1 1 1 1 1

1

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

F2

1Cov(F1,F2)

Page 57: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 57

F1

Y1 Y6Y2 Y3 Y4 Y5

E6

f11 f21 f31 f42 f52 f62

E1 E2 E3 E4 E5

1

Var(E1) Var(E2) Var(E3) Var(E4) Var(E5) Var(E6)

F2

1Cov(F1,F2)

Y*1 Y*2 Y*3 Y*4 Y*5 Y*6

The 2 factor model – categorical data

Latent response variables (continuous)

Observed variables (categorical)

Thresholds

1 1 1 1 1 1

Page 58: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 58

Multigroup factor model

• Fit factor model in multiple groupsGroup 1: Boys Group 2: Girls

PosWBb

QOLb SOCbHAPb SATb ADb SOMAb

E6b

f11b f21b f31b f42b f52b f62b

E1b E2b E3b E4b E5b

1 1 1 1 1 1

Var(PosWBb)

Var(E1b) Var(E2b) Var(E3b) Var(E4b) Var(E5b) Var(E6b)

NegWBb

Var(NegWBb)Cov(PosWBb, NegWBb)

PosWBg

QOLg SOCgHAPg SATg ADg SOMAg

E6g

f11g f21g f31g f42g f52g f62g

E1g E2g E3g E4g E5g

1 1 1 1 1 1

Var(PosWBg)

Var(E1g) Var(E2g) Var(E3g) Var(E4g) Var(E5g) Var(E6g)

NegWBg

Var(NegWBg)Cov(PosWBg, NegWBg)

Page 59: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 59

Group comparisons

• Group comparisons of latent constructs:– Means

For example:IQ differences across ethnic groups

– Covariance structureFor example:Covariance differences in negative and positive wellbeing in

adolescent boys and girls

• Only meaningful if shown that same constructs are measured in all groups!

Page 60: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 60

Modeling means and covariances

ggg

gYE

][

gggg

g

Means model: Covariance model:

Page 61: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 61

Measurement invariance (MI)

= Absence of measurement bias= Same measurement model holds in each group

Group differences in observed variables are only caused by group differences in latent factors, and not by other differences in the model, such as differences in factor loadings

Page 62: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 62

Types of MI models

Four models (models 2-4 are nested under 1)1. Configural Invariance model2. Metric Invariance model3. Strict Invariance model4. Strong Invariance model

Most complex

Most parsimonious

Page 63: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 63

Types of MI models

Four models (models 2-4 are nested under 1)1. Configural Invariance model

• Fit same factor model in each group

2. Metric Invariance model• Constrain factor loadings equal across groups

3. Strict Invariance model• Constrain factor loadings and intercepts equal across

groups

4. Strong Invariance model• Constrain factor loadings, intercepts and residual

variances equal across groups

Page 64: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 64

Outline

• Introduction to factor analysis• Phenotypic factor analysis• More advanced models• From phenotypic to genetic factor analysis…

Page 65: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 65

Phenotypic versus genetic models

Twin 1Phenotype 1

A1 A2

E1 E2

a11

a21

a22

e11e21

e22

1

1

1

Twin 1Phenotype 2

C1 C2

c11

c21 c22

11

Twin 1Phenotype 3

E3

e33

e31 e32

C3

c33

1

A3

a33

c32a31

c31

a32

1

1

1

Phenotypic factor model Multivariate genetic models – Cholesky decomposition

F1

Y1 Y2 Y3

f11 f21 f31

E1 E2 E3

1 1 1

1

Var(E1) Var(E2) Var(E3)

11.00-12.00 Danielle & Meike

13.00-14.00 Meike & Danielle

Page 66: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 66

Phenotypic versus genetic models

Multivariate genetic models – Common pathway model

Multivariate genetic models – Independent pathway model

14.30-16.45 Hermine & Nick

Page 67: Phenotypic Factor Analysis Marleen de Moor & Meike Bartels

March 3, 2010 M. de Moor, Twin Workshop Boulder 67