cfa in amos - rmc.ehe.osu.edu · •option 1: accessing amos within spss •open the dataset in...
TRANSCRIPT
CFA in AMOSROBERT NICHOLS & NIVEDITA BHAKTHA
QUANTITATIVE RESEARCH, EVALUATION, AND MEASUREMENT
EHE RESEARCH METHODOLOGY CENTER WORKSHOP
2/18/2017
Review of Measurement
Levels of Measurement
• Nominal
• Categorical measure of groups. No information about ranking or relative position.
• Ordinal
• Categorical measure that represents rank or relative position
• Interval
• Continuous measure that represents rank or relative position on a scale with equally spaced intervals
• Ratio
• Continuous measure that has same properties of interval data but with an absolute or “true” 0
Reliability
• Statistical reliability refers to the consistency of measurement
• Does the measure produce similar results under similar conditions?
• Reliability can vary from one sample to the next
• Multiple measures of reliability
• Reliability is necessary but not sufficient for validity
Reliability
• Inter-rater reliability: measures agreement between multiple raters measuring
the same thing
• Test-retest reliability: measures consistency of scores across test administrations
• Internal consistency: measures consistency of scores across items within a test or
questionnaire
• Cronbach’s alpha is the most well-known measure of internal consistency, but it also
performs poorly in many situations
• McDonald’s omega is generally considered a stronger measure of reliability
Validity
• The extent to which your model measures what it claims to measure
• Validity is not a fixed property; it heavily depends upon the context in which the
measure is being used.
• There are many types of validity
• Most types of validity deal with accuracy of the measure in some form
Validity
• Construct validity: the degree to which your measure is supported by theory and empirical evidence
• Convergent and divergent validity
• Content validity: the degree to which your measure matches the content domain it is supposed to measure
• Face validity
• Criterion validity: the degree to which your measure correlates with criteria known to correspond with your construct
• Concurrent validity
• Predictive validity
Reliability & Validity
https://farab.wikispaces.com/TESTING
Exploratory Factor Analysis (EFA)
Factor Analysis - Idea
• Factor analysis examines the inter-relationships between a large number of
variables and attempts to explain them in terms of their common underlying
dimension/s called factor/s
• Technique can help identify groups of items in a scale that represent a common
factor.
Factor Analysis - Use
• Data summarization or reduction
• Finding dimensions
• Part of construct validation
Factor Analysis - Assumptions
• Variables must be correlated
• 10 unrelated variables=10 factors
• Sample must be homogeneous
• Scale is at least interval
• Sample size
• Large sample sizes are preferred (at least 100)
• Rule of thumb: 10 observations/item
• Number of factors are not known
Factor Analysis - Representation
Factor Analysis - Representation
• Latent variable/Factor – represented by ovals or ellipses
• Variables that cannot be measured itself but is related to several variables that can be
measured
• Manifest/Observed variables – represented by squares or rectangles
• Variables that can be measured
• Error – represented by circles
• SPSS does not produce such plots
Factor Analysis - Terms
• Factor loadings: Factor loadings represent how much a factor explains a variable. It
represents the correlation between factor and the variables. Loadings can range from
-1 to 1. Loadings close to -1 or 1 indicate that the factor strongly affects the variable.
Loadings close to zero indicate that the factor has a weak affect on the variable.
• Communality: Variance accounted for by all the factors in a variable is called
communality. It is the sum of the squared factor loadings for all factors for a given
variable. The communality measures the percent of variance in a given variable
explained by all the factors jointly and may be interpreted as the reliability of the
indicator.
Factor Analysis - Terms
• Uniqueness of a variable: Uniqueness gives the proportion of the common variance of the
variable not associated with the factors. It is equal to 1 - communality.
• Eigenvalues: Eigenvalues measure the amount of variation in the total sample accounted for by
each factor.
• Factor scores: A factor score is a numerical value that indicates a observation’s relative spacing
or standing on a latent factor. To compute the factor score for a given case for a given factor,
one takes the case's standardized score on each variable, multiplies by the corresponding
loadings of the variable for the given factor, and sums these products.
Factor Analysis - Extraction
• Commonly used methods of extraction are:
• Principal axis factoring: The principal axis factoring method is implemented by
replacing the main diagonal of the correlation matrix (which consists of all ones) by
the initial estimates of the communalities. The principal component method is then
applied to the revised correlation matrix
• Maximum likelihood: This method assumes data (the correlations) came from
population having multivariate normal distribution and hence the residuals of
correlation coefficients must be normally distributed around 0. The loadings are
iteratively estimated by ML approach under these assumptions.
Factor Analysis - Factors
• How many factors to retain?
• Should be both theory and data driven
• Data driven methods:
• Scree plot: A scree plot displays the eigenvalues associated with a component or factor in descending order versus the number of the component or factor. The ideal pattern in a scree plot is a steep curve, followed by a bend and then a flat or horizontal line. Retain those components or factors in the steep curve before the first point that starts the flat line trend. You might have difficulty interpreting a scree plot. Use your knowledge of the data and the results from the other approaches of selecting components or factors to help decide the number of important components or factors.
• Mulaik’s ruler: Retain the number of factors above the imaginary ruler
Factor Analysis - Factors
• Kaiser criterion: Retain only factors with eigenvalues greater than 1. In essence this is like saying that, unless a factor extracts at least as much as the equivalent of one original variable, it should not be used.
• Eigenvalue: The cumulative variance of all the retainable factors should be greater than 80%
• Horn’s Parallel analysis: Parallel analysis is a method for determining the number of factors to retain from factor analysis. Essentially, the program works by creating a random dataset with the same numbers of observations and variables as the original data.
• Velicer’s MAP test: This process involves a complete principal components analysis followed by the examination of a series of matrices of partial correlations
(https://people.ok.ubc.ca/brioconn/nfactors/nfactors.html)
Factor Analysis - Rotation
In order to make the location of the axes fit the actual data points better, the
program can rotate the axes. Ideally, the rotation will make the factors more easily
interpretable.
• Orthogonal Rotation: Uncorrelated factors
• Suggested rotations: Varimax or equimax
• Oblique Rotation: Correlated factors
• Suggested rotation: Promax
EFA Example
• Data : KIMS Data
• Please open KIMS pdf
• The data has 39 items and responses are Likert on a scale of 1-5
EFA Example
• SPSS
• Analyze -> Data Reduction -> Factor
• Select Q1 – Q39 to Variables.
• Click Descriptives. Select Univariate Descriptives, Initial Solution, Coefficients,
Determinant. Click Continue.
• Click Extraction. Select Principal Axis Factoring under Method. Select Unrotated
Factor Solution and Scree Plot under Display. Set Eigenvalues Greater Than to 1
under Extract (default). Click Continue
EFA Example – SPSS
To do orthogonal rotation…
• Click Rotation. Select Varimax under Method. Check Rotated Solution under
Display. Click Continue.
• Click OK.
To do oblique rotation…
• Click Rotation. Select Promax under Method. Check Rotated Solution under
Display. Click Continue.
• Click OK.
EFA Example - Result
EFA Example - Result
EFA Example - Result
EFA Example - Result
EFA Example - Result
Difference between EFA and CFA
• With EFA, researchers usually decide on the number of factors. With CFA, the researchers must specify the number of factors a priori.
• CFA requires that a particular factor structure be specified, in which the researcher indicates which items load on which factor. EFA allows all items to load on all factors.
• CFA provides a fit of the hypothesized factor structure to the observed data.
• CFA allows the researchers to specify correlated measurement errors, constrain loadings or factor correlations to be equal to one another, perform statistical comparisons of alternative models, test second-order factor models, and statistically compare the factor structure of two or more groups.
EFA vs. CFA
• EFA
• Explore patterns in the data
• Rotation and Extraction no unique solutions
• Goal is to identify plausible factor structure
• CFA
• Test specific hypotheses/models
• Model identification there is a unique solution that can be tested
• The researcher specifies a theoretical factor structure and tests how well that model
fits the data.
EFA vs. CFA
EFA vs. CFA
Confirmatory Factor Analysis (CFA)
Confirmatory Factor Analysis
• Common factor model – the values of the observed variables are caused by a common
unobservable latent variable
• Through factor analysis, we can test a theoretical model how we think a set of items
relate to each other by specifying a common cause for the observed responses
• We impose restrictions on the model (such as fixing certain factor loadings to 0, allowing
items to have correlated error variances, etc.)
• We can also test competing models to determine which factor structure best fits the data
Confirmatory Factor Analysis
• Assumptions for standard CFA models
• Linearity – relationships among variables are linear
• Multivariate Normality – observed variables are measured at the interval level and
are normally distributed
• Sufficiently large sample size
• Correct model specification
• Violating these assumptions produces biased or incorrect standard errors and
parameter estimates
Confirmatory Factor Analysis
• General guidelines
• Sample size > 200 or > 10*(number of items)
• Sample sizes below 150 are rarely sufficient
• At least 5 items per factor
• More indicators per factor typically produce more stable results
Model Identification
• A model is identified when a unique set of solutions can be found for the set of
parameter estimates specified in your model
• To be identified, you must have more pieces of information than parameters
being estimated
• The amount of information you have is a function of the number of items
(observed variables) in your covariance matrix
• For p items, this function is p(p + 1)/2
Model Identification
• The parameters being estimated are the
• Factor loadings
• Factor variances/covariances
• Error variances/covariances
• Parameters that are constrained to be equal are only counted once, and
parameters that are fixed to a constant do not count
• Degrees of freedom (df) = known information – number of free parameters
estimated
Model Identification
• Example
• A model using 10 items gives you 10(11)/2 = 55 pieces of information
• A one-factor CFA model with no correlated error variances will have
• 9 factor loadings (because one loading is fixed to 1)
• 1 factor variance
• 10 error variances
• Df = 55 – 20 = 35
Model Identification
• Under-identified: your model requires more parameter estimates than permitted by
the available information.
• This model can’t be estimated. Try a simpler model.
• Just-identified: your model requires exactly as many parameter estimates as
permitted by available information.
• This model is valid, but you will be unable to estimate model fit.
• Over-identified: you have more information than is required by your model.
• This is what we want. It allows room for modifications as well as calculation of model fit
indices.
AMOS
• Add-on program for conducting CFA/SEM using SPSS
• Can be run separately from SPSS or accessed within SPSS through the Analyze >
IBM SPSS AMOS menu
Opening a dataset
• Option 1: Accessing AMOS within SPSS
• Open the dataset in SPSS, then select Analyze > IBM SPSS AMOS
• This will open AMOS and automatically load your current dataset into the program
• Option 2: Accessing AMOS directly
• Open AMOS by running the program “AMOS Graphics” (this should be saved located in
the same folder where SPSS is installed).
• In AMOS, select File > Data Files or click on this button:
• Click on File Name and locate your data file
• Note: AMOS is not as flexible as SPSS in terms of opening different data formats. Your file should
be converted to an SPSS data file (.sav) before loading into AMOS.
The AMOS Interface
• Path Diagram Canvas
• Allows you to draw the graphical
representation of your model
The AMOS Interface
• The rectangle button is used to add
observed variables
• The oval button is used to add
unobserved variables
• The diagram button is used to add
latent variables and observed
indicators
The AMOS Interface
• The single arrow button is used to specify relationships in which there is a causal direction variables
• The double arrow button is used to specify relationships in which variables covary/correlate with each other
• The unique variable button is used primarily to add residual error terms to the diagram.
The AMOS Interface
• The hand buttons are used to select
or deselect objects in the graph.
• You can either select one object at a
time or select all objects in the graph
• The closed hand will deselect all
objects
The AMOS Interface
• The copy button will duplicate an
object and its properties
• The move button is used to drag and
reposition objects on the graph
• The delete button is used to delete
objects
The AMOS Interface
• These buttons are used to reshape,
rotate, and/or reflect objects in the
graph.
The AMOS Interface
• These buttons are used to reposition
the parameter estimates in your
graph and to align the pathways in
your diagram.
The AMOS Interface
• These buttons are used to select your
data file/files, to select and specify
analysis options, and to calculate
estimates for the model specified in
your path diagram.
The AMOS Interface
• These buttons are used to copy your
path diagram to the clipboard, to
view the text output and results
tables from your estimated model,
and to save your path diagram.
The AMOS Interface
• These buttons are used to view/edit
object properties, to drag object
properties to another object, and to
preserve symmetry in your diagram
(for example, to make sure your
items are equally spaced apart in the
diagram).
Example 1: One-factor CFA Model
Example 2: Three-factor CFA Model
Model Fit
• Absolute fit indices
• Assess overall model fit by comparing your observed covariance matrix to the covariance matrix predicted by your model
• Chi-square test, standardized root mean square residual (SRMR)
• Parsimonious fit indices
• Assess model fit while penalizing for model complexity
• Root mean square error of approximation (RMSEA)
• Relative fit indices
• Assess model fit in comparison to a nested baseline model
• Comparative fit index (CFI), Normed fit index (NFI), Tucker-Lewis index (TLI) AIC, BIC
Model Fit
• Chi-square test
• We would like to see a non-significant result, however this test is sensitive to large sample sizes (> 500)
• SRMR and RMSEA
• Closer to 0 is better. Good fitting model will have values < .05, although some consider < .08 to be an
acceptable threshold
• CFI, NFI, and TLI
• Closer to 1 is better. Good fitting models will be > .95, > .90 is acceptable
• AIC and BIC
• No absolute criteria. These are useful when comparing nested or competing models. Smaller values
relative to the other models examined indicate better model fit.
Modification Indices
• These are used to adjust model specifications
• The program recommends freeing certain paths or parameter estimates that result in improved model fit
• Caution: don’t blindly make every change recommended by AMOS. You need to be able to justify your modifications and to make sure that they make sense for your theoretical model.
• Don’t create a more complicated model just to chase model fit benchmarks!
• In many situations, mod indices are picking up on idiosyncrasies in your sample. Making the suggested modifications without justification damages the generalizability/validity of your model.
Recommended Resource
• Brown, T. A. (2014). Confirmatory factor analysis for applied research. Guilford
Publications.