UP

CHAPTER 16:

Confirmatory Factor Analysis through the Amos Program 

I.         The Notion of Confirmatory Factor Analysis

confirmatory factor analysis: “factor analysis conducted to test hypotheses (or confirm theories) about the factors one expects to find. It is a subtype of structural equation modeling. . .” (Vogt, 2005, p. 56).

structural equation: “an equation representing the strength and nature

of the hypothesized relations among (the ‘structure’ of) sets of variables in a theory” (Vogt, 2005, p. 313).

structural equation modeling: “models made up of more than one structural equation; models that describe causal relations among latent variables and include coefficients for endogenous variables” (Vogt, 2005, p. 313).

  1. Purposes

·  testing whether a set of measures continues to exhibit the same factor structure as hypothesized.

·  its role in constructing a causal model

·  comparing alternative factor solutions from the data.

·  comparing alternative factor solutions from different individuals

 

·  factor matching

factor matching: a use of factor analysis exploring ways to transform two sets of factors into structures that most strongly resemble each other.

·  comparing alternative solutions with different variables from the same individuals

  1. Types of Confirmatory Factor Analysis

--traditional confirmatory factor analysis uses a combination of hypotheses about factors and general factor analysis methods. Rather than employing principal components solutions, traditional confirmatory factor analysis relies on principal axis factoring.

--use of structural equation modeling to complete confirmatory factor analysis is the most popular method for completing confirmatory factor analyses, and it is the method described for the rest of this chapter. In this approach, confirmatory factor analysis is the “measurement model” for a system in which the factors are identified as latent variables in a larger predictive model

 

  1. Data Characteristics and Assumptions of Confirmatory Factor Analysis

1.   to make predictions of factors, researchers identify parameters for which some values are known and for which others are to be estimated.

 

·  fixed parameters that have been assigned given values

·  constrained parameters

·  free parameters

fixed parameters: parameters that have been assigned given values

constrained parameters: parameters that are unknown, but equal to one or more other

free parameters: parameters that are unknown and not constrained to be equal to any other parameter

2.   standard assumptions:

·  A major assumption tested by the modeling process is that there really are unobserved common dimensions that may be enlisted to account for the correlations among observed variables.

·  that the data are measured on the interval or ratio level

·  that both common and unique factors have means of zero;

·  that variances of common factors are equal to 1;

·  that common factors are uncorrelated with each other

3.   assumptions derived from structural equation modeling:

 

that residuals are: normally distributed, show the same pattern of relationship through the entire range of the variables (homoscedasticity); independent of each other; and independent of the exogenous variables.

4.      assumptions consistent with multiple regression correlation: that there is a linear relationship between observed and predicted values of the dependent variable (this assumption also means that the residuals have a mean of zero).

5.   assumptions for using maximum likelihood estimation or generalized least squares methods:

residuals: differences between the observed and predicted values of the dependent variable

 

a.   independence of observations

independence of observations: the assumption that participants or events are chosen independently from the population

b.   multivariate normal distribution

--if researchers know that this assumption is not tenable, standard chi square statistics related to model fit should probably be replaced by interpreting the Satorra-Bentler scaled chi square test, which adjusts the chi square value for non-normal distributions.

multivariate normal distribution: in multivariate analyses, the assumption that extends the logic of the standard normal curve to multiple variables

c.   that the covariance matrix is positive definite

positive definite: a characteristic of matrices in which all eigenvalues are above zero. Furthermore, a positive definite matrix is symmetric A where the quadratic form xHAx is greater than

zero for all nonzero vectors x.

--Researchers also can—and often do--explore models that do not meet the assumptions of uncorrelated error terms and uncorrelated factors. Indeed, good cases can be made for models that do not make these initial assumptions. Thus, modifications sometimes are made in the model to permit examining factor structures that violate these initial assumptions

 

II.  Using the AMOS Program for Confirmatory Factor Analysis

--“Amos” is an abbreviation of “Analysis of MOment Structures.”

--this section of the chapter covers the nomenclature of Amos, explains the operation of the program, and shows how to interpret the results of an actual analysis

A.  The Initial Language of Amos

 

--measures (observed variables) converge on underlying constructs (unobserved variables) in a measurement model.

observed variable (also called a manifest variable): a variable that is measured on a given item or scale.

unobserved variable (also called a latent variable or a factor): a variable that is not observed directly. This factor is composed of observed variables.

model: a predicted pattern of relationships among the types. In a broad sense, a model is “A simple description of a probabilistic process that may have given rise to observable data” (Upton & Cook, 2002, p. 234). In this context the description exhibits relationships among observed (manifest) variables and underlying factors (unobserved or latent) variables, as well as error terms.

parameters: in confirmatory factor analysis and math modeling,  coefficients expressing relationships among elements of the model. These parameters may be fixed, constrained, or free.

  1. Beginning Amos and Entering Data

  2. Operation of the Program

1.   Constructing the Diagram

 

 

 

2.   Examining Model Parameters

--for confirmatory factor analysis, major choices used to minimize discrepancies in the model’s estimates feature three major choices: some variation of ordinary least squares (which is the basis of so many correlation tools), generalized least squares, and maximum likelihood estimation (the latter of which is the default for estimating model parameters, and has the advantages of consistency, efficiency, and normality as sample size increases).

--analysis controlled by output requested including:

·  Minimization history: summarizing steps in the minimization of the discrepancy function using convergence criteria;

·  Sample moments: reporting the covariance matrix for the sample;

·  Implied moments: reporting the covariance matrix for the values of the observed variables;

·  Residual moments: reporting differences between the sample and implied covariance matrices.

§   Modification indices: reporting modification indices that conservatively estimate what effect on discrepancy would occur if the constraints on each parameter were removed. By default, the minimum threshold for reporting details of these analyses is a change in the maximum likelihood ratio of at least 4 (approximately the minimum chi-square value required to produce statistical significance at alpha risk of .05).

·  Factor score weights: reporting the regression weights from observed to  unobserved variables;

·  Covariances of estimates: reporting a matrix of the covariances for all parameter estimates;

·  Correlations of estimates: reporting a matrix of the correlation coefficients for all parameter estimates;

·  Critical ratios for differences: reporting a test of the null hypothesis that the values for pairs of population parameters are equal;

·  Tests of normality and outliers: reporting statistics to examine multivariate normality (including univariate assessments of skew and kurtosis as well as Mardia’s coefficient of multivariate kurtosis. Following a statistically significant coefficient of multivariate kurtosis, a check for outliers is completed  (based on use of Mahalanobis’ distance).

 

maximum likelihood estimation: estimation of parameters by a process in which “to choose as estimator of a [population] parameter θ that a function based of the sample of observations which will, when substituted for θ, make the probability of the sample a maximum. In other words, for this value of θ the observed sample is also the most likely sample” (Keeping, 1995, p. 123).

--selecting standardized estimate produces the diagrammatic version of the model showing standardized regression coefficients (beta weights) for all paths.

--forms of variables

 

·  exogenous variables

exogenous variables : variables that are not predicted by other influences.

·  endogenous variables

endogenous variables: variables that are predicted from others

--a report of the condition number for the matrix

 

 

condition number: in confirmatory factor analyses, a number reporting the condition of the matrix. This number is created by dividing the smallest eigenvalues into the largest eigenvalues. A small determinant

suggests that t there is a linear dependency of variables and that standard errors may be inflated

3.   Examining Measurement Model Fit Indices

--Though the likelihood ratio chi square test is heavily used in assessing confirmatory factor analysis results, it tends to reject otherwise tenable models when large sample sizes (over 200) are involved.

a.   Chi Square-Based Measures of Discrepancy

 

1.)  CMIN: The minimum discrepancy

CMIN: use of the maximum likelihood estimation chi-square test to assess the fit of a model in confirmatory factor analysis and modelling

2.)  CMIN/DF

--an attempt to adjust for model complexity.

--For a tenable model, the value should be close to 1. Conversely, “a ratio greater than 2.00 represents an inadequate fit”(Byrne, 1989, p. 55).

b.   Baseline Model Comparisons:

--These measures attempt to contrast some baseline model (not always a null hypothesis model) with another measurement model.

CMIN/DF: a test to assess the fit of a model in confirmatory factor analysis and modeling in which the minimum discrepancyis divided by its degrees of freedom

1.)  NFI

--Values above .8 or .9 are recommended for claims of model fit and 1.0 indicates a perfect fit of the model to the data. Though it tends to compensate for chi square’s upward bias with large sample sizes, it may be biased against models based on small sample sizes.

NFI: the normed fit index (also called the Bentler-Bonett normed fit index); a test to assess the fit of a model in confirmatory factor analysis and modeling in which the model’s minimum discrepancy is divided by the minimum discrepancy of the baseline model , and then this value is subtracted from 1.

2.)  RFI

--permits scores to vary beyond the range of zero and 0

--Coefficients close to 1 are considered desirable

RFI: the relative fit index (rho1): a test to assess the fit of a model in confirmatory factor analysis and modeling in which the NFI is adjusted by dividing discrepancy values by the degrees of freedom for hypothesized and baseline models. Then, this value is subtracted from zero.

3.)  IFI

--permits a range above 1.0, though acceptable fit is judged to be close to 1.0 and above .90

IFI: the incremental fit index (Delta2): a test to assess the fit of a model in confirmatory factor analysis and modeling computed as where  is the minimum discrepancy of the baseline model, , is the minimum discrepancy for the hypothesized model (CMIN), and d.f. is the degrees of freedom for the hypothesized model.

4.)  TLI

--permits values below zero and above 1, though acceptable fit is judged to be close to 1.0

TLI:  The Tucker-Lewis Index or rho2: a test to assess the fit of a model in confirmatory factor analysis and modeling in which the NFI is adjusted by the complexity model and computed as .

5.)  CFI

--Similar to the NFI, the Bentler (1990) comparative fit index (except for its failure to control for sample size)

--For sound models, the CFI should be above .90.

CFI:  the comparative fit index: a test to assess the fit of a model in confirmatory factor analysis and modeling that indicates the percent to which the data covariance can be reproduced by the hypothesized model by contrasting the covariance matrix of the hypothesized model against an independence model where latent variables are assumed to be uncorrelated.

c.   Parsimony Adjusted Fit Measures:

--This family of measures attempts to compensate for the complexity of models. These measures reduce the overall size of the measures of fit by a constant known as the “parsimony ratio.”

--Though there are no hard and fast rules for interpreting these coefficients, the closer they are to 1.0, the stronger (and more parsimonious) the model fit is claimed to be.

 

PRATIO: the parsimony ratio: in assessing the fit of a model in confirmatory factor analysis and modeling, a ratio multiplied by the NFI or CFI (then called the PNFI [parsimony normed fit index] and PCFI [parsimony comparative fit index] respectively) to take into account the complexity of the models, computed as in which the degrees of freedom for the hypothesized model d are divided by the degrees of freedom for the independence model d0.

d.   RMSEA Measures

--To accept a model, a general rule of thumb is that the RMSEA should be below .05 or .06

RMSEA: Root Mean Square Error of Approximation: in assessing the fit of a model in confirmatory factor analysis and modeling, an approach  taking the square root of the F0 values (, where  and is the minimum discrepancy value CMIN, d is degrees of freedom for the hypothesized model) that have been divided by the number of degrees of freedom for testing the model. This approach “can be interpreted as a root mean square standardized measure of badness of fit of a particular model . . .” (Steiger, 1998, p. 413). 

--PCLOSE test of statistical significance for RMSEA;

   a statistically significant difference leads the researcher to conclude that the theoretic model is significantly different from the actual relationships among variables

PCLOSE: in assessing the fit of a model in confirmatory factor analysis and modeling, an approach  transforming the RMSEA into a test of statistical significance.

D.  Measurement Adequacy and Considering Modification

--Following a successful confirmatory factor analysis, the researcher:

·  may form a common index;

·  apply a standard reliability tool such as Cronbach’s coefficient alpha (measures of model fit may show acceptable fit simply when paths are quite small--hence, even though a measurement model may show acceptable fit by the various tests used, it may be revealed to have low reliability when Cronbach's coefficient alpha is applied to it.

·  continue with cross validation work

--Following an unsuccessful confirmatory factor analysis, the researcher:

·  should reconsider the conceptual foundation of the confirmatory factor analysis model

·  carefully consider modifications;

--A warning: since researchers may follow modifications with statistical significance tests of model fit, they actually increase experimentwise alpha risk and their chances of finding an acceptable model may be reduced.

 

1.      Examining Models that Do Not Meet All Initial Assumptions

--models with correlated error terms, might be explored (the presence of additional variables creating nonrandom influences may be revealed by correlated error terms)

-- Sometimes researchers wish to permit correlations among common factors, especially if there is a theoretic basis behind the choice.

--The presence of two highly correlated factors may indicate that there really is only one underlying factor. To examine the matter empirically, the researcher may compare the fit of the originally hypothesized factor structure model with the fit of a measurement model in which the correlations among the factors are constrained to be equal to 1.00. “If the constrained model is not  significantly worse than the unconstrained one, the researcher concludes that a one-factor model would fit the data as well as a multi-factor one and, on the principle of parsimony, the one-factor model is to be preferred” (Garson, 2004, ¶ 16).

 

2.      Exploring Alternative Paths

--In Amos, the modification indices section of the output searches only for paths that may be added to the model to improve its fit.

--Guidelines for modifying paths include the following:

·  Construct a new model that includes a parameter path with the largest modification index. Then, the fit of the new model to the data is checked by use of tests of fit, such as those using the chi-square distribution. If the new model improves fit, it may be retained.

·  Construct a new model that includes all paths with parameters that had modification index scores over 100. Then, following similar advice as previously described, the adequacy of the new model would be examined by specific tests of fit.

Though such modifications may seem obvious repairs, they come at a price. Unless there is some conceptual or theoretic reason to add paths or correlated error terms, new models may not survive when efforts are made to replicate them.