June 8, 2023

# What is confirmatory factor analysis?

Developing accurate measurement instruments to assess complex concepts is essential in research and statistical analysis. Confirmatory factor analysis (CFA) is a statistical method used by researchers to assess the measurement model of a construct. This article explores the basics of CFA, its purpose, process, and key concepts involved in this method.

## Understanding factor analysis

Factor analysis is a statistical method used to identify the latent constructs that underlie observed variables. It allows researchers to examine the relationships among measured variables and reduce them to a smaller number of underlying factors that can account for the correlations among the observed variables. This method is widely used in fields such as psychology, sociology, and marketing research.

Factor analysis is based on the assumption that there are underlying factors that influence the observed variables. For example, in a study of personality traits, there may be underlying factors such as extraversion, agreeableness, and conscientiousness that influence the observed variables such as sociability, empathy, and organization.

Two types of factor analysis are exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). Both types of factor analysis are used to identify the underlying structure of variables, but they differ in their approach and purpose.

### The basics of factor analysis

EFA is a data-driven approach used to determine the underlying structure of a set of variables. It reduces a large number of variables to a smaller number of factors and identifies the loadings that each variable has on those factors. These factors are not pre-defined as in CFA, instead are derived through the analysis of the patterns in the data. EFA is useful in exploratory research where the underlying structure of variables is not well understood.

For example, in a study of customer satisfaction, EFA can be used to identify the underlying factors that influence customer satisfaction, such as product quality, customer service, and price. By identifying these factors, the company can focus on improving the areas that are most important to customers and increase overall customer satisfaction.

### Exploratory factor analysis vs. confirmatory factor analysis

While EFA can be used to identify the underlying structure of variables, it is not suitable for hypothesis testing, which is where CFA comes into play. In CFA, researchers test a pre-determined theoretical model, rather than allowing the data to determine the underlying structure. CFA is useful in confirmatory research where the underlying structure of variables is well understood and the researcher wants to test a specific hypothesis.

For example, in a study of intelligence, CFA can be used to test a specific theoretical model of intelligence, such as the Cattell-Horn-Carroll (CHC) theory, which posits that intelligence is composed of multiple factors such as fluid intelligence, crystallized intelligence, and visual-spatial processing. By testing this model, researchers can determine if the data supports the theory and make conclusions about the nature of intelligence.

## The purpose of confirmatory factor analysis

The primary purpose of CFA is to validate measurement models by testing the relationships among latent variables and observed variables. Testing a pre-determined model allows researchers to make stronger claims about the reliability and validity of a measurement instrument. However, there are many other reasons why CFA is a valuable tool in research.

### Hypothesis testing in research

Hypothesis testing is an essential aspect of research, and CFA is a powerful tool for hypothesis testing. CFA enables researchers to test the goodness of fit of a theoretical model and to determine whether the model accurately represents the underlying structure of a construct. This is particularly useful when researchers are trying to determine the relationship between multiple variables.

For example, imagine a researcher is interested in examining the relationship between a person's level of education and their income. The researcher could use CFA to test a model that includes both education and income as latent variables, and several observed variables that measure each construct. By doing so, the researcher could determine whether there is a significant relationship between education and income, and whether the model accurately represents the underlying structure of these constructs.

### Validating measurement models

CFA also provides a way to validate measurement models by examining the relationships among latent variables and observed variables. Researchers can assess the internal consistency, the construct validity, and the convergent and discriminant validity of a measurement model using CFA. This is particularly useful when researchers are developing new measurement instruments.

For example, imagine a researcher is developing a new questionnaire to measure anxiety. The researcher could use CFA to test the validity of the questionnaire by examining the relationships among the observed variables (e.g., questions on the questionnaire) and the underlying construct of anxiety. By doing so, the researcher could determine whether the questionnaire accurately measures anxiety, and whether any changes need to be made to improve the instrument.

### Assessing construct validity

CFA allows researchers to assess construct validity, which is the degree to which a measurement instrument measures what it is intended to measure. Construct validity can be estimated by examining the extent to which the model fits the data and the strength of the relationships among the observed variables and the underlying constructs. This is particularly useful when researchers are comparing different measurement instruments.

For example, imagine a researcher is interested in comparing two different questionnaires that measure depression. The researcher could use CFA to test the validity of both questionnaires by examining the relationships among the observed variables and the underlying construct of depression. By doing so, the researcher could determine which questionnaire is more valid and reliable, and which one should be used in future research.

## The process of confirmatory factor analysis

Confirmatory factor analysis (CFA) is a statistical technique used to test the validity of a theoretical construct by examining the relationships among a set of observed variables. Here is a more detailed breakdown of the four main steps involved in the CFA process:

### Defining the measurement model

The first step in CFA is defining the measurement model. This involves specifying the relationships among the observed variables and the underlying constructs using a hypothesized model. The model is typically derived from existing theory or previous research, and is used to test the validity of the construct being measured.

For example, if a researcher wants to test the construct of "job satisfaction," they might hypothesize that job satisfaction is composed of several underlying factors, such as "work-life balance," "pay and benefits," and "job security." The researcher would then specify the relationships among these factors and the observed variables that measure them, such as "hours worked per week," "salary," and "likelihood of layoffs."

### Data collection and preparation

The next step is data collection and preparation. Researchers must collect data on the variables that are required for the hypothesized measurement model. This typically involves administering surveys or questionnaires to a sample of participants, and collecting data on their responses.

Once the data is collected, it must be prepared in a suitable format for statistical analysis. This may involve cleaning the data to remove any errors or inconsistencies, and transforming the data to ensure that it meets the assumptions of the statistical model being used.

### Model estimation and evaluation

The third step is model estimation and evaluation. Researchers estimate the model using statistical software, and evaluate the goodness of fit of the model, as well as the significance of the factor loadings and other model parameters.

The goodness of fit of the model is typically evaluated using several fit indices, such as the chi-square test, the root mean square error of approximation (RMSEA), and the comparative fit index (CFI). These indices provide information about how well the model fits the data, and whether any modifications to the model are necessary.

If the model does not fit the data well, researchers may need to modify the model and re-estimate it. This process is typically iterative, with modifications made until the model fits the data appropriately.

### Model modification and re-specification

The final step is model modification and re-specification. If the initial model does not fit the data well, researchers must modify and re-specify the model until it fits the data appropriately.

Modifications may include adding or removing observed variables, redefining the relationships among the observed variables and underlying constructs, or allowing for correlations among error terms. The goal is to arrive at a final model that fits the data well and provides a valid test of the construct being measured.

In conclusion, CFA is a powerful tool for testing the validity of theoretical constructs. By following these four steps, researchers can develop and refine measurement models that accurately reflect the underlying constructs they are trying to measure.

## Key concepts in confirmatory factor analysis

Confirmatory factor analysis (CFA) is a statistical technique used to test a hypothesized relationship between latent variables and observed variables. In this technique, researchers aim to confirm a theoretical model that specifies the relationship between a set of latent variables and their corresponding observed variables. The following are some key concepts in CFA:

### Latent variables and observed variables

Latent variables are unobserved constructs that are measured by several observed variables. For example, intelligence is a latent variable that can be measured by several observed variables, such as verbal ability, spatial ability, and memory. Observed variables are those variables that are measured directly, for example, responses on a survey questionnaire. CFA is concerned with testing the relationship between latent variables and observed variables.

For instance, let's say a researcher is interested in examining the relationship between depression and anxiety. Depression and anxiety are latent variables that can be measured by several observed variables, such as mood, energy level, and worry. CFA can be used to test whether these observed variables are related to depression and anxiety as hypothesized.

Factor loadings represent the strength of the relationship between an observed variable and a latent variable. Factor loadings can be positive, negative, or zero. A positive factor loading indicates that an increase in the observed variable is associated with an increase in the latent variable. A negative factor loading indicates that an increase in the observed variable is associated with a decrease in the latent variable. A factor loading of zero indicates that there is no relationship between the observed variable and the latent variable.

Error terms represent the unique variance of an observed variable that is not accounted for by the latent variables. Error terms can be positive or negative and are assumed to be uncorrelated with the latent variables. CFA allows researchers to estimate both factor loadings and error terms.

### Model fit indices

Model fit indices are statistical measures that assess the degree to which the hypothesized model fits the data. CFA output typically includes several model fit indices, such as chi-squared test statistic, root mean square error of approximation (RMSEA), comparative fit index (CFI), and standardized root mean square residual (SRMR).

The chi-squared test statistic tests the null hypothesis that the hypothesized model fits the data perfectly. A non-significant chi-squared test statistic indicates a good fit between the model and the data. However, the chi-squared test statistic is sensitive to sample size and can be significant even when the model fits the data well.

RMSEA is a measure of how well the model fits the data, taking into account the complexity of the model. A lower RMSEA indicates a better fit between the model and the data. CFI is a measure of how well the model fits the data, relative to a null model that assumes no relationship between the latent variables and the observed variables. A higher CFI indicates a better fit between the model and the data. SRMR is a measure of the average difference between the observed correlations and the predicted correlations based on the model. A lower SRMR indicates a better fit between the model and the data.

## Conclusion

In conclusion, confirmatory factor analysis is a statistical method used to validate measurement models by testing relationships among latent variables and observed variables. CFA is a powerful tool for hypothesis testing, validating measurement instruments, and assessing construct validity. By understanding the basics of CFA, its purpose, process, and key concepts, researchers can design better measurement instruments and draw stronger conclusions from their research.