Multiple Correspondence Analysis
Introduction
Multiple correspondence analysis (MCA) extends CA to several categorical variables. It is the analogue of PCA for categorical data and is widely used in social science and epidemiology for exploring patterns among multiple categorical variables.
Prerequisites
Correspondence analysis.
Theory
MCA operates on a disjunctive-coded indicator matrix of all categorical variables. Burt’s approach decomposes the Burt table (cross-tabulation of all pairs). Each category becomes a point; each individual becomes a point; low-dimensional projection preserves chi-squared distances.
Assumptions
Categorical variables; no strong zero cells; sufficient sample size.
R Implementation
library(FactoMineR); library(factoextra)
data(hobbies)
mca <- MCA(hobbies[, 1:8], graph = FALSE)
fviz_mca_biplot(mca)
fviz_mca_var(mca)
summary(mca)Output & Results
Coordinates for individuals and variable categories; inertia per axis.
Interpretation
“MCA of 8 hobby variables revealed two dimensions, corresponding to ‘active vs. sedentary’ and ‘social vs. solitary’. Age groups and gender mapped onto these gradients.”
Practical Tips
- Inertia tends to be low-per-axis in MCA; needing many dimensions for much variance is common.
- Benzecri’s correction or Greenacre’s adjusted inertia gives more meaningful variance-explained percentages.
- Supplementary variables (
quali.sup) can be added without affecting the main axes. - Use
fviz_mca_varto focus on category positions. - For mixed data (categorical + continuous), consider factor analysis of mixed data (FAMD).