Factorial / Mixed Repeated-Measures ANOVA

mixed-anova

rmanova

within-between

sphericity

Combining within-subjects factors with between-subjects factors in a single ANOVA

Published

April 17, 2026

Research question

Use a mixed repeated-measures ANOVA when a design combines a within-subjects factor (measurements over time) with a between-subjects factor (treatment arm, sex, disease stage). Biomedical example: in a 2-arm trial with measurements at baseline, week 4, and week 12, does the treatment arm (active vs. placebo) interact with time on fasting plasma glucose?

Assumptions

Assumption	How to verify in R
Independent between-subjects groups, within-subjects measurements correlated	design
Outcome approximately normal per cell	`shapiro_test()` by arm x time
Sphericity of the within-subjects factor	Mauchly’s test on the within factor
Homogeneity of variance across between-subjects levels at each time	`leveneTest()` per time
No extreme outliers	boxplot by arm x time

Hypotheses

For a Group x Time design:

\[H_0^{\text{Time}}, \quad H_0^{\text{Group}}, \quad H_0^{\text{Group x Time}} : \text{no interaction}\]

The interaction is usually the hypothesis of primary interest in clinical trials with repeated measures.

R code

library(tidyverse); library(rstatix); library(afex); library(effectsize); library(ggstatsplot)
set.seed(42)

# 30 per arm; FPG at 3 time points
fpg <- expand_grid(
  id   = 1:60,
  time = factor(c("baseline", "wk4", "wk12"), levels = c("baseline", "wk4", "wk12"))
) |>
  mutate(arm  = factor(rep(rep(c("Placebo", "Active"), each = 3), 30),
                        levels = c("Placebo", "Active")),
         base = rep(rnorm(60, 8.2, 1.0), each = 3),
         fpg  = base + case_when(
           arm == "Placebo" & time == "baseline" ~ 0,
           arm == "Placebo" & time == "wk4"      ~ rnorm(n(), -0.3, 0.6),
           arm == "Placebo" & time == "wk12"     ~ rnorm(n(), -0.5, 0.7),
           arm == "Active"  & time == "baseline" ~ 0,
           arm == "Active"  & time == "wk4"      ~ rnorm(n(), -1.1, 0.6),
           arm == "Active"  & time == "wk12"     ~ rnorm(n(), -1.9, 0.8)
         ))

# Assumptions
fpg |> group_by(arm, time) |> shapiro_test(fpg)

# Mixed ANOVA
mix <- anova_test(data = fpg, dv = fpg, wid = id, between = arm, within = time)
get_anova_table(mix, correction = "auto")

# If interaction is significant, test the simple effect of time within each arm
fpg |> group_by(arm) |> anova_test(dv = fpg, wid = id, within = time)

# And the simple effect of arm at each time
fpg |> group_by(time) |> anova_test(dv = fpg, wid = id, between = arm)

ggbetweenstats(data = fpg |> filter(time == "wk12"), x = arm, y = fpg,
               xlab = "Arm at week 12", ylab = "FPG (mmol/L)")

Interpreting the output

The key test is the Group x Time interaction. A significant interaction, \(F(2, 116) \approx 15\), \(p < .001\), \(\eta_G^2 \approx 0.16\), indicates that the change in FPG over time differed between arms. Simple-effects analyses show: within the active arm, FPG declined substantially across time; within the placebo arm, the decline was minimal. At week 12, the active arm’s FPG was significantly lower than placebo’s.

Effect size

Generalised eta-squared (\(\eta_G^2\)) is preferred for mixed designs. Thresholds (adapted): small 0.01, medium 0.06, large 0.14.

Reporting (APA 7)

A 2 (arm) x 3 (time) mixed ANOVA showed a significant arm x time interaction on fasting plasma glucose, F(2, 116) = 14.9, p < .001, eta_G^2 = .16. Simple-effects analyses confirmed a steeper decline in the active arm than in placebo; at week 12, the active arm was 1.6 mmol/L lower than placebo, F(1, 58) = 48.1, p < .001.

Common pitfalls

Ignoring the interaction when it is the primary clinical question; report it first.
Reporting pooled main effects of group or time when the interaction is significant without interpreting the pattern.
Treating missing time points as “drop out and lose the whole case”; use a mixed model via lme4::lmer() for unbalanced data.
Using multiple paired t-tests without a corrected family-wise alpha.

Parametric vs. non-parametric alternative

For non-normal data, use aligned-rank transform ANOVA (ARTool::art) or a robust variant (WRS2::bwtrim). For unbalanced designs, prefer a linear mixed model.