STEP 7 / 17

ANOVA、ANCOVA、混合效應模型

從「比較多組平均」走向「在批次、subject、重複內共享資訊」——多層次資料的標準語言。

From "comparing several group means" to "sharing information across batches, subjects, replicates" — the lingua franca of multi-level data.

ANOVA 是線性模型的特例

「為什麼分開教 ANOVA?」其實 ANOVA = 用類別變數做 X 的線性迴歸。F 檢定把總變異拆解成「組間 + 組內」,問「組間變異是否顯著大於組內」。一旦理解這個身分,ANCOVA(加入連續共變數)、repeated measuresmixed models都是這個框架的自然延伸。

真正能讓 bioinformatician 與其他人拉開差距的是 混合效應模型 (mixed models):當資料有批次、subject、複製等多層結構時,固定效應模型要嘛假裝它們不存在(錯)、要嘛吃掉所有自由度(浪費),而混合模型用隨機效應把這些變異來源建模成分布,達成部分匯聚 (partial pooling)

"Why teach ANOVA separately?" — in fact ANOVA = linear regression with categorical X. The F-test partitions variance into "between + within" and asks if between exceeds within. Once you see this identity, ANCOVA (add continuous covariates), repeated measures, and mixed models all become natural extensions.

What separates bioinformaticians from the pack is mixed-effects modeling: when data have batch, subject, replicate structure, a purely fixed-effects model either ignores them (wrong) or burns df on them (wasteful). Mixed models treat these variance sources as random effects drawn from a distribution — yielding partial pooling.

💡
判別準則:把因子當固定還是隨機?若該因子的特定 levels 本身有興趣(treatment = drug A vs B)→ 固定;若這些 levels 是更大母體的可交換樣本(batch 1, 2, ..., 8 → 未來還會有更多)→ 隨機。 Rule of thumb: fixed vs random? If you care about the specific levels themselves (treatment = drug A vs B) → fixed. If they are exchangeable samples from a larger population (batch 1, 2, ..., 8 → there will be more) → random.

一、從 ANOVA 到混合模型

📦

ANOVA 與 SS 類型

  • F = MSbetween / MSwithin
  • Type I(序列):依輸入順序,結果隨順序變
  • Type II(邊際):適合無交互作用
  • Type III(part. SS):失衡資料的標準選擇 (car::Anova(., type=3))
  • F = MSbetween / MSwithin
  • Type I (sequential): order-dependent
  • Type II (marginal): for no-interaction models
  • Type III (partial SS): standard for unbalanced designs (car::Anova(., type=3))
📏

ANCOVA:調整後比較

  • lm(y ~ group + covariate):在共變數下比較組均值
  • 關鍵:先檢查 group × covariate 交互是否顯著(平行斜率假設)
  • 批次、性別、年齡、library size 都是 ANCOVA 的常見共變數
  • lm(y ~ group + covariate): compare group means at common covariate value
  • Always check group × covariate interaction first (parallel-slopes assumption)
  • Batch, sex, age, library size — typical ANCOVA covariates
🎲

隨機 vs 固定

  • 固定:估計每個 level;隨機:估計 levels 的變異 σ²
  • 隨機允許「未來新 level」的推廣
  • 常見隨機效應:batch、subject、site、family
  • 公式:(1|batch) 隨機截距;(slope|subject) 隨機斜率
  • Fixed: estimate each level; random: estimate variance σ² of levels
  • Random enables generalization to new levels
  • Typical random effects: batch, subject, site, family
  • Formula: (1|batch) random intercept; (slope|subject) random slope
🌊

部分匯聚

  • No pooling = 每組獨立估計(雜訊大)
  • Complete pooling = 全體共用均值(偏誤大)
  • Partial pooling = 各組均值朝整體均值收縮,收縮量依各組樣本數與群間變異決定
  • 這就是 BLUP(Best Linear Unbiased Predictor)
  • No pooling = each group estimated independently (high variance)
  • Complete pooling = one grand mean (high bias)
  • Partial pooling = group means shrunk toward the grand mean, by amount depending on n and between-group variance
  • This is the BLUP (Best Linear Unbiased Predictor)
⚠️
REML vs ML、p-value 爭議:① 混合模型用 REML 估計變異成分(更小偏誤),但比較固定效應結構不同的模型要用 ML。② lme4::lmer 故意不給 p-value——分母自由度在不平衡資料下沒有閉式解。解法:用 lmerTest (Satterthwaite) 或 pbkrtest (Kenward-Roger)。③ singular fit 警告:某個變異成分被估在 0 邊界——簡化隨機結構(移除隨機斜率或互動)。
REML vs ML, and the p-value controversy: ① Mixed models use REML for variance components (less bias), but to compare models with different fixed structure switch to ML. ② lme4::lmer deliberately withholds p-values — there is no closed-form denominator df in unbalanced designs. Fix: lmerTest (Satterthwaite) or pbkrtest (Kenward-Roger). ③ Singular fit warning: a variance component is estimated at zero — simplify the random structure (drop random slopes or correlations).

三種匯聚方式視覺化

同一筆 6 組資料(每組 n=5)用三種方式估計組均值:灰色=資料;=各組獨立均值(no pooling);=整體均值(complete pooling);=混合模型 BLUP(partial pooling)。拖動滑桿改變群間 vs 群內變異比 τ²/σ²——當 τ²/σ² 大(組真的不同),BLUP 接近 no-pooling;當 τ²/σ² 小(組差異是雜訊),BLUP 強烈收縮到整體均值。

Same 6 groups (n=5 each) estimated three ways: gray=data; red=independent group means (no pooling); green=grand mean (complete pooling); blue=mixed-model BLUP (partial pooling). The slider controls between/within variance ratio τ²/σ²: when τ²/σ² is large (groups truly differ), BLUP nears no-pooling; when small (group differences are noise), BLUP shrinks hard toward the grand mean.

X: 組別;Y: 估計均值

二、生物資訊中的應用

情境 模型 工具
RNA-seq 技術複製(同一樣本多 lane)(1|sample)limma::duplicateCorrelationRNA-seq tech replicates (sample × lane)(1|sample)limma::duplicateCorrelation
多批次 RNA-seq~ treatment + (1|batch)variancePartition::dreamMulti-batch RNA-seq~ treatment + (1|batch)variancePartition::dream
scRNA-seq 多 donor(細胞嵌套 donor)(1|donor)pseudobulk + DESeq2; muscat; NEBULAscRNA-seq multi-donor (cells nested in donor)(1|donor)pseudobulk + DESeq2; muscat; NEBULA
縱貫資料(同 subject 多時間點)(time|subject)lme4; nlmeLongitudinal (subjects × time)(time|subject)lme4; nlme
GWAS 親緣矩陣linear mixed model + kinship KGEMMA; SAIGEGWAS with kinshipLMM + kinship KGEMMA; SAIGE

實作:ANOVA、ANCOVA、lmer

# --- R --- ANOVA / ANCOVA / mixed model 全套
library(car); library(lme4); library(lmerTest); library(emmeans); library(performance)

# 1) 單因子 ANOVA(與 lm 等價)
fit_a <- aov(y ~ group, data = df)
summary(fit_a)
# 不平衡 → Type III SS
car::Anova(lm(y ~ group * sex, data = df), type = "III")

# 2) ANCOVA:加共變數,先檢查交互
anova(lm(y ~ group * cov, data=df), lm(y ~ group + cov, data=df))
fit_c <- lm(y ~ group + cov, data = df)        # 平行斜率成立 → 用此

# 3) Mixed model:固定 treatment、隨機 batch + subject
m <- lme4::lmer(y ~ treatment + (1|batch) + (1|subject),
                 data = df, REML = TRUE)
summary(m)                                      # lmerTest 提供 Satterthwaite df 與 p
anova(m_null, m)                                # LR 檢定 (改 REML=FALSE)

# 4) 估計邊際均值 (EMM) + Tukey 比較
emmeans::emmeans(m, pairwise ~ treatment, adjust = "tukey")

# 5) 模型診斷
performance::check_model(m)                  # 6 張全面診斷圖
# --- Python ---
import statsmodels.formula.api as smf
from statsmodels.stats.anova import anova_lm
from statsmodels.regression.mixed_linear_model import MixedLM
import pingouin as pg

# 1) Type III ANOVA
fit = smf.ols("y ~ C(group) * C(sex)", data=df).fit()
anova_lm(fit, typ=3)

# 2) ANCOVA
fit_c = smf.ols("y ~ C(group) + cov", data=df).fit()

# 3) Mixed model — random intercept on batch
m = MixedLM.from_formula("y ~ treatment", groups="batch",
                          re_formula="~time", data=df).fit(reml=True)
print(m.summary())

# 4) 重複測量 ANOVA
pg.mixed_anova(data=df, dv="y", within="time",
               between="group", subject="id")

# 5) 跨平台:pymer4 把 lme4 帶到 Python
# from pymer4.models import Lmer
# Lmer("y ~ treatment + (1|batch)", data=df).fit()
🚫
常見錯誤:把批次當固定效應——浪費 k−1 自由度,且結論無法推廣到「新 batch」。② 忽略嵌套結構(同 subject 的多次測量當獨立樣本)——pseudoreplication,p 值嚴重偏小,是 omics 假陽性的隱形殺手。③ 模型寫 (time|subject) 但只有 2 個時間點——隨機斜率無法識別。 Common mistakes:Treating batch as fixed — wastes k−1 df, and conclusions don't generalize to new batches. ② Ignoring nesting (multiple measurements per subject as independent) — pseudoreplication; p-values vastly anti-conservative; a silent killer of omics studies. ③ Writing (time|subject) with only 2 time points — random slope not identifiable.

📝 自我檢測

1. 你的研究有 8 個 RNA-seq batch,希望結論推廣到「未來再做的 batch」。batch 應怎麼建模?

1. Your study has 8 RNA-seq batches and you want conclusions to generalize to future batches. How should batch enter the model?

A. 完全忽略A. Ignore it
B. 固定效應B. Fixed effect
C. 隨機效應 (1|batch)C. Random effect (1|batch)
D. 共變數(連續數)D. A continuous covariate

2. 部分匯聚(partial pooling)解決了什麼問題?

2. What problem does partial pooling solve?

A. 異質變異A. Heteroscedasticity
B. 多重共線性B. Multicollinearity
C. 在「每組獨立估計」(雜訊大)與「全體共用一個均值」(偏誤大)之間取得最佳偏誤-變異折衷C. The bias-variance trade-off between fully separate (high variance) and fully pooled (high bias) estimation
D. 缺失資料D. Missing data

3. 為什麼 lme4::lmer 預設不給 p-value?

3. Why does lme4::lmer by default refuse to print p-values?

A. 它沒有計算 t 統計量A. It doesn't compute t-statistics
B. 在不平衡設計下沒有閉式分母自由度——需 Satterthwaite (lmerTest) 或 Kenward-Roger 近似B. There is no closed-form denominator df in unbalanced designs — needs Satterthwaite (lmerTest) or Kenward-Roger
C. 因為 Bates 不相信 p-valueC. Because Bates dislikes p-values
D. 混合模型沒有似然D. Mixed models have no likelihood