一、選對檢定法
🔍 該用哪個檢定?
t.test()wilcox.test() (Mann-Whitney)aov() 或 kruskal.test()chisq.test() / fisher.test()cor.test() / lm()glm(..., family = binomial)glm(...,family=poisson) / DESeq2二、t-test 與 Wilcoxon
set.seed(42)
ctrl <- rnorm(30, mean = 5, sd = 1)
trt <- rnorm(30, mean = 6.5, sd = 1)
# Welch's two-sample t-test (預設不假設等變異數,最穩健)
t.test(ctrl, trt)
# 公式介面:response ~ group / Formula interface
df <- data.frame(value = c(ctrl, trt),
group = rep(c("ctrl","trt"), each = 30))
t.test(value ~ group, data = df)
# 配對 t-test(同一受試者前後測)/ Paired
# t.test(before, after, paired = TRUE)
# 單尾 / One-sided
t.test(value ~ group, data = df, alternative = "less")
# 假設等變異數(傳統 Student's t)/ Equal variance
t.test(value ~ group, data = df, var.equal = TRUE)
# 取出特定欄位 / Pull out specific stats
res <- t.test(value ~ group, data = df)
res$p.value
res$estimate
res$conf.int
# 非常態 → Wilcoxon rank-sum (Mann-Whitney)
wilcox.test(value ~ group, data = df)
shapiro.test() 看 p < 0.05 表示「拒絕常態」,但樣本越大越容易拒絕。實務上 n > 30 時 t-test 已對非常態頗穩健(中央極限定理);極小樣本或極端偏態才需要 Wilcoxon。
Normality test: shapiro.test() with p < 0.05 rejects normality, but large n almost always rejects. In practice n > 30 makes t-test robust (CLT); use Wilcoxon only for tiny n or extreme skew.
三、ANOVA 與 Kruskal-Wallis
# 單因子 ANOVA / One-way ANOVA
fit <- aov(Sepal.Length ~ Species, data = iris)
summary(fit)
# 假設檢驗:殘差常態?變異數齊一?
# Diagnostic: residual normality? equal variance?
plot(fit, which = 1:2)
shapiro.test(residuals(fit)) # 殘差常態
bartlett.test(Sepal.Length ~ Species, data = iris) # 變異數齊一
# Post-hoc:哪些組間有差?/ Which pairs differ?
TukeyHSD(fit)
pairwise.t.test(iris$Sepal.Length, iris$Species,
p.adjust.method = "bonferroni")
# 變異數不齊用 Welch's ANOVA
oneway.test(Sepal.Length ~ Species, data = iris, var.equal = FALSE)
# 非常態 → Kruskal-Wallis
kruskal.test(Sepal.Length ~ Species, data = iris)
# Post-hoc: pairwise.wilcox.test or dunn.test::dunn.test
# 二因子 ANOVA:包含交互作用
# fit2 <- aov(yield ~ fertilizer * variety, data = farm)
# summary(fit2)
四、類別變數:chi-square 與 Fisher
# 列聯表 / Contingency table
# Disease Healthy
# Variant 30 10
# WT 50 90
tab <- matrix(c(30, 50, 10, 90), nrow = 2,
dimnames = list(genotype = c("Variant","WT"),
status = c("Disease","Healthy")))
tab
# Chi-square test (大樣本:每格期望次數 >= 5)
chisq.test(tab)
# Fisher's exact test (小樣本,期望次數 < 5)
fisher.test(tab)
fisher.test(tab)$estimate # odds ratio
fisher.test(tab)$conf.int # 95% CI for OR
# 從 data.frame 直接做表 / Build from data.frame
df <- data.frame(genotype = sample(c("V","WT"), 200, TRUE),
status = sample(c("D","H"), 200, TRUE))
table(df$genotype, df$status) |> chisq.test()
# 趨勢檢定(順序類別)/ Trend test
# prop.trend.test() for ordered categorical
五、相關係數與線性迴歸
## ----- 相關 / Correlation -----
cor(iris$Sepal.Length, iris$Sepal.Width) # Pearson
cor(iris$Sepal.Length, iris$Sepal.Width, method = "spearman") # 排名相關
cor.test(iris$Sepal.Length, iris$Petal.Length) # + p-value & CI
# 相關矩陣 / Correlation matrix
M <- iris[, 1:4]
cor(M)
cor(M, method = "spearman")
# 視覺化:corrplot::corrplot() 或 pheatmap()
## ----- 線性迴歸 / Linear regression -----
fit <- lm(Sepal.Length ~ Petal.Length, data = iris)
summary(fit) # 看 Estimate / Std.Error / Pr(>|t|) / R-squared
# 公式語法 / Formula syntax
# y ~ x 單變數 / single predictor
# y ~ x1 + x2 多變數 (加性) / additive multivariate
# y ~ x1 * x2 含交互作用 / with interaction (== x1 + x2 + x1:x2)
# y ~ x1 + x2 + x1:x2 等同 x1 * x2
# y ~ . - x 全部當預測變數但排除 x
# 多變數 / Multivariate
fit2 <- lm(Sepal.Length ~ Petal.Length + Species, data = iris)
summary(fit2)
# 取出資訊 / Extract pieces
coef(fit2) # 係數
confint(fit2) # 95% CI
fitted(fit2) # 預測值
residuals(fit2) # 殘差
broom::tidy(fit2) # 整潔的 data.frame 結果
# 預測新資料 / Predict
new_dat <- data.frame(Petal.Length = c(2, 4, 6),
Species = factor("setosa", levels = levels(iris$Species)))
predict(fit2, new_dat, interval = "confidence")
# 模型診斷 / Diagnostics — 4 張圖
par(mfrow = c(2, 2)); plot(fit2); par(mfrow = c(1, 1))
六、廣義線性模型 (GLM)
| 情境 | 分布 | 連結 | 範例 |
|---|---|---|---|
| 連續,常態 | gaussian | identity | 正常 lm() |
| 0/1 二元 | binomial | logit | 疾病 vs 健康預測 |
| 計次 | poisson / quasipoisson | log | 平均突變次數 |
| 過度離散計次 | negative binomial (MASS::glm.nb) | log | RNA-seq counts |
# 邏輯迴歸:以連續/類別變數預測二元結果 # Logistic regression df <- data.frame( age = c(35, 42, 51, 28, 60, 45, 55, 33), smoking = c(1, 0, 1, 0, 1, 1, 0, 1), disease = c(0, 0, 1, 0, 1, 1, 0, 0) ) fit <- glm(disease ~ age + smoking, data = df, family = binomial) summary(fit) # 解讀:把 coefficient 取 exp 變 odds ratio exp(coef(fit)) # OR per unit increase exp(confint(fit)) # OR 95% CI # 預測機率 / Predicted probabilities predict(fit, type = "response") # Poisson regression 範例 / Poisson example # fit_p <- glm(n_mutations ~ exposure + age, data = df, family = poisson) # Negative binomial(過度離散)/ Negative binomial (overdispersion) # library(MASS); fit_nb <- glm.nb(n_mutations ~ exposure + age, data = df)
七、多重檢定校正──生資的核心
同時對 20,000 個基因做 t-test,即使全部都是 H₀ 為真,也會有 20000 × 0.05 = 1000 個「假性顯著」。多重檢定校正就是處理這個問題。
- Bonferroni:把 α 除以檢定數。嚴格保證 FWER,但極度保守,生資少用。
- Benjamini-Hochberg (BH/FDR):控制「偽陽性比例 (False Discovery Rate)」。生資標準做法。
- Storey's q-value:類似 BH 但更敏感(
qvalue::qvalue())。
Run a t-test on 20,000 genes simultaneously: even if every H₀ is true, you'll get 20000 × 0.05 = 1000 "significant" hits by chance. Multiple testing correction fixes this.
- Bonferroni: divide α by # of tests. Guarantees FWER, but very conservative — rare in bio.
- Benjamini-Hochberg (BH/FDR): controls false discovery rate. The standard in bio.
- Storey's q-value: BH-like but more powerful (
qvalue::qvalue()).
set.seed(7)
# 模擬 1000 個基因的 p-value,其中 50 個真實 DEG
n <- 1000
n_true <- 50
pvals <- c(runif(n - n_true), runif(n_true, 0, 0.01)) # 假 H1 的 p-value 偏小
hist(pvals, breaks = 50)
# Bonferroni
pBonf <- p.adjust(pvals, method = "bonferroni")
sum(pBonf < 0.05)
# BH (FDR) — 生資標準
pBH <- p.adjust(pvals, method = "BH")
sum(pBH < 0.05)
# Holm(FWER 但比 Bonferroni 強)
pHolm <- p.adjust(pvals, method = "holm")
sum(pHolm < 0.05)
# 比較:可看出 BH 在 FDR 控制下保留更多 hit
cat("Method | Significant\n",
"Bonferroni |", sum(pBonf < 0.05), "\n",
"Holm |", sum(pHolm < 0.05), "\n",
"BH (FDR) |", sum(pBH < 0.05), "\n", sep = "")
八、Bonferroni vs FDR 視覺比較
調整真實 DEG 比例與 p-value 嚴格度,看看不同校正法剩下幾個顯著結果。
Adjust the true-positive fraction and threshold; see how each method behaves.
模擬:真實 H1 的 p ~ Beta(0.5,5);H0 為 Uniform(0,1)。
Simulated: true H1 p ~ Beta(0.5,5); H0 ~ Uniform(0,1).
📝 自我檢測
1. 你對 25,000 個基因做 t-test,原始 p < 0.05 有 1,800 個。下列敘述何者正確?
1. You ran t-tests on 25,000 genes; 1,800 have raw p < 0.05. Which is correct?
2. 想做配對 t-test(同人前後測),正確寫法?
2. Paired t-test (same subject before/after) — correct call?
3. 邏輯迴歸 glm(y ~ x, family=binomial) 的係數要如何解讀?
3. How to interpret coefficients in glm(y ~ x, family=binomial)?