一、兩種「整合」
表達空間整合
不同樣本/批次/條件混在一起做下游分析(spatial domain、deconvolution)時,要先校正 batch effect。Harmony / scVI / Seurat anchors 都可用,但要小心過度校正。
When mixing samples/batches/conditions for downstream analysis (spatial domains, deconvolution), batch effects must be corrected. Harmony / scVI / Seurat anchors all work — beware over-correction.
物理對位 / 3D
把連續切片沿 z 軸對齊,重建 3D 組織。PASTE / STAligner / STAIR 三大主流方法各有強項。
Align serial sections along z to reconstruct 3D tissue. Three mainstream tools: PASTE / STAligner / STAIR, each with different strengths.
二、主流工具比較
| 工具 | 任務 | 原理 | 注意 | ||
|---|---|---|---|---|---|
| Harmony | batch | 迭代校正 PCA 嵌入 | scRNA 起源;ST 上 batch 校正高、生物保留低 | Iteratively corrects PCA embeddings | scRNA-origin; high batch correction but low biological preservation in ST |
| Seurat anchors | batch | CCA + MNN-style anchors | 和 Seurat 流程整合度最好 | CCA + MNN-style anchors | Best integration with Seurat workflow |
| scVI / scANVI | batch | VAE,可同時處理多個 covariate | 需 GPU;對大資料集擴展性好 | VAE, handles multiple covariates | GPU needed; scales well |
| PASTE / PASTE2 | 3D | Optimal transport 對位相鄰切片 | 需切片幾何相似;不適合不同條件 / 不同個體 | Optimal transport on adjacent slices | Requires geometric similarity; poor for different conditions/individuals |
| STAligner | both | GNN + triplet loss,跨 platform 與條件穩定 | Nat Comp Sci 2023;推薦多條件整合 | GNN + triplet loss, cross-platform robust | Nat Comput Sci 2023; recommended for multi-condition |
| STAIR | both + 3D | end-to-end:批次校正 → 2D 對位 → 3D 重建 | Genome Biology 2025;最新一站式方案 | End-to-end: batch correction → 2D alignment → 3D reconstruction | Genome Biology 2025; newest all-in-one |
互動:兩個切片的 batch correction
左側:兩個切片在 PCA 上明顯分開(受 batch 主導)。拖動「校正強度」觀察兩個 batch 重新混合,並觀察生物 cluster 是否仍可辨識。
Left: two batches separate in PCA (dominated by batch). Slide the strength to mix them — but watch whether the biological clusters remain distinguishable.
圈:細胞;外框 = batch;填色 = cell type
實作
# Seurat v5 多樣本整合 combined <- merge(vis1, y = list(vis2, vis3), add.cell.ids = c("s1","s2","s3")) combined <- SCTransform(combined, assay = "Spatial") |> RunPCA() library(harmony) combined <- RunHarmony(combined, group.by.vars = "sample", reduction = "pca") combined <- RunUMAP(combined, reduction = "harmony", dims = 1:30) SpatialDimPlot(combined, label = TRUE)
# scVI multi-batch import scvi scvi.model.SCVI.setup_anndata(adata, batch_key="sample") mod = scvi.model.SCVI(adata); mod.train() adata.obsm["X_scvi"] = mod.get_latent_representation() # STAligner (跨切片整合 + alignment) import STAligner adata_concat = STAligner.train_STAligner(adata_list=[a1,a2,a3], n_epochs=600) # PASTE 對位相鄰切片 → 3D import paste as pst pis = pst.pairwise_align(a1, a2) # slice→slice transport center, slices = pst.center_align([a1,a2,a3]) # 對齊到 center slice
📝 自我檢測
1. PASTE 不適合用在哪一種情境?
1. When is PASTE NOT appropriate?
2. 2025 ST batch correction benchmark 顯示 Harmony 的特徵是?
2. According to the 2025 ST benchmark, Harmony tends to:
3. 想要一站式做完「批次校正 + 對位 + 3D」?
3. End-to-end "batch correction + alignment + 3D reconstruction"?