一、為什麼需要 deconvolution?
在 spot-based 平台,一個 spot 的表達向量是其涵蓋細胞的表達線性混合。如果直接用 spot 表達分群,會看到「不純」的 cluster——例如腫瘤邊界區同時混了腫瘤細胞 + 巨噬細胞 + 內皮細胞。
Deconvolution 的目標:在已知 scRNA cell-type signature 的前提下,估算每個 spot 上各細胞類型的比例 (proportion / abundance)。輸出通常是一個 spot × cell-type 的矩陣。
On spot-based platforms, a spot's expression is a linear mixture of its underlying cells. Clustering spot expression directly produces "impure" clusters — e.g. a tumor margin mixes tumor + macrophage + endothelial cells.
Deconvolution aims, given an scRNA cell-type signature, to estimate the proportion (or abundance) of each cell type per spot. Output: a spot × cell-type matrix.
二、五大主流方法
| 方法 | 原理 | 平台適性 | GPU | ||
|---|---|---|---|---|---|
| cell2location | Bayesian probabilistic | Visium / Slide-seq / Stereo-seq 多項 benchmark 第一 | ✓ | Bayesian probabilistic | Often #1 across Visium / Slide-seq / Stereo-seq benchmarks |
| RCTD | Probabilistic likelihood (NB) | Visium / Slide-seq;速度快、有 CPU 平行 | ✗ | Probabilistic NB likelihood | Visium / Slide-seq; fast, CPU-parallel |
| CARD | Conditional autoregressive 加入空間平滑 | Visium 上整體第一名 (2023 Nat Comm) | ✗ | Conditional autoregressive + spatial smoothing | #1 on Visium in 2023 Nat Comm benchmark |
| SPOTlight | Seeded NMF | Visium 表現好、易解釋 | ✗ | Seeded NMF | Strong on Visium, easy to interpret |
| Tangram | Deep learning, 把 sc 細胞「映射」到 spot | 適合 image-based 資料;可同時做映射 + imputation | ✓ | Deep learning, maps sc cells to spots | Good for image-based data; jointly maps + imputes |
互動:raw spot 拆成細胞類型比例
左側為「混合 spot 表達」,右側為 deconvolution 後的細胞類型比例(pie)。拖動滑桿改變 marker 雜訊,觀察方法在低品質 reference 下的穩定度。
Left: mixed spot expression. Right: deconvolved cell-type pie. Move the slider to add marker noise and watch how performance degrades when the reference is low quality.
左:spot 表達;右:估算比例
三、常見陷阱
- Reference 必須來自相似組織。用心臟 scRNA reference 去 deconvolute 腦切片是災難。
- 細胞類型不在 reference 裡 = 看不見。deconvolution 只能輸出 reference 中存在的類型。
- 分辨「proportion」與「abundance」。cell2location 輸出 abundance(每 spot 有幾顆細胞);RCTD/SPOTlight 多輸出 proportion(比例)。下游做 marker correlate 時要分清楚。
- 稀有細胞容易被 zero out。要驗證稀有細胞是否真的不存在於該空間,可借助 spatial domain 對照、或用 Tangram 把 sc 細胞直接映射。
- Reference must match the tissue. Deconvolving a brain section with a cardiac scRNA reference is a disaster.
- Cell types absent from the reference are invisible. Deconvolution outputs only types present in the reference.
- Distinguish "proportion" vs "abundance." cell2location outputs abundance (cells per spot); RCTD/SPOTlight typically output proportions. Be careful when correlating with markers downstream.
- Rare cells get zeroed out easily. Cross-check with spatial domains or use Tangram to map sc cells directly.
實作
# RCTD (spacexr) library(spacexr) ref <- Reference(sc_counts, sc_cell_types) puck <- SpatialRNA(coords, vis_counts) rctd <- create.RCTD(puck, ref, max_cores = 8) rctd <- run.RCTD(rctd, doublet_mode = "full") weights <- rctd@results$weights # spot × cell-type # CARD library(CARD) card <- createCARDObject(sc_count=sc_counts, sc_meta=sc_meta, spatial_count=vis_counts, spatial_location=coords, ct.varname="cellType", sample.varname="sampleID") card <- CARD_deconvolution(card)
import cell2location as c2l import scvi # 1. 用 scRNA reference 學 signature c2l.models.RegressionModel.setup_anndata(adata_sc, labels_key="cell_type", batch_key="sample") mod = c2l.models.RegressionModel(adata_sc); mod.train(max_epochs=300) adata_sc = mod.export_posterior(adata_sc) # 2. 用 signature 解 spot inf_aver = adata_sc.varm["means_per_cluster_mu_fg"] c2l.models.Cell2location.setup_anndata(adata_st, batch_key="sample") mod = c2l.models.Cell2location(adata_st, cell_state_df=inf_aver, N_cells_per_location=8, detection_alpha=20) mod.train(max_epochs=3000); adata_st = mod.export_posterior(adata_st) sq.pl.spatial_scatter(adata_st, color=["q05_cell_abundance_w_sf"])
📝 自我檢測
1. 一個 spot 涵蓋的細胞類型沒出現在 reference 裡,會發生什麼?
1. What happens if a spot's true cell type is missing from the reference?
2. 想要 CPU-only、快速、且穩定的起點,建議?
2. CPU-only, fast, and robust starting point?
3. cell2location 跟 RCTD 主要輸出的差別?
3. Key output difference between cell2location and RCTD?