STEP 9 / 15

與 scRNA-seq 整合:把單細胞分類映回空間

deconvolution 給比例,integration 給「每一顆 sc 細胞最有可能在組織哪個位置」。

Deconvolution gives proportions; integration tells where each sc cell most likely sits in the tissue.

一、整合的三種任務

🏷️

Label transfer

把 scRNA 的 cell-type 標籤投射到 spot(per-spot top-1 / weighted)。Seurat FindTransferAnchors 是經典方案。

Project scRNA cell-type labels to spots (per-spot top-1 / weighted). Seurat FindTransferAnchors is the classic approach.

🗺️

Cell mapping

Tangram:用 deep learning 學一個機率矩陣,每顆 sc 細胞對應到某 voxel/spot 的機率。同時可做 gene imputation。

Tangram: deep learning estimates a probability matrix mapping each sc cell to a voxel/spot. Also imputes missing genes.

🎯

High-res placement

CytoSPACE:把 sc cell 一對一映射到 ST spot,達到單細胞解析的空間圖;對 noise tolerance 強。

CytoSPACE: one-to-one assignment of sc cells to ST spots — yields a single-cell spatial map; strong noise tolerance.

二、主流工具

工具任務特色
Seurat anchorslabel transfer與 scRNA-seq 工作流無縫銜接、社群最大Seamless with scRNA workflow, largest community
Tangramcell mapping + imputationPyTorch、可同時擴充基因覆蓋;MERFISH/STARmap/Visium 通用PyTorch, also imputes missing genes; works on MERFISH/STARmap/Visium
CytoSPACE1-to-1 placement2023 Nat Biotech、優於前期方法;對 noise robust2023 Nat Biotech; outperforms prior methods; noise-robust
SpaGEgene imputation針對 image-based panel 補齊未被測到的基因Imputes off-panel genes for image-based data
STEMtransfer learning2024 Comm Bio;用 deep transfer learning 同時做 mapping + 標籤遷移2024 Comm Bio; deep transfer learning for mapping + label transfer

三、整合的標準工作流程

清乾淨 sc reference

用對的組織、用對的分群粒度。太細的 cluster 可能跟 ST 解析度不匹配。

Use the right tissue and the right granularity. Too-fine clusters may not match ST resolution.

挑選共用基因

Tangram 建議用每個 cell-type 各 ~100 個 marker(共 ~1k 基因)做 training set。

Tangram recommends ~100 markers × cell-type (~1 k genes) for training.

訓練 / 對位

Tangram constraint mode(每個 spot 細胞數估值帶入),有助於 Visium。CytoSPACE 直接用 ILP optimizer。

For Tangram, "constrained" mode (with per-spot cell count) helps on Visium. CytoSPACE uses an ILP optimizer.

驗證

把 imputed marker 跟 ground-truth marker 比較;以及 spatial domain 對應細胞類型是否符合解剖學。

Compare imputed markers with ground-truth; check whether spatial domains correspond to expected cell types per anatomy.

實作

# Seurat label transfer
anchors <- FindTransferAnchors(reference = sc, query = vis, normalization.method = "SCT")
preds <- TransferData(anchorset = anchors, refdata = sc$celltype,
                       prediction.assay = TRUE, weight.reduction = vis[["pca"]])
vis[["predictions"]] <- preds
SpatialFeaturePlot(vis, features = c("T_cell", "B_cell"))
import tangram as tg
# Tangram cell mapping (Visium)
markers = sc.tl.rank_genes_groups_df(adata_sc, group=None).head(1500)["names"].tolist()
tg.pp_adatas(adata_sc, adata_st, genes=markers)
ad_map = tg.map_cells_to_space(adata_sc, adata_st, mode="constrained",
                                density_prior="rna_count_based", num_epochs=500)
ad_proj = tg.project_genes(ad_map, adata_sc)  # 補齊基因

# CytoSPACE
import cytospace
out = cytospace.run(scRNA = sc_counts, ST = st_counts,
                     scLabels = sc_meta["celltype"])  # one-to-one assignment

📝 自我檢測

1. 下列哪個工具最適合「想要單細胞解析的空間圖」?

1. Which tool best fits "I want a single-cell-resolution spatial map"?

A. RCTDA. RCTD
B. cell2locationB. cell2location
C. CytoSPACEC. CytoSPACE
D. Moran's ID. Moran's I

2. Tangram 除了 cell mapping 還能做?

2. Beyond cell mapping, Tangram also performs?

A. 基因 imputation(補齊未被測到的基因)A. Gene imputation (filling in untested genes)
B. 自動 segmentationB. Automatic segmentation
C. 估算 ligand-receptorC. Ligand-receptor inference
D. 移除 batch effectD. Removes batch effects

3. Tangram 訓練基因怎麼挑最有效率?

3. How to pick training genes for Tangram efficiently?

A. 全部基因都丟進去A. Pass in all genes
B. 每個細胞類型各取 top markers (共 ~1k)B. Top markers per cell type (~1k total)
C. 隨機取 100 個C. Random 100 genes
D. 只用 housekeeping 基因D. Only housekeeping genes