Step 6: Spatial Domains — Spatial Transcriptomics Tutorial

概念

一、什麼叫「Spatial Domain」？

Spatial domain 是 ST 才有的概念，指「表達 profile 相似且在組織上連續成片」的 spot 集合。在腦皮層上，這對應到 cortical layer L1–L6；在腎臟，對應到腎絲球與腎小管；在腫瘤，對應到核心 / 邊界 / 基質區。

跟 scRNA cluster 的差別：scRNA 只看 expression similarity，「空間連續」是無從定義的。在 ST，同一個 cell type 在不同空間區域可能行使不同功能，所以用 cell type 取代 spatial domain 會丟失關鍵資訊。

A spatial domain is an ST-specific concept: spots that are both expression-similar and spatially contiguous. In the cortex, domains map to cortical layers L1–L6; in kidney, to glomeruli vs tubules; in tumor, to core / margin / stroma.

How does this differ from scRNA clusters? scRNA only knows expression similarity; "spatial contiguity" is undefined. In ST, the same cell type can perform different roles in different spatial regions — replacing domains with cell types loses key information.

主流方法

二、五大主流演算法

工具	原理	語言	優缺
BayesSpace	Bayesian + Markov Random Field 強制空間連續	R	在 Visium、原始 ST 上表現穩定；Markov 假設下會「太平滑」	Bayesian + Markov Random Field for spatial smoothness	Robust on Visium / original ST; can over-smooth
SpaGCN	GCN + 組織影像顏色一起當特徵	Python	能整合 H&E；Stereo-seq 表現很好	GCN + H&E color as features	Integrates H&E; strong on Stereo-seq
STAGATE	Adaptive graph attention auto-encoder	Python	在 Visium 表現次於 GraphST；對 STARmap 也較友善	Adaptive graph-attention autoencoder	2nd on Visium; friendlier on STARmap
GraphST	Self-supervised contrastive GNN	Python	2025 benchmark Visium 第一名（mean ARI 0.55）	Self-supervised contrastive GNN	Top method on Visium in 2025 benchmark (mean ARI 0.55)
BANKSY	把 self / nbr-mean / nbr-azimuthal 串接做特徵	R / Python	超快、無需 GPU、Seurat v5 內建	Concatenates self / nbr-mean / nbr-AGF features	Very fast, no GPU needed, built into Seurat v5

⚠️

重要結論：2025 年 NAR benchmark 顯示沒有任何方法在所有平台都最佳。Visium 偏 GraphST/BayesSpace；Stereo-seq 偏 SpaGCN/Louvain；Slide-seqV2 / STARmap 上所有方法表現都顯著下降，需要更謹慎。 Key takeaway: the 2025 NAR benchmark shows no method dominates across platforms. GraphST/BayesSpace lead on Visium; SpaGCN/Louvain lead on Stereo-seq; all methods drop noticeably on Slide-seqV2 / STARmap — handle with care.

互動模擬

互動：smoothing 強度與 cluster 數

下方是模擬皮層 5 層結構。左側為 raw expression（含噪），中間是「弱平滑」(似 Leiden)，右側是「強平滑」(似 BayesSpace)。觀察邊界平滑度與 cluster 數的取捨。

Below: simulated 5-layer cortex. Left = raw expression (noisy); middle = weak smoothing (Leiden-like); right = strong smoothing (BayesSpace-like). Watch the trade-off between boundary sharpness and cluster count.

平滑強度 0.3

cluster 數 5

決策

三、選工具決策樹

🌳 先看你用哪個平台

Visium:

首選 GraphST 或 BayesSpace；要極快可用 BANKSY。

Visium HD:

BANKSY（Seurat v5 整合佳）+ Leiden；BayesSpace 對 HD 過大不友善。

Stereo-seq:

SpaGCN（影像融合）或 GraphST/Louvain。

Image-based (Xenium/MERFISH/CosMx):

先 segmentation → cluster 細胞 → 再用 niche/CN 分析（見 Step 11），不一定走傳統 spatial domain。

Slide-seq:

最難；建議先用 BANKSY 較大 λ 平滑；報告要附 ARI 範圍。

Visium:

Default GraphST or BayesSpace; pick BANKSY when speed matters.

Visium HD:

BANKSY (Seurat v5 integration) + Leiden; BayesSpace struggles at HD scale.

Stereo-seq:

SpaGCN (image fusion) or GraphST/Louvain.

Image-based (Xenium/MERFISH/CosMx):

Segment first → cluster cells → niche/CN analysis (see Step 11) instead of classic spatial domains.

Slide-seq:

Hardest; try BANKSY with larger λ; report ARI ranges honestly.

程式碼

實作

# BayesSpace
library(BayesSpace)
spe <- spatialPreprocess(spe, n.PCs = 15, n.HVGs = 2000)
spe <- qTune(spe, qs = 3:10)  # 選最佳 cluster 數
spe <- spatialCluster(spe, q = 7, nrep = 10000)
clusterPlot(spe)

# BANKSY (Seurat v5 內建)
vis <- RunBanksy(vis, lambda = 0.2, k_geom = 15, dimx = "x", dimy = "y")
vis <- RunPCA(vis, assay = "BANKSY") |> FindNeighbors() |> FindClusters(resolution = 0.5)
SpatialDimPlot(vis)

# GraphST
from GraphST import GraphST
from GraphST.utils import clustering
model = GraphST(adata, device="cuda")
adata = model.train()
clustering(adata, n_clusters=7, method="mclust")
sq.pl.spatial_scatter(adata, color=["domain"])

# SpaGCN
import SpaGCN as spg
adj = spg.calculate_adj_matrix(x=adata.obs["x"], y=adata.obs["y"],
                               x_pixel=adata.obs["x_pix"], y_pixel=adata.obs["y_pix"],
                               image=img, beta=49, alpha=1, histology=True)
clf = spg.SpaGCN()
clf.set_l(0.5); clf.train(adata, adj, n_clusters=7)

📝 自我檢測

1. 為什麼 spatial domain 不能用 scRNA cluster 取代？

1. Why can't spatial domains simply be replaced by scRNA clusters?

A. 因為 scRNA 沒有基因A. scRNA has no genes

B. 因為 cluster 數不能控制B. Cluster count can't be controlled

C. 因為相同細胞類型在不同空間區域可能有不同功能C. Same cell type can play different roles in different regions

D. 因為 ST 沒有 clusterD. ST has no clusters

2. 2025 NAR benchmark 中，Visium 上整體表現最佳的是？

2. Best-performing method on Visium in the 2025 NAR benchmark?

A. GraphSTA. GraphST

B. K-meansB. K-means

C. PCAC. PCA

D. tSNED. tSNE

3. SpaGCN 跟其他方法最特別的不同是？

3. What is unique about SpaGCN?

A. 它不是 GNNA. It is not a GNN

B. 它把 H&E 影像顏色當成節點特徵B. It uses H&E color as node features

C. 它只能用在 Stereo-seqC. It only works on Stereo-seq

D. 它只用 RD. It only runs in R