STEP 3 / 15

品質管控:空間 QC 跟 scRNA 不一樣

spot 不等於 cell;組織覆蓋與空間鄰域離群是 ST QC 的核心。

A spot is not a cell. Tissue coverage and spatially-aware outliers are the heart of ST QC.

一、為什麼 scRNA QC 不能直接套到 ST?

在 scRNA-seq,「一個 barcode = 一個細胞」是基本假設。在 ST 上這個假設只在 image-based 平台成立。對於 Visium 這類 55 µm spot,一個 spot 可能涵蓋 0、1、5、甚至 10 個細胞。所以:

  • nFeature 高不一定是 doublet——可能只是 spot 裡細胞多。
  • MT% 高不一定是死細胞——可能該區是肌肉/心臟組織。
  • 過濾要先依組織區塊 (tissue region) 做分組,再下手。

另一個 ST 特有的問題:組織覆蓋 (tissue coverage)。capture array 上必有些 spot 落在組織外,這些必須先排除。Space Ranger / Loupe 會給 in_tissue 標籤。

In scRNA-seq, "one barcode = one cell" is foundational. That only holds for image-based ST platforms. For Visium-style 55 µm spots, one spot can contain 0, 1, 5, or even 10 cells. So:

  • High nFeature isn't necessarily a doublet — the spot may simply contain more cells.
  • High MT% isn't necessarily dead cells — the area may be muscle/cardiac tissue.
  • Filter by tissue region, then apply thresholds.

An ST-specific issue: tissue coverage. Many spots on the capture array fall off-tissue and must be excluded first — Space Ranger / Loupe provides the in_tissue flag.

二、ST 的 QC 核心指標

📍

in_tissue

spot 是否落在組織內。Space Ranger 自動偵測;務必檢查 H&E 影像確認沒有把組織誤判成背景。

Whether a spot is inside tissue. Space Ranger detects it automatically; always check the H&E image to confirm tissue isn't mis-classified as background.

📊

nCount / nFeature

UMI 與基因數,跟 scRNA 一樣。但不要用單一閾值——應該分組織區塊看分佈。

UMI and gene counts, like scRNA. But don't use one global threshold — examine distributions per tissue region.

🔋

percent.mt

MT 比例。心肌/肝組織天然偏高,過濾前先看空間圖。Visium FFPE 因為用 probe,MT% 通常較低。

Mitochondrial fraction. Naturally elevated in cardiac/liver tissue — view the spatial map before filtering. Probe-based Visium FFPE typically has lower MT%.

🧬

percent.hb

血紅素基因比例。富血管組織偏高,是「該區是血液」而非「該 spot 壞」。

Hemoglobin fraction. Elevated in vascular regions — usually means "this region is blood-rich," not "this spot is bad."

🌐

局部離群

SpotSweeper 把每個 spot 跟自己的 k-nearest spatial neighbours (Visium 用 k = 36) 比較,標記出「跟周遭格格不入」的 spot。比 global threshold 更精準。

SpotSweeper compares every spot to its k-nearest spatial neighbors (k = 36 for Visium hex grid) and flags "out-of-context" spots — more accurate than global thresholds.

🩻

影像品質

H&E 染色不均、組織皺褶、氣泡都會造成局部偏差。務必目視確認影像,並可在 Loupe 圈出可疑區。

Uneven H&E staining, tissue folds, bubbles all create local artifacts. Always inspect the image, and use Loupe to mask suspicious regions if needed.

互動:global vs spatially-aware QC

下方左圖:spot 顏色深淺對應 nCount。拖動「全域閾值」會用單一截斷值;切換到「空間鄰域」則會比照局部鄰居。觀察哪一種會誤殺正常但低表達的組織區。

Left: spot color = nCount. Move the "global threshold" to use a single cutoff; switch to "spatial neighborhood" to compare each spot with its local neighbors. Notice which method mistakenly removes biologically low-expression regions.

灰:被剔除;紫:保留;底色色塊代表組織區塊

三、ST QC 決策流程

🌳 ST QC 流程

1.
先去掉 in_tissue = FALSE
2.
spatial featureplot(nCount, nFeature, MT%)—— 看分佈是否跟組織結構吻合。
3.
是否多樣本/多切片? → 是 →per-sample MAD adaptive,避免一刀切。
4.
套上 SpotSweeper 局部離群(解析 H&E 上的 fold/bubble 損傷區)。
5.
聚類後若某 cluster 全是高 MT% 又無 marker,回頭收緊。
1.
Drop spots with in_tissue = FALSE.
2.
Plot spatial feature maps (nCount, nFeature, MT%) — check whether distributions follow tissue anatomy.
3.
Multi-sample / multi-slice? → Yes → use per-sample MAD adaptive thresholds.
4.
Apply SpotSweeper local outliers (catches fold/bubble damage on H&E).
5.
If a downstream cluster is all high-MT% with no markers, return and tighten.

QC 範例

library(Seurat); library(SpotSweeper); library(SpatialExperiment)
vis <- Load10X_Spatial("visium/")

# 基本指標 / Basic metrics
vis[["percent.mt"]] <- PercentageFeatureSet(vis, pattern = "^MT-")
vis[["percent.hb"]] <- PercentageFeatureSet(vis, pattern = "^HB[AB]")

SpatialFeaturePlot(vis, features = c("nCount_Spatial", "percent.mt"))

# 全域簡易過濾(先寬鬆) / Lenient global filter
vis <- subset(vis, subset = nCount_Spatial > 500 & percent.mt < 25)

# SpotSweeper 局部離群(轉成 SpatialExperiment) / Local-outlier QC
spe <- as.SingleCellExperiment(vis) |> as.SpatialExperiment()
spe <- localOutliers(spe, metric = "sum", direction = "lower", n_neighbors = 36)
spe <- spe[, !spe$local_outliers]
import scanpy as sc
import squidpy as sq
adata = sc.read_visium("visium/")
adata.var_names_make_unique()

# 標記 mt / hb 基因 / Tag mt & hb genes
adata.var["mt"] = adata.var_names.str.startswith("MT-")
adata.var["hb"] = adata.var_names.str.contains("^HB[AB]")
sc.pp.calculate_qc_metrics(adata, qc_vars=["mt","hb"], inplace=True)

# 空間視覺化 / Spatial QC view
sq.pl.spatial_scatter(adata, color=["total_counts","pct_counts_mt"])

# 簡易過濾(in_tissue 已自動套用) / Simple filter
sc.pp.filter_cells(adata, min_counts=500)
adata = adata[adata.obs.pct_counts_mt < 25].copy()

📝 自我檢測

1. Visium 心臟切片整片 MT% 都偏高,最佳處置?

1. A Visium heart section has globally elevated MT%. Best response?

A. 用 5% 嚴格過濾A. Apply strict 5% cutoff
B. 完全不做 MT% 檢查B. Skip MT% entirely
C. 把閾值依組織特性調高,且看空間分佈是否跟解剖結構一致C. Raise threshold to fit tissue biology and check spatial distribution against anatomy
D. 移除所有 MT 基因D. Remove all MT genes

2. SpotSweeper 的核心優勢是?

2. The key advantage of SpotSweeper?

A. 自動估算 batch effectA. Estimates batch effects
B. 比較每個 spot 與其空間鄰域,找出局部離群B. Compares each spot with its spatial neighborhood to flag local outliers
C. 可以做 doublet detectionC. Performs doublet detection
D. 跟 Cellpose 一樣是切割工具D. Is a segmentation tool like Cellpose

3. 為什麼「nFeature 高」在 Visium 不一定代表 doublet?

3. Why doesn't "high nFeature" in Visium always mean a doublet?

A. 因為一個 spot 本來就可能涵蓋 1–10 個細胞A. Because each spot can naturally contain 1–10 cells
B. 因為 Visium 不會發生 doubletB. Because Visium never produces doublets
C. 因為 nFeature 是錯誤的指標C. Because nFeature is a wrong metric
D. 因為 Visium 有 in_tissue 標籤D. Because Visium has the in_tissue flag