STEP 6 / 14

結果與圖表設計

結果只報告事實,不解讀意義。每張圖都是獨立故事,搭配完整效應量、CI 與校正後 p 值。

Results report facts, not interpretation. Every figure tells its own story, paired with full effect size, CI, and adjusted p.

Results 的黃金鐵則

鐵則 1:只報告,不解讀。解讀放 Discussion。
鐵則 2:圖能獨立說故事。讀者不看正文,只看圖與 caption 也能懂主結論。
鐵則 3:所有 p 值都要有效應量 + 95% CI 配對。p 值只說「有沒有差」,效應量說「差多少」。

Rule 1: Report, don't interpret. Interpretation belongs in Discussion.
Rule 2: Figures must stand alone. A reader skimming only figures + captions should grasp the main findings.
Rule 3: Every p value needs an effect size + 95% CI. p tells you "is there a difference"; effect size tells you "how much."

💡
結構提示:Results 段落順序通常是 (1) Cohort/sample 描述 (Table 1) → (2) Primary outcome (Figure 1-2) → (3) Secondary outcomes (Figure 3-4) → (4) Sensitivity / subgroup analyses。每個 figure 對應 1 段文字。 Structure hint: Results paragraphs typically flow (1) Cohort/sample description (Table 1) → (2) Primary outcome (Figs 1-2) → (3) Secondary outcomes (Figs 3-4) → (4) Sensitivity/subgroup analyses. One paragraph per figure.

一、Table 1:cohort 描述

Table 1 是臨床 / 觀察性 / 流病研究的標配,列出受試者基本特徵。生資研究通常改成「Sample characteristics」表格。

Table 1 is the standard cohort-characteristics table in clinical / observational / epidemiological research. Bioinformatics papers typically retitle as "Sample characteristics."

變項Treated (n=120)Control (n=120)p / SMD
年齡 (歲),mean (SD)Age (years), mean (SD)58.4 (11.2)57.9 (10.8)0.74
女性,n (%)Female, n (%)62 (51.7)65 (54.2)0.79
BMI (kg/m²),median (IQR)BMI (kg/m²), median (IQR)26.4 (23.8–29.1)26.9 (24.0–29.6)0.42
糖尿病,n (%)Diabetes, n (%)28 (23.3)31 (25.8)0.76
基線 HbA1c,mean (SD)Baseline HbA1c, mean (SD)6.8 (0.9)6.7 (1.0)0.51
⚠️
Table 1 的 3 個常見錯誤:① 對連續變項硬套 mean (SD),沒檢查常態性 (應改 median (IQR)) ② 把 outcome 變項放進 Table 1 (Table 1 只放 baseline) ③ Propensity-matched cohorts 沒列 SMD (Standardized Mean Difference)。
Three common Table 1 errors: ① Forcing mean (SD) on non-normal continuous variables (should be median (IQR)). ② Putting outcome variables in Table 1 (baseline only). ③ Propensity-matched cohorts without SMD (Standardized Mean Difference).

二、p / 效應量 / CI 完整報告

單獨報 p 值在 2026 年已被視為不充分。ASA 2016 Statement on p-values 明確指出:p 值不能單獨判斷「重要性」。

Reporting p alone is now considered inadequate. The ASA 2016 Statement on p-values stresses that p alone cannot judge importance.

比較類型效應量完整報告範例
兩組均值差Two-group meansCohen's d, mean diffMean difference 3.2 mmHg (95% CI 1.4–5.0; p<0.001; Cohen's d=0.42)
類別 vs 結果Categorical vs outcomeOR / RR / HRHR 0.68 (95% CI 0.52–0.89; p=0.005)
相關CorrelationPearson r / Spearman ρr = 0.45 (95% CI 0.28–0.59; p<0.001, n=152)
基因差異表達Gene DElog2 fold-changelog2FC = 2.3, FDR = 1.2 × 10⁻⁵ (BH-adjusted)
分類器表現Classifier performanceAUCAUC 0.87 (95% CI 0.83–0.91; bootstrap n=1000)

三、Figure 設計 7 大原則

One figure, one message

每張主圖傳達一個核心訊息,多 panel (A/B/C/D) 是同一個故事的多個面向,不是塞滿空間。

One figure conveys one core message. Multi-panel (A/B/C/D) figures are facets of one story, not space-fillers.

色盲友善

避開 jet/rainbow。連續變項用 viridis / magma;類別變項用 ColorBrewer Set1/Set2。約 8% 男性、0.5% 女性是紅綠色盲。

Avoid jet / rainbow. Continuous: viridis / magma. Categorical: ColorBrewer Set1/Set2. ~8% of men and 0.5% of women have red-green color blindness.

字體 ≥7 pt

大多期刊要求印刷後最小字體 ≥7 pt (約 2 mm)。300 dpi、export 時記得算實際尺寸。

Most journals require ≥7 pt (~2 mm) at print size. Export at 300 dpi and verify physical dimensions.

去除 chartjunk

3D 效果、漸層背景、雙 Y 軸都要小心。Edward Tufte 的 data-ink ratio:墨水越多用在 data 越好。

Drop 3D, gradient backgrounds, dual Y-axes (with caution). Tufte's data-ink ratio: maximize ink dedicated to data.

顯示原始數據

單純 bar chart 隱藏分布。改用 boxplot + jitter / violin / dot plot 讓讀者看到 n 與離群值。

Bar charts hide distributions. Use boxplot + jitter / violin / dot plot so readers see n and outliers.

Caption 完整自包含

Caption 應寫:n 數、所用統計檢定、誤差條代表的是 SD/SEM/95% CI、什麼校正法。

Captions must state: n, statistical test used, what error bars represent (SD/SEM/95% CI), correction method.

四、結果段落寫作對照

解讀混入

「治療組的 PFS 顯著比對照組好 (p=0.001),這顯示我們的新療法非常有效,可能改變臨床實踐。」
(「非常有效」「可能改變臨床實踐」是 Discussion,不是 Results。也缺 HR 與 95% CI。)

"Treated PFS was significantly better than control (p=0.001), showing our therapy is highly effective and may change clinical practice."
("highly effective" / "may change practice" belong in Discussion. HR and 95% CI also missing.)

只報告

「治療組的中位 PFS 為 11.4 個月 (95% CI 9.8–13.2),對照組為 7.6 個月 (95% CI 6.4–9.1)。Cox 模型 HR 為 0.62 (95% CI 0.48–0.81; log-rank p=0.001),調整年齡、性別與 PD-L1 後 HR 維持 0.65 (95% CI 0.50–0.84) (Figure 2)。」

"Median PFS was 11.4 months (95% CI 9.8–13.2) in the treated group vs 7.6 months (95% CI 6.4–9.1) in control. Cox model HR 0.62 (95% CI 0.48–0.81; log-rank p=0.001); adjusted HR 0.65 (95% CI 0.50–0.84) after age, sex, and PD-L1 (Figure 2)."

只報 p 值

「Gene X 在腫瘤組顯著上調 (p=0.03)。」

"Gene X was significantly upregulated in tumors (p=0.03)."

完整版

「Gene X 在腫瘤 (n=24) 表達中位數 4.2 TPM (IQR 3.1–5.6),正常組織 (n=18) 為 1.9 TPM (IQR 1.2–2.7);Mann-Whitney U=89, p=0.03,BH 校正後 q=0.08,效應量 r=0.34 (Figure 3A)。」

"Median Gene X expression was 4.2 TPM (IQR 3.1–5.6) in tumors (n=24) vs 1.9 TPM (IQR 1.2–2.7) in normal (n=18). Mann-Whitney U=89, p=0.03, BH-adjusted q=0.08, effect size r=0.34 (Figure 3A)."

五、選對圖型決策樹

🌳 什麼數據用什麼圖?

Q1:
連續變項分佈? Histogram / density / boxplot + jitter / violin。
Q2:
兩個連續變項關係? Scatter + 回歸線 + 95% CI band。n 大時用 hexbin。
Q3:
類別 vs 連續? Boxplot + dot overlay (避免純 bar chart)。
Q4:
時間序列 / 存活? Kaplan-Meier 含 risk table + log-rank p。
Q5:
多基因 vs 多樣本? Heatmap (z-score 標準化、annotate columns、用 viridis)。
Q6:
多重比較顯著性? Volcano (x: log2FC, y: -log10 q) 或 Manhattan (GWAS)。
Q7:
分類器效能? ROC + AUC + 95% CI;不要只報 accuracy。
Q1:
Continuous distribution? Histogram / density / boxplot + jitter / violin.
Q2:
Two continuous variables? Scatter + regression line + 95% CI band; hexbin for large n.
Q3:
Categorical vs continuous? Boxplot + dot overlay (avoid plain bar charts).
Q4:
Time series / survival? Kaplan-Meier with risk table + log-rank p.
Q5:
Many genes × samples? Heatmap (z-score, column annotations, viridis).
Q6:
Many comparisons w/ significance? Volcano (log2FC vs -log10 q) or Manhattan (GWAS).
Q7:
Classifier performance? ROC + AUC + 95% CI; never only accuracy.

六、出版級圖表範本

# Publication-quality boxplot + jitter
library(ggplot2); library(viridis)
p <- ggplot(df, aes(group, expr, fill=group)) +
  geom_boxplot(outlier.shape=NA, alpha=0.7) +
  geom_jitter(width=0.2, size=1.2, alpha=0.6) +
  scale_fill_viridis_d() +
  labs(x="", y="Expression (TPM)") +
  theme_classic(base_size=8) +
  theme(legend.position="none",
        axis.text=element_text(color="black"))
ggsave("fig2a.pdf", p, width=3.5, height=3, units="in", dpi=300)
import matplotlib.pyplot as plt
import seaborn as sns
plt.rcParams.update({"font.size":8,"pdf.fonttype":42})
fig, ax = plt.subplots(figsize=(3.5,3))
sns.boxplot(data=df, x="group", y="expr", palette="viridis",
            showfliers=False, ax=ax)
sns.stripplot(data=df, x="group", y="expr", color="black",
              alpha=0.5, size=3, ax=ax)
ax.set_ylabel("Expression (TPM)"); ax.spines[["top","right"]].set_visible(False)
fig.tight_layout(); fig.savefig("fig2a.pdf", dpi=300)

📝 自我檢測

1. 哪一句最違反 Results 的「鐵則 1」?

1. Which sentence most violates the "Rule 1" of Results?

A. 「中位 PFS 為 11.4 個月 (95% CI 9.8–13.2)」A. "Median PFS was 11.4 months (95% CI 9.8–13.2)"
B. 「這顯示我們的療法可能改變臨床實踐」B. "This shows our therapy may change clinical practice"
C. 「Cox 模型 HR 0.62 (log-rank p=0.001)」C. "Cox HR 0.62 (log-rank p=0.001)"
D. 「BH 校正後 q=0.08」D. "BH-adjusted q=0.08"

2. 完整報告兩組均值差異,最少需要包含?

2. Minimum required to fully report a between-group mean difference?

A. 只要 p 值A. p value alone
B. p 值 + nB. p value + n
C. 平均差 + p 值C. mean difference + p
D. 平均差 + 95% CI + p 值(+ 效應量)D. mean difference + 95% CI + p (+ effect size)

3. 連續基因表達 vs 多個樣本群,最佳圖型?

3. Best plot for a continuous gene-expression value across multiple sample groups?

A. 3D bar chartA. 3D bar chart
B. Pie chartB. Pie chart
C. Boxplot + jitter / violinC. Boxplot + jitter / violin
D. Bar chart with SEM error barD. Bar chart with SEM error bar