Step 6: 結果與圖表設計 — Academic Writing Tutorial

概覽

Results 的黃金鐵則

鐵則 1：只報告，不解讀。解讀放 Discussion。
鐵則 2：圖能獨立說故事。讀者不看正文，只看圖與 caption 也能懂主結論。
鐵則 3：所有 p 值都要有效應量 + 95% CI 配對。p 值只說「有沒有差」，效應量說「差多少」。

Rule 1: Report, don't interpret. Interpretation belongs in Discussion.
Rule 2: Figures must stand alone. A reader skimming only figures + captions should grasp the main findings.
Rule 3: Every p value needs an effect size + 95% CI. p tells you "is there a difference"; effect size tells you "how much."

💡

結構提示：Results 段落順序通常是 (1) Cohort/sample 描述 (Table 1) → (2) Primary outcome (Figure 1-2) → (3) Secondary outcomes (Figure 3-4) → (4) Sensitivity / subgroup analyses。每個 figure 對應 1 段文字。 Structure hint: Results paragraphs typically flow (1) Cohort/sample description (Table 1) → (2) Primary outcome (Figs 1-2) → (3) Secondary outcomes (Figs 3-4) → (4) Sensitivity/subgroup analyses. One paragraph per figure.

Table 1 慣例

一、Table 1：cohort 描述

Table 1 是臨床 / 觀察性 / 流病研究的標配，列出受試者基本特徵。生資研究通常改成「Sample characteristics」表格。

Table 1 is the standard cohort-characteristics table in clinical / observational / epidemiological research. Bioinformatics papers typically retitle as "Sample characteristics."

變項	Treated (n=120)	Control (n=120)	p / SMD
年齡 (歲)，mean (SD)Age (years), mean (SD)	58.4 (11.2)	57.9 (10.8)	0.74
女性，n (%)Female, n (%)	62 (51.7)	65 (54.2)	0.79
BMI (kg/m²)，median (IQR)BMI (kg/m²), median (IQR)	26.4 (23.8–29.1)	26.9 (24.0–29.6)	0.42
糖尿病，n (%)Diabetes, n (%)	28 (23.3)	31 (25.8)	0.76
基線 HbA1c，mean (SD)Baseline HbA1c, mean (SD)	6.8 (0.9)	6.7 (1.0)	0.51

⚠️

Table 1 的 3 個常見錯誤：① 對連續變項硬套 mean (SD)，沒檢查常態性 (應改 median (IQR)) ② 把 outcome 變項放進 Table 1 (Table 1 只放 baseline) ③ Propensity-matched cohorts 沒列 SMD (Standardized Mean Difference)。

Three common Table 1 errors: ① Forcing mean (SD) on non-normal continuous variables (should be median (IQR)). ② Putting outcome variables in Table 1 (baseline only). ③ Propensity-matched cohorts without SMD (Standardized Mean Difference).

統計報告

二、p / 效應量 / CI 完整報告

單獨報 p 值在 2026 年已被視為不充分。ASA 2016 Statement on p-values 明確指出：p 值不能單獨判斷「重要性」。

Reporting p alone is now considered inadequate. The ASA 2016 Statement on p-values stresses that p alone cannot judge importance.

比較類型	效應量	完整報告範例
兩組均值差Two-group means	Cohen's d, mean diff	Mean difference 3.2 mmHg (95% CI 1.4–5.0; p<0.001; Cohen's d=0.42)
類別 vs 結果Categorical vs outcome	OR / RR / HR	HR 0.68 (95% CI 0.52–0.89; p=0.005)
相關Correlation	Pearson r / Spearman ρ	r = 0.45 (95% CI 0.28–0.59; p<0.001, n=152)
基因差異表達Gene DE	log2 fold-change	log2FC = 2.3, FDR = 1.2 × 10⁻⁵ (BH-adjusted)
分類器表現Classifier performance	AUC	AUC 0.87 (95% CI 0.83–0.91; bootstrap n=1000)

圖表設計

三、Figure 設計 7 大原則

① One figure, one message

每張主圖傳達一個核心訊息，多 panel (A/B/C/D) 是同一個故事的多個面向，不是塞滿空間。

One figure conveys one core message. Multi-panel (A/B/C/D) figures are facets of one story, not space-fillers.

② 色盲友善

避開 jet/rainbow。連續變項用 viridis / magma；類別變項用 ColorBrewer Set1/Set2。約 8% 男性、0.5% 女性是紅綠色盲。

Avoid jet / rainbow. Continuous: viridis / magma. Categorical: ColorBrewer Set1/Set2. ~8% of men and 0.5% of women have red-green color blindness.

③ 字體 ≥7 pt

大多期刊要求印刷後最小字體 ≥7 pt (約 2 mm)。300 dpi、export 時記得算實際尺寸。

Most journals require ≥7 pt (~2 mm) at print size. Export at 300 dpi and verify physical dimensions.

④ 去除 chartjunk

3D 效果、漸層背景、雙 Y 軸都要小心。Edward Tufte 的 data-ink ratio：墨水越多用在 data 越好。

Drop 3D, gradient backgrounds, dual Y-axes (with caution). Tufte's data-ink ratio: maximize ink dedicated to data.

⑤ 顯示原始數據

單純 bar chart 隱藏分布。改用 boxplot + jitter / violin / dot plot 讓讀者看到 n 與離群值。

Bar charts hide distributions. Use boxplot + jitter / violin / dot plot so readers see n and outliers.

⑥ Caption 完整自包含

Caption 應寫：n 數、所用統計檢定、誤差條代表的是 SD/SEM/95% CI、什麼校正法。

Captions must state: n, statistical test used, what error bars represent (SD/SEM/95% CI), correction method.

範例對照

四、結果段落寫作對照

❌ 解讀混入

「治療組的 PFS 顯著比對照組好 (p=0.001)，這顯示我們的新療法非常有效，可能改變臨床實踐。」
(「非常有效」「可能改變臨床實踐」是 Discussion，不是 Results。也缺 HR 與 95% CI。)

"Treated PFS was significantly better than control (p=0.001), showing our therapy is highly effective and may change clinical practice."
("highly effective" / "may change practice" belong in Discussion. HR and 95% CI also missing.)

✅ 只報告

「治療組的中位 PFS 為 11.4 個月 (95% CI 9.8–13.2)，對照組為 7.6 個月 (95% CI 6.4–9.1)。Cox 模型 HR 為 0.62 (95% CI 0.48–0.81; log-rank p=0.001)，調整年齡、性別與 PD-L1 後 HR 維持 0.65 (95% CI 0.50–0.84) (Figure 2)。」

"Median PFS was 11.4 months (95% CI 9.8–13.2) in the treated group vs 7.6 months (95% CI 6.4–9.1) in control. Cox model HR 0.62 (95% CI 0.48–0.81; log-rank p=0.001); adjusted HR 0.65 (95% CI 0.50–0.84) after age, sex, and PD-L1 (Figure 2)."

❌ 只報 p 值

「Gene X 在腫瘤組顯著上調 (p=0.03)。」

"Gene X was significantly upregulated in tumors (p=0.03)."

✅ 完整版

「Gene X 在腫瘤 (n=24) 表達中位數 4.2 TPM (IQR 3.1–5.6)，正常組織 (n=18) 為 1.9 TPM (IQR 1.2–2.7)；Mann-Whitney U=89, p=0.03，BH 校正後 q=0.08，效應量 r=0.34 (Figure 3A)。」

"Median Gene X expression was 4.2 TPM (IQR 3.1–5.6) in tumors (n=24) vs 1.9 TPM (IQR 1.2–2.7) in normal (n=18). Mann-Whitney U=89, p=0.03, BH-adjusted q=0.08, effect size r=0.34 (Figure 3A)."

決策引導

五、選對圖型決策樹

🌳 什麼數據用什麼圖？

Q1:

連續變項分佈？→ Histogram / density / boxplot + jitter / violin。

Q2:

兩個連續變項關係？→ Scatter + 回歸線 + 95% CI band。n 大時用 hexbin。

Q3:

類別 vs 連續？→ Boxplot + dot overlay (避免純 bar chart)。

Q4:

時間序列 / 存活？→ Kaplan-Meier 含 risk table + log-rank p。

Q5:

多基因 vs 多樣本？→ Heatmap (z-score 標準化、annotate columns、用 viridis)。

Q6:

多重比較顯著性？→ Volcano (x: log2FC, y: -log10 q) 或 Manhattan (GWAS)。

Q7:

分類器效能？→ ROC + AUC + 95% CI；不要只報 accuracy。

Q1:

Continuous distribution? → Histogram / density / boxplot + jitter / violin.

Q2:

Two continuous variables? → Scatter + regression line + 95% CI band; hexbin for large n.

Q3:

Categorical vs continuous? → Boxplot + dot overlay (avoid plain bar charts).

Q4:

Time series / survival? → Kaplan-Meier with risk table + log-rank p.

Q5:

Many genes × samples? → Heatmap (z-score, column annotations, viridis).

Q6:

Many comparisons w/ significance? → Volcano (log2FC vs -log10 q) or Manhattan (GWAS).

Q7:

Classifier performance? → ROC + AUC + 95% CI; never only accuracy.

程式碼

六、出版級圖表範本

# Publication-quality boxplot + jitter
library(ggplot2); library(viridis)
p <- ggplot(df, aes(group, expr, fill=group)) +
  geom_boxplot(outlier.shape=NA, alpha=0.7) +
  geom_jitter(width=0.2, size=1.2, alpha=0.6) +
  scale_fill_viridis_d() +
  labs(x="", y="Expression (TPM)") +
  theme_classic(base_size=8) +
  theme(legend.position="none",
        axis.text=element_text(color="black"))
ggsave("fig2a.pdf", p, width=3.5, height=3, units="in", dpi=300)

import matplotlib.pyplot as plt
import seaborn as sns
plt.rcParams.update({"font.size":8,"pdf.fonttype":42})
fig, ax = plt.subplots(figsize=(3.5,3))
sns.boxplot(data=df, x="group", y="expr", palette="viridis",
            showfliers=False, ax=ax)
sns.stripplot(data=df, x="group", y="expr", color="black",
              alpha=0.5, size=3, ax=ax)
ax.set_ylabel("Expression (TPM)"); ax.spines[["top","right"]].set_visible(False)
fig.tight_layout(); fig.savefig("fig2a.pdf", dpi=300)

📝 自我檢測

1. 哪一句最違反 Results 的「鐵則 1」？

1. Which sentence most violates the "Rule 1" of Results?

A. 「中位 PFS 為 11.4 個月 (95% CI 9.8–13.2)」A. "Median PFS was 11.4 months (95% CI 9.8–13.2)"

B. 「這顯示我們的療法可能改變臨床實踐」B. "This shows our therapy may change clinical practice"

C. 「Cox 模型 HR 0.62 (log-rank p=0.001)」C. "Cox HR 0.62 (log-rank p=0.001)"

D. 「BH 校正後 q=0.08」D. "BH-adjusted q=0.08"

2. 完整報告兩組均值差異，最少需要包含？

2. Minimum required to fully report a between-group mean difference?

A. 只要 p 值A. p value alone

B. p 值 + nB. p value + n

C. 平均差 + p 值C. mean difference + p

D. 平均差 + 95% CI + p 值（+ 效應量）D. mean difference + 95% CI + p (+ effect size)

3. 連續基因表達 vs 多個樣本群，最佳圖型？

3. Best plot for a continuous gene-expression value across multiple sample groups?

A. 3D bar chartA. 3D bar chart

B. Pie chartB. Pie chart

C. Boxplot + jitter / violinC. Boxplot + jitter / violin

D. Bar chart with SEM error barD. Bar chart with SEM error bar