一、為什麼用 R?又為什麼配 RStudio?
R 是統計與生物資訊領域最廣泛使用的開源語言,Bioconductor(R 的生資延伸生態圈)擁有超過 2,300 個專為生物資料設計的套件——這是任何其他語言都無法比擬的優勢。從基因表現分析、單細胞、ChIP-seq 到表觀遺傳,幾乎所有主流分析工具都有 R 實作。
R 與 RStudio 的關係:R 是「引擎」,RStudio 是「儀表板」。你可以只裝 R 就能執行所有運算,但少了 RStudio 的編輯器、檔案瀏覽器、繪圖預覽與專案管理,你的工作效率會低非常多。初學者請務必兩者都裝,先裝 R,再裝 RStudio。
R is the most widely used open-source language for statistics and bioinformatics. Bioconductor (R's biology extension ecosystem) hosts over 2,300 packages purpose-built for biological data — an advantage no other language can match. From gene expression to single-cell, ChIP-seq to epigenetics, virtually every mainstream tool has an R implementation.
R vs. RStudio: R is the engine; RStudio is the dashboard. You can run R alone, but without RStudio's editor, file browser, plot pane, and project manager you'll be far less productive. Beginners should install both — R first, then RStudio.
二、安裝 R(核心引擎)
到 CRAN 官網下載對應你作業系統的安裝檔:
- Windows:
https://cran.r-project.org/bin/windows/base/→ 下載「Download R-x.x.x for Windows」→ 雙擊.exe安裝。 - macOS:
https://cran.r-project.org/bin/macosx/→ 選對應晶片(Intelx86_64或 Apple Siliconarm64)。 - Linux (Ubuntu/Debian):
sudo apt install r-base r-base-dev,或加入 CRAN repo 安裝最新版。
Download the installer for your OS from CRAN:
- Windows:
https://cran.r-project.org/bin/windows/base/→ "Download R-x.x.x for Windows" → run the.exe. - macOS:
https://cran.r-project.org/bin/macosx/→ pick your chip (Intelx86_64or Apple Siliconarm64). - Linux (Ubuntu/Debian):
sudo apt install r-base r-base-dev, or add the CRAN repo for the latest version.
Windows 安裝路徑
預設會裝在 C:\Program Files\R\R-4.x.x\。不要勾選「為所有使用者建立桌面捷徑」以外的「Add R to PATH」──RStudio 會自動找到它,手動加入 PATH 反而可能讓多版本管理混亂。
Default location: C:\Program Files\R\R-4.x.x\. Don't tick "Add R to PATH" — RStudio finds it automatically, and adding to PATH can complicate multi-version management later.
macOS 權限提示
第一次開啟可能跳「無法驗證開發者」。到「系統偏好設定 → 隱私權與安全性」按「仍要打開」即可。Apple Silicon 請務必裝 arm64 版,效能差距很大。
You may see "developer cannot be verified". Go to System Settings → Privacy & Security → "Open Anyway". On Apple Silicon, the arm64 build is significantly faster — install the right one.
三、安裝 RStudio(IDE)
到 Posit(RStudio 母公司)下載 RStudio Desktop Free:
https://posit.co/download/rstudio-desktop/
網頁會自動偵測作業系統推薦對應版本。Windows 安裝完後在開始選單可以找到 RStudio,第一次開啟會問你要用哪個 R——通常自動偵測即可。
Get RStudio Desktop Free from Posit:
https://posit.co/download/rstudio-desktop/
The page auto-detects your OS. On Windows you'll find RStudio in the Start menu after install. First launch asks which R to use — auto-detect is usually fine.
RStudio 介面四象限
📝 Source(左上)
程式碼編輯器。寫 .R 腳本或 .Rmd 報告的地方。Ctrl/Cmd + Enter 把當前行送到 Console 執行。
The code editor. Where you write .R scripts or .Rmd reports. Ctrl/Cmd + Enter sends the current line to the Console.
💻 Console(左下)
R 引擎的互動視窗。直接打字會立刻執行,看到 > 表示等待輸入;看到 + 表示上一行語法未完成(按 Esc 取消)。
Live R prompt. > means ready; + means the previous expression is incomplete (press Esc to abort).
🌐 Environment / History(右上)
顯示當前載入的所有變數與資料。Environment 分頁是除錯利器——隨時可看到變數的型別、維度與內容。
All variables & data currently in memory. The Environment tab is invaluable for debugging — inspect type, dimensions, and contents at any time.
📁 Files / Plots / Packages / Help(右下)
檔案瀏覽器、繪圖預覽、套件清單與說明文件。?function_name 在 Console 輸入即可在 Help 跳出對應文件。
File browser, plot preview, package list, and help docs. Type ?function_name in the Console to pop up its docs in Help.
四、工作目錄(Working Directory):所有路徑的起點
「工作目錄」是 R 認定的「當前資料夾」。當你寫 read.csv("data.csv"),R 會去工作目錄裡找 data.csv;寫 ggsave("plot.png"),圖也會存到工作目錄。幾乎所有「找不到檔案」的錯誤都是工作目錄設定問題。
The working directory is the folder R treats as "current". When you write read.csv("data.csv"), R looks for data.csv in the working directory; ggsave("plot.png") saves there too. Almost every "file not found" error is a working-directory issue.
基本指令
# 看現在的工作目錄 / Check current working directory getwd() #> [1] "C:/Users/Charlene/Documents" # 切換工作目錄 / Set a new working directory setwd("E:/Charlene/Bioinformatics_Tutorials/R") # 列出當前目錄的檔案 / List files in current directory list.files() list.files(pattern = "\\.csv$") # 只看 csv 檔 # 建立子資料夾(若已存在不會報錯)/ Create a subdirectory dir.create("results", showWarnings = FALSE) dir.create("results/figures", recursive = TRUE)
\ 當作跳脫字元。直接複製 Windows 檔案總管裡的 E:\Charlene\R 貼到 R 裡會錯!必須改成下列其中一種:
- 正斜線:
"E:/Charlene/R"(推薦,跨平台通用) - 雙反斜線:
"E:\\Charlene\\R"(醜但相容 Windows 慣例)
\ as escape character. You can't paste E:\Charlene\R directly. Use one of:
- Forward slashes:
"E:/Charlene/R"(recommended, cross-platform) - Double backslashes:
"E:\\Charlene\\R"(ugly but matches Windows convention)
絕對路徑 vs. 相對路徑
🗺️ 絕對路徑
從磁碟根開始的完整路徑:
E:/Charlene/Bioinformatics_Tutorials/R/data/counts.csv
優點:明確,不依賴工作目錄。
缺點:換電腦或換使用者就壞掉,無法分享給合作者。
Full path from disk root:
E:/Charlene/Bioinformatics_Tutorials/R/data/counts.csv
Pros: unambiguous, doesn't depend on working dir.
Cons: breaks when sharing or switching machines.
📍 相對路徑
從工作目錄出發的路徑:
data/counts.csv
優點:可攜,整個專案資料夾打包就能跑。
缺點:必須先正確設定工作目錄。
建議寫法:用 here::here("data", "counts.csv")(見 I/O 章節)。
Path from the working directory:
data/counts.csv
Pros: portable — zip the project, it still runs.
Cons: requires correct working dir.
Best practice: use here::here("data", "counts.csv") (see I/O chapter).
五、推薦的專案資料夾結構
每一個分析專案請建立獨立資料夾,並用以下結構──這是學界與業界的最佳實踐:
Every analysis project should live in its own folder, organized like this — academic & industry best practice:
my_rnaseq_project/ ├── my_rnaseq_project.Rproj # RStudio 專案檔(雙擊開啟)/ RStudio project file ├── README.md # 專案說明 / project description ├── data/ # 原始資料(唯讀)/ raw data (read-only) │ ├── raw/ # 原檔案,永不修改 / never modify │ └── processed/ # 清理後的中間檔 / cleaned intermediates ├── R/ # R 腳本 / R scripts │ ├── 01_load_data.R │ ├── 02_qc.R │ ├── 03_dge.R │ └── 04_enrichment.R ├── results/ # 分析輸出 / analysis outputs │ ├── tables/ # 表格 .csv .xlsx │ └── figures/ # 圖檔 .png .pdf .svg ├── reports/ # RMarkdown / Quarto 報告 / writeups │ └── final_report.Rmd └── renv.lock # 套件版本鎖定(見 Repro 章)/ package lockfile
- data/ 永遠唯讀──腳本只讀不寫,避免污染原始資料。
- 每個輸出可重現──所有 results/ 內檔案都應由 R/ 內腳本產出,刪掉重跑也能復原。
- 不要寫絕對路徑──用 RStudio Project + here 套件,移到任何電腦都能跑。
- data/ is read-only — scripts only read from it.
- Every output must be reproducible — anything in results/ should regenerate from R/ scripts.
- Never hard-code absolute paths — use RStudio Projects + the here package; portable across machines.
六、RStudio Projects:自動管理工作目錄
建立 RStudio Project 後,每次雙擊 .Rproj 檔開啟,工作目錄會自動設成該資料夾──不必再手動 setwd(),腳本可以直接用相對路徑。
Once you create an RStudio Project, double-clicking the .Rproj file opens RStudio with the working directory set automatically — no manual setwd(), and scripts can use relative paths directly.
開啟新專案
RStudio 選單 File → New Project... → New Directory → New Project。
Menu: File → New Project... → New Directory → New Project.
命名與位置
Directory name 填 my_rnaseq_project,Create project as subdirectory of 點 Browse 選 E:/Charlene/Bioinformatics_Tutorials/R/。
Set Directory name to my_rnaseq_project; Create project as subdirectory of: browse to E:/Charlene/Bioinformatics_Tutorials/R/.
(可選)勾選 renv
勾選 Use renv with this project 啟用套件版本鎖定(見 Reproducibility 章)。新手可先不勾。
Tick Use renv with this project for package version locking (see Reproducibility chapter). Beginners can skip.
建立子資料夾
在 RStudio 右下 Files 分頁,按 New Folder 建立 data/、R/、results/ 等資料夾,或直接在 Console 執行:
In the bottom-right Files tab, hit New Folder to create data/, R/, results/ — or run in Console:
# 一次建立完整專案結構 / Create the full structure in one go
folders <- c("data/raw", "data/processed", "R",
"results/tables", "results/figures", "reports")
for (f in folders) dir.create(f, recursive = TRUE, showWarnings = FALSE)
list.files(recursive = TRUE, include.dirs = TRUE)
- 把腳本和資料散在桌面或下載資料夾。
- 用
setwd("C:/Users/我的名字/Desktop/proj")寫死路徑──你的合作者開不了。 - 所有檔案塞在同一層──三個月後你絕對找不到
final_v2_REAL_FINAL.R是哪一個。
- Scripts and data scattered across Desktop/Downloads.
- Hard-coded
setwd("C:/Users/me/Desktop/proj")— collaborators can't open it. - All files in one folder — in three months, no one knows which is
final_v2_REAL_FINAL.R.
七、安裝與載入套件
R 的功能透過套件(package)無限擴充。生資需要懂三大來源:
- CRAN──R 官方套件庫,
install.packages()安裝。 - Bioconductor──專為生資設計的延伸庫,
BiocManager::install()安裝。 - GitHub──開發版或未上 CRAN 的套件,
remotes::install_github()安裝。
R's power comes from packages. For bioinformatics you need three sources:
- CRAN — official repo,
install.packages(). - Bioconductor — biology-focused extension,
BiocManager::install(). - GitHub — dev versions or non-CRAN packages,
remotes::install_github().
# 安裝單一套件 / Install one package install.packages("tidyverse") # 安裝多個套件 / Install multiple install.packages(c("ggplot2", "dplyr", "data.table")) # 載入套件(每次重啟 R 都要重新載入)/ Load (re-load every R session) library(tidyverse) # 不載入也想用:用 :: 直接呼叫 / Use without loading dplyr::filter(mtcars, mpg > 25) # 看已安裝套件 / List installed installed.packages()[, "Package"] # 更新所有套件 / Update everything update.packages(ask = FALSE)
# 第一次:先裝 BiocManager / First-time: install BiocManager if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") # 安裝 Bioc 套件 / Install Bioc packages BiocManager::install(c("DESeq2", "limma", "clusterProfiler")) # 檢查 Bioc 版本是否與 R 對齊 / Check Bioc version matches R BiocManager::version() BiocManager::valid() # 載入後查看 vignette(套件作者寫的教學)/ Browse vignettes library(DESeq2) browseVignettes("DESeq2")
# 第一次:先裝 remotes / First-time install.packages("remotes") # 安裝 GitHub 上的套件(user/repo 格式) remotes::install_github("satijalab/seurat") # 指定特定 commit / branch / tag remotes::install_github("satijalab/seurat@v5.0.0") # 開發者:本機資料夾安裝(用於套件開發) remotes::install_local("E:/dev/myPackage")
📦 套件儲存位置(library path)
知道套件存哪可以解決很多奇怪問題(例如套件衝突、權限不足)。
Knowing where packages live solves many weird issues (conflicts, permission errors).
# 看所有套件搜尋路徑 / All library search paths .libPaths() # Windows 預設使用者層級路徑 / Typical Windows per-user path: # C:/Users/<you>/AppData/Local/R/win-library/4.4 # 系統層級路徑(需管理員權限)/ System-wide (admin only): # C:/Program Files/R/R-4.4.x/library # 自訂專案內 library(搭配 renv 的做法)/ Project-local library (renv style) .libPaths(c("./renv/library/R-4.4/x86_64-w64-mingw32", .libPaths()))
- 缺 Rtools──某些套件需要從原始碼編譯。到
https://cran.r-project.org/bin/windows/Rtools/下載對應 R 版本的 Rtools 並安裝。 - 路徑含中文或空格──強烈建議把使用者資料夾或工作目錄改成純英文路徑。
- 權限不足──安裝到「Program Files」需要系統管理員,建議不要碰系統路徑,改用使用者層級安裝。
- Missing Rtools — some packages need source compilation. Download Rtools matching your R version from
https://cran.r-project.org/bin/windows/Rtools/. - Chinese characters or spaces in path — strongly recommend an ASCII-only working directory.
- Permission denied — installing to "Program Files" needs admin. Prefer user-level library.
八、第一支可運行的腳本
把以下程式碼貼到 RStudio Source 編輯器,按 Ctrl/Cmd + Shift + S 執行整份腳本。或者直接點下方 ▶ 在瀏覽器試跑:
Paste this in the Source editor and press Ctrl/Cmd + Shift + S to run the whole script. Or hit ▶ below to try it in-browser:
# ===== 第一支 R 腳本 / Your first R script =====
# 1. 計算 / Calculate
2 + 3 * 4
sqrt(144)
# 2. 變數 / Variables
gene_count <- 20000
sample_size <- 100
cat("Total measurements:", gene_count * sample_size, "\n")
# 3. 向量 / Vectors
expression <- c(5.2, 8.1, 3.4, 9.7, 2.1)
mean(expression)
sd(expression)
# 4. 內建資料集 + 圖 / Built-in data + plot
plot(iris$Sepal.Length, iris$Sepal.Width,
col = iris$Species, pch = 19,
xlab = "Sepal Length", ylab = "Sepal Width",
main = "Iris")
legend("topright", legend = levels(iris$Species),
col = 1:3, pch = 19)
- Ctrl/Cmd + Enter 執行當前行(最常用)
- Ctrl/Cmd + Shift + S 執行整份腳本
- Ctrl/Cmd + Shift + M 插入 pipe
|>或%>% - Alt + - 插入賦值符號
<- - Ctrl/Cmd + Shift + R 插入區段標題(可摺疊)
- Ctrl/Cmd + L 清空 Console
- Ctrl/Cmd + Enter Run current line (most common)
- Ctrl/Cmd + Shift + S Source the entire script
- Ctrl/Cmd + Shift + M Insert pipe
|>or%>% - Alt + - Insert assignment
<- - Ctrl/Cmd + Shift + R Insert section header (foldable)
- Ctrl/Cmd + L Clear Console
九、存檔的多種方式
分析跑完之後,要把結果存下來。R 中常見的「存檔」分四類:
Once your analysis is done you need to persist results. Four common kinds of "save":
| 存什麼 | 函式 | 副檔名 | 特性 |
|---|---|---|---|
| 表格資料(人讀) | write.csv() / readr::write_csv() | .csv .tsv | Excel/任何工具可開;體積大、無型別資訊。 |
| 單一 R 物件(最快) | saveRDS() / readRDS() | .rds | 保留所有 R 屬性,跨機器可讀;只能存一個物件。 |
| 多個物件 / 整個 workspace | save() / load() | .RData .rda | 可存多物件;load() 會直接覆寫同名變數,謹慎使用。 |
| 圖片 | ggsave() / png() + dev.off() | .png .pdf .svg | ggsave() 是 ggplot 的便捷函式;可指定 dpi 與尺寸。 |
# 存 CSV / Save CSV write.csv(my_results, "results/tables/dge_results.csv", row.names = FALSE) # tidyverse 寫法(更快、預設不寫 row names)/ tidyverse style readr::write_csv(my_results, "results/tables/dge_results.csv") # 帶日期的檔名(避免覆蓋)/ Date-stamped filename fname <- paste0("results/tables/dge_", Sys.Date(), ".csv") readr::write_csv(my_results, fname) #> results/tables/dge_2026-05-07.csv
# 存 / Save saveRDS(deseq_object, "results/dds.rds") # 讀(注意:讀回來時要指派給變數)/ Load (must assign to variable) dds <- readRDS("results/dds.rds") # RDS 的優點:保留 S4 物件、factor 順序、attributes 等 # 跑很久的分析(幾小時),務必 saveRDS() 中間結果!
# ggplot 物件 / ggplot object p <- ggplot2::ggplot(iris, ggplot2::aes(Sepal.Length, Sepal.Width, color = Species)) + ggplot2::geom_point() ggplot2::ggsave("results/figures/iris_scatter.png", p, width = 6, height = 4, dpi = 300) # 出版用 PDF(向量圖、可放大不失真)/ Publication PDF ggplot2::ggsave("results/figures/iris_scatter.pdf", p, width = 6, height = 4) # 基本繪圖(非 ggplot)/ Base R plot png("results/figures/base_plot.png", width = 800, height = 600, res = 120) plot(iris$Sepal.Length, iris$Sepal.Width) dev.off() # 一定要 dev.off() 才會寫入檔案!/ Required to flush to disk!
十、決策樹:我該用哪個函式?
🤔 任務 → 推薦函式
getwd()setwd(\"E:/path\") or open an RStudio Project">想換個工作目錄? setwd(\"E:/path\") 或開啟 RStudio Projectlist.files(\"data/\")">想知道某資料夾有什麼檔? list.files(\"data/\")dir.create(\"results/figures\", recursive=TRUE)">想建子資料夾? dir.create(\"results/figures\", recursive=TRUE)write.csv() 或 writexl::write_xlsx()saveRDS() / readRDS()ggsave()📝 自我檢測
1. 在 Windows 上,下列哪個路徑寫法在 R 中會出錯?
1. Which Windows path string fails in R?
2. 你雙擊一個 .Rproj 檔開啟 RStudio,下列何者正確?
2. You double-click a .Rproj file. Which is correct?
3. 想要安裝 Bioconductor 上的 DESeq2,正確的指令是?
3. Correct way to install DESeq2 from Bioconductor?
4. 關閉 RStudio 時跳出「Save workspace image to .RData?」最佳的選擇是?
4. RStudio asks "Save workspace image to .RData?" on exit. Best answer?