一、要不要用框架?
2025 年 Anthropic 工程部落格寫過一篇有名的文章「Building effective agents」:對絕大多數任務,純 Python while loop + LLM API + 一份工具註冊表就夠了。框架的價值在於:
- 提供已測試的設計模式(避免重造輪子)
- 觀測性 / 持久化 / checkpoint 已內建
- 多 agent 的 hand-off / 訊息協定已標準化
- 能畫出圖、視覺化 debug
但框架也有代價:
- 抽象層讓除錯變難(「為什麼這個 prompt 變成這樣?」)
- 升級風險(LangChain 0.x → 0.y 的破壞性改動)
- 過度工程化簡單任務
本章對比六大主流選項,幫你做出有依據的選擇。
Anthropic's famous 2025 engineering post "Building effective agents" argues that for most tasks, a plain Python while-loop + LLM API + a tool registry is enough. The value of a framework is:
- Battle-tested patterns (no reinventing wheels)
- Built-in observability / persistence / checkpointing
- Standardized multi-agent hand-off & messaging
- Visualizable graphs for debugging
The cost:
- Abstractions make debugging harder ("why did this prompt mutate?")
- Upgrade risk (LangChain 0.x → 0.y breaking changes)
- Overkill for simple tasks
This chapter compares the six mainstream options to inform a grounded choice.
二、2026 主流 Agent 框架對照表
| 框架 | 主打 | 心智模型 | 多 agent | 觀測性 | 學習曲線 | 2026 狀態 | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| LangGraph | 高可控、複雜流程 | Maximum control, complex flows | 有向狀態圖 | Directed state graph | ★★★★★ | ★★★★★ (LangSmith) | ★★★★★ (LangSmith) | 中等偏陡 | Steep-ish | 🚀 生產首選 | 🚀 production king |
| CrewAI | 快速組「角色團隊」 | Quick role-based teams | role + task DSL | role + task DSL | ★★★★ | ★★★ | ★ 最低 | ★ lowest | 🌱 快速 prototype | 🌱 fast prototyping | |
| AutoGen / AG2 | 會話式多 agent、群聊 | Conversational multi-agent, group chat | GroupChat + Speaker | GroupChat + Speaker | ★★★★ | ★★★ | 中 | Moderate | ⚠️ Microsoft AutoGen 維護中、AG2 fork 接手 | ⚠️ MS AutoGen in maintenance; AG2 fork active | |
| LlamaIndex Agents | RAG 為主軸的 agent | RAG-first agents | QueryEngine + Agent | QueryEngine + Agent | ★★★ | ★★★ | 中 | Moderate | 📚 文件密集場景 | 📚 doc-heavy use cases | |
| OpenAI Agents SDK | OpenAI 模型 + Swarm 模式 | OpenAI models + Swarm pattern | Agent + Tool + Hand-off | Agent + Tool + Hand-off | ★★★★ | ★★★★ (Traces) | ★★ 低 | ★★ low | 🆕 2025 Q1 GA | 🆕 2025 Q1 GA | |
| Anthropic Agent SDK | 原生 MCP + Claude + sub-agent | Native MCP + Claude + sub-agent | Sub-agent + MCP servers | Sub-agent + MCP servers | ★★★★ | ★★★★ | ★★ 低 | ★★ low | 🆕 2025 Q4 推出 | 🆕 launched 2025 Q4 |
三、三大主流框架的代碼風味比較
同樣的「研究 → 寫稿」雙 agent 任務,三種寫法:
The same "researcher → writer" two-agent task, three ways:
from langgraph.graph import StateGraph, END from typing import TypedDict class S(TypedDict): query:str; notes:str; draft:str def research(s): return {"notes": llm.invoke(f"Research: {s['query']}").content} def write(s): return {"draft": llm.invoke(f"Write using: {s['notes']}").content} g = StateGraph(S) g.add_node("research", research); g.add_node("write", write) g.set_entry_point("research"); g.add_edge("research","write"); g.add_edge("write", END) app = g.compile(checkpointer=memory) # 內建持久化 result = app.invoke({"query":"CRISPR base editing"})
from crewai import Agent, Task, Crew, Process researcher = Agent(role="Researcher", goal="Gather 3 facts", backstory="PhD biologist.") writer = Agent(role="Writer", goal="200-word brief", backstory="Science journalist.") t1 = Task(description="Research CRISPR base editing", agent=researcher, expected_output="3 bullets") t2 = Task(description="Write 200-word brief", agent=writer, expected_output="draft", context=[t1]) crew = Crew(agents=[researcher,writer], tasks=[t1,t2], process=Process.sequential) result = crew.kickoff()
from openai_agents import Agent, Runner researcher = Agent(name="Researcher", instructions="Gather 3 key facts on the topic.") writer = Agent(name="Writer", instructions="Write a 200-word brief from the researcher's notes.", handoffs=[]) researcher.handoffs = [writer] # researcher 完成後可 hand off result = Runner.run_sync(researcher, "CRISPR base editing") print(result.final_output)
四、該選哪個框架?三步決策樹
🌳 選擇你的框架
五、選框架時的五個務實考量
① 團隊熟悉度
選團隊已熟悉的框架,學習成本比框架本身優劣更重要。LangChain 生態最大,找人最容易。
Pick what your team already knows — learning cost matters more than fine differences. LangChain has the largest ecosystem and talent pool.
② 模型廠商鎖定
OpenAI Agents SDK 與 Anthropic SDK 都只支援自家模型。LangGraph / CrewAI 跨廠商。看你是否在意。
OpenAI Agents SDK and Anthropic SDK lock you to one vendor; LangGraph / CrewAI are cross-vendor. Weigh the trade-off.
③ 可觀測性
LangGraph + LangSmith 是 2026 觀測性最完整組合(trace、time-travel debug、A/B)。其他框架靠第三方 (Langfuse、Arize)。
LangGraph + LangSmith is the most complete 2026 observability stack (traces, time-travel debug, A/B). Others rely on third-party (Langfuse, Arize).
④ 人類介入
LangGraph 原生支援「暫停 → 等人 → 繼續」(interrupt_before)。AutoGen/AG2 用 human-proxy agent;CrewAI 需要自己包。
LangGraph natively supports pause → wait for human → resume (interrupt_before). AutoGen/AG2 use a human-proxy agent; CrewAI needs custom wrappers.
⑤ 授權與營運
確認框架的 license 與你公司政策相容、確認原作者 / 公司會持續維護。AutoGen 2026 進入 maintenance 模式(MS 投入 Agent Framework),是個警訊。
Verify license compatibility and whether maintainers will keep shipping. AutoGen entered maintenance mode in 2026 (MS shifted to Agent Framework) — a cautionary signal.
六、Charlene 推薦的 2026 起手式
🚀 從零開始的個人 / 小團隊路線
- 第 1 週:純 Python + Anthropic SDK 或 OpenAI SDK 跑通
while loop版的 Step 1 agent,理解每一行代碼。 - 第 2 週:把工具改用 MCP 標準化(Step 11),用 Claude Desktop 或 Cursor 直接連線測試。
- 第 3 週:加上 Chroma 做 episodic memory + Cohere rerank 做 RAG。
- 第 4 週:把流程搬到 LangGraph,加上 checkpoint、observability、A/B canary。
- 之後:需要多 agent 時引入 OpenAI Agents SDK(單廠)或 LangGraph supervisor 模式(跨廠)。
- Week 1: Plain Python + Anthropic / OpenAI SDK with a
while loopagent (Chapter 1 starter). Understand every line. - Week 2: Standardize tools via MCP (Chapter 11). Test with Claude Desktop or Cursor.
- Week 3: Add Chroma for episodic memory + Cohere rerank for RAG.
- Week 4: Port to LangGraph; enable checkpointing, observability, A/B canary.
- Later: Introduce OpenAI Agents SDK (single-vendor) or LangGraph supervisor pattern (cross-vendor) when multi-agent is justified.
🎓 章節小測
Q1. Anthropic「Building effective agents」一文的核心建議是?
Q1. The core advice of Anthropic's "Building effective agents" post?
Q2. 哪個框架原生支援「暫停 → 等人 → 繼續」?
Q2. Which framework natively supports pause → wait-for-human → resume?
interrupt_before + checkpointer 提供原生 HITL。✅ LangGraph's interrupt_before + checkpointer give native HITL.Q3. 「先用框架再說」的最大風險?
Q3. The biggest risk of "framework-first"?