一、Agent 不只是 Chatbot
「AI 代理人 (AI Agent)」在 2026 年是最熱、也最被濫用的詞之一。同樣叫做「Agent」的系統,可能是一個幫你查天氣的 chatbot,也可能是一個能自己讀程式碼、跑單元測試、開 PR 的自動化工程師。要避免概念混淆,我們先回到最根本的定義。
經典 AI 教科書 Russell & Norvig 的定義:任何能透過感測器 (sensors) 感知環境,並透過效應器 (actuators) 作用於環境的實體,都是 agent。這個定義廣到包含掃地機器人、自動駕駛、甚至一隻青蛙。
2026 年大家在講的「AI Agent」幾乎都特指以 LLM 為核心推理引擎,可以使用工具、保有記憶、能自主規劃多步驟任務的系統。這比單純的 LLM 對話更強,但又比通用人工智慧 (AGI) 更務實。
"AI Agent" is one of the most overloaded buzzwords of 2026. The same label can describe a weather chatbot or a fully autonomous engineer that reads code, runs tests, and opens PRs. To avoid the confusion, return to first principles.
The Russell & Norvig textbook defines an agent as: anything that perceives its environment through sensors and acts upon it through actuators. This is broad enough to include Roombas, self-driving cars, and frogs.
When practitioners say "AI Agent" in 2026 they almost always mean a more specific thing: a system that uses an LLM as its reasoning engine, can call tools, retain memory, and plan multi-step tasks autonomously — stronger than a plain chat LLM, but more grounded than the AGI moonshot.
二、PEAS:分析任何 Agent 的萬用框架
Russell & Norvig 提出的 PEAS 框架是設計或評估 agent 時必先填寫的清單。它強迫你回答四個問題:
Russell & Norvig's PEAS framework is the checklist you fill in before designing or evaluating any agent. It forces four questions:
P · Performance
怎麼算是成功?任務完成率?人類滿意度?成本?延遲?沒有定義 P,就無法評估 agent。
What counts as success? Task completion rate? Human satisfaction? Cost? Latency? Without P, you cannot evaluate the agent.
E · Environment
Agent 處在什麼環境?網頁瀏覽器?terminal?資料庫?真實世界?環境是離散或連續?可觀測或不可觀測?
What environment? Web browser? Terminal? Database? Real world? Discrete or continuous? Fully or partially observable?
A · Actuators
Agent 能做什麼動作?發 HTTP 請求?執行 shell?點擊滑鼠?呼叫資料庫?這就是後續第 4 章「工具」。
What actions can the agent take? HTTP requests? Shell? Mouse clicks? DB queries? This becomes Chapter 4 (Tools).
S · Sensors
Agent 能觀測什麼?文字、圖像、API 回應、檔案系統、感測器數據?這決定了 context 的形狀。
What can the agent observe? Text, images, API responses, file system, sensor data? This shapes the context.
案例:用 PEAS 描述「程式碼修 Bug Agent」
| PEAS | 內容 | |
|---|---|---|
| P | 所有 unit test 通過 + 不引入新 lint 警告 + 修改檔案數 ≤ 5 | All unit tests pass + no new lint warnings + diff touches ≤ 5 files |
| E | 本機 Git repo、Linux shell、CI 系統、issue tracker | Local Git repo, Linux shell, CI, issue tracker |
| A | 讀檔、寫檔、執行 bash、執行 pytest、開 PR | read_file, write_file, run_bash, run_pytest, open_pr |
| S | 檔案內容、test 輸出、CI 日誌、人類審查者留言 | File contents, test output, CI logs, human reviewer comments |
三、四種經典 Agent 型態
從最簡單到最複雜,經典 AI 把 agent 分為四階。理解這個階層能幫你診斷「我的 LLM agent 卡在第幾階?」
From simplest to most capable, classical AI sorts agents into four tiers. Knowing the ladder helps you diagnose: "Which tier is my LLM agent stuck on?"
① 簡單反射型 (Simple Reflex)
「if 看到 X,就做 Y」的純規則系統。沒有記憶、沒有規劃。例如:自動掃地機器人「碰到牆 → 轉向」。
Pure rule-based "if see X, do Y". No memory, no planning. Example: a Roomba turning when it hits a wall.
② 模型反射型 (Model-Based Reflex)
保留內部狀態,能根據過去的觀察推測現在無法直接看到的部分(部分可觀測環境)。例如:自駕車記得「剛才右邊有車」,即使現在感測器照不到。
Keeps an internal state; infers parts of the world that aren't directly observable (partially observable env). Example: a self-driving car remembering "there was a car on the right" even after losing line of sight.
③ 目標型 (Goal-Based)
擁有明確目標,並透過搜尋或規劃 (search / planning) 找出達成目標的動作序列。例如:A* 路徑規劃、棋類 AI。
Has an explicit goal and uses search or planning to find an action sequence that achieves it. Examples: A* pathfinding, chess engines.
④ 效用型 (Utility-Based) 與學習型 (Learning)
不只判斷「達成 / 沒達成」,而是用效用函數衡量不同結果的好壞,並能從經驗中學習改進策略。現代 LLM Agent + RL fine-tuning 屬於這一階。
Beyond binary success/failure: uses a utility function to grade outcomes and can learn from experience to improve. Modern LLM agents with RL fine-tuning sit here.
四、現代 LLM Agent 的四層解剖
2026 年生產級 agent 普遍呈現「四層架構」(Redis、Oracle、IBM 等大廠 blog 都用類似分法):
Production agents in 2026 typically follow a "four-layer" architecture (Redis, Oracle, IBM all describe variants of this):
🧠 推理層
🎼 編排層
把 LLM 的單次呼叫串成多步驟流程。實作可以是 while loop、有向圖 (LangGraph)、或角色式 (CrewAI)。決定何時呼叫工具、何時結束——詳見 Step 6。
Chains single LLM calls into multi-step workflows. Implementation can be a while loop, a directed graph (LangGraph), or roles (CrewAI). Decides when to call tools and when to stop — see Step 6.
📚 記憶與資料層
五、Chat LLM vs Agent:一個能力對照表
下表勾選 / 取消「需要的能力」,右邊會即時告訴你需要的是哪種架構。
Toggle the capabilities below — the right side updates to tell you which architecture you need.
六、五個常見的概念誤區
七、最小 Agent:用 30 行體會核心迴圈
下方範例不依賴任何框架,直接呼叫 LLM API + 兩個工具,展示 agent 的「思考 → 行動 → 觀察 → 再思考」迴圈。Step 4 與 Step 6 會深入細節。
Below is a framework-free minimal agent: a direct LLM API call plus two tools, illustrating the "think → act → observe → think" loop. Chapters 4 and 6 go deeper.
# pip install anthropic from anthropic import Anthropic import json, math, requests client = Anthropic() # 1) Define tools the agent can call TOOLS = [{ "name": "calculator", "description": "Evaluate a math expression. Input: a Python-safe expression string.", "input_schema": {"type":"object","properties":{"expr":{"type":"string"}},"required":["expr"]} },{ "name": "web_get", "description": "HTTP GET a URL and return text.", "input_schema": {"type":"object","properties":{"url":{"type":"string"}},"required":["url"]} }] def run_tool(name, args): if name == "calculator": return str(eval(args["expr"], {"__builtins__":{}}, vars(math))) if name == "web_get": return requests.get(args["url"]).text[:2000] return "unknown tool" # 2) The core agent loop messages = [{"role":"user","content":"What is 2026's most-cited AI paper, and is its DOI a prime?"}] while True: resp = client.messages.create(model="claude-sonnet-4-6", max_tokens=1024, tools=TOOLS, messages=messages) if resp.stop_reason == "end_turn": break # agent decided to stop messages.append({"role":"assistant","content":resp.content}) tool_results = [] for block in resp.content: if block.type == "tool_use": result = run_tool(block.name, block.input) tool_results.append({"type":"tool_result","tool_use_id":block.id,"content":result}) messages.append({"role":"user","content":tool_results}) print(resp.content[-1].text)
// npm install openai import OpenAI from "openai"; const client = new OpenAI(); const tools = [{ type: "function", function: { name: "calculator", description: "Evaluate a math expression.", parameters: { type:"object", properties:{expr:{type:"string"}}, required:["expr"] } } }]; async function runTool(name, args) { if (name === "calculator") return String(eval(args.expr)); return "unknown"; } let messages = [{ role:"user", content:"Compute 17 * 23 then check primality of the result." }]; while (true) { const resp = await client.chat.completions.create({ model:"gpt-4.1", messages, tools, tool_choice:"auto" }); const msg = resp.choices[0].message; messages.push(msg); if (!msg.tool_calls) break; for (const tc of msg.tool_calls) { const out = await runTool(tc.function.name, JSON.parse(tc.function.arguments)); messages.push({ role:"tool", tool_call_id:tc.id, content:out }); } } console.log(messages.at(-1).content);
eval() 僅供示範。生產環境絕對不可把使用者輸入直接 eval——詳見 Step 12 安全章節。
Note: The eval() calls are illustrative only. Never eval user input in production — see Step 12.
🎓 章節小測 (3 題)
Q1. 下列哪一項不是本教程定義的 AI Agent 必要元件?
Q1. Which is not a required component of an AI Agent in our working definition?
Q2. PEAS 框架中的「P」代表什麼?
Q2. What does the "P" in PEAS stand for?
Q3. 「給 LLM 五個工具就叫做 Agent」這個說法為什麼不準確?
Q3. Why is "an LLM with five tools = an agent" inaccurate?