LLM智能体类型全景图谱：从基础工具到自主决策的进化之路原创

发布于 2025-7-22 08:21

浏览

0收藏

本文对LLM智能体（也称“智能体”）进行系统分类，分析各种类型的智能体基于任务处理方式的差异，如推理、工具调用、多智能体协作等。

简介

每个成功的AI智能体的核心都在于一项基本技能：提示词（或“提示词工程”）。这是一种通过精心设计输入文本来指导LLM执行任务的方法。

提示词工程是首批文本到文本NLP模型（2018年）输入的演变。当时，开发人员通常更专注于建模和特征工程。在大型GPT模型（2022年）创建之后，我们开始主要使用预训练工具，因此重点转移到了输入格式上。因此，“提示词工程”学科应运而生。如今（2025年），随着NLP逐渐模糊代码和即时之间的界限，它已发展成为一门艺术与科学的融合。

不同类型的提示词技巧会创造出不同类型的智能体。每种方法都会增强一项特定的技能：逻辑、计划、记忆、准确性和工具的整合。让我们通过一个非常简单的例子来了解所有这些技巧。

## 模型搭建准备
import ollama
llm = "qwen2.5"

## 提问
q = "What is 30 multiplied by 10?"

主要技术

1.“常规”提示词——只需提出一个问题，即可获得直接的答案

也称为“零样本”提示词，具体指模型在没有任何先前样本的情况下被赋予任务的情况。这种基本技术专为单步执行任务的智能体设计，尤其是在早期模型中，这类智能体无需中间推理即可执行任务。

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路-AI.x社区

response = ollama.chat(model=llm, messages=[
 {'role':'user', 'content':q}
])
print(response['message']['content'])

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路-AI.x社区

2.ReAct（Reason+Act）——推理与行动的结合

该模型不仅会思考问题，还会根据推理采取行动。因此，随着模型在推理步骤和行动之间交替，并不断迭代改进其方法，其交互性更强。本质上，它是一个“思考-行动-观察”的循环。它用于更复杂的任务，例如搜索网页并根据结果做出决策，通常设计用于多步骤智能体，这些智能体执行一系列推理步骤和行动以得出最终结果。它们可以将复杂的任务分解成更小、更易于管理的部分，并逐步构建彼此。

就我个人而言，我非常喜欢ReAct Agents，因为我发现它们更类似于人类，因为它们像我们一样“四处游荡并发现新事物”。

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路-AI.x社区

prompt = '''
To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Action:', and 'Observation:' sequences.
At each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task, then the tools that you want to use.
Then in the 'Action:' sequence, you shold use one of your tools.
During each intermediate step, you can use 'Observation:' field to save whatever important information you will use as input for the next step.
'''

response = ollama.chat(model=llm, messages=[
 {'role':'user', 'content':q+" "+prompt}
])
print(response['message']['content'])

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路-AI.x社区

3.思维链（CoT）

这是一种推理模式，涉及生成得出结论的过程。该模型通过明确列出通向最终答案的逻辑步骤，迫使其“大声思考”。本质上，它是一个没有反馈的计划。CoT最常用于高级任务，例如解决可能需要逐步推理的数学问题，通常为多步骤智能体设计。

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路-AI.x社区

prompt = '''Let’s think step by step.'''

response = ollama.chat(model=llm, messages=[
 {'role':'user', 'content':q+" "+prompt}
])
print(response['message']['content'])

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路-AI.x社区

CoT扩展

从上面的技术链中又衍生出其他几种新的提示方法。

4.反思提示

这是在初始CoT推理的基础上增加一个迭代自我检查或自我纠正阶段，其中模型审查和批评自己的输出（发现错误、识别差距、提出改进建议）。

cot_answer = response['message']['content']

response = ollama.chat(model=llm, messages=[
 {'role':'user', 'content': f'''Here was your original answer:\n\n{cot_answer}\n\n
 Now reflect on whether it was correct or if it was the best approach. 
 If not, correct your reasoning and answer.'''}
])
print(response['message']['content'])

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路-AI.x社区

5.思想树（ToT）

通过这种方法，可以将CoT概括为一棵树，同时还要探索多个推理链。

num_branches = 3

prompt = f'''
You will think of multiple reasoning paths (thought branches). For each path, write your reasoning and final answer.
After exploring {num_branches} different thoughts, pick the best final answer and explain why.
'''

response = ollama.chat(model=llm, messages=[
 {'role':'user', 'content': f"Task: {q} \n{prompt}"}
])
print(response['message']['content'])

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路-AI.x社区

6.思维图（GoT）

基于这种方法，可以将CoT概括为一张图表，同时还要考虑相互连接的分支。

class GoT:
 def __init__(self, question):
 self.question = question
 self.nodes = {} # node_id: text
 self.edges = [] # (from_node, to_node, relation)
 self.counter = 1

 def add_node(self, text):
 node_id = f"Thought{self.counter}"
 self.nodes[node_id] = text
 self.counter += 1
 return node_id

 def add_edge(self, from_node, to_node, relation):
 self.edges.append((from_node, to_node, relation))

 def show(self):
 print("\n--- Current Thoughts ---")
 for node_id, text in self.nodes.items():
 print(f"{node_id}: {text}\n")
 print("--- Connections ---")
 for f, t, r in self.edges:
 print(f"{f} --[{r}]--> {t}")
 print("\n")

 def expand_thought(self, node_id):
 prompt = f"""
 You are reasoning about the task: {self.question}
 Here is a previous thought node ({node_id}):\"\"\"{self.nodes[node_id]}\"\"\"
 Please provide a refinement, an alternative viewpoint, or a related thought that connects to this node.
 Label your new thought clearly, and explain its relation to the previous one.
 """
 response = ollama.chat(model=llm, messages=[{'role':'user', 'content':prompt}])
 return response['message']['content']

##开始构建图
g = GoT(q)

## 获取初始想法
response = ollama.chat(model=llm, messages=[
 {'role':'user', 'content':q}
])
n1 = g.add_node(response['message']['content'])

##通过一些改进来扩展最初的想法
refinements = 1
for _ in range(refinements):
 expansion = g.expand_thought(n1)
 n_new = g.add_node(expansion)
 g.add_edge(n1, n_new, "expansion")
 g.show()

## 最终答案输出
prompt = f'''
Here are the reasoning thoughts so far:
{chr(10).join([f"{k}: {v}" for k,v in g.nodes.items()])}
Based on these, select the best reasoning and final answer for the task: {q}
Explain your choice.
'''

response = ollama.chat(model=llm, messages=[
 {'role':'user', 'content':q}
])
print(response['message']['content'])

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路-AI.x社区

7.思想程序（PoT）

这种方法专门用于编程领域，其中推理通过可执行代码片段进行。

import re

def extract_python_code(text):
 match = re.search(r"```python(.*?)```", text, re.DOTALL)
 if match:
 return match.group(1).strip()
 return None

def sandbox_exec(code):
 ## 创建具有安全限制的最小沙盒
 allowed_builtins = {'abs', 'min', 'max', 'pow', 'round'}
 safe_globals = {k: __builtins__.__dict__[k] for k in allowed_builtins if k in __builtins__.__dict__}
 safe_locals = {}
 exec(code, safe_globals, safe_locals)
 return safe_locals.get('result', None)

prompt = '''
Write a short Python program that calculates the answer and assigns it to a variable named 'result'. 
Return only the code enclosed in triple backticks with 'python' (```python ... ```).
'''

response = ollama.chat(model=llm, messages=[
 {'role':'user', 'content': f"Task: {q} \n{prompt}"}
])
print(response['message']['content'])
sandbox_exec(code=extract_python_code(text=response['message']['content']))

结论

本文概述了人工智能智能体的所有主要提示词技术。然而，并没有单一的“最佳”提示词技术，因为这很大程度上取决于任务本身和所需推理的复杂性。

例如，像总结和翻译这样的简单任务，可以通过零次/常规提示词轻松完成，而CoT模式则非常适合数学和逻辑任务。另一方面，带有工具的智能体通常是使用ReAct模式创建的。此外，当需要从错误或迭代中学习以改进结果时，例如游戏，Reflexion模式最为合适。

就复杂任务的多功能性而言，PoT是真正的赢家，因为它完全基于代码生成和执行。事实上，PoT智能体在多项办公任务中正越来越接近取代人类。

我相信，在不久的将来，提示词将不仅仅是“你对模型说什么”，而是在人类意图、机器推理和外部动作之间构建一个交互循环。

有关本文中示例程序的完整源代码，请见GitHub地址。

译者介绍

朱先忠，51CTO社区编辑，51CTO专家博客、讲师，潍坊一所高校计算机教师，自由编程界老兵一枚。

原文标题：Recap of all types of LLM Agents，作者：Mauro Di Pietro

标签

51CTO

51CTO博客

51CTO学堂

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路原创

简介

主要技术

1.“常规”提示词——只需提出一个问题，即可获得直接的答案

2.ReAct（Reason+Act）——推理与行动的结合

3.思维链（CoT）

CoT扩展

4.反思提示

5.思想树（ToT）

6.思维图（GoT）

7.思想程序（PoT）

结论

译者介绍

目录

51CTO

51CTO博客

51CTO学堂

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路 原创

简介

主要技术

1.“常规”提示词——只需提出一个问题，即可获得直接的答案

2.ReAct（Reason+Act）——推理与行动的结合

3.思维链（CoT）

CoT扩展

4.反思提示

5.思想树（ToT）

6.思维图（GoT）

7.思想程序（PoT）

结论

译者介绍

目录

LLM智能体类型全景图谱：从基础工具到自主决策的进化之路原创