
图谱赋能AI Agent:探索下一代智能体的发展前景与挑战 原创
摘要
人工智能代理经历了从强化学习(RL)早期主导地位到由大型语言模型(LLMs)驱动的智能代理崛起的范式转变,现在正进一步迈向RL和LLM能力的协同融合。这一进步赋予了人工智能代理越来越强大的能力。尽管取得了这些进展,但要完成复杂的现实世界任务,代理需要有效规划和执行,保持可靠的记忆,并与其他代理顺畅协调。实现这些能力涉及应对始终存在的复杂信息、操作和互动。鉴于这一挑战,数据结构化可以通过将复杂无序的数据转换为结构良好的形式,使代理能够更有效地理解和处理,从而发挥出有希望的作用。在此背景下,图谱在组织、管理和利用复杂数据关系方面具有天然优势,为支持高级人工智能代理所需的能力提供了强大的数据结构化范式。为此,本调查首次系统地回顾了图谱如何赋能人工智能代理。具体来说,我们探讨了图形技术与核心智能体功能的整合,重点介绍了显著的应用案例,并识别了未来研究的潜在途径。通过全面审视这一迅速发展的交叉领域,我们希望激发下一代AI智能体的发展,使其能够应对日益复杂的图谱挑战。相关资源已在GitHub链接中收集并持续更新供社区参考。
Graphs Meet AI Agents: Taxonomy, Progress, and Future Opportunities
https://github.com/YuanchenBei/Awesome-Graphs-Meet-Agents
核心速览
研究背景
- 研究问题:这篇文章要解决的问题是如何通过图技术增强AI代理的功能,包括规划、执行、记忆和多智能体协调。具体来说,图技术如何帮助AI代理更有效地处理复杂任务中的信息、操作和交互。
- 研究难点:该问题的研究难点包括:处理复杂任务中的非结构化数据、在多智能体系统中实现有效的信息传递和协调、以及在动态环境中维护和更新记忆。
- 相关工作:该问题的研究相关工作包括强化学习(RL)和大语言模型(LLM)在AI代理中的应用,以及图技术在数据组织和知识提取方面的成功应用。
研究方法
这篇论文提出了通过图技术来增强AI代理功能的方法。具体来说,
- 图在代理规划中的应用:图可以用于组织任务推理形式、安排任务分解过程,并构建高效的任务决策搜索过程。
- 任务推理:使用知识图谱辅助推理和结构组织推理来增强LLM代理的任务理解能力。
- 任务分解:构建任务依赖图(TDG)来表示子任务的依赖关系,并优化执行路径。
- 任务决策搜索:使用状态空间图(SSG)来表示状态之间的转换,并通过搜索算法(如蒙特卡罗树搜索)来优化决策过程。
2.图在代理执行中的应用:图可以帮助组织工具的使用和环境交互。
- 工具使用:构建工具图来清晰地显示工具之间的连接,使代理能够高效地使用和管理大量工具。
- 环境交互:使用场景图来编码视觉场景中的对象及其空间或语义关系,并采用启发式方法和学习方法来建模这些关系。
3.图在代理记忆中的应用:图结构记忆可以有效地揭示各种信息之间的潜在关联。
- 记忆组织:将知识和经验存储为相互连接的表示,使用知识图谱和其他结构化形式来组织长期记忆。
- 记忆检索:使用图检索增强生成(graph-based RAG)来准确高效地检索有用信息。
- 记忆维护:动态更新和细化记忆表示和图拓扑,以响应新的经验和交互。
4.图在多智能体协调中的应用:图可以用于建模智能体之间的通信路径和任务分配。
- 任务特定关系:使用任务依赖图和任务分配图来优化任务执行和信息交换。
- 环境特定关系:根据特定环境的特征动态学习智能体之间的关系权重。
- 协调拓扑优化:通过边重要性测量、图自编码器优化和强化学习来优化多智能体系统的通信拓扑。
结果与分析
- 科学计算:图学习结合代理系统在科学计算领域表现出显著潜力,特别是在自动化科学发现和生物信息学分析方面。
- 具身AI:图基表示在具身AI中提供了强大的工具,增强了场景理解能力,并支持更明智的决策。
- 游戏AI:图结构在表示和建模游戏AI中越来越受到关注,特别是在多智能体游戏和文本游戏中。
- 代理信息检索:图学习可以支持结构化的检索规划和自适应推理,提高复杂任务中的信息检索效率。
- 工业和自动化系统:图组织和学习可以在工业系统中实现高效的性能,增强系统的可扩展性、动态演化和鲁棒性。
- 人类社会:LLM代理在分析和模拟人类社会行为方面显示出巨大潜力,特别是在社交网络和信息传播方面。
总体结论
这篇论文全面系统地回顾了图技术与AI代理的交叉领域,探讨了图技术如何增强代理的规划、执行、记忆和多智能体协调功能。此外,还研究了代理范式如何反过来增强图学习。基于详细的回顾,论文总结了有意义的应用、开放问题和未来的研究方向,为下一代面对日益复杂和混乱任务信息的代理提供了新的潜在方法。相关资源已在Github链接中组织并持续更新。
论文评价
优点与创新
- 系统综述:本文提供了图技术与AI代理交叉领域的第一个全面系统综述,涵盖了从基于强化学习(RL)到基于大型语言模型(LLM)的代理范式。
- 新颖的分类视角:论文引入了一种新颖的分类视角,探讨了图如何增强代理的核心功能:规划、执行、记忆和多代理协调。
- 双向创新:论文不仅讨论了图如何增强代理功能,还探讨了代理如何反过来推动图学习的进步,强调了双向创新和整体视角。
- 应用和未来机会:基于综述,论文进一步讨论了图增强AI代理的有意义应用、关键挑战和未来机会。
- 资源更新:相关资源在Github链接中收集并持续更新,为研究和工业社区提供了宝贵的参考资料。
不足与反思
- 基准评估:现有的基准测试在任务定义或评估数据上存在差异,使得统一评估变得具有挑战性。此外,针对多代理推理、大规模动态环境中的记忆和协作等新兴复杂场景,缺乏以图为中心的代理基准。
- 图基础模型:尽管有许多基于图的图学习方法,但在代理功能中广泛使用的图操作符仍然缺乏。开发有效的图基础模型(GFMs)是一个有前景的方向,特别是从效果、可解释性和可扩展性(EES)的角度设计GFMs。
- 隐私和安全:图组织和学习在建立实体之间的连接时可能会带来安全问题,包括数据隐私和攻击防御。未来的研究应专注于设计更安全的代理协调和环境交互策略。
- 多模态代理:尽管LLM代理在语言空间中取得了显著进展,但多模态代理(能够理解和整合文本、视觉和语音等信息流的代理)的研究仍在不断发展。图学习可以在多模态数据的抽象和连接中发挥重要作用。
- 模型上下文协议(MCP):MCP协议提供了一种标准化手段,用于无缝集成代理应用和外部数据源、工具和服务。未来的研究可以探索如何利用图学习增强MCP的两个有前景的方面:高效的数据集成和个性化推荐。
- 开放代理网络(OAN):一个开放的代理网络(OAN)将是一个公共、去中心化的网络,其中代理被注册和编排。未来的研究可以探索如何在OAN中实现图学习,以提高代理网络的效率和风险控制能力。
关键问题及回答
问题1:图技术在AI代理的规划功能中具体有哪些应用?
- 任务推理:使用知识图谱辅助推理和结构组织推理来增强LLM代理的任务理解能力。例如,QA-GNN通过结合语言模型和辅助知识图谱来提高问答系统的性能,而ToG和KG-CoT则利用基于知识图谱的检索增强生成(RAG)来提升LLM代理的任务推理能力。
- 任务分解:构建任务依赖图(TDG)来表示子任务的依赖关系,并优化执行路径。TDG是一种有向无环图(DAG),用于表示任务之间的依赖关系,从而帮助代理识别和执行有效的任务子序列。
- 任务决策搜索:使用状态空间图(SSG)来表示状态之间的转换,并通过搜索算法(如蒙特卡罗树搜索)来优化决策过程。SSG通过将状态和状态之间的转换表示为图的节点和边,帮助代理在复杂环境中进行有效的决策。
问题2:图技术在AI代理的执行功能中有哪些具体应用?
- 工具使用:构建工具图来清晰地显示工具之间的连接,使代理能够高效地使用和管理大量工具。例如,GPTSwarm通过将代理抽象为有向图(DAG),每个节点代表一个函数,边表示信息流,从而帮助代理高效地调用和管理工具。
- 环境交互:使用场景图来编码视觉场景中的对象及其空间或语义关系,并采用启发式方法和学习方法来建模这些关系。场景图通过将对象及其关系表示为图的节点和边,帮助代理更好地理解和交互复杂的环境。
问题3:图技术在AI代理的记忆功能中有哪些具体应用?
- 记忆组织:将知识和经验存储为相互连接的表示,使用知识图谱和其他结构化形式来组织长期记忆。例如,AriGraph将代理的记忆表示为结构化的知识图谱,包含语义事实和事件,从而帮助代理回忆复杂的结构和关系。
- 记忆检索:使用图检索增强生成(graph-based RAG)来准确高效地检索有用信息。例如,G-Retriever和GFM-RAG通过结合语义相似性和图度量来设计定制的检索器,从而提高信息检索的准确性和效率。
- 记忆维护:动态更新和细化记忆表示和图拓扑,以响应新的经验和交互。例如,A-MEM通过动态索引和链接创建互联的知识网络,从而允许代理在不断变化的环境中持续改进其记忆表示。
🚀 Taxonomy
Graph for Agent Planning
Task Reasoning
Knowledge Graph-Auxiliary Reasoning
- (NAACL 2021) QA-GNN: Reasoning with language models and knowledge graphs for question answering [Paper] [Code]
- (ICLR 2024) Think-on-graph: Deep and responsible reasoning of large language model on knowledge graph [Paper] [Code]
- (ICLR 2024) Reasoning on graphs: Faithful and interpretable large language model reasoning [Paper] [Code]
- (IJCAI 2024) Kg-cot: Chain-of-thought prompting of large language models over knowledge graphs for knowledge-aware question answering [Paper]
- (ACL 2024) Mindmap: Knowledge graph prompting sparks graph of thoughts in large language models [Paper] [Code]
- (WWW 2025) Paths-over-graph: Knowledge graph empowered large language model reasoning [Paper]
Structure-Organized Reasoning
- (NeurIPS 2023) Tree of thoughts: Deliberate problem solving with large language models [Paper] [Code]
- (NAACL 2024) GoT: Effective Graph-of-Thought Reasoning in Language Models [Paper] [Code]
- (AAAI 2024) Graph of thoughts: Solving elaborate problems with large language models [Paper] [Code]
- (AAAI 2025) Ratt: A thought structure for coherent and correct llm reasoning [Paper] [Code]
- (Arxiv 2025) Reasoning with Graphs: Structuring Implicit Knowledge to Enhance LLMs Reasoning [Paper]
Task Decomposition
- (CoLM 2024) Agentkit: Structured LLM reasoning with dynamic graphs [Paper] [Code]
- (TMLR 2024) Feudal Graph Reinforcement Learning [Paper]
- (NeurIPS 2024) Can Graph Learning Improve Planning in LLM-based Agents? [Paper] [Code]
- (ACL 2024) Villageragent: A graph-based multi-agent framework for coordinating complex task dependencies in minecraft [Paper] [Code]
- (Arxiv 2024) DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning [Paper] [Demo]
- (ICRA 2025) Enhancing Multi-Agent Systems via Reinforcement Learning with LLM-based Planner and Graph-based Policy [Paper]
- (Arxiv 2025) DynTaskMAS: A Dynamic Task Graph-driven Framework for Asynchronous and Parallel LLM-based Multi-Agent Systems [Paper]
- (Arxiv 2025) Plan-over-Graph: Towards Parallelable LLM Agent Schedule [Paper] [Code]
Task Decision Searching
- (NeurIPS 2014) Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning [Paper]
- (AAAI 2018, Best Paper) Memory-augmented monte carlo tree search [Paper]
- (ACML 2020) Monte-Carlo Graph Search: the Value of Merging Similar States [Paper]
- (ICAPS 2021) Improving alphazero using monte-carlo graph search [Paper]
- (AI 2024) Evolving interpretable decision trees for reinforcement learning [Paper]
- (AAMAS 2024) Continuous monte carlo graph search [Paper] [Code]
- (ICLR 2024) Promptagent: Strategic planning with language models enables expert-level prompt optimization [Paper] [Code]
- (ICML 2024) Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models [Paper] [Code]
Graph for Agent Execution
Tool Usage
- (ACL 2024) Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs [Paper] [Code]
- (ECCV 2024) ControlLLM: Augment Language Models with Tools by Searching on Graphs [Paper] [Code]
- (ICML 2024) GPTSwarm: Language Agents as Optimizable Graphs [Paper] [Code]
- (Arxiv 2024) ToolNet: Connecting Large Language Models with Massive Tools via Tool Graph [Paper]
- (Arxiv 2025) Graph RAG-Tool Fusion [Paper] [Code]
Environment Interaction
Heuristic-Based Relationship
- (ICRA 2023) Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand [Paper]
- (COR 2024) A Graph Reinforcement Learning Framework for Neural Adaptive Large Neighbourhood Search [Paper]
- (AAMAS 2024) Towards Generalizability of Multi-Agent Reinforcement Learning in Graphs with Recurrent Message Passing [Paper] [Code]
- (RAL 2024) Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation [Paper] [Code]
- (Arxiv 2024) PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning [Paper]
- (CVPR 2025) GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration [Paper] [Code]
- (Arxiv 2025) Multi-agent Auto-Bidding with Latent Graph Diffusion Models [Paper]
- (Arxiv 2025) A Schema-Guided Reason-while-Retrieve framework for Reasoning on Scene Graphs with Large-Language-Models (LLMs) [Paper]
Learning-Based Relationship
- (NeurIPS Workshop 2018) Deep Multi-Agent Reinforcement Learning with Relevance Graphs [Paper] [Code]
- (CoRL 2023) Learning Control Admissibility Models with Graph Neural Networks for Multi-Agent Navigation [Paper]
- (AAMAS 2023) TransfQMix: Transformers for Leveraging the Graph Structure of Multi-Agent Reinforcement Learning Problems [Paper] [Code]
- (TAI 2024) Reinforcement Learned Multi–Agent Cooperative Navigation in Hybrid Environment with Relational Graph Learning [Paper]
- (NCA 2024) Graph network-based human movement prediction for socially-aware robot navigation in shared workspaces [Paper]
Graph for Agent Memory
Memory Organization
- (JMS 2021) Towards Self-X Cognitive Manufacturing Network: An Industrial Knowledge Graph-Based Multi-Agent Reinforcement Learning Approach [Paper]
- (Arxiv 2024) AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents [Paper]
- (Arxiv 2024) On the Structural Memory of LLM Agents [Paper] [Code]
- (Arxiv 2024) From Local to Global: A GraphRAG Approach to Query-Focused Summarization [Paper] [Code]
- (Arxiv 2024) KG-Retriever: Efficient Knowledge Indexing for Retrieval-Augmented Large Language Models [Paper] [Code]
- (SIGIR 2025) Enhancing the Patent Matching Capability of Large Language Models via the Memory Graph [Paper] [Code]
- (AAAI 2025) LLM-Powered Decentralized Generative Agents with Adaptive Hierarchical Knowledge Graph for Cooperative Planning [Paper] [Code]
- (WWW 2025) Graphusion: A RAG Framework for Scientific Knowledge Graph Construction with a Global Perspective [Paper] [Code]
- (Arxiv 2025) Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research [Paper] [Code]
Memory Retrieval
- (NeurIPS 2024) G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [Paper] [Code]
- (Arxiv 2024) LightRAG: Simple and Fast Retrieval-Augmented Generation [Paper] [Code]
- (ICLR 2025) Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation [Paper] [Code]
- (NAACL 2025) GRAG: Graph Retrieval-Augmented Generation [Paper] [Code]
- (Arxiv 2025) GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation [Paper] [Code]
- (Arxiv 2025) PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths [Paper] [Code]
Memory Maintenance
- (NeurIPS 2024) HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models [Paper] [Code]
- (Arxiv 2024) LightRAG: Simple and Fast Retrieval-Augmented Generation [Paper] [Code]
- (Arxiv 2024) KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph [Paper]
- (Arxiv 2024) AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents [Paper] [Code]
- (AAAI 2025) LLM-Powered Decentralized Generative Agents with Adaptive Hierarchical Knowledge Graph for Cooperative Planning [Paper] [Code]
- (Arxiv 2025) Zep: A Temporal Knowledge Graph Architecture for Agent Memory [Paper] [Code]
- (Arxiv 2025) A-Mem: Agentic Memory for LLM Agents [Paper] [Code]
- (Arxiv 2025) InstructRAG: Leveraging Retrieval-Augmented Generation on Instruction Graphs for LLM-Based Task Planning [Paper]
Graphs for Multi-Agent Coordination
Coordination Message Passing
Task-Specific Relationship
- (NeurIPS 2022) Learning NP-Hard Multi-Agent Assignment Planning using GNN: Inference on a Random Graph and Provable Auction-Fitted Q-learning [Paper]
- (ICRA 2025) Enhancing Multi-Agent Systems via Reinforcement Learning with LLM-based Planner and Graph-based Policy [Paper]
- (ICLR 2025) Scaling Large Language Model-based Multi-Agent Collaboration [Paper] [Code]
- (Arxiv 2025) DynTaskMAS: A Dynamic Task Graph-driven Framework for Asynchronous and Parallel LLM-based Multi-Agent Systems [Paper]
- (Arxiv 2025) MAGNNET: Multi-Agent Graph Neural Network-based Efficient Task Allocation for Autonomous Vehicles with Deep Reinforcement Learning [Paper]
- (Arxiv 2025) GNNs as Predictors of Agentic Workflow Performances [Paper] [Code]
Environment-Specific Relationship
- (ICASSP 2021) Graphcomm: A Graph Neural Network Based Method for Multi-Agent Reinforcement Learning [Paper]
- (ITSC 2022) Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Interactive Traffic Scenarios [Paper]
- (TITS 2022) Multi-Agent Trajectory Prediction with Heterogeneous Edge-Enhanced Graph Attention Network [Paper]
- (TCCN 2023) MAGNNETO: A Graph Neural Network-based Multi-Agent system for Traffic Engineering [Paper] [Code]
- (TPAMI 2023) Robust Multi-Agent Communication With Graph Information Bottleneck Optimization [Paper]
- (IROS 2024) Transformer-based Multi-Agent Reinforcement Learning for Generalization of Heterogeneous Multi-Robot Cooperation [Paper]
- (ICLR 2025) Exponential Topology-enabled Scalable Communication in Multi-agent Reinforcement Learning [Paper] [Code]
Coordination Topology Optimization
- (AAAI 2020) Multi-Agent Game Abstraction via Graph Attention Neural Network [Paper]
- (AAMAS 2021) Deep Implicit Coordination Graphs for Multi-agent Reinforcement Learning [Paper] [Code]
- (AAMAS 2021) Multi-Agent Graph-Attention Communication and Teaming [Paper]
- (NeurIPS 2021) Learning Distilled Collaboration Graph for Multi-Agent Perception [Paper] [Code]
- (TNNLS 2022) Online Multi-Agent Forecasting with Interpretable Collaborative Graph Neural Networks [Paper]
- (AAMAS 2023) Learning Graph-Enhanced Commander-Executor for Multi-Agent Navigation [Paper] [Code]
- (ICLR 2024) Learning Multi-Agent Communication from Graph Modeling Perspective [Paper] [Code]
- (Arxiv 2024) G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks [Paper] [Code]
- (COLM 2024) A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration [Paper] [Code]
- (ICML 2024) GPTSwarm: Language Agents as Optimizable Graphs [Paper] [Code]
- (JCISE 2025) Adaptive Network Intervention for Complex Systems: A Hierarchical Graph Reinforcement Learning Approach [Paper]
- (ICRA 2025) Reliable and Efficient Multi-Agent Coordination via Graph Neural Network Variational Autoencoders [Paper] [Code]
- (ICLR 2025) Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems [Paper] [Code]
- (Arxiv 2025) Deep Meta Coordination Graphs for Multi-agent Reinforcement Learning [Paper] [Code]
- (Arxiv 2025) Adaptive Graph Pruning for Multi-Agent Communication [Paper]
Agents for Graph Learning
Graph Annotation and Synthesis
- (NeurIPS 2020) Graph Policy Network for Transferable Active Learning on Graphs [Paper] [Code]
- (AAAI 2022) Batch Active Learning with Graph Neural Networks via Multi-Agent Deep Reinforcement Learning [Paper]
- (Arxiv 2024) Exploring the Potential of Large Language Models in Graph Generation [Paper]
- (Arxiv 2024) LLM-Based Multi-Agent Systems are Scalable Graph Generative Models [Paper] [Code]
- (ICLR Workshop 2025) IGDA: Interactive Graph Discovery through Large Language Model Agents [Paper]
- (Arxiv 2025) Plan-over-Graph: Towards Parallelable LLM Agent Schedule [Paper] [Code]
- (Arxiv 2025) GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments [Paper]
Graph Understanding
- (KDD 2020) Policy-GNN: Aggregation Optimization for Graph Neural Networks [Paper]
- (WWW 2021) SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism [Paper] [Code]
- (NeurIPS 2023) MAG-GNN: Reinforcement Learning Boosted Graph Neural Network [Paper] [Code]
- (ICLR 2023) Agent-based Graph Neural Networks [Paper] [Code]
- (Arxiv 2023) Graph Agent: Explicit Reasoning Agent for Graphs [Paper]
- (Arxiv 2023) A Versatile Graph Learning Approach through LLM-based Agent [Paper]
- (KDD 2024) GraphWiz: An Instruction-Following Language Model for Graph Computational Problems [Paper] [Code]
- (SIGIR 2024) GraphGPT: Graph Instruction Tuning for Large Language Models [Paper] [Code]
- (ICLR 2024) One For All: Towards Training One Graph Model For All Classification Tasks [Paper] [Code]
- (ICML 2024) LLaGA: Large Language and Graph Assistant [Paper] [Code]
- (KDD 2024) ZeroG: Investigating Cross-dataset Zero-shot Transferability in Graphs [Paper] [Code]
- (Arxiv 2024) GraphAgent: Agentic Graph Language Assistant [Paper] [Code]
- (Arxiv 2024) Scalable and Accurate Graph Reasoning with LLM-based Multi-Agents [Paper]
- (Arxiv 2024) GraphTeam: Facilitating Large Language Model-based Graph Reasoning via Multi-Agent Collaboration [Paper]
- (Arxiv 2024) GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability [Paper] [Code]
- (AAAI 2025) Graph Agent Network: Empowering Nodes with Inference Capabilities for Adversarial Resilience [Paper]
💻 Benchmarks and Open-Source Toolkits
General
- (NeurIPS 2021, RL Agent, Multi-Agent Coordination) Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks [Paper] [Code]
- (JMLR 2024, RL Agent, Multi-Agent Coordination) BenchMARL: Benchmarking Multi-Agent Reinforcement Learning [Paper] [Code]
- (JMLR 2025, RL Agent, Agent Memory) Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents [Paper] [Code]
- (EMNLP 2023, LLM Agent, Tool Usage) API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs [Paper] [Code]
- (NeurIPS 2023, LLM Agent, Task Reasoning, Task Decomposition) PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change [Paper] [Code]
- (ICLR 2024, LLM Agent, General) AgentBench: Evaluating LLMs as Agents [Paper] [Code]
- (ICLR 2024, LLM Agent, Tool Usage) ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs [Paper] [Code]
- (NeurIPS 2024, LLM Agent, Tool Usage) GTA: A Benchmark for General Tool Agents [Paper] [Code]
- (ICML 2024, LLM Agent, Task Reasoning, Task Decomposition, Tool Usage) TravelPlanner: A Benchmark for Real-World Planning with Language Agents [Paper] [Code]
- (ICLR 2025, LLM Agent, Tool Usage, Agent-Environment Interaction) τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains [Paper] [Code]
- (NAACL 2025, LLM Agent, Tool Usage, Agent-Environment Interaction) ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities [Paper] [Code]
- (Arxiv 2025, LLM Agent, Task Reasoning, Task Decomposition, Multi-Agent Coordination) REALM-Bench: A Real-World Planning Benchmark for LLMs and Multi-Agent Systems [Paper] [Code]
Graph-Related
- (LLM Agent, Tool Usage, Multi-Agent Coordination) LangGraph [Docs] [Code]
- (NeurIPS 2024, LLM Agent, Graph Modeling) GLBench: A Comprehensive Benchmark for Graph with Large Language Models [Paper] [Code]
- (NeurIPS 2024, LLM Agent, Graph Modeling) Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models [Paper] [Code]
- (ICLR 2025, LLM Agent, Graph Modeling) GraphArena: Evaluating and Exploring Large Language Models on Graph Computation [Paper] [Code]
- (ICML 2024, LLM Agent, Task Reasoning, Tool Usage, Multi-Agent Coordination) GPTSwarm: Language Agents as Optimizable Graphs [Paper] [Code]
- (ICLR 2025, LLM Agent, Multi-Agent Coordinatio) Scaling Large Language Model-based Multi-Agent Collaboration [Paper] [Code]
- (Arxiv 2025, LLM Agent, Tool Usage) Graph RAG-Tool Fusion [Paper] [Code]
- (Arxiv 2025, LLM Agent, Task Reasoning, Task Decomposition) GNNs as Predictors of Agentic Workflow Performances [Paper] [Code]
本文转载自知识图谱科技,作者:Wolfgang
