Last Updated Time: 2024/1/19
A repo lists papers related to LLM based agent. Includes
- methods of role playing, memory mechanism and game playing
- methods of feedback or reflection
- methods of tool usage or human-agent interaction
- multi-agent system
- benchmarks and surveys of the field
- environments or platforms
- agent fine-tuning
For more comprehensive reading, we also recommend other paper lists:
- zjunlp/LLMAgentPapers: Must-read Papers on Large Language Model Agents.
- teacherpeterpan/self-correction-llm-papers: This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.
- Paitesanshi/LLM-Agent-Survey: A Survey on LLM-based Autonomous Agents.
- woooodyy/llm-agent-paper-list: Must-read papers for LLM-based agents.
- Survey
-
[2024/01/01] If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents | [paper] | [code]
-
[2023/12/31] A Survey of Personality, Persona, and Profile in Conversational Agents and Chatbots | [paper] | [code]
-
[2023/12/19] Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives | [paper] | [code]
-
[2023/09/14] The Rise and Potential of Large Language Model Based Agents: A Survey | [paper] | [code]
-
[2023/08/22] A Survey on Large Language Model based Autonomous Agents | [paper] | [code]
-
[2023/06/27] Next Steps for Human-Centered Generative AI: A Technical Perspective | [paper] | [code]
-
[2023/04/06] Can Large Language Models Play Text Games Well? Current State-of-the-Art and Open Questions | [paper] | [code]
-
- Agent Fine-tuning
-
[2024/01/10] Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training | [paper] | [code]
-
[2024/01/10] Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk | [paper] | [code]
-
[2024/01/10] AUTOACT: Automatic Agent Learning from Scratch via Self-Planning | [paper] | [code]
-
[2024/01/05] From LLM to Conversational Agent: A Memory Enhanced Architecture with Fine-Tuning of Large Language Models | [paper] | [code]
-
[2023/12/20] Machine Mindset: An MBTI Exploration of Large Language Models | [paper] | [code]
-
[2023/10/19] AgentTuning: Enabling Generalized Agent Abilities for LLMs | [paper] | [code]
-
[2023/10/09] FireAct: Toward Language Agent Fine-tuning | [paper] | [code]
-
[2023/10/01] Adapting LLM Agents Through Communication | [paper] | [code]
-
- Role Playing
-
[2024/01/09] Agent Alignment in Evolving Social Norms | [paper] | [code]
-
[2023/12/28] Experiential Co-Learning of Software-Developing Agents | [paper] | [code]
-
[2023/12/27] Automating Knowledge Acquisition for Content-Centric Cognitive Agents Using LLMs | [paper] | [code]
-
[2023/12/21] ChatGPT as a commenter to the news: can LLMs generate human-like opinions? | [paper] | [code]
-
[2023/12/19] Can ChatGPT be Your Personal Medical Assistant? | [paper] | [code]
-
[2023/12/06] LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem | [paper] | [code]
-
[2023/11/28] War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars | [paper] | [code]
-
[2023/11/23] Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach | [paper] | [code]
-
[2023/11/10] Smart Agent-Based Modeling: On the Use of Large Language Models in Computer Simulations | [paper] | [code]
-
[2023/10/01] RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models | [paper] | [code]
-
[2023/09/08] Unleashing the Power of Graph Learning through LLM-based Autonomous Agents | [paper] | [code]
-
[2023/09/05] Cognitive Architectures for Language Agents | [paper] | [code]
-
[2023/08/22] Towards an On-device Agent for Text Rewriting | [paper] | [code]
-
[2023/08/14] ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate | [paper] | [code]
-
[2023/08/07] TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents | [paper] | [code]
-
[2023/07/24] To Infinity and Beyond: SHOW-1 and Showrunner Agents in Multi-Agent Simulations | [paper] | [code]
-
[2023/06/28] Inferring the Goals of Communicating Agents from Actions and Instructions | [paper] | [code]
-
[2023/05/27] SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks | [paper] | [code]
-
[2023/05/26] Training Socially Aligned Language Models in Simulated Human Society | [paper] | [code]
-
[2023/05/25] Role-Play with Large Language Models | [paper] | [code]
-
[2023/05/24] Reasoning with Language Model is Planning with World Model | [paper] | [code]
-
[2023/05/17] Tree of Thoughts: Deliberate Problem Solving with Large Language Models | [paper] | [code]
-
[2023/05/09] TidyBot: Personalized Robot Assistance with Large Language Models | [paper] | [code]
-
[2023/05/02] The Role of Summarization in Generative Agents: A Preliminary Perspective | [paper] | [code]
-
[2023/04/26] Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models | [paper] | [code]
-
[2023/04/24] ChatLLM Network: More brains, More intelligence | [paper] | [code]
-
[2023/04/15] Self-collaboration Code Generation via ChatGPT | [paper] | [code]
-
[2023/04/07] Generative Agents: Interactive Simulacra of Human Behavior | [paper] | [code]
-
[2023/03/31] CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society | [paper] | [code]
-
[2022/12/08] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models | [paper] | [code]
-
- Multi-Agent System
-
[2024/01/11] Combating Adversarial Attacks with Multi-Agent Debate | [paper] | [code]
-
[2024/01/08] MARG: Multi-Agent Review Generation for Scientific Papers | [paper] | [code]
-
[2024/01/08] SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems | [paper] | [code]
-
[2024/01/08] Why Solving Multi-agent Path Finding with Large Language Model has not Succeeded Yet | [paper] | [code]
-
[2023/12/20] AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation | [paper] | [code]
-
[2023/12/01] Deciphering Digital Detectives: Understanding LLM Behaviors and Capabilities in Multi-Agent Mystery Games | [paper] | [code]
-
[2023/10/31] Multi-Agent Consensus Seeking via Large Language Models | [paper] | [code]
-
[2023/10/25] MultiPrompter: Cooperative Prompt Optimization with Multi-Agent Reinforcement Learning | [paper] | [code]
-
[2023/10/10] MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative Agents | [paper] | [code]
-
[2023/10/03] Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View | [paper] | [code]
-
[2023/09/22] Learning to Coordinate with Anyone | [paper] | [code]
-
[2023/08/21] AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents | [paper] | [code]
-
[2023/08/03] InterAct: Exploring the Potentials of ChatGPT as a Cooperative Agent | [paper] | [code]
-
[2023/08/01] MetaGPT: Meta Programming for Multi-Agent Collaborative Framework | [paper] | [code]
-
[2023/07/16] Communicative Agents for Software Development | [paper] | [code]
-
[2023/07/11] Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration | [paper] | [code]
-
[2023/07/05] Building Cooperative Embodied Agents Modularly with Large Language Models | [paper] | [code]
-
[2023/06/05] Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents | [paper] | [code]
-
- Feedback&Reflection
-
[2023/11/14] The ART of LLM Refinement: Ask, Refine, and Trust | [paper] | [code]
-
[2023/10/31] Learning From Mistakes Makes LLM Better Reasoner | [paper] | [code]
-
[2023/08/01] SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning | [paper] | [code]
-
[2023/07/27] PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback | [paper] | [code]
-
[2023/05/30] Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate | [paper] | [code]
-
[2023/05/26] AdaPlanner: Adaptive Planning from Feedback with Language Models | [paper] | [code]
-
[2023/05/22] Making Language Models Better Tool Learners with Execution Feedback | [paper] | [code]
-
[2023/04/21] Improving Grounded Language Understanding in a Collaborative Environment by Interacting with Agents Through Help Feedback | [paper] | [code]
-
[2023/04/11] Teaching Large Language Models to Self-Debug | [paper] | [code]
-
[2023/03/30] Self-Refine: Iterative Refinement with Self-Feedback | [paper] | [code]
-
- Memory Mechanism
-
[2023/12/22] Empowering Working Memory for Large Language Model Agents | [paper] | [code]
-
[2023/12/22] Evolving Large Language Model Assistant with Long-Term Conditional Memory | [paper] | [code]
-
[2023/10/16] CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization | [paper] | [code]
-
[2023/06/06] ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory | [paper] | [code]
-
[2023/05/31] Monotonic Location Attention for Length Generalization | [paper] | [code]
-
[2023/05/26] Randomized Positional Encodings Boost Length Generalization of Transformers | [paper] | [code]
-
[2023/05/25] Landmark Attention: Random-Access Infinite Context Length for Transformers | [paper] | [code]
-
[2023/05/24] Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration | [paper] | [code]
-
[2023/05/24] Adapting Language Models to Compress Contexts | [paper] | [code]
-
[2023/05/23] RET-LLM: Towards a General Read-Write Memory for Large Language Models | [paper] | [code]
-
[2023/05/22] RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text | [paper] | [code]
-
[2023/05/19] ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings | [paper] | [code]
-
[2023/05/17] MemoryBank: Enhancing Large Language Models with Long-Term Memory | [paper] | [code]
-
[2023/05/15] Small Models are Valuable Plug-ins for Large Language Models | [paper] | [code]
-
[2023/05/02] Unlimiformer: Long-Range Transformers with Unlimited Length Input | [paper] | [code]
-
[2023/05/01] Learning to Reason and Memorize with Self-Notes | [paper] | [code]
-
[2023/04/27] ChatLog: Recording and Analyzing ChatGPT Across Time | [paper] | [code]
-
[2023/04/26] Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System | [paper] | [code]
-
[2023/04/21] Emergent and Predictable Memorization in Large Language Models | [paper] | [code]
-
[2023/03/17] CoLT5: Faster Long-Range Transformers with Conditional Computation | [paper] | [code]
-
- Game Playing
-
[2023/12/29] Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game | [paper] | [code]
-
[2023/10/31] Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models | [paper] | [code]
-
[2023/09/29] Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4 | [paper] | [code]
-
[2023/09/10] An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language Model Game Agents | [paper] | [code]
-
[2023/09/09] Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf | [paper] | [code]
-
[2023/08/23] Are ChatGPT and GPT-4 Good Poker Players? -- A Pre-Flop Analysis | [paper] | [code]
-
[2023/05/31] Recursive Metropolis-Hastings Naming Game: Symbol Emergence in a Multi-agent System based on Probabilistic Generative Models | [paper] | [code]
-
[2023/05/26] Playing repeated games with Large Language Models | [paper] | [code]
-
[2023/05/25] Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory | [paper] | [code]
-
[2023/05/25] Voyager: An Open-Ended Embodied Agent with Large Language Models | [paper] | [code]
-
[2023/05/17] Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback | [paper] | [code]
-
[2023/05/08] Knowledge-enhanced Agents for Interactive Text Games | [paper] | [code]
-
[2023/03/29] Plan4MC: Skill Reinforcement Learning and Planning for Open-World Minecraft Tasks | [paper] | [code]
-
- Game Platform
- Benchmark&Evaluation&Framework
-
[2024/01/05] AFSPP: Agent Framework for Shaping Preference and Personality with Large Language Models | [paper] | [code]
-
[2024/01/02] CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation | [paper] | [code]
-
[2023/12/28] How Far Are We from Believable AI Agents? A Framework for Evaluating the Believability of Human Behavior Simulation | [paper] | [code]
-
[2023/12/26] RoleEval: A Bilingual Role Evaluation Benchmark for Large Language Models | [paper] | [code]
-
[2023/11/17] Testing Language Model Agents Safely in the Wild | [paper] | [code]
-
[2023/11/16] ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks | [paper] | [code]
-
[2023/11/15] ToolTalk: Evaluating Tool-Usage in a Conversational Setting | [paper] | [code]
-
[2023/11/02] ProAgent: From Robotic Process Automation to Agentic Process Automation | [paper] | [code]
-
[2023/10/24] FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions | [paper] | [code]
-
[2023/10/09] Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena | [paper] | [code]
-
[2023/10/02] SmartPlay : A Benchmark for LLMs as Intelligent Agents | [paper] | [code]
-
[2023/09/29] Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency | [paper] | [code]
-
[2023/09/14] Agents: An Open-source Framework for Autonomous Language Agents | [paper] | [code]
-
[2023/08/22] ProAgent: Building Proactive Cooperative AI with Large Language Models | [paper] | [code]
-
[2023/08/11] BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents | [paper] | [code]
-
[2023/08/07] AgentBench: Evaluating LLMs as Agents | [paper] | [code]
-
[2023/07/31] HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution | [paper] | [code]
-
[2023/06/09] Mind2Web: Towards a Generalist Agent for the Web | [paper] | [code]
-
- Tool Usage&Human-Agent Interaction
-
[2024/01/03] GPT-4V(ision) is a Generalist Web Agent, if Grounded | [paper] | [code]
-
[2023/12/21] Team Flow at DRC2023: Building Common Ground and Text-based Turn-taking in a Travel Agent Spoken Dialogue System | [paper] | [code]
-
[2023/12/21] AppAgent: Multimodal Agents as Smartphone Users | [paper] | [code]
-
[2023/12/14] CogAgent: A Visual Language Model for GUI Agents | [paper] | [code]
-
[2023/11/19] TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems | [paper] | [code]
-
[2023/10/18] MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models | [paper] | [code]
-
[2023/10/13] AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems | [paper] | [code]
-
[2023/10/12] A Zero-Shot Language Agent for Computer Control with Structured Reflection | [paper] | [code]
-
[2023/09/02] ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models | [paper] | [code]
-
[2023/08/07] TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents | [paper] | [code]
-
[2023/06/05] When Large Language Model based Agent Meets User Behavior Analysis: A Novel User Simulation Paradigm | [paper] | [code]
-