One of the most common requests we've heard is better functionality and documentation for creating custom agents. This has always been a bit tricky - because in our mind it's actually still very unclear what an "agent" actually is, and therefor what the "right" abstractions for them may be. Recently, we've felt some of the abstractions starting to come together, so we did a big push across both our Python and TypeScript modules to better enforce and document these abstractions. Please see below for links to those technical docs, and then a description of the abstractions we've introduced and future directions.
TL;DR: we've introduced a
BaseSingleActionAgent as the highest level abstraction for an agent that can be used in our current
AgentExecutor. We've added a more practical
LLMSingleActionAgent that implements this interface in a simple and extensible way (PromptTemplate + LLM + OutputParser).
The most base abstraction we've introduced is a
BaseSingleActionAgent. As you can tell by the name, we don't consider this a base abstraction for all agents. Rather, we consider this the base abstraction for a family of agents that predicts a single action at a time.
SingleActionAgent is used in an our current
AgentExecutor can largely be thought of as a loop that:
- Passes user input and any previous steps to the Agent
- If the Agent returns an
AgentFinish, then return that directly to the user
- If the Agent returns an
AgentAction, then use that to call a tool and get an
- Repeat, passing the
Observationback to the Agent until an
AgentAction is a response that consists of
action refers to which tool to use, and
action_input refers to the input to that tool.
AgentFinish is a response that contains the final message to be sent back to the user. This should be used to end an agent run.
If you are interested in this level of customizability, check out this walkthrough. For most use cases, however, we would recommend using the abstraction below.
Another class we've introduced is the
LLMSingleActionAgent. This is a concrete implementation of the
BaseSingleActionAgent, but is highly modular so therefor is highly customizable.
LLMSingleActionAgent consists of four parts:
PromptTemplate: This is the prompt template that can be used to instruct the language model on what to do
LLM: This is the language model that powers the agent
stopsequence: Instructs the
LLMto stop generating as soon as this string is found
OutputParser: This determines how to parse the output of an
The logic for combining these is:
- Use the
PromptTemplateto turn the input variables (inlcuding user input and any previous
Observationpairs) into a prompt
- Pass the prompt to the
LLM, with a specific
- Parse the output of the
These abstraction can be used to customize your agent in a lot of ways. For example:
- Want to give your agent some personality? Use the
- Want to format the previous
Observationpairs in a specific way? Use the
- Want to use a custom or local model? Write a custom LLM wrapper and pass that in as the LLM!
- Is the output parsing too brittle, or you want to handle errors in a different way? Use a custom OutputParser!
(The last one is in bold, because that's the one we'v maybe heard the most)
We imagine this being the most practically useful abstraction. Please see the documentation links at the beginning of the blog for links to concrete Python/TypeScripts guides for getting started here.
We hope these abstractions have clarified some of our thinking around agents, as well as open up places where we hope the community can contribute. In particular:
We are very excited about other examples of
- Using embeddings to do tool selection before calling an
- Using a
ConstitutionalChaininstead of an
LLMChainto improve reliability
We are also excited about other types of agents (which will require new
- Multi-action agents
- Plan-execute agents
If any of those sound interesting, we are always willing to work with folks to implement their ideas! The best way is probably to do some initial work, open a RFC pull request, and we're happy to go from there :)