TL;DR: LangGraph is module built on top of LangChain to better enable creation of cyclical graphs, often needed for agent runtimes.
One of the things we highlighted in our LangChain v0.1 announcement was the introduction of a new library: LangGraph. LangGraph is built on top of LangChain and completely interoperable with the LangChain ecosystem. It adds new value primarily through the introduction of an easy way to create cyclical graphs. This is often useful when creating agent runtimes.
In this blog post, we will first walk through the motivations for LangGraph. We will then cover the basic functionality it provides. We will then spotlight two agent runtimes we've implemented already. We will then highlight a few of the common modifications to these runtimes we've heard requests for, and examples of implementing those. We will finish with a preview of what we will be releasing next.
One of the big value props of LangChain is the ability to easily create custom chains. We've invested heavily in the functionality for this with LangChain Expression Language. However, so far we've lacked a method for easily introducing cycles into these chains. Effectively, these chains are directed acyclic graphs (DAGs) - as are most data orchestration frameworks.
One of the common patterns we see when people are creating more complex LLM applications is the introduction of cycles into the runtime. These cycles often use the LLM to reason about what to do next in the cycle. A big unlock of LLMs is the ability to use them for these reasoning tasks. This can essentially be thought of as running an LLM in a for-loop. These types of systems are often called agents.
An example of why this agentic behavior can be so powerful can be found when considering a typical retrieval augmented generation (RAG) application. In a typical RAG application, a call to a retriever is made that returns some documents. These documents are then passed to an LLM to generate a final answer. While this is often effective, it breaks down in cases when the first retrieval step fails to return any useful results. In this case, it's often ideal if the LLM can reason that the results returned from the retriever are poor, and maybe issue a second (more refined) query to the retriever, and use those results instead. Essentially, running an LLM in a loop helps create applications that are more flexible and thus can accomplish more vague use-cases that may not be predefined.
These types of applications are often called agents. The simplest - but at the same time most ambitious - form of these is a loop that essentially has two steps:
- Call the LLM to determine either (a) what actions to take, or (b) what response to give the user
- Take given actions, and pass back to step 1
These steps are repeated until a final response is generated. This is essentially the loop that powers our core AgentExecutor, and is the same logic that caused projects like AutoGPT to rise in prominence. This is simple because it is a relatively simple loop. It is the most ambitious because it offloads pretty much ALL of the decision making and reasoning ability to the LLM.
One thing we've seen in practice as we've worked the community and companies to put agents into production is that often times more control is needed. You may want to always force an agent to call particular tool first. You may want have more control over how tools are called. You may want to have different prompts for the agent, depending on that state it is in.
When talking about these more controlled flows, we internally refer to them as "state machines". See the below diagram from our blog on cognitive architectures.
These state machines have the power of being able to loop - allowing for handling of more ambiguous inputs than simple chains. However, there is still an element of human guidance in terms of how that loop is constructed.
LangGraph is a way to create these state machines by specifying them as graphs.
At it's core, LangGraph exposes a pretty narrow interface on top of LangChain.
StateGraph is a class that represents the graph. You initialize this class by passing in a
state definition. This state definition represents a central state object that is updated over time. This state is updated by nodes in the graph, which return operations to attributes of this state (in the form of a key-value store).
The attributes of this state can be updated in two ways. First, an attribute could be overridden completely. This is useful if you want to nodes to return the new value of an attribute. Second, an attribute could be updated by adding to its value. This is useful if an attribute is a list of actions taken (or something similar) and you want nodes to return new actions taken (and have those automatically added to the attribute).
You specify whether an attribute should be overridden or added to when creating the initial state definition. See an example in pseudocode below.
from langgraph.graph import StateGraph
from typing import TypedDict, List, Annotated
all_actions: Annotated[List[str], operator.add]
graph = StateGraph(State)
After creating a
StateGraph, you then add nodes with
graph.add_node(name, value) syntax. The
name parameter should be a string that we will use to refer to the node when adding edges. The
value parameter should be either a function or LCEL runnable that will be called. This function/LCEL should accept a dictionary in the same form as the
State object as input, and output a dictionary with keys of the
State object to update.
See an example in pseudocode below.
There is also a special
END node that is used to represent the end of the graph. It is important that your cycles be able to end eventually!
from langgraph.graph import END
After adding nodes, you can then add edges to create the graph. There are a few types of edges.
The Starting Edge
This is the edge that connects the start of the graph to a particular node. This will make it so that that node is the first one called when input is passed to the graph. Pseudocode for that is:
These are edges where one node should ALWAYS be called after another. An example of this may be in the basic agent runtime, where we always want the model to be called after we call a tool.
These are where a function (often powered by an LLM) is used to determine which node to go to first. To create this edge, you need to pass in three things:
- The upstream node: the output of this node will be looked at to determine what to do next
- A function: this will be called to determine which node to call next. It should return a string
- A mapping: this mapping will be used to map the output of the function in (2) to another node. The keys should be possible values that the function in (2) could return. The values should be names of nodes to go to if that value is returned.
An example of this could be that after a model is called we either exit the graph and return to the user, or we call a tool - depending on what a user decides! See an example in pseudocode below:
After we define our graph, we can compile it into a runnable! This simply takes the graph definition we've created so far an returns a runnable. This runnable exposes all the same method as LangChain runnables (
.astream_log, etc) allowing it to be called in the same manner as a chain.
app = graph.compile()
We've recreated the canonical LangChain AgentExecutor with LangGraph. This will allow you to use existing LangChain agents, but allow you to more easily modify the internals of the AgentExecutor. The state of this graph by default contains concepts that should be familiar to you if you've used LangChain agents:
agent_outcome to represent the most recent agent outcome)
from typing import TypedDict, Annotated, List, Union
from langchain_core.agents import AgentAction, AgentFinish
from langchain_core.messages import BaseMessage
agent_outcome: Union[AgentAction, AgentFinish, None]
intermediate_steps: Annotated[list[tuple[AgentAction, str]], operator.add]
See this notebook for how to get started
Chat Agent Executor
One common trend we've seen is that more and more models are "chat" models which operate on a list of messages. This models are often the ones equipped with things like function calling, which make agent-like experiences much more feasible. When working with these types of models, it is often intuitive to represent the state of an agent as a list of messages.
As such, we've created an agent runtime that works with this state. The input is a list of messages, and nodes just simply add to this list of messages over time.
from typing import TypedDict, Annotated, Sequence
from langchain_core.messages import BaseMessage
messages: Annotated[Sequence[BaseMessage], operator.add]
See this notebook for how to get started
One of the big benefits of LangGraph is that it exposes the logic of AgentExecutor in a far more natural and modifiable way. We've provided a few examples of modifications that we've heard requests for:
Force Calling a Tool
Managing Agent Steps
Returning Output in a Specific Format
How to make the agent return output in a specific format using function calling. Only for Chat Agent Executor.
Dynamically Returning the Output of a Tool Directly
Sometimes you may want to return the output of a tool directly. We provide an easy way to do with the
return_direct parameter in LangChain. However, this makes it so that the output of a tool is ALWAYS returned directly. Sometimes, you may want to let the LLM choose whether to return the response directly or not. Only for Chat Agent Executor.
We're incredibly excited about the possibility of LangGraph enabling more custom and powerful agent runtimes. Some of the things we are looking to implement in the near future:
- More advanced agent runtimes from academia (LLM Compiler, plan-and-solve, etc)
- Stateful tools (allowing tools to modify some state)
- More controlled human-in-the-loop workflows
- Multi-agent workflows
If any of these resonate with you, please feel free to add an example notebook in the LangGraph repo, or reach out to us at email@example.com for more involved collaboration!