如何使用 ReAct 风格的代理返回结构化输出¶
您可能希望您的代理以结构化格式返回其输出。例如,如果代理的输出由其他下游软件使用,您可能希望每次调用代理时输出都采用相同的结构化格式,以确保一致性。
此笔记本将介绍两种不同的方法,用于强制工具调用代理对其输出进行结构化。我们将使用一个基本的 ReAct 代理(一个模型节点和一个工具调用节点),以及最后用于格式化用户响应的第三个节点。这两个选项都将使用与下图中所示相同的图结构,但底层机制不同。
选项 1
强制工具调用代理进行结构化输出的第一种方法是,将您希望的输出绑定为agent
节点可使用的附加工具。与基本的 ReAct 代理相比,本例中的agent
节点并非在tools
和END
之间选择,而是选择它调用的特定工具。在这种情况下,预期的流程是agent
节点中的 LLM 将首先选择操作工具,并在收到操作工具输出后,调用响应工具,然后路由到respond
节点,该节点只是对agent
节点工具调用中的参数进行结构化。
优点和缺点
这种格式的优点是您只需要一个 LLM,并且由于这个原因可以节省资金和延迟。此选项的缺点是,不能保证单个 LLM 会在您需要的时候调用正确的工具。我们可以通过在使用bind_tools
时将tool_choice
设置为any
来帮助 LLM,这会强制 LLM 在每次轮次都至少选择一个工具,但这远非万无一失的策略。此外,另一个缺点是代理可能会调用多个工具,因此我们需要在我们的路由函数中明确检查这一点(或者如果我们使用的是 OpenAI,我们可以设置parallell_tool_calling=False
以确保一次只调用一个工具)。
选项 2
强制工具调用代理进行结构化输出的第二种方法是使用第二个 LLM(在本例中为model_with_structured_output
)来响应用户。
在这种情况下,您将像平常一样定义一个基本的 ReAct 代理,但agent
节点不会在tools
节点和结束对话之间进行选择,而是会在tools
节点和respond
节点之间进行选择。respond
节点将包含一个使用结构化输出的第二个 LLM,并且一旦被调用,就会直接返回给用户。您可以将这种方法视为基本的 ReAct,在响应用户之前增加了一步。
优点和缺点
这种方法的优点是保证了结构化输出(只要.with_structured_output
按照预期与 LLM 协同工作)。使用这种方法的缺点是,在响应用户之前需要进行额外的 LLM 调用,这会增加成本和延迟。此外,由于没有向agent
节点 LLM 提供有关所需输出模式的信息,因此存在agent
LLM 无法调用正确工具以按正确输出模式回答的风险。
请注意,这两个选项都将遵循完全相同的图结构(参见上面的图),即它们都是基本 ReAct 架构的完全副本,但都在末尾增加了一个respond
节点。
设置¶
首先,让我们安装所需的软件包并设置我们的 API 密钥
%%capture --no-stderr
%pip install -U langgraph langchain_anthropic
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("ANTHROPIC_API_KEY")
from pydantic import BaseModel, Field
from typing import Literal
from langchain_core.tools import tool
from langchain_anthropic import ChatAnthropic
from langgraph.graph import MessagesState
class WeatherResponse(BaseModel):
"""Respond to the user with this"""
temperature: float = Field(description="The temperature in fahrenheit")
wind_directon: str = Field(
description="The direction of the wind in abbreviated form"
)
wind_speed: float = Field(description="The speed of the wind in km/h")
# Inherit 'messages' key from MessagesState, which is a list of chat messages
class AgentState(MessagesState):
# Final structured response from the agent
final_response: WeatherResponse
@tool
def get_weather(city: Literal["nyc", "sf"]):
"""Use this to get weather information."""
if city == "nyc":
return "It is cloudy in NYC, with 5 mph winds in the North-East direction and a temperature of 70 degrees"
elif city == "sf":
return "It is 75 degrees and sunny in SF, with 3 mph winds in the South-East direction"
else:
raise AssertionError("Unknown city")
tools = [get_weather]
model = ChatAnthropic(model="claude-3-opus-20240229")
model_with_tools = model.bind_tools(tools)
model_with_structured_output = model.with_structured_output(WeatherResponse)
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
tools = [get_weather, WeatherResponse]
# Force the model to use tools by passing tool_choice="any"
model_with_response_tool = model.bind_tools(tools, tool_choice="any")
# Define the function that calls the model
def call_model(state: AgentState):
response = model_with_response_tool.invoke(state["messages"])
# We return a list, because this will get added to the existing list
return {"messages": [response]}
# Define the function that responds to the user
def respond(state: AgentState):
# Construct the final answer from the arguments of the last tool call
response = WeatherResponse(**state["messages"][-1].tool_calls[0]["args"])
# We return the final answer
return {"final_response": response}
# Define the function that determines whether to continue or not
def should_continue(state: AgentState):
messages = state["messages"]
last_message = messages[-1]
# If there is only one tool call and it is the response tool call we respond to the user
if (
len(last_message.tool_calls) == 1
and last_message.tool_calls[0]["name"] == "WeatherResponse"
):
return "respond"
# Otherwise we will use the tool node again
else:
return "continue"
# Define a new graph
workflow = StateGraph(AgentState)
# Define the two nodes we will cycle between
workflow.add_node("agent", call_model)
workflow.add_node("respond", respond)
workflow.add_node("tools", ToolNode(tools))
# Set the entrypoint as `agent`
# This means that this node is the first one called
workflow.set_entry_point("agent")
# We now add a conditional edge
workflow.add_conditional_edges(
"agent",
should_continue,
{
"continue": "tools",
"respond": "respond",
},
)
workflow.add_edge("tools", "agent")
workflow.add_edge("respond", END)
graph = workflow.compile()
用法¶
现在我们可以运行我们的图来检查它是否按预期工作
answer = graph.invoke(input={"messages": [("human", "what's the weather in SF?")]})[
"final_response"
]
answer
WeatherResponse(temperature=75.0, wind_directon='SE', wind_speed=3.0)
同样,代理返回了我们预期的WeatherResponse
对象。
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_core.messages import HumanMessage
# Define the function that calls the model
def call_model(state: AgentState):
response = model_with_tools.invoke(state["messages"])
# We return a list, because this will get added to the existing list
return {"messages": [response]}
# Define the function that responds to the user
def respond(state: AgentState):
# We call the model with structured output in order to return the same format to the user every time
# state['messages'][-2] is the last ToolMessage in the convo, which we convert to a HumanMessage for the model to use
# We could also pass the entire chat history, but this saves tokens since all we care to structure is the output of the tool
response = model_with_structured_output.invoke(
[HumanMessage(content=state["messages"][-2].content)]
)
# We return the final answer
return {"final_response": response}
# Define the function that determines whether to continue or not
def should_continue(state: AgentState):
messages = state["messages"]
last_message = messages[-1]
# If there is no function call, then we respond to the user
if not last_message.tool_calls:
return "respond"
# Otherwise if there is, we continue
else:
return "continue"
# Define a new graph
workflow = StateGraph(AgentState)
# Define the two nodes we will cycle between
workflow.add_node("agent", call_model)
workflow.add_node("respond", respond)
workflow.add_node("tools", ToolNode(tools))
# Set the entrypoint as `agent`
# This means that this node is the first one called
workflow.set_entry_point("agent")
# We now add a conditional edge
workflow.add_conditional_edges(
"agent",
should_continue,
{
"continue": "tools",
"respond": "respond",
},
)
workflow.add_edge("tools", "agent")
workflow.add_edge("respond", END)
graph = workflow.compile()
用法¶
现在我们可以调用我们的图来验证输出是否按预期结构。
answer = graph.invoke(input={"messages": [("human", "what's the weather in SF?")]})[
"final_response"
]
answer
WeatherResponse(temperature=75.0, wind_directon='SE', wind_speed=4.83)
正如我们所见,代理返回了一个WeatherResponse
对象,正如我们所期望的。现在我们可以很容易地在更复杂的软件堆栈中使用这个代理,而不用担心代理的输出与堆栈中下一步期望的格式不匹配。