如何从图中流式传输LLM令牌¶
在此示例中,我们将从驱动代理的语言模型中流式传输令牌。我们将以ReAct代理为例。
注意
如果您使用的是@langchain/core
< 0.2.3版本,当调用聊天模型或LLM时,您需要在节点内调用await model.stream()
以获取逐令牌的流式传输事件,并在需要时聚合最终输出以更新图状态。在@langchain/core
的更高版本中,这会自动发生,您可以调用await model.invoke()
。
有关如何升级@langchain/core
的更多信息,请查看此处的说明。
本操作指南与此目录中的其他指南密切相关,展示了如何将功能集成到LangGraph中的原型代理中。
流式传输支持
许多但并非所有聊天模型都支持令牌流式传输。请在此处查看您的LLM集成是否支持令牌流式传输(文档)。请注意,某些集成可能支持通用令牌流式传输,但缺乏对流式工具调用的支持。
注意
在此操作指南中,我们将从头开始创建代理以使其透明(但冗长)。您可以使用createReactAgent({ llm, tools })
(API文档)构造函数实现类似的功能。如果您习惯于LangChain的AgentExecutor类,这可能更合适。
设置¶
本指南将使用OpenAI的GPT-4o模型。我们将可选地设置用于LangSmith跟踪的API密钥,这将为我们提供一流的可观察性。
// process.env.OPENAI_API_KEY = "sk_...";
// Optional, add tracing in LangSmith
// process.env.LANGCHAIN_API_KEY = "ls__...";
// process.env.LANGCHAIN_CALLBACKS_BACKGROUND = "true";
// process.env.LANGCHAIN_TRACING = "true";
// process.env.LANGCHAIN_PROJECT = "Stream Tokens: LangGraphJS";
定义状态¶
状态是图中所有节点的接口。
import { Annotation } from "@langchain/langgraph";
import type { BaseMessageLike } from "@langchain/core/messages";
const StateAnnotation = Annotation.Root({
messages: Annotation<BaseMessageLike[]>({
reducer: (x, y) => x.concat(y),
}),
});
设置工具¶
首先定义您想要使用的工具。对于这个简单的例子,我们将创建一个占位符搜索引擎,但有关如何创建自定义工具的文档请参见此处。
import { tool } from "@langchain/core/tools";
import { z } from "zod";
const searchTool = tool((_) => {
// This is a placeholder for the actual implementation
return "Cold, with a low of 3℃";
}, {
name: "search",
description:
"Use to surf the web, fetch current information, check the weather, and retrieve other information.",
schema: z.object({
query: z.string().describe("The query to use in your search."),
}),
});
await searchTool.invoke({ query: "What's the weather like?" });
const tools = [searchTool];
我们现在可以将这些工具包装在一个预构建的ToolNode中。每当LLM调用这些工具时,该对象将实际运行这些工具(函数)。
设置模型¶
现在加载聊天模型。
- 它应该能够处理消息。我们将以消息形式表示所有代理状态,因此它需要能够很好地处理它们。
- 它应该支持工具调用,这意味着它可以在响应中返回函数参数。
注意
这些模型要求并非使用LangGraph的通用要求——它们只是此示例的要求。
import { ChatOpenAI } from "@langchain/openai";
const model = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0,
});
完成此操作后,我们应该确保模型知道它可以调用这些工具。我们可以通过调用bindTools来实现。
定义图¶
我们现在可以把它们放在一起。
import { StateGraph, END } from "@langchain/langgraph";
import { AIMessage } from "@langchain/core/messages";
const routeMessage = (state: typeof StateAnnotation.State) => {
const { messages } = state;
const lastMessage = messages[messages.length - 1] as AIMessage;
// If no tools are called, we can finish (respond to the user)
if (!lastMessage?.tool_calls?.length) {
return END;
}
// Otherwise if there is, we continue and call the tools
return "tools";
};
const callModel = async (
state: typeof StateAnnotation.State,
) => {
// For versions of @langchain/core < 0.2.3, you must call `.stream()`
// and aggregate the message from chunks instead of calling `.invoke()`.
const { messages } = state;
const responseMessage = await boundModel.invoke(messages);
return { messages: [responseMessage] };
};
const workflow = new StateGraph(StateAnnotation)
.addNode("agent", callModel)
.addNode("tools", toolNode)
.addEdge("__start__", "agent")
.addConditionalEdges("agent", routeMessage)
.addEdge("tools", "agent");
const agent = workflow.compile();
import * as tslab from "tslab";
const runnableGraph = agent.getGraph();
const image = await runnableGraph.drawMermaidPng();
const arrayBuffer = await image.arrayBuffer();
await tslab.display.png(new Uint8Array(arrayBuffer));
流式传输LLM令牌¶
您可以使用两种方法访问每个节点生成的LLM令牌:
stream
方法与streamMode: "messages"
结合使用streamEvents
方法
stream方法¶
兼容性
本节需要 @langchain/langgraph>=0.2.20
。有关升级帮助,请参阅此指南。
对于此方法,您必须使用支持流式传输的LLM(例如new ChatOpenAI({ model: "gpt-4o-mini" })
)或在内部LLM调用上调用.stream
。
import { isAIMessageChunk } from "@langchain/core/messages";
const stream = await agent.stream(
{ messages: [{ role: "user", content: "What's the current weather in Nepal?" }] },
{ streamMode: "messages" },
);
for await (const [message, _metadata] of stream) {
if (isAIMessageChunk(message) && message.tool_call_chunks?.length) {
console.log(`${message.getType()} MESSAGE TOOL CALL CHUNK: ${message.tool_call_chunks[0].args}`);
} else {
console.log(`${message.getType()} MESSAGE CONTENT: ${message.content}`);
}
}
ai MESSAGE TOOL CALL CHUNK:
ai MESSAGE TOOL CALL CHUNK: {"
ai MESSAGE TOOL CALL CHUNK: query
ai MESSAGE TOOL CALL CHUNK: ":"
ai MESSAGE TOOL CALL CHUNK: current
ai MESSAGE TOOL CALL CHUNK: weather
ai MESSAGE TOOL CALL CHUNK: in
ai MESSAGE TOOL CALL CHUNK: Nepal
ai MESSAGE TOOL CALL CHUNK: "}
ai MESSAGE CONTENT:
tool MESSAGE CONTENT: Cold, with a low of 3℃
ai MESSAGE CONTENT:
ai MESSAGE CONTENT: The
ai MESSAGE CONTENT: current
ai MESSAGE CONTENT: weather
ai MESSAGE CONTENT: in
ai MESSAGE CONTENT: Nepal
ai MESSAGE CONTENT: is
ai MESSAGE CONTENT: cold
ai MESSAGE CONTENT: ,
ai MESSAGE CONTENT: with
ai MESSAGE CONTENT: a
ai MESSAGE CONTENT: low
ai MESSAGE CONTENT: temperature
ai MESSAGE CONTENT: of
ai MESSAGE CONTENT:
ai MESSAGE CONTENT: 3
ai MESSAGE CONTENT: ℃
ai MESSAGE CONTENT: .
ai MESSAGE CONTENT:
禁用流式传输¶
如果您希望禁用特定节点或模型调用的流式传输,可以添加一个"nostream"
标签。以下是一个示例,我们在其中添加一个初始节点,其中包含一个不会在最终输出中流式传输的LLM调用
import { RunnableLambda } from "@langchain/core/runnables";
const unstreamed = async (_: typeof StateAnnotation.State) => {
const model = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0,
});
const res = await model.invoke("How are you?");
console.log("LOGGED UNSTREAMED MESSAGE", res.content);
// Don't update the state, this is just to show a call that won't be streamed
return {};
}
const agentWithNoStream = new StateGraph(StateAnnotation)
.addNode("unstreamed",
// Add a "nostream" tag to the entire node
RunnableLambda.from(unstreamed).withConfig({
tags: ["nostream"]
})
)
.addNode("agent", callModel)
.addNode("tools", toolNode)
// Run the unstreamed node before the agent
.addEdge("__start__", "unstreamed")
.addEdge("unstreamed", "agent")
.addConditionalEdges("agent", routeMessage)
.addEdge("tools", "agent")
.compile();
const stream = await agentWithNoStream.stream(
{ messages: [{ role: "user", content: "What's the current weather in Nepal?" }] },
{ streamMode: "messages" },
);
for await (const [message, _metadata] of stream) {
if (isAIMessageChunk(message) && message.tool_call_chunks?.length) {
console.log(`${message.getType()} MESSAGE TOOL CALL CHUNK: ${message.tool_call_chunks[0].args}`);
} else {
console.log(`${message.getType()} MESSAGE CONTENT: ${message.content}`);
}
}
LOGGED UNSTREAMED MESSAGE I'm just a computer program, so I don't have feelings, but I'm here and ready to help you! How can I assist you today?
ai MESSAGE TOOL CALL CHUNK:
ai MESSAGE TOOL CALL CHUNK: {"
ai MESSAGE TOOL CALL CHUNK: query
ai MESSAGE TOOL CALL CHUNK: ":"
ai MESSAGE TOOL CALL CHUNK: current
ai MESSAGE TOOL CALL CHUNK: weather
ai MESSAGE TOOL CALL CHUNK: in
ai MESSAGE TOOL CALL CHUNK: Nepal
ai MESSAGE TOOL CALL CHUNK: "}
ai MESSAGE CONTENT:
tool MESSAGE CONTENT: Cold, with a low of 3℃
ai MESSAGE CONTENT:
ai MESSAGE CONTENT: The
ai MESSAGE CONTENT: current
ai MESSAGE CONTENT: weather
ai MESSAGE CONTENT: in
ai MESSAGE CONTENT: Nepal
ai MESSAGE CONTENT: is
ai MESSAGE CONTENT: cold
ai MESSAGE CONTENT: ,
ai MESSAGE CONTENT: with
ai MESSAGE CONTENT: a
ai MESSAGE CONTENT: low
ai MESSAGE CONTENT: temperature
ai MESSAGE CONTENT: of
ai MESSAGE CONTENT:
ai MESSAGE CONTENT: 3
ai MESSAGE CONTENT: ℃
ai MESSAGE CONTENT: .
ai MESSAGE CONTENT:
streamEvents方法¶
您也可以像这样使用streamEvents
方法
const eventStream = agent.streamEvents(
{ messages: [{ role: "user", content: "What's the weather like today?" }] },
{ version: "v2" },
);
for await (const { event, data } of eventStream) {
if (event === "on_chat_model_stream" && isAIMessageChunk(data.chunk)) {
if (
data.chunk.tool_call_chunks !== undefined &&
data.chunk.tool_call_chunks.length > 0
) {
console.log(data.chunk.tool_call_chunks);
}
}
}
[
{
name: 'search',
args: '',
id: 'call_Qpd6frHt0yUYWynRbZEXF3le',
index: 0,
type: 'tool_call_chunk'
}
]
[
{
name: undefined,
args: '{"',
id: undefined,
index: 0,
type: 'tool_call_chunk'
}
]
[
{
name: undefined,
args: 'query',
id: undefined,
index: 0,
type: 'tool_call_chunk'
}
]
[
{
name: undefined,
args: '":"',
id: undefined,
index: 0,
type: 'tool_call_chunk'
}
]
[
{
name: undefined,
args: 'current',
id: undefined,
index: 0,
type: 'tool_call_chunk'
}
]
[
{
name: undefined,
args: ' weather',
id: undefined,
index: 0,
type: 'tool_call_chunk'
}
]
[
{
name: undefined,
args: ' today',
id: undefined,
index: 0,
type: 'tool_call_chunk'
}
]
[
{
name: undefined,
args: '"}',
id: undefined,
index: 0,
type: 'tool_call_chunk'
}
]