将聊天机器人评估作为多代理模拟¶
在构建聊天机器人(例如客户支持助手)时,正确评估机器人的性能可能很困难。对于每次代码更改,都需要手动与其进行密集交互,这非常耗时。
一种使评估过程更轻松和更可复制的方法是模拟用户交互。
下面是一个使用 LangGraph.js 创建“虚拟用户”以模拟对话的示例。
整个模拟看起来像这样
首先,我们将设置我们的环境。
// process.env.OPENAI_API_KEY = "sk_...";
// Optional tracing in LangSmith
// process.env.LANGCHAIN_API_KEY = "sk_...";
// process.env.LANGCHAIN_TRACING_V2 = "true";
// process.env.LANGCHAIN_PROJECT = "Agent Simulation Evaluation: LangGraphJS";
1. 定义聊天机器人¶
接下来,我们将定义我们的聊天机器人。此实现使用 OpenAI API 生成响应,并扮演航空公司客户支持代理的角色。
import { ChatOpenAI } from '@langchain/openai'
import type { AIMessageChunk, BaseMessageLike } from "@langchain/core/messages";
const llm = new ChatOpenAI({ model: "gpt-4o-mini" });
async function myChatBot(messages: BaseMessageLike[]): Promise<AIMessageChunk> {
const systemMessage = {
role: 'system',
content: 'You are a customer support agent for an airline.',
};
const allMessages = [systemMessage, ...messages];
const response = await llm.invoke(allMessages)
return response
}
// Test the chat bot
const response = await myChatBot([{ role: 'user', content: 'hi!' }]);
console.log(response);
AIMessage { "id": "chatcmpl-AE3nMDCiDkmBMSVI6Y6xJBQjjWQwY", "content": "Hello! How can I assist you today?", "additional_kwargs": {}, "response_metadata": { "tokenUsage": { "completionTokens": 9, "promptTokens": 23, "totalTokens": 32 }, "finish_reason": "stop", "system_fingerprint": "fp_f85bea6784" }, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": { "input_tokens": 23, "output_tokens": 9, "total_tokens": 32 } }
2. 定义模拟用户¶
现在我们将定义与我们的机器人交互的模拟用户。
import { type Runnable } from "@langchain/core/runnables";
import { AIMessage } from "@langchain/core/messages";
import { ChatPromptTemplate } from "@langchain/core/prompts";
async function createSimulatedUser(): Promise<Runnable<{ messages: BaseMessageLike[] }, AIMessage>> {
const systemPromptTemplate = `You are a customer of an airline company. You are interacting with a user who is a customer support person
{instructions}
If you have nothing more to add to the conversation, you must respond only with a single word: "FINISHED"`;
const prompt = ChatPromptTemplate.fromMessages([
['system', systemPromptTemplate],
["placeholder", '{messages}'],
]);
const instructions = `Your name is Harrison. You are trying to get a refund for the trip you took to Alaska.
You want them to give you ALL the money back. Be extremely persistent. This trip happened 5 years ago.`;
const partialPrompt = await prompt.partial({ instructions });
const simulatedUser = partialPrompt.pipe(llm);
return simulatedUser;
}
// Test the simulated user
const messages = [{role: "user", content: 'Hi! How can I help you?'}];
const simulatedUser = await createSimulatedUser()
const simulatedUserResponse = await simulatedUser.invoke({ messages });
console.log(simulatedUserResponse);
AIMessage { "id": "chatcmpl-AE3nNuHpuxAZfG6aQsKoKktitdyfD", "content": "Hello! I’m Harrison, and I need to discuss a refund for my trip to Alaska that I took five years ago. I expect all of my money back. Can you assist me with that?", "additional_kwargs": {}, "response_metadata": { "tokenUsage": { "completionTokens": 40, "promptTokens": 108, "totalTokens": 148 }, "finish_reason": "stop", "system_fingerprint": "fp_f85bea6784" }, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": { "input_tokens": 108, "output_tokens": 40, "total_tokens": 148 } }
3. 定义代理模拟¶
以下代码创建了一个 LangGraph 工作流来运行模拟。主要组件是
- 两个节点:一个用于模拟用户,另一个用于聊天机器人。
- 图形本身,具有条件停止标准。
阅读下面代码中的注释以获取更多信息。
节点
首先,我们在图中定义节点。这些节点应接收消息列表并返回要添加到状态的消息列表。这些将是我们在上面定义的聊天机器人和模拟用户的包装器。
**注意:**这里一个棘手的问题是哪些消息是哪些。因为聊天机器人和我们的模拟用户都是 LLM,所以它们都会使用 AI 消息进行回复。我们的状态将是交替的人工和 AI 消息列表。这意味着对于其中一个节点,将需要一些逻辑来翻转 AI 和人工角色。在此示例中,我们将假设HumanMessages
是来自模拟用户的消息。这意味着我们需要在模拟用户节点中添加一些逻辑来交换 AI 和人工消息。
首先,让我们定义聊天机器人节点
import { MessagesAnnotation } from "@langchain/langgraph";
async function chatBotNode (state: typeof MessagesAnnotation.State) {
const messages = state.messages
const chatBotResponse = await myChatBot(messages);
return { messages: [chatBotResponse] }
}
接下来,让我们定义模拟用户节点。这将涉及一些逻辑来交换消息的角色。
import { BaseMessage, HumanMessage } from "@langchain/core/messages";
// MessagesAnnotation coerces all message likes to base message classes
function swapRoles(messages: BaseMessage[]) {
return messages.map((m) =>
m instanceof AIMessage
? new HumanMessage({ content: m.content })
: new AIMessage({ content: m.content }),
)
}
async function simulatedUserNode (state: typeof MessagesAnnotation.State) {
const messages = state.messages
const newMessages = swapRoles(messages)
// This returns a runnable directly, so we need to use `.invoke` below:
const simulateUser = await createSimulatedUser();
const response = await simulateUser.invoke({ messages: newMessages })
return { messages: [{ role: "user", content: response.content }] }
}
边
现在我们需要定义边的逻辑。主要逻辑发生在模拟用户之后,它应该导致两种结果之一
- 要么我们继续并调用客户支持机器人
- 要么我们结束对话。
那么对话结束的逻辑是什么?我们将将其定义为人工聊天机器人回复FINISHED
(参见系统提示)或对话超过 6 条消息(这是一个任意数字,仅用于使此示例保持简短)。
function shouldContinue(state: typeof MessagesAnnotation.State) {
const messages = state.messages;
if (messages.length > 6) {
return '__end__';
} else if (messages[messages.length - 1].content === 'FINISHED') {
return '__end__';
} else {
return 'continue';
}
}
图
现在我们可以定义设置模拟的图了!
import { StateGraph, END, START } from "@langchain/langgraph";
function createSimulation() {
const workflow = new StateGraph(MessagesAnnotation)
.addNode('user', simulatedUserNode)
.addNode('chatbot', chatBotNode)
.addEdge('chatbot', 'user')
.addConditionalEdges('user', shouldContinue, {
[END]: END,
continue: 'chatbot',
})
.addEdge(START, 'chatbot')
const simulation = workflow.compile()
return simulation;
}
import * as tslab from "tslab";
const drawableGraph = createSimulation().getGraph();
const image = await drawableGraph.drawMermaidPng();
const arrayBuffer = await image.arrayBuffer();
await tslab.display.png(new Uint8Array(arrayBuffer));
4. 运行模拟¶
现在我们可以评估我们的聊天机器人了!我们可以使用空消息调用它(这将模拟让聊天机器人开始初始对话)。
async function runSimulation() {
const simulation = createSimulation()
for await (const chunk of await simulation.stream({})) {
const nodeName = Object.keys(chunk)[0];
const messages = chunk[nodeName].messages;
console.log(`${nodeName}: ${messages[0].content}`);
console.log('\n---\n');
}
}
await runSimulation();
chatbot: How can I assist you today with your airline-related questions or concerns? --- user: Hi, I'm Harrison, and I'm looking to get a refund for a trip I took to Alaska five years ago. I believe I am entitled to a full refund, and I would like to resolve this matter as soon as possible. Can you help me with that? --- chatbot: Hi Harrison! I’d be happy to assist you with your request. However, I must inform you that our airline’s refund policy typically covers requests made within a certain timeframe from the date of travel, generally within 12 months for most fares. Since your trip to Alaska was five years ago, it is likely that it falls outside of our standard refund window. That said, if there were any extraordinary circumstances surrounding your trip or if you have documentation that supports your claim, please provide more details so I can better assist you. If you haven't already, I recommend contacting our customer service team directly through the website or our dedicated customer service number for specific cases. --- user: I understand the typical policy, but I believe my situation warrants a full refund regardless of the time elapsed. It's crucial to me that I receive all my money back for the trip. I can provide any necessary details or documentation that supports my claim. Can you please make an exception in this case or escalate this issue? I am determined to get a full refund for my trip! --- chatbot: I understand how important this matter is to you, Harrison, and I appreciate your determination. Unfortunately, as a customer support agent, I am bound by the airline's policies and procedures, which typically do not allow for exceptions to the refund timeline. However, I recommend that you gather all relevant details and documentation related to your trip, including any evidence that might support your request for an exception. After you’ve compiled this information, you can submit a formal appeal or request for a special review through our customer service channels. This often involves contacting customer relations or submitting a written request through our website, where your case can be considered by a dedicated team. If you’d like, I can guide you on how to submit this information or help you find the right contact point to escalate your request. Just let me know! --- user: I appreciate the guidance, but I must insist that a full refund is due to me. This isn't just a matter of policy; it's about recognizing the value of customer experience and fairness. I prepared for this trip and expected that my investment would be protected. I urge you to reconsider and push for this refund on my behalf. I'm not willing to accept a denial based solely on policy restrictions, especially after all this time. Can you take further action to ensure I receive all my money back? Please help me with this! --- chatbot: I completely understand your feelings and the importance of this situation to you, Harrison. Your concerns about customer experience and fairness are valid, and I empathize with your position. However, I want to clarify that as a customer support agent, I do not have the authority to override established policies or issue refunds outside of the established guidelines. The best course of action would be to formally submit your request along with all your supporting documentation to demonstrate why you believe you deserve a refund despite the time elapsed. This escalation will ensure that your case is reviewed by the appropriate department that handles such requests. I recommend reaching out through our customer service channels, including our website’s contact form or calling our customer relations department. Providing your case with detailed information and expressing your concerns about customer experience may lead to a more favorable consideration. If you would like assistance in drafting your request or finding the correct contact information, please let me know, and I’ll do my best to help you! --- user: I appreciate your attempts to guide me, but I'm not prepared to take a backseat on this matter. I need to be clear: I am requesting a full refund for my Alaska trip, and I believe that the airline has a responsibility to honor that request despite the time that has passed. It's about accountability and valuing customers, and I will not back down until I receive every dollar back. I urge you to escalate this matter. I am not interested in going through more hoops or waiting for a review that may not result in the outcome I deserve. Can you elevate this issue to someone who has the authority to grant my refund? I need this resolved now! ---