关系型数据#

在这个任务中,代理可以访问一组工具,这些工具可用于跨 3 个关系型表进行查询。

这些表包含有关用户、位置和食物的信息。代理必须使用提供的工具回答有关数据的问题。

基础数据如下所示(显示前 2 条记录)

用户数据

id

姓名

电子邮件

位置

最喜欢的颜色

最喜欢的食物

1

Alice

alice@gmail.com

1

红色

[1, 2, 3]

21

Bob

bob@hotmail.com

2

橙色

[4, 5, 6]

位置数据

id

城市

当前时间

当前天气

1

纽约

2023-11-14 上午 10:30

多云,温度:68°F

2

洛杉矶

2023-11-14 上午 7:45

晴朗,温度:75°F

食物数据

id

姓名

卡路里

过敏成分

1

披萨

285

[“面筋”, “乳制品”]

2

巧克力

50

[“牛奶”, “大豆”]

这些工具允许根据 ID 查找信息(例如,get_user_email 接收用户 ID 并返回电子邮件),并进行搜索(例如,find_foods_by_name 接收食物名称并返回结果列表)。


from langchain_benchmarks import registry

要使此代码正常工作,请使用您的凭据配置 LangSmith 环境变量。

task = registry["Tool Usage - Relational Data"]

环境#

让我们检查环境

env = task.create_environment()
env.tools[:5]
[StructuredTool(name='get_user_name', description="get_user_name(user_id: int) -> str - Get the name of the user with the given user ID.\n\n        Args:\n            user_id: The user's ID.\n\n        Returns:\n            The user's name.", args_schema=<class 'pydantic.v1.main.get_user_nameSchema'>, handle_tool_error=True, func=<function get_available_functions.<locals>.get_user_name at 0x78f30602fec0>),
 StructuredTool(name='list_user_ids', description='list_user_ids() -> List[str] - List all the user IDs.', args_schema=<class 'pydantic.v1.main.list_user_idsSchema'>, handle_tool_error=True, func=<function get_available_functions.<locals>.list_user_ids at 0x78f30602fe20>),
 StructuredTool(name='find_users_by_name', description='find_users_by_name(name: str) -> List[langchain_benchmarks.tool_usage.tasks.relational_data.SearchHit] - Find users with the given name.\n\n        Args:\n            name: The name to search for.\n\n        Returns:\n            The list of matching users.', args_schema=<class 'pydantic.v1.main.find_users_by_nameSchema'>, handle_tool_error=True, func=<function get_available_functions.<locals>.find_users_by_name at 0x78f306058040>),
 StructuredTool(name='find_locations_by_name', description='find_locations_by_name(city: str) -> List[langchain_benchmarks.tool_usage.tasks.relational_data.SearchHit] - Find locations with the given city name.', args_schema=<class 'pydantic.v1.main.find_locations_by_nameSchema'>, handle_tool_error=True, func=<function get_available_functions.<locals>.find_locations_by_name at 0x78f3060580e0>),
 StructuredTool(name='find_foods_by_name', description='find_foods_by_name(food: str) -> List[langchain_benchmarks.tool_usage.tasks.relational_data.SearchHit] - Find foods with the given name.', args_schema=<class 'pydantic.v1.main.find_foods_by_nameSchema'>, handle_tool_error=True, func=<function get_available_functions.<locals>.find_foods_by_name at 0x78f306058180>)]
env.tools[0].invoke({"user_id": 21})
'Bob'
env.tools[3].invoke({"city": "LA"})
[{'id': 2, 'city': 'Los Angeles'},
 {'id': 1, 'city': 'New York'},
 {'id': 3, 'city': 'Chicago'},
 {'id': 4, 'city': 'Houston'},
 {'id': 5, 'city': 'Miami'}]

探索任务#

为了评估,我们需要一个代理工厂,该工厂将在每次评估运行时创建一个新的代理执行器实例。

我们将使用 StandardAgentFactory – 查看 intro 以获取有关其功能和/或如何创建自定义代理工厂的更多信息。

from langchain_core.prompts import ChatPromptTemplate
from langchain_openai.chat_models import ChatOpenAI

from langchain_benchmarks.tool_usage.agents import StandardAgentFactory

model = ChatOpenAI(temperature=0)
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "{instructions}"),  # Populated from task.instructions automatically
        ("human", "{question}"),  # Populated from the test data
        (
            "placeholder",
            "{agent_scratchpad}",
        ),  # Work where the agent can do its work (e.g., call multiple tools)
    ]
)

agent_factory = StandardAgentFactory(task, model, prompt)
from langchain import globals

globals.set_verbose(True)

agent = agent_factory()
agent.invoke({"question": "what is the weather in LA"})
> Entering new AgentExecutor chain...

Invoking: `find_locations_by_name` with `{'city': 'LA'}`


[{'id': 2, 'city': 'Los Angeles'}, {'id': 1, 'city': 'New York'}, {'id': 3, 'city': 'Chicago'}, {'id': 4, 'city': 'Houston'}, {'id': 5, 'city': 'Miami'}]
Invoking: `get_current_weather_for_location` with `{'location_id': 2}`


Sunny, Temperature: 75°FThe weather in Los Angeles is sunny with a temperature of 75°F.

> Finished chain.
{'question': 'what is the weather in LA',
 'output': 'The weather in Los Angeles is sunny with a temperature of 75°F.',
 'intermediate_steps': [(ToolAgentAction(tool='find_locations_by_name', tool_input={'city': 'LA'}, log="\nInvoking: `find_locations_by_name` with `{'city': 'LA'}`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_hJrCZgP4eDgaj6s4RtCKXTOo', 'function': {'arguments': '{"city":"LA"}', 'name': 'find_locations_by_name'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls'}, id='run-23ccffb0-3b17-46a4-b42e-5eaa3220b211', tool_calls=[{'name': 'find_locations_by_name', 'args': {'city': 'LA'}, 'id': 'call_hJrCZgP4eDgaj6s4RtCKXTOo'}], tool_call_chunks=[{'name': 'find_locations_by_name', 'args': '{"city":"LA"}', 'id': 'call_hJrCZgP4eDgaj6s4RtCKXTOo', 'index': 0}])], tool_call_id='call_hJrCZgP4eDgaj6s4RtCKXTOo'),
   [{'id': 2, 'city': 'Los Angeles'},
    {'id': 1, 'city': 'New York'},
    {'id': 3, 'city': 'Chicago'},
    {'id': 4, 'city': 'Houston'},
    {'id': 5, 'city': 'Miami'}]),
  (ToolAgentAction(tool='get_current_weather_for_location', tool_input={'location_id': 2}, log="\nInvoking: `get_current_weather_for_location` with `{'location_id': 2}`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_lopYjo00MF9mZtnHtiisTqyp', 'function': {'arguments': '{"location_id":2}', 'name': 'get_current_weather_for_location'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls'}, id='run-9bba5827-d98b-464d-8028-25eb4a05d227', tool_calls=[{'name': 'get_current_weather_for_location', 'args': {'location_id': 2}, 'id': 'call_lopYjo00MF9mZtnHtiisTqyp'}], tool_call_chunks=[{'name': 'get_current_weather_for_location', 'args': '{"location_id":2}', 'id': 'call_lopYjo00MF9mZtnHtiisTqyp', 'index': 0}])], tool_call_id='call_lopYjo00MF9mZtnHtiisTqyp'),
   'Sunny, Temperature: 75°F')]}

基准测试#

查看 introductionbenchmark all 以了解如何运行基准测试。此笔记本只是为了解释和探索任务。