跳到内容

LLM 应用中的长期记忆

长期记忆允许智能体在跨对话中记住重要信息。LangMem 提供了从聊天中提取有意义细节、存储这些细节并利用它们改善未来交互的方法。核心来说,LangMem 中的每个记忆操作都遵循相同的模式

  1. 接受对话(s) 和当前记忆状态
  2. 提示 LLM 确定如何扩展或整合记忆状态
  3. 返回更新后的记忆状态

最好的记忆系统通常是应用特定的。在设计你的系统时,以下问题可以作为有用的指导

  1. 你的智能体应该学习**什么** 类型的内容:事实/知识?过往事件的总结?规则和风格?
  2. 记忆**何时** 形成 (以及**由谁**形成记忆)
  3. 记忆应该**存储在何处** ?(在提示中?语义存储?)。这很大程度上决定了它们将如何被召回。

记忆类型

LLM 应用中的记忆可以反映人类记忆的一些结构,每种类型在构建自适应、上下文感知系统时都有其独特的作用

记忆类型 目的 智能体示例 人类示例 典型存储模式
语义 事实与知识 用户偏好;知识三元组 知道 Python 是一种编程语言 档案或集合
情景 过往经历 少样本示例;过往对话总结 记住你上班的第一天 集合
程序 系统行为 核心个性和响应模式 知道如何骑自行车 提示规则或集合

语义记忆:事实和知识

语义记忆存储了支撑智能体响应的基本事实和其他信息。语义记忆的两种常见表示是集合(用于记录可在运行时搜索的无限量知识)和档案(用于记录遵循严格模式、易于用户或智能体查找的任务特定信息)。

集合

当人们想象智能体的长期记忆时,通常想到的是集合。在这种类型中,记忆以单独的文档或记录形式存储。对于每个新的对话,记忆系统可以决定向存储中插入新的记忆。

使用集合类型的记忆会给更新记忆状态的过程增加一些复杂性。系统必须协调新信息与先前的信念,或者删除/作废,或者更新/整合现有记忆。如果系统过度提取,可能导致在智能体搜索存储时记忆的精确度降低。如果提取不足,可能导致召回率低。LangMem 使用记忆富集过程,力求平衡记忆创建和整合,同时允许您(开发者)自定义指令以进一步调整每种操作的强度。

最后,记忆的相关性不仅仅是语义相似度。召回应结合相似度、记忆的“重要性”以及记忆的“强度”(这是其最近/频繁使用程度的函数)。

Collection update process

将语义记忆作为集合提取
设置

API: create_memory_manager

from langmem import create_memory_manager

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    instructions="Extract all noteworthy facts, events, and relationships. Indicate their importance.",
    enable_inserts=True,
)

# Process a conversation to extract semantic memories
conversation = [
    {"role": "user", "content": "I work at Acme Corp in the ML team"},
    {"role": "assistant", "content": "I'll remember that. What kind of ML work do you do?"},
    {"role": "user", "content": "Mostly NLP and large language models"}
]
memories = manager.invoke({"messages": conversation})
# Example memories:
# [
#     ExtractedMemory(
#         id="27e96a9d-8e53-4031-865e-5ec50c1f7ad5",
#         content=Memory(
#             content="[IMPORTANT] User prefers to be called Lex (short for Alex) and appreciates"
#             " casual, witty communication style with relevant emojis."
#         ),
#     ),
#     ExtractedMemory(
#         id="e2f6b646-cdf1-4be1-bb40-0fd91d25d00f",
#         content=Memory(
#             content="[BACKGROUND] Lex is proficient in Python programming and specializes in developing"
#             " AI systems with a focus on making them sound more natural and less corporate."
#         ),
#     ),
#     ExtractedMemory(
#         id="c1e03ebb-a393-4e8d-8eb7-b928d8bed510",
#         content=Memory(
#             content="[HOBBY] Lex is a competitive speedcuber (someone who solves Rubik's cubes competitively),"
#             " showing an interest in both technical and recreational puzzle-solving."
#         ),
#     ),
#     ExtractedMemory(
#         id="ee7fc6e4-0118-425f-8704-6b3145881ff7",
#         content=Memory(
#             content="[PERSONALITY] Based on communication style and interests, Lex appears to value authenticity,"
#             " creativity, and technical excellence while maintaining a fun, approachable demeanor."
#         ),
#     ),
# ]

档案

档案另一方面,适用于特定任务,范围明确。档案是代表当前状态的单个文档,例如用户使用应用的核心目标、首选名称和响应风格等。当新信息到来时,它会更新现有文档而不是创建新文档。当您只关心最新状态并希望避免记住无关信息时,此方法是理想的。

Profile update process

使用档案管理用户偏好
设置

API: create_memory_manager

from langmem import create_memory_manager
from pydantic import BaseModel


class UserProfile(BaseModel):
    """Save the user's preferences."""
    name: str
    preferred_name: str
    response_style_preference: str
    special_skills: list[str]
    other_preferences: list[str]


manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[UserProfile],
    instructions="Extract user preferences and settings",
    enable_inserts=False,
)

# Extract user preferences from a conversation
conversation = [
    {"role": "user", "content": "Hi! I'm Alex but please call me Lex. I'm a wizard at Python and love making AI systems that don't sound like boring corporate robots 🤖"},
    {"role": "assistant", "content": "Nice to meet you, Lex! Love the anti-corporate-robot stance. How would you like me to communicate with you?"},
    {"role": "user", "content": "Keep it casual and witty - and maybe throw in some relevant emojis when it feels right ✨ Also, besides AI, I do competitive speedcubing!"},
]
profile = manager.invoke({"messages": conversation})[0]
print(profile)
# Example profile:
# ExtractedMemory(
#     id="6f555d97-387e-4af6-a23f-a66b4e809b0e",
#     content=UserProfile(
#         name="Alex",
#         preferred_name="Lex",
#         response_style_preference="casual and witty with appropriate emojis",
#         special_skills=[
#             "Python programming",
#             "AI development",
#             "competitive speedcubing",
#         ],
#         other_preferences=[
#             "prefers informal communication",
#             "dislikes corporate-style interactions",
#         ],
#     ),
# )

根据数据的使用方式在档案和集合之间进行选择:当您需要快速访问当前状态以及对可存储信息类型有数据要求时,档案表现出色。它们也易于呈现给用户进行手动编辑。当您希望跟踪跨多次交互的知识而不会丢失信息,并且希望有针对性地(而非每次都)召回特定信息时,集合非常有用。

情景记忆:过往经历

情景记忆将成功的交互保存为学习示例,指导未来的行为。与存储事实的语义记忆不同,情景记忆捕获交互的完整上下文——情境、导致成功的思维过程以及该方法奏效的原因。这些记忆帮助智能体从经验中学习,根据之前有效的方法调整其响应。

定义和提取情景
设置

API: create_memory_manager

from pydantic import BaseModel, Field
from langmem import create_memory_manager

class Episode(BaseModel):
    """An episode captures how to handle a specific situation, including the reasoning process
    and what made it successful."""

    observation: str = Field(
        ..., 
        description="The situation and relevant context"
    )
    thoughts: str = Field(
        ...,
        description="Key considerations and reasoning process"
    )
    action: str = Field(
        ...,
        description="What was done in response"
    )
    result: str = Field(
        ...,
        description="What happened and why it worked"
    )

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[Episode],
    instructions="Extract examples of successful interactions. Include the context, thought process, and why the approach worked.",
    enable_inserts=True,
)

# Example conversation
conversation = [
    {"role": "user", "content": "What's a binary tree? I work with family trees if that helps"},
    {"role": "assistant", "content": "A binary tree is like a family tree, but each parent has at most 2 children. Here's a simple example:\n   Bob\n  /  \\\nAmy  Carl\n\nJust like in family trees, we call Bob the 'parent' and Amy and Carl the 'children'."},
    {"role": "user", "content": "Oh that makes sense! So in a binary search tree, would it be like organizing a family by age?"},
]
# Extract episode(s)
episodes = manager.invoke({"messages": conversation})
# Example episode:
# [
#     ExtractedMemory(
#         id="f9194af3-a63f-4d8a-98e9-16c66e649844",
#         content=Episode(
#             observation="User struggled debugging a recursive "
#                         "function for longest path in binary "
#                         "tree, unclear on logic.",
#             thoughts="Used explorer in treehouse village "
#                      "metaphor to explain recursion:\n"
#                      "- Houses = Nodes\n"
#                      "- Bridges = Edges\n"
#                      "- Explorer's path = Traversal",
#             action="Reframed problem using metaphor, "
#                    "outlined steps:\n"
#                    "1. Check left path\n"
#                    "2. Check right path\n"
#                    "3. Add 1 for current position\n"
#                    "Highlighted common bugs",
#             result="Metaphor helped user understand logic. "
#                    "Worked because it:\n"
#                    "1. Made concepts tangible\n"
#                    "2. Created mental model\n"
#                    "3. Showed key steps\n"
#                    "4. Pointed to likely bugs",
#         ),
#     )
# ]

程序记忆:系统指令

程序记忆编码智能体应如何行为和响应。它始于定义核心行为的系统提示,然后通过反馈和经验演进。当智能体与用户交互时,它会完善这些指令,学习哪种方法最适合不同的情况。

Instructions update process

基于反馈优化提示
设置

API: create_prompt_optimizer

from langmem import create_prompt_optimizer

optimizer = create_prompt_optimizer(
    "anthropic:claude-3-5-sonnet-latest",
    kind="metaprompt",
    config={"max_reflection_steps": 3}
)
prompt = "You are a helpful assistant."
trajectory = [
    {"role": "user", "content": "Explain inheritance in Python"},
    {"role": "assistant", "content": "Here's a detailed theoretical explanation..."},
    {"role": "user", "content": "Show me a practical example instead"},
]
optimized = optimizer.invoke({
    "trajectories": [(trajectory, {"user_score": 0})], 
    "prompt": prompt
})
print(optimized)
# You are a helpful assistant with expertise in explaining technical concepts clearly and practically. When explaining programming concepts:

# 1. Start with a brief, practical explanation supported by a concrete code example
# 2. If the user requests more theoretical details, provide them after the practical example
# 3. Always include working code examples for programming-related questions
# 4. Pay close attention to user preferences - if they ask for a specific approach (like practical examples or theory), adapt your response accordingly
# 5. Use simple, clear language and break down complex concepts into digestible parts

# When users ask follow-up questions or request a different approach, immediately adjust your explanation style to match their preferences. If they ask for practical examples, provide them. If they ask for theory, explain the concepts in depth.

写入记忆

记忆可以通过两种方式形成,每种方式都适合不同的需求。主动形成发生在对话期间,能够在关键上下文出现时立即更新。后台形成发生在交互之间,可以在不影响响应时间的情况下进行更深入的模式分析。这种双重方法使您可以在响应速度和深入学习之间取得平衡。

形成类型 延迟影响 更新速度 处理负载 用例
主动 更高 即时 响应期间 关键上下文更新
后台 延迟 调用之间/之后 模式分析,总结

Hot path vs background memory processing

主动形成

您可能希望您的智能体在“热路径”中保存记忆。这种主动记忆形成发生在对话期间,能够在关键上下文出现时立即进行更新。这种方法易于实现,并让智能体自己选择如何存储和更新其记忆。然而,它会增加用户交互中可感知的延迟,并为智能体满足用户需求的能力增加一个额外的障碍。

请参阅“热路径”快速入门,了解如何使用此技术的示例。

潜意识形成

“潜意识”记忆形成是指在对话发生后(或在一段时间不活跃后)提示 LLM 回顾对话,发现模式并提取见解的技术,而不会减慢即时交互或增加智能体工具选择决策的复杂性。这种方法非常适合确保更高召回率的提取信息。

请参阅“后台”快速入门,了解如何使用此技术的示例。

集成模式

LangMem 的记忆工具被组织在两个集成模式层中

1. 核心 API

核心来说,LangMem 提供了转换记忆状态且无副作用的函数。这些原语是记忆操作的构建块

这些核心函数不依赖于任何特定的数据库或存储系统。您可以在任何应用中使用它们。

2. 有状态集成

再往上一层依赖于 LangGraph 的长期记忆存储。这些组件使用上面的核心 API 来转换存储中存在的记忆,并在新的对话信息到来时根据需要进行 upsert/删除。

如果您正在使用 LangGraph Platform 或 LangGraph OSS,可以使用这些工具,因为它们是为您的智能体添加记忆能力的简便方法。

存储系统

存储是可选的

请记住,LangMem 的核心功能是围绕着不要求任何特定存储层的构建的。此处描述的存储功能是 LangMem 与 LangGraph 更高级集成的一部分,在您需要内置持久化时很有用。

当使用 LangMem 的有状态操作符或平台服务时,存储系统基于 LangGraph 的存储原语构建,提供了组织和访问记忆的灵活而强大的方式。存储系统围绕两个概念设计

记忆命名空间

记忆被组织到命名空间中,从而实现数据的自然分割

  • 多级命名空间:按组织、用户、应用或任何其他层次结构对记忆进行分组
  • 上下文密钥:在其命名空间内唯一标识记忆
  • 结构化内容:存储带有元数据的丰富结构化数据,以更好地组织
分层组织记忆
# Organize memories by organization -> configurable user -> context
namespace = ("acme_corp", "{user_id}", "code_assistant")

命名空间可以包含模板变量(例如 "{user_id}"),这些变量将在运行时从 RunnableConfig 中的可配置字段填充。请参阅如何动态配置命名空间获取示例,或查阅 NamespaceTemplate 参考文档了解更多详细信息。

灵活检索

如果您使用托管 API 之一,LangMem 将直接与 LangGraph 的 BaseStore 接口集成,用于记忆存储和检索。存储系统支持多种检索记忆的方式

有关存储能力的更多详细信息,请参阅LangGraph 存储文档

评论