跳转到内容

语言代理树搜索

语言代理树搜索 (LATS),由 Zhou 等人提出,是一种通用的 LLM 代理搜索算法,它结合了反思/评估和搜索(特别是蒙特卡洛树搜索),以实现比 ReACT、Reflexion 或思维树等类似技术更好的整体任务性能。

LATS diagram

它有四个主要步骤

  1. 选择:根据步骤 (2) 中的汇总奖励选择最佳的下一步行动。要么响应(如果找到解决方案或达到最大搜索深度),要么继续搜索。
  2. 扩展和模拟:选择 5 个“最佳”潜在行动并并行执行它们。
  3. 反思 + 评估:观察这些行动的结果,并根据反思(以及可能的外部反馈)对决策进行评分
  4. 反向传播:根据结果更新根轨迹的分数。

设置

安装 langgraph(用于框架)、langchain_openai(用于 LLM)和 langchain + tavily-python(用于搜索引擎)。

我们将使用 Tavily 搜索作为工具。您可以在此处获取 API 密钥,或替换为您选择的不同工具。

%%capture --no-stderr
%pip install -U --quiet langchain langgraph langchain_openai
%pip install -U --quiet tavily-python
import getpass
import os


def _set_if_undefined(var: str) -> None:
    if os.environ.get(var):
        return
    os.environ[var] = getpass.getpass(var)


_set_if_undefined("OPENAI_API_KEY")
_set_if_undefined("TAVILY_API_KEY")

设置 LangSmith 以进行 LangGraph 开发

注册 LangSmith 以快速发现问题并提高 LangGraph 项目的性能。LangSmith 让您可以使用跟踪数据来调试、测试和监控使用 LangGraph 构建的 LLM 应用程序——阅读此处了解更多关于如何开始的信息。

图状态

LATS 基于(贪婪)蒙特卡洛树搜索。对于每个搜索步骤,它选择具有最高“置信上限”的节点,这是一个平衡了利用(最高平均奖励)和探索(最低访问次数)的指标。从该节点开始,它生成 N 个(在本例中为 5 个)新的候选操作来执行,并将它们添加到树中。当它生成有效的解决方案或达到最大展开次数(搜索树深度)时,它停止搜索。

Tree Diagram

我们的 LangGraph 状态将由两个项目组成:1. 搜索树的根 2. 用户输入

import math
from collections import deque
from typing import Optional

from langchain_core.messages import AIMessage, BaseMessage, HumanMessage, ToolMessage

from pydantic import BaseModel, Field


class Reflection(BaseModel):
    reflections: str = Field(
        description="The critique and reflections on the sufficiency, superfluency,"
        " and general quality of the response"
    )
    score: int = Field(
        description="Score from 0-10 on the quality of the candidate response.",
        gte=0,
        lte=10,
    )
    found_solution: bool = Field(
        description="Whether the response has fully solved the question or task."
    )

    def as_message(self):
        return HumanMessage(
            content=f"Reasoning: {self.reflections}\nScore: {self.score}"
        )

    @property
    def normalized_score(self) -> float:
        return self.score / 10.0


class Node:
    def __init__(
        self,
        messages: list[BaseMessage],
        reflection: Reflection,
        parent: Optional["Node"] = None,
    ):
        self.messages = messages
        self.parent = parent
        self.children = []
        self.value = 0
        self.visits = 0
        self.reflection = reflection
        self.depth = parent.depth + 1 if parent is not None else 1
        self._is_solved = reflection.found_solution if reflection else False
        if self._is_solved:
            self._mark_tree_as_solved()
        self.backpropagate(reflection.normalized_score)

    def __repr__(self) -> str:
        return (
            f"<Node value={self.value}, visits={self.visits},"
            f" solution={self.messages} reflection={self.reflection}/>"
        )

    @property
    def is_solved(self):
        """If any solutions exist, we can end the search."""
        return self._is_solved

    @property
    def is_terminal(self):
        return not self.children

    @property
    def best_child_score(self):
        """Return the child with the highest value."""
        if not self.children:
            return None
        return max(self.children, key=lambda child: int(child.is_solved) * child.value)

    @property
    def height(self) -> int:
        """Check for how far we've rolled out the tree."""
        if self.children:
            return 1 + max([child.height for child in self.children])
        return 1

    def upper_confidence_bound(self, exploration_weight=1.0):
        """Return the UCT score. This helps balance exploration vs. exploitation of a branch."""
        if self.parent is None:
            raise ValueError("Cannot obtain UCT from root node")
        if self.visits == 0:
            return self.value
        # Encourages exploitation of high-value trajectories
        average_reward = self.value / self.visits
        # Encourages exploration of less-visited trajectories
        exploration_term = math.sqrt(math.log(self.parent.visits) / self.visits)
        return average_reward + exploration_weight * exploration_term

    def backpropagate(self, reward: float):
        """Update the score of this node and its parents."""
        node = self
        while node:
            node.visits += 1
            node.value = (node.value * (node.visits - 1) + reward) / node.visits
            node = node.parent

    def get_messages(self, include_reflections: bool = True):
        if include_reflections:
            return self.messages + [self.reflection.as_message()]
        return self.messages

    def get_trajectory(self, include_reflections: bool = True) -> list[BaseMessage]:
        """Get messages representing this search branch."""
        messages = []
        node = self
        while node:
            messages.extend(
                node.get_messages(include_reflections=include_reflections)[::-1]
            )
            node = node.parent
        # Reverse the final back-tracked trajectory to return in the correct order
        return messages[::-1]  # root solution, reflection, child 1, ...

    def _get_all_children(self):
        all_nodes = []
        nodes = deque()
        nodes.append(self)
        while nodes:
            node = nodes.popleft()
            all_nodes.extend(node.children)
            for n in node.children:
                nodes.append(n)
        return all_nodes

    def get_best_solution(self):
        """Return the best solution from within the current sub-tree."""
        all_nodes = [self] + self._get_all_children()
        best_node = max(
            all_nodes,
            # We filter out all non-terminal, non-solution trajectories
            key=lambda node: int(node.is_terminal and node.is_solved) * node.value,
        )
        return best_node

    def _mark_tree_as_solved(self):
        parent = self.parent
        while parent:
            parent._is_solved = True
            parent = parent.parent

API 参考:AIMessage | BaseMessage | HumanMessage | ToolMessage

图状态本身

主要组件是树,由根节点表示。

from typing_extensions import TypedDict


class TreeState(TypedDict):
    # The full tree
    root: Node
    # The original input
    input: str

定义语言代理

我们的代理将具有三个主要的 LLM 驱动流程:1. 反思:根据工具响应对操作进行评分。2. 初始响应:创建根节点并开始搜索。3. 扩展:从当前树的最佳位置生成 5 个候选“下一步”

对于更“接地”的工具应用(例如代码合成),您可以将代码执行集成到反思/奖励步骤中。这种类型的外部反馈非常有用(尽管会给已经很复杂的示例笔记本增加复杂性)。

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o")

API 参考:ChatOpenAI

工具

对于我们的示例,我们将为语言代理提供一个搜索引擎。

from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_community.utilities.tavily_search import TavilySearchAPIWrapper
from langgraph.prebuilt import ToolNode

search = TavilySearchAPIWrapper()
tavily_tool = TavilySearchResults(api_wrapper=search, max_results=5)
tools = [tavily_tool]
tool_node = ToolNode(tools=tools)

API 参考:TavilySearchResults | TavilySearchAPIWrapper | ToolNode

反思

反思链将根据决策和工具响应对代理输出进行评分。我们将在其他两个节点中调用它。

from langchain_core.output_parsers.openai_tools import (
    JsonOutputToolsParser,
    PydanticToolsParser,
)
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import chain as as_runnable

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Reflect and grade the assistant response to the user question below.",
        ),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="candidate"),
    ]
)

reflection_llm_chain = (
    prompt
    | llm.bind_tools(tools=[Reflection], tool_choice="Reflection").with_config(
        run_name="Reflection"
    )
    | PydanticToolsParser(tools=[Reflection])
)


@as_runnable
def reflection_chain(inputs) -> Reflection:
    tool_choices = reflection_llm_chain.invoke(inputs)
    reflection = tool_choices[0]
    if not isinstance(inputs["candidate"][-1], AIMessage):
        reflection.found_solution = False
    return reflection

API 参考:JsonOutputToolsParser | PydanticToolsParser | ChatPromptTemplate | MessagesPlaceholder | chain

初始响应

我们从一个由第一步生成的根节点开始。它使用工具调用或响应来响应用户输入。

from langchain_core.prompt_values import ChatPromptValue
from langchain_core.runnables import RunnableConfig

prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are an AI assistant.",
        ),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="messages", optional=True),
    ]
)


initial_answer_chain = prompt_template | llm.bind_tools(tools=tools).with_config(
    run_name="GenerateInitialCandidate"
)


parser = JsonOutputToolsParser(return_id=True)

API 参考:ChatPromptValue | RunnableConfig

initial_response = initial_answer_chain.invoke(
    {"input": "Write a research report on lithium pollution."}
)
initial_response
AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_xRFx5hZJNyfurW9kWrPAWx15', 'function': {'arguments': '{"query":"lithium pollution research 2023"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 25, 'prompt_tokens': 93, 'total_tokens': 118, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_a5d11b2ef2', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-448238e0-f2a7-4be0-b21d-03beb7d22121-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'lithium pollution research 2023'}, 'id': 'call_xRFx5hZJNyfurW9kWrPAWx15', 'type': 'tool_call'}], usage_metadata={'input_tokens': 93, 'output_tokens': 25, 'total_tokens': 118})

起始节点

我们将候选生成和反思打包到图的单个节点中。这由以下函数表示

# Define the node we will add to the graph
def generate_initial_response(state: TreeState) -> dict:
    """Generate the initial candidate response."""
    res = initial_answer_chain.invoke({"input": state["input"]})
    parsed = parser.invoke(res)
    tool_responses = [
        tool_node.invoke(
            {
                "messages": [
                    AIMessage(
                        content="",
                        tool_calls=[
                            {"name": r["type"], "args": r["args"], "id": r["id"]}
                        ],
                    )
                ]
            }
        )
        for r in parsed
    ]
    output_messages = [res] + [tr["messages"][0] for tr in tool_responses]
    reflection = reflection_chain.invoke(
        {"input": state["input"], "candidate": output_messages}
    )
    root = Node(output_messages, reflection=reflection)
    return {
        **state,
        "root": root,
    }

候选生成

以下代码提示相同的 LLM 生成 N 个额外的候选者进行检查。

# This generates N candidate values
# for a single input to sample actions from the environment


def generate_candidates(messages: ChatPromptValue, config: RunnableConfig):
    n = config["configurable"].get("N", 5)
    bound_kwargs = llm.bind_tools(tools=tools).kwargs
    chat_result = llm.generate(
        [messages.to_messages()],
        n=n,
        callbacks=config["callbacks"],
        run_name="GenerateCandidates",
        **bound_kwargs,
    )
    return [gen.message for gen in chat_result.generations[0]]


expansion_chain = prompt_template | generate_candidates
res = expansion_chain.invoke({"input": "Write a research report on lithium pollution."})
res
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_rf2Ns2CW2LppxuUFI4irvRhM', 'function': {'arguments': '{"query":"lithium pollution research 2023"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'finish_reason': 'tool_calls', 'logprobs': None}, id='run-dc7c2f76-1eaf-4c65-8803-7ccededfcf0e-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'lithium pollution research 2023'}, 'id': 'call_rf2Ns2CW2LppxuUFI4irvRhM', 'type': 'tool_call'}], usage_metadata={'input_tokens': 93, 'output_tokens': 123, 'total_tokens': 216}),
 AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_rf2Ns2CW2LppxuUFI4irvRhM', 'function': {'arguments': '{"query":"lithium pollution research report 2023"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls', 'logprobs': None}, id='run-dc7c2f76-1eaf-4c65-8803-7ccededfcf0e-1', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'lithium pollution research report 2023'}, 'id': 'call_rf2Ns2CW2LppxuUFI4irvRhM', 'type': 'tool_call'}], usage_metadata={'input_tokens': 93, 'output_tokens': 123, 'total_tokens': 216}),
 AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_rf2Ns2CW2LppxuUFI4irvRhM', 'function': {'arguments': '{"query":"lithium pollution research report"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls', 'logprobs': None}, id='run-dc7c2f76-1eaf-4c65-8803-7ccededfcf0e-2', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'lithium pollution research report'}, 'id': 'call_rf2Ns2CW2LppxuUFI4irvRhM', 'type': 'tool_call'}], usage_metadata={'input_tokens': 93, 'output_tokens': 123, 'total_tokens': 216}),
 AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_rf2Ns2CW2LppxuUFI4irvRhM', 'function': {'arguments': '{"query":"lithium pollution research report"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls', 'logprobs': None}, id='run-dc7c2f76-1eaf-4c65-8803-7ccededfcf0e-3', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'lithium pollution research report'}, 'id': 'call_rf2Ns2CW2LppxuUFI4irvRhM', 'type': 'tool_call'}], usage_metadata={'input_tokens': 93, 'output_tokens': 123, 'total_tokens': 216}),
 AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_rf2Ns2CW2LppxuUFI4irvRhM', 'function': {'arguments': '{"query":"lithium pollution research report 2023"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls', 'logprobs': None}, id='run-dc7c2f76-1eaf-4c65-8803-7ccededfcf0e-4', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'lithium pollution research report 2023'}, 'id': 'call_rf2Ns2CW2LppxuUFI4irvRhM', 'type': 'tool_call'}], usage_metadata={'input_tokens': 93, 'output_tokens': 123, 'total_tokens': 216})]

候选生成节点

我们将候选生成和反思步骤打包在以下“扩展”节点中。我们将所有操作作为批处理过程执行,以加快执行速度。

from collections import defaultdict


def select(root: Node) -> dict:
    """Starting from the root node a child node is selected at each tree level until a leaf node is reached."""

    if not root.children:
        return root

    node = root
    while node.children:
        max_child = max(node.children, key=lambda child: child.upper_confidence_bound())
        node = max_child

    return node


def expand(state: TreeState, config: RunnableConfig) -> dict:
    """Starting from the "best" node in the tree, generate N candidates for the next step."""
    root = state["root"]
    best_candidate: Node = select(root)
    messages = best_candidate.get_trajectory()
    # Generate N candidates from the single child candidate
    new_candidates = expansion_chain.invoke(
        {"input": state["input"], "messages": messages}, config
    )
    parsed = parser.batch(new_candidates)
    flattened = [
        (i, tool_call)
        for i, tool_calls in enumerate(parsed)
        for tool_call in tool_calls
    ]
    tool_responses = [
        (
            i,
            tool_node.invoke(
                {
                    "messages": [
                        AIMessage(
                            content="",
                            tool_calls=[
                                {
                                    "name": tool_call["type"],
                                    "args": tool_call["args"],
                                    "id": tool_call["id"],
                                }
                            ],
                        )
                    ]
                }
            ),
        )
        for i, tool_call in flattened
    ]
    collected_responses = defaultdict(list)
    for i, resp in tool_responses:
        collected_responses[i].append(resp["messages"][0])
    output_messages = []
    for i, candidate in enumerate(new_candidates):
        output_messages.append([candidate] + collected_responses[i])

    # Reflect on each candidate
    # For tasks with external validation, you'd add that here.
    reflections = reflection_chain.batch(
        [{"input": state["input"], "candidate": msges} for msges in output_messages],
        config,
    )
    # Grow tree
    child_nodes = [
        Node(cand, parent=best_candidate, reflection=reflection)
        for cand, reflection in zip(output_messages, reflections)
    ]
    best_candidate.children.extend(child_nodes)
    # We have already extended the tree directly, so we just return the state
    return state

创建图

定义了这两个节点后,我们就可以定义图了。在每个代理步骤之后,我们可以选择完成。

from typing import Literal

from langgraph.graph import END, StateGraph, START


def should_loop(state: TreeState):
    """Determine whether to continue the tree search."""
    root = state["root"]
    if root.is_solved:
        return END
    if root.height > 5:
        return END
    return "expand"


builder = StateGraph(TreeState)
builder.add_node("start", generate_initial_response)
builder.add_node("expand", expand)
builder.add_edge(START, "start")


builder.add_conditional_edges(
    "start",
    # Either expand/rollout or finish
    should_loop,
    ["expand", END],
)
builder.add_conditional_edges(
    "expand",
    # Either continue to rollout or finish
    should_loop,
    ["expand", END],
)

graph = builder.compile()

API 参考:END | StateGraph | START

from IPython.display import Image

Image(graph.get_graph().draw_mermaid_png())

调用

question = "Generate a table with the average size and weight, as well as the oldest recorded instance for each of the top 5 most common birds."
last_step = None
for step in graph.stream({"input": question}):
    last_step = step
    step_name, step_state = next(iter(step.items()))
    print(step_name)
    print("rolled out: ", step_state["root"].height)
    print("---")
start
rolled out:  1
---
expand
rolled out:  2
---

solution_node = last_step["expand"]["root"].get_best_solution()
best_trajectory = solution_node.get_trajectory(include_reflections=False)
print(best_trajectory[-1].content)
Let's synthesize the information into a coherent table summarizing the average size, weight, and the oldest recorded instance for each of the top 5 most common birds.

### Top 5 Most Common Birds
Based on the search results, the top 5 most common birds are:
1. Domestic Chicken
2. House Sparrow
3. European Starling
4. Ring-billed Gull
5. Barn Swallow

### Table: Average Size, Weight, and Oldest Recorded Instance

| Bird               | Average Size (cm) | Average Weight (g) | Oldest Recorded Instance |
|--------------------|-------------------|--------------------|-------------------------|
| Domestic Chicken   | 40-50             | 1,200-2,500        | ~16 years (Pet record)  |
| House Sparrow      | 14-18             | 24-40              | 13 years                |
| European Starling  | 20-23             | 58-100             | 15 years                |
| Ring-billed Gull   | 48-53             | 300-700            | 23 years                |
| Barn Swallow       | 15-20             | 17-20              | 16 years                |

### Additional Details
- **Domestic Chicken**: The average size and weight can vary significantly based on breed and diet. The oldest recorded pet chicken lived up to 16 years.
- **House Sparrow**: Commonly found in urban areas, with an average lifespan significantly shorter in the wild.
- **European Starling**: Known for their adaptability, starlings have a notable lifespan when not exposed to predators or harsh conditions.
- **Ring-billed Gull**: These gulls are common in North America and have a relatively long lifespan compared to other birds.
- **Barn Swallow**: Known for their migratory habits, these birds have relatively high longevity given their size.

This table now provides a structured and comprehensive summary of the requested information.

question = "Write out magnus carlson series of moves in his game against Alireza Firouzja and propose an alternate strategy"
last_step = None
for step in graph.stream({"input": question}):
    last_step = step
    step_name, step_state = next(iter(step.items()))
    print(step_name)
    print("rolled out: ", step_state["root"].height)
    print("---")
start
rolled out:  1
---
expand
rolled out:  2
---
expand
rolled out:  3
---
expand
rolled out:  3
---
expand
rolled out:  3
---

solution_node = last_step["expand"]["root"].get_best_solution()
best_trajectory = solution_node.get_trajectory(include_reflections=False)
print(best_trajectory[-1].content)
It appears that the specific game moves between Magnus Carlsen and Alireza Firouzja are not readily available in the search results. However, I can provide a general idea of what a typical game between high-level players like Carlsen and Firouzja might look like and propose an alternate strategy based on common chess principles.

### Example Game Moves (Hypothetical)
Here's a hypothetical sequence of moves in a game between Magnus Carlsen and Alireza Firouzja:

1. e4 e5
2. Nf3 Nc6
3. Bb5 a6
4. Ba4 Nf6
5. O-O Be7
6. Re1 b5
7. Bb3 d6
8. c3 O-O
9. h3 Nb8
10. d4 Nbd7
11. Nbd2 Bb7
12. Bc2 Re8
13. Nf1 Bf8
14. Ng3 g6
15. a4 c5
16. d5 c4
17. Be3 Qc7
18. Qd2 Nc5
19. Nh2 Bg7
20. Ng4 Nxg4
21. hxg4 Qd7
22. f3 f6
23. Kf2 Qf7
24. Rh1 Rad8
25. Rh3 Bc8
26. Rah1 h6
27. Bxh6 Bxh6
28. Rxh6 Qg7
29. g5 f5
30. exf5 Bxf5
31. Bxf5 gxf5
32. Nh5 Qf7
33. Nf6+ Kf8
34. Rh8+ Ke7
35. Rxe8+ Rxe8
36. Nxe8 Qxe8
37. Rh7+ Kd8
38. g6 Qg8
39. Qg5+ Kc8
40. Qe7 Qd8
41. Qxd8+ Kxd8
42. g7 Kc7
43. g8=Q+ Kb6
44. Qb8+ Ka5
45. Qd8+ Kxa4
46. g4 fxg4
47. fxg4 Kb3
48. g5 Kxb2
49. Qb6 Kxc3
50. Qxc5 dxc5
51. d6 b4
52. d7 b3
53. d8=Q b2
54. Qd1 b1=Q
55. Rxb1 Kxc4
56. Qc1+ Kd5
57. Qxc3 c4
58. Ke3 Kc6
59. Kd4 Kc7
60. Qxc4+ Kd6
61. Qc5+ Ke6
62. Rb6+ Kf7
63. Qc7+ Ke8
64. Rb8#

### Alternate Strategy

If we consider that Magnus Carlsen played the white pieces and used a typical Ruy Lopez opening, an alternate strategy could involve a different opening or a variation within the Ruy Lopez itself. For instance:

1. **Alternative Opening: The Italian Game**
   - 1. e4 e5
   - 2. Nf3 Nc6
   - 3. Bc4 Bc5
   - 4. c3 Nf6
   - 5. d4 exd4
   - 6. cxd4 Bb4+
   - 7. Nc3 Nxe4
   - 8. O-O Bxc3
   - 9. d5 Ne7
   - 10. Qd3 f5
   - 11. bxc3 d6
   - 12. Nd4 O-O
   - 13. f3 Nc5
   - 14. Qc2 f4
   - 15. Re1 Ng6
   - 16. Ba3 Qg5
   - 17. Bxc5 dxc5
   - 18. Ne6 Bxe6
   - 19. dxe6 Ne7
   - 20. Rad1 Rad8
   - 21. Rd7 Rxd7
   - 22. exd7+ Kh8
   - 23. Qe4 Nc6
   - 24. Bd3 g6
   - 25. Qe8 Kg7
   - 26. Bb5 Nd8
   - 27. Re7+ Kh6
   - 28. Qxf8+ Kh5
   - 29. Rxh7#

2. **Variation in the Ruy Lopez:**
   - Instead of the main lines, White could opt for the "Cozy Variation" or the "Deferred Steinitz Defense."
   - For example, after the initial moves:
     - 1. e4 e5
     - 2. Nf3 Nc6
     - 3. Bb5 a6
     - 4. Ba4 d6 (Deferred Steinitz Defense)
     - 5. c3 Bg4
     - 6. h3 Bh5
     - 7. d4 exd4
     - 8. cxd4 Be7
     - 9. Nc3 Nf6
     - 10. O-O O-O

By varying the opening or the approach within a given opening, Carlsen could potentially avoid deep preparation by Firouzja and steer the game into less familiar territory for his opponent.

结论

恭喜您实现了 LATS!这是一种在解决复杂推理任务时可能相当快速且有效的技术。您可能在上面观察到一些注意事项:1. 虽然有效,但树展开可能会占用额外的计算时间。如果您想将其包含在生产应用程序中,您要么希望确保中间步骤是流式的(以便用户看到思考过程/可以访问中间结果),要么将其用于微调数据以提高单次准确性并避免长时间的展开。2. 候选选择过程的质量仅与您生成的奖励一样好。这里我们专门使用自我反思,但是如果您有外部反馈来源(例如代码测试执行),则应将其纳入上述位置。

评论