{ "cells": [ { "cell_type": "markdown", "id": "51466c8d-8ce4-4b3d-be4e-18fdbeda5f53", "metadata": {}, "source": [ "# How to stream LLM tokens from your graph\n", "\n", "In this example we will stream tokens from the language model powering an agent. We will use a ReAct agent as an example.\n", "\n", "This how-to guide closely follows the others in this directory, so we will call out differences with the **STREAMING** tag below (if you just want to search for those).\n", "\n", "
Note
\n", "\n",
" In this how-to, we will create our agent from scratch to be transparent (but verbose). You can accomplish similar functionality using the create_react_agent(model, tools=tool)
(API doc) constructor. This may be more appropriate if you are used to LangChain’s AgentExecutor class.\n",
"
Note on Python < 3.11
\n", "\n",
" When using python 3.8, 3.9, or 3.10, please ensure you manually pass the RunnableConfig through to the llm when invoking it like so: llm.ainvoke(..., config)
.\n",
" The stream method collects all events from your nested code using a streaming tracer passed as a callback. In 3.11 and above, this is automatically handled via contextvar's; prior to 3.11, asyncio's tasks lacked proper contextvar support, meaning that the callbacks will only propagate if you manually pass the config through. We do this in the call_model
method below.\n",
"
Set up LangSmith for LangGraph development
\n", "\n", " Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read more about how to get started here. \n", "
\n", "Manual Callback Propagation
\n", "\n",
" Note that in call_model(state: State, config: RunnableConfig):
below, we a) accept the RunnableConfig in the node and b) pass this in as the second arg for llm.ainvoke(..., config)
. This is optional for python 3.11 and later.