LangChain Streaming Response Tutorial: From Basic to Agent Tool Streaming
LangChain Streaming Response Tutorial: From Basic to Agent Tool Streaming
Imagine you’re asking a super smart computer a question. Instead of waiting a long time for the complete answer, wouldn’t it be cool if it started typing out its thoughts and findings right away? That’s what “streaming” is all about! It makes talking to AI feel much faster and more engaging.
In this guide, we’ll learn all about how LangChain helps us achieve this amazing streaming experience. We’ll start with simple AI messages and then dive deep into the fascinating world of langchain agent tool streaming. You’ll discover how to see your AI agent’s “brain” at work, step-by-step.
Why Streaming Makes AI So Much Better
When you use an app powered by AI, you usually type something and then wait. If the AI is doing something complex, like looking up facts or doing calculations, this waiting time can feel very long. Streaming changes this by sending you pieces of the answer as soon as they are ready. This makes the app feel responsive and alive.
Think of it like watching someone draw a picture. You’d rather see the drawing unfold line by line than just wait for the finished masterpiece, right? Streaming gives you that exciting real-time feeling with AI. You get to see the AI’s thoughts, actions, and results appear piece by piece.
Getting Started: The Very Basics of Streaming
Before we jump into agents, let’s understand how simple AI conversations stream their responses. In LangChain, this means getting words back from a large language model (LLM) as it generates them. It’s like watching words appear on your screen one by one, instead of waiting for the entire sentence.
This basic kind of streaming is super helpful for chat applications. You don’t want to wait for a whole paragraph to load; you want to see the AI’s reply immediately. Let’s see how you can make a simple AI chat stream its words.
What You Need for Basic Streaming
To follow along, you’ll need LangChain installed and access to an LLM. We’ll use OpenAI’s models as an example, but many other models also support streaming. Make sure you have your API key set up in your environment variables.
Here’s a quick peek at the setup:
1
2
3
4
5
6
# Install LangChain if you haven't already
# pip install langchain-openai langchain
# Set your OpenAI API key
import os
os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY"
You’ll replace "YOUR_API_KEY" with your actual key. This simple setup lets your computer talk to the smart AI model.
Streaming a Simple Chat Message
Let’s ask a large language model a question and watch its answer stream back to us. We’ll use the ChatOpenAI model from LangChain. It has a special stream() method that makes this magic happen.
This method gives us chunks of the response as they are generated. You can then print each chunk as it arrives. It’s really straightforward to set up.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
# Initialize the chat model with streaming enabled
# temperature=0 makes the AI less creative and more direct
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", streaming=True, temperature=0)
# Create a list of messages for the conversation
messages = [
HumanMessage(content="Tell me a very short story about a brave mouse.")
]
print("AI is thinking...")
# Stream the response and print each chunk
for chunk in llm.stream(messages):
print(chunk.content, end="", flush=True)
print("\n\nStory finished!")
When you run this code, you’ll see the story appear character by character, or word by word, instead of all at once. The flush=True part is important for making sure each piece prints immediately. You can see how fast the AI responds.
Understanding the Chunks
Each chunk you get back in the for loop is a small piece of the AI’s reply. It’s usually a few characters or a word. LangChain breaks down the full message into these tiny bits. This allows your application to show content to the user as soon as it’s available.
You’re not waiting for the complete thought; you’re getting it as it forms. This is the fundamental idea behind Agent streaming fundamentals. This basic concept will be important when we move on to more complex agents.
Stepping Up to Agent Streaming Fundamentals
Now, let’s talk about agents. An agent is like a super-smart helper that can not only talk but also use tools to get things done. Imagine an agent that can search the internet, do math, or even send emails. When an agent does these things, it usually takes multiple steps.
Instead of just getting the final answer, wouldn’t it be great to see the agent’s “thinking process”? This is where langchain agent tool streaming comes in. You can see when it decides to use a tool, what tool it uses, what input it gives, and what results it gets back. This is what we call intermediate steps streaming.
Why Agent Streaming is a Game-Changer
When an agent needs to perform several actions, like searching for information and then summarizing it, it can take a bit of time. If you only show the final answer, the user might wonder what’s happening. Agent streaming solves this. By showing the user the reasoning trace streaming, they understand the progress.
It gives the user confidence that the AI is working hard and moving towards a solution. This transparency is key for complex AI applications. It’s also incredibly useful for you, the developer, to debugging agent streams. You can see exactly what your agent is doing at each moment.
The AgentExecutor and Its Streaming Powers
In LangChain, agents are typically run using something called an AgentExecutor. This is the engine that drives your agent, helping it decide what to do next. The good news is that AgentExecutor also has a stream() method, just like our simple ChatOpenAI model. However, it streams much more information!
When you use AgentExecutor’s stream(), you’re not just getting words. You’re getting messages about the agent’s thoughts, its tool call streaming, and the results of those tool calls. This is the core of AgentExecutor streaming config and how you control what information you see.
Diving Deep into Agent Tool Streaming
Let’s explore what kind of information you can get when an agent streams. It’s not just the final reply; it’s a whole story of how the agent gets there. You can actually peek inside the agent’s mind and see its decision-making process. This makes the AI feel much more intelligent and capable.
invoke() vs. stream() with Agents
You might be familiar with the invoke() method in LangChain. For an agent, invoke() runs the entire process and only gives you the final answer. It’s like pressing a button and waiting for the completed task. This is good when you only care about the end result.
However, stream() is different. It’s like watching a movie being filmed, scene by scene. It provides a continuous flow of updates as the agent executes its plan. This includes its thoughts, tool usage, and the outcomes.
What Gets Streamed? The Pieces of the Puzzle
When you use agent.stream(), you’ll receive various types of “events” or “chunks” of information. These events tell you different parts of the agent’s journey. It’s like getting updates from a detective as they solve a case.
Here are the main types of information you’ll see in langchain agent tool streaming:
- Thinking (Reasoning Trace Streaming):
- This shows you the LLM’s internal monologue. It’s what the AI is “thinking” to decide its next step.
- You’ll often see this as text output from the LLM, explaining its reasoning. This is part of the
intermediate steps streaming.
- Tool Calls (Tool Call Streaming):
- When the agent decides it needs to use a tool (like a calculator or a search engine), you’ll get an event saying it’s calling a tool.
- This event usually includes the name of the tool and the input the agent is giving to that tool.
- Tool Outputs (Streaming Tool Outputs):
- After a tool has been used, you’ll receive an event with the result from that tool.
- For example, if the agent used a calculator, this would be the answer to the math problem. If it used a search tool, it would be the search results.
- Final Answer:
- Once the agent has completed its task and has the final answer, you’ll receive this as the last piece of information.
- This is typically the summarized or direct response to your original question.
Let’s set up a simple agent to see this in action.
Practical Example: A Simple Agent with a Tool
To illustrate langchain agent tool streaming, we’ll create a very basic agent that can perform a simple math calculation. We’ll use a tool for addition. This agent will decide if it needs to use the calculator, use it, and then give us the answer.
First, we need to define our tool. A tool is just a function that your agent can call. We’ll make a simple “add” tool.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
from langchain.tools import tool
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.prompts import PromptTemplate
# 1. Define a simple tool
@tool
def add(a: int, b: int) -> int:
"""Adds two integers and returns the result."""
return a + b
# Put our tool into a list
tools = [add]
# 2. Define the LLM (our smart brain)
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0, streaming=True)
# 3. Define the Agent's Prompt (how it thinks)
# This is a standard prompt for a ReAct agent, which makes it think step-by-step
prompt_template = """
You are a helpful assistant. You have access to the following tools:
{tools}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: {input}
Thought:{agent_scratchpad}
"""
prompt = PromptTemplate.from_template(prompt_template)
# 4. Create the agent
agent = create_react_agent(llm, tools, prompt)
# 5. Create the AgentExecutor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False) # verbose=False here to let our stream handle output
print("Agent is starting to work. Watching the stream...\n")
# 6. Stream the agent's response
for s in agent_executor.stream({"input": "What is 10 + 5?"}):
# Each 's' is a dictionary containing information about an event
# We'll print different parts depending on the event type
if "__end__" in s:
# This signals the end of the full process
print(f"\n--- Full process finished: {s['__end__']['output']['output']} ---\n")
elif "actions" in s:
# This means the agent decided to take an action (use a tool)
for action in s["actions"]:
print(f"**Agent decided to use tool:** {action.tool}")
print(f"**Input to tool:** {action.tool_input}")
print(f"**Thought before action:** {action.log}")
elif "steps" in s:
# These are the intermediate steps, like tool observations
for step in s["steps"]:
print(f"**Tool Observation (result):** {step.observation}")
# You might also find intermediate LLM thoughts here within step.log
if step.log:
print(f"**Agent's thoughts after tool:** {step.log}")
elif "output" in s:
# This is the final output from the agent
print(f"**Final Agent Output:** {s['output']['output']}")
elif "input" in s:
# Initial input to the agent
print(f"**Initial Input to Agent:** {s['input']}")
else:
# General LLM tokens (thoughts)
# This covers the LLM's internal thinking process before it decides on an action
for key, value in s.items():
if hasattr(value, 'content') and value.content.strip():
print(f"**Agent's thought:** {value.content}", end="", flush=True)
print("\n\nStreaming complete!")
When you run this code, you’ll see a detailed log of what the agent is doing. You’ll first see its Thought (reasoning trace streaming), then the Action it takes (tool call streaming), the Observation (streaming tool outputs) which is the tool’s result, and finally the Final Answer. This full sequence is the essence of langchain agent tool streaming.
You can clearly see the distinction between the agent’s internal reasoning and its interactions with the add tool. This is very powerful for understanding and building complex agents. It truly shows the intermediate steps streaming in action.
Customizing AgentExecutor Streaming Config
LangChain gives you control over what kind of information you want to stream from your AgentExecutor. You might not always want to show every single thought or observation to your user. Sometimes, a more concise stream is better for the user interface (agent streaming UI patterns).
The AgentExecutor has parameters that help you tailor the streaming experience. This allows you to fine-tune the AgentExecutor streaming config to match your application’s needs.
Controlling Intermediate Steps
One important configuration is whether to stream the intermediate steps streaming. These include the agent’s internal thoughts and the raw observations from tools. By default, stream() usually gives you these.
However, you might want to hide some of the raw internal workings. You can often control the verbosity through the agent creation or execution parameters. For instance, the verbose flag on AgentExecutor changes how much detail is logged to the console, but the stream() method itself provides granular control over the types of events you process.
In more recent LangChain versions, the stream() method returns a structured log of events. It’s up to you to filter and display what’s relevant. The example above shows how to parse these events. You can choose to ignore specific event types if you don’t want to display them to the user.
For instance, if you only wanted to show the final output and tool calls, you’d modify the loop:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# ... (agent_executor setup as before) ...
print("Agent is working, showing only key steps...\n")
for s in agent_executor.stream({"input": "What is 10 + 5?"}):
if "actions" in s:
for action in s["actions"]:
print(f"**Agent is using tool:** {action.tool} with input '{action.tool_input}'")
elif "steps" in s:
for step in s["steps"]:
# Only show the tool's result, not necessarily the agent's reflection
print(f"**Tool finished, result:** {step.observation}")
elif "output" in s:
print(f"**Final Answer:** {s['output']['output']}")
This simplified output demonstrates how you can customize the streaming experience by picking which events to display. This is a crucial part of designing effective agent streaming UI patterns.
The stream_run_log Option (Advanced)
Sometimes, you need an even more detailed, developer-focused log of everything that happens. LangChain’s stream_run_log can provide this. This feature gives you a complete trace of every single event and sub-event that occurs within the AgentExecutor. It’s often used for advanced debugging agent streams.
This is less for user display and more for understanding the inner workings. It can be quite verbose, providing a full RunLog object for each step.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# ... (agent_executor setup as before) ...
# To enable stream_run_log, you'd typically pass a config to the stream method,
# or for older versions, it might be a flag during agent creation.
# Modern LangChain `stream` method gives event objects you filter.
print("Agent is working, showing detailed run log (conceptual)...\n")
# In current LangChain, `stream` gives you event types.
# You would get events like 'on_llm_stream', 'on_tool_start', etc.
# The previous example of parsing 's' already covers this.
# If you wanted to see the *raw* event objects from LangChain Core,
# you'd typically look at the structure `s` provides.
# Example of looking at the type of events (conceptual for detailed log)
# for event in agent_executor.stream({"input": "What is 10 + 5?"}, stream_run_log=True):
# print(event) # This would print the full event object, often a dict with 'event', 'name', 'data'
# The `stream` method in recent LangChain directly gives you these event types.
# The code above already demonstrates how to inspect these event dictionaries.
The stream_run_log concept gives you deeper insight into the events like on_llm_start, on_tool_start, on_tool_end, on_llm_new_token, on_agent_action, and on_agent_finish. These are the raw building blocks that describe the agent’s activity. For most users, you’ll want to translate these into more human-readable messages, as shown in our previous example.
Streaming with Custom Tools
So far, we’ve used a simple add tool that we created. But what if your agent needs to do something much more specific to your application? This is where streaming with custom tools becomes incredibly valuable. You can build any tool you need and still get the benefits of streaming its usage.
Creating custom tools is a core part of building powerful LangChain agents. When you stream an agent that uses your custom tool, you’ll still see the tool call streaming and streaming tool outputs for your unique functions. This means you maintain full transparency, no matter how specialized your agent becomes.
Building a Custom “Fact Lookup” Tool
Let’s create a custom tool that can “look up” a fact from a predefined dictionary. This mimics how an agent might query a database or an internal knowledge base. Our agent will use this tool to answer a question.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
from langchain.tools import tool
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.prompts import PromptTemplate
# Define our knowledge base (simple dictionary for demonstration)
KNOWLEDGE_BASE = {
"Eiffel Tower height": "330 meters (including antenna)",
"Capital of France": "Paris",
"Population of Mars": "0 (it's uninhabited by humans)",
"Speed of light": "299,792,458 meters per second",
}
# 1. Create a custom tool to look up facts
@tool
def lookup_fact(query: str) -> str:
"""Looks up a fact in a predefined knowledge base.
Input should be the exact fact to look for, e.g., 'Eiffel Tower height'.
"""
if query in KNOWLEDGE_BASE:
return KNOWLEDGE_BASE[query]
else:
return "Fact not found in knowledge base."
# Put our custom tool into a list
tools = [lookup_fact]
# 2. Define the LLM (our smart brain)
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0, streaming=True)
# 3. Define the Agent's Prompt
prompt_template = """
You are a helpful assistant. You have access to the following tools:
{tools}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: {input}
Thought:{agent_scratchpad}
"""
prompt = PromptTemplate.from_template(prompt_template)
# 4. Create the agent
agent = create_react_agent(llm, tools, prompt)
# 5. Create the AgentExecutor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False)
print("Agent is starting to work with a custom tool. Watching the stream...\n")
# 6. Stream the agent's response
for s in agent_executor.stream({"input": "What is the height of the Eiffel Tower?"}):
if "__end__" in s:
print(f"\n--- Full process finished: {s['__end__']['output']['output']} ---\n")
elif "actions" in s:
for action in s["actions"]:
print(f"**Agent decided to use tool:** {action.tool}")
print(f"**Input to tool:** {action.tool_input}")
print(f"**Thought before action:** {action.log}")
elif "steps" in s:
for step in s["steps"]:
print(f"**Tool Observation (result):** {step.observation}")
if step.log:
print(f"**Agent's thoughts after tool:** {step.log}")
elif "output" in s:
print(f"**Final Agent Output:** {s['output']['output']}")
elif "input" in s:
print(f"**Initial Input to Agent:** {s['input']}")
else:
for key, value in s.items():
if hasattr(value, 'content') and value.content.strip():
print(f"**Agent's thought:** {value.content}", end="", flush=True)
print("\n\nStreaming complete!")
In this example, your agent now uses the lookup_fact tool. You’ll observe the same detailed tool call streaming and streaming tool outputs as before. This confirms that langchain agent tool streaming works seamlessly with any custom tool you create. This flexibility is what makes LangChain so powerful.
Handling Multi-Step Agent Streaming
Many real-world tasks require an agent to perform more than one step. It might need to search for information, then summarize it, then maybe perform a calculation. This is where multi-step agent streaming truly shines. You can observe the agent chaining together multiple thoughts and tool calls to reach a complex goal.
Seeing an agent go through a sequence of actions, one after another, is incredibly insightful. It helps you understand the agent’s reasoning process for complicated problems. This also helps in debugging agent streams, as you can pinpoint exactly where an agent might be going off track.
Example: Agent Using Two Tools Sequentially
Let’s create an agent that needs to first look up a fact and then perform a calculation based on that fact. This involves two different tools and multiple intermediate steps.
We’ll use our lookup_fact tool and the add tool together.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
from langchain.tools import tool
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.prompts import PromptTemplate
# Our tools from previous examples
KNOWLEDGE_BASE = {
"Eiffel Tower height": "330 meters", # Simplified for calculation
"London Eye height": "135 meters",
}
@tool
def lookup_fact(query: str) -> str:
"""Looks up a fact in a predefined knowledge base.
Input should be the exact fact to look for, e.g., 'Eiffel Tower height'.
"""
if query in KNOWLEDGE_BASE:
return KNOWLEDGE_BASE[query]
else:
return "Fact not found in knowledge base."
@tool
def add(a: int, b: int) -> int:
"""Adds two integers and returns the result."""
return a + b
# Put both tools into a list
tools = [lookup_fact, add]
# 2. Define the LLM
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0, streaming=True)
# 3. Define the Agent's Prompt
prompt_template = """
You are a helpful assistant. You have access to the following tools:
{tools}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: {input}
Thought:{agent_scratchpad}
"""
prompt = PromptTemplate.from_template(prompt_template)
# 4. Create the agent
agent = create_react_agent(llm, tools, prompt)
# 5. Create the AgentExecutor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False)
print("Agent is tackling a multi-step task. Watching the stream...\n")
# 6. Stream the agent's response
question = "What is the total height if you combine the Eiffel Tower and the London Eye? Give me just the number."
for s in agent_executor.stream({"input": question}):
if "__end__" in s:
print(f"\n--- Full process finished: {s['__end__']['output']['output']} ---\n")
elif "actions" in s:
for action in s["actions"]:
print(f"**Agent decided to use tool:** {action.tool}")
print(f"**Input to tool:** '{action.tool_input}'")
if action.log: # log contains the LLM's thought leading to this action
print(f"**Thought before action:** {action.log.strip()}")
elif "steps" in s:
for step in s["steps"]:
print(f"**Tool Observation (result):** {step.observation}")
if step.log: # log contains the LLM's thought after the observation
print(f"**Agent's thoughts after tool:** {step.log.strip()}")
elif "output" in s:
print(f"**Final Agent Output:** {s['output']['output']}")
elif "input" in s:
print(f"**Initial Input to Agent:** {s['input']}")
else:
for key, value in s.items():
if hasattr(value, 'content') and value.content.strip():
print(f"**Agent's thought:** {value.content}", end="", flush=True)
print("\n\nMulti-step streaming complete!")
When you run this example, you’ll see a fascinating sequence of events:
- Thought: The agent realizes it needs the heights of two landmarks.
- Action (lookup_fact): It calls
lookup_factfor “Eiffel Tower height”. - Observation: It gets the height (e.g., “330 meters”).
- Thought: It processes the first observation and realizes it needs the second height.
- Action (lookup_fact): It calls
lookup_factfor “London Eye height”. - Observation: It gets the second height (e.g., “135 meters”).
- Thought: Now it has both numbers and decides to add them.
- Action (add): It calls the
addtool with the two numbers. - Observation: It gets the sum (e.g., “465”).
- Thought: It has the final answer.
- Final Answer: It provides the total.
This detailed, step-by-step output demonstrates the power of multi-step agent streaming and reasoning trace streaming. You can follow the entire problem-solving journey of your agent. This kind of transparency is invaluable for building robust and intelligent AI applications.
Debugging Agent Streams
Even the smartest agents can sometimes get confused or make mistakes. When this happens, debugging agent streams becomes your best friend. By watching the stream, you can see exactly where the agent’s logic went wrong, which tool it tried to use, or what input it gave to a tool.
Without streaming, you’d just get a final incorrect answer, and it would be much harder to figure out why. With streaming, you have a detailed log of every decision and action.
What to Look For When Debugging
- Unexpected Thoughts: Is the agent thinking about something irrelevant?
- Incorrect Tool Selection: Did it choose the wrong tool for the job?
- Bad Tool Input: Is the input it’s giving to a tool formatted incorrectly or missing information?
- Unexpected Observations: Is the tool returning something you didn’t expect? This might indicate an issue with the tool itself, or how the agent is interpreting its results.
- Infinite Loops: Is the agent repeating the same thoughts and actions over and over? This usually means it’s stuck.
Using LangSmith for Advanced Debugging
For even more powerful debugging, especially in complex scenarios, LangChain offers LangSmith. LangSmith is a platform that traces every single step of your LangChain application, including all LLM calls, tool calls, and agent decisions. It presents this information in a beautiful, interactive timeline.
While setting up LangSmith is beyond this tutorial, know that it’s an incredibly valuable tool for debugging agent streams. It provides visual insights into the reasoning trace streaming and intermediate steps streaming that are hard to get from raw console output alone. You can learn more about it on the LangSmith website.
Agent Streaming UI Patterns
Once you have your agent streaming its thoughts and actions, the next step is to present this information to the user in a clear and engaging way. This is where agent streaming UI patterns come into play. A well-designed user interface can turn a raw stream of data into a magical experience.
Think about how chat applications show you when someone is typing. You don’t see raw data; you see “User is typing…” or animated dots. We want to do something similar for our agents.
Displaying Thoughts and Actions
Instead of just printing raw events, a good UI would translate them into user-friendly messages:
- For LLM thoughts (reasoning trace streaming):
- “AI is thinking…”
- “Let me consider this problem.”
- “I need to find some information first.”
- You could even show the actual thought in a subtle way, perhaps in a smaller font or a tooltip.
- For Tool Calls (tool call streaming):
- “AI is searching the web for…”
- “AI is doing a calculation…”
- “AI is checking the database for…”
- Followed by the specific input if it’s not too technical.
- For Tool Outputs (streaming tool outputs):
- “Found: [result of tool]”
- “Calculation result: [number]”
- “The database returned: [data]”
Visual Cues and Animations
- Typing Indicators: Like in chat apps, animated ellipses (…) can show the AI is actively generating text.
- Progress Bars/Spinners: For longer tool calls or complex
multi-step agent streaming, a spinning icon or a small progress bar can indicate activity. - Step-by-Step Display: Each thought, tool call, and observation could appear as a new message bubble or a new line of text, making it easy to follow the
intermediate steps streaming. - Highlighting Current Step: You could highlight the current action or thought the agent is working on.
Consider this conceptual example for a UI:
1
2
3
4
5
6
7
8
9
10
11
12
13
**User:** What is the total height of the Eiffel Tower and the London Eye?
**AI (thinking):** I need to find the heights of both landmarks first.
**AI (using tool: lookup_fact):** Looking up "Eiffel Tower height"...
**AI (result):** Found: 330 meters.
**AI (thinking):** Now I need the London Eye's height.
**AI (using tool: lookup_fact):** Looking up "London Eye height"...
**AI (result):** Found: 135 meters.
**AI (thinking):** I have both heights. Now I'll add them.
**AI (using tool: add):** Adding 330 + 135...
**AI (result):** Calculation complete: 465.
**AI (thinking):** I have the final answer.
**AI:** The total height is 465 meters.
This user-friendly representation of the streamed events provides a much better experience than raw on_tool_start logs. Designing effective agent streaming UI patterns transforms a functional feature into a delightful one.
Conclusion
You’ve embarked on an exciting journey, starting from the basic idea of streaming text from an AI to mastering the intricacies of langchain agent tool streaming. You’ve learned how to peek into your AI agent’s “mind,” seeing its reasoning trace streaming, its tool call streaming, and the streaming tool outputs for every action it takes.
We covered the Agent streaming fundamentals, explored intermediate steps streaming, and even customized the AgentExecutor streaming config. You also saw how streaming with custom tools makes your agents incredibly flexible and how multi-step agent streaming helps tackle complex problems. Finally, we touched upon debugging agent streams and discussed essential agent streaming UI patterns for creating engaging user experiences.
The ability to stream responses from LangChain agents is a powerful feature. It makes your AI applications more transparent, responsive, and user-friendly. Now, you have the knowledge to build dynamic and interactive AI experiences where users can witness the magic of intelligent agents at work, step by step. Keep experimenting, keep building, and unlock the full potential of streaming with LangChain!
Leave a comment