Chapter 4: The Agent Loop
💻 Code: start from the
lesson-04branch of Hendrixer/agents-v2. Thenotes/folder on that branch has the code you’ll write in this chapter.
The Heart of an Agent
This is the most important chapter in the book. Everything before this was setup. Everything after builds on this.
The agent loop is what transforms a language model from a question-answering machine into an autonomous agent. Here’s the pattern:
while true:
1. Send messages to LLM (with tools)
2. Stream the response
3. If LLM wants to call tools:
a. Execute each tool
b. Add results to message history
c. Continue the loop
4. If LLM is done (no tool calls):
a. Break out of the loop
b. Return the final response
The LLM decides when to stop. It might call one tool, process the result, call another, and then respond with text. Or it might call three tools in one turn, process all results, and respond. The loop keeps going until the LLM says “I’m done — here’s my answer.”
Streaming vs. Generating
In Chapter 2, we used generateText() which waits for the complete response before returning. That’s fine for evals, but terrible for UX. Users want to see tokens appear in real-time.
streamText() returns an async iterable that yields chunks as they arrive:
const result = streamText({
model: openai("gpt-5-mini"),
messages,
tools,
});
for await (const chunk of result.fullStream) {
if (chunk.type === "text-delta") {
// A piece of text arrived
process.stdout.write(chunk.text);
}
if (chunk.type === "tool-call") {
// The LLM wants to call a tool
console.log(`Tool: ${chunk.toolName}`, chunk.input);
}
}
The fullStream gives us everything: text deltas, tool calls, finish reasons, and more. We process each chunk type differently.
Building the Agent Loop
Create src/agent/run.ts:
import { streamText, type ModelMessage } from "ai";
import { openai } from "@ai-sdk/openai";
import { getTracer } from "@lmnr-ai/lmnr";
import { tools } from "./tools/index.ts";
import { executeTool } from "./executeTool.ts";
import { SYSTEM_PROMPT } from "./system/prompt.ts";
import { Laminar } from "@lmnr-ai/lmnr";
import type { AgentCallbacks, ToolCallInfo } from "../types.ts";
// Initialize Laminar for observability (optional - traces LLM calls)
Laminar.initialize({
projectApiKey: process.env.LMNR_API_KEY,
});
const MODEL_NAME = "gpt-5-mini";
export async function runAgent(
userMessage: string,
conversationHistory: ModelMessage[],
callbacks: AgentCallbacks,
): Promise<ModelMessage[]> {
const messages: ModelMessage[] = [
{ role: "system", content: SYSTEM_PROMPT },
...conversationHistory,
{ role: "user", content: userMessage },
];
let fullResponse = "";
while (true) {
const result = streamText({
model: openai(MODEL_NAME),
messages,
tools,
experimental_telemetry: {
isEnabled: true,
tracer: getTracer(),
},
});
const toolCalls: ToolCallInfo[] = [];
let currentText = "";
for await (const chunk of result.fullStream) {
if (chunk.type === "text-delta") {
currentText += chunk.text;
callbacks.onToken(chunk.text);
}
if (chunk.type === "tool-call") {
const input = "input" in chunk ? chunk.input : {};
toolCalls.push({
toolCallId: chunk.toolCallId,
toolName: chunk.toolName,
args: input as Record<string, unknown>,
});
callbacks.onToolCallStart(chunk.toolName, input);
}
}
fullResponse += currentText;
const finishReason = await result.finishReason;
// If the LLM didn't request any tool calls, we're done
if (finishReason !== "tool-calls" || toolCalls.length === 0) {
const responseMessages = await result.response;
messages.push(...responseMessages.messages);
break;
}
// Add the assistant's response (with tool call requests) to history
const responseMessages = await result.response;
messages.push(...responseMessages.messages);
// Execute each tool and add results to message history
for (const tc of toolCalls) {
const toolResult = await executeTool(tc.toolName, tc.args);
callbacks.onToolCallEnd(tc.toolName, toolResult);
messages.push({
role: "tool",
content: [
{
type: "tool-result",
toolCallId: tc.toolCallId,
toolName: tc.toolName,
output: { type: "text", value: toolResult },
},
],
});
}
}
callbacks.onComplete(fullResponse);
return messages;
}
Let’s walk through this step by step.
Function Signature
export async function runAgent(
userMessage: string,
conversationHistory: ModelMessage[],
callbacks: AgentCallbacks,
): Promise<ModelMessage[]>
The function takes:
userMessage— The latest message from the userconversationHistory— All previous messages (for multi-turn conversations)callbacks— Functions to notify the UI about streaming tokens, tool calls, etc.
It returns the updated message history, which the caller stores for the next turn.
Message Construction
const messages: ModelMessage[] = [
{ role: "system", content: SYSTEM_PROMPT },
...conversationHistory,
{ role: "user", content: userMessage },
];
We build the full message array: system prompt, then conversation history, then the new user message. This array grows as tools are called — tool results get appended.
The Loop
while (true) {
const result = streamText({ model, messages, tools });
// ... process stream ...
if (finishReason !== "tool-calls" || toolCalls.length === 0) {
break; // LLM is done
}
// Execute tools, add results to messages, loop again
}
Each iteration:
- Sends the current messages to the LLM
- Streams the response, collecting text and tool calls
- Checks the
finishReason:"tool-calls"→ The LLM wants tools executed. Do it and loop.- Anything else (
"stop","length", etc.) → The LLM is done. Break.
Tool Execution
for (const tc of toolCalls) {
const toolResult = await executeTool(tc.toolName, tc.args);
callbacks.onToolCallEnd(tc.toolName, toolResult);
messages.push({
role: "tool",
content: [{
type: "tool-result",
toolCallId: tc.toolCallId,
toolName: tc.toolName,
output: { type: "text", value: toolResult },
}],
});
}
For each tool call:
- Execute the tool using our dispatcher from Chapter 2
- Notify the UI that the tool completed
- Add the result as a
toolmessage, linked to the originaltoolCallId
The toolCallId is critical — it tells the LLM which tool call this result belongs to. Without it, the LLM can’t match results to requests.
Callbacks
The callbacks pattern decouples the agent logic from the UI:
callbacks.onToken(chunk.text); // Stream text to UI
callbacks.onToolCallStart(name, args); // Show tool execution starting
callbacks.onToolCallEnd(name, result); // Show tool result
callbacks.onComplete(fullResponse); // Signal completion
The agent doesn’t know or care whether the UI is a terminal, a web page, or a test harness. It just calls the callbacks. This is the same pattern used by the AI SDK itself.
Testing the Loop
Let’s test with a simple script. Update src/index.ts:
import { runAgent } from "./agent/run.ts";
import type { ModelMessage } from "ai";
const history: ModelMessage[] = [];
const result = await runAgent(
"What files are in the current directory? Then read the package.json file.",
history,
{
onToken: (token) => process.stdout.write(token),
onToolCallStart: (name, args) => {
console.log(`\n[Tool] ${name}`, JSON.stringify(args));
},
onToolCallEnd: (name, result) => {
console.log(`[Result] ${name}: ${result.slice(0, 100)}...`);
},
onComplete: () => console.log("\n[Done]"),
onToolApproval: async () => true, // Auto-approve for now
},
);
console.log(`\nTotal messages: ${result.length}`);
Run it:
npm run start
You should see the agent:
- Call
listFilesto see the directory contents - Call
readFileto readpackage.json - Respond with a summary of what it found
That’s the loop in action. The LLM made two tool calls across potentially multiple loop iterations, got the results, and synthesized a coherent response.
The Message History
After the loop, the messages array looks something like:
[system] "You are a helpful AI assistant..."
[user] "What files are in the current directory? Then read..."
[assistant] (tool call: listFiles)
[tool] "[dir] node_modules\n[dir] src\n[file] package.json..."
[assistant] (tool call: readFile, text: "Let me read...")
[tool] "{ \"name\": \"agi\", ... }"
[assistant] "Your project has the following files... The package.json shows..."
This is the full conversation history. The LLM sees all of it on each iteration, which is how it maintains context. This is also why context management (Chapter 7) becomes important — this history grows with every interaction.
Error Handling
The real implementation should handle stream errors. Here’s the enhanced version with error handling:
try {
for await (const chunk of result.fullStream) {
if (chunk.type === "text-delta") {
currentText += chunk.text;
callbacks.onToken(chunk.text);
}
if (chunk.type === "tool-call") {
const input = "input" in chunk ? chunk.input : {};
toolCalls.push({
toolCallId: chunk.toolCallId,
toolName: chunk.toolName,
args: input as Record<string, unknown>,
});
callbacks.onToolCallStart(chunk.toolName, input);
}
}
} catch (error) {
const streamError = error as Error;
if (!currentText && !streamError.message.includes("No output generated")) {
throw streamError;
}
}
If the stream errors but we already have some text, we can still use it. If the error is about “no output generated” and we have nothing, we provide a fallback message. This makes the agent resilient to transient API issues.
Summary
In this chapter you:
- Built the core agent loop with streaming
- Understood the stream → detect tool calls → execute → loop pattern
- Used callbacks to decouple agent logic from UI
- Handled the message history that grows with each tool call
- Added error handling for stream failures
This is the engine of the agent. Everything else — more tools, context management, human approval — plugs into this loop. In the next chapter, we’ll build multi-turn evaluations to test the full loop.