Use TestAgent to run prompts and inspect tool calls
The TestAgent connects an LLM to your MCP tools, letting you test how models interact with your server. It handles the agentic loop (prompt → tool call → result → response) and gives you rich inspection capabilities.
Unit tests verify your tools work correctly in isolation. But in production, an LLM decides which tool to call and what arguments to pass. Testing with real LLMs catches issues like:
Ambiguous tool descriptions that confuse the model
Send natural language prompts and inspect what happens:
Copy
Ask AI
const result = await agent.prompt("What is 15 plus 27?");// What did the model say?console.log(result.getText());// "15 plus 27 equals 42."// What tools did it call?console.log(result.toolsCalled());// ["add"]// What arguments did it pass?console.log(result.getToolArguments("add"));// { a: 15, b: 27 }
Every prompt returns a PromptResult with rich inspection methods:
Copy
Ask AI
const result = await agent.prompt("...");// Tool inspectionresult.toolsCalled(); // string[] - names of tools calledresult.hasToolCall("add"); // boolean - was this tool called?result.getToolCalls(); // detailed tool call inforesult.getToolArguments("add"); // arguments for a specific tool// Performanceresult.e2eLatencyMs(); // total timeresult.llmLatencyMs(); // time in LLM APIresult.mcpLatencyMs(); // time executing toolsresult.totalTokens(); // tokens used// Error handlingresult.hasError(); // did something go wrong?result.getError(); // error message if so
TestAgent never throws exceptions. Errors are captured in the result, making it safe to run many tests without try/catch blocks.
Pass previous results as context to maintain conversation history:
Copy
Ask AI
// First turnconst r1 = await agent.prompt("Create a task called 'Buy groceries'");// Second turn (model sees the first exchange)const r2 = await agent.prompt("Mark it as high priority", { context: r1,});// Third turn (model sees both previous exchanges)const r3 = await agent.prompt("Now show me all my tasks", { context: [r1, r2],});
This is essential for testing workflows that span multiple interactions.
Use stopWhen when you want to stop the multi-step loop after a particular step completes:
Copy
Ask AI
import { hasToolCall } from "@mcpjam/sdk";// Stop after the step where the tool is calledconst result = await agent.prompt("Search for open tasks", { stopWhen: hasToolCall("search_tasks"),});expect(result.hasToolCall("search_tasks")).toBe(true);
stopWhen does not skip tool execution. It controls whether the prompt loop continues after the current step completes, and TestAgent also applies stepCountIs(maxSteps) as a safety guard.
Use timeout when you want to bound how long TestAgent.prompt() can run:
Copy
Ask AI
const result = await agent.prompt("Run a long workflow", { timeout: { totalMs: 10_000, stepMs: 2_500 },});if (result.hasError()) { console.error(result.getError());}
timeout accepts number, totalMs, stepMs, and chunkMs. In practice, number and totalMs cap the full prompt, stepMs caps each step, and chunkMs mainly matters in streaming flows. The runtime creates an internal abort signal, so tools can stop early if their implementation respects the provided abortSignal.
import { matchToolCalls, matchToolCallWithArgs } from "@mcpjam/sdk";const result = await agent.prompt("Add 10 and 5");// Check the right tool was calledexpect(matchToolCalls(["add"], result.toolsCalled())).toBe(true);// Check the arguments were correctexpect( matchToolCallWithArgs("add", { a: 10, b: 5 }, result.getToolCalls())).toBe(true);