LLM Tool calls | Will Schenk

https://willschenk.com/howto/2024/llm_tool_calls/ · scraped

![](https://prod-files-secure.s3.us-west-2.amazonaws.com/871f1661-80b8-4d0c-ac3b-2adfc6ff4c66/c80d390c-ee6f-4a64-b728-d6da677bd70a/cover.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=ASIAZI2LB4662VDTBORY%2F20260519%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20260519T193137Z&X-Amz-Expires=3600&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEBMaCXVzLXdlc3QtMiJIMEYCIQDIz1RvTrkc%2B80rqWF%2BlqAQrDgXmlq7izSb7CJE96RNmgIhAPTBzNnYEQrHTUaN1QeCb3SDZxouFjtvlQXMOpg7h75zKogECNz%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEQABoMNjM3NDIzMTgzODA1IgwLCLDFnjWC1E9igdsq3AOIgvajA68u%2BUUpqkKuqfNDfnfzedS8c1t0XDr%2F4X3r2JaFIWyUgKXTNEUBdhTx2GS98e6PTWwoCSrxKWoWXq0IHNhYWpWfL%2FlgTPXX4vm9LFcQOomK1WmHmnZF2MxUOQ6cqKJx8UJWuNio6wVDhM1AF0R7eAoc0ZRV2EdJcJXD2CX6EQNMSgttPnXvjFx31tAfaqtqysvPNuFq3q%2By1c8TjTVWBB%2Bfva%2FIpL09%2B2JN5mukRCwu2z8hpq6yaX%2Fz%2BMu6lbs6Flf%2FbgKk7jiuCf6NBiApA%2BDjUCrKXoArpdHcs8qOxvgnP8rMJId6qxntdt1JiwMYdFporD0phf9CN3aylCehcYpc10WSKB8230sVGWm9MyLynfAf4tka6wDktMS8QUlJMdEFvF1EaxsBUEb4sY7s2xbR48XGIeTaZ5KTzUJDnnMp6ED2O1W0NBKm1JZXk8lOEWQ0MF1T4Bq1R26%2BOmvCGv6kaW10DMq7qKokytCT7cNguxHxt1Pa0IHUVieLA9KAn7vuAiIgTI7DTWiulA4v5lrNFBA7F2WUqPQmJagWT0%2Fe0oDFT3lLeKA2B6MyQQghbsEd9%2BOKE2k5Ahq0FqhLq2pSdbT5SNfuIcv%2BaLU7%2BxcGhOKj1G96dTD727LQBjqkAdTYtdOuEq7SIJmpAWjMqd2I4x73cTs5BTGuUZ7NNje7Ah1ZrlqSB24sqMQcirmnXDDOvLcXuuI%2B4Pgr6i%2FoTVzhOdnO4AhJPBQdMP4AhhqPGT3%2FzDmq0nHNadUNva82pjSJYQ%2BRfMZ%2FdNXo5bHFsDy7Uw%2FTd3O0w68zVC7DJL96ODBV2uxprCz4Bgbx%2FfzjD77P7U1LfMxut%2F%2FxxmgaBSopFxcV&X-Amz-Signature=840d114be7c393f397d0387ebfb0fa2c5ebd9c73178bf162b19d97b4fbf5de30&X-Amz-SignedHeaders=host&x-amz-checksum-mode=ENABLED&x-id=GetObject) ## getting it to talk back Continuing on the exploration of vercels ai package, lets look at how to add tool calling. I'm mainly going to be testing this against ollama, but it's easy enough to point it to a different model provider without changing any of the code. (This is a big reason why I'm using vercel's npm package rather than hitting the APIs directly.) Make sure that you get an ollama model that supports tools, such as llama3.1: | 1 2 3 | pnpm i ai @ai-sdk/openai ollama-ai-provider zod pnpm i -D @types/node tsx typescript npx tsc --init | First we build something that abstracts over the model selection, so we can do things like ollama/llama3.1 or openai/gpt-4o. | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | //models.ts import { createOpenAI, openai } from "@ai-sdk/openai"; import { LanguageModel } from "ai"; import { ollama } from "ollama-ai-provider"; export function modelFromString(modelString: string): LanguageModel { const [provider, model] = modelString.split("/"); if (provider === "ollama") { return ollama(model); } else if (provider === "openai") { return openai(model); } else if (provider === "groq") { const groq = createOpenAI({ baseURL: "https://api.groq.com/openai/v1", // apiKey: process.env.GROQ_API_KEY, }); return groq(model); } throw new Error(`Unknown model provider: ${provider}`); } | I'm going to wrap everything into a Context object, which contains the model name, the system prompt, and running list of messages that have been exchanged with the LLM. We'll also put tools on here, so that we can have a clean way to add things together. The makeGenerator function returns them all put together which will be passed to either generatorResponse (sync) or streamResponse (async). maxToolRoundtrips is honored by generateText but we will need to implement this manually for streamText. | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | // agent.ts import { CoreMessage, CoreTool } from "ai"; import { modelFromString } from "./models"; export interface Context { model: string; messages: CoreMessage[]; systemPrompt: string; tools?: Record<string, CoreTool<any, any>>; } export function startContext( model: string, systemPrompt: string, prompt: string ): Context { const ctx: Context = { model, systemPrompt, messages: [], }; addMessage(ctx, prompt); return ctx; } export function addMessage(ctx: Context, message: string) { ctx.messages.push({ role: "user", content: message, }); } export function addTool(ctx: Context, name: string, tool: CoreTool<any, any>) { if (!ctx.tools) { ctx.tools = {}; } ctx.tools[name] = tool; } export function makeGenerator(ctx: Context) { return { model: modelFromString(ctx.model), tools: ctx.tools, system: ctx.systemPrompt, messages: ctx.messages, maxToolRoundtrips: 5, // allow up to 5 tool roundtrips }; } | Fairly straight forward. | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | // generateResponse.ts import { generateText, LanguageModel } from "ai"; import { Context, makeGenerator } from "./agent"; export async function generateResponse(ctx: Context) { const generator = makeGenerator(ctx); const result = await generateText(generator); for await (const message of result.responseMessages) { ctx.messages.push(message); } // console.log("result", JSON.stringify(result, null, 2)); // console.log(ctx.messages[ctx.messages.length - 1].content); return result; } | Here we get more complicated. When you call streamText, it will return chunks which we print out from the onChunk handler. We will also get a few different chunks – tool-call which is when the LLM requests to call our tool, and tool-result which is the result. This is fine, but at the end it calls onFinish with the results. This isn't really what we want, we want it to take the results and do something with it. So we check to see if the finish reason is tool-results and if so we call streamResponse again and hope for an actual text output. | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | // streamResponse.ts import { streamText } from "ai"; import { Context, makeGenerator } from "./agent"; export async function streamResponse(ctx: Context) { // This prints out the results to the screen let length = 0; function onChunk({ chunk }: { chunk: any }) { if (chunk.type === "text-delta") { if (length + chunk.textDelta.length > 80) { process.stdout.write("\n"); length = 0; } else { length += chunk.textDelta.length; } process.stdout.write(chunk.textDelta); } else if (chunk.type === "tool-call") { console.log("Calling tool", chunk.toolName, chunk.args); } else if (chunk.type === "tool-result") { console.log("Tool result", chunk.toolName, chunk.result); } else { console.log("unknown", chunk.type, JSON.stringify(chunk, null, 2)); } } // Sets up the model to run again if there are tool calls async function onFinish({ //@ts-ignore text, //@ts-ignore toolCalls, //@ts-ignore toolResults, //@ts-ignore finishReason, //@ts-ignore usage, }) { // console.log("-----"); // console.log("finishReason", finishReason); // console.log("toolCalls", toolCalls); // console.log("toolResults", toolResults); // console.log("usage", usage); // console.log("-----"); // console.log("text", text); process.stdout.write("\n"); if (finishReason === "tool-calls") { ctx.messages.push({ role: "assistant", content: toolCalls, }); ctx.messages.push({ role: "tool", content: toolResults, }); return await streamResponse(ctx); } else { ctx.messages.push({ role: "assistant", content: text, }); return "done"; } } const generator = makeGenerator(ctx); // @ts-ignore generator.onChunk = onChunk; // @ts-ignore generator.onFinish = onFinish; const result = await streamText(generator); // consume stream: for await (const textPart of result.textStream) { // Process each text part here } } | Lets try it out! We'll define a simple weatherTool like so, which just returns something random. | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | // weatherTool.ts import { tool } from "ai"; import { z } from "zod"; import { addTool, Context } from "./agent"; export function addWeatherTool(ctx: Context) { addTool( ctx, "weather", tool({ description: "Get the weather in a location", parameters: z.object({ location: z.string().describe("The location to get the weather for"), }), execute: async ({ location }) => { console.log("tool call for", location); return { location, temperature: 72 + Math.floor(Math.random() * 21) - 10, }; }, }) ); } | And then put it together: | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 | // weatherToolTest.ts import { startContext } from "./agent"; import { streamResponse } from "./streamResponse"; import { addWeatherTool } from "./weatherTool"; const ctx = startContext( "ollama/llama3.1", "You are a helpful, respectful and honest assistant.", "Whats the weather in Tokyo?" ); addWeatherTool(ctx); streamResponse(ctx).catch(console.error); | ```plain text Calling tool weather { location: 'Tokyo' } tool call for Tokyo Tool result weather { location: 'Tokyo', temperature: 77 } The current temperature in Tokyo is 77 degrees Fahrenheit. ``` How about a tool that the llm can call when it wishes it had a tool? We could then use that to add that tool and rerun the chat so maybe it'll be smarter in the future! When the meta tool is invoked, it creates a typescript file that defines the tool that it wants to use. You can then implement that tool. | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | // metaTool.ts import { tool } from "ai"; import { addTool, Context } from "./agent"; import { z } from "zod"; import * as fs from "fs"; export function addMetaTool(ctx: Context) { addTool( ctx, "meta_tool", tool({ description: `Whenever the query from the user exceeds your capabilities, you can call this tool to request the development of another tool. Our team will review these logs and maybe develop the tool you requested so your capabilities will improve`, parameters: z.object({ tool_name: z .string() .describe("name of the tool that you'd like to use"), tool_description: z.string().describe( `A description of the tool you would have like to have to correctly answer to the user and how their query was exceeding your current capabilities` ), }), execute: async ({ tool_name, tool_description }) => { makeToolRequest(tool_name, tool_description); console.log("tool call for", tool_name); console.log("tool call for", tool_description); return { tool_name, tool_description, }; }, }) ); } function makeToolRequest(tool_name: string, tool_description: string) { console.log("templating a tool called", tool_name); console.log("with the following description", tool_description); const file = `${tool_name}.ts`; if (fs.existsSync(file)) { console.log("file already exists", file); } else { console.log("writing file", file); fs.writeFileSync( file, `// ${tool_name}.ts import { tool } from "ai"; import { addTool, Context } from "./agent"; import { z } from "zod"; export function add${tool_name}(ctx: Context) { addTool( ctx, "${tool_name}", tool({ description: "${tool_description}", parameters: z.object({ argument: z .string() .describe("something something") }), execute: async ({ argument }) => { console.log("executing", "${tool_name}"); return { argument, }; }, }) ); }` ); } } | Here's a chat interface for going dynamic with it. | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | // chat.ts import { addMessage, startContext } from "./agent"; import * as readline from "node:readline/promises"; import { streamResponse } from "./streamResponse"; import { addMetaTool } from "./metaTool"; async function chat() { const terminal = readline.createInterface({ input: process.stdin, output: process.stdout, }); const userInput = await terminal.question("You: "); const ctx = startContext( "ollama/llama3.1", "You are a helpful, respectful and honest assistant.", userInput ); addMetaTool(ctx); await streamResponse(ctx); while (true) { const userInput = await terminal.question("You: "); addMessage(ctx, userInput); await streamResponse(ctx); console.log("\n"); } } chat().catch(console.error); | Previously Next

▼

Scraped Content

— 1676 words · 2026-05-19 12:31:41 UTC ·

Excerpt

Visibility

Visible to everyone

Reading Status

Related Bookmarks

My Note

Saved!

Annotations

Agent findings

info Long content (1676 words) has no proposition chunks health · Jun 29

Export as Markdown