🔄 Core Architecture

query.ts

The Execution Loop - Where Queries Come Alive

68KB Size
src/ Location
80% Compaction Threshold
TS Language
01

The Query Loop

query.ts contains the main conversation execution loop. It handles: constructing API requests, normalizing messages, managing context token budget (auto-compaction at 80%), streaming API responses, processing tool use blocks, executing tools, and tracking query analytics.

02

Execution Flow

1

Construct Messages

Builds the message array with system prompt, conversation history, and context window management.

2

Check Token Budget

If context exceeds 80% of the model's window, auto-compaction kicks in - summarizing earlier messages to free space.

3

Call Claude API

Sends the request to Claude via the Anthropic SDK with streaming enabled, including tool definitions.

4

Stream Response

Processes the streaming response token-by-token, rendering text to the terminal in real-time.

5

Execute Tools

If the model requests tool use, checks permissions and executes the tool. Results are appended to the conversation.

6

Loop or Return

If tools were executed, loop back to step 1. If the model returned text only (end_turn), return the response.

03

Key Concepts

📦

Auto-Compaction

At 80% context usage, older messages are summarized to free token budget without losing important context.

🎯

Tool Use Blocks

Claude returns structured tool_use blocks that specify which tool to call and with what parameters.

Error Recovery

Failed tool executions are reported back to Claude, which can adapt its approach or try alternatives.

💰

Budget Control

maxBudgetUsd and maxTurns prevent runaway sessions from consuming too many resources.

💡 Try This Experiment

Go to the Playground and use the Query Flow Visualizer. Type read the file package.json and watch how tools are detected and executed in the loop.

Open Query Visualizer