Are you an LLM? Read llms.txt for a summary of the docs, or llms-full.txt for the full context.
Skip to content

Context Management

The SDK provides several mechanisms to keep agent context clean and within limits, even during complex multi-step investigations.

Smart Compaction

When context approaches the limit, the SDK performs structured state snapshots that preserve key findings while discarding intermediate reasoning and raw data.

What's preserved:

  • Key findings and conclusions
  • Active hypotheses
  • Entities and relationships discovered
  • User preferences from the session

What's discarded:

  • Intermediate reasoning chains
  • Raw tool outputs already summarized
  • Superseded hypotheses

Tool Output Masking

A backward-scan FIFO mechanism prunes bulky tool outputs while protecting recent context. Older tool responses are truncated or removed when newer, more relevant data arrives.

The mask operates on a priority system:

  1. Recent tool outputs are protected
  2. Older outputs with summaries already incorporated are candidates for removal
  3. Large raw JSON responses are first to be pruned

Eager Knowledge Loading

All knowledge documents in knowledge/ are loaded into the system prompt at session start. No on-demand fetch, no "which doc should I load" decision turn — the agent sees everything from turn one. This keeps reasoning simple: a user asks about payments-api and the agent already knows the baseline, the patterns, and the team directory without spending a turn on lookup.

System prompt: [full knowledge docs inline — 3-8K tokens typical]
  knowledge/environment.md: Production environment...
  knowledge/baselines.md: Service baselines...
  knowledge/patterns.md: Known patterns...
  knowledge/team.md: Team directory...

Sub-agents share the parent's context compiler, so they also start with full knowledge loaded.

This works because realistic knowledge bases are small (~20 docs, a few thousand tokens total). If your knowledge base starts to crowd the context window, split by session type (separate repos or personas) rather than adding retrieval — lookup turns compound across every investigation and the cost outweighs the context savings.

Loop Detection

The SDK detects when an agent enters unproductive loops:

  • Pattern matching — repeated identical tool calls or reasoning patterns
  • LLM-based detection — the model evaluates whether it's making progress

When a loop is detected, the agent is nudged to try a different approach or escalate to the user.