Context Management

The SDK provides several mechanisms to keep agent context clean and within limits, even during complex multi-step investigations.

Smart Compaction

When context approaches the limit, the SDK performs structured state snapshots that preserve key findings while discarding intermediate reasoning and raw data.

What's preserved:

Key findings and conclusions
Active hypotheses
Entities and relationships discovered
User preferences from the session

What's discarded:

Intermediate reasoning chains
Raw tool outputs already summarized
Superseded hypotheses

Tool Output Masking

A backward-scan FIFO mechanism prunes bulky tool outputs while protecting recent context. Older tool responses are truncated or removed when newer, more relevant data arrives.

The mask operates on a priority system:

Recent tool outputs are protected
Older outputs with summaries already incorporated are candidates for removal
Large raw JSON responses are first to be pruned

Eager Knowledge Loading

All knowledge documents in knowledge/ are loaded into the system prompt at session start. No on-demand fetch, no "which doc should I load" decision turn — the agent sees everything from turn one. This keeps reasoning simple: a user asks about payments-api and the agent already knows the baseline, the patterns, and the team directory without spending a turn on lookup.

System prompt: [full knowledge docs inline — 3-8K tokens typical]
  knowledge/environment.md: Production environment...
  knowledge/baselines.md: Service baselines...
  knowledge/patterns.md: Known patterns...
  knowledge/team.md: Team directory...

Sub-agents share the parent's context compiler, so they also start with full knowledge loaded.

This works because realistic knowledge bases are small (~20 docs, a few thousand tokens total). If your knowledge base starts to crowd the context window, split by session type (separate repos or personas) rather than adding retrieval — lookup turns compound across every investigation and the cost outweighs the context savings.

Loop Detection

The SDK detects when an agent enters unproductive loops:

Pattern matching — repeated identical tool calls or reasoning patterns
LLM-based detection — the model evaluates whether it's making progress

When a loop is detected, the agent is nudged to try a different approach or escalate to the user.