Context Management
The SDK provides several mechanisms to keep agent context clean and within limits, even during complex multi-step investigations.
Smart Compaction
When context approaches the limit, the SDK performs structured state snapshots that preserve key findings while discarding intermediate reasoning and raw data.
What's preserved:
- Key findings and conclusions
- Active hypotheses
- Entities and relationships discovered
- User preferences from the session
What's discarded:
- Intermediate reasoning chains
- Raw tool outputs already summarized
- Superseded hypotheses
Tool Output Masking
A backward-scan FIFO mechanism prunes bulky tool outputs while protecting recent context. Older tool responses are truncated or removed when newer, more relevant data arrives.
The mask operates on a priority system:
- Recent tool outputs are protected
- Older outputs with summaries already incorporated are candidates for removal
- Large raw JSON responses are first to be pruned
Eager Knowledge Loading
All knowledge documents in knowledge/ are loaded into the system prompt at session start. No on-demand fetch, no "which doc should I load" decision turn — the agent sees everything from turn one. This keeps reasoning simple: a user asks about payments-api and the agent already knows the baseline, the patterns, and the team directory without spending a turn on lookup.
System prompt: [full knowledge docs inline — 3-8K tokens typical]
knowledge/environment.md: Production environment...
knowledge/baselines.md: Service baselines...
knowledge/patterns.md: Known patterns...
knowledge/team.md: Team directory...Sub-agents share the parent's context compiler, so they also start with full knowledge loaded.
This works because realistic knowledge bases are small (~20 docs, a few thousand tokens total). If your knowledge base starts to crowd the context window, split by session type (separate repos or personas) rather than adding retrieval — lookup turns compound across every investigation and the cost outweighs the context savings.
Loop Detection
The SDK detects when an agent enters unproductive loops:
- Pattern matching — repeated identical tool calls or reasoning patterns
- LLM-based detection — the model evaluates whether it's making progress
When a loop is detected, the agent is nudged to try a different approach or escalate to the user.