Are you an LLM? Read llms.txt for a summary of the docs, or llms-full.txt for the full context.
Skip to content

Engineering Standards

The standards the runtime holds itself to. If you're writing custom tool handlers, embedding the runtime via createAgent(), or extending the codebase, follow these patterns. They're not aesthetic — every one exists because we paid for the opposite in a previous version of the codebase.

Errors are values

Functions that can fail return Result<T, E> — not null, not thrown exceptions (except at module boundaries).

type Result<T, E = Error> = { ok: true; value: T } | { ok: false; error: E }

This forces the caller to handle both cases. You can't accidentally treat "not found" the same as "database is broken."

Never:
  • catch (e) { } — empty catch
  • catch (e) { return null } — caller can't distinguish failure from absence
  • catch (e) { log.error(e) } without re-throwing — error logged but swallowed
  • catch (e) { return [] } — empty data hiding a broken system
Four valid reasons to catch:
  1. Enrich and re-throw — add context: throw new StoreWriteError(store, id, err)
  2. Module boundary → structured error response — API routes, tool executors convert errors into the agent's observation
  3. Specific expected failure with specific handling — retries, fallbacks
  4. Cleanup — use finally, not catch

Error boundaries live at module edges — API routes, tool executors, the session manager. Not inside store backends, not inside utility functions, not inside state handlers.

Async discipline

No floating promises. Every async call is awaited or explicitly voided with a .catch(). A floating promise that rejects silently is as bad as a swallowed error.

// BAD
executeStoreDirectly(storeBackend, storeName, data)
 
// GOOD
await executeStoreDirectly(storeBackend, storeName, data)
 
// GOOD — intentional fire-and-forget with error handling
void deliverResult(result).catch(err =>
  logger.error('delivery_failed', { error: err.message })
)

Timeouts on all external operations. Every provider call, MCP call, tool execution, and store operation gets an AbortSignal.timeout(). If the external system hangs, we don't hang with it.

await provider.request(url, { signal: AbortSignal.timeout(5000) })

Exhaustive switches on discriminated unions. Use the never trick so adding a new variant causes a compile error, not silent fallthrough.

switch (state.type) {
  case 'thinking':  return handleThinking(state, ctx)
  case 'streaming': return handleStreaming(state, ctx)
  // ... all cases ...
  default: {
    const _exhaustive: never = state
    throw new Error(`Unhandled state: ${(_exhaustive as AgentState).type}`)
  }
}

Types as documentation

  • No any. Use unknown and narrow with type guards.
  • No as casts except at system boundaries (parsing external JSON/API responses, after validation).
  • Discriminated unions for state types (AgentState, SSEEvent, ToolResult). The type tells you what fields exist in each variant.
  • Branded types for IDs (SessionId, TenantId, ToolCallId) — prevents passing a session ID where a tenant ID is expected.

Logging

Logs are the runtime narrative of what happened. Use the Logger interface, never console.log, console.error, or process.stderr.write.

// BAD
console.log(`Processing tool call ${toolName} for session ${sessionId}`)
 
// GOOD
logger.info('tool_call_start', {
  tool: toolName,
  session: sessionId,
  tenant: tenantId,
})

Snake_case event names, structured data object. Every tool call, state transition, and error emits a structured log.

Always log on tool calls: tool name, status, duration, session ID, tenant ID. On errors: what operation, what inputs, what state.

Never log: raw API credentials, tokens, full PII. Use redacted patterns.

Module boundaries

  • No importing from another module's internal files (../agent/internal/helper.ts from the session manager — no)
  • No accessing private fields via (obj as any).field or obj['_privateField']
  • No circular dependencies between modules
  • Each module wraps errors at its boundary with module-specific error types

Tool schemas

  • Code-defined tools (store, connection, admin): use Zod schemas. You get TypeScript type inference on the execute function.
  • External-schema tools (MCP tools, custom tools from tool.json): use jsonSchema() from the AI SDK. Pass the schema through unchanged. Converting to Zod and back is a lossy round-trip that can lose nullable, oneOf, $ref, or format constraints.

Testing

  • Integration tests > unit tests for tool execution — test the real path, not mocks.
  • Contract tests for SSE events — if an event shape changes, the test fails before the UI breaks.
  • Don't test implementation details — test public behavior. Private functions can be refactored freely.

What this means for custom tool handlers

When you write a handler.ts for a custom tool, the same rules apply to your handler's code:

  • Use ctx.log (structured event + data), not console.log
  • Don't catch errors to swallow them — let them propagate so the executor can turn them into a proper tool-error observation
  • Don't make fire-and-forget HTTP calls
  • Use ctx.request() for connection HTTP — it handles auth, timeouts, and permission checks

What this means for createAgent() embedding

When you embed the runtime via createAgent({ storeBackend, sessionStore, ... }):

  • Your injected StoreBackend should return Result<T, StoreError> from operations that can fail
  • Provider API keys belong in environment variables, not hardcoded
  • Timeouts on any HTTP client you pass into the runtime
  • Your own error handler middleware catches whatever bubbles up — the runtime throws typed errors at its public boundary