Are you an LLM? Read llms.txt for a summary of the docs, or llms-full.txt for the full context.
Skip to content

Architecture Overview

Amodal is a layered system. The Runtime is the agent engine (state machine, providers, tools, stores, session manager). The CLI provides the developer interface. The React SDK and chat widget embed the runtime's streaming output in web apps. Each layer has a clear boundary and a clear job.

The runtime is transport-agnostic. The same engine runs behind an HTTP server (amodal dev), embedded in your own Node.js application via createAgent(), inside an automation runner, or in a test harness — without code changes.

System Diagram

Chat Interfaces
  ├── CLI (amodal chat)
  ├── React SDK (@amodalai/react)
  └── Chat Widget (@amodalai/react/widget)


Runtime (@amodalai/runtime)
  │ HTTP server, SSE streaming, session management,
  │ state machine agent loop, tools, stores, automations

Providers (via Vercel AI SDK)
  ├── Anthropic
  ├── OpenAI
  └── Google Gemini

Packages

PackageRole
@amodalai/runtimeAgent engine. State machine, provider layer, tool system, stores, session manager, HTTP server, automation runner. createAgent() is the public entry point.
@amodalai/coreBuild utilities. loadRepo, buildSnapshot, NPM package management, knowledge base formatting, MCP manager. No agent runtime code.
@amodalai/typesZero-dep shared types. AgentBundle, SSEEvent, ToolDefinition, StoreBackend, CustomToolContext, branded ID types. Safe to import from any context.
@amodalai/amodal (CLI)Terminal interface. Commands for project management, chat, eval, and package install. Built with Ink.
@amodalai/reactReact components + SSE chat client for embedding Amodal in web apps.
@amodalai/react/widgetStandalone embeddable chat widget (no React required on the host page).

Data Flow: What Happens When a User Sends a Message

When a user types a message and hits enter, here is what happens — from keystroke to streamed response:

1. Client to Runtime

The chat client (CLI, web app, or widget) sends a POST to /chat/stream with the message text and an optional sessionId. If this is a new conversation, no session ID is sent. The connection stays open — the response streams back over SSE.

2. Session Manager

The StandaloneSessionManager either creates a new session or loads an existing one. For a new session, it builds the session components (provider, tool registry, permission checker, logger, compiled system prompt). For an existing session, it loads the conversation history from the session store (PGLite locally, Postgres in production). It also checks that the session has not expired.

3. Context Compilation

Before the message reaches the LLM, the context compiler assembles the system prompt. This is where the layered config becomes a single, coherent prompt:

  • Agent instructions: the agent's role definition and userContext from amodal.json
  • Skills: full body of every skill that passes requirement checks
  • Knowledge index: a compact listing of available KB documents (titles, tags, categories) so the agent can load them on demand
  • Connection surfaces: endpoints + field guidance + scope labels for every configured connection
  • Store schemas: auto-generated from each store's entity definition
  • Tool definitions: store tools, connection tools, custom tools, MCP tools, admin tools — all Vercel AI tool() definitions with Zod or JSON Schema

The compiled system prompt is cached and reused for every turn in the session (and prompt-cached with Anthropic providers to reduce cost).

4. The State Machine

The agent loop is an explicit state machine — thinkingstreamingexecuting (if tools were called) → back to thinking, until done. Each state is a discriminated-union variant with its own handler. Compaction, loop detection, sub-agent dispatch, and user confirmation are all explicit states, not if-branches in a while loop.

See State Machine for the full architecture: all six states, the transition rules, exhaustiveness checking, and how SSE events are returned as side effects of transitions.

5. Tool Execution

When the model emits tool calls, the state machine enters executing. Each tool call runs through the ToolRegistry via the Vercel AI tool() interface:

  • request: Calls a connection endpoint. The PermissionChecker resolves the connection's access.json — check ACL rules, strip hidden fields, apply rate limits, require confirmation for destructive writes.
  • query_store / write_<store> / <store>_batch: CRUD against the configured store backend (PGLite or Postgres via Drizzle).
  • dispatch_task: Spawns a sub-agent with its own isolated state machine, sharing the parent's tools and stores. Returns a compressed summary to the parent's context.
  • present: Emits a widget SSE event with structured data for the client to render inline.
  • stop_execution: Ends the current turn cleanly — useful for automations that shouldn't keep talking after completing their task.
  • web_search / fetch_url (when webTools is configured): Grounded search and URL extraction via a dedicated Gemini Flash instance, routed regardless of the main model provider. See Web Tools.
  • Custom tools: Compiled from each tool's handler.ts via esbuild, then executed with a scoped ctx containing request, store, exec, log, and env.
  • MCP tools: Proxied to the MCP server they were discovered from.
  • Admin file tools (in admin sessions only): read_repo_file, write_repo_file, edit_repo_file, delete_repo_file, list_repo_files, glob_repo_files, grep_repo_files, read_many_repo_files, internal_api — all allowlist-scoped to the agent's config directories. See Admin Agent.

Tool errors are observations the model reasons about — not exceptions that crash the loop. This is the "continue site" pattern: a tool failure becomes a tool_call_result with status: 'error', and the model decides what to do next.

6. Response Streaming

Throughout this entire process, the runtime streams events to the client over SSE. The client receives text_delta events as the LLM generates text, tool_call_start and tool_call_result events as tools execute, subagent_event events when sub-agents are working, and a final done event with token usage when the response is complete.

The client renders these events in real time. The user sees the agent thinking, calling tools, and composing its answer — not a loading spinner followed by a wall of text.

Deployment

Local Development

amodal dev    # starts runtime reading from local git repo
amodal chat   # connects to local runtime

The runtime reads your config directory directly from the filesystem. Changes to files are hot-reloaded — edit a skill, and the next message uses the updated version. This is the fastest feedback loop for development.

Production

For production, run the runtime as a standalone server or in a Docker container:

amodal deploy serve          # run from local config
amodal ops docker build      # build a Docker image

The git repo is the source of truth. Everything about your agent — connections, skills, knowledge, automations, and config — lives in version-controlled files.

Embedded (ISV)

For ISVs embedding Amodal in their own SaaS product, createAgent() gives you the engine as a library:

import { createAgent } from '@amodalai/runtime'
 
const agent = await createAgent({
  repoPath: './my-agent',
  provider: 'anthropic',
  apiKey: process.env.ANTHROPIC_API_KEY,
  storeBackend: myPostgresPool,    // bring your own
})

See the SDK overview for the embedded pattern.

Security Boundaries

What Runs Where, What Has Access to What

The runtime process holds the resolved config (including decrypted secrets in memory), the LLM provider API clients, and the connected system API clients. Secrets are resolved from environment variables at startup and held only in memory — they are never written to disk, logged, or included in LLM prompts.

How Secrets Flow

Secrets never appear in deployment snapshots, API responses, LLM prompts, or logs.

  1. You set secrets as environment variables in your hosting environment (Kubernetes secrets, Fly.io secrets, etc.)
  2. amodal.json references them with env:VARIABLE_NAME
  3. The runtime resolves env: references at startup, holding the values in memory
  4. The connection tool uses them for API authentication — injecting headers, signing requests — but never includes them in the context sent to the LLM
  5. The field scrubber scans all tool outputs against the connection's access.json hidden-field rules and strips them before the result reaches the model

Role-Based Tool Filtering

Not every caller should see every tool. The runtime filters the tool list based on the caller's role before the LLM sees it:

  • User role: Sees all non-admin tools.
  • Admin role: Sees everything including file tools, connection configuration, and automation control.
  • Automation role: Sees read-only tools by default. Write tools are only available if writeEnabled is true for the specific automation.

This filtering happens at the session layer, before the prompt is compiled. The LLM never knows about tools it cannot use — they are simply absent from its tool list.

Key Protocols

ProtocolWherePurpose
SSE (Server-Sent Events)Runtime to ClientStreaming chat responses, tool calls, widget events
RESTRuntime APISession management, automation control, store/file endpoints
MCP (Model Context Protocol)Runtime to MCP serversTool discovery and execution over stdio or SSE transport

SSE was chosen over WebSockets for client streaming because it is simpler (unidirectional), works through more proxies and CDNs, and automatically reconnects on connection drops.

Deep Dives