# The Agentic OS Protocol
> The standard for multi-agent systems. Define agents, coordinate workflows, and build systems where agents work together.
---
# Apps
This feature is experimental. The App schema and distribution manifest format
have no real implementation yet and are subject to change.
## Overview
An app is a distribution of the Agentic OS — a manifest that declares which vendors implement which protocol interfaces, forming a complete agentic system configuration. Think of it like `package.json` for Node.js, `docker-compose.yml` for containers, or `vercel.json` for deployments: a single file that describes how all the pieces fit together.
The format is YAML frontmatter for the structured, machine-readable manifest, combined with a free-form markdown body for human-readable documentation. The filename is up to the author — `SYNER.md`, `APP.md`, `ACADEMY.md`, or anything else. The format is what matters.
## Distribution File
An app file combines a YAML frontmatter manifest with a markdown documentation body:
```yaml
---
name: '@synerops/syner-os'
version: '1.0.0'
description: 'Syner OS — AI-powered development platform'
protocol: '0.1.0'
providers:
system:
sandbox: { provider: '@vercel/sandbox' }
fs: { provider: '@vercel/blob' }
context:
embeddings: { provider: '@upstash/vector' }
---
# Syner OS
Syner OS is an AI-powered development platform...
```
The frontmatter declares the distribution metadata and provider bindings. The markdown body is free-form — documentation, usage instructions, architecture notes, or anything else relevant to the distribution.
## TypeScript API
Import types from `@osprotocol/schema/apps`:
```ts
import type { App, AppMetadata, ProviderMap, ProviderEntry } from '@osprotocol/schema/apps'
```
### ProviderEntry
A single vendor binding for a protocol interface:
```ts
interface ProviderEntry {
provider: string
version?: string
enabled?: boolean
metadata?: Record
}
```
| Field | Type | Description |
| ---------- | ------------------------- | ----------------------------------------------------------- |
| `provider` | `string` | Package name or identifier of the vendor implementation |
| `version` | `string` | Optional semver range for the provider |
| `enabled` | `boolean` | Whether this binding is active. Defaults to true if omitted |
| `metadata` | `Record` | Arbitrary provider-specific configuration |
### ProviderMap
Maps protocol domains to their provider bindings, one entry per interface:
```ts
interface ProviderMap {
system?: Record
context?: Record
actions?: Record
checks?: Record
}
```
Each key within a domain corresponds to a specific protocol interface (e.g., `sandbox`, `fs`, `embeddings`), not the domain as a whole. Granularity is per interface.
### AppMetadata
The structured frontmatter of a distribution file:
```ts
interface AppMetadata {
name: string
version: string
description?: string
protocol?: string
providers?: ProviderMap
metadata?: Record
}
```
| Field | Type | Description |
| ------------- | ------------------------- | -------------------------------------------------------- |
| `name` | `string` | Distribution name, typically a scoped package identifier |
| `version` | `string` | Semver version of this distribution |
| `description` | `string` | Human-readable description |
| `protocol` | `string` | OS Protocol spec version this distribution targets |
| `providers` | `ProviderMap` | Vendor bindings per domain and interface |
| `metadata` | `Record` | Arbitrary additional metadata |
### App
The parsed representation of a full distribution file:
```ts
interface App {
metadata: AppMetadata
content: string
path: string
}
```
| Field | Type | Description |
| ---------- | ------------- | ---------------------------------------- |
| `metadata` | `AppMetadata` | Parsed frontmatter |
| `content` | `string` | Raw markdown body after the frontmatter |
| `path` | `string` | Filesystem path to the distribution file |
## Provider Bindings
The `providers` field in the frontmatter maps each protocol domain's interfaces to concrete vendor implementations. Each domain (`system`, `context`, `actions`, `checks`) contains a record where each key is a specific interface name and each value is a `ProviderEntry`.
```yaml
providers:
system:
env:
provider: '@vercel/env'
sandbox:
provider: '@vercel/sandbox'
version: '^1.0.0'
context:
embeddings: { provider: '@upstash/vector' }
checks:
screenshot: { provider: 'playwright' }
judge: { provider: '@braintrust/judge' }
```
This declaration says: for the `system` domain, use `@vercel/env` for environment access and `@vercel/sandbox` for sandboxed execution; for `context`, use `@upstash/vector` for embeddings; for `checks`, use Playwright for screenshots and Braintrust for LLM-as-judge evaluation.
Provider bindings are resolved at runtime by the OS Protocol host. Interfaces with no binding fall back to any default registered by the host environment.
## Usage Examples
### Load an app manifest
```ts
import { readFileSync } from 'fs'
import matter from 'gray-matter'
import type { App, AppMetadata } from '@osprotocol/schema/apps'
function loadApp(filePath: string): App {
const raw = readFileSync(filePath, 'utf-8')
const { data, content } = matter(raw)
return {
metadata: data as AppMetadata,
content,
path: filePath,
}
}
const app = loadApp('./SYNER.md')
console.log(app.metadata.name) // '@synerops/syner-os'
console.log(app.metadata.version) // '1.0.0'
```
### Validate provider bindings
```ts
import type { ProviderMap } from '@osprotocol/schema/apps'
function getProviderForInterface(
providers: ProviderMap,
domain: keyof ProviderMap,
interfaceName: string
): string | undefined {
return providers[domain]?.[interfaceName]?.provider
}
const sandboxProvider = getProviderForInterface(
app.metadata.providers ?? {},
'system',
'sandbox'
)
// '@vercel/sandbox'
```
### Compare two distributions
```ts
import type { App } from '@osprotocol/schema/apps'
function diffProviders(a: App, b: App): string[] {
const differences: string[] = []
const domains = ['system', 'context', 'actions', 'checks'] as const
for (const domain of domains) {
const aBindings = a.metadata.providers?.[domain] ?? {}
const bBindings = b.metadata.providers?.[domain] ?? {}
const allInterfaces = new Set([...Object.keys(aBindings), ...Object.keys(bBindings)])
for (const iface of allInterfaces) {
const aProvider = aBindings[iface]?.provider
const bProvider = bBindings[iface]?.provider
if (aProvider !== bProvider) {
differences.push(`${domain}.${iface}: ${aProvider ?? 'none'} → ${bProvider ?? 'none'}`)
}
}
}
return differences
}
```
## Cross-References
The app manifest format is analogous to other ecosystem configuration files:
| Format | Ecosystem | Defines |
| ---------------------------- | ----------------- | ------------------------------------------------------ |
| `package.json` | Node.js / npm | Package dependencies and scripts |
| `docker-compose.yml` | Docker | Service definitions and bindings |
| `Chart.yaml` | Helm / Kubernetes | Chart metadata and dependencies |
| `claude_desktop_config.json` | Claude Desktop | MCP server registrations |
| `mcp.json` (Cursor) | Cursor | MCP server registrations for Cursor |
| `.vscode/mcp.json` | VS Code | MCP server registrations for VS Code |
| `vercel.json` | Vercel | Deployment configuration and routing |
| App manifest (`*.md`) | OS Protocol | Agentic OS provider bindings and distribution metadata |
The app manifest occupies the same role in the Agentic OS ecosystem that these files occupy in their respective ecosystems: a single source of truth for how a system is configured and what implements each capability.
## Integration
Provider bindings in an app manifest map to the following protocol domains:
* [System](/docs/system) — environment, sandboxing, storage, file system interfaces
* [Context](/docs/context) — memory, embeddings, and knowledge retrieval interfaces
* [Actions](/docs/actions) — external integrations and side-effect interfaces
* [Checks](/docs/checks) — validation, evaluation, and quality assurance interfaces
---
# Architecture
## Overview
The Agentic OS Protocol (OSP) is built on a modular architecture that separates concerns and enables flexible composition of agent systems.
## Philosophy and Foundations
OSP doesn't create everything from scratch. Instead, it adapts proven philosophies and patterns that have demonstrated effectiveness in production environments, combined with our own contributions and experience in infrastructure design.
### Influences and Adaptations
OSP draws inspiration from several key sources:
* **[Anthropic: Building Effective Agents](https://www.anthropic.com/engineering/building-effective-agents)**: Workflow patterns (Routing, Prompt Chaining, Orchestrator-Workers, Parallelization, Evaluator-Optimizer) form the foundation of our workflow taxonomy.
* **[Agent Communication Protocol](https://agentcommunicationprotocol.dev/core-concepts/agent-run-lifecycle)**: The concept of **Runs**—essential for multi-agent systems—provides the lifecycle management framework.
* **[Claude Agent SDK](https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk)**: The **Agent Loop** execution pattern (Gather Context → Take Action → Verify Work → Iterate) defines our core cognitive cycle.
These patterns have been adapted and extended to work together in a unified protocol specification that emphasizes interoperability, scalability, and system-level orchestration.
### The Operating System Concept
Where OSP contributes uniquely is in the **Operating System** abstraction—a layer that provides system intelligence through standardized APIs. This OS layer manages the lifecycle, coordination, and resource management that multi-agent systems require.
> **Understanding the Metaphor**: Just as traditional operating systems abstract hardware resources (CPU, memory, disk), an Agentic OS abstracts cognitive resources (inference, context, knowledge, tools). The [Agentic OS concept](/docs/concepts/agentic-os) defines this paradigm—OSP is the protocol specification that implementations follow.
Just as traditional operating systems provide process management, memory management, and I/O interfaces, OSP's Operating System provides:
* **Agent Registry**: Discovery and capability management
* **System APIs**: Environment, Filesystem, Settings, Sandbox
* **Context Facades**: System Context, Embeddings, Key-Value
* **Quality Assurance**: Rules, Audit, Judge, Screenshot
## Architecture Layers
The protocol architecture is organized into distinct domains:
### 1. System (Infrastructure)
The OS layer provides infrastructure services that all agents depend on. Unlike traditional protocols that focus solely on agent-to-agent communication, OSP includes system-level intelligence:
* **Registry**: Manages agent registration, discovery, and capability matching
* **Environment**: Handles configuration and environment variable management
* **Filesystem**: Provides standardized file system operation interfaces
* **Sandbox**: Isolated execution environments for untrusted code
* **Settings**: Manages system-level configurations
* **Preferences**: User-level preferences and personalization
* **Installer**: Package and dependency management
* **MCP Client**: Integrates with Model Context Protocol for external tool access
### 2. Context & Actions (Read/Write Facades)
OSP separates read-only information gathering from state-changing operations. This mirrors the Agent Loop phases: **gather** context, then **take** action.
**Context** (read-only facades for the gather phase):
* **System Context**: Aggregated read-only view of system state (environment, settings, filesystem metadata)
* **Embeddings**: Vector database integration for semantic search
* **Key-Value**: Lightweight key-value storage for agent state
**Actions** (write facades for the act phase):
* **System Actions**: Aggregated write operations (filesystem writes, setting changes)
* **Tools**: Tool registration and invocation interfaces
* **MCP Servers**: External tool access through Model Context Protocol servers
### 3. Checks (Quality Assurance)
Built-in mechanisms for ensuring reliability, compliance, and quality:
* **Rules**: Behavioral constraints and validation frameworks
* **Judge**: LLM-based evaluation of agent outputs and decisions
* **Audit**: Comprehensive monitoring and logging of agent behavior
* **Screenshot**: Visual validation for UI-related tasks
### 4. Workflows (Execution Patterns)
Composable patterns for agent coordination, based on [Anthropic's building blocks for agentic systems](https://www.anthropic.com/engineering/building-effective-agents):
* **Routing**: Classify input and delegate to the appropriate handler
* **Parallelization**: Split work across parallel branches and merge results
* **Orchestrator-Workers**: Plan, delegate to workers, synthesize outputs
* **Evaluator-Optimizer**: Generate, evaluate, refine in a loop
### 5. Runs (Lifecycle Control)
Every agent execution is a **Run** with a well-defined lifecycle and control mechanisms:
* **Timeout**: Time-based execution limits
* **Retry**: Automatic retry with configurable strategies
* **Cancel**: Graceful cancellation with cleanup
* **Approval**: Human-in-the-loop gates for sensitive operations
## Core Execution Model
### The Agent Loop
At the heart of every agent is the **Agent Loop**, the cognitive cycle that drives execution:
This loop executes within workflows and is managed by the Operating System during the Execution phase of the Agent Lifecycle.
Learn more: [Agent Loop](/docs/concepts/agent-loop)
### The Agent Lifecycle
At the system level, agents follow a **Lifecycle** that spans their entire existence:
* **Registration**: Agents declare capabilities to the Registry
* **Discovery**: The OS matches agents to tasks based on capabilities
* **Execution**: The Agent Loop runs within workflows
* **Evaluation**: Performance, quality, and compliance are assessed
Learn more: [Agent Lifecycle](/docs/concepts/lifecycle)
## Integration Patterns
OSP is designed to integrate with existing protocols and systems:
### MCP Integration
OSP includes native support for the Model Context Protocol, allowing agents to access external tools and resources through standardized MCP servers. The OS provides MCP client functionality that any agent can leverage.
### Workflow Orchestration
The protocol defines workflow patterns that can be composed and combined. Workflows handle coordination; Runs handle lifecycle control (timeouts, retries, cancellation, approval).
Learn more: [Workflows](/docs/workflows) · [Runs](/docs/runs)
### Multi-Agent Coordination
OSP enables agents from different implementations to work together through standardized interfaces. Agents can coordinate tasks, share context (with proper isolation), and participate in distributed workflows.
## Design Philosophy
The architecture prioritizes:
* **Modularity**: Components can be used independently or together
* **Extensibility**: New components and capabilities can be added without breaking existing implementations
* **Interoperability**: Different implementations can work together through standardized contracts
* **Reliability**: Built-in quality assurance through Rules, Audit, and Judge
* **Observability**: Comprehensive monitoring, auditing, and evaluation capabilities
* **Scalability**: Designed to grow from single-agent systems to complex multi-agent environments
## Next Steps
* Learn about **[System](/docs/system)** for infrastructure services
* Understand **[Context](/docs/context)** and **[Actions](/docs/actions)** for the read/write split
* Review **[Checks](/docs/checks)** for quality assurance mechanisms
* Explore **[Workflows](/docs/workflows)** for coordination patterns
* See **[Runs](/docs/runs)** for lifecycle control
---
# Introduction
import { Orbit, Boxes, CircuitBoard, Server } from 'lucide-react';
## What is the Agentic OS Protocol?
The Agentic OS Protocol (OSP) is a specification—a shared contract that defines the interfaces, behaviors, and data formats for orchestrating AI agents at scale.
Think of it as the blueprint you implement: it tells you which domains exist (System, Context, Actions, Checks, Workflows, Runs), how they interact, and what "conformant" behavior looks like. It's not a runtime or framework—you build to it.
### Protocol Domains
| Domain | Purpose |
| ---------------------------- | ---------------------------------------------------------------------------------------- |
| [System](/docs/system) | Infrastructure — registry, environment, filesystem, settings |
| [Context](/docs/context) | Read-only facades — system context, embeddings, key-value |
| [Actions](/docs/actions) | Write facades — system actions, tools, MCP servers |
| [Checks](/docs/checks) | Quality assurance — rules, judge, audit, screenshot |
| [Workflows](/docs/workflows) | Execution patterns — routing, parallelization, orchestrator-workers, evaluator-optimizer |
| [Runs](/docs/runs) | Lifecycle control — timeout, retry, cancel, approval |
## Getting Started
OSP defines the contract for building agent systems that can work together seamlessly, with standardized patterns for coordination, quality assurance, and context management. Below are the key entry points to understand and implement the protocol.
} href="/docs/architecture" title="See the big picture">
Explore the architecture, how everything fits before you dive into details.
} href="/docs/concepts/agent-loop" title="Agent Loop">
Gather Context → Take Actions → Verify Results
} href="/docs/system" title="System Intelligence">
Registry, Environment, Filesystem, Settings—the infrastructure layer.
} href="/docs/concepts/agentic-os" title="What is Agentic OS?">
The architectural paradigm where LLM functions as the Kernel of the system.
---
# Motivation
## The Challenge
As AI agents become more sophisticated and capable, we face new challenges in orchestrating, managing, and executing them at scale. Traditional approaches to agent management often fall short when dealing with the reality that comes when systems grow beyond a single agent running in isolation.
Picture this: you start with one agent handling a simple task. It works perfectly. Then you need two agents to work together. Still manageable. But as you add more agents—each with different capabilities, running across different environments, coordinating complex workflows—things quickly become messy. What seemed simple at small scale reveals complexity you didn't anticipate.
## When Systems Grow
Orchestration, at its core, is about coordinating multiple agents to work together toward a common goal. When you have a handful of agents, coordination feels straightforward. But scale changes everything. Your agents start running across different environments—some in the cloud, others on edge devices, each with different capabilities and constraints. Coordinating them becomes a challenge in itself. Then workflows get complex: one agent's output becomes another's input, creating chains of dependencies that span multiple agents and environments. A failure in one step can cascade through the entire process. You find yourself managing not just individual agents, but intricate relationships between them—who depends on whom, what happens when something fails, how to retry, how to recover.
As your system grows, new questions emerge that traditional approaches struggle to answer. How do you ensure quality when you can't monitor everything manually? How do you maintain context when agents operate independently across sessions? How do you scale from ten agents to a thousand without everything breaking? In distributed systems where agents collaborate, quality is emergent—it's about how agents interact, not just how each performs alone. Context management becomes critical: agents need to share information but also maintain isolation. Without standardized approaches, everyone solves these problems differently. Agent platforms built by different teams can't interoperate. The ecosystem fragments, and innovation slows because everyone is reinventing the same solutions.
## Why a Protocol
This is where protocols shine. A protocol defines a shared contract—the interfaces, behaviors, and data formats that enable interoperability. Just as HTTP allows any web browser to communicate with any web server, a protocol for agent orchestration would allow different implementations to work together while remaining free to innovate in their specific domains.
A protocol, unlike a framework or library, doesn't prescribe implementation details. It defines *what* must be supported and *how* components should interact, but leaves *how* you build it up to you. This flexibility is crucial: teams working in different languages, with different constraints, and different use cases can all implement the same protocol and achieve interoperability.
## What OSP Does
OSP provides standardized patterns for agent coordination, quality assurance, and resource management—proven patterns that are reusable but not prescriptive. It means providing infrastructure for common problems: agent discovery, context management, quality monitoring. It means designing for scale from the start, so systems can grow from a single agent to complex multi-agent environments without fundamental redesigns.
With a standardized protocol, the entire ecosystem benefits. Developers can build agent systems knowing they'll interoperate with others. Teams can share agents, workflows, and patterns. Agents from different platforms can collaborate on complex tasks. Workflows can span multiple systems. The ecosystem becomes composable—you can combine agents and tools from different sources, knowing they'll work together because they follow the same protocol. This creates the foundation for systems where agents collaborate at unprecedented scale, where workflows span organizations and platforms, where the whole ecosystem is greater than the sum of its parts.
## Building in the Open
The Agentic OS Protocol is in active development, maintained by [SynerOps](https://synerops.com), and we're building it in the open.
Why? Because the protocol needs to solve real problems, work in real environments, and evolve based on how people actually use it.
We welcome contributions, feedback, and collaboration. Whether you're implementing the protocol, using it in production, researching agent systems, or just curious about what's possible—your perspective matters. Together, we're not just defining a protocol; we're shaping how agents will work together for years to come.
---
# MCP Servers
This interface is experimental. No real implementation exists yet. The API surface
may change as the MCP ecosystem and OS Protocol integration patterns mature.
## Overview
`McpServers` is the agent-facing interface for MCP-specific capabilities that go beyond tool execution. Resources (data and content exposed by a server) and prompts (reusable templates) are accessed through this interface. Tool execution from MCP servers goes through the unified Tools interface, not here. Connection management and server lifecycle are handled at the infrastructure level by `system/mcp-client`. Provider analogues include Anthropic MCP and the AAIF MCP Standard.
## Architecture
## TypeScript API
```ts
import type { McpServers, McpResource, McpPrompt } from "@osprotocol/schema/actions/mcp-servers"
```
### McpResource
Represents a data or content resource exposed by an MCP server.
```ts
interface McpResource {
uri: string
name: string
mimeType?: string
description?: string
metadata?: Record
}
```
| Field | Type | Description |
| ------------- | ------------------------- | -------------------------------------------------- |
| `uri` | `string` | Unique resource identifier within the server |
| `name` | `string` | Human-readable resource name |
| `mimeType` | `string` | Optional MIME type of the resource content |
| `description` | `string` | Optional description of what the resource contains |
| `metadata` | `Record` | Optional server-defined metadata |
### McpPrompt
Represents a reusable prompt template provided by an MCP server.
```ts
interface McpPrompt {
name: string
description?: string
arguments?: object
metadata?: Record
}
```
| Field | Type | Description |
| ------------- | ------------------------- | -------------------------------------------- |
| `name` | `string` | Unique prompt name within the server |
| `description` | `string` | Optional description of the prompt's purpose |
| `arguments` | `object` | Optional argument schema for the prompt |
| `metadata` | `Record` | Optional server-defined metadata |
### McpServers
The primary interface agents use to interact with MCP server resources and prompts.
```ts
interface McpServers {
listResources(server: string): Promise
readResource(server: string, uri: string): Promise
listPrompts(server: string): Promise
getPrompt(server: string, name: string, args?: Record): Promise
}
```
| Method | Description |
| -------------------------------- | ---------------------------------------------------------------- |
| `listResources(server)` | List all resources available on the given MCP server |
| `readResource(server, uri)` | Read the content of a specific resource by URI |
| `listPrompts(server)` | List all prompt templates available on the given MCP server |
| `getPrompt(server, name, args?)` | Retrieve a rendered prompt by name, optionally passing arguments |
## Usage Examples
### List and read resources from a server
```ts
const resources = await mcp.listResources("knowledge-base")
for (const resource of resources) {
console.log(`${resource.name} (${resource.mimeType ?? "unknown type"})`)
}
const content = await mcp.readResource("knowledge-base", "docs://api-reference")
if (content) {
// process the resource content
}
```
### Get a prompt with arguments
```ts
const rendered = await mcp.getPrompt("code-assistant", "explain-function", {
language: "typescript",
context: "async generator",
})
if (rendered) {
// use the rendered prompt text in an LLM call
}
```
### Discover what an MCP server offers
```ts
const [resources, prompts] = await Promise.all([
mcp.listResources("my-server"),
mcp.listPrompts("my-server"),
])
console.log(`Resources: ${resources.map((r) => r.name).join(", ")}`)
console.log(`Prompts: ${prompts.map((p) => p.name).join(", ")}`)
```
## Integration
* [Tools](/docs/actions/tools) — unified interface for tool execution, including tools from MCP servers
* [MCP Client](/docs/system/mcp-client) — infrastructure-level connection management for MCP servers
* [SystemActions](/docs/actions/system) — broader system actions context
---
# System Actions
This interface is experimental — no production implementation exists yet.
The API surface may change.
## Overview
System Actions composes all system write interfaces into a single entry point for the actions phase of the agent loop. It is a pure facade — each system API owns its own Actions interface, and `SystemActions` re-exports them under a unified namespace.
The read counterpart is [System Context](/docs/context/system), which provides read-only access to the same system interfaces.
## TypeScript API
```ts
import type { SystemActions } from '@osprotocol/schema/actions/system'
```
### SystemActions
Composes all system write interfaces.
```ts
interface SystemActions {
/** Environment variables */
env: EnvActions
/** System-wide settings */
settings: SettingsActions
/** Scoped preferences */
preferences: PreferencesActions
/** Resource registries */
registry: RegistryActions
/** Host filesystem */
fs: FsActions
/** Sandbox environments */
sandbox: SandboxActions
/** Installed packages */
installer: InstallerActions
/** MCP server connections */
mcp: McpActions
}
```
Individual Actions interfaces are also re-exported:
```ts
import type {
EnvActions,
SettingsActions,
PreferencesActions,
RegistryActions,
FsActions,
SandboxActions,
InstallerActions,
McpActions,
} from '@osprotocol/schema/actions/system'
```
## Usage Examples
### Mutate system state through the facade
```ts
// Set an environment variable
await system.env.set({ key: 'DATABASE_URL', value: 'postgres://...' })
// Create a sandbox for code execution
const sandbox = await system.sandbox.create({ runtime: 'node24', timeout: 60000 })
// Install a dependency
await system.installer.install({ name: '@osprotocol/schema', version: '^0.2.0' })
```
### Use individual Actions interfaces directly
```ts
// When you only need filesystem write access
async function saveArtifact(fs: FsActions, path: string, content: string) {
await fs.write(path, content)
}
```
## Design Rationale
The agent loop enforces read/write separation by phase:
* **Context phase** → read-only (`SystemContext` in `context/system.ts`)
* **Actions phase** → write operations (`SystemActions` in `actions/system.ts`)
This zero-trust pattern ensures agents gather all context before mutating state. The facade is pure composition — it adds no logic, just groups the individual Actions interfaces for convenience.
## Integration
System Actions integrates with:
* **[System Context](/docs/context/system)**: Read counterpart — same interfaces, read-only view
* **[Tools](/docs/actions/tools)**: System mutations can be exposed as agent tools
* **[Checks](/docs/checks/audit)**: Audit trails record system mutations for verification
---
# Tools
This interface is experimental — no production implementation exists yet.
The API surface may change.
## Overview
`Tools` provides a unified surface for agent tool discovery and execution. Agents call tools without needing to know whether they originate from MCP servers, built-in capabilities, or custom providers — the implementation aggregates all sources transparently. Provider analogues include Anthropic MCP Tools, OpenAI Function Calling, Vercel AI SDK Tools, and LangChain Tools.
## TypeScript API
```ts
import type { Tool, ToolResult, Tools } from "@osprotocol/schema/actions/tools"
```
### Tool
Represents a single callable tool with a name, description, parameter schema, and execute function.
```ts
interface Tool {
name: string
description: string
parameters?: object
execute(params: TParams): Promise
metadata?: Record
}
```
| Field | Type | Description |
| ------------- | --------------------------------------- | --------------------------------------------------- |
| `name` | `string` | Unique tool identifier |
| `description` | `string` | Human-readable description of what the tool does |
| `parameters` | `object` | Optional JSON Schema describing accepted parameters |
| `execute` | `(params: TParams) => Promise` | Function that performs the tool action |
| `metadata` | `Record` | Optional extra data attached to the tool |
### ToolResult
Returned by `Tools.execute`. Wraps the result with success/error state.
```ts
interface ToolResult {
toolName: string
result: T
success: boolean
error?: string
metadata?: Record
}
```
| Field | Type | Description |
| ---------- | ------------------------- | ----------------------------------------- |
| `toolName` | `string` | Name of the tool that was executed |
| `result` | `T` | The value returned by the tool |
| `success` | `boolean` | Whether execution completed without error |
| `error` | `string` | Error message if `success` is `false` |
| `metadata` | `Record` | Optional extra data from the execution |
### Tools
The primary interface agents use to discover and invoke tools.
```ts
interface Tools {
get(name: string): Promise
list(): Promise
execute(name: string, params?: unknown): Promise>
}
```
| Method | Returns | Description |
| ------------------------ | ------------------------ | ------------------------------------------------------ |
| `get(name)` | `Promise` | Retrieve a tool by name, or `null` if not found |
| `list()` | `Promise` | Return all available tools from all registered sources |
| `execute(name, params?)` | `Promise>` | Invoke a tool by name with optional parameters |
## Usage Examples
### Execute a tool
```ts
const result = await tools.execute("read_file", { path: "/data/config.json" })
if (result.success) {
console.log(result.result)
} else {
console.error(result.error)
}
```
### List available tools and find one
```ts
const allTools = await tools.list()
const searchTool = allTools.find((t) => t.name === "web_search")
if (searchTool) {
console.log(searchTool.description)
console.log(searchTool.parameters)
}
```
### Handle tool errors
```ts
const result = await tools.execute("send_email", {
to: "user@example.com",
subject: "Hello",
body: "Message body",
})
if (!result.success) {
// Log the error and fall back gracefully
console.error(`Tool "${result.toolName}" failed: ${result.error}`)
}
```
## Integration
* [MCP Servers](/docs/actions/mcp-servers) — tool sources exposed over the Model Context Protocol
* [MCP Client](/docs/system/mcp-client) — connects to MCP servers and surfaces their tools
* [SystemActions](/docs/actions/system) — built-in system-level actions available as tools
---
# Audit
## Overview
Agents generate formal audit reports as markdown files with YAML frontmatter. The frontmatter conforms to the `AuditEntry` schema, enabling machine-parseable compliance records.
The schema is aligned with **ISO 27001** audit reporting and **ISACA/ITAF** expression of opinion standards.
**Implementation patterns:** gray-matter, Contentlayer, Fumadocs.
**Consumers:** Drata, Scytale (compliance automation), LangSmith, Langfuse (agent observability).
## Audit Flow
## Schema
```ts
import type {
AuditEntry,
AuditOpinion,
AuditFindings,
AuditQuery,
Audit,
} from '@osprotocol/schema/checks/audit'
```
### AuditOpinion
ISACA expression of opinion.
```ts
type AuditOpinion =
| 'unqualified' // No significant issues, full compliance
| 'qualified' // Minor issues that don't affect overall compliance
| 'adverse' // Significant issues, non-compliant
| 'disclaimer' // Unable to form opinion (insufficient evidence)
```
### AuditFindings
Finding severity counts aligned with ISO 27001 non-conformity classification.
```ts
interface AuditFindings {
critical: number // Immediate action required
major: number // Should be addressed soon
minor: number // Low risk, normal course
}
```
### AuditEntry
Schema for the YAML frontmatter in audit report files.
```ts
interface AuditEntry {
id: string
createdAt: number
agentId?: string
executionId?: string
// ISO 27001 / ISACA fields
objectives: string // What the audit aims to determine
scope: string[] // Files, systems, or processes audited
opinion: AuditOpinion // Expression of opinion
findings: AuditFindings // Counts by severity
// Detailed results (optional)
ruleResults?: RuleResult[]
judgeResult?: JudgeResult
metadata?: Record
}
```
### AuditQuery
Filter criteria for querying audit entries.
```ts
interface AuditQuery {
opinion?: AuditOpinion | AuditOpinion[]
minCritical?: number
minMajor?: number
agentId?: string
executionId?: string
since?: number // Unix ms
until?: number // Unix ms
}
```
### Audit
Operations for parsing, writing, and querying audit entries.
```ts
interface Audit {
parse(content: string): AuditEntry
write(entry: Omit, body: string): string
query(query: AuditQuery): Promise
}
```
* `parse` — Extract `AuditEntry` from file content with YAML frontmatter
* `write` — Generate file content from entry and markdown body
* `query` — Find entries matching filter criteria
## Agentic Usage
### Prompt
```
Audit the production configuration files
```
### Output
File: `audits/2026-02-21-config-review.md`
```yaml
---
type: audit
date: 2026-02-21
agent: reviewer
objectives: "Verify configuration files follow security best practices"
scope:
- config/production.yaml
- config/staging.yaml
status: complete
opinion: qualified
findings:
critical: 0
major: 1
minor: 2
---
## Criteria
- Security best practices
- Secret management
- Environment isolation
## Findings
### Major
**M1: Hardcoded API endpoint**
Production config contains hardcoded URL instead of environment variable...
### Minor
**m1: Missing timeout configuration**
...
## Recommendations
**M1**: Use environment variable for API endpoint
...
## Conclusion
Configuration is mostly secure but contains one hardcoded value that
should be externalized. Qualified opinion issued.
```
### Querying Audits
The frontmatter is machine-parseable:
```bash
# Find audits with critical findings
grep -l "critical: [1-9]" audits/*.md
# Find adverse opinions
grep -l "opinion: adverse" audits/*.md
```
Or programmatically via the `Audit` interface:
```ts
const critical = await audit.query({ minCritical: 1 })
const adverse = await audit.query({ opinion: 'adverse' })
```
## Standards Mapping
| AuditEntry Field | ISO 27001 | ISACA/ITAF |
| ---------------- | -------------------- | ----------------------- |
| `objectives` | Scope and Objectives | Objectives of the Audit |
| `scope` | Scope and Objectives | Scope of Engagement |
| `opinion` | Audit Conclusion | Expression of Opinion |
| `findings` | Non-Conformities | Findings with Severity |
| `ruleResults` | Evidence | Supporting Data |
| `judgeResult` | Evaluation | Quality Assessment |
## Integration
* [Rules](/docs/checks/rules) — `RuleResult[]` can be included in audit entries as evidence.
* [Judge](/docs/checks/judge) — `JudgeResult` can be included for quality assessment.
* [Screenshot](/docs/checks/screenshot) — Visual comparison results can support findings.
---
# Judge
This interface is **experimental**. No real implementation exists yet.
The API shape may change before stabilization.
## Overview
`Judge` is the checks-phase interface for LLM-as-judge evaluation. It uses a model to score agent output against quality criteria, returning a numeric score (0–1), a pass/fail result, and natural-language reasoning. Results can optionally include a per-criterion breakdown via `ruleResults`, feed into audit records, and trigger approval flows when a score falls below threshold.
Provider analogues: OpenAI Evals, Braintrust Scorers, LangSmith Evaluators, Arize Phoenix.
## Evaluation Flow
## TypeScript API
```ts
import type { JudgeConfig, JudgeResult, Judge } from '@osprotocol/schema/checks/judge'
```
### JudgeConfig
```ts
interface JudgeConfig {
model?: string
criteria: string
threshold?: number
metadata?: Record
}
```
| Field | Type | Description |
| ----------- | ------------------------- | ---------------------------------------------------------------------------------------- |
| `model` | `string` | Model identifier to use as judge. Falls back to provider default when omitted. |
| `criteria` | `string` | Natural-language description of what constitutes a passing result. Required. |
| `threshold` | `number` | Minimum score (0–1) for `passed: true`. Defaults to provider-defined value when omitted. |
| `metadata` | `Record` | Arbitrary metadata attached to this evaluation run. |
### JudgeResult
```ts
interface JudgeResult {
score: number
passed: boolean
reasoning: string
ruleResults?: RuleResult[]
metadata?: Record
}
```
| Field | Type | Description |
| ------------- | ------------------------- | --------------------------------------------------- |
| `score` | `number` | Numeric quality score in the range 0–1. |
| `passed` | `boolean` | `true` when `score >= threshold`. |
| `reasoning` | `string` | Model-generated explanation for the score. |
| `ruleResults` | `RuleResult[]` | Optional per-criterion breakdown from linked rules. |
| `metadata` | `Record` | Arbitrary metadata returned by the judge. |
### Judge
```ts
interface Judge {
evaluate(content: unknown, config: JudgeConfig): Promise
}
```
`evaluate` accepts any `content` value (string, object, or structured output) and a `JudgeConfig`, and returns a `Promise`.
## Usage Examples
### Basic evaluation
```ts
const result = await judge.evaluate(agentOutput, {
criteria: 'The response must be factually accurate, concise, and free of harmful content.',
threshold: 0.8,
})
console.log(result.passed) // true | false
console.log(result.score) // e.g. 0.92
console.log(result.reasoning) // "The response was accurate and well-scoped..."
```
### Breakdown by criteria using ruleResults
```ts
const result = await judge.evaluate(agentOutput, {
model: 'claude-opus-4-6',
criteria: 'Evaluate accuracy, tone, and completeness separately.',
threshold: 0.75,
})
if (result.ruleResults) {
for (const rule of result.ruleResults) {
console.log(rule.name, rule.passed, rule.score)
}
}
```
### Conditional approval trigger
```ts
const result = await judge.evaluate(agentOutput, {
criteria: 'Output must not contain PII and must follow the brand voice guide.',
threshold: 0.9,
})
if (!result.passed) {
// Route to human approval before publishing
await approvalGate.request({
reason: result.reasoning,
score: result.score,
})
}
```
## Rules vs Judge
| | Rules | Judge |
| ----------------- | ------------------------------------------------- | -------------------------------------------- |
| Evaluation method | Deterministic / programmatic | LLM-based qualitative evaluation |
| Output | Pass/fail per rule | Score (0–1) + reasoning |
| Best for | Schema validation, format checks, required fields | Tone, accuracy, helpfulness, nuanced quality |
| Latency | Low | Higher (model call required) |
| Cost | None | Model inference cost |
| Auditability | Exact rule match | Natural-language reasoning |
Use `Rules` for hard constraints and `Judge` for qualitative grading where human-like judgment is required.
## Integration
* [Rules](/docs/checks/rules) — deterministic checks whose results can be surfaced as `ruleResults` inside a `JudgeResult`
* [Audit](/docs/checks/audit) — `JudgeResult` records are written to the audit log for traceability
* [Approval](/docs/runs/approval) — when `passed` is `false`, evaluation results can trigger a human approval gate before execution continues
---
# Rules
Rules is an experimental interface. No implementation exists yet. The API
described here reflects the current design and is subject to change as the
protocol evolves.
## Overview
Rules define declarative verification criteria that agent output must satisfy before being accepted. They are composable: rules can be evaluated individually or as a complete set against any content. Provider analogues include ESLint Rules, GitHub Checks, Vercel Deployment Checks, and OpenAI Guardrails.
## Severity Levels
| Severity | Meaning |
| --------- | ------------------------------------------------------------- |
| `error` | The rule failure blocks acceptance. Output must not proceed. |
| `warning` | The rule failure is notable but does not block acceptance. |
| `info` | The rule result is informational only. No action is required. |
## TypeScript API
```ts
import type { RuleSeverity, RuleResult, Rule, Rules } from '@osprotocol/schema/checks/rules'
```
### RuleSeverity
```ts
type RuleSeverity = 'error' | 'warning' | 'info'
```
Indicates how a rule failure should be treated. An `error` blocks acceptance, a `warning` is surfaced without blocking, and `info` is purely observational.
### RuleResult
```ts
interface RuleResult {
ruleName: string
passed: boolean
severity: RuleSeverity
message: string
metadata?: Record
}
```
The result returned after evaluating a single rule against content. `passed` indicates whether the rule was satisfied. `message` provides a human-readable explanation. `metadata` carries any structured diagnostic data the rule chooses to emit.
### Rule
```ts
interface Rule {
name: string
description: string
severity: RuleSeverity
evaluate(content: unknown): Promise
metadata?: Record
}
```
A single verifiable criterion. `evaluate` receives the content to check and returns a `RuleResult`. The `severity` on the `Rule` defines the default severity that should appear in results when the rule fails.
### Rules
```ts
interface Rules {
get(name: string): Promise
list(): Promise
evaluate(content: unknown): Promise
}
```
A collection of rules. `list` enumerates all registered rules. `get` retrieves a specific rule by name. `evaluate` runs all rules against the provided content and returns a `RuleResult` for each one.
## Usage Examples
### Evaluate all rules against content
```ts
const results = await rules.evaluate(agentOutput)
for (const result of results) {
if (!result.passed && result.severity === 'error') {
throw new Error(`Rule failed: ${result.ruleName} — ${result.message}`)
}
}
```
### Get a specific rule by name
```ts
const rule = await rules.get('no-pii-in-output')
if (rule) {
const result = await rule.evaluate(agentOutput)
console.log(result.passed, result.message)
}
```
### Define a custom rule
```ts
const noPiiRule: Rule = {
name: 'no-pii-in-output',
description: 'Ensures agent output does not contain personally identifiable information',
severity: 'error',
async evaluate(content: unknown): Promise {
const text = typeof content === 'string' ? content : JSON.stringify(content)
const hasPii = /\b\d{3}-\d{2}-\d{4}\b/.test(text) // SSN pattern example
return {
ruleName: 'no-pii-in-output',
passed: !hasPii,
severity: 'error',
message: hasPii ? 'Output contains potential PII' : 'No PII detected',
}
},
}
```
## Integration
Rule results produced by `Rules.evaluate` feed into other parts of the checks and runs pipeline:
* [Judge](/docs/checks/judge) — uses rule results alongside other signals to produce a quality verdict
* [Audit](/docs/checks/audit) — records rule results for traceability and post-hoc review
* [Approval](/docs/runs/approval) — an `error`-severity failure can gate a run and trigger a human approval step
---
# Screenshot
This interface is experimental. No implementation exists yet.
The API shape may change before stabilization.
## Overview
Screenshot provides visual capture and baseline comparison for visual regression detection within the checks phase. Providers include Playwright, Puppeteer, Browserbase, and ScreenshotOne. Comparison results feed into the audit trail alongside rule and judge results, giving a complete picture of agent output quality.
## Capture and compare flow
## TypeScript API
```ts
import type {
ImageFormat,
ScreenshotOptions,
ScreenshotEntry,
ComparisonResult,
Screenshot,
} from '@osprotocol/schema/checks/screenshot'
```
### ImageFormat
```ts
type ImageFormat = 'png' | 'jpeg' | 'webp'
```
PNG is lossless and best suited for pixel diffing. JPEG and WebP produce smaller payloads when exact pixel fidelity is not required.
### ScreenshotOptions
```ts
interface ScreenshotOptions {
url?: string
fullPage?: boolean
clip?: {
x: number
y: number
width: number
height: number
}
selector?: string
format?: ImageFormat
quality?: number
scale?: number
omitBackground?: boolean
metadata?: Record
}
```
| Field | Description |
| ---------------- | -------------------------------------------------------- |
| `url` | Page URL to navigate to before capturing |
| `fullPage` | Capture the full scrollable page instead of the viewport |
| `clip` | Restrict capture to a bounding box in pixels |
| `selector` | CSS selector — captures only the matching element |
| `format` | Output image format (`png`, `jpeg`, `webp`) |
| `quality` | Compression quality for JPEG and WebP (0–100) |
| `scale` | Device pixel ratio multiplier |
| `omitBackground` | Make the background transparent (PNG only) |
| `metadata` | Arbitrary key-value pairs attached to the entry |
### ScreenshotEntry
```ts
interface ScreenshotEntry {
id: string
data: string
format: ImageFormat
width: number
height: number
createdAt: number
metadata?: Record
}
```
`data` is a base64-encoded image string. `createdAt` is a Unix timestamp in milliseconds.
### ComparisonResult
```ts
interface ComparisonResult {
passed: boolean
message: string
diffPixels: number
diffRatio: number
diffImage?: string
metadata?: Record
}
```
`passed` and `message` follow the same convention as `RuleResult`, so comparison results compose naturally into the checks audit trail. `diffImage` is an optional base64-encoded visualization of the pixel diff.
### Screenshot
```ts
interface Screenshot {
capture(options?: ScreenshotOptions): Promise
compare(
actual: ScreenshotEntry,
baseline: ScreenshotEntry,
threshold?: number,
): Promise
}
```
`capture` maps to provider-native methods: `page.screenshot` in Playwright and Puppeteer, `Page.captureScreenshot` via CDP in Browserbase, and `GET /take?url=...` in ScreenshotOne. `compare` uses pixel diffing — only Playwright has this built in via `toHaveScreenshot` (Pixelmatch). For all other providers the adapter handles comparison externally.
`threshold` is a ratio between 0 and 1 representing the maximum acceptable pixel difference before `passed` becomes `false`.
## Usage examples
### Capture a full-page screenshot
```ts
const entry = await screenshot.capture({
url: 'https://example.com',
fullPage: true,
format: 'png',
})
```
### Visual regression test against a baseline
```ts
const actual = await screenshot.capture({ url: 'https://example.com' })
const result = await screenshot.compare(actual, baseline, 0.01)
if (!result.passed) {
console.log(result.message)
console.log(`Diff: ${result.diffPixels} pixels (${result.diffRatio * 100}%)`)
}
```
### Capture a specific element
```ts
const entry = await screenshot.capture({
url: 'https://example.com/dashboard',
selector: '#revenue-chart',
format: 'png',
omitBackground: true,
})
```
## Integration
* [Audit](/docs/checks/audit) — screenshot entries and comparison results are attached to the audit trail
* [Sandbox](/docs/system/sandbox) — browser-based captures run inside isolated sandbox environments
* [Rules](/docs/checks/rules) — rule results and screenshot comparison results compose into a unified checks report
---
# Agent Loop
## Overview
The Agent Loop is the fundamental execution pattern that all agent implementations MUST support. It defines the iterative cycle through which agents gather context, take actions, verify their work, and iterate until completion.
## The Four Steps
### 1. Gather Context
Agents collect information needed to complete their task through:
* **Agentic search**: File systems, grep, tail, structured queries
* **Semantic search**: Vector embeddings for concept-based queries
* **Subagents**: Isolated context windows for parallel information gathering
* **Context compaction**: Summarization for long-running agents
Learn more: [Context Management](/docs/context)
### 2. Take Action
Agents execute operations using:
* **Tools**: Primary building blocks with clear interfaces
* **Bash/Scripts**: Command execution and automation
* **Code Generation**: Dynamic code creation and execution
* **MCP Integration**: Standardized protocol for external services
Learn more: [Actions](/docs/actions/tools)
### 3. Verify Work
Agents validate outputs through:
* **Rules-based validation**: Defined criteria and constraints
* **Visual feedback**: Screenshots and renders for UI tasks
* **LLM-as-judge**: Model-based evaluation
Learn more: [Checks](/docs/checks/rules)
### 4. Iterate
The loop repeats until:
* Task completion criteria are met
* Iteration limits are reached
* Termination conditions are triggered
## Cognitive Micro-Pattern
The loop maps to an internal cognitive cycle:
1. **Think/Reason** → Plan next action (Gather Context)
2. **Act** → Execute tools (Take Action)
3. **Observe** → Process results (Verify Work)
4. **Reflect** → Evaluate progress (Verify Work)
5. **Decide** → Continue or stop (Iterate)
Reference: [Anthropic: Building Agents with Claude Agent SDK](https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk)
## Next Steps
* Understand how the loop fits into the **[Agent Lifecycle](/docs/concepts/lifecycle)**
* Explore **[Workflow Patterns](/docs/concepts/workflows-taxonomy)** that orchestrate the loop
* Read the full specification in [AGENTS.md Section 2](https://github.com/synerops/osprotocol/blob/main/AGENTS.md#2-core-execution-pattern-agent-loop)
---
# Agentic OS
## Overview
An **Agentic OS** is a design paradigm where the **Large Language Model (LLM)** functions conceptually as the **Kernel** of the system. Unlike traditional operating systems designed for human interaction, an Agentic OS is a **backend infrastructure layer** that manages the lifecycle and resources of autonomous software agents.
This concept is foundational to understanding the Agentic OS Protocol (OSP). OSP defines the standardized interfaces and behaviors that implementations of an Agentic OS must follow—the contract that enables different agent systems to interoperate.
## Resource Abstraction
Just as traditional operating systems abstract hardware resources (CPU, memory, disk, devices), an Agentic OS abstracts the cognitive resources of AI systems. The Agentic OS manages cognitive resources just as a traditional kernel manages physical hardware:
* **CPU Cycles → Inference / Tokens**: Managing the compute required for reasoning and generation
* **RAM (Memory) → Context Window**: Managing the finite amount of information active in the model's immediate attention
* **Disk / Filesystem → Vector Store / RAG**: Managing long-term retrieval and persistent knowledge
* **Device Drivers → Tools / MCP**: Standardizing interfaces for external interaction (APIs, browsers, code execution)
* **Process Scheduler → Agent Orchestrator**: Determining which agent runs when, and for how long
Learn more: [System Intelligence](/docs/system/registry)
## The "User" of the OS
In this paradigm, **the "User" of the Operating System is the Agent itself**, not the human.
* The **Agent** requests resources from the OS ("I need to read this file", "I need to store this memory")
* The **OS** enforces permissions, manages limits, and provides the requested capabilities
* The **Human** acts as the external administrator or the user of the *application* built on top of the OS, but does not interact with the Agentic OS layer directly
This distinction is critical: an Agentic OS is the invisible infrastructure that enables complex, multi-agent systems to function reliably at scale.
## Scope and Purpose
The Agentic OS solves **Orchestration Complexity**, not User Experience. Its primary goals are:
1. **Context Hygiene:** Preventing context pollution and managing finite window sizes
2. **Process Isolation:** Ensuring agents operate within defined boundaries without interfering with each other
3. **Inter-Process Communication:** Enabling standardized communication between disparate agents
These goals align directly with the challenges outlined in our [Motivation](/docs/motivation): as systems grow, managing context, isolation, and communication becomes increasingly complex. The Agentic OS provides the infrastructure layer that addresses these challenges systematically.
Learn more: [Motivation](/docs/motivation) | [Architecture](/docs/architecture)
## Agentic OS vs OSP
It's important to understand the distinction:
* **Agentic OS** is the conceptual paradigm—the architectural metaphor
* **OSP (Agentic OS Protocol)** is the specification—the standardized contract that implementations must follow
Just as "operating system" describes a category of software (Linux, Windows, macOS), "Agentic OS" describes a category of systems that manage agent resources. OSP defines the protocol specification that different implementations can follow to achieve interoperability.
Think of it this way: Linux and Windows are both operating systems, but they follow different architectures. Multiple implementations can follow OSP and each be an "Agentic OS" with different internal designs—but they'll all interoperate because they follow the same protocol contract.
## How OSP Implements the Agentic OS
OSP defines the standardized interfaces and behaviors that make an Agentic OS possible:
* **[System](/docs/system)**: Registry, Environment, Filesystem, Sandbox, Settings, Preferences, Installer, MCP Client — the infrastructure layer
* **[Context](/docs/context)**: System Context, Embeddings, Key-Value — read-only facades for the gather phase
* **[Actions](/docs/actions)**: System Actions, Tools, MCP Servers — write facades for the act phase
* **[Checks](/docs/checks/rules)**: Rules, Judge, Audit, Screenshot — verification and quality assurance
These components work together to provide the resource abstraction, process isolation, and inter-process communication that define an Agentic OS.
Learn more: [Architecture](/docs/architecture)
## Next Steps
* Understand the **[Agent Loop](/docs/concepts/agent-loop)**—the core execution pattern within agents
* Explore the **[Agent Lifecycle](/docs/concepts/lifecycle)**—how the OS manages agent resources
* Review **[Workflow Patterns](/docs/concepts/workflows-taxonomy)**—operational execution patterns
---
# Agent Lifecycle
## Overview
The Agent Lifecycle is a System/Control Workflow that defines how agents are managed within the system. Unlike the [Agent Loop](/docs/concepts/agent-loop) (which describes internal execution), the Lifecycle governs system-level responsibilities: registration, discovery, execution management, and evaluation.
## The Four Phases
### 1. Registration
Agents declare their capabilities and constraints to the system:
* Capability declaration
* Resource requirements specification
* Constraint definition
* Metadata registration
Learn more: [System Registry](/docs/system/registry)
### 2. Discovery
The system exposes agents for selection and routing:
* Capability-based discovery
* Dynamic service discovery
* Load balancing mechanisms
* Failover protocols
Learn more: [System Registry](/docs/system/registry)
### 3. Execution Management
The OS assigns tasks and monitors progress:
* Task assignment interfaces
* Real-time monitoring
* Error handling
* State management
* Policy enforcement
Learn more: [Runs](/docs/runs/run), [Actions](/docs/actions)
### 4. Evaluation
Outputs, logs, and performance are reviewed:
* Performance monitoring
* Quality assessment
* Compliance verification
* Adaptation mechanisms
Learn more: [Audit](/docs/checks/audit), [Judge](/docs/checks/judge)
## Lifecycle vs Loop vs Workflows
Understanding the distinction is crucial:
| Concept | Layer | Purpose | Scope |
| -------------------------------------------------- | ----------- | ------------------ | ---------------------- |
| **Lifecycle** | System | Agent management | OS/Platform governance |
| **[Loop](/docs/concepts/agent-loop)** | Cognitive | Internal execution | Single agent reasoning |
| **[Workflows](/docs/concepts/workflows-taxonomy)** | Operational | Task orchestration | Multi-step processes |
* **Lifecycle** exists *outside* any specific workflow—it's the system contract
* **Loop** executes *inside* workflows—it's the cognitive engine
* **Workflows** orchestrate *during* Execution/Evaluation phases—they're the macro patterns
Reference: [Anthropic: Building Effective Agents](https://www.anthropic.com/engineering/building-effective-agents)
## Next Steps
* Explore [Workflow Taxonomy](/docs/concepts/workflows-taxonomy) to see operational patterns
* Read the [AGENTS.md](https://github.com/synerops/osprotocol/blob/main/AGENTS.md) knowledge base
---
# Workflows Taxonomy
## Overview
Workflows are operational execution patterns that define how tasks are executed during the Execution/Evaluation phases of the [Agent Lifecycle](/docs/concepts/lifecycle). They are macro-level orchestration patterns, distinct from the [Agent Loop](/docs/concepts/agent-loop) (micro-execution) and the Lifecycle (system layer).
## The Six Categories
### 1. System/Control Workflows
Govern agent management at the platform level. The primary workflow is the [Agent Lifecycle](/docs/concepts/lifecycle): Registration → Discovery → Execution → Evaluation.
### 2. Task Workflows
Operational patterns for executing work:
* **[Routing](/docs/workflows/routing)**: Classify inputs and direct to specialized tasks
* **Prompt Chaining**: Sequential steps with validation gates
* **[Orchestrator-Workers](/docs/workflows/orchestrator-worker)**: Central orchestrator delegates to workers
* **[Parallelization](/docs/workflows/parallelization)**: Simultaneous execution with aggregation
* **[Evaluator-Optimizer](/docs/workflows/evaluator-optimizer)**: Generate-evaluate-refine loops
Reference: [Anthropic: Building Effective Agents](https://www.anthropic.com/engineering/building-effective-agents)
### 3. Quality Workflows
Ensure outputs meet standards:
* **[Rules Validation](/docs/checks/rules)**: Defined criteria and constraints
* **[Visual Checks](/docs/checks/screenshot)**: Screenshots and renders
* **[LLM-as-Judge](/docs/checks/judge)**: Model-based evaluation
### 4. Recovery Workflows
Handle failures and errors:
* **[Retries](/docs/runs/retry)**: Automatic retry mechanisms
* **[Timeouts](/docs/runs/timeout)**: Long-running operation handling
* **[Cancellation](/docs/runs/cancel)**: Graceful termination of running operations
### 5. Human-in-the-Loop Workflows
Integrate human oversight:
* **[Approval Workflows](/docs/runs/approval)**: Human approval before proceeding
* **Manual Delegation**: Human task assignment
### 6. Multi-Agent Workflows
Coordinate multiple agents:
* **Agent Coordination**: Multiple agents working together
* **Distributed Execution**: Tasks distributed across agents
## Key Distinctions
* **Workflows** are macro-level orchestration patterns used *during* Execution/Evaluation
* **[Agent Loop](/docs/concepts/agent-loop)** is the micro-level cognitive cycle *inside* workflows
* **[Lifecycle](/docs/concepts/lifecycle)** is the system-level governance *around* workflows
## Next Steps
* Explore specific [Task Workflows](/docs/workflows/routing)
* Understand [Quality Assurance](/docs/checks/audit) mechanisms
* Learn about [Recovery Patterns](/docs/runs/retry)
* Read the full specification in [AGENTS.md Section 3](https://github.com/synerops/osprotocol/blob/main/AGENTS.md#3-workflow-patterns)
---
# Embeddings
This interface is experimental — no production implementation exists yet.
The API surface may change.
## Overview
Embeddings is the agent-facing interface for semantic search over indexed knowledge. Agents use it to find relevant content by meaning rather than exact keywords. The vector database infrastructure underneath is a system concern — providers like Pinecone, Upstash Vector, Weaviate, or OpenAI Embeddings handle storage and retrieval without the agent needing to know which one is in use.
## TypeScript API
```ts
import type { Embeddings, EmbeddingEntry, EmbeddingsContext, EmbeddingsActions } from '@osprotocol/schema/context/embeddings'
```
### EmbeddingEntry
A single result returned from a search or get operation.
```ts
interface EmbeddingEntry> {
id: string
content: string
/** Similarity score, 0–1. Present only in search results. */
score?: number
metadata?: T
}
```
### EmbeddingsContext
Read-only interface for the context phase of the agent loop. Use this to find relevant entries by meaning or retrieve a known entry by ID.
```ts
interface EmbeddingsContext {
search>(
query: string,
topK: number,
filter?: Partial
): Promise[]>
get>(id: string): Promise | null>
}
```
### EmbeddingsActions
Write interface for the actions phase of the agent loop. Use this to index new content or remove stale entries.
```ts
interface EmbeddingsActions {
upsert>(
id: string,
content: string,
metadata?: T
): Promise>
remove(id: string): Promise
}
```
### Embeddings
Full interface combining read and write operations.
```ts
interface Embeddings {
upsert>(
id: string,
content: string,
metadata?: T
): Promise>
search>(
query: string,
topK: number,
filter?: Partial
): Promise[]>
get>(id: string): Promise | null>
remove(id: string): Promise
}
```
## Usage Examples
### Semantic search with metadata filter
```ts
type DocMeta = { source: string; language: string }
const results = await embeddings.search(
'how to handle authentication errors',
5,
{ language: 'en' }
)
for (const entry of results) {
console.log(entry.score, entry.content)
// 0.91 "When a 401 is returned, refresh the token and retry..."
}
```
### Upsert content into the index
```ts
await embeddings.upsert(
'doc:auth-errors',
'When a 401 is returned, refresh the token and retry the request.',
{ source: 'runbook', language: 'en' }
)
```
### RAG pattern — retrieve, then generate
```ts
const chunks = await embeddings.search(userQuestion, 3)
const context = chunks.map((c) => c.content).join('\n\n')
const answer = await llm.complete(`Answer using this context:\n\n${context}\n\nQuestion: ${userQuestion}`)
```
## Embeddings vs Key-Value
| Concern | Embeddings | Key-Value (`context/kv`) |
| ---------- | -------------------------- | ------------------------------- |
| Lookup by | Meaning / similarity | Exact key |
| Returns | Ranked results with scores | Single entry or null |
| Best for | Knowledge retrieval, RAG | Session state, config, counters |
| Query type | Natural language query | Known key string |
## Integration
Embeddings integrates with:
* **[Key-Value Store](/docs/context/kv)**: Complementary persistence — embeddings for semantic search, kv for exact lookups
* **[System Context](/docs/context/system)**: EmbeddingsContext is part of the read-only system context facade
* **[Filesystem](/docs/system/fs)**: Source documents can be read from fs and indexed into embeddings
---
# Context
## Overview
The Context domain provides application-specific context and data management for agents. It enables agents to access, store, and retrieve information needed for intelligent decision-making and task execution.
Context is one of the three pillars of the agent loop: **Gather Context** → Take Actions → Verify Results.
## Context APIs
| API | Description |
| -------------------------------------- | ------------------------------------------------------ |
| [System Context](/docs/context/system) | Read-only composition of all system Context interfaces |
| [Embeddings](/docs/context/embeddings) | Vector embeddings for semantic search |
| [Key-Value Store](/docs/context/kv) | Key-value persistence for the agent loop |
## Role in Agent Loop
Context provides the foundation for informed agent behavior:
## Usage
Context is accessed through the `context` protocol domain. `SystemContext` composes all system read interfaces into a single entry point. `Embeddings` and `KV` are agent-facing read/write interfaces for semantic search and key-value persistence respectively.
## Integration
Context integrates with:
* **System**: Accesses system-level information
* **Actions**: Provides context for action execution
* **Checks**: Context informs quality verification
* **Workflows**: Workflows access context during execution
---
# Key-Value Store
This interface is experimental — no production implementation exists yet.
The API surface may change.
## Overview
The key-value store provides flat, direct-access persistence for structured data. Agents use it to store and retrieve data by known keys — session state, user preferences, configuration, counters. Unlike `fs` (hierarchical, file-based) and `embeddings` (semantic search by meaning), `kv` is for exact key lookups.
## TypeScript API
```ts
import type { Kv, KvEntry, KvContext, KvActions } from '@osprotocol/schema/context/kv'
```
### KvEntry
A single key-value entry.
```ts
interface KvEntry {
/** Entry key */
key: string
/** Entry value */
value: T
/** Extensible metadata for provider-specific data */
metadata?: Record
}
```
### Kv
Full key-value store interface with read and write operations.
```ts
interface Kv {
get(key: string): Promise | null>
set(key: string, value: T): Promise>
remove(key: string): Promise
list(prefix?: string): Promise
}
```
### KvContext
Read-only view for the context phase of the agent loop.
```ts
interface KvContext {
get(key: string): Promise | null>
list(prefix?: string): Promise
}
```
### KvActions
Write operations for the actions phase of the agent loop.
```ts
interface KvActions {
set(key: string, value: T): Promise>
remove(key: string): Promise
}
```
## Usage Examples
### Store and retrieve session state
```ts
await kv.set('session:abc123', {
userId: 'user-42',
startedAt: Date.now(),
step: 'code-review',
})
const session = await kv.get<{ userId: string; step: string }>('session:abc123')
// session.value.step → 'code-review'
```
### Enumerate keys by prefix
```ts
const keys = await kv.list('session:')
// ['session:abc123', 'session:def456', ...]
```
### Remove expired data
```ts
const removed = await kv.remove('session:abc123')
// true if the entry existed
```
## Agent Persistence Model
The protocol provides three distinct persistence patterns:
| Pattern | Interface | Access | Use Case |
| ---------------- | -------------------- | ---------------------------- | ---------------------------------------- |
| **Hierarchical** | `system/fs` | Paths and directories | Files, configs, artifacts |
| **Key-value** | `context/kv` | Direct key lookup | Session state, counters, structured data |
| **Semantic** | `context/embeddings` | Similarity search by meaning | Knowledge retrieval, RAG |
## Integration
Key-Value Store integrates with:
* **[Embeddings](/docs/context/embeddings)**: Complementary persistence — kv for exact lookups, embeddings for semantic search
* **[Filesystem](/docs/system/fs)**: Complementary persistence — kv for flat data, fs for hierarchical files
* **[System Context](/docs/context/system)**: KvContext is part of the read-only system context facade
---
# System Context
This interface is experimental — no production implementation exists yet.
The API surface may change.
## Overview
`SystemContext` is the read-only facade that composes all system Context interfaces into a single entry point. It is used during the context (gather) phase of the agent loop, giving agents a unified view of system state without the ability to mutate it. Write operations are handled by the counterpart: [SystemActions](/docs/actions/system).
## Architecture
`SystemContext` is pure composition — it adds no logic of its own, only grouping each system API's read-only Context interface under one namespace. `SystemActions` mirrors this structure for write operations.
## TypeScript API
```ts
import type { SystemContext } from '@osprotocol/schema/context/system'
```
### SystemContext
Composes all system read-only interfaces.
```ts
interface SystemContext {
env: EnvContext
settings: SettingsContext
preferences: PreferencesContext
registry: RegistryContext
fs: FsContext
sandbox: SandboxContext
installer: InstallerContext
mcp: McpContext
}
```
## Composed Interfaces
| Property | Type | Provides | Docs |
| ------------- | -------------------- | -------------------------------------------------------------- | --------------------------------------- |
| `env` | `EnvContext` | Read environment variables (get, list) | [Env](/docs/system/env) |
| `settings` | `SettingsContext` | Read system-wide settings (get, list) | [Settings](/docs/system/settings) |
| `preferences` | `PreferencesContext` | Read per-agent or per-user preferences by scope (get, list) | [Preferences](/docs/system/preferences) |
| `registry` | `RegistryContext` | Discover and look up registered resources (get, list) | [Registry](/docs/system/registry) |
| `fs` | `FsContext` | Read host filesystem entries (read, list, exists) | [Fs](/docs/system/fs) |
| `sandbox` | `SandboxContext` | Inspect existing sandbox environments (get, list) | [Sandbox](/docs/system/sandbox) |
| `installer` | `InstallerContext` | Inspect installed packages and their status (get, list) | [Installer](/docs/system/installer) |
| `mcp` | `McpContext` | Inspect MCP server connections and available tools (get, list) | [MCP Client](/docs/system/mcp-client) |
## Usage Examples
### Check environment and preferences together
An agent reads an environment variable and resolves a user preference in the same context phase before deciding how to act.
```ts
async function resolveOutputConfig(system: SystemContext) {
const dbUrl = await system.env.get('DATABASE_URL')
const formatPref = await system.preferences.get('output.format', 'user')
return {
databaseUrl: dbUrl?.value ?? null,
outputFormat: formatPref?.value ?? 'json',
}
}
```
### Inspect installed packages and MCP connections
An agent audits what capabilities are currently available before deciding whether to proceed with a task.
```ts
async function auditCapabilities(system: SystemContext) {
const [packages, mcpServers] = await Promise.all([
system.installer.list(),
system.mcp.list(),
])
const hasSchemaPackage = packages.some(
(p) => p.name === '@osprotocol/schema' && p.status === 'installed'
)
const connectedServers = mcpServers.filter((s) => s.status === 'connected')
return { hasSchemaPackage, connectedServers }
}
```
## Integration
* **[SystemActions](/docs/actions/system)**: The write counterpart — same system interfaces, mutation operations
* **[EnvContext](/docs/system/env)**: Environment variable read interface
* **[SettingsContext](/docs/system/settings)**: System-wide settings read interface
* **[PreferencesContext](/docs/system/preferences)**: Scoped preferences read interface
* **[RegistryContext](/docs/system/registry)**: Resource registry read interface
* **[FsContext](/docs/system/fs)**: Host filesystem read interface
* **[SandboxContext](/docs/system/sandbox)**: Sandbox inspection interface
* **[InstallerContext](/docs/system/installer)**: Installed packages read interface
* **[McpContext](/docs/system/mcp-client)**: MCP server connections read interface
---
# Approval
## Overview
The approval system enables human oversight of workflow execution. It provides approval requests, multi-approver workflows, and configurable timeout behavior for critical checkpoints.
## TypeScript API
```ts
import type {
Approval,
ApprovalConfig,
ApprovalRequest
} from '@osprotocol/schema/runs/approval'
```
### Approval
Result of an approval request.
```ts
interface Approval {
/** Whether the action was approved */
approved: boolean
/** Optional reason for the decision */
reason?: string
/** Identifier of who approved (user ID, email, etc.) */
approvedBy?: string
/** When the approval decision was made */
timestamp: Date
}
```
### ApprovalConfig
Configuration for approval requests.
```ts
interface ApprovalConfig {
/** Default timeout for approval requests (milliseconds) */
timeoutMs?: number
/** Whether to auto-approve after timeout */
autoApproveOnTimeout?: boolean
/** List of users who can approve */
approvers?: string[]
/** Minimum approvals required (for multi-approval scenarios) */
requiredApprovals?: number
}
```
### ApprovalRequest
A pending request for human approval.
```ts
interface ApprovalRequest {
/** Unique identifier for this request */
id: string
/** Message describing what needs approval */
message: string
/** Execution ID this request belongs to */
executionId: string
/** When the request was created */
createdAt: Date
/** When the request expires */
expiresAt?: Date
/** Current approval responses */
responses: Approval[]
}
```
## Usage Example
```ts
// Request approval during execution
const approval = await execution.waitForApproval(
'Deploy to production environment?'
)
if (approval.approved) {
console.log(`Approved by ${approval.approvedBy}: ${approval.reason}`)
// Continue with deployment
} else {
console.log(`Denied: ${approval.reason}`)
// Handle rejection
}
```
## Multi-Approval Workflows
For critical operations requiring multiple approvers:
```ts
const config: ApprovalConfig = {
timeoutMs: 3600000, // 1 hour
approvers: ['alice@company.com', 'bob@company.com'],
requiredApprovals: 2,
autoApproveOnTimeout: false
}
```
## Integration
Approval integrates with:
* **Execution**: Pauses execution until approval
* **Timeout**: Approval requests can expire
* **Cancel**: Rejected approvals can trigger cancellation
---
# Cancellation
## Overview
The cancel system provides mechanisms to cancel running workflows gracefully. It supports cancellation hooks, cleanup operations, and configurable grace periods for in-progress work.
## TypeScript API
```ts
import type { Cancel } from '@osprotocol/schema/runs/cancel'
```
### Cancel
Cancel configuration for workflow runs.
```ts
interface Cancel {
/**
* Called before cancellation proceeds
* Return false to prevent cancellation
*/
beforeCancel?: () => boolean | Promise
/**
* Called after cancellation completes
*/
afterCancel?: () => void
/**
* Optional reason for cancellation
*/
reason?: string
/**
* Whether to wait for cleanup before resolving
*/
graceful?: boolean
/**
* Timeout for graceful cancellation in milliseconds
*/
gracefulTimeoutMs?: number
}
```
## Usage Examples
### Simple Cancellation
```ts
// Cancel an execution
await execution.cancel('User requested stop')
```
### Cancellation with Cleanup
```ts
const cancel: Cancel = {
graceful: true,
gracefulTimeoutMs: 5000,
beforeCancel: async () => {
// Check if safe to cancel
const canCancel = await checkSafeToCancel()
return canCancel
},
afterCancel: () => {
// Cleanup resources
cleanupTempFiles()
closeConnections()
}
}
```
### Preventing Cancellation
```ts
const cancel: Cancel = {
beforeCancel: () => {
if (criticalOperationInProgress) {
console.log('Cannot cancel during critical operation')
return false // Prevent cancellation
}
return true
}
}
```
### Graceful Shutdown
```ts
const cancel: Cancel = {
graceful: true,
gracefulTimeoutMs: 10000, // 10 second grace period
reason: 'System shutdown',
afterCancel: () => {
notifyDependentSystems()
}
}
// Waits up to 10 seconds for graceful cleanup
// Forces cancellation if cleanup exceeds timeout
```
## Cancellation Flow
## Integration
Cancel integrates with:
* **RunOptions**: Configure cancel behavior for runs
* **Timeout**: Timeouts can trigger cancellation
* **Execution**: Cancel is called via execution.cancel()
* **Approval**: Rejected approvals may trigger cancellation
---
# Retry
## Overview
The retry system provides configurable retry behavior for failed operations. It supports multiple backoff strategies, conditional retries, and callbacks for monitoring retry attempts.
## Backoff Strategies
| Strategy | Description |
| ------------- | ----------------------------------------------- |
| `none` | No delay increase between retries |
| `linear` | Delay increases linearly (delay \* attempt) |
| `exponential` | Delay doubles each attempt (delay \* 2^attempt) |
## TypeScript API
```ts
import type { Retry, Backoff } from '@osprotocol/schema/runs/retry'
```
### Backoff
Available backoff strategies for retry delays.
```ts
type Backoff = 'none' | 'linear' | 'exponential'
```
### Retry
Retry configuration for workflow runs.
```ts
interface Retry {
/** Maximum number of retry attempts */
attempts: number
/** Initial delay between retries in milliseconds */
delayMs: number
/** Backoff strategy (default: 'none') */
backoff?: Backoff
/** Maximum delay when using backoff (milliseconds) */
maxDelayMs?: number
/** Callback on each retry attempt */
onRetry?: (error: Error, attempt: number) => void
/** Optional predicate to determine if error is retryable */
shouldRetry?: (error: Error) => boolean
}
```
## Usage Examples
### Simple Retry
```ts
const retry: Retry = {
attempts: 3,
delayMs: 1000
}
// Retries up to 3 times with 1 second between each attempt
```
### Exponential Backoff
```ts
const retry: Retry = {
attempts: 5,
delayMs: 100,
backoff: 'exponential',
maxDelayMs: 10000,
onRetry: (error, attempt) => {
console.log(`Retry ${attempt}: ${error.message}`)
}
}
// Delays: 100ms, 200ms, 400ms, 800ms, 1600ms (capped at 10000ms)
```
### Conditional Retry
```ts
const retry: Retry = {
attempts: 3,
delayMs: 500,
shouldRetry: (error) => {
// Only retry network errors
return error.name === 'NetworkError'
}
}
```
## Delay Calculation
| Strategy | Attempt 1 | Attempt 2 | Attempt 3 | Attempt 4 |
| ------------- | --------- | ------------ | ------------ | ------------ |
| `none` | delayMs | delayMs | delayMs | delayMs |
| `linear` | delayMs | 2 \* delayMs | 3 \* delayMs | 4 \* delayMs |
| `exponential` | delayMs | 2 \* delayMs | 4 \* delayMs | 8 \* delayMs |
## Integration
Retry integrates with:
* **RunOptions**: Configure retry behavior for runs
* **Timeout**: Retries respect timeout constraints
* **Cancel**: Pending retries can be cancelled
---
# Run Lifecycle
## Overview
Executions represent an active workflow run with full lifecycle control. The run system provides status tracking, progress monitoring, and execution control through pause, resume, and cancel operations.
Creating a run IS starting it — `workflow.run()` returns an active `Execution` handle directly. This aligns with the [Agent Communication Protocol (ACP)](https://agentcommunicationprotocol.dev/core-concepts/agent-run-lifecycle).
## Lifecycle
## TypeScript API
```ts
import type {
RunOptions,
RunStatus,
Execution,
ExecutionProgress
} from '@osprotocol/schema/runs'
```
### RunStatus
The possible states of a workflow execution.
```ts
type RunStatus =
| 'pending' // Execution is queued/initializing
| 'in-progress' // Execution is actively running
| 'awaiting' // Execution is waiting for human input/approval
| 'completed' // Execution finished successfully
| 'failed' // Execution encountered an error
| 'cancelled' // Execution was cancelled
```
### RunOptions
Options for configuring a workflow run.
```ts
interface RunOptions