Agents
The agent manifest format, lifecycle states, runtime selection, and BYOLLM model alias system.
Agents
An Agent in AEGIS is a stateless compute process defined entirely by a declarative YAML manifest (kind: AgentManifest). The orchestrator reads the manifest to determine the runtime environment, what tools the agent is allowed to use, what security constraints to enforce, what resources to allocate, and how to validate the output.
Agents do not maintain state between executions. Context is injected by the orchestrator at the start of each execution and tool results are returned to the agent via the orchestrator proxy.
Manifest Structure
All agent manifests follow a Kubernetes-style format:
apiVersion: 100monkeys.ai/v1
kind: AgentManifest
metadata:
name: code-reviewer
version: "1.0.0"
labels:
team: platform
environment: production
spec:
runtime:
language: python
version: "3.11"
isolation: docker
task:
instruction: "Reviews pull requests and outputs structured feedback."
security:
network:
mode: allow
allowlist:
- api.github.com
- api.openai.com
filesystem:
read:
- /workspace
write:
- /workspace
resources:
cpu: 1000
memory: "1Gi"
timeout: "300s"
execution:
mode: iterative
max_iterations: 10
validation:
system:
must_succeed: true
output:
format: json
env:
LOG_LEVEL: info
volumes:
- name: workspace
storage_class: ephemeral
mount_point: /workspace
access_mode: read-write
ttl_hours: 1Manifest Fields
metadata
| Field | Type | Required | Description |
|---|---|---|---|
name | string | ✓ | Unique identifier for the agent on this node. Used in CLI commands and gRPC calls. |
labels | map[string]string | Arbitrary key-value tags for filtering and organization. |
spec
| Field | Type | Required | Description |
|---|---|---|---|
description | string | Human-readable description injected into the system prompt. | |
runtime | object | ✓ | Language, version, and isolation mode. |
task | object | Instruction, agentskills, and prompt template. | |
security | object | Network policy, filesystem policy, resource limits (deny-by-default). | |
execution | object | Iteration mode, max iterations, and validation criteria. | |
tools | object[] or string[] | MCP tools the agent may invoke. | |
env | map[string]string | Environment variables injected into the agent container. | |
volumes | object[] | Volume mounts. See Configuring Storage. |
spec.security
| Field | Type | Description |
|---|---|---|
network.mode | allow | deny | none | Policy mode. allow = allowlist; none = no network. |
network.allowlist | string[] | Allowed domain names and CIDR blocks. |
network.denylist | string[] | Explicitly blocked domains. |
filesystem.read | string[] | Readable paths inside the container. Glob patterns supported. |
filesystem.write | string[] | Writable paths inside the container. Glob patterns supported. |
resources.cpu | integer | CPU in millicores (1000 = 1 core). Default: 1000. |
resources.memory | string | Memory limit. Human-readable: "512Mi", "1Gi". Default: "512Mi". |
resources.timeout | string | Total execution timeout. Human-readable: "300s", "5m". Max "1h". |
Agent Lifecycle
An agent transitions through the following states after deploy:
deployed → paused → deployed
│
└──────────────────────→ archived| State | Description |
|---|---|
deployed | The manifest is registered. New executions can be started against this agent. |
paused | The manifest is retained but no new executions are accepted. Running executions complete normally. |
archived | The manifest is soft-deleted. Cannot be unarchived. Historical execution records are retained. |
aegis agent deploy ./agent.yaml
aegis agent pause <id>
aegis agent resume <id>
aegis agent delete <id> # archives, does not hard-deleteRuntime Selection
The runtime used to execute an agent is determined by the node configuration, not the agent manifest. All agents on a given node use the same runtime.
| Runtime | Use Case | Requirement |
|---|---|---|
docker | Development and staging | Docker daemon accessible |
firecracker | Production | Bare-metal or KVM-passthrough host, Linux kernel 5.10+ |
The AgentRuntime trait abstracts both runtimes. Switching a node from Docker to Firecracker requires only a config change — no agent manifest changes needed.
See Docker Runtime and Firecracker Runtime for deployment details.
BYOLLM: Model Alias System
Agent manifests reference LLM models by alias, not by provider name. This decouples agent definitions from infrastructure choices.
In the agent manifest, specify the model alias in the spec or within the bootstrap.py client initialization:
from aegis import AegisClient
client = AegisClient(model="default") # resolves to whatever "default" maps to in node configIn aegis-config.yaml, the node operator maps aliases to providers:
llm:
providers:
- name: openai-gpt4o
type: openai
api_key: "env:OPENAI_API_KEY"
model: gpt-4o
- name: claude-3-5-sonnet
type: anthropic
api_key: "env:ANTHROPIC_API_KEY"
model: claude-3-5-sonnet-20241022
aliases:
default: openai-gpt4o
fast: claude-3-5-sonnet
reasoning: openai-gpt4oTo swap the model backing the default alias for all agents on the node, change the alias mapping in config and restart the daemon — no manifest changes required.
See Configuring LLM Providers for the full provider configuration reference.