Aegis Orchestrator
Core Concepts

Agents

The agent manifest format, lifecycle states, runtime selection, and BYOLLM model alias system.

Agents

An Agent in AEGIS is a stateless compute process defined entirely by a declarative YAML manifest (kind: AgentManifest). The orchestrator reads the manifest to determine the runtime environment, what tools the agent is allowed to use, what security constraints to enforce, what resources to allocate, and how to validate the output.

Agents do not maintain state between executions. Context is injected by the orchestrator at the start of each execution and tool results are returned to the agent via the orchestrator proxy.


Manifest Structure

All agent manifests follow a Kubernetes-style format:

apiVersion: 100monkeys.ai/v1
kind: AgentManifest
metadata:
  name: code-reviewer
  version: "1.0.0"
  labels:
    team: platform
    environment: production
spec:
  runtime:
    language: python
    version: "3.11"
    isolation: docker

  task:
    instruction: "Reviews pull requests and outputs structured feedback."

  security:
    network:
      mode: allow
      allowlist:
        - api.github.com
        - api.openai.com
    filesystem:
      read:
        - /workspace
      write:
        - /workspace
    resources:
      cpu: 1000
      memory: "1Gi"
      timeout: "300s"

  execution:
    mode: iterative
    max_iterations: 10
    validation:
      system:
        must_succeed: true
      output:
        format: json

  env:
    LOG_LEVEL: info

  volumes:
    - name: workspace
      storage_class: ephemeral
      mount_point: /workspace
      access_mode: read-write
      ttl_hours: 1

Manifest Fields

metadata

FieldTypeRequiredDescription
namestringUnique identifier for the agent on this node. Used in CLI commands and gRPC calls.
labelsmap[string]stringArbitrary key-value tags for filtering and organization.

spec

FieldTypeRequiredDescription
descriptionstringHuman-readable description injected into the system prompt.
runtimeobjectLanguage, version, and isolation mode.
taskobjectInstruction, agentskills, and prompt template.
securityobjectNetwork policy, filesystem policy, resource limits (deny-by-default).
executionobjectIteration mode, max iterations, and validation criteria.
toolsobject[] or string[]MCP tools the agent may invoke.
envmap[string]stringEnvironment variables injected into the agent container.
volumesobject[]Volume mounts. See Configuring Storage.

spec.security

FieldTypeDescription
network.modeallow | deny | nonePolicy mode. allow = allowlist; none = no network.
network.allowliststring[]Allowed domain names and CIDR blocks.
network.denyliststring[]Explicitly blocked domains.
filesystem.readstring[]Readable paths inside the container. Glob patterns supported.
filesystem.writestring[]Writable paths inside the container. Glob patterns supported.
resources.cpuintegerCPU in millicores (1000 = 1 core). Default: 1000.
resources.memorystringMemory limit. Human-readable: "512Mi", "1Gi". Default: "512Mi".
resources.timeoutstringTotal execution timeout. Human-readable: "300s", "5m". Max "1h".

Agent Lifecycle

An agent transitions through the following states after deploy:

deployed → paused → deployed

    └──────────────────────→ archived
StateDescription
deployedThe manifest is registered. New executions can be started against this agent.
pausedThe manifest is retained but no new executions are accepted. Running executions complete normally.
archivedThe manifest is soft-deleted. Cannot be unarchived. Historical execution records are retained.
aegis agent deploy ./agent.yaml
aegis agent pause <id>
aegis agent resume <id>
aegis agent delete <id>   # archives, does not hard-delete

Runtime Selection

The runtime used to execute an agent is determined by the node configuration, not the agent manifest. All agents on a given node use the same runtime.

RuntimeUse CaseRequirement
dockerDevelopment and stagingDocker daemon accessible
firecrackerProductionBare-metal or KVM-passthrough host, Linux kernel 5.10+

The AgentRuntime trait abstracts both runtimes. Switching a node from Docker to Firecracker requires only a config change — no agent manifest changes needed.

See Docker Runtime and Firecracker Runtime for deployment details.


BYOLLM: Model Alias System

Agent manifests reference LLM models by alias, not by provider name. This decouples agent definitions from infrastructure choices.

In the agent manifest, specify the model alias in the spec or within the bootstrap.py client initialization:

from aegis import AegisClient
client = AegisClient(model="default")  # resolves to whatever "default" maps to in node config

In aegis-config.yaml, the node operator maps aliases to providers:

llm:
  providers:
    - name: openai-gpt4o
      type: openai
      api_key: "env:OPENAI_API_KEY"
      model: gpt-4o
    - name: claude-3-5-sonnet
      type: anthropic
      api_key: "env:ANTHROPIC_API_KEY"
      model: claude-3-5-sonnet-20241022
  aliases:
    default: openai-gpt4o
    fast: claude-3-5-sonnet
    reasoning: openai-gpt4o

To swap the model backing the default alias for all agents on the node, change the alias mapping in config and restart the daemon — no manifest changes required.

See Configuring LLM Providers for the full provider configuration reference.

On this page