Aegis Orchestrator
Guides

Deploying & Running Agents

Full agent lifecycle management via the AEGIS CLI — deploy, pause, resume, delete, execute, and monitor.

Deploying & Running Agents

This guide covers the complete agent lifecycle using the aegis CLI: deploying manifests, managing agent state, triggering executions, streaming iteration output, and inspecting execution history.

All operations assume the AEGIS daemon is running and accessible. Use --daemon-addr <host:port> to target a remote daemon, or the default localhost:9090 if running locally.


Authoring Approaches

There are two ways to supply an agent's runtime:

ApproachWhen to Use
Generic AEGIS runtimeMost agents. Define all behavior in the manifest via spec.task.instruction. No Dockerfile or custom image needed.
Custom containerAgents with complex dependencies, non-standard libraries, or a fully custom bootstrap.py. See Writing Your First Agent.

Both approaches use identical lifecycle and execution commands. The difference is entirely in the manifest format.


Generic AEGIS Runtime (No Custom Container)

For most use cases you do not need to build or maintain a Docker image. Define the agent's behavior in the manifest and let the orchestrator supply the runtime.

How it works:

  1. The orchestrator starts a container from the AEGIS-managed Python base image.
  2. It injects bootstrap.py into the container automatically.
  3. It renders the final LLM prompt by substituting Handlebars variables in spec.task.prompt_template — replacing {{instruction}} with the manifest's spec.task.instruction text and {{input}} with the JSON passed at execution time.
  4. The rendered prompt is passed to bootstrap.py as argv[1].
  5. bootstrap.py calls the LLM proxy at /v1/llm/generate and streams back the response.
  6. If validation fails, {{previous_error}} is populated with the failure output and the next iteration begins automatically.

No Python file to write. No Docker build. No image registry.

Example Manifest

apiVersion: 100monkeys.ai/v1
kind: Agent

metadata:
  name: pr-reviewer
  version: "1.0.0"
  description: "Reviews pull request diffs and returns structured feedback."

spec:
  runtime:
    language: python
    version: "3.11"
    isolation: docker

  task:
    instruction: |
      Review the provided code diff and return structured feedback covering:
      - Security vulnerabilities or concerns
      - Performance issues or opportunities
      - Code quality and maintainability
      - Idiomatic patterns and best practices
      Provide specific line references where relevant. Be concise and actionable.
    prompt_template: |
      {{instruction}}

      User: {{input}}
      Reviewer:

  execution:
    mode: iterative
    max_iterations: 5

  security:
    network:
      mode: allow
      allowlist:
        - api.github.com
    filesystem:
      read:
        - /workspace
      write:
        - /workspace/output
    resources:
      cpu: 1000
      memory: "1Gi"
      timeout: "300s"

  volumes:
    - name: workspace
      storage_class: ephemeral
      mount_point: /workspace
      access_mode: read-write
      size_limit_mb: 2048
      ttl_hours: 1

Deploy it:

aegis agent deploy ./pr-reviewer.yaml

Run an execution, passing the diff as the {{input}} context:

aegis execute \
  --agent pr-reviewer \
  --input '{"diff": "<git diff output>", "repo": "my-org/my-repo", "pr": 42}' \
  --watch

The full contents of the --input JSON string become the {{input}} value the LLM sees. Structure it however is useful for your agent's instruction.

Prompt Template Variables

VariablePopulated With
{{instruction}}spec.task.instruction text from the manifest.
{{input}}JSON string passed via --input at execution time.
{{iteration_number}}Current iteration number (1-based).
{{previous_error}}Validator failure output or error from the previous iteration. Empty on iteration 1; injected automatically on retries.

Use {{previous_error}} in the prompt template to give the LLM explicit feedback about what went wrong in earlier attempts:

task:
  prompt_template: |
    {{instruction}}

    User: {{input}}
    {% if previous_error %}
    Your previous attempt failed with the following error. Fix it:
    {{previous_error}}
    {% endif %}
    Reviewer:

Setting a Specific Model

By default the agent uses the default alias defined in aegis-config.yaml. Override per-agent with spec.runtime.model:

spec:
  runtime:
    language: python
    version: "3.11"
    model: reasoning          # maps to an alias in aegis-config.yaml llm.aliases
    isolation: docker

Agent Lifecycle Commands

Deploy an Agent

aegis agent deploy ./my-agent/agent.yaml

On success, the agent is registered with status: deployed and assigned a UUID. The manifest is validated before acceptance — invalid manifests are rejected with a specific error.

Deployed agent "python-coder" (id: a1b2c3d4-0000-0000-0000-000000000001)

To deploy and immediately print the full agent record as JSON:

aegis agent deploy ./my-agent/agent.yaml --output json

List Agents

# Table format (default)
aegis agent list

# JSON format — useful for scripting
aegis agent list --output json

Example table output:

ID                                    NAME           STATUS     RUNTIME  LABELS
a1b2c3d4-0000-...                     python-coder   deployed   docker   type=developer
b2c3d4e5-0000-...                     code-reviewer  paused     docker
c3d4e5f6-0000-...                     security-scan  deployed   docker   team=security

Filter by status:

aegis agent list --status deployed
aegis agent list --status paused

Inspect an Agent

# By ID
aegis agent get a1b2c3d4-0000-0000-0000-000000000001

# By name
aegis agent get python-coder

Outputs the full agent record including the stored manifest.

Update an Agent

aegis agent deploy ./my-agent/agent.yaml

Re-deploying an agent with the same metadata.name updates the manifest in-place. Running executions are not affected — they continue using the manifest version that was active when they started.

Pause an Agent

Pausing prevents new executions but does not affect currently running ones.

aegis agent pause python-coder
# or by ID:
aegis agent pause a1b2c3d4-0000-0000-0000-000000000001

Resume a Paused Agent

aegis agent resume python-coder

Delete an Agent

Delete archives the agent (soft-delete). Historical execution records are retained. The agent cannot be restored after deletion.

aegis agent delete python-coder

To force-delete without confirmation prompt:

aegis agent delete python-coder --yes

Running Executions

Start an Execution

aegis execute \
  --agent python-coder \
  --input '{"task": "Write a function that reverses a linked list."}'

Returns the execution ID immediately:

Execution started: a1b2c3d4-1111-0000-0000-000000000001

Stream Execution Output

Use --watch to stream iteration events to the terminal:

aegis execute \
  --agent python-coder \
  --input '{"task": "Write a function that reverses a linked list."}' \
  --watch

Example output:

[2026-02-23T10:00:01Z] Execution a1b2c3d4-1111-... Started
[2026-02-23T10:00:01Z] Iteration 1 Started
[2026-02-23T10:00:03Z] Tool: fs.write /workspace/solution.py (234 bytes)
[2026-02-23T10:00:04Z] Tool: cmd.run python /workspace/solution.py
[2026-02-23T10:00:05Z] Tool: fs.write /workspace/result.json (89 bytes)
[2026-02-23T10:00:06Z] Iteration 1 Completed
[2026-02-23T10:00:06Z] Validation: exit_code=PASS json_schema=PASS (score=1.0)
[2026-02-23T10:00:06Z] Execution a1b2c3d4-1111-... Completed (1 iteration, 5.2s)

Control Max Iterations

Override the default 10-iteration limit for a specific execution:

aegis execute \
  --agent python-coder \
  --input '{"task": "..."}' \
  --max-iterations 3

Inspecting Executions

List Executions

# All recent executions
aegis execution list

# Filter by agent
aegis execution list --agent python-coder

# Filter by status
aegis execution list --status completed
aegis execution list --status failed
aegis execution list --status running

# Limit results
aegis execution list --limit 20

Get Execution Details

aegis execution get a1b2c3d4-1111-0000-0000-000000000001

Returns a full execution record, including:

  • Status
  • Start and end timestamps
  • Total iterations
  • Each iteration's status, duration, and validation scores

Get Iteration Logs

# All iterations
aegis execution logs a1b2c3d4-1111-0000-0000-000000000001

# Specific iteration
aegis execution logs a1b2c3d4-1111-0000-0000-000000000001 --iteration 2

Iteration logs include the LLM's final output text and any tool call summaries for that iteration.

Cancel a Running Execution

aegis execution cancel a1b2c3d4-1111-0000-0000-000000000001

Cancellation stops the current iteration's container, releases any held locks, and marks the execution as cancelled.


Scripting with JSON Output

All CLI commands support --output json for machine-readable output. This is useful for CI/CD pipelines:

# Deploy and capture the agent ID
AGENT_ID=$(aegis agent deploy ./agent.yaml --output json | jq -r '.id')

# Start an execution and capture the execution ID
EXEC_ID=$(aegis execute --agent python-coder \
  --input '{"task": "..."}' \
  --output json | jq -r '.id')

# Poll until complete
while true; do
  STATUS=$(aegis execution get $EXEC_ID --output json | jq -r '.status')
  echo "Status: $STATUS"
  if [[ "$STATUS" == "completed" || "$STATUS" == "failed" || "$STATUS" == "cancelled" ]]; then
    break
  fi
  sleep 5
done

echo "Final status: $STATUS"

Configuration and Flags

FlagDescription
--config <path>Path to aegis-config.yaml. Defaults to ./aegis-config.yaml.
--daemon-addr <host:port>Target a specific daemon. Defaults to localhost:9090.
--output json|table|yamlOutput format. Defaults to table.
--yesSkip confirmation prompts.

On this page