Writing Your First Agent
Step-by-step guide to writing an agent.yaml manifest and bootstrap.py from scratch.
Writing Your First Agent
This guide walks through creating a working agent from scratch: an agent.yaml manifest and the bootstrap.py script that runs inside the container. By the end, you will have an agent that accepts a task prompt, writes code to a workspace volume, runs it, and validates the output.
Prerequisites
- AEGIS daemon running locally (see Getting Started)
- Python 3.11+ in your target container image
- The
aegis-pythonSDK installed in the container image
Step 1: Create the Project Structure
my-agent/
├── agent.yaml
├── bootstrap.py
├── Dockerfile
└── output_schema.jsonStep 2: Write the Dockerfile
Your agent runs inside the container image specified in the manifest. Build a minimal image with Python and the AEGIS SDK:
FROM python:3.11-slim
RUN pip install --no-cache-dir aegis-sdk
WORKDIR /agent
COPY bootstrap.py .
COPY output_schema.json .
CMD ["python", "/agent/bootstrap.py"]Build and push the image:
docker build -t myregistry/my-agent:latest .
docker push myregistry/my-agent:latestStep 3: Write bootstrap.py
bootstrap.py is the entrypoint for your agent. It uses the AEGIS Python SDK, which handles the /v1/llm/generate communication loop automatically.
import os
import json
from aegis import AegisClient, TaskInput
def main():
client = AegisClient()
# Receive the task input injected by the orchestrator
task: TaskInput = client.get_task()
user_request = task.input.get("task", "")
# Build the initial conversation
messages = [
{
"role": "system",
"content": (
"You are a Python developer. Write code to solve the given task. "
"Save your solution to /workspace/solution.py. "
"Run the solution with cmd.run to verify it works. "
"When done, write a JSON summary to /workspace/result.json with "
"fields: 'solution_path' and 'output'."
)
},
{
"role": "user",
"content": user_request
}
]
# Run the inner loop — the SDK handles tool call interception transparently
response = client.generate(messages=messages)
# The response is the final text from the LLM after all tool calls are resolved
print(response.content)
if __name__ == "__main__":
main()The SDK's client.generate() blocks until the LLM produces a non-tool-call response. All fs.*, cmd.run, and other capability calls are intercepted and executed by the orchestrator transparently.
Step 4: Write the Output Schema
Define what a valid output looks like. The json_schema validator checks /workspace/result.json against this schema.
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"required": ["solution_path", "output"],
"properties": {
"solution_path": {
"type": "string",
"pattern": "^/workspace/"
},
"output": {
"type": "string",
"minLength": 1
}
},
"additionalProperties": false
}Step 5: Write the Agent Manifest
apiVersion: 100monkeys.ai/v1
kind: AgentManifest
metadata:
name: python-coder
version: "1.0.0"
description: "Writes Python solutions to programming tasks."
labels:
role: worker
team: platform
spec:
runtime:
language: python
version: "3.11"
isolation: docker
task:
instruction: |
You are a Python developer. Write code to solve the given task.
Save your solution to /workspace/solution.py.
Run the solution to verify it works.
Write a JSON summary to /workspace/result.json with fields: solution_path and output.
security:
network:
mode: allow
allowlist:
- pypi.org
filesystem:
read:
- /workspace
- /agent
write:
- /workspace
resources:
cpu: 1000
memory: "1Gi"
timeout: "300s"
volumes:
- name: workspace
storage_class: ephemeral
mount_point: /workspace
access_mode: read-write
ttl_hours: 1
size_limit_mb: 5000
execution:
mode: iterative
max_iterations: 10
validation:
system:
must_succeed: true
output:
format: json
schema:
type: object
required: ["solution_path", "output"]
properties:
solution_path:
type: string
output:
type: string
tools:
- name: filesystem
server: "mcp:filesystem"
config:
allowed_paths: ["/workspace", "/agent"]
access_mode: read-write
env:
PYTHONUNBUFFERED: "1"Step 6: Deploy and Test
Deploy the agent:
aegis agent deploy ./my-agent/agent.yamlConfirm it's registered:
aegis agent get python-coderRun an execution:
aegis execute \
--agent python-coder \
--input '{"task": "Write a function that checks if a number is prime."}' \
--watchIf the first iteration fails validation, the orchestrator injects the error into the next iteration's context and retries automatically (up to 10 times by default).
Common Issues
| Symptom | Cause | Fix |
|---|---|---|
| Container fails to start | Image not found or not pullable | Verify spec.runtime.isolation and registry credentials in node config |
| Tool call rejected | Tool not declared in spec.tools | Add the tool to spec.tools in the manifest |
| Validation always fails | Schema path wrong | Double-check execution.validation.output.schema definition |
| Timeout on first iteration | Task too complex | Increase security.resources.timeout or execution.validation.system.timeout_seconds |
| Network call rejected | Domain not in allowlist | Add the domain to spec.security.network.allowlist |
Next Steps
- Deploying Agents — full CLI lifecycle management.
- Configuring Agent Validation — all validator types and threshold tuning.
- Agent Manifest Reference — complete field specification.
- Manifest Specification v1.0 — full specification with extended examples.