Troubleshooting

This guide covers common problems and their solutions.

Connection Issues

Server Not Responding

Problem: cyberian status fails with connection error.

Symptoms:

Error: Connection refused

Solutions:

Check if server is running:

cyberian list-servers

If no servers are listed, start one:

cyberian server start claude --skip-permissions

Verify port:

# Make sure you're checking the right port
cyberian status --port 3284

Check firewall:

Ensure localhost connections on the port are allowed.

Wait for startup:

Server might still be starting:

sleep 3
cyberian status

Wrong Port

Problem: Server is running but on different port.

Solution:

# Find all running servers
cyberian list-servers

# Look for agentapi processes and their ports
# Connect to the correct port
cyberian status --port <correct_port>

Cannot Connect to Remote Server

Problem: Can't connect to server on another machine.

Solutions:

Check CORS settings:

Server must be started with appropriate CORS configuration:

cyberian server start claude \
  --allowed-origins "http://your-client-host:port" \
  --allowed-hosts "your-server-host"

Check network:

# Test basic connectivity
ping server-hostname

# Test port is open
nc -zv server-hostname 3284

Firewall:

Ensure the port is open on the server's firewall.

Timeout Issues

Message Timeouts

Problem: Messages timeout before agent completes.

Symptoms:

Error: Timeout waiting for agent response

Solutions:

Increase timeout:

# Default is 60 seconds, increase as needed
cyberian message "Complex task" --sync --timeout 300

Use fire-and-forget:

# Don't wait for response
cyberian message "Long running task"

# Check later
cyberian messages --last 1

Check agent is working:

# Monitor status
watch -n 2 cyberian status

Status should change from idle → busy → idle.

Workflow Timeouts

Problem: Workflow tasks timeout.

Solution:

# Increase per-task timeout (default: 300 seconds)
cyberian run workflow.yaml --timeout 600

Or break tasks into smaller steps:

# Instead of one large task
subtasks:
  big_task:
    instructions: "Do everything. COMPLETION_STATUS: COMPLETE"

# Break into smaller tasks
subtasks:
  step1:
    instructions: "Do part 1. COMPLETION_STATUS: COMPLETE"
  step2:
    instructions: "Do part 2. COMPLETION_STATUS: COMPLETE"
  step3:
    instructions: "Do part 3. COMPLETION_STATUS: COMPLETE"

Workflow Issues

Task Never Completes

Problem: Workflow hangs, task never finishes.

Symptoms:

cyberian waits indefinitely, agent shows busy status.

Cause: Agent didn't output COMPLETION_STATUS: COMPLETE.

Solutions:

Check instructions:

Ensure COMPLETION_STATUS: COMPLETE is in the instructions:

subtasks:
  task:
    instructions: |
      Do something.
      COMPLETION_STATUS: COMPLETE  # Must include this!

Make it explicit:

subtasks:
  task:
    instructions: |
      Do your task.

      When done, you MUST output exactly:
      COMPLETION_STATUS: COMPLETE

Check agent output:

# View conversation
cyberian messages --last 10

Look for what the agent actually output.

Template Variables Not Rendering

Problem: Workflow has literal {{variable}} instead of value.

Symptoms:

Instructions contain: "Research {{query}}"

Cause: Parameter not provided or misspelled.

Solutions:

Check parameter name:

params:
  query:  # Must match exactly
    range: string
    required: true

# Must use exact parameter name
cyberian run workflow.yaml --param query="value"

Check parameter is required:

params:
  optional_param:
    range: string
    required: false

subtasks:
  task:
    instructions: |
      {% if optional_param %}
      Use: {{optional_param}}
      {% else %}
      No parameter provided
      {% endif %}
      COMPLETION_STATUS: COMPLETE

Test template rendering:

Add debug output:

subtasks:
  debug:
    instructions: |
      Debug info:
      - query: {{query}}
      - defined: {% if query is defined %}YES{% else %}NO{% endif %}
      COMPLETION_STATUS: COMPLETE

Success Criteria Always Fails

Problem: Success criteria keeps retrying and failing.

Symptoms:

Retry 1/3: Success criteria not met
Retry 2/3: Success criteria not met
Retry 3/3: Success criteria not met
Error: Task failed after max retries

Solutions:

Debug the criteria:

success_criteria:
  python: |
    import os
    import sys

    # Debug output
    print("Checking file: results.txt", file=sys.stderr)
    print("CWD:", os.getcwd(), file=sys.stderr)
    print("Files:", os.listdir('.'), file=sys.stderr)

    # Actual check
    result = os.path.exists("results.txt")
    print("Result:", result, file=sys.stderr)
  max_retries: 1

Check working directory:

Success criteria runs in the workflow's working directory:

cyberian run workflow.yaml --dir /tmp/workspace

Ensure files are created there.

Simplify the check:

# Complex check that might fail
success_criteria:
  python: |
    with open("output.json") as f:
      data = json.load(f)
    result = data['status'] == 'success'

# Simpler check
success_criteria:
  python: |
    import os
    result = os.path.exists("output.json")
  max_retries: 2

Server Issues

Port Already in Use

Problem: Can't start server, port is occupied.

Symptoms:

Error: Address already in use

Solutions:

Find what's using the port:

cyberian list-servers

Use different port:

cyberian server start claude --port 3285

Stop existing server:

cyberian stop --port 3284

Multiple Servers Running

Problem: Too many servers, system is slow.

Solution:

# List all servers
cyberian list-servers

# Stop them
cyberian list-servers | grep agentapi | awk '{print $1}' | while read pid; do
  cyberian stop "$pid"
done

Server Crashes

Problem: Server stops unexpectedly.

Solutions:

Check logs:

Look in the server's working directory for log files.

Check system resources:

# Memory usage
free -h

# CPU usage
top

Agent might be running out of memory.

Restart with fresh state:

cyberian stop --port 3284
sleep 2
cyberian server start claude --port 3284 --dir /tmp/fresh-workspace

Permission Issues

Permission Denied Errors

Problem: Agent can't access files or directories.

Solutions:

Check directory permissions:

# Ensure directory is writable
chmod 755 /path/to/workdir

Use accessible directory:

# Use /tmp for testing
cyberian server start claude --dir /tmp/test-workspace

Check file ownership:

ls -la /path/to/workdir

Ensure your user owns the files.

Farm Issues

Farm Won't Start

Problem: cyberian farm start fails.

Solutions:

Check YAML syntax:

# Validate YAML
python -c "import yaml; yaml.safe_load(open('farm.yaml'))"

Check directories exist:

servers:
  - name: worker1
    directory: /tmp/worker1  # Must be writable

# Create directories first
mkdir -p /tmp/worker1 /tmp/worker2

Check ports available:

base_port: 4000  # Ensure 4000, 4001, etc. are free

cyberian list-servers  # Check for conflicts

Template Directory Not Copying

Problem: Files from template_directory don't appear.

Solutions:

Check path is relative to farm config:

# If farm.yaml is in /home/user/farms/
template_directory: my-template  # Looks in /home/user/farms/my-template

Use absolute path:

template_directory: /absolute/path/to/template

Check directory exists:

ls -la farm-template/

Template Issues

Jinja2 Syntax Errors

Problem: Workflow fails with template error.

Symptoms:

Error: Jinja2 syntax error at line 5

Solutions:

Check for unclosed tags:

# Bad
instructions: |
  {% if condition %}
  Do something
  # Missing {% endif %}

# Good
instructions: |
  {% if condition %}
  Do something
  {% endif %}

Check for typos:

# Bad
{% fi condition %}  # Should be 'if'

# Good
{% if condition %}

Test templates:

Use Python to test:

python3 << 'EOF'
from jinja2 import Template
t = Template("{% if x %}Y{% endif %}")
print(t.render(x=True))
EOF

Codex-Specific Issues

First Message Ignored (Welcome Banner)

Problem: Codex shows startup banner instead of processing the first task.

Symptoms:

Welcome to OpenAI Codex!
Available commands:
/init - Initialize project
/approvals - Configure approval settings

Instead of actual task output.

Cause: Fresh/untrusted directories trigger Codex's welcome flow.

Solutions:

Use --skip-permissions:

cyberian server start codex --skip-permissions

Configure config.toml:

# ~/.codex/config.toml
approval_policy = "never"
sandbox_mode = "danger-full-access"

Mark directory as trusted:

# ~/.codex/config.toml
[projects."/your/workspace"]
trust_level = "trusted"

cyberian auto-retry:

cyberian automatically detects the welcome banner and resends the first message.

Server Startup Timeout

Problem: "Server did not become ready within 30s" with Codex.

Cause: Codex takes longer to initialize than Claude.

Note: cyberian automatically extends timeout to 120s for Codex.

If still failing:

Check Codex installation:

which codex
codex --version

Check agentapi logs:

cat /path/to/workspace/agentapi_stderr.log

Start manually to debug:

agentapi server codex --port 3284 -- --dangerously-bypass-approvals-and-sandbox

Codex Environment Mismatch

Problem: Codex works in terminal but not when spawned by cyberian.

Symptoms:

Server starts but Codex process fails
"codex: command not found" in logs
Different behavior than manual execution

Cause: PATH or environment differs when spawned as subprocess.

Solutions:

Check PATH:

# Where is codex?
which codex

# Is it in a standard location?
echo $PATH

Use absolute path (if needed):

Ensure codex binary is in a standard PATH location, or configure shell environment:

# ~/.codex/config.toml
[shell_environment_policy]
inherit = "all"

Check agentapi can find codex:

# Test agentapi directly
agentapi server codex --port 9999

Approval Prompts Block Workflow

Problem: Workflow hangs, waiting for interactive approval.

Symptoms:

Status shows waiting indefinitely
No output from agent
Works fine when running codex manually

Cause: Codex approval policy requires user input.

Solutions:

Use --skip-permissions flag:

cyberian server start codex --skip-permissions

Set approval_policy in config.toml:

# ~/.codex/config.toml
approval_policy = "never"

Use automation profile:

# ~/.codex/config.toml
profile = "automation"

[profiles.automation]
approval_policy = "never"
sandbox_mode = "danger-full-access"

Sandbox Blocks File Operations

Problem: Codex can't read/write files despite correct instructions.

Symptoms:

"Permission denied" errors
File operations silently fail
Works in terminal but not via cyberian

Cause: Sandbox mode restricting filesystem access.

Solutions:

Use full access mode:

# ~/.codex/config.toml
sandbox_mode = "danger-full-access"

Or allow specific paths:

# ~/.codex/config.toml
sandbox_mode = "workspace-write"

[sandbox_workspace_write]
writable_roots = ["/tmp/cyberian", "~/projects"]

Verify directory permissions:

ls -la /path/to/workspace
# Ensure user has write access

Different Behavior in Fresh Directories

Problem: Workflow works in existing project but fails in new directory.

Cause: Codex trusts established projects but not fresh directories.

Solutions:

Pre-create and trust the directory:

mkdir -p /path/to/workspace

# ~/.codex/config.toml
[projects."/path/to/workspace"]
trust_level = "trusted"

Use existing trusted directory:

cyberian run workflow.yaml --dir ~/existing-project

Initialize project first:

# Create basic project structure
mkdir -p workspace && cd workspace
git init
echo "# Project" > README.md

Codex Model Issues

Problem: Wrong model being used or model errors.

Solutions:

Specify model in config.toml:

# ~/.codex/config.toml
model = "gpt-4-turbo"

Check API key:

Ensure OPENAI_API_KEY is set:

echo $OPENAI_API_KEY

Check model availability:

Some models require specific access levels.

Debug Techniques

Enable Verbose Output

Add debug instructions:

subtasks:
  debug:
    instructions: |
      List current directory:
      ls -la

      Show environment:
      env

      COMPLETION_STATUS: COMPLETE

Check Conversation History

# See all messages
cyberian messages

# Last 5 messages
cyberian messages --last 5

# As YAML for readability
cyberian messages --format yaml --last 10

Monitor Server Status

# Watch status change
watch -n 1 cyberian status

# Or in a loop
while true; do
  cyberian status
  sleep 2
done

Getting Help

If you're still stuck:

Check the logs in the agent's working directory
Simplify - Create minimal reproduction
File an issue at https://github.com/monarch-initiative/cyberian/issues

Include:

Your command
The error message
Relevant workflow YAML
Output of cyberian list-servers and cyberian status

Send Messages - Message patterns
Manage Servers - Server lifecycle
Write Workflows - Workflow authoring

Troubleshooting

Connection Issues

Server Not Responding

Wrong Port

Cannot Connect to Remote Server

Timeout Issues

Message Timeouts

Workflow Timeouts

Workflow Issues

Task Never Completes

Template Variables Not Rendering

Success Criteria Always Fails

Server Issues

Port Already in Use

Multiple Servers Running

Server Crashes

Permission Issues

Permission Denied Errors

Farm Issues

Farm Won't Start

Template Directory Not Copying

Template Issues

Jinja2 Syntax Errors

Codex-Specific Issues

First Message Ignored (Welcome Banner)

Server Startup Timeout

Codex Environment Mismatch

Approval Prompts Block Workflow

Sandbox Blocks File Operations

Different Behavior in Fresh Directories

Codex Model Issues

Debug Techniques

Enable Verbose Output

Check Conversation History

Monitor Server Status

Getting Help

Related Guides

See Also