Skip to content

Troubleshooting

This guide covers common problems and their solutions.

Connection Issues

Server Not Responding

Problem: cyberian status fails with connection error.

Symptoms:

Error: Connection refused

Solutions:

  1. Check if server is running:
cyberian list-servers

If no servers are listed, start one:

cyberian server start claude --skip-permissions
  1. Verify port:
# Make sure you're checking the right port
cyberian status --port 3284
  1. Check firewall:

Ensure localhost connections on the port are allowed.

  1. Wait for startup:

Server might still be starting:

sleep 3
cyberian status

Wrong Port

Problem: Server is running but on different port.

Solution:

# Find all running servers
cyberian list-servers

# Look for agentapi processes and their ports
# Connect to the correct port
cyberian status --port <correct_port>

Cannot Connect to Remote Server

Problem: Can't connect to server on another machine.

Solutions:

  1. Check CORS settings:

Server must be started with appropriate CORS configuration:

cyberian server start claude \
  --allowed-origins "http://your-client-host:port" \
  --allowed-hosts "your-server-host"
  1. Check network:
# Test basic connectivity
ping server-hostname

# Test port is open
nc -zv server-hostname 3284
  1. Firewall:

Ensure the port is open on the server's firewall.

Timeout Issues

Message Timeouts

Problem: Messages timeout before agent completes.

Symptoms:

Error: Timeout waiting for agent response

Solutions:

  1. Increase timeout:
# Default is 60 seconds, increase as needed
cyberian message "Complex task" --sync --timeout 300
  1. Use fire-and-forget:
# Don't wait for response
cyberian message "Long running task"

# Check later
cyberian messages --last 1
  1. Check agent is working:
# Monitor status
watch -n 2 cyberian status

Status should change from idlebusyidle.

Workflow Timeouts

Problem: Workflow tasks timeout.

Solution:

# Increase per-task timeout (default: 300 seconds)
cyberian run workflow.yaml --timeout 600

Or break tasks into smaller steps:

# Instead of one large task
subtasks:
  big_task:
    instructions: "Do everything. COMPLETION_STATUS: COMPLETE"

# Break into smaller tasks
subtasks:
  step1:
    instructions: "Do part 1. COMPLETION_STATUS: COMPLETE"
  step2:
    instructions: "Do part 2. COMPLETION_STATUS: COMPLETE"
  step3:
    instructions: "Do part 3. COMPLETION_STATUS: COMPLETE"

Workflow Issues

Task Never Completes

Problem: Workflow hangs, task never finishes.

Symptoms:

cyberian waits indefinitely, agent shows busy status.

Cause: Agent didn't output COMPLETION_STATUS: COMPLETE.

Solutions:

  1. Check instructions:

Ensure COMPLETION_STATUS: COMPLETE is in the instructions:

subtasks:
  task:
    instructions: |
      Do something.
      COMPLETION_STATUS: COMPLETE  # Must include this!
  1. Make it explicit:
subtasks:
  task:
    instructions: |
      Do your task.

      When done, you MUST output exactly:
      COMPLETION_STATUS: COMPLETE
  1. Check agent output:
# View conversation
cyberian messages --last 10

Look for what the agent actually output.

Template Variables Not Rendering

Problem: Workflow has literal {{variable}} instead of value.

Symptoms:

Instructions contain: "Research {{query}}"

Cause: Parameter not provided or misspelled.

Solutions:

  1. Check parameter name:
params:
  query:  # Must match exactly
    range: string
    required: true
# Must use exact parameter name
cyberian run workflow.yaml --param query="value"
  1. Check parameter is required:
params:
  optional_param:
    range: string
    required: false

subtasks:
  task:
    instructions: |
      {% if optional_param %}
      Use: {{optional_param}}
      {% else %}
      No parameter provided
      {% endif %}
      COMPLETION_STATUS: COMPLETE
  1. Test template rendering:

Add debug output:

subtasks:
  debug:
    instructions: |
      Debug info:
      - query: {{query}}
      - defined: {% if query is defined %}YES{% else %}NO{% endif %}
      COMPLETION_STATUS: COMPLETE

Success Criteria Always Fails

Problem: Success criteria keeps retrying and failing.

Symptoms:

Retry 1/3: Success criteria not met
Retry 2/3: Success criteria not met
Retry 3/3: Success criteria not met
Error: Task failed after max retries

Solutions:

  1. Debug the criteria:
success_criteria:
  python: |
    import os
    import sys

    # Debug output
    print("Checking file: results.txt", file=sys.stderr)
    print("CWD:", os.getcwd(), file=sys.stderr)
    print("Files:", os.listdir('.'), file=sys.stderr)

    # Actual check
    result = os.path.exists("results.txt")
    print("Result:", result, file=sys.stderr)
  max_retries: 1
  1. Check working directory:

Success criteria runs in the workflow's working directory:

cyberian run workflow.yaml --dir /tmp/workspace

Ensure files are created there.

  1. Simplify the check:
# Complex check that might fail
success_criteria:
  python: |
    with open("output.json") as f:
      data = json.load(f)
    result = data['status'] == 'success'

# Simpler check
success_criteria:
  python: |
    import os
    result = os.path.exists("output.json")
  max_retries: 2

Server Issues

Port Already in Use

Problem: Can't start server, port is occupied.

Symptoms:

Error: Address already in use

Solutions:

  1. Find what's using the port:
cyberian list-servers
  1. Use different port:
cyberian server start claude --port 3285
  1. Stop existing server:
cyberian stop --port 3284

Multiple Servers Running

Problem: Too many servers, system is slow.

Solution:

# List all servers
cyberian list-servers

# Stop them
cyberian list-servers | grep agentapi | awk '{print $1}' | while read pid; do
  cyberian stop "$pid"
done

Server Crashes

Problem: Server stops unexpectedly.

Solutions:

  1. Check logs:

Look in the server's working directory for log files.

  1. Check system resources:
# Memory usage
free -h

# CPU usage
top

Agent might be running out of memory.

  1. Restart with fresh state:
cyberian stop --port 3284
sleep 2
cyberian server start claude --port 3284 --dir /tmp/fresh-workspace

Permission Issues

Permission Denied Errors

Problem: Agent can't access files or directories.

Solutions:

  1. Check directory permissions:
# Ensure directory is writable
chmod 755 /path/to/workdir
  1. Use accessible directory:
# Use /tmp for testing
cyberian server start claude --dir /tmp/test-workspace
  1. Check file ownership:
ls -la /path/to/workdir

Ensure your user owns the files.

Farm Issues

Farm Won't Start

Problem: cyberian farm start fails.

Solutions:

  1. Check YAML syntax:
# Validate YAML
python -c "import yaml; yaml.safe_load(open('farm.yaml'))"
  1. Check directories exist:
servers:
  - name: worker1
    directory: /tmp/worker1  # Must be writable
# Create directories first
mkdir -p /tmp/worker1 /tmp/worker2
  1. Check ports available:
base_port: 4000  # Ensure 4000, 4001, etc. are free
cyberian list-servers  # Check for conflicts

Template Directory Not Copying

Problem: Files from template_directory don't appear.

Solutions:

  1. Check path is relative to farm config:
# If farm.yaml is in /home/user/farms/
template_directory: my-template  # Looks in /home/user/farms/my-template
  1. Use absolute path:
template_directory: /absolute/path/to/template
  1. Check directory exists:
ls -la farm-template/

Template Issues

Jinja2 Syntax Errors

Problem: Workflow fails with template error.

Symptoms:

Error: Jinja2 syntax error at line 5

Solutions:

  1. Check for unclosed tags:
# Bad
instructions: |
  {% if condition %}
  Do something
  # Missing {% endif %}

# Good
instructions: |
  {% if condition %}
  Do something
  {% endif %}
  1. Check for typos:
# Bad
{% fi condition %}  # Should be 'if'

# Good
{% if condition %}
  1. Test templates:

Use Python to test:

python3 << 'EOF'
from jinja2 import Template
t = Template("{% if x %}Y{% endif %}")
print(t.render(x=True))
EOF

Debug Techniques

Enable Verbose Output

Add debug instructions:

subtasks:
  debug:
    instructions: |
      List current directory:
      ls -la

      Show environment:
      env

      COMPLETION_STATUS: COMPLETE

Check Conversation History

# See all messages
cyberian messages

# Last 5 messages
cyberian messages --last 5

# As YAML for readability
cyberian messages --format yaml --last 10

Monitor Server Status

# Watch status change
watch -n 1 cyberian status

# Or in a loop
while true; do
  cyberian status
  sleep 2
done

Getting Help

If you're still stuck:

  1. Check the logs in the agent's working directory
  2. Simplify - Create minimal reproduction
  3. File an issue at https://github.com/monarch-initiative/cyberian/issues

Include:

  • Your command
  • The error message
  • Relevant workflow YAML
  • Output of cyberian list-servers and cyberian status

See Also