Hermes Agent Deployment Guide
Modular AI Agent framework with hot-swappable Skills, Feishu / Telegram / Discord integration, and ultra-lightweight configuration
🚀 Quick Install
pip install, one command to start, driven entirely by config.yaml
5 min setup⚙️ config.yaml Reference
Every field explained with best practices and examples for minimal and advanced configs
Full Reference❌ Troubleshooting
Platform integration failures, Skill not firing, Gateway timeout — diagnosis and fix
Debug Guide💡 Tips & Tricks
Skill authoring, Memory, multi-agent, hot reload, debugging techniques
Pro TipsI. Installation
Requirements: Ubuntu 20.04+ / macOS 13+, Python 3.10+, internet connection. No GPU required for CPU inference; GPU (NVIDIA CUDA) accelerates model inference.
Install Hermes CLI
pip install hermes-agent hermes --version
Tip: Use a virtual environment to avoid system Python pollution: python -m venv hermes-env && source hermes-env/bin/activate && pip install hermes-agent
Initialize Project
hermes init ~/hermes-agent cd ~/hermes-agent
Generates:
~/hermes-agent/ ├── config.yaml # Main config ├── skills/ # Skill directory (create manually) ├── memory/ # Memory storage └── logs/ # Log files
Write config.yaml
See Section II for full field reference. Minimal required: agent.model, agent.api_key, gateway.port.
Note: YAML is indentation-sensitive. Use spaces (not tabs), 2 spaces per level.
Start
# Foreground (see live logs) hermes start # Daemon (background) hermes start --daemon # Check status hermes status
After start: Gateway listens on http://localhost:8000. API endpoint: /v1/chat. Web UI: /ui (if enabled in config).
II. config.yaml Reference
| Field | Type | Description |
|---|---|---|
| agent.name | string | Agent name for logs and debugging |
| agent.model | string | Default model ID: gpt-4o, claude-4-sonnet, deepseek-v3, etc. |
| agent.api_base | string | API base URL. Leave empty for OpenAI official. For FlowerWolf: https://api.flowerwolf.net/v1 |
| agent.api_key | string | Required. Get from flowerwolf.net/token_en.html |
| agent.max_tokens | int | Max tokens per reply, default 4096. Set to 8192 for complex tasks. |
| agent.temperature | float | Sampling temperature, 0.0–2.0, default 0.7 |
| agent.system_prompt | string | System prompt defining the Agent's role and behavior. Supports {{user_name}}, {{agent_name}}, {{current_time}}. |
| gateway.host | string | Bind address, default 0.0.0.0 (accepts all sources) |
| gateway.port | int | HTTP port, default 8000 |
| gateway.allow_all_users | bool | true = allow all users; false = whitelist only |
| skills.dir | string | Skill directory, default ./skills |
| skills.autoload | bool | Auto-load all Skills on startup |
| memory.provider | string | Storage backend: sqlite (default), postgres, memory |
| memory.session_limit | int | Max messages per session before auto-summarization |
| platforms.feishu.* | bool/string | Feishu integration settings. Requires app_id, app_secret, bot_name. |
| platforms.telegram.* | bool/string | Telegram integration. Requires bot_token from @BotFather. |
| platforms.discord.* | bool/string | Discord integration. Requires bot_token and guild_id. |
| log.level | string | debug / info / warn / error |
| log.file | string | Log file path. Leave blank for stdout only. |
| cors.enabled | bool | Enable Cross-Origin Resource Sharing for frontend access |
| cors.origins | list | Allowed origin domains, e.g. ["https://flowerwolf.net"] |
Minimal Config
agent: name: my-hermes model: gpt-4o api_base: https://api.flowerwolf.net/v1 api_key: your-token-here gateway: host: "0.0.0.0" port: 8000 allow_all_users: true skills: dir: ./skills autoload: true memory: provider: sqlite session_limit: 30 log: level: info
III. Skill Authoring
Skills are Hermes's extension units. Each is a file in skills/, loaded automatically on startup.
name: summarize_text
description: Compresses long text into a summary. Use when user says "summarize this", "too long", "give me the key points"
version: "1.0"
trigger:
keywords: ["summarize", "summary", "too long", "key points", "compress"]
action: python
script: |
def summarize(text, ratio=0.3):
sentences = text.replace("\n", " ").split(".")
summary = ".".join(sentences[:max(1, int(len(sentences) * ratio))]) + "."
return summary
result = summarize(text="{{context}}", ratio=0.3)
print(result)Key fields:
• name: Unique identifier, no duplicates.
• description: Tells the Agent when to invoke this Skill. Be specific.
• trigger.keywords: Keywords that increase the likelihood of this Skill being selected.
• action: python (run Python script) or http (make HTTP request).
• {{context}}: Replaced with relevant conversation context before execution.
HTTP Type Skill
name: weather_query
description: Looks up current weather for a city, e.g. "how's the weather in Tokyo"
version: "1.0"
trigger:
keywords: ["weather", "temperature", "rain", "forecast"]
action: http
script:
method: GET
url: "https://api.weather.com/v3/wx/conditions/current?city={{city}}&key=YOUR_KEY"
headers:
Accept: application/jsonDebug Skills: hermes skills list shows all loaded Skills. hermes skills test summarize_text runs a single Skill in isolation.
IV. Troubleshooting
Feishu: DM Works But Group Messages Don't Trigger
The most common issue. Check in order:
1. In Feishu event subscription, confirm "Use long connection to receive events" is selected (NOT Webhook URL mode).
2. Required permissions: im:message, im:message:send_as_bot.
3. For group messages: set FEISHU_GROUP_POLICY=open or platforms.feishu.group_policy: "open" in config.yaml. Feishu blocks group messages to bots by default.
4. Check logs: hermes logs | grep feishu. If OpenClaw bot under the same app receives group messages fine, the platform config is OK — the issue is Hermes-side.
Gateway Won't Start / Port Already in Use
Find what's using the port: ss -tlnp | grep 8000. If it's a previous Hermes, hermes stop. Otherwise kill <PID> or change the port in config: gateway.port: 8001.
Skill Loaded But Never Invoked
Two things to verify: 1) hermes skills list shows the Skill. 2) The Skill's description is specific enough — the Agent uses description text to decide when to call a Skill. Too vague = never triggered.
You can also trigger manually: say "use the summarize_text Skill on this passage".
API Errors: 403 / 429 / 500
403: Invalid API key or zero balance. Check key spelling and Token balance.
429: Rate limit exceeded. FlowerWolf has RPM limits; high-volume usage needs queuing or plan upgrade.
500: Upstream API error. Check Hermes logs for details. Usually transient — retry after a few seconds.
Memory Growing Without Bound
SQLite mode appends indefinitely. Set memory.session_limit to trigger auto-summarization. To disable memory entirely: memory.provider: none.
CORS Error / Frontend Can't Access API
Browser enforces CORS when calling Hermes API directly. Enable in config: cors.enabled: true, cors.origins: ["*"] for dev (or specific domain for prod).
hermes start Produces No Output
No stdout in foreground usually means a config error. Try hermes start -v for verbose output. Or run python -m hermes directly to see Python-level errors.
V. Tips & Tricks
Memory-Driven User Profiles
Beyond conversation history, Hermes Memory can inject custom context per user. Set memory.inject_user_profile to store per-user preferences and history — the Agent sees them on every reply, like a persistent memory of each user.
Cron Jobs for Automation
Schedule tasks directly in config.yaml:
cron: "0 9 * * 1-5": "daily_briefing" # Weekdays at 9am "0 */2 * * *": "check_alerts" # Every 2 hours "0 0 * * 0": "weekly_report" # Sunday midnight
Task names reference Skills. Results are pushed to configured channels (Feishu/TG/Discord).
Multi-Agent: Division of Labor
Configure multiple agents[] in config.yaml, each with its own model, skills, and responsibility. A Router Agent dispatches based on user intent: agents[0].name: "coder" for code review, agents[1].name: "support" for customer service.
Custom Webhook: Connect Any Platform
Hermes supports custom webhooks via platforms.custom. Any platform with webhook support (Slack, DingTalk, WeCom, etc.) can be connected — Hermes POSTs messages to your URL in a fixed format.
Skill Hot Reload
Add or update Skills without restarting Hermes: hermes skills reload rescans skills/ and reloads everything. New and modified Skills take effect immediately. Pair with log.level: debug to see detailed loading logs.
Streaming Output (SSE)
Enable agent.stream: true in config. API requests use Server-Sent Events — frontends can display AI output word-by-word as it arrives, like ChatGPT.
VI. FAQ
Hermes vs OpenClaw — what's the difference?
Both are Agent frameworks with different architectural philosophies. OpenClaw is more "all-in-one" — config in one file, all platform integrations built-in. Hermes is more "modular" — Skills are separate files, platform config is more flexible. OpenClaw is better for quick setup; Hermes for deep customization.
How many Hermes instances per server?
Each instance listens on one port, so resource limits are the constraint. Memory-only: 3–5 instances per 16GB RAM. With GPU inference: 3090/4090 24GB handles ~2 GPU-accelerated instances.
Does Hermes support streaming?
Yes. Set agent.stream: true. Uses Server-Sent Events (SSE) — frontends can render output token-by-token.
How to upgrade Hermes?
pip install hermes-agent --upgrade. Stop with hermes stop first. Config fields are generally backward-compatible; breaking changes are noted in the release notes.
Can Hermes integrate with our existing Feishu / DingTalk?
Absolutely. Hermes provides a standard HTTP API (/v1/chat) and WebSocket interface. Any messaging platform can trigger Hermes via bot events, and Hermes posts replies back through the bot API. See each platform's Bot development docs for the incoming event side.