AI AgentHermes AgentLLM APINbilityServer Deployment

Deploy an AI Agent to Your Own Server in Two Hours: A Hermes Agent Getting Started Guide

A practical walkthrough for deploying Hermes Agent from scratch: installation, model API configuration, starting the command-line assistant, and why AI Agents need a stable token supply.

Deploy an AI Agent to Your Own Server in Two Hours: A Hermes Agent Getting Started Guide

Over the past year, everyone has been talking about AI Agents.

But if all you do is open a web page and chat with ChatGPT, you are not really using an Agent yet. The interesting question is whether it can read files, edit code, execute commands, browse the web, run scheduled tasks, and even connect to Telegram, Weixin, or QQ groups as a long-running assistant.

Recently I have been experimenting with Hermes Agent.

Its positioning is close to tools such as Claude Code, OpenClaw, and similar coding agents: it is not just a chat shell. It lets a model touch your working environment and complete tasks.

This article keeps the scope simple and does one thing:

Start from zero and get Hermes Agent running on a server.

If you follow along, you will end up with an AI Agent that works in the terminal. Later articles can build on this and connect it to Telegram, Weixin, QQ groups, scheduled jobs, automation, and coding workflows.


Cover image: niku introducing Hermes Agent in a server room beside a terminal window and API token panel


What Is the Difference Between an AI Agent and a Chatbot?

A normal chatbot is more like a question-and-answer window.

You ask a question and it replies. It does not know what files are on your computer, and it cannot execute commands for you. If you ask it to change code, it can at most give you suggestions. You still copy, run, and test everything yourself.

An AI Agent is different.

A useful Agent should be able to do at least these things:

  • read project files
  • modify code
  • execute shell commands
  • search for information
  • call a browser
  • run tests
  • remember your preferences
  • respond inside messaging apps
  • run scheduled tasks

That is what makes Hermes Agent interesting. It is not just a chat wrapper. It is an Agent framework that can connect to tools, the filesystem, the terminal, and messaging platforms.

You can think of it as:

Giving the model hands and feet.

Chatbot vs AI Agent comparison diagram


Why Deploy It to a Server?

You can run Hermes locally, but a server has several clear advantages.

First, it can stay online.

If you later connect Hermes to Telegram, Weixin, QQ groups, or ask it to summarize news and check server status every day, a local laptop is not ideal. When the computer shuts down, the Agent disappears.

Second, a server is better for automation.

For example:

  • send an AI industry brief every morning at 9
  • summarize group chats every night
  • monitor website status
  • notify you when a service goes down
  • run scripts on a schedule
  • check logs remotely

Third, a server environment is cleaner.

You can dedicate a small machine to the Agent and install only what it needs. When something breaks, it is easier to debug.


Prerequisites

This article assumes you already have a Linux server.

The minimum configuration does not need to be high. An ordinary cloud server is enough. Recommended:

  • OS: Ubuntu 22.04 / 24.04, or Debian
  • memory: at least 2 GB, preferably 4 GB or more
  • disk: 10 GB or more
  • network: able to access the model API
  • permissions: root or sudo access is best

You also need a model API key.

Hermes Agent does not provide a model by itself. It needs to connect to OpenAI, Claude, OpenRouter, DeepSeek, Gemini, or another OpenAI-compatible API service.

If you do not want to register accounts across multiple model providers, you can use a unified API token service. I often use:

https://nbility.dev

The benefit of this kind of platform is that one token can access multiple models. Many open-source projects only need a Base URL, API Key, and model name to start running. For beginners, that removes a lot of friction.


Step 1: Install Hermes Agent

After logging into the server, run:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

After installation, reload your shell environment or log in to the server again.

Then check whether the command is available:

hermes --version

If you see a version number, the installation succeeded.

Hermes Agent installation flow


Step 2: Run the Initial Setup

For first use, run the setup wizard:

hermes setup

It will ask you to configure the model, provider, tools, and related settings.

If you use officially supported providers such as OpenAI, Anthropic, or OpenRouter, you can follow the prompts.

If you use an OpenAI-compatible API service such as Nbility, focus on three values:

Base URL
API Key
Model Name

Use the exact values from your own dashboard.

An OpenAI-compatible service usually looks like this:

Base URL: https://your-api-host/v1
API Key: sk-xxxxxxxxxxxxxxxx
Model: the model name you want to use

Do not put a real API key in public articles, screenshots, or group chats.

Hermes Agent calling the Nbility API chain


Step 3: Check Whether the Environment Works

After configuration, run:

hermes doctor

This command checks Hermes dependencies, configuration, model connectivity, and related status.

If you see an obvious error, first check these common causes:

  • wrong API Key
  • Base URL missing /v1
  • model name does not exist
  • server cannot reach the API
  • insufficient balance
  • provider configuration mismatch

If the problem is token or model API related, check your API dashboard logs and balance first.


Step 4: Start Hermes Agent

The simplest way to start it is:

hermes

After startup, you enter an interactive command-line chat interface.

Ask a simple question first:

What can you do right now?

Then try a more Agent-like task:

List the files in the current directory and explain what this project is roughly about.

If Hermes has terminal and file-tool permissions, it can actually inspect directories and files instead of guessing.

This step is important.

At this point you are no longer using a web chatbot. You are using an AI Agent that can interact with the system environment.

A user chatting with Hermes Agent in the terminal while the Agent reads project files


Step 5: Try a Real Task

Do not only ask “hello”.

Whether an Agent is useful depends on whether it can complete concrete tasks.

Try something like:

Check this server's system information, including CPU, memory, disk, and current running processes.

Or:

Write a Python script that checks whether a website is reachable every 10 minutes, and prints an error log if it fails.

Or, if you are inside a project directory:

Read this project's README and package configuration, then tell me how to start the local development environment.

If it can read files, execute commands, and summarize the results, the basic capabilities are working.


Why Does an AI Agent Use More Tokens?

Many beginners notice that Agents consume tokens faster than normal chat.

That is normal.

An Agent does not simply answer one sentence. It usually goes through this process:

  1. understand your task
  2. inspect the current environment
  3. read files
  4. analyze output
  5. decide the next step
  6. call tools
  7. read the results
  8. continue reasoning
  9. summarize at the end

Every step creates context.

If it is helping you inspect a code project, it may read the README, config files, source files, tests, and logs. The more it reads, the longer the context becomes, and the more tokens it uses.

So if you plan to use tools such as Hermes, OpenClaw, Dify, LobeChat, or NextChat long term, a stable model API token is basically required.

You can use official or third-party platforms such as OpenAI, Anthropic, and OpenRouter directly. You can also use a unified token site such as:

https://nbility.dev

My suggestion is not to overthink this at the beginning. Pick an OpenAI-compatible service first and get the Agent running. Once you actually use it, choose cheaper or stronger models based on the task.

AI Agent token consumption path


A Beginner-Friendly Model Selection Strategy

If you are just testing, do not start with the most expensive model.

Split by task:

Normal chat, summaries, and simple config edits:

Use a cheaper model

Coding, debugging, and reading projects:

Use a medium or stronger model

Complex refactors, long-context analysis, and multi-step Agent tasks:

Use a strong model

The point of an Agent is not to always use the most expensive model. It is to use the right model for the task.

A practical combination is:

  • daily Q&A: cheaper model
  • coding tasks: stronger model
  • long document analysis: long-context model
  • image generation: specialized image model

If your API platform supports switching across models, this becomes much easier.


FAQ

1. The hermes command is not found

The environment variable may not have refreshed.

Log in to the server again, or check the PATH instructions printed by the installer.

You can also try:

which hermes

to see whether the system can find it.


2. Model connection failed

Start with:

hermes doctor

Then confirm:

  • API Key is correct
  • Base URL is correct
  • model name is correct
  • the account still has balance
  • the server can reach the API host

Many issues are not Hermes issues; they are API configuration mistakes.


3. Why can the Agent execute commands?

Because this is exactly the difference between an Agent and a normal chatbot.

But pay attention to safety. Do not casually give an unknown model unrestricted server access, especially on a production server.

Beginners should start on a test machine, lightweight server, or container environment.


4. Can it connect to Telegram, Weixin, or QQ?

Yes.

Hermes Agent supports multiple messaging platforms. I will continue with articles about:

  • connecting Hermes to Telegram
  • connecting Hermes to Weixin
  • connecting Hermes to QQ groups
  • summarizing group chats automatically
  • monitoring websites and servers with Hermes

Deploying it to a server is only the first step. The fun parts come later.


That Is Enough for This Article

At this point, you have completed the basic Hermes Agent deployment:

  • installed Hermes
  • completed initial setup
  • configured a model API
  • checked the environment with hermes doctor
  • started the AI Agent in the terminal
  • asked it to inspect the environment and execute tasks

That is already a big step beyond ordinary web chat.

In the next article I will cover:

How to connect Hermes Agent to Nbility: what to put in Base URL, API Key, and model name.

If you want to follow along, prepare a model API token first.

I use:

https://nbility.dev

It is suitable for testing and deploying open-source AI apps. For projects such as Hermes, OpenClaw, Dify, LobeChat, and NextChat that need an OpenAI-compatible endpoint, setting the API address and token is usually enough to get started.


Optional Short Summary

Use this if a platform needs a short summary:

This article walks through deploying Hermes Agent from scratch. Compared with a normal chatbot, an AI Agent can read files, execute commands, call tools, and complete multi-step tasks. The article covers server prerequisites, installation commands, initial setup, model API configuration, common issues, and why Agent-style apps continuously consume tokens. It is suitable for beginners who want to learn AI Agents, Hermes Agent, OpenAI-compatible APIs, and automation assistant deployment.

Related posts

OpenClaw Deployment Guide: Run a 24/7 AI Agent on Your Server
OpenClawAI AgentVPS

OpenClaw Deployment Guide: Run a 24/7 AI Agent on Your Server

Part 5 of the AI Agent Getting Started series: deploy OpenClaw on a VPS, install Node/npm and the CLI, run onboarding, configure a model API, open the Web UI, connect messaging channels, and keep the Agent alive with systemd.

Run your Agent workflow through Nbility

Get an API key and connect OpenAI-compatible models and developer tools from one place.

Manage API keys