Deploy an AI Agent to Your Own Server in Two Hours: A Hermes Agent Getting Started Guide

Over the past year, everyone has been talking about AI Agents.

But if all you do is open a web page and chat with ChatGPT, you are not really using an Agent yet. The interesting question is whether it can read files, edit code, execute commands, browse the web, run scheduled tasks, and even connect to Telegram, Weixin, or QQ groups as a long-running assistant.

Recently I have been experimenting with Hermes Agent.

Its positioning is close to tools such as Claude Code, OpenClaw, and similar coding agents: it is not just a chat shell. It lets a model touch your working environment and complete tasks.

This article keeps the scope simple and does one thing:

Start from zero and get Hermes Agent running on a server.

If you follow along, you will end up with an AI Agent that works in the terminal. Later articles can build on this and connect it to Telegram, Weixin, QQ groups, scheduled jobs, automation, and coding workflows.

Cover image: niku introducing Hermes Agent in a server room beside a terminal window and API token panel

What Is the Difference Between an AI Agent and a Chatbot?

A normal chatbot is more like a question-and-answer window.

You ask a question and it replies. It does not know what files are on your computer, and it cannot execute commands for you. If you ask it to change code, it can at most give you suggestions. You still copy, run, and test everything yourself.

An AI Agent is different.

A useful Agent should be able to do at least these things:

read project files
modify code
execute shell commands
search for information
call a browser
run tests
remember your preferences
respond inside messaging apps
run scheduled tasks

That is what makes Hermes Agent interesting. It is not just a chat wrapper. It is an Agent framework that can connect to tools, the filesystem, the terminal, and messaging platforms.

You can think of it as:

Giving the model hands and feet.

Chatbot vs AI Agent comparison diagram

Why Deploy It to a Server?

You can run Hermes locally, but a server has several clear advantages.

First, it can stay online.

If you later connect Hermes to Telegram, Weixin, QQ groups, or ask it to summarize news and check server status every day, a local laptop is not ideal. When the computer shuts down, the Agent disappears.

Second, a server is better for automation.

For example:

send an AI industry brief every morning at 9
summarize group chats every night
monitor website status
notify you when a service goes down
run scripts on a schedule
check logs remotely

Third, a server environment is cleaner.

You can dedicate a small machine to the Agent and install only what it needs. When something breaks, it is easier to debug.

Prerequisites

This article assumes you already have a Linux server.

The minimum configuration does not need to be high. An ordinary cloud server is enough. Recommended:

OS: Ubuntu 22.04 / 24.04, or Debian
memory: at least 2 GB, preferably 4 GB or more
disk: 10 GB or more
network: able to access the model API
permissions: root or sudo access is best

You also need a model API key.

Hermes Agent does not provide a model by itself. It needs to connect to OpenAI, Claude, OpenRouter, DeepSeek, Gemini, or another OpenAI-compatible API service.

If you do not want to register accounts across multiple model providers, you can use a unified API token service. I often use:

https://nbility.dev

The benefit of this kind of platform is that one token can access multiple models. Many open-source projects only need a Base URL, API Key, and model name to start running. For beginners, that removes a lot of friction.

Step 1: Install Hermes Agent

After logging into the server, run:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

After installation, reload your shell environment or log in to the server again.

Then check whether the command is available:

hermes --version

If you see a version number, the installation succeeded.

Hermes Agent installation flow

Step 2: Run the Initial Setup

For first use, run the setup wizard:

hermes setup

It will ask you to configure the model, provider, tools, and related settings.

If you use officially supported providers such as OpenAI, Anthropic, or OpenRouter, you can follow the prompts.

If you use an OpenAI-compatible API service such as Nbility, focus on three values:

Base URL
API Key
Model Name

Use the exact values from your own dashboard.

An OpenAI-compatible service usually looks like this:

Base URL: https://your-api-host/v1
API Key: sk-xxxxxxxxxxxxxxxx
Model: the model name you want to use

Do not put a real API key in public articles, screenshots, or group chats.

Hermes Agent calling the Nbility API chain

Step 3: Check Whether the Environment Works

After configuration, run:

hermes doctor

This command checks Hermes dependencies, configuration, model connectivity, and related status.

If you see an obvious error, first check these common causes:

wrong API Key
Base URL missing /v1
model name does not exist
server cannot reach the API
insufficient balance
provider configuration mismatch

If the problem is token or model API related, check your API dashboard logs and balance first.

Step 4: Start Hermes Agent

The simplest way to start it is:

hermes

After startup, you enter an interactive command-line chat interface.

Ask a simple question first:

What can you do right now?

Then try a more Agent-like task:

List the files in the current directory and explain what this project is roughly about.

If Hermes has terminal and file-tool permissions, it can actually inspect directories and files instead of guessing.

This step is important.

At this point you are no longer using a web chatbot. You are using an AI Agent that can interact with the system environment.

A user chatting with Hermes Agent in the terminal while the Agent reads project files

Step 5: Try a Real Task

Do not only ask “hello”.

Whether an Agent is useful depends on whether it can complete concrete tasks.

Try something like:

Check this server's system information, including CPU, memory, disk, and current running processes.

Or:

Write a Python script that checks whether a website is reachable every 10 minutes, and prints an error log if it fails.

Or, if you are inside a project directory:

Read this project's README and package configuration, then tell me how to start the local development environment.

If it can read files, execute commands, and summarize the results, the basic capabilities are working.

Why Does an AI Agent Use More Tokens?

Many beginners notice that Agents consume tokens faster than normal chat.

That is normal.

An Agent does not simply answer one sentence. It usually goes through this process:

understand your task
inspect the current environment
read files
analyze output
decide the next step
call tools
read the results
continue reasoning
summarize at the end

Every step creates context.

If it is helping you inspect a code project, it may read the README, config files, source files, tests, and logs. The more it reads, the longer the context becomes, and the more tokens it uses.

So if you plan to use tools such as Hermes, OpenClaw, Dify, LobeChat, or NextChat long term, a stable model API token is basically required.

You can use official or third-party platforms such as OpenAI, Anthropic, and OpenRouter directly. You can also use a unified token site such as:

https://nbility.dev

My suggestion is not to overthink this at the beginning. Pick an OpenAI-compatible service first and get the Agent running. Once you actually use it, choose cheaper or stronger models based on the task.

AI Agent token consumption path

A Beginner-Friendly Model Selection Strategy

If you are just testing, do not start with the most expensive model.

Split by task:

Normal chat, summaries, and simple config edits:

Use a cheaper model

Coding, debugging, and reading projects:

Use a medium or stronger model

Complex refactors, long-context analysis, and multi-step Agent tasks:

Use a strong model

The point of an Agent is not to always use the most expensive model. It is to use the right model for the task.

A practical combination is:

daily Q&A: cheaper model
coding tasks: stronger model
long document analysis: long-context model
image generation: specialized image model

If your API platform supports switching across models, this becomes much easier.

FAQ

1. The `hermes` command is not found

The environment variable may not have refreshed.

You can also try:

which hermes

to see whether the system can find it.

2. Model connection failed

Start with:

hermes doctor

Then confirm:

API Key is correct
Base URL is correct
model name is correct
the account still has balance
the server can reach the API host

Many issues are not Hermes issues; they are API configuration mistakes.

3. Why can the Agent execute commands?

Because this is exactly the difference between an Agent and a normal chatbot.

But pay attention to safety. Do not casually give an unknown model unrestricted server access, especially on a production server.

Beginners should start on a test machine, lightweight server, or container environment.

4. Can it connect to Telegram, Weixin, or QQ?

Yes.

Hermes Agent supports multiple messaging platforms. I will continue with articles about:

connecting Hermes to Telegram
connecting Hermes to Weixin
connecting Hermes to QQ groups
summarizing group chats automatically
monitoring websites and servers with Hermes

Deploying it to a server is only the first step. The fun parts come later.

That Is Enough for This Article

At this point, you have completed the basic Hermes Agent deployment:

installed Hermes
completed initial setup
configured a model API
checked the environment with hermes doctor
started the AI Agent in the terminal
asked it to inspect the environment and execute tasks

That is already a big step beyond ordinary web chat.

In the next article I will cover:

How to connect Hermes Agent to Nbility: what to put in Base URL, API Key, and model name.

If you want to follow along, prepare a model API token first.

I use:

https://nbility.dev

It is suitable for testing and deploying open-source AI apps. For projects such as Hermes, OpenClaw, Dify, LobeChat, and NextChat that need an OpenAI-compatible endpoint, setting the API address and token is usually enough to get started.

Optional Short Summary

Use this if a platform needs a short summary:

This article walks through deploying Hermes Agent from scratch. Compared with a normal chatbot, an AI Agent can read files, execute commands, call tools, and complete multi-step tasks. The article covers server prerequisites, installation commands, initial setup, model API configuration, common issues, and why Agent-style apps continuously consume tokens. It is suitable for beginners who want to learn AI Agents, Hermes Agent, OpenAI-compatible APIs, and automation assistant deployment.

Deploy an AI Agent to Your Own Server in Two Hours: A Hermes Agent Getting Started Guide

What Is the Difference Between an AI Agent and a Chatbot?

Why Deploy It to a Server?

Prerequisites

Step 1: Install Hermes Agent

Step 2: Run the Initial Setup

Step 3: Check Whether the Environment Works

Step 4: Start Hermes Agent

Step 5: Try a Real Task

Why Does an AI Agent Use More Tokens?

A Beginner-Friendly Model Selection Strategy

FAQ

1. The `hermes` command is not found

2. Model connection failed

3. Why can the Agent execute commands?

4. Can it connect to Telegram, Weixin, or QQ?

That Is Enough for This Article

Optional Short Summary

Related posts

Connect Hermes Agent to Telegram: Turn Your Server-Side AI Agent into a Mobile Remote Assistant

Connect Hermes Agent to Nbility: Use a Smoother OpenAI-Compatible Model API Entry

Advanced QQ Group AI Bot: Summaries, Image Generation, and Status Page Screenshots

Run your Agent workflow through Nbility