Agent OS — self-hosted AI assistant in Go. One message, any channel, no code to add agents

Marco · May 11, 2026, 8:44pm

Hi everyone,

I’ve been building this for a while and finally reached a point where it feels stable enough to share.

Agent OS is a self-hosted, multi-channel AI executive assistant written in Go. You message it on WhatsApp, Discord, or the web, in plain English, and it figures out what to do: draft an email, build a software project, research a topic, set reminders, manage calendar events, etc. No interface to click through. Just talk.

The part I’m most proud of: is the ability to add new agents with zero Go code.

Drop a folder with:

agent.yaml
SYSTEM.md
SOUL.md (optional)

into agents/ and the server automatically loads it on restart.

That’s it.

Doctor agent, finance agent, legal agent, companion agent all just folders. No recompilation, plugin SDK, or framework boilerplate.

A few other things worth mentioning:

Compound intent routing, requests like “email Alice and schedule a follow-up next week” fan out to the right agents in parallel and merge back into a single response
Builder Agent: describe what you want to build in plain English and it walks through requirements → spec → tasks → codegen → automated review. You can also say “build autonomously” and it runs the full workflow without interruption
Personality engine: observes communication patterns over time and adapts every agent’s tone and response style automatically
Costguard: a custom LLM gateway layer I built for provider abstraction, streaming, retries/fallbacks, cost tracking, multimodal support, and tool-call normalization
Model-agnostic architecture: supports both hosted and local models. Agents can use different providers/models depending on capability or cost
Single statically compiled Go binary: runs on a VPS, Raspberry Pi, or locally with:
make run

Repo:
https://github.com/marcoantonios1/agent-os

https://github.com/marcoantonios1/costguard

Happy to talk architecture, routing/orchestration, the agentic loop design, concurrency patterns, multimodal handling, or the gateway layer.