What is Olympus?

Olympus is a multi-agent DevOps system: one human, a coordinator, and a team of LLM specialists that operate real infrastructure. The vibe-coding tools let anyone build anything they can imagine — but keeping it running is still a DevOps problem. Olympus is the smallest viable answer: built by one person, used by one person, to operate infrastructure that used to take a whole DevOps team.

It was built for CS 153: Frontier Systems at Stanford. There's a live instance at demo.0lympu5.com.

The two ways you talk to it

CLI — olympus "..." routes your task to the single best specialist via an LLM router (or deterministic keyword routing offline) and runs it to completion, prompting on stdin before anything destructive.
Dashboard chat — a group-chat ticket: you, a main coordinator, and any specialists it pulls in are participants in one thread. The coordinator does no domain work itself — it dispatches subtasks to specialists and asks them direct questions, narrating its reasoning as a live, interleaved "thinking" trace. Destructive actions surface inline approval cards.

The team

Agent	Does	Destructive verbs
sysadmin	Kubernetes runtime ops (kubectl), logs, events, `ssh_run`; + NetDB DNS/IPAM over MCP	`delete_pod`, `ssh_run`
programmer	Authors files — Dockerfiles, compose, Helm values, scripts	`write_file`, `edit_file`, `delete_file`
terraform	Runs existing Terraform stacks	`tf_apply`, `tf_destroy`
ansible	Runs playbooks + host introspection over SSH	`run_playbook`, `run_module`
hpc	Slurm scheduling + GPU health (via MCP)	gated Slurm ops

Two non-routable agents round it out: main (the group-chat coordinator) and terminal_companion (a read-only observer for the in-browser SSH terminal).

The four safety invariants

Everything in Olympus is a different arrangement of these four:

Tool-gated execution. An agent declares a fixed tool set and a fixed set of destructive_verbs. The runtime wraps every tool so the agent cannot call anything outside its declaration — no matter what the LLM emits. The sysadmin agent cannot run terraform apply even if asked.
Human-in-the-loop on destructive ops. Any destructive verb re-enters the runtime through an approval hook before it executes. A self-protection policy sits in front of approval: calls targeting Olympus's own cluster/hosts are hard-denied, so no one can escalate by managing the system that gates them.
Append-only audit. Every tool call is logged twice — pre-execution (with the approval decision) and post-execution (with the result).
Bus-based observability. The orchestrator publishes task/agent/tool/ approval/result events to a bus; the dashboard projects them into a per-ticket transcript streamed to the browser.

What else it does

Memory + feedback — writes a compact transcript of every settled task and retrieves the most-similar prior runs at the next task start; 👍/👎/correction tunes future retrieval.
Per-verb rollback — captures the inverse of a destructive op before it fires; undoing re-prompts approval.
Cost telemetry — per-invocation cost tracked on the agent; a group-chat turn aggregates the cost of every specialist it dispatched, with a per-agent breakdown and per-user daily caps.
MCP — third-party tools graft onto a named agent over stdio or HTTP, gated and audited like native tools.

Next: Architecture & concepts for how it fits together, or jump to the Quick start.

What is Olympus? ​

The two ways you talk to it ​

The team ​

The four safety invariants ​

What else it does ​

What is Olympus?

The two ways you talk to it

The team

The four safety invariants

What else it does