CHARIOT

Community Article Published October 6, 2025

A signal-based model router built inside the Augustus framework

Hi, I’m Marcus, founder of Augustus.
I build systems that make AI infrastructure faster, leaner, and more predictable.
My work usually sits somewhere between logic, geometry, and rhythm,
trying to make machines act with precision rather than probability.

Over the past weeks, I’ve been working on Chariot,
a small but ambitious project inside the Augustus framework.
It’s a pre-LLM router that decides which model to use (GPT-4.1-mini, GPT-5-mini, or GPT-5)
based purely on signal, not semantics.

🧠 Why I built it

Most routers today use an LLM to decide which LLM to use.
That’s circular logic, you spend tokens just to decide where to spend more tokens.
I wanted to break that pattern.

Chariot doesn’t read language, it reads form.
It analyzes a prompt as rhythm, density, and structure.
It turns words into numbers, and numbers into a waveform.
From that waveform comes one value, the signal score, that decides which model fits best.

To me, it feels like a bridge between human and machine.
A system that doesn’t interpret, it recognizes.

⚙️ The first results

The first version is built in Python, running on Flask and Gunicorn.
The entire API is under 10 KB, lightweight and minimal.
No dependencies, no extra logic, just signal analysis and a decision.

Example API call

POST /analyze
{
  "prompt": "Why do we fear death?"
}

Response

{
  "model": "gpt-5-mini",
  "signal_score": 1.72
}

The API processes requests with an average response time below 200 ms, even under load.
Decisions feel nearly instantaneous.
Across 1,000 diverse prompts, Chariot reduced GPT-5 usage by 55%
without any measurable loss in result quality.

Prompt	Selected Model	Signal Score
`good morning`	GPT-4.1-mini	0.87
`what is 2 + 2`	GPT-4.1-mini	1.18
`why do we fear death`	GPT-5-mini	1.72
`can a computer learn love`	GPT-5	2.37

📊 Benchmarks

Test	Without Chariot	With Chariot	Savings
1,000 mixed prompts	100% GPT-5 usage	44% GPT-5 usage	−55% cost

Routing latency: <200 ms
Integration overhead: none

💬 What it’s teaching me

It’s not perfect, and it’s not finished.
But it’s alive, a deterministic system that listens to signal instead of meaning.
And maybe that’s enough for now.

If you’ve built something similar or have ideas on where this could evolve,
I’d really like to hear your thoughts.

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote