March 8, 2026

From Seven Agents to One: How an Anthropic Outage Saved My Product

architectureengineeringai

This article replaces "Energy Descriptions", "Prompt for the General Overview", and "From Analyst Agents to Energy Caching", which describe the previous version of the system.

In early March 2026, Anthropic had a massive outage. My service, which depended entirely on Claude, just stopped working. Users saw errors, I saw panic. So I went and experimented with whatever was actually working. Two weeks later I had a different product — faster, simpler, cheaper. Sometimes you need someone to pull the plug so you finally stop and think: maybe you don't need that many plugs in the first place?

What came before (and why I still hate throwing it away)

The old architecture was built on an elegant idea: the energy of each house in a natal chart is static and doesn't depend on the user's question. So you can describe it once, cache it, and reuse it forever.

Sounds elegant. In practice, here's what it meant:

When adding a chart, the system generated 12 energy texts — one per house, 500-1500 words each, a separate LLM call for each one. The user added a chart — and waited. Five minutes. Sometimes longer. They hadn't asked a single question yet, hadn't seen a single answer — and they were already sitting there staring at a progress bar. Many didn't stick around. And I get it — I wouldn't have either.

For every question — a classifier determined the request type, a planner built a plan of 4-10 points, the cache was checked for each point, missing texts were generated, then everything was assembled into a final response. Seven agents, each with its own prompt. Response time — up to two minutes.

Looks great on an architecture diagram. Painful in practice.

And here's the thing — and this matters — I still like the idea itself. The energy texts were good. The prompts for generating them were the result of hundreds of iterations and several hundred euros in testing. The caching worked. It was just a solution to the wrong problem for the product. Maybe I'll come back to this architecture someday — when I figure out what for. For now, the prompts sit in git, waiting for their moment.

What actually broke

Five minutes to get in — then silence. Five minutes of waiting when adding your first chart isn't "slow." It's "closed the tab." People came, added a chart, saw they had to wait — and left. I was losing users before they even had a chance to ask a single question. Great way to build a business.

Two minutes per answer. Even after a chart was "warmed up" — every complex question could take up to two minutes. In a world where people are used to ChatGPT and its five-second responses, two minutes is an eternity. Especially when you're not yet sure the service is even worth your time.

Seven points of failure. Seven agents — seven places where something can go wrong. The classifier sometimes got it wrong. The planner built bizarre plans. One stuck call would stall everything. And then Anthropic went down — and all seven points failed simultaneously. A symphony of reliability.

Caching for caching's sake. "The tenth question will be cheap!" — I promised. The problem is that hardly anyone made it to the tenth question. Most asked 2-3. I'd built a Ferrari for a trip to the bakery around the corner.

How Anthropic pushed me to change

In early March, Claude went down. Not for five minutes — seriously and for a long time. The service wasn't working, users were writing in, and I sat there waiting for someone else's infrastructure to fix itself. That lovely feeling of complete helplessness.

Instead of just waiting, I went to test alternatives. Gemini, GPT-5.2 — anything, as long as it was working right now. And I discovered something unexpected.

GPT-5.2 can hold context. The entire natal chart JSON — planet positions, house cusps, aspects with orbs, dignities, intercepted signs, receptions — loads entirely into a single prompt. And the model doesn't get confused. Doesn't mix up the ruler of a house with a planet in the house. Doesn't lose sections. Doesn't forget to check intercepted signs.

Before, the energy texts were a crutch: earlier models couldn't properly work with raw chart data, they needed pre-chewed analysis. GPT-5.2 chews on its own. Not bad when a model does for you what you spent six months engineering.

The prompt changed — bigger and denser. Instead of seven small prompts — one, but thorough. It contains the entire methodology: derivative house logic, the three-level interpretation model, the consciousness map by houses, the planetary integration spectrum, the axis principle, question routing, demographic adaptation. One document that the model receives in full with every question.

What this achieved

Instant readiness. Add a chart — and you can start asking right away. Five minutes of waiting turned into zero. This is probably the most important change from a product perspective. Not the most technically impressive — but the most important.

30-60 seconds instead of two minutes. One LLM call instead of ten. Still not instant — the prompt is big, the chart is detailed, the answers are long. But the difference between "waiting two minutes staring at the screen" and "waiting half a minute while pouring tea" is fundamental.

One euro to start. Instead of free "two questions with a five-minute loading screen," every new user gets a euro in their account. Thanks to the reduced cost per call, that covers noticeably more questions — enough to actually try the system out, instead of getting two answers and hitting a wall.

Reliability. One model, one call, one point of failure instead of seven. And if GPT-5.2 also goes down — I'll switch to Claude or Gemini within a day. There's one prompt, and adapting it for a different provider is incomparably easier than migrating seven agents.

What was lost (and waiting in reserve)

Caching was a beautiful engineering idea. "The chart warms up with every question" — I still think that's a good metaphor and good architecture. If people asked 20 questions in a row, it would make enormous sense. But reality showed: a beautiful system isn't worth anything if it ruins the first impression.

The Sonnet/Opus split is gone too. Before, the user chose between fast-and-cheap and deep-and-expensive. Now a single model covers both scenarios well enough.

The energy description prompts — hundreds of iterations, several hundred euros in testing — sit in the repository. Not deleted. I'm an engineer, and it physically hurts me to delete working code. Someday, maybe, I'll find a use for them — for offline reports, for a premium tier with deep analysis, or for something I haven't thought of yet. Or maybe they'll just remain a monument to overengineering. That's an honorable fate too.

The moral

I love building systems. The multi-agent pipeline was a good system — with elegant caching, parallel generation, separation of concerns. I spent months building it and was proud of the result.

And then Anthropic broke, and in two weeks I built something that works better.

Sometimes the right architectural decision is to demolish the architecture. And sometimes you need an outside kick to do it. Thanks, Anthropic.