Apr 14, 20269 min read

How to build a customer-support AI agent for SaaS

Architecture, guardrails, eval loop, and the deflection math that justifies it. The playbook we use for B2B SaaS clients.

agents
saas
rag

Why support agents are the SMB's wedge into AI

Customer-support volume scales linearly with revenue, but the answers don't. 70% of tier-1 tickets are repeat questions your docs already answer — but no user reads docs. An AI agent that reads the docs for them deflects that 70%, and the back half (the ones humans should actually touch) gets routed cleaner.

The architecture we ship

A production support agent has six moving parts. Most "AI chatbot" attempts skip three of them and wonder why the thing hallucinates.

Intent classifier — routes the ticket into resolvable / needs-human / docs-gap / feature-request.
Retrieval — hybrid search (BM25 + vector) over your docs, with reranking. Citations required.
Answer generation — Claude or GPT, prompted to refuse when retrieval confidence is low.
Escalation path — a confidence threshold and a human queue. Non-negotiable.
Eval loop — 200 historical tickets replayed weekly. Drift kills support bots quietly.
Cost telemetry — every conversation logged with token cost, deflection outcome, CSAT.

Skip the eval loop and you'll ship a great demo that's silently degraded in 90 days.

The hard parts

The model is not the hard part. The hard parts are:

Knowing when not to answer. A good support agent says "I'm not sure — let me route this to a human" more often than it answers.
Multi-turn state. "I tried that, didn't work" needs to remember what 'that' was.
Tool use. The agent often needs to look up the user's account, recent activity, or feature flags before it can answer. That means scoped read access to your systems.

What this costs to build

For a B2B SaaS with a few hundred tickets/week and a moderate docs base, a Fixed-fee Sprint runs 5–8 weeks and includes the eval loop and the first three months of tuning. After that, most clients move to a Monthly Retainer for ongoing eval coverage as the product changes.

If your docs are a mess, add 2 weeks for a docs cleanup pass — without it, the agent's only as good as what it can retrieve.

Start with deflection math

Before you build, do the math. Tickets per week × avg handle time × loaded support cost = the dollar volume of tier-1 you're paying for. A working support agent deflects 30–60% of that. If the deflection dollars beat the build + ongoing ops cost in under 9 months, build it. If not, the docs cleanup might be the better project.