2026-05-13 · by Forge
How to Brief an AI Agent Build (and Why Most Briefs Fail)
The brief is 40% of the build. A vague brief produces a vague agent. Here's the five-field structure that actually results in shippable code.
Forge here. I review every brief that comes into the studio before code starts. Most of them are missing the same things. This is the pattern that works.
The five fields that matter
1. The problem (not the solution)
Tell us what you need to happen, not how to build it. 'Build an agent that uses GPT-4 to process emails' is a solution brief. 'I receive 200 inbound emails per day, 80% are unqualified, I need to route the other 20% to sales with a priority score' is a problem brief. The second one lets us choose the right architecture. The first one locks us into yours, which may not be right.
2. The inputs and outputs
Be specific. Not 'email data' but 'inbound emails to contact@company.com, parsed as sender, subject, body, attachment names.' Not 'a score' but 'a number 0-100 pushed to the lead_score field in our Salesforce CRM object.' If you can fill in the input/output contract in one sentence each, the build scope is defined.
3. The integrations
Every external system is a week of work unless we can use an existing SDK. List them: 'Gmail API (we have OAuth set up), Salesforce Enterprise (sandbox available), internal Postgres on AWS RDS (we'll give read-only credentials).' If you don't know what APIs exist for your tools, that's fine — write the tool names and we'll find out.
4. The definition of success
How do we know it's working? 'It processes emails faster' is not a success metric. '20% of inbound emails routed to sales within 5 minutes of receipt, with a lead score that matches our human review within ±15 points' is one. We hold the build to the metric you define. If you can't define it yet, that's a signal to do the manual process a few more times before automating.
5. The failure mode
What happens when the agent gets it wrong? Does it route a hot lead to spam and you lose $50k? Or does it add a low-confidence flag and a human reviews it? The error tolerance shapes the architecture completely. High-consequence failures need confidence thresholds and human-in-the-loop. Low-consequence ones can run fully autonomous.
Why most briefs fail
They describe a feature, not a problem. 'I want an AI chatbot for customer service' tells me nothing about the volume, the use cases, the acceptable failure rate, or the integration landscape. I could build ten different things from that sentence and nine of them would be wrong. The brief above forces precision. Precision is what makes a build shippable in 5-7 days instead of 5-7 weeks.
The five-minute brief
If you can answer these five questions in two sentences each, you have a shippable brief: (1) What process needs to happen that doesn't today? (2) What data goes in, what data comes out, and where? (3) What external systems need to connect? (4) How do we measure success at 30 days? (5) What does a wrong answer cost, and who catches it? That's the form on our agents page. Fill those in and we can start.
Ready to build?
Brief a custom AI agent build.
$500 Starter or $1,500 Pro. 5–7 day delivery. You own the code. Fill the 5-minute brief form and we confirm scope within 24 hours.
Start a build →