How much does it cost to run an AI research agency with 5 agents?

About $0.09 per report in API fees. A full month of production at 150 reports costs under $14. The trick is mixing expensive models (Sonnet 4 for judgment roles) with cheap ones (Qwen 2.5 72B for data work) and running formatting locally on Ollama for zero variable cost.

What do you actually sell with an AI research agency?

Competitive intelligence reports, market overviews, and technology evaluations. Buyers are product managers, VCs, consultants, and SaaS founders who need structured research but don't have analysts on staff. Pricing ranges from $200 per report on freelance platforms to $2,000 for direct B2B clients.

How long until an AI automation agency makes $5,000 per month?

Realistically 3-4 months. Month 0 is sample production. Month 1 targets one paying client at $500-1,500. Month 2 adds clients and raises prices to $2,000-4,000. Month 3 systematizes intake and hits $4,000-6,000. The bottleneck is client acquisition, not agent costs.

Build an AI Research Agency on Hermes + Paperclip

Summary:

Five agents (Intake, Researcher, Writer, Editor, Ops) produce competitive intelligence reports for $0.09 each.

All agents run on the hermes_local adapter in Paperclip. Premium models handle judgment, cheap models handle data, local models handle formatting.

You sell the output to product managers, VCs, and SaaS founders at $200-$2,000 per report.

Month 3 target: $4,000-$6,000/month with 150+ deliverables at under $30 in agent costs.

An AI automation agency that ships client deliverables for nine cents each sounds wrong. It’s not. The whole trick is splitting work across five agents with different cost profiles: expensive models where judgment matters, cheap models where it doesn’t, and a local model for the boring stuff.

This tutorial walks through the exact Paperclip configuration for a research agency that produces competitive intelligence reports. Five agents. Three model tiers. Under $30/month in total API costs at production volume.

5-agent research pipeline flow showing Intake, Researcher, Writer, Editor, and Ops with model assignments and cost per agent

What does the 5-agent pipeline look like?

Five agents, each with a specific role and a model matched to the cost of that role:

Agent	Role	Model	Cost per report
Intake	Parse client briefs into structured research questions	Claude Sonnet 4	$0.014
Researcher	Gather data, cite sources, produce research notes	Qwen 2.5 72B	$0.003
Writer	Turn research notes into polished report prose	Claude Sonnet 4	$0.068
Editor	Fact-check, verify citations, flag hallucinations	Qwen 2.5 72B	$0.001
Ops	Format final deliverable, save to output directory	Local Ollama	$0.000

Every agent uses the hermes_local adapter in Paperclip. Why these models: Intake and Writer need judgment (deciding what to research, writing polished prose), so they get Sonnet 4. Researcher and Editor do structured data work, so they get Qwen at 25x less cost. Ops does formatting, so it runs locally for free.

Total per report: about $0.09. At a $350 client price, that’s 99.97% margin on production costs.

How do you configure each agent?

Install three Hermes skills first. These encode the procedures each agent follows:

hermes skills install research-comparison-table
hermes skills install citation-verification
hermes skills install systematic-debugging

Then create the agents in your Paperclip dashboard. Here are the configs that matter.

Intake agent — fast parsing step, short timeout:

{
  "name": "Intake",
  "role": "Parse client briefs into structured research questions",
  "adapterType": "hermes_local",
  "adapterConfig": {
    "model": "anthropic/claude-sonnet-4",
    "maxIterations": 20,
    "timeoutSec": 180,
    "persistSession": false,
    "enabledToolsets": ["file", "memory"]
  }
}

No web access. No terminal. It reads the brief, writes a structured plan to a file, creates research issues for the Researcher. Twenty iterations max. Three minutes. Done.

Researcher agent — the slow one, needs room to work:

{
  "name": "Researcher",
  "role": "Gather data and produce structured research notes with citations",
  "adapterType": "hermes_local",
  "adapterConfig": {
    "model": "openrouter/qwen/qwen-2.5-72b-instruct",
    "maxIterations": 60,
    "timeoutSec": 900,
    "persistSession": false,
    "enabledToolsets": ["file", "web"],
    "extraArgs": ["--skills", "research-comparison-table,citation-verification"]
  }
}

Higher iteration limit (60) and a 15-minute timeout because research takes time. Web and file toolsets so it can fetch sources and save notes. The extraArgs preload the two skills this agent needs every run so it doesn’t guess which to load.

Writer agent — Sonnet for prose quality, no web access:

{
  "name": "Writer",
  "role": "Produce polished research reports from structured notes",
  "adapterType": "hermes_local",
  "adapterConfig": {
    "model": "anthropic/claude-sonnet-4",
    "maxIterations": 30,
    "timeoutSec": 600,
    "persistSession": false,
    "enabledToolsets": ["file", "memory"]
  }
}

File and memory access so it can read the Researcher’s notes and recall client preferences from ~/.hermes/memories/clients/<client-name>.md. No web access. The Writer’s inputs come from the Researcher, not from live fetches.

How does a client report get produced?

Each client engagement becomes a chain of Paperclip issues. One issue triggers the next:

PC-1: Parse client brief. Intake checks out the issue, reads the raw request, produces structured research questions, and creates follow-up issues.
PC-2 through PC-5: Research questions. Researcher works each issue. Structured notes with citations go to disk.
PC-6: Write report. Writer consumes the research notes and produces a 1,500-3,000 word draft with executive summary, comparison tables, and recommendations.
PC-7: Fact-check. Editor verifies citations, flags issues, and either approves or kicks back with corrections.
PC-8: Format and deliver. Ops takes the approved draft, formats it as a client-ready Markdown file with metadata, saves to output.

You only touch step 4 (reviewing editor flags) and step 5 (final QA before sending). Everything else runs automatically.

Trigger the first run manually to watch the logs:

# Check out an issue for the Intake agent
npx paperclipai issue checkout PC-1 --agent <intake-agent-id>

# Trigger one heartbeat immediately
npx paperclipai heartbeat run --agent <intake-agent-id>

# After it completes, verify
npx paperclipai issue get PC-1
npx paperclipai activity list --company <company-id>

What’s the revenue math?

The whole pipeline costs $0.09 per report. Here’s what the ramp looks like when you start selling:

Month 0 ($0 revenue). Produce 10 sample deliverables. Pick the 3 best as portfolio pieces. Set up profiles on Upwork, Fiverr, and LinkedIn. Set initial pricing below market rate. Your goal is testimonials, not money.

Month 1 ($500-$1,500). Land the first client. Run the job through the pipeline. Watch every step. Deliver, ask for feedback, capture it in client memory. Ask for a testimonial. One to two clients, one to three deliverables each, at $150-$250 per report.

Month 2 ($2,000-$4,000). You have a testimonial. Raise prices to market rate ($300+). Reduce review time from 20 minutes to 5 per report. Add a second client. Push for retainer pricing: “4 reports per month, $1,200/month” beats “1 report, $300.”

Month 3 ($4,000-$6,000). Systematize client intake with a standard form that feeds directly into the Intake agent’s queue. Target higher-value direct clients ($500/report) instead of freelance platform clients ($200/report). Five to seven clients, mix of retainer and per-project.

At 150 deliverables per month and $50+ per deliverable, you’re looking at $7,500+ in revenue against under $14 in agent API costs.

How do agents remember client preferences?

Each client gets a memory file:

~/.hermes/memories/clients/<client-name>.md

After the first report, save their feedback there: style preferences, competitor focus, formatting quirks, terminology. By the third report, your Writer agent writes in their voice. By the tenth, a new competitor can’t match your output quality for that client.

Each client also maps to a Paperclip company. Issues represent deliverables. The activity log is your client history. Query it directly:

# How many issues shipped for this client last month?
npx paperclipai issue list --company <client-company-id> --state done

# Cost for a specific client
npx paperclipai activity list --company <client-company-id> --format json \
  | jq '[.[] | select(.type == "issue_completed") | .cost_usd] | add'

No separate project management tool. Paperclip is the tool.

What broke?

Three things will bite you in the first month.

Vague client briefs. If the Intake agent can’t turn the request into clear research questions, everything downstream produces garbage. Fix: use Paperclip’s approval system to gate the transition from PC-1 to research. Force clients to fill out a structured intake form instead of sending freeform emails.

Fabricated citations. The Researcher will occasionally invent URLs that look plausible. The Editor should catch these with the citation-verification skill, but it misses some. Your final QA at step 5 is the backstop. Over time, the Editor’s skill library grows to recognize more hallucination patterns. Month 1 catch rate is maybe 60%. Month 3 is 80%. Month 6 is 90%.

Style drift after 20+ reports. You stop reviewing carefully. The agents produce output that’s 95% good, and the 5% accumulates until a client notices. Fix: do a deep review of one deliverable per client every month. Read it like the client would. Catch what you’ve been skipping.

What should you actually do?

If you’re starting from zero:

Install Hermes and Paperclip. Run npx paperclipai onboard if you haven’t.
Install the three skills: research-comparison-table, citation-verification, systematic-debugging.
Create one Paperclip company. Add the five agents with the configs above.
Write one fake client brief. Run it through the full pipeline. Watch every step. Fix what breaks.
Produce 10 sample reports. Pick the 3 best. Post them on Upwork and LinkedIn.
Price the first three clients below market rate. Get testimonials.
Raise prices after the first month. Push for retainers.

If you already have agents running:

Skip to step 4. You already have the stack. The business is config work and skill authoring. After three months of real client work, you have 20-30 domain-specific skills. A competitor starting today has none. That’s the moat.

Here’s a quick function to estimate your monthly agent costs at a given volume:

def research_agency_cost(reports_per_month: int = 100) -> dict:
    """Estimate monthly API cost for the 5-agent research pipeline."""
    cost_per_report = {
        "intake_sonnet4": 0.014,
        "researcher_qwen": 0.003,
        "writer_sonnet4": 0.068,
        "editor_qwen": 0.001,
        "ops_local": 0.000,
    }
    per_report = sum(cost_per_report.values())
    monthly = per_report * reports_per_month
    return {
        "cost_per_report": f"${per_report:.3f}",
        "monthly_agent_cost": f"${monthly:.2f}",
        "reports": reports_per_month,
        "breakeven_at_350_each": f"${350 * reports_per_month:,.0f} revenue",
        "margin": f"{(1 - monthly / (350 * reports_per_month)) * 100:.1f}%"
    }

print(research_agency_cost(100))
# {'cost_per_report': '$0.086', 'monthly_agent_cost': '$8.60',
#  'reports': 100, 'breakeven_at_350_each': '$35,000 revenue',
#  'margin': '100.0%'}

bottom_line

Five agents, three model tiers, $0.09 per report. The production cost of running an AI automation agency is a rounding error on the revenue.
The bottleneck is client acquisition, not agent costs. Month 1 is about getting one client. Month 3 is about systematizing intake. The agents are the easy part.
Skills compound. After three months, your Hermes skill library has 20-30 domain-specific skills that a competitor starting today doesn’t have. That’s the moat. Not the technology. The technology is free. The knowledge the system accumulates is yours.

Build an AI Research Agency on Hermes + Paperclip

Zero-Human Companies

What does the 5-agent pipeline look like?

How do you configure each agent?

How does a client report get produced?

What’s the revenue math?

How do agents remember client preferences?

What broke?

What should you actually do?

bottom_line

Frequently Asked Questions

More from this Book

How Much Does a Hermes + Paperclip Company Cost to Run?

8 Hermes + Paperclip Failures (and How to Fix Each One)

How to Set Up Hermes Agent Skills That Compound Over Time

Connect Hermes Agent to Paperclip with hermes_local