What is a verification council?

A verification council is an AI setup that checks a single claim from four independent angles (logic, consistency, falsification, source) and returns a verdict with a confidence band. It is built to check whether something is true, not to vote on whether an idea is good.

Can an AI council fact-check a claim?

Partly. It catches internally inconsistent claims, claims that contradict well-known facts, and claims one model invented that others do not share. It cannot manufacture a fact no model has, so its honest answer is sometimes UNVERIFIABLE.

What does UNVERIFIABLE mean?

UNVERIFIABLE means the council could not confirm or deny the claim from what the models know, usually because it needs a current or citable source they do not have. It is a valid, useful answer that sends you to a primary source instead of a confident guess.

How to Verify AI Claims with a Council

Summary:

Build a /verify council that fact-gates a claim from four independent angles.

It returns SUPPORTED, CONTRADICTED, or UNVERIFIABLE with a confidence band, never false certainty.

The honest UNVERIFIABLE verdict is the feature, not a failure: it stops you acting on a fact you only thought you had.

Bonus: the full /verify skill, a worked fake-statistic takedown, and the audit trail you can show a skeptic.

To verify AI claims, you stop asking one confident model and convene a council whose only job is to check whether the claim holds. A council for ideas asks “is this any good.” A verification council asks “is this true.” Those are different jobs, and almost everyone builds the first when the moment called for the second, then walks away with a well-reasoned, completely unchecked answer.

How do you verify AI claims with a council?

You check the claim four independent ways and let a chairman report how confident that check should make you. A verification council does not vote on whether a claim feels plausible; each advisor interrogates the claim from a different angle, so a shared assumption cannot sail through. One builder on Reddit described reaching for exactly this:

I actually find myself asking multiple LLMs the same question when I need a definitive answer without hallucinations.

That is from an r/n8n post (63 upvotes) rebuilding the LLM Council. The honest version of “definitive,” though, is not “the council declares it true.” It is “the council checked it four ways and here is how much to trust it.”

Exploration vs verification: two different jobs

The tell for which council you need is simple. If reasonable people could disagree about the answer forever, it is exploration. If there is a true answer that exists whether or not you like it, it is verification. “Should we raise prices to $49” is exploration. “Is our churn rate actually above the industry average” is verification.

Point an idea council at a factual question and the advisors will reason elegantly about whether the number seems plausible while never checking it. That is the cracked-foundation failure one level up: a beautiful structure of reasoning built on a fact nobody verified. Verification needs a council built to interrogate the claim itself.

Build the /verify skill

Create .claude/skills/verify/SKILL.md. Four checks plus a chairman in verification mode:

---
description: Fact-gate a claim. Four advisors check it from independent angles;
  the chairman reports SUPPORTED / CONTRADICTED / UNVERIFIABLE with confidence.
---

# Verify

Treat the user's message as the CLAIM. Run four checks independently with the
Bash tool (route each to a different model family where you can):

  LOGIC: "Check this claim for internal consistency only. Do the numbers and
  statements hold together? Do not judge real-world truth yet. CLAIM: CLAIM"

  CONSISTENCY: "Does this fit or contradict well-established facts? Flag any
  conflict. State your confidence and what you are unsure about. CLAIM: CLAIM"

  FALSIFICATION: "Try to prove this FALSE. What would have to be true for it to
  be wrong, and is any of that the case? If you cannot, say so. CLAIM: CLAIM"

  SOURCE: "How would anyone actually know this? What evidence would settle it,
  and can a language model reliably have it? Rate how verifiable this is. CLAIM: CLAIM"

CHAIRMAN: "Given the four checks, output (1) VERDICT: SUPPORTED / CONTRADICTED /
UNVERIFIABLE, (2) CONFIDENCE: high / medium / low with one line why, (3) the
biggest reason for doubt, (4) UNRESOLVED disagreements. Do NOT manufacture
certainty. If the council cannot confirm it, UNVERIFIABLE is a valid answer.
CHECKS: paste all four"

The chairman here is deliberately different from the idea council’s. It cannot say GO or NO-GO, because truth is not a decision. It says SUPPORTED, CONTRADICTED, or UNVERIFIABLE, reports a confidence band, and is forbidden from manufacturing certainty. That last instruction is the most important line in the skill.

Watch it gate a real claim

Say you read in a pitch deck, “AI cuts support response times 70% on average,” and you are about to repeat it to your own board. You feed it to the council:

/verify AI cuts support response times 70% on average

The logic check finds nothing internally contradictory. The consistency check gets cautious: “70% on average” is a suspiciously round, suspiciously universal number that conflicts with how much results actually vary by company. The falsification advisor goes to work: this would be false if the figure came from one vendor’s marketing study generalized into a universal claim, which is how numbers like this usually originate. The source check delivers the finding: no model can confirm an “on average” statistic without a specific citable study, and it does not have one. The chairman weighs them, with no single check deciding alone:

VERDICT: UNVERIFIABLE. CONFIDENCE: low.

The directional idea (AI cuts support response times) is well supported. The
specific universal figure (70% on average) is not, and reads like a vendor
statistic generalized beyond its evidence.
Biggest reason for doubt: round, universal percentages almost always come from
one study with a narrow sample.
UNRESOLVED: the council cannot locate a credible source for the 70% figure.
Do this instead: cite the directional claim, drop the number, or find the source.

That verdict just saved you from putting a fake-precise statistic in front of your board. It did not say “false.” It said “I cannot confirm this, here is why, here is what to do instead,” which is the honest answer and a far more useful one.

What a verification council can’t do

Here is the limit, sharp, because verification is where overselling gets dangerous: a council narrows error, it does not manufacture missing facts. If a fact is simply not known to any model in the council, convening four of them does not summon it. Four models that all lack a piece of information will lack it together, confidently, and a unanimous council can be unanimously wrong about a fact none of them ever had.

Two consequences to respect. Recent claims (“did this company raise funding last week,” “is this regulation in effect now”) depend on current facts the models may not have, so the right verdict is usually UNVERIFIABLE. And a shared training gap is invisible from inside the council, which is exactly why the multi-model build matters: different families have different gaps, so a shared blind spot is less likely. Treat a high-confidence SUPPORTED as worth acting on, a CONTRADICTED as go-look-hard, and an UNVERIFIABLE as “the council did its job, now I need a human or a source.”

Keep an audit trail you can show a skeptic

Verification is the council you most need to show someone, because “the AI said so” does not survive a skeptic. So export a paper trail:

AUDIT-TRAIL EXPORT (write under verify-log/):
  - The CLAIM exactly as posed
  - Each check, labeled by angle (logic / consistency / falsification / source)
    and by which model family ran it
  - The verdict, confidence band, and reason for doubt
  - The UNRESOLVED points, verbatim
Keep dates out of the entry text; let the filesystem track timestamps.

Now when someone challenges your conclusion, you hand them the export instead of arguing from authority: here is the claim, four independent checks, where they agreed and split, the confidence, and the unresolved doubt. That is a defensible artifact.

What should you actually do?

If the question has a true answer (a stat, a claim, a fact) → send it to /verify, not the idea council.
If the claim is recent or needs a live source → expect UNVERIFIABLE and go to a primary source. That is the council working, not failing.
If you need correctness → run the checks across different model families, so one shared gap does not pass as consensus.
If the verdict is SUPPORTED but the source check was weak → treat it as “probably,” not proof, and keep the audit trail.

The bottom line

UNVERIFIABLE is a first-class answer. A verification council that always returns a confident yes or no is broken.
It reports calibrated certainty, not false certainty. That is the exact thing a single agreeable model stole from you.
Use it to lower your odds of believing something false. Do not treat it as an oracle, because the moment you do, you have rebuilt the yes-man.

How to Verify AI Claims with a Council

The LLM Council

How do you verify AI claims with a council?

Exploration vs verification: two different jobs

Build the /verify skill

Watch it gate a real claim

What a verification council can’t do

Keep an audit trail you can show a skeptic

What should you actually do?

The bottom line

Frequently Asked Questions

How to Verify AI Claims with a Council

The LLM Council

How do you verify AI claims with a council?

Exploration vs verification: two different jobs

Build the /verify skill

Watch it gate a real claim

What a verification council can’t do

Keep an audit trail you can show a skeptic

What should you actually do?

The bottom line

Frequently Asked Questions

More from this book

Build an AI Board of Advisors That Argues Back

AI Code Review with a Multi-Model Council

How to Build an LLM Council in Claude Code

Build a Multi-Model AI Council with OpenRouter