I need a fact-checked answer, not a confident guess.

Ch. 01 What it is

A model will answer anything you ask in the same calm, certain voice — whether it read the source or invented it. The fix isn't a better prompt; it's refusing to trust a claim until you've watched it survive a source. Three ways to make it earn your trust, simplest first.

Ch. 02 The three ways to build it

Simplest path first. Every tier carries its real setup time and its honest trade-off — the cost is the part most write-ups leave out.

1
Tier 1 · simplest path
One careful, source-anchored search
Setup~5 min
- any search-capable model
- a browser
Ask the question, but change what you accept as an answer. Instead of 'what's the answer,' ask the model to find a source, quote the line that answers you, and link it — then answer from that line, not from memory. Read the quoted line yourself and click the link. If the source doesn't actually say what the answer claims, you've caught the guess before it cost you anything. That's the whole mechanism: you've moved from trusting a fluent sentence to trusting a sentence you can trace back to where it came from.
The trade-off
One source is one source. A single page can be wrong, out of date, or somebody's marketing — and a model is perfectly happy to quote it as gospel. This tier catches outright fabrication, the made-up statistic and the invented citation, which is most of what hurts you. It does not catch a real source that happens to be wrong. Use it for 'point me at the truth, don't make one up,' not for 'this number decides the budget.'
2
Tier 2
Multi-source cross-check
Setup~20 min
- a search-capable model
- a notes doc
Now make the claim survive more than one source. For each fact that matters, ask for two or three independent sources — not three links to the same press release — and have the answer note where they agree and, more usefully, where they don't. Agreement across sources that don't share a parent raises your confidence honestly. Disagreement is the gift: it surfaces exactly the claims that are contested, stale, or genuinely unknown, instead of letting the model paper over the gap with a confident average. You end with an answer that carries its own evidence and flags its own soft spots.
The trade-off
Cross-checking rewards diversity, and models will quietly cheat it — three citations that all trace to one original aren't three sources, they're one wearing three hats. Watch for the same wire story, the same dataset, the same author republished. And consensus isn't truth: a popular wrong answer cross-checks beautifully. This tier buys you well-evidenced and contested-flagged, which is most of what you need for a real decision — but it still trusts that the sources are independent and competent, and sometimes they're neither.
3
Tier 3
Adversarial fan-out + cited synthesis
Setup~half day to wire, minutes to run
- a research harness
- parallel search workers
- a model that cites
Stop asking the question once. Fan it out: break it into the sub-questions it actually depends on, run many searches in parallel, and gather a wide pool of sources. Then turn the model against its own draft — for each load-bearing claim, run a pass whose only job is to attack it, find the source that contradicts it, and demand the citation that backs it. Claims that survive the attack go into a synthesis where every sentence carries its source; claims that don't get marked unconfirmed rather than smoothed over. The output isn't a paragraph — it's a report you could hand to someone who'll check your work.
The trade-off
This is the version that earns trust on the hard questions, and it costs the most to run and to keep honest. Fan-out multiplies the bill — many searches, many model calls — and breadth without the adversarial pass just gathers more confident noise faster. The verification step is the whole point; skip it to save tokens and you've built an expensive way to be wrong at length. Reach for this tier when being wrong is genuinely expensive — a decision, a published claim, a number someone will act on — not for everyday questions a single careful search already settles.

Ch. 03 The detail

A model will answer anything you ask in the same calm, certain voice — whether it read the source or invented it. The fix isn't a better prompt; it's refusing to trust a claim until you've watched it survive a source. Three ways to make it earn your trust, simplest first.

Category: Knowledge-ops · Research & verification
Format: System
Level: intermediate
Provenance: Own-packaged

researchverificationfact-checkingdeep-researchknowledge-opscitations

The problem is the voice, not the answer

A model answers a question it researched and a question it invented in exactly the same tone: calm, fluent, certain. There’s no tell. The confident guess and the verified fact wear the same face, which is precisely why the guess is dangerous — it reads as authoritative right up until the moment you act on it and find the statistic was never real, the citation points nowhere, the quote was assembled rather than found.

So the fix isn’t a cleverer prompt that makes the model “be more accurate.” You can’t prompt your way out of a confident hallucination, because the model doesn’t know it’s hallucinating. The fix is to change what counts as an answer. Stop accepting a fluent sentence. Start accepting a sentence you can trace to a source you can open. Everything below is a way to force that trace — they differ only in how hard the claim has to work before you believe it.

Match the rigour to the cost of being wrong

The instinct, once you’ve been burned by a hallucination, is to reach straight for the heavy harness on every question. That’s the wrong reflex in the other direction. Most questions are settled cleanly by Tier 1 — make the model show you the source and quote the line, read it, done. Five minutes, and you’ve eliminated the failure that actually hurts most: the fact that was never there.

Climb when the cost of being wrong climbs. Tier 2 earns its twenty minutes when one source isn’t enough to stake a decision on — when you need the claim to agree across sources that don’t share a parent, and you want the disagreements surfaced rather than averaged away. Tier 3 earns its half-day of wiring when being wrong is genuinely expensive: a number that sets a budget, a claim you’re about to publish, a conclusion someone downstream will act on without rechecking. The adversarial pass — turning the model against its own draft to attack each claim and demand the citation — is what separates a real research report from a longer, more confident guess.

The honest version

No tier makes a model incapable of being wrong; they make it harder for a wrong answer to reach you unchallenged. Tier 1 catches the invented source. Tier 2 catches the lonely or contested one. Tier 3 catches the claim that sounds right and survives a casual read but folds the moment something is sent to argue against it. Each rung costs more — your attention, then your time, then your token bill — and each buys a stronger guarantee that the answer earned its place.

Pick the lightest rung that matches what the answer decides. A confident guess is cheap to produce and expensive to trust; the whole point of this ladder is to stop paying that second bill by accident.