PromptingBasics

Stop Hallucinations

By
Dan Lee
Dan Lee
Dec 20, 2025

Hallucinations: How to Spot and Reduce Them

Hallucinations are the awkward moment when an AI responds with total confidence… and it’s wrong. Not “typo wrong.” Invented facts, fake citations, incorrect calculations, made-up API behavior—that kind of wrong.

This happens because LLMs are trained to produce plausible text, not guaranteed truth. But here’s the good news: you can usually spot hallucinations faster and reduce them dramatically with a few reliable habits.

Quick Definition

A hallucination is when an AI outputs information that isn’t grounded in your inputs, data, or verified sources—especially when it states it as fact.

Spotting Hallucinations

Hallucinations have a “smell.” Here are the most common signals:

  • Overconfident tone with no evidence
    (“This law was updated in 2023…” with no source and no way to verify)

  • Unverifiable specifics
    Fake names, dates, metrics, or “studies” that you can’t trace.

  • Answering the wrong question
    The model fills gaps with assumptions instead of asking clarifying questions.

  • Formatting that looks authoritative
    Tables, bullet lists, or citations that look real but aren’t tied to any source.

  • Tool misuse (for builders)
    Model claims it “queried the database” without a tool trace, or returns SQL that doesn’t match schema.

Reducing Hallucinations (That Actually Works)

The goal isn’t “never hallucinate.” The goal is: make hallucinations obvious and unlikely.

Here are practical guardrails:

  1. Ground the model in input data
  • Paste the relevant text, table, or snippet.
  • Ask it to quote or reference the exact lines it used.
  1. Force assumptions to be explicit
  • “If something is unknown, say ‘Unknown’ and list what you need.”
  1. Require a verification step
  • Add a second pass: “Critique your answer. Flag anything uncertain.”
  1. Constrain output format
  • JSON schemas, checklists, tables with “Evidence” columns reduce free-wheeling speculation.
  1. Use tools when accuracy matters
  • Retrieval (RAG), browsing, database queries, calculators, code execution.

Example 1: Research Safeguard (Non-Technical)

A marketer wants competitive research. The model loves filling gaps unless you constrain it.

Text
Context: I’m preparing a competitive brief for a marketing meeting.
Instruction: Summarize the differences between Product A and Product B.
Input Data: Use only the notes below. If a detail isn’t present, write “Unknown.”
[NOTES]
- Product A: target = SMB, pricing starts at $49/mo, key feature = automated reporting
- Product B: target = mid-market, pricing = not listed, key feature = workflow automation
Output Indicator: Provide a table with columns: Category, Product A, Product B, Evidence (quote from notes).

This prompt prevents the model from inventing pricing, features, or positioning—and it forces evidence.

Example 2: Engineering Guardrail (Tool + Critique)

You want SQL, but also safety and correctness.

Text
Context: You are assisting an AI engineer generating BigQuery SQL for a dashboard.
Instruction: Draft the SQL, then run a self-check for hallucinations and schema mismatches.
Input Data:
Schema:
- events(user_id STRING, event_name STRING, event_ts TIMESTAMP, country STRING)
Goal: Weekly active users by country for the last 8 complete weeks.
Output Indicator:
1) FINAL_SQL
2) VALIDATION_NOTES: list assumptions, AND list any columns/tables you were tempted to use but are not in the schema.

That “tempted to use” line is sneaky-effective. It catches the model when it tries to reach for imaginary fields like created_at or users.country.

Add an Evidence Column

If you’re seeing hallucinations, make “Evidence” a required field in the output. Models hallucinate less when they must justify each claim.

Takeaway

Hallucinations aren’t random—they’re predictable. Spot them by watching for confident claims without evidence, untraceable specifics, and assumptions masquerading as facts. Reduce them by grounding the model in real input data, forcing explicit unknowns, adding a verification pass, and using structured outputs (plus tools when needed).

Your best mental model: LLMs are brilliant draftspeople, not guaranteed truth engines. Give them guardrails, and they’ll reward you with reliability.

Dan Lee

Dan Lee

DataInterview Founder (Ex-Google)

Dan Lee is an AI tech lead with 10+ years of industry experience across data engineering, machine learning, and applied AI. He founded DataInterview and previously worked as an engineer at Google.