Lesson 11 · Foundations · ~7 min

How to Spot When AI Is Wrong (and Verify Before You Trust)

Here's a small, slightly unsettling experiment. Ask an AI something you know cold — then ask it something obscure you'd normally have to look up. Read both answers back to back. They sound identical in tone: the same calm, the same certainty, the same tidy confidence. One of them might be completely made up. And nothing in the writing tells you which.

A few lessons back, we said to carry a healthy pinch of doubt and promised we'd come back to it properly. This is the coming-back. Time to turn that pinch of doubt into an actual skill — because doubt only helps if you know where to point it.

Why it sounds so sure, even when it's wrong

Quick reminder of why this happens, because it explains everything after it. When we looked at how an LLM thinks, the big idea was prediction, not lookup — it's a brilliant improviser, building each answer word by likely word, not pulling a fact off a shelf. So when it doesn't actually know something, it doesn't pause and say "not sure." It improvises the most plausible-sounding answer instead. And a plausible wrong answer comes out in exactly the same voice as a right one.

You've already met the word for this: a hallucination — a confident, made-up answer. It isn't lying. It genuinely has no idea it's doing it.

The one thing to burn into your brain

Here's the whole lesson in four words: confidence is not correctness. The assured tone you're reading carries zero information about whether the thing is true. Once that really lands — not as a scary warning, but as a plain fact about how the tool works — you stop being caught off guard, and you start checking the right things instead of everything.

The tells: when to get suspicious

You can't check everything — that'd be exhausting and pointless. The real skill is sensing which answers deserve a second look. A handful of patterns should make your antenna twitch:

  • Oddly exact details on obscure things. A precise date, an exact statistic, a word-for-word quote — for something niche or genuinely hard to look up. The more specific and the more obscure, the more suspicious. Invented facts are every bit as crisp as real ones.
  • Citations and sources. This is the big one. Ask for sources and it'll often hand you real-sounding papers, books, or links — proper titles, plausible authors, tidy years — that simply don't exist. A citation isn't proof. The citation is the thing to check.
  • Total confidence on a genuinely fuzzy question. If something should get a "well, it depends" and you get crisp certainty instead, that certainty may be papering over a guess.
  • The answer changes when you rephrase. Ask the same thing a different way. If the second answer meaningfully disagrees with the first, it doesn't have solid ground under it — it's improvising both times.
  • The "fill the quota" trap. Ask for five reasons when only two real ones exist, and it'll cheerfully invent three to round out the list. Forcing a number can manufacture fiction.

Picture asking for "three studies" on some niche topic. Back come three — neat titles, real-sounding authors, the works. Two are genuine. The third was assembled out of thin air to complete the set, and it looks exactly as legitimate as the other two. That's not a rare glitch. It's the single most common way people get burned.

The habit that covers most of it

Now the good news: you don't need a five-point inspection routine. One rule does most of the work. If a claim matters, check it before you act on it. If it doesn't, let it ride.

Brainstorming dinner ideas? Let it ride — nobody's hurt if one suggestion is a dud. About to repeat a "fact" in a report, take a supplement because it said so, or send a figure to your boss? That matters. Check it.

And checking is lighter than it sounds:

  • Cross-check the one claim that matters with a quick independent search, or a source you already trust.
  • Ask it to show its source — then actually open the source. Don't settle for the fact that one supposedly exists.
  • Ask the same question a second way and see if the answer holds.
  • For anything current or live, lean on the AI tools that can search the web — the plain chatbot is working from memory, not looking out the window.

Where being wrong gets expensive

Some mistakes cost nothing. Others end up in court. Slow right down in the obvious zones: anything medical, legal, or financial; exact figures; and anything you can't afford to get wrong and can't easily verify.

These aren't hypothetical. Lawyers have filed real court documents citing cases the AI invented — cases that flat-out don't exist — and been penalized by the judge for it. A company was held responsible when its customer-service chatbot confidently told a customer the wrong thing; "the bot said it, not us" didn't fly. A newspaper even ran an AI-generated summer reading list recommending books that famous authors never actually wrote.

None of those were exotic technical failures. They were ordinary people who trusted a confident answer and skipped the one verify step. That gap — checked it versus didn't — is the whole difference between using AI well and getting embarrassed by it.

A few things to keep to yourself

One more habit, quick — less about accuracy, more about safety. Treat the chat box like a semi-public space, not a private vault.

  • Don't paste in secrets you wouldn't post somewhere public: passwords, full card or bank numbers, government IDs, other people's private details.
  • Assume your conversations may be used to help improve the tool unless you've turned that off. Whether you can, and where the setting lives, varies by provider — so it's worth a peek in your settings.
  • Before you let an agent — the loop with hands from last lesson — act on something for you, give it the same glance you'd give your own work. Once it's pressed the button, it's pressed.

Nothing here is cause for alarm. It's basic hygiene — the digital version of not reading your PIN out loud in a crowded room.

So where does that leave you

Put it together and you land somewhere genuinely powerful. You're not the person who swallows every confident answer whole, and you're not the one so spooked they won't touch the tool at all. You're the one who knows which jobs to hand over and which to double-check — which, way back in the lesson on what it's actually for, we said was the whole game. Now you can actually play it.

And that, genuinely, is the end of the foundation. Look back at where you started: unsure what half these words even meant. Now you can name the pieces, you know how it thinks, what it costs, what it can make, how it acts — and how to keep it honest. From here the road forks. If the tool you keep reaching for is Claude, the next course tours its whole house, room by room. And if building something of your own ever calls to you, you've already seen that door too. Either way — you walked in confused, and you're walking out fluent. That was the point all along.

← PREVIOUS
Chatbot vs. Agent: What's Really Different?