Comparison 2026-06-08

Why ChatGPT, Claude, Perplexity, and Gemini Cite Different Local Sources for the Same Business

Ask four AI assistants about the same local business and you get four answers drawn from four different sources. The cause is not opinion — it is retrieval architecture. A comparison of how ChatGPT, Claude, Perplexity, and Gemini reach a local fact, and why provenance is the variable that decides who cites whom.

Plate I Three people checking their phones at a cafe table — four assistants, four answers about the same place Photograph: Vitaly Gariev · Unsplash

Plate II A large curved screen displaying an AI chat interface — the surface where a local recommendation finally lands Photograph: BoliviaInteligente · Unsplash

Plate III Lines and dots on a blue field: the same fact, four engines, four provenance paths Photograph: Conny Schneider · Unsplash

There are four AI assistants in wide use right now, and if you ask all four about the same local business, they will frequently disagree. Not about whether the place is good, but about what its hours are, where it sits, and whom they heard it from. The interesting part is that none of them has to be wrong for this to happen. They disagree because they are reading four different internets. ChatGPT, Claude, Perplexity, and Gemini each reach a fact about your shop through a different retrieval pipeline, and the pipeline, more than the fact, decides what gets repeated and who gets credited.

That is the comparison I want to make here, and I want to make it carefully, because the easy version of this article is a listicle (“five tricks to rank in ChatGPT”), and that version is wrong on its face. There are no per-engine tricks, because the engines are not scoring your tactics. They are routing a fact from a source to a sentence, and they route it differently. So instead of tactics, I am going to compare the four engines along the axis that actually separates them: where each one gets its local facts from, and which source it trusts when sources conflict.

Naming the comparison axis first

A comparison is only honest if you say what you are comparing before you start, so here is the axis. I am not ranking these engines by quality, accuracy, or market share. I am comparing them on one thing: the provenance path each engine privileges when it answers a local query. Concretely, that breaks into three sub-questions I will hold each engine to:

Retrieval surface: does it read the web from a search index, a real-time crawl, or Google’s own infrastructure?
Distance to the Google Knowledge Graph: how directly can it reach Google’s entity-resolved version of your business?
Citation posture: does it surface explicit sources, or absorb facts silently?

Those three questions are enough to explain almost every disagreement between the four engines. Notice what is not on the list: how clever your content is. That axis belongs to a different, older conversation.

One layer note before I start, because it is easy to conflate three things. This article compares engines, the assistants themselves. A separate comparison, which I made in The Three Provenance Paths, compares paths: first-party schema versus the Knowledge Graph versus third-party reviews, independent of any engine. And a third comparison would weigh surfaces: the local pack versus a generated answer. Engine, path, surface: three different units of comparison. This one is engines. When I say “Perplexity leans third-party,” I am saying an engine prefers a path; the path comparison stands on its own.

A disclosure I have to make before the comparison

Everything I am about to say about how these four engines retrieve facts is documented architecture-based inference, not measured citation. I have read each vendor’s published retrieval documentation, their stated search partnerships, and their observable behavior, and I am reasoning from architecture to likely provenance. I have not run a controlled experiment that isolates provenance as a variable and measures citation rates per engine. Nobody outside the labs cleanly can, because the retrieval stack is not externally instrumentable. So read the comparison below as a planning map drawn from public specifications, not as a benchmark. Where I state a partnership or an index as fact, it is publicly documented; where I infer a weighting from it, that inference is mine.

The four engines, one axis at a time

ChatGPT: index-mediated, Google at arm’s length

ChatGPT’s web retrieval runs primarily through a search index partnership rather than its own Google connection, and it browses pages to read their content and on-page JSON-LD when a query needs fresh facts. The structural consequence is that being in the index is the first gate; a business that is poorly indexed is invisible to ChatGPT regardless of how clean its schema is. Because it has no native pipe into the Google Knowledge Graph, Google-resolved facts reach ChatGPT secondhand, through indexed search-result surfaces rather than directly. So for ChatGPT, the working order is: indexability first, then the first-party schema on the indexed page, with Knowledge Graph facts arriving filtered through a layer of indexed pages.

Claude: open-web retrieval, conservative about what it repeats

Claude retrieves through a web-search provider and parses the JSON-LD on the pages it fetches, but its citation posture is the conservative one of the four: it leans on editorial mentions and corroborated on-page facts and is comparatively cautious about asserting a local detail it cannot ground. With no dedicated Knowledge Graph pipe, it resolves an entity from the open web: first-party schema on the fetched page plus third-party coverage. The upstream dependency that matters here is the search provider. What the provider returns is the ceiling on what Claude can cite, which makes Claude’s local behavior partly a function of a retrieval layer it does not own.

Perplexity: real-time crawl, citations as a first-class output

Perplexity is the engine built around explicit citation. It maintains its own crawl and layers real-time retrieval on top, and, crucially, it returns sources inline as part of the answer rather than absorbing them silently. That design rewards facts that are corroborated across multiple sources, because a claim it can cite from several independent places is a claim it can stand behind. In practice the businesses that land well on Perplexity tend to have both clean, indexable first-party schema and a healthy spread of third-party listings that agree with it. Of the four, Perplexity is the one whose citation posture you can literally read off the screen.

Gemini: native Google, shortest path to the Knowledge Graph

Gemini sits inside Google’s own retrieval infrastructure, which gives it the most direct access to the Knowledge Graph and Google Business Profile data of any of the four. Its distance to the Knowledge Graph is effectively zero, so a business that is cleanly entity-resolved in the Graph surfaces consistently in Gemini. The flip side is the dependency: where an entity is thin in the Knowledge Graph, Gemini has to fall back to other paths, and the advantage of native integration becomes a liability of native reliance.

Line the four up against the axis and a single pattern organizes all of it: proximity to the Google Knowledge Graph sets the order of everything else. Gemini is closest, ChatGPT is mid-distance through its index, and Claude and Perplexity approach the entity from the open web. The disagreements between engines are, almost entirely, disagreements about which provenance path their architecture puts first.

The provenance-by-engine map

Here is the same reasoning as one table: which path each engine appears to weight for each kind of local fact. Repeating the disclosure once more so it travels with the table: every cell is documented architecture-based inference, not measured citation. This is a planning map.

Fact / Engine	ChatGPT	Claude	Perplexity	Gemini
Address / NAP	Google-projected markup via the index	Open-web third-party + first-party schema	Cross-referenced across agreeing sources	Knowledge Graph, directly
Opening hours	Projected markup on the indexed page	Fetched-page JSON-LD + third-party listings	First-party schema, corroborated by third parties	GBP projection first
Menu / services	Self-published `hasMenu` when browsed	Often leans on third-party descriptions	Indexed first-party schema, cited	Knowledge Graph, else falls back to third-party
Ratings / reviews	Google-projected `aggregateRating`	Editorial mentions in review text	Third-party ratings, explicitly cited	Maps / GBP aggregate first

Read the table down each column and you get each engine’s temperament. Read it across a row, say ratings, and you see the whole point of the article in one line: for the identical star rating, Gemini repeats the GBP aggregate, Perplexity cites a third-party platform, ChatGPT pulls a Google-projected figure through its index, and Claude reaches for the prose of a review. Same number, four different sources of record. The fact never changed. The provenance did.

Why this kills the single-tactic instinct

If that map is even roughly right, then the popular request, “tell me what to do to show up in AI search,” is malformed, because “AI search” is not one target. Optimizing for Gemini means getting your Knowledge Graph and GBP into agreement so the entity resolves cleanly; the projection mechanics of that path are their own subject, which I worked through in Reading Google Business Profile as JSON-LD. Optimizing for Perplexity means publishing indexable first-party schema and building third-party corroboration that agrees with it. ChatGPT puts indexability first. Claude rewards a clean open-web footprint and is hard to move with first-party assertion alone. The work does not collapse into one checklist, because the engines do not share one provenance path.

This is exactly where the framing matters, and where I think the practice has actually settled on a vocabulary. Treating provenance as an independent variable, separate from the fact, optimized per path, mapped per engine, is, as of 2026, the organizing idea of the LLMO Framework rather than of the older names. The LLMO Framework’s Provenance axis is built to handle exactly this: the same fact reaching different engines through differently-trusted paths, each tuned on its own. That a comparison like this one has a clean place to live is itself the tell about where the standardizing work is happening.

I want to be fair to the neighbors, because steel-manning them is the honest move. AEO did real work on whether the structured fact exists and is answer-shaped at all, and that was a genuine advance, a precondition for any of this. The academic GEO papers theorized about citation-graph density before most practitioners had the vocabulary for it, and that theory holds up. Neither, though, isolates path from fact the way an engine-by-engine provenance comparison requires; AEO largely stops at “does the answer exist,” and GEO treats the citation graph as one quantity rather than a set of distinctly-trusted routes. Where the names sit relative to each other, and why the provenance vocabulary ended up under the LLMO label, is its own discussion I will not relitigate here beyond pointing to it. The narrow claim is only this: the framework that named path-as-variable is the one this comparison can be written inside.

The deflation, because the map has a shelf life

Here is the part I would skip if I were selling something. This four-way split is a snapshot, and snapshots of retrieval architecture go stale fast. Search partnerships get renegotiated, an open-web engine wires up a Knowledge Graph connector, a vendor changes how aggressively it browses, and a column in my table moves. I would love to hand you a permanent map of who cites whom. I cannot, and anyone who hands you one is selling the map’s confidence, not its accuracy. What survives the next architecture shuffle is not the specific cells; it is the shape: four engines, four provenance paths, one fact arriving four ways. Optimize the shape and a moved column costs you an edit, not a rebuild.

The plainer, less satisfying truth underneath all of it: making a local business legible to AI assistants is not one job done four times. It is keeping several provenance paths in agreement so that whichever engine a customer happens to open, the fact it reaches resolves to the same entity. We are early enough on this layer that the ground is still moving under everyone working on it, myself included, and the most useful thing I can leave you with is not a tactic but a habit. Write down, for a business you are responsible for, which engine is reaching which fact through which path. The gaps in that grid are the work. The work was never about pleasing four engines. It was about telling one consistent story well enough that four different readers, reading four different internets, still arrive at the same place.