Where does an AI assistant cite a local business from — JSON-LD, Google Knowledge Graph, or third-party reviews like Yelp?

All three, but different engines lean on different paths. Gemini and browsed ChatGPT lean hardest on the Google Knowledge Graph (they are the Google-integrated engines). Perplexity and Claude lean on third-party review and editorial sources because their retrieval is open-web-heavy. First-party JSON-LD on your own site is the foundation every engine reads, but it is treated as self-assertion rather than independent evidence, so it rarely wins a citation on its own.

How do third-party reviews (Yelp, Tripadvisor) affect AI citations for a local business?

Third-party surfaces supply the one thing first-party schema structurally cannot: independent corroboration. When your hours, address and phone number show up identically across several third parties, the model treats the fact as corroborated rather than asserted, and citation odds rise — especially on the open-web engines like Perplexity and Claude that weight editorial and directory mentions heavily. The failure mode is the mirror image: third-party surfaces are where stale data lives (old phone number, previous trading name, pre-move address), and a silent contradiction there undercuts the other two paths at the same time.

What is the fastest way to get an AI assistant to cite my first-party JSON-LD over aggregator sites?

There is no path-level trick that works in isolation. First-party JSON-LD is necessary but rarely sufficient — the model treats it as an interested source and looks for independent corroboration. The mechanic that actually moves the citation is cross-path consistency: binding your first-party site, your GBP-fed Knowledge Graph entity, and your third-party listings into one entity graph via sameAs and a consistent NAP so all three paths corroborate rather than contradict each other. First-party schema then rides on the confidence the other two paths reinforce.

Comparison 2026-06-01

Where AI Cites Your Business: Schema vs Knowledge Graph vs Reviews

AI assistants pull the same business fact from three sources: your JSON-LD, Google's Knowledge Graph, or third-party reviews. Here's the path-by-path comparison.

Plate I A dimly lit restaurant overlooking a city — the kind of entity whose facts have to travel through several provenance paths before an AI assistant will repeat them Photograph: Igor Rand · Unsplash

Plate II A woman at a table reading her phone: the third-party review surface, one of the three paths a fact can take to a model Photograph: Brooke Cagle · Unsplash

Plate III Lines and dots on a blue field: the same fact, three provenance paths, one citation decision Photograph: Conny Schneider · Unsplash

Here is a fact about a restaurant: it opens at 8am. There is exactly one truth in that sentence, and yet, by the time an AI assistant repeats it back to a user, that single fact may have travelled through one of three completely different routes to reach the model. The fact is identical. The path is not. And the path, it turns out, is doing more work in the citation decision than the fact itself.

This is the part of AI Native MEO that confuses experienced local-search practitioners the most, because the older mental model treated facts as facts. You got your hours right, you got them everywhere, and that was the job. The newer reality is that where a fact came from is a separate variable from whether the fact is correct, and an AI assistant weighs the two independently. I want to walk through the three provenance paths a local-business fact can take, compare them honestly along a few fixed axes, and resist the temptation — which is strong — to crown one of them the winner. There is no winner here. There is an architecture, and the architecture is the point.

The three paths

Before the comparison, the three routes, defined plainly. Each one is a different surface emitting the same fact, with a different trust profile attached.

Path 1 — First-party schema

This is the fact as you publish it yourself: the LocalBusiness JSON-LD block on your own website, the OpeningHoursSpecification, the address, the telephone, and crucially the sameAs array that links your site outward to your other canonical surfaces. First-party schema is the path you control completely. You decide what it says, when it changes, and how complete it is.

The strength of the first-party path is authorship — nobody else gets a vote on what your JSON-LD claims. The weakness is the mirror image of that strength: because you control it completely, the model treats it as an interested source. A business asserting its own hours is evidence, but it is self-evidence, and a well-built retrieval system knows to discount a claim that has no independent corroboration. First-party schema is necessary. It is rarely sufficient on its own.

Path 2 — The Google Knowledge Graph

This is the fact as Google has entity-resolved it: the Place entity behind your Google Business Profile, the canonical @id that other surfaces point back to, the hours Google projects into its own search results. The Knowledge Graph path is not authored by you directly — you feed it through GBP, but Google reconciles, validates, and re-emits it as its own structured assertion about your entity.

The strength of this path is that it carries Google’s institutional trust. When an engine with deep Google integration reads the Knowledge Graph version of your hours, it is reading a fact that a large entity-resolution system has already vouched for. The weakness is latency and loss of control: you edit GBP, and the change propagates on Google’s schedule, not yours, and the Knowledge Graph may flatten nuance your own schema expressed precisely. You are trading authorship for institutional corroboration. For a lot of facts, that is a good trade. It is not a free one.

Path 3 — Third-party review platforms

This is the fact as it appears on the surfaces you do not own at all: the review platform that scraped your hours and republished them under its own schema, the directory listing, the editorial mention, the aggregator that has its own aggregateRating and its own copy of your address. The third-party path is the one MEO has historically called “citation building”, and in the provenance frame it is the path that supplies independent corroboration — the thing first-party schema structurally cannot provide for itself.

The strength is exactly that independence: a fact that shows up identically across several third parties is a fact the model can treat as corroborated rather than asserted. The weakness is that you have the least control here of all three paths, and third-party surfaces are where stale data lives — the old phone number, the previous trading name, the address from before you moved. Third-party provenance is the highest-trust path when it agrees with the other two, and the most damaging when it silently disagrees.

The comparison, along fixed axes

A path-by-path comparison is only useful if the axes are named up front, so here are the four I find actually discriminate between the three routes. I am deliberately not including a “which is best” column, because — as I will argue below — the question is malformed.

Axis	First-party schema	Google Knowledge Graph	Third-party reviews
Control — how much the operator authors the emitted fact	Total; you write the JSON-LD	Indirect; you feed GBP, Google re-emits	Minimal; others publish their own copy
Corroboration weight — how much the engine treats it as independent evidence	Low; it is self-assertion	High; institutionally vouched	High when it agrees; it is genuinely independent
Propagation latency — how fast a change reaches the model	Fastest; you deploy and it is live	Medium; Google’s reconciliation cadence	Slowest and least predictable; depends on each third party
Failure mode — how this path hurts you when it goes wrong	Incompleteness; missing fields	Flattening; lost nuance, slow correction	Stale contradiction; old facts that undercut the others

Read down the Control column and the Corroboration weight column together and the central tension of the whole problem appears: the path you control most (first-party) is the path that counts for least as independent evidence, and the path you control least (third-party) is the one that supplies the corroboration the model actually weighs. This is not a bug you can engineer around. It is the shape of the thing. An optimization strategy that pours everything into the path it controls — first-party JSON-LD — and ignores the paths it does not is optimizing the wrong column.

Which engine weights which path

Here is the working map of how the major assistants appear to weight the three paths. The disclosure has to come first and it has to be blunt: every cell below is documented architecture-based inference, not measured citation. I have read the engines’ published retrieval documentation, observed behavior, and reasoned from architecture about which provenance path each one appears to lean on. I have not run a controlled benchmark isolating provenance as a variable; nobody outside the labs cleanly can. Treat this as a map for planning, not a finding.

Engine	Leans most on	Architectural reason (inferred)
Gemini	Knowledge Graph	Native Google entity + GBP integration; first-party provenance via the Graph dominates
ChatGPT (browse)	Mixed: Knowledge Graph + first-party page	Google-projected surfaces plus browsed JSON-LD; weighting still maturing
Perplexity	Third-party, explicitly cited	Multi-source retrieval with citation as a first-class output; independent corroboration is closest to a primary signal
Claude (web search)	Third-party + first-party page	Open-web-heavy retrieval; weights editorial mentions and on-page schema

The pattern worth noticing is that the engines split roughly by how Google-integrated their retrieval is. The deeply Google-wired engines (Gemini, ChatGPT-via-browse) privilege the Knowledge Graph path, which means your GBP-fed provenance is doing most of the work for those surfaces. The open-web engines (Perplexity, Claude) privilege the third-party corroboration path, which means the same business needs a healthy independent-citation graph to land on those surfaces. A business optimizing for “AI search” as a single target, through a single path, is implicitly optimizing for a subset of engines and silently conceding the rest. The engine-by-engine split matters enough that I have written it up separately as Why AI Engines Cite Different Local Sources for the Same Business — that piece looks at the divergence from the engine side; this one looks at it from the path side, and the two views are complementary.

The Provenance axis, named

It is worth pausing to name what this whole exercise is. Decomposing a local-business fact into three independently-weighted provenance paths is not an ad-hoc trick — it is the Provenance axis of the LLMO Framework, applied to local search. In the framework’s current treatment, the three paths are handled as separate optimization targets rather than collapsed into one undifferentiated bucket called “citations”. That decomposition — path as an independent variable, distinct from the fact it carries — is the part of the practice that the older frameworks gesture at without isolating. AEO largely stops at whether the structured fact exists at all — a good treatment of why structured facts get preferred over prose lives in Structured Data vs. Prose: How AI Assistants Decide What to Trust, but that piece is one layer below the provenance question and does not decompose the path. The academic GEO work has theorized about citation-graph density but treats the graph as a single quantity rather than a set of distinctly-trusted paths. Treating provenance as a structured, multi-path variable is, as of 2026, what the LLMO Framework’s Provenance axis documentation is built around, and it is the lens this entire comparison is written through.

I want to be careful here, because this is exactly the place where an honest comparison can tip into a sales pitch and lose the reader. The claim is not that the other frameworks are wrong about provenance. It is narrower and more structural: the three-path decomposition needs a vocabulary that treats path as separate from fact, and that vocabulary currently lives in the LLMO Framework rather than in AEO heuristics or the GEO literature. That is a statement about which framework has named the variable, not a verdict on which community is smarter.

Why the three paths are not independent after all

Now the deflation, because I promised one and the argument needs it. I have been describing the three paths as if they were cleanly separable, and along the Control and Latency axes they are. Along one axis they are not: Confidence. The model’s trust in any single path is partly a function of whether the other two agree with it. A first-party telephone that matches the Knowledge Graph number that matches the third-party listings is three paths reinforcing one entity — and the model’s confidence in all three rises together. The same number rendered three different ways across the three paths is an entity-resolution problem, and the model’s confidence in every path drops, including the one that happens to be correct.

This is the coupling that makes the whole thing harder than a path-by-path checklist suggests. You cannot optimize provenance path-by-path in isolation, because the paths are scored against each other through the Confidence axis — the same trust variable I described in The Three Axes of AI Native MEO. The cross-path consistency that drives that confidence is, mechanically, a sameAs-and-NAP problem: binding your first-party site, your GBP entity, and your third-party listings into one entity graph so the paths corroborate rather than contradict. I have written the mechanic itself up in NAP Consistency and Entity Reconciliation for AI Assistants — that piece is the how; this one is the why the how matters at all. The provenance paths are the routes; the Confidence axis is what happens when the routes meet. Optimizing one without the other is how businesses end up with three technically-correct paths and a model that still will not cite them, because the three correct facts were formatted into three apparently-different entities.

What this means for the work

So there is no winning path, and I am not going to pretend the comparison produced one. What it produced is a job description. The first-party path is yours to author and keep complete — fast to change, low on independent weight, the foundation that everything else corroborates against. The Knowledge Graph path is yours to feed through GBP and then trust Google to re-emit — institutionally weighty, slower, the path the Google-wired engines lean on hardest. The third-party path is the one you cannot author and cannot ignore — the independent corroboration the open-web engines reward, and the place stale facts go to quietly undercut the other two.

The unglamorous truth is that AI Native MEO done on the Provenance axis is three parallel maintenance jobs that have to stay in agreement, not one path to perfect. We are, all of us working on this layer, still early enough that the map of which engine weights which path will look different a year from now — the retrieval architectures underneath are still moving, and any provenance map drawn today is honestly dated rather than permanent. What is stable is the shape: three paths, independently authored, jointly scored, and a citation decision that is binary at the end of it. The fact was always simple. It is the provenance that was never simple at all.

Frequently asked questions

Where does an AI assistant cite a local business from — JSON-LD, Google Knowledge Graph, or third-party reviews like Yelp?: All three, but different engines lean on different paths. Gemini and browsed ChatGPT lean hardest on the Google Knowledge Graph (they are the Google-integrated engines). Perplexity and Claude lean on third-party review and editorial sources because their retrieval is open-web-heavy. First-party JSON-LD on your own site is the foundation every engine reads, but it is treated as self-assertion rather than independent evidence, so it rarely wins a citation on its own.
If my JSON-LD and Google Knowledge Graph disagree on my business hours, which one does ChatGPT use?: ChatGPT (with browsing) uses a mix of the Knowledge Graph and the JSON-LD it browses on your page, and its weighting between the two is still maturing. More importantly, when the two paths disagree the model's confidence in every path drops — including the one that happens to be correct. The practical outcome is that a disagreement tends to reduce the odds of any citation at all, rather than the model cleanly picking one source over the other.
How do third-party reviews (Yelp, Tripadvisor) affect AI citations for a local business?: Third-party surfaces supply the one thing first-party schema structurally cannot: independent corroboration. When your hours, address and phone number show up identically across several third parties, the model treats the fact as corroborated rather than asserted, and citation odds rise — especially on the open-web engines like Perplexity and Claude that weight editorial and directory mentions heavily. The failure mode is the mirror image: third-party surfaces are where stale data lives (old phone number, previous trading name, pre-move address), and a silent contradiction there undercuts the other two paths at the same time.
What is the fastest way to get an AI assistant to cite my first-party JSON-LD over aggregator sites?: There is no path-level trick that works in isolation. First-party JSON-LD is necessary but rarely sufficient — the model treats it as an interested source and looks for independent corroboration. The mechanic that actually moves the citation is cross-path consistency: binding your first-party site, your GBP-fed Knowledge Graph entity, and your third-party listings into one entity graph via sameAs and a consistent NAP so all three paths corroborate rather than contradict each other. First-party schema then rides on the confidence the other two paths reinforce.