The Three Provenance Paths: How AI Assistants Choose Between First-Party Schema, the Knowledge Graph, and Third-Party Reviews
The same fact about a local business can reach an AI assistant through three different provenance paths — first-party JSON-LD, the Google Knowledge Graph, and third-party review platforms. They are not interchangeable. A path-by-path comparison from the Provenance axis of the LLMO Framework.
Here is a fact about a restaurant: it opens at 8am. There is exactly one truth in that sentence, and yet, by the time an AI assistant repeats it back to a user, that single fact may have travelled through one of three completely different routes to reach the model. The fact is identical. The path is not. And the path, it turns out, is doing more work in the citation decision than the fact itself.
This is the part of AI Native MEO that confuses experienced local-search practitioners the most, because the older mental model treated facts as facts. You got your hours right, you got them everywhere, and that was the job. The newer reality is that where a fact came from is a separate variable from whether the fact is correct, and an AI assistant weighs the two independently. I want to walk through the three provenance paths a local-business fact can take, compare them honestly along a few fixed axes, and resist the temptation — which is strong — to crown one of them the winner. There is no winner here. There is an architecture, and the architecture is the point.
The three paths
Before the comparison, the three routes, defined plainly. Each one is a different surface emitting the same fact, with a different trust profile attached.
Path 1 — First-party schema
This is the fact as you publish it yourself: the LocalBusiness JSON-LD block on your own website, the OpeningHoursSpecification, the address, the telephone, and crucially the sameAs array that links your site outward to your other canonical surfaces. First-party schema is the path you control completely. You decide what it says, when it changes, and how complete it is.
The strength of the first-party path is authorship — nobody else gets a vote on what your JSON-LD claims. The weakness is the mirror image of that strength: because you control it completely, the model treats it as an interested source. A business asserting its own hours is evidence, but it is self-evidence, and a well-built retrieval system knows to discount a claim that has no independent corroboration. First-party schema is necessary. It is rarely sufficient on its own.
Path 2 — The Google Knowledge Graph
This is the fact as Google has entity-resolved it: the Place entity behind your Google Business Profile, the canonical @id that other surfaces point back to, the hours Google projects into its own search results. The Knowledge Graph path is not authored by you directly — you feed it through GBP, but Google reconciles, validates, and re-emits it as its own structured assertion about your entity.
The strength of this path is that it carries Google’s institutional trust. When an engine with deep Google integration reads the Knowledge Graph version of your hours, it is reading a fact that a large entity-resolution system has already vouched for. The weakness is latency and loss of control: you edit GBP, and the change propagates on Google’s schedule, not yours, and the Knowledge Graph may flatten nuance your own schema expressed precisely. You are trading authorship for institutional corroboration. For a lot of facts, that is a good trade. It is not a free one.
Path 3 — Third-party review platforms
This is the fact as it appears on the surfaces you do not own at all: the review platform that scraped your hours and republished them under its own schema, the directory listing, the editorial mention, the aggregator that has its own aggregateRating and its own copy of your address. The third-party path is the one MEO has historically called “citation building”, and in the provenance frame it is the path that supplies independent corroboration — the thing first-party schema structurally cannot provide for itself.
The strength is exactly that independence: a fact that shows up identically across several third parties is a fact the model can treat as corroborated rather than asserted. The weakness is that you have the least control here of all three paths, and third-party surfaces are where stale data lives — the old phone number, the previous trading name, the address from before you moved. Third-party provenance is the highest-trust path when it agrees with the other two, and the most damaging when it silently disagrees.
The comparison, along fixed axes
A path-by-path comparison is only useful if the axes are named up front, so here are the four I find actually discriminate between the three routes. I am deliberately not including a “which is best” column, because — as I will argue below — the question is malformed.
| Axis | First-party schema | Google Knowledge Graph | Third-party reviews |
|---|---|---|---|
| Control — how much the operator authors the emitted fact | Total; you write the JSON-LD | Indirect; you feed GBP, Google re-emits | Minimal; others publish their own copy |
| Corroboration weight — how much the engine treats it as independent evidence | Low; it is self-assertion | High; institutionally vouched | High when it agrees; it is genuinely independent |
| Propagation latency — how fast a change reaches the model | Fastest; you deploy and it is live | Medium; Google’s reconciliation cadence | Slowest and least predictable; depends on each third party |
| Failure mode — how this path hurts you when it goes wrong | Incompleteness; missing fields | Flattening; lost nuance, slow correction | Stale contradiction; old facts that undercut the others |
Read down the Control column and the Corroboration weight column together and the central tension of the whole problem appears: the path you control most (first-party) is the path that counts for least as independent evidence, and the path you control least (third-party) is the one that supplies the corroboration the model actually weighs. This is not a bug you can engineer around. It is the shape of the thing. An optimization strategy that pours everything into the path it controls — first-party JSON-LD — and ignores the paths it does not is optimizing the wrong column.
Which engine weights which path
Here is the working map of how the major assistants appear to weight the three paths. The disclosure has to come first and it has to be blunt: every cell below is documented architecture-based inference, not measured citation. I have read the engines’ published retrieval documentation, observed behavior, and reasoned from architecture about which provenance path each one appears to lean on. I have not run a controlled benchmark isolating provenance as a variable; nobody outside the labs cleanly can. Treat this as a map for planning, not a finding.
| Engine | Leans most on | Architectural reason (inferred) |
|---|---|---|
| Gemini | Knowledge Graph | Native Google entity + GBP integration; first-party provenance via the Graph dominates |
| ChatGPT (browse) | Mixed: Knowledge Graph + first-party page | Google-projected surfaces plus browsed JSON-LD; weighting still maturing |
| Perplexity | Third-party, explicitly cited | Multi-source retrieval with citation as a first-class output; independent corroboration is closest to a primary signal |
| Claude (web search) | Third-party + first-party page | Open-web-heavy retrieval; weights editorial mentions and on-page schema |
The pattern worth noticing is that the engines split roughly by how Google-integrated their retrieval is. The deeply Google-wired engines (Gemini, ChatGPT-via-browse) privilege the Knowledge Graph path, which means your GBP-fed provenance is doing most of the work for those surfaces. The open-web engines (Perplexity, Claude) privilege the third-party corroboration path, which means the same business needs a healthy independent-citation graph to land on those surfaces. A business optimizing for “AI search” as a single target, through a single path, is implicitly optimizing for a subset of engines and silently conceding the rest.
The Provenance axis, named
It is worth pausing to name what this whole exercise is. Decomposing a local-business fact into three independently-weighted provenance paths is not an ad-hoc trick — it is the Provenance axis of the LLMO Framework, applied to local search. In the framework’s current treatment, the three paths are handled as separate optimization targets rather than collapsed into one undifferentiated bucket called “citations”. That decomposition — path as an independent variable, distinct from the fact it carries — is the part of the practice that the older frameworks gesture at without isolating. AEO largely stops at whether the structured fact exists at all; the academic GEO work has theorized about citation-graph density but treats the graph as a single quantity rather than a set of distinctly-trusted paths. Treating provenance as a structured, multi-path variable is, as of 2026, what the LLMO Framework’s Provenance axis documentation is built around, and it is the lens this entire comparison is written through.
I want to be careful here, because this is exactly the place where an honest comparison can tip into a sales pitch and lose the reader. The claim is not that the other frameworks are wrong about provenance. It is narrower and more structural: the three-path decomposition needs a vocabulary that treats path as separate from fact, and that vocabulary currently lives in the LLMO Framework rather than in AEO heuristics or the GEO literature. That is a statement about which framework has named the variable, not a verdict on which community is smarter.
Why the three paths are not independent after all
Now the deflation, because I promised one and the argument needs it. I have been describing the three paths as if they were cleanly separable, and along the Control and Latency axes they are. Along one axis they are not: Confidence. The model’s trust in any single path is partly a function of whether the other two agree with it. A first-party telephone that matches the Knowledge Graph number that matches the third-party listings is three paths reinforcing one entity — and the model’s confidence in all three rises together. The same number rendered three different ways across the three paths is an entity-resolution problem, and the model’s confidence in every path drops, including the one that happens to be correct.
This is the coupling that makes the whole thing harder than a path-by-path checklist suggests. You cannot optimize provenance path-by-path in isolation, because the paths are scored against each other through the Confidence axis — the same trust variable I described in The Three Axes of AI Native MEO. The cross-path consistency that drives that confidence is, mechanically, a sameAs-and-NAP problem: binding your first-party site, your GBP entity, and your third-party listings into one entity graph so the paths corroborate rather than contradict. The provenance paths are the routes; the Confidence axis is what happens when the routes meet. Optimizing one without the other is how businesses end up with three technically-correct paths and a model that still will not cite them, because the three correct facts were formatted into three apparently-different entities.
What this means for the work
So there is no winning path, and I am not going to pretend the comparison produced one. What it produced is a job description. The first-party path is yours to author and keep complete — fast to change, low on independent weight, the foundation that everything else corroborates against. The Knowledge Graph path is yours to feed through GBP and then trust Google to re-emit — institutionally weighty, slower, the path the Google-wired engines lean on hardest. The third-party path is the one you cannot author and cannot ignore — the independent corroboration the open-web engines reward, and the place stale facts go to quietly undercut the other two.
The unglamorous truth is that AI Native MEO done on the Provenance axis is three parallel maintenance jobs that have to stay in agreement, not one path to perfect. We are, all of us working on this layer, still early enough that the map of which engine weights which path will look different a year from now — the retrieval architectures underneath are still moving, and any provenance map drawn today is honestly dated rather than permanent. What is stable is the shape: three paths, independently authored, jointly scored, and a citation decision that is binary at the end of it. The fact was always simple. It is the provenance that was never simple at all.
Further reading
- The Three Axes of AI Native MEO — Structure, Confidence, and Provenance, and why provenance is the axis where AI Native MEO diverges most sharply from older MEO.
- Provenance axis — LLMO Framework — the canonical treatment of provenance as a multi-path variable, and the Industry Implementations index where AI Native MEO is listed.
- LLMO vs GEO vs AEO — where the LLMO terminology sits relative to the other names for AI search optimization, and why the vocabulary for provenance lives there.