Engineering 2026-06-10

NAP Consistency as Entity Reconciliation: How AI Engines Merge a Business Across Sources

"Make your NAP consistent" is advice about a symptom. The real mechanism is entity reconciliation: the probabilistic string-matching layer where an AI engine decides that your Google Business Profile, your site, and a dozen directories all denote one place. Here is what that layer actually does, in code.

Plate I A storefront and its parking lot from above: one real place an engine has to reassemble from scattered records Photograph: Gene Gallin · Unsplash

Plate II A person on the phone with a coffee: the query moment, where reconciliation happens before any answer is spoken Photograph: Vitaly Gariev · Unsplash

Plate III Separate water droplets on one wooden surface: distinct records that may or may not denote the same entity Photograph: Steve A Johnson · Unsplash

Read ten local-SEO articles and you will meet the same sentence in all ten: keep your NAP consistent. NAP is Name, Address, Phone (the three fields that identify a business), and the promise is that if you spell them the same way everywhere on the web, you rank better. The advice is correct. It is also a description of a symptom dressed up as a mechanism, and the gap between the two is where the interesting engineering lives.

Here is the part the advice leaves out. When someone asks an assistant “where’s a good pharmacy near me,” the engine does not look up your business and check whether your phone number matches across sources. It does something stranger and more fragile first: it takes a scattered pile of records (your Google Business Profile, your own site, an Apple Maps listing, a Yelp page, three directory entries you forgot you submitted) and it has to decide which of them are you before it can decide whether to mention you at all. That decision is called entity reconciliation, and NAP consistency is simply what you feed it: the feature vector the matcher scores.

This piece reframes NAP consistency as a reconciliation problem rather than a citation-matching one. Drop the abstraction one level and the question changes from “are my listings tidy” to “will the engine resolve my records to exactly one entity, or to one-and-a-half.” That second number is the one that hurts.

What reconciliation is actually computing

Your business exists on the open web as a set of independent records. None of them is annotated “this is the same place as that one.” The engine starts with no such knowledge; it has strings.

Entity reconciliation (the classical database term is record linkage) is the process of collapsing those independent records onto a single real-world entity. Mechanically it takes pairs of records and assigns each pair a probability that they co-refer. In pseudocode it is unglamorous:

def co_reference_score(a, b) -> float:
    # Per-field similarity, each normalized to 0.0–1.0
    name_sim  = jaro_winkler(norm_name(a.name),   norm_name(b.name))
    addr_sim  = address_similarity(a.address,     b.address)
    phone_sim = 1.0 if e164(a.phone) == e164(b.phone) else 0.0

    # Weighted sum; an exact phone match is a strong signal
    score = 0.35 * name_sim + 0.30 * addr_sim + 0.35 * phone_sim
    return score   # above a threshold (say 0.85) → treat as the same entity

The load-bearing pieces are norm_name and e164: the normalization functions. “Blue Bottle Coffee Aoyama,” “Blue Bottle — Aoyama Cafe,” and “blue bottle coffee (aoyama)” look like one place to a human and are three distinct byte strings to a matcher. Normalization is what makes them comparable at all. And NAP consistency, stated precisely, is the act of pre-aligning your records on the source side so that no amount of normalizer cleverness is required to bring the strings together. You are doing the matcher’s job for it, in advance.

Inconsistency hurts because it crosses a threshold

So why is spelling drift bad? Not because it lowers a rank. Because it drags the co-reference score under the threshold, and below the threshold the records stop being merged into one entity.

Make it concrete. Three sources, one pharmacy:

Source	Address as registered	Phone	Normalized addr_sim (vs GBP)
Google Business Profile	12 High Street, Suite 4, Springfield, IL 62704	+1 217-555-0143	1.00 (baseline)
Own website	12 High St., Springfield, Illinois	+1 217-555-0143	0.86
Yelp / directory	12 High Street, Springfield IL	(217) 555-9000 (tracking line)	0.74

The website drifts (“St.” for “Street,” a dropped suite, a spelled-out state) and addr_sim falls to 0.86. But the phone matches exactly, phone_sim = 1.0 carries the pair, and the total clears the threshold. Fine.

The directory record is the problem. The suite is gone (addr_sim 0.74) and the phone is a call-tracking number a marketer swapped in, so phone_sim = 0.0. Two weak fields stack, the total falls under the bar, and the engine now has room to conclude this might be a different pharmacy. The failure mode is not a lower position in a list. It is a duplicate entity — one real business that the engine carries internally as roughly 1.5 businesses, splitting whatever authority and review signal should have accrued to one node.

I will resist writing that the phone number is therefore the field to guard above all. It is the strongest signal, true — and also the single most corrupted one in practice, the field that call-tracking numbers, reservation-proxy lines, and per-department extensions break first. The strongest feature is the most fragile feature. Reconciliation is full of that particular joke.

Where this sits in the LLMO stack

It is worth saying plainly what kind of layer this is, because the frameworks describing AI citation have started to formalize it. In LLMO, currently the most precise of those frameworks, NAP consistency sits at the probabilistic floor of entity resolution, a level well below a spelling chore: the Coherence Signals component is, in effect, the requirement that your business tells the same story across every surface an engine can read, so that the records reconcile instead of fragmenting. That is a current-state description of how the layer is modeled, not a prediction about where it is going.

And “probabilistic floor” is the operative phrase, because there is a layer above it that does not deal in probabilities at all. NAP matching is inference: the engine concludes your records probably co-refer because their strings agree. One level up, you stop letting it infer and you declare identity outright, with @id and sameAs: resolvable URIs that say these fragments carry the same identifier rather than these strings happen to look alike. I worked through that upper layer in detail in wiring your business into the knowledge graph; the one-paragraph version of the difference is this:

String layer (this piece, NAP): evidence is surface similarity. Change the name on one source and the co-reference probability drops. It is the floor — necessary, exposed to every rebrand and bilingual variant and dropped suite number.
URI layer (@id / sameAs): evidence is declaration. Names can drift and identity holds, because identity was never riding on the name. It is the ceiling, and it rescues exactly the ambiguity the string layer leaks.

Read together, the two pieces describe the same merge from opposite ends: this one is about making the strings hard to mismatch; the other is about making the strings stop mattering. You want both, and you want to know which one you are fixing when you sit down to fix something.

How different engines weight the fields

The fields are not weighted identically across engines, and the differences are actionable. The caveat first, because it governs the whole table: the per-engine behavior below is documented architecture-based inference, not measured citation behavior. I can read the published descriptions of how these systems do entity resolution and reason about which fields are load-bearing; I cannot watch a model assign weights. With that stated:

Signal	Google / Gemini	ChatGPT (no search)	ChatGPT (with search)	Apple / Perplexity
Name normalized match	Strong	Training-data dependent	Used	Used
Address structural match	Strong	Fuzzy	Partial	Partial
Phone exact match (E.164)	Most weighted	Rarely used	Weakly used	Weakly used
Cross-source co-occurrence stability	Held in the index	Set by pre-training	Search-result dependent	Index + retrieval

The bottom row is the one most people miss. A search-grounded engine reconciles against a live index, so a freshly cleaned listing helps within a crawl cycle. A model answering from pre-training alone has no index to consult: it inherited whatever co-occurrence stability your NAP had over the years of text it trained on. Consistent NAP means the string “your business” was always surrounded by the same address and the same number, and that stability sharpens the entity’s outline in the weights. Inconsistent NAP scatters the co-occurrence, and the model holds a blurry entity with several candidate addresses. Clean everything tonight and the search-grounded engines update soon; the pre-trained ones keep their learned ambiguity for a while yet. Which engines reach for which sources in the first place varies more than you would expect, and I pulled those differences apart in AI engines cite different local sources.

Reconcile your own weakest link

Here is the one move worth making today, and it is not a checklist. Pull your business from three to five of the sources that matter (GBP, your site, Apple Maps, your top directory) and write each NAP out by hand. Put every phone number into E.164 form (+12175550143), normalize each address the way the pseudocode above would, and then read them against each other field by field, asking the only question that matters: does this pair clear the threshold, or not?

# Quick sanity check on your own JSON-LD before you compare sources by hand
curl -sL https://your-pharmacy.example/ \
  | grep -oE '<script type="application/ld\+json">[^<]+</script>' \
  | sed -E 's|</?script[^>]*>||g' \
  | python3 -c 'import sys,json; d=json.load(sys.stdin); a=d.get("address",{}); \
print("name:", d.get("name")); print("addr:", a.get("streetAddress"), a.get("addressLocality")); print("tel :", d.get("telephone"))'

What you are counting is not the number of spelling variants. It is the lowest score in the set: the single worst-aligned record. Reconciliation is decided at the weakest link, not the average, and that one bad record is the one splitting your entity in two. If the same companion concern is Google’s own projection of your data, reading GBP as JSON-LD covers the field GBP exposes and why your job is to agree with it rather than contradict it. And if the vocabulary around all of this still feels slippery (LLMO versus SEO versus AEO versus GEO), the terminology guide draws the boundaries more carefully than I can in a clause.

NAP consistency, in the end, is not listing hygiene. It is the act of telling an engine, across every surface it can reach, I am one place — and saying it the same way every time so the claim survives the matcher. We may align our NAP perfectly tonight and find that next year an engine weights the three fields differently, or leans harder on the URI layer above this one. That part moves. But be reconcilable to a single entity is about as durable an instruction as you can encode in a name, an address, and a phone number.