Methodology · Public reference

How we measure AI search visibility — transparently

Every number you see in GEO Tracker AI— GEO Score, Share of Voice, mention rate — comes from a real prompt sent to a real AI engine, parsed by code you can audit. No traffic-lift claims, no black-box “AI optimization” magic, no “we got you cited” attribution. Here is exactly what we run, how we score it, and what we deliberately do not promise.

Section 1

The three engines we monitor

We track the three engines that account for the overwhelming majority of AI answer surface area today. Each one runs against a documented, vendor-supported endpoint — no scraping of consumer UIs, no “simulated” chat sessions.

EngineHow we call itWhy it matters
ChatGPTOpenAI Responses API · gpt-5.4-mini with the web_search tool enabledDefault answer engine for hundreds of millions of weekly users. 45 % weight in the composite GEO Score.
PerplexityPerplexity API · sonar model with native citationsCitation-first answers — every response ships with the source URLs we parse into the dashboard. 25 % weight.
Google AI ModeDataForSEO serp/google/ai_mode/live/advanced ($0.0075 / query)The Gemini-powered chatbot tab on google.com. The fastest-growing AI surface in 2026. 30 % weight.

Engines without at least one returned result for a given scan are excluded from the score denominator — we never penalize a brand for an engine outage we caused or a vendor returned empty.

Section 2

How a scan actually works

One scan = one tracked question, run against every engine your tier covers, parsed by deterministic code, persisted as auditable rows in our database. Four steps:

  1. 1 · Send the prompt

    We send the exact question you tracked — verbatim, no extra instructions, low temperature for parsing consistency. Concise system framing (“answer in 2–4 sentences, mention specific products by name”) keeps responses comparable across engines.

  2. 2 · Parse the response

    Layer 1 is regex-based: did the response contain the brand hostname or token? Layer 2 is deterministic interpretation — tone, top-list position, citation-by-domain. Long-tail ambiguous cases get an optional gpt-5.4-nano refine pass. Any LLM error falls back to the deterministic verdict, never silently drops.

  3. 3 · Extract citations

    We pull every source URL the engine cited and surface them as a 30-day Citations view per question. A separate brand-level extractor names the products the engine actually recommended in prose (“HubSpot”, “Pipedrive”), not just URL hosts.

  4. 4 · Persist for replay

    Every scan, every response excerpt, every cited domain is stored. You can re-derive the score from raw rows months later. We track our own LLM spend per scan in the same ledger.

Section 3

GEO Score = mention rate × quality, weighted by engine

The GEO Score is a 0–100 composite per engine, then a weighted average across engines. The intuition has two layers:

  • Mention rate — what fraction of your tracked questions returned an answer that named your brand at all.
  • Quality — when you were named, how prominently. We snap quality to four bands {0, 40, 70, 90} mapped to not_mentioned / mentioned / recommended / top_recommended.
  • Baseline lift — a 0.40 floor on quality so a single weak mention still gets partial credit; ramping up linearly toward 1.0 as quality improves.
  • Engine blend — ChatGPT 0.45 · Google AI Mode 0.30 · Perplexity 0.25. Engines with no results in the scan window are dropped from the denominator.

The exact formula (confidence weighting, edge cases, MVP quality bands) lives in the source — read the full reference in /docs or browse the engineering blog for a deeper walkthrough.

Section 4

Smart cadence + confidence weighting

Running every tracked question against every engine every day would burn LLM budget on signal you cannot act on. Two safeguards keep cost and noise honest:

  • Cadence floor by tier. Daily pulse for monitored questions, weekly deep cycles for variant runs. Unmonitored questions sit at on-demand cadence until you flip them on.
  • Question Confidence buckets. Every tracked question gets a 4-signal trust score (LLM domain coherence + DataForSEO PAA demand + Search Console token overlap + AI scan baseline mention), bucketed as Verified / Worth tracking / Off-target.
  • Confidence-weighted scoring. The GEO Score weights each scan result by its source bucket — Verified 1.0, Worth tracking 0.7, Off-target 0.3. A noisy question cannot drag a real signal into the floor.
  • Outcome Loop measurement. When you mark an action done, we re-scan the affected questions 14 days later and report the actual GEO Score delta. Lift is measured, not promised.

Section 5

What we do not claim

Most “rank in ChatGPT” vendors lead with a traffic-lift number — “216 % average traffic increase”, “3× faster citations”, “dominate GEO in 30 days”. None of those numbers are falsifiable: there is no public instrument to audit them with. We sell the instrument, so we hold ourselves to a higher bar:

What we doWhat we deliberately do not do
Show you exactly which buyer questions name you and which name competitors instead.Promise we “got you cited” — citation drift is correlated with content work, never causally attributed.
Report the actual GEO Score delta 14 days after each action.Quote a specific organic-traffic percentage lift — we don't run your analytics.
Surface the engine response excerpt verbatim with source URLs.Auto-publish anything to your CMS, your social channels, or anywhere else without explicit per-post approval.
Persist raw scans so you can replay any score months later.Hide the math behind a black box — every formula, weight, and bucket is documented.

If the answer to “did this actually work?” matters to you, measurement-grounded reporting beats SEO-by-promise every time. That is the entire thesis of this product.

See your own measurement, free

The free GEO Snapshot runs three real Perplexity Sonar scans against your domain — same engine, same parser, same scoring as the paid product. No credit card, no email gate beyond the shareable result. Upgrade later to add ChatGPT + Google AI Mode and daily cadence.

Pro starts a 14-day trial with full ChatGPT + Perplexity + Google AI Mode coverage and Mission Control cockpit. Cancel anytime in the customer portal.