Methodology · Public reference
How we measure AI search visibility — transparently
Every number you see in GEO Tracker AI— GEO Score, Share of Voice, mention rate — comes from a real prompt sent to a real AI engine, parsed by code you can audit. No traffic-lift claims, no black-box “AI optimization” magic, no “we got you cited” attribution. Here is exactly what we run, how we score it, and what we deliberately do not promise.
Section 1
The three engines we monitor
We track the three engines that account for the overwhelming majority of AI answer surface area today. Each one runs against a documented, vendor-supported endpoint — no scraping of consumer UIs, no “simulated” chat sessions.
| Engine | How we call it | Why it matters |
|---|---|---|
| ChatGPT | OpenAI Responses API · gpt-5.4-mini with the web_search tool enabled | Default answer engine for hundreds of millions of weekly users. 45 % weight in the composite GEO Score. |
| Perplexity | Perplexity API · sonar model with native citations | Citation-first answers — every response ships with the source URLs we parse into the dashboard. 25 % weight. |
| Google AI Mode | DataForSEO serp/google/ai_mode/live/advanced ($0.0075 / query) | The Gemini-powered chatbot tab on google.com. The fastest-growing AI surface in 2026. 30 % weight. |
Engines without at least one returned result for a given scan are excluded from the score denominator — we never penalize a brand for an engine outage we caused or a vendor returned empty.
Section 2
How a scan actually works
One scan = one tracked question, run against every engine your tier covers, parsed by deterministic code, persisted as auditable rows in our database. Four steps:
1 · Send the prompt
We send the exact question you tracked — verbatim, no extra instructions, low temperature for parsing consistency. Concise system framing (“answer in 2–4 sentences, mention specific products by name”) keeps responses comparable across engines.
2 · Parse the response
Layer 1 is regex-based: did the response contain the brand hostname or token? Layer 2 is deterministic interpretation — tone, top-list position, citation-by-domain. Long-tail ambiguous cases get an optional
gpt-5.4-nanorefine pass. Any LLM error falls back to the deterministic verdict, never silently drops.3 · Extract citations
We pull every source URL the engine cited and surface them as a 30-day Citations view per question. A separate brand-level extractor names the products the engine actually recommended in prose (“HubSpot”, “Pipedrive”), not just URL hosts.
4 · Persist for replay
Every scan, every response excerpt, every cited domain is stored. You can re-derive the score from raw rows months later. We track our own LLM spend per scan in the same ledger.
Section 3
GEO Score = mention rate × quality, weighted by engine
The GEO Score is a 0–100 composite per engine, then a weighted average across engines. The intuition has two layers:
- Mention rate — what fraction of your tracked questions returned an answer that named your brand at all.
- Quality — when you were named, how prominently. We snap quality to four bands
{0, 40, 70, 90}mapped tonot_mentioned/mentioned/recommended/top_recommended. - Baseline lift — a 0.40 floor on quality so a single weak mention still gets partial credit; ramping up linearly toward 1.0 as quality improves.
- Engine blend — ChatGPT 0.45 · Google AI Mode 0.30 · Perplexity 0.25. Engines with no results in the scan window are dropped from the denominator.
The exact formula (confidence weighting, edge cases, MVP quality bands) lives in the source — read the full reference in /docs or browse the engineering blog for a deeper walkthrough.
Section 4
Smart cadence + confidence weighting
Running every tracked question against every engine every day would burn LLM budget on signal you cannot act on. Two safeguards keep cost and noise honest:
- Cadence floor by tier. Daily pulse for monitored questions, weekly deep cycles for variant runs. Unmonitored questions sit at on-demand cadence until you flip them on.
- Question Confidence buckets. Every tracked question gets a 4-signal trust score (LLM domain coherence + DataForSEO PAA demand + Search Console token overlap + AI scan baseline mention), bucketed as Verified / Worth tracking / Off-target.
- Confidence-weighted scoring. The GEO Score weights each scan result by its source bucket — Verified 1.0, Worth tracking 0.7, Off-target 0.3. A noisy question cannot drag a real signal into the floor.
- Outcome Loop measurement. When you mark an action done, we re-scan the affected questions 14 days later and report the actual GEO Score delta. Lift is measured, not promised.
Section 5
What we do not claim
Most “rank in ChatGPT” vendors lead with a traffic-lift number — “216 % average traffic increase”, “3× faster citations”, “dominate GEO in 30 days”. None of those numbers are falsifiable: there is no public instrument to audit them with. We sell the instrument, so we hold ourselves to a higher bar:
| What we do | What we deliberately do not do |
|---|---|
| Show you exactly which buyer questions name you and which name competitors instead. | Promise we “got you cited” — citation drift is correlated with content work, never causally attributed. |
| Report the actual GEO Score delta 14 days after each action. | Quote a specific organic-traffic percentage lift — we don't run your analytics. |
| Surface the engine response excerpt verbatim with source URLs. | Auto-publish anything to your CMS, your social channels, or anywhere else without explicit per-post approval. |
| Persist raw scans so you can replay any score months later. | Hide the math behind a black box — every formula, weight, and bucket is documented. |
If the answer to “did this actually work?” matters to you, measurement-grounded reporting beats SEO-by-promise every time. That is the entire thesis of this product.
See your own measurement, free
The free GEO Snapshot runs three real Perplexity Sonar scans against your domain — same engine, same parser, same scoring as the paid product. No credit card, no email gate beyond the shareable result. Upgrade later to add ChatGPT + Google AI Mode and daily cadence.
Pro starts a 14-day trial with full ChatGPT + Perplexity + Google AI Mode coverage and Mission Control cockpit. Cancel anytime in the customer portal.