Citation Source Intelligence — what it is and the three layers
Why we don't treat AI citations as undifferentiated URLs — and the three-layer architecture (extract, enrich, act) that turns a list of cited pages into a queue of paste-ready actions.
Most AI visibility tools stop at "ChatGPT cited this URL". That is the floor, not the ceiling. Citation Source Intelligence (CSI) is the discipline of treating every cited URL as a structured object with an authority score, a source type, an actionability state, and — for the ones still open — a paste-ready draft for the comment, pitch, or PR the user can send.
The premise is simple: a Wikipedia article, an archived Reddit thread, a 2022 listicle, and a Hacker News story two days old are all "AI citations". Treating them as the same row in a dashboard is the single biggest reason teams stare at GEO reports without ever taking action. CSI is the structure that makes them different.
What CSI actually is
CSI is not a single feature. It is the pipeline that runs over every URL the tracked AI engines reference for your queries. Three layers, each adding a different kind of structure on top of the raw citation:
Layer 1 — Extract
For every scan result, we parse the AI's response, identify URLs that
were cited (linked, footnoted, or named as the source of a fact), and
record them in citation_sources with the query, engine, and timestamp
they came from. This is the cheapest layer — pattern matching, no LLM,
no remote fetch. The output is a clean list: "these are the URLs the
engine pulled from for your queries this week."
This is where most other AI visibility tools stop. CSI uses it as input.
Layer 2 — Enrich
For every cited URL, we fetch the page, classify the source type
(reddit_thread, hn_story, listicle, news_article, awesome_list,
docs_page, podcast, youtube_video, forum_thread, comparison_page,
…), pull platform-specific signals (Reddit archived / locked flags,
publication date, author, comment count, score), and run an
actionability classifier that decides whether the venue is open for
input. The four states are described in detail in the
Citation actionability concept.
The output of Layer 2 is the difference between "ChatGPT cited a Reddit thread" and "ChatGPT cited a Reddit thread that was archived eight months ago — there is no comment box."
Layer 3 — Act
For URLs classified as Live or Limited in Layer 2, we generate a tailored draft for the action that fits the source type:
reddit_thread→ a value-first Reddit comment with affiliation disclosure, mention of two or three real alternatives, and the etiquette guards (no superlatives, ≤160 words, single link) that keep the comment from being auto-removed.hn_story→ a Hacker News comment sized for the 14-day reply window and the HN tone (concrete, technical, no marketing voice).listicle/comparison_page/news_article→ an email pitch to the author, framed as a useful update for a future piece rather than a request to be added.awesome_list/github_repo→ a pull request description for the README entry.podcast→ a guest pitch matched to the show's format.youtube_video→ a partnership outreach rather than a comment (comments on stale uploads have low visibility).
For URLs classified as Frozen or Manual, the generator does not run. Drafting a comment for an archived thread is worse than drafting nothing — it wastes tokens, looks naïve, and trains the user to ignore the output.
How the layers connect
The three layers share data through the citation_sources and
citation_actions tables. A given URL flows through them in order,
but each layer is independently re-runnable: a refreshed actionability
classification (Layer 2) does not require a new draft (Layer 3) unless
the state changed; a new draft does not require a re-fetch unless the
source content drifted.
This separation matters because the layers have different cost profiles. Layer 1 is free. Layer 2 costs one HTTP fetch per URL. Layer 3 costs an LLM call per draft — the largest cost in the pipeline — and is the one we most aggressively gate on actionability state.
Why this is its own concept (not just a feature)
Most product surfaces — the Citations tab in the dashboard, the action queue in the weekly insights digest, the per-source detail view — are read views over the CSI pipeline. Understanding the pipeline as a concept makes the UI behavior predictable: why a Reddit thread shows a "Frozen" badge and no draft button; why a fresh HN story shows a draft within minutes; why actionability re-classifies on its own when a thread crosses the 30-day or 180-day boundary; why some URLs show enrichment data (comment count, age, author) and others only show the URL.
Concepts that pair with this one:
- Mention vs citation — the upstream signal CSI operates over.
- Citation actionability — the four-state classification that gates Layer 3.
- AI engines — where citations come from in the first place.
For the dashboard view of CSI, see Citation Source Intelligence in the Citations tab. For the strategic argument behind why the action layer matters in the broader GEO category, the blog post 22 AI Visibility Tools, $25 to $699 a Month. Not One Tells You What to Do Next. maps where this fits in the competitive landscape.