Citation Source Intelligence — Citations tab

Where AI engines pull citations from for your queries, whether you can act on each source, and paste-ready drafts (Reddit, HN, email pitch, GitHub PR, podcast) for the ones that are open.

8 min read

The Citations tab on /dashboard/citations enriches every URL the tracked AI engines cite for your queries, classifies whether you can act on it, and — for the open ones — generates a tailored draft you can paste. It is the action layer that turns a visibility dashboard into a queue of concrete things to send next.

What problem this solves

Every AI visibility tool tells you where you are missing. Almost none tell you what to send next. Citation Source Intelligence (CSI) closes that gap end-to-end.

For each cited URL:

  1. We fetch the page (Reddit JSON for Reddit threads — bypasses the bot-wall interstitial; HTML otherwise).
  2. We classify the source type: reddit_thread, hn_story, listicle, comparison_page, blog_post, news_article, github_repo, awesome_list, podcast, youtube_video, twitter_thread, forum_thread, docs_page, or other.
  3. We extract metadata: title, author, author email (when discoverable), published date, comment / score signals, structured data, and a plain-text excerpt.
  4. We classify actionability into one of four states (see below).
  5. For supported source types in the Live or Limited buckets we generate a paste-ready draft on demand.

The four actionability states

BadgeWhat it meansWhat you get
LiveVenue is open, recent, and accepts contributions today.Draft generated on demand.
LimitedVenue is open but degraded — old article, low-traffic thread, news comments closed but the author is still reachable.Draft generated with a degraded-venue warning above the Generate button.
FrozenVenue does not exist or is permanently closed: archived Reddit thread, locked HN story, defunct domain, Wikipedia / MDN / IETF informational reference.No draft. A replacement strategy card shows up instead — fresh thread on the same subreddit, "submit your own Show HN", "find an active podcast in the same beat", or "build a canonical resource on your own domain." Frozen does not equal done.
ManualVenue is open but requires a logged-in identity we cannot automate (Twitter / X reply, YouTube comment)."Open thread / Open video" hand-off + a short manual-reply hint.

Every Frozen badge in the UI has a tooltip showing the exact reason (e.g. "Reddit thread is archived (>6 months old) — comments are auto-locked").

What gets drafted, per source type

Only the source types listed below produce drafts. Other types still appear in the Citations table with their actionability badge but the detail sheet says "Drafts coming soon" — we'd rather ship one platform well than five poorly.

Reddit comment

Triggered for reddit_thread source type with status Live or Limited. The draft:

  • Answers the OP's specific question first (no "+1", no "great post").
  • Shares one concrete data point or experience.
  • Mentions your brand naturally alongside one or two real alternatives.
  • Ends with a short disclosure line (Disclosure: I work on …).
  • Stays inside 80–160 words, no superlatives, no emoji, link-capped.

Per-subreddit etiquette overrides apply for r/SEO, r/SaaS, r/marketing, r/Entrepreneur, r/digital_marketing, r/programming, r/devops, r/webdev, r/startups, and r/ArtificialInteligence — each contributes both a system-prompt hint and a culture-specific warning rendered in the action card.

Hacker News reply

Triggered for hn_story ≤ 14 days old (older threads are auto-locked by HN and classified Frozen). HN drafts:

  • Open with the technical claim, not rapport.
  • Drop one specific signal — a number, a benchmark, a gotcha.
  • Mention the brand once with a short "I work on …" line (HN expects identifiable handles, no separate "Disclosure:" prefix).
  • Stay inside 60–150 words.

Email pitch — listicles, comparisons, blog posts, news

Triggered for listicle, comparison_page, blog_post, news_article. The draft includes:

  • A short, specific subject (max 8 words; Re: prefixes blocked).
  • A first sentence that references something concrete from the article (a tool they listed, a category they covered).
  • One — not three — concrete data point about your brand.
  • A soft ask matched to "If you're updating this in Q3" / "no rush".
  • A Thanks, [name] placeholder for you to fill.

We auto-detect the author's email across mailto: links, obfuscated patterns (jane [at] domain [dot] com), Contact: / Email: labels, and footer-region scans. When found, the secondary action button opens the user's email client via mailto: pre-filled with subject + body. When not found, we fall back to the author profile URL or a Hunter.io domain search.

GitHub awesome-list PR

Triggered for awesome_list. Output:

  • PR title (max 70 chars, "Add X to Y section" convention).
  • A "Why it fits" + "Format check" + "Suggested entry" PR body.
  • The exact markdown line to add, formatted to match the list's existing entries, in a triple-backtick block for easy copy.
  • A one-click "Open README in editor" button that lands you in GitHub's web editor on main/README.md ready to fork + edit.

The CONTRIBUTING.md reminder is added to every PR draft's etiquette warnings — every awesome list has its own format and PR-template rules.

Podcast guest pitch

Triggered for podcast. Output:

  • Subject (≤ 8 words, references the show concretely).
  • 2-3 specific episode angles (each = a tentative episode title + a one-sentence hook).
  • One-line bio with the most credible signal you have (real number, notable customer, prior podcast).
  • Soft ask matched to "next month / no rush".

mailto: pre-fill when the host email is detected; otherwise host profile URL or a Listen Notes search.

Etiquette guards — hard rules, not vibes

Every draft passes through a per-platform guard pass that flags issues in the action card before you can copy:

RuleWhere
Body must mention your brandAll five draft types
Affiliation disclosure detectedReddit (required), HN (recommended)
Subject ≤ 70 chars and not Re:Email pitch, podcast pitch
Length cap (per platform)All types
Superlatives blocked (best, leading, game-changer, world-class)All types
Hard-ask phrases blocked (hop on a call, do you have 15 minutes, circling back, just touching base)Email pitch, podcast pitch
Multi-link comments flaggedReddit, HN
One PR = one entry ruleGitHub PR
Multiple topic angles requiredPodcast pitch

Failed checks do not block the copy button — they appear in the action card so you see them before sending. Drafts, not auto-posts.

Closed-loop outcome tracking

Every action you mark posted / sent / submitted is re-evaluated 14 days later by a Monday-morning cron. The evaluator:

  1. Counts citations on the same source in the 14 days before the action.
  2. Counts citations on the same source in the 14 days after the action.
  3. Computes citation_delta and the distinct queries affected.
  4. Assigns a confidence band (high / medium / low) based on baseline volume and whether the post window had enough completed scans.

The action card then shows the outcome in plain English:

  • Drove +5 citations across 3 queries (high confidence) — green.
  • Lost 2 citations after 14 days — investigate (medium confidence) — red.
  • No measurable change after 14 days (medium confidence) — neutral.
  • Not enough scan data yet to attribute (low confidence) — neutral.

Without this signal, every citation tool is just another dashboard.

What runs in the background

CronSchedule (UTC)What it does
Daily citations enrich0 4 * * *Pulls cited URLs from the last 36h of completed scans, fetches + classifies + extracts up to 60 new URLs per domain. Cache hits are unlimited; spillover lands the next morning.
Weekly citations outcomes0 5 * * 1Re-evaluates every action with status='done' and marked_done_at older than 14 days; upserts citation_action_outcomes.
Immediate enrichmentAfter each manual scanTriggered via after() so it never blocks the scan response. Bounded to 15 new URLs — most fetches finish within 10–30 seconds.

Free tier is excluded from CSI persistence (Free users see the top-3 citation sources from their latest scan only, read-only).

Plan limits

PlanRead accessAction drafts / week
FreeTop 3 citation sources, read-only, no drafts0
ProFull citation history with 4-state classification5
BusinessFull citation history + outcome tracking surfaced on every done action15

Generating a draft for the same source twice (regenerate) does not double-count toward the weekly cap.

Honest limits

  • We never auto-post. Every draft requires a human review before send.
  • Email auto-detection works best on personal blogs and content sites; corporate webs with contact-form-only setups fall through to Hunter.io.
  • Twitter / X and YouTube are Manual today. Posting requires a logged-in identity we do not automate.
  • We have not built helpers for forum_thread, docs_page, or generic github_repo (non-awesome) yet — those source types appear in the table with the right actionability badge but the detail sheet says drafts are coming soon.