How to set up llms.txt — A practical 2026 guide

What llms.txt actually is (and is not)

llms.txt is a plain-text file you host at the root of your website (e.g. yourdomain.com/llms.txt) that gives AI agents a Markdown map of which pages on your site matter and what each one is about. It is the AI-era equivalent of a sitemap: a one-page summary designed for single-fetch ingestion by an AI agent or research crawler, instead of multi-page HTML traversal.

The convention was proposed by Jeremy Howard in 2024 at llmstxt.org. It is informal, not standardised, and not enforced by any search engine. But it is also free, takes 60 seconds to ship, and the AI agents that do read it (Anthropic Claude clients, Perplexity, MCP servers, smaller open-source crawlers, AI research agents) get a measurably cleaner summary of your site than they would by inferring it from your homepage.

What it is not

Not a ranking signal. Google has explicitly stated llms.txt is not used in any AI Mode or AI Overviews ranking. Bing has not endorsed it either.
Not a guarantee of citation. Hosting a perfect file does not make ChatGPT or Perplexity cite you for "best X for Y".
Not a substitute for sitemap.xml or robots.txt. It is additive.

Why bother in 2026

Three honest reasons:

Low effort, asymmetric upside. Five minutes of work. If even a fraction of the agents that read it cite your brand more accurately because of it, you have paid back the time.
Trains your team to think in AI-citation hierarchies. Writing the file forces you to pick which pages on your site you actually want AI to surface. That prioritisation outlives the file itself.
Signal of seriousness. Investors, partners, and prospects who check /llms.txt on your domain treat its presence as a sign that the team is paying attention to AI search. Whether or not the file changes ranking, its absence is now a small negative signal in some buying conversations.

What it is not a good reason for: chasing rank in Google AI Mode or AI Overviews. That is decided by the same signals that drive Google organic ranking (authority, entity clarity, structured data, freshness) plus citation footprint in third-party sources — none of which llms.txt influences.

Where the file goes

Two locations, both served as text/plain:

yourdomain.com/llms.txt — the navigational map. Required.
yourdomain.com/llms-full.txt — the full content of all referenced resources, concatenated in one Markdown document. Optional, recommended for sites with deep docs.

Both must be at the exact root path. /docs/llms.txt or /static/llms.txt do not work — most agents fetch a fixed path.

Structure — the llmstxt.org spec in 90 seconds

Five elements, in order:

H1 — your brand name. Exactly one.
Blockquote — one sentence describing what you do. Optional but strongly recommended.
H2 sections — group your resources into categories (e.g. Documentation, Products, Articles, Optional).
Markdown list items under each H2 — one per resource, format - [Title](URL): One-line description.
End-of-file newline — keep it clean for parsers.

Concrete example:

# Acme Analytics

> B2B analytics platform that turns raw product events into board-ready dashboards in under five minutes.

## Documentation

- [Getting started](https://acme.com/docs/getting-started): Sign up, install the SDK, run your first dashboard in 5 minutes.
- [Event model](https://acme.com/docs/concepts/events): How Acme structures product events into entities and metrics.

## Products

- [Pricing](https://acme.com/pricing): Three tiers — Free, Pro $99/mo, Business $299/mo. Annual save 20%.

## Articles

- [Event-naming conventions for product analytics](https://acme.com/blog/event-naming): A pragmatic 2026 framework for naming events without breaking dashboards later.

## Optional

- [Sitemap](https://acme.com/sitemap.xml): Full URL inventory.

Need it without typing? Open the free llms.txt generator — fill the form, copy or download the file.

llms.txt vs llms-full.txt — when to publish both

llms.txt is the index. llms-full.txt is the full content. Ship both when:

You have ≥ 5 docs articles worth offering as a single ingest.
The docs change infrequently enough that a daily or weekly rebuild is fine (don't hammer your CMS in real time).
You want AI research agents to be able to answer questions about your product without crawling each page.

Skip llms-full.txt when:

Your site is < 5 content pages. The index is enough.
Your content includes large reference tables, code samples, or marketing assets that lose meaning when flattened to Markdown.
Your docs change multiple times per day. Stale concat files are worse than no concat file.

Setup on Next.js / Vercel

If you already auto-generate, skip the form-based tool

The free generator at /tools/llms-txt-generator produces a static file you maintain by hand — useful for small sites or first drafts. But if you have a content collection (MDX, Sanity, Contentful, a database, a sitemap that updates often), the dynamic route below is strictly better: it never drifts, never needs manual re-edits, and rebuilds on every deploy. Don't maintain a hand-written file alongside a CMS — pick one path.

Two options. Pick based on whether your file is static or needs to update with your CMS.

Option A — fully static

Drop the file at public/llms.txt in your repo. Vercel serves it as text/plain automatically. This is the right choice if you write the file by hand and update it manually when content changes.

Option B — dynamic route

Build the file at request time from your content collection. App Router example:

// app/llms.txt/route.ts
import { getAllPosts } from '@/lib/blog'

export const dynamic = 'force-static'
export const revalidate = false

export function GET() {
  const posts = getAllPosts().slice(0, 12)

  let body = '# Your Brand Name\n\n'
  body += '> One-line description of what you do.\n\n'

  body += '## Articles\n\n'
  for (const p of posts) {
    body += `- [${p.title}](https://yourdomain.com/blog/${p.slug}): ${p.description}\n`
  }

  return new Response(body, {
    status: 200,
    headers: {
      'Content-Type': 'text/plain; charset=utf-8',
      'Cache-Control': 'public, max-age=3600, s-maxage=3600',
    },
  })
}

force-static means Vercel builds the file at deploy time and serves it from the CDN — zero per-request cost. If you want it to update without a redeploy, switch to revalidate: 3600 (rebuilds hourly).

Setup on WordPress, Webflow, Framer, plain HTML

The file just needs to be at yourdomain.com/llms.txt, served as text/plain. Platform-specific options:

WordPress — upload via SFTP to your site root, or use a plugin like "File Manager" to drop it next to wp-config.php. If your .htaccess rewrites everything to index.php, add a passthrough rule for llms.txt.
Webflow — paid plans support /robots.txt; /llms.txt is not directly editable in the UI today. Workaround: host on a subdomain (docs.yourdomain.com/llms.txt) or use Cloudflare Workers in front.
Framer — same limitation as Webflow. Cloudflare Workers or a custom domain proxy is the practical route.
Plain HTML / static host (S3, Cloudflare Pages, Netlify) — just put the file in the root of your build output. Done.

Common mistakes

Serving as text/html instead of text/plain. Some agents reject the file. Always check the response header.
Linking to internal-only or login-gated pages. AI agents cannot read pages behind auth. Only link to public URLs.
Stuffing the file with every URL on your site. The point is curation, not coverage. 10–30 high-value links beat 500 noisy ones.
Writing descriptions in marketing-speak. "Industry-leading platform" teaches the model nothing. Write descriptions like a tech editor: what the page is, who it is for, what number or fact it contains.
Forgetting to update it. The file is a snapshot. If you rename a docs section or kill a product page, update llms.txt the same day. Stale files hurt more than missing ones.
Expecting it to fix ranking. It will not. See the next section.

How to verify it actually works

Here is the honest part. Hosting llms.txtdoes not move Google AI Mode rank, does not guarantee Perplexity citation, and does not affect ChatGPT's answer to "best X for Y". To know whether your AI search visibility is actually improving you need three measurements:

Share of Voice — of all the buyer-questions your prospects ask AI engines, what percentage now cite your brand?
AI citation count over time — is the count rising, flat, or falling per engine (ChatGPT, Perplexity, Google AI Mode)?
Outcome of specific actions — when you ship llms.txt, a comparison page, a Reddit comment, does the Share of Voice on the target prompt actually move 14 days later, or is it noise?

These three are what GEO Tracker AI measures. You can run a free 60-second audit on a single domain at /grader. The full product runs a controlled benchmark scan through ChatGPT, Perplexity, and Google AI Mode on a schedule and attributes Share of Voice deltas to specific actions you ship. Without that, you are guessing whether the file helped — which is fine if the cost was five minutes, but worth knowing.

What to ship next

Once llms.txt is live, the highest-leverage next steps for AI search visibility are, in order:

Fix robots.txt for AI crawlers. Most sites either block all AI bots (losing potential citations) or allow everything (including some abusive ones). Decide deliberately. Builder coming soon at /tools/robots-txt-for-ai.
Ship Organisation JSON-LD on every page with consistent name, url, sameAs (your social URLs), and description. Entity clarity is one of the few signals all AI engines consistently use.
Audit which competitor pages get cited for the buyer-questions you care about. If competitor X ranks because of a Reddit thread or a listicle, you can reverse-engineer the same surface.
Measure Share of Voice for 30 days before shipping more. Without a baseline you cannot tell whether any of the above worked.