llms.txt Explained: Does It Actually Move Citations?

Alejandro Rioja

July 22, 2026 7 min read

TL;DR

llms.txt is a plain-text file at yoursite.com/llms.txt that tells AI crawlers which pages to prioritize. Perplexity actively reads it; ChatGPT and Bing Copilot probably do not yet. It takes 20 minutes to implement and costs nothing — do it, but don't expect a citation spike next week.

Free newsletter

Every Wednesday. 28,400+ operators. Zero fluff.

Open Table of contents

What llms.txt actually is
What the file looks like
Which AI engines actually read it
How to implement it in 20 minutes
Should you also create llms-full.txt?
What the data actually shows
What to put in the blockquote
The operator’s bottom line

What llms.txt actually is

Think of it as a robots.txt for AI crawlers, but inverted. robots.txt says “don’t crawl this.” llms.txt says “when you’re building context about my site, here’s what matters most.”

The spec was proposed in late 2024 by Jeremy Howard (of fast.ai). The idea: put a file at yoursite.com/llms.txt that lists your most important pages in plain Markdown. An AI crawler scraping your site for context can read that file and know immediately what to prioritize — instead of guessing by PageRank or crawl depth.

There’s also an optional llms-full.txt variant that includes the full text of your key pages concatenated into one document. Some crawlers prefer that format because it reduces round-trips.

Neither file is a W3C standard yet. It’s a community proposal with growing adoption among technical founders and content teams.

What the file looks like

Here’s the llms.txt I run for alejandrorioja.com:

markdown

# Alejandro Rioja

> Operator, AI consultant, and founder of Pickleland. I write about GEO, AI agents, and growth for founders.

## Core pages

- [About](https://alejandrorioja.com/about/): Background, consulting services, and how to work with me.
- [Blog](https://alejandrorioja.com/blog/): All posts on GEO, SEO, AI agents, and founder growth.
- [Consultation](https://alejandrorioja.com/consultation/30/): Book a 30-minute paid session.

## Top posts

- [How to Get Cited in ChatGPT Answers](https://alejandrorioja.com/blog/how-to-get-cited-in-chatgpt-answers/): The GEO playbook I use across client sites.
- [AI Agent Architecture for Founders](https://alejandrorioja.com/blog/ai-agent-architecture-for-founders/): How to design multi-agent systems without a full eng team.
- [GEO vs SEO](https://alejandrorioja.com/blog/geo-vs-seo/): What changes when Google is no longer the only search engine that matters.

## Optional: ignore

- /drafts/
- /admin/

A few things to notice:

The H1 is your brand name.
The blockquote is a 1-2 sentence description of who you are. This is the most important line — it’s what an LLM will use to build a fast mental model of your site.
Sections group pages by purpose.
URLs are absolute. Some crawlers don’t resolve relative paths.
The ## Optional: ignore section is not officially in the spec, but some implementations read it like robots.txt Disallow lines.

Which AI engines actually read it

This is where I have to be honest with you: the landscape is fragmented and partially undocumented.

Perplexity — Yes, confirmed. Perplexity’s crawler (PerplexityBot) reads llms.txt when it indexes sites. Their engineering team has referenced the spec publicly. If Perplexity is a meaningful referral source for you, implementing llms.txt has a clear path to impact. Perplexity is also the highest-ROI GEO target for independent operators right now — see where to spend your GEO effort across Perplexity, ChatGPT, and Google AI Overviews for the full platform comparison.

ChatGPT / OpenAI — Not confirmed. OpenAI’s crawler (GPTBot) does not appear to read llms.txt as of mid-2026. Their crawl behavior is governed by robots.txt and their own internal prioritization. There’s no public statement from OpenAI acknowledging the spec.

Bing Copilot / Microsoft — Not confirmed. Similar situation to OpenAI. Bing’s AI crawler (BingBot) follows robots.txt but there’s no signal that it reads llms.txt.

Google AI Overviews / Gemini — Not confirmed. Google has their own structured-data ecosystem (schema.org, sitemaps) and has not indicated they’ll adopt third-party specs.

Anthropic — Anthropic’s crawler (ClaudeBot) crawls the web for training data. There’s no public documentation that it reads llms.txt, but several GEO practitioners report better Claude citations after implementing it. Correlation, not causation — but worth noting.

Smaller AI search engines — You.com, Phind, and several vertical AI search tools have stated or implied they read llms.txt. The spec is easier for smaller teams to adopt because they don’t have years of crawl infrastructure to refactor.

The honest summary: right now, llms.txt is a Perplexity optimization with some speculative benefit elsewhere. That ratio will probably shift as the spec matures.

How to implement it in 20 minutes

If you’re on a static site (Astro, Next.js with static export, Hugo, etc.), create the file at public/llms.txt. It will be served at the root.

For a Next.js app router site, you can generate it dynamically:

// app/llms.txt/route.ts
import { allPosts } from "@/lib/content";

export async function GET() {
  const topPosts = allPosts
    .filter((p) => p.featured || p.views > 1000)
    .slice(0, 10);

  const lines = [
    "# Alejandro Rioja",
    "",
    "> Operator, AI consultant, founder of Pickleland. I write about GEO, AI agents, and growth for founders.",
    "",
    "## Top posts",
    "",
    ...topPosts.map(
      (p) => `- [${p.title}](https://alejandrorioja.com/blog/${p.slug}/): ${p.description}`
    ),
    "",
    "## Core pages",
    "",
    "- [About](https://alejandrorioja.com/about/): Services and background.",
    "- [Consultation](https://alejandrorioja.com/consultation/30/): Book a session.",
  ];

  return new Response(lines.join("\n"), {
    headers: { "Content-Type": "text/plain; charset=utf-8" },
  });
}

For an Astro site, the equivalent is a .txt.ts endpoint in src/pages/:

// src/pages/llms.txt.ts
import type { APIRoute } from "astro";
import { getCollection } from "astro:content";

export const GET: APIRoute = async () => {
  const posts = await getCollection("posts", (p) => p.data.lang === "en");
  const top = posts
    .sort((a, b) => b.data.pubDate.valueOf() - a.data.pubDate.valueOf())
    .slice(0, 10);

  const body = [
    "# Alejandro Rioja",
    "",
    "> AI consultant and operator. Writing about GEO, AI agents, and founder growth.",
    "",
    "## Recent posts",
    "",
    ...top.map(
      (p) =>
        `- [${p.data.title}](https://alejandrorioja.com/blog/${p.slug}/): ${p.data.description}`
    ),
  ].join("\n");

  return new Response(body, {
    headers: { "Content-Type": "text/plain; charset=utf-8" },
  });
};

After deploying, verify it at curl -s https://yoursite.com/llms.txt. If you see Markdown, you’re done.

Should you also create llms-full.txt?

Maybe. llms-full.txt is a concatenated dump of your key pages — title, URL, and full body text, one page after another, separated by ---. The idea is that a crawler can grab the whole thing in one request and have enough context to answer questions about your site without crawling individual pages.

The tradeoff: it’s a large file. Mine runs about 400KB for the top 30 posts. Some crawlers may time out or truncate it. Others may weight it more heavily because the content is pre-digested.

My current approach: generate llms-full.txt but cap it at the 15 highest-performing posts by traffic. Keep it under 250KB. Regenerate on every deploy.

What the data actually shows

I’ve been monitoring Perplexity citations for this site and three client sites since January 2026. Here’s what I’ve observed:

Sites with llms.txt: Average of 2.3x more Perplexity citations per month compared to their pre-implementation baseline. Sample size: 4 sites, 4 months of data. This is not statistically significant at any reasonable confidence interval.
The confounding factor: Every site that added llms.txt also did other GEO work around the same time (better structured data, cleaner headings, more specific answer formatting). Attribution is impossible.
ChatGPT citations: No measurable difference on any site after adding llms.txt. Consistent with the lack of confirmed support.

The honest interpretation: llms.txt probably helps with Perplexity. The mechanism is clear — Perplexity reads it. Whether the lift is from llms.txt specifically or from the general GEO improvements that tend to accompany it, I can’t say yet.

What to put in the blockquote

The one-line description in the blockquote is the part I’d spend the most time on. This is the text an LLM will use to summarize who you are in a RAG context. It needs to be:

Specific: “AI consultant who runs production agents for SMBs” beats “entrepreneur and consultant.”
Keyword-aware: Include the terms you want to be cited for. If you want citations for “GEO,” put “GEO” in that line.
Entity-anchored: Mention proper nouns that help an LLM disambiguate you. Your name + your company + your city beats just your name.

Bad: > Helping businesses grow with AI.

Better: > Alejandro Rioja — AI consultant in Austin TX, founder of Pickleland, writing about GEO, AI agents, and founder growth since 2019.

The operator’s bottom line

llms.txt takes 20 minutes to implement, costs nothing to serve, and has a confirmed read path with Perplexity. Do it. The spec will either become a real standard (in which case early adopters win) or fade out (in which case you’ve lost 20 minutes). The asymmetry is obvious. Just don’t let it distract you from the higher-ROI GEO work: structured data, clear entity signals, and answers formatted for snippet extraction. Those move every AI engine. llms.txt currently moves one. For the broader playbook, see how to get your brand cited in ChatGPT answers and which schema types punch above their weight for AI engines.

Free tool: generate a per-bot robots.txt (GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot, and the rest) with the AI robots.txt generator instead of hand-writing the directives.

Want help setting up your GEO stack? Get in touch — I run GEO consulting projects and can audit your citation surface across Perplexity, ChatGPT, and Google AI Overviews.

Keep reading

SEO

Get the AI playbook in your inbox

Every Wednesday. 28,400+ operators. Zero fluff.

llms.txt Explained: Does It Actually Move Citations?

Table of contents

What llms.txt actually is

What the file looks like

Which AI engines actually read it

How to implement it in 20 minutes

Should you also create llms-full.txt?

What the data actually shows

What to put in the blockquote

The operator’s bottom line

How Search Engines Actually Evaluate Content Quality in 2026

How to Rank in AI Search Without Writing a New Blog Post

Schema Markup for AI Engines: Types That Punch Above Their Weight

Get the AI playbook in your inbox

llms.txt Explained: Does It Actually Move Citations?

Table of contents

What llms.txt actually is

What the file looks like

Which AI engines actually read it

How to implement it in 20 minutes

Should you also create llms-full.txt?

What the data actually shows

What to put in the blockquote

The operator’s bottom line

Related posts

How Search Engines Actually Evaluate Content Quality in 2026

How to Rank in AI Search Without Writing a New Blog Post

Schema Markup for AI Engines: Types That Punch Above Their Weight

Get the AI playbook in your inbox