llms.txt: The Complete Guide: What It Is, How to Create One, and Why It Matters for AI Search

llms.txt is a Markdown file placed at the root of your website that tells AI crawlers which pages matter most and how to describe them. Think of it as a curated table of contents written specifically for large language models.
This guide covers what the specification actually says, who proposed it, which bots currently read it, and how to create one for your site, including an honest assessment of where adoption stands right now.
Quick Answer
Create a Markdown file called llms.txt listing 10 to 20 important URLs with brief descriptions. Host it at https://yourdomain.com/llms.txt. Make sure AI bots are not blocked in your robots.txt. Keep it in sync with your sitemap and schema markup. The whole thing takes about 20 minutes to set up manually, or you can use Reaudit's free llms.txt generator to automate it.
Where llms.txt Came From
The llms.txt specification was proposed by Jeremy Howard, co-founder of Answer.AI (and previously co-founder of fast.ai), on September 3, 2024. The idea was straightforward: websites already have robots.txt to tell crawlers what they can and cannot access, but there was nothing to help LLMs understand what a site is actually about and which pages are most important.
Howard's proposal defined two files: /llms.txt as a navigation aid listing key pages with descriptions, and /llms-full.txt as an optional companion containing the full content of the site in a single Markdown document. The specification lives at llmstxt.org.
The concept gained traction quickly. Notable early adopters include Anthropic, Stripe, Cursor, Mintlify, Zapier, and FastHTML. As of mid 2025, around 951 domains had published llms.txt files, a tiny fraction of the web, but growing.
The Format: How llms.txt Is Structured
The specification is deliberately simple. The file uses standard Markdown with a few conventions:
An H1 heading (#) with the project or brand name. An optional blockquote (>) with a one-line summary. H2 headings (##) to group pages into logical sections. Bullet-point links in the format - [Title](URL): Brief description.
Here is a minimal example:
# Acme SaaS
> Cloud-based invoicing platform for small and medium businesses.
## Key Pages
- [Home](https://acme.io): Main landing page with product overview.
- [Pricing](https://acme.io/pricing): Subscription tiers and feature comparison.
## Documentation
- [API Guide](https://acme.io/docs/api): REST API reference for developers.
- [Getting Started](https://acme.io/docs/start): Quickstart tutorial for new users.
## Support
- [Help Center](https://acme.io/help): FAQs and troubleshooting guides.
The /llms-full.txt companion file follows the same Markdown format but contains the full text content of the site's key pages, concatenated into a single document. This lets LLMs ingest your entire knowledge base in one request.
How llms.txt Differs From robots.txt
These two files solve different problems. robots.txt controls access — it tells crawlers which URLs they are allowed or disallowed from visiting. It is a set of directives, not content.
llms.txt provides context. It does not control access; it offers a human-readable summary that AI models can use to understand what your site covers and which pages to prioritize. The two files complement each other: robots.txt must allow the AI bots you want to reach your llms.txt, otherwise the file will never be fetched.
Which AI Bots Actually Read llms.txt? An Honest Assessment
This is where many guides oversell the current state of things, so here is the straight picture.
No major LLM provider has officially committed to using llms.txt as a ranking or citation signal. Google's John Mueller stated in 2025 that Google has not adopted it. Analysis by Averi of 1,000 domains in August 2025 found zero visits from major LLM crawlers (GPTBot, ClaudeBot, PerplexityBot) to llms.txt files specifically.
That said, the picture is more nuanced than "nobody reads it." Profound's data showed some evidence of Microsoft and OpenAI bots crawling llms.txt files. Several AI platforms have adopted it for their own sites (Anthropic publishes one at anthropic.com/llms.txt). And the specification is gaining mindshare in the developer community, it is a low-effort implementation that positions your site for whatever comes next.
The honest take: llms.txt is an emerging convention, not an established standard. Implementing it costs almost nothing and cannot hurt your AI visibility. Whether it will become a meaningful signal depends on whether major AI providers decide to officially adopt it. The early signs are promising but not definitive.
Step-by-Step: Creating an llms.txt File
Option 1: Manual Creation
Pick 10 to 20 of your most important pages. These should be your highest value evergreen content: homepage, product pages, pricing, key guides, help center, and case studies. Write a one line summary for each page (keep descriptions under 20 words). Group them under logical H2 headings. Save as llms.txt in UTF-8 encoding and upload to your web root.
Test it by running curl -I https://yourdomain.com/llms.txt you should get a 200 response. If you also want to create llms-full.txt, concatenate the full Markdown content of each listed page into one file.
Option 2: Using Reaudit's Free Generator
If you do not want to build it by hand, Reaudit's free llms.txt generator crawls your site, extracts page titles and meta descriptions, and assembles a ready-to-publish file. Enter your domain, review the auto-generated output, edit descriptions where needed, and download the file. The tool formats everything according to the specification so you do not have to worry about syntax.
Real-World Examples by Industry
SaaS Platform
# CloudMetrics
> Real-time analytics platform for SaaS businesses.
## Product
- [Features](https://cloudmetrics.io/features): Dashboard, reporting, and integrations.
- [Pricing](https://cloudmetrics.io/pricing): Free tier through enterprise plans.
## Resources
- [API Documentation](https://cloudmetrics.io/docs): REST API with code examples.
- [Blog](https://cloudmetrics.io/blog): Product updates and analytics guides.
E-commerce Store
# GreenThread
> Sustainable fashion retailer shipping across the EU.
## Collections
- [Women's Dresses](https://greenthread.com/women/dresses): Organic cotton seasonal collection.
- [Men's Outerwear](https://greenthread.com/men/outerwear): Recycled material jackets and coats.
## Customer Info
- [Shipping Policy](https://greenthread.com/shipping): Free EU delivery over €50.
- [Returns](https://greenthread.com/returns): 30-day return window.
B2B Services
# BrightPath Consulting
> Digital transformation consulting for mid-market companies.
## Services
- [AI Strategy](https://brightpath.io/ai-strategy): Assessment and implementation roadmap.
- [Data Engineering](https://brightpath.io/data): Pipeline design and cloud migration.
## Proof
- [Case Studies](https://brightpath.io/cases): Client results across 12 industries.
Best Practices
Keep it lean. 10 to 20 URLs is the sweet spot. Stuffing 50+ links dilutes the signal and some crawlers may truncate the file.
Prioritize evergreen content. Product pages, help docs, and foundational guides are better candidates than time-sensitive blog posts.
Match your robots.txt. If your robots.txt blocks PerplexityBot or ClaudeBot, they will never reach your llms.txt. Make sure AI bots you want to read the file are explicitly allowed.
Add schema markup to linked pages. llms.txt points AI crawlers to your best content; schema markup on those pages (FAQPage, HowTo, Article, Organization) helps AI models extract structured answers from them. The combination is more powerful than either alone.
Update quarterly. Remove dead links, add new important pages, refresh descriptions. Treat it like your sitemap, a living document, not a set-and-forget file.
Version control it. Keep a copy in Git. You will want to track changes over time, especially as the specification evolves.
Common Mistakes
Using HTML instead of Markdown. The spec calls for Markdown. HTML will not be parsed correctly by tools designed for the llms.txt format.
Blocking AI bots in robots.txt. This is the single most common issue. Your llms.txt is invisible if the bots cannot access it.
Overloading with too many URLs. More is not better. A focused list of your best pages outperforms a dump of every URL on your site.
Forgetting to test. After uploading, verify the file returns a 200 status code and the content renders correctly as plain text.
Neglecting the linked pages themselves. An llms.txt file pointing to poorly structured pages without schema markup is a missed opportunity. Optimize the destination pages too.
Where llms.txt Fits in Your AI Visibility Strategy
llms.txt is one piece of a broader AI optimization stack. It works best alongside three other elements:
Sitemaps ensure all your pages are discoverable by search crawlers. robots.txt controls which bots can access your site and which directories they can reach. Schema markup gives AI models the structured data they need to extract precise answers.
When all four layers are aligned — sitemap for discovery, robots.txt for access control, llms.txt for context, and schema for structured answers — you give AI platforms the best possible conditions to cite your content accurately.
For monitoring, Reaudit tracks which AI bots visit your site (GPTBot, ClaudeBot, PerplexityBot, and 50+ others), which pages they crawl most frequently, and whether your brand is being cited in AI-generated answers across 11 platforms.
Checklist
✅ Create /llms.txt with 10-20 URLs and brief descriptions in Markdown format.
✅ Optionally create /llms-full.txt with full page content.
✅ Verify the file returns a 200 response: curl -I https://yourdomain.com/llms.txt
✅ Confirm robots.txt allows PerplexityBot, ClaudeBot, GPTBot, and OAI-SearchBot.
✅ Add FAQPage, HowTo, and Article schema to pages listed in llms.txt.
✅ Monitor server logs for AI bot user-agents fetching the file.
✅ Schedule quarterly reviews to update URLs and descriptions.
FAQ
What is llms.txt and who created it?
llms.txt is a Markdown file that helps AI crawlers understand your site's most important pages. It was proposed by Jeremy Howard, co-founder of Answer.AI, in September 2024. The specification is published at llmstxt.org.
How does llms.txt differ from robots.txt?
robots.txt controls crawl access (allow or disallow). llms.txt provides context — a curated index with descriptions that AI models can use to understand what your site covers. Both files should be present.
Do AI bots actually read llms.txt right now?
Adoption is still early. No major LLM provider has officially committed to using it as a citation signal. Some evidence shows Microsoft and OpenAI bots visiting llms.txt files, and several AI companies publish their own. It is a low-cost implementation worth having.
How many URLs should I include?
Aim for 10 to 20 high-value evergreen pages. Including too many dilutes the signal and some crawlers may truncate longer files.
Does llms.txt affect Google rankings?
No. Google's traditional search relies on sitemaps, structured data, and its own ranking algorithms. llms.txt is designed for AI-generated answers, not traditional SERP rankings.
Can I generate llms.txt automatically?
Yes. Reaudit's free llms.txt generator crawls your site and produces a ready-to-publish file formatted to the specification.
Should I also create llms-full.txt?
If your site has substantial documentation or knowledge base content, yes. The full-text companion file lets AI models ingest your complete content in a single request, which can improve the accuracy of how they represent your brand.
How do I verify AI bots are reading my llms.txt?
Check your server access logs for user-agents like PerplexityBot, ClaudeBot, and GPTBot. Reaudit's bot tracking dashboard surfaces these requests automatically across React, WordPress, Webflow, and Wix sites.