Paste any URL. We scrape your homepage, generate a canonical llms.txt that passes the Google Lighthouse Agentic Browsing audit, and score it inline. Free, no signup, copy-paste ready in 30 seconds.
llms.txt is a plain-text Markdown file at the root of a website (yourwebsite.com/llms.txt) that tells AI agents and large language models what the site is about, which pages matter, and how to navigate the content. It was proposed by Answer.AI's Jeremy Howard in 2024 and became an official Google Lighthouse audit in May 2026 under the new Agentic Browsing category.
Think of it as robots.txt for AI: instead of telling crawlers what they can and cannot fetch, llms.txt tells AI engines what your content means and which pages they should prioritize when answering questions about your brand.
In one sentence
llms.txt is a Markdown manifest at your domain root that tells ChatGPT, Claude, Gemini, Perplexity, Google AI Overviews, and other AI agents what your site is about and which pages to read first — now officially scored by Google Lighthouse.
In May 2026, Google added a new audit category to Lighthouse called Agentic Browsing. The first audit in this category checks for the presence and validity of llms.txtat the domain root. Per the official Lighthouse documentation: when the file is missing, the audit is marked optional; when it's broken or empty, Lighthouse flags it as a server error.
This is significant because Lighthouse is the same tool web developers run in Chrome DevTools and CI pipelines to grade Core Web Vitals, accessibility, and SEO. Adding llms.txt to that pipeline tells everyone building websites: this file is now part of the baseline, the same way semantic HTML and HTTPS became baseline a decade ago.
“Without llms.txt, agents have to crawl your whole site just to understand its structure. With it, they get straight to the point.”
The canonical structure published at llmstxt.org uses four ingredients: an H1 with your brand name, a blockquote summary, H2 sections grouping related pages, and bulleted link listswith title + URL + short description. Our scoring (and Google Lighthouse's) checks for all four.
# Acme Corp > Acme builds AI-native customer support software for e-commerce brands. > Trusted by 4,200 stores including Allbirds, Glossier, and Olipop. ## Product - [Product overview](https://acme.com/product): full feature tour - [Pricing](https://acme.com/pricing): plans and limits - [Integrations](https://acme.com/integrations): Shopify, Stripe, Klaviyo ## Customers - [Case study: Allbirds](https://acme.com/customers/allbirds): 38% faster reply - [Case study: Olipop](https://acme.com/customers/olipop): scaling to 50k tickets/mo ## Docs - [Getting started](https://acme.com/docs/start): set up in 10 minutes - [API reference](https://acme.com/docs/api): REST + webhooks - [Changelog](https://acme.com/changelog): weekly release notes ## Company - [About](https://acme.com/about): team and mission - [Careers](https://acme.com/careers): we are hiring engineers - [Contact](https://acme.com/contact): sales and support
# heading, no slogan, no tagline. Just the company name.>) describing what you do, who you serve, and any trust signals (customer count, awards). 2-3 sentences.##) grouping related pages: Product, Customers, Docs, Company.https:// URLs, not relative paths. AI agents may not resolve relative URLs the way browsers do.These three files all live at your domain root but solve different problems. Most sites should publish at least two of them.
| File | Purpose | Audience | Audited by Lighthouse? |
|---|---|---|---|
| robots.txt | Allow / disallow crawling rules | Search crawlers (Googlebot, Bingbot) | Yes (long-standing) |
| llms.txt | Editorial guidance on what your site means | AI agents (ChatGPT, Claude, Perplexity, Gemini) | Yes (May 2026) |
| ai.txt | Opt-in / opt-out for AI training | AI training pipelines | No (community-driven) |
robots.txt is for permission. llms.txt is for understanding. ai.txt is for consent. They're complementary, not substitutes.
Once you have the file content (use the generator above to make one), deploying it takes 2-5 minutes depending on your stack. Below: the most common.
Next.js / Vercel
WordPress
Static sites (Hugo, Astro, Eleventy)
Webflow / Squarespace / Wix
Common deployment mistakes
text/html content-type instead of text/plain or text/markdown./static/llms.txt instead of the root.Direct answers to the questions AI engines and developers ask about llms.txt.
llms.txt is a plain-text Markdown file placed at the root of a website (yourwebsite.com/llms.txt) that tells AI agents and large language models what your site is about, which pages matter most, and how to navigate it. Think of it as robots.txt but built for AI. It was proposed by Answer.AI in 2024 and Google made it an official Lighthouse audit in 2026.
Not required, but officially audited. As of May 2026, Google Lighthouse ships llms.txt as an audit under its new Agentic Browsing category. Lighthouse marks it as optional when missing, but flags a server error when the file is broken or empty. Sites that publish a well-formed llms.txt score higher in the Agentic Browsing audit.
At the root of your domain, served as https://yourwebsite.com/llms.txt with content-type text/plain or text/markdown and a 200 response. In Next.js, drop the file in /public. In WordPress, upload via FTP or use a plugin like WP File Manager. In static-site generators (Hugo, Astro, Eleventy), place it in the static/public folder.
robots.txt tells crawlers what they can and cannot fetch (allow/disallow rules). llms.txt tells AI agents what your content means and which pages are most important (positive guidance). robots.txt is for permission; llms.txt is for understanding. Most sites should have both.
ai.txt is a more restrictive cousin focused on opt-in/opt-out signals for AI training (similar to robots.txt). llms.txt is editorial guidance for AI agents that have already decided to read your site. They serve different purposes and many publishers use both.
Following the llmstxt.org canonical structure: an H1 with your brand name, a one-paragraph blockquote summary, then H2 sections grouping related pages with bulleted [Title](URL): description lines. Cover your highest-value pages first. Keep descriptions short and concrete — AI engines reward signal density over prose.
Most brands see measurable AI visibility improvement within 2-3 weeks of deploying a well-formed llms.txt and waiting for AI crawlers to re-index. ChatGPT, Claude, Perplexity, and Gemini all check /llms.txt on the next crawl cycle. Faster improvement is possible if you also fix schema markup and Organization entity gaps at the same time.
llms-full.txt is an optional expanded companion file that publishes your full content in Markdown for LLM retrieval. It is not yet audited by Lighthouse but is a strong GEO (Generative Engine Optimization) signal. Add it after llms.txt if you have content-heavy pages you want AI engines to ingest fully.
The generator is free and requires no signup or email. BrandCited covers the Firecrawl + LLM costs (about $0.01 per generation). The paid platform offers ongoing AI visibility tracking across 10+ AI engines if you want to measure the impact of your llms.txt over time.
Generative Engine Optimization (GEO) is the practice of optimizing your content and site structure so generative AI engines — ChatGPT, Claude, Gemini, Perplexity, Grok, and others — surface and cite your brand when users ask questions in their interfaces. llms.txt is one of the five core GEO signals, alongside schema markup, entity presence in directories, content depth, and AI-crawler access.
Answer Engine Optimization (AEO) is the broader discipline of getting your content surfaced as direct answers in any answer engine — Google AI Overviews, Bing Copilot, ChatGPT, Perplexity, You.com, Brave Answers, and others. It overlaps heavily with GEO but also covers traditional voice assistants like Alexa and Siri.
AIO refers to Google AI Overviews — the AI-generated summary that appears at the top of Google search results for many queries. AIO optimization is a subset of AEO focused specifically on the signals Google AI Overviews uses to source its responses. Schema markup, llms.txt, and entity presence are the three biggest AIO levers.
Yes. The output is an editable Markdown textarea — change any link, description, section, or wording. Then click Copy or Download. The generator gives you a starting point; you know your brand and product hierarchy better than any model, so a quick human polish pass is recommended.
On a 0-10 scale using the same heuristic Google Lighthouse Agentic Browsing applies. The score combines: file presence (2 pts), H1 brand heading (2 pts), blockquote summary (2 pts), at least 2 H2 sections (2 pts), and at least 5 bulleted link entries (2 pts). Broken or empty files score 0 with critical severity.
Sometimes no. The generator uses Firecrawl to fetch your homepage. Sites with aggressive bot protection (some Cloudflare configurations, custom user-agent blocking) may return an empty body. In that case the tool returns a clear error and you can either temporarily allow the Firecrawl user agent or write the llms.txt by hand using our example template.
Publishing a great llms.txt is step one. Step two is measuring whether ChatGPT, Claude, Gemini, Perplexity, and Google AI Overviews actually cite your brand when users ask about your category. That's what BrandCited does — daily AI visibility tracking across 10+ engines, with the same Lighthouse-aware audit you just ran, plus competitor benchmarks and weekly score reports.
Related: The complete llms.txt guide · GEO vs AEO vs SEO · Optimizing for Google AI Overviews · BrandCited methodology
Specification by Answer.AI · llmstxt.org · Audit reference: Google Chrome Developers
Last updated May 2026 · Published by BrandCited