Anthropic · training bot · Last updated 2026-05-22
How to allow or block ClaudeBot and the related Anthropic crawlers (Claude-SearchBot, Claude-User, anthropic-ai) in your robots.txt. Required directives + recommended setup.
ClaudeBot is Anthropic's primary training-data crawler for Claude. Anthropic operates four related user-agents: ClaudeBot (training), Claude-SearchBot (real-time search indexing for Claude's web-search feature), Claude-User (on-demand fetches when a Claude user references a specific URL), and the legacy anthropic-ai user-agent (older training crawler still respected). Each is a separate User-agent block in robots.txt.
Claude is a knowledge-presence engine — it answers from training corpus, not live retrieval (for most queries). Allowing ClaudeBot is how your content enters that training corpus and becomes citable when users ask Claude questions about your brand or category. Blocking ClaudeBot caps your Claude visibility growth. Anthropic publishes its bot allowlist transparently at docs.anthropic.com — verify the user-agent strings against the canonical source.
BrandCited recommendation
Allow all four Anthropic user-agents (ClaudeBot, Claude-SearchBot, Claude-User, anthropic-ai). They serve distinct surfaces and blocking any one creates a specific blind spot. Most brands should allow all; only block specific paths (/admin, /api, /dashboard) the same way you do for other crawlers.
The exact directive to add to your robots.txt for ClaudeBot. Paste at the end of your file — bot-specific blocks override the wildcard above.
robots.txt
Copy and paste# Allow all four Anthropic / Claude crawlers
User-agent: ClaudeBot
Allow: /
Disallow: /admin
Disallow: /api/
Disallow: /dashboard
User-agent: Claude-SearchBot
Allow: /
Disallow: /admin
Disallow: /api/
Disallow: /dashboard
User-agent: Claude-User
Allow: /
Disallow: /admin
Disallow: /api/
Disallow: /dashboard
# Legacy user-agent — still in some training pipelines
User-agent: anthropic-ai
Allow: /
Disallow: /admin
Disallow: /api/
Disallow: /dashboardClaudeBot crawls broadly for training data — Anthropic schedules its visits. Claude-User performs on-demand fetches when a Claude user pastes or references a URL in conversation. Blocking ClaudeBot opts you out of training; blocking Claude-User makes Claude return "I couldn't fetch that URL" to users who reference your page directly. Almost always allow both.
anthropic-ai is the legacy user-agent from before Anthropic rebranded its training crawler to ClaudeBot. Older Anthropic training pipelines may still send it; some robots.txt files written 2023-2024 only have anthropic-ai listed. Best practice: include both ClaudeBot and anthropic-ai blocks for compatibility.
https://docs.anthropic.com/en/docs/build-with-claude/web-crawling is the canonical source. It lists all four current user-agents (ClaudeBot, Claude-SearchBot, Claude-User, anthropic-ai), explains what each does, and confirms they respect standard robots.txt rules.
Anthropic doesn't publish exact rate limits but it's significantly slower than Googlebot. A medium-sized site (5,000-10,000 pages) typically sees ClaudeBot complete a full crawl over several weeks. Don't set Crawl-delay aggressively; the natural pace is already conservative.
No, not immediately. Content already in the training corpus persists in current Claude models. Blocking ClaudeBot prevents future inclusion, so the impact compounds over the next training cycle (months to a year). For immediate removal requests, Anthropic provides a separate opt-out form for specific content.
Cite this guide
BrandCited. (2026). ClaudeBot robots.txt — How to Allow, Block, or Audit. https://www.brandcited.ai/tools/robots-txt-auditor/claudebot
Each major AI engine operates one or more user-agents. Configure them in parallel for complete coverage.
robots.txt is one of dozens of AI ranking factors BrandCited audits. Run a free scan to also check schema completeness, llms.txt configuration, content structure, entity recognition, and AI citation share-of-voice.