OpenAI · training bot · Last updated 2026-05-22

GPTBot robots.txt — Allow, Block, or Audit

How to allow or block GPTBot (OpenAI ChatGPT training crawler) in your robots.txt. Why most brands should allow it, when blocking makes sense, and the exact directive to use.

Jump to the robots.txt block Audit your robots.txt now →

What does GPTBot do?

GPTBot is OpenAI's crawler for collecting training data for ChatGPT and future GPT models. It is one of three OpenAI user-agents (alongside OAI-SearchBot for SearchGPT indexing and ChatGPT-User for real-time fetches during conversations). GPTBot follows standard robots.txt rules and respects User-agent: GPTBot blocks. When allowed, it crawls accessible pages on a paced schedule and the content becomes part of the corpus for the next model training cycle.

Why GPTBot matters for AI visibility

GPTBot is the single most consequential AI crawler for long-term ChatGPT visibility. Pages it can read become candidates for citation when users ask ChatGPT questions on related topics months or years later. Blocking GPTBot today doesn't immediately remove your brand from ChatGPT (current training data persists), but it caps your future growth in the engine. Most brands should allow GPTBot — it's the deposit you make to be cited tomorrow.

BrandCited recommendation

Should you allow GPTBot?

Allow for marketing sites, blogs, documentation, and any public content you want cited by ChatGPT. Block specifically only if you have a legal or licensing reason (paywalled premium content, exclusive partnerships). Use ChatGPT-User and OAI-SearchBot allow rules to preserve browse-mode visibility even when GPTBot is blocked.

Copy-paste robots.txt block

The exact directive to add to your robots.txt for GPTBot. Paste at the end of your file — bot-specific blocks override the wildcard above.

robots.txt

Copy and paste

# Allow GPTBot full access
User-agent: GPTBot
Allow: /
Disallow: /admin
Disallow: /api/
Disallow: /dashboard
Disallow: /onboarding

# To OPT OUT of ChatGPT training while keeping real-time fetches working:
# User-agent: GPTBot
# Disallow: /
#
# User-agent: ChatGPT-User
# Allow: /
#
# User-agent: OAI-SearchBot
# Allow: /

Frequently asked questions about GPTBot

Does blocking GPTBot stop ChatGPT from citing my brand?

No, not immediately. Current ChatGPT model knowledge comes from training data already collected — blocking GPTBot today caps future training inclusion, not present citations. For real-time citations in ChatGPT browse mode, the relevant bot is ChatGPT-User (separate user-agent). Allowing GPTBot is about feeding the next model training cycle.

How do I check if GPTBot is hitting my site?

GPTBot identifies itself with the User-Agent string starting with "GPTBot". Grep your access logs for "GPTBot/" — most CDNs and server logs surface this directly. OpenAI publishes the source IP ranges at https://platform.openai.com/docs/bots for verification.

Does GPTBot respect Crawl-delay?

Yes, but be careful. A Crawl-delay of 30 seconds limits GPTBot to one request per 30 seconds — at that pace, a 5,000-page site takes ~42 hours per crawl. Old SEO robots.txt templates often have this set; for AI training pipelines it slows you down with no upside. Either remove Crawl-delay entirely or set it to 1-2 seconds.

Should I allow GPTBot but block /admin?

Yes, exactly. The standard pattern: Allow: / globally for GPTBot, then Disallow specific paths like /admin, /api, /dashboard, /onboarding. This gives the training pipeline access to public marketing pages while keeping authenticated app surfaces private.

What's the difference between GPTBot, OAI-SearchBot, and ChatGPT-User?

Three distinct bots from OpenAI. GPTBot collects training data (slow, batched). OAI-SearchBot indexes pages for SearchGPT real-time search (faster, fresher). ChatGPT-User fetches a specific URL when a ChatGPT user references it in a conversation (on-demand, per query). Block one and you affect only that surface.

Cite this guide

BrandCited. (2026). GPTBot robots.txt — How to Allow, Block, or Audit. https://www.brandcited.ai/tools/robots-txt-auditor/gptbot

Related AI crawlers

Each major AI engine operates one or more user-agents. Configure them in parallel for complete coverage.

ClaudeBot robots.txt

Anthropic · training

PerplexityBot robots.txt

Perplexity · search

Google-Extended robots.txt

Google · training

Applebot-Extended robots.txt

Apple · training

Audit your robots.txt against 40+ AI bots in one pass →

Want a full AI visibility audit?

robots.txt is one of dozens of AI ranking factors BrandCited audits. Run a free scan to also check schema completeness, llms.txt configuration, content structure, entity recognition, and AI citation share-of-voice.

Run a free AI visibility scan