Now accepting Q3 engagements,limited client slots

Insights/Tutorial

Is Your Site Blocking AI Crawlers? How to Check

If GPTBot, ClaudeBot, or PerplexityBot is blocked, AI cannot cite your firm. Run this 10-minute audit to find and fix the robots.txt and Cloudflare rules.

Published 4 min readBy Result.st

If your site blocks AI crawlers, no amount of great content will get your firm cited, because the engines never read your pages. The fastest way to check is to open yourfirm.com/robots.txt and look for Disallow rules naming GPTBot, ClaudeBot, or PerplexityBot, then confirm your CDN, often Cloudflare, is not blocking them by default. The full audit below takes about 10 minutes.

Why would your site be blocking AI crawlers at all?

Most firms never chose to block AI; it happened by default. Three common causes:

  • Cloudflare added an automatic "Block AI bots" toggle, and many plans enabled it without a clear notice.
  • A developer or SEO plugin added Disallow rules for AI user-agents to "protect content."
  • A managed hosting provider or WordPress security plugin shipped restrictive defaults.

The result is the same: GPTBot or PerplexityBot requests your page, receives a block, and the engine has nothing to cite. This is one of the most common and most overlooked reasons firms vanish from AI answers. We cover the others in 7 reasons your firm doesn't show up in AI answers.

Which AI crawlers actually matter?

You do not need to allow every bot, but you should allow the ones that feed the major answer engines. Here are the user-agents that count:

Crawler Powers Why it matters
GPTBot ChatGPT training and answers Largest AI audience
OAI-SearchBot ChatGPT Search Real-time citations
ClaudeBot Claude Growing professional use
PerplexityBot Perplexity Cites fast, often within days
Google-Extended Google AI Overviews Controls AI use of your content

Note that ChatGPT Search overlaps roughly 87% with Bing, so a healthy Bing presence helps too, but the crawlers above are what read your site directly. Allowing them does not change traditional SEO versus GEO outcomes; AI crawlers are separate from Googlebot.

How do you run the 10-minute crawler audit?

Work through these steps in order:

  1. Open yourfirm.com/robots.txt in a browser. Scan for any line naming GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, or Google-Extended followed by Disallow.
  2. Check for a blanket block: a User-agent: * paired with Disallow: / blocks everyone, including AI.
  3. Log into Cloudflare, go to Security then Bots, and look for an AI bot blocking toggle. Disable it if you want citations.
  4. Review any security or SEO plugin (Wordfence, Yoast, Rank Math) for AI bot rules.
  5. Use our AI crawler checker to test all the major agents against your live site in one pass.

If any crawler returns blocked, you have found your problem.

How do you fix a robots.txt that blocks AI?

Edit robots.txt to explicitly allow the crawlers you want. A clean, permissive block looks like this:

User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

Remove any conflicting Disallow rules for these agents. For Cloudflare, turn off the AI scraping block at the zone level. After changes, recrawl with the checker tool and confirm each bot now returns allowed. Changes to robots.txt take effect on the next crawl, usually within hours to days.

What other rules can silently block AI besides robots.txt?

robots.txt is the obvious culprit, but it is not the only one. Several layers can quietly turn AI bots away even when your robots.txt looks clean:

  • WAF and firewall rules: a web application firewall may challenge or block unfamiliar user-agents, including AI crawlers, before they ever reach robots.txt.
  • Rate limiting: aggressive limits can drop crawler requests, leaving the engine with partial or no content.
  • JavaScript-only rendering: if your key content loads only after heavy client-side scripts, some crawlers may capture an empty page even when access is allowed.
  • Meta robots and X-Robots-Tag: a noindex directive at the page or header level can suppress a page independently of robots.txt.
  • Geo or bot fencing: country or bot-management rules that block traffic from data-center IP ranges, where many crawlers originate.

Check these if a crawler is allowed in robots.txt but still cannot retrieve your pages. The crawler checker helps isolate where the failure happens.

What should you do after you unblock the crawlers?

Unblocking is the foundation, not the finish line. Once engines can read you, make sure they can understand and trust what they read: add schema markup for your practice so the content is machine-legible, then verify your visibility by checking what AI currently says about your firm. Re-run the audit quarterly, because CDN defaults and plugin updates can silently re-block bots.

Not sure whether a block is costing you citations? Get in touch through our contact page and we will run the full crawler and visibility audit for you.

Frequently asked questions

How do I know if my site blocks AI crawlers?

Check yourfirm.com/robots.txt for Disallow rules naming GPTBot, ClaudeBot, or PerplexityBot, and review your Cloudflare bot settings, which can block AI crawlers by default.

Which AI crawlers should I allow?

At minimum, allow GPTBot and OAI-SearchBot for ChatGPT, ClaudeBot for Claude, PerplexityBot for Perplexity, and Google-Extended for Google AI features.

Will allowing AI crawlers hurt my SEO?

No. AI crawlers are separate from Googlebot. Allowing GPTBot or PerplexityBot does not affect how Google ranks your pages in traditional search.

Start here

Find out what AI says about you right now.

Every engagement starts with a free AI visibility snapshot of your firm, yours to keep, whether we work together or not.