Thoth
Back to glossary

Glossary Definition

robots.txt

/ˈroʊ.bɒts.tɛkst/ noun

A plain-text file at the root of a website that tells web crawlers which pages they are allowed or not allowed to access. In 2026, robots.txt configuration has expanded beyond search engine bots to include AI crawlers. Allowing GPTBot, ClaudeBot, and PerplexityBot while blocking high-volume training scrapers like Bytespider is the recommended AI crawler configuration for most SaaS sites.

See it in action

Run an AI SEO audit

Check technical SEO, AEO readiness, GEO signals, crawler access, and content gaps in one workflow.

Explore ThothCard 1

Read the full guide

Go deeper on robots.txt

Read a practical guide with examples, workflows, and implementation advice.

Read the GuideCard 2