Generate Robots.txt Files Instantly (Control Crawlers, SEO-Ready)
Generate robots.txt file to control search engine crawlers. Create user-agent rules, allow/disallow paths, set crawl delays, and add sitemap URLs. Perfect for managing bot access to your website.
How to Use Robots.txt Generator
What is Robots.txt?
Robots.txt is a text file placed in your website root directory that tells search engine crawlers which pages or sections of your site they can or cannot access. It is part of the Robots Exclusion Protocol (REP), a group of web standards that regulate how robots crawl the web. While not all crawlers respect robots.txt, major search engines like Google, Bing, and Yahoo follow these directives.
How to Use This Tool
Step 1: Choose a Preset
Start with one of the preset configurations:
- Allow All Crawlers: Permit all search engines to crawl everything (default for most sites)
- Standard Website: Block admin, private, and API directories
- Blog/News Site: Block WordPress admin while allowing content directories
- E-commerce Store: Block cart, checkout, and account pages while allowing products
- Block All Crawlers: Prevent all crawlers from indexing your site (staging/development)
Click any preset to instantly load its configuration.
Step 2: Select User-agent
The user-agent specifies which crawler(s) the rules apply to:
All Crawlers (*)
- Applies rules to all search engine bots
- Most common choice for general sites
- Can be overridden by specific user-agent rules
Specific Crawlers:
- Googlebot: Google Search crawler
- Googlebot-Image: Google Image Search
- Bingbot: Microsoft Bing crawler
- Slurp: Yahoo Search crawler
- DuckDuckBot: DuckDuckGo crawler
- Baiduspider: Baidu (Chinese search engine)
- YandexBot: Yandex (Russian search engine)
- facebookexternalhit: Facebook link preview crawler
- Twitterbot: Twitter/X link preview crawler
You can create multiple user-agent blocks with different rules for each crawler.
Step 3: Configure Allow Rules
Allow rules explicitly permit crawlers to access specific paths:
When to use Allow:
- Override broader Disallow rules
- Permit access to specific subdirectories within blocked directories
- Example: Block
/admin/but allow/admin/public/
Path Syntax:
/= Allow root and everything (if no disallow rules)/blog/= Allow blog directory and all subdirectories/products/= Allow products directory- Leave empty if you want to block everything
Best Practices:
- Allow rules take precedence over Disallow rules at the same specificity level
- Use Allow sparingly, primarily to create exceptions
- Most sites do not need explicit Allow rules
Step 4: Configure Disallow Rules
Disallow rules block crawlers from accessing specific paths:
Common Paths to Block:
/admin/= Admin panel, control panel/wp-admin/= WordPress admin dashboard/private/= Private files and directories/temp/or/tmp/= Temporary files/api/= API endpoints/cgi-bin/= CGI scripts/search/= Search results pages (duplicate content)/cart/= Shopping cart pages/checkout/= Checkout flow pages/account/= User account pages/login/and/register/= Authentication pages
Path Syntax:
/= Block everything/admin/= Block admin directory and all subdirectories/secret.html= Block specific file/*?= Block all URLs with query parameters/*.pdf$= Block all PDF files/*sessionid== Block URLs with session IDs
Wildcards:
*= Matches any sequence of characters$= End of URL- Example:
/private/*.pdf$blocks all PDFs in private directory
Step 5: Set Crawl Delay (Optional)
Crawl-delay specifies the number of seconds crawlers should wait between requests:
When to use:
- Limit server load from aggressive crawlers
- Prevent bandwidth exhaustion
- Protect resource-intensive pages
Values:
0= No delay (not recommended, omit the directive instead)1-5= Light delay for fast servers10= Standard delay for most sites (recommended)30-60= Heavy delay for slow servers or heavy scrapers
Important Notes:
- Google ignores Crawl-delay; use Google Search Console instead
- Bing and Yandex respect Crawl-delay
- Too high values may reduce crawling frequency
- Most modern sites do not need this unless experiencing crawler issues
Step 6: Add Sitemap URL (Optional but Recommended)
Sitemap directive tells crawlers where to find your XML sitemap:
Format:
Sitemap: https://example.com/sitemap.xml- Must be absolute URL (include https://)
- Can list multiple sitemaps on separate lines
Benefits:
- Helps search engines discover all your pages
- Improves indexing efficiency
- Provides metadata about page priority and update frequency
Common Sitemap Locations:
/sitemap.xml= Root level (most common)/sitemap_index.xml= Sitemap index file/blog/sitemap.xml= Subdirectory sitemap- Multiple sitemaps are allowed
Step 7: Copy or Download the File
Two options to save your robots.txt:
Copy Button:
- Copies content to clipboard
- Paste into a text editor
- Save as
robots.txt(no file extension)
Download Button:
- Downloads file directly as
robots.txt - Ready to upload to your server
- Preserves correct formatting
Step 8: Upload to Your Website
Upload robots.txt to your website root directory:
File Location:
- Must be at:
https://yoursite.com/robots.txt - NOT in subdirectories:
(will not work)/blog/robots.txt - NOT with different names:
(invalid)robots.txt.txt - Case-sensitive:
robots.txtnotRobots.txt
Upload Methods:
FTP/SFTP:
- Connect to your server via FTP client (FileZilla, Cyberduck)
- Navigate to root directory (public_html, www, or htdocs)
- Upload robots.txt file
- Set file permissions to 644 (readable by all)
cPanel File Manager:
- Log into cPanel
- Open File Manager
- Navigate to public_html directory
- Upload robots.txt file
- Verify file is not hidden
WordPress:
- Use FTP to upload to root directory (same level as wp-config.php)
- Or use All in One SEO / Yoast SEO plugin robots.txt editor
- Some WordPress plugins auto-generate robots.txt (check first)
Next.js/Vercel:
- Place robots.txt in
/public/directory - Deployed to root automatically
- Or use next-sitemap package for dynamic generation
Nginx:
- Upload to web root directory (usually /var/www/html)
- Ensure proper permissions (644)
- Restart Nginx if needed
Step 9: Test Your Robots.txt
Verify your robots.txt file is working correctly:
Manual Check:
- Visit:
https://yoursite.com/robots.txt - Verify content displays correctly in browser
- Check for any 404 errors
Google Search Console:
- Go to: search.google.com/search-console
- Select your property
- Navigate to Legacy Tools & Reports → robots.txt Tester
- Enter a URL to test if it is blocked or allowed
- Submit robots.txt for indexing
Bing Webmaster Tools:
- Go to: bing.com/webmasters
- Select your site
- Go to Configure My Site → Crawl Control
- View current robots.txt
- Test URLs against rules
Online Validators:
- Ryte.com robots.txt validator
- Technical SEO robots.txt tester
- Screaming Frog robots.txt analyzer
Robots.txt Syntax and Rules
Basic Structure
User-agent: *
Allow: /
Disallow: /private/
Crawl-delay: 10
Sitemap: https://example.com/sitemap.xml
Multiple User-agent Blocks
You can define different rules for different crawlers:
# Allow Googlebot to access everything
User-agent: Googlebot
Allow: /
# Block other crawlers from certain paths
User-agent: *
Disallow: /private/
Disallow: /admin/
Path Matching Rules
Exact Path:
Disallow: /admin/
Blocks: /admin/, /admin/users/, /admin/settings.php Allows: /administrator/ (different path)
Wildcard (*):
Disallow: /search*
Blocks: /search, /search/, /search?q=test, /searchresults
End Anchor ($):
Disallow: /*.pdf$
Blocks: /documents/file.pdf, /downloads/guide.pdf Allows: /pdf/ (directory, not file)
Query Parameters:
Disallow: /*?
Blocks all URLs with query parameters
Case Sensitivity:
- Paths are case-sensitive:
/Admin/≠/admin/ - User-agent names are case-insensitive:
Googlebot=googlebot
Allow vs Disallow Priority
When rules conflict, most specific rule wins:
User-agent: *
Disallow: /admin/
Allow: /admin/public/
Result: /admin/public/ is allowed, rest of /admin/ is blocked
Comments
Use # for comments:
# Block admin area
User-agent: *
Disallow: /admin/ # Admin panel
# Allow public resources
Allow: /public/
Common Use Cases
Standard Public Website
User-agent: *
Allow: /
Sitemap: https://example.com/sitemap.xml
Allows all crawlers to index everything.
WordPress Site
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://example.com/sitemap.xml
Blocks WordPress admin while allowing AJAX endpoints.
E-commerce Store
User-agent: *
Allow: /
Allow: /products/
Allow: /categories/
Disallow: /cart/
Disallow: /checkout/
Disallow: /account/
Disallow: /search/
Disallow: /admin/
Sitemap: https://example.com/sitemap.xml
Allows product pages, blocks transactional pages.
Staging/Development Site
User-agent: *
Disallow: /
Blocks all crawlers from accessing the entire site.
Block Specific Crawlers
# Block bad bots
User-agent: BadBot
User-agent: ScraperBot
Disallow: /
# Allow good bots
User-agent: *
Allow: /
Blocks specific malicious crawlers.
Prevent Image Indexing
User-agent: Googlebot-Image
Disallow: /images/
User-agent: *
Allow: /
Blocks Google from indexing images while allowing text crawling.
Important Limitations
Robots.txt is NOT Security
What it does NOT do:
- Does not prevent malicious bots from accessing pages
- Does not hide pages from search results if linked externally
- Does not remove pages already indexed by search engines
- Can be ignored by any crawler (it is just a request, not enforcement)
For actual security:
- Use password protection (.htaccess, server authentication)
- Implement IP whitelisting
- Use noindex meta tags or X-Robots-Tag headers
- Apply proper file permissions
Cannot Remove Indexed Pages
If pages are already indexed:
- Robots.txt will not remove them from search results
- Use
noindexmeta tag instead:<meta name="robots" content="noindex"> - Or use X-Robots-Tag HTTP header
- Then request removal in Google Search Console
File Must Be Accessible
- Robots.txt must return 200 OK status
- Must be plain text (text/plain)
- Must be UTF-8 encoded
- Maximum size: 500 KiB (recommended under 100 KB)
- Cannot use redirects (301/302)
Troubleshooting
Robots.txt Not Working?
Check File Location:
- Must be at exact path:
/robots.txt - Not in subdirectory or with wrong name
- Case-sensitive filename
Verify File Permissions:
- Set to 644 (readable by all)
- Not executable (do not use 777)
Test Accessibility:
- Visit https://yoursite.com/robots.txt in browser
- Should display text content
- Check for 404, 403, or 500 errors
Syntax Errors:
- No syntax errors in directives
- Check spelling:
User-agentnotUser-AgentorUseragent - No extra spaces or special characters
Pages Still Being Indexed?
Solutions:
- Add noindex meta tag:
<meta name="robots" content="noindex"> - Wait for next crawl (can take weeks)
- Use Google Search Console Removals tool
- Check for external links pointing to blocked pages
Frequently Asked Questions
Most Viewed Tools
TOTP Code Generator
Generate time-based one-time passwords from a TOTP secret key. Enter your base32 secret, choose a period and digit length, and get the current and next codes with a live countdown timer. Useful for testing and debugging 2FA integrations.
Use Tool →JSON to Zod Schema Generator
Generate Zod validation schema code from a JSON sample object. Infers z.string(), z.number(), z.boolean(), z.array(), z.object(), and z.null() types automatically. Handles nested objects, arrays of objects with optional field detection, and outputs copy-ready TypeScript with import and z.infer type alias.
Use Tool →JSONL / NDJSON Formatter
Format, validate, and inspect JSON Lines (JSONL) and NDJSON files. Validates each line individually, reports parse errors by line number, outputs compact JSONL or a pretty-print preview, and lets you download the cleaned file.
Use Tool →Secret and Credential Scanner
Scan pasted text, code, or config files for accidentally exposed API keys, tokens, passwords, and private keys. Detects 50+ secret types across AWS, GitHub, Stripe, OpenAI, and more — all client-side, nothing leaves your browser.
Use Tool →TLS Cipher Suite Checker
Check TLS protocol version compatibility and cipher suite strength ratings against current best practices. Supports IANA and OpenSSL cipher names — rates each suite as Strong, Weak, or Deprecated and explains why.
Use Tool →Password Entropy Calculator
Calculate the information-theoretic bit entropy of any password or API key. Detects character set pools automatically, shows the total number of possible combinations, and estimates crack time across five attack scenarios from rate-limited web logins to GPU cracking clusters.
Use Tool →TOML Config Validator
Validate TOML configuration file syntax and report errors with line numbers. Paste any TOML content — Cargo.toml, pyproject.toml, config.toml — and instantly see a green checkmark with key counts and structure stats, or a precise error message pointing to the exact line. Includes a collapsible JSON structure preview to confirm what was parsed.
Use Tool →Content Security Policy Generator
Build Content Security Policy headers interactively. Toggle directives like script-src, style-src, and img-src, select allowed source tokens, and add custom origins. Instantly outputs your CSP as an HTTP header, meta tag, Nginx directive, or Apache header.
Use Tool →Related DevOps & Infrastructure Tools
Query String Parser
Parse URL query strings into readable key-value pairs. Decode parameters and inspect URL search queries with ease.
Use Tool →API Response Formatter
Format and beautify API responses for better readability. JSON formatter with minify and prettify options.
Use Tool →SSL Certificate Validator
Paste a PEM certificate to instantly validate expiry, signature algorithm, key strength, SAN presence, and trust chain. Get a clear pass/warn/fail report for each check.
Use Tool →Cookie Parser
Parse HTTP cookie strings into readable key-value pairs. Decode URL-encoded values and inspect cookies from browser requests.
Use Tool →Cron Expression Validator
Validate cron expressions, get a plain-English explanation of what they mean, and see the next scheduled run times — all in your browser.
Use Tool →robots.txt Validator
Validate your robots.txt file against the Robots Exclusion Protocol. Checks directive syntax, path formats, Crawl-delay values, and Sitemap URLs. Previews crawl rules per user-agent group. Free and runs entirely in your browser.
Use Tool →Sitemap Validator
Validate XML sitemaps against the sitemap protocol specification. Checks structure, required fields, URL count, changefreq values, and priority ranges. Supports both URL sitemaps and sitemap index files. Free and runs entirely in your browser.
Use Tool →HTTP Header Analyzer
Parse and analyze HTTP request or response headers. Identifies categories, explains each header, flags missing security headers, and detects duplicates or suspicious values — entirely in your browser.
Use Tool →Share Your Feedback
Help us improve this tool by sharing your experience