Free Robots.txt Tester
As a webmaster, you have control over access to your website. This includes whether or not you want to allow search engine bots to crawl your pages for information. The easiest way to manage this sort of access is through your site's robots.txt file
What is a Robots.txt File?
A robots.txt file clearly tells search engine crawlers which files or pages crawlers can or cannot request from your website. Typically, webmasters use it to avoid overloading their sites with requests.
It is not, however, a tool for keeping a web page from being on Google. If you want to keep a page off Google (or another search engine), you need to use something called "noindex directives." You can also password-protect the page.
Understanding Robots.txt File Structure
The robots.txt file is very simple and straightforward.
The basic format looks like this:
User-agent: [user-agent name]
Disallow: [URL string not to be crawled]
When you combine these two lines, you have a complete robots.txt file. But within each robots.txt file, it’s possible to have different user-agent directives. In other words, you can disallow certain bots from viewing specific pages, while allowing other bots access to crawl. It’s also possible to instruct bots to wait before crawling.
Consider the following:
This robots.txt file tells msnbot that it should wait four seconds before crawling each page and that it’s NOT allowed to crawl the blog.
Robots.txt: Other Things to Know
If you’re new to the concept of robots.txt file, you may find the following information helpful:
- Text is case sensitive. The file name must be “robots.txt” – not “Robots.txt” or any other variation.
- Robots.txt file must be placed in the top-level directory of a website; otherwise, it can’t be found.
- Every subdomain connected to a root domain will use separate robots.txt files. In other words, both “example.com” and “blog.example.com” will each have their own.
- Robots.txt files aren’t impenetrable. Malware robots, email address scrapers, and other unsavory crawlers may choose to ignore your requests.
Testing Your Robots.txt
This free tool from SEO.co lets you quickly and effortlessly test your robots.txt files. Simply enter the appropriate URL, followed by your first name and email address. Click the green “Check” button and we’ll let you know if your domain is allowed or not.
Partner With SEO.co
While our free robots.txt testing tools is useful in helping you determine which pages are crawlable and which ones disallow bots, sometimes you need more advanced assistance. For help with white label SEO, link building, content marketing, PPC management, video, or SEO audits, please contact us today!