Do you need a Robots.txt file? When you have a small site, you are probably under the false assumption that you really don’t need a robots.txt file. In fact, you may be saying to yourself, “I don’t need a robots.txt file because, my site is, small, it’s simple for the search engines to find, and since I want all pages indexed anyway, why bother.” That was my thoughts in the beginning, as well as, not being aware of what a robots.txt file is/was or what it could do for my site. Thus, I’ll try to give you a little insight as to what a robots.txt is, how to use them, why you need them and some basic instructions on creating a robots.txt file.
Define Robot.txt File
To begin we need to know what a web robot is, and is not. Thus, a Web robot is sometimes called spiders or web crawlers. These should not be confused with your normal web browser, for a web browser is not a web robot because a human being manually maneuvers it.
The main use of a robots.txt file is to give robots instructions to what they can crawl and what they should not crawl. This gives you a little more control over the robots. And since this gives you a little more control over the robots, which means you can issue indexing instructions to specific search engines.
Do you really need a Robots.txt file?
Do you really need a robots.txt even if you’re not excluding any robots? It’s a good idea. Why? First and foremost, it’s an invite to the search engines. In addition, some of the good bots may step away from your website if you do not have a robots.txt created in the top level of your website.
Sometimes you may want to exclude some pages from the search engine’s eye. What type of pages?
1. Pages that are still under construction
2. Directories that you would prefer not to have indexed
3. Or you may want to exclude those search engines whose sole purpose is to collect
email addresses or who you do not what to have your website appear in.
What does a Robots.txt file look like?
The robots.txt file is a simple text file, which can be created in Notepad. It needs to be saved to the root directory of your site-that is the directory where your home page or index page is located.
To create a simple robots.txt file to allow all robots to spider your site you can create the following info:
That’s it. This will allow all robots to index all your pages.
If you don’t want a specific robot to have access to any of your pages, you can do the following:
Here you would have to name the robot or specific substring. And you will need the “/” because that means “all directories”.
For example, let say you do not want the Googlebot to index a page called “donotenter: and your directory is “nogoprivate”. In the disallow section you would put:
Now if it’s a complete directory you do not want indexed you would put:
By putting the forward slashing at the beginning and at the end, you tell the search engine not to include any of the directories.
Getting Your Code Right
If your Robots.txt file is a more complex piece of code, than it’s always wise to do a quick check on the syntax. There are some nice online Robots.txt checks that are free, that you can use to check your syntax. One such free checker is called Robots Text Tester which is free to use through Search Engine Promotion (http://www.searchenginepromotionhelp.com/m/robots-text-tester/robots-checker.php) or go to ClockWatchers (http://www.clockwatchers.com/robots_main.html) and they can help you create a robots.txt file, as well as, give you info how to create a file to eliminate bad bots.
To conclude, a Robots.txt file can help you to increase the number of search engines that spider your site, which means increased traffic and better indexing. In fact, this small file also helps you to control what is and is not indexed by search engines. and which search engines can spider your site. So, let me ask you now- is a robots.txt file an important asset to have for your website? I’m sure you have to admit, that yes it is important, even for the small website.