Control Google's Spider and Get Web Pages Indexed Faster with Robots.txt
 by Stephen Mahaney

Everything you NEED to know about how to...
Get Your Web Pages Indexed Faster and Control Google's Spider with Robots.txt — By Esoos Bobnar & Stephen Mahaney

One of the most critical SEO tasks is to control the search engine spiders (like Googlebot) that crawl and index your Web site. Mastery of these spiders is paramount to preventing duplicate content while ensuring that search engines focus mainly on your most important pages.

Spider? Bot? Crawler?
The terms spider, crawler, bot and robot all generally refer to the same thing. Technically, a bot is any program that downloads pages off the web, while a spider is a bot that the search engines use to build their index. But you'll often hear one being used to refer to the other, and the distinction isn't especially important.

Although it may seem a bit technical, spider control is actually easier than most people think. It's simply a matter of deploying an essential tool we call the robots.txt file. Robots.txt gives spiders (aka, robots) the direction they need to find your most important pages. This file ensures a spider's time on your site will be spent efficiently—and not be wasted by indexing obscure pages (think, Privacy Policy, About us, cgi pages, etc.) that are either problematic to spiders or unessential to searchers who are seeking your products or services.

Controlling Search Spiders with Robots.txt

Picture your robots.txt file as the tour guide to your site for the search engines. It provides a map that tells search engines where to find the content yo...

TO READ THE FULL ARTICLE