Robots.txt Overview

Web-Indexing Robots can be controlled by installing a file in the root path of the server with explicit instructions on where a spider is not permitted.

This method relies on Robots/Spiders obeying the robots.txt file, and does not always happen with every Robot. If you want additional protection from robots/spider, you may wish to use methods such as password protecting the directory you wish to keep private

You can define which paths are off limits for spiders to visit. This is useful for such things as large directories of information, personal information, and parts of the website containing large amounts of recursive links, among others. Enter each disallowed path, one per line, in the box below. Each path must end with a "/"

Robots.txt examples:

To allow all robots/spiders index your website
User-agent: *
Disallow:

To exclude all robots/spiders from your website
User-agent: *
Disallow: /

To exclude all robots/spiders from part of your website
User-agent: *
Disallow: /cgi-bin/
Disallow: webpage.htm

To exclude a single robot/spider
User-agent: Googlebot
Disallow: /

Popular posts from this blog