[ About | Licence | Contacts ]
Written by Oleksandr Gavenko (AKA gavenkoa), compiled on 2024-04-01 from rev 052223c22317.

Web site

Speeding up web site loading

robots.txt

To exclude all robots from the entire server:

User-agent: *
Disallow: /

To exclude all robots from part of the server:

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/

To allow a single robot:

User-agent: Google
Disallow:

User-agent: *
Disallow: /

To allow all robots complete access:

User-agent: *
Disallow:

See:

http://www.robotstxt.org/
Page provides description for robots.txt usual practice and discussion about possible standardization efforts.
http://www.robotstxt.org/robotstxt.html
About /robots.txt
http://www.robotstxt.org/faq.html
Frequently Asked Questions.
https://en.wikipedia.org/wiki/Robots_exclusion_standard
Wikipedia article on robots.txt.
http://googlewebmastercentral.blogspot.com/2008/06/improving-on-robots-exclusion-protocol.html
Improving on Robots Exclusion Protocol.

Sitemap

Sitemaps protocol allows a webmaster to inform search engines about URLs on a website that are available for crawling.

http://www.sitemaps.org/protocol.html
Sitemap protocol.
http://en.wikipedia.org/wiki/Sitemaps
Wikipedia article.

Web document structure useage

http://dev.opera.com/articles/view/mama/
Metadata Analysis and Mining Application

Validation

Add search to your site

http://www.google.com/support/customsearch/
Custom Search Help
http://help.yahoo.com/l/uk/yahoo/search/basics/basics-13.html
Can I add a Yahoo! Search box to my site?