Quote:
Originally Posted by jelv
If (unlikely as it is) they do obey the robots.txt rules we need a robots.txt file putting together which includes all the known valid agents and barring *
|
Well this is my robots.txt file:
Code:
### BEGIN FILE ###
#
# robots.txt
#
# 01/05/2008
#
#
# Allow Specified Only
#
#
# The use of robots or other automated means to access the site
# without the express permission of the web master is strictly
# prohibited. Notwithstanding the foregoing, the web master may
# permit automated access to access certain pages but soley for the
# limited purpose of including content in publicly available search
# engines. Any other use of robots or failure to obey the robots
# exclusion standards set forth at:
# http://www.robotstxt.org/wc/exclusion.html is strictly prohibited.
#
# v1
#
User-agent: Google
Disallow:
Sitemap: http://www.dhea.org.uk/sitemap.xml
user-agent: FreeFind
Disallow:
Sitemap: http://www.dhea.org.uk/sitemap.xml
user-agent: ia_archiver
Disallow:
Sitemap: http://www.dhea.org.uk/sitemap.xml
User-agent: *
Disallow: /
### END FILE ###
Comments on it!