Interesting discovery relating to my BTOpenworld free ISP space. I use this site, which used to be the forwarding location for my domain (now moved to its own space), to host pages which simply now redirect to the new site - to make sure that old google links take people to the new site.
I thought it would be an interesting experiment to prepare it for Webwise, with an appropriate Webwise cookie trap, and robots.txt that didn't forbid google, and a "body text" statement banning Phorm and all its works, and some 3rd party stats counter links to give me a logging facility (there aren't any logs available on my free ISP space) and then see if any Webwise visits occurred when the trials started. Webwise has not been given the url for its black list.
I set up the Google Webmaster tools, including getting the site google verified, then just checked that robots.txt was working properly - I'm familiar with doing this from 2 other domains.
Then I discovered the problem - I can't actually set up a "valid" robots.txt for the ISP hosted pages - because as far as google is concerned the only robots.txt it sees is the one at the top level domain - the ISP domain of btinternet.com .
My site robots.txt is
www.btinternet.com slash tilde~ username slash robots.txt
The one google sees is
www.btinternet.com/robots.txt which merely says
User-agent: *
Disallow: /Templates
Disallow: /virtualworlds
which seems to be the one that btinternet.com uses for all their hosted space.
So I suppose my question is - IF the "official BT/Phorm/Webwise" way of keeping webwise out of my site is supposed to be by using robots.txt (not legally good enough I know, but let's put that on hold for a moment) - how could I do it? On my ISP hosted pages I CANT create a valid robots.txt that would keep google out.
I don't actually WANT to keep google or phorm out of this test site by robots.txt - I want to use it as a test site - but IF I wanted to use the official declared robots.txt method of banning Phorm/Webwise, I can't see any way of doing it.
And if the official way of doing this, recommened by my Webwise-prone ISP is actually not possible on ISP webspace provided by the same ISP, don't THEY have a problem?
Anyway - I'll throw this at them and see what happens and keep you posted.