Home News Forum Articles
  Welcome back Join CF
You are here You are here: Home | Forum | Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

You are currently viewing our boards as a guest which gives you limited access to view most of the discussions, articles and other free features. By joining our Virgin Media community you will have full access to all discussions, be able to view and post threads, communicate privately with other members (PM), respond to polls, upload your own images/photos, and access many other special features. Registration is fast, simple and absolutely free so please join our community today.


Welcome to Cable Forum
Go Back   Cable Forum > Virgin Media Services > Virgin Media Internet Service
Register FAQ Community Calendar

Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]
View Poll Results: Will you be opting out of the Virgin Ad Deal?
Yes, Definitely. 958 95.51%
No, I am quite happy to share my surfing habits with anyone. 45 4.49%
Voters: 1003. You may not vote on this poll

Closed Thread
 
Thread Tools
Old 13-05-2008, 03:06   #6421
Paul Delaney
Guest
 
Posts: n/a
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Do you think he gives a ****

---------- Post added at 02:31 ---------- Previous post was at 02:25 ----------



Do you think he gives a ****

The man can get more $1000 whores than you can shake a stick at. And that's what it's all about.[/QUOTE]

Yes, I think he does - this is a battle of hearts and minds and it's being fought on our terms on forums like this one. We are thorns in his side, there are plenty of laws he's yet to break - and don't worry we'll complain and make a song and dance about every single one until in the end he'll ***** off and dream up some other scheme in some country where there arn't so many laws to break...


 
Advertisement
Old 13-05-2008, 03:27   #6422
pip08456
Sad Doig Fan!
 
pip08456's Avatar
 
Join Date: Aug 2007
Location: Barry South Wales
Age: 68
Services: With VM for BB 250Mb service.(Deal)
Posts: 11,685
pip08456 has a nice shiny starpip08456 has a nice shiny star
pip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny starpip08456 has a nice shiny star
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Haven't read all the thread as it's too long but this may have been mentioned before. If you use Firefox as your browser then download the dephormation add on available. Just Google for it.
pip08456 is offline  
Old 13-05-2008, 03:56   #6423
labougie
Inactive
 
Join Date: Mar 2008
Posts: 44
labougie is an unknown quantity at this point
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Quote:
until in the end he'll ***** off and dream up some other scheme in some country where there arn't so many laws to break...
Global village
labougie is offline  
Old 13-05-2008, 07:20   #6424
jelv
Inactive
 
Join Date: Apr 2008
Posts: 128
jelv is an unknown quantity at this point
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Quote:
Originally Posted by AlexanderHanff View Post
Re: Robots.txt
Phorm claimed at the PIA Public Meeting that before they push a request through for a GET from a user to a website they will visit the document root for the domain to see if there is a robots.txt which allows Google access; if there is they will profile the pages the user requests. There is no indication (in fact they refused to tell us) what the user-agent will be for this robots.txt request ... <snip>
Alex: I have had confirmation from Emma Sanderson at BT that they will be checking robots.txt for googlebot - see quote from email in post here: http://www.cableforum.co.uk/board/34...-post6398.html
jelv is offline  
Old 13-05-2008, 07:33   #6425
Rchivist
Inactive
 
Join Date: Apr 2008
Posts: 831
Rchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of Quads
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Quote:
Originally Posted by popper View Post
hmm , i find this rather odd, the BBC will respond and defend the DrWho trademark and copyright, but so far, not a peep about the long standing BBC "webwise" trademark and copyright....

http://www.openrightsgroup.org/2008/...-from-the-web/

"...
“We note that you are supplying DR WHO items, and using trade marks and copyright owned by BBC.

You have not been given permission to use the DR WHO brand and we ask that you remove from your site any designs connected with DR WHO.

Please reply acknowledging receipt of this email, and confirm that you will remove the DR WHO items as requested.”
...
"
They responded to me a month or so ago, and they don't have Webwise copyrighted.

---------- Post added at 07:33 ---------- Previous post was at 07:27 ----------

Quote:
Originally Posted by AlexanderHanff View Post
They only specified Google but I presume they were just being flippant. I expect if there is anything at all representing permission to spider to anyone, they will use that as implied consent.

Alexander Hanff
I'm waiting for a BT answer to that specific point, following up their reply to someone else (which I have seen) - where it did appear from their reply that they were talking about Google.The reply was specific enough to make me think in terms of contacting Google if I get similar confirmation because it could mean webmasters singling out google for a disallow while allowing other search engines.

It's possible that the spokesperson was simply confused, as they don't get on well with very specific questions about how Phorm/Webwise works - but the answer seemed clear enough even if it was wrong.
Rchivist is offline  
Old 13-05-2008, 07:38   #6426
jelv
Inactive
 
Join Date: Apr 2008
Posts: 128
jelv is an unknown quantity at this point
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Quote:
Originally Posted by R Jones View Post
It's possible that the spokesperson was simply confused, as they don't get on well with very specific questions about how Phorm/Webwise works - but the answer seemed clear enough even if it was wrong.
I got an email from Emma on Friday saying:

Quote:
Apologies but the contact I needed to try and talk to re your additional second question was unavailable today. Please rest assured I have not forgotten that I owe you a response, I will follow up on Monday and then provide a response to this and your original email.
I think it is pretty clear that it is specifically googlebot.
jelv is offline  
Old 13-05-2008, 09:14   #6427
rryles
Inactive
 
Join Date: May 2008
Posts: 147
rryles will become famous soon enoughrryles will become famous soon enoughrryles will become famous soon enough
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Quote:
Originally Posted by AlexanderHanff View Post
Re: Robots.txt
Phorm claimed at the PIA Public Meeting that before they push a request through for a GET from a user to a website they will visit the document root for the domain to see if there is a robots.txt which allows Google access; if there is they will profile the pages the user requests. There is no indication (in fact they refused to tell us) what the user-agent will be for this robots.txt request but the user-agent for the user's GET requests will (I expect, although this has not been clarified either) be unchanged from the user's normal user-agent.
The've said that robots.txt will be cached* and not fetched for every phormed user. So it seems unlikely to me that they would pick a random user and forge her user agent string.

Also: If they by some miracle actually followed the robots.txt standard then the user-agent they match against and the one they send in the http headers must match:

"The name token a robot chooses for itself should be sent
as part of the HTTP User-agent header, and must be well documented."**

Sources:

* From http://www.cl.cam.ac.uk/~rnc1/080404phorm.pdf "40. Once the robots.txt file (if any) has been fetched, it will be cached. The cache retention period will be value set by the website using standard HTTP cache-control mechanisms, or for one month if no period is specified. The minimum period that the file will be cached for is two hours."

** From http://www.robotstxt.org/norobots-rfc.txt
rryles is offline  
Old 13-05-2008, 09:35   #6428
AlexanderHanff
Permanently Banned
 
Join Date: Mar 2008
Posts: 1,028
AlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful one
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Quote:
Originally Posted by rryles View Post
The've said that robots.txt will be cached* and not fetched for every phormed user. So it seems unlikely to me that they would pick a random user and forge her user agent string.

Also: If they by some miracle actually followed the robots.txt standard then the user-agent they match against and the one they send in the http headers must match:

"The name token a robot chooses for itself should be sent
as part of the HTTP User-agent header, and must be well documented."**

Sources:

* From http://www.cl.cam.ac.uk/~rnc1/080404phorm.pdf "40. Once the robots.txt file (if any) has been fetched, it will be cached. The cache retention period will be value set by the website using standard HTTP cache-control mechanisms, or for one month if no period is specified. The minimum period that the file will be cached for is two hours."

** From http://www.robotstxt.org/norobots-rfc.txt
You misunderstood me I think. I was trying to explain that the system will consist of 2 stages. When you send out a web request to a web site, Phorm (not you) will go off and look for robots.txt (providing it is not already cached) to check if search engines are allowed to spider. This stage is the one where they refuse to tell us what user-agent they will use.

Then the second stage is them actually forwarding your original request (yes there are some redirects and stuff going on in between but lets try and keep it simple) where we can only assume your real user-agent will be used. Certainly there has been no indication from Phorm that they will be using a different user-agent for these requests (and realistically they wouldn't want to as they could then be easily identified and blocked).

Alexander Hanff
AlexanderHanff is offline  
Old 13-05-2008, 09:46   #6429
davews
Inactive
 
Join Date: May 2008
Location: Bracknell
Posts: 34
davews is an unknown quantity at this point
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

As Robert has already pointed out, using robots.txt as permission to profile is hopeless. Many sites do not use robots.txt and others, like we with ISP hosted sites, are unable to set one up anyway. If you are quite happy for Google to scan our sites and you therefore have no specific reason to include a robots.txt, then you won't bother. I imagine there are countless sites out there, including some significant ones, who have never bothered with it, or in some cases don't even know what it is.

It would be nice to suggest that if robots.txt were missing then Phorm would not profile that site, but I guess that would be asking too much.....

On a similar theme, Phorm have never made clear if they look at and use the <meta> tags in sites. I would feel tempted to include loads of words in those to totally upset the meaning of any profiling information, after all you only need to ensure they are the ten most used words on the site....
davews is offline  
Old 13-05-2008, 09:55   #6430
tarka
Inactive
 
Join Date: May 2008
Posts: 86
tarka is on a distinguished roadtarka is on a distinguished road
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Quote:
Originally Posted by AlexanderHanff View Post
When you send out a web request to a web site, Phorm (not you) will go off and look for robots.txt (providing it is not already cached) to check if search engines are allowed to spider. This stage is the one where they refuse to tell us what user-agent they will use.
This has just given me an idea, although not exactly a straightfoward proposition.

If/When the BT trial happens, it's not going to be that difficult to work out what that user agent will be, just visit your own website (assuming you are phormed) then check your web logs.

Now assuming they don't forge a googlebot user agent and do use their own unique user agent, then it should be fairly simple to configure a web server to parse robots.txt as a script (I am sure I could set this up easily with apache/php) and serve different content based on the user agent. If it's a phorm user agent then deny the entire site, if not then serve your usual robots.txt.

Although this still doesn't get round the implied consent/default opt-in issue for webmasters/content authors, it's something to think about.

Regards...

T
tarka is offline  
Old 13-05-2008, 09:57   #6431
Rchivist
Inactive
 
Join Date: Apr 2008
Posts: 831
Rchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of Quads
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

I've posted the following on Google Webmaster tools forum
http://groups.google.com/group/Googl...90386b9ad852d0

and await the response. If anyone has a more direct route to Google management they are allowed to give me, I would be grateful - but the forum pigoens seemed the best ones to let the cat loose amongst.

Quote

As a Webmaster I am concerned about what my UK ISP (BTYahoo!/BT
Broadband) have told me about their plans to implement Webwise, a
technology patented by former spyware company 121Media, now known as
Phorm.Inc.
121 Media were formerly responsible for PeopleonPage, and for placing
difficult to remove rootkits on people's computers.

This technology uses Layer 7 interception of a users complete http
traffic to profile/mirror their browsing behaviour, and then use the
information to serve up targeted ads, based on an anonymised, cookie-
based UID placed on the users computer. It also involves the forging
of a cookie, purporting to come from the website visited, even if that
website has a privacy policy that says it does not set cookies.

It is similar but not identical to the US company NebuAd technology.

There has been relatively little debate about the isses this
technology raises for webmasters.

BT have stated that Webwise/Phorm will assume implied consent of
webmasters, to profile copyrighted web content, copy it, and exploit
it for commercial gain IF THE WEBMASTER CONSENTS TO A GOOGLE SPIDER
visiting their site.

They are equating their deep level Layer 7 intrusive interception
technology with that of the Google search engine. This may even lead
to confusion in people's minds between Webwise and Google, and they
may think that Google is in some way linked to Phorm/Webwise.
They are refusing to give webmasters a way of excluding Webwise
specifically using robots.txt - instead they are saying if we let
Google in, we let Webwise in. They have specifically repeatedly named
Google as the search engine robots.txt directive they will be looking
for in order to establish what they claim will be implied consent for
Webwise on the part of Webmasters.

Of course the major search engines allow and even assist webmasters to
exclude their robot from spidering sites by the use of specific user
agent strings. Webwise will neither set a user agent string, nor
permit itself to be specifically excluded via robots.txt.

Google need to be aware of this as it means that one possible step a
webmaster might take is to allow all other search engines to crawl
their site, but exclude Google. That way, they can exclude Webwise,
because they have excluded Google.

I would imagine Google would not be happy about this as it has the
potential to adversely impact their business model, by linking access
to Google robots with Layer 7 interception by Phorm/Webwise.

I am happy to provide further information to Google on this, including
details of communications received from my ISP, if someone from
google contacts me by email. A good starting point for information is
here
http://www.inphormationdesk.org/

followed by a text version of my disposable email address for google

End Quote
Rchivist is offline  
Old 13-05-2008, 10:00   #6432
rryles
Inactive
 
Join Date: May 2008
Posts: 147
rryles will become famous soon enoughrryles will become famous soon enoughrryles will become famous soon enough
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Quote:
Originally Posted by AlexanderHanff View Post
You misunderstood me I think. I was trying to explain that the system will consist of 2 stages. When you send out a web request to a web site, Phorm (not you) will go off and look for robots.txt (providing it is not already cached) to check if search engines are allowed to spider. This stage is the one where they refuse to tell us what user-agent they will use.

Then the second stage is them actually forwarding your original request (yes there are some redirects and stuff going on in between but lets try and keep it simple) where we can only assume your real user-agent will be used. Certainly there has been no indication from Phorm that they will be using a different user-agent for these requests (and realistically they wouldn't want to as they could then be easily identified and blocked).

Alexander Hanff
Yes, a slight misunderstanding. I thought we were just talking about the fetching of robots.txt as your paragraph that I quoted was titled "Re: Robots.txt". Back to the point anyway

I think you're probably right about them spoofing the user agent for the "second stage". Otherwise they couldn't be sure they were being served the same content. Many sites tailor content based on user-agent strings.

As for the robots.txt fetch they have a dilema. Either:

They completely emulate googlebot's behaviour which may risk litigation from Google.

or:

They do something that differentiates them from googlebot and allows them to be denied. (e.g. the user-agent string they send in the http headers is different so we serve a different robots.txt - not the easiest solution for a webmaster to implement and impossible unless you've got a proper hosting solution)

If they really wanted to create a new "gold standard" for user privacy then they would be a lot more open on these details.

"The name token a robot chooses for itself should be sent as part of the HTTP User-agent header, and must be well documented."
rryles is offline  
Old 13-05-2008, 10:02   #6433
AlexanderHanff
Permanently Banned
 
Join Date: Mar 2008
Posts: 1,028
AlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful oneAlexanderHanff is the helpful one
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Quote:
Originally Posted by tarka View Post
This has just given me an idea, although not exactly a straightfoward proposition.

If/When the BT trial happens, it's not going to be that difficult to work out what that user agent will be, just visit your own website (assuming you are phormed) then check your web logs.

Now assuming they don't forge a googlebot user agent and do use their own unique user agent, then it should be fairly simple to configure a web server to parse robots.txt as a script (I am sure I could set this up easily with apache/php) and serve different content based on the user agent. If it's a phorm user agent then deny the entire site, if not then serve your usual robots.txt.

Although this still doesn't get round the implied consent/default opt-in issue for webmasters/content authors, it's something to think about.

Regards...

T
I don't for one minute think Phorm would honour robots.txt if it explicitly denies them access. This is exactly why they won't tell us what user-agent they plan to use because they don't want to be denied access.

Let's not forget that robots.txt is not an access control mechanism, it is an honour based system which robots can either adhere to or ignore, it doesn't physically stop them accessing pages.

If their user-agent ever does get discovered, it would be useful to just add a script to your site which checks user-agent and if the Phorm user-agent is detected it builds a page which says something like "Get your hands of me you dirty ape!" or "Phorm is not welcome here, please go away." etc etc etc.

Alexander Hanff
AlexanderHanff is offline  
Old 13-05-2008, 10:02   #6434
Rchivist
Inactive
 
Join Date: Apr 2008
Posts: 831
Rchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of QuadsRchivist has a fine set of Quads
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Quote:
Originally Posted by tarka View Post
This has just given me an idea, although not exactly a straightfoward proposition.

If/When the BT trial happens, it's not going to be that difficult to work out what that user agent will be, just visit your own website (assuming you are phormed) then check your web logs.

Now assuming they don't forge a googlebot user agent and do use their own unique user agent, then it should be fairly simple to configure a web server to parse robots.txt as a script (I am sure I could set this up easily with apache/php) and serve different content based on the user agent. If it's a phorm user agent then deny the entire site, if not then serve your usual robots.txt.

Although this still doesn't get round the implied consent/default opt-in issue for webmasters/content authors, it's something to think about.

Regards...

T
That's assuming that there IS a legitimate Phorm/Webwise user agent. My personal view is - there won't be one. Based on analysing the silences and fudges from my ISP. What they DONT say is far more revealing than what they DO say, it's why I keep asking them awkward questions - to find out which ones they don't answer.
Rchivist is offline  
Old 13-05-2008, 10:13   #6435
tarka
Inactive
 
Join Date: May 2008
Posts: 86
tarka is on a distinguished roadtarka is on a distinguished road
Re: Virgin Media Phorm Webwise Adverts [Updated: See Post No. 1, 77, 102 & 797]

Quote:
Originally Posted by AlexanderHanff View Post
I don't for one minute think Phorm would honour robots.txt if it explicitly denies them access. This is exactly why they won't tell us what user-agent they plan to use because they don't want to be denied access.

Let's not forget that robots.txt is not an access control mechanism, it is an honour based system which robots can either adhere to or ignore, it doesn't physically stop them accessing pages.

If their user-agent ever does get discovered, it would be useful to just add a script to your site which checks user-agent and if the Phorm user-agent is detected it builds a page which says something like "Get your hands of me you dirty ape!" or "Phorm is not welcome here, please go away." etc etc etc.

Alexander Hanff
I agree that this still relies on them honouring the robots.txt they get served, I am just going by what they have said so far and trying to come up with something that will block phorm but allow google. I know we shouldn't have to resort to this, I was just throwing the idea out there.

My suggestion was just for the request for robots.txt, any other page (eg stage two as you put it) I believe they just pass on the end users user agent which is useless in this situation.

Regards...

T

---------- Post added at 10:13 ---------- Previous post was at 10:08 ----------

Quote:
Originally Posted by R Jones View Post
That's assuming that there IS a legitimate Phorm/Webwise user agent. My personal view is - there won't be one. Based on analysing the silences and fudges from my ISP. What they DONT say is far more revealing than what they DO say, it's why I keep asking them awkward questions - to find out which ones they don't answer.
I did say it relies on them having a unique user agent but as Alexander puts it, if they do not have a cached version of robots.txt they will make a request for one. This request I would imagine originates from the phorm equipment and not the end user so do they forge the users user-agent or googles etc oruse their own? This is the key question as you already know, but if/when the trial goes live, the question should be very quickly answered by looking at your own web server logs.
tarka is offline  
Closed Thread


Currently Active Users Viewing This Thread: 15 (0 members and 15 guests)
 

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 16:54.


Server: osmium.zmnt.uk
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.