Quote:
Originally Posted by pseudonym
If I was designing a MITM system, I'd be very tempted to capture a hash of the user's cookie whenever they are updated for those sites, that way I'd have a good chance of IDing them next time they visit the site from their initial request just by comparing the cookie hash.
|
That certainly has potential. However, I think it would be very tricky to apply it as a general technique to all websites. Cookies can be set by JavaScript (and Java), so you can’t rely on the Set-Cookie headers from the server alone. If you were to look at the Cookie headers sent by the browser, you won’t have information on the path, domain or, most importantly, the expiry time.
If you’re not going to store raw information, but only one-way hash values, even using Set-Cookie headers has limitations.
- Cookies are assigned not just to a particular domain, but to a particular path within each domain. I don’t usually allow cookies to be set, but I’ve gone around trying to pick up a representative sample. They all had path=/. So, in practice, you might be able to recognise the same user accessing any page of a website, using the cookie hash for any other page.
- Cookies have an expiry time. If you wanted to recognise the same user using both the same browsing session and a different one, you would need to create two different hash values – one containing all cookies and one containing only long-lived cookies. The occasional cookie with an expiry time in the very near future could be treated the same as a session cookie.
- Cookies for a domain can be set by more than one website. For example, site1.example.com and site2.example.com can both set cookies for the same domain of .example.com. So, site1.example.com could return cookies set by site2.example.com, and vice versa.
You could try to hash the cookie header sent with the request for the final object within a page that’s stored within the same domain as the page itself. However, it might be better to teach the system which cookies are important. If an international user clicks on, say, http://news.bbc.co.uk/2/hi/africa/7470304.stm, you’ll see something like:
GET /adj/bbccom.live.site.news/news_africa_content;... HTTP/1.1
Host: ad.doubleclick.net
Referer: http://news.bbc.co.uk/2/hi/africa/7470304.stm
Cookie: id=80000282f0e0ca4
If the user got to that BBC page by clicking through one of your doctored search pages, even if DoubleClick aren’t one of your advertising networks, you can now link that user’s DoubleClick identifier to your own one for that user. You now get to track them across all websites that use DoubleClick, which I believe is a lot.