This year at DEFCON in Las Vegas, investigative journalist Svea Eckert and researcher Andreas Dewes demonstrated how to deanonymize browsing datasets they had acquired through major browser plugin providers with relative ease.
Their research resulted in a handful of significant findings:
- 10 “privacy” plugins provided the most voluminous data sets.
- Data provided is very granular.
- Contrary to popular belief, deanonymization techniques aren’t novel with the majority being pattern matching rather than complex maths.
- Using Publicly Available Information (PAI) for correlation makes deanonymization much simpler.
Let’s dig into their research [PDF] to identify the risks involved with plugin usage and techniques utilized for data exploitation. One notable finding even affected an active law enforcement investigation.
Browser Plugins as Data Providers
The security implications of using browser plugins are extensive. Risks abound – from download and installation over settings and permissions to possible user data monetization.
One data provider identified in the research was Web of Trust (WOT) – a plugin installed to Firefox that allows for site reputation information to be overlaid on the browser. For WOT to serve its purpose, it relies on the browser to collect what sites a user has visited.
In their description, the authors state:
This proved to be false. Web of Trust gave the research authors voluminous datasets after they posed as a marketing company, with only thin social engineering required on their part.
Eckert and Dewes note that in their study, 95% of available user browsing data originated from 10 extensions with large install bases. In addition, they point out that a significant amount of extensions with smaller install bases are packaging and selling user data.
Findings: A Revealing Example
Eckert and Dewes identified a range of users through their deanonymization technique – the majority of their examples focus on relatively benign finds, such as matching anonymous data to users Google/Twitter profiles.
Their sample dataset included one month of information for 3 million German user IDs – in addition to real-time streaming for a 14-day period.
However – looking into their data, they were able to identify an active investigation by German law enforcement, the Landeskriminalamt (LKA) of one of Germany’s federal states.
The researchers noticed a user ID making a call to Google Translate. Google Translate passes the text as values in the URL string, giving the researchers full visibility into the German-language text a user entered for translation.
The English translation of the text entered by the user reads as follows [text sanitized by the researchers]:
“Ladies and Gentlemen, because of an investigation concerning computer fraud (file number), which I have dealt with here, § 113 TKG i.V.m. § 100j StPO.
I need information on following IP address: xxx.xxx.xxx.xxx Time stamp: xx.xx.2016, 10:05:31 CEST The data is needed to identify the offender.
Please send your answer by e-mail to the following address Firstname.email@example.com or by fax.
Detective Chief Place of county
This sort of analysis can easily be replicated to identify any range of high-value information sets.
Eckert and Dewes used deanonymization techniques that ranged from simple to highly technical.
Their simplest is their “instant deanonymization” technique. The duo hunted for uniquely identifiable URL strings such as profile pages – like Twitter analytics pages, admin panels, etc.
Their most complex and successful analysis came from Combinatorial Deanonymization – fusing publicly available information with user browser data.
The basic idea here is using maths to identify the uniqueness of a user visiting a set of domains and looking to compare that to publicly available data.
Using Twitter’s public API, the researchers pulled available information on URLs linked to within Tweets and correlated that to the user browsing data collected from the plugin vendors.
Many solutions, from plugins over VPNs to anonymous protocols, promise “anonymous browsing.” Too often, they provide anything but anonymity.
The research results presented by Eckert and Dewes reinforce three basic tenets of security:
Your public profile may seem insignificant until more sophisticated analytical methods and datasets come into play. Continuously minimize how much publicly available information you willingly throw into the ether.
Your data is only as secure as the sites, services, and software you use. In this case, the use of an extension for security inadvertently introduces significant risk. Operate on the web with the assumption that any site/service you use is compromised.
Review your software/services terms of service – the devil is in the (fine print) details when it comes to how your user data is utilized. Often, the ToS make for a long and arduous read. Tip: CTRL + F is a handy shortcut to find telling terms like “marketing”, “anonymous”, “sell”, etc.
To discuss minimizing your digital footprint, how to secure access to the web, or to learn more about Authentic8’s extensive research solutions, contact us for details. The subject matter experts on our Commercial and Federal teams will be happy to chat.
*** This is a Security Bloggers Network syndicated blog from Authentic8 Blog authored by Nicholas Espinoza. Read the original post at: https://authentic8.blog/research-deanonymizing-browser-data-made-easy/