Posted on August 2, 2017 at 11:37 AM
Two German researchers managed to get their hands on so-called anonymous browsing habits of more than 3 million German residents and dig up their personal data from the acquired information.
The worst thing is that they didn’t even get it in an illegal way, from a hacker for example, but legitimately bought it.
Svea Eckert who is a journalist and Andreas Dewes, a data scientist, presented their findings from the experiment at Def Con hacking conference in Las Vegas. They acquired a database containing 3bn URLs from three million German users, spread over 9m different sites.
The pair decided to create a fake marketing company and everything that a company would want to have, like a website, a LinkedIn page for the chief executive and such. They then advertised themselves as a company that developed an algorithm that would help market more effectively to masses, but they needed a large amount of data to make it work.
Eckert said that they had to contact a lot of companies and ask for people’s clickstream. It took them longer than expected to acquire the data, but they believe it had to do with the fact they wanted data of German citizens only. The companies often offered them data on Uk or US users.
A data broker had eventually given them the data for free, eager to help them test their algorithm. And while they got an anonymous set of data, the duo easily managed to discover the identity of many users.
There were some that were easier to discover and some that were harder. The easy way was to check if the person providing the data have visited their own analytics page on Twitter because doing that leaves a URL in their browsing record that reveals their Twitter username and it is only visible to the said user. All you need to do is find the URL and you can connect the data to a person. The same method was used for German social networking site Xing.
The harder names to discover required the duo to explore their so-called virtual fingerprint, which required as little as 10 URLs that people have visited in order for them to be identified. The similar strategy was used back in 2008 when a set of rating published by Netflix was compared to the ratings on IMDB, revealing the people behind Netflix accounts which in the end resulted in a closeted lesbian suing Netflix for the privacy violation.
Google translate is another way to find data on people because the site stores the text typed into the site in the URL. This way, researchers managed to discover operational details in a German cybercrime investigation, all due to one of the investigators on the said crime used the site to translate requests for foreign police forces.