How A Cancer Researcher Could Steal Your Identity

Jan 29, 2015 at 2:13 PM ET

Scientists found that four pieces of personal data—seemingly innocent things, like where you ate lunch last week or how much you paid for your new hat—is all it takes to steal your identity. Although we often think of identity theft as a hacker’s game, a new study shows that even public datasets used by scientists and pollsters are exceedingly vulnerable to being compromised.

After examining 1.1 million credit card records, the researchers found that they were able to reconstruct 90 percent of shoppers’ (supposedly anonymous) identities based solely on factors like where they had shopped and how much they had paid.

This anonymous data seems safe enough. But if you know that your friend went to the bakery on 9/23 and the restaurant on 9/24, you can find him (and where he’s been since) by deduction. 

Scientists (and journalists) routinely use large anonymous data sets to learn about the population. Cancer research, for instance, would be virtually impossible without access to reams of anonymous medical records. But if those anonymous records aren’t quite anonymous, that means our credit card information, browsing history and even phone records could be compromised. From the paper:

Like credit card and mobile phone metadata, Web browsing or transportation data sets are generated as side effects of human interaction with technology [and] are subjected to the same idiosyncrasies of human behavior. …These data can probably be relatively easily re-identified.

Although removing names and addresses from public data is clearly not enough to protect our identities, the researchers acknowledge that once you remove all possible identifiers there simply isn’t much data left to study. Finding that happy medium, they say, is necessary to both encourage big data analyses and protect our identities.