camera will get even less avoidable
as life recorders become more prevalent. Once enough people regularly record video
of what they are seeing, you’ll be in enough of their video footage that it’ll no
longer matter whether or not you’re wearing one. It’s kind of like herd immunity,
but in reverse.
UBIQUITOUS SURVEILLANCE
Philosopher Jeremy Bentham conceived of his “panopticon” in the late 1700s as a way
to build cheaper prisons. His idea was a prison where every inmate could be surveilled
at any time, unawares. The inmate would have no choice but to assume that he was always
being watched, and would therefore conform. This idea has been used as a metaphor
for mass personal data collection, both on the Internet and off.
On the Internet, surveillance is ubiquitous. All of us are being watched, all the
time, and that data is being stored forever. This is what an information-age surveillance
state looks like, and it’s efficient beyond Bentham’s wildest dreams.
3
Analyzing Our Data
I n 2012, the New York Times published a story on how corporations analyze our data for advertising advantages.
The article revealed that Target Corporation could determine from a woman’s buying
patterns that she was pregnant, and would use that information to send the woman ads
and coupons for baby-related items. The story included an anecdote about a Minneapolis
man who’d complained to a Target store that had sent baby-related coupons to his teenage
daughter, only to find out later that Target was right.
The general practice of amassing and saving all kinds of data is called “big data,”
and the science and engineering of extracting useful information from it is called
“data mining.” Companies like Target mine data to focus their advertising. Barack
Obama mined data extensively in his 2008 and 2012 presidential campaigns for the same
purpose. Auto companies mine the data from your car to design better cars; municipalities
mine data from roadside sensors to understand driving conditions. Our genetic data
is mined for all sorts of medical research. Companies like Facebook and Twitter mine
our data for advertising purposes, and have allowed academics to mine their data for
social research.
Most of these are secondary uses of the data. That is, they are not the reason thedata was collected in the first place. In fact, that’s the basic promise of big data:
save everything you can, and someday you’ll be able to figure out some use for it
all.
Big data sets derive value, in part, from the inferences that can be made from them.
Some of these are obvious. If you have someone’s detailed location data over the course
of a year, you can infer what his favorite restaurants are. If you have the list of
people he calls and e-mails, you can infer who his friends are. If you have the list
of Internet sites he visits—or maybe a list of books he’s purchased—you can infer
his interests.
Some inferences are more subtle. A list of someone’s grocery purchases might imply
her ethnicity. Or her age and gender, and possibly religion. Or her medical history
and drinking habits. Marketers are constantly looking for patterns that indicate someone
is about to do something expensive, like get married, go on vacation, buy a home,
have a child, and so on. Police in various countries use these patterns as evidence,
either in a court or in secret. Facebook can predict race, personality, sexual orientation,
political ideology, relationship status, and drug use on the basis of Like clicks
alone. The company knows you’re engaged before you announce it, and gay before you
come out—and its postings may reveal that to other people without your knowledge or
permission. Depending on the country you live in, that could merely be a major personal
embarrassment—or it could get you killed.
There are a lot of errors in these inferences, as all of us who’ve seen