tested with new technologies,
and widespread doping was detected.
The NSA stores a lot of historical data, which I’ll talk about more in Chapter 5.
We know that in 2008 a database called XKEYSCORE routinely held voice and e-mail content
for just three days, but it held metadata for a month. One called MARINA holds a year’s
worth of people’s browsing history. Another NSA database, MYSTIC, was able to store
recordings of all the phone conversations for Bermuda. The NSA stores telephone metadata for five years.
These storage limits pertain to the raw trove of all data gathered. If an NSA analyst
touches something in the database, the agency saves it for much longer. If your data
is the result of a query into these databases, your data is saved indefinitely. If
you use encryption, your data is saved indefinitely. If you use certain keywords,
your data is saved indefinitely.
How long the NSA stores data is more a matter of storage capacity than a respect for
privacy. We know the NSA needed to increase its storage capacity to hold all the cell
phone location data it was collecting. As data storage gets cheaper, assume that more
of this data will be stored longer. This is the point of the NSA’s Utah Data Center.
The FBI stores our data, too. During the course of a legitimate investigation in 2013,
the FBI obtained a copy of all the data on a site called Freedom Hosting, including
stored e-mails. Almost all the data was unrelated to the investigation, but the FBI
kept a copy of the entire site and has been accessing it for unrelated investigations
ever since. The state of New York retains license plate scanning data for at least
five years and possibly indefinitely.
Any data—Facebook history, tweets, license plate scanner data—can basically be retained
forever, or until the company or government agency decides to delete it. In 2010,
different cell phone companies held text messages for durations ranging from 90 days
to 18 months. AT&T beat them all, hanging on to the data for seven years.
MAPPING RELATIONSHIPS
Mass-surveillance data permits mapping of interpersonal relationships. In 2013, when
we first learned that the NSA was collecting telephone calling metadata on every American,
there was much ado about so-called hop searches and what they mean. They’re a new
type of search, theoretically possible before computers but only really practical
in a world of mass surveillance. Imagine that the NSA is interested in Alice. It will
collect data on her, and then data on everyone she communicates with, and then data
on everyone they communicate with, and then data on everyone they communicate with. That’s three hops away from Alice, which is the maximum the NSA
worked with.
The intent of hop searches is to map relationships and find conspiracies. Making sense
of the data requires being able to cull out the overwhelming majority of innocent
people who are caught in this dragnet, and the phone numbers common to unrelated people:
voice mail services, pizza restaurants, taxi companies, and so on.
NSA documents note that the agency had 117,675 “active surveillance targets” on one
day in 2013. Even using conservative estimates of how many conversants each person
has and how much they overlap, the total number of people being surveilled by this
system easily exceeded 20 million. It’s the classic “six degrees of separation” problem;
most of us are only a few hops away from everyone else. In 2014, President Obama directed
the NSA to conduct two-hop analysis only on telephone metadata collected under one
particular program, but he didn’t place any restrictions on NSA hops for all the other
data it collects.
Metadata from various sources is great for mapping relationships. Most of us use the
Internet for social interaction, and our relationships show up in that. This is what
both the NSA and Facebook do, and it’s why the latter is so