Commentary

How Congress can rein in data brokers

Know your customer rules are a first step to address the risks of sensitive data — including on U.S. military servicemembers — sold online.

By Brady Allen Kruse

December 20, 2023

Pedestrians walk inside newly installed "bike rack" barricades outside the U.S. Capitol on March 21, 2023 in Washington, DC. (Photo by Win McNamee/Getty Images)

When the Federal Trade Commission earlier this year unsealed a revised complaint against the data broker Kochava, perhaps the most eye-opening revelation in the document was how easy it was to obtain the data offered for sale by the firm. Like many other data brokers, Kochava collects and sells stunningly accurate location data from numerous apps on a person’s phone. And this data is available to nearly any user of Amazon’s online data marketplace, where Kochava offers a free “data sample” — a text file with billions of data points collected on a whopping 60 million unique mobile devices.

The case against Kochava speaks to a much broader problem. A new report from our research team at Duke University finds that data brokers are making available sensitive data on a huge number of people with few barriers to exploitation. Our team purchased data on military servicemembers and their families from data brokers and found that, like Kochava, many brokers make virtually no effort to identify the purchaser before delivering sensitive data. And even when one broker refuses to sell data to an unknown customer, there is usually another one with similar data that will.

Data brokers represent the harsh reality of today’s internet economy, where the activity and data of ordinary people are logged, aggregated and then sold to the highest bidder. The data broker industry poses a huge risk to privacy and national security. The easy availability for purchase of personal data belonging to U.S. military servicemembers is just one example how the data broker industry can be used to target sensitive populations. For a malicious actor, the only real limit on how the industry can enable fraud, abuse and violence is the actor’s imagination and persistence.

This regulatory void means that it is imperative Congress take swift action to regulate this industry by implementing know your customer rules, which would place certain requirements on data brokers to vet their clients. In the absence of broader privacy legislation, Congress should urgently prevent the worst abuses of the data broker industry by forcing its players to take basic steps to verify who their customers are. A risk-based approach — placing the most sensitive data behind the biggest identification barriers — are a reasonable and effective solution.

Today, few data brokers attempt to identify their customers before allowing them to purchase data. The data sold by brokers include sensitive data fields — like home address, income or political affiliation — that can be used to blackmail military servicemembers, scam the elderly, and even find and murder a judge’s son. We found that know your customer (KYC) controls across the data broker industry are wildly inconsistent and easy to avoid; a malicious actor can purchase sensitive data anonymously by simply finding and exploiting brokers that do a poor job implementing KYC controls (if any at all).

You may not know data brokers, but they certainly know you. Data brokers are companies that quietly collect, trade, infer, aggregate, and ultimately sell data, including about people. The definition of data brokers that we use at Duke covers a wide variety of businesses, from well-known credit reporting agencies to companies that collect GPS data from cell phones and use it to infer information about an individual, such as their religion or hobbies. Many data brokers go to considerable lengths to obscure their data collection activities, and they possess highly sensitive data on typically unaware consumers, including contact information, religion and sexual orientation. Brokers work hard to quietly collect sensitive data, but when it comes to picking their buyers, we discovered that many brokers make virtually no effort to identify prospective customers.

“Know your customer” is a catch-all term for businesses performing a minimum level of customer identification and gauging the risk of the customer committing a crime. These controls range from regulatory requirements to internal processes implemented voluntarily by companies. For example, Title III of the Patriot Act imposed KYC requirements on financial institutions, which were instated to prevent terrorist financing as well as fraud and identity theft.

Data brokers generally face no such requirements. Our study on data brokers and military personnel made this clear, when our team purchased data by emailing US data brokers from the domains “datamarketresearch.org” and “dataanalytics.asia.” Despite using vague email domains, broad or nonexistent websites and Google Voice phone numbers that should have raised red flags, we were able to purchase data on thousands of active-duty military servicemembers without providing detailed information about our identities or intended use of the data. This data included names, home addresses, religions, marital status, number of children, and many other sensitive fields, all for as little as $0.12 per record.

We did all this without using deception. We did not proactively disclose our identities, but we never posed as employees of a legal entity or lied about our role. Instead, we used broad language: We were simply a “research team” doing “data market research.” For some data brokers, this was a problem; they asked for our legal company name or stopped responding to our emails. But for others, KYC controls were seemingly nonexistent.

One data broker asked us for a phone call to confirm our identity, but then offered to skip identity verification if we paid by wire (which we did). Another broker said it couldn’t “vet [our] company and it’s a .asia domain,” before trying to entice us to purchase more data than we initially asked for — and then sold it. Others asked if we planned to contact the individuals in the dataset and for a sample mailing piece, but did not attempt to verify our (truthful) answer that we wouldn’t contact anybody. Even buying data on American servicemembers from a .asia domain and email — including sensitive fields like religion and ethnicity geofenced to places like Fort Liberty (formerly, Fort Bragg) — did not raise any red flags.

This speaks to a major problem facing consumers as well as U.S. national security. A malicious actor could easily lie their way around many data brokers’ lax KYC controls, or simply find a broker with virtually no KYC practices whatsoever, like the location data broker Kochava. The end result is obtaining sensitive, non-public, individually identified data about military servicemembers and plenty of other Americans.

To address these risks, Congress should enact industry-wide KYC requirements for the sale of data by data brokers, including to identify purchasers’ identities, intended use and justification for obtaining sensitive data fields. Per the Patriot Act, banks must implement a risk-based customer identification program, depending on the size of the bank and types of accounts it offers, among other factors. These programs typically verify a customer’s name, address and taxpayer identification number, such as a Social Security number. While applying this approach to the data broker industry would help prevent unverified individuals from purchasing data, it creates several other issues: requiring effort by the government to verify identifying documents and forcing customers to disclose sensitive information about themselves to data brokers (which have a history of lax cybersecurity practices), to name a few.

Instead, Congress could enact a similarly risk-based approach but with a slightly lower bar for customer identification. To purchase low-risk data — non-sensitive bulk or aggregated data, for example — a customer might only have to provide an email and valid payment. To purchase identifiable data that is typically considered public information, a purchaser could be required to provide a viable intended use for the data, much more specific than “market research.” Sale of more sensitive information could be restricted to business entities that have a professional and complete website, a public address, and legitimate contact information (i.e. not a Google Voice number). The most sensitive data, like GPS data or information on members of the military, might require a business to submit a valid business license or articles of incorporation that could verify the business. Such an approach places a reasonable standard of care on data brokers while also protecting the privacy concerns of individuals.

Currently, if one broker is meticulous about identifying their customer before selling to them, a purchaser wishing to remain anonymous, such as a scammer or malicious foreign actor, could simply contact another broker. A national privacy law or outlawing the sale of certain types of data altogether are ideal but long-term and difficult solutions. In the short term, industry-wide and risk-based KYC controls for brokers are a necessary and feasible step in the right direction.

Brady Allen Kruse is a Masters of Public Policy student at Duke University with a background in computer science.

How Congress can rein in data brokers

More Like This

Protecting America’s cybersecurity demands showing our teeth

Microsoft’s Recall puts the Biden administration’s cyber credibility on the line

FCC, Tracfone Wireless reach $16M cyber and privacy settlement

Top Stories

Cyber firm KnowBe4 hired a fake IT worker from North Korea

Low-level cybercriminals are pouncing on CrowdStrike-connected outage

North Korean hacking group makes waves to gain Mandiant, FBI spotlight

More Scoops

Reform bill would overhaul controversial surveillance law

How a private company helps ICE track migrants’ every move

FTC nominees urge Congress to pass federal data privacy law

California passes first-in-the-nation data broker deletion tool

Twitter, now X, will begin collecting users’ biometric data

White House hosts roundtable on harmful data broker practices

Safeguarding data is our best hope to control AI

Latest Podcasts

How Troy Hunt knows if you’ve been hacked and Washington tries to understand AI

Why pig butchering is the worst kind of online scam

How the FBI fights ransomware

Dewey Murdick on enabling principles for AI governance; a landmark breach at AT&T

Government

Technology

Threats

Geopolitics