A View Of The Future Of Our Data

Welcome to the era of data coalitions.

Julien Gachadoat for Noema Magazine
Matt Prewitt is the president of the RadicalxChange Foundation and one of the authors of the Data Freedom Act.

Spend time with people interested in data policy, and you will hear wild, florid language. Terms like “self-sovereignty,” “data colonialism” and “surveillance capitalism” share sentences with comparisons of data transactions to organ sales. These colorful metaphors beg for your attention, contrasting startlingly with the insipid transmissions of ones and zeroes to which they refer. Much of this talk is confused and misled, but it is not overheated. Language is simply sputtering before the vastness of the issue.

This happens to technical people regularly. In the 1990s, those who really understood the internet could not voice their predictions without eliciting eye rolls. During much of the last decade, blockchain enthusiasts sounded ridiculous to everyone except one another.

Today, it is policy thinkers who are dumbstruck regarding the question of data regulation. Because while not everyone quite sees it yet, the policy decisions now facing us will shape society and democracy for decades.

In this essay, addressed to the reader from 2022, I will depict and defend an attractive, feasible vision for the future of the data economy. But make no mistake: We’re going to have to fight for it.

A View From 2022

Following a nonpartisan groundswell, 2021 marked the beginning of a historic realignment of our governments and economies. Viewed from a distance, it looked like merely the latest policy skirmish around data, artificial intelligence and Big Tech. But more than met the eye was at stake when legislatures signed the “data coalition era” into law.

The law established a new class of organizations called data coalitions to redress the outsize power of technology companies. It ensconced data coalitions in the economy by requiring large companies to negotiate with them in order to obtain the rights to use data concerning anyone who had joined the coalition’s membership. In other words, it set up coalitions as unavoidable intermediaries in the data economy.

The coalitions themselves are legally independent, specially regulated entities with strong fiduciary duties to their members — a bit like credit unions or old-fashioned mutual insurance companies. By representing many members, they act as collective bargaining agents on behalf of members’ interests.

“We all must weigh whether we want our medical, browsing, geolocation or other data to be held close to the vest, used for the public good or used to make money.”

Coalitions have sprung up to represent many different types of data and philosophies of data use. Some people have opted into coalitions focused on privacy, while others have joined ones interested in research and progress. Many have joined coalitions focused on ensuring that social media platforms do not use information in ways that harm democracy. Still others have joined coalitions that aimed to earn revenue for members.

Data coalitions have rapidly replaced the foundations of the digital economy. They have inhibited Silicon Valley’s most harmful and unethical data uses — and not coincidentally, have also reduced the market capitalization of many large technology companies. But at the same time, they are accelerating technology’s overall development and spawning a dynamic new entrepreneurial ecosystem, giving an advantage to businesses trying to access and use data in ways the public genuinely supports.

Most fundamentally, data coalitions are restoring to ordinary citizens the power over our lives, our communities and our world that we had ceded to Big Tech. Coalitions have become the main forums through which we express our views on the most important public question of modern times: How do we make technology work for us?

What Does The Data Coalition Era Look Like?

Selecting a coalition requires a fair bit of engagement, like voting. We all must weigh difficult questions of value: whether we want our medical, browsing, geolocation, or other data to be held close to the vest, used for the public good or used to make money.

All options, however, are better than the status quo ante. The law requires data coalitions to be fully member-controlled. Thus, they make decisions democratically through direct or delegated votes. Members can also make their voices heard by exiting: If we disagree with the direction of our data coalition, we can leave it for another one. As in any democratic organization, we sometimes choose to go along with our coalitions when they don’t make the decision we hoped for. But the fact that data coalitions are bargaining for us on a vastly collective basis — some quickly gained many millions of members — means that they have already secured enormously better privacy policies and terms of service than we could have hoped for before 2022.

Choosing one coalition over another sometimes means that we gain or lose access to particular digital services. For example, if your designated coalition fails to negotiate terms with Slack, then you can’t use Slack. A lot of people fretted about this at first. But it has become clear that the threat of coalition “strikes” forces technology companies to serve our true interests and opens up more room for plucky competitors. This new diversity of services, for example, pushes companies to focus on interoperability instead of trying to lock users into their ecosystems.

“If we disagree with the direction of our data coalition, we can leave it for another one.”
Protecting A New Category Of Information: Everything

Today, data coalitions represent our interests in a dizzying array of information. Before 2021, we had cognizable interests in, roughly speaking, just two “bundles” of information. The first, personal information, included things like our social security numbers and medical histories. The second, intellectual property, included copyrights to our expressive work and any patents or trademarks we might own.

But there was always a third, under-governed category: the data that we generate semi-spontaneously as we move through the world, interacting with services and sensors that capture all sorts of information about our location, interests, habits and behavior. This category of data, sometimes referred to as “data exhaust,” was previously collected by companies in an almost completely unrestricted manner and then used to predict our behavior, influence our decisions and train algorithms that could emulate our intellectual work.

“The threat of coalition ‘strikes’ forces technology companies to serve our true interests.”

Data coalitions now represent our interests concerning all of this information — including “exhaust” information that we previously (and mistakenly) considered non-sensitive. Suppose you walk past a CVS on a Saturday afternoon, and CVS’s foot-traffic monitors record that an unidentified person walked past the store at that time. Your coalition, alongside others, now negotiates the terms on which CVS can use even that simple datum.

This seems radical to those of us who remember the 20th century, but it is searingly necessary. Data coalitions do much more than protect our economic and privacy interests. In 2021, we realized that if we cannot make nimble democratic decisions about the uses of all types of information, we will never steer a middle path between tyranny and chaos.

Rhymes With Gutenberg

The adage that “knowledge is power” contains a complete blueprint of the data economy. Whoever controls information we rely on controls us and can therefore profit off us. Another adage (apocryphally credited to Mark Twain) says that history doesn’t repeat itself — it rhymes. We have always been slow to recognize when information becomes our master instead of our servant.

Consider the history of the printing press. We know that the technology of the printing press empowered capitalist press owners to challenge the state. A neglected side of the story is how it also empowered them to exploit writers.

Gutenberg’s invention changed writers’ lives in two important ways. First, it increased their potential distribution, and second, it decreased their control. Presses could produce thousands of copies of a text far more easily than manual copyists. But writers generally didn’t benefit, because they didn’t own these expensive new printing machines. To be sure, on a shoestring budget, they might rent a press and print a handful of copies of something they wrote. But if readers responded positively, larger publishers would react, rapidly saturating the market and excluding the writer from both profit and moral credit. Thus, power shifted away from creators toward the owners of the fastest information-processing machines.

I don’t mean to suggest that the printing press was harmful or even a double-edged sword. The point is that 15th-century technology’s facilitation of information distribution inexorably concentrated power among the owners of capital. It challenged the state but also made publishers into moguls and writers into their pawns.

“This new diversity of services pushes companies to focus on interoperability instead of trying to lock users into their ecosystems.”

Two hundred years later, after several religious wars driven by heretical pamphlets — the fake news of its day — countries began to respond. Following the end of the Thirty Years’ War in 1648, sovereigns set up harsh press censorship regimes. Shortly thereafter, they also moved to strengthen writers’ hands. In 1710, the British Parliament passed the first copyright act, giving authors legal tools to protect themselves from easy exploitation by press owners.

Centralized state censorship is not fondly remembered: The more progressive regimes (such as Sweden and the United States) rolled it back in the 18th century. Copyright, on the other hand, didn’t work out too badly. To be sure, it has never been perfect. And today, copyright law has a deservedly terrible reputation because global media companies have bent it to suit their interests. (For example, texts were originally protected for a reasonable term of 21 years from their writing. Now, thanks to Disney’s lobbying, most copyrights last well over a century, generating absurd rents on old work.) But overall, the 300-year project of arming authors against exploitation has been one of modern law’s signal successes.

The historical lesson here is that when radical new information-processing technologies emerge, the balance of power between content producers, owners of the information processing devices and the state changes. It necessitates new legal infrastructure to restabilize these relationships in a way that permits the potential unlocked by technology to flourish, while protecting the weak from exploitation and preventing wealthy owners of technology from acting against public interests.

Shared Governance, Not Individual Property

Data coalitions do something very different from intellectual property. Still, it is useful to see that they address the same problem: helping the creators of valuable information protect themselves from the well-capitalized parties positioned to appropriate it.

Where intellectual property law gives individuals unilateral property rights, data coalitions instead give communities democratic authority. Printing presses enabled capital owners to appropriate certain texts assembled (in theory) by solitary writers. But big data and artificial intelligence enabled capital owners to appropriate all the information that everyone assembled collectively. This is why we need data coalitions instead of new individual rights.

Twenty-first-century data cannot be understood in the individualistic terms of the 18th century. Followers of John Locke and Immanuel Kant unsurprisingly restricted their protection of information to the artisanal-scale works of the “geniuses and great men” who they parochially saw as the drivers of history. But the vast data that drives history today is not like that. Instead, it always both pertains to and originates from indefinite numbers of people, and it gains its value through aggregation. “My” data is valuable not because it is some masterpiece of self-expression, but because it contains deep, predictive insights about people I associate with. Thus, whenever one person discloses or withholds it, countless others are affected in ways we cannot simply ignore.

That is why data cannot be owned, but must be governed. Data must be the subject of shared democratic decisions rather than individual, unilateral ones. This presents particular challenges for liberal legal orders that have typically centered on individual rights.

“We have always been slow to recognize when information becomes our master instead of our servant.”

Immediately prior to the data coalition era, some of the most earnest advocates for a better data economy remained stuck in a defunct, 18th-century proprietarian mindset, hoping to grant us “sovereignty” over our individual information. But in the context of 21st-century data, this made no sense. It was like imagining thousands of people owning different threads in the same blanket.

Similarly, many well-intentioned advocates of open data failed to see how free information has always concentrated power in the owners of the fastest information-processing machines. Like the publishers of centuries past, the richest technology companies will always lead in extracting value from open data, giving them unearned leverage over the rest of society. So putting data into the public domain actually does precisely the opposite of leveling the playing field.

If individual data ownership is Scylla, the mythical sea monster who devoured unwary sailors, then open data is Charybdis, the whirlpool near Scylla’s cave. Finding the narrow path between the two means treating data like a police force or a water system — that is, as the subject of widely shared yet deeply responsible governance.

Individual Responsibilities, Democratic Rights

It is important to emphasize that “your data” and “my data” do not exist as distinct things. My address is my father’s son’s address, and my genes are the genes of my cousins. My deepest desires are also the desires of my friends, not just because we have common backgrounds, but because we communicate and form our desires together.

For some, this is a destabilizing insight that threatens to submerge the individual into a collective. It might seem to neutralize the emancipatory dimensions of Enlightenment thinking. After all, modern democracies have done much good by making distinct human subjects the wielders of legal rights. If data does not pertain to individuals, how are we going to preserve this tradition in this crucial new context?

Data coalitions address this concern in a powerful but unexpected way. In fact, they keep individuals at the center of the picture. But instead of centering individual rights, they center individual responsibilities.

“Data cannot be owned, but must be governed.”

Although we cannot all have absolute individual rights to an inseparable corpus of shared information, we can and do have palpable individual responsibilities regarding how we make use of whatever information is at our disposal. After all, when we disclose information unilaterally, we compromise (or advance) others’ interests. The same is true when we withhold information, like when we refuse to participate in contact-tracing that could halt the spread of a contagious virus. Thus, our decisions about “our” information should look more like decisions at the ballot box (exercises of responsibility) than like decisions about the money in our wallets (exercises of power).

Before the data coalition era, we had no hope of meaningfully exercising these responsibilities. Races to the bottom ruled the day. Some new data-grabbing productivity app would frequently tempt some people to use it, giving early adopters a social or professional advantage, thus pressuring others to follow suit and ultimately revealing information about many third parties who — once they realized they had few privacy interests left to defend — adopted it themselves. For example, it was not worth trying to keep your information away from Gmail when most of the people you corresponded with were using it.

This kind of thing happens much less now. Today, coalition-wide votes indicate whether overall membership — thousands or millions of people — believes it would be served or disserved by the proliferation of particular services. The power of these shared decisions forces even the biggest developers to appeal to a deeper notion of the public interest in designing their services. Thus, by participating in these shared decisions, we both advance our interests and fulfill our responsibilities in ways that were impossible before 2021.

“My deepest desires are also the desires of my friends, not just because we have common backgrounds, but because we communicate and form our desires together.”
How Did It Happen?

The data coalition era could not have started without support from lawmakers. It would never have emerged solely through grassroots organization or new technological developments. There are three reasons why.

1. Data coalitions needed to be made into necessary counterparties for businesses looking to use data. Otherwise, Big Tech would have done everything possible to circumvent coalitions and go on assembling data from the individuals willing to part with it most cheaply. 

2. Data coalitions needed to be subjected to special regulations. These regulations ensured that they remained wholly independent of conventional data-exploiting businesses and that they really served their members’ interests. They also set forth necessary guidelines streamlining the complex disputes that arise between different coalitions representing partly overlapping data.

3. The individual rights to data created by legacy laws like the GDPR (in Europe) and CCPA (in California) needed to be made temporarily delegable to coalitions. If that didn’t happen, then coalitions could not have fulfilled their central purpose: facilitating shared decisions by their members about data use. Making individual data rights strictly non-delegable to coalitions would have neutered them in exactly the same way that “right to work” laws neuter labor unions.

“Today, any entrepreneur with a good idea has the same shot at getting access to large datasets as Amazon and Google.”
Paradise Found?

Data coalitions have cut through the network monopolies of the internet giants like a Gordian knot. Those companies have not been pleased.

But data coalitions have neither diminished the quality of digital services, nor slowed technological progress. On the contrary, competition has increased and the technology sector as a whole is prospering. Today, any entrepreneur with a good idea has the same shot at getting access to large datasets as Amazon and Google. Consumers are happier, and the public discourse is healthier. Data coalitions have helped society make precisely the sort of intelligent judgments about technology and the public interest that were disastrously neglected for the previous two decades.

The data coalition era is neither a techno-utopia nor a Luddite fever-dream. It is simply a new political-economic settlement with a more reasonable synthesis of competing interests, better incentives and a less alarming concentration of power in Silicon Valley.

I wonder where’d we be heading without it?