The Right to Benefit from Big Data as a Public Resource

Mary D. Fan

The information that we reveal from interactions online and with electronic devices has massive value—for both private profit and public benefit, such as improving health, safety, and even commute times. Who owns the lucrative big data that we generate through the everyday necessity of interacting with technology? Calls for legal regulation regarding how companies use our data have spurred laws and proposals framed by the predominant lens of individual privacy and the right to control and delete data about oneself. By focusing on individual control over droplets of personal data, the major consumer privacy regimes overlook the important question of rights in the big data ocean.

This Article is the first to frame a right of the public to benefit from our consumer big data. Drawing on insights from property theory, regulatory advances, and open innovation, the Article introduces a model that permits controlled access and the use of big data for public interest purposes while protecting against privacy harms, among others. I propose defining a right of access to pooled personal data for public purposes, with sensitive information safeguarded by a controlled-access procedure akin to that used by institutional review boards in medical research today. To encourage companies to voluntarily share data for public interest purposes, the Article also proposes regulatory sandboxes and safe harbors akin to those successfully deployed in other domains, such as antitrust, financial technology, and intellectual property law.

The Second Digital Disruption: Streaming and the Dawn of Data-Driven Creativity

Kal Raustiala, Christopher Jon Sprigman

This Article explores how the explosive growth of online streaming is transforming the market for creative content. Two decades ago, the popularization of the internet led to what we refer to here as the first digital disruption: Napster, file-sharing, and the re-ordering of numerous content industries, from music to film to news. The advent of mass streaming has led us to a second digital disruption, one driven by the ability of streaming platforms to harvest massive amounts of data about consumer preferences and consumption patterns. Coupled to powerful computing, the data that firms like Netflix, Spotify, and Apple collect allows those firms to know what consumers want in incredible detail. This knowledge has long shaped advertising; now it is beginning to shape the content streaming firms purchase or even produce, a phenomenon we call “data-driven creativity.” This Article explores these phenomena across a range of firms and content industries. In particular, we take a close look at the firm that is perhaps farthest along in its use of data-driven creativity. We show how MindGeek, the little-known parent company of Pornhub and a leader in the market for adult entertainment, has leveraged streaming data not only to organize and suggest content to consumers but even to shape creative decisions. MindGeek is itself the product of the same forces—the shift to digital distribution and the accompanying explosion of free content—that transformed mainstream creative industries and paved the way for the rise of streaming. We first show how the adult industry adapted to the first digital disruption; that story aligns with similar accounts of how creative industries adapt to a loss of control over intellectual property. We then show how MindGeek and other streaming firms such as Netflix, Spotify, and Amazon are leveraging the second digital disruption, using data to make decisions about content promotion, aggregation, dissemination, and investment. Finally, we consider what these trends suggest for competition and innovation in markets for creative work. By making creative production far less risky, data-driven creativity may drive down the need for strong IP rights and reshape conventional assumptions about the purpose and role of IP. At the same time, the rise of data-driven creativity may reinforce the tendency of online markets toward dominance by a few major firms, with significant implications for competition and innovation.

Data Standardization

Michal S. Gal, Daniel L. Rubinfeld

With data rapidly becoming the lifeblood of the global economy, the ability to improve its use significantly affects both social and private welfare. Data standardization is key to facilitating and improving the use of data when data portability and interoperability are needed. Absent data standardization, a “Tower of Babel” of different databases may be created, limiting synergetic knowledge production. Based on interviews with data scientists, this Article identifies three main technological obstacles to data portability and interoperability: metadata uncertainties, data transfer obstacles, and missing data. It then explains how data standardization can remove at least some of these obstacles and lead to smoother data flows and better machine learning. The Article then identifies and analyzes additional effects of data standardization. As shown, data standardization has the potential to support a competitive and distributed data collection ecosystem and lead to easier policing in cases where rights are infringed or unjustified harms are created by data-fed algorithms. At the same time, increasing the scale and scope of data analysis can create negative externalities in the form of better profiling, increased harms to privacy, and cybersecurity harms. Standardization also has implications for investment and innovation, especially if lock-in to an inefficient standard occurs. The Article then explores whether market-led standardization initiatives can be relied upon to increase welfare, and the role governmental-facilitated data standardization should play, if at all.