Reclaiming the Digital Commons: A Public Data Trust for Training Data

03/16/2023
by   Alan Chan, et al.
0

Democratization of AI means not only that people can freely use AI, but also that people can collectively decide how AI is to be used. In particular, collective decision-making power is required to redress the negative externalities from the development of increasingly advanced AI systems, including degradation of the digital commons and unemployment from automation. The rapid pace of AI development and deployment currently leaves little room for this power. Monopolized in the hands of private corporations, the development of the most capable foundation models has proceeded largely without public input. There is currently no implemented mechanism for ensuring that the economic value generated by such models is redistributed to account for their negative externalities. The citizens that have generated the data necessary to train models do not have input on how their data are to be used. In this work, we propose that a public data trust assert control over training data for foundation models. In particular, this trust should scrape the internet as a digital commons, to license to commercial model developers for a percentage cut of revenues from deployment. First, we argue in detail for the existence of such a trust. We also discuss feasibility and potential risks. Second, we detail a number of ways for a data trust to incentivize model developers to use training data only from the trust. We propose a mix of verification mechanisms, potential regulatory action, and positive incentives. We conclude by highlighting other potential benefits of our proposed data trust and connecting our work to ongoing efforts in data and compute governance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/22/2021

The Sanction of Authority: Promoting Public Trust in AI

Trusted AI literature to date has focused on the trust needs of users wh...
research
07/25/2023

The Importance of Distrust in AI

In recent years the use of Artificial Intelligence (AI) has become incre...
research
09/21/2021

Rebuilding Trust: Queer in AI Approach to Artificial Intelligence Risk Management

AI, machine learning, and data science methods are already pervasive in ...
research
08/02/2022

Humble Machines: Attending to the Underappreciated Costs of Misplaced Distrust

It is curious that AI increasingly outperforms human decision makers, ye...
research
11/15/2022

Operationalizing Digital Self Determination

We live in an era of datafication, one in which life is increasingly qua...
research
10/16/2020

Monitoring Trust in Human-Machine Interactions for Public Sector Applications

The work reported here addresses the capacity of psychophysiological sen...
research
02/05/2022

The case for Zero Trust Digital Forensics

It is imperative for all stakeholders that digital forensics investigati...

Please sign up or login with your details

Forgot password? Click here to reset