Accountability in an Algorithmic Society: Relationality, Responsibility, and Robustness in Machine Learning

by   A. Feder Cooper, et al.
cornell university

In 1996, philosopher Helen Nissenbaum issued a clarion call concerning the erosion of accountability in society due to the ubiquitous delegation of consequential functions to computerized systems. Using the conceptual framing of moral blame, Nissenbaum described four types of barriers to accountability that computerization presented: 1) "many hands," the problem of attributing moral responsibility for outcomes caused by many moral actors; 2) "bugs," a way software developers might shrug off responsibility by suggesting software errors are unavoidable; 3) "computer as scapegoat," shifting blame to computer systems as if they were moral actors; and 4) "ownership without liability," a free pass to the tech industry to deny responsibility for the software they produce. We revisit these four barriers in relation to the recent ascendance of data-driven algorithmic systems–technology often folded under the heading of machine learning (ML) or artificial intelligence (AI)–to uncover the new challenges for accountability that these systems present. We then look ahead to how one might construct and justify a moral, relational framework for holding responsible parties accountable, and argue that the FAccT community is uniquely well-positioned to develop such a framework to weaken the four barriers.



page 1

page 2

page 3

page 4


Proceedings of the Robust Artificial Intelligence System Assurance (RAISA) Workshop 2022

The Robust Artificial Intelligence System Assurance (RAISA) workshop wil...

Non-functional Requirements for Machine Learning: Understanding Current Use and Challenges in Industry

Machine Learning (ML) is an application of Artificial Intelligence (AI) ...

Conceptualization and Framework of Hybrid Intelligence Systems

As artificial intelligence (AI) systems are getting ubiquitous within ou...

Agility in Software 2.0 – Notebook Interfaces and MLOps with Buttresses and Rebars

Artificial intelligence through machine learning is increasingly used in...

Advancing Computing's Foundation of US Industry Society

While past information technology (IT) advances have transformed society...

A New Framework for Machine Intelligence: Concepts and Prototype

Machine learning (ML) and artificial intelligence (AI) have become hot t...

Dynamic Algorithmic Service Agreements Perspective

A multi-disciplinary understanding of the concepts of identity, agency, ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

In 1996, writing against the backdrop of the meteoric rise of the commercial Internet (Leiner et al., 2009), Helen Nissenbaum warned of the erosion of accountability due to four barriers inimical to societies increasingly reliant on computerized systems (Nissenbaum, 1996). These barriers are: “many hands,” borrowing a term from philosophy to refer to the problem of attributing moral responsibility for outcomes caused by multiple moral actors; “bugs,” a way software developers might shrug off responsibility by suggesting software errors are unavoidable; “computer as scapegoat,” shifting blame to computers as if they were moral actors; and “ownership without liability,” a free pass to the software industry to deny responsibility, particularly via shrink-wrap and click-wrap Terms of Service agreements. Today, twenty-five years later, significant work has been done to address the four barriers, for example, through developments in professional practices of computer science (van Dorp, 2002; Jarke, 1998), organizational management (Javed and Zdun, 2014), and civil law (Mulligan and Bamberger, 2018; European Parliament Committee on Legal Affairs, 2017); however, on the whole, the effort to restore accountability remains incomplete. In the interim, the nature of computerized systems has once again been radically transformed — this time by the ascendance of data-driven algorithmic systems111Since rule-based software systems are also “algorithmic,” we will take care to specify which of the meanings we intend in settings where the context of use does not disambiguate. characterized by machine learning (ML) and artificial intelligence (AI), which either have replaced or complemented rule-based software systems, or have been incorporated within them as essential elements (Carbin, 2019; Zhang and Tsai, 2003; Barstow, 1988; Mulligan and Bamberger, 2019; Kroll et al., 2017).

The resurgent interest in accountability, particularly among the FAccT community, is therefore timely for a world in which data-driven algorithmic systems are ubiquitous. In domains as varied as finance, criminal justice, medicine, advertising, entertainment, hiring, manufacturing, and agriculture, these systems are simultaneously treated as revolutionary, adopted in high-stakes decision software and machines (Angwin et al., 2016; Ajunwa, 2021; Kleinberg et al., 2018; American Association for Justice, 2017), and as novelties (Kleeman, 2016). The failure to comprehensively establish accountability within computational systems through the 1990s and 2000s has therefore left contemporary societies just as vulnerable to the dissipation of accountability, with even more at stake. We remain in need of conceptual, technical, and institutional mechanisms to assess how to achieve accountability for the harmful consequences of data-driven algorithmic systems — mechanisms that address both whom to hold accountable and how to hold them accountable for the legally cognizable harms of injury, property loss, and workplace hazards, and the not-yet-legally-cognizable harms increasingly associated with data-driven algorithmic systems, such as privacy violations (Citron and Solove, 2022), manipulative practices (Agarwal et al., 2019; Kreps et al., 2020), and unfair discrimination (Ajunwa, 2021). In light of growing concerns over accountability in computing, our paper revisits Nissenbaum’s “four barriers to accountability” to assess whether insights from that work remain relevant to data-driven algorithmic systems and, at the same time, how the ascendance of such systems complicates, challenges, obscures, and demands more of technical, philosophical, and regulatory work.

We next provide necessary background concerning recent developments standards of care, law and policy, and computer science that will assist us in our analysis. Equipped with this context, we orient ourselves in a broad range of disciplines (Section 2) in order to examine each of Nissenbaum (1996)’s four barriers to accountability in relation to data-driven algorithmic systems (Section 3). Having re-introduced and updated the discussion of the barriers, we then integrate our conceptual framing to suggest how one might construct and justify a moral, relational framework for holding responsible parties accountable, and argue that the FAccT community is uniquely well-positioned to develop such a framework to weaken the four barriers (Section 4).

1.1. Contemporary Interventions in Accountability and Data-Driven Algorithmic Systems

Re-visiting the four barriers requires engaging with the significant body of contributions concerning accountability produced in the interim. Rather than comprehensively reviewing existing literature — an immense undertaking already addressed in work such as Wieringa (2020) and Kohli et al. (2018) — we touch on three areas of work that we find useful for our analysis of the four barriers in Section 3: standards of care, law and policy, and computer science.

Standards of care. Standards of care play a crucial role in building a culture of accountability through establishing best practices. They provide formal guidelines for ensuring and verifying that concrete practices align with agreed-upon values, such as safety and reliability (Nissenbaum, 1996). In engineering, standards of care dictate the behaviors and outputs that ought to be expected. Concerning data-driven algorithmic systems in particular, such standards of care have taken the form of model cards (Mitchell et al., 2019; Shen et al., 2021), data sheets (Gebru et al., 2021), annotations (Beretta et al., 2021), audits (Ajunwa, 2021), and frameworks (Hutchinson et al., 2021) concerning the appropriate use of data and other artifacts, which are often developed and used in the production of AI and ML systems (McMillan-Major et al., 2021; Boyd, 2021). Taken together, these standards of care enable accountability by making the intentions and expectations around such systems concrete; they provide a baseline against which one can evaluate deviations from expected behavior, and thus can be used to review and contest the legitimacy of specific uses of data-driven techniques.

In certain cases, scholars have re-framed accountability standards around harmed and vulnerable parties (Raji et al., 2020; Metcalf et al., 2021). This work — particularly that which focuses on transparency (Diakopoulos, 2020) and audits (Vecchione et al., 2021) — makes clear that standards of care and frameworks, while important for developing actionable notions of accountability, do not guarantee accountability on their own. Algorithmic impact assessments (AIAs) attempt to fill this gap (Moss et al., 2021). They task practitioners with assessing new technologies (Selbst, 2021) with respect to the impacts they produce (Metcalf et al., 2021). IAs formalize accountability relationships in ways to systematically address and correct algorithmic harms.

Law and policy. Democratic legal literature on data-driven algorithmic systems generally concerns AI/ML-related harms and corresponding ex post interventions (Ajunwa, 2021; National Highway Traffic Safety Administration, 2016). For example, work on liability spans both anticipated harms related to new or forthcoming data-driven technology, including autonomous vehicles (AVs) and robotics (Abraham and Rabin, 2019; American Association for Justice, 2017; Surden and Williams, 2016; Elish, 2019), and not-yet-legally-cognizable harms, such as unfair discrimination due to demographically-imbalanced, biased, or otherwise-discredited training data (Hellman, 2021; Okidebe, 2022; Waldman, 2019), privacy violations (Crawford and Schultz, 2014; Citron and Solove, 2022; Kaminski, 2019), and manipulation (Kreps et al., 2020). Regulatory and administrative scholarship tends to analyze data-driven algorithmic systems in relation to pre-existing legislation and policy (Shah, 2018; Viljoen, 2021; Whittington et al., 2015; Sadowski et al., 2021; Mulligan and Bamberger, 2019). One notable exception is GDPR — the nascent, yet wide-reaching data-privacy policy in the EU — which has also been applied to AI/ML systems (Kang, 2020; Wachter et al., 2017; Hamon et al., 2021).

Transparency is intimately connected to accountability. It is necessary for identifying responsible parties (in order to attribute harms to those who are responsible for them), and necessary for identifying the means by which harms came about (so that harms can be mitigated) (Diakopoulos, 2020). Transparency is therefore of broad import in democratic governance (Mulligan and Bamberger, 2018), and therefore in law and policy. This work spans a variety of urgent issues regarding lack of transparency in data-driven algorithmic systems. Concerning data (Wachter and Mittelstadt, 2019; Levy and Johns, 2016), lack of transparency is implicated in the obfuscation of data provenance, particularly via the concentration of data ownership within data brokers (Federal Trade Commission, 2014; Lambdan, 2019; Young et al., 2019); concerning algorithms and models, insufficient transparency contributes to the inscrutability of algorithmic decisions (Cofone and Strandburg, 2019; Lehr and Ohm, 2017; Kroll et al., 2017). As a result, outsourcing legal decisions to automated tools, particularly data-driven tools that obscure underlying logic, can create a crisis of legitimacy in democratic decision-making (Mulligan and Bamberger, 2019; U.S. Government Accountability Office, 2021; Calo, 2021; Citron and Calo, 2020).

Computer science. Research in computer science, especially in ML, has increasingly treated accountability as a topic for scholarly inquiry. In updating Nissenbaum (1996)’s barriers, we address cases in which researchers explicitly recognize the relationship between their work and accountability (Kim and Doshi-Velez, 2021) — namely, in auditing and transparency — and work concerning robustness, which we identify as having significant implications for accountability, even when this work does not make this claim explicitly. Concerning auditing, recent work underscores the importance of being able to analyze algorithmic outputs to detect and correct for the harm of unfair discrimination (Adler et al., 2018; Raji et al., 2020). Transparency tends to be treated as a property of models, particularly with regard to whether a model is interpretable or explainable to relevant stakeholders (Doshi-Velez and Kim, 2018; Freitas, 2014; Bhatt et al., 2021).222It has long been contested whether metrics for model interpretability are sufficient for human comprehension (Chang et al., 2009). More recently, computational work has begun to take a more expansive view of transparency, applying it to other parts of the ML pipeline, such as problem formulation, data provenance, and model selection choices (Forde et al., 2021; Kroll, 2021; Sivaprasad et al., 2020; Forde et al., 2021; Sloane et al., 2020). Lastly, broadly construed, robustness concerns whether a model behaves as expected — under expected, unexpected, anomalous or adversarial conditions. Designing for and evaluating robustness implicates accountability, as it requires researchers to rigorously define their expectations concerning model performance; this in turn enables lines of inquiry aimed at both preventing deviations from those expectations, and identifying (and ideally correcting for) such deviations when they do occur. Robustness thus encompasses work in ML that aims to achieve proven theoretical guarantees in practice (Yang and Rinard, 2019; Meinke and Hein, 2019; Kacianka and Pretschner, 2021; Zhang et al., 2020), and work that, even in the absence of such guarantees, produces models that exhibit reproducible behavior empirically (Bouthillier et al., 2019; Raff, 2019). Robustness also includes the ability for ML models to generalize beyond the data on which they were trained (Neyshabur et al., 2017; Hu et al., 2020), ranging from natural cases of distribution shift (Ovadia et al., 2019; Koh et al., 2021) to handling the presence of adversaries that are trying to control or game model outputs (Goodfellow et al., 2015; Papernot et al., 2016; Szegedy et al., 2014; Cooper et al., 2021b).

2. Conceptual Framing

We now turn to the conceptual framing we draw on for the remainder of the paper. We contextualize notions of accountability in relation to two areas: 1) contemporary moral philosophy, which situates accountability in a relationship between multiple actors; 2) the so-called “narrow” definition of accountability, largely influenced by Mark Bovens, whose framework for identifying accountability relationships has received attention from scholars interested in “algorithmic accountability,” especially within the FAccT community (Wieringa, 2020; Kroll, 2021; Kacianka and Pretschner, 2021).

2.1. Accountability in Moral Philosophy

There have been various attempts in moral philosophy to develop rigorous notions of accountability. We focus on two threads in the literature — blameworthiness and relationships between moral actors, and correspondences between the two.

Blame. Nissenbaum (1996) anticipated a problem of diminishing accountability as societies become increasingly dependent on computerized systems. The concern in question, simply put, was a growing incidence of harms to individuals, groups, or even societies for which no one would step forward to answer. In explaining how and why barriers to accountability emerge, the article turned to the work of legal philosopher Joel Feinberg to ground its conception of accountability. According to Feinberg, blame, defined in terms of causation and faultiness, is assigned to moral agents for harms they have caused due to faulty actions (Feinberg, 1970, 1985).333Although neither of the two core elements is straightforward — in fact, the subjects of centuries of philosophical and legal thinking — we will reach deeper only when our core lines of inquiry demand it. Dating back thousands of years, philosophers have grappled with causation as a metaphysical challenge and a moral one of assigning causal responsibility. To this day the fascination holds, claiming the attention of computer scientists (Halpern and Pearl, 2005). The concept of faultiness, likewise, is complex. To begin, it presumes free agency, itself a state whose metaphysical character and its role in moral attribution the subject of centuries’ long debate. Among contemporary writings on free action, Harry Frankfurt’s work (Frankfurt, 2019), for example, has added further twists and expanded lines of scholarship. Faultiness is a basic concept in all legal systems and courts as well as legal scholarship have weighed in on it giving rise to widely adopted categories of harmful actions as intentional, reckless, and negligent, which have informed judgements of legal liability (Moore, 2019; Feinberg, 1970, 1968). The elements of recklessness and negligence are particularly relevant to our analysis because they presume implicit or explicit standards, or expectations, that an actor has failed to meet.

Following Feinberg,  Nissenbaum (1996) conceives of actors as accountable when they step forward to answer for harms for which they’re blameworthy. This conception of accountability focuses on the circumstances in which harm arises and attempts to connect an agent to the faulty actions which brought about harm. Accordingly, the barriers to accountability that  Nissenbaum (1996) identifies arise because the conditions of accountability are systematically obscured, due, at some times, to circumstances surrounding computerization, and at other times, to a societal breakdown in confronting willful failures to step forward. Many hands obscures establishing lines of causal responsibility (Section 3.1) ; bugs obscures the classification of errors as instances of faulty action (Section 3.2) ; scapegoating computers obscures answerable moral actors by misleadingly (or mistakenly) attributing moral agency to non-moral causes (Section 3.3) ; and ownership without liability bluntly severs accountability from blame (Section 3.4) .

Relationality. Whereas the conception of accountability based on blameworthiness focuses on the actions and the actor causing harm through faulty behavior, recent philosophical work expands its focus to consider responsibility in light of the relationships between moral actors. Watson (1996), for example, argues that responsibility should cover more than attributablility, a property assigned to an actor for bringing about a given outcome (Talbert, 2019). A second dimension, which he calls accountability, situates responsibility in a relationship among actors. For Watson, “Holding people responsible is not just a matter of the relation of an individual to her behavior; it also involves a social setting in which we demand (require) certain conduct from one another and respond adversely to another’s failures to comply with these demands” (Watson, 1996, p 229). Other work, including T.M. Scanlon’s theory of responsibility, provides accounts of both being responsible and being held responsible, where the latter describes situations when parties violate relationship-defined norms (Scanlon, 2000; Shoemaker, 2011). The moral obligations that actors have to one another are tied to the placement of responsibility and blame. Accordingly, the characteristics of a harmed party might dictate whether, or what, accountability is needed. For instance, if one harms in self defense, there may be no moral imperative to hold them accountable.

The works discussed attempt to situate accountability in the relationships — social, political, institutional, and interpersonal — in which we are enmeshed. Accordingly, the relationship-defined obligations we have to one another — as spouses, citizens, employees, or friends — may dictate what it is we are responsible for, as well as the contours of accountability we can expect. By situating accountability not just as attributability between action and actor, but instead within a social framework, some of what has come out of the so-called “narrow” notion of accountability in political theory (discussed in Section 2.2) can be derived from the vantage of a more “pure” moral philosophy. Rather than formally pursuing this derivation here, we instead simply suggest that these notions of accountability need not be framed as alternatives to one another. Moral philosophy, then, can offer a conceptual infrastructure through which a given relational framing — be it interpersonal, institutional, or political — can be said to be legitimate and ethically viable. Similarly, for practitioners holding a variety of organizational positions, the moral responsibilities that individuals hold can shape the ethical obligations and specific forms of accountability at play.

2.2. A “Narrow” Definition

In the past few years, “algorithmic accountability”444Concerns with the phrase will be discussed in the “Scapegoat” section of this paper. has attracted growing interest in approaches that are institutional or structural in character. The work of political scientist Mark Bovens, particularly his “narrow definition” (Bovens, 2007; Bovens et al., 2014) of accountability, has informed recent work on accountability for “algorithmic systems” (Wieringa, 2020). Prompted by a concern that newly formed governmental structures and public authorities comprising the European Union lack “appropriate accountability regimes” (Bovens et al., 2014, p. 447), Bovens proposes an accountability framework that comprises two key actors: an accountable actor and a forum. In the face of certain conditions, or in the wake of certain incidents, an accountable actor has an obligation to a forum to explain and justify itself — to address a forum’s questions and judgments, be judged by a forum and possibly suffer sanctions. Bovens calls this a “relational” definition555We would argue that Bovens’s relational framework is social in the sense given by Watson (1996), mentioned above. because it defines accountability as a social relation between one actor (e.g. governmental department, a board, an agency, a committee, a pubic authority, or a person acting in an official capacity) and another actor (e.g. a governmental entity, watchdog group, oversight committee, or even an individual person acting in a capacity, e.g. journalist.) The actor — accountable actor and forum — are defined in relation to each other. Unlike in the work of the moral philosophers discussed above (Section 2.1), Bovens’s framework is not directed at rights and obligations we have to one another as moral actors, but rather as actors in roles and capacities defined by the respective sociopolitical structures in which we live.

Bovens’s approach to accountability has emerged as an attractive framework for promoting accountability in societies that have given over key controls and decisions to AI/ML, as it has been recognized in FAccT literature that a relational framework may compel clarity on murky issues (Wieringa, 2020; Kacianka and Pretschner, 2021). For one, as discussed in Wieringa (2020), Bovens directly illuminates the sociopolitical stakes of transparency, explainability, and interpretability, illustrating why these concepts are necessary for any accountability framework for data-driven societies, even though they are ultimately not sufficient to constitute accountability in and of themselves.666We return to this idea in Section 4. For another, Bovens allows us to highlight parameters for what constitutes accountability, and for which appropriate values need to be specified: i.e., who is accountable, for what, to whom, and under which circumstances? The values for these parameters may be deeply contextual, and the work of tuning these parameters may, as suggested by Metcalf et al. (2021), lie in the sociopolitical contestations, the “slow boring of hard boards” (Weber, 1978, p. 225), by the many constituencies implicated in any particular computational system.

The domain of philosophical work that could inform a moral conception of accountability is vast but not decisive in its nuances. Accordingly, we hold onto the bare bones of Joel Feinberg’s notion of blameworthiness (Section 2.1), with one amendment, as we revisit the four barriers in light of data-driven algorithmic systems. The amendment shines a spotlight on moral claims of the affected, or harmed parties whose position in relation to those whose actions have harmed them may obligate the latter to account for this. This work in moral philosophy aligns with work on accountability as a property of social structures, which holds it to be relational — not merely as a requirement on an accountable party to “own up” to blameworthy action as an obligation to another. We reserve a fuller discussion of work on accountability as a property social arrangements for Section 4. For now, we merely remark that this work is not an alternative to the conception of accountability as stepping forward for blameworthy action. Rather, the two are co-existent, each applying to a different object: one an attribute of moral actors, the other, an attribute of institutional, or societal structures.

As we demonstrate below in Section 3, we find that data-driven algorithmic systems heighten the barriers to accountability as establishing the conditions of blame — causal responsibility and fault — are even further obscured and, in turn, the ability for blameworthy actors to obfuscate their roles reduces pressure to step forward.

3. Revisiting the Four Barriers to Accountability

In a typical scenario in which software is integrated into a functional system — fully or partially displacing (groups of) human actors — it may not be obvious that accountability could be displaced along with human actors who, frequently, are its bearers. The cumulative effect of such displacements is the increasing incidence of harmful outcomes for which no one answers, whether these outcomes are major or minor, immediate or long-term, or accrue to individuals or to societies. Resuscitating accountability is no simple task — so Nissenbaum (1996) argues — because computerization sets up particularly troublesome barriers to accountability: Many hands (3.1), Bugs (3.2), The computer as scapegoat (3.3), and Ownership without liability (3.4). These interdependent barriers are not necessarily an essential quality of computer software. Rather, they are a consequence of how software is produced, integrated into institutions, and embedded within cyber-physical systems; they are a function of the wonderment and mystique that has grown around computerization, and the prevailing political-economy within which the computer and information industries have thrived. In the sections that follow, we revisit these barriers to accountability with an eye turned toward their implications amidst the massive growth and adoption of data-driven algorithmic technologies. We provide examples of the barriers in action, and defer discussion of how the barriers can be weakened to Section 4.

3.1. The Problem of Many Hands

The barrier of many hands concerns the issue that many actors are involved in the design, development, and deployment of complex computerized systems. When these systems cause harm, it can be difficult to isolate the component(s) at the source of that harm. In the absence of an accountability framework, this difficulty can conceal which specific individuals should step forward to take responsibility for the harm. As Nissenbaum summarized, “Where a mishap is the work of ‘many hands,’ it may not be obvious who is to blame because frequently its most salient and immediate causal antecedents do not converge with its locus of decision making” (Nissenbaum, 1996, p. 29).

Nissenbaum further analyzes the difficulty of the barrier of many hands by showing how it operates at four different levels: 1) software is produced in institutional, often corporate, settings in which there is no actor responsible for all development decisions; 2) within these settings, multiple, diffuse, groups of engineers contribute to different segments or modules of the overall deployed system, which additionally often depends on software designed and built by other actors (which, in today’s landscape, may result in licensed or freely-available open-source software); 3) individual software systems often interact with or depend on other software systems, which themselves may be unreliable or present interoperability issues; 4) hardware, not just software, often contributes to overall system function, particularly in cyber-physical systems, and it can be difficult to pinpoint if harms occur due to issues with the code, the physical machine, or the interface between the two. Any and all of these four levels of many hands problems can operate simultaneously, further obscuring the source of blame.

These particular difficulties of the problem of many hands remain today; moreover, they have been further complicated in numerous ways, given that computer systems are now ubiquitous rather than ascendant. We focus our discussion on how data-driven algorithm systems further complicate this barrier, using two illustrative (though necessarily non-exhaustive) examples. First, we discuss how the machine learning pipeline — the multi-stage process by which machine-learned models are designed, trained, evaluated, and deployed — presents novel problems concerning many hands. Second, contemporary data-driven algorithmic systems rely significantly on the composability of openly-available ML toolkits and benchmarking suites; these toolkits, often developed and maintained by large tech companies, tend to be advertised as general- or multi-purpose, and are frequently (mis)used in specific, narrow applications.

The machine learning pipeline. The ML pipeline is a dynamic series of steps — each of which can involve multiple (groups of) actors, including designers, engineers, managers, researchers, and data scientists. The pipeline typically starts with problem formulation and, in commercial settings, results in the deployment (and continued monitoring) of a trained model (Passi and Barocas, 2019)

. Problem formulation can involve the collection, selection, or curation of a dataset, and then the operationalization of a concrete task to learn, such as classifying loan-granting decisions or generating natural-language text. The actors responsible for formulation may then hand off their work to others responsible for implementation: choosing the type of model to fit to the data and the learning procedure to use for model training. In selecting the type of model to train, these actors may custom-design their own architecture, or may defer to using a pre-existing model, such as an off-the-shelf neural network, which has been designed by others, potentially at another company or institution. The stage of model training and evaluation can then begin, in which a set of actors runs training multiple times, perhaps for multiple combinations of chosen model types, training procedures, and hyperparameter values 

(Sivaprasad et al., 2020). These actors compare trained models, from which they select some “best”-performing model (or ensemble of models), where “best” is often informed by a chosen quantitative metric, such as mean overall accuracy (Forde et al., 2021). These stages of the ML pipeline, from formulation to evaluation, are often repeated in a dynamic loop: Until the model passes the threshold of some specified performance criteria, the process can cycle from (re)-modeling to tuning and debugging. Finally, if the model is deployed in practice, yet another set of actors who monitor the model’s behavior in practice, ensuring that it behaves in a way that aligns with expectations developed during training and evaluation.

Each stage of the ML pipeline involves numerous actors — in fact, potentially uncountably many actors if the pipeline employs third-party model architectures or ML toolkits, which we discuss below).777Participatory design further expands the set of many hands to end-user stakeholder (Sloane et al., 2020). This illustrates an additional manifestation of the barrier: when harms occur, it is possible to shift blame to harmed end-users, as they were explicitly involved in the ML pipeline. Thus, if a trained model causes harms in practice, it can be extremely challenging to tease out particular actors who should answer for them. For example, harms could originate from how actors operationalize the learning task at the start of the pipeline (Cooper and Abrams, 2021), move from high-level abstraction to concrete implementation (Selbst et al., 2019), select hyperparameters or random seeds during model selection (Forde et al., 2021). Accountability could lie with actors in any part of the pipeline, or some combination thereof. Bias could creep in early on from the choice of dataset, and then accumulate and become magnified further downstream during model selection. In other words, the diffuse and dynamic nature of the pipeline makes locating accountability extremely challenging. This can be understood as an issue of transparency — beyond the specific the problem of model interpretability — concerning who is responsible for what, and how this can be related to overarching accountability with respect to a model’s ultimate use in practice (Kroll, 2021).888This indicates why transparency in the form of model intepretability may be important, but is ultimately not sufficient, for identifying actors accountable for harms; irrespective of interpretability, the pipeline can muddle transparency.

Multi-purpose toolkits. Practitioners and researchers often do not code model architectures or optimization algorithms from scratch. Just like how Nissenbaum identified a problem of many hands due to the integration of third-party software modules, builders of data-driven algorithmic systems today often rely on toolkits produced by others. Such toolkits, which demand tremendous mathematical and software engineering expertise to develop, are arguably indispensable to individual and small-business model developers and data scientists. To decrease the amount of time and money spent iterating the ML pipeline, these actors depend on the investment of large tech companies with vast resources and large, concentrated pools of technical talent to develop and release efficient, correct, comprehensive, and user-friendly libraries of algorithm implementation, model architectures, and benchmark datasets (Abadi et al., 2015; Paszke et al., 2019)

. Of particular note is that, due to the time and expense of training increasingly larger models, some of these toolkits contain pre-trained models — large-language models like BERT 

(Devlin et al., 2019), which can be used out-of-the-box or fine-tuned for particular use cases. Since such models are conceived of as multi- or general-purpose, they have been shown to exhibit bias-related harms when used in narrow application contexts (Nadeem et al., 2021; Lovering et al., 2021; Kornblith et al., 2019). Determining blame for these types of harms is far from simple; for example, if intended use is under-specified, blame could lie at least partially with the model creator.

3.2. Bugs

Nissenbaum (1996) uses the term “bug” to cover a variety of issues common to software, including “modeling, design, and coding errors.” Bugs are said to be “inevitable,” “pervasive,” and “endemic to programming” — “natural hazards of any substantial system” (Nissenbaum, 1996, p. 32). Even with software debuggers and verification tools that can assure correctness, “bugs” emerge and cause unpredictable behavior when software systems are deployed and integrated with each other in the real world.  (Smith, 1985; MacKenzie, 2001). In short, “bugs” are predictable in their unpredictability; they serve as a barrier to accountability because they cannot be helped (except in obvious cases), and therefore are often treated as an accepted “consequence of a glorious technology for which we hold no one accountable” (Nissenbaum, 1996, p. 34). As Nissenbaum (1996) notes, what we consider to be the “inevitable” can change over time as technology evolves, with certain types of “bugs” spilling over into the avoidable. For example, evolving norms and new debugging tools can rebrand the “inevitable” to be sloppy or negligent implementation, at which point programmers can be held to account for such errors. Similarly, the advent of data-driven algorithmic systems has recently indicated that this malleability also extends in the other direction: New technological capabilities can both contract and expand what we consider “inevitable” buggy behavior. That is, while these systems contain “bugs” of the “modeling, design, and coding” varieties that Nissenbaum (1996) describes for rule-based programs, the statistical nature of data-driven systems presents additional types of harm-inducing errors, which may present an additional barrier to accountability.999Of course, statistical software is not new to ML; however, the proliferation of data-driven algorithmic systems has clarified the prevalence of such errors. For example, misclassifications, statistical error, and nondeterministic outputs may cause harm, and likewise may be treated as inevitable, which makes it difficult to place blame.

In relation to the treatment of “bugs” in 1996, it is important to note that labeling these types of errors “bugs” presents a complication, as they are an inherent part of machine learning attributable to its statistical nature. That is, misclassification, statistical error, and nondeterminism seem to turn the notion of “bug” on its head: Many experts would just as readily call these features of machine learning, rather than bugs.101010We return to this idea in Section 3.3, in which we discuss the accountability barrier of treating properties of ML as a scapegoat. Nevertheless, regardless of where one attempts to draw the line, these errors share common elements with the “bugs” Nissenbaum (1996) describes — namely, they fit into an overarching category of software issue for which the ability to reason about causality and fault is elusive. In other words, they present a barrier to accountability due to being treated as “inevitable,” “pervasive,” and a “consequence of a glorious technology for which we hold no one accountable” (Nissenbaum, 1996). To clarify this barrier, we next consider some concrete examples of “bugs” that data-driven algorithmic systems present.

Faulty modeling premises. Prior to implementation, as discussed in Section 3.1, data-driven algorithmic systems require significant modeling decisions. For example, choosing a model to learn necessarily involves abstraction and can have significant ramifications (Selbst et al., 2019; Passi and Barocas, 2019); assumptions during this stage of the ML pipeline can bias the resulting computational solution space in a particular direction (Friedman and Nissenbaum, 1996). For example, assuming a linear model is sufficient to capture patterns in data precludes the possibility of modeling non-linearities. When such biases involve over-simplified or faulty reasoning, they can result in model mis-specification and the introduction of “modeling error bugs.” Such mis-specifications may include the assumption that values like fairness and accuracy are correctly modeled as a trade-off to be optimized (Cooper and Abrams, 2021), and that physical characteristics can serve as legitimate classification signals for identifying criminals (Wu and Zhang, 2016) or inferring sexual orientation (Stark and Hutson, 2021; Wang and Kosinski, 2018). More generally, a common modeling error involves the assumption that a problem is amenable to classification — that it is possible to divide data examples into separable categories in the first place (Sun et al., 2019; Sloane et al., 2021). Being grounded in such false premises means that, even if it is possible to train mis-specified models like these to behave “accurately” (i.e., to return better-than-chance results after learning these tasks), the conclusions we can draw from them are unsound (Cooper and Abrams, 2021). In these cases, if modeling assumptions are unclear or elided, such that one is not able to attribute errors to them, it is easy to evade accountability by placing blame on the presence of inexplicable, unavoidable “bugs” endemic to computer software.

Individual errors. Avoiding faulty premises is not alone sufficient to guarantee that the ML pipeline produces a harm-free model, as such models tend to exhibit some level of error leading to individual instances of harm of different varieties, including disparate impact or manipulation (Feldman et al., 2015; Lovering et al., 2021; Nadeem et al., 2021; Kreps et al., 2020). ML has several metrics to quantify error (Botchkarev, 2019; Hardt et al., 2016); in training, an optimization algorithm attempts to minimize a chosen error metric. Nevertheless, even the most robust, well-trained models report imperfect accuracy. In fact, a model that achieves accuracy is usually considered suspect; it likely overfit to the training data and to exhibit poor performance when presented with new examples (Srivastava et al., 2014; Hastie et al., 2009). Therefore, when individual errors occur, they can be treated as inevitable, just like the “bugs” Nissenbaum (1996) describes, thus displacing responsibility for the harm such errors cause affected individuals.

Bad model performance.

Like the problem of excusing individual errors as unavoidable “bugs,” it is possible to treat unexpectedly bad overall model performance similarly. Consider a hypothetical example of a (well-formulated) computer vision system used to detect skin cancer, whose training and evaluation promise will have a mean accuracy rate of 94%. If, once deployed, the model coheres with (or out-performs) its promised performance, then mis-classifications can be said to have been anticipated or

expected.111111Individual instances of error can pose additional challenges for accountability, since the model may still overall exhibit an expected degree of error (i.e., be within a margin of error); in these cases, it is possible to treat “the computer as scapegoat” — another barrier to accountability — and say error is inherent to the fundamental statistical nature of ML. We discuss this further in Section 3.3. Since expected accuracy is a probabilistic claim about what is likely to occur, deviations from expectation can (and do) occur. If, when monitoring a deployed model, this deviation yields a decrease in expected model accuracy that is sustained over time, it is possible evade accountability by ascribing the issue to the amorphous category of “bug.” That is, calling this type of error an inevitable “bug” avoids attributing under-performance to a particular source, which, rather than being unavoidable, could be a result from human negligence, poor generalization, distribution shift, or other faulty behavior.

3.3. The Computer as Scapegoat

Nissenbaum (1996) suggested that the practice of blaming a computer could pose a barrier to accountability, since “having found one explanation for an error or injury, the further role and responsibility of human agents tend to be underestimated” (Nissenbaum, 1996, p. 34). To explain why people could plausibly blame computers for a wrongdoing, Nissenbaum points to the fact that “computers perform tasks previously performed by humans in positions of responsibility”; whereas before the human would be indicated as the blameworthy party, the computer has now taken up the role. And yet, while computer systems have become more immediate causal antecedents to an expanding number of harms, they do not have moral agency and thus cannot be said to conduct ethically faulty behavior (Nissenbaum, 1996). We discuss how the barrier of scapegoating the computer has become more complicated within the landscape of ubiquitous data-driven algorithmic systems: the tendency to treat such systems as “intelligent” objects with moral agency (which they do not possess) provides unprecedented opportunities to scapegoat the system or its component algorithms (Turkle, 2005; Pradhan et al., 2019). In seeming opposition to treating the system as intelligent or rational, the non-determinism and stochasticity inherent in ML can cause inexplicable behavior or errors, which also presents an occasion to misplace blame.

Moral agency. As data-driven algorithmic systems have become pervasive in life-critical contexts, there has been a corresponding tendency to anthropomorphize and equate technological processes to human cognition (Turkle, 2005; Pradhan et al., 2019). The “intelligence” in artificial intelligence and the “learning” in machine learning would suggest a sort of adaptive, informed decision apparatus that enables a neat placement of moral blame. However, directing blame toward data-driven algorithmic systems effectively imbues them with moral agency, ascribing them the ability to act intentionally (Schlosser, 2019).121212Proponents of the standard conception of agency in Schlosser (2019) include Davidson (1963), Goldman (2015), and Brand (1984). It is also applied to artificial agency by Himma (2009). Nissenbaum (1996) likens blaming a computer to blaming a bullet in a shooting: While the bullet can be said to play an active, causal role, it cannot be said to have been intentional in its behavior. In the same vein, a data-driven algorithmic system may play a central role in life-critical decisions, and may even be said to make a choice in a particular task, but such a choice does not hold a deliberate intention that constitutes moral agency (Schlosser, 2019).131313This is consistent with scholarship in both moral philosophy and legal theory concerning AI, algorithms, agency, and personhood (Véliz, 2021; Birhane and van Dijk, 2020; Himma, 2009; Bryson et al., 2017). The legal literature has called this, particularly in relation to robots, “social valence” (Calo, 2015) and “The Substitution Effect” (Balkin, 2015). Moreover, since in the US tort law is grounded in a notion of moral culpability, it has been ill-suited for application to AI-harm-related remedies (see, e.g., (Lemley and Casey, 2019)).

“Accountable algorithms”. This banner-phrase makes algorithms the subject of accountability, even though algorithms cannot be said to hold moral agency and, by extension, moral responsibility. This popular term (Kroll et al., 2017), therefore, reduces accountability to a piecemeal, procedural quality that can be deduced from a technology, rather than a normative concept that has to do with the moral obligations that people have toward one another. Moreover, algorithms do not get deployed in practice; systems (which contain algorithms) do. When, for example, studies fairness in AI/ML-assisted judicial bail decisions fixate on the biases that exist within an algorithm, they fail to capture broader inequities that are systemic in complex sociotechical systems, of which AI/ML techniques are just one part (Abebe et al., 2020; Barabas et al., 2020; Cooper et al., 2021a).

Mathematical guarantees. This attempt to direct blame away from people and corporations can be either strategic or unconscious. In some cases, a group of harmed individuals do not know who to blame (Section 3.1) and settle on blaming the system. In others, scapegoating the system can be a way by which an actor dodges and dissipates public ire. For example, consider the now-canonical example of Northpointe exhibiting bias in their risk-assessment tool (Angwin et al., 2016); rather than attributing this bias to a mistake or “bug,” Northpointe blamed the fundamental incompatibility of different algorithmic operationalizations of fairness as the source of the problem (and pointed to a specific measure, for which bias was not detectable, as evidence of blamelessness).

More generally, mathematical guarantees, if blamed for harmful outcomes, can exhibit the barrier of scapegoating. In contrast to the example of unexpectedly poor model performance, described above in the “Bugs” (Section 3.2, consider the following case: Engineers design a data-driven algorithmic system which they analytically prove — and empirically validate — meets some specified theoretical guarantee. In particular, let us even consider the same case we discussed for the problem of individual errors in “Bugs”: The engineers prove that a system is 94% accurate in detecting tumors, and then validate that this is in fact the case in practice. This same example, depending on how it is unpacked, can exhibit the barrier of “bugs” or the barrier of scapegoating the computer. Above, we talked about this example in terms of individual errors, for which responsibility for harm could be excused due to “buggy” behavior. Here, rather than analyzing behavior at the level of individual decisions, we examine the behavior of the model overall. If the frequency of mis-classifications is within the model’s guaranteed error rate, the engineers could attempt to excuse all resulting harms by gesturing to the fact that the model is performing exactly as expected. In short, satisfying mathematical guarantees can serve as a scapegoat because pointing to mathematical claims satisfied at the model-level can serve to obscure the need to account for harms that occur at the individual-decision level.141414And, of course, one can see-saw back-and-forth between “bug” and scapegoating to evade accountability. If satisfying guarantees at the overall model-level is for some reason rejected as a rationale for an individual harm, one could claim there is a “bug”; if calling an individual decision “buggy” is rejected, and the model is classifying within an expected threshold for error, one could then displace blame by arguing that the model is according to its specification.

Non-determinism. Data-driven algorithmic systems that involve ML exhibit non-determinism. They involve randomization to, for example, shuffle the order in which training data examples are presented to an algorithm. While such features of ML algorithms may seem like technical minutiae, they in fact introduce stochasticity into the outputs of machine-learned models: Training the same model architecture on the same dataset with the same algorithm — but changing the order in which the training data are supplied to the algorithm — can yield models that behave very differently in practice (De Sa, 2020). For example, as  Forde et al. (2021)

shows, changing the order that the data examples are presented to a train tumor-detection model can lead to surprisingly variable performance. The relationship between training-data-ordering and the resulting variance in model performance is under-explored in the technical literature. Thus, such differences in model performance are often attributed to an inherent stochasticity in ML. The randomization used in ML algorithmic systems — randomization on which these systems depend — becomes the scapegoat for the harms it may cause, such as missed tumor detection.

3.4. Ownership without Liability

Nissenbaum (1996) highlights a dual trend in the computer industry: 1) strengthening property rights; 2) avoiding liability. Behavioral trends that informed these assertions have persisted in the decades since, with lively public debates over the fit of traditional forms of intellectual property (i.e., copyrights, patents, and trade secrets) to digital products such as software, data, databases, and algorithms, and subsequent expensive legal struggles among industry titans (Board, 2021). Similarly, we have seen explicit denials of liability expressed in shrink-wrap licenses, carried over into so-called “click-wrap” licenses, and terms of service disclaimers accompanying websites, web-based services, mobile apps, IoT devices, content moderation decisions, and the like (Kosseff, 2022; Citron and Solove, 2022; Levy, 2014; Tereszkiewicz, 2018).

Before addressing how we see these trends carrying forward in the contemporary landscape, we need to qualify our observations. Property and liability are weighty legal concepts with long histories and rich meanings. Narrowing our view to digital technologies, even before the 1996 paper a robust literature had grown over questions of ownership — questions that have persisted through numerous landmark court cases. Liability, too, is a core legal concept that is increasingly an issue in relation to the products and services of digital industries. It lies outside the scope of this paper to attempt meaningful insights into these concepts as they manifest in scholarship, law, and the courts. However, it is useful to observe broad patterns and anticipate the likely actions of stakeholders.

For a start it is not difficult to see how the trends toward strong ownership and weak liability reinforce barriers to accountability, and also to understand why industry incumbents might support them; liability is costly and strong property rights enrich rights holders and empower them against competitors. Four lines of advocacy on behalf of industry interests are noted below, supplementary to those discussed in Nissenbaum (1996):

  1. [topsep=0pt, leftmargin=.55cm]

  2. Third-party providers of data-driven algorithmic systems refuse to expose their systems to scrutiny by independent auditors on grounds of trade-secrets (Cofone and Strandburg, 2019; Fromer, 2019). As long as experts maintain that transparency is necessary to evaluate the ML pipeline and AI development, strong property rights that block scrutiny are barriers to accountability.

  3. Manufacturers and owners of cyber-physical systems, such as robots, IoT devices, drones, and autonomous vehicles, evade liability for harms by shifting blame to environmental factors or humans-in-the-loop (Lemley and Casey, 2019). In this respect, the barrier of ownership without liability for data-driven algorithmic systems suggests a twist on the problem of scapegoating (Section 3.3): treating “the human user as scapegoat.” That is, claiming the user has mis-used an AI- or ML-enabled system in order to obscure responsibility for unclear, under-specified, or deliberately misleading user interfaces or expected use, as has happened with Tesla and accidents concerning its (so-called) “AutoPilot” autonomous driving feature (Boudette, 2021).

  4. Almost without question the computer industry, having metamorphosed into the data industry, has assumed ownership over data passing through its servers (Federal Trade Commission, 2014; Okidebe, 2022; Lambdan, 2019). We still do not have clear rules of liability for industry actors when their servers, holding unimaginable quantities of data, are breached (Sharkey, 2016).

  5. Technology companies hold unprecedented sway over regulation. Twenty-five years ago, although the software industry was already a force to be reckoned with, it successfully persuaded Congress that imposing legal constraints would stifle innovation — that societal well-being depended on a nascent industry that could not flourish under excessive regulatory and legal burden. Despite the obvious maturing of the industry and the emergence of global industry titans the innovation argument seems to hold sway (Russell, 2014), this time (in the US) with a twist: concern over losing tech market dominance to emerging economies.

4. Beyond the Four Barriers: Looking Ahead

Nissenbaum (1996) warned of a waning culture of accountability — harms befalling individuals, groups, even societies, accepted as sufferers’ bad luck, because no one would be stepping forward to answer for them. In the previous section, we revisited the four barriers in light of data-driven algorithmic systems and found that the framework still provides a useful lens through which to locate sources of the dissipation of accountability. Striking down, or even weakening the barriers would clear the way for sound attribution of blame, in turn exposing blameworthy parties to a societal expectation to step forward. But we have also argued that the moral responsibility to step forward is a necessary, but insufficient, component of accountability. In spite of, or perhaps because of the barriers, established institutional frameworks calling some actors (individuals or groups) to account and empowering others to call to account, as an alternative to case-by-case attention, is a more promising approach to building a culture of accountability (Section 4.1), and demonstrate how the FAccT community is uniquely disposed to develop future work that would erode the barriers to both forms of accountability (Section 4.2).

4.1. Bringing together moral and relational accountability

A robust accountability framework needs to specify who is accountable, for what, to whom, and under which circumstances. Bovens’ notion of relational accountability suggests such a framework, including two key actors: a forum and an accountable actor, who has an obligation to justify itself to the members of the forum (Section 2.2). We have argued for the appropriateness and urgency of unifying this relational definition with a conception of blameworthiness, and now extend this argument to highlight the need for a moral, relational accountability framework for dealing with the harms of data-driven algorithmic systems. In light of this call, the moral conception of accountability, on which  Nissenbaum (1996) depended, suggests one partial answer: Those who have caused harm through faulty action are contenders for the class of accountable actors, and those who have suffered harm deserve a place among the members of the forum. This point usefully clarifies a confluence between accountability as answerability for blameworthy action, and accountability as a social arrangement. Being blameworthy for harm is (almost always) a sufficient condition for being designated an accountable actor; being harmed through blameworthy action (almost always) is a sufficient condition for being designated a member of the forum. These two conceptions do not stand against one another as alternative solutions to the same problem; they are solutions to different problems that intersect in instructive ways. We thus show how a moral, relational accountability structure thus widens the scope — beyond the pair harmed party and faulty actor — by providing examples of values for all four parameters:

Who is accountable. Accountable actors may include those who are not directly responsible for harm (e.g. engineers) but are designated as accountable (or liable) because of their deep pockets, capacities to render explanations, or positions in organizational hierarchies, such as corporate officers or government procurers of data-driven systems.

For what. Beyond legally-cognizable harms considered in tort law, such as bodily injury, property damage, pecuniary losses, and even non-physical harms, such as those to reputations, harms particularly associated with data-driven algorithmic systems include privacy violations (Citron and Solove, 2022), unfair discrimination (Ajunwa, 2021), and autonomy losses due to manipulation (Agarwal et al., 2019; Kreps et al., 2020). We take as an example privacy harms from AI/ML. Privacy rights of data subjects, in the creation of training and testing data sets has rightfully drawn attention from advocates of data sheet (Gebru et al., 2021; Boyd, 2021), for example, and likewise a rationale for differential privacy (Dwork, 2006), whose focus is the data subject. But this limited view of harmed parties fails to consider the privacy of parties affected by the uses of models, even if creation of the data from which the models are derived has followed recommended, privacy preserving standards, e.g. anonymization, contextual integrity, permission, etc. As discussed in Hanley et al. (2020), a face dataset that has been created with utmost attention to privacy, may nevertheless cause privacy harm when a model derived from it is used to identify individuals as gay (Wang and Kosinski, 2018).151515A related point is discussed in  Barocas and Nissenbaum (2014).. In other words, once there is a trained model, it can be used to learn things about specific people; even if one were to anonymize a dataset for training, that dataset — by being used for training — can be operationalized and used to learn things about other people who are not in the dataset.

To whom. The members of the forum may not just include those who are themselves harmed (or placed in harms way through heightened risk). They may also include those deputized to represent and advocate on behalf of vulnerable parties, such as lawyers, public or special interest advocacy groups; or, beyond direct advocates, groups and individuals in oversight capacities, such as journalists, elected officials, government agencies, or professional societies.

Under which circumstances. This concerns the nature of the obligation, or, what accountable actors may owe to the forum — to explain, be judged, and address questions and challenges. Not every algorithmic system used in every domain requires the same approach to accountability. Rather, the specific responsibilities an actor has to others are inflected by the context in which they act.

4.2. Weakening the barriers

In the previous section, we laid out what would be needed to satisfy a moral and relational accountability framework. Any technical interventions that have already been developed — notably, those that we have emphasized concerning transparency, audits, and robustness — would need to be folded into such a framework, their use justified in these moral and relational terms. For example, whatever is proposed as a technical definition of transparency, it is unlikely to satisfy the needs of all those who comprise a forum, who may not be educated in the particulars of what it means for a model to be “interpretable.” Robustness says what expectations are, but leaves unanswered the question of the conditions under which deviations from expectations ought to be remedied. Relational treatments of these issues, it would seem, require that the obligation be tuned to the variable needs of all members of the forum.

Aside from these justifying pre-existing interventions, as we have demonstrated in Section 3, new interventions are also needed to weaken the barriers to accountability. For example, a moral and relational accountability framework opens the aperture, in principle, to many, if not all, of the “many hands” being designated as accountable actors, including dataset creators, model developers, decision and control systems designers, purveyors, and operators of these systems (Section 3.1. Developing rigorous standards of care could help mitigate the problems of inappropriate use of pre-trained models and unclear measures of quality control at different stages of the ML pipeline. For example, robust auditing mechanisms at each stage, rather than treating auditing as an end-to-end concern (Raji et al., 2020), could help clarify the relationship between stage-specific issues and resulting harms.

Various harms, depending on how they are contextualized, can implicate either the barrier of “bugs” or “scapegoating the computer” (Sections 3.23.3). For example, we note that the computer science community could have treated harms due to unfair discrimination as either a “bug” or blamed them on intrinsic aspects of AI/ML — and yet they did not. In relation to “bugs,” in the past, unfairness often was ascribed due to biased or imbalanced training data (Kallus and Zhou, 2018; Fish et al., 2016). Such biased historical data is arguably “pervasive” and unavoidable. Similarly, the community could have set some “tolerable” level of model unfairness and, as long as model met that specification, they could have attempted to evade accountability by blaming inherent properties of the model. And yet, the community does not redirect blame for the harms of unfair discrimination to these barriers.161616Arguably, it would be repugnant to do so; the particularly ugly nature of unfair-discrimination-related harms may be the reason they have escaped such treatment, which could be perceived as flippant, discriminatory, and thus a harm in itself. While unfair discrimination remains a serious issue in data-driven systems, FAccT and its antecedents have made a significant effort not to evade accountability for unfairness harms by attributing them to “bugs” or treating the computer as scapegoat. For example, significant attention has been paid to mitigating unfairness harms by developing training algorithms that are robust to such biased input data. The field of algorithmic fairness therefore serves an example that challenges the narrative these barriers — an example that could encourage similar treatment of other issues like robustness and its relationship to privacy violations, or adversarial ML and its relationship to manipulation. The community has demanded more from ML modelers concerning the treatment of unfair discrimination; it has set expectations concerning the necessity of interventions to root out and correct for unfairness, thereby surpassing the barriers of being attributed to “bugs” or scapegoated.

Lastly, being liable is related but not identical to being accountable (Section 3.4. The latter is applied to blameworthy parties who step forward to answer, the former to parties who step forward to compensate victims of harm. Often liability is assigned to those who are found to be blameworthy. If lines of accountability are blurred, for example, as a consequence of the barriers we have discussed, harms due to AI/ML and other data-driven algorithmic systems will be viewed as unfortunate accidents; the cost of “bad luck” will settle on victims. Instead, legal systems have developed approaches, such as strict liability, to compensate victims harmed in certain types of incidents even without a showing of faulty behavior. Strict liability assigned to actors who are best positioned to prevent harm is sound policy as it is likely to motivate these actors to take extraordinary care with their products. If, indeed, barriers such as many hands make the attribution of blame impossible, strict liability for a range of algorithm driven harms, such as privacy breaches, unfair discrimination, manipulative practices, as well as traditional injuries would, at least, shift the “bad luck” from victims to those best positioned to take extraordinary steps to mitigate and prevent such harms.

FAccT has a key role in developing these and other tools to help erode the barriers to accountability. However, the use of these tools needs to be justified. Just as mature political governance requires durable institutions and formal attributions of rights and duties, we have similar needs for the governance of producers, purveyors, and operators of data-driven algorithmic systems. That is, as we have contended throughout this paper, accountability is moral and relational. It depends on social, legal, and political structures that provide legitimacy for the checks actors and forums place on each others behavior; it depends on the way those checks are internalized as professional, personal, legal, and ethical duties that motivate actors’ personal responsibility. FAccT, given its proclaimed valuing of accountability and the array of expertise it brings together, is uniquely positioned to help develop a relational and moral accountability framework — the structures that provide legitimacy, as well as the professional codes and standards of care, disciplinary norms, and personal mores that tie it all together. The future work of creating these structures, as we noted earlier, is no small undertaking; it lies in the sociopolitical contestations, the “slow boring of hard boards” (Weber, 1978, p. 225), by the many constituencies implicated in any particular computational system.

5. Conclusion

In this paper we revisited Nissenbaum (1996)

’s “four barriers” to accountability, with attention to the contemporary moment in which data-driven algorithmic systems have become ubiquitious in consequential decision-making contexts. We draw on conceptual framing from  

Nissenbaum (1996)’s use of the concept of blameworthiness171717 Nissenbaum (1996), in turn, drew on the work of Joel Feinberg (Feinberg, 1970, 1985). and how it can be aligned with, rather than cast in opposition to, Mark Bovens’ work on accountability as a relational property of social structures (Bovens, 2007; Bovens et al., 2014). We demonstrate how data-driven algorithmic systems heighten the barriers to accountability with regard to determining the conditions of blame, and look ahead to how one might endeavor to weaken the barriers. In particular, drawing on both Nissenbaum and Bovens, we put forward the conditions necessary to satisfy a moral and relational accountability framework, discuss how the development of such a framework would weaken the barriers, and argue that the FAccT community is uniquely positioned to construct such a framework and to develop lines of inquiry to erode the barriers to accountability. Given our tender historical moment, addressing why these or those parties belong in the forum or in the set of accountable actors, why those obligations are justified, and, of course, evaluating the numerous permutations the relational nature of the approach demands is the provenance of future work. No easy formulations or operationalizations make sense until we have developed a rigorous approach to justification. In our view, this calls for expertise in relevant technologies, moral philosophy, the prevailing political economy of data and computing industries, organizational sociology, prevailing political and regulatory contexts, domain area expertise, and more — areas that FAccT has successfully brought together under its sponsorship. It is not that all these are needed all the time; but any of them may be called in to develop linkages between proposed values and social welfare.

The authors would like to thank the Digital Life Initiative at Cornell Tech for its generous support. A. Feder Cooper is additionally supported by the Artificial Intelligence Policy and Practice initiative at Cornell University and the John D. and Catherine T. MacArthur Foundation. Benjamin Laufer is additionally supported by NSF Grant CNS-1704527. Helen Nissenbaum is additionally supported by NSF Grants CNS-1704527 and CNS-1801307, and ICSI Grant H98230-18-D-006.


  • M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng (2015) TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Note: Software available from External Links: Link Cited by: §3.1.
  • R. Abebe, S. Barocas, J. Kleinberg, K. Levy, M. Raghavan, and D. G. Robinson (2020) Roles for computing in social change. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, New York, NY, USA, pp. 252–260. Cited by: §3.3.
  • K. S. Abraham and R. L. Rabin (2019) Automated Vehicles and Manufacturer Responsibility for Accidents: A New Legal Regime for a New Era. Virginia Law Review 105, pp. 127–171. Cited by: §1.1.
  • P. Adler, C. Falk, S. A. Friedler, T. Nix, G. Rybeck, C. Scheidegger, B. Smith, and S. Venkatasubramanian (2018) Auditing black-box models for indirect influence. Knowledge and Information Systems 54, pp. 95–122. Cited by: §1.1.
  • S. Agarwal, H. Farid, Y. Gu, M. He, K. Nagano, and H. Li (2019) Protecting World Leaders Against Deep Fakes. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops

    Long Beach, CA, pp. 8. Cited by: §1, §4.1.
  • I. Ajunwa (2021) An Auditing Imperative for Automated Hiring. Harv. J.L. & Tech. 34. Cited by: §1.1, §1.1, §1, §4.1.
  • American Association for Justice (2017) Driven to Safety: Robot Cars and the Future of Liability. Cited by: §1.1, §1.
  • J. Angwin, J. Larson, S. Mattu, and L. Kirchner (2016) Machine bias. ProPublica 23 (2016), pp. 139–159. Cited by: §1, §3.3.
  • J. M. Balkin (2015) The Path of Robotics Law. California Law Review Circuit 6, pp. 45–60. Cited by: footnote 13.
  • C. Barabas, C. Doyle, J. Rubinovitz, and K. Dinakar (2020) Studying up: reorienting the study of algorithmic fairness around issues of power. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, New York, NY, USA, pp. 167–176. Cited by: §3.3.
  • S. Barocas and H. Nissenbaum (2014) Big Data’s End Run around Anonymity and Consent. In Privacy, Big Data, and the Public Good: Frameworks for Engagement, J. Lane, V. Stodden, S. Bender, and H. Nissenbaum (Eds.), pp. 44–75. Cited by: footnote 15.
  • D. Barstow (1988) Artificial intelligence and software engineering. In Exploring Artificial Intelligence, pp. 641–670. Cited by: §1.
  • E. Beretta, A. Vetrò, B. Lepri, and J. C. D. Martin (2021)

    Detecting discriminatory risk through data annotation based on bayesian inferences

    In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, New York, NY, USA, pp. 794–804. Cited by: §1.1.
  • U. Bhatt, J. Antorán, Y. Zhang, Q. V. Liao, P. Sattigeri, R. Fogliato, G. Melançon, R. Krishnan, J. Stanley, O. Tickoo, L. Nachman, R. Chunara, M. Srikumar, A. Weller, and A. Xiang (2021) Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, pp. 401–413. Cited by: §1.1.
  • A. Birhane and J. van Dijk (2020) Robot rights? let’s talk about human welfare instead. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, pp. 207–213. Cited by: footnote 13.
  • H. L. R. E. Board (2021) Google LLC v. Oracle America, Inc.. External Links: Link Cited by: §3.4.
  • A. Botchkarev (2019) A New Typology Design of Performance Metrics to Measure Errors in Machine Learning Regression Algorithms. Interdisciplinary Journal of Information, Knowledge, and Management 14, pp. 045–076. External Links: ISSN 1555-1237 Cited by: §3.2.
  • N. E. Boudette (2021) Tesla Says Autopilot Makes Its Cars Safer. Crash Victims Say It Kills. External Links: Link Cited by: item 2.
  • X. Bouthillier, C. Laurent, and P. Vincent (2019) Unreproducible Research is Reproducible. In Proceedings of the 36th International Conference on Machine Learning, K. Chaudhuri and R. Salakhutdinov (Eds.), Proceedings of Machine Learning Research, Vol. 97, pp. 725–734. Cited by: §1.1.
  • M. Bovens, T. Schillemans, and R. E. Goodin (2014) Public accountability. The Oxford Handbook of Public Accountability 1 (1), pp. 1–22. Cited by: §2.2, §5.
  • M. Bovens (2007) Analysing and assessing accountability: a conceptual framework. European LawJjournal 13 (4), pp. 447–468. Cited by: §2.2, §5.
  • K. L. Boyd (2021) Datasheets for datasets help ml engineers notice and understand ethical issues in training data. Proceedings of the ACM on Human-Computer Interaction 5 (CSCW2), pp. 1–27. Cited by: §1.1, §4.1.
  • M. Brand (1984) Intending and acting: toward a naturalized action theory. MIT Press, Cambridge, MA, USA. Cited by: footnote 12.
  • J. J. Bryson, M. E. Diamantis, and T. D. Grant (2017) Of, for, and by the people: the legal lacuna of synthetic persons. Artificial Intelligence and Law 25 (3), pp. 273–291. Cited by: footnote 13.
  • R. Calo (2015) Robotics and the Lessons of Cyberlaw. California Law Review 103 (3), pp. 513–563. Cited by: footnote 13.
  • R. Calo (2021) Modeling Through. Duke Law Journal 72. Note: SSRN Preprint External Links: Link Cited by: §1.1.
  • M. Carbin (2019) Overparameterization: a connection between software 1.0 and software 2.0. In 3rd Summit on Advances in Programming Languages (SNAPL 2019), Cited by: §1.
  • J. Chang, S. Gerrish, C. Wang, J. Boyd-Graber, and D. Blei (2009) Reading Tea Leaves: How Humans Interpret Topic Models. In Advances in Neural Information Processing Systems, Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, and A. Culotta (Eds.), Vol. 22, Red Hook, NY, USA, pp. . Cited by: footnote 2.
  • D. K. Citron and R. Calo (2020) The Automated Administrative State: A Crisis of Legitimacy. Note: Working Paper External Links: Link Cited by: §1.1.
  • D. K. Citron and D. J. Solove (2022) Privacy Harms. Boston University Law Review 102. Cited by: §1.1, §1, §3.4, §4.1.
  • I. Cofone and K. J. Strandburg (2019) Strategic Games and Algorithmic Secrecy. McGill Law Journal 623. Cited by: §1.1, item 1.
  • A. F. Cooper and E. Abrams (2021) Emergent Unfairness in Algorithmic Fairness-Accuracy Trade-Off Research. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, pp. 46–54. External Links: ISBN 9781450384735 Cited by: §3.1, §3.2.
  • A. F. Cooper, K. Levy, and C. De Sa (2021a) Accuracy-Efficiency Trade-Offs and Accountability in Distributed ML Systems. In Equity and Access in Algorithms, Mechanisms, and Optimization, New York, NY, USA. External Links: ISBN 9781450385534 Cited by: §3.3.
  • A. F. Cooper, Y. Lu, J. Z. Forde, and C. De Sa (2021b) Hyperparameter Optimization Is Deceiving Us, and How to Stop It. In Advances in Neural Information Processing Systems, Vol. 34, Red Hook, NY, USA. Cited by: §1.1.
  • K. Crawford and J. Schultz (2014) Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms. B.C. Law Review 55, pp. 93–128. Cited by: §1.1.
  • D. Davidson (1963) Actions, Reasons, and Causes. The Journal of Philosophy 60 (23), pp. 685–700. Cited by: footnote 12.
  • C. M. De Sa (2020) Random Reshuffling is Not Always Better. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33, Red Hook, NY, USA, pp. 5957–5967. Cited by: §3.3.
  • J. Devlin, M. Chang, K. Lee, and K. Toutanova (2019) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 4171–4186. Cited by: §3.1.
  • N. Diakopoulos (2020) Accountability, Transparency, and Algorithms. The Oxford Handbook of Ethics of AI 17 (4), pp. 197. Cited by: §1.1, §1.1.
  • F. Doshi-Velez and B. Kim (2018) Considerations for Evaluation and Generalization in Interpretable Machine Learning. In Explainable and Interpretable Models in Computer Vision and Machine Learning, H. J. Escalante, S. Escalera, I. Guyon, X. Baró, Y. Güçlütürk, U. Güçlü, and M. van Gerven (Eds.), pp. 3–17. Cited by: §1.1.
  • C. Dwork (2006) Differential Privacy. In Automata, Languages and Programming, M. Bugliesi, B. Preneel, V. Sassone, and I. Wegener (Eds.), Berlin, Heidelberg, pp. 1–12. Cited by: §4.1.
  • M. C. Elish (2019) Moral Crumple Zones: Cautionary Tales in Human-Robot Interaction. Engaging Science, Technology, and Society. Cited by: §1.1.
  • European Parliament Committee on Legal Affairs (2017) Report with Recommendations to the Commission on Civil Law Rules on Robotics. Cited by: §1.
  • Federal Trade Commission (2014) A Call For Transparency and Accountability: A Report of the Federal Trade Commission. Cited by: §1.1, item 3.
  • J. Feinberg (1968) Collective Responsibility. In Sixty-Fifth Annual Meeting of the American Philosophical Association Eastern Division, Vol. 65, pp. 674–688. Cited by: footnote 3.
  • J. Feinberg (1970) Doing & Deserving: Essays in the Theory of Responsibility. Princeton University Press, Princeton, NJ, USA. Cited by: §2.1, footnote 17, footnote 3.
  • J. Feinberg (1985) Sua Culpa. In Ethical Issues in the Use of Computers, pp. 102–120. Cited by: §2.1, footnote 17.
  • M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian (2015) Certifying and Removing Disparate Impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’15, New York, NY, USA, pp. 259–268. Cited by: §3.2.
  • B. Fish, J. Kun, and Á. D. Lelkes (2016) A Confidence-Based Approach for Balancing Fairness and Accuracy. Note: Preprint Cited by: §4.2.
  • J. Z. Forde, A. F. Cooper, K. Kwegyir-Aggrey, C. D. Sa, and M. L. Littman (2021)

    Model Selection’s Disparate Impact in Real-World Deep Learning Applications

    External Links: Link Cited by: §1.1, §3.1, §3.1, §3.3.
  • H. G. Frankfurt (2019) Alternate possibilities and moral responsibility/alternative möglichkeiten und moralische verantwortung (englisch/deutsch.) reclam great papers philosophie. Reclam Verlag, Ditzingen, Germany. Cited by: footnote 3.
  • A. A. Freitas (2014) Comprehensible classification models: a position paper. SIGKDD Explor. Newsl. 15 (1), pp. 1–10. External Links: ISSN 1931-0145 Cited by: §1.1.
  • B. Friedman and H. Nissenbaum (1996) Bias in Computer Systems. ACM Trans. Inf. Syst. 14 (3), pp. 330–347. External Links: ISSN 1046-8188 Cited by: §3.2.
  • J. C. Fromer (2019) Machines as the new Oompa-Loompas: trade secrecy, the cloud, machine learning, and automation. NYU L. Rev. 94, pp. 706. Cited by: item 1.
  • T. Gebru, J. Morgenstern, B. Vecchione, J. W. Vaughan, H. Wallach, H. D. Iii, and K. Crawford (2021) Datasheets for datasets. Communications of the ACM 64 (12), pp. 86–92. Cited by: §1.1, §4.1.
  • A. I. Goldman (2015) Theory of human action. Princeton University Press, Princeton, NJ, USA. Cited by: footnote 12.
  • I. J. Goodfellow, J. Shlens, and C. Szegedy (2015) Explaining and Harnessing Adversarial Examples. In ICLR (Poster), Cited by: §1.1.
  • J. Y. Halpern and J. Pearl (2005) Causes and Explanations: A Structural-Model Approach. Part I: Causes. British Journal for the Philosophy of Science 56 (4), pp. 843–887. Cited by: footnote 3.
  • R. Hamon, H. Junklewitz, G. Malgieri, P. D. Hert, L. Beslay, and I. Sanchez (2021) Impossible Explanations? Beyond explainable AI in the GDPR from a COVID-19 use case scenario. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, New York, NY, USA, pp. 549–559. Cited by: §1.1.
  • M. J. Hanley, A. Khandelwal, H. Averbuch-Elor, N. Snavely, and H. Nissenbaum (2020) An Ethical Highlighter for People-Centric Dataset Creation. Note: ArXiv Preprint Cited by: §4.1.
  • M. Hardt, E. Price, E. Price, and N. Srebro (2016)

    Equality of Opportunity in Supervised Learning

    In Advances in Neural Information Processing Systems, D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett (Eds.), Vol. 29, Red Hook, NY, USA, pp. . Cited by: §3.2.
  • T. Hastie, R. Tibshirani, and J. Friedman (2009) The elements of statistical learning: data mining, inference and prediction. 2 edition, Springer, USA. Cited by: §3.2.
  • D. Hellman (2021) Big Data and Compounding Injustice. Note: Forthcoming, SSRN preprint Cited by: §1.1.
  • K. E. Himma (2009) Artificial agency, consciousness, and the criteria for moral agency: What properties must an artificial agent have to be a moral agent?. Ethics and Information Technology 11 (1), pp. 19–29. Cited by: footnote 12, footnote 13.
  • W. Hu, Z. Li, and D. Yu (2020) Understanding Generalization of Deep Neural Networks Trained with Noisy Labels. In ICLR, Cited by: §1.1.
  • B. Hutchinson, A. Smart, A. Hanna, E. Denton, C. Greer, O. Kjartansson, P. Barnes, and M. Mitchell (2021) Towards accountability for machine learning datasets: Practices from software engineering and infrastructure. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, New York, NY, USA, pp. 560–575. Cited by: §1.1.
  • M. Jarke (1998) Requirements Tracing. Communications of the ACM 41 (12), pp. 32–36 (en). Cited by: §1.
  • M. A. Javed and U. Zdun (2014) A systematic literature review of traceability approaches between software architecture and source code. In Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering - EASE ’14, London, England, United Kingdom, pp. 1–10 (en). External Links: ISBN 978-1-4503-2476-2 Cited by: §1.
  • S. Kacianka and A. Pretschner (2021) Designing Accountable Systems. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, New York, NY, USA, pp. 424–437. External Links: ISBN 9781450383097 Cited by: §1.1, §2.2, §2.
  • N. Kallus and A. Zhou (2018) Residual Unfairness in Fair Machine Learning from Prejudiced Data. External Links: 1806.02887 Cited by: §4.2.
  • M. E. Kaminski (2019) Binary Governance: Lessons from the GDPR’s Approach to Algorithmic Accountability. S. Cal. L. Rev. 92. Cited by: §1.1.
  • S. S. Kang (2020) Algorithmic accountability in public administration: The GDPR paradox. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, New York, NY, USA, pp. 32–32. Cited by: §1.1.
  • B. Kim and F. Doshi-Velez (2021) Machine Learning Techniques for Accountability. AI Magazine 42 (1), pp. 47–52. Cited by: §1.1.
  • A. Kleeman (2016) Cooking with Chef Watson, I.B.M.’s Artificial-Intelligence App. The New Yorker. Cited by: §1.
  • J. Kleinberg, H. Lakkaraju, J. Leskovec, J. Ludwig, and S. Mullainathan (2018) Human Decisions and Machine Predictions. The Quarterly Journal of Economics 133 (1), pp. 237–293. Cited by: §1.
  • P. W. Koh, S. Sagawa, H. Marklund, S. M. Xie, M. Zhang, A. Balsubramani, W. Hu, M. Yasunaga, R. L. Phillips, I. Gao, T. Lee, E. David, I. Stavness, W. Guo, B. Earnshaw, I. Haque, S. M. Beery, J. Leskovec, A. Kundaje, E. Pierson, S. Levine, C. Finn, and P. Liang (2021) WILDS: A Benchmark of in-the-Wild Distribution Shifts. In Proceedings of the 38th International Conference on Machine Learning, M. Meila and T. Zhang (Eds.), Proceedings of Machine Learning Research, Vol. 139, pp. 5637–5664. Cited by: §1.1.
  • N. Kohli, R. Barreto, and J. A. Kroll (2018)

    Translation tutorial: A shared lexicon for research and practice in human-centered software systems

    In 1st Conference on Fairness, Accountability, and Transparency, Vol. 7, New York, NY, USA, pp. 7. Note: Tutorial Cited by: §1.1.
  • S. Kornblith, J. Shlens, and Q. V. Le (2019)

    Do Better ImageNet Models Transfer Better?

    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §3.1.
  • J. Kosseff (2022) A User’s Guide to Section 230, and a Legislator’s Guide to Amending It (or Not). Berkeley Technology Law Journal 37. Cited by: §3.4.
  • S. Kreps, R. M. McCain, and M. Brundage (2020) All the News That’s Fit to Fabricate: AI-Generated Text as a Tool of Media Misinformation. Journal of Experimental Political Science, pp. 1–14. Cited by: §1.1, §1, §3.2, §4.1.
  • J. A. Kroll, J. Huey, S. Barocas, E. W. Felten, J. R. Reidenberg, D. G. Robinson, and H. Yu (2017) Accountable Algorithms. University of Pennsylvania Law Review 165, pp. 633–705. Cited by: §1.1, §1, §3.3.
  • J. A. Kroll (2021) Outlining Traceability: A Principle for Operationalizing Accountability in Computing Systems. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, New York, NY, USA, pp. 758–771. External Links: ISBN 9781450383097 Cited by: §1.1, §2, §3.1.
  • S. Lambdan (2019) When Westlaw Fuels ICE Surveillance: Legal Ethics in the Era of Big Data Policing. N.Y.U. Review of Law and Social Change 43, pp. 255–293. Cited by: §1.1, item 3.
  • D. Lehr and P. Ohm (2017) Playing with the Data: What Legal Scholars Should Learn About Machine Learning. U.C. Davis Law Review 51, pp. 653–717. Cited by: §1.1.
  • B. M. Leiner, V. G. Cerf, D. D. Clark, R. E. Kahn, L. Kleinrock, D. C. Lynch, J. Postel, L. G. Roberts, and S. Wolff (2009) A Brief History of the Internet. SIGCOMM Comput. Commun. Rev. 39 (5), pp. 22–31. External Links: ISSN 0146-4833 Cited by: §1.
  • M. A. Lemley and B. Casey (2019) Remedies for Robots. The University of Chicago Law Review 86 (5), pp. 1311–1396. Cited by: item 2, footnote 13.
  • K. E. Levy and D. M. Johns (2016) When open data is a Trojan Horse: The weaponization of transparency in science and governance. Big Data & Society 3 (1). Cited by: §1.1.
  • K. E. Levy (2014) Intimate Surveillance. Idaho L. Rev. 51, pp. 679. Cited by: §3.4.
  • C. Lovering, R. Jha, T. Linzen, and E. Pavlick (2021) Predicting inductive biases of pre-trained models. In International Conference on Learning Representations, Cited by: §3.1, §3.2.
  • D. MacKenzie (2001) Mechanizing proof: computing, risk, and trust. MIT Press, Cambridge, MA, USA. External Links: ISBN 0262133938 Cited by: §3.2.
  • A. McMillan-Major, S. Osei, J. D. Rodriguez, P. S. Ammanamanchi, S. Gehrmann, and Y. Jernite (2021)

    Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards

    Note: ArXiv preprint Cited by: §1.1.
  • A. Meinke and M. Hein (2019) Towards neural networks that provably know when they don’t know. Note: ArXiv preprint Cited by: §1.1.
  • J. Metcalf, E. Moss, E. A. Watkins, R. Singh, and M. C. Elish (2021) Algorithmic impact assessments and accountability: The co-construction of impacts. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, New York, NY, USA, pp. 735–746. Cited by: §1.1, §2.2.
  • M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, and T. Gebru (2019) Model Cards for Model Reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency, New York, NY, USA, pp. 220–229. Cited by: §1.1.
  • M. Moore (2019) Causation in the Law. In The Stanford Encyclopedia of Philosophy, E. N. Zalta (Ed.), Note: Cited by: footnote 3.
  • E. Moss, E. A. Watkins, R. Singh, M. C. Elish, and J. Metcalf (2021) Assembling Accountability: Algorithmic Impact Assessment for the Public Interest. Note: SSRN preprint Cited by: §1.1.
  • D.K. Mulligan and K.A. Bamberger (2018) Saving governance-by-design. California Law Review 106, pp. 697–784. Cited by: §1.1, §1.
  • D. K. Mulligan and K. A. Bamberger (2019) Procurement As Policy: Administrative Process for Machine Learning. Berkeley Technology Law Journal 34, pp. 771–858. Cited by: §1.1, §1.1, §1.
  • M. Nadeem, A. Bethke, and S. Reddy (2021) StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, pp. 5356–5371. Cited by: §3.1, §3.2.
  • National Highway Traffic Safety Administration (2016) Federal Automated Vehicles Policy: Accelerating the Next Revolution In Roadway Safety. Technical report U.S. Department of Transportation. Cited by: §1.1.
  • B. Neyshabur, S. Bhojanapalli, D. Mcallester, and N. Srebro (2017) Exploring Generalization in Deep Learning. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30, Red Hook, NY, USA, pp. . Cited by: §1.1.
  • H. Nissenbaum (1996) Accountability in a computerized society. Science and engineering ethics 2 (1), pp. 25–42. Cited by: §1.1, §1.1, §1, §1, §2.1, §2.1, §3.1, §3.2, §3.2, §3.2, §3.3, §3.3, §3.4, §3.4, §3, §4.1, §4, §5, footnote 17.
  • N. Okidebe (2022) Discredited Data. Note: Forthcoming, Cornell Law Review, Vol. 107, shared privately with the authors Cited by: §1.1, item 3.
  • Y. Ovadia, E. Fertig, J. Ren, Z. Nado, D. Sculley, S. Nowozin, J. V. Dillon, B. Lakshminarayanan, and J. Snoek (2019) Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty under Dataset Shift. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Red Hook, NY, USA. Cited by: §1.1.
  • N. Papernot, P. D. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami (2016) The Limitations of Deep Learning in Adversarial Settings. In 1st IEEE European Symposium on Security & Privacy, Cited by: §1.1.
  • S. Passi and S. Barocas (2019) Problem Formulation and Fairness. In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* ’19, New York, NY, USA, pp. 39–48. External Links: ISBN 9781450361255 Cited by: §3.1, §3.2.
  • A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala (2019) PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), pp. 8024–8035. Cited by: §3.1.
  • A. Pradhan, L. Findlater, and A. Lazar (2019) “Phantom Friend” or “Just a Box with Information” Personification and Ontological Categorization of Smart Speaker-based Voice Assistants by Older Adults. In Proceedings of the ACM on Human-Computer Interaction, Vol. 3, New York, NY, USA, pp. 1–21. Cited by: §3.3, §3.3.
  • E. Raff (2019) A Step toward Quantifying Independently Reproducible Machine Learning Research. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Red Hook, NY, USA. Cited by: §1.1.
  • I. D. Raji, A. Smart, R. N. White, M. Mitchell, T. Gebru, B. Hutchinson, J. Smith-Loud, D. Theron, and P. Barnes (2020) Closing the ai accountability gap: defining an end-to-end framework for internal algorithmic auditing. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, New York, NY, USA, pp. 33–44. Cited by: §1.1, §1.1, §4.2.
  • A. L. Russell (2014) Open Standards and the Digital Age: History, Ideology, and Networks. Cambridge Studies in the Emergence of Global Enterprise, Cambridge University Press, Cambridge, UK. Cited by: item 4.
  • J. Sadowski, S. Viljoen, and M. Whittaker (2021) Everyone should decide how their digital data are used — Not just tech companies. Nature Publishing Group. Cited by: §1.1.
  • T. Scanlon (2000) What we owe to each other. Belknap Press, Cambridge, MA, USA. Cited by: §2.1.
  • M. Schlosser (2019) Agency. In The Stanford Encyclopedia of Philosophy, E. N. Zalta (Ed.), Note: Cited by: §3.3, footnote 12.
  • A. D. Selbst, d. boyd, S. A. Friedler, S. Venkatasubramanian, and J. Vertesi (2019) Fairness and Abstraction in Sociotechnical Systems. In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* ’19, New York, NY, USA, pp. 59–68. External Links: ISBN 9781450361255 Cited by: §3.1, §3.2.
  • A. D. Selbst (2021) An institutional view of algorithmic impact assessments. Harvard Journal of Law & Technology 35. Cited by: §1.1.
  • H. Shah (2018) Algorithmic Accountability. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 376 (2128). Cited by: §1.1.
  • C. M. Sharkey (2016) Can Data Breach Claims Survive the Economic Loss Rule. DePaul Law Review 66, pp. 339. Cited by: item 3.
  • H. Shen, W. H. Deng, A. Chattopadhyay, Z. S. Wu, X. Wang, and H. Zhu (2021) Value Cards: An Educational Toolkit for Teaching Social Impacts of Machine Learning through Deliberation. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, New York, NY, USA, pp. 850–861. Cited by: §1.1.
  • D. Shoemaker (2011) Attributability, answerability, and accountability: Toward a wider theory of moral responsibility. Ethics 121 (3), pp. 602–632. Cited by: §2.1.
  • P. T. Sivaprasad, F. Mai, T. Vogels, M. Jaggi, and F. Fleuret (2020) Optimizer Benchmarking Needs to Account for Hyperparameter Tuning. In Proceedings of the 37th International Conference on Machine Learning, H. D. III and A. Singh (Eds.), Proceedings of Machine Learning Research, Vol. 119, pp. 9036–9045. Cited by: §1.1, §3.1.
  • M. Sloane, E. Moss, O. Awomolo, and L. Forlano (2020) Participation is not a Design Fix for Machine Learning. Note: ArXiv preprint Cited by: §1.1, footnote 7.
  • M. Sloane, E. Moss, and R. Chowdhury (2021) A Silicon Valley Love Triangle: Hiring Algorithms, Pseudo-Science, and the Quest for Auditability. Note: ArXiv preprint Cited by: §3.2.
  • B. C. Smith (1985) The Limits of Correctness. SIGCAS Comput. Soc. 14,15 (1,2,3,4), pp. 18–26. External Links: ISSN 0095-2737 Cited by: §3.2.
  • N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov (2014) Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research 15 (56), pp. 1929–1958. Cited by: §3.2.
  • L. Stark and J. Hutson (2021) Physiognomic Artificial Intelligence. Note: SSRN preprint Cited by: §3.2.
  • T. Sun, A. Gaut, S. Tang, Y. Huang, M. ElSherief, J. Zhao, D. Mirza, E. Belding, K. Chang, and W. Y. Wang (2019) Mitigating Gender Bias in Natural Language Processing: Literature Review. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 1630–1640. Cited by: §3.2.
  • H. Surden and M. Williams (2016) Technological Opacity, Predictability, and Self-Driving Cars. Cardozo Law Review 38, pp. 121–181. Cited by: §1.1.
  • C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, and R. Fergus (2014) Intriguing properties of neural networks. In ICLR (Poster), Cited by: §1.1.
  • M. Talbert (2019) Moral Responsibility. In The Stanford Encyclopedia of Philosophy, E. N. Zalta (Ed.), Note: Cited by: §2.1.
  • P. Tereszkiewicz (2018) Digital platforms: regulation and liability in the EU law. European Review of Private Law 26 (6). Cited by: §3.4.
  • S. Turkle (2005) The second self: computers and the human spirit. MIT Press, Cambridge, MA, USA. Cited by: §3.3, §3.3.
  • U.S. Government Accountability Office (2021) Artificial Intelligence: An Accountability Framework for Federal Agencies and Other Entities.. External Links: Link Cited by: §1.1.
  • K. van Dorp (2002) Tracking and tracing: a structure for development and contemporary practices. Logistics Information Management 15 (1), pp. 24–33. External Links: ISSN 0957-6053 Cited by: §1.
  • B. Vecchione, K. Levy, and S. Barocas (2021) Algorithmic auditing and social justice: lessons from the history of audit studies. In Equity and Access in Algorithms, Mechanisms, and Optimization, pp. 1–9. Cited by: §1.1.
  • C. Véliz (2021) Moral zombies: why algorithms are not moral agents. AI & SOCIETY 36. Cited by: footnote 13.
  • S. Viljoen (2021) A relational theory of data governance. Yale Law Journal 131 (2). Cited by: §1.1.
  • S. Wachter, B. Mittelstadt, and L. Floridi (2017) Transparent, explainable, and accountable ai for robotics. Science (Robotics) 2 (6). Cited by: §1.1.
  • S. Wachter and B. Mittelstadt (2019) A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI. Columbia Business Law Review 2. Cited by: §1.1.
  • A. E. Waldman (2019) Power, Process, and Automated Decision-Making. Fordham Law Review 88. Cited by: §1.1.
  • Y. Wang and M. Kosinski (2018) Deep neural networks are more accurate than humans at detecting sexual orientation from facial images. Journal of Personality and Social Psychology 114, pp. 246–257. Cited by: §3.2, §4.1.
  • G. Watson (1996) Two Faces of Responsibility. Philosophical Topics 24 (2), pp. 227–248. External Links: ISSN 02762080, 2154154X Cited by: §2.1, footnote 5.
  • M. Weber (1978) Max weber: selections in translation. Cambridge University Press, Cambrdige, UK. Cited by: §2.2, §4.2.
  • J. Whittington, R. Calo, M. Simon, J. Woo, M. Young, and P. Schmiedeskamp (2015) Push, pull, and spill: A transdisciplinary case study in municipal open government. Berkeley Technology Law Journal 30 (3), pp. 1899–1966. Cited by: §1.1.
  • M. Wieringa (2020) What to account for when accounting for algorithms: a systematic literature review on algorithmic accountability. In Proceedings of the 2020 conference on fairness, accountability, and transparency, New York, NY, USA, pp. 1–18. Cited by: §1.1, §2.2, §2.2, §2.
  • X. Wu and X. Zhang (2016) Automated Inference on Criminality using Face Images. Note: ArXiv preprint Cited by: §3.2.
  • Y. Yang and M. Rinard (2019) Correctness Verification of Neural Networks. Note: NeurIPS 2019 Workshop on Machine Learning with Guarantees Cited by: §1.1.
  • M. Young, L. Rodriguez, E. Keller, F. Sun, B. Sa, J. Whittington, and B. Howe (2019) Beyond open vs. closed: Balancing individual privacy and public accountability in data sharing. In Proceedings of the Conference on Fairness, Accountability, and Transparency, New York, NY, USA, pp. 191–200. Cited by: §1.1.
  • D. Zhang and J. J. Tsai (2003) Machine learning and software engineering. Software Quality Journal 11 (2), pp. 87–119. Cited by: §1.
  • R. Zhang, A. F. Cooper, and C. M. De Sa (2020) Asymptotically Optimal Exact Minibatch Metropolis-Hastings. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33, Red Hook, NY, USA, pp. 19500–19510. Cited by: §1.1.