An Algorithmic Equity Toolkit for Technology Audits by Community Advocates and Activists

12/06/2019 ∙ by Michael Katell, et al. ∙ University of Michigan Middlebury College University of Oxford West Virginia University University of Washington 0

A wave of recent scholarship documenting the discriminatory harms of algorithmic systems has spurred widespread interest in algorithmic accountability and regulation. Yet effective accountability and regulation is stymied by a persistent lack of resources supporting public understanding of algorithms and artificial intelligence. Through interactions with a US-based civil rights organization and their coalition of community organizations, we identify a need for (i) heuristics that aid stakeholders in distinguishing between types of analytic and information systems in lay language, and (ii) risk assessment tools for such systems that begin by making algorithms more legible. The present work delivers a toolkit to achieve these aims. This paper both presents the Algorithmic Equity Toolkit (AEKit) Equity as an artifact, and details how our participatory process shaped its design. Our work fits within human-computer interaction scholarship as a demonstration of the value of HCI methods and approaches to problems in the area of algorithmic transparency and accountability.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Extensive evidence demonstrates that the harms of algorithmic and information technologies are significant. Demonstrated harms exist across highly varied applications. Automated pretrial and sentencing risk assessment systems used in courts of law are racially biased (Desmarais and Lowder, 2019; Dressel and Farid, 2018; Angwin et al., 2016)

, facial recognition is racially and gender biased

(Buolamwini and Gebru, 2018), algorithmically supported hiring decisions are gender biased (Dastin, 2018), automated license plate readers lead to unwarranted police stops (Lecher, 2019), sensitive financial information has been stolen in major privacy breaches (Cowley, 2019), digital currencies are susceptible to price manipulation (Gandal et al., 2018; Krafft et al., 2018; Donovan and Friedberg, 2019), social media is susceptible to disinformation campaigns (Woolley and Howard, 2018; Tucker et al., 2018), labor platforms are exacerbating precarious labor conditions (Rosenblat and Stark, 2016), and much more.

Community organizations and civil rights groups concerned about the discriminatory risks of public sector technology adoption have pushed for algorithmic equity—including accountability, transparency, and fairness—through the implementation of municipal ordinances in several U.S. cities. These ordinances manage the acquisition and use of surveillance technologies and other automated decision systems. For instance, Berkeley, Cambridge, Nashville, Seattle, and others have passed surveillance ordinances to provide a degree of oversight for regulating such government technologies (10). Particular to the context of our research, the City of Seattle passed one of the first and strongest surveillance ordinances in 2017, mandating the publication of a “master list” of government surveillance technologies and a series of “surveillance impact reports” (SIR) that include input from both city personnel and designated community representatives (González, 2017; Harrell, 2018).

Yet existing legislation does not go far enough to address the risks at hand. Policy-makers and community members alike find algorithmic systems to be inscrutable and illegible. Risks that are already subject to existing legislation are not being recognized because the risks are tied to opaque and ill-understood algorithmic systems, and few existing pieces of passed legislation aim regulation at the algorithmic level. Our prior work examined the Seattle Surveillance Ordinance as a case study and found that city personnel tasked with implementing that city’s surveillance ordinance did not consider any of the surveillance technologies in their portfolio to be algorithmic systems even when multiple technologies employed machine learning algorithms such as optical character recognition and facial recognition; these stakeholders instead focused on the technologies’ data collection functions and privacy implications

(Young et al., 2019a). This finding suggests a crisis of legibility in algorithmic regulation.

Our strategy to address the crisis of legibility around algorithmic systems is to empower community members to hold vendors and policy-makers accountable for these algorithmic harms that are being neglected. Through interactions with a US-based civil rights organization and their coalition of community organizations, we identify a need for (i) heuristics that aid community organizers and community members in distinguishing between types of analytic and information systems in lay language, and (ii) risk assessment tools for such systems that begin by making algorithms more legible. The present work delivers a toolkit, the Algorithmic Equity Toolkit (AEKit) Equity, to achieve these aims. This paper both presents the AEKit as an artifact, and provides a case study about the participatory process that shaped its design. We report on our iterative participatory design sessions with members of local civil rights organizations as the intended user base of the toolkit, and the Diverse Voices panels (Young et al., 2019b) we conducted to elicit additional input from those with lived experience of the harms motivating the toolkit.

2. Related Work

The community of human computer interaction (HCI) researchers is increasingly turning to the role that algorithms play in shaping sociotechnical systems, such as social media platforms, labor markets, and reputation scores. Algorithms are opaque due to multiple factors, including trade secret, technical unfamiliarity, and the complexity of machine learning and related techniques(Burrell, 2016). As a result, users are sometimes not aware of the role algorithms play in shaping social worlds (Eslami et al., 2019). Indeed, researchers have highlighted the seeming “invisibility” of algorithms in sociotechnical systems. How embedded an algorithm is in a system changes with respect to the viewer; that is, algorithms are “relational” like attributes of other infrastructural systems (Star, 1999).

Users form their own beliefs about how algorithms work (Rader and Gray, 2015), sometimes referred to as “algorithmic literacy” (Rainie and Anderson, 2017). These beliefs may not adhere closely to the way an algorithm works (Eslami et al., 2015). Nevertheless, these lay understandings, or “folk theories” of algorithms (DeVito et al., 2018a, b) shape user behavior (Nagy and Neff, 2015). Advanced users try to leverage what they know about how a system works in order to achieve more visibility on social media feeds (Bucher, 2012; Cotter, 2019; Bishop, 2019). Existing approaches for making the functioning of algorithmic systems more transparent have primarily adopted the form of textual explanations; for example, of what, how, or why a newsfeed algorithm performed as it did (Rader et al., 2018); counterfactual explanations of what set of circumstances would result in a different algorithmic decision (Wachter et al., 2017); and explanations of how a particular personalized ad was shown to a specific user (Eslami et al., 2018). Some work has explored the potential for regulation to mandate such explanations (Selbst and Powles, 2017). Amid growing interest in this area, researchers call for further application of methods found in HCI toward more user-centered design of these tools (Inkpen et al., 2019).

Our toolkit relates to and differs from existing related HCI efforts addressing this need. Compared with similar toolkit efforts such as the AI Blindpots toolkit (1), we emphasize civically engaged community stakeholders as our intended users rather than companies or government agencies that design and deploy the algorithms. In that way, similar to Woodruff et al. (Woodruff et al., 2018), we seek to empower historically marginalized communities. In contrast to the work of Woodruff et al., though, we do so through providing tools that enable recognition of and engagement with algorithms rather than soliciting perspectives on salient algorithms as they are readily understood. We also observed that other notable flowcharts that define and demystify AI for non-experts similarly rely on anthropomorphic metaphors, such as asking whether a system can “see”.111https://www.technologyreview.com/s/612404/is-this-ai-we-drew-you-a-flowchart-to-work-it-out/ Our toolkit attempts to avoid the use of these metaphors and adhere more closely to describing system functions. By trying to design it to be adaptable to a wide range of systems, the tool highlights that even conventional systems like Microsoft Excel could be used for automated decision system processes, and should be subject to oversight. Finally, in comparison to methods in the area of explainable AI (e.g., (Miller, 2019)) we develop a tool for first identifying the presence of an algorithm as a path to understanding it, a necessary prerequisite to pursuing explainability, and we emphasize consideration of sociotechnical context (cf., (Selbst et al., 2019)).

3. Design Context

Our research takes place in a major U.S. city that has implemented a strong municipal surveillance ordinance. The state’s legislature also recently drafted a tech fairness bill that is a first step in the direction of broad algorithmic regulation. Yet, previous research indicates that even expert policymakers are not prepared to understand the particular risks of algorithmic systems as such. In this participatory research project, we designed a toolkit that can be adopted within government, by civil rights organizations, and by individual community organizers to strengthen existing, ongoing, and future regulatory efforts.

Our work intervenes in a critical gap in non-expert understanding of complex (and proprietary) algorithmic systems. Both within and beyond the public sector, grassroots and advocacy organizations desire visibility into systems that could have disparate impact on historically marginalized communities, but they lack domain knowledge and a set of recommended processes for exposing such systems to oversight. Furthermore, such systems are typically ”black boxes,” provided by vendors who are often unwilling to reveal key aspects of their functionality. Even when a system’s functions are well-documented, the vectors of disparate impact are not readily apparent. To remedy this gap in understanding, and to provide those affected with tools necessary to hold algorithmic systems accountable, we co-designed the Algorithmic Equity Toolkit (AEKit) with community stakeholders in order to equip non-experts with a process and tools for surfacing unintended impacts of systems in use.

4. Methods

We iteratively developed the AEKit through a participatory design process that engaged data science experts, community partners, and policy advocates, while also drawing upon an array of prior literature

(Dillahunt et al., 2017; Erete et al., 2018; Green, 2018) and similar toolkit efforts (15; 1). Initially, based on the regulatory focus of prior academic research, we envisioned that the primary users of the Algorithmic Equity Toolkit would be employees in state and local government seeking to surface the potential for algorithmic bias in existing systems. We thought advocacy and grassroots organizations could also find the toolkit useful for understanding the social justice implications of public sector systems.

Through our participatory design process, we refined our audience and design goals to focus on helping civil rights advocates and community activists—rather than state employees—identify and audit algorithmic systems embedded in public-sector technology, including surveillance technology. We achieve this goal through three toolkit components: (1) A flowchart designed for lay users for identifying algorithmic systems and their functions; (2) A Question Asking Tool (QAT) for surfacing the key issues of social and political concern for a given system. These tools together reveal a system’s technical failure modes (i.e., potential for not working correctly, such as false positives), and its social failure modes (i.e. its potential for discrimination when working correctly); and (3) An interactive web tool that illustrates the underlying mechanics of facial recognition systems, such as the relationship between how models are trained and adverse social impacts. In creating our own toolkit, we followed a weekly prototyping schedule interspersed with stakeholder feedback and co-design sessions.

The underlying questions that drove our design of these components were: What ethical issues should civil rights advocates be concerned with in regards to surveillance and automated decision systems? How are algorithmic systems reinforcing bias and discrimination? What do community organizers and non-tech experts understand about algorithmic tools and their impacts? What should they know about surveillance and automated decision systems to identify them and know how they work? In comparison to existing resources, which tend to target software engineers and in some cases policymakers as an audience, we focused on policy advocates and community activists as users.

4.1. Team Composition

Our team consisted of a mix of students and researchers with expertise in policy analysis, qualitative research, human-centered design, computer science, data science, information ethics, and sociology.

4.2. Stakeholders

We envision our target users employing our toolkit to better inform their activism efforts in regards to tech fairness policy. We foresee community organizersand organization leaders using the toolkit to aid their understanding of the different functions of government technologies and the potential biases found in the use of algorithmic systems in their specific city and society at large. Stakeholder engagement was a key component in the development of our toolkit. As the work began, the group articulated reasons for engaging directly with community stakeholders in our designn activities. These reasons included: (i) scoping and defining the problem space; (ii) understanding the broader context of the problem; (iii) testing and interpreting motivating concepts; (iv) assessing the usefulness and accessibility of the toolkit; (v) prototyping; (vi) ensuring technical accuracy; and (vii) planning for toolkit implementation and stewardship.

We partnered with a prominent civil right organization at the forefront in advocating for transparency and accountability from state and local governments, and two member organization from their network that advocate for the rights of historically marginalized communities. All of the stakeholders have collaborated together for tech fairness and advocacy work and have demonstrated enthusiasm for our toolkit to help their members raise concerns about the potential harms of algorithmic systems to policy makers and other public officials.

4.3. Engagements

The team also met regularly with five members of a data science lab to receive feedback on the definitions and conceptualizations used in toolkit.

4.4. Ethics

4.4.1. Messaging

Algorithmic harm is an issue reflective of systemic and structural inequality. When evaluating technologies used to manage and control populations, it is not uncommon to limit the discussion of merits to the scope of functionality; asking only if results are ”accurate,” ”effective,” or ”predictive.” Our concern is broader—through the design of this toolkit, we seek also to interrogate the social context of technology and to surface risks to the goals of establishing and maintaining a just and civil society. When confronted with facial recognition systems, for example, in addition to questioning their accuracy or to advocate for diversified data models to improve identification of women and people of color, we push farther and also question the whether this software should have a place in a democratic society at all; whether the potential benefits of a perfectly accurate facial recognition system outweigh the panoptic harms.

4.4.2. Stakeholders

Our aim was to incorporate as much input as possible from our stakeholders without customizing the tool too much to the needs or desires of one specific group. One of our criticisms of existing tools is that they have not gone far enough to engage with stakeholder perspectives outside of academia and industry. As a response to this, we have chosen to focus heavily on the needs of underrepresented populations and members of historically marginalized communities in particular. However, given both the quantity and diversity of community stakeholders, creating a tool to service such a breadth of users presents a challenge. Thus, one ethical consideration is whether to design this tool with all, several, or one stakeholder(s) in mind.

4.4.3. How we address ethical concerns?

Connecting with diverse stakeholders and a coalition of community groups, as well as co-designing with the a leading civil rights organization and other stakeholders to ensure our prototype and final product addresses stakeholders’ concerns. While stakeholder engagement was fruitful and educational for the team, it also presented some constraints and limitations. Algorithmic systems are complex and are understood differently between lay and expert observers. Feedback about the toolkit content from community organizers differed starkly from that of the data scientists. The latter favored rich descriptions of the algorithmic processes that correctly identify machine learning concepts, such as clustering and classification. Meanwhile community stakeholders were easily overwhelmed by such technical language. This tension revealed the challenge of making an interpretive tools that is accessible to lay users while not misrepresenting the computational concepts they are designed to make salient. While balancing the competing desires of these stakeholder groups was difficult, it ultimately forced us to continually revise our approach in making the tool accessible while still remaining meaningful.

5. Participatory Design

In this section we describe how participants impacted the design at the outset of the project and throughout. These impacts are summarized in Table 1. The AEKit was initially formulated as an ultimate result of initial conversations with our partnering civil rights organization. In those initial conversations we had learned that the organization needed technical expert support on technology policy advocacy to provide additional input on technology, and for communicating why community engagement is needed as a robust part of policy-making efforts. In association with our responding to those needs, we identified the opportunity to develop pedagogical tools for algorithmic literacy. While initially we conceptualized these pedagogical tools as oriented towards a policy-maker audience, our partner civil rights organization encourage us to instead think about how we might empower community organizers and community members. The project benefited greatly from a standing coalition of community organizations assembled by this civil rights organizations. Members of this coalition became further partners in our participatory design process. Through interactions with these community organizations, and with the Diverse Voices panels we convened, we further emphasized contexts of use of algorithmic technologies and simplicity and accessibilty of the toolkit.

Time Impact
May 2018 In initial conversations with a partnering civil rights organization, we learned that the organization needed technical expert support on technology policy advocacy to provide additional input on technology, and communicating why community engagement is needed as a robust part of policy-making efforts.
June 2018 - September 2018 Initial research into technical capabilities of disclosed surveillance technologies and municipal oversight process emphasized need for an intervention on public understanding of algorithmic technologies
January 2019 Initial design of Algorithmic Equity Toolkit proposed, targeting policy makers with a tool for identifying algorithmic systems and a checklist of red flags
January - February 2019 Partnering data science institute encourages inclusion of a technical component to the project to better suit learning objectives of students on the project. This technical component is formulated to be a web demo.
February 2019 Civil rights organization joins as a formal partner in the project
February 2019 We held planning conversations with partners asking, ”Given time and resource constraints, what process should our co-design follow?” Community partners requested continuous engagement in our process in addition to our initially planned Diverse Voices panels near the end. Based on this feedback, our team pivoted to a project that was participatory throughout.
June 2019 We held an initial meeting with our primary partner organization, asking ”What does your advocacy look like in this space?” ”Is there anything that would be useful to your organizing efforts?” After input from this partner, we pivoted to focus on supporting the organizing efforts of civil rights organizations through empowering community organizers and activists rather than targeting policymakers as our main audience in the first instance.
June 2019 We held initial envisioning sessions with small groups comprised of members of community partner organizations, asking ”What is your current capacity to advocate in the area of algorithmic decision systems?” ”What would support your work?” Learning about the policy context led us to focus on intervening at the public comment period of oversight surveillance and ADS technologies.
July 2019 We held another round of co-design sessions with participants sharing initial artifacts. Participants directed the toolkit design to be less technical to enable broader diffusion and use. As a result, the toolkit shifted from a focus on machine learning concepts to embracing the wider sociotechnical context of use (e.g. ”Is the operator being trained in the accuracy levels of the system?)
August 2019 We conducted 3 Diverse Voices panels with members of communities historically harmed by surveillance and ADS technologies. Panelists identified several accessibility barriers; as a result of this input we modified the toolkit design to be more concise for field use, and more focused on algorithmic harm.
Table 1. A timeline of major impacts of participatory engagements.

6. The Algorithmic Equity Toolkit

At the time of writing, the toolkit has three components:222The latest version is available online here: https://github.com/anon770/toolkit_anon. The repository is organized to demonstrate how the toolkit evolved in response to stakeholder engagements.

  1. A flowchart for distinguishing surveillance and ADS’s and their different functions.

  2. A question-asking tool for surfacing the social context of a given system, its technical failure modes (i.e., potential for not working correctly, such as false positives), and its social failure modes (i.e. its potential for discrimination when working correctly).

  3. An interactive demo of facial recognition that reveals the underlying harms and mechanics of facial recognition technology.

Illustrations of the versions of these three components at the time of writing are shown in Figure 1.

Figure 1. The three primary components of the Algorithmic Equity Toolkit, consisting of (a) an identification guide for automated decision systems, (b) a questionnaire on potential harms, and (c) an interactive tool demonstrating algorithmic bias for a particular technology. In its initial scope, we conceived the primary users of the Algorithmic Equity Toolkit as employees in state and local government seeking to surface the potential for algorithmic bias in existing systems. The choice to center advocacy and grassroots organizations —particularly in support of their policy advocacy work in this space– emerged from our participatory design process early on.

The primary users of the Algorithmic Equity Toolkit will be community organizers, including civil rights advocacy and grassroots organizations as well as anyone interested in algorithmic equity. A secondary target audience includes personnel tasked with implementing technologies for managing and controlling populations. A key goal is to overcome power asymmetries between individuals and systems of authority, such as government agencies who should be held accountable for the technologies they implement in their communities. The toolkit can be used when engaging with policymakers and other public officials, or in other contexts where individuals and groups want to learn more about surveillance and ADS technologies and their potential harms.

An ADS is a computerized implementation of an algorithmic system to assist in decision-making by humans, or to take specified actions automatically. ADS’s are increasingly used in our society to analyze data and make decisions more quickly and efficiently; however, the increasing use of ADS’s decreases transparency and accountability due to their complexity and the lack of public awareness about how they work. The steps for using our present toolkit are:

  • Start with the Surveillance and ADS Identification Guide. This guide should be used to help you determine whether a government technology is a surveillance or ADS tool or system. It will also help you understand the different functions of surveillance and ADS tools and systems. With this Surveillance and ADS ID Guide, civil rights advocates can better detect the presence of algorithms and what those features do.

  • Questionnaire. Use the questionnaire to inquire about the potential harms of surveillance or ADS technologies when engaging with policymakers and other public officials.

  • Interactive facial recognition web demo. Click on [link] to access the interactive demo on facial recognition tool that illustrates some of the harms of the technology.

6.1. Flowchart for identifying a machine learning or AI system

6.1.1. Unmet need:

Information technologies are an increasing part of our everyday lives. Some technologies are more impactful than others, potentially affecting individual and group autonomy, civil rights, and safety. Our work with community groups and civil rights activists suggests that a means of ensuring that the effects of information technologies are mainly positive, or that their negative aspects are minimized, begins at recognizing and understanding the technologies in our midst. This is particularly true of public-sector technologies, where the principles of democratic governance require that state actors be accountable to the public for the tools and technologies they use to manage and control the population. Research by Young et al. (Young et al., 2019a) suggests that the public, including policy makers, need assistance in identifying the opaque algorithmic aspects of public sector systems so that technology implementations can be sufficiently transparent and publicly accountable.

6.1.2. Meeting the need of helping community organizers understand: Where is the algorithm in this system—what is the algorithm doing?

As described by Young et al., lay observers, including professionals who should know, often do not recognize that a system is ”algorithmic”. At other times, people may know a technology is algorithmic, but they don’t know how the algorithm is coming into play. In still more cases, there are systems that can be understood as algorithmic but their harms are not necessarily of concern (e.g. simple calculators, thermostats). The goal of the flowchart tool is to signal the likely presence of algorithms that likely pose harms, especially harms that correspond to marginalized identities and histories of discriminatory state action. The tool represents a set of definitional criteria, which, when applied to algorithmic systems, help to scope which technologies should be part of the conversation.

6.1.3. Form:

The tool we developed guides users through a process for identifying components of technical systems that are algorithmic. Many technological artifacts are ambiguous as to their inner functionality leaving observers, including users, unaware of what kind of work the artifact does over and above its most obvious functions. To make the embedded features more salient and open to questioning, our flowchart tool offers a decision tree for contemplating what has been disclosed or can be observed about a technology, providing a verdict about whether it might be an AI system. While some systems are relatively straightforward, either because their functions are obvious, publicized, or fully disclosed, there are other technologies that are more challenging to unpack. An example of the former is booking photo comparison software (BPCS), which employs an algorithmic system that has already faced considerable public scrutiny, facial recognition. Many other artifacts contain algorithmic features that are much harder to detect simply by encountering them or even by having them explained by a public official or software vendor.

The flowchart differentiates algorithmically-enhanced systems from systems that are merely surveillant (i.e. only a data collection tool and not a tool that performs, say, an analysis and/or renders action-guiding judgements, or takes its own actions). An automated license plate reader (ALPR) may appear at first to be merely surveillant—basically a device that captures license plate images. But embedded within are AI components such as computer vision and algorithms for recognizing alpha-numeric sequences and matching the results to lists of license plates of interest. It is helpful to understand these features because, over and above whatever functionality is most obvious (e.g. a camera), embedded systems have their own failure modes, design constraints, and social valences that can contribute to the artifact’s impact on individuals and communities. For example, some ALPR systems do not detect the issuing state of a license plate suggesting that a driver from Arizona could be misidentified as a driver from Pennsylvania whose license plate contains a similar alpha-numeric sequence. Even when such a system accurately identifies a license plate of interest, there are questions about the social conditions that lead to drivers being subjects of detection, such as the correlation between unpaid parking tickets and racialized poverty, that cannot be asked without peeling back the layers of technology to the sociotechnical imaginaries bundled within.

6.2. Asking the right questions

6.2.1. Unmet need:

Having identified an algorithmic system, the next step is to pose questions about it; about its functions and features, about the claims made about its efficacy, and about its potential to harm those to whom it is applied. Armed with a narrowly tailored set of questions, community organizers and activists can contest the narratives provided to them by authority figures and product vendors, proposing richer shared meanings onto the technologies in question. Given a camera with facial recognition capabilities, for example, Toolkit users will be able to address concerns about this technology, such as issues of race and gender detection parity and the potential for the tool contribute to oppressive feedback loops in which systemic discrimination is reproduced through the use of the tool by institutions with a history of discriminatory action. In creating this tool, we set some baseline standards, including: (i) it must be intuitive and legible to non-technical users; and (ii) questions should employ familiar language to the extent possible.

6.2.2. Form:

The Question Asking Tool (QAT) is a tool for guiding users through the salient issues presented by an algorithmic system. Its goal is to surface social contexts and technical failure modes and to prompt questions that reveal potential harms, particularly harms to particular communities and identities. The QAT could also contribute to algorithmic impact assessments required by local and international laws (e.g. the General Data Protection Regulation) and recommended by legal experts and other scholars (Article 29 Data Protection Working Party, 2018; Koops et al., 2016; Edwards and Veale, 2017), including the public accountability processes required by municipal surveillance ordinances in the United States. The tool can also be used by individual community members in dialogue with public officials and other authority figures. The tool distills known harms from the Fairness, Accountability, and Transparency literature and translates them for non-specialist audience.

The QAT prompts toolkit users to identify the socio-ethical issues community advocates and civil rights activists should be concerned with in regards to algorithmic systems. In what ways does a particular type of algorithmic system reinforce bias and discrimination? What should individuals and groups with little or no technical expertise understand about the impacts of algorithmic tools? What answers should they demand from public officials and other authority figures implementing management and control technologies in their communities? The QAT contains a series of questions sorted into categories designed to assess an algorithmic system’s potential harms in regard to social impact, appropriate use, transparency and accountability, data security and privacy, and interpretability or operability.

6.3. Interactive demo of intersectional failures of facial recognition

6.3.1. Unmet need:

Observers may have heard that algorithmic systems are problematic but may have difficulty envisioning and internalizing what those problems are. The interactive demo makes at least some issues of algorithmic sorting and decision making salient to the user.

6.3.2. Form:

The interactive demo tool demonstrates the problem of algorithmic harms such as bias in machine learning due to technical limitations and model representation, among other problems. Our demo involved running ten celebrity photos in Open Face’s model using a database of 60 celebrity photos collected from Labeled Faces in the Wild and Google image searches. We then selected the top 8 closest images for each of the ten celebrity photos to include in our demo. Of all the ten celebrity photos, the minimum similarity score of the top 8 closest images was 0.15, between a photo of Aaron Peirsol and Ai Sugiyama, and the maximum similarity score was 1.384, between two different photos of LeBron James. Overall, celebrities with lighter skin tones had lower similarity scores than celebrities with darker skin tones. Our demo showing differences in similarity scores along the lines of skin tone are consistent with the literature surrounding facial recognition software and accuracy according to skin tone (Buolamwini and Gebru, 2018).

7. Discussion

We observed that our efforts toward equity in public-sector algorithmic systems required articulation work, or alignment, (Corbin and Strauss, 1993) between the expertise of three distinct groups: civil rights legal experts, technology experts, and those with the lived experience of being differentially targeted by ADS groups. The shortest path to integrating these different knowledges was by traversing the social distance between them with a prototype in hand, letting each stakeholder interaction inform our subsequent encounters. Through frequent, concurrent probing with each of these groups, the territory of the intervention space began to reveal itself. Though we aim for the Toolkit to serve as an education aid, reinforcing connections between these three critical groups was no less important to us. The former is the foundation for individual awareness. The latter is the foundation for the collective action needed to propel tactical and just action– that can move ADS use closer toward social equity and accountability. To effect such change, it is not enough for those impacted by algorithmic systems to better understand the mechanics of these technologies. They must have a sense of the recourse that may mitigate harms.

But pushing knowledge in one direction is not enough (c.f. the failures of the “deficit model” of public understanding of science (Sturgis and Allum, 2004)). True change also demands technologists better understand the cultural, social and legal frames of these technologies as well as the lived experience of those particularly impacted by their designs. Likewise, legal and political experts better align the aims of civil society when they have a more grounded understanding of the technologies along with the lived experience of the affected. Such multi-directional co-learning necessitates a more demanding design process in which the problem and potential solutions are articulated with by each respective expert. Initially this results in confusion and ambiguity as the ways of conceiving of these technologies was not mutually intelligible. After several iterations of articulation (and rearticulation), a solution space can emerge that is truly reflective of all these expertises. This co-produced understanding may be the most important contribution of this work. Yet the social and technological complexities of algorithmic technologies inevitably slow the progress of multilateral co-production. Our initial co-articulations are incomplete and provisional. We assess that it will take many years of such effort to achieve a fully articulated mutual understandable operational vision of Algorthimic Justice. This work is but one early starting point. For this reason, we reflect on this work as an example of Research through Design (Bardzell et al., 2015, 2016; Zimmerman et al., 2007, ; Gaver, 2012).

7.1. Limitations

The AEKit has several limitations. First, we found that there exists little consensus about the definition and structure of concepts like artificial intelligence, machine learning, and automated decision systems. Second, meeting with many diverse stakeholders presented a challenge in building the toolkit. Such a diverse set of community organizers inevitably resulted in diverse and sometimes conflicting priorities, and it proved difficult to meet all expectations. We also faced a challenge of balancing unnecessary technical detail with simplicity. Understanding the baseline level of knowledge of the user and what would be helpful and not helpful for them to know, and connecting the flowchart to the checklist and interactive demo in a fluid way were also challenges. Finally, we did not want our toolkit to communicate that the harms we covered were comprehensive. For instance, while the facial recognition demo showcases inaccuracies in facial recognition, it runs the risk of communicating to users that our goal is accuracy in facial recognition. A more complete demo would attend more fully to the distinct and troubling harms of fully accurate face recognition and surveillance. To attempt to mitigate this risk, we decided to include case studies and quotes from stakeholders voicing this concern alongside the technical demo. In unpacking this tension, we also note that this failure may be inherent to the framing of fairness with respect to different social and demographic groups. As Hoffmann explains (Hoffmann, 2019), the hierarchical logic underpinning the discourse of “fairness” may reproduce disadvantage rather than mitigate it.

A broader concern is the negative impact of the existence of flowcharts and checklists for accountability and regulatory work. In a better developed area of environmental regulation, the environmental review process and negative declarations are well intended and effective in many ways. However, that process has had unintended consequences that were not foreseen by those who designed and adopted the (now standard) environmental review processes. For example, environmental reviews are used very effectively for class war with wealthier communities and individuals able to use effectively slow or stop any kind of development they find undesirable. This has had a tremendous and deleterious impact on affordable housing. The problem of hijacking environmental review for other ends is currently unfixable at this point (4 to 5 decades on this as well established procedure). One idea to resolve this kind of issue is potentially to have an independent review for both risks AND benefits.

8. Conclusion

Community organizers and civil rights activists throughout the U.S. are concerned about surveillance technologies being implemented in their communities. There is concern that these technologies are being used by law enforcement and other public officials for profiling and targeting historically marginalized communities. Activists and advocates have pushed for algorithmic equity (accountability, transparency, fairness) through the implementation of legislation like municipal surveillance ordinances that regulate and supervise the acquisition and use of surveillance technology. Major cities, including Seattle, Berkeley, Nashville, Cambridge, and others have implemented ordinances that differ in their scope, process, and power in regulating government technologies. However, most technology policy legislation in the U.S. fails to manage the growing use of automated decision systems such as facial recognition and predictive policing algorithms. Despite its limitations, the Algorithmic Equity Toolkit is a vital tool that community civil rights advocates can use to voice their concerns about these technologies during the decision-making process for the acquisition of these technologies. Our work fits within HCI scholarship as a demonstration of the value of HCI methods and approaches to problems in the area of algorithmic transparency and accountability.

Acknowledgements.
Blinded for peer review.

References

  • [1] AI Blindspot: A Discovery Process for preventing, detecting, and mitigating bias in AI systems. Note: https://aiblindspot.media.mit.edu/ Cited by: §2, §4.
  • J. Angwin, J. Larson, S. Mattu, and L. Kirchner (2016) Machine Bias: There’s Software Used Across the Country to Predict Future Criminals. And it’s Biased Against Blacks.. ProPublica. Cited by: §1.
  • Article 29 Data Protection Working Party (2018) Guidelines on Automated individual decision-making and Profiling for the purposes of Regulation 2016/679 (wp251rev.01). Technical report Technical Report WP251rev.01. Cited by: §6.2.2.
  • J. Bardzell, S. Bardzell, P. Dalsgaard, S. Gross, and K. Halskov (2016) Documenting the Research Through Design Process. In Proceedings of the 2016 ACM Conference on Designing Interactive Systems - DIS ’16, Brisbane, QLD, Australia, pp. 96–107. Cited by: §7.
  • J. Bardzell, S. Bardzell, and L. Koefoed Hansen (2015) Immodest Proposals: Research Through Design and Knowledge. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems - CHI ’15, Seoul, Republic of Korea, pp. 2093–2102. Cited by: §7.
  • S. Bishop (2019) Managing visibility on youtube through algorithmic gossip. New Media & Society, pp. 1461444819854731. Cited by: §2.
  • T. Bucher (2012) Want to be on the top? algorithmic power and the threat of invisibility on facebook. New media & society 14 (7), pp. 1164–1180. Cited by: §2.
  • J. Buolamwini and T. Gebru (2018) Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Proceedings of Machine Learning Research, Vol. 81, New York, NY, pp. 15. Cited by: §1, §6.3.2.
  • J. Burrell (2016) How the machine ‘thinks’: understanding opacity in machine learning algorithms. Big Data & Society 3 (1), pp. 2053951715622512. Cited by: §2.
  • [10] Community Control Over Police Surveillance. Note: https://www.aclu.org/issues/privacy-technology/surveillance-technologies/community-control-over-police-surveillance Cited by: §1.
  • J. M. Corbin and A. L. Strauss (1993) The articulation of work through interaction. The sociological quarterly 34 (1), pp. 71–83. Cited by: §7.
  • K. Cotter (2019) Playing the visibility game: how digital influencers and algorithms negotiate influence on instagram. New Media & Society 21 (4), pp. 895–913. Cited by: §2.
  • S. Cowley (2019) Equifax to Pay at Least $650 Million in Largest-Ever Data Breach Settlement. The New York Times. Cited by: §1.
  • J. Dastin (2018) Amazon scraps secret AI recruiting tool that showed bias against women. Reuters. Cited by: §1.
  • [15] David Anderson, Joy Bonaguro, Miriam McKinney, and Andrew Nicklin Ethics & Algorithms Toolkit (beta). Note: https://ethicstoolkit.ai/ Cited by: §4.
  • S. L. Desmarais and E. M. Lowder (2019) PRETRIAL RISK ASSESSMENT TOOLS. pp. 12. Cited by: §1.
  • M. A. DeVito, J. Birnholtz, J. T. Hancock, M. French, and S. Liu (2018a) How people form folk theories of social media feeds and what it means for how we study self-presentation. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 120. Cited by: §2.
  • M. A. DeVito, J. T. Hancock, M. French, J. Birnholtz, J. Antin, K. Karahalios, S. Tong, and I. Shklovski (2018b) The algorithm and the user: how can hci use lay understandings of algorithmic systems?. In Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems, pp. panel04. Cited by: §2.
  • T. R. Dillahunt, S. Erete, R. Galusca, A. Israni, D. Nacu, and P. Sengers (2017) Reflections on Design Methods for Underserved Communities. In Companion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing - CSCW ’17 Companion, Portland, Oregon, USA, pp. 409–413 (en). External Links: ISBN 978-1-4503-4688-7, Link, Document Cited by: §4.
  • J. Donovan and B. Friedberg (2019) Source hacking: media manipulation in practice. Technical report Data & Society Research Institute. Cited by: §1.
  • J. Dressel and H. Farid (2018) The accuracy, fairness, and limits of predicting recidivism. Science Advances 4 (1). Cited by: §1.
  • L. Edwards and M. Veale (2017)

    Slave to the Algorithm? Why a ’right to an explanation’ is probably not the remedy you are looking for

    .
    Duke Law & Technology Review 16, pp. 18–84. Cited by: §6.2.2.
  • S. Erete, A. Israni, and T. Dillahunt (2018) An intersectional approach to designing in the margins. Interactions 25 (3), pp. 66–69 (en). External Links: ISSN 10725520, Link, Document Cited by: §4.
  • M. Eslami, S. R. Krishna Kumaran, C. Sandvig, and K. Karahalios (2018) Communicating algorithmic process in online behavioral advertising. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 432. Cited by: §2.
  • M. Eslami, A. Rickman, K. Vaccaro, A. Aleyasen, A. Vuong, K. Karahalios, K. Hamilton, and C. Sandvig (2015) I always assumed that i wasn’t really that close to [her]: reasoning about invisible algorithms in news feeds. In Proceedings of the 33rd annual ACM conference on human factors in computing systems, pp. 153–162. Cited by: §2.
  • M. Eslami, K. Vaccaro, M. K. Lee, A. Elazari Bar On, E. Gilbert, and K. Karahalios (2019) User attitudes towards algorithmic opacity and transparency in online reviewing platforms. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 494. Cited by: §2.
  • N. Gandal, J. Hamrick, T. Moore, and T. Oberman (2018) Price manipulation in the Bitcoin ecosystem. Journal of Monetary Economics 95, pp. 86–96. Cited by: §1.
  • W. Gaver (2012) What should we expect from research through design?. In Proceedings of the 2012 ACM Annual Conference on Human Factors in Computing Systems - CHI ’12, Austin, Texas, USA, pp. 937. Cited by: §7.
  • M. L. González (2017) Seattle: Surveillance Ordinance (Seattle). Note: Ordinance 125376 Cited by: §1.
  • B. Green (2018) Data Science as Political Action: Grounding Data Science in a Politics of Justice. (en). External Links: Link Cited by: §4.
  • B. Harrell (2018) Seattle: Surveillance Ordinance Amendment. Cited by: §1.
  • A. L. Hoffmann (2019) Where fairness fails: data, algorithms, and the limits of antidiscrimination discourse. Information, Communication & Society 22 (7), pp. 900–915. Cited by: §7.1.
  • K. Inkpen, S. Chancellor, M. De Choudhury, M. Veale, and E. P. Baumer (2019) Where is the Human?: Bridging the Gap Between AI and HCI. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, pp. W09. Cited by: §2.
  • B. Koops, B. C. Newell, T. Timan, I. Škorvánek, T. Chokrevski, and M. Galič (2016) A typology of privacy. Note: 00012 Cited by: §6.2.2.
  • P. M. Krafft, N. Della Penna, and A. S. Pentland (2018) An experimental study of cryptocurrency market dynamics. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 605. Cited by: §1.
  • C. Lecher (2019) Privacy advocate held at gunpoint after license plate reader database mistake, lawsuit alleges. The Verge. Cited by: §1.
  • T. Miller (2019) Explanation in artificial intelligence: insights from the social sciences. Artificial Intelligence 267, pp. 1–38. Cited by: §2.
  • P. Nagy and G. Neff (2015) Imagined affordance: reconstructing a keyword for communication theory. Social Media+ Society 1 (2), pp. 2056305115603385. Cited by: §2.
  • E. Rader, K. Cotter, and J. Cho (2018) Explanations as mechanisms for supporting algorithmic transparency. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 103. Cited by: §2.
  • E. Rader and R. Gray (2015) Understanding user beliefs about algorithmic curation in the facebook news feed. In Proceedings of the 33rd annual ACM conference on human factors in computing systems, pp. 173–182. Cited by: §2.
  • L. Rainie and J. Anderson (2017) The need grows for algorithmic literacy, transparency and oversight. Cited by: §2.
  • A. Rosenblat and L. Stark (2016) Algorithmic labor and information asymmetries: A case study of Uber’s drivers. International Journal of Communication 10, pp. 3758–3784. Cited by: §1.
  • A. D. Selbst, D. Boyd, S. A. Friedler, S. Venkatasubramanian, and J. Vertesi (2019) Fairness and abstraction in sociotechnical systems. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 59–68. Cited by: §2.
  • A. D. Selbst and J. Powles (2017) Meaningful information and the right to explanation. International Data Privacy Law 7 (4), pp. 233–242. Cited by: §2.
  • S. L. Star (1999) The ethnography of infrastructure. American behavioral scientist 43 (3), pp. 377–391. Cited by: §2.
  • P. Sturgis and N. Allum (2004) Science in society: re-evaluating the deficit model of public attitudes. Public understanding of science 13 (1), pp. 55–74. Cited by: §7.
  • J. A. Tucker, A. Guess, P. Barberá, C. Vaccari, A. Siegel, S. Sanovich, D. Stukal, and B. Nyhan (2018) Social media, political polarization, and political disinformation: a review of the scientific literature. Cited by: §1.
  • S. Wachter, B. Mittelstadt, and C. Russell (2017) Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GPDR. Harv. JL & Tech. 31, pp. 841. Cited by: §2.
  • A. Woodruff, S. E. Fox, S. Rousso-Schindler, and J. Warshaw (2018) A qualitative exploration of perceptions of algorithmic fairness. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 656. Cited by: §2.
  • S. C. Woolley and P. N. Howard (2018) Computational propaganda: political parties, politicians, and political manipulation on social media. Oxford University Press. Cited by: §1.
  • M. Young, M. Katell, and P. Krafft (2019a) Municipal surveillance regulation and algorithmic accountability. Big Data & Society, Forthcoming. Cited by: §1, §6.1.1.
  • M. Young, L. Magassa, and B. Friedman (2019b) Toward inclusive tech policy design: a method for underrepresented voices to strengthen tech policy documents. Ethics and Information Technology 21 (2), pp. 89–103. Cited by: §1.
  • J. Zimmerman, J. Forlizzi, and S. Evenson (2007) Research through design as a method for interaction design research in HCI. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI ’07, San Jose, California, USA, pp. 493. Cited by: §7.
  • [54] J. Zimmerman, E. Stolterman, and J. Forlizzi An Analysis and Critique of Research through Design: towards a formalization of a research approach. pp. 10. Cited by: §7.