Despite the exponential growth of fairness in Machine Learning (AI) research, it remains centred on Western concerns and histories—the structural injustices (e.g., race and gender), the data (e.g., ImageNet), the measurement scales (e.g., Fitzpatrick scale), the legal tenets (e.g., equal opportunity), and the enlightenment values. Conventional western AI fairness is becoming a universal ethical framework for AI; consider the AI strategies from India (141), Tunisia (Tunisia, 2018), Mexico (Martinho-Truswell et al., 2018), and Uruguay ((Agesic), 2019) that espouse fairness and transparency, but pay less attention to what is fair in local contexts.
Conventional measurements of algorithmic fairness make several assumptions based on Western institutions and infrastructures. To illustrate, consider Facial Recognition (FR), where demonstration of AI fairness failures and stakeholder coordination have resulted in bans and moratoria in the US. Several factors led to this outcome:
Decades of scientific empiricism on proxies and scales that corresponds to subgroups in the West (Fitzpatrick, 1988).
The existence of government representatives glued into technology policy, shaping AI regulation and accountability (website, 2020).
An active media systematically scrutinises and reports on downstream impacts of AI systems (Kofman, 2016)
We argue that the above assumptions may not hold in much else of the world. While algorithmic fairness keeps AI within ethical and legal boundaries in the West, there is a real danger that naive generalisation of fairness will fail to keep AI deployments in check in the non-West. Scholars have pointed to how neoliberal AI follows the technical architecture of classic colonialism through data extraction, impairing indigenous innovation, and shipping manufactured services back to the data subjects—among communities already prone to exploitation, under-development, and inequality from centuries of imperialism (Wallerstein, 1991; Kwet, 2019; Mohamed et al., 2020; Birhane, 2020; Roy, 2014). Without engagement with the conditions, values, politics, and histories of the non-West, AI fairness can be a tokenism, at best—pernicious, at worst—for communities. If algorithmic fairness is to serve as the ethical compass of AI, it is imperative that the field recognise its own defaults, biases, and blindspots to avoid exacerbating historical harms that it purports to mitigate. We must take pains not to develop a general theory of algorithmic fairness based on the study of Western populations. Could fairness, then, have structurally different meanings in the non-West? Could fairness frameworks that rely on Western infrastructures be counterproductive elsewhere? How do social, economic, and infrastructural factors influence Fair-ML?
In this paper, we study algorithmic power in contemporary India, and holistically re-imagine algorithmic fairness in India. Home to 1.38 billion people, India is a pluralistic nation of multiple languages, religions, cultural systems, and ethnicities. India is the site of a vibrant AI workforce. Hype and promise is palpable around AI—envisioned as a force-multiplier of socio-economic benefit for a large, under-privileged population (141). AI deployments are prolific, including in predictive policing (Baxi, 2018) and facial recognition (Dixit, 2019). Despite the momentum on high-stakes AI systems, currently there is a lack of substantial policy or research on advancing algorithmic fairness for such a large population interfacing with AI.
We report findings from 36 interviews with researchers and activists working in the grassroots with marginalised Indian communities, and from observations of current AI deployments in India. We use feminist, decolonial, and anti-caste lenses to analyze our data. We contend that India is on a unique path to AI, characterised by pluralism, socio-economic development, technocratic nation-building, and uneven AI capital—which requires us to confront many assumptions made in algorithmic fairness. Our findings point to three factors that need attention in Fair-ML in India:
Data and model distortions: Infrastructures and social contracts in India challenge the assumption that datasets are faithful representations of people and phenomena. Models are over-fitted for digitally-rich profiles—typically middle-class men—further excluding the 50% without Internet access. Sub-groups like caste (endogamous, ranked social identities, assigned at birth (Ambedkar, 1916; Shanmugavelan, 2018)),111According to Shanmugavelan, “caste is an inherited identity that can determine all aspects of one’s life opportunities, including personal rights, choices, freedom, dignity, access to capital and effective political participation in caste-affected societies” (Shanmugavelan, 2018). Dalits (’broken men’ in Marathi) are the most inferiorised category of the people, are within the social hierarchy, but excluded in caste categories (Shanmugavelan, 2018; Beteille, 1990; Ambedkar, 1916; Saracini and Shanmugavelan, 2019). gender, and religion require different fairness implementations; but AI systems in India are under-analyzed for biases, mirroring the limited mainstream public discourse on oppression. Indic social justice, like reservations, presents new fairness evaluations.222We use the term ’Indic’ to refer to native Indian concepts.
Double standards and distance by ML makers: Indian users are perceived as ‘bottom billion’ data subjects, petri dishes for intrusive models, charitable beneficiaries, or given poor recourse—thus, effectively limiting their agency. While Indians are part of the AI workforce, a majority work in services, and engineers do not entirely represent marginalities, limiting re-mediation of distances.
Unquestioning AI aspiration: The AI imaginary is aspirational in the Indian state, media, and legislation. AI is readily adopted in high-stakes domains, often too early. Lack of an ecosystem of tools, policies, and stakeholders like journalists, researchers, and activists to interrogate high-stakes AI inhibits meaningful fairness in India.
In summary, we find that conventional Fair-ML may be inappropriate, insufficient, or even inimical in India if it does not engage with the local structures. In a societal context where the distance between models and dis-empowered communities is large—via technical distance, social distance, ethical distance, temporal distance, and physical distance—a myopic focus on localising fair model outputs alone can backfire. We call upon fairness researchers working in India to engage with end-to-end factors impacting algorithms, like datasets and models, knowledge systems, the nation-state and justice systems, AI capital, and most importantly, the oppressed communities to whom we have ethical responsibilities. We present a holistic framework to operationalise algorithmic fairness in India, calling for: re-contextualising data and model fairness; empowering oppressed communities by participatory action; and enabling an ecosystem for meaningful fairness.
Our paper contributes by bringing to light how algorithmic power works in India, through a bottom-up analysis. Second, we present a holistic research agenda as a starting point to operationalise Fair-ML in India. The concerns we raise may certainly be true of other countries. The broader goal should be to develop global approaches to Fair-ML that reflect the needs of various contexts, while acknowledging that some principles are specific to context.
Recent years have seen the emergence of a rich body of literature on fairness and accountability in machine learning e.g., (Barocas et al., 2017; Mehrabi et al., 2019). However, most of this research is framed in the Western context, by researchers situated in Western institutions, for mitigating social injustices prevalent in the West, using data and ontologies from the West, and implicitly imparting Western values, e.g., in the premier FAccT conference, of the 138 papers published in 2019 and 2020, only a handful of papers even mention non-West countries, and only one of them—Marda’s paper on New Delhi’s predictive policing system(Marda and Narayan, 2020)—substantially engages with a non-Western context.
2.1. Western Orientation in Fair-ML
2.1.1. Axes of discrimination
The majority of fairness research looks at racial (Lum and Isaac, 2016; Sap et al., 2019; Davidson et al., 2019; Manzini et al., 2019; Buolamwini and Gebru, 2018) and gender biases (Bolukbasi et al., 2016; Buolamwini and Gebru, 2018; Sun et al., 2019; Zhao et al., 2017) in models—two dimensions that dominate the American public discourse. However these categories are culturally and historically situated (Hanna et al., 2020). Even the categorisation of proxies in fairness analyses have Western biases and origins; e.g., the Fitzpatrick skin type is often used by researchers as a phenotype (Buolamwini and Gebru, 2018), but was originally developed to categorise UV sensitivity) (Fitzpatrick, 1988). While other axes of discrimination and injustices such as disability status (Hutchinson et al., 2020), age (Diaz et al., 2018), and sexual orientation (Garg et al., 2019) have gotten some attention, biases relevant to other geographies and cultures are not explored (e.g., Adivasis (indigeneous tribes of South Asia) and Dalits). As (Mulligan et al., 2019) points out, tackling these issues require a deep understanding of the social structures and power dynamics therein, which points to a wide gap in literature.
2.1.2. Legal framing
Since early inquiries into algorithmic fairness largely dealt with US law enforcement (predictive policing and recidivism risk assessment) as well as state regulations (e.g., in housing, loans, and education), the research framings often rely implicitly on US laws such as the Civil Rights Acts and Fair Housing Act, as well as on US legal concepts of discrimination. Indeed, researchers since the late 1960s have tried to translate US anti-discrimination law into statistical metrics (Hutchinson and Mitchell, 2019). The community also often repurposes terminology from US legal domains, such as “disparate impact”, “disparate treatment”, and “equal opportunity”, or use them as points of triangulation, in order to compare technical properties of fairness through analogy with legal concepts (Green and Viljoen, 2020; Green, 2020).
2.1.3. Philosophical roots
Connections have been made between algorithmic fairness and Western concepts such as egalitarianism (Binns, 2018), consequentialism (Roff, 2020; Mulligan et al., 2019), deontic justice (Binns, 2018; Mulligan et al., 2019), and Rawls’ distributive justice (Mulligan et al., 2019; Joseph et al., 2016). Indeed, notions of algorithmic fairness seem to fit within a broad arc of enlightenment and post-enlightenment thinking, including in actuarial risk assessment (Ochigame, 2020). Dr. B. R. Ambedkar’s ((1891–1956), fondly called Babasaheb, the leader and dominant ideological source of today’s Dalit politics) anti-caste movement was rooted in social justice, distinct from Rawl’s distributive justice (Rodrigues, 2011) (also see Sen’s critique of Rawl’s idea of original position and inadequacies of impartiality-driven justice and fairness (Sen, 2009)). Fairness’ status as the de facto moral standard of choice and signifier of justice, is itself a sign of cultural situatedness. Other moral foundations (Graham et al., 2013) of cultural importance may often be overlooked by the West, including purity/sanctity. Traditional societies often value restorative justice, which emphasises repairing harms (Boyes-Watson, 2014), rather than fairness, e.g., contemporary Australian Aboriginal leaders emphasise reconciliation rather than fairness in their political goals (Inayatullah, 2006). Furthermore, cultural relationships such as power distance, and temporal orientation, are known to mediate the importance placed on fairness (Lund et al., 2013).
2.2. Fairness perceptions across cultures
Social psychologists have argued that justice and fairness require a lens that go beyond the Euro-American cultural confines (Leung and Stephan, 2001). While the concern for justice has a long history in the West (e.g., Aristotle, Rawls) and the East (e.g., Confucius, Chanakya), they show that the majority of empirical work on social justice has been situated in the US and Western Europe, grounding the understanding of justice in the Western cultural context. Summarising decades worth of research, (Leung and Stephan, 2001) says that the more collectivist and hierarchical societies in the East differs from the more individualistic and egalitarian cultures of the West in how different forms of justice—distributive, procedural, and retributive—are conceptualised and achieved. For instance, (Blake et al., 2015) compared the acquisition of fairness behaviour in seven different societies: Canada, India, Mexico, Peru, Senegal, Uganda, and the USA, and found that while children from all cultures developed aversion towards disadvantageous inequity (avoid receiving less than a peer), advantageous inequity aversion (avoid receiving more than a peer) was more prevalent in the West. Similarly, a study of children in three different cultures found that notions of distributive justice are not universal: “children from a partially hunter-gatherer, egalitarian African culture distributed the spoils more equally than did the other two cultures, with merit playing only a limited role” (Schäfer et al., 2015). See above point on Dr. B. R. Ambedkar’s centring on priority-based social justice for caste inequalities. The above works point to the dangers in defining fairness of algorithmic systems based solely on a Western lens.
2.3. Algorithmic fairness in the non-West
The call for a global lens in AI accountability is not new (Paul et al., 2018; Hagerty and Rubinov, 2019), but the ethical principles in AI are often interpreted, prioritised, contextualised, and implemented differently across the globe (Jobin et al., 2019). Recently, the IEEE Standards Association highlighted the monopoly of Western ethical traditions in AI ethics, and inquired how incorporating Buddhist, Ubuntu, and Shinto-inspired ethical traditions might change the processes of responsible AI (IEEE, 2019). Researchers have also challenged the normalisation of Western implicit beliefs, biases, and issues in specific geographic contexts; e.g., India, Brazil and Nigeria (Sambasivan and Holbrook, 2018), and China and Korea (Shin, 2019). Representational gaps in data is documented as one of the major challenges in achieving responsible AI from a global perspective (Arora, 2016; Shankar et al., 2017). For instance, (Shankar et al., 2017)et al., 2020) shows that NLP models disproportionately fail to even detect names of people from non-Western backgrounds.
2.4. Accountability for unfairness
Discussion of accountability is critical to any discussions of fairness, i.e., how do we hold deployers of systems accountable for unfair outcomes? Is it fair to deploy a system that lacks in accountability? Accountability is fundamentally about answerability for actions (Kohli et al., 2018), and central to these are three phases by which an actor is made answerable to a forum: information-sharing, deliberation and discussion, and the imposition of consequences (Wieringa, 2020). Since outcomes of ML deployments can be difficult to predict, proposals for accountability include participatory design (Katell et al., 2020) and participatory problem formulation (Martin Jr et al., 2020), sharing the responsibility for designing solutions with the community. Nissenbaum distinguishes four barriers to responsibility in computer systems: (1) the problem of many hands, (2) bugs, (3) blaming the computer, and (4) ownership without liability (Nissenbaum, 1996). These barriers become more complicated when technology spans cultures: more, and more remote, hands are involved; intended behaviours may not be defined; computer-blaming may meet computer-worship head on (see Section 4); and questions of ownership and liability become more complicated.
Specific to the Indian context, scholars and activists have outlined opportunities for AI in India (Kalyanakrishnan et al., 2018), proposed policy deliberation frameworks that take into account the unique policy landscape of India (Marda, 2018), and questioned the intrusive data collection practices through Aadhaar (biometric-based unique identity for Indians) in India (Ramanathan, 2014, 2015). Researchers have documented societal biases in predictive policing in New Delhi (Marda and Narayan, 2020), caste (Thorat and Attewell, 2007) and ethnicity (Agrawal, 2020) biases in job applications, call-center job callbacks (Banerjee et al., 2009), caste-based wage-gaps (Madheswaran and Attewell, 2007), caste discrimination in agricultural loans decisions (Kumar, 2013), and even in online matrimonial ads (Rajadesingan et al., 2019).
Our research results come from a critical synthesis of expert interviews and discourse analysis. Our methods were chosen in order to provide an expansive account of who is building ML for whom, what the on-the-ground experiences are, what the various processes of unfairness and exclusions are, and how they relate to social justice.
We conducted qualitative interviews with 36 expert researchers, activists, and lawyers working closely with marginalised Indian communities at the grassroots. Expert interviews are a qualitative research technique used in exploratory phases, providing practical insider knowledge and surrogacy for a broader community (Bogner et al., 2009). Importantly, experts helped us gain access to a nascent and difficult topic, considering the early algorithmic deployments in the public sector in India. Our respondents were chosen from a wide range of areas to create a holistic analysis of algorithmic power in India. Respondents came from Computer Science (11), Activism (9), Law and Public Policy (6), Science and Technology Studies (5), Development Economics (2), Sociology (2), and Journalism (1). All respondents had 10-30 years of experience working with marginalised communities or on social justice. Specific expertise areas included caste, gender, labour, disability, surveillance, privacy, health, constitutional rights, and financial inclusion. 24 respondents were based in India, 2 in Europe, 1 in Southeast Asia, the rest in the USA; 25 of them self-identified as male, 10 as female, and 1 as non-binary.
In conjunction with qualitative interviews, we conducted an analysis of various algorithmic deployments and emerging policies in India, starting from Aadhaar (2009). We identified and analysed various Indian news publications (e.g., TheWire.in, Times of India), policy documents (e.g., NITI Aayog, Srikrishna Bill), and community media (e.g., Roundtable India, Feminism in India), and prior research. Due to secondary sources, our citations are on the higher side.
Recruitment and moderation We recruited respondents via a combination of reaching out directly and personal contacts, using purposeful sampling (Palinkas et al., 2015)—i.e., identifying and selecting experts with relevant experience—iterative until saturation. We conducted all interviews in English (preferred language of participants). The semi-structured interviews focused on 1) unfairness through discrimination in India; 2) technology production and consumption; 3) the historical and present role of fairness and ethics in India; 4) biases, stereotypes and proxies; 5) data; 6) laws and policy relating to fairness; and 7) canonical applications of fairness, evaluated in the Indian context. Respondents were compensated for the study (giftcards of 100 USD, 85 EUR, and 2000 INR), based on purchasing power parity and non-coercion. Employer restrictions prevented us from compensating government employees. Interviews lasted an hour each, and were conducted using video conferencing and captured via field notes and video recordings.
Analysis and coding Transcripts were coded and analyzed for patterns using an inductive approach (Thomas, 2006). From a careful reading of the transcripts, we developed categories and clustered excerpts, conveying key themes from the data. Two team members created a code book based on the themes, with seven top-level categories (sub-group discrimination, data and models, law and policy, ML biases and harms, AI applications, ML makers, and solutions) and several sub-categories (e.g., caste, missing data, proxies, consent, algorithmic literacy, and so on). The themes that we describe in Section 4 were then developed and applied iteratively to the codes.
Our data is analysed using feminist, decolonial, and anti-caste lenses. A South Asian feminist stance allows us to examine oppressed communities as encountering and subverting forces of power, while locating in contextual specifics of family, caste, class, and religion.333We are sympathetic to Dalit feminist scholars, like Rege, who have critiqued postcolonial or subaltern feminist thoughts for the lack of anti-caste scrutiny (Rege, 1998) South Asian feminism is a critique of the white, Western feminism that saw non-western women as powerless victims that needed rescuing (Mohanty, 2005; Chaudhuri, 2004). Following Dr. B. R. Ambedkar’s insight on how caste hierarchies and patriarchies are linked in India (Chakravarti, 1993), we echo that no social justice commitment in India can take place without examining caste, gender, and religion. A decolonial perspective (borrowed from Latin American and African scholars like (Escobar, 2011; Mignolo, 2011; Wa Thiong’o, 1992; Fanon, 2007; Dorfman and Mattelart, 1975)) helps us confront inequalities from colonisation in India, providing new openings for knowledge and justice in AI fairness research. To Dr. B. R. Ambedkar and Periyar E. V. Ramasamy, colonialism predates the British era, and decolonisation is a continuum. For Dalit emancipatory politics, deconstructing colonial ideologies of the powerful, superior, and privileged begins by removing influences and privileges of dominant-caste members.444Thanks to Murali Shanmugavelan for contributing these points.
Research ethics We took great care to create a research ethics protocol to protect respondent privacy and safety, especially due to the sensitive nature of our inquiry. During recruitment, participants were informed of the purpose of the study, the question categories, and researcher affiliations. Participants signed informed consent acknowledging their awareness of the study purpose and researcher affiliation prior to the interview. At the beginning of each interview, the moderator additionally obtained verbal consent. We stored all data in a private Google Drive folder, with access limited to our team. To protect participant identity, we deleted all personally identifiable information in research files. We redact identifiable details when quoting participants. Every respondent was given the choice of default anonymity or being included in Acknowledgements.
All co-authors of this paper work at the intersection of under-served communities and technology, with backgrounds in HCI, critical algorithmic studies, and ML fairness. The first author constructed the research approach and has had grassroots commitments with marginalised Indian communities for nearly 15 years. The first and second author moderated interviews. All authors were involved in the framing, analysis, and synthesis. Three of us are Indian and two of us are White. All of us come from privileged positions of class and/or caste. We acknowledge the above are our interpretations of research ethics, which may not be universal.
We now present three themes (see Figure 1) that we found to contrast views in conventional algorithmic fairness.
4.1. Data and model distortions
Datasets are often seen as reliable representations of populations. Biases in models are frequently attributed to biased datasets, presupposing the possibility of achieving fairness by “fixing” the data (Holstein et al., 2019). However, social contracts, informal infrastructures, and population scale in India lead us to question the reliability of datasets.
4.1.1. Data considerations
Missing data and humans Our respondents discussed how data points were often missing because of social infrastructures and systemic disparities in India. Entire communities may be missing or misrepresented in datasets, exacerbated by digital divides, leading to wrong conclusions (Barocas and Selbst, 2016; Lerman, 2013; Crawford, 2013a, b) and residual unfairness (Kallus and Zhou, 2018). Half the Indian population lacks access to the Internet—the excluded half is primarily women, rural communities, and Adivasis (Rowntree, 2020; Kamath and Kumar, 2017; Jain, 2016; Shah, ; Pandey, 2020). Datasets derived from Internet connectivity will exclude a significant population, e.g., many argued that India’s mandatory covid-19 contact tracing app excluded hundreds of millions due to access constraints, pointing to the futility of digital nation-wide tracing (also see (Kodali, 2020)). Moreover, India’s data footprint is relatively new, being a recent entrant to 4G mobile data. Many respondents observed a bias towards upper-middle class problems, data, and deployments due to easier data access and revenue, as P8, CS/IS (computer science/information sciences researcher) put it, “rich people problems like cardiac disease and cancer, not poor people’s Tuberculosis, prioritised in AI [in India]”, exacerbating inequities among those who benefit from AI and those who do not.
Several respondents pointed to missing data due to class, gender, and caste inequities in accessing and creating online content, e.g., many safety apps use data mapping to mark areas as unsafe, in order to calculate an area-wide safety score for use by law enforcement (172; 40) (women’s safety is a pressing issue in India, in public consciousness since the 2012 Nirbhaya gang rape (Gardiner, 2013). P4 (anti-caste, communications researcher) described how rape is socially (caste, culture, and religion) contextualised and some incidents get more visibility than others, in turn becoming data, in turn getting fed into safety apps—a perpetual source of unfairness. Many respondents were concerned that the safety apps were populated by middle-class users and tended to mark Dalit, Muslim, and slum areas as unsafe, potentially leading to hyper-patrolling in these areas.
Data was reported to be ‘missing’ due to artful user practices to manipulate algorithms, motivated by privacy, abuse, and reputation concerns. e.g., studies have shown how women users have ‘confused’ algorithms, motivated by privacy needs (Sambasivan et al., 2018; Masika and Bailur, 2015)). Another class of user practices that happened outside of applications led to ‘off data’ traces. As an example, P17, CS/IS researcher, pointed to how auto rickshaw drivers created practices outside of ride-sharing apps, like calling passengers to verify landmarks (as Indian addresses are harder to specify (Culture, 2018)) or cancelling rides in-app (which used mobile payments) to carry out rides for a cash payment. Respondents described how data, like inferred gender, lacking an understanding of context was prone to inaccurate inferences.
Many respondents pointed to the frequent unavailability of socio-economic and demographic datasets at national, state, and municipal levels for public fairness research. Some respondents reported on how the state and industry apparati collected and retained valuable, large-scale data, but the datasets were not always made publicly available due to infrastructure and non-transparency issues. As P5, public policy researcher, described, “The country has the ability to collect large amounts of data, but there is no access to it, and not in a machine-readable format.” In particular, respondents shared how datasets featuring migration, incarceration, employment, or education, by sub-groups, were unavailable to the public. Scholarship like Bond’s caste report (Saracini and Shanmugavelan, 2019) argues that there is limited political will to collect and share socio-economic indicators by caste or religion.
A rich human infrastructure (Sambasivan and Smyth, 2010) from India’s public service delivery, e.g., frontline data workers, call-center operators, and administrative staff extends into AI data collection. However, they face disproportionate work burden, sometimes leading to data collection errors (Murali, 2019; Bhonsle and Prasad, 2020; Ismail and Kumar, 2018). Many discussed how consent to a data worker stemmed from high interpersonal trust and holding them in high respect—relationships which may not be transitive to the downstream AI applications. In some cases, though, data workers have been shown to fudge data without actual conversations with affected people; efforts like jun sanwais (public hearings) and Janata Information Systems organized by the Mazdoor Kisan Shakti Sangatan are examples to secure representation through data. (Singh, 2018; Jenkins and Goetz, 1999).
Mis-recorded identities Statistical fairness makes a critical assumption so pervasively that it is rarely even stated: that user data corresponds one-to-one to people.555This issue is somewhat related to what (Olteanu et al., 2019) call “Non-individual agents”. However the one-to-one correspondence in datasets often fails in India. Ground truth on full names, location, contact details, biometrics, and their usage patterns can be unstable, especially for marginalised groups. User identity can be mis-recorded by the data collection instrument, assuming a static individual correspondence or expected behaviour. Since conventional gender roles in Indian society lead to men having better access to devices, documentation, and mobility (see (Sambasivan et al., 2018; Donner et al., 2008)), women often borrowed phones. A few respondents pointed to how household dynamics impacted data collection, especially when using the door-to-door data collection method, e.g., how heads of households, typically men, often answered data-gathering surveys on behalf of women, but responses were recorded as women’s.
Several AI-based applications use phone numbers as a proxy for identity in account creation (and content sharing) in the non-West, where mobile-first usage is common and e-mail is not (Donner, 2015). However, this approach fails due to device sharing patterns (Sambasivan et al., 2018), increased use of multiple SIM cards, and the frequency with which people change their numbers. Several respondents mentioned how location may not be permanent or tethered to a home, e.g., migrant workers regularly travel across porous nation-state boundaries.
|Sub-groups, Proxies and Harms|
|Caste (17% Dalits; 8% Adivasi; 40% Other Backward Class (OBC) )(Ministry of Home Affairs, ) [noitemsep,leftmargin=*,topsep=2pt] Societal harms: Human rights atrocities. Poverty. Land, knowledge and language battles (Xaxa, 2011; Ambedkar, 2014; IDSN, 2010). Proxies: Surname. Skin tone. Occupation. Neighborhood. Language. Tech harms: Low literacy and phone ownership. Online misrepresentation & exclusion. Accuracy gap of Facial Recognition (FR). Limits of Fitzpatrick scale. Caste-based discrimination in tech ((Mukherji, )).|
|Gender (48.5% female)(Central Statistics Office and Implementation, 2018) [noitemsep,leftmargin=*,topsep=2pt] Societal harms: Sexual crimes. Dowry. Violence. Female infanticide. Proxies: Name. Type of labor. Mobility from home. Tech harms: Accuracy gap in FR. Lower creditworthiness score. Recommendation algorithms favoring majority male users. Online abuse and ’racey’ content issues. Low Internet access.|
|Religion (80% Hindu, 14% Muslim, 6% Christians, Sikhs, Buddhists, Jains and indigeneous) (Ministry of Home Affairs, ) [noitemsep,leftmargin=*,topsep=2pt] Societal harms: Discrimination, lynching, vigilantism, and gang-rape against Muslims and others (Abraham and Rao, ). Proxies: Name. Neighborhood. Expenses. Work. Language. Clothing. Tech harms: Online stereotypes and hate speech, e.g., Islamophobia. Discriminatory inferences due to lifestyle, location, appearance. Targeted Internet disruptions.|
|Ability (5%–8%+ persons with disabilities) (Region, 2009) [noitemsep,leftmargin=*,topsep=2pt] Societal harms: Stigma. Inaccessible education, transport & work. Proxies: Non-normative facial features, speech patterns, body shape & movements. Use of assistive devices. Tech harms: Assumed homogeneity in physical, mental presentation. Paternalistic words and images. No accessibility mandate.|
|Class (30% live below poverty line; 48% on $2–$10/day)(Rangarajan et al., 2014) [noitemsep,leftmargin=*,topsep=2pt] Societal harms: Poverty. Inadequate food, shelter, health, & housing. Proxies: Spoken & written language(s). Mother tongue. Literacy. Feature / Smart Phone Ownership. Rural vs. urban. Tech harms: Linguistic bias towards mainstream languages. Model bias towards middle class users. Limited or lack of internet access.|
|Gender Identity & Sexual Orientation (No official LGBTQ+ data) [noitemsep,leftmargin=*,topsep=2pt] Societal harms: Discrimination and abuse. Lack of acceptance and visibility, despite the recent decriminalisation.(Tamang, 2020) Proxies: Gender declaration. Name. Tech harms: FR ”outing” and accuracy. Gender binary surveillance systems (e.g., in dormitories). M/F ads targeting. Catfishing and extortion abuse attacks.|
|Ethnicity (4% NorthEast) (Ministry of Home Affairs, ) [noitemsep,leftmargin=*,topsep=2pt] Societal harms: Racist slurs, discrimination, and physical attacks. Proxies: Skin tone. Facial features. Mother tongue. State. Name. Tech harms: Accuracy gap in FR. Online misrepresentation & exclusion. Inaccurate inferences due to lifestyle, e.g., migrant labor.|
Discriminated sub-groups and proxies AI systems in India remain under-analysed for biases, mirroring the limited public discourse on oppression. In Table 1, we present a summary of discriminated sub-groups in India, derived from our interviews and enriched through secondary research and statistics from authoritative sources, to substantiate attributes and proxies. Furthermore, we describe below some common discriminatory proxies and attributes that came up during our interviews. While the proxies may be similar to those in the West, the implementation and cultural logics may vary in India, e.g., P19, STS researcher, pointed to how Hijra community members (a marginalised intersex or transgender community) may live together in one housing unit and be seen as fraudulent or invisible to models using family units. Proxies may not generalise even within the country, e.g., asset ownership: “If you live in Mumbai, having a motorbike is a nuisance. If rural, you’re the richest in town.” (P9, CS/IS researcher).
Occupation: traditional occupations may correspond to caste, gender, or religion; e.g., manual scavenging or butchery.
Expenditure: on dietary and lifestyle items may be proxies for religion, caste, or ethnicity; e.g., expenses on pork or beef.
Skin tone: may indicate caste, ethnicity, and class; however, unlike correlations between race and skin tone, correspondences to Indian sub-groups is weaker. Dark skin tones can be discriminated against in India (Dixit, July-2019). Many respondents described how datasets under-collected darker skin tones and measurement scales like Fitzpatrick scale are not calibrated on diverse Indian skin tones.
Language: Language can correspond to religion, class, ethnicity, and caste. Many AI systems serve in English, which only 10% of Indians understand (S, May-07). India has 30 languages with over a million speakers. Everyday slurs such as ‘dhobi’, ‘kameena’, ‘pariah’, or ‘junglee’ are reported to be rampant online (Kandukuri, 2018; Agrawal, 2020).
Devices and infrastructure: Internet access corresponds to gender, caste, and class, with 67% Internet users being males (Kala, 2019).
Documentation: Several AI applications require state-issued documentation like Aadhaar or birth certificates, e.g., in finance. The economically poor are also reported to be document-poor (S, 2020).666National IDs are contested in the non-West, where they are used to deliver welfare in an ‘objective’ manner, but lead to populations falling through the cracks (see research on IDs in Aadhaar (Ratclifee, 2019; Singh and Jackson, 2017), post-apartheid South Africa (Donovan, 2015) and Sri Lankan Tamils and Côte d’Ivoirians (Bailur et al., 2019)). Documentation has been known to be a weapon to dominate the non-literate in postcolonial societies (Gupta, 2012) Also see administrative violence (Spade, 2015).
4.1.2. Model considerations
Over-fitting models to the privileged
Respondents described how AI models in India overfitted to ‘good’ data profiles of the digitally-rich, privileged communities, as a result of poor cultural understanding and exclusion on part of AI makers. Respondents noted that the sub-groups that had access to underlying variables for data-rich profiles, like money, mobility, and literacy, were typically middle-class men. Model inputs in India appear to be disproportionately skewed due to large disparities in digital access. For instance, P11, tech policy researcher, illustrated how lending apps instantly determined creditworthiness through alternative credit histories built based on the user’s SMS messages, calls, and social networks (due to limited credit or banking history). Popular lending apps equate ‘good credit’ with whether the user called their parents daily, had stored over 58 contacts, played car-racing games, and could repay in 30 days(Dahir, 2019). Many respondents described how lending models imagined middle-class men as end-users—even with many microfinance studies showing that women have high loan repayment rates (D’espallier et al., 2011; Swain and Wallentin, 2009). In some cases, those with ‘poor’ data profiles subverted model predictions—as in P23’s (STS researcher) research on financial lending, where women overwhelmingly availed and paid back loans in the names of male relatives to avoid perceived gender bias in credit scoring. Model re-training left new room for bias, though, due to a lack of Fair-ML standards for India, e.g., an FR service used by police stations in eight Indian states retrained a western FR model on photos of Bollywood and regional film stars to mitigate the bias (Dixit, 2019)—but Indian film stars are overwhelmingly fair-skinned, conventionally attractive, and able-bodied (Karan, 2008), not fully representative of the larger society.
Indic justice in models
Popular fairness techniques, such as equal opportunity and equal odds, stem from epistemological and legal systems of the US (e.g., (Dobbe et al., 2018; Xiang and Raji, 2019)). India’s own justice approaches present new and relevant ways to operationalise algorithmic fairness locally. Nearly all respondents discussed reservations as a technique for algorithmic fairness in India. One of the restorative justice measures to repair centuries of subordination—reservations are a type of affirmative action enshrined in the Indian constitution (Richardson, 2012). Reservations assigns quotas for marginalised communities at the national and state levels.777While the US Supreme court has banned various quotas (Joshi, 2018), there is a history of quotas in the US, sometimes discriminatory, e.g., New Deal black worker quotas (Deal, ) and anti-semitic quotas in universities (Hollinger, 1998). Quotas in India are legal and common. Thanks to Martin Wattenberg for this point. Originally designed for Dalits and Adivasis, reservations have expanded to include other backward castes, women, persons with disabilities, and religious minorities. Depending on the policy, reservations can allocate quotas from 30% to 80%. Reservations in India have been described as one of the world’s most radical policies (Baker, 2001) (see (Richardson, 2012) for more). Several studies have pointed to the successful redistribution of resources towards oppressed sub-groups, through reservations (Pande, 2003; Duflo, 2005; Borooah et al., 2007).
4.2. Double standards by ML makers
’Bottom billion’ petri dishes Several respondents discussed how AI developers, both regional and international, treated Indian user communities as ‘petri dishes’ or ‘wild west’ for models; in contrast to how AI makers were moderately responsive to Western communities. Many criticised how neo-liberal AI tended to treat Indians as ‘bottom billion data subjects’ in ‘the periphery’ (Wallerstein, 1991)—being subject to intrusive models, poor tech policies, inadequate user research, low-cost or free products that are low standard, growth targets as ‘unsaturated markets’, and the poster children of AI for social good projects. India’s diversity of languages, scripts, and populace has been reported to be attractive for improving model robustness and training data (Aggarwal, 2018). Many discussed how low quality designs, algorithms, and support were deployed for Indians, attributing to weak tech policy and enforcement of accountability in India. Several respondents described how AI makers had a transactional mindset towards Indians, seeing them as agency-less data subjects that generated large-scale behavioural traces to improve AI models.
In contrast to how AI industry and research were moderately responsive to user bias reports in the West, recourse and redress for Indians were perceived to be non-existent. Respondents described that when recourse existed, it was often culturally-insensitive or dehumanising, e.g., a respondent was violently questioned about their clothing by staff of a ride-sharing application, during redressal for an assault faced in a taxi (also see (Sambasivan et al., 2019) for poor recourse). Several respondents described how lack of recourse was even more dire for marginalised users. e.g., P14 (CS/IS researcher) described, “[Ordering a ride-share] a person with a disability would choose electronic payment, but the driver insisted on cash. They said they are blind and wanted to pay electronically, but the driver declined and just moved on. No way to report it.” Even when feedback mechanisms were included, respondents shared that they were not always localised for India, and incidents were not always recognised unless an activist contacted the company staff directly. Many respondents shared how street-level bureaucrats, administrative offices, and front line workers—the human infrastructures (Sambasivan and Smyth, 2010) who played a crucial role in providing recourse to marginalised Indian communities—were removed in AI systems. Further, the high-tech illegibility of AI was noted to render recourse out of reach for groups marginalised by literacy, legal, and educational capital (see (Veeraraghavan, 2013) for ’hiding behind a computer’). As P12 (STS researcher) explained, “If decisions are made by a centralised server, communities don’t even know what has gone wrong, why [welfare] has stopped, they don’t know who to go to to fix the problem.” Many described how social audits and working with civil society created a better understanding and accountability.888Social audits like jan sanwais have long gauged effectiveness of civil programmes through village-level audits of documents, e.g., to curb corrupt funds siphoning (Patnaik, 2012).
Some respondents pointed to how Dalit and Muslim bodies were used as test subjects for AI surveillance, e.g., pointing to how human efficiency trackers were increasingly deployed among Dalit sanitation workers in cities like Panchkula and Nagpur. Equipped with microphones, GPS, cameras, and a SIM, the trackers allowed detailed surveillance of movement and work, leading to some women workers avoiding restrooms for fear of camera capture, avoiding sensitive conversations for fear of snooping, and waiting for the tracker to die before going home (Khaira, 2020). Such interventions were criticised for placing power in the hands of dominant-caste supervisors. P21 (legal researcher) pointed out that surveillance has historically been targeted at the working-poor, “the house cleaner who is constantly suspected of stealing dried fruits or jewellery. Stepping out of their house means that their every move is tracked. Someone recording the face, the heartbeat..under the pretext of efficiency. Her job is to clean faeces in the morning and now she is a guinea pig for new AI.”
Entrenched privilege and distance Nearly all respondents described how AI makers and researchers, including regional makers, were heavily distant from the Indian communities they served. Several respondents discussed how Indian AI engineers were largely privileged class and caste males.999India fares slightly better than the US in gender representation in the tech workforce; however, gender roles and safety concerns lead to nearly 80% of women leaving computing by their thirties (coinciding with family/parenting responsibilities) (Thakkar et al., 2018). For e.g., P17 (CS/IS researcher) described, “Who is designing AI? Incredibly entitled, Brahmin, certainly male. They’ve never encountered discrimination in their life. These guys are talking about primitive women. If they’re designing AI, they haven’t got a clue about the rest of the people. Then it becomes fairness for who?” Many respondents described how the Indian technology sector claimed to be ‘merit-based’, open to anyone highly gifted in the technical sciences; but many have pointed to how merit is a function of caste privilege (Subramanian, 2015; Upadhya, 2007). Many, though not all, graduates of Indian Institutes of Technology, founders of pioneering Indian software companies and nearly all Nobel prize winners of Indian origin have come from privileged castes and class backgrounds (Express, 2020; Subramanian, 2015). As P21 (legal researcher) explained the pervasive privilege in AI, “Silicon Valley Brahmins [Indians] are not questioning the social structure they grew up in, and white tech workers do not understand caste to spot and mitigate obvious harms.” While engineers and researchers are mostly privileged everywhere, the stark socio-economic disparities between Indian engineers and the marginalised communities may further amplify the distances.
4.3. Unquestioning AI aspiration
AI euphoria Several respondents described how strong aspiration for AI for socio-economic upliftment was accompanied by high trust in automation, limited transparency, and the lack of an empowered Fair-ML ecosystem in India. Contrast with the West, where a large, active stakeholder ecosystem (of civil society, journalists, and law makers) is AI-literate and has access to open APIs and data. Many respondents described how public sector AI projects in India were viewed as modernising efforts to overcome age-old inefficiencies in resource delivery (also in (Sambasivan, 2019)). The AI embrace was attributed to follow the trajectory of recent high-tech interventions (such as Aadhaar, MGNREGA payments, and the National Register of Citizens (NRC)). Researchers have pointed to the aspirational role played by technology in India, signifying symbolic meanings of modernity and progress via technocracy (Pal, 2015, 2008; Sambasivan and Aoki, 2017). AI for societal benefit is a pivotal development thrust in India, with a focus on healthcare, agriculture, education, smart cities, and mobility (141)—influencing citizen imaginaries of AI. In an international AI perceptions survey (2019), Indians ranked first in rating AI as ‘exciting’, ‘futuristic’ and ‘mostly good for society’ (Kelley et al., 2019).
Several respondents pointed to how automation solutions had fervent rhetoric; whereas in practice, accuracy and performance of systems were low. Many described how disproportionate confidence in high-tech solutions, combined with limited technology policy engagement among decision-makers, appeared to lead to sub-par high-stakes solutions, e.g., the FR service used by Delhi Police was reported to have very low accuracy and failed to distinguish between boy and girl children (88).101010A confidence threshold of 80-95% is recommended for law enforcement AI (Crumpler, 2020) Some respondents mentioned how a few automation solutions were announced following public sentiment, but turned into surveillance e.g., how predictive policing in Delhi and FR in train stations was announced after Nirbhaya’s gruesome gangrape in 2012 and women’s safety incidents (BBC, 2019; Barik, 2020).
Disputing AI4All Many respondents pointed to how emerging ‘4good’ deployments tended to leave out minorities. e.g., P29 (LGBTQ+ activist) discussed how AI was justified in the public domain, e.g., surveillance for smart cities,11111198 Indian cities are smart city sites, to be equipped with intelligent traffic, waste and energy management, and CCTV crime monitoring. http://smartcities.gov.in/content. as women’s safety measures, but tended to invisibilise transgender members or increase monitoring of Dalit and Muslim areas, e.g., a FR was deployed outside women’s restrooms to detect intrusion by non-female entrants, potentially leading to violence against transgender members.
Many respondents expressed concern over AI advances in detecting sexuality, criminality, or terrorism (e.g., (Seo et al., 2018; Wang and Kosinski, 2018)) potentially being exported to India and harming minorities. P29 remarked on targeted attacks (Banaji and Bhat, 2019), “part of the smart cities project is a Facial Recognition database where anyone can upload images. Imagine the vigilantism against dalit, poor, Muslims, trans persons if someone uploads a photo of them and it was used for sex offenders [arrests].”
End-to-end algorithmic opacity In contrast to the ‘black box AI problem’, i.e., even the humans who design models do not always understand how variables are being combined to make inferences (Rudin and Radin, 2019), many respondents discussed an end-to-end opacity of inputs, model behaviour, and outcomes in India. Fairness in India was reported to suffer from a lack of access to contributing datasets, APIs, and documentation, with several respondents describing how challenging it was for researchers and civil society to assess the high-stakes AI systems. As P11 described, “Opacity is quite acute [in India]. People talk about blackboxes, reverse engineering inputs from outputs. What happens when you don’t have the output? What happens when you can’t reverse engineer at all?”.
AI’s ‘neutral’ and ‘human-free’ associations lent credence to its algorithmic authority. In January 2020, over a thousand protestors were arrested during protests in Delhi, aided by FR. The official statement was, “This is a software. It does not see faith. It does not see clothes. It only sees the face and through the face the person is caught.” (88). While algorithms may not be trained on sub-group identification, proxies may correspond to Dalits, Adivasis, and Muslims disproportionately. e.g., according to the National Crime Records Bureau (NCRB) in 2015, 34% of undertrials were Dalits and Adivasis (25% of the population); 20% were Muslims (14% of population); and 70% were non-literate (26% of the population) (Tiwary, 2015).
Several respondents discussed a lack of inclusion of diverse stakeholders in decision-making processes, laws, and policies for public sector AI. Some talked about a colonial mindset of tight control in decision-making on high-stakes AI laws, leading to reticent and monoscopic views by the judiciary and state. P5 (public policy researcher) pointed to how mission and vision statements for public sector AI tended to portray AI like magic, rather than contending with the realities of “how things worked on-the-ground in a developing country”. Additionally, respondents pointed to significant scope creep in high-stakes AI, e.g., a few mentioned how the tender for an FR system was initially motivated by detecting criminals, later missing children, to then being used to arrest protestors (88).
Questioning AI power Algorithmic fairness requires a buffer zone of journalists, activists, and researchers to keep AI system builders accountable. Many respondents described how limited debate and analysis of AI in India led to a weak implementation of Fair-ML in India. Issues of algorithmic bias and ethics were not raised in the public consciousness in India, at the time of the study. Respondents described how technology journalists in India—a keystone species for public discourse—covered app launches and investments, and less on algorithms and their impacts. P15 (journalist) pointed out that journalists may face disapproval for questioning certain narratives. “The broader issue is that AI biases do not come up in Indian press. Our definition of tech journalism has been about covering the business of tech […] It is different from the US, where there is a more combative relationship. We don’t ask tough questions of the state or tech companies.” A seminal report by Newslaundry/Oxfam described how privileged castes comprised Indian news media, invisibilising the vast oppressed majority in public discourse (Newslaundry and India, 2019).
5. Towards AI Fairness in India
In Section 4, we demonstrated that several underlying assumptions about algorithmic fairness and its enabling infrastructures fail in the Indian context. To meaningfully operationalise algorithmic fairness in India, we extend the spirit of Dr. B.R. Ambedkar’s call to action, “there cannot be political reform without social reform” (Ambedkar, 2014). We need to substantively innovate on how to empower oppressed communities and enable justice within the surrounding socio-political infrastructures in India. Missing Indic factors and values in data and models, compounded by double standards and disjuncture in ML, deployed in an environment of unquestioning AI aspiration, face the risk of reinforcing injustices and harms. We need to understand and design for end-to-end chains of algorithmic power, including how AI systems are conceived, produced, utilised, appropriated, assessed, and contested in India. We humbly submit that these are large, open-ended challenges that have perhaps not received much focus or are considered large in scope. However, a Fair-ML strategy for India needs to reflect its deeply plural, complex, and contradictory nature, and needs to go beyond model fairness. We propose an AI fairness research agenda in India, where we call for action along three critical and contingent pathways: Recontextualising, Empowering, and Enabling. Figure 2 shows the core considerations of these pathways. The pathways present opportunities for cross-disciplinary and cross-institutional collaboration. While it is important to incorporate Indian concepts of fairness into AI that impacts Indian communities, the broader goal should be to develop general approaches to fairness that reflect the needs of local communities and are appropriate to the local contexts.
5.1. Recontextualising Data and Models
How might existing algorithmic fairness evaluation and mitigation approaches be recontextualised to the Indian context, and what novel challenges does it give rise to?
5.1.1. Data Considerations
Data plays a critical role in measurements and mitigations of algorithmic bias. However, as seen in Section 4, social, economic, and infrastructural factors challenge the reliance on datasets in India. Based on our findings, we outline some recommendations and put forth critical research questions regarding data and its uses in India.
Due to the challenges to completeness and representation discussed in Section 4, we must be (even more than usual) sceptical of Indian datasets until they are trust-worthy. For instance, how could fairness research handle the known data distortions guided by traditional gender roles? What are the dangers of identifying caste in models? Should instances where models are deliberately ‘confused’ (see Section 4) be identified, and if so what should we do with such data? We must also account for data voids (Golebiewski and Boyd, 2019) for which statistical extrapolations might be invalid.
The vibrant role played by human infrastructures in providing, negotiating, collecting, and stewarding data points to new ways of looking at data as dialogue, rather than operations. Such data gathering via community relationships lead us to view data records as products of both beholder and the beheld. Building ties with community workers in the AI dataset collection process can be a starting point in creating high quality data, while respecting their situated knowledge. Combining observational research and dataset analysis will help us avoid misreadings of data. Normative frameworks (e.g., perhaps ethics of care (Held and others, 2006; Asaro, 2019; Zevenbergen, 2020)) may be relevant to take into account these social relations. A related question is how data consent might work fairly in India. One approach could be to create transitive informed consent, built upon personal relationships and trust in data workers, with transparency on potential downstream applications. Ideas like collective consent (Ruhaak, ), data trusts (Mulgan and Straub, ), and data co-ops may enhance community agency in datasets, while simultaneously increasing data reliability.
Finally, at a fundamental level, we must question the categories and constructs we model in datasets, and how we measure them. As well as the situatedness of social categories such as gender (cf. Hijra) and race (Hanna et al., 2020), ontologies of affect (sentiment, inappropriateness, etc.), taboos (Halal, revealing clothing, etc.), and social behaviours (headshakes, headwear, clothing, etc) are similarly contextual. How do we justify the “knowing” of social information by encoding it in data? We must also question if being “data-driven” is inconsistent with local values, goals and contexts. When data are appropriate for endemic goals (e.g., caste membership for quotas), what form should their distributions and annotations take? Linguistically and culturally pluralistic communities should be given voices in these negotiations in ways that respect Indian norms of representation.
5.1.2. Model and Model (Un)Fairness Considerations
The prominent axes of historical injustices in India listed in Table 1 could be a starting point to detect and mitigate unfairness issues in trained models (e.g., (Bolukbasi et al., 2016; Zhang et al., 2018)), alongside testing strategies, e.g. perturbation testing (Prabhakaran et al., 2019), data augmentation (Zmigrod et al., 2019), adversarial testing (Kurakin et al., 2016), and adherence to terminology guidelines for oppressed groups, such as SIGACCESS. However, it is important to note that operationalising fairness approaches from the West to these axes is often nontrivial. For instance, personal names act as a signifier for various socio-demographic attributes in India, however there are no large datasets of Indian names (like the US Census data, or the SSA data) that are readily available for fairness evaluations. In addition, disparities along the same axes may manifest very differently in India. For instance, gender disparities in the Indian workforce follow significantly different patterns compared to the West. How would an algorithm made fairer along gender based on datasets from the West be decontextualised and recontextualised for India?
Another important consideration is how the algorithmic fairness interventions work with the existing infrastructures in India that surrounds decision making processes. For instance, how do they work in the context of established fairness initiatives such as reservations/quotas? As an illustration, compared to the US undergraduate admission process of selection from a pool of candidates, the undergraduate admissions in India is done through various joint seat allocation processes, over hundreds of programmes, across dozens of universities and colleges that takes into account test scores, ordered preference lists provided by students, as well as various affirmative action quotas (Baswana et al., 2019). The quota system gives rise to the problem of matching under distributional constraints, a known problem in economics (Kamada and Kojima, 2015; Goto et al., 2017; Ashlagi et al., 2020), but has not received attention within the FAccT community (although (Cotter et al., 2019) is related). First-order Fair-ML problems could include representational biases of caste and other sub-groups in NLP models, biases in Indic language NLP including challenges from code-mixing, Indian subgroup biases in computer vision, tackling online misinformation, benchmarking using Indic datasets, and fair allocation models in public welfare.
5.2. Empowering Communities
Recontextualising data and models can only go so far without participatorily empowering communities in identifying problems, specifying fairness expectations, and designing systems.
5.2.1. Participatory Fair-ML knowledge systems
Context-free assumptions in Fair-ML, whether in homegrown or international AI systems, can not just fail, but produce harm inadvertently when applied to different infrastructural and cultural systems. As Mbembe describes the Western epistemic tradition, the knowing subject is enclosed in itself and produces supposedly objective knowledge of the world, “without being part of that world, and he or she is by all accounts able to produce knowledge that is supposed to be universal and independent of context” (Mbembe, 2015). How can we move away from the ‘common good’ defined by the researcher—the supposedly all-knowing entity who has the expertise and experience necessary to identify what is of benefit to all (Nathan et al., 2017)
. Humbly creating grassroots commitments with vulnerable communities is an important first step. Our study discusses how caste, religion, and tribe are eluded even within the Indian technology discourse and policy. Modes of epistemic production in Fair-ML should enable marginalised communities to produce knowledge about themselves in the policies or designs. Grassroots efforts like Deep Learning Indaba(51) and Khipu (106) are exemplar of bootstrapping AI research in communities. Initiatives like Design Beku (52) and SEWA (186) are decolonial examples of participatorily co-designing with under-served communities.
5.2.2. Low-resource considerations
India’s heterogeneous literacies, economics, and infrastructures mean that Fair-ML researchers’ commitment should go beyond model outputs, to deployments in accessible systems. Half the population of India is not online. Layers of the stack like interfaces, devices, connectivity, languages, and costs are important in ensuring access. Learning from computing fields where constraints have been embraced as design material like ICTD (Toyama, 2015) and HCI4D (Dray et al., 2012) can help, e.g., via delay-tolerant connectivity, low cost devices, text-free interfaces, intermediaries, and NGO partnerships (Brewer et al., 2005; Heimerl et al., 2013; Kumar and Anderson, 2015; Medhi et al., 2006; Sambasivan et al., 2010; Gandhi et al., 2007; Vivek et al., 2018). Data infrastructures to build and sustain localised datasets would enhance access equity (e.g., (114; N. Sambasivan, S. Kapania, H. Highfill, D. Akrong, P. Paritosh, and L. Aroyo (2021))).
5.2.3. First-world care in deployments
Critiques were raised in our study on how neo-liberal AI followed a broader pattern of extraction from the ‘bottom billion’ data subjects and labourers. Low costs, large and diverse populations, and policy infirmities have been cited as reasons for following double standards in India, e.g., in non-consensual human trials and waste dumping (Macklin, 2004) (also see (Mohamed et al., 2020)). Past disasters in India, like the fatal Union Carbide gas leak in 1984—one of the world’s worst industrial accidents—point to faulty design and low quality, double standards for the ‘third world’ (Ullah, ). Similarly, unequal standards, inadequate safeguards, and dubious applications of AI in the non-West can lead to catastrophic effects (similar analogies have been made for content moderation (Roberts, 2016; Sambasivan et al., 2019)). Fair-ML researchers should understand the systems into which they are embedding, engage with Indian realities and user feedback, and whether the recourse is meaningful.
5.3. Enabling Fair-ML Ecosystems
AI is increasingly perceived as a masculine, hype-filled, techno-utopian enterprise, with nations turning into AI superpowers (RT, 2017). AI is even more aspirational and consequential in non-Western nations, where it performs an ‘efficiency’ function in distributing scarce socio-economic resources and differentiating economies. For Fair-ML research to be impactful and sustainable, it is crucial for researchers to enable a critically conscious ecosystem.
5.3.1. Ecosystems for Accountability
Bootstrapping an ecosystem made up of civil society, media, industry, judiciary, and the state is necessary for accountability in Fair-ML systems (recall the US facial recognition example). Moving from ivory tower research approaches to solidarity with various stakeholders through project partnerships, involvement in evidence-based policy, and policy maker education can help create a sustainable Fair-ML ecosystem based on sound empirical and ethical norms. For example, we should consider research with algorithmic advocacy groups like Internet Freedom Foundation (89), that have advanced landmark changes in net neutrality and privacy.
5.3.2. Critical transparency
Inscrutability suppresses algorithmic fairness. Besides the role played by ecosystem-wide regulations and standards, radical transparency should be espoused and enacted by Fair-ML researchers committed to India. Transparency on datasets, processes, and models (e.g., (Mitchell et al., 2019; Gebru et al., 2018; Bender and Friedman, 2018)), openly discussing limitations, failure cases, and intended use-cases can help bolster the practical limits to applying computing to human problems. Such approaches can help move from the ‘magic pill’ role of fairness as a checklist for ethical issues in India, to a more pragmatic, flawed, and evolving function.
As AI becomes global, algorithmic fairness naturally follows. Context matters. We must take care to not copy-paste the western-normative fairness everywhere. We presented a qualitative study and discourse analysis of algorithmic power in India, and found that algorithmic fairness assumptions are challenged in the Indian context. We found that data was not always reliable due to socio-economic factors, ML products for Indian users sufffer from double standards, and AI was seen with unquestioning aspiration. We called for an end-to-end re-imagining of algorithmic fairness that involves re-contextualising data and models, empowering oppressed communities, and enabling fairness ecosystems. The considerations we identified are certainly not limited to India; likewise, we call for inclusively evolving global approaches to Fair-ML.
Our thanks to the experts who shared their knowledge and wisdom with us: A. Aneesh, Aishwarya Lakshmiratan, Ameen Jauhar, Amit Sethi, Anil Joshi, Arindrajit Basu, Avinash Kumar, Chiranjeeb Bhattacharya, Dhruv Lakra, George Sebastian, Jacki O Neill, Mainack Mondal, Maya Indira Ganesh, Murali Shanmugavelan, Nandana Sengupta, Neha Kumar, Rahul De, Rahul Matthan, Rajesh Veeraraghavan, Ranjit Singh, Ryan Joseph Figueiredo (Equal Asia Foundation), Savita Bailur, Sayomdeb Mukerjee, Shanti Raghavan, Shyam Suri, Smita, Sriram Somanchi, Suraj Yengde, Vidushi Marda, and Vivek Srinivasan, and others who wish to stay anonymous. To Murali Shanmugavelan for educating us and connecting this paper to anti-caste emancipatory politics and theories. To Jose M. Faleiro, Daniel Russell, Jess Holbrook, Fernanda Viegas, Martin Wattenberg, Alex Hanna, and Reena Jana for their invaluable feedback.
- Artificial intelligence for the digital government — english version. In AI whitepaper., Note: https://www.gub.uy/agencia-gobierno-electronico-sociedad-informacion-conocimiento/sites/agencia-gobierno-electronico-sociedad-informacion-conocimiento/files/documentos/publicaciones/IA%20Strategy%20-20english%20version.pdf Cited by: §1.
-  () 84% dead in cow-related violence since 2010 are muslim; 97% attacks after 2014 — indiaspend. Note: https://archive.indiaspend.com/cover-story/86-dead-in-cow-related-violence-since-2010-are-muslim-97-attacks-after-2014-2014(Accessed on 08/16/2020) Cited by: 1st item.
Entity-switched datasets: an approach to auditing the in-domain robustness of named entity recognition models. arXiv preprint arXiv:2004.04123. Cited by: §2.3.
- India’s mess of complexity is just what ai needs — mit technology review. Note: https://www.technologyreview.com/2018/06/27/240474/indias-mess-of-complexity-is-just-what-ai-needs/(Accessed on 09/18/2020) Cited by: §4.2.
- Chutia— ’chutia not slang, but community where i belong’: assam woman’s online job application rejected due to surname — trending & viral news. Note: https://www.timesnownews.com/the-buzz/article/chutia-not-slang-but-community-where-i-belong-assam-womans-online-job-application-rejected-due-to-surname/625556(Accessed on 09/28/2020) Cited by: §2.4, 1st item, 7th item.
- We are implementing a one-year moratorium on police use of rekognition. Note: https://blog.aboutamazon.com/policy/we-are-implementing-a-one-year-moratorium-on-police-use-of-rekognition(Accessed on 08/29/2020) Cited by: 3rd item.
- Annihilation of caste: the annotated critical edition. Verso Books. Cited by: 1st item, §5.
- Castes in india: their mechanism, genesis and development (vol. 1). Columbia: Indian Antiquary. Ambedkar, BR (1936). Annihilation of Caste. Jullundur: Bheem Patrika Publications. Cited by: §1, footnote 1.
- Machine bias — propublica. Note: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing(Accessed on 07/30/2020) Cited by: 2nd item.
- Spectral housing and urban cleansing: notes on millennial mumbai. Public culture 12 (3), pp. 627–651. Cited by: 2nd item.
- Bottom of the data pyramid: big data and the global south. International Journal of Communication 10, pp. 19. Cited by: §2.3.
- AI ethics in predictive policing: from models of threat to an ethics of care. IEEE Technology and Society Magazine 38 (2), pp. 40–53. Cited by: §5.1.1.
- Assignment mechanisms under distributional constraints. Operations Research 68 (2), pp. 467–479. Cited by: §5.1.2.
- Women and id in a digital age: five fundamental barriers and new design questions. Note: https://savitabailur.com/2019/09/09/women-and-id-in-a-digital-age-five-fundamental-barriers-and-new-design-questions/(Accessed on 08/02/2020) Cited by: footnote 6.
- Bioethics and human rights: a historical perspective. Cambridge Quarterly of Healthcare Ethics 10 (3), pp. 241–252. External Links: Cited by: §4.1.2.
- WhatsApp vigilantes: an exploration of citizen reception and circulation of whatsapp misinformation linked to mob violence in india. Department of Media and Communications, LSE. Cited by: §4.3.
- Labor market discrimination in delhi: evidence from a field experiment. Journal of comparative Economics 37 (1), pp. 14–27. Cited by: §2.4.
- Facial recognition based surveillance systems to be installed at 983 railway stations across india. Note: https://www.medianama.com/2020/01/223-facial-recognition-system-indian-railways-facial-recognition/(Accessed on 10/03/2020) Cited by: §4.3.
- Fairness in machine learning. NIPS Tutorial 1. Cited by: §2.
- Big data’s disparate impact. Calif. L. Rev. 104, pp. 671. Cited by: §4.1.1.
- Centralized admissions for engineering colleges in india. INFORMS Journal on Applied Analytics 49 (5), pp. 338–354. Cited by: §5.1.2.
- Law enforcement agencies in india are using artificial intelligence to nab criminals. Note: https://www.forbes.com/sites/baxiabhishek/2018/09/28/law-enforcement-agencies-in-india-are-using-artificial-intelligence-to-nab-criminals-heres-how(Accessed on 08/30/2020) Cited by: §1.
- Nirbhaya case: four indian men executed for 2012 delhi bus rape and murder - bbc news. Note: https://www.bbc.com/news/world-asia-india-51969961(Accessed on 09/01/2020) Cited by: §4.3.
Data statements for natural language processing: toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics 6, pp. 587–604. Cited by: §5.3.2.
- Race, caste and gender. Man, pp. 489–504. Cited by: footnote 1.
- Isolated by caste: neighbourhood-scale residential segregation in indian metros. IIM Bangalore Research Paper (572). Cited by: 2nd item.
- Counting cows, not rural health indicators. Note: https://ruralindiaonline.org/articles/counting-cows-not-rural-health-indicators/(Accessed on 08/02/2020) Cited by: §4.1.1.
- Fairness in machine learning: lessons from political philosophy. In Conference on Fairness, Accountability and Transparency, pp. 149–159. Cited by: §2.1.3.
- Algorithmic colonization of africa. SCRIPTed 17, pp. 389. Cited by: §1.
- The ontogeny of fairness in seven societies. Nature 528 (7581), pp. 258–261. Cited by: §2.2.
- Interviewing experts. Springer. Cited by: §3.
- Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in neural information processing systems, pp. 4349–4357. Cited by: §2.1.1, §5.1.2.
- The effectiveness of jobs reservation: caste, religion and economic status in india. Development and change 38 (3), pp. 423–445. Cited by: §4.1.2.
- Suffolk university, college of arts & sciences. Center for Restorative Justice. Retrieved on November 28, pp. 2015. Cited by: §2.1.3.
- The case for technology in developing regions. Computer 38 (6), pp. 25–38. Cited by: §5.2.2.
- Gender shades: intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency, pp. 77–91. Cited by: 3rd item, §2.1.1.
- Women and men in india: a statistical compilation of gender related indicators in india. Technical report Government of India. Cited by: Table 1.
- Conceptualising brahmanical patriarchy in early india: gender, caste, class and state. Economic and Political Weekly, pp. 579–585. Cited by: §3.
- Feminism in india. Cited by: §3.
-  (2020) Citizen cop foundation. External Links: Cited by: §4.1.1.
- ZIP code history: how they define us — the new republic. Note: https://newrepublic.com/article/112558/zip-code-history-how-they-define-us(Accessed on 09/24/2020) Cited by: 2nd item.
- Optimization with non-differentiable constraints with applications to fairness, recall, churn, and other goals.. Journal of Machine Learning Research 20 (172), pp. 1–59. Cited by: §5.1.2.
- The hidden biases in big data. Harvard business review 1 (1), pp. 814. Cited by: §4.1.1.
- Think again: big data. Foreign Policy 9. Cited by: §4.1.1.
- How accurate are facial recognition systems – and why does it matter? — center for strategic and international studies. Note: (Accessed on 07/28/2020) Cited by: footnote 10.
- Economic impact of discoverability of localities and addresses in india — emerging worlds. Note: http://mitemergingworlds.com/blog/2018/2/12/economic-impact-of-discoverability-of-localities-and-addresses-in-india(Accessed on 09/24/2020) Cited by: §4.1.1.
- Women and repayment in microfinance: a global analysis. World development 39 (5), pp. 758–772. Cited by: §4.1.2.
- Mobile loans apps tala, branch, okash face scrutiny in kenya — quartz africa. Note: https://qz.com/africa/1712796/mobile-loans-apps-tala-branch-okash-face-scrutiny-in-kenya/(Accessed on 08/04/2020) Cited by: §4.1.2.
- Racial bias in hate speech and abusive language detection datasets. arXiv preprint arXiv:1905.12516. Cited by: §2.1.1.
-  African americans. Note: https://livingnewdeal.org/what-was-the-new-deal/new-deal-inclusion/african-americans-2/(Accessed on 08/29/2020) Cited by: footnote 7.
-  (2020) Deep learning indaba. External Links: Cited by: §5.2.1.
-  (2020) Design beku. External Links: Cited by: §5.2.1.
Addressing age-related bias in sentiment analysis. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, CHI ’18, New York, NY, USA, pp. 1–14. External Links: Cited by: §2.1.1.
- Fair, but not so lovely: india’s obsession with skin whitening — by neha dixit — bright magazine. Note: https://brightthemag.com/fair-but-not-so-lovely-indias-obsession-with-skin-whitening-beauty-body-image-bleaching-4d6ba9c9743d(Accessed on 09/25/2020) Cited by: 5th item.
- India is creating a national facial recognition system. Note: https://www.buzzfeednews.com/article/pranavdixit/india-is-creating-a-national-facial-recognition-system-and(Accessed on 08/30/2020) Cited by: §1, §4.1.2.
- A broader view on bias in automated decision-making: reflecting on epistemology and dynamics. arXiv preprint arXiv:1807.00553. Cited by: §4.1.2.
- “Express yourself ” / “stay together”: tensions surrounding mobile communication in the middle-class indian family. J. Katz (Ed.), Handbook of mobile communication studies, pp. 325–337. Cited by: §4.1.1.
- After access: inclusion, development, and a more mobile internet. MIT press. Cited by: §4.1.1.
- The biometric imaginary: bureaucratic technopolitics in post-apartheid welfare. Journal of Southern African Studies 41 (4), pp. 815–833. Cited by: footnote 6.
- How to read donald duck. International General New York. Cited by: §3.
- Human–computer interaction for development: changing human–computer interaction to change the world. In The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies, and Emerging Applications, Third Edition, pp. 1369–1394. Cited by: §5.2.2.
- Why political reservations?. Journal of the European Economic Association 3 (2-3), pp. 668–678. Cited by: §4.1.2.
- Encountering development: the making and unmaking of the third world. Vol. 1, Princeton University Press. Cited by: §3.
- Most indian nobel winners brahmins: gujarat speaker rajendra trivedi. Note: https://indianexpress.com/article/cities/ahmedabad/most-indian-nobel-winners-brahmins-gujarat-speaker-rajendra-trivedi-6198741/(Accessed on 09/04/2020) Cited by: §4.2.
- The wretched of the earth. Grove/Atlantic, Inc.. Cited by: §3.
- The validity and practicality of sun-reactive skin types i through vi. Archives of dermatology 124 (6), pp. 869–871. Cited by: 1st item, §2.1.1.
- Digital green: participatory video for agricultural extension. In 2007 International conference on information and communication technologies and development, pp. 1–10. Cited by: §5.2.2.
- 5 in new delhi rape case face murder charges - the new york times. Note: https://www.nytimes.com/2013/01/04/world/asia/murder-charges-filed-against-5-men-in-india-gang-rape.html(Accessed on 09/13/2020) Cited by: §4.1.1.
- Counterfactual fairness in text classification through robustness. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp. 219–226. Cited by: §2.1.1.
- Datasheets for datasets. arXiv preprint arXiv:1803.09010. Cited by: §5.3.2.
- Data voids: where missing data can easily be exploited. Data & Society. Cited by: §5.1.1.
- Designing matching mechanisms under general distributional constraints. American Economic Journal: Microeconomics 9 (2), pp. 226–62. Cited by: §5.1.2.
- Moral foundations theory: the pragmatic validity of moral pluralism. In Advances in experimental social psychology, Vol. 47, pp. 55–130. Cited by: §2.1.3.
- Algorithmic realism: expanding the boundaries of algorithmic thought. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 19–31. Cited by: §2.1.2.
- The false promise of risk assessments: epistemic reform and the limits of fairness. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 594–606. Cited by: §2.1.2.
- Red tape: bureaucracy, structural violence, and poverty in india. Duke University Press. Cited by: footnote 6.
- Global ai ethics: a review of the social impacts and ethical implications of artificial intelligence. arXiv, pp. arXiv–1907. Cited by: §2.3.
- Towards a critical race methodology in algorithmic fairness. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 501–512. Cited by: §2.1.1, §5.1.1.
- Local, sustainable, small-scale cellular networks. In Proceedings of the Sixth International Conference on Information and Communication Technologies and Development: Full Papers-Volume 1, pp. 2–12. Cited by: §5.2.2.
- The ethics of care: personal, political, and global. Oxford University Press on Demand. Cited by: §5.1.1.
- Science, jews, and secular culture: studies in mid-twentieth-century american intellectual history. Princeton University Press. Cited by: footnote 7.
- Improving fairness in machine learning systems: what do industry practitioners need?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–16. Cited by: §4.1.
- 50 years of test (un) fairness: lessons for machine learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 49–58. Cited by: §2.1.2.
- Social biases in nlp models as barriers for persons with disabilities. ACL. Cited by: §2.1.1.
- Two thirds of india’s dalits are poor - international dalit solidarity network. Note: https://idsn.org/two-thirds-of-indias-dalits-are-poor/(Accessed on 08/13/2020) Cited by: 1st item.
- The IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems. “Classical Ethics in A/IS”. In Ethically Aligned Design: A Vision for Prioritizing Human Well-being with Autonomous and Intelligent Systems, First Edition, pp. 36–67. Cited by: §2.3.
- Culture and fairness: the idea of civilization fairness. In Fairness, Globalization and Public Institutions, pp. 31–33. Cited by: §2.1.3.
-  (2020-03) India used facial recognition tech to identify 1,100 individuals at a recent riot — techcrunch. Note: https://techcrunch.com/2020/03/11/india-used-facial-recognition-tech-to-identify-1100-individuals-at-a-recent-riot(Accessed on 07/28/2020) Cited by: §4.3, §4.3, §4.3.
-  (2020) Internet freedom foundation. External Links: Cited by: §5.3.1.
- Engaging solidarity in data collection practices for community health. Proceedings of the ACM on Human-Computer Interaction 2 (CSCW), pp. 1–24. Cited by: §4.1.1.
- India’s internet population is exploding but women are not logging in. Scroll.in. External Links: Cited by: §4.1.1.
- Accounts and accountability: theoretical implications of the right-to-information movement in india. Third world quarterly 20 (3), pp. 603–622. Cited by: §4.1.1.
- The global landscape of ai ethics guidelines. Nature Machine Intelligence 1 (9), pp. 389–399. Cited by: §2.3.
- Rawlsian fairness for machine learning. arXiv preprint arXiv:1610.09559 1 (2). Cited by: §2.1.3.
- Racial indirection. UCDL Rev. 52, pp. 2495. Cited by: footnote 7.
- High gender disparity among internet users in india - the financial express. Note: https://www.financialexpress.com/industry/high-gender-disparity-among-internet-users-in-india/1718951/(Accessed on 10/06/2020) Cited by: 8th item.
- Residual unfairness in fair machine learning from prejudiced data. arXiv preprint arXiv:1806.02887. Cited by: §4.1.1.
- Opportunities and challenges for artificial intelligence in india. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 164–170. Cited by: §2.4.
- Efficient matching under distributional constraints: theory and applications. American Economic Review 105 (1), pp. 67–99. Cited by: §5.1.2.
- In india, accessible phones lead to inaccessible opportunities. Note: https://thewire.in/caste/india-accessible-phones-still-lead-inaccessible-opportunities(Accessed on 01/14/2021) Cited by: §4.1.1.
- Casteist slurs you need to know - youtube. Note: https://www.youtube.com/watch?v=wJwkIxOpqZA(Accessed on 09/25/2020) Cited by: 7th item.
- Obsessions with fair skin: color discourses in indian advertising. Advertising & society review 9 (2). Cited by: §4.1.2.
- Toward situated interventions for algorithmic equity: lessons from the field. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 45–55. Cited by: §2.4.
- ”Happy and assured that life will be easy 10years from now.”: perceptions of artificial intelligence in 8 countries. arXiv preprint arXiv:2001.00081. Cited by: §4.3.
- Surveillance slavery: swachh bharat tags sanitation workers to live-track their every move — huffpost india. Note: https://www.huffingtonpost.in/entry/swacch-bharat-tags-sanitation-workers-to-live-track-their-every-move_in_5e4c98a9c5b6b0f6bff11f9b?guccounter=1(Accessed on 07/28/2020) Cited by: §4.2.
-  (2020) Khipu ai. External Links: Cited by: §5.2.1.
- Aarogya setu: a bridge too far? — deccan herald. Note: https://www.deccanherald.com/specials/sunday-spotlight/aarogya-setu-a-bridge-too-far-835691.html(Accessed on 08/01/2020) Cited by: §4.1.1.
- How facial recognition can ruin your life – intercept. Note: https://theintercept.com/2016/10/13/how-a-facial-recognition-mismatch-can-ruin-your-life/(Accessed on 07/30/2020) Cited by: 2nd item, 5th item.
Translation tutorial: a shared lexicon for research and practice in human-centered software systems. In 1st Conference on Fairness, Accountability, and Transparancy. New York, NY, USA, Vol. 7. Cited by: §2.4.
- Mobile phones for maternal health in rural india. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 427–436. Cited by: §5.2.2.
- Does access to formal agricultural credit depend on caste?. World Development 43, pp. 315–328. Cited by: §2.4.
- Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236. Cited by: §5.1.2.
- Digital colonialism: us empire and the new imperialism in the global south. Race & Class 60 (4), pp. 3–26. Cited by: §1.
-  (2020) Lacuna fund. External Links: Cited by: §5.2.2.
- Big data and its exclusions. Stan. L. Rev. Online 66, pp. 55. Cited by: §4.1.1.
- Social justice from a cultural perspective.. Cited by: §2.2.
- To predict and serve?. Significance 13 (5), pp. 14–19. Cited by: §2.1.1.
- Culture’s impact on the importance of fairness in interorganizational relationships. Journal of International Marketing 21 (4), pp. 21–43. Cited by: §2.1.3.
- Double standards in medical research in developing countries. Vol. 2, Cambridge University Press. Cited by: §5.2.3.
- Caste discrimination in the indian urban labour market: evidence from the national sample survey. Economic and political Weekly, pp. 4146–4153. Cited by: §2.4.
- Black is to criminal as caucasian is to police: detecting and removing multiclass bias in word embeddings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 615–621. Cited by: §2.1.1.
- Data in new delhi’s predictive policing system. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 317–324. Cited by: §2.4, §2.
- Artificial intelligence policy in india: a framework for engaging the limits of data-driven decision-making. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 376 (2133), pp. 20180087. Cited by: §2.4.
- Participatory problem formulation for fairer machine learning through community based system dynamics. ICLR Workshop on Machine Learning in Real Life (ML-IRL). Cited by: §2.4.
- Towards an ai strategy in mexico: harnessing the ai revolution. In AI whitepaper., Cited by: §1.
- Negotiating women’s agency through icts: a comparative study of uganda and india. Gender, Technology and Development 19 (1), pp. 43–69. Cited by: §4.1.1.
- Decolonizing knowledge and the question of the archive. Cited by: §5.2.1.
- Text-free user interfaces for illiterate and semi-literate users. In 2006 international conference on information and communication technologies and development, pp. 72–82. Cited by: §5.2.2.
- A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635. Cited by: §2.
- The darker side of western modernity: global futures, decolonial options. Duke University Press. Cited by: §3.
-  () 2011 census data. Note: https://www.censusindia.gov.in/2011-Common/CensusData2011.html(Accessed on 08/26/2020) Cited by: Table 1, Table 1, Table 1.
- Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency, pp. 220–229. Cited by: §5.3.2.
- Decolonial ai: decolonial theory as sociotechnical foresight in artificial intelligence. Philosophy & Technology, pp. 1–26. Cited by: §1, §5.2.3.
- Why urban indian women turn down job opportunities away from home —. Note: https://www.indiaspend.com/why-urban-indian-women-turn-down-job-opportunities-away-from-home-94002/(Accessed on 09/25/2020) Cited by: 6th item.
- Feminism without borders: decolonizing theory, practicing solidarity. Zubaan. Cited by: §3.
-  () The cisco case could expose rampant prejudice against dalits in silicon valley. Note: https://thewire.in/caste/cisco-caste-discrimination-silicon-valley-dalit-prejudice(Accessed on 08/14/2020) Cited by: 3rd item.
-  () The new ecosystem of trust: how data trusts, collaboratives and coops can help govern data for the maximum public benefit — nesta. Note: https://www.nesta.org.uk/blog/new-ecosystem-trust/(Accessed on 08/21/2020) Cited by: §5.1.1.
- This thing called fairness: disciplinary confusion realizing a value in technology. Proceedings of the ACM on Human-Computer Interaction 3 (CSCW), pp. 1–36. Cited by: §2.1.1, §2.1.3.
- How india’s data labellers are powering the global ai race — factordaily. Note: https://factordaily.com/indian-data-labellers-powering-the-global-ai-race/(Accessed on 09/13/2020) Cited by: §4.1.1.
- Good for whom? unsettling research practice. In Proceedings of the 8th International Conference on Communities and Technologies, pp. 290–297. Cited by: §5.2.1.
-  (2018) National strategy for artificial intelligence #ai4all. Niti Aayog. Cited by: §1, §1, §4.3.
- Who Tells Our Stories Matters: Representation of Marginalised Caste Groups in Indian Newsrooms. Cited by: §4.3.
- Accountability in a computerized society. Science and engineering ethics 2 (1), pp. 25–42. Cited by: §2.4.
- The long history of algorithmic fairness. Phenomenal World. Cited by: §2.1.3.
- Social data: biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data 2, pp. 13. Cited by: footnote 5.
- Computers and the promise of development: aspiration, neoliberalism and “technolity” in india’s ictd enterprise. A paper presented at confronting the Challenge of Technology for Development: Experiences from the BRICS, pp. 29–30. Cited by: §4.3.
- Banalities turned viral: narendra modi and the political tweet. Television & New Media 16 (4), pp. 378–387. Cited by: §4.3.
- Purposeful sampling for qualitative data collection and analysis in mixed method implementation research. Administration and policy in mental health and mental health services research 42 (5), pp. 533–544. Cited by: §3.
- Can mandated political representation increase policy influence for disadvantaged minorities? theory and evidence from india. American Economic Review 93 (4), pp. 1132–1151. Cited by: §4.1.2.
- COVID-19 lockdown highlights india’s great digital divide. Note: https://www.downtoearth.org.in/news/governance/covid-19-lockdown-highlights-india-s-great-digital-divide-72514(Accessed on 01/14/2021) Cited by: §4.1.1.
- Social audits in india – a slow but sure way to fight corruption. Note: https://www.theguardian.com/global-development/poverty-matters/2012/jan/13/india-social-audits-fight-corruption(Accessed on 08/21/2020) Cited by: footnote 8.
- Reflecting the past, shaping the future: making ai work for international development. USAID. gov. Cited by: §2.3.
- Perturbation sensitivity analysis to detect unintended model biases. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5744–5749. Cited by: §5.1.2.
- Smart, responsible, and upper caste only: measuring caste attitudes through large-scale analysis of matrimonial profiles. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 13, pp. 393–404. Cited by: §2.4.
- Biometrics use for social protection programmes in india–risk: violating human rights of the poor. United Nations Research Institute for Social Development 2. Cited by: §2.4.
- Considering social implications of biometric registration: a database intended for every citizen in india [commentary]. IEEE Technology and Society Magazine 34 (1), pp. 10–16. Cited by: §2.4.
- Report of the expert group to review the methodology for measurement of poverty. Technical report Government of India Planning Commission. Cited by: Table 1.
- How a glitch in india’s biometric welfare system can be lethal — india — the guardian. Note: https://www.theguardian.com/technology/2019/oct/16/glitch-india-biometric-welfare-system-starvation(Accessed on 07/29/2020) Cited by: footnote 6.
- Dalit women talk differently: a critique of ’difference’ and towards a dalit feminist standpoint position. Economic and Political Weekly, pp. WS39–WS46. Cited by: footnote 3.
- People with disabilities in india: from commitments to outcomes. Note: http://documents1.worldbank.org/curated/en/577801468259486686/pdf/502090WP0Peopl1Box0342042B01PUBLIC1.pdf(Accessed on 08/26/2020) Cited by: Table 1.
- Fairness and political equality: india and the u.s.. Note: https://law.utah.edu/event/fairness-and-political-equality-india-and-the-u-s/ Cited by: §4.1.2.
- Digital refuse: canadian garbage, commercial content moderation and the global circulation of social media’s waste. Wi: journal of mobile media. Cited by: §5.2.3.
- Justice as the lens: interrogating rawls through sen and ambedkar. Indian Journal of Human Development 5 (1), pp. 153–174. Cited by: §2.1.3.
- Expected utilitarianism. arXiv preprint arXiv:2008.07321. Cited by: §2.1.3.
- The mobile gender gap report 2020. London: GSMA.. Cited by: §4.1.1.
- Capitalism: a ghost story. Haymarket Books. Cited by: §1.
- ’Whoever leads in ai will rule the world’: putin to russian children on knowledge day — rt world news. Note: https://www.rt.com/news/401731-ai-rule-world-putin/(Accessed on 09/20/2020) Cited by: §5.3.
Why are we using black box models in ai when we don’t need to? a lesson from an explainable ai competition.
Harvard Data Science Review1 (2). Cited by: §4.3.
-  () Mozilla foundation - when one affects many: the case for collective consent. Note: https://foundation.mozilla.org/en/blog/when-one-affects-many-case-collective-consent/(Accessed on 08/21/2020) Cited by: §5.1.1.
- India’s poor are also document-poor. Note: https://www.livemint.com/news/india/india-s-poor-are-also-document-poor-11578300732736.html(Accessed on 09/13/2020) Cited by: 9th item.
- In india, who speaks in english, and where?. Note: https://www.livemint.com/news/india/in-india-who-speaks-in-english-and-where-1557814101428.html(Accessed on 09/25/2020) Cited by: 7th item.
-  (2020) Safetipin. External Links: Cited by: §4.1.1.
- Imagined connectivities: synthesized conceptions of public wi-fi in urban india. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 5917–5928. Cited by: §4.3.
- ” They don’t leave us alone anywhere we go” gender and digital abuse in south asia. In proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–14. Cited by: §4.2, §5.2.3.
- ” Privacy is not for me, it’s for those rich women”: performative privacy practices on mobile phones by women in south asia. In Fourteenth Symposium on Usable Privacy and Security (SOUPS 2018), pp. 127–142. Cited by: §4.1.1, §4.1.1, §4.1.1.
- Intermediated technology use in developing communities. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2583–2592. Cited by: §5.2.2.
- Toward responsible ai for the next billion users. interactions 26 (1), pp. 68–71. Cited by: §2.3.
- ”Everyone wants to do the model work, not the data work”: data cascades in high-stakes ai. In proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Cited by: §5.2.2.
- The human infrastructure of ictd. In Proceedings of the 4th ACM/IEEE international conference on information and communication technologies and development, pp. 1–9. Cited by: §4.1.1, §4.2.
- The remarkable illusions of technology for social good. interactions 26 (3), pp. 64–66. Cited by: §4.3.
- The risk of racial bias in hate speech detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1668–1678. Cited by: §2.1.1.
- BOND: caste and development. Cited by: §4.1.1, footnote 1.
- Fair is not fair everywhere. Psychological science 26 (8), pp. 1252–1260. Cited by: §2.2.
- The idea of justice. Harvard University Press. Cited by: §2.1.3.
Partially generative neural networks for gang crime classification with partial information. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 257–263. Cited by: §4.3.
-  (2020) SEWA. External Links: Cited by: §5.2.1.
- Disability rights: wheelchair users cannot access most of delhi’s buses. Note: https://scroll.in/roving/894005/in-photos-why-wheelchair-users-in-delhi-find-it-difficult-to-use-buses-even-low-floor-ones(Accessed on 09/25/2020) Cited by: 6th item.
-  () #MissionCashless: few use mobiles, fewer know what internet is in adivasi belts of madhya pradesh. Note: https://scroll.in/article/824882/missioncashless-few-use-mobiles-fewer-know-what-internet-is-in-adivasi-belts-of-madhya-pradesh(Accessed on 08/14/2020) Cited by: §4.1.1.
- No classification without representation: assessing geodiversity issues in open data sets for the developing world. arXiv preprint arXiv:1711.08536. Cited by: §2.3.
- Everyday communicative practices of arunthathiyars: the contribution of communication studies to the analysis of caste exclusion and subordination of a dalit community in tamil nadu, india. Cited by: §1, footnote 1.
- Toward fair, accountable, and transparent algorithms: case studies on algorithm initiatives in korea and china. Javnost - The Public 26 (3), pp. 274–290. External Links: Cited by: §2.3.
- From margins to seams: imbrication, inclusion, and torque in the aadhaar identification project. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 4776–4824. Cited by: footnote 6.
- ’The living dead’. Whispers from the Field: Ethnographic Poetry and Creative Prose, pp. 29–31. Cited by: §4.1.1.
- Normal life: administrative violence, critical trans politics, and the limits of law. Duke University Press. Cited by: footnote 6.
- Making merit: the indian institutes of technology and the social life of caste. Comparative Studies in Society and History 57 (2), pp. 291. Cited by: §4.2.
- Mitigating gender bias in natural language processing: literature review. arXiv preprint arXiv:1906.08976. Cited by: §2.1.1.
- Does microfinance empower women? evidence from self-help groups in india. International review of applied economics 23 (5), pp. 541–556. Cited by: §4.1.2.
- Section 377: challenges and changing perspectives in the indian society. Changing Trends in Human Thoughts and Perspectives: Science, Humanities and Culture Part I, pp. 68. Cited by: 1st item.
- The unexpected entry and exodus of women in computing and hci in india. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 1–12. Cited by: footnote 9.
- A general inductive approach for analyzing qualitative evaluation data. American journal of evaluation 27 (2), pp. 237–246. Cited by: §3.
- The legacy of social exclusion: a correspondence study of job discrimination in india. Economic and political weekly, pp. 4141–4145. Cited by: §2.4, 1st item.
- Almost 68 percent inmates undertrials, 70 per cent of convicts illiterate — the indian express. Note: https://indianexpress.com/article/india/india-news-india/almost-68-inmates-undertrials-70-of-convicts-illiterate/(Accessed on 07/28/2020) Cited by: §4.3.
- Geek heresy: rescuing social change from the cult of technology. PublicAffairs. Cited by: §5.2.2.
- National ai strategy: unlocking tunisia’s capabilities potential. In AI workshop., Note: http://www.anpr.tn/national-ai-strategy-unlocking-tunisias-capabilities-potential/ Cited by: §1.
-  () Court told design flaws led to bhopal leak — environment — the guardian. Note: https://www.theguardian.com/world/2000/jan/12/1(Accessed on 08/21/2020) Cited by: §5.2.3.
- Employment, exclusion and’merit’in the indian it industry. Economic and Political weekly, pp. 1863–1868. Cited by: §4.2.
- Dealing with the digital panopticon: the use and subversion of ict in an indian bureaucracy. In Proceedings of the Sixth International Conference on Information and Communication Technologies and Development: Full Papers-Volume 1, pp. 248–255. Cited by: §4.2.
- Are technology-enabled cash transfers really ’direct’?. Economic and Political Weekly 53 (30). Cited by: §5.2.2.
- Decolonising the mind: the politics of language in african literature. East African Publishers. Cited by: §3.
- World system versus world-systems: a critique. Critique of Anthropology 11 (2), pp. 189–194. Cited by: §1, §4.2.
- Deep neural networks are more accurate than humans at detecting sexual orientation from facial images.. Journal of personality and social psychology 114 (2), pp. 246. Cited by: §4.3.
- Jayapal joins colleagues in introducing bicameral legislation to ban government use of facial recognition, other biometric technology - congresswoman pramila jayapal. Note: https://jayapal.house.gov/2020/06/25/jayapal-joins-rep-pressley-and-senators-markey-and-merkley-to-introduce-legislation-to-ban-government-use-of-facial-recognition-other-biometric-technology/(Accessed on 07/30/2020) Cited by: 4th item.
- What to account for when accounting for algorithms: a systematic literature review on algorithmic accountability. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 1–18. Cited by: §2.4.
- Tribes and social exclusion. CSSSC-UNICEF Social Inclusion Cell, An Occasional Paper 2, pp. 1–18. Cited by: 1st item.
- On the legal compatibility of fairness definitions. arXiv preprint arXiv:1912.00761. Cited by: §4.1.2.
- Internet users as vulnerable and at-risk human subjects: reviewing research ethics law for technical internet research. Ph.D. Thesis, University of Oxford. Note: Unpublished PhD thesis Cited by: §5.1.1.
- Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 335–340. Cited by: §5.1.2.
- Men also like shopping: reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457. Cited by: §2.1.1.
- Counterfactual data augmentation for mitigating gender stereotypes in languages with rich morphology. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1651–1661. Cited by: §5.1.2.