The Big Picture: Ethical Considerations and Statistical Analysis of Industry Involvement in Machine Learning Research

06/08/2020
by   Thilo Hagendorff, et al.
Universität Tübingen
0

It is commonly believed among the machine learning (ML) community that industry influence on the community itself as well as the scientific process is increasing since tech companies have begun to allocate a large amount of human and monetary resources to ML. However, concrete ethical implications and the quantitative scale of this influence are rather unknown. For this purpose we have not only carried out an informed ethical analysis of the field, but have inspected all papers of the main ML conferences NeurIPS, CVPR and ICML of the last 5 years - almost 11000 papers in total. Our statistical approach focuses on conflicts of interest, innovation and gender equality. We have obtained four main findings: (1) Academic-corporate collaborations are growing in numbers. At the same time, we found that conflicts of interest are rarely disclosed. (2) Industry publishes papers about trending ML topics on average two years earlier than academia. (3) Industry papers are not lagging behind academic papers concerning social impact considerations. (4) Finally, we demonstrate that industrial papers fall short of their academic counterparts with respect to the ratio of gender diversity. The results have been reviewed in the light of related research works from ethics and other disciplines. For the first time we have quantitatively analysed the influence of industry on the ML community. We believe that this is a good starting point for further fine-grained discussion. The main recommendation that follows from our research is for the community to openly declare conflicts of interest, also subtle or only potential ones, to foster trustworthiness and transparency.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

page 15

07/10/2021

Industry and Academic Research in Computer Vision

This work aims to study the dynamic between research in the industry and...
09/22/2020

Ethical Machine Learning in Health Care

The use of machine learning (ML) in health care raises numerous ethical ...
12/08/2021

Whose Ground Truth? Accounting for Individual and Collective Identities Underlying Dataset Annotation

Human annotations play a crucial role in machine learning (ML) research ...
06/29/2021

The Values Encoded in Machine Learning Research

Machine learning (ML) currently exerts an outsized influence on the worl...
08/06/2021

Mitigating dataset harms requires stewardship: Lessons from 1000 papers

Concerns about privacy, bias, and harmful applications have shone a ligh...
05/13/2021

An Interpretable Graph-based Mapping of Trustworthy Machine Learning Research

There is an increasing interest in ensuring machine learning (ML) framew...
03/25/2019

Evolving Academia/Industry Relations in Computing Research: Interim Report

In 2015, the CCC co-sponsored an industry round table that produced the ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The number of papers submitted and accepted at the major machine learning (ML) conferences is growing rapidly. Besides submissions from academia, big tech companies like Amazon, Apple, Google, and Microsoft submit a large number of papers. But the influence of these companies on science is unclear. Do they drive innovation? What are potential upsides and downsides of industry involvement in ML research? What are the possible ramifications of conflicts of interest? In order to investigate the mentioned topics, namely the industry involvement in ML research and its associated ramifications that range from questions of conflicts of interest, scientific progress and research agendas, as well as gender balance, we have conducted a statistical data analysis into the field. Our analysis serves to answer four overarching research questions. First of all, we will develop a quantitative analysis of the proportion of industry, academic, and academic-corporate collaboration papers within the three major ML conferences, namely the Conference and Workshop on Neural Information Processing Systems (NeurIPS), International Conference on Machine Learning (ICML), and Conference on Computer Vision and Pattern Recognition (CVPR). Secondly, we aim to find out whether conflicts of interest are disclosed in those cases in which they are pertinent. Answering these questions will be of importance in order to assess potential changes in conference policies on transparency statements. Thirdly, we are interested in the role industry papers play with regard to scientific progress and ethical concerns, as well as whether they are, in this respect, any different than academic research. And finally, we investigate gender balances, particularly with regard to proportions of women working on industry papers as well as acting in the position of principal/senior investigators. In the following paragraphs, we give a theoretical introduction into the mentioned issues and discuss their ethical implications.

1.1 The ethics of conflicts of interests

First, we will examine conflicts of interest which are a common side effect of industry involvement in academic research in general. A substantial amount of literature is dedicated to reflecting on conflicts of interest that can occur in clinical practice, education, or research Rodwin.1993; Fickweiler.2017; Thompson.1993. As a consequence of conflicts of interest in research, medical journals require researchers to name funding sources. The public disclosure of funding sources, affiliations, memberships etc. are supposed to inform those who receive scientific information or advice in order to close information gaps. This allows them to assess the information or advice to its full extent. In this context, since 2020, NeurIPS, the largest ML conference in the world, requires researchers who make submissions to describe the potential broader impact their respective research has on society as well as to disclose financial conflicts of interest that could directly or indirectly affect the research in a funding transparency statement (see: https://neurips.cc/Conferences/2020/PaperInformation/FundingDisclosure).
But what exactly are conflicts of interest? While it is hard to find a universal definition, a common denominator is that conflicts of interest arise when personal interests interfere with requirements of institutional roles or professional responsibilities Komesaroff.2019. Here, interests can be seen as goals that are aligned with certain financial or non-financial values that have a particular, maybe detrimental effect on decision-making. Coexistence of conflicting interests results in an incompatibility of two or more lines of actions. In modern research settings, dynamic and complex constellations of conflicting interests frequently occur Komesaroff.2019. For instance, conflicts of interest do not only pose a problem in cases where researchers intentionally follow particular interests that undermine others. Many effects that arise from conflicts of interest take effect on subconscious levels Cain.2008; Moore.2004b; Dana.2003, where actions are rationalized by post hoc explanations Haidt.2001. Especially in the field of medical research many studies show that even when physicians report that they are not biased by financial incentives, they actually are Orlowski.1992; Avorn.1982. This means that despite researchers’ belief in their own integrity and the idea that financial opportunities, honorariums, grants, awards, or gifts have no influence in their line of action, opinion, or advice, the influence is, in fact, measurable.
Psychological research has shown that individuals often succumb to various biases that steer their behavior Chavalarias.2010; Ioannidis.2005; Kahneman.2012b; Tversky.1974

. These are so-called “self-serving biases”, meaning that fairness criteria, assumptions about the susceptibility towards conflicts of interest, or other ways of evaluating issues are skewed towards one’s own favor

McKinney.1990. One famous self-serving bias is exemplified by the fact that physicians assume that small gifts do not significantly influence their behavior, while actually, the opposite is true Brennan2006. Even small favors elicit the reciprocity principle, meaning that there is a clear influence or bias on an individual’s behavior. These biases are not necessarily associated with lacking moral integrity or even corruptibility. On the contrary, they can be assigned to an “ecological rationality”, meaning that an individual’s behavior is adapted to environmental structures and certain cognitive strategies Arkes.2016; Gigerenzer.2001. Nevertheless, conflicts of interest can have or actually do have dysfunctional effects on the scientific process. Hence, the scientific community does well in finding a way to deal with them properly. This is mostly done by obliging researchers to disclose conflicts of interest. While this is an accepted method in many scientific fields, it can actually have negative effects. These so-called “perverse effects” are described by Cain et al. Cain.2005; Crawford.1982. On the one hand, Cain and colleagues demonstrate that disclosing conflicts of interest does not lead people to relativize claims by biased experts sufficiently since disclosure can in some cases increase rather than decrease trust. On the other hand, and more importantly, experts who reveal conflicting interests may thus feel free to exaggerate their advice and claims since they have reduced their guilty conscience about spreading misleading or biased information. While transparency statements have side effects, they should certainly not be omitted entirely. Further efforts to introduce transparency declarations are to be welcomed, while at the same time a responsible interpretation of these declarations is required in order to ensure that disclosure brings about the intended effects Loewenstein.2012.
A further concern connected to industry funding is that research agendas are skewed. More applied topics and short-term benefits are favored over basic science and its potential long-term outcomes. This causes an agenda setting that leads to a strong orientation of research strands towards corporate interests Washburn.2008, or, more severely, to the plain distortion or suppression of certain research results in order to produce favorable outcomes for the respective sponsor. This is called “industry bias” Lundh.2017; Probst.2016; Krimsky.2013. This bias can occur due to payments for services, the commodification of intellectual property rights, job offers, startups or companies scientists own, consultation opportunities, and the like. But despite the manifold pitfalls that are caused by the intermingling of academia and industry, studies show that particularly corporate-sponsored research can be very valuable for science itself as well as for society as a whole Wright.2014 — see the example of microneedle fabrication.

1.2 Drivers of scientific progress and innovation

Hence, one has to discuss another concomitant of industry involvement in research, namely industry’s potential innovative strength. Industry involvement in the sciences can not only provide more jobs, lead to tangible applications of scientific insights, provide life-enhancing products, increase a society’s wealth, but also lead to much-cited papers, and spur innovations. Researchers Wright.2014 have showed that corporate-sponsored inventions resulted in licenses and patents more frequently than federally sponsored ones—although this alone does not mean that industry is more innovative per se. Furthermore, corporations are often seeking university partners in order to widen their portfolio of products, business models, and profit opportunities. This can nudge academic partners to act progressively, toward novel, unprecedented experiments, research ideas, and speculative approaches Evans.2010. On detours, industry funds lead to scientific progress and innovation. Research on innovation processes has shown that organizations are typically not innovating internally, but in networks, in social relationships between members of different organizations, in technology transfer offices, science parks, and many other university-industry collaborations Perkmann.2007. These collaborations can emerge via research papers, conferences, meetings, informal information exchange, consulting, contract research, hired graduates, a joint work on patents or licences, etc., and play a vital role in driving innovation processes Cohen.2002b. All in all, scientists’ sensitivity towards opportunities of industry funding may in fact cause “deformed” research agenda settings. This does not necessarily mean, though, that innovations, scientific progress, and its positive effects for society are diminished. Academic engagement, i.e. the involvement of researchers in university-industry knowledge transfer processes of all kinds, is a common by-product of academic success. Scientists who are well established, more senior, have more social capital, more publications, and more government grants, are at the same time more likely to have industrial collaborators Perkmann.2013. This is due to the “Matthew effect”, meaning that researchers who are already successful in their field of research are more likely to reinforce this success with industry engagements whose returns continuously lead to more academic success. Researchers who are involved in commercialization activities publish more papers in comparison to their non-patenting colleagues Fabrizio.2008; Breschi.2007. Scientific success in ML research seems to go hand in hand with industry collaborations. However, industry-driven research or research that is intended to be commodified is, in most cases, more secretive and less accessible for the public.

1.3 Statistics on gender imbalances

The final issue we are going to investigate is that of gender aspects and their entanglement with industry research. Noticeably, male academics are significantly more likely to have industry partners than female scientists Perkmann.2013. This finding corresponds to the fact that ML research has a diversity imbalance, indicating that male researchers significantly outnumber females. Statistics show that only a small share of authors at major conferences are women. The same holds true for the proportion of ML professors, the affliction of tech companies with heavy gender imbalances, women’s tendency to leave the technology sector, as well as the fact that they are payed less than men MyersWest.2019; Simonite.2018. Further diversity dimensions like ethnicity, intersexuality, and many other minorities or marginalized groups are often not statistically documented. Tech companies have even thwarted access to diversity figures to avoid public attention on the under-representation of women and minorities Pepitone.2013. All in all, the “gender problem” of the ML sector does not only manifest itself on the level of lacking workforce diversity, but in the functionality of software applications, too Leavy.2018. Despite these rather general observations and statistics, we want to find out whether gender imbalances have a particularly pronounced manifestation in the context of industry ML research. Inspired by previous research on gender imbalances Andersen.2019, we scrutinise the ratio of female (last) authors in academia and industry papers.

2 Methods: Analysing 10772 ML papers

At this point, we will briefly describe the methods we have used to conduct our statistical analysis. More detailed information about the process can be found in the supplementary material. All in all, our analysis focuses on three major ML conferences: ICML, CVPR, and NeurIPS. We downloaded all available articles in the period between 2015 to 2019 from the respective conference proceedings. Altogether, the data set contains 10807 papers. The papers were downloaded using the python-tool Beautiful Soup (v. 4.8.2). We extracted the text with pdftotext (v.0.62.0) and analyzed the text with a self-written script. With this method, we are now able to search 10772 papers (99,7%). Some of the papers are, for example, not searchable because their text is embedded as an image. We are not only analysing the papers themselves, we are also interested in the metadata, namely affiliations and authors. For the analysis of the affiliations, we extracted them from the texts and categorised them into academic and industrial affiliations. We define a paper as academic if it contains one of the following terms (see supplementary material S.1.5 for more information on why we use these terms only):

California Institute of Technology / Ecole / EPFL / ETH Zürich / INRIA / Kaist / Massachusetts Institute of Technology / MILA / MIT / ParisTech / Planck / RIKEN / TU Darmstadt / Université / Universiteit / University.

For the definition of a paper as industry we use the following terms:

Adobe / AITRICS / Alibaba / Amazon / Ant Financial / Apple / Bell Labs / Bosch / Criteo /Data61 / DeepMind / Expedia / Facebook / Google / Huawei / IBM / Intel / Kwai / Microsoft / NEC / Netflix / NTT / Nvidia. / Petuum / Qualcomm / Salesforce / Siemens / Tencent / Toyota / Uber / Vector Institute / Xerox / Yahoo / Yandex.



Unless otherwise stated, we define a paper as academic if it does not contain an industry term and a paper belonging to industry if it does not contain an academic term. In total, 90,2% of all papers contain at least one of the terms from academia or industry listed above. These numbers are entirely dependent on the fact that the authors actually declare all their affiliations in the paper. Furthermore, we extracted the names of the authors and the titles of the papers directly from the websites, not from pdf documents. For this purpose, we once again used Beautiful Soap. We extracted 41939 authors. However, many authors have multiple accepted papers, and thus, the number of authors is reduced to 18060 unique authors. All information in text and graphics about the number of authors refers to these unique authors. The genders of the names were determined using the name-to-gender service GenderAPI. GenderAPI offers the highest accuracy of the name-to-gender tools Santamaria2018 and is able to determine the gender of 17412 authors (96.4%). Finally, we downloaded the citations received for each individual paper using the Microsoft Academic Knowledge API Sinha2015 (date citations received: 03.29.2020). This was successful for 10616 papers (98.2%).

Our approach has three (possible) limitations. Firstly, our results should be understood as general trends, not exact numbers, since it is not is possible to extract data from the papers in all cases. A further limitation of our method that is particularly relevant for our analysis of conflicts of interest is that we cannot detect cases where researchers have academic and industry affiliations at the same time but state only one of them in the respective research paper. Moreover, we would like to point out that the data set is smaller for the industry analysis. Small data sets tend to produce extreme results—in both positive and negative directions. Nevertheless, we believe that this is not a problem in our case as our results, as we will see in the following section, are very robust.

3 Results

3.1 Subtle conflicts of interest in academia

Figure 1a plots the number of papers accepted at ICML, NeurIPS and CVPR between 2015 - 2019. The number of accepted papers is steadily increasing. Figure 1b shows whether the paper includes authors with affiliations from academia, industry or both. While the ratio of papers from industry is stable, an increasing ratio of papers have affiliations from both academia and industry. These collaborations are especially prone to conflicts of interest.
Ties between the two social systems Luhmann.1995b —university and industry— do seem to become tighter. Academic settings are increasingly mitigating towards corporate tech environments. Moreover, academic papers with no industry affiliations are slightly on the decrease. This makes an appropriate approach to dealing with conflicts of interest all the more important. However, purely academic papers still make up the largest part of the submissions to all major conferences.

Figure 1: Progression of the number of papers at major ML conferences (a) and institutional affiliations. (b) For better overview, the mixed papers are plotted in light green. Please note the numbers do not add up to 100% because we were only able to extract this information for 90% of the papers, see methods and supplementary information.

Furthermore, we extracted the acknowledgements of all papers from academia and searched them for terms of industry affiliations (Google, Facebook etc.). This gives us an insight into whether academic papers acknowledge industry funding, grants etc. In fact, we calculated the conditional probability

. Figure 2 shows the relation between academic papers that are potentially cases for conflicts of interest. Finally, we also searched also for the terms “conflict of interest”—the plural “conflicts of interest” do not lead to a single finding— and “disclosure” in order to identify whether such influences are named. Only three of more than ten thousand papers contain an explicit conflict of interest statement at all. This inquiry shows that on the one hand, conflicts of interest are present in many academic research papers, while on the other hand, those conflicts are not clearly stated. This is a further sign for the reasonableness of ML conferences to demand researchers to add transparency statements to their submissions.

Figure 2: Papers from academia with industry acknowledgement (red line).
Figure 3: Average number of citations received is shown (a), see text why we do not show error bars. Ratio of papers from academia and industry with trending topics : ’adversarial’ and ’reinforcement’ (b) and papers with social impact terms (c).

With recourse to the insights from figure 1b, there is no question that purely academic papers make up the largest part of submissions to all major ML conferences, not industry papers. However, approximate 15-20 % of academic papers contain conflicts of interest, as shown in figure 2, which is due to various kinds of industry involvement. This reduces the number of papers genuinely uninfluenced by industry in figure 1b) to about 40%.

3.2 Publishing behaviour on trending topics: Industry is two years ahead of academia

Next, we want to gain insight into whether it is academia or industry that propels important parts of the ML field. Thus, we compared industry and academic papers with regard to the average amount of citations they possess. The results are shown in figure 3

a. For industry papers, the plotted mean citation is influenced by a few heavily cited papers. Therefore, error bars with standard deviations are very large and are not shown. Instead, we show the median in the supplementary material, which shows a similar trend, see supplementary figure

SF.1.
While citation analyses are not particularly credible for papers that were published quite recently, since citations are slow to accumulate, citation analyses gain in significance over time. Thus, our analysis clearly shows that industry papers from 2015 were cited far more frequently than academic papers, giving evidence for the high scientific relevance of industry papers. This trend prevails throughout the following years, albeit on a smaller scale. Overall, there is no question that industry papers receive greater attention from the scientific community than academic papers. A confound of this analysis is that one may assume that academic researchers who are especially successful are likely to be hired by ML companies, which then causes industry papers to have more citations on average than academic papers. Thus, it is difficult to state whether industry research has more scientific impact because of the industry context itself or because of companies’ strategic hiring policies and the corresponding migration of successful university researchers to companies.
In order to gain further insights into whether it is academia or industry that drives the field of ML, we searched for two terms, the first of which is “adversarial”. This, on the one hand, corresponds to the very popular Generalised Adversarial Networks invented 2014 by Goofellow et al. Goodfellow2014

and, on the other, to the adversarial attack on neural networks

Szegedy2013

. We also included the term “reinforcement” for reinforcement learning. These are topics of increasing interest to the ML field

Biggio201; Lipton2018. The results are shown in figure 3b. They show that academia is lagging roughly two years behind industry (ICML and NeurIPS). Similar trends can be found for much more frequently used terms “convolution” and “deep” in supplementary figure SF.2.
In addition, we are interested in whether social aspects are of growing interest in the ML community. Thus, we included terms from the social impact category of NeurIPS2020 (safety, fairness, accountability, transparency, privacy, anonymity, security) and added ’ethic’ as well as ’explainab’. We call these terms social impact terms. Overall, we can see growing attention by the ML community to these terms in the course of the past years. But while one may assume that academic papers put a stronger focus on social impact issues in comparison to industry research, this intuition does not hold true. The amount of social impact terms is more or less equally shared between academia and industry, showing that ideas of an ethical “superiority” of academia do not bear scrutiny.

3.3 Gender equality: Female senior authors are underrepresented in industry

Finally, we analysed the contribution of male and female authors to ML conferences. Figure 4a shows the ratio of female to all authors across the conferences, indicating a slight increase in the ratio of female authors across the three major ML conferences. We show here only academic and industry papers, as we are not able to assign particular affiliations for individual authors. In general, the results are in line with other studies, claiming that the proportion of women in ML research and in the number of workforces at major tech companies is typically hovering between 10 and 20 percent Yuan.2020. What is somewhat noticeable, though, is that female authors were previously represented even less in industry papers, compared to academic papers. However, this difference seems to be disappearing, so that nowadays, no notable differences in the ratio of female authors exist between academia and industry.

Figure 4: Overall ratio of female authors in academia and industry (a) and ratio of last authors (b).

Taking up previous research Andersen.2019 and going into further detail, figure 4b shows the ratio of female last authors compared to all last authors. Being the last author corresponds to the principal investigator or the most senior author. Here, female authors are less represented in industry papers than in academic papers. This indicates that gender imbalances on the level of principal investigators are even more significant in contexts of industry research than in academia. However, we also analysed the ratio of female last authors in academic papers with industry acknowledgement, see supplementary figure SF.3. Due to the small differences, it is difficult to say whether there is no or only very little discrimination against women in the assignment of grants.

4 Conclusion

The scientific success of ML research lured an increasing amount of industry partners to coalescence with academia. The growing number of papers stemming from academic-corporate collaboration is an indication of this (see figure 1b). While medical journals require researchers to name conflicts of interest, the ML community slowly follows and obliges researchers to add transparency statements to their work. This seems reasonable, especially against the backdrop of an increasing number of academic-corporate collaborations and academic papers with industry acknowledgements. Up to now, though, only a handful of papers voluntarily add conflicts of interest sections. On a related note, it is difficult to describe concrete ramifications on lines of action, opinions, or advice. In medical research, tangible and relatively direct influences from the pharmaceutical industry can be picked up. In ML research, industry influences are more fuzzy and hard to monitor. Hence, the concrete consequences of existing conflicts of interest can only be discovered by more in-depth, qualitative empirical social research. One can assume that in ML research ramifications mostly affect research agendas, so that scientists consciously or unconsciously steer their research in a direction that is most valuable for corporate interests or commercialization processes of all kinds. This bias can also suppress certain research results in order to avoid unfavorable outcomes that are nonpractical to those interests or processes. After all, universities and companies follow different “symbolically generalized communication media” (i.e. money or truth, see Luhmann.1995b), which can make it difficult for researchers with academic-corporate cooperation to act in accordance with only one of those goals.
Despite the issue of conflicting interest, our data analysis also provides evidence for the fact that industry-driven research has a measurable impact and is setting research trends ahead of academia (see figure 3a and b). This insight stands in contrast to the rather industry-critical discourse on conflicts of interest and proves the irrefutable positive impact industry-driven research has on scientific progress in ML. In line with this insight, we show that industry papers receive significantly more citations than research from academia (see figure 3

c), which is a clear sign that corporate ML research is of high importance for the scientific community. Besides the great attention that is directed towards industry papers, we demonstrate that these papers are not just orientated towards technical issues and omit to discuss social aspects of technology, as one might be tempted to impute. Actually, the amount of social impact terms that we used to measure the significance of social aspects is more or less equally shared between academic and industry papers. This finding hints to the fact that something like an ethical “superiority” of academia against industry does not exist, and it makes it difficult to ascribe any kind of ethical blindness to industry research in general.


Tangible problems, however, occur in view of diversity shortcomings (see figure 4). We show that the ratio of female authors compared to male authors of conference papers indicates a slight improvement of gender equality over time. But overall, the proportion of women in ML research is quite small, especially in industry research. Here, amendments are necessary, mainly comprising the creation of more inclusive workplaces, changes in hiring practices, but also an end of pay and opportunity inequalities Crawford.2019. In contrast to issues like innovative strength or citations, industry has a lot of catching up to do here.
In summary, we provide quantitative evidence for the increasing influence tech companies have on ML research. Our analysis reveals three main insights that can inform and differentiate future policies and principles of research ethics. Firstly, the analysis shows that besides the growing number of academic-corporate collaborations, conflicts of interest are not disclosed sufficiently. Secondly, it proves that industry-led papers are not only a strong driving force for promising scientific methods, but possess significantly more citations than academic papers, while being in no way inferior with regard to considerations on social impacts. Thirdly, we provide further evidence for the need to improve gender balance in ML research, especially in industry contexts. Consequently, we recommend to the ML community:

  • All potential conflicts of interests should be declared, especially grants and other support. Industry biases and susceptibility to financial incentives are always potent influences on behavior.

  • Industry papers are ahead of academia with regard to trending topics. It has not yet been determined whether this is due to the genuinely higher quality of industry papers, successful researchers being hired by companies, or if there is another reason. Therefore, more research is needed to disentangle the direction of causality.

  • Lastly, we join the voices demanding improved gender equality in ML research, especially with regard to staffing senior research positions with women.

Broader Impact

An increasing amount of literature is dedicated to technology impact assessment with regard to ML applications Floridi.2020; Madaio.2020; Morley.2019; Rahwan.2019; Reisman.2018; Anderson.2018b; McGonigal.2018; Mittelstadt.2016. Most of these approaches comprise ethics checklists data scientists are supposed to go through in order to check for pitfalls in data collection, storage, analysis, as well as modelling and deployment of their application. Those pitfalls comprise mostly privacy, fairness, accountability, security, or transparency issues Hagendorff.2020c; Jobin.2019

. This approach is not suitable for assessing the broader impact of our work. Nevertheless, we want to give a few estimates about the outcomes and impact of our research.

Potential risks

A negative aspect in our data analysis is the poor performance of the name-to-gender service GenderAPI regarding names from Asian language families. However, we believe that this does not diminish the general validity of the data, since we cannot determine the gender identity of only about 5% of all authors.
Regarding the ethical aspects and broader societal consequences of our work, we state that no discernible negative outcomes are to be feared with regard to the typical impact dimensions (access to goods, financial, property, reputation, emotional, safety, privacy, liberty, rights) Anderson.2018b. No particular person or organization is put at an unfair disadvantage due to our research. Our use of personal data is based on publicly available information. Moreover, personal data is only used and analysed in an aggregated manner. Our algorithms are fully explainable. Hence we do not face accountability issues. In summary, the overall direction of the paper’s impact is assumed to be positive, for academia as well as the public.

Who benefits from our research?

In contrast to other works which are solely focused on criticising industry involvement and on describing the negative side effects of industry-driven research on the academic world, we adopt a perspective that is—as far as possible—free from preset assumptions or normative opinions for or against industry. Thus, our research, which proves the positive effects of industry involvement on scientific progress, actually serves to overcome prejudices. It can help to objectify public debate, but also to underline initiatives that demand a better gender balance and transparency policies in the ML field. Especially in view of the latter, our data analysis gives evidence for the disproportion between existing conflicts of interest and respective disclosure statements in the papers. This insight can be used to support policy measures like the one implemented by NeurIPS organizers, requiring conference submitters to disclose financial conflicts of interest.

Acknowledgements

This research was supported by the Cluster of Excellence “Machine Learning – New Perspectives for Science” funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – Reference Number EXC 2064/1 – Project ID 390727645. Additional funding was provided for K.M, in part, by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – projectnumber 276693517 – SFB 1233, TP 4 Causal inference strategies in human vision.
We would like to thank Felix Wichmann, Ulrike von Luxburg, and Cornelius Schröder for very helpful comments on the manuscript, and Ella Gierss for helping us with the preparation of the manuscript.

References

Appendix Supplementary Material

Section S.1 provides in-depth explanation and all information necessary for reproducing our results. Section S.2 shows additional figures.

s.1 Data Analysis

s.1.1 Code Availability

Unfortunately, due to legal restrictions, we can neither offer our database of papers nor our script for downloading the papers to the public. Without this database, our analysis scripts are not usable. However, we are looking forward to receive any questions regarding our approach and we are happy to answer all inquiries to ensure reproducibility.

s.1.2 Paper download

In total, we have downloaded 10807 papers. The NeurIPS papers were downloaded from https://papers.nips.cc/ and ICML/CVPR from http://proceedings.mlr.press/. To avoid traffic overload on servers, we recommend a minimum delay between each individual paper download. Especially the automatic download of NeurIPS papers sometimes fails. This problem can be caused by the download link having a different name than the paper or by the download link being too long for our Linux file system (Ubuntu 18.04 LTS 64 bit). Therefore, every time the download failed, we manually added the corresponding paper to our collection. This is also necessary for reproducing our results.

s.1.3 Text conversion

It is a challenge to convert pdfs to text, since the PDF format is not suitable or meant for this task. We have decided to use the command line tool pdf2text.The tool pdf2text is able to convert pdf-documents into simpler txt-documents. We made sure that 2-column design is converted correctly. As a further pre-processing step we removed the watermark and headers of the ICML papers. We preserved the bibliography in each case, as this also gives an insight into the content. The supplementary materials, on the other hand, remained neglected.
With this procedure, we were able to obtain 10772 papers which contain at least the word “the”. The outlying papers are those in which the entire text was present as an image.

s.1.4 Text search

We wrote a simple function that searches the textfiles. This function was not case-sensitive and finds arbitrary subwords, for example it can find the word “anonym” within the text segment “anonymous reviewers”. With blanks, the search can be limited to certain words. To avoid unintentional results, we have compared all terms we have searched for in the dataset with the following list of English words: https://raw.githubusercontent.com/dwyl/english-words/master/words.txt

For the search for social impact words we used the list on the corresponding NeurIPS 2020 subject area. However, we removed some words to improve the results. Caution is necessary here, as one might receive false positive results when, for example, “anonymity” is changed to “anonym” as many authors thank their anonymous reviewers. Finally, we used the terms “AI safety”, “fairn”, “accountab”, “transpar”, “privacy”, “anonymity”, “security” and the terms “ethic” and “explainab”.

s.1.5 Affiliation extraction

We are not only interested in analyzing the content of the papers, but also their origin. Therefore, we have tried to extract the headers of the papers. This was no problem for NeurIPS or CVPR papers. For these papers, we simply extracted all content before the word “abstract”. In most cases, there were no issues. Very rarely, a figure appeared before the abstract or authors changed the standard template. The same procedure worked for ICML 2015 and 2016. However, from 2017 onwards, the affiliations were shown in the lower left corner. No keywords were placed before, only a blank line. This was difficult to parse with our script. We thus decided to keep the first 5000 characters as header for these papers, but split it before the terms “international conference of machine learning” which always ended the listing of authors. We think that this yields only a small amount of false positive if we search for affiliations, since it is most likely that the academic and industry institutional terms will appear in the affiliations only.

To get an impression of which institutions publish on NeurIPS, CVPR and ICML, we oriented ourselves on already created analyses:

To prevent us from cherry-picking we only used terms appearing in the analyses above. We define a paper as academic if it contains one of the following terms:

California Institute of Technology / Ecole / EPFL / ETH Zürich / INRIA / Kaist / Massachusetts Institute of Technology / MILA / MIT / ParisTech / Planck / RIKEN / TU Darmstadt / Université / Universiteit / University.

For the definition of a paper as industry we use the following terms:

Adobe / AITRICS / Alibaba / Amazon / Ant Financial / Apple / Bell Labs / Bosch / Criteo /Data61 / DeepMind / Expedia / Facebook / Google / Huawei / IBM / Intel / Kwai / Microsoft / NEC / Netflix / NTT / Nvidia. / Petuum / Qualcomm / Salesforce / Siemens / Tencent / Toyota / Uber / Vector Institute / Xerox / Yahoo / Yandex.

We perform a non-exclusive classification. Papers may have academic and industry affiliations. It is important to note that we included blanks before and after the text for the MIT, NEC and Intel terms to avoid contaminations with other words like “admit”.

s.1.6 Extract acknowledgements

We extracted the acknowledgements for our conflict of interest analysis. In this particular analysis, we focused on academic papers. In our data sets, we have 6632 papers from academia. Of these papers, 5221 papers (78.7%) contain an acknowledgement section which we were able to parse. We also included both spellings of acknowledgement: “acknowledgement” and “acknowledgment”.

s.1.7 Authors and genders extraction

The authors were not imported from the PDFs, but from the websites. We found a total of 41939 authors. However, it is clear that some papers were written by the same author. Therefore, we decided to pool all authors with the same name. Of course, this leads to the effect that different authors with the same name are pooled. We believe that this effect is negligible. For authors with middle names, we kept only the first letter. There is a great variation in how people give their middle name, e.g. T.,T, or Tom. This procedure gives us 18060 unique authors.

From these unique authors, we extracted the gender using the commercial tool GenderAPI. GenderAPI also provides an estimate of the accuracy. The mean accuracy in our case was 87.1%. Unfortunately, we noticed that most times, GenderAPI fails in the recognition of names from Asian language families. This is a clear bias in the underlying dataset of GenderAPI. Furthermore, we want to acknowledge that some people reject the idea that a name corresponds to a gender. However, we applied the analysis of genders here to gain insight into the inequality of author’s genders on average, not only in single cases.

s.1.8 Downloading citations

Finally, we conducted a citation analysis. To do this, we first downloaded the titles of the papers from the websites with Beautiful Soap. We then wrote an automated script to access the Microsoft Academic Knowledge API Sinha2015. This was successful for 10616 papers (98.2%, date of citation download: 03.29.2020). The most common reason for a paper not being found in the database is the use of special characters like etc. in the title.

s.2 Additional Figures

s.2.1 Figure A1

Figure SF.1: Median number of citations received for pure academic (red), mixed (orange) and industry (blue) paper.

s.2.2 Figure A2

Figure SF.2: Ratio of academic (red), mixed (orange) and industry (blue) paper which include the term “convolution”(a) or “deep”(b).

s.2.3 Figure A3

Figure SF.3: Ratio of female last author in all academic papers (blue line) and in academic papers with industry funding acknowledgement (red line).