Gender-based homophily in collaborations across a heterogeneous scholarly landscape

Using the corpus of JSTOR articles, we investigate the role of gender in collaboration patterns across the scholarly landscape by analyzing gender-based homophily–the tendency for researchers to co-author with individuals of the same gender. For a nuanced analysis of gender homophily, we develop methodology necessitated by the fact that the data comprises heterogeneous sub-disciplines and that not all authorships are exchangeable. In particular, we distinguish three components of gender homophily in collaborations: a structural component that is due to demographics and non-gendered authorship norms of a scholarly community, a compositional component which is driven by varying gender representation across sub-disciplines, and a behavioral component which we define as the remainder of observed homophily after its structural and compositional components have been taken into account. Using minimal modeling assumptions, we measure and test for behavioral homophily. We find that significant behavioral homophily can be detected across the JSTOR corpus and show that this finding is robust to missing gender indicators in our data. In a secondary analysis, we show that the proportion of female representation in a field is positively associated with significant behavioral homophily.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 7

10/31/2018

Gender differences in research collaboration

The debate on the role of women in the academic world has focused on var...
10/16/2018

Gender Bias in Nobel Prizes

Strikingly few Nobel laureates within medicine, natural and social scien...
12/16/2021

Gendered Language in Resumes and its Implications for Algorithmic Bias in Hiring

Despite growing concerns around gender bias in NLP models used in algori...
10/30/2018

Accounting for gender research performance differences in ranking universities

The literature on the theme of gender differences in research performanc...
07/06/2019

Exploring difference in public perceptions on HPV vaccine between gender groups from Twitter using deep learning

In this study, we proposed a convolutional neural network model for gend...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Measuring homophily in co-authorships

1.1 Quantifying assortativity

We consider the unit of analysis to be an authorship–an instance of co-authoring a single article–rather than an author who may have co-authored multiple articles; implications are addressed in the Discussion. For a set of authorships, Ted Bergstrom’s (bergstrom2003algebra) is , where

is the probability that a randomly selected co-authorship of a randomly selected male authorship is male, and

is the probability that a randomly selected co-authorship of a randomly selected female authorship is male.

Our analysis framework is not dependent on a particular metric; however, we use due to its interpretation as a difference in risks and its connection to Wright’s coefficient of inbreeding (wright1949genetical). Indeed, is a generalization of Wright’s coefficient to the multi-author scenario, and the two measures are equivalent when all papers have two authors. bergstrom2016index show that is equal to the observed coauthor-gender correlation in a given collection of papers while wang2016relationship show that is equivalent to Newman’s network-based assortativity coefficient (newman2003mixing) in an appropriately weighted network.

For a set of authorships, the possible values of depend on various structural aspects such as the gender ratio, the total number of authorships, and the number of authorships on each paper. For concreteness, suppose all papers have two authors, and let be the proportion of the less frequent gender. If the proportion of female-male papers is what we would expect under random pairings, , then ; if there are no female-male papers, ; if the proportion of female-male papers is the largest attainable value, , then which is between and .

1.2 Homophily in a heterogeneous setting

When analyzing papers from across the scholarly landscape, observing a large value of is not sufficient to indicate that gender plays an active role in co-authorship decisions. To elucidate this point, we distinguish between the structural, behavioral, and compositional aspects of gender homophily.

Figure 1: An example co-authorship network that exhibits negative in each field ( and for and respectively), but results in positive for the aggregated data (). Each square represents a paper, circles represent authorships, and the colors of the circle indicate the gender. Compositional homophily drives this counter-intuitive result.

Within a tightly focused intellectual community, if co-authorships are formed randomly, the expected should be near , but the exact distribution of in each community depends on the various structural aspects mentioned previously. For instance in Figure 1, when permuting authorships within their fields without regard to gender, the expected for and are not , but and respectively. We use structural homophily to describe the deviation of from 0 that arises due to these structural aspects.

When examining aggregations of intellectual communities, we no longer expect to see teams formed completely at random because individuals are more likely to co-author with others who share intellectual interests. Given that gender representation varies across disciplines, collaboration along the lines of shared interests generate homophily that we denote compositional homophily. Figure 1 provides an illustrative example with two fields, and , with 4 papers each. Within each field, the observed configuration of authorships result in of for and for ; however, when all 8 papers are aggregated, . This occurs because the proportion of females in is much greater than that in . If the intellectual interests in differ from those in , it would be reasonable to conclude that the observed gender homophily is actually driven by discipline homophily. blau1977inequality uses “consolidation” to describe homophily induced by factors (in this case discipline) which are associated with the factor of interest (in this case gender). In statistics, it is a case of Simpson’s paradox (simpson1951contingency) where homophily is confounded by gender imbalances across scholarly fields. In population genetics, this phenomenon is known as the Wahlund effect (walhund1928zuzammensetzung): random mating within subpopulations does not imply Hardy-Weinberg equilibrium in the population as a whole. Indeed, under the Wahlund effect, we would not expect Wright’s F (which would be equivalent to ) to be 0.

In contrast to structural and compositional homophily, which could occur even if authors select co-authors irrespective of gender, we use behavioral homophily to describe deviations of from its expected value under structural and compositional homophily which could be due to explicit or implicit consideration of gender when selecting co-authors.

These notions of homophily map onto the two components of homophily discussed by  mcpherson2001birds–baseline homophily and inbreeding homophily–in a context of voluntary and professional network ties such as those of friendship, support, and advice. Specifically, they described baseline homophily as homophily “created by the demography of the potential tie pool” and inbreeding homophily as “homophily measured as explicitly over and above the opportunity set” (mcpherson2001birds, pg 419). Our notion of structural homophily aligns with baseline homophily of mcpherson2001birds, though we prefer to use the term “structural” which does not have temporal connotations111

i.e., baseline observations in longitudinal studies

. mcpherson2001birds emphasize that their definition of inbreeding homophily does not refer to “choice homophily purified of structural factors,” but instead encompasses “homophily induced by social structures below the population level to homophily induced by other dimensions with which the focal dimension is correlated, and to homophily induced by personal preferences.” (mcpherson2001birds, pg 419). Indeed, compositional homophily accounts for homophily induced by “structures below the population level” and the correlated dimension of intellectual interests, and we assume the remaining homophily–behavioral homophily–is “induced by personal preferences.” We acknowledge, however, that the behavioral homophily we measure may not be strictly due to gender and can be potentially correlated with other social stratification dimensions that we don’t observe in our data such as race or ethnicity.

1.3 Data

In the JSTOR corpus, “subpopulations” which may create compositional homophily correspond to tight intellectual communities that focus on similar research questions. To identify these communities, we apply a hierarchical implementation of the InfoMap network clustering algorithm to the citation network on the JSTOR corpus  (rosvall2008maps; rosvall2011multilevel). The algorithm reveals the hierarchical structure of the corpus through efficient coding of random walks on the citation network. At the lowest level of the clustering, each paper is grouped into one of 1,450 terminal fields which form the finest partition of the data. These terminal fields are indicative of scholarly communities tied by shared narrow research topics or methodologies. Each higher level of the clustering forms a progressively coarser partition of the documents by aggregating terminal fields into composite fields. Finally, there are 24 identified top-level fields that are indicative of disciplinary divisions (west2013role) such as molecular and cell biology, economics, statistics, and sociology. The hierarchical structure obtained from the InfoMap algorithm has up to 6 levels. At any given level of hierarchy, papers in a common field are more connected to each other via citations than they are to papers from neighboring fields. Likewise, the fields defined by a lower (finer) level of the hierarchy are more connected than fields defined by a higher (coarser) level in the hierarchical clustering. This hierarchical clustering allows us to test for behavioral homophily at varying levels of granularity. An interactive browser of the clustering can be accessed at the Eigenfactor browser: http://eigenfactor.org/projects/gender_homophily.

We include all papers clustered into one of the terminal fields that were published between 1960 - 2011 and have more than one author. After the cleaning procedure described in Section 4

, this amounts to 252,413 papers with 807,588 authorships. We impute the gender of authorships using first name as discussed in section 

4.4.1 and the supplement.

2 Results

2.1 Measuring Behavioral Homophily

To estimate the contributions of structural, compositional, and behavioral homophily to the overall observed homophily, we compare the observed to measured on plausible hypothetical configurations which aim to reflect all relevant aspects of co-authorship choice except behavioral homophily. Specifically, we sample these configurations from the null distribution–described below and given explicitly in (2

)–that encodes the null hypothesis of no behavioral homophily. Roughly speaking, we fix the papers and field structure and shuffle authorships so that co-authorships are formed without regard to gender. Systematic differences between the observed

and the values from the null would suggest behavioral gender homophily.

To reflect the underlying structural homophily, we restrict the distribution to configurations that preserve structural aspects of our data: i.e., the total number of male/female authorships, the number of authorships on each paper, and the number of papers/authorships in each field. To capture compositional homophily and scholarly connectivity across terminal fields, we allow inter-terminal field swaps with a probability proportional to the flow of citations between the authorship’s original terminal field and other terminal fields in the corpus. Configurations with authorships in their original or nearby (as defined by citation flows) terminal fields are much more likely than configurations where they are far away; for almost all cases, an authorship remains in its original terminal field with probability above 0.9. This ensures that under the null distribution, the gender ratio of any terminal field stays close to the observed ratio. However, inter-field swaps may occur with small probability to reflect cross-field collaborations; this also makes the null distribution less sensitive to the otherwise discrete assignment of documents to terminal fields. Finally, we treat authorships within a terminal field as exchangeable; i.e., in the counterfactual world, all authorships are equally likely to appear on any other paper in the terminal field and co-authorships are formed without regard to gender.

We calculate for each configuration and test for the presence of behavioral homophily. The p-value for each field is the proportion of ’s from the null distribution which are greater or equal to the observed . A small p-value implies that under only structural and compositional factors, the observed

is unlikely to occur and suggests the presence of behavioral homophily. Direct sampling from the distribution is intractable so we use a Markov chain Monte Carlo Metropolis-Hastings procedure.

2.2 Main Analysis

Table 1 summarizes results for the entire JSTOR corpus and all top-level fields. The first column gives both the observed and the expected from the null distribution. The expected is positive for every top-level field, implying that even when collaborator choices are gender-blind, same-gender co-authorships are expected to occur more often simply because of the structure and gender composition of these fields. Also, the observed exceeds the expected in all top-level fields. Figure 2 provides a representation of the hierarchical clustering for Economics; the observed is , but given only structural and composition homophily, we would expect an of . Similar illustrations for all fields are available in the interactive browser.

For concrete interpretation, consider a setting where every field consists of 100 two-author papers. Then for a given and proportion of female authorships , is the number of heterophilous (female-male) papers. In the column (Table 1), we report for the observed/expected of each top-level field, setting to the proportion of observed female authorships in that field. Note that the magnitude of changes in does not always correspond to observed papers in a direct way; e.g., Education and Organizational and Marketing have similar observed and expected values, but the difference between the observed and expected heterophilous papers is larger in Education than in Organizational and Marketing (6.0 vs 4.2) because Education has a larger proportion of female authorships.

We perform hypothesis tests for all top-level, composite, and terminal fields using p-values adjusted by the Benjamini-Yekutieli (benjamini2001control) procedure to control the false discovery rate at . The procedure is likely quite conservative in our setting; in the supplement, we provide results for less conservative multiple testing procedures and different false discovery rates. We reject the null hypothesis of no behavioral homophily in the JSTOR corpus, in 20/24 (83%) top-level fields, in 82/280 (29%) of composite fields (not including the top fields), and in 124/1450 (9%) terminal fields. Across JSTOR, and in almost every top-level field, the incidence of significant behaviorial homophily in composite fields is at least as large as the incidence among terminal fields. We posit two reasons for this. First, composite fields are an aggregation of terminal fields, and we expect behavioral homophily in a composite field aggregates roughly as an “or” operator over its terminal fields (i.e., behavioral homophily in a single terminal field typically implies behavioral homophily in the corresponding composite field). However, as seen in the Eigenfactor browser, there are composite fields with significant homophily despite having no significant terminal fields. Thus, we also posit that composite fields have higher testing power due to their larger size. In general, there is a trade-off between increasing testing power by aggregating data versus controlling for confounders by analyzing the data at a fine-grain level. This highlights the benefit of our approach which allows testing homophily in composite fields, where we have more power, while still accounting for compositional effects.

Field Obs / Exp P-value Signif / Total
Term Comp
JSTOR .11/.05 38.6/41.1 .00 124/1450 82/280
Mol/Cell Bio .05/.01 38.2/39.8 .00 29/178 19/44
Eco/Evol .06/.02 31.9/33.3 .00 17/257 15/56
Economics .11/.02 18.7/20.8 .00 9/136 11/28
Sociology .19/.07 38.5/44.2 .00 13/94 12/21
Prob/Stat .09/.03 26/27.8 .00 1/90 2/23
Org/mkt .16/.04 29/33.2 .00 8/68 3/4
Education .16/.04 41.2/47.2 .00 12/42 6/10
Occ Health .10/.02 41.7/45.5 .00 12/24 1/1
Anthro .12/.03 38.5/42 .00 5/63 2/8
Law .17/.08 29.7/32.9 .00 0/98 1/16
History .16/.07 32.9/36.5 .00 0/49 1/6
Phys Anthro .07/.01 34.7/36.8 .00 1/32 2/10
Intl Poli Sci .09/.02 27.3/29.1 .03 0/34 0/2
US Poli Sci .15/.07 25.2/27.4 .00 2/37 1/6
Philosophy .10/.03 18.7/20.3 .03 0/45 0/8
Math .04/.01 14.3/14.7 1.00 0/46 0/9
Vet Med .09/.01 38.3/41.6 .00 7/19 1/2
Cog Sci .18/.09 35.7/39.4 .00 4/14 3/3
Radiation .09/.01 34/36.8 .00 3/14 1/5
Demography .15/.06 40.3/44.4 .00 0/20 1/2
Classics .07/.01 38.7/41.2 .27 0/35 0/8
Opr Res .03/.00 16.6/17.1 .73 0/18 0/4
Plant Phys .08/.02 29.1/31 .03 1/21 0/3
Mycology .03/.01 36.8/37.5 1.00 0/16 0/1
Table 1: Results for the JSTOR corpus and each top-level field, sorted largest to smallest (top to bottom) by number of authorships. The column gives the observed value and expected value under no behavioral homophily; the column gives the number of heterophilous papers corresponding to the observed and expected . The “P-value” column gives the Benjamini-Yekutieli adjusted p-value for the top-level field; “Term” and “Comp” give the numbers of significant/total terminal and composite fields, respectively.
Figure 2: Gender homophily across sub-disciplines. The top figure shows the hierarchical clustering of economics; the height of each rectangle indicates the size relative to the top-level field and darker shade of green indicates a smaller p-value. The bottom panel shows the histogram of values when accounting for structural and compositional effects with the observed value indicated by the vertical red line. An interactive version of this figure for exploring other disciplines can be found in the online browser.

In the supplement, we compare results from our approach to those from a naive approach that does not account for compositional homophily and only accounts for structural homophily by treating all individuals within a given composite or top-level field as exchangeable. In this analysis, we find significant homophily in 21/24 top-level fields and 157/280 composite fields. Unsurprisingly, in almost all cases, the expected under the null distribution which accounts for compositional homophily is larger than the expected when only accounting for structural homophily.

2.3 Secondary Analysis

We also test whether certain characteristics of a terminal field are associated with significant behavioral homophily observed in the multi-author papers. We fit a logistic regression for all terminal fields where the outcome is whether or not statistically significant behavioral homophily is detected (

) and the covariates are the ratio of % of solo-authorships which are female to the % authorships on multi-authored papers which are female (), the proportion of all authorships (solo and multi) which are female (), and the log of the number of authorships ().

Previous work (boschini2007team) has shown that gender homophily is positively associated with increased female representation. We also include an indicator () for whether the field is majority female (i.e., ) and the interaction term (); this allows the association between behavioral homophily and female % to change slope in majority female terminal fields since gender dynamics may systematically differ in majority female fields. Furthermore, where concerns about gender discrimination are common, we might observe a positive association across subfields between relative rates of solo-authorship for the lower-frequency gender (women in most cases) and increased behavioral homophily, since both would be rational choices in reaction to gender discrimination in collaboration rubin2017discrimination; oconnor2016dynamics; ferber1980disadvantage; mcdowell1992effect.

We calculate robust standard errors using a generalized estimating equation procedure with a diagonal working covariance where the clusters correspond to top-level fields.

(1)

The ratio of female solo-authors to multi-authors is not significant (p-value = ); however, field size () and the proportion of females () have a statistically significant positive association with behavioral homophily (p-values respectively). The estimate of the interaction term () is negative, but both the majority female indicator and interaction term are not significant (p-values of .07 and .052). Further details are provided in the supplement.

2.4 Sensitivity to missing gender indicators

Our main analysis used gender indicators for the 87.9% of authorships with first names that are used predominantly for one gender and removed the other 12.1% of authorships (see Section 4.4.1). This rate of missingness compares favorably with previous studies (sugimoto2013global), and we explore the impact of missingness with a sensitivity analysis using two multiple imputation strategies.

The first strategy imputes each missing indicator according to the proportions of assigned genders in its original terminal field. This assumes that there is no behavioral homophily in the missing data because the imputed genders are conditionally independent given the terminal field, providing a reasonable lower bound on the homophily we might have obtained given the full data. The second strategy imputes each missing gender indicator according to the proportions of assigned genders on its original paper; if a paper contains only unassigned authorships, we impute a single gender for all authorships according to the proportions of assigned genders for the terminal field. By construction, papers with one or no assigned authorships are always homophilous, thus this imputation strategy provides a reasonable upper bound on the homophily we might have observed given the full data. We repeat the main analysis procedure to test for behavioral homophily in each of the imputed data sets; details are given in the supplement. For each strategy, Table 2 shows the average proportion (across 10 imputations) of fields with significant behavioral homophily. In both strategies, we assume that the observed gender proportions are good estimates of the true gender proportions, and we do not address bias which may be induced if one gender is more likely to be unidentified than the other222Sugimoto et al (sugimoto2013global)

hand-check a sample of 1000 authorships randomly selected across all fields. For names for which no prior records existed, the proportions of men and women (.68 and .32) was consistent with the proportions of men and women in the classified names (.69 and .31 for author-name combinations). In names which were not classified due to prevalent use for both genders, men were slightly overrepresented (.79 and .21).

.

Analysis Terminal Composite Top
Main 0.09 0.29 0.83
Sensitivity - low homophily 0.07 0.25 0.78
Sensitivity - high homophily 0.54 0.82 1.00
Table 2: Impact of missing gender indicators. For each strategy, we show the average proportion of terminal, composite, and top-level fields exhibiting statistically significant behavioral homophily.

3 Discussion

When controlling for the hierarchical structure of scholarly communities and for field-specific cultures of collaboration, we observe behavioral gender homophily in co-authorships across wide swaths of the JSTOR corpus. This holds across all levels of granularity, from top-level scholarly fields to intellectually narrow terminal fields.

Although we focus on gender and co-authorship, our methodology generalizes to studying homophily in other contexts where confounding occurs. For example, racial homophily may be confounded by spatial structure and homophily by illicit substance use in adolescents may be confounded by age or peer environment. Using a similar sampling procedure to control for observable structures could allow for a more nuanced analysis of homophily in these contexts as well.

While this methodology represents a substantial step in understanding homophily by allowing identification of its structural, compositional and behavioral components, there are a number of other methodological issues that present fruitful avenues for future work. For example, there are likely compositional effects due to aggregating data across time because gender representation has changed over time. Our analysis addresses some temporal aspects explicitly by considering publications from a limited time-span and implicitly by using the hierarchical clustering which may capture some time dynamics. However, a future analysis might directly incorporate temporal information into the null distribution. Also, because disambiguating authorships across papers is difficult without additional identifying information (torvik2009author), our analysis considers authorships rather than authors. However, in terminal fields with very few female authors, this may actually overestimate structural and compositional homophily (and underestimate behavioral homophily) by allowing for configurations with large where multiple authorships corresponding to the same female author are reassigned to the same paper. In addition, since individuals are more likely to co-author with previous co-authors, a future analysis with disambiguated authors could capture co-authorship dependency across papers. Finally, we choose not to include solo-authored papers, because their inclusion would require strong modeling assumptions about the decision to write a solo-author paper versus to collaborate on a multi-author paper. However, it is unclear whether this systematically biases our behavioral homophily estimates.

In the secondary analysis, we find that female representation and field size are positively associated with statistically significant behavioral homophily. Scientifically, this result may seem counterintuitive on its face; however, it is not surprising from the perspective of homophily (mcpherson1987homophily): as the representation of women increases, it becomes more likely that same-gender individuals who are sufficiently compatible along other key dimensions become available as prospective co-authors. This is consistent with with prior research in Economics which finds that behavioral homophily tends to be larger in sub-fields with a higher proportion of females (boschini2007team). Indeed, if women’s representation increases in areas of scholarship that remain stereotyped as male, women may have implicit or explicit preferences to collaborate with other women to protect against stereotype threat: experiments demonstrate that the presence of other women enhances women’s confidence, performance, and motivation in male-stereotyped domains (murphy2007signaling; sekaquaptewa2003solo; inzlicht2000threatening; stout2011steming; marx2002female). However, because more balanced gender representation and larger field size also increase the power of our testing procedure, future work is needed to disentangle whether these factors are actually associated with increased homophily or simply due to our increased ability to detect homophily. We also find that the ratio of the proportion of single authored papers written by females to the proportion of female multi-authorships is not significantly associated with behavioral homophily. However, an analysis which directly models single-author papers could be more conclusive.

Future work should further evaluate the short-term and long-term strategic value of gender homophilous collaboration for women. In the short-term, do women who engage in gender homophilous relationships experience higher rates of retention in the authorship pool333Sarson (sarsons2015gender) shows that women who co-author (instead of solo-author) are less likely to receive tenure, but the effect is less pronounced if women co-author with women instead of men., productivity, and impact? In the long-term, do gender-homophilous co-authorships give rise to gender-homophilous intellectual communities? And if so, does increasing the ratio of women in an intellectual community lead to its devaluation/impact, just as increasing the ratio of women in an occupation can decrease its prestige (goldin2014pollution)?

While many open questions remain, the direct implications of our current results are important: since behavioral gender homophily is not due to structural and compositional aspects such as gender imbalances across subdisciplines, and is endemic to some of the smallest intellectual communities, it might only be mitigated by changing the current cultural norms and perceptions that drive behavioral gender homophily within those communities.

4 Data and Methods

4.1 JSTOR Data

We impute authorship gender from first names by using name-specific gender percentages from Social Security and crowd sourced records. For each authorship, as in West et. al. (west2013role), we treat gender as known if the respective first name—or one of the first names in case of double names—is used for only one gender at least 95% of the time. We start by using United States Social Security Administration records which allow gender imputation for 75.3% of authorships. Using genderize.io (wais2016gender)

, which obtains gender prevalence by first name from user profiles across major social networks, we impute gender for an additional 12.6% of authorships. The remaining 12.1% of authorship instances consists of 7.6% of authorships that appear in neither database and of 4.5% of authorships that are used for both genders with at least 5% frequency. We provide detailed descriptive statistics in the supplement. In the main analysis, authorships with unimputed genders are omitted from our analysis. For instance, a paper with 2 female, 1 male, and 2 unimputed authorships is treated as an article with 2 female and 1 male authorships.

The alpha values and p-values for each discipline and all the other descriptive statistics reported in this paper will be made openly available on our project website http://eigenfactor.org/projects/gender_homophily. Because the raw publication data are provided by JSTOR under license to the authors, requests for the raw data should be made to JSTOR directly. Code for the analysis and plots is available at https://github.com/ysamwang/genderHomophily.

4.2 Sampling procedure

We use to denote an authorship in , the set of all authorship instances. Let be a configuration where and denote the terminal field and document to which authorship is assigned, and let denote the observed configuration and denote the probability that an authorship originally observed in terminal field might instead author a paper in terminal field . We define the gender-blind null distribution as follows:

(2)

The equivalence relation indicates that is a permutation of the authorships in which preserves the total authorships per terminal field, the total numbers of male and female authorships, and the number of authorships per paper. Because the denominator in (2) cannot be calculated easily, we sample gender-blind hypothetical configurations indirectly using a Markov chain Monte Carlo Metropolis-Hastings sampling procedure. A description of how are determined by the observed citation flows and the details of the sampling procedure are provided in t supplement. For the main analysis we use 75,000 samples from the null distribution after burn-in to calculate each p-value. For each of the sensitivity analyses, we use 9,000 samples from the null distribution after burn-in.

References

Appendix A Measure of Homophily

Recall that where is the probability that a randomly selected co-author of a randomly selected male authorship is male and is the probability that a randomly selected co-author of a randomly selected female authorship is male. We calculate the for the example given in Figure 1 in the main text.

Figure 3: Two fields with four papers each. Each square represents a paper, each circle represents an authorship, and color indicates gender (blue is male, red is female).

For Field A, there are males and females. To calculate in the following equation, we calculate the proportion of male co-authors for each male authorship and then take the average. The values in the equation correspond to authorships from left to right. To calculate , we calculate the proportion of male co-authors for each female authorship and then take the average–again from left to right.

(3)

For Field B, there is male and females.

(4)

For the Fields and combined, there are 12 males and 10 females.

(5)

In Section 2.B of the main manuscript, we describe a concrete interpretation of displayed in the “FM-Papers” column of Table 1. In particular, we assume a field consists of 100 2-author papers and let and be the proportion of female and male authorships respectively. We can calculate the number (which may be fractional) of Female-Male papers (), Male-Male papers () and Female-Female papers () which would result in a specific . In this setting

because there are 100 papers total, and since is the proportion of female authorships. Solving for , , and then yields:

Appendix B JSTOR Description

Table 3 shows the size of each of the 24 top level fields identified by the map equation. The values are calculated for all papers published in or after 1960. Note that the table describes the data prior to the data cleaning procedure, so counts of authorships, papers, terminal fields and composite fields shown here may differ from those given in the main manuscript which refer to data after the cleaning procedure. Specifically, Classics, Law, and Philosophy have entire terminal fields which are removed by the cleaning procedure. Table 4 presents the structural characteristics of each top level field. For the multi-author columns, we report the proportion amongst all authorships on a multi-author paper; e.g., all female authorships on multi-authored papers divided by the total count of authorships on multi-authored papers. For the intraclass correlation (ICC) of individuals with unimputed genders, we use the statistic from (ridout1999estimating). This gives a measure of how unimputed authorships cluster by paper. Anecdotally, unimputed authorships are often names which have been Romanized. Thus a high ICC may indicate homophily by race or ethnicity.

Authors Papers Terminal Composite
Label (Count) (Count) Fields Fields
Anthropology 37588 30499 63 8
Classical studies 10596 9061 37 8
Cognitive science 15715 5553 14 3
Demography 9653 5509 20 2
Ecology and evolution 264853 116327 257 56
Economics 95934 59096 136 28
Education 40188 23065 42 10
History 26449 24043 49 6
Law 23974 19779 105 16
Mathematics 18348 14125 46 9
Molecular & Cell biology 382971 92528 178 44
Mycology 7469 3679 16 1
Operations research 13716 7780 18 4
Organizational and marketing 34254 17963 68 4
Philosophy 21738 19126 46 8
Physical anthropology 29693 16703 32 10
Plant physiology 9159 5436 21 3
Political science - international 15283 11835 34 2
Political science-US domestic 12581 7824 37 6
Pollution and occupational health 50967 12359 24 1
Probability and Statistics 37471 22094 90 23
Radiation damage 14118 4215 14 5
Sociology 57146 31662 94 21
Veterinary medicine 17756 4796 19 2
Table 3: Size of each of the top level fields identified by the map equation hierarchical clustering
Prop Single Author Single-Author Multi-Author
Label Papers Auth % F % M % U % F % M % U ICC
Anthropology 0.86 0.70 0.27 0.63 0.10 0.28 0.61 0.12 0.10
Classical studies 0.93 0.79 0.22 0.70 0.08 0.27 0.65 0.08 0.05
Cognitive science 0.25 0.09 0.29 0.64 0.07 0.28 0.62 0.10 0.09
Demography 0.57 0.32 0.24 0.61 0.15 0.30 0.53 0.16 0.11
Ecology and evolution 0.37 0.16 0.14 0.79 0.08 0.20 0.70 0.11 0.10
Economics 0.55 0.34 0.08 0.81 0.11 0.11 0.77 0.12 0.10
Education 0.55 0.31 0.35 0.58 0.07 0.41 0.50 0.08 0.07
History 0.92 0.84 0.24 0.70 0.06 0.23 0.69 0.08 0.05
Law 0.85 0.70 0.17 0.78 0.06 0.22 0.71 0.06 0.05
Mathematics 0.75 0.58 0.06 0.76 0.18 0.06 0.73 0.21 0.19
Molecular & Cell biology 0.14 0.03 0.19 0.70 0.10 0.23 0.61 0.16 0.09
Mycology 0.45 0.22 0.20 0.71 0.09 0.22 0.65 0.13 0.08
Operations research 0.48 0.27 0.05 0.81 0.14 0.08 0.72 0.20 0.21
Organizational and marketing 0.40 0.21 0.18 0.72 0.10 0.19 0.68 0.12 0.12
Philosophy 0.89 0.78 0.09 0.82 0.08 0.10 0.78 0.11 0.12
Physical anthropology 0.66 0.37 0.22 0.72 0.06 0.22 0.68 0.09 0.07
Plant physiology 0.53 0.31 0.13 0.79 0.08 0.17 0.72 0.10 0.07
Political science - international 0.78 0.60 0.16 0.74 0.10 0.17 0.73 0.10 0.08
Political science-US domestic 0.57 0.35 0.17 0.76 0.07 0.18 0.75 0.07 0.05
Pollution and occupational health 0.22 0.05 0.24 0.65 0.11 0.31 0.53 0.17 0.19
Probability and Statistics 0.54 0.32 0.08 0.75 0.17 0.13 0.69 0.19 0.20
Radiation damage 0.23 0.07 0.22 0.66 0.11 0.22 0.62 0.16 0.14
Sociology 0.52 0.29 0.30 0.63 0.08 0.38 0.53 0.08 0.07
Veterinary medicine 0.26 0.07 0.27 0.60 0.12 0.25 0.59 0.15 0.22
Table 4: The structural characteristics of each top level field identified by the map equation hierarchical clustering. The “Prop Single-Author” columns display the proportion of papers (authorships) which are single-authored out of all papers (authorships). The “Single-author” (“Multi-Author”) column give the proportions of all authorships on single-authored (multi-authored) papers which were imputed a gender based on first name. F- Female; M- Male; U- Unimputed. The “ICC” column displays the intraclass correlation of unimputed authorships on multi-author papers.

The plots below show how the following quantities have changed over time for each top level fields- average number of authors per paper, proportion of papers with multiple authors, and the imputed gender proportions. The values are calculated on the data before the data-cleaning procedure which removes authorship instances with unimputed genders.

Appendix C Data Cleaning Procedures

For the main analysis, we impute gender indicators for authorships with first names that are used for a single gender with at least 95% frequency in either the U.S. Social Security records or in the genderizeR database. We consider the gender indicator to be missing for authorships that either do not appear in those databases or are not used with at least 95% frequency for one gender. We subsequenty remove authorships with unimputed genders from our main analysis. This removal results in some articles which originally had multiple authors becoming single author papers, which are excluded from the analysis. The following table shows the proportion of authorships and papers which are lost solely due to unimputed genders. The denominator only includes papers which have multiple authors which were published from 1960-2012. The % unimputed column is the % of authors for which we do not impute a gender indicator. The % Lost column is the % of authors (or papers) which are lost after removing the authorships with unimputed gender indicators and then removing the resulting single author papers. For authorships, this percentage includes the authorships with unimputed genders.

Prop Authors with Authors Papers
Label Unimputed Gender Remaining Prop Lost Remaining Prop Lost
Anthropology 0.12 9326 0.17 3466 0.15
Classical studies 0.08 1976 0.11 610 0.09
Cognitive science 0.10 12510 0.13 3814 0.07
Demography 0.16 5069 0.22 1930 0.17
Ecology and evolution 0.11 192091 0.13 66152 0.08
Economics 0.12 51691 0.19 22178 0.15
Education 0.08 24356 0.12 9396 0.09
History 0.08 3699 0.12 1596 0.11
Law 0.06 6526 0.10 2765 0.09
Mathematics 0.21 5319 0.31 2459 0.25
Molecular & Cell biology 0.16 303761 0.18 73357 0.07
Mycology 0.13 4828 0.17 1759 0.11
Operations research 0.20 7217 0.28 3025 0.21
Organizational and marketing 0.12 22299 0.18 9137 0.13
Philosophy 0.11 3897 0.18 1770 0.15
Physical anthropology 0.09 16463 0.12 5175 0.07
Plant physiology 0.10 5388 0.14 2287 0.10
Political science - international 0.10 5128 0.16 2247 0.14
Political science-US domestic 0.07 7269 0.11 3068 0.09
Pollution and occupational health 0.17 39703 0.18 8845 0.06
Probability and Statistics 0.19 18763 0.27 7600 0.21
Radiation damage 0.16 10710 0.18 2902 0.09
Sociology 0.08 35858 0.12 13600 0.09
Veterinary medicine 0.15 13741 0.17 3275 0.06
Total 0.14 807588 0.16 252413 0.11
Table 5: Data reduction due to unimputed gender indicators

Appendix D Sampler Details

Recall from the main text, that for each authorship in the set of all authorships , we let denote the terminal field and denote the document to which is assigned. We denote the entire configuration of all authorships as and denote the configuration which we actually observe in the data as . We use a Markov Chain Monte Carlo Metropolis-Hastings sampler to draw samples from the gender-blind null distribution:

(6)

where the equivalence relationship indicates that the number of total authorships per terminal field, the total numbers of male and female authorships, and the number of authorships per paper is the same in and .

Define a permutation cycle of length to be a set of authorships in which is reassigned to the current terminal field and document of and is reassigned the current terminal field and document of . Any hypothetical configuration of authorship assignments can be decomposed into disjoint permutation cycles of the observed data . The sampling procedure starts with the observed assignments of authorships to papers within terminal fields and generates assignments by successively modifying the current state by a series of permutation cycles. We generate a proposal for each of these cycles by first randomly selecting a cycle length

from a geometric distribution. Then,

specific authorships are selected to form the permutation cycle. This proposed permutation cycle is then accepted or rejected with the appropriate Metropolis-Hastings probability.

For , let , the authorships in a terminal field where any authorship originally from terminal field could be re-assigned.

Step 1: Sample Cycle
Select authorship uniformly from
Select authorship uniformly from
Draw
while  do
     
     Select authorship uniformly from
end while
Step 2: Generate Proposal with Cycle
for  do
     
     
end for
Step 3: Accept or Reject
if  then
     Set
else
     Set
end if
Algorithm 1 Proposal Procedure

The length of the proposed cycle , where is a tuning parameter which regulates the average cycle length. A larger value of will yield longer cycles resulting in larger changes in the proposal but a lower probability of acceptance; a smaller value of will yield shorter cycles resulting in smaller changes in the proposal but a higher probability of acceptance. In general, the maximum length of a permutation cycle in the decomposition could be up to , the number of authorships in our corpus. Thus, any distribution which has positive support over would be sufficient for irreducibility. Under this scheme proposed in Algorithm 1, (as defined by gender-blind null distribution) could be 0 since we have not guaranteed that . In addition, since we are selecting authorships with replacement, if an authorship is selected twice on the cycle. However, if we were to sample without replacement we would need to condition on authorships that had been previously selected, so the proposal probabilities would no longer be symmetric since the probability of traversing a cycle would not be invariant to the orientation of the cycle.

Remark 1.

Let be the described proposal distribution in Algorithm 1. Then is symmetric such that .

Proof.

Let and be two assignments which differ by cycle . For notational convenience, let and . Then,

(7)

A proposal of from requires traversing the cycle in the opposite direction.

(8)

Remark 2.

The Markov chain produced from the proposal procedure in Algorithm 1 is irreducible if and the cycle length is chosen from a distribution with support over where is the number of authorship instances.

Proof.

For each with , there exists a decomposition of into disjoint sets such that is a permutation of some subset of . Let be the sequence of assignments which correspond to updating the permutation cycles , . Since there are a finite number of disjoint cycles and the proposal for permuting each cycle is positive, then the joint probability of permuting all cycles is also positive, so . Because the transition support is symmetric, we can also reverse each cycle to move with positive probability from .

Thus, for any two states and with positive probability under the null,

To allow for collaboration across terminal fields, we use observed citation data from one terminal field to another to define the authorship re-assignment probability, , between terminal fields and . Here, we make two simplifying assumptions. First, we threshold the citation flow between terminal fields at 5% of outgoing citations. Authorship re-assignments between terminal fields that have little connectivity are highly unlikely. Thresholding the citation data produces a network of terminal fields that is sparser (has greater number of disjoint graph components) which allows the sampling procedure to be parallelized more efficiently. Second, to ensure that our sampling procedure can reach all that have positive probability under the null distribution, we allow for authorship reassignment between terminal fields to be possible in both directions.

More formally, let be observed the proportion of citations from terminal field to terminal field , . We define he authorship re-assignment probabilities between terminal fields as follows:

  1. Set any proportions to 0

  2. Set

  3. Renormalize the proportions so

This procedure allows us to take into account substantial connectivity between terminal fields and also ensures that authorship reassignments between terminal fields are possible in both directions:

Appendix E Results

e.1 Sampler Convergence

As recommended by Gelman and Shirley (gelman2011inference), we take 3 separate chains and discard roughly the first half of each chain as burn in. In particular, we take 3 chains of 45,000 MCMC samples each and discard the first 20,000 from each chain for burn in. Such long chains and burn-in are necessary because all chains were initialized from the same starting values. We then combine the remaining 75000 samples (25,000 from each chain) to estimate p-values for the observed values. To check for convergence in the distribution of each relevant field, we compare the distributions of from each chain using a Kolmogorov-Smirnov test as suggested by (brooks2003nonparametric). Figure 4 shows the p-values for a two-sided Kolmogorov-Smirnov Test for equality of distributions. Because the distribution of is discrete, instead of using the typical asymptotic distribution of the KS statistic to calculate p-values, we bootstrap p-values using the R function ks.boot from the Matching package (sekhon2011matching). We see that the p-values across all comparisons are relatively uniform as we would expect if the distribution of across all chains were similar.

Figure 4: For each field (terminal and composite), we compare the distribution of sampled values for each of the 3 chains against one another other. The p-values for each Kolmogorov-Smirnov tests for equality are plotted above. The top panel shows all fields. The bottom panel shows the subset of fields which had an unadjusted p-value from the main analysis below .05.

e.2 Comparison with Naive Approach

We can compare the expected value of from the null distribution which accounts for structural and compositional homophily (Eq. (2) in the main document) to the expected value of from a naive null distribution which only accounts for structural homophily. We construct a naive null distribution for each level in the full hierarchical clustering by preserving all structure of (terminal or composite) fields with depth less than , but treating all fields of depth as a terminal field. We then recalculate the swap probabilities given the citation flows, and then run the sampler for 5,000 samples. We discard the first 1,000 as burn in and use the remaining 4,000 samples to calculate an expected value and calculate p-values.

Columns labeled “Struct” provide the expected value of , the expected number of female-male papers, the p-value for behavioral homophily, and the number of significant composite fields under the null hypothesis of only structural (but not compositional) homophily. Under only structural homophily, the expected value is smaller than the expected when also preserving compositional homophily. In addition, the p-value decreases for all top-level fields when only considering structural homophily. In many top-level fields, the number of composite fields with behavioral increases when only capturing structural homophily and never decreases.

P-values Signif Comp
Field Obs Exp Struct Obs Exp Struct Main Struct Main Struct
JSTOR .11 .05 .00 38.6 41.1 43.3 .00 .00 82/280 157/280
Mol/Cell Bio .05 .01 .00 38.2 39.8 40.2 .00 .00 19/44 35/44
Eco/Evol .06 .02 .00 31.9 33.3 34.0 .00 .00 15/56 33/56
Economics .11 .02 .00 18.7 20.8 21.1 .00 .00 11/28 18/28
Sociology .19 .07 .00 38.5 44.2 47.4 .00 .00 12/21 19/21
Prob/Stat .09 .03 .00 26.0 27.8 28.6 .00 .00 2/23 12/23
Org/mkt .16 .04 .00 29.0 33.2 34.7 .00 .00 3/4 4/4
Education .16 .04 .00 41.2 47.2 49.2 .00 .00 6/10 9/10
Occ Health .10 .02 .00 41.7 45.5 46.3 .00 .00 1/1 1/1
Anthro .12 .03 .00 38.5 42.0 43.5 .00 .00 2/8 4/8
Law .17 .08 .00 29.7 32.9 35.7 .00 .00 1/16 4/16
History .16 .07 .00 32.9 36.5 39.1 .00 .00 1/6 2/6
Phys Anthro .07 .01 .00 34.7 36.8 37.1 .00 .00 2/10 2/10
Intl Poli Sci .09 .02 .00 27.3 29.1 29.8 .03 .00 0/2 1/2
US Poli Sci .15 .07 .00 25.2 27.4 29.6 .00 .00 1/6 2/6
Philosophy .10 .03 .00 18.7 20.3 20.8 .03 .00 0/8 0/8
Math .04 .01 .00 14.3 14.7 14.9 1.00 .12 0/9 0/9
Vet Med .09 .01 .00 38.3 41.6 42.0 .00 .00 1/2 2/2
Cog Sci .18 .09 .00 35.7 39.4 43.3 .00 .00 3/3 3/3
Radiation .09 .01 .00 34.0 36.8 37.3 .00 .00 1/5 4/5
Demography .15 .06 .00 40.3 44.4 47.3 .00 .00 1/2 1/2
Classics .07 .01 .00 38.7 41.2 41.7 .27 .01 0/8 0/8
Opr Res .03 .00 .00 16.6 17.1 17.1 .73 .33 0/4 0/4
Plant Phys .08 .02 .00 29.1 31.0 31.8 .03 .00 0/3 1/3
Mycology .03 .01 .00 36.8 37.5 37.9 1.00 .60 0/1 0/1
Table 6: Comparison of naive analysis only preserving structural homophily to main analysis.

e.3 Calculating and adjusting P-values

In a sampled configuration, if a field only contains authorships of a single gender, is undefined. When calculating a p-value, we consider this as . This approach is conservative because it increases p-values, but in practice has very little effect on our results.

In the main manuscript, we control the false discovery rate at .05 with the Benjamini-Yekutieli procedure (benjamini2001control) which allows for arbitrary dependence of the p-values, but is more conservative than the Benjamini-Hochberg procedure (benjamini1995controlling), which only allows for certain types of positive dependence. Table 7 replicates the last 3 columns of Table 1 of the main manuscript using the Benjamini-Yekutieli procedure with an FDR of .005 as well as the Benjamini-Hochberg procedure with FDR rates of .05 and .005.

BY; Rate BH; Rate BH; Rate
Field P-value Term Comp P-value Term Comp P-value Term Comp
JSTOR .00 68/1450 65/280 .00 114/1450 81/280 .00 261/1450 125/280
Mol/Cell Bio .00 16/178 18/44 .00 28/178 19/44 .00 53/178 29/44
Eco/Evol .00 7/257 13/56 .00 14/257 15/56 .00 43/257 25/56
Economics .00 4/136 8/28 .00 8/136 11/28 .00 25/136 17/28
Sociology .00 9/94 9/21 .00 12/94 12/21 .00 26/94 14/21
Prob/Stat .00 0/90 1/23 .00 1/90 2/23 .00 7/90 7/23
Org/mkt .00 5/68 3/4 .00 6/68 3/4 .00 15/68 3/4
Education .00 6/42 4/10 .00 10/42 6/10 .00 17/42 6/10
Occ Health .00 8/24 1/1 .00 12/24 1/1 .00 15/24 1/1
Anthro .00 2/63 1/8 .00 5/63 2/8 .00 8/63 2/8
Law .00 0/98 0/16 .00 0/98 1/16 .00 1/98 4/16
History .00 0/49 0/6 .00 0/49 1/6 .00 1/49 1/6
Phys Anthro .00 1/32 2/10 .00 1/32 2/10 .00 6/32 3/10
Intl Poli Sci .03 0/34 0/2 .00 0/34 0/2 .00 3/34 0/2
US Poli Sci .00 0/37 0/6 .00 2/37 1/6 .00 4/37 3/6
Philosophy .03 0/45 0/8 .00 0/45 0/8 .00 5/45 0/8
Math 1.00 0/46 0/9 .16 0/46 0/9 .16 2/46 0/9
Vet Med .00 5/19 1/2 .00 7/19 1/2 .00 8/19 2/2
Cog Sci .00 2/14 3/3 .00 4/14 3/3 .00 7/14 3/3
Radiation .00 3/14 1/5 .00 3/14 1/5 .00 5/14 3/5
Demography .00 0/20 0/2 .00 0/20 0/2 .00 3/20 2/2
Classics .27 0/35 0/8 .03 0/35 0/8 .03 1/35 0/8
Opr Res .73 0/18 0/4 .09 0/18 0/4 .09 1/18 0/4
Plant Phys .03 0/21 0/3 .00 1/21 0/3 .00 4/21 0/3
Mycology 1.00 0/16 0/1 .27 0/16 0/1 .27 1/16 0/1
Table 7: Main results using different FDR procedures. “BY” indicated Benjamini-Yekutieli and “BH” indicates Benjamini-Hochberg.

e.4 Secondary Analysis

We examine whether certain terminal field characteristics are associated with statistically significant behavioral homophily. In particular, we fit a logistic regression where the dependent variable is whether or not significant behavioral homophily was detected using the Benjamini-Yekutieli FDR procedure (benjamini2001control) with . We include the following independent variables: the ratio of % of solo-authorships which are female and the % authorships on multi-authored papers which are female (); the log of the number of authorships (); the proportion of female authorships (); an indicator of whether the field is majority female(); and an interaction between and . The interaction term allows the association of female proportion to differ depending on whether the field is majority female or not.

We fit the logistic regression specified in (1) using a generalized estimating equation (GEE) (gee2015); to account for dependency across terminal fields, we use robust standard errors and specify clusters aligning to top level field. We also specify a diagonal working covariance. The results are shown in Table 8. We see that the ratio of female solo-authorships to female multi-authorships is not significant, but the size of the terminal field and the proportion of female authorships is statistically significant. We also see that both the indicator for whether a field is majority female and the interaction term are not significant at the .05 level. While it is interesting to note that the estimate of the interaction term is negative, we also caution that the estimate may not be precise since the majority female indicator is only positive for of the terminal fields.

(9)
Estimate Robust S.E. Robust z P-value
Intercept -14.05 1.00 -14.09 0.00
log(Authorships) 1.45 0.09 15.25 0.00
Proportion Female 6.70 1.55 4.32 0.00
Majority Female Indicator 13.30 7.33 1.81 0.07
Ratio Solo vs Multi Females 0.18 0.35 0.52 0.60
Proportion Female Majority Female Interaction -24.70 12.73 -1.94 0.052
Table 8: Results of the logistic regression using significant behavioral homophily under the Benjamini-Yekutieli FDR procedure as the dependent variable.

Alternatively, if we define as whether behavioral homophily was detected under the Benjamini-Hochberg (benjamini1995controlling) FDR control procedure, we see that the significant/non-significant covariates do not change, but the p-values for the majority female indicator and interaction term are much further away from .05 than when defining significance using the Benjamini-Yekutieli procedure. Results are shown in Table 9.

Estimate Robust S.E. Robust z P-value
Intercept) -10.22 0.72 -14.26 0.00
log(Authorships) 1.20 0.09 12.75 0.00
Proportion Female 4.11 1.06 3.89 0.00
Majority Female Indicator 3.08 4.01 0.77 0.44
Ratio Solo vs Multi Females -0.20 0.22 -0.89 0.37
Proportion Female Majority Female Interaction -5.63 7.22 -0.78 0.44
Table 9: Results of the logistic regression using significant behavioral homophily under the Benjamini-Hochberg FDR procedure as the dependent variable.

e.5 Sensitivity Analysis: Missing Gender Indicators

To evaluate how sensitive our main results are to the missing gender indicators, we impute gender for authorships with missing gender indicators under two scenarios:

  • Low homophily: Each authorship with a missing gender indicator is assigned a gender at random according to the proportions of observed genders on its original terminal field. This procedure assumes that there is no behavioral homophily in the imputed data because the imputed genders are conditionally independent given the terminal field. Thus, it gives a reasonable lower bound on the homophily we might have observed given the full data.

  • High homophily: Each authorship with a missing gender indicator is assigned a gender at random according to the proportions of observed genders on its original paper. If the original paper contains only authorships with missing gender indicators, we assign all authorships on the paper the same gender indicator which is drawn randomly according to the proportions of observed genders for its original terminal field. Because papers with at most one assigned gender indicator are homophilous by construction, this provides a reasonable upper bound on the homophily we might have observed given the full data.

For each scenario, we carry out 10 imputations and then repeat the entire sampling and testing procedures used for the main analysis. Table 10 gives the resulting percentages of terminal, composite, and top level fields with significant behavioral homophily under the Benjamini-Yekutieli FDR procedure with under the low and high homophily missing data imputation scenarios. We observed that, on average, 7%, 25%, and 78% of terminal, composite, and top level fields exhibit statistically significant respectively in the low homophily scenario; for the high homophily procedure the corresponding averages are 54%, 82%, and 100%.

Terminal Composite Top
Main Analysis 0.09 0.29 0.83
Low Imputation 1 0.06 0.23 0.83
Low Imputation 2 0.07 0.25 0.75
Low Imputation 3 0.06 0.25 0.75
Low Imputation 4 0.08 0.25 0.75
Low Imputation 5 0.06 0.26 0.75
Low Imputation 6 0.07 0.28 0.75
Low Imputation 7 0.07 0.25 0.83
Low Imputation 8 0.07 0.26 0.79
Low Imputation 9 0.07 0.25 0.75
Low Imputation 10 0.06 0.26 0.79
Low Imputation Avg 0.07 0.25 0.78
High Imputation 1 0.54 0.82 1.00
High Imputation 2 0.53 0.82 1.00
High Imputation 3 0.53 0.82 1.00
High Imputation 4 0.53 0.82 1.00
High Imputation 5 0.54 0.82 1.00
High Imputation 6 0.53 0.81 1.00
High Imputation 7 0.54 0.82 1.00
High Imputation 8 0.54 0.83 1.00
High Imputation 9 0.54 0.83 1.00
High Imputation 10 0.54 0.82 1.00
High Imputation Avg 0.54 0.82 1.00
Table 10: Each column shows the percentage of fields which exhibit statistically significant homophily for each of the individual imputations