Cultural transmission modes of music sampling traditions remain stable despite delocalization in the digital age

10/28/2018 ∙ by Mason Youngblood, et al. ∙ 0

Music sampling is a common practice among hip-hop and electronic producers that has played a critical role in the development of particular subgenres. Artists preferentially sample drum breaks, and previous studies have suggested that these may be culturally transmitted. With the advent of digital sampling technologies and social media the modes of cultural transmission may have shifted, and music communities may have become decoupled from geography. The aim of the current study was to determine whether drum breaks are culturally transmitted through musical collaboration networks, and to identify the factors driving the evolution of these networks. Using network-based diffusion analysis we found strong evidence for the cultural transmission of drum breaks via collaboration between artists, and identified several demographic variables that bias transmission. Additionally, using network evolution methods we found evidence that the structure of the collaboration network is no longer biased by geographic proximity after the year 2000, and that gender disparity has relaxed over the same period. Despite the delocalization of communities by the internet, collaboration remains a key transmission mode of music sampling traditions. The results of this study provide valuable insight into how demographic biases shape cultural transmission in complex networks, and how the evolution of these networks has shifted in the digital age.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Introduction

Music sampling, or the use of previously-recorded material in a new composition, is a nearly ubiquitous practice among hip-hop and electronic producers. The usage of drum breaks, or percussion-heavy sequences, ripped from soul and funk records has played a particularly critical role in the development of certain subgenres. For example, “Amen, Brother”, released by The Winstons in 1969, is widely regarded as the most sampled song of all time. Its iconic 4-bar drum break has been described as “genre-constitutive” [Whelan2009], and can be prominently heard in classic hip-hop and jungle releases by N.W.A and Shy FX [Collins2007]. Due to the consistent usage of drum breaks in particular music communities and subgenres [Whelan2009, Collins2007, Frane2017, Vakeva2010, Rodgers2003] some scholars have suggested that they may be culturally transmitted [Bown2009], which could occur as a direct result of collaboration between artists or as an indirect effect of community membership.

Before the digital age, artists may have depended upon collaborators for access to the physical source materials and expensive hardware required for sampling [Lloyd2014]. In the 1990s, new technologies like compressed digital audio formats and digital audio workstations made sampling more accessible to a broader audience [Bodiford2017]. Furthermore, the widespread availability of the internet and social media have delocalized communities [DiMaggio2001], and allowed global music “scenes” to form around shared interests beyond peer-to-peer file sharing [Ebare2003, Alexandraki2009]. Individuals in online music communities now have access to the collective knowledge of other members [Salavuo2006, Lazar2002], and there is evidence that online communities play a key role in music discovery [Garg2011]. Although musicians remain concentrated in historically important music cities (i.e. New York City and Los Angeles) [Florida2010, Graham2016], online music communities also make it possible for artists to establish collaborative relationships independently of geographic location [Kruse2010]. If more accessible sampling technologies and access to collective knowledge have allowed artists to discover sample sources independently of collaboration [Makelberge2012], then the strength of cultural transmission via collaboration may have decreased over the last couple of decades. Similarly, if online music communities have created opportunities for interactions between potential collaborators, then geographic proximity may no longer structure musical collaboration networks.

Studies of the cultural evolution of music have primarily investigated diversity in musical performances [Ellis2018] and traditions [Savage2014], macro-scale patterns and selective pressures in musical evolution [Mauch2015, Percino2014, Savage2015, RodriguezZivic2013], and the structure and evolution of consumer networks [Garg2011, Schlitter2009, Monechi2017]. Although several diffusion chain experiments have addressed how cognitive biases shape musical traits during transmission [Verhoef2012, Ravignani2017, Lumaca2017], few studies have investigated the mechanisms of cultural transmission at the population level [Rossman2008, Nakamura2018]. The practice of sampling drum breaks in hip-hop and electronic music is an ideal research model for cultural transmission because of (1) the remarkably high copy fidelity of sampled material, (2) the reliable documentation of sampling events, and (3) the availability of high-resolution collaboration and demographic data for the artists involved. Exhaustive online datasets of sample usage and collaboration make it possible to reconstruct networks of artists and track the diffusion of particular drum breaks from the early 1980s to today. Furthermore, the technological changes that have occurred over the same time period provide a natural experiment for how the digital age has impacted cultural transmission more broadly [Acerbi2016].

The aim of the current study was to determine whether drum breaks are culturally transmitted through musical collaboration networks, and to identify the factors driving the evolution of these networks. We hypothesized that (1) drum breaks are culturally transmitted through musical collaboration networks, and that (2) the strength of cultural transmission via collaboration would decrease after the year 2000. Previous studies have investigated similar questions using diffusion curve analysis [Rossman2008], but the validity of inferring transmission mechanisms from cumulative acquisition data has been called into question [Laland2003]. Instead, we applied network-based diffusion analysis (NBDA), a recently developed statistical method for determining whether network structure biases the emergence of a novel behavior in a population [Franz2009]. As NBDA is most useful in identifying social learning, an ability that is assumed to be present in humans, it has been primarily applied to non-human animal models such as birds, whales, and primates [Aplin2014, Allen2013, Hobaiter2014]

, but the ability to incorporate individual-level variables to nodes makes it uniquely suited to determining what factors bias diffusion more generally. Additionally, we hypothesized that (3) collaboration probability would be decoupled from geographic proximity after the year 2000. To investigate this we applied separable temporal exponential random graph modeling (STERGM), a dynamic extension of ERGM for determining the variables that bias network evolution

[Krivitsky2010].

Methods

All data used in the current study was collected in September of 2018. For the primary analysis, the three most heavily sampled drum breaks of all time, “Amen, Brother” by The Winstons, “Think (About It)” by Lyn Collins, and “Funky Drummer” by James Brown, were identified using WhoSampled111https://www.whosampled.com/. The release year and credits for each song listed as having sampled each break were collected using Rcrawler (v 0.1.2), an R package designed for bulk data scraping. In order to avoid name disambiguation, only artists, producers, and remixers with active Discogs links and associated IDs were included in the dataset. In order to investigate potential shifts in transmission strength around 2000, the same method was used to collect data for the eight songs in the “Most Sampled Tracks” on WhoSampled that were released after 1990 (see supporting information). One of these, “I’m Good” by YG, was excluded from the analysis because the sample is a producer tag used by a single artist. Each set of sampling events collected from WhoSampled was treated as a separate diffusion.

Collaboration data was retrieved from Discogs222https://www.discogs.com/, a crowdsourced database of music releases. Discogs was utilized because it has more extensive coverage than Allmusic (including unofficial releases) [VanVenrooij2015]. Collaborative releases in the database were extracted using xml2 (v 1.2.0), an R package designed for XML parsing, and converted to a master list of pairwise collaborations. For each diffusion, pairwise collaborations including two artists in the dataset were used to construct collaboration networks, in which nodes correspond to artists and weighted links correspond to collaboration number. Although some indirect connections between artists were missing from these subnetworks, conducting the analysis with the full dataset was computationally prohibitive and incomplete networks have been routinely used for NBDA in the past [Aplin2012, Allen2013, Aplin2014].

Individual-level variables for artists included in each collaboration network were collected from MusicBrainz333https://musicbrainz.org/, a crowdsourced database with more complete artist information than Discogs, and Spotify444https://www.spotify.com/

, one of the most popular music streaming services. Gender and geographic location were retrieved from the Musicbrainz API. Whenever it was available, the “begin area” of the artist, or the city in which they began their career, was used instead of their “area”, or country of affiliation, to maximize geographic resolution. Longitudes and latitudes for each location, retrieved using the Data Science Toolkit and Google Maps, were used to calculate each artist’s mean geographic distance from other individuals. Albunack

555http://www.albunack.net/, an online tool which draws from both Musicbrainz and Discogs, was used to convert IDs between the two databases. Popularity and followers were retrieved using the Spotify API. An artist’s popularity, an index of streaming count that ranges between 0 and 100, is a better indicator of their long-term success, whereas number of followers is a better indicator of current success. Discogs IDs are incompatible with the Spotify API, so artist names were URL-encoded and used as text search terms.

In order to identify whether social transmission played a role in sample acquisition, order of acquisition diffusion analysis (OADA) was conducted using the R script for NBDA (v 1.2.13) provided on the Laland lab’s website666https://lalandlab.st-andrews.ac.uk/freeware/. OADA was utilized instead of time of acquisition diffusion analysis (TADA) because it makes no assumptions about the baseline rate of acquisition [Franz2009]

. For each artist, order of acquisition was determined by the year that they first used the sample in their music. Sampling events from the same year were given the same order. Gender, popularity, followers, and mean distance were included as predictor variables. For gender, females were coded as -1, males were coded as 1, and individuals with other identities or missing values were coded as 0. For popularity, followers, and mean distance each variable was centered around zero and missing values were replaced with the mean according to

[Allen2013]. Asocial, additive, and multiplicative models were fit to all three diffusions collectively with every possible combination of individual-level variables. Standard information theoretic approaches were used to rank the models according to Akaike’s Information Criterion corrected for sample size (AICc). Models with a AICc 2 were considered to have the best fit [Burnham2002]. The best fitting model with the most individual-level variables was run separately to assess the effects of each variable on social transmission. Effect sizes were calculated according to [Allen2013]. An additional OADA was conducted using the seven diffusions from after 1990 without individual-level variables. An additive model was fit to the OADA, and separate social transmission parameters were calculated for each diffusion to identify differences in transmission strength. Additive and multiplicative models give identical results in the absence of individual-level variables, so no model comparison was necessary.

In order to assess the effects of individual-level variables on network evolution, STERGM was conducted using statnet (v 2016.9), an R package for network analysis and simulation. Collaboration events involving artists from each diffusion were combined to construct static collaboration subnetworks for each year between 1984 and 2017, which were then converted into an undirected, unweighted dynamic network. Early years not continuous with the rest of the event data (i.e. 1978 and 1981) were excluded from the dynamic network. In order to determine whether the variables biasing network structure have changed over time, the analysis was conducted separately with the data from 1984-1999 and 2000-2017. For each time period a set of STERGM models with every possible combination of individual-level variables were fit to the dynamic network using conditional maximum likelihood estimation (CMLE). Although STERGM can be used to separately model both the formation and dissolution of links, this analysis was restricted to the former. Gender, popularity, and followers were included to investigate homophily, while mean distance was included to assess its effect on link formation. The models from each period were ranked according to AIC, and the best fitting models (

AIC 2) with the most individual-level variables were run separately to assess the effects of each variable on network evolution.

Results

The three most heavily sampled drum breaks of all time were collectively sampled 6530 times (n1 = 2966, n2 = 2099, n3 = 1465). 4462 (68.33%) of these sampling events were associated with valid Discogs IDs, corresponding to 2432 unique artists (F: n = 143, 5.88%; M: n = 1342, 55.18%; Other or NA: n = 947, 38.94%), and included in the primary OADA and STERGM. The eight samples released after 1990 were collectively sampled 1752 times (n1 = 284, n2 = 260, n3 = 248, n4 = 198, n5 = 194, n6 = 193, n7 = 192, n8 = 182). 1305 (74.53%) of these sampling events were associated with valid Discogs IDs, corresponding to 1270 unique artists, and included in the additional OADA. All analyses were conducted in R (v 3.3.3).

0.1 Nbda

Both of the best fitting models from the primary OADA were multiplicative. The results for the second, which included all four individual-level variables, can be seen in Table 1. In support of our first hypothesis, a likelihood ratio test found strong evidence for social transmission over asocial learning (AICc = 292; p 0.001). Based on the effect sizes, transmission appears to be more likely among females (p 0.001) and less likely among artists who are more popular (p 0.001) and have more followers (p 0.001). Mean distance is not a significant predictor of transmission (p = 0.60). The diffusion network and diffusion curve for all three drum breaks included in the primary OADA are shown in Figure 1 and S1, respectively. All other models fit to the primary OADA can be found in the supporting information.

Multiplicative Model - Order of Acquisition
Estimate Effect size p
Gender -0.23 0.64 0.001
Popularity -0.011 0.82 0.001
Followers -7.6E-8 0.92 0.001
Mean distance -7.1E-9 0.99 0.60
Likelihood Ratio Test
AICc p
With social transmission 43346 0.001
Without social transmission 43638
Table 1: The results of the multiplicative model for the OADA including all individual-level variables. The top panel shows the model estimate, effect size, and p-value for each individual-level variable. The bottom panel shows the AICc for the model with and without social transmission and the p-value from the likelihood ratio test.
Figure 1: The diffusion of all three drum breaks through the combined collaboration network. At each time point uninformed individuals are shown as white squares, previously informed individuals are shown as blue circles, and newly informed individuals are shown as red circles.

The results from the additional OADA, conducted using the seven diffusions from after 1990, can be found in the supporting information. A likelihood ratio test found strong evidence for social transmission overall (AICc = 88; p

0.001). Contrary to our second hypothesis, linear regression found no significant relationships between either mean year of diffusion and social transmission estimate (R

2 = 0.20, p = 0.31) or median year of diffusion and social transmission estimate (R2 = 0.17, p = 0.36) (see Figure S2).

0.2 Stergm

For both time periods the second best fitting STERGM models (AIC 2) included all four individual-level variables, the results of which can be seen in Table 2. All other models can be found in the supporting information. Across both periods there appears to be homophily based on popularity (p 0.001) and gender (M: p 0.001; F: ps 0.05). In support of our third hypothesis, mean distance negatively predicts link formation only before 2000 (p 0.001). Additionally, there is a heterophilic effect of followers only after 2000 (p 0.001). Based on the effect sizes, there has been a nearly three-fold decrease in the strength of homophily among females. Conversely, the strengh of homophily by popularity has actually increased since 2000. Linear regression found significant positive relationships between both popularity and number of collaborations (R2 = 0.048, p 0.001) and followers and number of collaborations (R2 = 0.090, p 0.001) (see Figure S3).

A goodness-of-fit analysis was conducted by generating simulated networks (n = 100) from the parameters of the best fitting model and comparing them to the observed network statistics [Hunter2008]. For both time periods, the global statistics (i.e. gender, popularity, followers, mean distance) from the simulated networks were not significantly different from those observed, indicating that both models are good fits for the variables in question. Structural statistics (i.e. degree, edgewise shared partner, minimum geodesic distance) from the simulated networks were significantly different from those observed, indicating that both models are not good fits for the structural properties of the network. The results of this analysis can be found in the supporting information.

STERGM 1984-1999 2000-2017
Effect size p Effect size p
Gender (F) 6.83 0.001 2.41 0.01
Gender (M) 1.72 0.001 2.41 0.001
Popularity 0.78 0.001 0.53 0.001
Followers 1.03 0.38 2.03 0.001
Mean distance 0.87 0.001 0.96 0.30
Table 2: The results of the STERGM analyses for before and after 2000. The table shows the effect size and p-value for gender, popularity, followers, and mean distance during each time period.

Discussion

Using high-resolution collaboration and longitudinal diffusion data, we have provided the first quantitative evidence that music samples are culturally transmitted via collaboration between artists. Additionally, in support of the widespread assertion that the internet has delocalized artist communities, we have found evidence that geographic proximity no longer biases the structure of musical collaboration networks after the year 2000. Given that the strength of transmission has not weakened over the same time period, this finding indicates that collaboration remains a key cultural transmission mode for music sampling traditions. This result supports the idea that the internet has enhanced rather than disrupted existing social interactions [DiMaggio2001].

Gender appears to play a key role in both network structure and cultural transmission. Across the entire time period, collaborations were more likely to occur between individuals of the same gender. Additionally, the probability of cultural transmission appears to be much higher for female artists. This effect could be a result of the much higher levels of homophily among women before 2000. Previous work has suggested that high levels of gender homophily are associated with gender disparity [Glass2017, Crewe2018, Jadidi2018], which is consistent with the historic marginalization of women in music production communities [Ebare2003, Baker2008, Whelan2009]. Although the proportion of female artists in the entire dataset is extremely low (6%), the reduction in homophily among female artists after 2000 could be reflective of increasing inclusivity [Smith2014].

Artists with similar levels of popularity were also more likely to collaborate with each other. The increase in homophily by popularity after 2000 could be the result of an increase in skew, whereby fewer artists take up a greater proportion of the music charts

[Ordanini2016]. In addition, the probability of cultural transmission appears to be higher among less popular artists, even though they are slightly less collaborative. This effect could be linked to cultural norms within “underground” music production communities. In these communities, collective cultural production is sometimes prioritized over individual recognition [Thornton1995, Hesmondhalgh1998]. This principle is best demonstrated by the historic popularity of the white-label release format, where singles are pressed to blank vinyl and distributed without artist information [Thornton1995, Hesmondhalgh1998]. In more extreme cases, individual artists who experience some level of mainstream success or press coverage risk losing credibility, and may even be perceived as undermining the integrity of their music community [Hesmondhalgh1998, Noys1995]. Concerns about credibility could cause individuals to selectively copy less popular artists or utilize more rare samples (i.e. De La Soul’s refusal to sample James Brown and George Clinton because of their use by other popular groups [Lena2004]). Future research should investigate whether the “high prestige attached to obscurity” [Hesmondhalgh1998] in these communities may be driving a model-based bias for samples used by less popular artists or a frequency-based bias for samples that are more rare in the population [Boyd1985]. A frequency-based novelty bias was recently identified in Western classical music using agent-based modeling [Nakamura2018], and similar methods could be utilized for sampling.

Similarly to popularity, the number of followers an individual has negatively predicts transmission probability. However, artists with similar numbers of followers were actually less likely to collaborate with each other after 2000. This result could be due to the fact that followers is a better indicator of current popularity, but has lower resolution further back in time. Newer artists with inflated follower counts who collaborate with older, historically-important artists with lower follower counts may still be expressing homophily based on overall popularity.

There are several limitations to this study that should be highlighted. Firstly, Discogs only documents official releases, which means that more recent releases on streaming sites like Bandcamp and Soundcloud are not well-represented in the dataset. Additionally, the time lag inherent in the user editing of WhoSampled means that older transmission records are more complete. Algorithms for sample-detection [Hockman2015] may allow researchers to reconstruct more complete transmission records in the future, but these approaches are not yet publicly available. Even with incomplete data, the fact that the social network was reconstructed from collaborations rather than temporal co-occurence [Aplin2014] reduces the risk of observation error in the NBDA [Franz2010].

The results of this study provide valuable insight into how demographic variables, particularly gender and popularity, have biased both cultural transmission and the evolution of collaboration networks going into the digital age. In addition, we provide evidence that collaboration remains a key transmission mode of music sampling traditions despite the delocalization of communities by the internet. Future research should investigate whether decreased homophily among females is actually linked to greater inclusivity in the music industry (e.g. booking rates, financial compensation, media coverage), as well as whether the inverse effect of popularity on cultural transmission probability is a result of a model-based bias for obscurity or a frequency-based bias for novelty.

Acknowledgments

I would like to thank David Lahti and Carolyn Pytte, as well as all members of the Lahti lab, for their valuable conceptual and analytical feedback.

Data Availability Statement

All R scripts and data used in the study are available in the Harvard Dataverse repository: https://doi.org/10.7910/DVN/Q02JJQ.

[title=References]

Supporting information

Nbda

0.1 Primary OADA

The results of the multiplicative NBDA model fit to the primary OADA with all four individual-level variables are shown below.

Summary of Multiplicative Social Transmission Model
Order of acquisition data
Unbounded parameterisation.
Coefficients:
                           Estimate   Bounded           se           z            p
Social transmission 1  1.747643e-01 0.1487654           NA          NA           NA
gender                -2.266729e-01        NA 2.739639e-02  -8.2738230 1.110223e-16
popularity            -1.082929e-02        NA 9.314274e-04 -11.6265576 0.000000e+00
followers             -7.551632e-08        NA 1.527092e-08  -4.9451063 7.610233e-07
meandist              -7.090714e-09        NA 1.369750e-08  -0.5176646 6.046923e-01
Likelihood Ratio Test for Social Transmission:
Null model includes all other specified variables
Social transmission and asocial learning assumed to combine multiplicatively
                            Df LogLik   AIC  AICc     LR  p
With Social Transmission     5  21668 43346 43346 294.22  0
Without Social Transmission  4  21815 43638 43638

The results of all NBDA models fit to the primary OADA. In the “Additive?” column TRUE means the model was additive, FALSE means the model was multiplicative, and NA means the model was asocial. In the “ILVs”, or individual-level variables, column the numbers correspond to the variables included in the model (1: gender; 2: popularity; 3: followers; 4: mean distance).

Additive?       ILVs    Social? AICc                    deltaAICc
FALSE           1 2 3   social  43344.1808622338        0
FALSE           1 2 3 4 social  43345.9171166924        1.74
FALSE           1 2     social  43373.8584641891        29.68
FALSE           1 2 4   social  43375.6738168026        31.49
FALSE           2 3     social  43407.5127988714        63.33
FALSE           2 3 4   social  43409.4693873732        65.29
TRUE            1 2     social  43426.2129577337        82.03
FALSE           2       social  43437.1101322225        92.93
FALSE           2 4     social  43439.0948370825        94.91
FALSE           1 3     social  43472.3636789154        128.18
FALSE           1 3 4   social  43473.9221794829        129.74
TRUE            2       social  43485.1298678497        140.95
FALSE           3       social  43530.4325608775        186.25
FALSE           3 4     social  43532.2902707822        188.11
FALSE           1       social  43609.4787479712        265.3
FALSE           1 4     social  43611.1862604718        267.01
TRUE            1       social  43612.45988987          268.28
NA              1 2     asocial 43640.9959016707        296.82
NA              0       social  43662.7306996825        318.55
FALSE           4       social  43664.6556869778        320.47
NA              2       asocial 43681.3580034879        337.18
NA              3       asocial 43770.164437537         425.98
NA              2 3     asocial 43772.1666674654        427.99
NA              1 3     asocial 43772.166667502         427.99
TRUE            3       social  43774.1519991679        429.97
NA              1 2 3   asocial 43774.1700138118        429.99
TRUE            2 3     social  43776.1553454759        431.97
TRUE            1 3     social  43776.1553455143        431.97
TRUE            1 2 3   social  43778.1598091375        433.98
NA              1       asocial 43779.4569488198        435.28
NA              3 4     asocial 43801.9449913924        457.76
NA              2 3 4   asocial 43803.9483377061        459.77
NA              1 3 4   asocial 43803.9483377387        459.77
NA              1 2 3 4 asocial 43805.9528013678        461.77
NA              0       asocial 43815.27922266          471.1
NA              4       asocial 43815.6063399776        471.43
TRUE            3 4     social  43816.0877037818        471.91
NA              2 4     asocial 43817.6085698873        473.43
TRUE            4       social  43817.6085699399        473.43
NA              1 4     asocial 43817.6085699426        473.43
TRUE            2 3 4   social  43818.0921674118        473.91
TRUE            1 3 4   social  43818.0921674435        473.91
TRUE            2 4     social  43819.611916231         475.43
NA              1 2 4   asocial 43819.6119162337        475.43
TRUE            1 4     social  43819.6119162863        475.43
TRUE            1 2 3 4 social  43820.0977493238        475.92
TRUE            1 2 4   social  43821.6163798927        477.44
Figure S1: The combined diffusion curve for all three drum breaks included in the primary OADA. The proportion of informed individuals is on the y-axis, and the year is on the x-axis. Although recent research suggests that inferring acquisition modes from diffusion curves is unreliable, it appears that the curve may have the S-shape indicative of social transmission prior to the early-2000s.

0.2 Additional OADA

The eight songs in the “Most Sampled Tracks” on WhoSampled that were released after 1990 are shown below. The fifth song, “I’m Good” by YG, was excluded from the additional OADA because it is a producer tag used by a single artist.

  1. “Crash Goes Love (Yell Apella)” by Loleatta Holloway (1992)

  2. “Shook Ones Part II” by Mobb Deep (1994)

  3. “C.R.E.A.M.” by Wu-Tang Clan (1993)

  4. “Sound of Da Police” by KRS-One (1993)

  5. “I’m Good” by YG (2011) [excluded producer tag]

  6. “Juicy” by The Notorious B.I.G. (1994)

  7. “Sniper” by DJ Trace and Pete Parsons (1999)

  8. “Who U Wit?” by Lil Jon and The East Side Boyz (1997)

The results of the additive NBDA model fit to the additional OADA are shown below. Remember that the fifth song was excluded, so the transmission estimates for five, six, and seven here are actually for six, seven, and eight.

Summary of Additive Social Transmission Model
Order of acquisition data
Unbounded parameterisation
Coefficients
                        Estimate    Bounded
Social transmission 1 0.13558602 0.11939740
Social transmission 2 0.28805974 0.22363849
Social transmission 3 0.05184340 0.04928814
Social transmission 4 0.60154771 0.37560399
Social transmission 5 0.06816578 0.06381573
Social transmission 6 0.07600555 0.07063677
Social transmission 7 0.01547410 0.01523830
Likelihood Ratio Test for Social Transmission:
Null model includes all other specified variables
Social transmission and asocial learning assumed to combine additively
                            Df LogLik   AIC  AICc     LR  p
With Social Transmission     7   6450 12914 12914 101.99  0
Without Social Transmission  0   6501 13002 13002
Figure S2: The relationship between diffusion years and transmission strengths for all seven diffusions included in the additional OADA. The mean (left) and median (right) years of diffusion are on the x-axis, and the social transmission estimates from the additive model are on the y-axis. Linear regression found no significant relationships between either mean year of diffusion and social transmission estimate (R2 = 0.20, p = 0.31) or median year of diffusion and social transmission estimate (R2 = 0.17, p = 0.36).

Stergm

The results of all formation models of the STERGM fit to the data from 1984-1999. In the “ILVs”, or individual-level variables, column the numbers correspond to the variables included in the model (1: gender; 2: popularity; 3: followers; 4: mean distance).

ILVs    AIC                     deltaAIC
1 2 4   8364.97161369026        0
1 2 3 4 8366.24656444229        1.27495075203478
1 2     8384.05280723982        19.0811935495585
1 2 3   8385.29456801713        20.3229543268681
2 4     8404.6082794331         39.6366657428443
2 3 4   8405.18598860689        40.214374916628
1 3 4   8420.09735430591        55.125740615651
1 4     8422.0333077386         57.0616940483451
2       8423.86012894101        58.8885152507573
2 3     8424.43170945253        59.4600957622752
1 3     8438.77183840238        73.8002247121185
1       8440.62374953367        75.6521358434111
3 4     8460.98939725943        96.017783569172
4       8461.70808108896        96.7364673987031
3       8479.93401312549        114.96239943523
0       8480.63135860674        115.659744916484

The results of the best-fitting formation model of the STERGM with the most individual-level variables fit to the data from 1984-1999.

==========================
Summary of model fit
==========================
Formula:   y.form ~ edges + nodecov(”meandist”) + absdiff(”popularity”) +
    absdiff(”followers”) + nodematch(”gender”, diff = TRUE)
Iterations:  10 out of 20
Monte Carlo MLE Results:
                      Estimate Std. Error MCMC % z value Pr(>|z|)
edges               -9.214e+00  1.020e-01      0 -90.295   <1e-04 ***
nodecov.meandist    -1.401e-07  3.536e-08      0  -3.962   <1e-04 ***
absdiff.popularity  -2.474e-02  3.500e-03      0  -7.069   <1e-04 ***
absdiff.followers    2.263e-08  2.566e-08      0   0.882    0.378
nodematch.gender.-1  1.922e+00  3.137e-01      0   6.126   <1e-04 ***
nodematch.gender.0   2.249e-01  1.947e-01      0   1.155    0.248
nodematch.gender.1   5.440e-01  1.063e-01      0   5.118   <1e-04 ***
Signif. codes:  0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1   1
     Null Deviance: 6772784  on 4885531  degrees of freedom
 Residual Deviance:    8352  on 4885524  degrees of freedom
AIC: 8366    BIC: 846

The results of the goodness-of-fit analysis of the formation model of the STERGM with the most individual-level variables fit to the data from 1984-1999 are below.

================================
Formation model goodness of fit:
================================
Goodness-of-fit for degree
     obs   min     mean   max MC p-value
0  11383 11065 11141.98 11223       0.00
1    871  1129  1210.66  1280       0.00
2    167   103   119.53   137       0.00
3     49    15    21.40    29       0.00
4     15     5     7.90    11       0.00
5      2     0     0.50     3       0.10
6      6     0     0.96     2       0.00
7      1     0     0.05     1       0.10
8     12     3     6.57     8       0.00
9      1     0     0.45     4       0.72
10     1     0     0.00     0       0.00
11     2     0     0.87     1       0.00
12     2     0     1.08     2       0.24
13     0     0     0.05     1       1.00
Goodness-of-fit for edgewise shared partner
     obs min   mean max MC p-value
esp0 586 668 709.95 753       0.00
esp1 107  49  49.01  50       0.00
esp2  42  19  19.00  19       0.00
esp3   6   2   2.99   3       0.00
esp4   0   0   0.01   1       1.00
esp7  71  35  35.99  36       0.00
esp8   1   0   0.01   1       0.02
Goodness-of-fit for minimum geodesic distance
         obs      min        mean      max MC p-value
1        813      775      816.96      860       0.80
2        463      194      223.60      274       0.00
3        273       40       70.11      111       0.00
4        170       14       24.73       47       0.00
5         84        6       10.40       30       0.00
6         32        0        1.67       18       0.00
7         11        0        0.28        9       0.00
8          1        0        0.02        2       0.02
Inf 78266969 78267514 78267668.23 78267774       0.00
Goodness-of-fit for model statistics
                              obs           min          mean           max MC p-value
edges                      813.00        775.00        816.96        860.00       0.80
nodecov.meandist    -303419284.30 -386203374.87 -308269525.41 -209159754.62       0.80
absdiff.popularity       15000.66      13859.22      15133.10      16199.37       0.82
absdiff.followers    618981622.77  529697962.84  630469916.52  747967165.94       0.76
nodematch.gender.-1         21.00         13.00         21.10         29.00       1.00
nodematch.gender.0          70.00         57.00         69.97         86.00       1.00
nodematch.gender.1         423.00        393.00        422.94        452.00       0.98
==================================
Dissolution model goodness of fit:
==================================
Goodness-of-fit for degree
    obs   min     mean   max MC p-value
0 12439 12406 12437.70 12464       0.98
1    68    46    70.77    98       0.88
2     5     0     2.92     9       0.36
3     0     0     0.55     3       1.00
4     0     0     0.06     1       1.00
Goodness-of-fit for edgewise shared partner
     obs min  mean max MC p-value
esp0  36  25 38.90  57       0.74
esp1   3   0  0.34   4       0.22
esp2   0   0  0.01   1       1.00
Goodness-of-fit for minimum geodesic distance
         obs      min        mean      max MC p-value
1         39       25       39.25       57       1.00
2          2        0        4.37       14       0.74
3          0        0        1.43        8       1.00
4          0        0        0.47        5       1.00
5          0        0        0.12        3       1.00
6          0        0        0.02        1       1.00
Inf 78268775 78268748 78268770.34 78268789       0.76
Goodness-of-fit for model statistics
       obs        min       mean        max MC p-value
     39.00      25.00      39.25      57.00       1.0

The results of all formation models of the STERGM fit to the data from 2000-2017. In the “ILVs”, or individual-level variables, column the numbers correspond to the variables included in the model (1: gender; 2: popularity; 3: followers; 4: mean distance).

ILVs    AIC                     deltaAIC
1 2 3   13529.2558427881        0
1 2 3 4 13530.1094146641        0.853571875952184
2 3 4   13705.8555144574        176.599671669304
2 3     13706.3869501296        177.131107341498
1 2     13765.2875133222        236.031670534052
1 2 4   13765.5738510219        236.318008233793
1 3     13927.5697519705        398.313909182325
1 3 4   13928.7255045716        399.469661783427
2 4     13965.980285421         436.724442632869
2       13967.2225550059        437.966712217778
1       13999.0397193497        469.78387656156
1 4     13999.8463704623        470.590527674183
3 4     14110.16599843          580.910155641846
3       14110.50174635          581.245903561823
4       14197.1997775007        667.943934712559
0       14198.0100768805        668.75423409231

The results of the best-fitting formation model of the STERGM with the most individual-level variables fit to the data from 2000-2017.

==========================
Summary of model fit
==========================
Formula:   y.form ~ edges + nodecov(”meandist”) + absdiff(”popularity”) +
    absdiff(”followers”) + nodematch(”gender”, diff = TRUE)
Iterations:  11 out of 20
Monte Carlo MLE Results:
                      Estimate Std. Error MCMC %  z value Pr(>|z|)
edges               -8.522e+00  7.740e-02      0 -110.101  < 1e-04 ***
nodecov.meandist    -2.005e-08  1.915e-08      0   -1.047 0.295149
absdiff.popularity  -5.360e-02  3.056e-03      0  -17.539  < 1e-04 ***
absdiff.followers    1.883e-07  9.484e-09      0   19.850  < 1e-04 ***
nodematch.gender.-1  8.793e-01  3.607e-01      0    2.437 0.014790 *
nodematch.gender.0  -1.108e+00  2.962e-01      0   -3.742 0.000183 ***
nodematch.gender.1   8.813e-01  8.061e-02      0   10.934  < 1e-04 ***
Signif. codes:  0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1   1
     Null Deviance: 7195554  on 5190495  degrees of freedom
 Residual Deviance:   13516  on 5190488  degrees of freedom
AIC: 13530    BIC: 13624

The results of the goodness-of-fit analysis of the formation model of the STERGM with the most individual-level variables fit to the data from 2000-2017 are below.

================================
Formation model goodness of fit:
================================
Goodness-of-fit for degree
     obs   min     mean   max MC p-value
0  11235 10780 10869.76 10974       0.00
1   1464  1867  1961.64  2033       0.00
2    361   305   338.18   391       0.14
3    128    67    88.09   105       0.00
4     68    18    24.97    34       0.00
5     25     3     6.28    12       0.00
6      5     1     2.71     6       0.16
7      5     1     2.07     4       0.00
8      0     0     0.28     1       1.00
9      2     0     0.02     1       0.00
10     1     0     0.00     0       0.00
Goodness-of-fit for edgewise shared partner
      obs  min    mean  max MC p-value
esp0 1142 1284 1342.67 1400          0
esp1  291  143  144.56  150          0
esp2   75   35   36.11   38          0
esp3   22   10   10.02   11          0
Goodness-of-fit for minimum geodesic distance
         obs      min        mean      max MC p-value
1       1530     1474     1533.36     1590       0.98
2       1166      600      660.58      738       0.00
3       1091      308      371.34      450       0.00
4        937      162      221.00      308       0.00
5        648       78      121.39      193       0.00
6        380       42       73.18      134       0.00
7        183       22       40.71       85       0.00
8         79        7       18.39       47       0.00
9         16        1        6.75       23       0.18
10         1        0        1.72       12       1.00
11         0        0        0.39        6       1.00
12         0        0        0.11        4       1.00
13         0        0        0.03        2       1.00
14         0        0        0.01        1       1.00
Inf 88352540 88355144 88355522.04 88355776       0.00
Goodness-of-fit for model statistics
                              obs          min          mean           max MC p-value
edges                     1530.00       1474.0       1533.36       1590.00       0.98
nodecov.meandist    -168204110.01 -270781202.0 -167277975.89  -43285584.81       0.90
absdiff.popularity       23566.44      22293.3      23645.07      24905.35       0.96
absdiff.followers   2983403117.99 2661498427.5 2990092538.60 3422178615.52       0.92
nodematch.gender.-1         16.00         12.0         16.19         23.00       1.00
nodematch.gender.0          28.00         22.0         28.01         40.00       0.98
nodematch.gender.1         989.00        940.0        988.81       1060.00       0.98
==================================
Dissolution model goodness of fit:
==================================
Goodness-of-fit for degree
    obs   min     mean   max MC p-value
0 13189 13147 13185.20 13219       0.94
1    99    74   106.06   144       0.66
2     5     0     2.72     7       0.22
3     1     0     0.02     2       0.02
Goodness-of-fit for edgewise shared partner
     obs min  mean max MC p-value
esp0  56  38 55.66  75       0.98
esp1   0   0  0.12   3       1.00
Goodness-of-fit for minimum geodesic distance
         obs      min        mean      max MC p-value
1         56       38       55.78       75       1.00
2          8        0        2.66       10       0.02
3          1        0        0.10        1       0.20
Inf 88358506 88358492 88358512.46 88358532       0.50
Goodness-of-fit for model statistics
       obs        min       mean        max MC p-value
     56.00      38.00      55.78      75.00       1.0
Figure S3: The relationship between popularity and followers and the number of collaborations for each artist in the dataset. Popularity and followers are on the x-axis, and number of collaborations is on the y-axis. Linear regression found significant positive relationships between both popularity and number of collaborations (R2 = 0.048, p 0.001) and followers and number of collaborations (R2 = 0.090, p 0.001).