Bayesian Analysis of Social Influence

by   Johan Koskinen, et al.

The network influence model is a model for binary outcome variables that accounts for dependencies between outcomes for units that are relationally tied. The basic influence model was previously extended to afford a suite of new dependence assumptions and because of its relation to traditional Markov random field models it is often referred to as the auto logistic actor-attribute model (ALAAM). We extend on current approaches for fitting ALAAMs by presenting a comprehensive Bayesian inference scheme that supports testing of dependencies across subsets of data and the presence of missing data. We illustrate different aspects of the procedures through three empirical examples: masculinity attitudes in an all-male Australian school class, educational progression in Swedish schools, and un-employment among adults in a community sample in Australia.


Using Sampled Network Data With The Autologistic Actor Attribute Model

Social science research increasingly benefits from statistical methods f...

Discriminative Modeling of Social Influence for Prediction and Explanation in Event Cascades

The global dynamics of event cascades are often governed by the local dy...

Statistical Estimation from Dependent Data

We consider a general statistical estimation problem wherein binary labe...

Bayesian Inference of Networks Across Multiple Sample Groups and Data Types

In this paper, we develop a graphical modeling framework for the inferen...

The Impossibility of Testing for Dependence Using Kendall's τ Under Missing Data of Unknown Form

This paper discusses the statistical inference problem associated with t...

The Bayesian Echo Chamber: Modeling Social Influence via Linguistic Accommodation

We present the Bayesian Echo Chamber, a new Bayesian generative model fo...

Network Autocorrelation Models with Egocentric Data

Network autocorrelation models have been widely used for decades to mode...

1 Introduction

In social statistics it has become commonplace to take dependencies between outcomes into account using multilevel models (e.g. Goldstein, 1995; Snijders and Bosker, 2011). Thus we may account for compositional or contextual factors in for example school classes when we consider educational outcomes by using random effects for class rooms or neighbourhoods. If we acknowledge the possibility of our observational units being connected with each other through social networks, we can account for some of the dependence this induces using multilevel models (Tranmer et al., 2014) but we cannot capture the detail and diversity of what the social networks literature has termed social influence (Robins, 2015). The notion of attitudes and information spreading through friendship networks was already a premise in Moreno’s (1934) seminal work explaining a runaway epidemic in a reformatory. Coleman, Katz, and Menzel’s (1957) study of the diffusion of the prescription of a novel drug among a network of physicians has been followed by numerous empirical studies of spread on different types of networks (for example Strang, 1991; Strang and Soule, 1998; Strang and Tuma, 1993; Valente, 1995, 2005, 2012).

Social influence in social network analysis can broadly be seen as representing processes whereby people tend to be, or become, similar to their friends (or contacts) in their behaviours, attitudes, or beliefs. Social influence is sometimes referred to as social contagion (e.g. Robins et al., 2012, Burt, 1987) by analogy to how diseases spread through contact between individuals. Network models are indeed frequently used to model disease spread and epidemics (Rolls et al. 2012, 2013a, 2013b; Jenness, Goodreau, Morris, 2016; Krivitsky and Morris 2017) even though the mechanisms of social contagion may differ. The current canonical empirical framework for investigating social influence is stochastic actor-oriented models, SAOM (Steglich et al., 2010). While a powerful tool, SAOM require longitudinal network data and researchers do not always have the resources or opportunity to collect network data at multiple points in time. For cross-sectional network data, even if you have to assert the existence of contagion or influence, controlling for dependencies is a statistical reality (see e.g. Bailey and Hoff, 2015) and neglecting these dependencies may have adverse effects (Doreian, Teuter, & Wang, 1984; Lubbers & Snijders, 2007). Consequently we defer to other work for discussions and analysis of identification (Manski, 1993; An, 2011; Bramoullé, Djebbari, and Fortin, 2009) and focus here on the inferential aspects of a well-defined framework for accounting for network dependence in individual outcomes.

We consider a class of models for investigating social influence for cross-sectional data called auto-logistic actor-attribute models (ALAAM) (Robins et al., 2001; Daraganova and Robins, 2013) where the outcome of interest is binary. A number of continuous models for social influence exist (Marsden & Friedkin, 1994; Leenders, 2002; Doreian, 1982; Agneessens and Koskinen, 2016; Sewell, 2017; Vitale et al., 2016) that can easily be modified to suit binary outcome variables (Koskinen and Stenberg, 2012; Zhang et al., 2013) but these do not afford specifying the types of dependencies that the ALAAM does.

Gibbs random fields, such as the auto-logistic Ising model (Besag, 1972), have been studied in great detail in statistics and employed in various forms in spatial statistics for modelling binary outcomes with neighbourhood dependencies. To accommodate interpretations in terms of the Behavioural and Social Sciences, Robins et al. (2001) elaborated on these Gibbs-distributions and derived a class of ‘social influence’ models from a set of specific dependence assumptions. These were later extend by Daraganova (2009) to form a family of actor-attribute auto-logistic models for inferring contagion in cross-sectional data. Exponential random graph models (ERGM) is a related class of models aimed at modelling the network ties conditional on fixed actor covariates (see Lusher et al., 2013, for an introduction). There are known problems with ERGM (Handcock, 2003; Schweinberger, 2011) and it is well-known that simple model-specifications do not work (Snijders et al., 2006). For ALAAM this is less of an issue but inhomogeneous ALAAM still present considerable challenges relative to the simpler Ising model. Møller et al. (2006) proposed an auxiliary variable MCMC for Bayesian inference for auto logistic models. While this works well for the Ising model, for the inhomegenous ALAAM of the more elaborate model of Robins et al. (2001), the auxiliary variable MCMC fails and modified MCMC samplers are required (Koskinen, 2008). To accommodate the challenges presented by realistic ALAAM specifications with multiple covariates, we draw on an adoption of the exchange algorithm (Murray et al., 2006) that has previously been applied by Caimo and Friel (2011) to exponential family random graph models.

Maximum likelihood estimation for the elaborated model as described in Daraganova and Robins (2013) is implemented in the statistical software package MPnet (Wang et al., 2014) and is becoming increasingly more popular (some recent studies include acquisition of norms through networks, Kashima et al., 2013; network effects on performance, Letina, 2016; and ‘contagion’ of depression and PTSD, Bryant et al., 2017). Here we improve on a previous Bayesian inference approach (Koskinen, 2008) and provide a straightforward inference scheme and demonstrate how this caters to the practical issues often encountered when working with complex empirical network data, such as handling missing data, performing goodness-of-fit, and choosing between competing models. We introduce the model by describing it in some detail. We then proceed to outline various aspect of inference for the model, something that we then illustrate in three empirical examples.

2 Notational preliminaries

We consider directed graphs represented as di-graphs on a fixed set of nodes , with an arc-set . In social network research, the nodes typically represent individuals and the set of connections amongst them (Robins, 2015). We further assume a stochastic binary vertex labelling , that corresponds to the binary out-come variable of interest for the nodes of the graph. We represent by its binary adjacency matrix , where the tie-indicators

and the attribute indicators

We denote the space of all adjacency matrices by

and the support of the attribute vector by

. We allow for binary and continuous, fixed and exogenous covariates, but suppress the notational dependency on these for the sake of exposition.

The data structure is thus of the kind where we can imagine that we survey employees in an organisation about whom they go to for advice and whether they approve of a certain work-place policy. In that scenario, we might ask the question of wether people that share advice are also more likely to share the same opinion about the policy. In the examples to follow in Section 5, consists of 108 males in a Year 10 level Australian secondary school; 403 sixth grade students across 19 school classes in Sweden; and 551 adult individuals in Australia. For the first two cases, the network ties are friendship nominations (both directed) and for the third, nominations of whom who are close to and/or with whom you discuss employment matters (treated as undirected). The outcome variables () are a binary masculine attitudes index, progression to higher education (intention), and employment status, respectively.

3 The auto-logistic actor attribute model

The general form of the log-linear model used here is

where is a vector-valued function on , are the natural parameters, and

is a normalising constant. The vector of sufficient statistics can be chosen arbitrarily but may also be derived from a set of dependence assumptions. In particular, following Frank and Strauss (1986), Robins et al. (2001) provided a set of dependence assumptions and derived their corresponding sufficient statistics. This was later elaborated upon and extended by Daraganova (2009).

3.1 Dependence

The simplest form of an ALAAM is a model in which and are independent conditional on and a fixed set of exogenous covariates, for all

. In this case the ALAAM reduces to a logistic regression model. A simple relaxation of this independence is a model where the conditional probabilities obey the Markov property

where is the set of neighbours of . This is a Markov random field, a family of models that have a long history in statistical mechanics. In particular, under some conditions Markov random fields can be expressed in terms of Gibbs distributions, like the auto-logistic Ising model (Besag, 1972).

To define a class of coherent models with identified sufficient statistics, we proceed by briefly summarising their definitions by following Robins et al. (2001). We begin by defining the chain graph, which is a directed dependence graph on the variables , where directed ties from variables () to variables (), indicate dependence on exogenous variables. Let be the index set for the variables in with the mapping . In Figure 1 (a) the tie-variables , , and , are considered exogenous and given, and the attribute variable is considered endogenous. The set of sufficient statistics associated with a dependence assumption is derived out of the moral graph (Lauritzen and Spiegelhalter, 1988) of the chain graph. The moral graph is obtained from marrying the parents of directed ties (the ‘senders’) and turning all ties un-directed. The sufficient statistics of the model are counts of all configurations in that correspond to cliques in the moral graph.

Let be defined as the chain-graph on , where there are directed ties from the Network block to the Attribute block , and where there may be undirected ties between nodes in the Attribute block. For a given chain graph , denote the moral graph by . The statistic is zero for any subset that is not a clique of (Robins et al., 2001). The relation to standard log-linear models lie in the fact that the relevant statistics are all interaction terms.

Network Block

Attribute Block


Figure 1: Dependence graph (a) and Moral graph (b) of network activity dependence model (Robins et al., 2001)

3.1.1 Network Activity Dependence

Robins et al. (2001) drew on the Markov dependence assumption for networks of Frank and Strauss (1086) to derive a model for that depends on . An attribute variable is conditionally dependent on the network tie-variable if and only if . This dependence assumption defines the chain graph (a) in Figure 1 whose moral graph is (b) in Figure 1. Cliques in are the singletons of type , two-cliques of the type , , and , as well as three-cliques corresponding to labeled network 2-stars and four-cliques corresponding to labeled network 3-stars , as well as higher order cliques, all corresponding to the association of different stars with the attribute value . The full form of the ALAAM is

In the summation, denotes all -element sub-sets of . To reduce the number of parameters, the following homogeneity assumption may be imposed:

for all and . Thus, for example the interaction terms and both contribute to the same sufficient statistic. The homogeneity constraints are also crucial for estimation of ERGM (Koskinen et al., 2018).

While this network activity dependence model takes the dependence on network ties into account, the nodal outcomes are conditionally independent for all conditional on . Thus, while the Markov dependence assumption of Frank and Strauss (1986) for the conditional model of given induces dependencies, that dependence assumption does not induce dependence for the elements of given .

3.1.2 Network Contagion Dependence

Although a number of different social influence and social contagion mechanisms have been specified in terms of networks (Friedkin, 1984; Marsden and Friedkin, 1994; Burt, 1987) we consider contagion here primarily in terms of an association of and for pairs . This might manifest itself in a large number of homophilous ties relative to the number heterophilous ties as exemplified in Figure 2. To induce dependence between outcomes and we may add undirected ties between variables in the attribute block in Figure 1. This would result in the graph in Figure 3 and consequently we would indeed obtain sufficient statistics of the form . However, would be a clique in the moral graph for all (as would, of course, interactions for all k-tuppels . In other words, the dependence graph would be complete and all outcomes would become dependent and, consequently, a model could not be derived using the Hammersley-Clifford theorem. Robins et al. (2001) could not define a coherent dependence assumption like that of the Markov dependence assumption for ERGM and instead drew on item response theory to define a model that afforded a contagion term.

Following how realization-dependence (Baddeley & Möller, 1989) was used by Pattison and Robins (2002) and subsequently Snijders et al. (2006) for extra-Markov dependencies for ERGM, Daraganova (2009) derived a suit of statistics for ALAAM from dependence assumptions defined by partial dependence graphs. The partial dependence graph , is a graph on , where if variables and are conditionally dependent conditional on variables , and for . In the model, the parameter for the statistic is non-zero only if is a clique of and is a clique of for all .

The partial dependence graphs allow us to specify contagion dependence assumption with dependence between and for distinct conditional on outcomes in , in addition to the dependencies of the network activity model. For example, may be a clique in for , but does not have to be a clique in a graph where . That is, and may be conditionally independent when , but conditionally dependent when .

Making some further homogeneity assumptions and setting some higher-order statistics to zero, we arrive at the following contagion model


Note that while (1) includes an interaction term similar to that of Besag’s (1972) classic auto-logistic model, it is subtly different in the definition of the neighbourhood.

Figure 2: Example statistics of heterophilous (left), and homophilous (right) tie, .

Network Block

Attribute Block



Figure 3: Dependence graph (a) and Moral graph (b) of model with dependence between attributes that share tie-variables

3.1.3 More Elaborate Dependence Assumptions

Daraganova (2009) elaborate further on outcome-dependent dependence assumptions for undirected graphs and derive statistics that correspond to more elaborate forms of contagion and influence processes. It has been argued that more long-range patterns of influence may occur, for example indirect influence through friends. Daraganova (2009) proposes an indirect partial dependence assumption whereby any two attribute variables and may be conditionally dependent given everything else if and only if there is a two-path connecting them, i.e. for some . This leads to statistics (c) of Fig. 4. Influence of this nature has been proposed by Burt (1987) who argues that individuals that are in structurally equivalent positions may be subject to the same types of influences. Robins (2015) refer by the term generalised social influence to the phenomena whereby a person’s position in a network might influence their individual-level outcomes. Further dependence assumptions may be defined to obtain statistics like those in Figure 5, that capture the effect of triadic closure.





Figure 4: More elaborate statistics. Grey node indicates , white node .



Figure 5: Statistics associated with closure. Grey node indicates , white node .

3.2 Simulation

The expression of Eq. 1 cannot be evaluated analytically as the normalising constant is a sum over all of . Simulation for Markov random fields is however straightforward and has a long history. Many algorithms have been proposed and they typically draw on the conditional independence that implies that

where is the vector of sufficient statistics, is the vector with element set to one, and is the vector with element set to zero. For a nearest neighbour algorithm we can update iteratively by selecting at random from and either updated it using a Gibbs-update, or through a Metropolis up-dating step by proposing to change the value from to . It is also possible to update multiple variables in parallel using Besag’s (1974) coding scheme approach for the Ising mode. For example, outcomes for nodes that are isolated in a graph can be updated independently of all other values. Blocks of variables may also be updated independently of each other if they are well separated in the sense of Pattison et al. (2013).

4 Inference

4.1 Estimation

The main obstacle to Bayesian inference for the model Eq. 1, is that the posterior is doubly intractable in the sense that both the normalising constant of the posterior and the likelihood are intractable. With prior distribution , the posterior distribution is

where we note that the numerator contains the intractable normalising constant and the denominator involves an intractable integral (of an intractable expression). MCMC may be performed by numerically approximating

, either by interpolation on a grid (which is much more straightforward for auto logistic models than for ALAAM) or through simulation-based methods. The auxiliary variable MCMC elegantly avoids having to evaluate

by drawing variables from an auxiliary distribution with he same set of parameters (Møller et al., 2006). The performance of the auxiliary-variable MCMC relies critically on how well the values of the parameters in the auxiliary distribution represents the true but unknown posterior. The linked importance sample auxiliary variable MCMC (Koskinen, 2008) alleviates this issue by introducing bridging distributions, linking the reference distribution to the likelihood. The performance of linked importance sample auxiliary variable MCMC is however dependent on a good choice of auxiliary variable parameters. The exchange algorithm (Murray et al., 2006) removes the need for the parameters of the auxiliary variable to be fixed and in the process not only reduces computational overheads but also automatically tunes the auxiliary parameters in the course of the MCMC. Caimo and Friel (2011) adopted the exchange algorithm to ERGM approximating the Gibbs up-dating step by a Metropolis MCMC.

For model Eq. 1

, the exchange MCMC has as its target distribution the joint distribution

where is a variable with the same distributional form as but with a parameter , the prior of which is . Marginalising this joint posterior with respect to and , we obtain our desired posterior for given .

We draw a sample by in each iteration , given the current values , , and , proposing from , and conditional on the proposed value, draw from . Given these proposed values, we propose to swap the parameters with probability

setting and . As for ERGM (Caimo and Friel, 2011), the acceptance ratio simplifies (for non-curved forms of ERGM) to

It is convenient to use a symmetric proposal distribution for . In particular, we propose a simplistic multivariate normal with mean vector

and a variance-covariance matrix that is set to

times the inverse of , approximated from a short initial sample from the model defined by the initial value . For exponential family models in canonical form we have that (this procedure was also used by Koskinen et al., 2013, for tuning the algorithm for ERGM).

4.2 Missing data

Assume that we observe data only for a subset of actors given by the missing data indicator , where if the response is unobserved for and if the response is observed for . Following Rubin (1976) and Little and Rubin (1987) we define a missing data mechanism

conditional on the response variables where the parameter

is distinct from the model parameters . Initialising by assigning initial values to missing entries, with a prior , the estimation is carried out as above with two additional updating steps in each iteration. The first consists of updating the missing values and is done by for each proposing to set , and accepting this with probability

where is with element toggled and set to . To update , propose a move to drawn from a proposal distribution , and accept this with probability

If data are missing not at random (MNAR) we can define a missing data generating mechanism to test the sensitivity of our inference for to deviations from data being missing at random (MAR).

When analysing social selection using ERGM, there are a number of ways of accounting for missing tie-variables (Handcock and Gile, 2010; Koskinen et al., 2010). When modelling tie-variables, accounting for key missing attributes is more complicated. While methods for jointly handling missing ties and missing attributes have been proposed (Koskinen et al., 2013), a more straightforward (albeit less principled) approach would be to perform multiple imputation for binary attributes using the ALAAM procedure above and, for example, using the network effects model (Marsden and Freidkin, 1994; Lenders, 2002; implemented in the routine

lnam in the R-package sna, Butts, 2016) for continuous attributes.

4.3 Goodness of fit

For ERGM it has become standard practice to evaluate model fit by considering the predictive distributions for a range of different features of the network (Hunter et al., 2008; Robins and Lusher, 2013). This is partly because the high-dimensional network space admits a number of projections. The outcomes in an influence model have a range-space that is considerably more straightforward to summarise. Given the suite of different statistics that the different dependence assumptions of Daraganova (2009) imply, it is still however necessary to consider a number of functions of

as these may inform us of dependencies in data that we have not captured. Similar to the Bayesian goodness-of-fit (GOF) for ERGM (Koskinen et al., 2010; Koskinen et al., 2013), the GOF distribution is the posterior predictive distribution, marginalised over the parameters (and not based on a plug-in point estimate, as in MPNet for ALAAM, Wang et al., 2014). The predictive distribution

is obtained from drawing from the ALAAM defined by the posterior draw . In the MCMC that generate the posterior draws, whenever is updated we set , as the auxiliary variable is drawn from the distribution , the draw of is also a draw from the posterior predictive distribution. Thus, if we let for every such that , and , otherwise, we have a draw from the posterior predictive distribution at the termination of the estimation algorithm. Note that the Bayesian GOF is based on draws of replicate data from the predictive distribution and as such account for uncertainty in parameters. Assuming that we have models with posterior distributions , we can average the predictive distributions over models.

4.4 Model selection

Caimo and Friel (2013) propose an across-model procedure to evaluate model evidence for ERGM. They note that within-model estimation of evidence that relies on density estimation of the posterior breaks down for high-dimensional parameter vectors (greater than 5). Friel (2013) proposes an elegant method for estimation of Bayes factors of pair-wise nested models based on the MCMC updating in the exchange algorithm and demonstrate their application to two simple Markov random field models (the Ising model and a Markov two-star ERGM) (Everitt et al., 2017, propose direct estimation of the marginal likelihood using an importance sampling scheme that circumvents the need to evaluate

using the trick of Møler et al., 2006). Here we aim to provide a within-model estimation scheme that works for the types of complex models that you would expect when modelling outcomes in the social and behavioural sciences. We follow an adoption of Chib and Jeliazkov (2001) that has previously been used for ERGM (Koskinen, 2004). First we note from the so-called basic marginal likelihood identity that

where is the marginal likelihood or equivalently the normalising constant of the posterior distribution of given . This equality holds for any choice of and thus we can calculate the marginal likelihood by picking any value and evaluate the basic marginal likelihood for We can use the path sampler to evaluate the likelihood ordinate (as in Hunter and Handcock, 2006, and Caimo and Friel, 2013; for details see e.g. Gelman and Meng, 1998) but obtaining a good numerical approximation of the posterior ordinate is hard.

The exchange algorithm constitutes a reversible MCMC that converges to draws from our target distribution. If we analyse it as such we can think of it as a Metropolis algorithm that uses a proposal distribution and then accepts this move with a probability . Here, the proposal is clearly but what can we say about ? In order for the move to be accepted, we need to draw from and conditional on this we accept with probability . Thus, if we marginalise with respect to the probability of we obtain the implied acceptance probability unconditional on :


Proceeding by the method of Chib and Jeliazkov (2001), by reversibility of the Metropolis algorithm

for any choice of and . Integrating both sides of this expression with respect to and solving for we see that we can write the posterior ordinate in terms of a ratio of two expectations


We can evaluate the numerator using the Monte-Carlo estimate, taking from our posterior draws and for the inner expectation we can take a sample of of auxiliary variables for each . We may also change the order of the expectations in the numerator, meaning that we draw one large sample from the distribution defined by and average . For the denominator we draw a number of from the proposal distribution and similarly calculate Monte Carlo averages of the conditional acceptance probability across samples from .

With missing data in , the likelihood is given by where . We can evaluate using the path sampler with the restriction that are fixed for such that . With missing data Eq. (2) needs to be modified to account for the uncertainty in the missing outcomes. When evaluating the numerator of Eq. (2), the Monte Carlo average over the joint posterior of and by taking the corresponding draws from the joint posterior. For the denominator of Eq. (2), the Monte Carlo average will be taken with respect to draws from and draws of from the conditional distribution . The Monte Carlo estimate of (2) with missing data is written


where are posterior draws of , are draws from , are independent draws from , are draws of from the model , and finally are draws from . The numerator in Eq. 3 is computationally cheap to evaluate as we only need one large sample from the distribution defined by . The denominator in Eq. 3 does however require a sample of size from the model defined by for all . The variance of the estimator is not very sensitive to the size of .

4.4.1 Prior distributions

There are good reasons for performing inference for ALAAM with prior distributions that are proper (probability distributions). With an improper prior distribution for

, the posterior distribution is proper if the observed vector of statistics is in the relative interior of the convex hull on , where is the image of under (this follows trivially from the properties of exponential families, Barndorff-Nielsen,1978; see e.g. Koskinen et al., 2010). Since there are instances where does not fall in the (relative interior of the) convex hull on (Handcock , 2003), a proper prior distribution formally is a safeguard against the risk of the posterior not being defined. The Bayes factor

is only properly defined if the prior distributions for the parameters of both models are proper. A convenient choice for prior distribution for the canonical parameters for an exponential family model is a multivariate normal distribution

. While it can be motivated to set a priori to reflect no bias on the parameters, setting the scale through is less straightforward. For related binomial models, Chen et al. (2008) argue the merits of using Jeffreys’ prior (Jeffreys, 1946). Here, this would translate to the prior being , for which motivates the scalable normal prior with variance covariance matrix . The information matrix is straightforward to obtain as the Monte Carlo estimate of the variance covariance of the model sufficient statistics under . For some data sets where is small, it may be motivated to use a data-dependent prior with , where . As will shrink the prior distribution and pull parameters towards the origin, setting will reduce the influence of the intercept which is largely a nuisance.

4.4.2 Posterior deviance

To evaluate model fit with constant or reference priors, posterior predictive p-values (Meng, 1994) may be applied for any function of the network and attributes that are typically used in GOF (Hunter et al., 2008; Robins and Lusher, 2013). For single value summaries of model fit we may also consider functions of the deviance. In the context of complex network models, Aitkin et al. (2017) considered evidence in terms of the posterior distribution of the deviance (a full discussion of this approach is given in Aitkin, 2010). This provides a useful graphical representation of relative fit of a model that can be summarised using the deviance information criterion (Spiegelhalter et al., 2002; Gelman et al., 2004). As the likelihood of Eq. 1 is intractable, we need to evaluate the log-likelihood for each draw numerically using the path-sampler (Hunter and Handcock, 2006; Gelman and Meng, 1998). With missing data defined as in Section 4.2, the likelihood is estimated as . Here the likelihood is estimated using the path sampler relative to the MLE for a nested independent model.

5 Applications

We demonstrate the proposed inference procedures using three datasets. The first dataset explores the dependence of attitudes of individuals on their peers and is of a canonical network form, with a network census of a single school class, something which allows us to illustrate the basic features of the modelling framework. The second dataset has several features that you might expect to come across in studies of substance use, attitudes, and behaviours in adolescents and children. We illustrate how to perform the inference procedure in the face of missing data. The dataset also has outcomes for pupils across several school classes, something which allows us to explore how the strength of social influence may be tested across settings. Finally, we adopt the Bayesian inference scheme to snowball-sampled data.

5.1 Masculine attitudes in a school class

Figure 6: Friendship network among 106 pupils in an all-male school. Dominant culture indicated by squares (1) and circles (0), and outcome black (), grey ().

Lusher and Dudgeon (2007) (see also Lusher and Robins, 2009) developed a scale, MAI, for measuring male dominance attitudes. In school classes it may be of interest to know if a (male) pupil’s attitudes to masculinity is contingent on those of his friends. MAI scores as well as friendship nominations were collected for 106 pupils in a Year 10 level school class in a single-sex, religious secondary school in Australia (Lusher, 2011). Our response variable is the MAI dichotomised at the mean. Controls are: ‘dominant culture’ (indicates if has an Anglo-Australian ethno-cultural background (1) or not (0)); the socio economic status of the pupil’s household (as measured by standardised SES based on postcode); the occupational score for the father of pupil (original range is 0 to 100 according to Jones and McMillan, 2001, but here standardised); the equivalently defined occupational score for the mother of pupil (see Lusher, 2011, for further details of the network data).

5.1.1 Direct contagion

Figure 7 provides the MCMC output for a model defined as in Eq. (1) 20,000 draws using the standard settings of Section 4.1 (with and estimated from a simulation of statistics under the MLE for a logistic regression with contagion parameter set to 0). The autocorrelation for the contagion parameter is fairly large even at large lags. This can be improved upon by setting the proposal covariance matrix equal to the covariance of the posteriors (this reduces the SACF greatly). According to the posterior summaries provided in Table1, there is evidence for a positive contagion parameter.

Figure 7: MCMC output for contagion-model for masculine attitudes in an Australian school

mean sd ESS SACF 10 SACF 30 2.5 perc 97.5 perc intercept -4.94 5.78 269.73 0.72 0.37 -16.23 6.33 contagion 0.17 0.07 251.07 0.70 0.35 0.03 0.29 outdegree 0.03 0.07 292.16 0.72 0.37 -0.10 0.16 indegree -0.12 0.06 293.03 0.74 0.42 -0.23 -0.00 domculture -0.32 0.43 235.78 0.70 0.38 -1.28 0.47 SES 5.04 6.14 271.11 0.73 0.38 -7.90 17.11 father -0.00 0.01 281.46 0.68 0.34 -0.02 0.01 mother 0.00 0.01 285.92 0.68 0.31 -0.01 0.01

Table 1: Posterior summaries for model (1) with controls estimated for contagion-model for masculine attitudes in a school in Australia

5.1.2 Indirect contagion

To infer whether there is evidence of influence on MAI being transmitted through indirect ties we add the statistic , corresponding to the configuration (c) in Figure 4. In addition we include a statistic for number of nodes that are reachable from an individual, the number of indirect ties . The potential for a brokerage effect on is controlled for by the mixed 2-path effect . The results for the elaborated model are provided in Table 2. The introduction of the additional contagion effect reduces the direct contagion (posterior correlation of

), making interpretation less conclusive than in the simpler model. The number of indicted ties is positive with a large posterior probability suggesting that pupils that are indirectly connected to many others are likely to have masculine attitudes.

mean sd ESS SACF 10 SACF 30 2.5 perc 97.5 perc intercept -6.83 6.53 314.11 0.78 0.44 -20.14 5.31 contagion 0.21 0.13 281.64 0.79 0.51 -0.05 0.48 indirect cont -0.02 0.02 275.48 0.79 0.52 -0.06 0.01 outdegree -0.48 0.27 242.09 0.80 0.54 -1.02 0.02 indegree 0.07 0.16 324.60 0.79 0.50 -0.27 0.38 brokerage -0.01 0.02 323.54 0.79 0.51 -0.05 0.02 indirect ties 0.07 0.03 211.01 0.80 0.53 0.02 0.13 domculture -0.41 0.49 292.16 0.79 0.51 -1.30 0.59 SES 7.28 6.64 305.18 0.77 0.43 -6.32 20.32 father -0.01 0.01 1256.67 0.78 0.46 -0.03 0.01 mother -0.00 0.01 305.47 0.79 0.49 -0.02 0.01

Table 2: Posterior summaries for model with indirect contagion estimated for masculine attitudes in a school in Australia

5.1.3 Gof

The posterior predictive distributions for some functions of are provided in Table 4. For reference predictive distributions for a latent network effects model (LNAM) are provided. For this model we assume that the vector and the standard network effects model (Marsden and Friedkin, 1994) , where is the row-normalised adjacency matrix, is the network effects parameter, is a matrix with the same fixed covariates as for the model in Table 1, and are i.i.d. standard normal variates. As is binary, we use as the latent variable for a probit link-function by letting . Estimation of and largely follows Koskinen and Stenberg (2012). We assume the same form prior, as described in Section 4.4, for the in the network effects model and in the complex contagion model with the exception that the in the former is not included in the regression parameters.

mean sd intercept -7.52 3.91 alpha 0.49 0.18 outdegree -0.04 0.03 indegree 0.13 0.04 domculture -0.61 0.31 SES 7.85 4.11 father -0.01 0.01 mother 0.01 0.00

Table 3: Posterior summaries for latent network autocorrelation model (LNAM) with controls estimated for contagion-model for masculine attitudes in a school in Australia

For this relatively limited set of attribute and network interactions, the ALAAM more or less consistently outperforms the LNAM judging by the posterior predictive p-values of Table 4. However, for this dataset there is no clear evidence of the LNAM completely failing to reproduce a statistics. The goodness-of-fit does also illustrate that the simpler specification of ALAAM is sufficient for explaining higher-order dependencies such as indirect contagion.

ALAAM LNAM statistic observed mean p-value mean p-value intercept 55.00 55.87 0.21 53.02 0.20 direct contagion 272.00 277.94 0.25 250.54 0.17 reciprochal contagion 75.00 78.17 0.24 74.39 0.22 indirect contagion 2069.00 2132.74 0.26 1880.53 0.17 closure contagion 763.00 790.04 0.26 695.57 0.17 transitive contagion 456.00 534.64 0.24 456.68 0.28 indegree 428.00 432.53 0.24 393.71 0.14 outdegree 478.00 488.87 0.21 481.77 0.23 two-paths 3859.00 3972.61 0.22 3786.91 0.22 out-2-star 2362.00 2404.59 0.22 2452.69 0.17 in-2-star 1780.00 2082.39 0.19 1701.24 0.18 out-triangles 1368.00 1404.57 0.21 1385.58 0.22 in-triangles 1210.00 1191.38 0.23 1048.03 0.11 transitive triangles 1004.00 1033.12 0.23 951.75 0.19 indirecct ties 3957.00 3914.89 0.25 3895.86 0.24

Table 4: Posterior predictive p-values for ALAAM and latent network autocorrelation models

5.2 Stockholm Birth Cohort

Figure 8: Best friend network in four schools in SBC. Sex indicated by squares (girl) and circles (boys), and outcome black (), grey (), and white for missing.

The Stockholm Birth Cohort is a large cohort study in the Stockholm Metropolitan area that includes a detailed surveys and school-class network data (Stenberg and Vågerö, 2006; Stenberg et al. 2007; Stenberg, 2018). The networks are the best-friend nominations of school children and for each pupil there are a range of sociological, psychological, and educational variables. The survey was carried out in May 1966 when the pupils were nearing the end of the sixth grade. This is the time when they would have started considering whether they were going to proceed to higher secondary education (grades 10 and above) and been been talking about this with their peers. We chose for our example 19 school classes out of the 1966 survey. We let be the directed best-friend network (this had a cap of three nominations), and be indicators of whether pupils said that they intended to proceed to higher secondary school, and otherwise (in accordance with the model of Koskinen and Stenberg, 2012). By design there are no ties between pupils in different school classes. The average network size is , the average density is with a minimum of and a maximum of (the most dense network is the smallest network consisting of 5 pupils). The proportion of missing entries range from to with an average of . We apply the ALAAM specified by Eq. 1 but set the parameter for out-stars (of the form ) to zero as the nominations were capped at three and there is little variance in the out-degree distribution. In addition to this structural part, we control for: sex (female:1); family support (an 11-point scale measuring the family’s attitude toward school ranging from least positive, 0 to most positive 10); average school marks (scaled to range from 0 to 10); an indicator of whether the father’s social class is one (the highest social class in a 5-point ordinal categorisation).

The results from the MCMC with 10,000 iterations with constant priors are summarised in Figure 9 and posterior summaries are provided in Table 5 (the table is based on default settings with a burnin of 1000 and thinning of 20 iteration). Mixing of the MCMC can be said to be satisfactory with default settings. There is strong evidence for a positive family attitude to school and high grades to increase the likelihood of the intention to proceed to higher education. The evidence is inconclusive for other effects. In particular, the contagion parameter is positive with posterior probability .

Figure 9: MCMC output for contagion-model for progression to upper-secondary school in SBC

mean sd ESS SACF 10 SACF 30 2.5 perc 97.5 perc intercept -9.67 1.11 178.03 0.68 0.32 -11.83 -7.51 contagion 0.16 0.10 183.10 0.68 0.32 -0.04 0.35 indegree -0.07 0.11 183.55 0.67 0.32 -0.29 0.13 sex -0.09 0.29 134.35 0.70 0.39 -0.66 0.47 family attitude 0.48 0.09 164.22 0.70 0.32 0.33 0.65 marks 0.99 0.15 168.66 0.68 0.32 0.69 1.28 social class 1 0.59 0.32 198.40 0.66 0.24 -0.06 1.19

Table 5: Model 1 posterior summaries for model (1) with controls estimated for contagion-model for progression to upper-secondary school in SBC (Posterior means, sd, and probability interval based on a thinned sample of 10,000 iterations, taking every 20th iteration, with burnin of 1000; SACF and ESS based on un-thinned sample)

5.2.1 Testing difference in contagion

The classes come from 4 schools that differ in socio economic status of uptake area as reflected in the composition of social class of pupils. We divide the schools into one subset with less than 15% of students (across school classes) from the highest social class and a subset with more than 15% of students from the highest social class. Table 6 present the results for a model () that includes an interaction of the contagion parameter and an indicator for the type of school ( for schools with low proportion of pupils from the highest social class) as well as the main effect. There is stronger evidence than for model 1 for a contagion effect (the contagion parameter is positive with posterior probability). There is weak evidence for contagion being absent in schools with a lower proportion of pupils from the highest social class (the posterior distribution for has a mean of

and a standard deviation of

and is negative with posterior probability).

mean sd ESS SACF 10 SACF 30 2.5 perc 97.5 perc intercept -10.13 1.19 168.32 0.76 0.44 -12.81 -8.04 contagion 0.24 0.12 143.31 0.72 0.39 0.02 0.48 indegree -0.08 0.12 122.80 0.75 0.41 -0.33 0.13 sex -0.09 0.28 126.04 0.76 0.45 -0.69 0.47 family attitude 0.48 0.08 140.26 0.72 0.38 0.34 0.65 marks 1.01 0.14 265.08 0.72 0.40 0.76 1.31 composition 0.91 0.55 137.33 0.74 0.39 -0.25 1.97 social class 1 0.57 0.34 143.59 0.73 0.37 -0.07 1.21 contagion social class 1 -0.21 0.16 152.15 0.72 0.37 -0.51 0.11

Table 6: Model 2 posterior summaries for model (1) with controls estimated for contagion-model for progression to upper-secondary school in SBC (Posterior means, sd, and probability interval based on thinned sample of 10,000 iterations, taking every 20th iteration, with burnin of 1000; SACF and ESS based on un-thinned sample)

Figure 10: The posteriors of contagion parameters for model 1 and model 2 and the interaction of contagion and composition with 95% probability intervals

There is an indication of a difference in the posteriors in Figure 10 but can we test against ? Consider first evaluating the evidence for against based on the results in Tables 5 and 6 that are based on improper priors. We estimate the likelihood as in Section 4.4.2, relative to the MLE for a model with the contagion parameter, composition, and contagion interaction set to zero (we estimate for a thinned sample of 226 posterior draws, using 20 bridges and 100 samples for each; In fact, using half, 113, of these posterior draws, and only 5 samples for each give virtually identical results). Figure 11 (left panel) shows that the deviance distributions are stochastically ordered (AItkin et al., 2017) and that model 2 is the preferred model. Based on the posterior deviances of Figure 11 (left panel) , we provide two versions of the DIC measure (Spiegelhalter et al., 2002; Gelman et al., 2004) in Table 7, both of which suggest that is preferred over .

Figure 11: Left panel: The posterior deviance for model 1 (solid) and model 2 (dashed) . Right panel, model evidence for model 1 () and model 2 () for prior with scale and

Model 1 Model 2

Table 7: DIC for Models 1 and 2 fitted to SBC

Examining the evidence for the two models in Figure 11 (right panel), the interaction model, Model 2, is preferred for between one and 4. As gets larger, the prior variance increases, penalising model complexity and thus favouring the more parsimonious model (c.p. Bartlett, 1957). The precision of the estimates of the evidence here is however not sufficient to draw firm conclusions.

5.2.2 Sensitivity to MAR assumption

To test the sensitivity of the posteriors to violations of the missing at random assumption, we posit the MNAR missing data mechanism assuming the logistic form

independently for all conditional on . With the interpretation would be that pupils that do not intend to proceed to higher secondary education are less likely to respond. Assuming that receiving few best-friend nominations is associated with social isolation, a negative would mean that socially relatively isolated pupils are more likely to be missing. Fixing only will affect inference as the covariate dependent and the intercept cancel out in simulating for missing cases. Figure 12

plots the change in credibility intervals for some of the parameters of model 2. If missingness is strongly predicted by an intention to proceed to higher secondary education (

positive), the contagion effect is weakened. If pupils not intending to proceed are more likely to be missing, the contagion effect is strengthened. The strength of the MNAR mechanism also affects the composition parameter and the interaction with composition and contagion. The bias, as represented by , does however need to be strong to have an effect (at , almost with probability 1 for missing cases). While missings in is not automatically associated with being a non-responder and having missing out-going nominations, pupils with missing on nominate fewer people, and if these missings are imputed by , this means increasing the number of isolates that have when in fact these pupils might have missing values on their out-going ties.

Figure 12: Posterior summaries (95% CI) for parameters of model 2 for SBC agains MNAR parameter

5.3 Unemployement in a large network

When the node-set of a network is not unambiguously defined or the population size is too big to allow for a complete census of the network, we may still want to estimate network-related effects from a sample of the population network. We consider here the case of the dataset analysed previously by Daraganova and Pattison (2013) that consists in total of 551 individuals obtained from a three-wave snowball sample (Frank, 2005; Frank & Snijders, 1994; Goodman, 1961) in Australia. Drawing on Besag’s (1974) coding scheme, Pattison et al. (2013) demonstrated how the dependence assumptions of an ERGM can be used to define a conditional inference scheme. For ALAAM this translates to estimating the model as described above with the condition that remains fixed at their observed values for , where is a set that separates (Pattison et al., 2013) data in . For the three-wave snowball sample this means conditioning on the outcomes of nodes in zone 3 (184 nodes), and conditional on this modelling the outcomes of the seed nodes, and zones 1 and 2 (367 nodes). The outcome variable of interest is employment status with ‘employed individuals’ were those individuals who worked full or part time, and students who worked part time (); and (2) ‘unemployed individuals’ were those individuals who did not work at the time of the interview (). In addition we use a reduced set of other variables, namely the number of network partners (degree); sex (male: 0; female: 1); and age (ranging from 19 to 67 with a mean of 37).

The results of Table 8 largely agree with the analysis of Daraganova and Pattison (2013), and there is clear evidence of a positive association between people that are relationally tied (the posterior mean is ) and a greater risk of being unemployed the more people that you are connected to. Of course, for a sample of a community network we cannot discount the possibility that the network and outcomes are spatially clustered (Butts, 2003; Daraganova et al., 2012; Koskinen and Lomi, 2013) or that there are other geographical network effects (Sohn et al., 2019).

mean sd 2.5 perc 97.5 perc intercept contagion degree sex age

Table 8: Posterior summaries for a model for employment status for a sample from Victoria, Australia, estimated conditional on outcomes in waves 3 and greater

6 Summary

Building on previous work on ALAAM (Robins et al., 2001; Daraganova, 2009; Daraganova and Robins, 2013) we draw on advances in modelling Markov random fields (Friel, 2013; Caimo and Friel, 2011) to improve on previous Bayesian estimation schemes (Koskinen, 2008) for the social influence model.

We illustrated various aspects of fitting the model using three example datasets. We found that pupils that have friends that have male-dominance attitudes also tend to have male-dominance attitudes themselves. Posterior predictive p-values show that a simple model with direct contagion is sufficient for explaining more complex interactions and in addition shows that the ALAAM compares favourably with an alternative network dependence model. For a Swedish dataset we found that pupils that have friends that intend to proceed to higher education are more likely to have the same intention themselves. We also found tentative evidence for this ‘contagion’ effect to be present in schools with pupils of higher social class than in schools with a lower proportion of pupils from a high social class. The estimates for the contagion effect was demonstrated to be robust to violations of the missing at random assumption. Finally, a dataset collected using snowball sampling in Australia showed that people that have unemployed friends are more likely to be unemployed themselves.

A benefit of the Bayesian estimation approach for ALAAM is that the coherent treatment of uncertainty allows greater flexibility in handling missing data and performing model evaluation relative to the maximum likelihood approach. This likelihood-based framework is also readily extended to hierarchical modelling so that we for example can analyse social influence jointly for multiple datasets (c.p. the continuous case, Agneessens and Koskinen, 2016). This multilevel approach would be better suited to investigate network-level effects like the case of differences in contagion for different types of schools. Further extensions to the model itself are of course possible and we may consider multiple types of ties, e.g. for independent effects contagion for relations ; but we may also consider algebras (Pattison, 1993) on in our modelling peer effects - what combinations of different types of social relations are most predictive of an outcome.

While distinguishing social influence and social contagion from social selection, the tendency for ties to change conditional on nodal variables, requires longitudinal data (Steglich et al., 2010), when only cross-sectional network data are available, accounting for peer-dependence through network ties is still necessary. A Bayesian ALAAM framework allows us to take a number of different types of network dependencies into account. Investigating social influence, social contagion, and social diffusion has a long tradition of empirical applications in network research (Robins, 2015) and here we have not touched on the potentially rich source of network data that online sources constitute. Indeed, the study of online opinion formation and the spread of fake news through online networks is an increasingly popular field (see e.g., Brady et al., 2017).


  • 1 Agneessens, F., Koskinen J. (2016). Modelling individual outcomes using a multilevel social influence (MSI) model. Pp 81–105 in Emmanuel Lazega and Tom Snijders (Eds.) Multilevel Network Analysis for the Social Sciences. London: Springer.
  • 2 Aitkin, M. (2010) Statistical Inference: an Integrated Bayesian/Likelihood Approach. Boca Raton: Chapman and Hall–CRC.
  • 3 Aitkin, M., Vu, D., and Francis, B. (2017). Statistical modelling of a terrorist network. Journal of the Royal Statistical Association (Series A), 180: 751–768.
  • 4 An, W. (2011). Models and methods to identify peer effects. The Sage handbook of social network analysis. London: Sage, 515–532.
  • 5 Baddeley, A. & Möller, J. (1989). Nearest–neighbour Markov point processes and random sets. International Statistical Review/Revue Internationale de Statistique, 89–121.
  • 6 Bailey K. Fosdick & Peter D. Hoff (2015) Testing and Modeling Dependencies Between a Network and Nodal Attributes, Journal of the American Statistical Association, 110:511, 1047-1056
  • 7 Barndorff-Nielsen, O.E. (1978). Information and Exponential Families in Statistical Theory, New York: Wiley.
  • 8 Bartlett, M. (1957). A Comment on D. V. Lindley’s Statistical Paradox. Biometrika, 44(3/4), 533-534.
  • 9 Besag, J. E. (1972) Nearest-neighbour Systems and the Auto-Logistic Model for Binary Data. Journal of the Royal Statistical Society Series B (Methodological), 34(1):75–83.
  • 10 Besag, J. E. (1974) Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society Series B (Methodological), 36, 96–127 (with discussion).
  • 11 Brady, W. J., Wills, J. A., Jost, J. T., Tucker, J. A., & Van Bavel, J. J. (2017). Emotion shapes the diffusion of moralized content in social networks. Proceedings of the National Academy of Sciences, 114(28), 7313–7318.
  • 12 Bramoullé, Y., Djebbari, H., & Fortin, B. (2009). Identification of peer effects through social networks. Journal of econometrics, 150(1), 41–55.
  • 13 Bryant, R. A., Gallagher, H. C., Gibbs, L., Pattison, P., MacDougall, C., Harms, L., Block, K., Baker, E.,Sinnott, V., Breton, G., Richardson, J., Forbes, D., Lusher, D. (2017). Mental health and social networks after disaster. American Journal of Psychiatry, 174(3), 277–285.
  • 14 Burt, R. S. (1987). Social contagion and innovation: Cohesion versus structural equivalence. American journal of Sociology, 92(6), 1287-1335.
  • 15 Butts, C. T. (2003). Predictability of Large-Scale Spatially Embedded Networks. Pp 313–323 in C. T. Butts, R. Breiger, K. Carley, and P. Pattison. (eds.) Dynamic Social Network Modeling and Analysis, Washington, DC: National Academies Press.
  • 16 Butts, Carter T. (2016). sna: Tools for Social Network Analysis. R package version 2.4.
  • 17 Caimo, A., Friel, N., (2011). Bayesian inference for exponential random graph models. Social Networks 33, 41–55.
  • 18 Caimo, A., & Friel, N. (2013). Bayesian model selection for exponential random graph models. Social Networks, 35(1), 11–24.
  • 19 Chen, Ming-Hui, Ibrahim, Joseph G., and Kim, Sungduk (2008). Properties and Implementation of Jeffreys’s Prior in Binomial Regression Models. Journal of the American Statistical Society, 103(484): 1659–1664.
  • 20 Chib, S. and Jeliazkov, I. (2001). Marginal likelihood from the MetropolisHastings output. Journal of the American Statistical Society, 96, 453, 270–281.
  • 21 Coleman, J., Katz, E., & Menzel, H. (1957). The diffusion of an innovation among physicians. Sociometry, 20(4), 253–270.
  • 22 Daraganova, G. (2009). Statistical models for social networks and network-mediated social influence processes: Theory and Applications, University of Melbourne, unpublished PhD thesis.
  • 23 Daraganova, G., Pattison, P. (2013). Autologistic Actor Attribute Model Analysis of Unemployment: Dual Importance of Who You Know and Where You Live. Pp. 237–247 In Lusher, D., Koskinen, J., Robins, G. (eds.) Exponential Random Graph Models for Social Networks: Theory, Methods and Applications, New York: Cambridge University Press.
  • 24 Daraganova, G., Pattison, P., Koskinen, J., Mitchell, B., Bill, A., Watts, M., & Baum, S. (2012). Networks and geography: modelling community network structures as the outcome of both spatial and network processes. Social Networks, 34(1), 6–17.
  • 25 Daraganova, G., Robins, G. (2013). Autologistic Actor Attribute Model. Pp. 102–114 In Lusher, D., Koskinen, J., Robins, G. (eds.) Exponential Random Graph Models for Social Networks: Theory, Methods and Applications, New York: Cambridge University Press.
  • 26 Doreian, P. (1982). Maximum likelihood methods for linear models. Sociological Methods and Research, 10, 243–269.
  • 27 Doreian, P., Teuter, K., & Wang, C. (1984). Network autocorrelation models: Some Monte Carlo evidence. Sociological Methods and Research, 13, 155–200.
  • 28 Everitt, R.G., Johansen, A.M., Rowing, E., and Evdemon-Hogan, M. (2017). Bayesian model comparison with un-normalised likelihoods. Statistics and Computing, 27(2), 403–422.
  • 29 Frank, Ove (2005). Network sampling and model fitting. Pp. 31–56 in Carrington, Peter J., John Scott, and Stanley Wasserman, (Eds.) Models and methods in social network analysis. Vol. 28. New York: Cambridge university press.
  • 30 Frank, O., and Snijders, T.A.B., (1994). Estimating the size of hidden populations using snowball sampling. Journal of Official Statistics 10, 53–67.
  • 31 Frank, O., and Strauss, D (1986). Markov Graphs. Journal of the American Statistical Association, 81, 832–842.
  • 32 Friel, N. (2013). Evidence and Bayes Factor Estimation for Gibbs Random Fields. Journal of Computational and Graphical Statistics, 22:3, 518–532.
  • 33 Friedkin, Noah E. (1984). Structural Cohesion and Equivalence Explanations of Social Homogeneity. Sociological Methods and Research 12:235–61.
  • 34 Gelman, Andrew, Carlin, John B., Stern, Hal S., Rubin, Donald B. (2004). Bayesian Data Analysis: Second Edition. Texts in Statistical Science. CRC Press.
  • 35 Gelman, A., Meng, X.L. (1998). Simulating normalizing constants: From importance sampling to bridge sampling to path sampling, Statistical Science, 13: 163–185.
  • 36 Goldstein, H. (1995). Multilevel statistical models. London: Edward Arnold
  • 37 Goodman, Leo A. (1961). Snowball sampling. The annals of mathematical statistics, 148-170.
  • 38 Handcock, M.S. (2003). Assessing degeneracy in statistical models of social networks, Working Paper no. 39, Center for Statistics and the Social Sciences, University of Washington (available from
  • 39 Handcock, M. S., and Gile, K. (2010). Modeling social networks from sampled data, The Annals of Applied Statistics, vol. 4, pp. 5–25.
  • 40 Hunter, D.R., Goodreau, S.M., Handcock, M.S., (2008). Goodness of fit of social network models. Journal of the American Statistical Association 103.
  • 41 Hunter, D.R., Handcock, M.S., (2006). Inference in curved exponential family models for networks. Journal of Computational and Graphical Statistics 15, 565–583.
  • 42

    Jeffreys H. (1946). An Invariant Form for the Prior Probability in Estimation Problems.

    Proceedings of the Royal Society of London, 196:453–46. Ser. A
  • 43 Jenness S.M., Goodreau S.M., Morris M. (2016). EpiModel: Mathematical Modeling of Infectious Disease. R Package Version 1.2.6. DOI: 10.5281/zenodo.16767
  • 44 Jones, F. L., and McMillan, J. (2001). Scoring Occupational Categories for Social Research: A Review of Current Practice, with Australian Examples. Work, Employment and Society, 15, 539–563.
  • 45 Kashima, Y., Wilson, S., Lusher, D., Pearson, L. J., & Pearson, C. (2013). The acquisition of perceived descriptive norms as social category learning in social networks. Social networks, 35(4), 711–719.
  • 46 Koskinen, J. H. (2004). Bayesian Analysis of Exponential Random Graphs - Estimation of Parameters and Model Selection. Research Report 2004:2, Department of Statistics, Stockholm University
  • 47 Koskinen, J. H. (2008). The Linked Importance Sampler Auxiliary Variable Metropolis Hastings Algorithm for Distributions with Intractable Normalising Constants. MelNet Social Networks Laboratory Technical Report 08-01, Department of Psychology, School of Behavioural Science, University of Melbourne, Australia
  • 48 Koskinen, J., and Lomi, A. (2013). The Local Structure of Globalization: The Network Dynamics of Foreign Direct Investments in the International Electricity Industry.Journal of Statistical Physics, 151, (3), 523–548.
  • 49 Koskinen, J. H., Robins, G. L., and Pattison, P. E. (2010). Analysing Exponential Random Graph (p-star) Models with Missing Data Using Bayesian Data Augmentation. Statistical Methodology, Vol. 7(3), 366–384.
  • 50 Koskinen, J. H., Robins, G. L., Wang, P., Pattison, P. E. (2013). Bayesian analysis for partially observed network data, missing ties, attributes and actors. Social Networks, vol. 35(4), 514–527.
  • 51

    Koskinen, J. H., Wang, P., Robins, G. L., Pattison, P. E. (2018). Outliers and Influential Observations in Exponential Random Graph Models.

    Psychometrika, vol. 83(4), 809–830.
  • 52 Koskinen, J. H., and Stenberg, S.-Å. (2012). Bayesian Analysis of Multilevel Probit Models for Data with Friendship Dependencies. Journal of Educational and Behavioural Statistics. 37(2):203–230.
  • 53 Krivitsky, P.N., and Morris, M. (2017). Inference for social network models from egocentrically sampled data, with application to understanding persistent racial disparities in HIV prevalence in the US. The Annals of Applied Statistics, Vol. 11 (1), 427-455.
  • 54 Lauritzen, S.L., & Spiegelhalter, D.J. (1988). Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society Series B, 50, 157–224.
  • 55 Leenders, R. T. A. J. (2002). Modelling social influence through network autocorrelation: constructing the weight matrix. Social Networks, 24, 21–47.
  • 56 Letina, S. (2016). Network and actor attribute effects on the performance of researchers in two fields of social science in a small peripheral community. Journal of informetrics, 10(2), 571–595.
  • 57 Little, R.J.A., Rubin, D.B. (1987). Statistical Analysis with Missing Data. New York: Wiley.
  • 58 Lubbers, M. J., & Snijders, T. A. (2007). A comparison of various approaches to the exponential random graph model: A reanalysis of 102 student networks in school classes. Social Networks, 29, 489–507.
  • 59 Lusher, Dean (2011). Masculinity, educational achievement and social status: a social network analysis. Gender and Education, 23(6): 655–675.
  • 60 Lusher, D., and Dudgeon, P. (2007). The Masculine Attitudes Index (MAI). Working Paper, The University of Melbourne.
  • 61 Lusher, D., Koskinen, J., Robins, G., (2013). Exponential Random Graph Models for Social Networks: Theory, Methods and Applications, New York: Cambridge University Press.
  • 62 Lusher, D., and Robins, G. (2009). Hegemonic and other masculinities in local social contexts. Men & Masculinities 11: 387–423.
  • 63 Manski, C. F. (1993). Identification of endogenous social effects: The reflection problem. The Review of Economic Studies, 60, 531–542.
  • 64 Marsden, EV., & Friedkin, N.E. (1994). Network studies of social influence. In S. Wasserman & J. Galaskiewicz (Eds.), Advances in social network analysis (pp. 3–25). Thousand Oaks, CA: Sage.
  • 65 Meng, X. L. (1994). Posterior predictive p-values. The Annals of Statistics, 1142–160.
  • 66 Moreno, J. L. (1934). Who shall survive?: A new approach to the problem of human interrelations. Washington, DC, US: Nervous and Mental Disease Publishing Co.
  • 67 Morris, M., (2004). Network Epidemiology: A Handbook for Survey Design and Data Collection. Oxford: Oxford University Press.
  • 68 Murray, I., Ghahramani, Z., MacKay, D., (2006). MCMC for doubly-intractable distributions. In:

    Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence (UAI-06)

    . AUAI Press, Arlington, Virginia.
  • 69

    Møller, J., Pettitt, A. N., Berthelsen, K. K., and Reeves, R.W. (2006), An Efficient Markov Chain Monte Carlo Method for Distributions with Intractable Normalising Constants,

    Biometrika, 93, 451–458.
  • 70 Pattison, P., (1993). Algebraic Models for Social Networks. Cambridge: Cambridge University Press.
  • 71 Pattison, P., Robins, G., (2002). Neighborhood based models for social networks. Sociological Methodology 32, 301–337.
  • 72 Pattison, P.E., Robins, G.L., Snijders, T.A.B., Wang, P. (2013). Conditional estimation of exponential random graph models from snowball sampling designs. Journal of Mathematical Psychology, 57: 284–296.
  • 73 Robins, G. (2015). Doing social network research: Network-based research design for social scientists. London: Sage.
  • 74 Robins, G., Lewis, J. M., & Wang, P. (2012). Statistical network analysis for analyzing policy networks. Policy Studies Journal, 40(3), 375–401.
  • 75 Robins, G. L., Lusher, D., (2013). Illustrations: Simulation, Estimation, and Goodness of Fit. Pp. 167–185 in: Lusher, L., Koskinen, J., Robins, G. (Eds.) Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications. Cambridge: Cambridge University Press.
  • 76 Robins, G., Pattison, P., & Elliott, P. (2001). Network models for social influence models. Psychometrika, 66, 161–190.
  • 77 Rolls, D. A., G. Daraganova, R. Sacks-Davis, M. Hellard, R. Jenkinson, E. McBryde, P. E. Pattison, and G. L. Robins. (2013a). Modelling a disease-relevant contact network of people who inject drugs. Social Networks, 35(4), 699-710.
  • 78 Rolls, D. A., Sacks-Davis, R., Jenkinson, R., McBryde, E., Pattison, P., Robins, G., & Hellard, M. (2013b). Hepatitis C transmission and treatment in contact networks of people who inject drugs. PloS one, 8(11), e78286.
  • 79 Rolls, D. A., Daraganova, G., Sacks-Davis, R., Hellard, M., Jenkinson, R., McBryde, E., Pattison, P., & Robins, G. L. (2012). Modelling hepatitis C transmission over a social network of injecting drug users. Journal of Theoretical Biology, 297, 73-87.
  • 80 Rubin, D.B. (1976). Inference and missing data (with discussion). Biometrika 63, 581–592.
  • 81 Schweinberger, M. (2011). Instability, sensitivity, and degeneracy of discrete exponential families. Journal of the American Statistical Association, 106(496), 1361-1370.
  • 82 Sewell, D. K. (2017). Network autocorrelation models with egocentric data. Social Networks, 49, 113-123.
  • 83 Snijders, T. A., & Bosker, R. J. (2011). Multilevel analysis: An introduction to basic and advanced multilevel modeling, 2nd edition. London: Sage.
  • 84 Snijders, T.A.B., Pattison, P.E., Robins, G.L., & Handcock, M.S. (2006). New specifications for exponential random graph models. Sociological Methodology 36, 99–153.
  • 85 Sohn, C. Christopoulos, D., and Koskinen, J. (2019). Borders moderating distance: A Social Network Analysis of Spatial Effects on Policy Interactions, Geographical Analysis (forthcoming).
  • 86 Spiegelhalter, David J., Best, Nicola G., Carlin, Bradley P., van der Linde, Angelika (2002). Bayesian measures of model complexity and fit (with discussion). Journal of the Royal Statistical Society, Series B. 64 (4): 583–639.
  • 87 Steglich, C. E. G., Snijders, T. A. B., & Pearson, M. (2010). Dynamic Networks And Behavior: Separating Selection From Influence. Sociological Methodology, 40, 329–393.
  • 88 Stenberg, S.-Å. (2018). Born in 1953. Stockholm: Stockholm University Press.
  • 89 Stenberg, S.-Å., & Vågerö, D (2006). Cohort profile: The Stockholm birth cohort of 1953. International Journal of Epidemiology, 35, 546–548.
  • 90 Stenberg, S.-Å., & Vågerö, Österman, R., Arvidsson, E., & Otter, C. von, & Jansson, C.-G (2007). Stockholm birth cohort study 1953-2003: A new tool for life-course studies. Scandinavian Journal of Public Health, 35, 104–110.
  • 91 Strang, D. (1991). Adding social structure to diffusion models: An event history framework. Sociological Methods and Research 19, 324–353.
  • 92 Strang, D. and S. A. Soule (1998). Diffusion in organizations and social movements: From hybrid corn to poison pills. Annual Review of Sociology 24, 265–290.
  • 93 Strang, D. and N. B. Tuma (1993). Spatial and temporal heterogeneity in diffusion. American Journal of Sociology 99(3), 614–639.
  • 94 Tranmer, M., Steel, D., & Browne, W. J. (2014). Multiple‐membership multiple‐classification models for social network and group dependences. Journal of the Royal Statistical Society: Series A (Statistics in Society), 177(2), 439–455.
  • 95 Valente, T. W. (1995). Network models of the diffusion of innovations. New York: Hampton Press.
  • 96 Valente, T. W. (2005). Network models and methods for studying the diffusion of innovations. In P. J. Carrington, J. Scott, and S. Wasserman (Eds.), Models and Methods in Social Network Analysis, pp. 98–116. Cambridge: Cambridge University Press.
  • 97 Valente, T.W. (2012) Network Interventions. Science, 337, 49-53.
  • 98 Wang, P., Robins, G., Pattison, P., and Koskinen, J. (2014). MPNet, Program for the Simulation and Estimation of (p) Exponential Random Graph Models for Multilevel Networks: USER MANUAL. Melbourne School of Psychological Sciences, The University of Melbourne Australia.
  • 99 Vitale, M. P., Porzio, G. C., & Doreian, P. (2016). Examining the effect of social influence on student performance through network autocorrelation models. Journal of Applied Statistics, 43(1), 115-127.
  • 100 Zhang, B., Thomas, A. C., Doreian, P., Krackhardt, D., & Krishnan, R. (2013). Contrasting multiple social network autocorrelations for binary outcomes, with applications to technology adoption. ACM Transactions on Management Information Systems (TMIS), 3(4), 18.