Arguably human beings are both products and creators of their social environments, the collection of relationships individuals engage in. Although the influence of individuals’ social environments on their socio-economic choices has been widely studied, perhaps surprisingly, the possibility that the converse influence is also in place—humans’ behaviors shape their social environment—has been overlooked. Substantively, recognizing that friendships are part of one’s choice offers a novel perspective on the mode of operation of social conformity pressures333Instead of yielding to peer pressure, individuals may alter their friendships., and on the economic (and statistical) analysis and evaluation of public policies. More specifically, it shifts the focus from questions such as how the social network propagates changes in individuals’ behaviors, say, due to a policy change, to questions such as how the social network responds to policies targeting human behaviors.
While individuals may selfishly choose their optimal behaviors, aiming to maximize their individual utility (Nash, 1950), they commit to relationships that result from communication, coordination, and cooperation (Jackson and Wolinsky, 1996). The tension between the instinct for selfish choices and the consensual nature of humans’ friendships can be prototyped as a game of link and node statuses, and this paper contributes the understanding of this environment: (i) by proposing a novel formulation of individuals’ decision problem and a family of equilibria that capture both the selfish nature of individuals’ strategic interactions and the consensual nature of human relationships; (ii) by obtaining both ordinal and cardinal probabilistic ranking of these equilibria which ranking has broader implications, including for both the estimation of and simulations from the proposed model; (iii) by presenting empirical evidence that: (a.) the response of the friendship network to increases in tobacco price amplifies the intended policy effect on smoking, (b.) racially desegregating high-schools via stimulating the social interaction of students, each with a different intrinsic propensity to smoke, decreases the overall prevalence of smoking, (c.) the response of the social network is quantitatively important in assessing the aggregate spillovers upon changes in behavior of a subset of the population, (d.) the estimation biases when the network externalities are mis-specified and when peer effects are omitted are of the same sign.
Individuals’ behaviors and relationships materialize fundamentally different decision processes. Whereas for an individual to engage in a certain behavior she need not consider any but her own incentives, for a relationship to emerge and be sustained, there needs to be a consensus between both parties. The consensual nature of human relationships places particular restrictions, which I term stability constraints, on the outcome of individuals’ decisions to engage in a given behavior or in a given relationship. Simply put, individuals confine their choices of friendships to those for which there is a mutual consent.
Given an agent’s incentives, her observed links and behaviors are likely to compare favorably against her alternatives, i.e., are likely to be robust against a set of feasible deviations. Reasoning about the complexity of individuals’ decision problem444There are possible link deviations and only possible one-link-at-a-time link deviations. motivates a family of equilibria indexed by the diameter of permissible deviations. For a population of size , Nash equilibrium in a (-player) stable network (NESN) is a configuration of links and node statuses in which no player has a profitable deviation when contemplating his action status and a subset of of his links, given, of course, the stability constraints for this subset. For , NESN is less demanding on the players in comparison with a Nash play (i.e., when ) and is, therefore, a more tenable assumption in large populations.
Friends’ influences have often been pointed to as a major driver of human behaviors and have been associated with the potential to create a domino effect.555This effect is also known as bandwagon effect or social multiplier. See the pioneering study of Leibenstein (1950) and, for a comprehensive treatment, the volume in Benhabib, Bisin and Jackson (2010). The main premise of this study is that it is possible a converse mechanism to be in place in which a change in individual A’s decisions re-shapes who her peers are as opposed to pressuring her peers to follow her decisions; e.g., an individual who contemplates ceasing smoking (after an increase in tobacco prices) may at the same time reconsider her friendship network! This motivates a model in which individuals decide on both their friendships and their behaviors (decision to smoke). Importantly, this model ought to acknowledge the possible externalities on the one hand between smokers and their peers, e.g. peer effects, and on the other hand among peers themselves. Externalities, in turn, give raise to multiplicity of equilibria.
The -player consensual dynamic (CD) is a family of myopic dynamic processes where every period an individual meets potential friends and decides whether or not to befriend each of them as well as whether or not to revise her action choice. In the presence of random preference shocks, the CD family induces a stationary distribution over the entire set of possible outcomes, which embeds the family of NESN in an intuitive way (each NESN is -neighborhood local mode of the stationary distribution) and which, because of its invariance to , ranks probabilistically each equilibrium within the family, even for different -s. In addition to the cardinal ranking, analysis of the -CD independently delivers, as a by-product, a re-affirmative ordinal ranking of these equilibria. The larger is, the faster CDs approach the stationary distribution, i.e., the more likely the stationary distribution is to represent those
CDs, and the more probable the rest points of theseCDs (and of course, NESN states) are.666See Nash (1950), Foster and Young (1990), Kandori et al. (1993); Blume (1993); Jackson and Watts (2002). Some of the ideas I exploit are encountered in Cournot (1838, Chapter VII).
The convergence properties of CDs have immediate implications for the implementation of the proposed model. The model’s likelihood is given by the (unique) stationary distribution of the CD family. This distribution pertains to the Exponential Random Graph Models777See Frank and Strauss (1986); Wasserman and Pattison (1996), for which both direct estimation and simulating from the model with known parameters are computationally infeasible.888An evaluation of the likelihood involves a summation with terms, e.g. for , terms.
For these models, the double Metropolis-Hastings sampler offers a Bayesian estimation strategy that nevertheless relies on simulations from the stationary distribution via Markov chains.999See Murray et al. (2006), Liang (2010), and Mele (2017). The CD family has varying convergence rates, indexed by , which in turn suggest a transparent strategy for designing these Markov chains.101010Poor convergence properties are associated with local Markov chains, where each update is of size (Bhamidi et al., 2011). The novelty here is sampling with varying , on the support , speeding the convergence of the proposed algorithm. I thank a referee for pointing to me this direction.
The model is estimated with data on smoking behavior, friendship networks, and home environment (parental education background and parental smoking behavior) from the National Longitudinal Study of Adolescent Health.111111Details about the Add Health data, including the sample construction, are in the appendix. This is a longitudinal study of a nationally representative sample of adolescents in the United States, who were in grades – during the – school year.
1.1 Conclusions from the empirical analysis
The estimation exercise reveals the role of network data availability for the analysis of determinants of teen risky activities. In particular, ignoring the existence of peer effect or miss-specifying the externalities in the network formation lead to larger biases in estimating the price coefficient compared to when social network is kept fixed or network data is not available. When comparing the estimates for other determinants of adolescent smoking (presence of a smoker in the household or maternal education) the estimation biases when the network externalities are mis-specified and when peer effects are omitted are of the same sign. This observation is robust to alternative specification where instead of price on smoking, the income (allowances) are included.
Counterfactual experiments with the estimated model quantify the response of the friendship network in various settings. The first experiment asks whether this response is relevant for policies working through changes in tobacco prices. To motivate this exercise compare how individuals respond to a price increase in fixed versus endogenous network environments. There are two effects to consider. The direct effect of changing tobacco prices is the first order response and, intuitively, will be larger whenever individuals are free to change their friendships, i.e. more individuals are likely to immediately respond to changes in tobacco prices provided they are not confined to their (smoking) friends. The indirect (ripple) effect of changing tobacco prices is the effect on smoking which is due, in part, to the fact that one’s friends have stopped smoking. Contrary to before, a fixed network propels the indirect effect, e.g. an individual who changes her smoking status is bound to exert pressure to her friends (most likely smokers) and, thus, likely to alter her friends’ decisions to smoke. It is, then, an empirical question how these two opposing effects balance out. Simulations with the estimated model suggest that the direct effect dominates, i.e. that following an increase in tobacco prices the response of the friendship network amplifies the intended policy effects this increase.
The second experiment asks whether school racial composition has effect on adolescent smoking. When students from different racial backgrounds study in the same school, they interact and are likely to become friends. Being from different racial backgrounds student have different intrinsic propensity to smoke and the question is what is the equilibrium behavior in these mixed-race friendships–those who do not smoke start smoking or those who smoke stop smoking. Simulations from the model suggest that redistributing students from racially segregated schools into racially balanced schools decreases the overall smoking prevalence.
The last experiment examines a school from the sample with a particularly high smoking prevalence (). The experiment simulates a policy intervention, capable of influencing students’ smoking decisions121212The policy may consists of providing direct incentives or information about the health risks associated with smoking tobacco. Then, it may be too expensive to treat the entire school and, instead, the policy maker may engage only a small part of the school with the purpose to alter the social norms., however, targeting only part of this school’s population. The (empirical) question of interest is when treated individuals return in the schools, will their friends follow their example, i.e. extending the effect of the proposed policy beyond the set of treated individuals and thus creating a domino effect, or will their, previous to the treatment, friends un-friend them? In essence this is a question about the magnitude of the spillover effects and this paper contribution is to account for the possibility of the friendship network to adjust to the proposed treatment. This study suggests that this spillover effect is in the neighborhood of folds.
1.2 Related literature
The empirical models in Nakajima (2007) and Mele (2017) inspired the proposed framework. Nakajima (2007) studies peer effects abstracting from friendships and Mele (2017) obtains large network asymptotics of a model with link formation only. The approaches in these papers are fundamentally compatible so these models can be unified in a joint model, as in Hsieh and Lee (2014), Boucher (2016) and Hsieh et al. (2016). Compared to existing frameworks, including Cabrales et al. (2011a) and Canen et al. (2016), I departure from purely non-cooperative formulations of individuals’ decision problem. More specifically, the proposed solution concept embeds the necessity for consensus for forming a relationship (Jackson and Wolinsky, 1996).
A handful of theoretical papers consider (broadly related) adaptive link dynamics or model both network formation along with other choices potentially affected by the network (See Jackson and Watts, 2002; Goyal and Vega-Redondo, 2005; Cabrales et al., 2011b; Hiller, 2012; König et al., 2014; Baetz, 2015; Lagerås and Seim, 2016). For example König et al. (2014) obtain a characterization, later generalized by Lagerås and Seim (2016), of the Nash equilibria in our model as nested split graphs—graphs where nodes’ neighborhoods have strict hierarchical structures, i.e. a node’s neighborhood contains the neighborhoods of all nodes with lower degrees. Similar results are obtained by Hiller (2012) in the context of -player stable networks.131313An analogous characterization of the family of NESN should be of independent interest. Given the focus of this paper, I leave this for another occasion.141414In similar settings Jackson and Watts (2002) and Goyal and Vega-Redondo (2005) obtain conditions under which the risk-dominant or the efficient action prevails when they analyze similar dynamic processes (as opposed to the equilibrium networks). Importantly, the theoretical frameworks available, are meant to provide focused insights into isolated features of networks and deliver sharp predictions, but are not easily adapted for the purposes of estimation.151515A typical approach is to focus the analysis on a particular equilibrium as opposed to discussing all equilibria. Multiplicity of equilibria reflects the possibility of network and behavioral externalities which is an indispensable feature of our settings.
Potential function representation as a dimensionality reduction tool is widely used in (algorithmic) game theory, computer science and in economics of networks for processes on fixed networks, for processes of link formation and, more recently, for combined processes, e.g.Foster and Young (1990), Blume (2003), Jackson and Watts (2002), Nakajima (2007), Bramoullé et al. (2014), Bourlés et al. (2017), Mele (2017), Boucher (2016), Hsieh and Lee (2014), Hsieh et al. (2016).161616Congestion games were the first class of games exhibiting this property (Beckmann, McGuire and Winsten, 1956; Rosenthal, 1973). Monderer and Shapley (1996) recognize that congestion games are instances of games with potential, propose several notions of potential functions for games in strategic form, and obtain a characterization of potential games. In the proposed framework, the role of the potential function is to justify the gravitation of a family of adaptive dynamics around equilibria of the link/behavior game. This study also suggest a direction for addressing the poor statistical properties of those models.171717This is an important point for the implementation of those models. For more background, see the discussion in Bhamidi et al. (2011) and Chandrasekhar and Jackson (2016).
Econometric models of networks and actions are proposed in Goldsmith-Pinkham and Imbens (2013) and Hsieh and Lee (2014) where the decisions to form friendships influence the decision to engage in a particular activity. The focus of their research, however, is not on policy analysis nor on accounting for the possible endogenous response of the friendship network to changing the decision environment. Thus, they are able to abstract from the converse feedback from actions to friendship formation and, also, to avoid the equilibrium microfoundations of a strategic model and an explicit treatment of the possibility for multiplicity. In contrast, the framework proposed in Boucher (2016) is microfounded as a particular equilibrium in a non-cooperative model of friendships and behaviors. Related work by Hsieh et al. (2016) propose a two-stage estimation procedure, with an application to R and D, which relies on conditional independence of links delivered by abstracting from link externalities. Auerbach (2016) obtains identification results within large network asymptotics which also rely on conditional independence of links. While the assumptions leading to conditional independence present a convenience, these limit the scope for studying peer effects, in addition ruling out externalities in the network formation.181818It is important to realize that network economics owes its appeal namely because networks conceptualize so naturally situations with externalities, so ruling these out in essence defies the use of network economics. Finally, there are recent contributions to the econometrics literature which focus on link formation, though these are not easily extendable to include action choice as well.191919See Sheng (2014), Chandrasekhar and Jackson (2016), Leung (2014), de Paula et al. (2016), Graham (2017), Menzel (2015) and the reviews in Chandrasekhar (2015); de Paula (2016); Bramoullé et al. (2016).
The theoretical analysis in Jackson (2018) argues that variability in individuals’ popularity (degree in a social network) leads to biased perceptions for the social norm which in turn leads to higher levels of activities compared to a situation when there is no variability in individuals’ popularity.202020This increase is further amplified if the most popular individuals are those with the highest proclivity for extreme behaviors. Although the model in this paper is different in that individuals choose their friends and I exploit different information assumptions, one of my counterfactual experiments hints to such amplification mechanisms (in quite different settings) where as a result of the endogenous response of the social network to a price increase the intended effects on the overall smoking are amplified.
The empirical analysis of friendship networks and smoking behaviors lends support to a host of results which are related to the large body of empirical work on social interactions and teen risky behaviors. Typically, empirical studies on peer effects either lack data on friendship network or take the friendship network as given.212121See, for example, Liu et al. (2014), who distinguish between local aggregate and local average peer effects, and the references therein. Also the approaches range from models that directly relate an individual’s choices to mean characteristics of his peer groups222222See Powell et al. (2005), Ali and Dwyer (2009a), and the references therein. to models with elaborate equilibrium micro-foundations, such as those in Brock and Durlauf (2001, 2007); Krauth (2005); Calvó-Armengol, Patacchini and Zenou (2009). In terms of estimates, this paper makes the first step in explaining how not accounting for the response of these social network (e.g. as a result of lack of data) could bias the estimates.232323It is difficult, if not impossible, to account the empirical contributions of the large literature on peer effects and teen risky behaviors. For a small sample of papers obtaining estimates of peer effects see Chaloupka and Wechsler (1997), Ali and Dwyer (2009b) and the references in (CDC, 2000, Surgeon General’s Report). Similarly, this paper pioneers a mechanism capable of explaining the role of the school composition, or more generally the determinants of the social fabric, on the overall teen risky behaviors. The possibility of such a role was theorized by Graham et al. (2014) and experimentally discovered by Carrell et al. (2013).242424It is important to recognize that this study has limitations. The presence of unobservables which affect both individuals’ propensity to smoke and also the friendship decisions, in principle, could influence the model’s predictions (See Manski (1993) and the discussion in Blume et al. (2015).) While the model can be augmented to accommodate an extension along these lines, identification of model’s parameters becomes delicate and frequentist and Bayesian perspectives differ substantially on this issue. Since I do not want to exploit the disagreement between those perspectives, I discuss this point only briefly and leave it for future research.
2 A game on an endogenous network
The model captures two primitives of a (strategic) environment where individuals choose both their behaviors and their relationships jointly. First, individuals’ behaviors and relationships materialize fundamentally different decision processes. While for an individual to engage in a certain behavior she need not consider any but her own incentives, for a relationship to emerge and sustain there needs to be a consensus between both parties. At the same time, the incentives which substantiate individuals’ relationship decisions remain selfish (and asymmetric), so that pure strategies and payoff functions remain appropriate means for mathematical description of the game. To reflect the consensual nature of human relationships, I augment the individual’s decision problem with stability constraints. The point is that, in contemplating their optimal play, individuals are aware that were the stability constraint violated, a relationship cannot be established.252525In essence, the stability constraints embed the consensual nature of the pairwise stability of Jackson and Wolinsky (1996).
The second primitive of the model is the presence of externalities for which, indeed, models of network formation are well suited for. These externalities exist not only within the relationships per se, but also between the action decisions and the relationships. Externalities between relationships stem from the motives for sharing common friends or, alternatively, the motives for exclusivity of a relationship. In addition, externalities between individuals’ behaviors and relationships reflect the motives for conformism with friends behaviors or, conversely, the motives to associate with those who share common habits.
The model is developed in two stages. First, agents’ strategic behavior is analyzed in static settings and then a family of myopic dynamic processes is used to approximate the predictions of the static model in a inferentially suitable way.
2.1 Players and preferences
Each individual (say ), in a finite population , decides on a binary action and her social network, given by the set of her relationships for . For the empirical application, is the collection of all student cohorts in a given high school at a given time period, where if student smokes and if there is a friendship between and .262626Alternatively, could be the population of a geographically isolated area. In general, any closed collection of individuals who draw friends from within themselves will fit the assumptions of the model.
Agent selects her friendship and action statuses from her choice set to maximize her payoff, which depends both on her exogenous characteristics , e.g. age, gender, etc, and on her endogenous characteristics, e.g. network position, decision of her network neighbors, and etc. Let and . Formally ’s payoff function, , orders the outcomes in given :
Here , and are functions of agents’ (exogenous) characteristics.
Note that the incremental payoff of changing ’s action status, , depends on ’s observables, her friendship links, and the choices of the overall population. The coefficient captures the possibility that the intrinsic preferences over different action statuses may depend on an agent’s attributes. The coefficient in captures conformity pressures.272727This is an example of positive externality. Alternatively, the local externality term may capture competitive pressures, in which case this term may have a negative sign. In the case of friendships, one may be influenced strongly by the behavior of own friends as opposed to casual individuals. Continuing, in the second summation captures the aggregate externalities: may be influenced from the behaviors of the surrounding population, irrespective of whether these are friends or not.282828In principle, could be a function of individual’s exogenous attributes capturing, for example, a situation where males are more likely to be affected by the observed behavior of other males as opposed to the observed behavior of females.
Consider the incremental payoff of a friendship, . In that incremental payoff, the term captures the baseline benefit of friendship which may depend on ’s and ’s degree of similarity, i.e., same sex, gender, race, etc. The terms with capture link externalities. Mechanically, if links to and links to then may be more likely link to (thus closing the triangle). On the contrary, if there is friendship rivalry these forces will have the opposite effect, i.e. will be negative.
It is important to emphasize that the labels of the terms in (2) are figurative and are only meant to draw intuition from well-recognized, in the literature on social interactions, identification concerns. While friends influences have often been pointed to as a major driver of human behaviors and have been associated with the potential to create a domino effect, the main premise of this paper is that it it possible a converse mechanism to be in place where a change of ’s decisions re-shapes who her peers are as opposed to pressure her peers to follow ’s decisions, e.g. an individual who contemplates ceasing smoking (after an increase of tobacco prices) may at the same time reconsider her friendship network!
2.2 Equilibrium play
There is a tension between the selfish nature of humans’ behaviors, aiming to (selfishly) maximize individual’s utility, and the consensual nature of humans’ friendships calling for cooperative decisions. I assume that players internalize the necessity for consensus by confining their choices to the stability constraints.292929Stability constraints bear similarities with incentive compatibility constraints from the theory of mechanism design, which have been discussed as early as Hurwicz (1960, 1972).
Definition 1 (NeSn)
A profile of actions and a network is a Nash equilibrium in a (-player) stable network, provided is a solution of decision problem on :
where , and , for all and .
In particular, in a NESN no player has permissible, by the stability constraints (4), and profitable deviations in any restriction of the game on nodes.
In a NESN, individual’s play is optimal with respect to a reference set of contemplated strategies and the size of this reference set depends on . In equilibrium, no player has permissible, by the stability constraints (4), and profitable deviations in any restriction of the game on nodes.
Assume that and are symmetric in their arguments.
In proposition 1, the existence of a non-trivial solution to the individual’s decision problem in (3-4) and an equilibrium is guaranteed by the potential representation for this game. The proof is in appendix A (p. A).
For , any NESN play is robust against a single link deviation and whence pairwise stable.313131This holds for any payoffs. The larger is, the more elaborate deviations individuals contemplate when choosing an optimal play. In our settings, for , NESNs are Nash networks. Further, pairwise stability is interpreted as a necessary condition for the existence of a NESN (proposition 2 part 3). The proof is in appendix A (p. A).
3 -player dynamic and a random utility framework
As is typical for an equilibrium, NESNs present an intuitive way to conceptualize the outcome of the (non-cooperative and cooperative) forces which drive action and relationship statuses in our model, without specifying the decision process leading to this outcome. This abstraction is challenged by two primitives of our settings: (a.) the complexity of agent’s decision problem due to the size of her action space and (b.) the possibility for multiple equilibria due to the presence of externalities. Turning to a framework based on adaptive dynamics simplifies agents’ decision process and delivers a way to embed the multiplicity of NESN in an inferentially convenient way.
The formulation of individuals’ decision problem (3-4) provides a basis for an adaptive process describing the evolution of networks towards (or around) NESNs. The proposed dynamic builds on Nash (1950), who suggested that equilibrium might arise from simple (myopic) adaptive dynamic as opposed to from a complex reasoning process, and Jackson and Watts (2002), who in the context of network evolution, introduced the notion of “improving paths” in which the adaptive dynamic is consensual and moves incrementally.323232See also Cournot (1838); Blume (1993). 333333In Jackson and Watts (2002), each period a pair of players meet and update the status of their relationship (a single link of theirs), while in the literature studying the myopic best reply dynamic, agents take turns to update their strategies (i.e., in our settings, all links). The proposed dynamic spans these approaches.
3.1 -player consensual dynamic (-Cd)
Every period a randomly chosen individual (say ) reconsiders of her friendships and her behavior . In particular, myopically solves her decision problem on from (3–4), i.e. can not establish a relationship with unless and . A stochastic meeting process outputs who makes choices and who are the individuals considered as potential friends. Formally:
The sequence of meetings and players’ optimal decisions induce a sequence of network states , which is indexed by time subscript and which will be referred as -player consensual dynamic (-PCD).343434In the simplest case, any meeting is equally probable so that: . Note that in large networks the meetings may be biased towards people with similar characteristics (or people sharing friends) as in Currarini et al. (2009). Such meetings will result in the presence of triangles on the friendship network. My approach is to place little structure on the meeting process and, instead, to include a triangle-motive directly in the utility function. In this study the school networks are of relatively small size–in compact social spaces everybody is likely to have met everybody else so that the meeting frictions are less likely to influence the patters of friendships.
Any meeting is possible for all , , and .
Let and suppose that assumption 2 holds then
Any is absorbing i.e. if then for all .
Independently of the initial condition (distribution)
Indeed the NESNs are exactly the rest points of a simple decision process, the -PCDs. The proof is in Appendix A (p. A).
3.2 A random utility framework
The assumption below introduces to this discrete choice problem a random preference shock.353535See Thurstone (1927); Marschak (1960); McFadden (1974). Similar stochastic has been considered by the literature on stochastic stability, which when shocks vanish over time, presents an equilibrium selection device. See Foster and Young (1990); Kandori et al. (1993).
Suppose that the utilities in (2) contain a random preference shock. More specifically, let
with across time and network states. Moreover, suppose that has c.d.f. and unbounded support on .
Suppose that the preference shock is distributed .
Suppose that for the meeting probability : (i) does not depend on the relationship status between and any . (ii) does not depend on . These together imply for all and .
The matching process and the sequence of optimal choices, in terms of friends selection and individual actions, induce a Markov chain of network configurations on .363636See the stochastic-choice dynamics in Blume (1993). The above assumption guarantees that this chain obeys some desirable properties, which are formalized below. (The proof is in the appendix on p. A.)
The first part is not surprising in that it asserts that CD is well behaved so that standard convergence results apply. The uniqueness of precludes any ambiguity in the predictions of the process, while the ergodicity is relevant whenever we want to draw predictions from the model. However, the second part of the theorem has novel implications.
Note how in (1) the stationary distribution does not depend on and, thus, delivers an approach to unify the family of . As we formally argue in theorem 3, the stationary distribution offers a probabilistic ranking of the set equilibria in the family (within and across different s). In addition, the closed-form expression for the stationary distribution has advantages for the empirical implementation of the proposed framework, where can be treated as the likelihood. In particular, one can explore a transparent argument for the identification of model’s parameters. It is clear that, given the variation in the data of individual choices , friendships and attributes , functional forms for will be identified as long as the different parameters induce different likelihoods of the data. Also, a closed-form expression for facilitates the use of likelihood-based estimation methods, including Bayesian ones.373737To provide intuition behind this result consider two states . It can be shown that the probability of moving from to is proportional to the probability of returning from to by a factor that does not depend on . The formal argument can be found in the appendix.
Our next results hints to the role of the dimension of the meeting process in the -player dynamic. Our formulation is for the most stark case when is the only factor differentiating the properties of the -player dynamic.383838In general, the shape of the potential, i.e. the terms of the potential function, and the geography of the network will likely influence the speed of convergence. For more see Bhamidi et al. (2011). As far as I know treatment of the most general case remains out of reach. The proof is in appendix A (p. A).
[-PD ranking] Suppose that for all . For , the CD converges strictly faster than -CD to the stationary equilibrium . In particular, the second eigen value of CD is given by
There are two rationales behind the pursuit of a characterization of the speed of convergence of CD. As anticipated (and formally established shortly) probabilistically ranks the family of NESN. In a dual fashion, the differential speed of convergence provides a means to rank probabilistically the family of CD.393939 so that the limiting behavior of this exponentiation is governed by the second eigenvalue. Theorem 2 suggests that for large s, CD converge faster and the , to which these CD converge, have higher probability.
The second reason for why studying how well the CD is represented by the stationary distribution is highlighted by Bhamidi et al. (2011) who showed that adaptive dynamic with local updates converges very slowly.404040A Markov chain is local if at most links are updated at a time. See also the discussion in Chandrasekhar and Jackson (2016). Note that CD encompasses not only local updates, e.g. , and thus offers a way to address the problem of slow convergence (poor approximation). In addition, theorem 2 offers insights into an important trade-off for sampling design: the Markov chain is facing a trade-off between speed of convergence and complexity in simulating the next step. Whenever is small, the speed of convergence to the stationary distribution is slower, however, the computational difficulty in updating the network at each step is smaller. The opposite holds when is large.414141While this is true in general, the particular structure of the problem here permits computational shortcuts where, for each , the cost of updating the Markov chain is constant.
3.3.1 Probabilistic ranking and long-run equilibria
The stationary distribution obtained in theorem 1 gives an intuitive (probabilistic) ranking of the family of . Under , a network state will receive a positive probability, although it may not be an equilibrium in any sense. It will be desirable, however, that in the vicinity of an equilibrium, the equilibrium to receive the highest probability. Relatedly, the mode of (i.e. the state with the highest probability) has special role. This offers a new perspective to the theoretical results on equilibrium selection from evolutionary game theory, namely equilibrium ranking.
To formalize our discussion, define the neighborhood of as:
Next, define a state as a long-run equilibrium of the network formation model if for any sequence of vanishing preference shocks, the stationary distribution places a positive probability on (Kandori et al., 1993).
A state is a Nash equilibrium in a pair-wise stable network iff if it receives the highest probability in its neighborhood .
The most likely network states (the one where the network spends most of its time) are pairwise Nash networks.
The long-run equilibria of the underlying evolutionary model that are given by , which need not be Pareto efficient.
3.3.2 Meeting process of random dimension
Now consider what appears to be a very unrestrictive meeting process, where every period a random individual meets a set of potential friends of random size and composition. Let be a discrete process with support and augment the meeting process with an additional initialization step with respect to the dimension of . In particular, at each period first is realized and then is drawn just as before. It is relatively straightforward to establish, without any assumptions on the process , that this augmented process has the same stationary distribution as the one from theorem 1.424242I omit here the formal statement and the proof as it essentially follows the one from 1. This is another demonstration of the fact that different meeting processes result in observationally equivalent models.
4 Data and estimation
The model is implemented with data on smoking and friendships from the National Longitudinal Study of Adolescent Health (Add Health). I employ a Bayesian estimation technique based on MCMC approximations of the likelihood434343Direct estimation via maximizing the likelihood is not feasible because the likelihood can be evaluated only up to an intractable constant.. The particular MCMC algorithm I propose is a modification of Murray et al. (2006) and Mele (2017) drawing on theorems 1 and 2, which have broader implications for constructing algorithms for estimation and simulation from the model.
4.1 The Add Health data
The National Longitudinal Study of Adolescent Health is a longitudinal study of a nationally representative sample of adolescents in grades – in the United States in the – school year. In total, 80 high schools were selected together with their “feeder“ schools. The sample is representative of US schools with respect to region of country, urbanicity, school size, school type, and ethnicity. The students were first surveyed in-school and then at home in four follow-up waves conducted in –, , –, and –. This paper makes use of Wave I of the in-home interviews, which contain rich data on individual behaviors, home environment, and friendship networks.444444In addition to the in-home interview from Wave I, data on friendship are available from the in-school and Wave III interviews. However, the in-school questionnaire itself does not provide information on important dimensions of an individual’s socio-economic and home environment, such as student allowances, parental education, and parental smoking behaviors. On the other hand, during the collection of the Wave III data, the respondents were not in high school any more. For more details on Add Health research design, see www.cpc.unc.edu/projects/addhealth/design.
To provide unbiased and complete coverage of the social network, all enrolled students in the schools from the so-called saturated sample were eligible for in-home interviews. These were 16 schools of which 2 large schools (with a total combined enrollment exceeding 3,300) and 14 small schools (with enrollments of fewer than 300). One of the large schools is predominantly white and is located in a mid-sized town. The other is ethnically heterogeneous and is located in a major metropolitan area. The 14 small schools, some public and some private, are located in both rural and urban areas.
In addition, Add Health data have been merged with existing databases with information about respondents’ neighborhoods and communities. For example, the American Chamber of Commerce Research Association (ACCRA) compiles cost of living index, which is linked to the Add Health data on the basis of state and county FIPS codes for the year in which the data were collected. From the ACCRA, I use administrative data on the average price of a carton of cigarettes.454545For details see the Council for Community and Economic Research www.c2er.org, formerly the American Chamber of Commerce Research Association. Additional details about the estimation sample including sample construction and sample statistics are presented in the appendix.
4.2 Bayesian estimation
The model delivers network state dynamic which is a Markov chain with unique stationary distribution . Because no information is available on when the network process started or on its initial state, the best prediction about the network state is given by . Thus, for estimation purposes, the stationary distribution can be thought of as the likelihood. Given a single network observation , the likelihood is given by:
where is the potential (evaluated at ) and is an (intractable) normalizing constant.464646The size of and the summation in calculating are so large, even for small networks, that the value of cannot be calculated directly for practical purposes. For the summation includes terms.
The specific form of the likelihood pertains to the exponential family, whose application to graphical models has been termed as Exponential Random Graph Models.474747For more see Frank and Strauss (1986); Wasserman and Pattison (1996). Various generative and descriptive approaches have been proposed, both within frequentist and Bayesian paradigms, to address this specific tractability problem. I adopt a Bayesian estimation and propose an algorithm for sampling from the posterior which is a modification of the double M-H algorithm of Murray et al., 2006 and Mele (2017), informed by theorem 2.
The posterior sampling algorithm is exhibited in table 1. In the original double M-H algorithm, an M-H sampling of from is nested in an M-H sampling of from the posterior . The novelty of my approach is the random dimension of the meeting process in step . Theorem 2 suggests that varying improves the convergence and theorem 1 demonstrates that changing leaves the stationary distribution unaltered. The validity of the particular implementation is proven in proposition 4.
Input: initial , number of iterations , size of the Monte Carlo , data S
|6.||Draw a meeting where and from|
|8.||Propose where are drawn from|
|11.||If then else|
|15.||If then else|
[Varying double M-H algorithm] Let and suppose assumptions 2 and 3 hold. If in the algorithm of table 1, the proposal density conditional on meeting of random dimension , is symmetric, then the unconditional proposal is symmetric. In particular, the acceptance ratio of the inner M-H step 9 does not depend on and .
Finally, the Bayesian estimator requires specifying prior distributions and proposal densities. The prior
is at least a standard deviation away from the posterior mean. Proposals, , and are uniform over their respective domains.
The payoffs from (2) and (2) have five sets of parameters: , , , and . In the empirical specification, the first three are functions of the data , , . I explore a wide set of parametrizations which are informed by the salient features of the data and by the particular experiments I am interested in.484848Importantly, the trade off between flexibility and data limitation is tight. Recall that our identification framework is casted in the many networks asymptotics and, in the end, we have 16 networks. Careful scrutiny of the data motivates the final specification which is discussed in the appendix (appendix B.3 on page B.3).
The identification, within the framework of many networks, follows from the connection of the model with the family of exponential random graph models (ERGM). These are a broad class of statistical models, capable of incorporating arbitrary dependencies among the links of a network. As a result, ERGM have been very popular in estimating statistical models of network formation. A corollary of theorem 1 is that the likelihood of the model falls in the family of ERGM.
The likelihood , where .
As the number of networks grows to infinity, identification follows from the theory of the exponential family. In particular, it is enough that the sufficient statistics are linearly independent functions on . In the structural parametrization of the model above, this condition is established immediately.494949Most of the parameters are identified in the asymptotic frame, where the size of the network grows to infinity (as opposed to the number of networks going to infinity). Further discussion of this point is outside the scope of this work. For more see Xu (2011); Chandrasekhar and Jackson (2016).
Unobservable heterogeneity in friendship selection and decision to smoke
In addition to the models’ parameters for observable attributes, it is possible to incorporate agents’ specific unobservable types which may influence both the utility for friendships, e.g. could include term , and also the propensity to smoke, e.g. could include a term . In this case the likelihood has to integrate out :
There are a couple of approaches to discuss identification in this case. Within the Bayesian paradigm, identification casually obtains as long as the data provides information about the parameters. The understand the issues note even a weakly informative prior can introduce curvature into the posterior density surface that facilitates numerical maximization and the use of MCMC methods. However, the prior distribution is not updated in directions of the parameter space in which the likelihood function is flat (see An and Schorfheide (2007)
). From a frequentist perspective, the heuristic identification argument goes as follows. Individuals who are far away in observables, must have realizations of the unobservables very close by. If in the data those individuals are either smokers or non smokers with very high probability then it must be the case thatis large. However, formalizing this argument is nether immediate nor it is clear whether this argument will support non-parametric identification so this endeavor is left for future research.
Adding unobservable heterogeneity imposes substantial computational costs to the estimation and simulation algorithms because these need to include an additional step—drawing from . For the Add Health data, such an extension is out of reach because of the limited number of well-sampled networks.
4.5 Estimation results
|Utility of smoking|
|Parameter||No Net Data||Fixed Net||No PE||No Tri||Model|
|1||Baseline probability of smoking|
|4||Mom edu (HS&CO)|
|7||of the school smokes||—|
|Utility of friendships|
|Parameter||No SNet Data||Fixed Net||No PE||No Tri||Model|
|8||Baseline number of friends||—||—|
Note: MP stands for the estimated marginal probability in percentage points and MP for estimated marginal probability in percent, relative to the baseline probability. The posterior sample contains simulations before discarding the first . Posterior mean outside of the shortest credible sets is indicated by // respectively.
Table 2 presents the posterior means of model’s parameters which have been transformed for ease of interpretation to baseline probabilities, marginal probabilities ( in ppt) and relative marginal probabilities ( in pct) 505050For example, the baseline probability of smoking is derived from the intercept as . Superscript MP stands for marginal probability and MP stands for marginal probability in percentages with respect to the baseline probability of smoking . Appendix B.3 on page B.3 provides details.. The estimates (in the last column) suggest a substantial role for friends and home environment in adolescents’ decisions to smoke. In particular, one additional friend who is a smoker increases the conditional probability of smoking by ppt.515151Because both friendships and smoking are choices in the model, this parameter should be interpreted with caution. In particular, the estimate cannot be interpreted as the effect on the likelihood of smoking from a randomly assigned friend who is a smoker as individuals cannot be forced into friendships. If of the students in a school smoke, all other things being equal, then an individual is ppt more likely to smoke (row ). Also the presence of a smoker in the household increases the likelihood of smoking by ppt. Note that these marginal effects are first order approximations which do not take into account equilibrium adjustments of individuals’ friendships.525252This is addressed in the following section.
In addition to the model’s posteriors (in the last column), table 2 presents the posterior means for different scenarios: (a.) without network data, (b.) with fixed network, (c.) without peer effects, (d.) without externalities of common friends (triangles). While the estimates have particularly limited interpretation it is worth pointing out that the estimate for the price coefficient does not vary much and as figure 1 suggests the largest biases are those when peer effects terms or when the externalities from common friends are omitted. In contrast, when the network is held fixed or there is no network data available the bias is relatively small.535353The hypotheses of equal means between the model’s posterior and each of the other posteriors in figure 1 are rejected with by -tests. In addition, when comparing the estimates for other determinants of adolescent smoking (presence of a smoker in the household or maternal education) the estimation biases when the network externalities are mis-specified and when peer effects are omitted are of the same sign. This observation is robust to alternative specification where instead of price on smoking, the income (allowances) are included.545454See table 10 in the appendix.
4.6 Model fit
Selected (non-targeted) moments
|Coleman Homophily Index||1.001||1.001|
|Freeman Segregation Index||0.352||0.368|
|Smoker||76% (57.1)||24% (18.1)||77% (52.8)||23% (15.5)|
|Nonsmoker||41% (18.1)||59% (26.3)||41% (15.5)||59% (22.8)|
Table 3 compares selected statistics from the data to those from a sample simulated with the estimated model.555555Using the parameter estimates, a Markov chain of size from the -player dynamic is simulated from which, to reduce the auto dependence, every element is sampled. In addition to statistics that are directly targeted by the model’s parameters (overall prevalence, density, and reciprocity), statistics which are only indirectly governed by model parameters are reported in tables 3 and 4, e.g. maximum degree, certain friendship configurations, mixing etc. Overall the model fits well the smoking decisions and the network features of the data. The only caveat is the number of triangles as fraction of the size of the network which in the data is while in the simulations from the model is . The most likely reason for this discrepancy is that in the model triangles are generated only via a single parameter which does not depend on observables, i.e. race, sex etc. This parsimonious specification is dictated by the small sample size of only networks and further exploration of this feature of the data is left for the future.
5 Policy experiments
5.1 A. Changes in the price of tobacco
The estimated model serves as a numerical prototype for the equilibrium response to policy interventions and, more generally, for the analysis of the determinants of teen smoking.565656Each policy is simulated as draws from the stationary distribution of the model. Table 5 presents the effects of increases in tobacco prices ranging from to cents (in the sample tobacco prices average at for a pack). The second through the fifth columns report predictions for the change in overall smoking prevalence in percentage points for various scenarios: the full model, the model when agents are restricted from adjusting their friendship links, the model when agents are not subject to peer effects ( and are set to zero) and, finally, the predictions from a model which has been estimated without peer effects altogether.575757Note that the scenario “PE off” entails stronger restrictions on individuals’ social environments than just keeping friendships fixed. In particular, when simulating individuals’ smoking decisions, I keep constant their friendship choices, their friends’ smoking statuses, and the average smoking behavior of the population overall (i.e., the number of smokers in the population).
|Price increase||Model||Fixed net||PE off||Model w/o PE|
Note: The first column shows proposed increases in tobacco prices in cents. The average price of a pack of cigarettes is $1.67 so that 20 cents is approximately 10%. The second through fourth columns show the predicted increase in the overall smoking (baseline 33%) in ppt from the full model, from the model when the friendship network is fixed, and from the model when the coefficients and are set to zero respectively. The last column shows the predicted effect produced from a model which is estimated with no peer effects.
Table 5 suggests that the social interactions, including the response of the social network, is quantitatively relevant in measuring the intended effect of increasing tobacco prices. The baseline smoking in the sample is so that a price increase of (about ) reduces the smoking to . Comparison between model’s predictions with and without friendship adjustments (columns two and three) reveals that the freedom of breaking friendships and changing smoking behavior induces larger decrease in overall smoking compared to situation when individuals are held in their existing (fixed) social networks. Figuratively, a price change has two effects on the decision to smoke: the direct effect operates through changing individuals’ exogenous decision environment and the indirect/spillover effect operates through changing the peer norm which then puts additional pressure on the individuals’ to follow the change. When comparing the endogenous to fixed network, the direct effect is likely to be stronger in the former environment while the indirect effect is likely to be stronger in the latter environment.585858Additional analysis and simulations are provided in the appendix on page 3.
Comparison of the model’s predictions with and without peer effects (columns two and four) suggest that social interactions account for roughly of the decrease in smoking following a price increase. Finally, note that the prediction from a model relying on coarse calculations, if one is to completely discard the presence of peer effects, and the predictions from the model when the social network is held fixed are grossly inline.
5.2 B. Changes in the racial composition of schools
Suppose that in a given neighborhood there are two racially segregated schools: “White School” consisting of only white students and “Black School” consisting of only black students. One would expect that the smoking prevalence in White school is much higher compared to Black school because, in the sample, black high students smoke thee times less than white high school students. Consider a policy aiming to promote racial desegregation, which prevents schools from enrolling more than percent of students of the same race. If such policy is in place, will students from different races form friendships and will these friendships systematically impact the overall smoking in one or another direction?
To simulate the effect of the proposed policy I consider one of the racially balanced schools in my sample.595959The school has 150 students of which are Whites and are Blacks. It incorporates students from grades 7 to 12. The Whites and the Blacks from this school serve as prototypes for the White School and Black School respectively.606060As an alternative to splitting one school into two racially segregated schools, one could consider two schools from the data that are already racially segregated. However, the only school with a high ratio of Blacks in the sample incorporates students from grades 7 and 8. If this school serves as a prototype for the Black School, then I am faced with two options for the choice of the White School. If the White School incorporates only 7th and 8th graders, then smoking prevalence will be low regardless, since these grades are mostly nonsmoking. Alternatively, if the White School incorporates higher grades, the simulation results will be driven in part by the asymmetry in the population (7th and 8th graders do not make friends with students from higher grades). Consequently, the school that incorporates black students in grades 7 and 8 only cannot properly serve as a prototype for the Black School. To implement the proposed policy I randomly select a set of students from the White School and a set of students from the Black School and swap them. For example to simulate the effect of a cap on the same-race students in a school, I need to simulate a swap of .
Note: A cap of same-race students is implemented with a swap of students. Columns 2, 3 and 4 show smoking prevalence. Columns 5 and 6 show the p value for the hypothesis of mean equality in overall smoking. Columns 7 and 8 show the p value for the hypothesis of equality in distribution of overall smoking (two-sample Kolmogorov-Smirnov test). Both test cannot reject the hypotheses for equal mean/equal distribution of the overall smoking for two cases: (a.) same-race cap 30% and same-race cap 20% and (b.) same-race cap 50% and same-race cap 40%, suggesting non-linear relation between school composition and overall smoking.
Table 6 presents the simulation results, which suggest that racial composition affects the overall smoking prevalence. The first column shows the size of the set of students which is being swapped. The second, third, and forth columns show the simulated smoking prevalence in the White School, Black School, and both, respectively. The table suggest that overall smoking prevalence is lower when schools are racially balanced, thus supporting policies promoting racial integration in the context of fighting high smoking rates. Finally, the tests in the rightmost four columns lendstatistical support of this finding.616161The appendix examines the changes in the distribution of the overall smoking rates. In particular, figure 4 illustrates the shift in the distribution for three of the segregation scenarios (, and swaps) suggesting that de-segregation decreases the likelihood of outcomes consistent with higher smoking prevalence (the mass above ).
It is important to note that the simulations here offer only suggestive evidence on the role of racial desegregation on the overall prevalence of smoking. There are many factors, e.g. the profile of all observables for the entire schools (income, home environment, tobacco price, …), that are likely to influence the outcome of desegregation. Unfortunately, the Add Health data does not offer sufficient variation in those factors.626262There is only 1 racially balanced school in the data sample! The simulations in this section are done using a particular (racially balanced) school from the data as a prototype. The author hopes this paper to stimulate further research into this question.
5.3 C. Cascade effects of an anti-smoking campaign
The smoking prevalence in the school with the highest smoking rate is . For this school, I consider the effects of an anti-smoking campaign that can prevent with certainty a given number of students from smoking, e.g. a group of students are invited to a weekend-long information camp on the health consequences of smoking. The camp is very effective in terms of preventing students from smoking; however, it is too costly to engage all students. The question is once the “treated students” come back will their smoking friends follow their example and stop smoking, or will their friends un-friend them and continue smoking?
|Campaign (%)||Smoking||Predicted effect||Actual||Multiplier|
Note: The first column lists the alternative attendance rates. The second and third columns display the smoking rate and the change in smoking rate respectively if the decrease would be proportional to the intervention, i.e. computes a baseline without peer effects. The last column computes the ratio between the percentage change in the number of smokers and the attendance rate. Note that that attendance is random with respect to the smoking status of the students. If the campaign is able to target only students who are currently smokers, the spillover effects will be even larger.
Table 7 presents the simulation results, which suggest that an anti-smoking campaign may have a large impact on the overall prevalence of smoking, without necessarily being able to directly engage a large part of the student population.636363The policy is simulated times, where each time a new random draw of attendees is being considered. In particular, the multiplier factor–the ratio between the actual effect and effect constrained to the treated sub-population–indicated a substantial spillover effects, operating through the social network, from those who attended the camp to the rest of the school.
6 Concluding remarks
The premise of this paper is that social norms and behaviors are shaped jointly and that individuals confine their choices of friendships to those for which there is a mutual consent (Jackson and Wolinsky, 1996). The paper proposes a novel formulation of a decision environment where the requirement for mutual consent is captured by the stability constraints. This environment gives rise to a family of equilibria indexed by the complexity of the contemplated deviations, e.g. changing behaviors and a single friendship, and two friendships, etc. The presence of complementaries may result in multiple equilibria, ranked in probabilistic sense as these equilibria arise in consensual dynamic—adaptive dynamic in which relationships arise from mutual consent.
Application of the proposed framework to adolescents’ friendship selection and decisions to smoke illustrates the opportunities for public policy arising from accounting for the instability of the social network in response to increases in tobacco prices or to changing the racial composition of schools. This study also sheds light on the estimation biases due to lack of social network data and on the non-linear relationship between the social norms and individuals’ behaviors theorized by Graham et al. (2014) and experimentally discovered in Carrell et al. (2013). Overall this study formulates an avenue to study the complementarities and coordination in social networks with accents on the presence of consent in forming relationships and on the possibility for multiple equilibrium outcomes.
Appendix A Proofs
The preferences of each player are summarized by the potential function (which does not depend on ):
Derive the incremental changes of individuals’ utilities and the potential function:
Thus and . Further the symmetry of preferences (necessary for the existence of potential) is sufficient to show that (i.) the stability constraints are never biding so that the unconstrained optimum is feasible (ii.) for any and the solution of individual’s decision problem (3)–(4) is given by
Note that the argmax in (17) is always non empty since is discrete function on finite set which implies part 1. Part two follows from a similar argument noting that
For definition 1 directly implies that the is pairwise stable and this is a general observation, independent of the particular payoff structure here. For , observe that the s are the states that maximize the potential function. Consider then the following strategies in a normal form link-announcement game (conditional on the equilibrium behavior ): each player announces his links. Given that others will announce exactly the links of a , it is easy to see that no player has a profitable deviation. Finally, part 3 follows from proposition 5.
That any Nash equilibrium in a -player stable network absorbing follows from the definition of . The second part follows from observing that is a submartingale:
So that converges almost surely. Since the network is of finite size it follows that is constant for large . Because of assumption 2 this can happen only if .
Note that the -PCD induces a finite state Markov chain which is irreducible, positive recurrent, and aperiodic. The first part of the theorem follows from standard results on convergence of Markov chains. For the second part, we need to integrate out the meeting process to obtain the one step transition probabilities for , and then show that .
be the set of all probability distributions on