1 Introduction
Many forms of social interaction can be represented in terms of a multigraph, where each individual interaction corresponds to an edge in the graph, and repeated interactions may occur between two individuals. For example, we might have multigraphs where the value of an edge corresponds to the number of emails between two individuals, or the number of packages sent between two computers.
Recently, the class of edge exchangeable graphs CaiCampbellBroderick2016 ; CraneDempsey2018 ; Williamson2016 have been proposed for modeling networks as exchangeable sequences of edges. These models are able to capture many properties of largescale social networks, such as sparsity, community structure, and powerlaw degree distribution.
Being explicit models for sequences of edges, the edgeexchangeable models are appropriate for networks that grow over time: we can add more edges by expanding the sequence, and their nonparametric nature means that we expect to introduce previously unseen vertices as the network expands. However, their exchangeable nature precludes graphs whose properties change over time. In practice, the dynamics of social interactions tend to vary over time. In particular, in models that aim to capture community dynamics, the popularity of a given community can wax and wane over time.
We propose a new model for sparse multigraphs with clustered edges, that breaks the exchangeability of existing models by preferentially assigning edges to clusters that have been recently active. We show that incorporating dynamics using a mechanism based on the distancedependent Chinese restaurant process (ddCRP) BleiFrazier2011 leads to improved testset predictive likelihood over exchangeable models. Further, when used in a link prediction task, we show improved performance over both its exchangeable counterpart and a range of stateoftheart dynamic network models.
2 Background and related work
Our goal is to construct a Bayesian model for sparse interaction multigraphs where the interaction patterns can vary over time. Like many dynamic graph models, our model incorporates temporal dynamics into a previously defined stationary model. We begin by discussing Bayesian stationary models, before moving on to dynamic models. We end this section by discussing dynamic extensions of the Chinese restaurant process, which we will use in our construction.
2.1 Bayesian models for multigraphs
Bayesian models for multigraphs can loosely be divided into three camps. First, we have graphs where the value of an edge between vertices and
is a random variable parametrized by the value of some function
. In the multigraph setting we consider in this paper, that function might be a Poisson distribution.
We refer to these graphs as jointly vertexexchangeable, since the distribution over the adjacency matrix is invariant to jointly permuting the row and column indices. In this class, we have models such as the stochastic blockmodel, where vertices are clustered into finitely many communities and a parameter is associated with each communitycommunity pair Snijders:Nowicki:1997 ; Karrer:Newman:2011 ; the infinite relational model, where the number of communities is unbounded irm ; mixedmembership stochastic blockmodels, where edges are generated according to an admixture model airoldi2008mixed ; the Latent Feature Relational Model, where parameter values are distributed according to a latent feature model based on the Indian buffet process miller2009nonparametric ; and Poisson factor analysis, where the parameter values are distributed according to a gamma processbased latent factor model ZhouCarin2013 ; gopalan2015scalable . While these models are able to capture interesting community structure, the resulting graphs are dense almost surely Aldous1981 ; Hoover:1979 . This makes them a poor choice for large networks, which are typically sparse.
Second, we have multigraphs where the edges occur according to a Poisson process on the space of potential edges. The main example of such a model is CaronFox2017 , where the edges are sampled according to a Poisson process where the rate measure is distributed according to a generalized gamma process. Such a model has been extended to have community structure lee2018bayesian . These models yield sparse graphs with powerlaw degree distribution, properties that are common in large social networks.
Third, we have multigraphs constructed using an exchangeable sequence of edges CaiCampbellBroderick2016 ; CraneDempsey2018 . Here, we assume the edges are generated by sequentially sampling pairs of vertices. These pairs of vertices are iid given some nonparametric prior, such as a Dirichlet process, a normalized generalized gamma process, or a Pitman Yor process, resulting in a sparse multigraph. Unlike the Poisson process based models, edge exchangeable multigraphs can grow over time by adding new edges, either between new or previously seen vertices.
Several extensions have been made to these edge exchangeable graphs to incorporate community structure. Williamson2016 ; herlau2016completely . Most relevant to this paper, the mixture of Dirichlet network distributions (MDND) Williamson2016 uses a mixture of edgeexchangeable models, with shared infinitedimensional support. Concretely, the MDND assumes exchangeable sequences of links are generated according to:
(1) 
Each link is associated with a cluster , which governs which edgeexchangeable model it is generated from. The clusters are distributed according to a Chinese restaurant process (CRP), allowing an unbounded number of clusters. A hierarchical Dirichlet process formulation doi:10.1198/016214506000000302 ensures that the component edge exchangeable models have common support, so that a vertex can appear in edges associated with multiple clusters.
2.2 Models for dynamic graphs
There has been significant research attention on dynamic (timeevolving) networks modelling, ranging from nonBayesian methods such as dynamic extensions of the exponential random graph model (ERGM) guo2007recovering
, or matrix and tensor factorizationbased methods
dunlavy2011temporal , to Bayesian latent variable models pmlrv2sarkar07a ; ishiguro2010dynamic ; sarkar2014nonparametric ; durante2014nonparametric ; schein2016bayesian ; PallaCaronTeh2016 ; ng2017dynamic ; yang2018dependent . A common approach relies on the extensions of static network models to a dynamic framework. We focus here on dynamic extensions of Bayesian models of the forms discussed in Section 2.1.Most dynamic Bayesian networks extend jointly vertexexchangeable graphs. For example,
xu2014dynamicextends the stochastic blockmodel using an extended Kalman filter (EFK) based algorithm, and the stochastic block transition model
xu2015sbtmrelaxes a hidden Markov assumption on the edgelevel dynamics, allowing the presence or absence of edges to directly influence future edge probabilities. Several methods have also been used to incorporate temporal dynamics into the mixed membership stochastic blockmodel framework
fu2009dynamic ; xing2010state ; ho2011evolving and the latent feature relational model foulds2011dynamic ; heaukulani2013dynamic ; kim2013nonparametric . Most recently, several models have extended Poisson factor analysis. The dynamic gamma process Poisson factorization (DGPPF) acharya2015nonparametricintroduces dependency by incorporating a Markov chain of marginally gamma random variables into the latent representation. The dynamic Poisson gamma model (DPGM)
yang2018poisson extends a bilinear form of Poisson factor analysis zhou2015infinite in a similar manner; they dynamic relational gamma process model (DRGPM) yang2018dependent also incorporates a temporally dependent thinning process.Much less work has been carried out on dynamic extensions of the sparse graphs generated using Poisson processes or via a sequence of exchangeable edges. In the Poisson processbased space, PallaCaronTeh2016 use a timedependent base measure, and assume edges have a geometric lifespan. In the edge exchangeable case, ng2017dynamic incorporates temporal dynamics into the MDND by introducing a latent Gaussian Markov chain, and a Poisson vertex birth mechanism.
2.3 Dynamic nonparametric priors
Our model extends the MNDN by replacing the exchangeable CRPbased clustering mechanism with a temporally varying clustering mechanism. A number of methods exist for incorporating temporal dynamics into the CRP, e.g. maceachern2000dependent ; lin2010construction ; Ren:2008:DHD:1390156.1390260 . For our purposes, we choose to use the distancedependent CRP (ddCRP) BleiFrazier2011 . Recall that the CRP can be described in terms of a restaurant analogy, where customers select tables (clusters) proportional to the number of people sat at that table, or sit at a new table with probability proportional to a concentration parameter . The ddCRP modifies this by encouraging customers to sit next to “similar” customers. In a timedependent setting, similarity is evaluated based on arrival time using some nonnegative, nonincreasing decay function such that . Concretely, if customers and arrive at times , let
Then the th customer picks a customer to sit next to (and therefore a cluster) according to
(2) 
The CRP is recovered if for all .
3 A dynamic edgeclustering graph models for timeevolving sparse graphs
We propose a dynamic extension of the MDND, which is appropriate for sparse, structured graphs for temporal dynamics. The MDND is based on a sequence of Dirichlet processes (see Eqn 3). One Dirichlet process (the distribution over the the cluster indicators , represented in Eqn 3 in terms of a CRP) governs the clustering structure of the edges. Another (the distribution over ) controls the number of vertices, and their overall popularity, within the graph. Finally, the distributions over the and the control the clusterspecific distributions over the “sender” and “recipient” of edges in the graph.
Any of these distributions could be replaced with dynamic or dependant clustering models to generate a temporally evolving graph. In practice, replacing all of the distributions with dynamic alternatives is likely to lead to overspecification of the dependencies, making inference challenging. We choose to retain stationary models for , and , implying that a cluster’s representation stays stable over time, and allow the cluster popularities to vary by making the sequence timevarying.
We capture this variation using a ddCRP (see Section 2.3), yielding the generative process (for some decay function )
(3) 
The ddCRP is wellsuited to our use case, as it captures the behavior that we are likely to see clusters that have appeared recently. In an interaction network context, this implies that we are likely to see modes of communication that have been popular in recent time periods, over modes of communication that have fallen out of popularity. Another reason to favor the ddCRP is ease of inference: its construction lends itself to an easytoimplement Gibbs sampler, allowing us to apply our method to larger graphs. By contrast, many other dependent Dirichlet processes have much more complicated inference algorithms, which would limit scalability.
A limitation of the ddCRP is that it assumes that all data has been observed up to the current time point; the distribution is not invariant to adding edges at previously observed time points. This is not a concern in our setting, since we are typically able to observe past instances of the full graph, and are interested in predicting future edges.
4 Inference
We perform inference by combining the ddCRP sampler of BleiFrazier2011 with the original MDND sampler Williamson2016 . The original MDND sampler is based on the direct assignment inference algorithm for the hierarchical Dirichlet process doi:10.1198/016214506000000302 , which assigns observations (in our case, links) to “tables” and represents
using a finitedimensional vector
, where is the number of observed vertices. Concretely, let and be the number of tables in cluster associated with vertex as a sender and a recipient, respectively. Our procedure for inferring , and exactly follows Williamson2016 .Rather than sample the cluster assignment of link directly, we sample the link, that link “follows” or sits next to. Following BleiFrazier2011 , we first set th link to follow itself, i.e. , and then sample a new value for based on the conditional probability that ,
(4) 
where represents the edge structure of the graph.
Rather than directly calculate , the likelihood of the graph given the entire partition, we calculate , where
(5) 
is 1 if the clustering structure implied by is the same as that implied by . Alternatively, if joins two partitions that would be separate if , then the ratio becomes
(6) 
where is the subset of edges that are in cluster (note that since we have assigned edge to follow itself, we have ). We have
(7) 
where is the number of times vertex appears in cluster in role , and is the number of edges in cluster . The inference algorithm, which scales as , is summarized in Algorithm 1.
5 Experiments
In this section, we address the following questions: (1) How well does DynMDND capture the underlying network behavior, as evaluated using test set log likelihoods, and (2) How well does DynMDND perform in a link prediction task, compared with other stateoftheart dynamic network models? We explored these questions on four realworld datasets, detailed in Section 5.1.
We evaluated DynMDND using two decay functions, an Exponential decay where , and a Logistic decay where . We explored several values for between 0.5 and 2, and found setting worked well on all datasets. We placed Gamma(1, 1) priors on , , and , and sampled their values using the augmented samplers described in doi:10.1198/016214506000000302 . We initialized our algorithm using the Louvain graph clustering method. All the experiments were run on a single node of a compute cluster with 48 cores, and 2.67 GHz RAM, using python code attached to this submission.^{1}^{1}1Code will be made public following submission
5.1 Datasets
We evaluate our model on four realworld networks: (1) FacetoFace dynamic contacts network (FFDC)^{2}^{2}2http://www.sociopatterns.org/datasets/highschoolcontactandfriendshipnetworks/ mastrandrea2015contact records face to face contacts among students with communications for school days in Marseilles, France. We consider each day as one time slot, and an edge between any pair of students at a timestamp is considered if they have at least one contact recorded at that given time. It leads to the total of edges and network sparsity . (2) Social Evolution network (SocialEv)^{3}^{3}3http://realitycommons.media.mit.edu/socialevolution.html madan2011sensing , released by MIT Human Dynamics Lab, tracks the everyday life of a whole undergraduate dormitory with mobile phones. We consider the surveys of proximity (observed Bluetooth connections), calls and SMSs as the event time observations. The network consists of nodes and links with the total sparsity . This network has high clustering coefficient and about events over time slots. (3) DBLP asur2009event maintains information on more than 800,000 computer scientist publications among 958 authors over ten years (19972006) in 28 conferences. We extract a subset of most connected authors over all time period which contains edges with sparsity . We choose the snapshot interval to be a year, resulting in consecutive snapshot networks. (4) Enron^{4}^{4}4https://www.cs.cmu.edu/~enron/ contains interactions among users over 38 months (May 1999 June 2002) with sparsity . We consider an edge between each pair of users at each month, if they have at least one email recorded at that given time. We use the first snapshots of the network for the evaluation results.
5.2 Evaluation metrics
We study the effectiveness of DynMDND by evaluating our model of dynamic link prediction and dynamic test set likelihood prediction.
Test set log likelihood. We held out 100 test set allocations from each time slot, and trained our model on the remaining data.^{5}^{5}5We held out the values of the sender and recipient for the test set, but kept the time stamps, since the ddCRP method assumes the arrival times of all edges are known.
We then used a Chibstyle estimator
Wallach:2009:EMT:1553374.1553515to estimate the log likelihood of the test set, and report mean and standard error of log likelihood for each decay method. We compare our dynamic link prediction model with various decays against the exchangeable MDND
Williamson2016 . We implement the MDND using the same code, but with a distance of 1 between all edges. This setting reduces the ddCRP to the CRP.Link Prediction. Test set log likelihood is useful for evaluating whether the model is a good fit for data. However, in practical applications we often want to make concrete predictions for future network values. In the datasets described below, each discrete time step is associated with multiple edges, and within each time period there are no repeated edges. To predict the next
edges in this context, we consider the probability distribution over the location of the next edge, and pick the
highest probability edges. This task allows us to compare with models that are not explicitly designed for edge prediction. We consider three stateoftheart network models, discussed in Section 2.2: DRGPM yang2018dependent , DPGM yang2018poisson , and DGPPF acharya2015nonparametric . All of these models are not designed for explicit link prediction, but can be modified to give predictions using the above procedure of selecting the highest probability edges. These models also have the limitation of assuming a fixed number of vertices. While the edgebased dynamic model of ng2017dynamic is a dynamic extension of MDND and an appropriate comparison method, we were not able to compare due to lack of available code.For performance comparison, we use F1 score, Map@ and Hit@. F1 score is 2(precisionrecall)/(precision+recall). Precision is the fraction of edges in the future network present in the true network, Recall is the fraction of edges of the true network present in the future network. MAP@ is the classical mean average precision measure and Hits@ is the rate of the top ranked edges.
5.3 Results
Dataset  MDND  DynMDNDLogistic  DynMDNDExponential 

FFDC  383,094.6146  286,833.9220  344,683.9105 
Enron  1032.94 147.18  640.7390.51  700.88 22.97 
DBLP  980,928.4532.1  798,568.4998.5  649,521195.0 
SocialEv  173,708.1223.6  23,087.391.0  19,820.193.6 
Table 1 shows the predictive log likelihood computed by our DynMDND method using two different decays (i.e. Exponential and Logistic) in comparison to the CRP decay function. At each time slot
, We use 80% of the network data for training the model and the remaining 20% for the test set. It can be seen that considering time dependency into our mixture models results in a better log likelihood for the task of prediction. We use a sample size of 1000. We repeat the experiments 10 times and report the mean and standard deviation of the results over four real networks.
Figure 1 compares dynamic log likelihood inference of DynMDND over the underlying evolving network, on all four datasets. It can be seen that DynMDND outperforms CRP in terms of better clustering and proves that considering time in the model can significantly improves inference.
Figure 2 illustrates the F1 score, Map@ and Hits@ for DynMDND with all three decay types, Exponential, Logistic and CRP vs. DRPGM, DPGM and DGPPF for dynamic link prediction. We use the networks of time slots 1 to as training set and predict the network edges of time slot
. We report the results on the three datasets, FFDC, DBLP and Enron, using time interval one day, one year and one month respectively. For each task, we repeat the experiments 10 times and report the mean and standard deviation of each evaluation metric.
We see that DynMDND significantly outperforms DRPGM, DPGM and DGPPF on all metrics, for the task of dynamic link prediction. We hypothesise that this is due to several reasons. First, DynMDND is explicitly designed in terms of a predictive distribution over edges, making it wellsuited to predicting future edges. Second, DynMDND is able to increase the number of vertices over time, and is likely better able to capture natural network growth. Conversely, the other methods assume the number of vertices is fixed—and explicitly incorporates the absence of edges at earlier time points into the likelihood.
6 Conclusions
We have presented a new model for interaction networks that can be represented in terms of sequences of links, such as email interaction graphs and collaboration graphs. Using a nonparametric sequence of links makes our model wellsuited to predicting future links, and unlike many vertexbased graphs allows for an unbounded number of vertices.
Unlike previous edgesequence models, we explicitly incorporate temporal dynamics in our construction. As we saw in Section 5, this allows us to make more accurate predictions in realworld multigraphs where the underlying patterns of behavior move over time.
In this paper, we incorporate dynamics using a ddCRP model, which encourages edges to belong to clusters that have been recently active. An interesting avenue for future research would be to explore alternative forms of dependency, and incorporate mechanisms that can capture link reciprocity blundell2012modelling .
References
 (1) Diana Cai, Trevor Campbell, and Tamara Broderick. Edgeexchangeable graphs and sparsity. In Advances in Neural Information Processing Systems, pages 4249–4257, 2016.
 (2) Harry Crane and Walter Dempsey. Edge exchangeable models for interaction networks. Journal of the American Statistical Association, 113(523):1311–1326, 2018.

(3)
Sinead A Williamson.
Nonparametric network models for link prediction.
The Journal of Machine Learning Research
, 17(1):7102–7121, 2016.  (4) David M Blei and Peter I Frazier. Distance dependent chinese restaurant processes. Journal of Machine Learning Research, 12(Aug):2461–2488, 2011.
 (5) T.A.B. Snijders and T. Nowicki. Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification, 14(1):75–100, 1997.
 (6) B. Karrer and M.E.J. Newman. Stochastic blockmodels and community structure in networks. Physical Review E, 83(1):016107, 2011.

(7)
C. Kemp, J.B. Tenenbaum, T.L. Griffiths, T. Yamada, and N. Ueda.
Learning systems of concepts with an infinite relational model.
In
National Conference on Artificial Intelligence (AAAI)
, pages 381–388, 2006.  (8) Edoardo M Airoldi, David M Blei, Stephen E Fienberg, and Eric P Xing. Mixed membership stochastic blockmodels. Journal of machine learning research, 9(Sep):1981–2014, 2008.
 (9) Kurt Miller, Michael I Jordan, and Thomas L Griffiths. Nonparametric latent feature models for link prediction. In Advances in neural information processing systems, pages 1276–1284, 2009.
 (10) Mingyuan Zhou and Lawrence Carin. Negative binomial process count and mixture modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(2):307–320, 2013.
 (11) Prem Gopalan, Jake M Hofman, and David M Blei. Scalable recommendation with hierarchical poisson factorization. In UAI, pages 326–335, 2015.

(12)
David J Aldous.
Representations for partially exchangeable arrays of random
variables.
Journal of Multivariate Analysis
, 11(4):581–598, 1981.  (13) D.N. Hoover. Relations on probability spaces and arrays of random variables. Preprint. Institute for Advanced Study, Princeton., 1979.
 (14) François Caron and Emily B Fox. Sparse graphs using exchangeable random measures. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(5):1295–1366, 2017.
 (15) Juho Lee, Lancelot F James, Seungjin Choi, and François Caron. A Bayesian model for sparse graphs with flexible degree distribution and overlapping community structure. arXiv preprint arXiv:1810.01778, 2018.
 (16) Tue Herlau, Mikkel N Schmidt, and Morten Mørup. Completely random measures for modelling blockstructured sparse networks. In Advances in Neural Information Processing Systems, pages 4260–4268, 2016.
 (17) Yee Whye Teh, Michael I Jordan, Matthew J Beal, and David M Blei. Hierarchical dirichlet processes. Journal of the American Statistical Association, 101(476):1566–1581, 2006.
 (18) Fan Guo, Steve Hanneke, Wenjie Fu, and Eric P Xing. Recovering temporally rewiring networks: A modelbased approach. In Proceedings of the 24th international conference on Machine learning, pages 321–328. ACM, 2007.
 (19) Daniel M Dunlavy, Tamara G Kolda, and Evrim Acar. Temporal link prediction using matrix and tensor factorizations. ACM Transactions on Knowledge Discovery from Data (TKDD), 5(2):10, 2011.
 (20) Purnamrita Sarkar, Sajid M. Siddiqi, and Geogrey J. Gordon. A latent space approach to dynamic embedding of cooccurrence data. In International Conference on Artificial Intelligence and Statistics, volume 2, pages 420–427, 21–24 Mar 2007.
 (21) Katsuhiko Ishiguro, Tomoharu Iwata, Naonori Ueda, and Joshua B Tenenbaum. Dynamic infinite relational model for timevarying relational data analysis. In Advances in Neural Information Processing Systems, pages 919–927, 2010.
 (22) Purnamrita Sarkar, Deepayan Chakrabarti, Michael Jordan, et al. Nonparametric link prediction in large scale dynamic networks. Electronic Journal of Statistics, 8(2):2022–2065, 2014.
 (23) Daniele Durante and David B Dunson. Nonparametric bayes dynamic modelling of relational data. Biometrika, 101(4):883–898, 2014.
 (24) Aaron Schein, Mingyuan Zhou, David M Blei, and Hanna Wallach. Bayesian poisson tucker decomposition for learning the structure of international relations. In Proceedings of the 33rd International Conference on International Conference on Machine Learning, pages 2810–2819, 2016.
 (25) Konstantina Palla, Francois Caron, and Yee Whye Teh. Bayesian nonparametrics for sparse dynamic networks. arXiv preprint arXiv:1607.01624, 2016.
 (26) Yin Cheng Ng and Ricardo Silva. A dynamic edge exchangeable model for sparse temporal networks. arXiv:1710.04008, 2017.
 (27) Sikun Yang and Heinz Koeppl. Dependent relational gamma process models for longitudinal networks. In International Conference on Machine Learning, pages 5547–5556, 2018.
 (28) Kevin S Xu and Alfred O Hero. Dynamic stochastic blockmodels for timeevolving social networks. IEEE Journal of Selected Topics in Signal Processing, 8(4):552–562, 2014.
 (29) Kevin Xu. Stochastic block transition models for dynamic networks. In International Conference on Artificial Intelligence and Statistics, pages 1079–1087, 2015.
 (30) Wenjie Fu, Le Song, and Eric P Xing. Dynamic mixed membership blockmodel for evolving networks. In Proceedings of the 26th annual international conference on machine learning, pages 329–336, 2009.
 (31) Eric P Xing, Wenjie Fu, Le Song, et al. A statespace mixed membership blockmodel for dynamic network tomography. The Annals of Applied Statistics, 4(2):535–566, 2010.
 (32) Qirong Ho, Le Song, and Eric Xing. Evolving cluster mixedmembership blockmodel for timeevolving networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pages 342–350, 2011.
 (33) James Foulds, Christopher DuBois, Arthur Asuncion, Carter Butts, and Padhraic Smyth. A dynamic relational infinite feature model for longitudinal social networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pages 287–295, 2011.
 (34) Creighton Heaukulani and Zoubin Ghahramani. Dynamic probabilistic models for latent feature propagation in social networks. In International Conference on Machine Learning, pages 275–283, 2013.
 (35) Myunghwan Kim and Jure Leskovec. Nonparametric multigroup membership model for dynamic networks. In Advances in neural information processing systems, pages 1385–1393, 2013.
 (36) Ayan Acharya, Joydeep Ghosh, and Mingyuan Zhou. Nonparametric bayesian factor analysis for dynamic count matrices. arXiv preprint arXiv:1512.08996, 2015.
 (37) Sikun Yang and Heinz Koeppl. A poisson gamma probabilistic model for latent nodegroup memberships in dynamic networks. In ThirtySecond AAAI Conference on Artificial Intelligence, 2018.
 (38) Mingyuan Zhou. Infinite edge partition models for overlapping community detection and link prediction. In Artificial Intelligence and Statistics, pages 1135–1143, 2015.
 (39) Steven N MacEachern. Dependent dirichlet processes. Unpublished manuscript, Department of Statistics, The Ohio State University, pages 1–40, 2000.
 (40) Dahua Lin, Eric Grimson, and John W Fisher. Construction of dependent dirichlet processes based on poisson processes. In Advances in neural information processing systems, pages 1396–1404, 2010.
 (41) Lu Ren, David B. Dunson, and Lawrence Carin. The dynamic hierarchical dirichlet process. In Proceedings of the 25th International Conference on Machine Learning, ICML ’08, pages 824–831, New York, NY, USA, 2008. ACM.
 (42) Rossana Mastrandrea, Julie Fournet, and Alain Barrat. Contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys. PloS one, 10(9):e0136497, 2015.
 (43) Anmol Madan, Manuel Cebrian, Sai Moturu, Katayoun Farrahi, et al. Sensing the" health state" of a community. IEEE Pervasive Computing, 11(4):36–45, 2011.
 (44) Sitaram Asur, Srinivasan Parthasarathy, and Duygu Ucar. An eventbased framework for characterizing the evolutionary behavior of interaction graphs. ACM Transactions on Knowledge Discovery from Data (TKDD), 3(4):16, 2009.
 (45) Hanna M. Wallach, Iain Murray, Ruslan Salakhutdinov, and David Mimno. Evaluation methods for topic models. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 1105–1112, 2009.
 (46) Charles Blundell, Jeff Beck, and Katherine A Heller. Modelling reciprocating relationships with hawkes processes. In Advances in Neural Information Processing Systems, pages 2600–2608, 2012.