In recent years there has been an increasing interest in the study of probabilistic
models defined on graphs in order to describe the random interactions in a network
system. The Exponential Random Graph Model (ERGM), that was pioneered by Frank & Strauss (1986), allows the representation of a large number of dependencies found in real networks, such as social networks (Robins et al., 2007) . ERGM is a family of probability distributions on graphs belonging to the exponential family (known as Gibbs distributions in Statistical Physics) such that the probability of a given graph depends only on the sufficient statistics of the graph, such as number of edges, stars, triangles and so on. Despite its applicability, this model does not incorporate properties of the underlying space where the vertices are located. For some real networks, it is reasonable to consider that the network connections might depend on the physical space where the graph is embedded.
. ERGM is a family of probability distributions on graphs belonging to the exponential family (known as Gibbs distributions in Statistical Physics) such that the probability of a given graph depends only on the sufficient statistics of the graph, such as number of edges, stars, triangles and so on. Despite its applicability, this model does not incorporate properties of the underlying space where the vertices are located. For some real networks, it is reasonable to consider that the network connections might depend on the physical space where the graph is embedded. InMourrat et al. (2018), the authors introduce and study the behavior of a Spatial Gibbs Random Graph defined on an one-dimensional space that gives more weight to graphs with small average distance between vertices. They also consider that the existence of each edge in the graph has a cost that depends only on the underlying space.
In this paper we introduce a random graph model that describes a balance between the statistics of the graph and the distance between the vertices in the underlying space. For a finite vertex set , we define a Gibbs measure with weights depending on the length of the edges and a sufficient statistic which is a function of the graph and describes the interaction among the edges. We propose a graphical construction of the spatial Gibbs measure and we prove the existence and uniqueness of an infinite volume measure as the limit along the finite volume measures under some sufficient conditions. In the infinite volume, a vertex of the graph can be connected with infinitely many other vertices allowing a vertex to have infinity degree, this is not the case for the proposed model where is concentrated on the set of graphs with finite degree. To our knowledge, Ferrari et al. (2010) is the first attemp to study the existence of such limits for a spatial Gibbs random graph measure that favors graphs with short edges and penalizes vertices with degree other than one. Their results also involve percolation properties of the Gibbs measure. The model proposed by Ferrari et al. (2010) is a particular case of the model proposed in this work. Since the uniqueness of the infinite measure was not a problem addressed by the authors, our results can be seen as an complement to their seminal work.
The clan of ancestors graphical construction used in this paper was originally proposed in Fernández et al. (2001). This construction is based on the graphical representation of a birth and death process defined on the set of graphs that has as its invariant measure. In this process edges try to appear in the graph with an exponential rate, but they are in fact added to the graph according to some probability depending on the present configuration of the graph. The edges are removed from the graph with rate . The process described above is dominated by a birth and death process for which edges are added in the graph every time they try to be born. This dominating process allows the presence of multiples edges in the network giving rise to an independent multigraph process. To use the independent multigraph process to determine whether an edge is present at a time of the dependent process, it is necessary to look back in the past to the edges born before that could have an influence on the existence of at time . Once the clan is determined, it is necessary to perform a cleaning procedure forward on time to erase the edges that should not have been added in the graph. This construction directly induces a perfect simulation algorithm in order to sample a subgraph from , as proposed in (Ferrari et al., 2002). This algorithm does not assume any monotonous property of the process used in its construction as it is required in the case of perfectly sampling methods for the ERGM (Cerqueira et al., 2017).
This paper is organized as follows. In Section 2 we introduce the definitions and notation to be used along the paper. Section 3 contain the main results and some examples. Sections 4 and 5 contain the graphical construction that is the key ingredient in the perfect simulation scheme as well in the proofs which are presented in Sections 6 and 7 respectively.
2 Graph definitions
Let be a finite set and define the set of all simple graphs with vertex set by
where . We denote by and by . We shall use to denote a graph where is if there exists an edge between and and otherwise, for . For simplicity, we write for .
Define as the graph which coincides with for all edges other than and . In the same way, is the graph which coincides with for all edges other than and .
We denote the set of pairs of vertices intersecting by
We define the restricted graph to the set by .
Let be the norm on given by and define the length of an edge by . For any denote by
the ball of radius centered at .
Let be the degree of vertex in the graph and let be the degree of vertex restricted to the box , that is,
Define the set .
2.1 Spatial Gibbs Random Graphs
The random graph model considered in this work describes an interplay between the sufficient statistics of the graph and the underlying space. We focus on random graphs with vertex set given by subsets of . We consider a model that penalizes connections between distant nodes in such a way that the Gibbs distribution defined on favors graphs with short edges. Inspired by the ERGM, our model also penalizes edges through a sufficient statistic which is a function of the whole graph. In this way, for we define the following Hamiltonian
For each fixed , the finite volume Gibbs distribution is given by
where is the normalizing constant.
In this paper we consider functions such that
only depends on edges that are connected with vertex or
there exists a finite constant , , (which does not depend on ) such that
for all finite .
Notice that the RHS of (2.4) is always finite if is finite.
In the work Ferrari et al. (2010) the authors consider a Gibbs measure that favors graphs with short edges, few vertices with degree zero and few vertices with degree greater or equal than . To this end, they define the function by
where are fixed parameters.
In this particular case,
Although the authors have proved the existence of the infinite measure , the uniqueness of this measure has not been addressed by them.
Thus, as complement of their work, we proved in Theorem 3.1-(1), that the uniqueness of is guaranteed under mild conditions over and . Further discussion about these conditions are stated after Theorem 3.1-(1).
Our main interest is to consider models defined in (2.3) covering some interesting models that take into account statistics of the graph such as k-stars and triangles. In these cases, the constant , given by (2.4), is equal to . In the particular case that the model gives more weight to graphs with short edges and penalizes 2-stars, the Hamiltonian can be written by
2.2 Dependent graph process
For a finite or infinite set and a real continuous function on , we define a Markov process on for which the generator of the process is defined by
It is worth noting that by the definition of in (2.4) we have that , for all and all .
The process defined above has the following dynamics: when the current graph is , the edge attempts to be born with rate and it is added in the graph with probability if it is not already in the graph. An edge belonging to the current graph is removed at rate .
For finite, it is easy to see that the invariant measure of process defined above is given by (2.3). The next theorem guarantees the existence of at least one invariant measure of the process in the case of graphs with infinite number of vertices.
For any (infinite) the Markov process with generator exists and admits at least one invariant measure.
3 Main Results
For , define
Our first result guarantee the existence and uniqueness of an infinite volume distribution as a limit along sub-sequences of , as . Furthermore, we show that under all graphs have finite vertex degree with probability .
If , then the following statements hold:
For any there exists a unique process with generator . The process has a unique invariant measure given by . For finite, is the measure defined by (2.3). For , we denote by .
Weakly convergence. As , converges weakly to and is concentrated on
For any , define the set as the smallest set of vertices that fully determines the function . More precisely, it is uniquely determined by the following conditions
If is any other finite vertex set for which whenever , then .
In the same way, for any , the set is the smallest set of pair of edges that depends on and its uniquely determined by
If is any other finite set of pair of vertices for which whenever , then .
Denote by and the subsets of which have finite and respectively.
For , define the distance
Theorem 3.2 (Exponential space convergence).
Let be a finite subset of and assume that . If is a measurable function with , then
for any . If is a measurable function with , then
for any .
Examples 3.3 and 3.4 below illustrate how Theorem 3.2 can be applied in order to better understand the relation between the infinite and finite measures through some characteristics of the graph. In particular, we obtain an upper bound for the absolute difference between the expected degree and the degree distribution with respect to the infinite and finite measures.
Example 3.3 (Expectation of the restricted degree).
Example 3.4 (Distribution of the degree of a vertex).
For any and , set . Since , and , using Theorem 3.2 we get
Theorem 3.5 states the mixing property for the finite measure .
Theorem 3.5 (Exponential mixing).
Let be a finite (infinite) subset of and assume that . If and are measurable functions with , then
for any . If and are measurable functions with , then
for any .
The last result in this section is a generalization of a central limit of theorem for graphs.
Theorem 3.7 (Central limit theorem).
Let be a measurable function on with finite vertex support such that , for some . Let be a translation by and assume that and . Then,
Define by the degree of the vertex located in the origin on . For all define the function and its translation . Assuming that and define
then we have that
4 Graphical Representation
In this section we present the graphical construction of the birth and death process inspired by Fernández et al. (2001) which will be the key ingredient to prove all the results stated before as well as getting a perfect simulation scheme. The construction and the proofs are very similar to those in Fernández et al. (2001), therefore we will omit the proofs.
To each pair of vertices we associated an independent marked Poisson process on with rate . Let be the ordered occurrence times of the Poisson process such that . To each occurrence time we associated an independent mark exponentially distributed with mean 1 and an independent mark uniformly distributed on . In a nutshell, an edge borns at the random time and it survives time units.
We define the random family of marked Poisson process. Each quartet can be represent by a marked rectangle with basis , birth time , lifetime and mark . In this way, for a rectangle we denote , , , and .
For an initial graph , it is associated an independent random initial life time , exponentially distributed with mean , and an independent uniform mark on for each edge in graph . Define the set of initial rectangles
For , , define the set of rectangles born on the time interval by
For the model defined by (2.3), Assumption 1 guarantees that the existence of the edge in the graph depends on the edges that are connected to vertice or . In general, we say that there exists a dependence relation between two edges in the graph if they share a common vertex. We define this dependence relation () between edges by
and between rectangles by
In the next sections, we consider the probability space given by the product of the spaces generated by the rectangles and initial rectangles . We denote it by . We also write for the respective expectation.
4.1 Construction of the independent multigraph process
To make the notation easier to follow we shall reserve the bold roman letters to represent graphs and the greek letters to represent the processes defined on graphs.
For , define the process on by
The process described above is a product of independent birth-and-death process on with initial graph whose generator is given by
In this “free” process an edge is added in the graph every time it tries to born. Because of this lack of restriction, an edge is allowed to be added in the graph when it is already in the graph, giving rise to a multigraph structure. In this case, corresponds to the number of edges connecting and .
The invariant and reversible measure for this process, denoted by , is the product distribution on for which the (marginal) number of multiples edges is given by a Poisson random variable with mean
is given by a Poisson random variable with mean, that is,
In a nutshell, the invariant measure is defined on the set of multigraphs with independent edges. Because of this well defined structure, this measure will be called independent multigraph distribution.
4.2 Finite volume construction of the dependent process
As defined in Section 4.1, the independent multigraph process is constructed using the graphical representation by rectangles introduced in Section 4. In this section, we describe the cleaning operation that should be applied in the independent multigraph process in order to construct the process on , for finite, with generator given by (2.8).
Let . To construct the independent process on , for finite, we use the set of rectangles and the set of initial rectangles associated with the initial graph . To construct the dependent process with generator given by (2.8), some rectangles are erased from the set , using a cleaning procedure, resulting the set of kept rectangles at time . The cleaning procedure used to decide which rectangles are erased or kept are described below.
At time we include all rectangles of in . Since is finite we can move forward ordering the birth and death marks as .
We construct the process as following:
We set .
Supose that is already defined, and that . We set
if , then
If is a death time, that is, for some , then we delete the edge of the graph by setting, for all ,
Go back to step .
If is a birth time, that is, for some , then if
we add the edge in the graph by setting
and we keep the rectangle , that is, is added in the set of kept rectangles . In either case, set go back to step .
We can also construct the process described above directly from the set of kept rectangles. To do this, we first generate rectangles by running the independent multigraph process from to with initial graph . After that, we decide which rectangles are kept using successively the test given by (4.6). Basically, the test does not allow multiple edges in the graph and it only adds a non-existent edge in the graph with probability given by (2.9). Using directly the kept rectangles, the process on is defined by
We show in Theorem 5.1 that has generator given by (2.8) restricting the sums to the set of pairs of edges contained in . Since is reversible for this process and we have an irreducible Markov process with a finite state space , converges in distribution to for any initial graph . This implies that is the unique invariant measure for this process.
Set two initial graphs and such that , for all . We construct the process and using the same and the same set of initial rectangles for common edges in and . Since in the independent multigraph process all rectangles are kept we have that
The construction given by (4.7) can be done in a stationary way for . Indeed, since is a finite set, there exists a sequence of random times with as , such that corresponds to the empty graph, that is , for all . In other words, in each no rectangle is alive. Thus, we can construct the set of kept rectangles independently in each random intervals using the rectangles of and forgetting the set of initial rectangles. Let us denote by the resulting set of kept rectangles and the process defined as in (4.7). By construction, has a time translation-invariant distribution. The process has generator given by (2.8) and distribution independent of given by . This implies that, for any and any ,
Taking , we have that
since for all and has Poisson distribution with mean
has Poisson distribution with mean.