1 Introduction
Cooccurrence network can be modeled efficiently by using hyperbaggraphs (hbgraphs for short) introduced in [1]. Depending on the information the cooccurrence network carries, the ranking of the information hold by the associated hbgraph has to be performed on different features, and the importance stressed on the lower, higher or medium values. Hence, the necessity of extending the exchangebased diffusion that is already coupled to a biased random walk given in [2] to a more general approach using biases. We start by giving the background in Section 2. We then propose a framework to achieve such a kind of diffusion in Section 3 and evaluate it in Section 4, before concluding in Section 5.
2 Mathematical Background and Related Work
A hbgraph is a family of multisets of same universe The elements of are called the hbedges; each hbedge is a multiset of universe and of multiplicity function: The mcardinality of a hbedge is: For more information on hbgraphs, the interested reader can refer to [3] for a full introduction. A weighted hbgraph has hbedges having a weight given by:
In [4], the authors introduce an abstract information function
which is associated to a probability for each vertex
In [5], a bias is introduced in the transition probability of a random walk in order to explore communities in a network. The bias is either related to a vertex property such as the degree or to an edge property such as the edge multiplicity or the shortest path betweenness. For a vertex, the new transition probability between vertex and is given by: where is the adjacency matrix of the graph and is a parameter.A same kind of bias, can be used related to the edges and can be combined to the former to have the overall transition probability from one vertex to another.
3 Biased Diffusion in Hbgraphs
We consider a weighted hbgraph with and we write the incidence matrix of the hbgraph.
3.1 Abstract Information Functions and Bias
We consider a hbedge based vertex abstract information function: The exchangebased diffusion presented in [6, 2] is a particular example of biased diffusion, where the biases are given in Table 1. An unbiased diffusion would be to have a vertex abstract function and a hbedge vertex function that is put to 1 for every vertices and hbedges, i.e. equiprobability for every vertices and every hbedges.
Hbedge based vertex abstract information function  

Vertex abstract information function  
Vertex bias function  
Vertex overall bias  
Vertexbased hbedge abstract information function  
Hbedge abstract information function  
Hbedge bias function  
Hbedge overall bias 
The vertex abstract information function is defined as the function: such that: The probability corresponding to this hbedge based vertex abstract information as: If we now consider a vertex bias function: applied to we can define a biased probability on the transition from vertices to hbedges as:
where , the vertex overall bias, is defined as:
Typical choices for are: or When , higher values of are encouraged, and on the contrary, when smaller values of are encouraged.
Similarly,the vertexbased hbedge abstract information function is defined as the function: The hbedge abstract information function is defined as the function: such that: The probability corresponding to the vertexbased hbedge abstract information is defined as: Considering a vertex bias function: applied to a biased probability on the transition from hbedges to vertices is defined as:
where the hbedge overall bias is defined as:
Typical choices for are: or When , higher values of are encouraged, and on the contrary, when smaller values of are encouraged.
3.2 Biased Diffusion by Exchange
A twophase step diffusion by exchange is now considered—with a similar approach to [6, 2]—, taking into account the biased probabilities on vertices and hbedges.
The vertices hold an information value at time given by:
The hbedges hold an information value at time given by:
We write
the row state vector of the vertices at time
and the row state vector of the hbedges. We call information value of the vertices, the value: and the one of the hbedges. We write:The initialisation is done such that At the diffusion process start, the vertices concentrate uniformly and exclusively all the information value. Writing we set for all and for all
At every time step, the first phase starts at time and ends at where values held by the vertices are shared completely to the hbedges, followed by the second phase between time and , where the exchanges take place the other way round. The exchanges between vertices and hbedges aim at being conservative on the global value of and distributed over the hbgraph.
During the first phase between time and time , the contribution to the value from the vertex is given by:
and:
We have:
Claim 1 (No information on vertices at ).
It holds:
Proof.
For all
∎
Claim 2 (Conservation of the information of the hbgraph at ).
It holds:
Proof.
We have:
∎
We introduce the vertex overall bias matrix: and the biased vertexfeature matrix: It holds:
(1) 
During the second phase that starts at time , the values held by the hbedges are transferred to the vertices. The contribution to given by a hbedge is proportional to in a factor corresponding to the biased probability
Hence, we have: and:
Claim 3 (The hbedges have no value at ).
It holds:
Proof.
Similar to the one of the first phase for
∎
Claim 4 (Conservation of the information of the hbgraph at ).
It holds:
Proof.
Similar to the one for the first phase.
∎
We now introduce the diagonal matrix of size and the biased hbedgefeature matrix: it comes:
(2) 
(3) 
It is valuable to keep a trace of the intermediate state: as it records the information on hbedges.
Writing , it follows from 3:
Claim 5 (Stochastic transition matrix).
Proof.
Let: and: and are nonnegative rectangular matrices. Moreover:

and:

and:
We have: where:
It yields:
Hence is a nonnegative square matrix with its row sums all equal to 1: it is a row stochastic matrix.
∎
Claim 6 (Properties of T).
Assuming that the hbgraph is connected, the biased feature exchangebased diffusion matrix is aperiodic and irreducible.
Proof.
This stochastic matrix is aperiodic, due to the fact that any vertex of the hbgraph retrieves a part of the value it has given to the hbedge, hence for all . Moreover, as the hbgraph is connected, the matrix is irreducible as any state can be joined from any other state.
∎
The fact that is a stochastic matrix aperiodic and irreducible for a connected hbgraph ensures that converges to a stationary state which is the probability vector
associated to the eigenvalue 1 of
. Nonetheless, due to the presence of the different functions for vertices and hbedges, the simplifications do not occur anymore as in [6, 2] and thus we do not have an explicit expression for the stationary state vector of the vertices.The same occurs for the expression of the hbedge stationary state vector which is still calculated from using the following formula:
4 Results and Evaluation
We consider different biases on a randomly generated hbgraph using still the same features that in the exchangebased diffusion realized in [6, 2]. We generate hbgraphs with 200 collaborations—built out of 10,000 potential vertices—with a maximum mcardinality of 20, such that the hbgraph has five groups that are generated with two of the vertices chosen out of a group of 10, that have to occur in each of the collaboration; there are 20 vertices that have to stand as central vertices, i.e. that ensures the connectivity in between the different groups of the hbgraph.
The approach is similar to the one taken in [6, 2], using the same hbedge based vertex abstract information function and the same vertexbased hbedge abstract information function, but putting different biases as it is presented in Table 2.
Experiment  1  2  3  4  5  

Vertex bias function  
Hbedge bias function  
Experiment  6  7  8  9  
Vertex bias function  
Hbedge bias function  
Experiment  10  11  12  13  14  15 
Vertex bias function  
Hbedge bias function 
We compare the rankings obtained on vertices and hbedges after 200 iterations of the exchangebased diffusion using the strict and large Kendall tau correlation coefficients for the different biases proposed in Table 2. We present the results as a visualisation of correlation matrices in Figure 1 and in Figure 2, lines and columns of these matrices being ordered by the experiment index presented in Table 2.
We write the ranking obtained with Experiment biases for indicating whether the ranking is performed on vertices or hbedges—the absence of means that it works for both rankings. The ranking obtained by Experiment 1 is called the reference ranking.
In Experiments 2 to 5, the same bias is applied to both vertices and hbedges. In Experiments 2 and 3, the biases are increasing functions on while in Experiments 4 and 5, they are decreasing functions.
Experiments 2 and 3 lead to rankings that are well correlated with the reference ranking given the large Kendall tau correlation coefficient value. The higher value of compared to the one of marks the fact that the rankings with pair of similar biases agree with the ties in this case. The exponential bias yields to a ranking that is more granular in the tail for vertices, and reshuffles the way the hbedges are ranked; similar observations can be done for both the vertex and hbedge rankings in Experiments 2 and 3.
In Experiments 4 and 5, the rankings remain well correlated with the reference ranking but the large Kendall tau correlation coefficient values show that there is much less agreement on the ties, but it is very punctual in the rankings, with again more discrimination with an exponential bias. This slight changes imply a reshuffling of the hbedge rankings in both cases, significantly emphasized by the exponential form.
None of these simultaneous pairs of biases reshuffle very differently the rankings obtained in the head of the rankings of vertices, but most of them have implications on the head of the rankings of the hbedges: typical examples are given in Figure 3
. It would need further investigations using the Jaccard index.
Dissimilarities in rankings occur when the bias is applied only to vertices or to hbedges. The strict Kendall tau correlation coefficients between the rankings obtained when applying the bias of Experiments 6 to 9—bias on vertices—and 10 to 13—bias on hbedges—and the reference ranking for the vertices show weak consistency for vertices with values around 0.4—Figure 1 (a)—, while the large Kendall tau correlation coefficient values show a small disagreement with values around 0.1—Figure 1 (b). For hbedges, the gap is much less between the strict—values around 0.7 as shown in Figure 2 (a)—and large Kendall tau correlation coefficient values—with values around 0.6 as shown in Figure 2 (b).
Biases with same monotony variations— and on the one hand and and on the other hand—have similar effects independently of their application to vertices xor to hbedges. It is also worth to remark that increasing biases lead to rankings that have no specific agreement or disagreement with rankings of decreasing biases—as it is shown with and for
We remark also that increasing biases applied only to vertices correlate with the corresponding decreasing biases applied only to hbedges, and viceversa. This is the case for Experiments 6 and 12, Experiments 7 and 13, Experiments 8 and 10, and Experiments 9 and 11 for both vertices—Figures 1 (a) and (b)—and hbedges—Figures 2 (a) and (b).
Finally, we conduct two more experiments—Experiments 14 and 15—combining the biases and in two different manners. With no surprise, they reinforce the disagreement with the reference ranking both on vertices and hbedges, with a stronger disagreement when the decreasing bias is put on vertices. We can remark that Experiment 14— and —has the strongest correlations with the rankings of dissimilar biases that are either similar to the one of vertices—Experiments 6 and 7— or to the one of hbedges—Experiments 12 and 13.
(a) Strict Kendall tau correlation coefficient 
(b) Large Kendall tau correlation coefficient 
(a) Strict Kendall tau correlation coefficient 
(b) Large Kendall tau correlation coefficient 
(a) First ranking: and ; Second ranking: and 
(b) First ranking: and ; Second ranking: and 
(c) First ranking: and ; Second ranking: and 
(d) First ranking: and ; Second ranking: and 
A last remark is on the variability of the results: if the values of the correlation coefficients change, from one hbgraph to another, the phenomenon observed remains the same, whatever the first hbgraph observed; however, the number of experiments performed ensures already a minimized fluctuation in these results.
5 Further Comments
The biasedexchangebased diffusion proposed in this Chapter enhances a tunable diffusion that can be integrated into the hbgraph framework to tune adequately the ranking of the facets. The results obtained on randomly generated hbgraphs have still to be applied to real hbgraphs, with the known difficulty of the connectedness: it will be addressed in future work. There remains a lot to explore on the subject in order to refine the query results obtained with real searches. The difficulty remains that in ground truth classification by experts, only a few criteria can be retained, that ends up in most cases in pairwise comparison of elements, and, hence, does not account for higher order relationships.
References
 [1] X. Ouvrard, J.M. Le Goff, and S. MarchandMaillet, “Adjacency and Tensor Representation in General Hypergraphs. Part 2: Multisets, Hbgraphs and Related eadjacency Tensors,” arXiv preprint arXiv:1805.11952, 2018.
 [2] X. Ouvrard, J.M. Le Goff, and S. MarchandMaillet, “Diffusion by Exchanges in HBGraphs: Highlighting Complex Relationships Extended version,” Arxiv:1809.00190v2, 2019.

[3]
X. Ouvrard, J.M. Le Goff, and S. MarchandMaillet, “On Hbgraphs and their Application to General Hypergraph eadjacency Tensor,”
MCCCC32 Special Volume of the Journal of Combinatorial Mathematics and Combinatorial Computing, to be published, 2019.  [4] M. Dehmer and A. Mowshowitz, “A history of graph entropy measures,” Information Sciences, vol. 181, pp. 57–78, Jan. 2011.
 [5] V. Zlatic, A. Gabrielli, and G. Caldarelli, “Topologically biased random walk and community finding in networks,” Physical Review E, vol. 82, p. 066109, Dec. 2010.
 [6] X. Ouvrard, J.M. Le Goff, and S. MarchandMaillet, “Diffusion by Exchanges in HBGraphs: Highlighting Complex Relationships,” CBMI Proceedings, 2018.