Selection of Random Walkers that Optimizes the Global Mean First-Passage Time for Search in Complex Networks

We design a method to optimize the global mean first-passage time (GMFPT) of multiple random walkers searching in complex networks for a general target, without specifying the property of the target node. According to the Laplace transformed formula of the GMFPT, we can equivalently minimize the overlap between the probability distribution of sites visited by the random walkers. We employ a mutation only genetic algorithm to solve this optimization problem using a population of walkers with different starting positions and a corresponding mutation matrix to modify them. The numerical experiments on two kinds of random networks (WS and BA) show satisfactory results in selecting the origins for the walkers to achieve minimum overlap. Our method thus provides guidance for setting up the search process by multiple random walkers on complex networks.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

07/22/2013

Sub-Dividing Genetic Method for Optimization Problems

Nowadays, optimization problem have more application in all major but th...
07/22/2013

Sub- Diving Labeling Method for Optimization Problem by Genetic Algorithm

In many global Optimization Problems, it is required to evaluate a globa...
07/22/2013

New Optimization Approach Using Clustering-Based Parallel Genetic Algorithm

In many global Optimization Problems, it is required to evaluate a globa...
12/13/2014

Optimization of Reliability of Network of Given Connectivity using Genetic Algorithm

Reliability is one of the important measures of how well the system meet...
12/07/2021

Genetic Algorithm for Constrained Molecular Inverse Design

A genetic algorithm is suitable for exploring large search spaces as it ...
06/20/2016

Bandit-Based Random Mutation Hill-Climbing

The Random Mutation Hill-Climbing algorithm is a direct search technique...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

0.1 Introduction

Network provides a succinct mathematical representation of the interaction between components in complex systems [1]. One of the recent topics of research in network concerns the studies of information spreading. A particularly useful measure in the spreading of signal in network is the first-passage time (FPT) needed by a random walker to reach a target node starting from a chosen origin [2]. Among many papers on the mean first-passage time (MFPT) [3], the useful relation between the asymptotic form of the MFPT distribution and the structural properties of a network can be employed for efficient search provided that we have some prior knowledge of the target [4, 5]. However, without information about the target node, the problem of search efficiency remains an unsolved problem as we do not know how to minimize the global mean first-passage time (GMFPT) to a general target when multiple random walkers are employed. In this paper, we solve this problem by designing a method to optimize the GMFPT by choosing the initial positions of multiple random walkers.

Consider there are independent random walkers on a network . For each random walker taking steps, there will be a probability distribution on the set of visited sites. Intuitively, larger path distances between the initial positions of the walkers will lead to a smaller overlap between the probability distribution of the walkers. Since a smaller overlap implies there is less time wasted by the walkers, the average searching time can be shortened. This intuition is verified by the theoretical analysis on the case of two walkers. We thus like to find the set of origins of the random walkers that minimize the overlap. In this paper, we use genetic algorithm [6] to find an optimal set of initial nodes. We employ a special evolutionary framework with a mutation matrix, which was called the mutation only genetic algorithm (MOGA), to design the optimization algorithm. MOGA was first introduced by Szeto and Zhang [7] and later generalized by Law and Szeto to include crossover [8, 9]. Simulations are performed on the two main kinds of random networks: WS and BA, with satisfactory results.

0.2 Global Mean First-Passage Time and Overlap

Consider a connected and undirected finite network with adjacency matrix . The degree of a node is denoted by . A Random walk is a discrete-time stochastic process. A walker at node will choose one of its neighbors as the next step with equal chances. For a random walker, if it starts at the initial node , the chance of finding it at node at time is denoted by

. We may introduce the stochastic matrix

and the probability distribution row vector

. Then the master equation of a random walk is .

Our goal is to optimize the global mean first-passage time (GMFPT) while searching by multiple walkers. However, computing the GMFPT needs the global topological structure of the network and is time-consuming since it converges slowly when we simulate random walks. In order to design an algorithm to optimize the GMFPT, we first want to find a fast computable quantity that is monotonically related to it. Following our intuition that an overlap between the probability distributions of distinct walkers is a waste of searching time, we expect that a smaller overlap may result in a smaller GMFPT.

Now we want to verify the intuition by rigorous derivations. Theoretically, we may only consider the case of two walkers start at nodes and , as one can easily generalize to the case of more walkers. For a walker starts at node , we define as the probability that node has not been visited by walker till time . Then the GMFPT is . And the overlap is defined as . We claim that they are positively correlated with each others. We start from the deep relation between and [4]. Applying the discrete Laplace transform, we have, . Then using the discrete Laplace transform formula of a product, we can write the GMFPT and the overlap in terms of and as, , and, . Note that the part which only depends on will become a constant under the summation over in both equations. We also find that is holomorphic in the region enclosed by and its value on the positive real axis is always real and positive. By these facts we conclude that and are positively correlated with each others.

Following the analysis above, we may focus on minimizing the total overlap to optimize the GMFPT. This is the starting point of the genetic algorithm introduced in this paper, in which we define the total overlap as the fitness function. Computing the mutual overlap is faster and only requires local topological structures in the surrounding region of initial nodes if we imposed an appropriate cutoff time.

0.3 Mutation Only Genetic Algorithm

We start with formally defining the optimization problem as ”the optimal set of initial nodes” problem. The input is an undirected network with size represented by the adjacency list and the number of random walkers . While the output is the optimal initial nodes of these random walkers. To construct the set of initial nodes, we may start from randomly choosing a node and iteratively run the genetic algorithm to decide a new initial node which has minimum overlap with the existed ones, and then add it to the set. Following this strategy, the search space of the genetic algorithm is reduced to the set of nodes .

We use a special class of genetic algorithm, the mutation only genetic algorithm (MOGA), (note that it does not stand for the multi-objective genetic algorithm), which defines a more general formulation of mutation. Suppose there are chromosomes of -bits in the decreasing order of fitness. We assign different mutation probabilities to each locus and they form the mutation matrix of size , where is the probability to mutate the -th locus of the -th chromosome.

The design of chromosome representation requires the definition of a one-to-one correspondence between a set of -bit binary chromosomes and . However, if a chromosome is encoded with the degree and the clustering coefficient of the node, since they are not contiguous on the network, we cannot easily find the new node which is represented by the mutated chromosome. An approach to solve this problem is to define the chromosome as a sequence of characters of the nodes on a specific path from a preset node. We thus only need to search among the nearest neighbors to find the new node correspond to the mutated chromosome. And we can obtain the candidate node from a path by picking out the last node. A difficulty of this representation is that the chromosomes may have variable length. However, we can easily extend the length of a chromosome by filling the remaining space with the characters of the last node in the path. In practice, we extend all chromosomes to a fixed length, which is larger than the diameter of the network .

The design of mutation matrix is special since our chromosomes represent paths. To achieve effective exploration and exploitation at the same time, we should only mutate the last several loci of a fit chromosome to perturb the end of the path and search the surrounding area of the current node, while mutate nearly all the loci of an unfit chromosome, so as to generate new paths for search in the region far from the current solution. Since there is no fixed meaning of a specific locus, the mutation probabilities should not depend on the statistics of loci. We define the mutation matrix as,

The special triangular form with each element being perturbs the ends of fit paths and uniformly generates new paths simultaneously. Numerical results verify that the special design of the chromosomes and the mutation matrix helps the genetic algorithm provide solutions closed to the optimum in reasonable time.

0.4 Numerical Results and Conclusions

We run the genetic algorithm on two kinds of random networks (Watts-Strogatz and Barabási-Albert). Firstly, we evaluate the performance of our algorithm by computing the fitness of the fittest chromosome in each generation and draw Figure 1. We vary the size and the average degree of BA networks and the rewiring probability of WS networks. From Figure (a)a, we can see that the average improvement of fitness within generations does not explicitly depend on , but decreases with increasing . From Figure (b)b, we find that the improvement strictly decreases with respect to . Note that a small in WS model leads to a flat degree distribution and a large average path length . By these observations, we conclude that our algorithm has better performance on networks with small average degree and large average path length . This may be because the GMFPTs of all possible sets of initial nodes in networks with large and small is sharply distributed so that there is limited room for improvement.

(a) BA networks with size or , and average degree or .
(b) WS networks with size , average degree , and , , or
Figure 1: Normalized average fitness versus the number of generations when running genetic algorithm on BA and WS networks. The fitness is normalized with respected to its initial value and averaged on 30 independent experiments.

Secondly, we compute the path lengths between all pairs of initial nodes produced by the algorithm. We expect that every two initial nodes are far from each other so that the total overlap is small. Consider the special case of a regular ring lattice, in which the optimal positions of random walkers should be arranged with equal spacing on the ring so that the path lengths between them are approximately , where is the diameter. For WS networks with small , which is not far from regular ring lattices, we expect the distribution of path lengths of the set of initial nodes also has a sharp peak at . This is verified by numerical results shown in Figure 2.

(a) Path lengths between all pairs of initial nodes
(b) Path lengths between all pairs of nodes
Figure 2: Normalized path-length histograms of all pairs of nodes in the set of initial nodes produced by the algorithm and in . The algorithm runs on a WS network with size , average degree , and . Distribution in Figure (a)a is an average of 20 experiments.

Finally, we calculate the percentage reduction of the GMFPT of the selected set of initial positions compared to the average. Table 1 summarizes the reduction of the GMFPT of BA networks with various and WS networks with various .

BA () WS (,)
GMFPT
Improvement 2.6% 1.5% 1.0% 4.9% 2.2% 1.8%
Uncertainty 0.6% 0.4% 0.4% 0.9% 0.5% 0.6%
Table 1: The percentage reduction of the GMFPT of initial nodes produced by the algorithm compared to the average. Each result is an average of 20 experiments.

We conclude that we have found a method to optimize the global mean first-passage time (GMFPT) by choosing the starting nodes of multiple random walkers to search for a general target in a complex network. As the GMFPT of multiple random walkers depends monotonically on the overlap, we can minimize the total overlap to reduce GMFPT. We achieve this with the mutation only genetic algorithm (MOGA) which finds the initial position of a new walker that has minimum overlap with the existing ones. Special forms of the chromosomes and the mutation matrix are introduced and we achieve balance between exploration and exploitation. Numerical works on WS and BA networks confirm the effectiveness of our method, as shown in Table 1. Our method may be very useful in speeding up searches on large networks. We expect further improvement by means of a more sophisticated MOGA with crossover [8].

References

  • [1] R. Albert and A.L. Barabási. Statistical mechanics of complex networks. Rev. Mod. Phys., 74:47–97, Jan 2002.
  • [2] B.D. Hughes. Random walks and random environments. Vol. 1. Random walks. Oxford, 1995.
  • [3] S. Condamin, O. Bénichou, and M. Moreau. Random walks and brownian motion. Phys. Rev. E, 75:021111, Feb 2007.
  • [4] J. D. Noh and H. Rieger. Random walks on complex networks. Phys. Rev. Lett., 92:118701, Mar 2004.
  • [5] H. W. Lau and K. Y. Szeto. Asymptotic analysis of first passage time in complex networks. EPL (Europhysics Letters), 90(4):40005, 2010.
  • [6] John H Holland.

    Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence.

    U Michigan Press, 1975.
  • [7] K. Y. Szeto and J. Zhang. Adaptive Genetic Algorithm and Quasi-parallel Genetic Algorithm: Application to Knapsack Problem, pages 189–196. Springer, Berlin, Heidelberg, 2006.
  • [8] N. L. Law and K. Y. Szeto. Adaptive genetic algorithm with mutation and crossover matrices. In 12th Intl. Joint Conf. on Artificial Intelligence (IJCAI’07), pages 2330–2333, 2007.
  • [9] D.G. Wu and K. Y. Szeto. Applications of genetic algorithm on optimal sequence for parrondo games. In 6th Intl. Joint Conf. on Computational Intelligence (IJCCI’14), page 30, 2014.