Improved Bounds on Information Dissemination by Manhattan Random Waypoint Model

09/19/2018 ∙ by Aria Rezaei, et al. ∙ THE UNIVERSITY OF UTAH Stony Brook University California State University, Northridge 0

With the popularity of portable wireless devices it is important to model and predict how information or contagions spread by natural human mobility -- for understanding the spreading of deadly infectious diseases and for improving delay tolerant communication schemes. Formally, we model this problem by considering M moving agents, where each agent initially carries a distinct bit of information. When two agents are at the same location or in close proximity to one another, they share all their information with each other. We would like to know the time it takes until all bits of information reach all agents, called the flood time, and how it depends on the way agents move, the size and shape of the network and the number of agents moving in the network. We provide rigorous analysis for the model (which takes paths with minimum number of turns), a convenient model used previously to analyze mobile agents, and find that with high probability the flood time is bounded by O(N M(N/M) (NM)), where M agents move on an N× N grid. In addition to extensive simulations, we use a data set of taxi trajectories to show that our method can successfully predict flood times in both experimental settings and the real world.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

It has always been an interesting research topic to understand human mobility and how contagions spread via such motion. One of the motivations is to understand how infectious diseases spread by moving agents. Think of a strain of an infectious virus such as SARS or Ebola. When two individuals are at the same location at the same time, there is a possibility for one to spread that strain of virus to the other. Therefore the spread of contagions in a population is highly dependent on the density of the population and how individuals move around. In another example, human motion can be used for our benefit. It has become common that people carry wireless devices around. Short range low cost wireless communication can be established at the contact events to allow energy efficient information exchange. In this case the mobility model influences how long it takes for a piece of information from one node to reach all nodes in the network.

There have been two main approaches to study human mobility in the literature: data-driven methods versus theoretical analysis. In recent years wireless technology has made it possible to collect a large amount of mobility data through wireless devices. There has been a lot of work on finding exciting patterns in human movements and information spreading in real-world data (Ding et al., 2015; Frias-Martinez et al., 2011; Bajardi et al., 2011). It has been shown that human mobility is immensely complex. Accurate models can be built using historic traffic data to predict agents’ locations and social ties (Song et al., 2010; Eagle et al., 2009). But these models each work only for a specific scenario. It is unclear how these models can help us analyze asymptotic behavior of moving agents or whether a model generalizes to a different geographical location, a different travel modality, or a different group of people. Furthermore, long-term mobility data can be identity revealing. Even with great efforts to anonymize the data and with removing big fractions of it, individuals are identifiable by their movement patterns. A seminal work revealed that randomly selected points in an hourly location sequence of a person recorded over months via cellphone antennas is enough to make that person identifiable among million individuals (De Montjoye et al., 2013). As a result, mobility data sets are usually not published by companies due to concerns over user privacy, except for a few special cases of shared vehicles (taxis or shared bikes).

On the other hand, an extensive amount of work has been dedicated to mathematical models of mobility and their asymptotic behaviors. Although these models cannot compete with the accuracy of data-driven models in the presence of enough historic mobility data, they have been used for their rigorous analysis and their ability to predict future events with provable certainty. Over the years theoretical models have evolved from simplistic models, inspired by known physical phenomena in real world, to more sophisticated ones, taking into account the complexity of human mobility. We briefly review these models and their analytical results below. For a comprehensive survey on these models refer to (Camp et al., 2002).

  • Random Walk: Perhaps the most studied movement model. Inspired by the movement of floating particles in a liquid or gas, called Brownian Motion (Feynman et al., 2013), in its simplest version an agent starts its movement in an arbitrary node in a given network. At each time step, the agent chooses one of the neighbors of its current node uniformly at random and moves to that neighbor. There are variations in which agents can rest in their current position for a period of time or use a non-uniform transition probability when choosing a neighbor.

  • Random Direction Model: In this model, an agent chooses a random direction and possibly a random velocity, then moves in that direction until it collides with the boundary of the network. The agent then chooses a new direction and velocity and continues as before.

  • Random Way-point: In this model, an agent, starting from an initial position in the network, chooses the next destination from all nodes in the network uniformly at random. Then, using one of the shortest paths, moves towards the destination and after reaching it finds a new destination with the same method. This has been widely used in modeling human motion and in many prior simulations for mobile networks (Hu and Johnson, 2000; Johnson and Maltz, 1996; Perkins et al., 2001; Broch et al., 1998; Boudec and Vojnovic, 2006).

  • Manhattan Random Way-point : This is a special case of the Random Way-point model (Boudec and Vojnovic, 2006). In a grid (or torus) networks, agents move to the destination with as few turns as possible. Thus they travel first horizontally and then vertically (or vice versa) to the destination. This model is inspired by the fact that in urban streets, turning can be time-consuming (Crescenzi et al., 2009; Clementi et al., 2011).

  • Lévy Walk: Studies on intelligent moving agents, especially humans, have revealed that the distance to the next destination chosen by such agents seems to follow a fat-tailed distribution (González et al., 2008; Barabási, 2005; Viswanathan et al., 2002). A popular movement model in this category is called Lévy Walk   (Shlesinger et al., 1999). This model is similar to Random Way-point, but instead of choosing a new destination uniformly at random, an agent chooses a node as its next destination with probability proportional to the inverse of its distance from the agent’s current position, to some power .

1.1. Our Contributions

In this paper we provide improved upper bounds on the rate of information dissemination when agents move according to the Manhattan Random Way-point model. Through a series of simulations we show that, combined with bounds on Random Walk, our bounds lead to a new conjecture on the time it takes for information to disseminate through a network when agents’ movements follow the Lévy Walk  model, a challenging question that remains widely open. Finally, we report the result of a series of experiments we have designed, which show that our model is capable of predicting trends on experimental settings, as well as real-world data.

We formally define our problem as follows. Consider a set of autonomous agents, each starting at time 0 with a unique bit of information, , at a node selected uniformly at random in an torus111A grid where nodes in the boundaries are each connected to their corresponding node in the opposite boundary. denoted by . Agents all follow the same movement model and they share information with each other when they meet during their move. Meeting is defined as being at the same location at the same time, where the location can be inside a node or on an edge between two nodes. Agents start their movement at the beginning of each time step in a synchronized fashion. We consider uniform speed for all agents222In some previous definitions of Random Way-point the agents choose their speed uniformly at random from a range. But this choice will lead to the average moving speed to be decreasing over time (Yoon et al., 2003). Also in reality vehicles/pedestrians often move with a fixed speed. and transmission radius is practically as agents have to be collocated in order to pass along information, for simplicity. However our findings can be extended for arbitrary constant transmission radius.

In the above setting, we are interested in finding bounds on the time it takes until every agent finds out about every piece of information. This value is called the flood time (). An equally important statistic is the time it takes for all agents to learn a specific bit of information, called broadcast time and denoted by . Clearly . Using the union bound, any upper bound on extends to , too, if the probability of the bound occurring is sufficiently high333If , for some , we know that , when is sufficiently large.. Both and are important statistics in various applications. In the case of disease spreading, corresponds to the time when an infection of any agent would have been passed to the entire population. In a delay-tolerant wireless mobile network, corresponds to the time in the past, from which we can assume all information has been shared across the network; we can predicate the start of a new protocol based on assuming all agents are up to date after this delay. In mobile social networks, corresponds to the time it takes a new piece of information to permeate society. Overall, both and are important statistics which capture information flow in a network, and will be the focus of our study.

Our contributions are as follows:

  • We find a new upper bound for and that is tight for a wide range of settings. Specifically, when agents move on an grid with torus topology, we show . This bound improves upon recent upper bounds for topologies with more complex boundary conditions.

  • We analyze the relation between Random Walk, Manhattan Random Way-point and Lévy Walk. Through simulations, we show empirically that the Lévy Walk model can be understood by carefully interpolating between the heavily studied Random Walk model and our new results on the Manhattan Random Way-point model.

  • We validate the theoretical bounds in a number of empirical studies using simulated scenarios, bike and taxi trajectory data sets.

2. Background and Related Works

Since the advent of social networks, researchers studied how information (e.g., rumors or viral videos) spreads through a network (Easley and Kleinberg, 2010; Gomez Rodriguez et al., 2010; Kempe et al., 2003). The rate of change in these networks is slow enough that they can be considered static throughout the course of information dissemination. This assumption simplifies the mathematical models tremendously. These studies do not fit dynamic mobile networks, due to the high rate of topology changes. The spreading behavior heavily depends on how agents move (Kleinberg, 2007).

For analyzing the mobility models mentioned earlier, arguably the simplest model for a geographically spread network is a 2D grid. In some metropolitan settings the downtown area is a reasonable grid. Wrapping the grid around to a torus has been commonly adopted in prior papers that analyze information spreading in a mobile network. The benefit of the torus model is to get rid of the boundary effect, thus simplifying the analysis. The torus also removes the boundary effect in the Random Way-point model, which often caused unwanted artifacts in simulations (Chu and Nikolaidis, 2002; Bettstetter, 2001; Royer et al., 2001).

For direct comparison, we review prior results in the same format as ours, where and are decoupled and there are no assumptions on their ratio, near-zero transmission radius (), and where each move consists of walking along a path from source node to destination node, rather than jumping to the destination instantaneously.

The Random Walk movement model has been extensively studied and tight bounds on many characteristics have been fully resolved (Dimitriou et al., 2006; Kesten and Sidoravicius, 2005; Lawler, 2013). One of the tightest bounds for Random Walk is given by Patterin et al. (Pettarin et al., 2011). They found that even with a very small transmission radius, broadcast time does not depend on the relation between the mobility speed and the transmission radius. They prove that with high probability (w.h.p):

(1)

Keeping track of a bit of information , they first divide the initial grid into smaller cells and find the time by which an arbitrary cell is infiltrated by an agent carrying . Then they show that this infiltrating agent will inform the majority of agents near this cell and spread the information locally. After the local spread is done within the cell, information leaks into adjacent cells and this whole process is repeated. Ultimately, every cell is infiltrated and every agent in the network finds out about .

Clementi et al. (Clementi et al., 2013) have proved bounds for in Manhattan-like grids where agents move according to the Manhattan Random Way-point model. Their setting slightly differs from ours as they do not employ boundary loops, which results in two zones with very different traffic of agents. On the one hand, nodes in the central zone are visited by agents across all nodes in the grid with high probability. On the other hand, the periphery, called suburb areas, are starved of agents: the probability that an agent passes through these areas is significantly lower than that of the central zone. They prove that w.h.p. , where is the transmission radius and is the agents’ speed. We can rewrite this in our setting as:

(2)

They found that the bulk of the Flooding Time is devoted to carrying the information to the “suburbs,” as it requires a flow of informed agents traveling from the central zone to the suburbs.

A different line of work focused on solving the problem in a general platform, oblivious to geometric considerations. Clementi et al. (Clementi et al., 2015) have derived an upper bound for , and the mixing time444Mixing time is the time needed for a Markov chain to reach its stationary distribution, starting from an arbitrary distribution.

of the Markov chain corresponding to agent movements, as well as how independent the collisions between different pairs are, play a major role in the analysis of

. Applying Manhattan Random Way-point model to their general bound yields the following:

where is the maximum speed of any agent. Rewritten in our setting, the bound is:

(3)

Since their method is very general, their bound is not competitive with the bounds on specific movement models and networks. Table 1 shows that our bound is a significant improvement over these bounds on different movement models.

Authors Model Bound
Ours MRWP
Clementi et al. 2010 (Clementi et al., 2013) MRWP
Clementi et al. 2015 (Clementi et al., 2015) RWP
Pettarin et al. 2011 (Pettarin et al., 2011) RW
Table 1. Recent bounds on Manhattan Random Way-point (MRWP), Random Way-point (RWP) and Random Walk (RW).

Rigorous analysis of the Lévy Walk model is much more challenging due to the strong spatiotemporal correlation (Birand et al., 2011; Shinki et al., 2017; Lee et al., 2011). The most relevant work is done by Wang et al. (Wang et al., 2014) where the authors have analyzed the distribution of the minimum time needed until a piece of information reaches a certain region. However, to the best of our knowledge, there are no bounds for when the agents are moving according to a Lévy Walk.

Besides theoretical work, there has been a lot of empirical analysis of how information can spread through opportunistic peer communication among mobile agents using simulations or empirical evaluations. Protocols for reducing communication cost (e.g., to avoid a message be delivered to a node multiple times) have been studied extensively (Chu et al., 2002; Intanagonwiwat et al., 2000; Peng and Lu, 2000). Last, there has also been work on a model assuming that a supporting static wireless network is in place, which helps to cache and propagate these events (Zhou and Gao, 2009). This model is different from ours.

3. Bounds on Flood Time

In this section, we present and prove our main theoretical results on the flooding time on torus networks. We start with a trivial lower bound.

Theorem 3.1 ().

For agents initially positioned at uniformly random nodes and moving with constant speed in an torus, with constant probability, we have:

(4)
Proof.

Let be the bit of information initially carried by agent , and the farthest agent to at time . The time it takes until reaches , denoted by , is a lower bound for both and . Since we are assuming uniformly random initial positions, with probability the distance between and is at least . Also note that information can only travel as fast as agents can. This means that at each time step the distance between and the closest copy of is reduced by at most 2 units. As a result, with probability , is at least , which completes the proof. ∎

Now we move forward to our main theorem on the upper bound for Manhattan Random Way-point model.

Theorem 3.2 ().

For agents moving according to the Manhattan Random Way-point model with constant speed in an torus, with high probability555High probability in our work means at least for some constant ., we have:

(5)

W present the proof in two parts. The first part is to analyze when and with what probability two agents have a collocation event so they can share information. This involves a few necessary conditions: time-wise the moves made by two agents with a collocation event need to overlap; geometrically their trajectories need to have an intersection; and thirdly, they arrive at the intersection at the same time. Next we need to analyze the global property: how information sharing enabled by collocation events leads to global dissemination. The next two subsections focus on these two parts of the proof respectively.

3.1. Bounding Collocation Probability

We consider agents moving non-stop following Manhattan Random Way-point model, and partition the mobility trace of each agent into moves between randomly chosen destinations. First, observe that any move by an agent following Manhattan Random Way-point model takes time with constant probability. We call the two straight parts of a movement, one horizontal and the other vertical, segments. Note that any move in this model can have at most two segments of different directions appearing in an arbitrary order.

We say two agents have a connection during their moves if they are at the same location at the same time. For that to happen, the moves of the two agents must at least overlap over time.

Definition 3.3 ().

Strongly overlapping moves are moves made by two agents where the time interval of a segment of an agent’s move is completely contained within the time interval of the other agent’s whole move.

Note that this overlap only needs to happen in the time interval of two moves and no condition is imposed on their geometric locations.

Lemma 3.4 ().

Every move of an agent strongly overlaps with at least one of the moves of another agent , say

. The starting moment of the moves can be at most

time-steps apart and with constant probability, a segment of will have a time duration overlap of with a segment of .

Proof.

First consider the move of that is still active when starts. If ends after the first segment of , then covers the first segment of and the two moves strongly overlap (Figure 1 case (i)). If not, then consider the next move of , . If ends after ends, completely covers ’s second segment and the two moves strongly overlap (Figure 1 case (ii)). If not, then must be completely covered by and the two moves strongly overlap (Figure 1 case (iii)). We have shown that a strong overlap occurs. Since the time duration of each move can be at most , the starting point of is at most time-steps apart from the starting points of both and .

Figure 1. Any move of one agent strongly overlaps with at least one of the moves of another agent.

Without loss of generality, assume a segment of is covered completely by . With probability , the time duration of (equal to its length due to the constant speed assumption) is at least . Now take the segment of with the longest time duration overlap with , and denote it by . This overlap should be at least . As a result, with probability , the time duration of this overlap between and is at least , as required. ∎

Figure 2. The segments are trimmed to duration of their overlap, . Left: The collocation event of two agents happens if the distance between the intersection point () and the starting point of both segments is equal (). Right: Connection happens if there is a non-zero overlap between two segments and there are such placements for the blue segment if the red segment is fixed.
Lemma 3.5 ().

If two agents and have two segments and with a time interval overlap of , they meet with probability .

Proof.

We trim both segments to the duration of their overlap, making them of equal length . All cases of a connection between the two agents can be reduced to three main cases below by rotating the torus or swapping the agents:

  1. is horizontal and vertical. Here, the two agents connect, if the two segments intersect geometrically at a point and is at equal distance from the starting points of and (see Figure 2, left). Since we assume that is a torus, we can fix ’s position in our analysis. Out of all possible placements of on , there are placements that result in an intersection that meets the above condition. Hence the probability of a connection between and is in this scenario.

  2. Both and are horizontal and in opposite directions. In this case, any geometric intersection is enough for a connection to happen (see Figure 2, right). If both agents move in the same row, there would be placements of once ’s position is fixed that results in an intersection. The probability of both agents moving in the same row is , which makes the overall probability .

  3. Both and are horizontal and in the same direction. In this case, the starting points of and have to be in the same exact node, which happens with probability .

Based on the possible directions of each segment, there are possible cases for and , all of which happens with equal probability and can be reduced to one of the cases above. The overall probability of a connection between two agents can be bounded as:

(6)

An immediate consequence of Lemmas 3.5 and 3.4 is:

Lemma 3.6 ().

Two agents with strongly overlapping moves have a connection with probability .

Proof.

For the probability of two strongly overlapping agents connecting we have:

(7)

where is the probability that the time duration of the overlap between segments of and is . From Lemma 3.4 we know that . As a result we can rewrite (7) as:

The upper bound in (6) yields:

3.2. Bounding the Flood Time

Given the probability of two agents sharing information during their moves, we are now ready to argue for how information propagates to the entire network. For simplicity, we find an upper bound for the broadcast time (one specific message reaching everyone) and extend it to the flood time (all messages reaching everyone). There are two issues that we need to address. First, the positions of an agent are temporally correlated—but fortunately, if sufficiently far apart in time, the positions of agents are independent which will help to simplify our analysis. Second, we need to track agents who have been informed and who have not, and analyze how information spreads from the informed ones to the uninformed ones.

Lemma 3.7 ().

The locations of two agents at steps apart are independent of each other.

Proof.

Let the sequence of destinations chosen by an agent be . Observe that regardless of what is, every node in the torus (including ) has the same probability of being . Also, nodes visited between two consecutive destinations are only dependent on those two destinations. Let and be the position of at time and . Since the agents move along the shortest path towards their destination, a move is at most steps long. As a result, at least two destinations will be visited between time and by . Take the destination immediately after , , and immediately before , . As noted above, is only dependent on and , while is only dependent on and , where and are distinct positions. This makes and completely independent. ∎

As mentioned earlier, we track the spread of a single bit of information among the agents. We first divide the time span of the whole process into windows of size time steps, called cycles. The cycle starts at time and ends at time . Each agent would visit at least destinations between time and , which yield at least moves independent of other cycles in an agent’s trajectory (note the margin between selected moves in each cycle, which guarantees independence). According to Lemma 3.4, we know that a move by an agent will strongly overlap with a move by another agent and the starting point of the two moves are at most time steps apart. As a result, for each pair of agents and each cycle, we can find two strongly overlapping moves independent of other cycles, which according to Lemma 3.6 have a chance of connection, where is a constant. This essentially makes collocation events between a fixed pair of agents in different cycles i.i.d.

We now divide the whole process into consecutive epochs

. Each epoch consists of

cycles, and starts when we have a set of agents who know about (referred to as informed agents) and a set of agents who do not (referred to as uninformed agents). An epoch ends when the number of informed agents doubles, or the number of uninformed agents drops to zero. For the broadcast time of , , we can write:

(8)

We now find the number of cycles needed for that w.h.p. each agent in is paired with a distinct agent in during the epoch. By artificially forcing informed agents to find distinct partners, we only slow down the process of information spread, and an upper bound found in this manner is valid as an upper bound for the main problem. The reason we are require distinct partners for each informed agent is to ensure that connections between different pairs are independent.

The probability of an informed agent connecting to any is as follows (arrow shows the direction of information exchange):

(9)

Let there be an arbitrary order for agents in and one for agents in . The first informed agent can match to any of the uninformed agents. After the first matching is done, there will be potential matches for the second informed agent and so on. Assuming that there are pairs at the end of this epoch () and using Equation (3.2), the probability of this happening (i.e., having pairs of matched informed/uninformed agents) is:

For the above to happen w.h.p., for , we need:

The last step is due to the fact that . To make sure that is a non-zero integer and we have at least one cycle, we set:

(10)

Substituting (10) into (8), we have:

(11)
(12)

Since our bound for works for arbitrarily high probability (, for a constant ), it extends to using the Union Bound. This completes the proof of Theorem 3.2.

This is a pretty tight bound. Consider a semi-dense scenario where , our bound becomes , which nearly meets the trivial lower bound of Equation (4).

Compared to previously mentioned bounds for the Random Way-point model in (2) and (3), our bound is stronger. For the same movement model, although under slightly different network assumptions, Clementi et al. found the bound of  (Clementi et al., 2013), which is mainly due to the choice of not having a torus as they intended to study the impact of rarely visited areas on the total information spread time. Our bound also improves that of (Clementi et al., 2015) by a huge margin. This can be due to the fact that their method is a general framework to find an upper bound for . Our version of the Manhattan Random Way-point model assumes that agents complete their move in one coordinate then start moving in another. As suggested in (Clementi et al., 2015), this assumption increases the probability of connection between two agents, which in turn leads to a better upper bound for and . Further, the Manhattan Random Way-point model is a better fit for mobility in urban areas it implicitly incorporates the cost of turning during movement.

Depending on the application, one can think of various extensions to our model, such as an arbitrary transmission radius or random waiting time between two consecutive moves of an agent. In this case, an approach similar to ours can be adopted to find a bound for if these three components are available: (1) suitably sized independent time windows (called cycle here), (2) guarantee of a long enough time interval overlap between segments of moves by two agents and (3) probability of connection between those two segments.

4. Experiments

In this section, through a series of experiments we test the accuracy of our discovered bound. First, we test our model where agents are moving in a torus-like grid, following Manhattan Random Way-point model. Next, using bike sharing records in 3 major cities, we create synthetic trajectories in real-world road networks, and compare simulated behavior of flood time against our model. Finally, using GPS traces of taxis in a major city, we verify our model against a real-world case of information dissemination via mobile agents.

Figure 3. (left) The sigmoid-like behavior of Lévy Walk, sandwiched by Random Walk and Manhattan Random Way-point. (middle, right) Our bound accurately captures the changes in as and are tweaked while other parameters are fixed.

4.1. Simulated Movements in a Grid

We simulate the movement of agents in a torus-like grid following a Manhattan Random Way-point model, and compare the average flood times, , against our bound. First, we fix to and find for by averaging over realizations. To fit the resulting values to our bound, we use function , where each is a positive constant, accounting for fixed

and constants in our asymptotic analysis. The results of the simulation along with fitted values are shown in Figure 

3, middle. Our bound has accurately captured the changes in as the coefficient of determination, , is equal to . Next, using a similar procedure, we fix to and find for . We fit the values to the function , where again each is a positive constant. The simulation results and the fitted values are depicted in Figure 3. As expected, we showed good performance here too by yielding an of (equality between the two values is coincidental).

4.2. Simulated Movements in Real Networks

Figure 4. Actual (empty circles) and fitted (dotted lines) for two movement policies in cities.

To test our model against non-grid networks, we use the bike rental records of 3 major US cities (Citi Bike, 2017; Capital Bikeshare, 2017; Hubway, 2017). Each city has a unique road network and a set of fixed stations,

, which are used as the set of possible destinations each agent can choose from. The goal here is to test our model against a setting beyond the grid network and uniformly random selection of destinations. The data sets include an origin and a destination station for each trip made. Using these records, we can estimate the probability of choosing a destination

given that an agent is currently positioned in station , called the transition probability between and and denoted by . We can also calculate the probability of initiating a trajectory from any given station, from here on called initiation probability and denoted by . To gradually move our tests away from the theoretical settings, we use the following movement models throughout our simulations:

  1. Similar to Section 4.1, we select each station in the sequence of stations visited by an agent uniformly at random. This is equivalent to the Random Way-point model and denoted by RWP in this experiment.

  2. Next, we use the calculated and values to build synthetic trajectories. We call this model DATA throughout this experiment.

In each experiment, after selecting a sequence of stations visited by each agent, we find the shortest paths between consecutive stations using Routino (Bishop, 2017) and OpenStreetMap (OpenStreetMap contributors, 2017) extracts. Two agents will connect, if at any time they are closer than meters from each other. Finally, for each of these experiments, we iterate over values of between and , and report the average flood time () by aggregating over 25 realizations.

Using a function similar to Section 4.1, we can fit the simulation results to our bound. Figure 4 shows the actual and fitted values for the 3 cities, along with the value of the fitting. In these simulations, our bound closely approximates the flood time in the simulations, even when the movement policy used is data-specific rather than the Random Way-point model. The values care compared in Table 2. Across all settings, we achieve ( of them ), which shows the flexibility of our model to variations of network and movement policy. Note that here the network was a real-world road network, and far from a torus.

There can be many different factors contributing to the flood time in networks as complicated as urban maps, which are beyond the scope of this study. Here, we tried to explore the limits of our model’s prediction capabilities by tweaking the settings of experiment in a controlled manner. Further investigation in the effects of structural properties of road networks, and different distributions of frequent origins and destinations on the flood time is needed to fully understand the process of information dissemination by human mobility in real road networks.

City RWP DATA
Boston
New York City
Washington, D.C.
Table 2. Fitting score, , for all movement policies in all cities.

4.3. Real-World Data

Figure 5. Actual (empty circles) and fitted (dotted lines) for Shenzhen.

Next, we try to fit our model to real-world GPS traces. Ideally, one may want to experiment on personal trajectories, as the behavior of a single moving agent is best understood by looking at individuals’ mobility traces. However, due to the sensitivity of such data, large and high-quality data sets containing personal mobility traces are extremely rare. As a substitute, we can study the mobility of shared vehicles, as we did in the previous section. Here, we study GPS traces of taxi cabs in the city of Shenzhen in China (Ding et al., 2015). Over the course of hours, the location of taxis are sampled every minutes. We set the transmission radius to meters and, for simplicity, assume that connections can happen only on sampled points in time. To generate different numbers of moving agents (), we have to subsample from the set of all taxis. Since these trajectories are fixed, we cannot extend them in the event of having no information flood. Hence, we filter out those taxis that meet less than distinct taxis during the whole 24 hours, and taxis will remain. We iterate over different values of between and , each time finding the flood time in hours. We average the results of realizations for each and report it. The results are shown in Figure 5. We have followed the same procedure to fit the simulation values to our bound. The resulting fitted line is drawn in Figure 5, achieving an value of . This shows that our model is capable of predicting flood times for real-world scenarios to some degree. It is worth noting that the real-world experiments did not have significant fluctuations in the flood time value and, similar to controlled experiments in the two sections before, shows a smooth behavior, even with only hundreds of moving agents in some cases.

5. Towards Bounding the Lévy Walk

Compared to other mobility models, the Lévy Walk if far less studied. Formally, in a Lévy Walk, given a constant , an agent positioned at its destination, , chooses node as its next destination, , with the following probability:

(13)

where is the normalizing factor and is the distance between nodes and , such as Manhattan Distance in a grid, Euclidean Distance in the 2D plane or Graph Shortest Path Distance in any given network. Figure 6 compares a Random, a Lévy and a Random Way-point walker, simulated for 250 steps. Notice that a Random Way-point walker tends to take big steps and cover a vast area in the grid, while a Random walker is concentrated to a small area around its initial position. A Lévy walker shows a mixture of the two behaviors. It roams around in a small area most of the time, but occasionally makes a long move to a different region in the grid.

Figure 6. A Random Way-point walker (left), a random walker (middle), and a Lévy walker () after 250 steps.

To compare how information propagates in the three movement models, we have to first observe that Random Walk (or Brownian motion) and Random Way-point can be thought of as two extreme ends of the spectrum of all possible Lévy Walks. In (13), setting to (and applying the corresponding value) yields a constant probability regardless of the distance between and , similar to Random Way-point. On the other hand, given any time limit , we can make high enough so that w.h.p. no agent selects a destination more than one unit distance away at any time , effectively forcing them to follow Random Walk.

Figure 3, left, shows the simulated results for the average flood time, denoted by , of a system where agents are moving by Random Walk, Manhattan Random Way-point or Lévy Walk model with different values of . Both and are set to , goes from to in increments, and each point is created by aggregating the results of 100 realizations. Additionally, with our newly discovered bound for the Manhattan Random Way-point model, the bounds for in Random Walk and Manhattan Random Way-point have gotten very close. We now have a reason to believe that any future bound for Lévy Walk should be close to either of the bounds for these two movement models. And since their bounds are close, it is worth investigating whether or not a careful interpolation of the bounds for Random Walk and Manhattan Random Way-point is a good predictor of how a Lévy Walker moves in a network.

6. Conclusion

Thanks to ever-present portable devices, there has been a growing interest in a better understanding of mobile networks (also called vehicular networks), where autonomous agents move independently and are capable of carrying and transmitting information. We studied the case of agents moving in an torus.

We made a new improvement to the flood time bound for Manhattan Random Way-point model, , that is tight for a wide range of problem settings. To the best of our knowledge, this bound is stronger than all previous bounds found for this movement model. Through extensive experiments, we showed that our bound can accurately predict flood time for a wide variety of simulated and real-world settings.

Lastly, given the shrinking difference between the bounds for Random Walk and Random Way-point, and the fact that Lévy Walk behaves in between the former two movement models, it is now worth investigating whether a careful interpolation of Random Walk and Random Way-point can describe Lévy Walk accurately enough. Finding theoretical bounds for Lévy Walk can be a valuable future work that further expands our knowledge of the relation between these three movement models and ultimately of human mobility.

Acknowledgements

A. Rezaei and J. Gao acknowledge support through NSF CCF-1535900, CNS-1618391, and DMS-1737812. J. Phillips acknowledges support by NSF CCF-1350888, ACI-1443046, CNS-1514520, and CNS-1564287. Research by C.D. Tóth was supported in part by NSF CCF-1422311 and CCF-1423615. The experiments were conducted with equipment purchased through NSF CISE Research Infrastructure Grant No. 1405641. The authors thank Dagstuhl Seminar 15111 on Computational Geometry during which some of the ideas were developed.

References

  • (1)
  • Bajardi et al. (2011) Paolo Bajardi, Chiara Poletto, Jose J Ramasco, Michele Tizzoni, Vittoria Colizza, and Alessandro Vespignani. 2011. Human mobility networks, travel restrictions, and the global spread of 2009 H1N1 pandemic. PloS one 6, 1 (2011), e16591.
  • Barabási (2005) Albert-lászló Barabási. 2005. The origin of bursts and heavy tails in human dynamics. Nature 435, 7039 (2005), 207–211.
  • Bettstetter (2001) Christian Bettstetter. 2001. Smooth is Better than Sharp: A Random Mobility Model for Simulation of Wireless Networks. citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.23.3460 (2001).
  • Birand et al. (2011) B Birand, M Zafer, G Zussman, and K W Lee. 2011. Dynamic Graph Properties of Mobile Networks under Levy Walk Mobility. In 2011 IEEE Eighth International Conference on Mobile Ad-Hoc and Sensor Systems. 292–301.
  • Bishop (2017) Andrew M. Bishop. 2017. Router for OpenStreetMap Data. http://www.routino.org/. (2017).
  • Boudec and Vojnovic (2006) J Y Le Boudec and M Vojnovic. 2006. The Random Trip Model: Stability, Stationary Regime, and Perfect Simulation. IEEE/ACM Trans. Netw. 14, 6 (Dec. 2006), 1153–1166.
  • Broch et al. (1998) Josh Broch, David A Maltz, David B Johnson, Yih-Chun Hu, and Jorjeta Jetcheva. 1998. A Performance Comparison of Multi-hop Wireless Ad Hoc Network Routing Protocols. In Proceedings of the 4th Annual ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom ’98). ACM, New York, NY, USA, 85–97.
  • Camp et al. (2002) Tracy Camp, Jeff Boleng, and Vanessa Davies. 2002. A survey of mobility models for ad hoc network research. Wireless communications and mobile computing 2, 5 (2002), 483–502.
  • Capital Bikeshare (2017) Capital Bikeshare. 2017. Washington DC Trip Histories. https://www.capitalbikeshare.com/system-data. (2017).
  • Chu et al. (2002) Maurice Chu, Horst Haussecker, and Feng Zhao. 2002. Scalable information-driven sensor querying and routing for ad hoc heterogeneous sensor networks. The International Journal of High Performance Computing Applications 16, 3 (2002), 293–313.
  • Chu and Nikolaidis (2002) Tommy Chu and Ioanis Nikolaidis. 2002. On the Artifacts of Random Waypoint Simulations. In International Conference on Internet Computing.
  • Citi Bike (2017) Citi Bike. 2017. New York City Trip Histories. https://www.citibikenyc.com/system-data. (2017).
  • Clementi et al. (2011) Andrea Clementi, Angelo Monti, and Riccardo Silvestri. 2011. Modelling mobility: A discrete revolution. Ad Hoc Networks 9, 6 (Aug. 2011), 998–1014.
  • Clementi et al. (2013) Andrea Clementi, Angelo Monti, and Riccardo Silvestri. 2013. Fast flooding over Manhattan. Distributed computing 26, 1 (2013), 25–38.
  • Clementi et al. (2015) Andrea Clementi, Riccardo Silvestri, and Luca Trevisan. 2015. Information spreading in dynamic graphs. Distributed Computing 28, 1 (2015), 55–73.
  • Crescenzi et al. (2009) Pilu Crescenzi, Miriam Di Ianni, Andrea Marino, Gianluca Rossi, and Paola Vocca. 2009. Spatial Node Distribution of Manhattan Path Based Random Waypoint Mobility Models with Applications. In Structural Information and Communication Complexity (Lecture Notes in Computer Science). Springer, Berlin, Heidelberg, 154–166.
  • De Montjoye et al. (2013) Yves-Alexandre De Montjoye, César A Hidalgo, Michel Verleysen, and Vincent D Blondel. 2013. Unique in the crowd: The privacy bounds of human mobility. Scientific reports 3 (2013), 1376.
  • Dimitriou et al. (2006) Tassos Dimitriou, Sotiris Nikoletseas, and Paul Spirakis. 2006. The infection time of graphs. Discrete Applied Mathematics 154, 18 (2006), 2577–2589.
  • Ding et al. (2015) Jiaxin Ding, Jie Gao, and Hui Xiong. 2015. Understanding and modelling information dissemination patterns in vehicle-to-vehicle networks. In Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM, 41.
  • Eagle et al. (2009) Nathan Eagle, Alex (Sandy) Pentland, and David Lazer. 2009. Inferring friendship network structure by using mobile phone data. Proceedings of the National Academy of Sciences 106, 36 (2009), 15274–15278. https://doi.org/10.1073/pnas.0900282106 arXiv:http://www.pnas.org/content/106/36/15274.full.pdf+html
  • Easley and Kleinberg (2010) David Easley and Jon Kleinberg. 2010. Networks, crowds, and markets: Reasoning about a highly connected world. Cambridge University Press.
  • Feynman et al. (2013) Richard P Feynman, Robert B Leighton, and Matthew Sands. 2013. The Feynman Lectures on Physics, Desktop Edition Volume I. Vol. 1. Basic books.
  • Frias-Martinez et al. (2011) Enrique Frias-Martinez, Graham Williamson, and Vanessa Frias-Martinez. 2011. An agent-based model of epidemic spread using human mobility and social network information. In Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on. IEEE, 57–64.
  • Gomez Rodriguez et al. (2010) Manuel Gomez Rodriguez, Jure Leskovec, and Andreas Krause. 2010. Inferring networks of diffusion and influence. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1019–1028.
  • González et al. (2008) Marta C González, César A Hidalgo, and Albert-László Barabási. 2008. Understanding individual human mobility patterns. Nature 453, 7196 (2008), 779–782.
  • Hu and Johnson (2000) Yih-Chun Hu and David B Johnson. 2000. Caching Strategies in On-demand Routing Protocols for Wireless Ad Hoc Networks. In Proceedings of the 6th Annual International Conference on Mobile Computing and Networking (MobiCom ’00). ACM, New York, NY, USA, 231–242.
  • Hubway (2017) Hubway. 2017. Boston Trip Histories. https://www.thehubway.com/system-data. (2017).
  • Intanagonwiwat et al. (2000) Chalermek Intanagonwiwat, Ramesh Govindan, and Deborah Estrin. 2000. Directed diffusion: A scalable and robust communication paradigm for sensor networks. In Proceedings of the 6th annual international conference on Mobile computing and networking. ACM, 56–67.
  • Johnson and Maltz (1996) David B Johnson and David A Maltz. 1996. Dynamic Source Routing in Ad Hoc Wireless Networks. In Mobile Computing. Springer, Boston, MA, 153–181.
  • Kempe et al. (2003) David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 137–146.
  • Kesten and Sidoravicius (2005) Harry Kesten and Vladas Sidoravicius. 2005. The spread of a rumor or infection in a moving population. Annals of Probability (2005), 2402–2462.
  • Kleinberg (2007) Jon Kleinberg. 2007. Computing: The wireless epidemic. Nature 449, 7160 (2007), 287–288.
  • Lawler (2013) Gregoyr Lawler. 2013. Intersections of random walks. Springer Science & Business Media.
  • Lee et al. (2011) K Lee, Y Kim, S Chong, I Rhee, and Y Yi. 2011. Delay-capacity tradeoffs for mobile networks with Lévy walks and Lévy flights. In 2011 Proceedings IEEE INFOCOM. 3128–3136.
  • OpenStreetMap contributors (2017) OpenStreetMap contributors. 2017. Planet dump retrieved from https://planet.osm.org . https://www.openstreetmap.org. (2017).
  • Peng and Lu (2000) Wei Peng and Xi-Cheng Lu. 2000. On the reduction of broadcast redundancy in mobile ad hoc networks. In Proceedings of the 1st ACM international symposium on Mobile ad hoc networking & computing. IEEE Press, 129–130.
  • Perkins et al. (2001) C E Perkins, E M Royer, S R Das, and M K Marina. 2001. Performance comparison of two on-demand routing protocols for ad hoc networks. IEEE Pers. Commun. 8, 1 (Feb. 2001), 16–28.
  • Pettarin et al. (2011) Alberto Pettarin, Andrea Pietracaprina, Geppino Pucci, and Eli Upfal. 2011. Tight bounds on information dissemination in sparse mobile networks. In Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing. ACM, 355–362.
  • Royer et al. (2001) E M Royer, P M Melliar-Smith, and L E Moser. 2001. An analysis of the optimum node density for ad hoc mobile networks. In ICC 2001. IEEE International Conference on Communications. Conference Record (Cat. No.01CH37240), Vol. 3. 857–861 vol.3.
  • Shinki et al. (2017) K Shinki, M Nishida, and N Hayashibara. 2017. Message Dissemination Using Lévy Flight on Unit Disk Graphs. In 2017 IEEE 31st International Conference on Advanced Information Networking and Applications (AINA). 355–362.
  • Shlesinger et al. (1999) Michael F Shlesinger, Joseph Klafter, and Gert Zumofen. 1999. Above, below and beyond Brownian motion. American Journal of Physics 67, 12 (1999), 1253–1259.
  • Song et al. (2010) Chaoming Song, Zehui Qu, Nicholas Blumm, and Albert-László Barabási. 2010. Limits of Predictability in Human Mobility. Science 327, 5968 (2010), 1018–1021. https://doi.org/10.1126/science.1177170
  • Viswanathan et al. (2002) G M Viswanathan, F Bartumeus, Sergey V. Buldyrev, J Catalan, U L Fulco, Shlomo Havlin, M G E da Luz, M L Lyra, E P Raposo, and H Eugene Stanley. 2002. Lévy flight random searches in biological phenomena. Physica A: Statistical Mechanics and its Applications 314, 1 (Nov. 2002), 208–213.
  • Wang et al. (2014) S Wang, X Wang, X Cheng, J Huang, and R Bie. 2014. The Tempo-Spatial Information Dissemination Properties of Mobile Opportunistic Networks with Levy Mobility. In 2014 IEEE 34th International Conference on Distributed Computing Systems. 124–133.
  • Yoon et al. (2003) J Yoon, M Liu, and B Noble. 2003. Random waypoint considered harmful. In IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428), Vol. 2. 1312–1321 vol.2.
  • Zhou and Gao (2009) Dengpan Zhou and Jie Gao. 2009. Opportunistic Processing and Query of Motion Trajectories in Wireless Sensor Networks. In Proceedings of the 28th Annual IEEE Conference on Computer Communications (INFOCOM’09). 1197–1205.