## I Introduction

One aspect of evasion for preys in nature is to deceive predators in terms of their escape trajectory. The prey can utilize randomness in its decision making process to ensure that the predator cannot anticipate its future positions on the escape trajectory accurately, contrary to the case that the prey always escapes directly away from the predator [1]. In nature, this behaviour is demonstrated by mantises, moths, and green lacewings performing sudden, unpredictable changes in flight path to avoid attacks of bats [2]. Cockroaches also randomly select their escape bearing from a set of possible trajectories at fixed angles away from the threat [3]. The randomness in the escape trajectory seems to be shared by many other insects [4]. The aim of this paper is to draw inspiration from these preys to devise stochastic evasive maneuvers for engineering applications.

Investigation of evasive maneuvers is an active area of research in robotics and aerospace engineering; see [5, 6, 7, 8, 9] spanning over five decades. In these studies, the underlying methodology for evasion is typically deterministic. Analysing the dynamics of the predator, specifically its constraints and short comings, such as maximum speed and turning radius, reveals a deterministic path for evasion (based on mechanical and geometric advantages). This sometimes results in a game-theoretic approach for decision making, dubbed as pursuit-evasion games [10]. The application and analysis of these games do not however end with engineering [11, 12, 13, 14].

Fundamental differences between theoretical investigation of optimal evasion trajectories and the behaviour of preys in nature was also observed within the biological studies [15], where it was noted that the theoretical studies of optimal evasion trajectories often fall back on the relative speeds of predator and prey [16, 17, 18] and do not accommodate one of the main properties postulated for evasion trajectories, namely their unpredictability, which is fundamental for preventing predators from learning a repeated pattern of prey response [19, 20, 21].

## Ii Modelling

Consider a prey that has been detected or observed at an initial position , where for planar and for aerial preys, at time by a predator. The dynamics of the prey is given by

(1a) | ||||

(1b) |

where is the state of the prey (e.g., its position, velocity, and orientation), is the control input, and is the output measurements, which is the position in this case. Note that, due to consistency, . In what follows, the dynamical system (1) is assumed to be controllable in the sense that there exists a sequence of actions for it to be able to get from any state to another state within a non-zero horizon. Controllability could be a reasonable assumption as otherwise the prey cannot perform necessary maneuvers to escape, potentially rendering it vulnerable. Although any state is reachable, some maneuvers consume more energy and may not be practically feasible.

The predator is assumed to be interested in estimating the position of the prey in

seconds, at time instant . This could be motivated by that the predator aims to intercept the prey at that position. By anticipating the trajectory of the prey, the predator can gain an advantage and capture the prey easier. For instance, snakes anticipate future behaviour of their prey and strike where they would be later [23]. Other predators have been also noted to actively predict future movements of their preys [24, 25].At , the state of the prey is . The position of the prey is determined by the control actions of the prey over the window of time of interest denoted by . The problem of interest for the prey is to determine the position to balance between energy consumption and its ability to evade (in the sense of making the predator’s estimation error relatively large to remain unpredictable).

In order to accomodate unpredictability of the prey along its evasion trajectories, which is postulated to be a fundumental aspect of evasion in nature [19, 20, 21, 4], the prey selects the desired position according to a randomized policy captured by a probability density . This implies that for any Lebesgue-measurable set . Note that is not a conditional probability function as

is not necessarily a random variable (there is no assumption regarding a prior for

). The notation emphasizes possible dependency of the probability density function of on as a parameter^{1}

^{1}1As observed in the remainder of the paper, in some cases, the optimal policy does not depend on . . For instance, at high altitudes, downward dives might be more likely. The task at hand, in this paper, is to find to balance between evasion efficiency and energy consumption.

Since the prey is controllable, there always exist control actions to bring the prey from the position , with a given initial state of , to the position over the horizon . The minimum required energy for the maneuver is

(2a) | ||||

(2b) | ||||

(2c) | ||||

(2d) |

where relates the energy consumption of the prey to its control input and state. Here, To keep its energy consumption low, the prey can select so that is kept low. Since the prey’s policy is stochastic, it keeps the following average cost low:

(3) |

where is the support set of the density function .

Another facet of evasion is to avoid heading back towards the predator, even when reacting unpredictably. In fact, it is desired to head away from the predator so that the evasion is complete. Thus the preferred directions head away from the predator but with enough choices for unpredictable maneuvers. To capture this, define the set

angle between | ||||

(4) |

where is the position of the predator at time and . The search is restricted to the set of probability density functions that , i.e., the set of probability density functions that, with probability one, pick to head away from the predator (according to the property that the angle between and is smaller than or equal to ). In what follows, is the boundary of .

Effective tracking of the prey improves the chances of the predator to capture it. The predator’s prediction of the location of the prey is denoted by . A measure of the effectiveness of the predator’s ability to predict the behaviour of the prey is . The objective of the prey is to ensure that is maximized (albeit in balance with other factors such as energy consumption to ensure the policy is implementable), irrespective of the predator’s policy for determining . The independence from the predator’s policy is motivated by that the prey does not know the estimation policy of new predators or that it might want to be prepared for the worst-case outcome.

The following assumptions make the problem of searching for this policy more tractable.

###### Assumption 1

(i) is twice continuously differentiable, (ii) has zero Lebesgue-measure for all and for all , and (iii) is bounded for all .

Assumption 1 (i) simplifies the search for the optimal policy by allowing the use of calculus of variation [26]. Assumption 1 (ii) ensures that the Fisher information matrix, in Theorem 1, is well-defined, and is inversely related to the estimation error of the predator, and that its trace is a convex function of the density function . Finally, in the absence of Assumption 1 (iii), the average energy required for realising is unbounded and thus such a policy cannot be physically realized. This paper adopts the notation to denote the gradient of a continuously differentiable mapping

, which is assumed to be a column vector.

###### Theorem 1

Under Assumption 1, it can be shown that

(5) |

where is the Fisher information associated with the density function defined as

(6) |

###### Proof:

See Appendix A.

Note that, strictly speaking, (1) is not the classic definition of the Fisher information matrix; see, e.g., [27]. This can be observed by noting that in the definition of the Fisher information matrix the variables with respect to which derivatives and integrations are performed are not the same, contrary to (1). This is due to the fact that, in the Cramér-Rao bound, a randomized measurement is first realized and from that some deterministic parameters are estimated. However, in this paper, a deterministic measurement (regarding the initial position of the prey) is first taken and then a randomized movement in the state is realized with the ultimate aim being to estimate the-said random state and not the deterministic initial condition. Forgetting about this “philosophical” difference, the proof techniques are similar to that of the Cramér-Rao bound with subtle, yet important, differences. Due to these similarities, the name Fisher information matrix is adopted for . The bound in (5) is tight in the sense that there exist and such that (5) holds with equality^{2}^{2}2Let

be a Gaussian distribution with zero mean and unit variance (for which

) and set to be the maximum likelihood estimator (i.e., implying that ). Then, if ..The problem of finding the optimal policy for striking a balance between evasion efficiency and energy consumption while running away from the predator can be posed as an infinite-dimensional optimization problem in

(7a) | ||||

(7b) |

where is the set of all policies that admit Assumption 1 and is a constant balancing the trade-off between evasion efficiency and energy consumption. As increases, the emphasis on energy consumption also increases (i.e., the prey acts more sedentarily).

## Iii Main Results

The following theorem captures the optimal stochastic evasive maneuver, in the sense of the solution of (7). It shows that the optimal probability density functions of the actions of the prey is characterized by the Schrödinger’s equation.

###### Theorem 2

###### Proof:

See Appendix B.

Theorem 2 shows that (the square root of) the prey’s probability density function for selecting its destination must satisfy the time-invariant (stationary) Schrödinger’s equation. This is an interesting observation illustrating that quantum particles play an elaborate game of pursuit-evasion with measurement devices. Within the context of quantum mechanics, the same formula can be obtained by choosing Fisher information as a measure of disorder [28]; however, the choice of Fisher information as a measure of disorder in quantum mechanics is philosophical while, in this paper, the choice stems from stochastic evasion.

As an example consider a single-integrator dynamics robot moving on the ground with simple dynamics:

(9a) | ||||

(9b) |

where is the state (only the position for single-integrator robots) and is the control input (the velocity in each direction). Furthermore, assume that For these systems, can be explicitly calculated as Therefore, the energy required for moving from to is captured by the distance between those points. Set without loss of generality. Assume that the predator is located in . We can compute the optimal policy using Theorem 2; see Appendix C for the complete algebraic computations. Figure 1 illustrates for , , and . The dashed lines show . The probability density function is larger at the darker areas and thus the prey is more likely to appear at those points at time . By changing the constant the size of the dark region changes.

If evader repeats the optimal policy for evasion at multiple time instances, it must follow the way points , which are random variables with the probability density function . Note that is a function of , which is the position of predator at time , as the set changes with . For this example, consider the case where the predator is also a single-integrator robot with the dynamics in (9). Assume that the predator uses the least mean square estimator [29, p. 30] for estimating the prey’s position as . The predator then tries to intercept the prey by following the way points given by Figure 2 illustrates a realization of the trajectory of the prey when using the optimal evasion policy in Theorem 2 (black) and the trajectory of the predator (red). The predator and the prey never cross path. To observe this, Figure 3 shows the distance between the predator and the pray (solid black) for the trajectories in Figure 2. The dashed black line illustrates the statistical lower bound on the distance between predator and prey given by (5), which follows from that Note that variations in the distance between the prey and predator are expected as the guarantee in (5) only holds in expectation. Figure 4 illustrates The probability density function of the distance between predator and prey across time. This density is estimated using randomly generated trajectories. Based on this figure, it can be seen that . Finally, it remains to discuss the relationship between energy consumption and evasion capability. As expected, with increasing the energy consumption, the prey can implement more intensive maneuvers to escape. Figure 5 shows the relationship between Fisher information and expected energy consumption , various points on this line can be achieved with a different .

## Iv Conclusions and Future Work

We proved that the optimal probability density functions of the actions of the prey for trading-off unpredictability and energy consumption for implementing stochastic evasion manoeuvres is shown to be characterized by the stationary Schrödinger’s equation. Future work can focus on computing the optimal probability density function for practical robot dynamics.

## References

- [1] G. M. Card, “Escape behaviors in insects,” Current Opinion in Neurobiology, vol. 22, no. 2, pp. 180–186, 2012.
- [2] D. Yager, M. May, and M. Fenton, “Ultrasound-triggered, flight-gated evasive maneuvers in the praying mantis parasphendale agrionina. i. free flight,” Journal of Experimental Biology, vol. 152, no. 1, pp. 17–39, 1990.
- [3] P. Domenici, D. Booth, J. M. Blagburn, and J. P. Bacon, “Cockroaches keep predators guessing by using preferred escape trajectories,” Current Biology, vol. 18, no. 22, pp. 1792–1796, 2008.
- [4] P. Domenici, J. M. Blagburn, and J. P. Bacon, “Animal escapology ii: escape trajectory case studies,” Journal of Experimental Biology, vol. 214, no. 15, pp. 2474–2494, 2011.
- [5] Y. Ho, A. Bryson, and S. Baron, “Differential games and optimal pursuit-evasion strategies,” IEEE Transactions on Automatic Control, vol. 10, no. 4, pp. 385–389, 1965.
- [6] T. Shima and J. Shinar, “Time-varying linear pursuit-evasion game models with bounded controls,” Journal of Guidance, Control, and Dynamics, vol. 25, no. 3, pp. 425–432, 2002.
- [7] D. W. Oyler, P. T. Kabamba, and A. R. Girard, “Pursuit–evasion games in the presence of obstacles,” Automatica, vol. 65, pp. 1–11, 2016.
- [8] M. Casini and A. Garulli, “An improved lion strategy for the lion and man problem,” IEEE control systems letters, vol. 1, no. 1, pp. 38–43, 2017.
- [9] J. Shinar and S. Gutman, “Three-dimensional optimal pursuit and evasion with bounded controls,” IEEE Transactions on Automatic Control, vol. 25, no. 3, pp. 492–496, 1980.
- [10] R. Isaacs, Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. New York, NY: John Wiley and Sons, 1965.
- [11] G. F. Miller and D. Cliff, “Protean behavior in dynamic games: Arguments for the co-evolution of pursuit-evasion tactics,” From Animals to Animats, vol. 3, pp. 411–420, 1994.
- [12] S. Morice, S. Pincebourde, F. Darboux, W. Kaiser, and J. Casas, “Predator–prey pursuit–evasion games in structurally complex environments,” Integrative and Comparative Biology, vol. 53, no. 5, pp. 767–779, 2013.
- [13] A. Soto, W. J. Stewart, and M. J. McHenry, “When optimal strategy matters to prey fish,” Integrative and Comparative Biology, vol. 55, no. 1, pp. 110–120, 2015.
- [14] L. Dugatkin and H. Reeve, Game Theory and Animal Behavior. Oxford University Press, 2000.
- [15] P. Domenici, J. M. Blagburn, and J. P. Bacon, “Animal escapology i: theoretical issues and emerging trends in escape trajectories,” Journal of Experimental Biology, vol. 214, no. 15, pp. 2463–2473, 2011.
- [16] S. Arnott, D. Neil, and A. Ansell, “Escape trajectories of the brown shrimp crangon crangon, and a theoretical consideration of initial escape angles from predators,” Journal of Experimental Biology, vol. 202, no. 2, pp. 193–209, 1999.
- [17] P. Domenici, “The visually mediated escape response in fish: Predicting prey responsiveness and the locomotor behaviour of predators and prey,” Marine and Freshwater Behaviour and Physiology, vol. 35, no. 1–2, pp. 87–110, 2002.
- [18] D. Weihs and P. Webb, “Optimal avoidance and evasion tactics in predator-prey interactions,” Journal of Theoretical Biology, vol. 106, no. 2, pp. 189 – 206, 1984.
- [19] C. Comer, “Behavioral biology: inside the mind of proteus?,” Current Biology, vol. 19, no. 1, pp. R27–R28, 2009.
- [20] J.-G. J. Godin, “Evading predators,” in Behavioural ecology of teleost fishes (J.-G. J. Godin, ed.), pp. 191–236, Oxford, UK: Oxford University Press, 1997.
- [21] D. Humphries and P. Driver, “Protean defence by prey animals,” Oecologia, vol. 5, no. 4, pp. 285–302, 1970.
- [22] L. Markus, “Controllability of nonlinear processes,” Journal of the Society for Industrial and Applied Mathematics, Series A: Control, vol. 3, no. 1, pp. 78–90, 1965.
- [23] K. C. Catania, “Tentacled snakes turn c-starts to their advantage and predict future prey behavior,” Proceedings of the National Academy of Sciences, vol. 106, no. 27, pp. 11183–11187, 2009.
- [24] B. G. Borghuis and A. Leonardo, “The role of motion extrapolation in amphibian prey capture,” Journal of Neuroscience, vol. 35, no. 46, pp. 15430–15441, 2015.
- [25] M. Mischiati, H.-T. Lin, P. Herold, E. Imler, R. Olberg, and A. Leonardo, “Internal models direct dragonfly interception steering,” Nature, vol. 517, p. 333, 12 2014.
- [26] D. Kirk, Optimal Control Theory: An Introduction. Dover Books on Electrical Engineering, Dover Publications, 2012.
- [27] J. Shao, Mathematical Statistics. Springer Texts in Statistics, Springer-Verlag New York, 2003.
- [28] B. R. Frieden, “Fisher information, disorder, and the equilibrium distributions of physics,” Phys. Rev. A, vol. 41, pp. 4265–4276, 1990.
- [29] B. D. O. Anderson and J. B. Moore, Optimal Filtering. Dover Books on Electrical Engineering, Dover Publications, 2012.
- [30] R. Larson and B. Edwards, Calculus. Cengage Learning, 2016.
- [31] R. T. Rockafellar, Convex Analysis. Princeton Landmarks in Mathematics and Physics, Princeton University Press, 1997.
- [32] V. Jeyakumar and H. Wolkowicz, “Zero duality gaps in infinite-dimensional programming,” Journal of Optimization Theory and Applications, vol. 67, no. 1, pp. 87–108, 1990.
- [33] C. H. Edwards, Advanced Calculus of Several Variables. Academic Press, 1973.

## Appendix A Proof of Theorem 1

For all constants , with for all , it could be shown that

(10) |

Notice that

where the first and the second equalities, respectively, follow from

and

Here, denotes a vector of all zeros except the -th entry that is equal to one. Now, it can be deduced that because of the Gauss’s Theorem (a.k.a., the divergence theorem) [30] and for all because is bounded in light of Assumption 1 (iii). Therefore, it can be seen that

(11) |

On the other hand, by the definition of the Fisher information matrix in (1), it can be deduced that

(12) |

Substituting (11) and (12) into (10) while setting , , results in

(13) |

Noting that the left-hand-side of (13) is always greater than or equal to zero (because it is the expectation of a non-negative random variable), it can be deduced that where the second inequality follows from the Jensen’s inequality [31, p. 25], the facts that the mapping is convex over , and for all (since is positive semi-definite). This completes the proof.

## Appendix B Proof of Theorem 2

First, we prove that is convex in , . Note that Define as , which is convex over because its Hessian is positive semi-definite over its domain. Let and satisfy Assumption 1 and belongs to . We may define for any such that . Clearly satisfies Assumption 1 as well. Further, and, as a result, . For any , where , it can be shown that

(14) |

due to convexity of over . Now, note that, if ,

for . Thus, because , with denoting the complement of set , is a zero-measure set (see Assumption 1), . Similarly, In light of these identities, the proof of convexity of follows from integrating both sides of (14).

Noting that the cost function and the constraint set are convex, the stationarity condition (that the variational derivative is equal to zero) is sufficient for optimality. Further, if multiple density functions satisfy the sufficiency conditions, they all exhibit the same cost.

In the rest of the proof, the stationarity condition is rewritten in a simpler form. Following the result of [32], the Lagrangian can be constructed as

where is the Lagrange multiplier corresponding to the equality constraint . Using Theorem 5.3 in [33, p. 440], it can be seen that the extrema must satisfy

Introducing the change of variable results in Now, it can be deduced that Note that, if for some , the equality cannot be satisfied with any .

## Appendix C Single-Integrator Robots

For single integrator robots, . With change of variable , we get . The polar coordinates, in addition to , require angle such that . We get . Changing from Cartesian coordinates to the polar coordinates and substituting with

into the partial differential equation in Theorem

2 results inFinally, note that the square root of the following density function solves this partial differential equation:

Comments

There are no comments yet.