 # Information Geometry of Sensor Configuration

In problems of parameter estimation from sensor data, the Fisher Information provides a measure of the performance of the sensor; effectively, in an infinitesimal sense, how much information about the parameters can be obtained from the measurements. From the geometric viewpoint, it is a Riemannian metric on the manifold of parameters of the observed system. In this paper we consider the case of parameterized sensors and answer the question, "How best to reconfigure a sensor (vary the parameters of the sensor) to optimize the information collected?" A change in the sensor parameters results in a corresponding change to the metric. We show that the change in information due to reconfiguration exactly corresponds to the natural metric on the infinite dimensional space of Riemannian metrics on the parameter manifold, restricted to finite-dimensional sub-manifold determined by the sensor parameters. The distance measure on this configuration manifold is shown to provide optimal, dynamic sensor reconfiguration based on an information criterion. Geodesics on the configuration manifold are shown to optimize the information gain but only if the change is made at a certain rate. An example of configuring two bearings-only sensors to optimally locate a target is developed in detail to illustrate the mathematical machinery, with Fast-Marching methods employed to efficiently calculate the geodesics and illustrate the practicality of using this approach.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

This paper is an attempt to begin the construction of an abstract theory of sensor management, in the hope that it will help to provide both a theoretical underpinning for the solution of practical problems and insights for future work. A key component of sensor management is the amount of information a sensor in a given configuration can gain from a measurement, and how that information gain changes as the configuration does. In this vein, it is interesting to observe how information theoretic surrogates have been used in a range of applications as objective functions for sensor management; see, for instance, [5, 12, 35]. Our aim here is to abstract from various papers including these, those by Kershaw and Evans  as well as others, the mathematical principles required for this theory. Our approach is set within the mathematical context of differential geometry.

The problem of estimation is expressed in terms of a likelihood; that is, a probability density

, where denotes the measurement and the parameter to be estimated. It is well known, that the Fisher Information associated with this likelihood provides a measure of information gained from the measurement.

While the concepts discussed here are, in some sense, generic and can be applied to any sensor system that has the capability to modify its characteristics, for simplicity and to keep in mind a motivating example, we focus on a particular problem: that of localization of a target. Of course, a sensor system is itself just a (more complex) aggregate sensor, but it will be convenient, for the particular problems we will discuss, to assume a discrete collection of disparate (at least in terms of location) sensors, that together provide measurements of aspects of the location of the target. This distributed collection of sensors, each drawing measurements that provide partial information about the location of a target, using known likelihoods, defines a particular sensor configuration state. As the individual sensors move, they change their sensing characteristics and thereby the collective Fisher Information associated with estimation of target location. The Fisher information matrix defines a metric, the Fisher-Rao metric, over the physical space where the target resides [28, 17], or in more generality a metric over the parameter space in which the estimation process takes place, and this metric is a function of the location of the sensors. This observation permits analysis of the information content of the system, as a function of sensor parameters, in the framework of differential geometry (“information geometry”) [1, 2]. A considerable literature is dedicated to the problem of optimizing the configuration so as to maximize information retrieval (see Sec. II. A) [23, 24, 21, 5]. The mathematical machinery of information geometry has led to advances in several signal processing problems, such as blind source separation , gravitational wave parameter estimation 

, and dimensionality reduction for image retrieval

 or shape analysis .

In sensor management/adaptivity applications, the performance of the sensor configuration (in terms of some function of the Fisher Information) becomes a cost associated with finding the optimal sensor configuration, and tuning the metric by changing the configuration is important. Literally hundreds of papers, going back to the seminal work of  and perhaps beyond, use the Fisher Information as a measure of sensor performance. In this context, parametrized families of Fisher-Rao metrics arise (e.g. [22, 4]). Sensor management then becomes one of choosing an optimal metric (based on the value of the estimated parameter), from among a family of such, to permit the acquisition of the maximum amount of information about that parameter.

As we have stated, the focus, hopefully clarifying, example of this paper, is that of estimating the location of a target using measurements from mobile sensors (c.f. [37, 9]). The information content of the system depends both on the location of the target and on the spatial locations of the sensors, because the covariance of measurements is sensitive to the distances and angles made between the sensors and the target. As the sensors move in space, the associated likelihoods vary, as do the resulting Fisher matrices, describing the information content of the system, for every possible sensor arrangement. It is this interaction between sensors and target that this paper sets out to elucidate in the context of information geometry.

The collection of all Riemannian metrics on a Riemannian manifold itself admits the structure of an infinite-dimensional Riemannian manifold [15, 11]. Of interest to us is only the subset of Riemannian metrics corresponding to Fisher informations of sensor configurations, and this allows us to restrict attention to a finite-dimensional sub-manifold of the manifold of metrics, called the sensor manifold [23, 24]. In particular, a continuous adjustment of the sensor configuration, say by moving one of the individual sensors, results in a continuous change in the associated Fisher metric and so a movement in the sensor manifold.

Though computationally difficult, the idea of regarding the Fisher metric as a measure of performance of a given sensor configuration and then understanding variation in sensor configuration in terms of the manifold of such metrics is powerful. It permits questions concerning optimal target trajectories, as discussed here, to minimize information passed to the sensors and, as will be discussed in a subsequent paper, optimal sensor trajectories to maximize information gleaned about the target. In particular, we remark that the metric on the space of Riemannian metrics that appears naturally in a mathematical context in [15, 11], also has a natural interpretation in a statistical context.

Our aims here are to further develop the information geometry view of sensor configuration begun in [23, 24]. While the systems discussed are simple and narrow in focus, already they point to concepts of information collection that appear to be new. Specifically, we set up the target location problem in an information geometric context and we show that the optimal (in a manner to be made precise in Sec. II) sensor trajectories, in a physical sense, are determined by solving the geodesic equations on the sensor manifold (Sec. III). Various properties of geodesics on this space are derived, and the mathematical machinery is demonstrated using concrete physical examples (Sec IV).

## 2 The Information in Sensor Measurements Figure 1: Diagrammatic representation of the sensor model; sensors are at λi∈M taking measurements of a target at θ∈M. A measure of distance between different sensor configurations, physically corresponding to change in information content is obtained through a suitable restriction of the metric Gg (10) to the configuration manifold M(Γ)⊂M, the space of all Riemannian metric on M. M is almost certainly not topologically spherical, it is merely drawn here as such for simplicity.

In general, sensor measurements, as considered in this paper, can be formulated as follows. Suppose we have, in a fixed manifold , a collection of sensors located at . For instance the manifold may be and location may just mean that in the usual sense in Euclidean space. The measurements from these sensors are used to estimate the location of a target also in (see the left of Figure 1). Each sensor draws measurements

from a distribution with probability density function (PDF)

. A measurement is the collected set of individual measurement from each of the sensors with likelihood

 p(x|θ)=N∏i=1pi(xi|θ). (1)

Measurements here are assumed independent111While this assumption is probably not necessary, it allows one to define the aggregate likelihood (1) as a simple product over the individual likelihoods, which renders the problem computationally more tractable. between sensors and over time.

Given a measurement of a target at , the likelihood that the same measurement could be obtained from a target at ,

, is given by the log odds expression

 L(θ,θ′)=logp(x|θ)p(x|θ′),

and the average over all measurements is, by definition, the Kullback-Leibler divergence

, :

 D(θ||θ′)=Ex[L(θ,θ′)]=∫p(x|θ)logp(x|θ)p(x|θ′)dx (2)

This would, ostensibly, be a good measure on the information in the sensor measurements as it is non-negative and , but it lacks desirable features of a metric: it is not symmetric and does not satisfy the triangle inequality. We recall that the Kullback-Leibler divergence is related to mutual information, and refer the reader to Section 17.1 of  for a discussion of this connection. It is widely used as a measure of the difference in information available about the target between the locations and .

In the limit as the first non-zero term in the series expansion for is second order, viz.

 limθ′→θD(θ||θ′)=limθ′→θ(θ−θ′)Tg(θ−θ′)+O((θ−θ′)3), (3)

where is an symmetric matrix, and is the dimension of the manifold .

This location-dependent matrix defines a metric over , the Fisher Information Metric [28, 17]

. It can also be calculated, under mild conditions, as the expectation of the tensor product of the gradients of the log-likelihood

as

 g=Ex|θ[dθℓ⊗dθℓ]. (4)

Since Fisher Information is additive over independent measurements, the Fisher Information Metric provides a measure of the instantaneous change in information the sensors can obtain about the target. In this paper, we adopt a relatively simplistic view that the continuous case of measurements is a limit of measurements discretized over time. Because sensor measurements depend on the relative locations of the sensors and target, this incremental change depends on the direction the target is moving; the Fisher metric (4) can naturally be expressed in coordinates that represent the sensor locations (see Sec. IV), but also depends on parameters that represent target location, which may be functions of time in a dynamical situation. Once the Fisher metric (4) has been evaluated, one can proceed to optimize it in an appropriate manner.

### 2.1 D-Optimality

Because the information of the sensor system described in Section 2 is a matrix-valued function it is not obvious what it means to maximize the ‘total information’ with respect to the sensor parameters. We require a definition of ‘optimal’ in an information theoretic context. Several different optimization criteria exist (e.g. [36, 40]), defined by constructing scalar invariants from the matrix entries of , and maximizing those functions in the usual way.

We adopt the notion of D-optimality in this paper; we consider the maximization of the determinant of (4). Equivalently, D-optimality maximizes the differential Shannon entropy of the system with respect to the sensor parameters , and minimizes the volume of the elliptical confidence regions for the sensors estimate of the location of the target .

A complication in applying D-optimality (or any other) criterion to this problem is that the sensor locations and distributions are not fixed. Conventionally, measurements are drawn from sensors with fixed properties, with a view to estimating a parameter . Permitting sensors to move throughout produces an infinite family of sensor configurations, and hence Fisher-Rao metrics (4), parametrized by the locations of the sensors. One aim of this structure is to move sensors to locations that serve to maximize information content, given some prior distribution for a particular . This necessitates a tool to measure the difference between information provided by members of a family of Fisher-Rao metrics; this is explored in Section 3.

### 2.2 Geodesics on the Sensor Manifold

We now consider the case where the target is in motion, so that varies along a path . The instantaneous information gain by the sensor(s) at time is then , where is the Fisher Information Metric (4). This observation is based on the assumption that the measurements are all independent. The total information gained along is

 I(T)=∫T0g(γ′(t),γ′(t))dt, (5)

which is the equivalent of the energy functional in differential geometry [e.g. Chapter 9 of ], and this has the same extremal paths as , the arc-length of the path ,

 lg(γ)=∫T0√g(γ′(t),γ′(t))dt. (6)

Paths with extremal values of this length are geodesics and these can be interpreted as the evasive action that can be taken by the target to minimize amount of information it gives to the sensors.

### 2.3 Kinematic Conditions on Information

While the curves that are extrema of the Information functional and of arc-length are the same as sets, a geodesic only minimizes the information functional if traversed at speed

 dlg/dt=+√g(γ′(t),γ′(t)). (7)

In differential geometric terms, this is equivalent to requiring the arc-length parametrization of the geodesic to fulfill the energy condition. In order to minimize information about its location, the target should move along a geodesic of at exactly the speed (7). This direct kinematic condition on information is unusual and difficult to reconcile with our current view of information theory.

While aspects of this speed constraint are still unclear to us, an analogy that may be useful is to regard the target as moving though a “tensorial information fluid”. Now moving slower relative to the fluid will result in “pressure” building behind, requiring information (energy) to be expended to maintain the slower speed. Moving faster also requires more information to push through the slower moving fluid. In the fluid dynamics analogy, the energy expended in moving though a fluid is proportional to the square of the difference in speed between the fluid and the object. The local energy is proportional to the difference between actual speed and the speed desired by the geodesic; that is, the speed that minimizes the energy functional. Pursuing the fluid dynamics analogy, the density of the fluid mediates the relationship between the energy and the relative speed.

 E∝g(δv,δv)

In particular, the scalar curvature, which depends on , influences the energy and hence the information flow. We will explore this issue further in a future publication.

## 3 The Information of Sensor Configurations

A sensor configuration is a set of sensor parameters. The Fisher-Rao metric can be viewed as a function of as well as , the location of the target. To calculate the likelihood that a measurement came from one configuration over another requires the calculation of , which is difficult as the value of is not known exactly. Measurements can be used to construct an estimate

, however, the distribution of this estimate is hard to quantify and even harder to calculate. Instead, here, the maximum entropy distribution is used. This is normally distributed with mean

, and covariance , the inverse of the Fisher information metric at the estimated target location.

The information gain due to the sensor configuration is now

because there was no prior information about the location (the uniform distribution

222Note that the uniform distribution is, in general, an improper prior in the Bayesian sense unless the manifold is of finite volume. It may be necessary, therefore, to restrict attention to some (compact) submanifold for to be well-defined; see also the discussion below equation (9).) before the sensors were configured compared with the maximum entropy distribution after. Evaluating this gives

 D(p||1)=log((2πe)ndetg−1(Γ,^θ)) (8)

The Fisher Information metric for this divergence can be calculated from

 G(h,k)=E[d2gD(p||1)]=∫Mtr(g−1hg−1k)vol(g)dμ (9)

where and

are tangent vectors to the space of metrics. The integral defining (

9) may not converge for non-compact , so restriction to a compact submanifold of is assumed throughout as necessary (c.f. Figure 1).

### 3.1 The Manifold of Riemannian Metrics

The set of all Riemannian metrics over a manifold can itself be imbued with the structure of an infinite-dimensional Riemannian manifold [15, 11], which we call . Points of are Riemmanian metrics on ; i.e. each point bijectively corresponds to a positive-definite, symmetric -tensor in the space . Under reasonable assumptions, an metric on [8, 15] may be defined as:

 G(h,k)=∫Mtr(g−1hg−1k)vol(g)dμ, (10)

which should be compared to (9).

It should be noted that the points of the manifold comprise all of the metrics that can be put on , most of which are irrelevant for our physical sensor management problem. We restrict consideration to a sub-manifold of consisting only of those Riemannian metrics that are members of the family of Fisher information matrices (4) corresponding to feasible sensor configurations. This particular sub-manifold is called the ‘sensor’ or ‘configuration’ manifold [23, 24] and is denoted by , where now the objects and are now elements of the now finite dimensional tangent space . The dimension of is since each point of is uniquely described by the locations of the -sensors, each of which require numbers to denote their coordinates. A visual description of these spaces is given in Figure 1. For all cases considered in this paper, the integral defined in (10) is well-defined and converges (see however the discussion in ).

For the purposes of computation, it is convenient to have an expression for the metric tensor components of (10) in some local coordinate system. In particular, in a given coordinate basis over (not to be confused with the coordinates on ; see Sec. IV), the metric (10) reads

 G(h,k)=∫Ωgnkgℓmhmnkℓkvol(g). (11)

where and are tangents vectors in given in coordinates by

 TM(Γ)=span{∂∂zigmn}dimM(Γ)i=1. (12)

From the explicit construction (11), all curvature quantities of , such as the Riemann tensor and Christoffel symbols, can be computed.

### 3.2 D-Optimal Configurations

D-optimality in the context of the sensor manifold described above is discussed in this section. Suppose that the sensors are arranged in some arbitrary configuration . The sensors now move in anticipation of target behaviour; a prior distribution is adopted to localize a target position . The sensors move continuously to a new configuration , where is determined by maximizing the determinant of , i.e. corresponds to the sensor locations for which , computed from (11), is maximized. The physical situation is depicted graphically in Figure 2. This process can naturally be extended to the case where real measurement data is used. In particular, as measurements are drawn, a (continuously updated) posterior distribution for becomes available, and this can be used to update the Fisher metric (and hence the metric ) to define a sequence of optimal configurations; see Sec. V. Figure 2: Graphical illustration of D-optimal sensor dynamics; a sensor configuration Γ0 evolves to a new configuration Γ1 by moving the N sensors in Ω-space to new positions that are determined by maximizing the determinant of the metric G, given by equation (11), on the sensor manifold M(Γ). Each sensor λi traverses a path γi through Ω-space to end up in the appropriate positions constituting Γ1. As shown in Section 3.3, the paths γi are entropy-minimizing if they are geodesic on the sensor manifold M(Γ). Note that the target is shown as stationary in this particular illustration.

### 3.3 Geodesics for the Configuration Manifold

While D-optimality allows us to determine where the sensors should move given some prior, it provides us with no guidance on which path the sensors should traverse, as they move through , to reach their new, optimized positions.

A path from one configuration to another is a set of paths for each sensor from location to . Varying the sensor locations is equivalent to varying the metric and the estimate of the target location . The information gain along is then

 ∫ΥGg(t)(g′(t),g′(t))dt, (13)

and the extremal paths are again the geodesics of the metric . Also the speed constraint observed earlier in Section 2.2.3 for the sensor geodesics is in place here and given by

 dlGdt=√Gg(t)(g′(t),g′(t)). (14)

Again, this leads to the conclusion that there are kinematic constraints on the rate of change of sensor parameters that lead to the collection of the maximum amount of information.

## 4 Configuration for Bearings-only Sensors

To illustrate the mathematical machinery developed in the previous sections consider first the configuration metric for two bearings-only sensors. The physical space , where the sensors and the target reside, is chosen to be the square . Effectively, we assume an uninformative (uniform) prior over the square .

The goal is to estimate a position from bearings-only measurements taken by the sensors, as in previous work [23, 24]. We assume that measurements are drawn from a Von Mises distribution,

 Mn∼pn(⋅|θ)=eκcos[⋅−arg(θ−λn)]2πI0(κ), (15)

where is the concentration parameter and is the th modifed Bessel function of the first kind, , and is the location of the -th sensor in Cartesian coordinates. Note that, in reality, the parameter will depend on location, since the signal-to-noise ratio will decrease as the sensors move farther away from the target. This is beyond the scope of this paper and will be addressed in future work.author=Bill,color=red,inlineauthor=Bill,color=red,inlinetodo: author=Bill,color=red,inlineI think Xuezhi would say that since we’re measuring angle the distance doesn’t matter. This needs to be clarified

For the choice (15), the Fisher metric (4) can be computed easily, and has components

 (16)

### 4.1 Target Geodesics

A geodesic, , starting at with initial direction for a manifold with metric is the solution of the coupled second-order initial value problem for the components :

 d2γidt2=Γijkdγjdtdγkdt,γ(0)=p, γ′(0)=v, (17)

where are the Christoffel symbols for the metric .

Figure 3 shows directly integrated solutions to the geodesic equation (17) a target at and sensors at and . The differing paths correspond to the initial direction vector of the target, varying as varies from 0 to radians in steps of radians.

An alternative way to numerically compute the geodesics connecting two points on the manifold is using the Fast Marching Method (FMM) [31, 32, 38]. Since the Fisher-Rao Information Metric is a Riemannian metric on the Riemannian manifold , one can show that the geodesic distance map , the geodesic distance from the initial point to a point , satisfies the Eikonal equation

 ||∇u(θ)||g−1θ=1 (18)

with initial condition . By using a mesh over the parameter space, the (isotropic or weakly anisotropic) Eikonal equation (18

), can be solved numerically by Fast Marching. The geodesic is then extracted by integrating numerically the ordinary differential equation

 dγ(t)dt=−ηtg−1θ(t)∇u(θ(t)), (19)

where is the mesh size. The computational complexity of FMM is , where is the total number of mesh grid points. For Eikonal equations with strong anisotropy, a generalized version of FMM, Ordered Upwind Method (OUM)   is preferred.

Compare Figure 3 with Figure 4 which uses a Fast Marching algorithm to calculate the geodesic distance from the same point.

Figure 5 shows the speed required along a geodesic travelling in the direction of the vector . author=Simon,color=blue,inlineauthor=Simon,color=blue,inlinetodo: author=Simon,color=blue,inlineAgain critique requiredIn the case of these bearing-only sensors the trade-off over speed is between time-on-target and rate of change of relative angle between the sensors and the target. Travelling slowly means more time for the sensors to integrate measurements but the change in angle is slower, resulting in more accurate position estimates. Conversely, faster movement than the geodesic speed results in larger change in angle measurements but less time for measurements again resulting in more accurate measurements. Only at the geodesic speed is the balance reached and the minimum information criterion achieved. Figure 3: Solutions to the geodesic equation on Ω=[−10,10]×[−10,10] for a target starting at (−1,−3) and sensors at (−7,−6) and (0,1). The differing paths correspond to the initial direction vector (cosθ,sinθ) of the target, varying as θ varies from 0 to 2π radians in steps of 0.25 radians. Figure 4: Geodesic distance on Ω=[−10,10]×[−10,10] from the point (−1,−3) with sensors at (−7,−6) and (0,1). The distance was calculated using a Fast Marching formulation of the geodesic equation. The fact that the geodesics follow the gradient of this distance allows comparison with Figure 3 Figure 5: Geodesic speed for each point in Ω=[−10,10]×[−10,10] for targets departing in direction of the vector (1,-1)

### 4.2 Configuration Metric Calculations

The coordinates on are . The sensor management problem amounts to, given some initial configuration, identifying a choice of for which the determinant of (11) is maximized, and traversing geodesics in , starting at the initial locations and ending at the D-optimal locations; see Figure 2. We assume that the target location is given by an uninformative prior distribution; that is, for all .

To make the problem tractable, we consider a simple case where one of the sensor trajectories is fixed; that is, we consider a 2-dimensional submanifold of parametrized by the coordinates of only. Figure 6 shows a contour plot of , as a function of , for the case where moves from (yellow dot) to (black dot). The second sensor begins at the point (red dot). Figure 6 demonstrates that, provided moves from to , is maximized at the point , implying that moving to is the D-optimal choice. The geodesic path beginning at and ending at is shown by the dashed yellow curve; this is the information-maximizing path of through . Similarly for , the geodesic path beginning at and ending at is shown by the dotted red curve. Figure 6: Contour plot of det(G), for G given by (11), for the case where S1 moves from (0,1) (yellow dot) to (2,3) (black dot). The second sensor S2 begins at the point (−6,−7) (red dot). Brighter shades indicate a greater value of det(G); det(G) is maximum at (−1,−5.5). The geodesic path linking the initial and final positions of S1 is shown by the dashed yellow curve, while the dashed red curve shows the geodesic path linking (−6,−7), the initial position of S2, to the D-optimal location (−1,−5.5).

### 4.3 Visualizing Configuration Geodesics

Figure 7 shows solutions to the geodesic equation on for a sensor starting at with the other sensor stationary at and the target at . The differing paths correspond to varying through in steps of radians in the target initial direction vector . It is interesting to note the concentration in direction of the geodesics in spite of the even distribution of initial directions. Both groups improve information, in a way that is well understood for bearing-only sensors, by increasing the rate of change of bearing. One group acheives this by closing with the target, the other by moving at right angles. This should be compared with Figure 8, which uses a Fast Marching algorithm [19, 27] to calculate the geodesic distance from the same point. Figure 9 shows the speed required along a geodesic travelling through each point in the direction of the vector (1,1). Figure 7: Solutions to the geodesic equation on Ω=[−10,10]×[−10,10] for a sensor starting at (0,1) with the other sensor stationary at (−7,−6) and the target at (-1,-3). The differing paths correspond to the initial direction vector (cosϕ,sinϕ) of the target, varying as ϕ varies from 0 to 2π radians in steps of 0.25 radians. Figure 8: Geodesic distance on Ω=[−10,10]×[−10,10] for a sensor starting at (0,1) with the other sensor stationary at (−7,−6) and the using an uninformative prior for the target. The distance was calculated using a Fast Marching formulation of the geodesic equation. The fact that the geodesics follow the gradient of this distrance allows comparison with Figure 7 Figure 9: Geodesic speed sensor moving along a geodesic in direction (1,-1) at each point in Ω=[−10,10]×[−10,10]. The other sensor is stationary at (−7,−6), and an uninformative prior is used for the target.

## 5 Discussion

We consider the problem of sensor management from an information-geometric standpoint [1, 2]. A physical space houses a target, with an unknown position, and a collection of mobile sensors, each of which takes measurements with the aim of gaining information about target location [23, 24, 21]. The measurement process is parametrized by the relative positions of the sensors. For example, if one considers an array of sensors that take bearings-only measurements to the target (15), the amount of information that can be extracted regarding target location clearly depends on the angles between the sensors. In general, we illustrate that in order to optimize the amount of information the sensors can obtain about the target, the sensors should move to positions which maximize the norm of the volume form (‘D-optimality’) on a particular manifold imbued with a metric (11) which measures the distance (information content difference) between Fisher matrices [30, 14]. We also show that, if the sensors move along geodesics [with respect to (11)] to reach the optimal configuration, the amount of information that they give away to the target is minimized. This paves the way for (future) discussions about game-theoretic scenarios where both the target and the sensors are competitively trying to acquire information about one another from stochastic measurements; see e.g. [29, 16] for a discussion on such games. Differential games along these lines will be addressed in forthcoming work.

We hope that this work may eventually have realistic applications to signal processing problems involving parameter estimation using sensors. We have demonstrated that there is a theoretical way of choosing sensor positions, velocities, and possibly other parameters in an optimal manner so that the maximum amount of useful data can be harvested from a sequence of measurements taken by the sensors. For example, with sensors that take continuous or discrete measurements, this potentially allows one to design a system that minimizes the expected amount of time taken to localize (with some given precision) the position of a target. If the sensors move along paths that are geodesic with respect to (11), then the target, in some sense, learns the least about its trackers. This allows the sensors to prevent either intentional or unintentional evasive manoeuvres; a unique aspect of information-geometric considerations. Ultimately, these ideas may lead to improvements on search or tracking strategies available in the literature [e.g. [25, 34]]. Though we have only considered simple sensor models in this paper, the machinery can, in principle, be adopted to systems of arbitrary complexity. It would certainly be worth testing the theoretical ideas presented in this paper experimentally using various sensor setups.

## Acknowledgements

This work was supported in part by the US Air Force Office of Scientific Research (AFOSR) Under Grant No. FA9550-12-1-0418. Simon Williams acknowledges an Outside Studies Program grant from Flinders University. All the authors declare that they have no further conflict of interest.

## References

• Amari 

Amari S (2001) Information geometry on hierarchy of probability distributions. IEEE Transactions on Information Theory 47(5):1701–1711

• Amari and Nagaoka  Amari Si, Nagaoka H (2007) Methods of information geometry, vol 191. American Mathematical Soc.
• Amari et al  Amari Si, Cichocki A, Yang HH (1996) A new learning algorithm for blind signal separation. In: Advances in neural information processing systems, pp 757–763
• Beldjoudi et al  Beldjoudi G, Rebuffel V, Verger L, Kaftandjian V, Rinkel J (2012) An optimised method for material identification using a photon counting detector. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 663(1):26–36
• Bell  Bell MR (1993) Information theory and radar waveform design. IEEE Transactions on Information Theory 39(5):1578–1597
• Carmo  Carmo MPd (1992) Riemannian geometry. Birkhäuser
• Carter et al  Carter KM, Raich R, Finn WG, Hero III AO (2011) Information-geometric dimensionality reduction. IEEE Signal Processing Magazine 28(2):89
• Clarke  Clarke B (2009) The completion of the manifold of riemannian metrics with respect to its metric. PhD thesis
• Cochran and Hero  Cochran D, Hero AO (2013) Information-driven sensor planning: Navigating a statistical manifold. In: Proceedings of IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp 1049–1052
• Cover and Thomas  Cover TM, Thomas JA (2012) Elements of Information Theory. John Wiley & Sons
• DeWitt  DeWitt BS (1967) Quantum theory of gravity. i. the canonical theory. Physical Review 160(5):1113
• Donoho  Donoho DL (2006) Compressed sensing. IEEE Transactions on Information Theory 52(4):1289–1306
• Forbes et al  Forbes C, Evans M, Hastings N, Peacock B (2011) Statistical Distributions. John Wiley & Sons
• Ford et al  Ford I, Titterington D, Kitsos CP (1989) Recent advances in nonlinear experimental design. Technometrics 31(1):49–60
• Gil-Medrano and Michor  Gil-Medrano O, Michor PW (1991) The Riemannian manifold of all Riemannian metrics. Quarterly Journal of Mathematics 42:183–202
• Hamadene and Lepeltier  Hamadene S, Lepeltier JP (1995) Zero-sum stochastic differential games and backward equations. Systems & Control Letters 24(4):259–263
• Jeffreys 

Jeffreys H (1946) An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society of London Series A 186(1007):453–461

• Kershaw and Evans  Kershaw DJ, Evans RJ (1994) Optimal waveform selection for tracking systems. IEEE Transactions on Information Theory 40(5):1536–1550
• Kimmel and Sethian  Kimmel R, Sethian JA (1998) Computing geodesic paths on manifolds. Proceedings of the National Academy of Sciences 95(15):8431–8435
• Kullback and Leibler  Kullback S, Leibler RA (1951) On information and sufficiency. The Annals of Mathematical Statistics 22(1):79–86
• Mentre et al  Mentre F, Mallet A, Baccar D (1997) Optimal design in random-effects regression models. Biometrika 84(2):429–442
• Montúfar et al  Montúfar G, Rauh J, Ay N (2014) On the Fisher metric of conditional probability polytopes. Entropy 16(6):3207–3233
• Moran et al [2012a] Moran B, Howard S, Cochran D (2012a) An information-geometric approach to sensor management. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 5261–5264
• Moran et al [2012b] Moran W, Howard SD, Cochran D, Suvorova S (2012b) Sensor management via Riemannian geometry. In: Proceedings of 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp 358–362
• Padula and Kincaid  Padula SL, Kincaid RK (1999) Optimization strategies for sensor and actuator placement. Langley Research
• Peter and Rangarajan  Peter A, Rangarajan A (2006) Shape analysis using the Fisher-Rao Riemannian metric: unifying shape representation and deformation. In: Proceedings of 3rd IEEE International Symposium on Biomedical Imaging: Nano to Macro, pp 1164–1167
• Peyré et al 

Peyré G, Péchaud M, Keriven R, Cohen LD (2010) Geodesic methods in computer vision and graphics. Foundations and Trends® in Computer Graphics and Vision 5(3–4):197–397

• Rao  Rao CR (1945) Information and accuracy attainable in estimation of statistical parameters. Bulletin of the Calcutta Mathematical Society 37:81–91
• Rhodes and Luenberger  Rhodes I, Luenberger D (1969) Differential games with imperfect state information. IEEE Transactions on Automatic Control 14(1):29–38
• Sebastiani and Wynn 

Sebastiani P, Wynn HP (1997) Bayesian experimental design and Shannon information. In: Proceedings of the Section on Bayesian Statistical Science, vol 44, pp 176–181

• Sethian  Sethian JA (1996) A fast marching level set method for monotonically advancing fronts. Proceedings of the National Academy of Sciences 93(4):1591–1595
• Sethian  Sethian JA (1999) Fast marching methods. SIAM Review 41(2):199–235
• Sethian and Vladimirsky  Sethian JA, Vladimirsky A (2003) Ordered upwind methods for static Hamilton—Jacobi equations: Theory and algorithms. SIAM Journal on Numerical Analysis 41(1):325–363
• Sheng and Hu  Sheng X, Hu YH (2005) Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks. IEEE Transactions on Signal Processing 53(1):44–53
• Sowelam and Tewfik  Sowelam SM, Tewfik AH (2000) Waveform selection in radar target classification. IEEE Transactions on Information Theory 46(3):1014–1029
• Steinberg and Hunter  Steinberg DM, Hunter WG (1984) Experimental design: review and comment. Technometrics 26(2):71–97
• Suvorova et al  Suvorova S, Moran B, Howard SD, Cochran D (2013) Control of sensing by navigation on information gradients. In: Proceedings of IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp 197–200
• Tsitsiklis  Tsitsiklis JN (1995) Efficient algorithms for globally optimal trajectories. IEEE Transactions on Automatic Control 40(9):1528–1538
• Vallisneri  Vallisneri M (2008) Use and abuse of the Fisher information matrix in the assessment of gravitational-wave parameter-estimation prospects. Physical Review D 77(4):042,001–1–042,001–20
• Walter and Pronzato  Walter É, Pronzato L (1990) Qualitative and quantitative experiment design for phenomenological models¡ªa survey. Automatica 26(2):195–213