Due to its vast applications such as environmental monitoring , target/source motion tracking/localization -, agriculture -, sensor management has been studied extensively in robotics and automation literature, in terms of communication management - and sensor trajectory planning -, etc. Closely related problems of sensor scheduling and sensor placement have also received much attention in the control community -.
The problem of managing one or more sensor-equipped mobile robots’ trajectories to maximize the information gathered regarding a target system/process is known as Active Information Acquisition (AIA). The AIA problem is commonly formulated as a stochastic control problem where the mutual information between sensor measurements and the target state is optimized . When the target’s motion dynamics is linear and driven only by Gaussian noise, and the sensor’s observation model is linear in the target state, it is well-known that the stochastic AIA problem reduces to a deterministic optimal control problem for which open-loop solutions are optimal (see  and the references therein). Recent works have also established tree-search based methods and algorithms to efficiently approximate the optimal policy while maintaining suboptimality guarantees , .
However, the assumptions made regarding the target motion model in the aforementioned works are often limiting. For example, the target may be subject to arbitrary unknown disturbances which are difficult, if possible, to statistically interpret or model. Typical examples in applications include systems under fault/attacks -, tracking/localisation of targets subject to abrupt maneuvers -, advanced vehicle applications under complex tire-ground interactions -
, estimation of unmeasured forces in grasping/manipulation-, etc. In fact, filtering and estimation under arbitrary unknown inputs have received much attention in the control literature -, and have found numerous applications, including robustness/security analysis and synthesis of resilient autonomous robots and connected vehicle systems -.
Motivated by the above considerations, here we consider targets whose dynamics are subject to arbitrary unknown disturbances. In this case, it is of interest to track the evolution of both the target and the unknown disturbances. To circumvent complexities of such an approach, we formulate and solve the AIA problem for tracking the target state and analyse the resulting performance of both target state and unknown disturbance tracking. Firstly, we show that both the state and input error covariance update maps given in existing unknown input filtering works - are concave and monotone. To the best of our knowledge, these properties have not previously been explored. Secondly, inspired by the concepts of Reduced Value Iteration (RVI) ,, we propose a suboptimal solution to the AIA problem using Forward Value Iteration (FVI) with pruning according to an information dominance metric. Concrete suboptimality performance guarantees for tracking both the target state and the unknown disturbance are established. Finally, we use a target tracking example to show the merits of the proposed solution in comparison to a greedy policy.
Ii Preliminaries and Problem Formulation
Ii-a Preliminaries of filtering under unknown inputs
Consider a mobile sensor with discrete dynamics model:
where is the sensor state with being an dimensional state space with metric , and is the control input with as a finite space of admissible controls. Suppose there exists a target with linear time-varying motion model:
is the target state vector, the target process noise, i.e.
is normally distributed with zero-mean and covariance, represents arbitrary unknown inputs whose models or statistical properties are not assumed to be known and are known matrices of compatible dimensions. Without loss of generality, we assume that While in operation, the sensor has observation model:
where, is the measurement, is a known measurement matrix, and the measurement noise with . For brevity we drop dependence of , , on the sensor state in the remainder of the paper.
Unknown input estimation:
as the unknown input estimation error, the filtered state error, and their respective covariances.
) are unbiased estimates if and only if the initial state guessis unbiased and the unknown input filter gain satisfies
unknown input filter gain in the minimum variance sense is given by
where , and is the filtered state error covariance at time step . Given
, one may transform the state estimation problem into a standard Kalman filtering problem and find a resulting optimal gain matrix. The resulting optimal gain is in general non-unique . For simplicity, in this paper we take the choice
Ii-B Problem Formulation
Given an initial sensor state and a prior distribution of the target state the problem of interest is to optimize the trajectory of the sensor over a planning horizon of length to best track the evolution of the target dynamics and the unknown input. Expanding upon the problem formulation in , , we consider the following optimal control problem
where is a sequence of admissible controls, and , are the state and unknown input error covariance update maps defined in (11), with the first measurement taken at sampling instant .
Although the unknown inputs are not assumed to follow any specific probability distribution, one could give some statistical interpretation of the optimization problem (12) similar to the existing works for the case without unknown inputs , . This can be done by following the concepts in  to firstly pose the unknown input as a Gaussian noise process with variance and derive the statistical interpretation of problem (12). Then, the lack of prior information regarding the unknown input can be expressed by taking to infinity. Due to limited space, we will not pursue this point further.
Finding the optimal solution to problem (12) amounts to exploring the large space of sensor states and error covariances allowed by over a planning horizon and finding the optimal path via tree search. To obtain a compromise between complexity and optimality of search tree construction, we adopt the concepts of the RVI algorithm proposed in , . Conceptually, if a set of nodes are sufficiently close in sensor configuration space (i.e. they -cross) and one node’s covariance is not as informative as nearby nodes’ (i.e. is -algebraically redundant), it is discarded from the tree. This method reduces computational complexity and gives suboptimality bounds for the resulting solution. -crossing and -algebraic redundancy are formalized below.
 Two sensor trajectories -cross at time if for .
 Let and be a finite set with . Then a matrix is -algebraically redundant with respect to if there exists a set of nonnegative constants such that
To prune nodes according to their algebraic redundancy and approximately solve (12) one must consider how informative -crossing nodes are for state evolution and unknown input evolution tracking separately. That is, Definition 2 must be checked for both and . This may result in highly informative nodes for state evolution tracking being pruned due to mediocre contribution to unknown input tracking, making suboptimality bounds for the resulting solution to (12) difficult to analyse.
However, the close relationship between state and unknown input error covariance update maps seen in (11) allows one to prune according to state tracking performance only while still having concrete performance guarantees for unknown input tracking. The following sections of the paper therefore address the reduced problem
and derive suboptimality bounds for both state and unknown input tracking that result from this simplified approach.
Iii Suboptimality bounds for state and unknown disturbance evolution tracking
The expansion of RVI for tracking targets with unknown dynamics and the derivation of suboptimality bounds is our main focus. In our approach we solve (13) approximately with Algorithm 1 to find a control sequence without optimization for unknown input evolution tracking. The suboptimality of state evolution tracking incurred by pruning nodes can then be upper bounded via a worst case analysis as in . We then leverage the relationship between state and unknown input estimation error covariance maps to derive corresponding bounds for unknown input evolution tracking. To begin, we require the following assumptions.
 The sensor motion model is Lipschitz continuous in with Lipschitz constant for every fixed , i.e. .
 For any two nodes , . Let , be the updated state estimation error covariances after applying control to each node. Then
, where for some . Note for some , if then .
Iii-a Suboptimality bounds for target state evolution tracking
The following properties of the state estimation covariance update map in (11) are key for performance analysis.
The state estimation covariance update map is:
Monotone: if then
It is important in our worst case analysis to consider recursive update of the error covariance over a long horizon. We therefore introduce the k-horizon mapping  , which maps the state error covariance matrix at time 0 to time according to the first elements of the control sequence :
Monotonicity and concavity of the k-horizon mapping naturally follow from Lemma 1 and the definition in (14). As a direct result of concavity, the k-horizon mapping is bounded by its first order Taylor approximation, i.e.
is the directional derivative of the k-horizon mapping at along an arbitrary direction . The directional derivative can therefore be interpreted as the impact an early perturbative error will have on the error covariance at a later time provided no further perturbations occur. This interpretation becomes pertinent for studying the consequences of pruning nodes if one frames the term in Definition 2 as a perturbative error. This motivates the study of the directional derivative.
The directional derivative of the state estimation covariance update map at along the arbitrary direction is given by
where is defined as in (11). The directional derivative of the k-horizon mapping at along an arbitrary direction is given by
, with .
Suppose such that , then we have
where and is the minimum eigenvalue of
is the minimum eigenvalue of.
As in , , the above bound implies that provided the state error covariance is bounded for all time, the effect of a perturbation at an early time step decays exponentially as time evolves. The culmination of utilising the above results in a worst case performance analysis is an upper bound on the suboptimality of the state error covariance found by Algorithm 1. Denoting ,
Let be the peak state estimation error of the optimal trajectory, i.e. . Then we have
where , , .
Iii-B Suboptimality bounds for unknown input tracking
In this section, we show that despite considering only minimisation of the cost function for state estimation as written in (13), one can still derive concrete suboptimality bounds for the resulting unknown input evolution tracking.
Once again we introduce a “k-horizon” update map for unknown input estimation error,
where is as in (14). From this definition, we see that the control sequence that solves the reduced problem (13) which considers state error covariance only should give that minimises (16). The performance of unknown input tracking should therefore be closely linked to that of the state evolution tracking. However, as the sensor state found by solving (13) may not coincide with the sensor state required to minimise (16) over both arguments . This is an important observation that has direct impact on the performance of unknown input tracking under a control sequence tailored for state evolution tracking optimization. This impact will become apparent in Theorem 2.
Monotonicity and concavity of the unknown input error covariance update map are again crucial properties for describing the evolution of nodes.
The unknown input error covariance update map is monotone and concave.
The directional derivative of at in the direction is given by
where is the directional derivative of the state k-horizon update map.
As in Lemma 3, we find that the effect of a perturbation in the state error covariance on the unknown input error covariance dampens with time provided and are bounded for all .
Suppose such that . Then
where is the maximum eigenvalue of .
The propagated error incurred on the unknown input error covariance by a perturbation in the state covariance is therefore a multiple of that found for the state. Hence, given the state result in Lemma 3, the unknown input analogue can also be found. We now provide an upper bound on the final unknown input error covariance found by Algorithm 1.
Let , be the peak state and input estimation errors of the optimal trajectory respectively. That is, and . Then
where , is the maximum eigenvalue of , and .
We observe the same behaviours of the unknown input bounds with respect to as the state bounds found in Theorem 1. Here is a factor of from the state bounds, again highlighting the close relationship between the two bounds. Most notably, we see the previously mentioned impact of unknown input estimation under a control sequence optimized for state estimation; the bound grows with . This result is expected – if the distance between optimal sensor positions for state estimation and unknown input estimation is large, the performance of unknown input estimation resulting from optimizing only state estimation via Algorithm 1 worsens.
Iv Illustrative Simulations
In this section, we illustrate the theoretical results with a two-dimensional target tracking problem in which the target dynamics is subject to an unknown input signal. Suppose a sensor with state defined by its position-velocity vector is mounted on a robot with the dynamic model:
with control input , where and is a small time translation. The goal of the robot is to track and estimate the position and velocity of a constant-velocity vehicle driven by Gaussian noise and an unknown input in the form of abrupt accelerations:
where is the position-velocity vector of the target state at time and is a diffusion strength scalar. The tracking takes place over 51 time steps. At , is a maneuver that takes form of a sharp acceleration in some direction.
The sensor takes noisy position measurements of the target and uses them to obtain the target’s velocity by differentiation. For simplicity, the sensor observation model in (3) is given by with the measurement noise increasing linearly with the distance between robot and target. Certain areas of the environment are “cloudy”, depicted as grey areas in Figure 1, and increase the robot’s measurement noise. Upon entering a cloud, the robot should slow down for safety under poor visibility. Beyond a maximum range of 20 metres the measurement noise is effectively infinite.
For Monte-Carlo simulations, Algorithm 1 is used to track the target with , , , , . The performance of our proposed algorithm is compared to a greedy approach in Figure 1. The RVI algorithm’s long planning horizon predicts the target will enter and remain in an area of high measurement noise in future time steps, and thus prioritises avoiding entering this area over remaining close to the target. On the contrary, the greedy algorithm prioritises minimising the cost function at each time step and therefore lacks the foresight to avoid these areas.
The trajectory costs of the two policies in Figure 1 elucidate the impact that RVI’s non-myopic planning has on the performance of the found solution. We see that RVI incurs much less cost than the greedy policy. Further, comparison of the average root mean square error (RMSE) of the policies’ state estimates shows that for all time steps our algorithm more successfully tracks target evolution in the presence of unknown inputs. These results are promising confirmation of our theoretical expansion of RVI to tracking targets subject to arbitrary, unknown disturbances.
V Conclusion and Discussions
In this work, we studied the AIA problem for targets subject to arbitrary unknown disturbances. We have shown both the state and input error covariance update maps given in existing unknown input filtering works are concave and monotone. These properties were used to derive suboptimality guarantees for both state and unknown disturbance tracking by the proposed method. Notably, we have shown that one may consider tracking only the target state without loss of performance guarantees for unknown disturbance tracking due to the close relationship between unknown disturbance and target state estimation. The suboptimality bounds presented were notably linear in the tuning parameters which dictate strictness of node pruning, and thus the optimal solution is recovered when the tuning parameters are set to zero. Simulations demonstrated that the proposed algorithm performs well in tracking a target undertaking unknown maneuvers. Future work will focus on the more general case with unknown disturbances affecting both target motion and sensor observation models. A decentralized extension of the AIA considered here will also be pursued.
Vi Appendix A: Proofs for main results
Vi-a Proofs for results in Section Iii-A
Vi-B Proofs for results in Section Iii-B
To prove Lemma 4, we require some preparatory results:
Let be a constant. Then , we have .
Proof.  For , denote . We have
when and Thus, . Since is order reversing for any matrix , we have . Additionally, note that
For , we have .
is a convex function of .
Proof.  We note that is monotone. Then, by Corollary V.2.6 in , is operator convex. For and , let . We can prove that
So is also operator convex.
Proof. [Proof of Lemma 4] We note that is monotone. Then for any with , we have . Then, as matrix multiplication is order preserving, and matrix inversion is order reversing, it immediately follows that . Hence, monotonicity is proved. We next prove concavity. For and , let . Then, from Lemma 8, and since we have . Inverting this expression, utilising Lemma 7, and remembering that is a convex operation  gives
thus proving concavity.
Proof. [Proof of Lemma 5] Denoting and , it is simple to show that
Putting these together and simplifying gives the result.
Proof. [Proof of Lemma 6] The proof follows straightforwardly by considering the cyclical property of trace operator and submultiplicity of the Frobenius norm .
Vi-C Proof of Theorem 1
Vi-D Proof of Theorem 2
There exists a real constant such that :
Proof.  Consider any two nodes , . Then, applying control to each node we have from Assumption 2. Hence,
where the last inequality follows from monotonicity of and . Denote
then it is simple to show