In multi-target tracking, the (normalized) likelihoods of the associations between tracks and measurements are calculated using motion and sensor models, and for some tracking algorithms, these suffice to define a maximum likelihood solution. However, there are situations in which the probabilities of association hypotheses are also important, or even required for the algorithms. One example is Joint Probabilistic Data Association Filter (JPDAF) as described by , where the evaluation of target-measurement association probabilities is a necessary part of the algorithm. Another example is Generalized Labeled Multi-Bernoulli (GLMB) Filter as described in ; here the probabilities are not required, but it would be good to know, quantitatively, how much truncation error  has occurred: When we keep for example only the top 100 hypotheses, are we keeping 90% of the probability mass, or just 50%?
To get such probabilities we need to normalize by the sum of likelihoods of all permissible association hypotheses, whose number grows combinatorially. If we construct a “likelihood matrix” whose entries are derived from pairwise target-measurement likelihoods, and append it with diagonal matrices for missed detections and target deaths, as is done in , then under an independence assumption, each hypothesis likelihood is a product of “non-conflicting” terms from this matrix, and the normalizing factor we seek is the permanent of the matrix .
Exact matrix permanent algorithms, such as Ryser’s [5, 1], scale exponentially with the matrix size [6, 7]. For a matrix with nonnegative entries, a fully polynomial time randomized approximation scheme (FPRAS) is presented in 
through Markov Chain Monte Carlo (MCMC), which can calculate a solution within a factor ofof being optimal for a given . This algorithm is quite complex to analyze and implement. On the other hand, as is shown in , even “crude” approximations may turn out to be useful for estimating various probabilities. With such a motivation, this letter brings to the attention of the tracking community a recent result by Bero Roos  that provides a first order and a second order approximations to the permanent of a rectangular matrix, both with error bounds111For higher order approximations, see for example ..
Ii The Roos’ approximations
We use, for concreteness, the matrix layout in Figure 1 of  for (normalized) likelihoods222without taking the logarithm: Each row corresponds to either an existing target, or a potential new-born target from a Labeled Multi-Bernoulli birth model. Each column corresponds to one of the following situations: (1) a measurement for a survived and detected target, (2) a survived but undetected target, and (3) a dead or unborn target. An association hypothesis essentially “picks” likelihood entries from the matrix, such that there is exactly one entry picked for each row, and zero or one entry picked for each column. Measurements that are not picked automatically become clutter and need not be explicitly dealt with333which may explain why the contemporary GLMB filters are more efficient than the classical Hypothesis-Oriented Multiple Hypothesis Tracker (HO-MHT) . .
Thus the matrix always has more columns than rows. However, in order to follow the presentation in  closely, we will describe the algorithm using a “thin” matrix with more rows than columns; this means that we will apply Roos’ algorithm to the transpose of our likelihood matrix for computation.
Let denote the set of all -permutations of , the ordered arrangements of a -element subset of an -set, and denote the set444To iterate such sets in a memory efficient way, see . of all -combinations of , the unordered -element subset of an -set. Then the permanent of a thin matrix is defined as
For , set the column average to
Define for an index subset the product
Using a Matlab-type notation “:” to denote consecutive integers, we define
and state the first and second order approximations respectively as
where definitions used in the error bounds are given below. To save space, we skip special cases and only describe those where .
For , define
Define a shorthand notation for row difference
where for , the constants are given by
The functions are defined as
Iii An example of ideal usage
We will illustrate one use of Roos’ permanent approximations in the framework of GLMB . For ease of exposition we will consider the case where at time there is only one hypothesis, and at time it gives rise to hypotheses, assuming that we enumerate them all. The weight of each hypothesis is proportional to the product of likelihoods inside the summation in Equation (1), noting that the matrix has been transposed. After normalizing the weight by the permanent, we will get the probability of each hypothesis.
However, for any practical application of GLMB, we cannot enumerate all child hypotheses, and have to truncate at a number, say . Then the weights are normalized by the sum of these weights, not the sum of all weights which is given by the permanent. The truncation error is given in , which confirms our intuition that we should pick the highest weights 555or the best weights we can find within a computation budget using for example Gibbs sampling to keep. If we take the negative log of the likelihood matrix , then the best assignments can be enumerated by the Murty’s algorithm [14, 15], which calls as a subroutine the Munkres algorithm that finds the best bipartite matching [16, 17, 18].
It would be quite useful, even if done offline, to know quantitatively what the truncation error is: Do these hypotheses represent 90% of the probability mass, or only 50%?
To illustrate the point, we create a toy example with a random likelihood matrix of size 4 by 12 and run Murty’s algorithm on it, recording the cummulative likelihoods with each increment of the value . This is shown as the blue curve666If we use Gibbs sampling instead of Murty’s algorithm, the curve will still be increasing but not necessarily concave. in Figure 1. We also calculate, by Ryser’s algorithm, the exact permanent and mark it as the black, dotted line. The approximate permanent calculated by Roos’ second approximation, together with its upper and lower bounds, are marked as the red, cyan and green lines.
It can be seen from the figure that if we stop at about the top 25 hypotheses, we are guaranteed to capture about of the total probability, and there is even hope that the percentage can be as high as . If we continue to obtain the top 100 hypotheses, the lower bound is no longer informative, but the upper bound guarantees that we have about of the probability. Empirically we have observed that Roos’ second approximation is often quite accurate but with conservative bounds, so the percentage may be close to the truth, which we know is .
Iv The issue of computation time
Ryser’s algorithm scales exponentially while Roos’ approximation scales polynomially, so for large matrices the latter should be faster to compute than the former. However, for “mid-sized” matrices, Roos’s first approximation is fast but conservative while Roos’ second approximation is more useful but slow, often slower than Ryser’s algorithm. This point is illustrated by an experiment shown in Figure 2, where computation times for random likelihood matrices are plotted, based on unoptimized Matlab code. The structure of the likelihood matrix corresponds to 10 targets (existing and birthing) and 10 to 15 measurements, i.e., with 10 rows and 30 to 35 columns.
The fact that both Ryser’s and Roos’ second approximation are unbearably slow for a matrix of this size indicate the following possible ways of improvement:
optimized implementation in a compiled languge;
exploitation of the diagonal structure of the second and third blocks of the likelihood matrix (used in GLMB filtering);
better approximation algorithms.
In this letter we have presented Roos’ approximation algorithms with bounds for matrix permanent. We illustrated their use in estimating data association probablities, such as in GLMB filtering where only the top hypotheses are kept. We pointed out the challenge in computation time and proposed directions for improvements.
The author would like to thank Professor Bero Roos for discussions and clarifications, and Professor Jeffrey Ulhmann for providing a cited paper. He would also like to thank Ms. Emily Polson for support.
-  D. F. Crouse and P. Willett, “Computation of Target-Measurement Association Probabilities Using the Matrix Permanent,” IEEE Transactions on Aerospace and Electronic Systems, vol. 53, no. 2, pp. 698–702, Apr. 2017. http://dx.doi.org/10.1109/taes.2017.2664479
-  B. N. Vo, B.-T. Vo, and H. Hoang, “An Efficient Implementation of the Generalized Labeled Multi-Bernoulli Filter,” IEEE Transactions on Signal Processing, vol. 65, no. 8, pp. 1975–1987, Apr. 2017. http://dx.doi.org/10.1109/tsp.2016.2641392
-  B.-N. Vo, B.-T. Vo, and D. Phung, “Labeled Random Finite Sets and the Bayes Multi-Target Tracking Filter,” Signal Processing, IEEE Transactions on, vol. 62, no. 24, pp. 6554–6567, Dec. 2014. http://dx.doi.org/10.1109/tsp.2014.2364014
-  Wikipedia, “Permanent – Wikipedia, the free encyclopedia,” 2018, [Online; accessed 16-February-2018 ]. https://en.wikipedia.org/w/index.php?title=Permanent&oldid=795641630
-  H. J. Ryser, Combinatorial mathematics, ser. Carus mathematical monographs. Mathematical Association of America; distributed by Wiley [New York], 1963. https://books.google.com/books?id=wOruAAAAMAAJ
-  L. G. Valiant, “The complexity of computing the permanent,” Theoretical Computer Science, vol. 8, no. 2, pp. 189–201, Jan. 1979. http://dx.doi.org/10.1016/0304-3975(79)90044-6
-  A. Lyons, “Polynomial-Time Approximation of the Permanent,” Course project for MATH, vol. 100, 2011.
-  M. Jerrum, A. Sinclair, and E. Vigoda, “A Polynomial-time Approximation Algorithm for the Permanent of a Matrix with Nonnegative Entries,” J. ACM, vol. 51, no. 4, pp. 671–697, Jul. 2004. http://dx.doi.org/10.1145/1008731.1008738
-  J. K. Uhlmann, “Matrix permanent inequalities for approximating joint assignment matrices in tracking systems,” Journal of the Franklin Institute, vol. 341, no. 7, pp. 569–593, Nov. 2004. http://dx.doi.org/10.1016/j.jfranklin.2004.07.003
-  B. Roos, “New permanent approximation inequalities via identities,” arXiv preprint arXiv:1612.03702, 2017. https://arxiv.org/pdf/1612.03702.pdf
-  ——, “On Bobkovas approximate de Finetti representation via approximation of permanents of complex rectangular matrices,” Proceedings of the American Mathematical Society, vol. 143, no. 4, pp. 1785–1796, 2015. http://dx.doi.org/10.1090/s0002-9939-2014-12429-4
-  D. Reid, “An algorithm for tracking multiple targets,” IEEE Transactions on Automatic Control, vol. 24, no. 6, pp. 843–854, Dec. 1979. http://dx.doi.org/10.1109/tac.1979.1102177
-  A. Nijenhuis, W. Rheinboldt, and H. S. Wilf, Combinatorial Algorithms. Academic Press, 1978. http://www.worldcat.org/isbn/9780125192606
-  K. G. Murty, “Letter to the Editor—An Algorithm for Ranking all the Assignments in Order of Increasing Cost,” Operations Research, vol. 16, no. 3, pp. 682–687, 1968. http://dx.doi.org/10.1287/opre.16.3.682
-  M. L. Miller, H. S. Stone, and I. J. Cox, “Optimizing Murty’s ranked assignment method,” IEEE Transactions on Aerospace and Electronic Systems, vol. 33, no. 3, pp. 851–862, Jul. 1997. http://dx.doi.org/10.1109/7.599256
-  J. Munkres, “Algorithms for the Assignment and Transportation Problems,” Journal of the Society for Industrial and Applied Mathematics, vol. 5, no. 1, pp. 32–38, Mar. 1957. http://dx.doi.org/10.1137/0105003
-  R. A. Pilgrim, “Munkres’ Assignment Algorithm Modified for Rectangular Matrices,” http://csclab.murraystate.edu/~bob.pilgrim/445/munkres.html, 2017.
-  Y. Cao, “munkres.m,” 2011, [Downloaded from Matlab Central, http://goo.gl/9YPMi7].