Sliding-Window Optimization on an Ambiguity-Clearness Graph for Multi-object Tracking

11/28/2015 ∙ by Qi Guo, et al. ∙ 0

Multi-object tracking remains challenging due to frequent occurrence of occlusions and outliers. In order to handle this problem, we propose an Approximation-Shrink Scheme for sequential optimization. This scheme is realized by introducing an Ambiguity-Clearness Graph to avoid conflicts and maintain sequence independent, as well as a sliding window optimization framework to constrain the size of state space and guarantee convergence. Based on this window-wise framework, the states of targets are clustered in a self-organizing manner. Moreover, we show that the traditional online and batch tracking methods can be embraced by the window-wise framework. Experiments indicate that with only a small window, the optimization performance can be much better than online methods and approach to batch methods.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

With the development of computer vision techniques, more and more people began to focus on understanding the behavior as well as other context of the objects via visual information. Tracking targets in video sequences, one of the core topics with wide applications in video surveillance, rocketed with the boost of tracking-by-detection (TBD) methods

[26]. The TBD reconstruct the states of targets based on the detection responses by assigning identity to each detection and optimizing the trajectories [2, 21]. The prosperity of TBD these years has raised people’s interests in a more challenging topic - multi-object tracking (MOT) with unknown numbers. MOT remains difficult due to complex settings of sequences, , intricate trajectories of targets, varying illumination, movements of cameras, .

The MOT problem can be handled in an online fashion, which could be adopted in time critical applications. However, the traditional online methods is susceptible to outliers brought by occlusions and noises, , false positives, true negatives, duplicate detections of a single target, . These outliers can cause ambiguities in data association. Some tackles the problem using sparse appearance model [19, 28], and others via prediction [3] of states in future frames. But dynamics and appearances of the targets are unpredictable in some cases. Batch tracking methods are easier to solve the problem of outliers than online methods by global optimization of association and trajectories. Terms that penalize mutual exclusions and the number of tracklets [21, 9] were added to the energy function to regularize trajectories.

Apart from advantages of batch methods, one major problem is that the global optimization involves frames in the whole sequence [4] which does not suit for real-time applications. Some batch methods require initial solutions, [21]. Therefore, we propose our method in this paper, aiming at combining advantages of online and batch methods together while avoiding their disadvantages. We derive an iteratively Approximation-Shrink Scheme (AS Scheme) from the Maximum-A-Posterior (MAP) formulation using sequential approximation. We show that the state space can be effectively shrunk, but there may exist conflicts in the sequential optimization and the results may vary with different optimization sequences. In order to avoid these problems, an Ambiguity-Clearness Graph (A-C Graph) is formulated to efficiently represent the tracklet fragments and ambiguities in the association. A set of rules and procedures are defined for changes of nodes and edges in the graph, , connections, disconnections, transforms, merges, . A sliding Window-of-Ambiguity (WOA) is defined in the A-C Graph for sequential optimization of layers in the graph. Based on the A-C Graph and the sliding WOA optimization, MOT is conducted in a window-wise manner, which is able to disambiguate the association and accelerate the optimization process. We also show that the traditional online and batch approach can be embraced into this framework with different window sizes.

Our main contributions can be summarized as: (1) an approximation-shrink scheme that iteratively approximate the global optimization, (2) a window-wise optimization framework based on the novel A-C Graph which embrace the traditional online and batch methods, (3) a unified analysis of window-wise approaches with different window sizes using search tree.

2 Related Works

Different from the past tracking methods [24, 12], TBD reconstructs trajectories of targets by associating detections provided by the object detectors. Most of the researchers exploits the TBD framework to design their algorithms in MOT, which can be categorized as online and batch approaches.

As for batch tracking [21, 22, 2, 7, 25, 10, 8] approaches, conditional random field (CRF) is often used to learn and model the affinity such as appearance and motion to discriminate among different trajectories [29, 30]. A global and pairwise model is learned online in  [30]

to form an energy function, which is minimized offline via heuristic search. Despite the popularity of CRF model, extensive training is needed. Continuous energy model is introduced by a series of work 

[21, 22, 2]. Milan [21] built a comprehensive continuous energy function by linearly combining terms regarding appearance, motion, mutual exclusion, trajectory persistence, . The continuous energy functions are easier to optimize than discrete ones, whereas they possess too many parameters and are hard to be tuned. Network flow is first applied to tracking by Zhang  [32]. A graph is formed with states of targets as nodes and the associations as edges. The likelihood of the states are represented as the capacity of edges. Butt  [7] improved the network structure by defining their node as a candidate pair of matching observations between consecutive frames. In order for a better model of occlusions, [25] designed a latent data association framework. Instead of assigning each detection to a corresponding track, they assume each detection is its own track and assign a latent data to each node to represent the association. In addition to the general modeling of targets, some people worked on tracking targets with specific characteristics, , Dicle  [10] focus on tracking targets with similar appearance but different motion patterns.

Online tracking [3, 4, 5, 6, 11, 31, 19] has become more and more popular these days. Network flow has also been adopted in online tracking. [5] formulate multi-object tracking into a multi-commodity network flow problem. They use sparse appearance to reduce computational complexity. Lu  [19] constructed a dictionary using already tracked objects and assigned the new detections by minimizing the L1 regularized function. Wang  [27] finds that the representation residuals follow the Laplacian distribution, by which they improved the sparse representation method on tracking. Hungarian algorithm is firstly introduced into tracking problems by Joo  [14] to solve the bipartite graph model they proposed. The frame-by-frame scheme of online tracking takes great advantages of hungarian algorithm. Bae  [3] designed tracklet confidence by considering the length, occlusion and affinity. Different strategies are applied to tracklets with high and low confidence. Hungarian algorithm is employed in the association for local and global association respectively. Hungarian algorithm greedily associates detections in consecutive frames which could possibly misses the global optimal and cause identity switches. Besides the popularity of Hungarian algorithm in association algorithms, Bayesian framework is also one of the most popular model for target modeling. Bae  [4] improved their previous work [3]

by perform data association with a track existence probability, the provided detections are associated to the existed tracks and the corresponding track existence probabilities will be updated. Yoon  

[31] constructed a Relative Motion Network(RMN) to factor out the camera motion by considering motion context from multiple object and incorporate relative motion network to Bayesian framework.

3 Approximate-Shrink Scheme

Given observations of a real time video sequence, where denotes the number of observations in frame , we assume: (1) each observation corresponds to a state [25], (2) states in the same frame are independent, (3) some of the states are already clear given observations. The Maximum-a-Posterior (MAP) formulation of MOT is


Based on Assumption (2), we resolve as


Assumption (3) offers us an intuition that there exist some states . Denote . We name Clear states (C states) and Ambiguous states (A states). The global optimization in Equation 2 can be relaxed to




Doing these two optimization separately is an approximation to Equation 2. First, we sequentially optimize every state in (approximation step) via Equation 4. Then we set fixed as the evidence for , and derive Equation 3 to


(shrink step). We iteratively find the , , let and repeat the above steps to shrink the search space.

This Approximate-Shrink Scheme (A-S Scheme) iteratively search and narrow down the state space. serve as nucleus of trajectories in the space which attract states to associate to them. Some nucleus merge together in the iteration to form longer tracklets during the iteration. However, the space is still too large, and the convergence is not guaranteed. More approximations are needed to accelerate the speed and ensure the convergence of this scheme. Moreover, it is necessary to design a data structure so as to avoid conflicts of associations of states in and the effects of the sequence on the optimization results. Therefore, we propose a self-organizing A-C Graph and window-wise optimization framework to meet the demands in this regard.

4 Window-wise Optimization for Tracking

(a) Frame 11 to 17
(b) Frame 12 to 18
(c) Frame 13 to 19
(d) Frame 14 to 20
Figure 1: Visualization of the Window of Ambiguity (WOA) from frame 12 to frame 21 in TUD-Stadtmitte dataset. Each association is directed from parent to child and the A-C Graph is directed and acyclic. From (a) to (b), state was connected to state as a C state child, and was merged with . Meanwhile, the association from and to were removed. From (c) to (d), state was inserted into the tracklet of state and state . The figure is best shown in color.

4.1 Ambiguous-Clearness Graph

Given states and observations (the detections serve as observations in TBD multi-object tracking) in a real time video sequence, predefined thresholds and (the value of and are given in Section 5), we define state to be the parent of state if and there exists an association between and , and is the child of . ( and are only used as examples for clearness in illustration. They do not indicate certain states.) The determined parent of a state is its only parent and the affinity score of the association is greater than . We now formally define the C states and A states. If a state has one determined parent or does not have parent, is a clear state (C state), denoted as . On the contrary, if has parent states but does not have a determined parent, it is an Ambiguous State (A state), denoted as . Note that a C state can only have zero or one parent. All the parents of a state form its active set. We regulate that a state can have up to one C state as its child, and the frame number of its A state child should be smaller than that of its C state child. The observation corresponding to and is notated as and . A clear association is the association between a clear state and its parent, and a tracklet is defined as a group states connected by clear association. The tracklet including is denoted as . The C states in after is defined as the descendant of . By taking states and associations as the vertices and edges, we form the A-C Graph of the MOT problem. In this paper, we use states and associations instead of vertices and edges when discussing on the A-C Graph. The A-C Graph of TUD-Stadtmitte dataset is visualized in Figure 1, where the clear association is shown in solid line and the states belong to the same tracklet is in the same color.

As the association is directed from parent to child, the A-C Graph is a directed acyclic graph. In an A-C Graph, we define a time period to () where there is only clear association in to and to as Window-of-Ambiguity (WOA). The tracklet outside the WOA is determined and fixed and the changes of the states and association can only take place in the WOA. One can restrict the size of state space by setting the length of WOA.

4.2 Actions

As is mentioned in Section 3, actions in A-C Graph should help avoid conflicts, , multiple fathers for a C state, multiple C state children, clear association forms cycle, . Meanwhile, the actions should be symmetrical to avoid the effect of chronological order. The basic actions of A-C Graph are initializations, disconnections, connections and merges between two states. Table 1 shows functions and symbols used in defining these actions.

Functions and Symbols Description
Check whether the is empty.
Find the C States in the .
Find all the parents of .
Find all the children of .
Find the frame number of .
Judge whether is a clear state.
The affinity scores between all the
fathers of and .
Table 1: The functions and symbols used in this paper.

For a newly-entered state , first we initialize the active set by enumerating all the potential parents. As is regulated in Section 4.1, is able to connect with states in the previous frames, who does not have C state child or whose C state child is after . Procedure 1 shows the pseudocode of initializing the active set.

We disconnect two states and by removing the association between them, and update these two states.

As is shown in Procedure 2, we assign to as A state child. The procedure is terminated if is already a C state. If not, we check the descendant of . If has no descendants, we directly add an association between and , otherwise, we find the nearest C state descendant in the tracklet of not after . If is in frame , the procedure is terminated. If is before , add the association between and .

Procedure 5 illustrates the action that is connected to as C state child. If is currently not a C state, the existing parents of are removed. If does not have C state children, we directly add a connection between and , otherwise, we find ’s latest C state descendant not after . If is in frame , and are merged together via Procedure 6. As is before , an association is added between and . All the A and C state children of after are removed from and reconnected to following Procedure 2 and 5 respectively. If is currently a C state and is not, is inserted into ’s tracklet using Procedure 5 if there is not a state in frame in ’s tracklet and Procedure 6 if there exists a state in frame . If and are both C states, the two tracklets and will be grouped into one by recursively calling Procedure 5 and 6, as shown in Procedure 5. If one of the two states is in a tracklet, the other state will be inserted into the tracklet.

Procedure 6 describes the process of merging to in the same frame. As we cannot make changes on the states and tracklets outside WOA, we ensure that and cannot be C states at the same time to avoid merging of states outside WOA. For the descendants of and , we recursively merge them into one tracklet by Procedure 5. For the A state child of , we simply remove the association between and and connect it to via Procedure 2.

state , latest frame number , size of Window of Ambiguity (WOA)
the active set containing all the potential parents of
for all the states in frame to  do
      if  or  then
            Add to the active set
      end if
end for
Procedure 1 Initialize the active set for the state .
child state , parent state
the updated network
if  then
      while  not and  do
             the C state child of
      end while
      if  then
            Add to
            Add to
            Update the features of and
      end if
end if
Procedure 2 Connect state to state as A state child.
child state , parent state , latest frame number , size of Window of Ambiguity (WOA)
the updated network
while  not and  do
       the C state child of
end while
if  then
      Do Procedure 6 with ,,, as input
      Remove all parents of
      Remove all children of in the same frame with
      for all children of in the frames after  do
            Remove the association between and
            if  then
                 Do Procedure 5 with ,,, as input
                 Do Procedure 2 with , as input
            end if
      end for
end if
Procedure 3 Connect state to state as C state child, where is currently an A state.
child state , parent state , latest frame number , size of Window of Ambiguity (WOA)
the updated network
while  and  do
       the determined father of
end while
while  and  do
       the determined father of
end while
if  then
      Do Procedure 5 with ,,, as input
else if  then
      Do Procedure 5 with ,,, as input
      if  and  then
            if  then
                 Remove the parent of ,
                 Do Procedure 6 with ,,, as input
                 Remove the parent of ,
                 Do Procedure 6 with ,,, as input
            end if
      else if  then
            Do Procedure 6 with ,,, as input
            Do Procedure 6 with ,,, as input
      end if
end if
Procedure 4 Connect state to state as C state child, where is currently a C state.
child state , parent state , latest frame number , size of Window of Ambiguity (WOA)
the updated network
if  then
      Do Procedure 3 with ,,,
      Do Procedure 4 with ,,,
end if
Procedure 5 Connect state to state as C state child.
state , state , latest frame number , size of Window of Ambiguity (WOA)
the updated network
if  then
      Remove from its parent , if any
      Do Procedure 5 with ,,, as input
      Remove from its parents
      For all , do Procedure 2 with , as input
end if
for all the children of  do
      if  then
            Remove from
            Do Procedure 5 with ,,, as input
            Remove from
            Do Procedure 2 with ,,, as input
      end if
end for
Procedure 6 Merge state with state .

Although there exists recursion in the actions, it can be easily proved that the recursion in Procedure 2, 5 and 6 cannot form an endless recursion loop, and the sequence of carrying out actions on a set of states will not affect the structure of A-C Graph. Visualization of these actions in TUD-Stadtmitte dataset can be found in Figure 1. In Figure 1(b), newly-entered states to connected to their initial active sets via Procedure 1, 2 and 5. From Figure 1(a) to 1(b), was connected to as a C state child by Procedure 5, and merged with using Procedure 6.

4.3 Sliding Window Optimization

For a real time sequence, the A-C Graph is continuously adding new states from latest frame . The WOA should be sliding to keep its size from being too large and remove the ambiguities to generate tracks. So we set the upper bound of the size of WOA as .

The sliding window optimization consists of three steps. First, for all the newly-entered states in frame , , we find the active sets via Procedure 1 and compute the affinity score between and each state in the corresponding active set. If , do Procedure 5 with as input. If , do Procedure 2 with as input. Second, from frame to , we sequentially recompute the affinity score of states in the same frame with their fathers and reconnect them according to the new affinity. Third, Hungarian Algorithm [1] is carried out on states in frame with their father states to get the best arrangement of association and clear all the ambiguity in frame . All states in frame are transformed to C states and the WOA shifts forward. If has not reached the end, and return to the first step, otherwise, and redo the third step. The outline of the optimization process is shown in Procedure 7.

size of Window-of-Ambiguity (WOA)
the final A-C Graph and the association result
1. Associate the newly-entered states in the latest frame to their initial active sets.
2. Sequentially shrink the active set of each A state in WOA.
3. Determine the association of states in frame using Hungarian Algorithm [1].
if  has not reached the end then
      , return to 1
      , return to 3
end if
Procedure 7 Conduct sliding window optimization for MOT.

The sliding window optimization conducts A-S Scheme in a window-wise manner. Procedure 5 and 6 in step one and two serve as the approximation step, and updating affinity score in step two follows the shrink step. Step three forces the states in frame to determine their connections, which guarantees the convergence.

4.4 Online, Delayed and Batch Methods

Based on the definition of A-C Graph and sliding window optimization, we form this window-wise framework which includes online (), delay () and batch methods (). Figure 2 demonstrates the formation of a trajectory starting from in the A-C Graph via these three methods. The window-wise optimization finds a relatively small search tree according to at each iteration. As for an online method (Figure 2(b)), and the search is greedy. For a delayed method (Figure 2(c)), heuristic search is conducted in . The search space remains unchanged for a batch method (Figure 2(d)), so local search methods, , hill climbing, simulated annealing, , is often exploited to direct to local optimal iteratively. The experimental analysis of the relation between and optimization results is provided in Section 5.2.

(a) Search Space
(b) Online
(c) Delayed
(d) Batch
Figure 2: Formation of a trajectory with different . (a) illustrates the original search space. (b),(c) and (d) stand for the search process with local search tree. indicates the SWO in frame , . The red lines represent the associations. For online and delayed approaches, the trajectory are formed from top to bottom, while for batch approaches, the trajectory is formed and optimized via iteration.

5 Experimental Evaluation

5.1 Implementation

Affinity model: We implemented a basic affinity model, following [3], which includes the appearance model , motion model and shape model . The appearance model measures the Bhattacharyya distance of histograms of and . If is in a tracklet , instead of using Incremental Linear Discriminant Analysis (ILDA) used in [3], we simply average the appearance histograms of all states in

using an exponential discount factor. First-order Kalman filter is applied to smoothing and predicting positions of the targets and shapes of the bounding boxes. We compute the normalized distance of target positions and bounding box shapes and map them to a Gaussian distribution

to get the affinity scores. The overall affinity


Dataset description: We use the MOT Benchmark [18] for training and evaluation in this paper, where the benchmark contains both sequences for training and testing. In total, there are frames, for training set and for testing set. The sequences possess different frame rates and resolutions, and only tracking pedestrians.

Parameter Settings: In our experiment, the and

. We estimate the length of every occlusion (number of frames with overlap

0.4) in the training set of MOT Benchmark and study the distribution of occlusion lengths. As shown in Figure 3, about of the overlaps are within , and of which are within . Therefore, the delayed time is set to and the length of WOA frame rate

delayed time. The variance of the Gaussian distribution in the motion model and shape model is

. Other parameters of the affinity model are the same as [3].

Figure 3: Distribution of lengths of bounding box overlaps in the ground truth sequences in MOT Benchmark [18]. of the overlaps are within 5s and of them are in 1s.

5.2 Analysis of Window-of-Ambiguity

To analyze the connection of WOA size and the quality of the window-wise optimization, we define the energy of an A-C Graph as


Figure 4 presents the final energy with varying size of WOA on TUD-Stadtmitte (number of frame ), TUD-Campus (number of frame ) and PETS-S2L2 (number of frame ) in MOT Benchmark. The X-axis is in logarithmic scale. Interestingly, final energy of these sequences reduced significantly when window size grows from to , while the speed of decrease become much slower when . Settings of these sequences, , target density, viewpoint, , are different, but the patterns of energy change almost remain identical. It is likely that the trend of final energy only deals with WOA size . And the tracking results can be much improved with a small WOA comparing to the online method, which experimentally illustrates the better performance of delayed methods than online ones in the window-wise optimization framework. The final energy does not reduce too much when grows larger than . This indicates the sliding window approximation only has a minor effect on the final performance. And it becomes a trade-off between speed and better results when WOA grows larger.

(a) TUD-Stadtmitte(number of frame )
(b) TUD-Campus(number of frame )
(c) PETS-S2L2(number of frame )
Figure 4: The final energy with varying size of Window-of-Ambiguity (WOA) on different sequences. The X-axis is in logarithmic-scale. The energy decrease rapidly when grows from to . When , the decrease of energy becomes slower.

5.3 Performance Evaluation

Evaluation Metrics: We apply the CLEAR MOT [15] and [29, 16]’s metric when evaluating our result. The multiple object tracking accuracy (MOTA) shows the combined accuracy based on the number of false positives (FP), identity switches (IDS) and missed targets (FN). The multiple object tracking precision (MOTP) measures the overlap of bounding boxes between ground truths and results given by trackers. MT and ML indicate the number of mostly tracked and lost targets. FG represents the number of fragmented tracks.

Evaluation: As shown in Table 2, our method clearly outperforms the TC_ODAL method using the same affinity model, not only in MOTA. Even in some datasets, shown in Table 3, our method with the basic affinity model reached the performance of the methods using state-of-the-art affinity models.

AC-MOT(Proposed affinity of [3]) Delayed
TBD[13] Batch
TC_ODAL [3] Online
DP_NMS [23] Batch
Table 2: Performance evaluation. Results can be found in The best outcomes are marked in bold. represents higher is better, while stands for lower being better. Methods evaluated using the same set of affinity descriptor are marked with the same symbol.
Method AC-MOT CEM MotiCon SegTrack
[21] [17] [20]
type Delayed Batch Batch Batch
KITTI-16 38.8 40.2
Table 3: MOTA of some sequences in MOT Benchmark. We compare our method using [3]’s affinity model with state-of-the-art affinity models.

6 Conclusion

This paper proposed an A-S Scheme for sequential approximation and a window-wise optimization framework based on the A-C Graph. The core idea of this method is to cluster the states subject to several constraints, states in the same frame cannot be clustered into one group, . The A-C Graph together with the sliding window optimization transformed the global clustering into a sequential local clustering which self-organized the structure in a relatively small state space, which can be done efficiently with little harm to handling occlusions. We showed experimentally that the characteristics of window-wise optimization framework rarely change with the varying settings of the sequence. As the affinity model serves as the distance metric in clustering, it can influence the results of clustering. Therefore, it is a fair comparison of optimization models if similar affinity models are adopted. The experimental results show that by using the basic affinity model, our method even showed competitive performance in an unfair test. Our future work is to realize more state-of-the-art affinity models to the window-wise optimization model. Also, we plan to design a unity interface, which can help to embed the affinity models into different optimization models much easier than now.


  • [1] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network flows. Prentice Hall, 1993.
  • [2] A. Andriyenko, K. Schindler, and S. Roth. Discrete-continuous optimization for multi-target tracking. In

    Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on

    , pages 1926–1933. IEEE.
  • [3] S.-H. Bae and K.-J. Yoon. Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 1218–1225. IEEE.
  • [4] S.-H. Bae and K.-J. Yoon. Robust online multi-object tracking with data association and track management. 2014.
  • [5] H. Ben Shitrit, J. Berclaz, F. Fleuret, and P. Fua. Multi-commodity network flow for tracking multiple people. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 36(8):1614–1627, 2014.
  • [6] M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, and L. Van Gool. Online multiperson tracking-by-detection from a single, uncalibrated camera. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 33(9):1820–1833, 2011.
  • [7] A. A. Butt and R. T. Collins. Multi-target tracking by lagrangian relaxation to min-cost network flow. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 1846–1853. IEEE.
  • [8] C. Canton-Ferrer, J. R. Casas, M. Pard s, and E. Monte. Multi-camera multi-object voxel-based monte carlo 3d tracking strategies. EURASIP Journal on Advances in Signal Processing, 2011(1):1–15, 2011. B and limited O 3D.
  • [9] W. Choi. Near-online multi-target tracking with aggregated local flow descriptor. arXiv preprint arXiv:1504.02340, 2015.
  • [10] C. Dicle, O. I. Camps, and M. Sznaier. The way they move: Tracking multiple targets with similar appearance. In Computer Vision (ICCV), 2013 IEEE International Conference on, pages 2304–2311. IEEE.
  • [11] C. Fantacci, B.-N. Vo, B.-T. Vo, G. Battistelli, and L. Chisci. Consensus labeled random finite set filtering for distributed multi-object tracking. arXiv preprint arXiv:1501.01579, 2015. new approach.
  • [12] T. E. Fortmann, Y. Bar-Shalom, and M. Scheffe. Sonar tracking of multiple targets using joint probabilistic data association. Oceanic Engineering, IEEE Journal of, 8(3):173–184, 1983. B.
  • [13] A. Geiger, M. Lauer, C. Wojek, C. Stiller, and R. Urtasun.

    3d traffic scene understanding from movable platforms.

    Pattern Analysis and Machine Intelligence, IEEE Transactions on, 36(5):1012–1025, 2014.
  • [14] S.-W. Joo and R. Chellappa. A multiple-hypothesis approach for multiobject visual tracking. Image Processing, IEEE Transactions on, 16(11):2849–2854, 2007.
  • [15] B. Keni and S. Rainer. Evaluating multiple object tracking performance: the clear mot metrics. EURASIP Journal on Image and Video Processing, 2008, 2008.
  • [16] C.-H. Kuo and R. Nevatia. How does person identity recognition help multi-person tracking? In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1217–1224. IEEE. B.
  • [17] L. Leal-Taixé, M. Fenzi, A. Kuznetsova, B. Rosenhahn, and S. Savarese. Learning an image-based motion context for multiple people tracking. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 3542–3549. IEEE, 2014.
  • [18] L. Leal-Taixé, A. Milan, I. Reid, S. Roth, and K. Schindler. Motchallenge 2015: Towards a benchmark for multi-target tracking. arXiv preprint arXiv:1504.01942, 2015.
  • [19] W. Lu, C. Bai, K. Kpalma, and J. Ronsin. Multi-object tracking using sparse representation. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pages 2312–2316. IEEE. O.
  • [20] A. Milan, L. Leal-Taixé, K. Schindler, and I. Reid. Joint tracking and segmentation of multiple targets.
  • [21] A. Milan, S. Roth, and K. Schindler. Continuous energy minimization for multitarget tracking. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 36(1):58–72, 2014.
  • [22] A. Milan, K. Schindler, and S. Roth. Detection-and trajectory-level exclusion in multiple object tracking. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 3682–3689. IEEE.
  • [23] H. Pirsiavash, D. Ramanan, and C. C. Fowlkes. Globally-optimal greedy algorithms for tracking a variable number of objects. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1201–1208. IEEE, 2011.
  • [24] D. B. Reid. An algorithm for tracking multiple targets. Automatic Control, IEEE Transactions on, 24(6):843–854, 1979. B.
  • [25] A. V. Segal and I. Reid. Latent data association: Bayesian model selection for multi-target tracking. In Computer Vision (ICCV), 2013 IEEE International Conference on, pages 2904–2911. IEEE.
  • [26] A. W. Smeulders, D. M. Chu, R. Cucchiara, S. Calderara, A. Dehghan, and M. Shah. Visual tracking: An experimental survey. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 36(7):1442–1468, 2014.
  • [27] B. Wang and F. Wang. Multi-object tracking using least absolute deviation. In Image and Signal Processing (CISP), 2014 7th International Congress on, pages 60–65. IEEE. O.
  • [28] D. Wang, H. Lu, and M.-H. Yang. Least soft-threshold squares tracking. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 2371–2378. IEEE. O single.
  • [29] B. Yang and R. Nevatia. An online learned crf model for multi-target tracking. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 2034–2041. IEEE.
  • [30] B. Yang and R. Nevatia. Multi-target tracking by online learning a crf model of appearance and motion patterns. International Journal of Computer Vision, 107(2):203–217, 2014.
  • [31] J. H. Yoon, M.-H. Yang, J. Lim, and K.-J. Yoon. Bayesian multi-object tracking using motion context from multiple objects. In Applications of Computer Vision (WACV), 2015 IEEE Winter Conference on, pages 33–40. IEEE. O.
  • [32] L. Zhang, Y. Li, and R. Nevatia. Global data association for multi-object tracking using network flows. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1–8. IEEE.