Dual approach for object tracking based on optical flow and swarm intelligence

08/15/2018
by   Rajesh Misra, et al.
0

In Computer Vision,object tracking is a very old and complex problem.Though there are several existing algorithms for object tracking, still there are several challenges remain to be solved. For instance, variation of illumination of light, noise, occlusion, sudden start and stop of moving object, shading etc,make the object tracking a complex problem not only for dynamic background but also for static background. In this paper we propose a dual approach for object tracking based on optical flow and swarm Intelligence.The optical flow based KLT(Kanade-Lucas-Tomasi) tracker, tracks the dominant points of the target object from first frame to last frame of a video sequence;whereas swarm Intelligence based PSO (Particle Swarm Optimization) tracker simultaneously tracks the boundary information of the target object from second frame to last frame of the same video sequence.This dual function of tracking makes the trackers very much robust with respect to the above stated problems. The flexibility of our approach is that it can be successfully applicable in variable background as well as static background.We compare the performance of the proposed dual tracking algorithm with several benchmark datasets and obtain very competitive results in general and in most of the cases we obtained superior results using dual tracking algorithm. We also compare the performance of the proposed dual tracker with some existing PSO based algorithms for tracking and achieved better results.

READ FULL TEXT VIEW PDF

Authors

05/24/2017

Object Tracking based on Quantum Particle Swarm Optimization

In Computer Vision domain, moving Object Tracking considered as one of t...
04/25/2018

Object Tracking in Satellite Videos Based on a Multi-Frame Optical Flow Tracker

Object tracking is a hot topic in computer vision. Thanks to the booming...
11/16/2021

Image-based monitoring of bolt loosening through deep-learning-based integrated detection and tracking

Structural bolts are critical components used in different structural el...
10/18/2020

FGAGT: Flow-Guided Adaptive Graph Tracking

Multi-object tracking (MOT) has always been a very important research di...
09/13/2014

Concurrent Tracking of Inliers and Outliers

In object tracking, outlier is one of primary factors which degrade perf...
05/03/2018

Visual Object Tracking: The Initialisation Problem

Model initialisation is an important component of object tracking. Track...
11/17/2016

Video Processing from Electro-optical Sensors for Object Detection and Tracking in Maritime Environment: A Survey

We present a survey on maritime object detection and tracking approaches...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Object Tracking employs the idea of following an object as long as its movement can be captured by a camera in various environments under Variable Background and Static Background. Moving object detection and tracking pose a challenge in real world scenarios like automatic surveillance system, traffic monitoring, vehicle navigation etc. In many scenarios where background changes dynamically due to motion of camera, abrupt changes in speed of the tracked object, change in illumination of light, noise, occlusion etc., tracking becomes very complex and challenging. Therefore tracking algorithm under such situation should be robust, flexible and adaptive. It should be capable of real time execution.

Moving object tracking is existing for past several decades. Many methods have been proposed with a certain degree of accuracy and effectiveness. Still there remains several challenging problems in tracking due to the reasons stated earlier. In this paper we adopt a dual tracking approach based on optical flow and swarm intelligence so that the tracking becomes very robust under the challenges as stated.

Optical flow is the pattern of motion of images between two consecutive frames generated by movement of object or camera. The resultant vector of the optical flow is the displacement vector containing position of pixels from first frame to second frame. This optical flow provides a good amount of motion information of moving object, and thus encourage researchers to apply that information in moving object detection as well as tracking. There exist several optical flow methods, like Lucas–Kanade method

Tomasi and Kanade (1991),Horn–Schunck methodHorn and Schunck (1993), Buxton–Buxton method Buxton and Buxton (1984), Black–Jepson method Jepson and Black (1993) etc. Among all these methods on optical flow Barron et al. (1994) Horn–Schunck and Lucas–Kanade are more popular than others. Both these methods have their own merits and demerits.

In object tracking KLT method has been applied by Sundaram et.al Sundaram et al. (2010) for multiple point tracking in a parallel environment. Chen et.alChen et al. (2011) perform segmentation of video object and apply optical flow method to track each segments. Schwarz et.alSchwarz et al. (2012) use optical flow in subsequent intensity image frames to get the motion information about the moving object body and apply graph based representation to track the entire object. Aslani et.alAslani and Mahdavi-Nasab (2013)

used optical flow together with some image processing method to estimate the position of the object in consecutive frames, and using that positional pixel values they track the whole object. Kale et.al

Kale et al. (2015)use optical flow to compute motion vector which provides an estimation of object position in consecutive frames. Though optical flow method has been applied extensively in object detection and tracking still there is no method which can extract perfect flow of data. Thus, the use of optical flow in object tracking is still a widely open problemHusseini (2017). In Wu et al. (2013) Wu.et.al provide object tracking benchmark. In Wu et al. (2013), Wu.et.al provide tracking results of some of the top performing object tracking algorithms: Visual Tracking via Adaptive Structural Local Sparse Appearance Model(ASLA) Jia et al. (2012) Jia et.al use sparse representation to find possible match with target template with minimum reconstruction error, Beyond semi-supervised tracking: Tracking should be as simple as detection, but not simpler than recognition(BSBT)Stalder et al. (2009)

Stalder et.al use multiple supervised and semi supervised classifier to perform the task of detection, recognition and tracking. Color-based probabilistic tracking[CPF]

Pérez et al. (2002) Perez’ et.al use Monte Carlo tracking method with particle filter. Exploiting the circulant structure of tracking-by-detection with kernels[CSK] Henriques et al. (2012)

Henriques et.al use theory of circulant matrices with Fast Fourier Transformation to detect and track the moving object. Real-time compressive tracking[CT]

Zhang et al. (2012)

, zhang et.al create an appearance model based on feature extracted from multi-scale image space and compute a sparse measurement matrix. Later using that sparse matrix they compress foreground and background targets, and perform tracking by using naive-Bayes classifier. In

Moudgil and Gandhi (2017) Moudgil et.al provide a benchmark dataset for long duration video sequence which they name as ’Track Long and Prosper(TLP)’. This dataset is important because most tracking algorithms work well in short sequences but drastically failed on long challenging video sequence. This dataset contains 50 long time running video nearly 400 minutes. This paper Moudgil and Gandhi (2017)

includes some of the recent tracking algorithms: Learning multi-domain convolutional neural networks for visual tracking

Nam and Han (2016) Nam et.al use convolutional Neural Network which is composed of different domain specific layers which are trained to capture different parts of moving object to track. Fully-convolutional siamese networks for object tracking Bertinetto et al. (2016) Bertinetto et.al create a fully-convolutional Siamese Network which is trained with ILSVRC15 dataset for video tracking. Crest: Convolutional residual learning for visual tracking Song et al. (2017)

Song et.al reformulate Discriminative correlation filters as a one-layer convolutional neural network and apply residual learning to take appearance changes into consideration, Action-decision networks for visual tracking with deep reinforcement learning

Yoo et al. (2017) Yoo et.al propose a tracking algorithm which sequentially pursue actions learned by deep reinforcement learning. MEEM: robust tracking via multiple experts using entropy minimization Zhang et al. (2014) Zhang et.al propose multi-expert restoration method for problem of drifting of model in on line tracking by creating an expert ensemble where best expert is selected based on minimum entropy criteria to correct undesirable model updates.

Bio-inspired based methods are effective tools for object tracking and are given extensive attention in past few decadesSevilla-Lara and Learned-Miller (2012)

. Among other Bio-inspired methods like Genetic Algorithm (GA),Ant Colony optimization(ACO), Particle Swarm Optimization (PSO) emerges real fast because of its efficient, robust and quick convergence. Some of the earlier works successfully considered tracking problems using PSO. Particle Swarm Optimization is applied by Zheng et.al

Zheng and Meng (2007),Zheng and Meng (2008) on high dimensional feature space for searching optimal matching in Haar-Like features detected by a pre-defined classifier set. Xiaoqin et.al Zhang et al. (2008) calculate temporal continuity between two frames and use that information for swarm particle to fly and track that information. Vijay et.alJohn et al. (2010) construct Human Body Model as a collection of truncated cones and numbering those cones and PSO cost function checks how well a pose matches with data taken from multiple cameras. Multiple people tracking is considered by Chen et.al Ching-Han and Miao-Chun (2011) where target object is modeled by feature vector and then PSO particles search the search space for optimal matching. Fakheredine shows Keyrouz (2012) the use of multiple swarms for multiple parts of object tracking. Those swarms share information with each other to make tracking of object as a whole. Multiple object tracking is also considered by Chen-Chien et.al Hsu and Dai (2012) using PSO. They construct a feature model using grey-level histogram and apply PSO particles to track the difference between grey level histogram information of consecutive frames in a video sequence. Bogdan Kwolek (2013) represents an approach where object is represented by image template and a covariance matrix is formed on that. Using similarity measure PSO tracks the difference between the movement of object and target templateKim et al. (2012). We Compare our proposed approach with some other PSO based tracking algorithm: Multiple object tracking using particle swarm optimization Hsu and Dai (2012) Hsu et.al first create a grey-level histogram feature model and then distribute PSO particles where target object used as fitness function. Real-Time Multiview Human Body Tracking using GPU-Accelerated PSO Rymut and Kwolek (2014) Boguslaw et. al show that movement is tracked by a 3D human model in the pose described by each particle and then rasterizing it in each particle’s 2D plane. Hierarchical Annealed Particle Swarm Optimization for Articulated Object Tracking Nguyen et al. (2013) Xuan et.al show articulate object tracking by decomposing the search space into subspaces and then using particle swarms to optimize over these subspaces hierarchically. Monocular Video Human Motion Tracking based on Hybrid PSO Zhang (2014) Ben shows tracking human pose in monocular video human motion by using hybrid PSO method. Object tracking using Particle Swarm Optimization and Earth mover’s distance Xia and Ludwig (2017) Xia et.al use Particle Swarm Optimization (PSO) as the object localization method based on the Bayesian tracking framework.

Though several approaches for object tracking are existing for past several decades but none of them consider dual tracking approach which is the major novelty of the present work.The proposed dual tracking approach for object tracking is very much robust for short video sequence under static background and variable background as well as long challenging video sequence under static background and variable background. Though an exhaustive survey on existing object tracking algorithms and a through review on deep learning approaches to object tracking are not within the scope of the present work; but in this particular context we like to make a critical appreciation on deep neural network(DNN) learning for object tracking. Through several experimental studies we reveal that deep learning approaches(based on Microsoft’s ResNet, Google’s Inception and Oxford’s VGGNet etc.) for object tracking are good for known classes of object under fixed environments; but under variable background(in a time critical situation) with several uncertainties like unknown object,variation of illumination of light,noise, occlusion, sudden start and stop of moving object, shading etc. we observe that the deep learning method for object tracking don’t produce satisfactory results and in many cases deep neural network(DNN) requires further training under the new environment and unknown objects

Simpson (2015). In Simpson (2015)

, Simpson states that ”It is an embarrassing fact that while deep neural networks(DNN) are frequently compared to the brain, and even their performance found to be similar in specific static tasks, there remains a critical difference; DNN do not exhibit the fluid and dynamic learning of the brain but are static once trained. For example, to add a new class of data to a trained DNN it is necessary to add the respective new training data to the preexisting training data and re-train (probably from scratch) to account for the new class. By contrast, learning is essentially additive in the brain – if we want to learn a new thing, we do”. Based on such observation

Simpson (2015) and a critical appraisal Marcus (2018) and also based on our recent experimental studies on VGGNet we categories DNN approach as a representation of crystallized intelligence Horn (1968)Cattell (1963) of the network under learned or accumulated knowledge and has low capability of handling unknown environment specially under unknown objects. We suggest that for new environment with unknown objects DNN should have an added feature of fluid intelligence with some working memory which can handle novel or abstract problem solving environment Schneider and McGrew (2012), Blair (2006), Horn (1968),Gray et al. (2003),Cattell (1987), Cattell (1963),Ashton and Lee (2006),Ackerman et al. (2005),Ackerman (2000). However all such challenging issues and several other proposals Wolff (2018),Wolff (2004),Wolff (2013),Wolff (2014a),Wolff (2014b),Wolff (2016a),Wolff (2016b),Wolff (????),Wolff and Wolff (2017)should be throughly reevaluated before we come to any conclusion. Such issues should be separately considered elsewhere as an independent work.

In this paper we essentially try to extract the merit of fusion between KLT tracker and PSO based tracker. Such fusion is considered to supplement each other in an intelligent fashion so that dual trackers become very simple, very robust and cost effective under variable background and static background and very much capable of handling the challenges of object tracking which cannot be tackled by DNN based tracking algorithms as stated above. The proposed dual tracking algorithm can successfully tracks object for short video sequence as well as long challenging video sequence. In case of unknown object , dual tracking approach simply needs to recalculate the dominant points on the contour of the unknown target object(objects) and no need to spend huge time to learn/train the unknown environment with unknown object form the beginning of tracking of the target object as we have seen in case of DNN based tracking algorithms. The KLT trackers based on optical flow concepts tracks the dominant points of the target objects from the first frame to last frame of the video sequence. Tracking of dominants points by KLT tracker is supplemented by swarm intelligence based PSO tracker from frame 2 to last frame. Swarm Intelligence based PSO (Particle Swarm Optimization) tracker basically tracks the boundary information of the target object from frame 2 to last frame. The flexibility of our approach is that it can be successfully applicable to variable background as well as static background. The basic tracking sequences of the propose dual tracking approach is as follows;

In the first frame of the video sequence we obtain the dominant points of the target object and start tracking it by KLT tracker till the last frame. In frame 2 of the same video sequence the boundary of the target object is polygonally approximated for the first time. An environment of multiswarms is generated and an annular ring(strip) of swarms is formed within which the approximated ploygon is embedded.At frame -3 onwards the shape of the annular ring (strip) of the multiswarms changes simply because the shape of the dynamically generated polygon which, at each frame, continuously captures the boundary information of the target object, changes due to the movement of the target object which is a non-rigid body in general. The newly generated polygon embedded over the newly generated annular ring(strip) of the multiswarms is tracked from frame -3 by PSO tracker along with KLT tracker. The above process of dual tracking continues till the last frame with dynamic change of shape of the approximated polygon and the change of shape of the annular ring(strip) at each frame of the video sequence.

In course of tracking if there is any loss of dominant points due to some sort of unpredictable disturbances then the tracking procedures by KLT and PSO are disturbed. In that case instead of recomputation of dominant points, we reinitialize the missing dominant points by some heuristic approach which essentially exploits the intelligence level of swarms. Similar to the reinitialization of the missing dominant points sometimes positions of the particles of the individual swarm may need to be reinitialized due to its distraction from the individual swarm by some process of disturbances. When all the swarms around the boundary of the target object reach the optimal solution a bounding box is generated around the target object based on particles final positions. This entire method of tracking uses only one feature which is basically dominant points on the contour of the target object and other information of the target object boundary is captured by the annular ring(strip) of multiswarms within which polygonally approximated target object is embedded. Note that the notion of using dominant points on the contour of a target object as good features for object tracking is basically derived from the concept of interest points as proposed by Shi et.al

Shi et al. (1994) and Tomasi et.al Tomasi and Kanade (1991). In our approach, instead of searching for interest points of an object for tracking we directly compute dominant points on the contour of the target object to be tracked and thereby reduces the search complexity of KLT algorithm for object tracking. In sec 2.3.3 we experimentally demonstrate that the set of dominant points on the contour of the target object is basically a subset of interest points.Further note that the use of dominant points as good features for object tracking is an important and unique concept which is not used by classical KLT algorithm for object tracking. The robustness of the proposed dual tracking algorithm, under several existing challenges of object tracking as stated earlier, is verified and established through several experimental studies on benchmark datasets Wu et al. (2013),Moudgil and Gandhi (2017),Geiger et al. (2012). Another specialty of the proposed dual tracking algorithm is its robustness under short video sequence as well as long challenging video sequence where most of the existing classical approaches fail Moudgil and Gandhi (2017). Note that in the proposed dual tracking algorithm the KLT tracker for tracking the dominant points of the target object is continuously supplemented by PSO tracker from frame-2 to last frame. And due to embedding of the target object approximated by polygon in the annular ring(strip) of multiswarms,the target object is tightly captured throughout the tracking sequence by the multiswarms environment and there is no loss of meaningful information about the target object during tracking. Hence the proposed dual tracking algorithm is inherently robust. There are several striking features of the proposed dual tracking algorithm.The overall performance of the proposed dual tracking algorithm, with respect to several benchmark datasets, are very much competitive and in most of the cases superior than the others.

The paper is organized as follows: Section 2 discusses the basic concepts and tools and techniques required for dual tracking algorithm. Section 3 essentially deals with salient features of the proposed dual tracking algorithm.Section 4 pictorially describes the proposed dual tracking algorithm. Section 5 provides the pseducode and complexity analysis of the proposed dual tracking algorithm. Section 6 provides detail experimental studies on 3 benchmark datasets and also provide some analysis and performance measure of the proposed dual tracking algorithm. Section 7 provides Conclusion and future work.

2 Dual approach for object tracking

2.1 Basic concepts

In this paper we propose a dual approach for object tracking based on optical flow and swarm Intelligence. The optical flow based tracker i.e. KLT, tracks the dominant points of the target object from frame 1 to last frame; whereas swarm Intelligence based PSO (Particle Swarm Optimization) tracker simultaneously tracks the boundary information of the target object from frame 2 to last frame. This dual function of tracking makes the trackers very much robust with respect to the above stated problems. In our approach, in the first frame of the video sequence we calculate the dominant points of the target object and start tracking it till the last frame. From frame 2 of the same video sequence the boundary information of the target object is captured by a dynamically generated polygon of the target object. The polygonal approximation of the target object at each frame is achieved by joining two consecutive dominant points on the target object by a straight line segment. In frame -2 of the same video sequence a group of particles is distributed randomly over the image search space. This particles form swarm over each line segment of the dynamically generated polygon of the target object. Formation of swarm on each line segment is based on the smallest distance of each particle from the individual line segment.

Thus, a multiswarms environment is formed and an annular ring(strip) of swarms is generated over which the dynamically generated polygon of the target object is embedded. If the target object is a closed digital curve then the annular ring of swarms is formed as shown in fig-(1); otherwise a strip of swarms is formed as shown in fig- (2)

Figure 1: Annular ring of multiswarms.
Figure 2: Strip of multiswarms.

The vertices’s (dominant points) of the polygon are tracked by KLT tracker and the boundary information of the target object, which is approximated by dynamically generated polygon and which is embedded over the annular ring(strip) formed by multiswarms, is tracked by the pso tracker from frame -2 to last frame. At frame -3 the shape of the annular ring(strip) of the multiswarms changes simply because the shape of the dynamically generated polygon, which continuously captures the boundary information of the target object, changes due to the movement of the target object which is a non-rigid body in general. During the said process of shape change of the annular ring(strip) of multiswarms, the individual swarm of each small line segment further rearranges the position of the particles of each swarm to converge on the individual line segment of the newly generated polygon. During the said process of convergence, until all the particles of individual swarm over individual line segment successfully converges over all the line segments of the newly generated polygon, they (particles) update their velocity and position based on previous local best and global best position. Thus local best and global best positions are further updated. Again the newly generated polygon embedded over the newly generated annular ring(strip) of the multiswarms is tracked from frame -3 by PSO tracker along with KLT tracker. The above process of dual tracking continues till the last frame with dynamic change of shape of the polygon and the change of shape of the annular ring(strip) at each frame of the video sequence. Thus the dual tracking approach for object tracking tracks the dominant points on the contour of the target object and simultaneously tracks the tightly captured and embedded approximated polygon of the target object. The basic purpose of this dual tracking approach is that during tracking the multiswarms environment within which the approximated polygon is embedded continuously supplement the KLT tracker for dominant points from frame-2 to last frame. As the polygonally approximated target object is embedded and tightly captured within the frame of multiswarms ring(strips) so under any kind of environmental disturbances as stated earlier the tracking of the target object is not lost in the midway of any video sequence of tracking. Another specialty and uniqueness of this dual tracking approach is that it very successfully tracks the long challenging video sequences where many classical approaches for tracking drastically fails. This achievement of successful tracking of long challenging video sequence is mainly due to the fact that the approximated polygonal version of the target object is embedded and tightly captured in a multiswarms environment. And there is a very little possibility that the target object is lost during tracking in a long challenging video sequence.

In course of tracking if there is any loss of dominant point (points) due to some environmental disturbances then the tracking procedures by KLT and PSO are disturbed. In that case instead of recomputation of dominant point (points), we reinitialize the missing dominant point (points) by some heuristic approach which essentially exploits the intelligence level of swarms. Similar to the reinitialization of the missing dominant point (points), particles of the individual swarm over individual line segment may require reinitialization during convergence process, which starts from frame -3 till the end of the last frame. If it is detected, during the said convergence process, a particular particle(particles) of an individual swarm over individual line segment diverges ( instead of converges) from its global best position ( even after several iteration of convergence) then the position(positions) of that particular particle(particles) is (are) reinitialized to a position(positions) for convergence over the line segment of the corresponding swarm from where the particle(particles) is (are) displaced to an undesirable position. After successful convergence of all particles over individual swarm of each line segment of the polygon a bounding box around the target object is formed based on a new concept of PSO-based bounding box generation algorithm. Note that, as stated earlier,at frame-2 a group of particles are randomly distributed over the image search space. These particles essentially take part in formation of swarms on individual line segments of the dynamically generated polygon of the target object. The population of particles is not fixed. It depends upon the need of the problem and basically a heuristic parameter in nature Röhler and Chen (2011),Xueyan and Zheng (2015). If the population of the particles at frame-2 is very large the computational complexity of the entire algorithm may increase. Keeping this in mind we have to select the population of particles at frame 2.The flexibility of our approach for dual tracking is that it can be successfully applicable to variable background as well as static background. The tools and techniques used for implementing the basic concepts of dual tracking are discussed in the following -

2.2 Dominant Point Detection

For the detection of the dominant point on the contour of the target object we use the methods Ray and Ray (1992),Ray and Ray (2013) and Wu (2003a),Wu (2003b). We first perform contour tracking of the target object to find the Chain Code based on Freeman’s Chain CodeFreeman (1961). Freeman Chain code gives us list of pixels around object body. Among those pixels we eliminate linear points(pixels), as those points(pixels) do not provide us any significant curvature information. For elimination of linear points(pixels) we consider the following rule

(1)

where is the previous chain code value and is the current one, on the point .

After excluding those linear points(pixels) rest of the points(pixels) are called breakpoints, which are candidates for dominant points. We have to consider the region of support of only for those breakpoints. We calculate the length of support of each breakpoint. Rather considering all breakpoints at once we collect them as a group of 10 for variable background and group of 5 for static background. The number of breakpoints in a group is decided based on which background we perform the tracking. Normally on variable background object shape changes fast. Hence we need the curvature of the object body smaller so that large number of breakpoints are close to each other. That’s why we chose large number of breakpoints, compare to static background where the object is more stable and we can use much longer curvature. Therefore less number of breakpoints suffices for dominant point calculation.

For each group of breakpoints we calculate k-Cosine values for each of them and apply the following rule

Let us start with k =1 to form a group. Increase the value of k by 1 until we reach all breakpoints on that group.

(2)

We chose dominant point as those points which are max kCosine values, i.e.

(3)

Thus the entire procedure for calculating dominant points can be summarized as follows;

  • Use Freeman Chain Code for performing contour tracking.Get those pixels and store them in a file.

  • Form the stored pixels eliminate linear points using equation (1). Save them in a file and call breakpoints.

  • Perform K –Cosine for each of the breakpoints using Eq-(2).

  • Select those points as dominant points which has max k-cosine values and collect a set of Dominant points as per Eq-(3).

2.3 Tracking of Dominant point(points) by KLT

2.3.1 Feature selection

Before any tracking of moving object the most fundamental step is the selection of “trackable” features. First we have to determine the parameters to find out good features. According to Tomasi and KanadeTomasi and Kanade (1991) ’a single pixel cannot be tracked until it has s a very distinctive brightness with respect to all of its neighbors’. Hence, they prefer a “Window” of pixels which should contain sufficient texture. By “texture” we mean a group of neighboring pixels (window of pixels) which shows significant variation or changes of intensity or brightness between consecutive frames. Areas with a varying texture pattern are mostly unique in an image, while uniform or linear intensity areas are often common and not unique. Based on these guideline we proceed as follows;

2.3.2 Selecting Dominant point(points) as good feature

The main reason for choosing dominant point as a trackable feature is that by definition Ray and Ray (2013) dominant point itself holds maximum curvature information on the contour of a target object. So quite obviously a window centered at dominant point should always give us enough texture for tracking from one frame to another. The area of such window can vary, depending on the number of features. This dominant point act as “interest point” which captures maximal local intensity information. Every basic KLT algorithm starts with finding corners or interest points satisfying the equation. Shi et al. (1994)

(4)

where,

are two eigenvalues and

is a predefined threshold. Rather applying a separate algorithm for finding good interest points which satisfy the above equation-(4), we consider dominant point as our interest point. Let us define the image gradient as follows

We, consider the product of gradient and its transpose as follows

If we integrate the matrix defined above over the area W(selected window),we get.

(5)

Z is a 2x2 matrix containing texture information along X and Y axis. Analyzing the eigenvalues of the matrix Z we get the W, which is window of pixels that are trackable.The equation for Z forms an intricate part of the Kanade-Lucas-Tomasi tracking algorithm. It is necessary to establish a minimum threshold for the value of the eigenvalues. If the two eigen values of Z are and , we accept a window which satisfies equation–(4).

Figure 3: original image; no dominant point is selected so far.
(a) image gradient according X- axis.
(b) image gradient according Y- axis.
Figure 4: image gradient according X and Y- axis from left to right.
Figure 5: RED dots indicate the pixels which qualify the equation-(4).

2.3.3 Dominant points as subset of interest points

In section 2.2 we state that dominant point holds maximum curvature information on the contour of a target object and provides enough texture for tracking. In this subsection we further clarify this concept through a simple experiment as example that dominant points are the subset of interest points which are the key elements of KLT tracking algorithm. In Figure-(5) the set of RED dots are the interest points as per equation -(4) and our chosen dominant points for target object are taken from this set of RED dots as a subset.

In figure-(3) we show original image and in figure-(4) we show image gradient in x axis and y axis. Figure-(5) is the results of the feature points which satisfy the above equation-(4).Experimentally we obtained that the calculated dominant points using equation -(3)Ray and Ray (1992) is a subset of the interest points of a selected window of feature points as stated above. So we can move to the next step of the KLT tracking algorithm by considering dominant points as our interest points which we do not have to search forShi et al. (1994).

2.3.4 Concepts of tracking dominant points by KLT

The basic notion of tracking by KLT can be explained by looking at two images in an image sequence. Let us assume that the first image is captured at time t and the second image is captured at time t + . It is important to keep in mind that the incremental time depends on the frame rate of the video camera and should be as small as possible. An image can be represented as function of variables x and y. If we define a window in an image taken at time t+ as I(x,y,t+). The basic assumption of the KLT tracking algorithm is;

(6)

From equation -(6) it is clear that every point in the second window can by obtained by shifting every point in the first window by an amount (x, y). This amount can be defined as the displacement d = (x, y) and the main goal of tracking is to calculate d.

2.3.5 Calculation Feature displacement

Now we have basic information to solve the displacement d mentioned above. The solution is explained inBirchfield (1997). According toBirchfield (1997), we can calculate displacement d from from image frame I to image frame J.Thus we obtain-

(7)

where , the displacement , and the weighting function w(x) is usually set to constant 1. Now according to Taylors series expansion of J about a point , truncated to the linear term is

(8)

where, = . Following the derivation, we let (x + ) = .To get the final derivation,

(9)

In continuation of equation-(9) we calculate,

(10)

where, To calculate the displacement d , we need to set the derivative 0.

(11)

Solving further, we get a simplified equation -

(12)

where, Z is the 2x2 matrix : and e is the 2x1 vector: . So the displacement d is the solution of equation-(12)

2.3.6 KLT algorithm

We summaries the KLT algorithm as follows -
Step 1: Find the dominant points which satisfy min((see equation -(4).
Step 2: For each dominant point compute displacement to next frame using the Lucas-Kanade method (see equation -(12)).
Step 3: Store displacement of each dominant point, update the position of the dominant point.
Step 4: Go to step 2 until all dominant points are exhausted.

2.4 Particles Swarm Optimization (PSO) method for tracking

In 1995 James Kennedy and Russell Eberhart proposed an evolutionary algorithm that creates a ripple among Bio-inspired algorithms. This particular algorithm is called Particle Swarm Optimization (PSO)

Eberhart and Kennedy (1995). In a simple term it is a method of optimization for continuous non-linear function. This method is influenced by swarming theory form biological world like fish schooling, bird swarming etcAhmed and Glasgow (2012).
PSO is effectively applied to the problems in which each solution of that problem can be considered as a set of points in a solution space. Particle is the term associated to those set of points. Analogically suppose there is a food source and a swarm of birds tries to reach that food source. Every bird, tries by its own choice to reach there. Whoever is reached. or nearly reached to that food source share that information with other birds who are close neighbor.As a ripple in water information flows among entire swarm of birds and every bird synchronously update their velocity and position if it gets better position in terms of nearest position to the food source. As a result after certain period of time entire swarm eventually gathers to the food source. Every solution considered as particle computes its value based on some cost function, until it satisfies certain criterion known as stopping condition. It keeps updating its velocity and position, provided its neighbor has better solution.
Position and Velocity are two associated terms in Particle Swarm Optimization. Position of every particle is calculated by particle’s own velocity. Let (t) denote position of particle i in the search space at time t. Position updation formula is as follows

(13)

where,

is the velocity of particle i at time (t+1), which is computed based on this following formula-

(14)

where, , represent the relative influence on social and cognitive components respectively. They are also known as learning rates and are often set to same constants value, to give each component equal weight.

, = random values associated with learning rate components to give more robustness.

= Particle Local Best position it is the historically best position of the particle achieved so far.

= Particle Global Best position it is the historically best position of the entire swarm. Which is basically the position of a particle which achieves closest solution.

Equation (14) is Kennedy and Eberhart’s original idea. After that lot of different researches have been going on.Based on those researches a remarkable idea comes up Shi and Eberhart (1998). In Shi and Eberhart (1998) Shi and Eberhart add a a new factor called “inertia weight” or “w”. After addition of inertia weight the Eq (14) becomes as follows

(15)

This inertia weight helps to balance local and global search abilities. Small weight means local search and larger weight means global search.Carlisle and Dozier (2001). Pseudo code of the basic PSO algorithm is given in appendix .

In this paper the PSO based tracker tracks the dynamically approximated polygon of the target object and continuously supplements the tracking of the dominant points of the target object by KLT.

2.4.1 Setting PSO parameters and Initialization

Because of dynamic nature, setting PSO parameters to right value is a crucial task. Below we discuss some of the major parameters.

  • Multiswarms - In the proposed dual tracking algorithm one tracker is PSO based approach. In the basic concept of section 2.1 we have clearly explain a key feature of dual tracking algorithm is ring(strip) of multi swarms within which the approximated polygonal representation is embedded during tracking. Number of swarms are decided by number of dominant points of the target object. If we have D number of dominant points of a target object the number of line segments which polygonally approximate the said target object is equal to (D - 1). Thus there will be (D - 1) number of swarms. During tracking, due to several disturbances as stated earlier the dominant point(points) of the target object may be lost at the midway of tracking and thereby some of the particle of the swarm which is based on that dominant point will be also distracted. In such cases as mentioned in section 3.1 and section 3.2 the algorithm will automatically reinitialize the lost dominant points and the lost particle of the swarms. During experiment of tracking we have seen in worst case approximately 10% percent of the particle including dominant points need to be reinitialized.

  • Population of particles - We initialize the population of particles needed for construction of swarm around individual line segment of the approximated polygon of the target object. It is chosen heuristically depending upon the need of the application. In case of object tracking under static background the particle population is 25 and object tracking under variable background the particle population is 33. It is obvious that if we increase the population size the computational complexity of the PSO tracker will be increased. It is also obvious that more the length of the line segment more the particles will converge on that to form a swarm as per the PSO algorithm.

  • Position and Velocity initialization - According to PSO methodology we need to initialize the position and velocity of every particle of the swarm. This position of particle for each swarm will be inside the search space and randomly defined. In our case we first consider the range of the image space within which the target object are lying. If the image space is represented by [x,y] range for each frame then we first select some values for as velocity component in x-direction and y-direction to be Shi et al. (2001);

    1 3

    where and denotes velocity towards X and Y direction.
    Velocity signifies how far a single particle will jump, as we are working on image pixels. It cannot be negative value or fractional value and also setting high value is not a practical approach as the particles work in close vicinity on the dominant points.

  • Local best value of Particle i () -local best value of an individual particle in a swarm indicates its current best position it achieved to converge on the target line segment between two consecutive dominant points. We initialize each particle’s Plbest value with its initial position inside the search space [x,y] which is randomly defined at the very beginning as stated above. Latter it will be modified according to the Plbest updating rule.

  • Global best value of Particle i () - In a particular swarm, the particle which holds the best position such as close to the line segment between two consecutive dominant points is considered as global particle and its position is . Each particle first compute the perpendicular distance from line segment connected by dominant points. The particle which hold minimum distance considered as .

    (16)

    where () is the position of the 1st dominant point and () position of the 2nd dominant point and is the coordinate of the particle.

  • w, values initialization - Earlier we define the meaning of these terms in equation -(14) and equation-(15). Initialization of these variables based entirely on application . In this paper after some experiment we choose w = 0.3 , = 0.1, = 0.1, and some integer between 1 to 3. The convergence of the PSO algorithm is based on these parameters and we are basically guided by the information provided in Shi et al. (2001).

2.4.2 Polygonal approximation of the target object

For polygonal approximation of the target object we draw small line segments between two consecutive dominant points of the target object. Let us consider two dominant points and which are calculated using equation-(3). The path between this two points is a small line segments joining the said two points. There could be infinitely many curves( not straight line segments) that may pass through the said two dominant points, but in this paper we consider Euclidean Distances between the two said points.
In Cartesian coordinate, and are the two points in Euclidean space. The distance between this two points is calculated as follows -

(17)

In the following we illustrate this phenomenon using an arbitrary curve as shown in figure-(6):. In the following figure-(6) , we have an arbitrary curve C as stated above, which contains dominant points like . The line segments and passing through , and respectively. Though it is not exactly the curve connecting dominant points , or but as shown in the figure it serves the purpose of approximately representing the contour(boundary) of the curve as shown in fig-1. Thus we obtain polygonal approximation of the arbitrary chosen curve C as stated above. As we are not detecting or tracking the exact contour of the target object, we focus only on moving area of the target object approximated by polygon, so polygonal approximation of the target object does not produce any serious threat for tracking. It is always possible to construct the exact curvature between two consecutive dominant points of the target object. But such construction of the curvature is always time consuming and does not really improve the tracking result. This can be a scope for future work.

Figure 6: This Example Diagram shows approximate Curvature calculation using Pythagorean formula

2.4.3 Fitness Function for PSO tracker

Every PSO model is based on some cost function. Each particle of the swarm computes that fitness function in each iteration to confirm whether it converges to the final solution or not. In this paper, our cost function is the perpendicular distance of the particle i to the small line segment which is a part of the approximated polygon of the arbitrary curve C.

Figure 7: This Diagram shows Fitness Function Calculation and how it is selected

Based on figure-(6), we draw figure-(7). Here we have a particle and the small line segment is . We compute the perpendicular distance from the point to the line .
As passes through two dominant points and then the distance from the point is

(18)

The denominator is the length between and . Numerator is the twice the area of triangle with its vertices’s at 3 points . For Every particle we compute the perpendicular distance from the particle on small line segment as stated above and if the distance is in the acceptable range iteration stops else this procedure continues.

2.4.4 Formation of Multiswarms

Once the task of constructing the polygon of the target object is completed, for the first time,at frame 2 of the video sequence, we distribute particles over the entire image space. Note that the the population of the particles is a heuristic parameter which depends on the need of the problem and which has several options as stated in de Melo and Delbem (2012),Melo et al. (2009),Jiang et al. (2007). These said particles form swarm over each small line segment of the polygon according to the smallest value of the fitness function.
For the first time each particle of frame 2 measures its perpendicular distance from each small line segment and chooses the particular line segment as the line over which it will lie to form a swarm. Thus all particles of the image space of frame 2 are distributed over the small line segments of the approximated polygon of the target object and form a multiswarms scenario at frame-2 of the video sequence. These multiswarms scenario is nothing but an annular ring(strip) of swarms within which the approximated polygon of the target object is embedded(see figure-1 and 2).
Dual tracking of the target object starts from frame -2 . The vertices’s of the polygon which are essentially the dominant points of the target object and which are computed at frame 1 of the video sequence are tracked at frame -2 where these vertices’s are embedded in the annular ring(strip) of the multiswarms as dominant points of the approximated polygon of the target object. These vertices’s (dominant points) are tracked by KLT. KLT tracks the dominant point of the target object from frame 1 to the last frame of the video sequence. Whereas the entire target object which is approximated by polygon and embedded in the multiswarms environment is tracked by the PSO tracker. PSO tracks the approximated polygon of the target object embedded in the multiswarms from frame 2 to last frame.

2.4.5 Updation and Reshaping of the annular ring of the multiswarms

When the dual trackers arrive at frame-3 of the video sequence, the shape of the polygon is automatically changed due to the movement of the target object which is in general non rigid in nature. In case of rigid object the shape of the polygon of the target object remains same. Once the shape of the polygon changes at frame 3 of video sequence the particles inside a swarm are redistributed on the small line segments of the changed polygon as per the built in function function. Until all the particles inside a swarm successfully converge on the small line segment of the changed polygon, they (particles) keep updating their velocity and position using formula-(14) and (13) respectively.
and are updated as

(19)

=

(20)

2.4.6 Reinitialization of the particle of the individual swarm

At the time of updating the position the particles of the individual swarms, instead of converging over the small line segments of the changed polygon, the particle(particles) may be distracted from the said line segments to a far away distance even after several iteration of updation. In that case we need to reinitialize the particle(particles). Reinitializing particles over entire image space certainly feasible but not a practical idea. For further illustration see section-3.

2.5 Bounding Box formulation

To identify tracked target object usually a rectangular bounding box is utilized. There are some pre-defined algorithms exist for this purpose, but here we design our own bounding box based on PSO particle position which will best suite our object tracking algorithm.

The main idea is whenever all particles in all swarms successfully converge for a particular image frame we find p number of particles which have smallest X – direction and smallest Y-Direction. These particle are close to (0,0) in our image space. These p values can be the first 10 particles with minimum values in X- direction and Y-directions. This choice of number of particles entirely depends on application. As per our experimental experience this number of particles should lie from 10 to 20 particles with smallest X,Y direction. We take an average of these p- points, which is the starting point for bounding box formation. Let us consider a particle q, which is calculated as follows

(21)
Figure 8: Explaining how particles are converging towards object boundary and based on that we are calculating bounding boundary

Figure-8 represent the point q. Now we compute the Length and Breadth of the bounding box. LEngth is the vertical line and they are calculated as follows -

First select l number of particles which have maximum X-direction values and minimum Y-direction values. These l values can be the first 10 particles which have values maximum in X-direction and minimum in Y-direction. This number of points depends on designer choice and application. According to our experience it is effective if we take first 10 to 20 particles which are maximum in X direction and minimum in Y direction. Thus we get as follows.

(22)

Length is the euclidean distance from point q() to point ) as shown below.

(23)

Similarly, for breadth calculation, first select b number of particle which have minimum X-direction values and maximum Y-direction values.

(24)

Breadth is the euclidean distance from point q() to point ) as follows.

(25)

Once we get the 3 parameters; length(L), breadth(B) and starting point q, using equation-(26) we construct the bounding box as shown in Fig-9.

Figure 9: How Bounding Box actually calculated
(26)

.

3 Salient features of the proposed algorithm

3.1 Re-initialization of missing dominant points

Due to background clutter, occlusion, Illumination Variation, low resolution and scale variation of various video sequences, change of image background occurs frequently. So the optical flow method based Kanade–Lucas–Tomasi(KLT) tracker which is basically a point tracker is unable to track a single point throughout the video duration. Hence the proposed algorithm is developed based on the fusion between optical flow and swarm intelligence. After the first frame of tracking using KLT, the PSO provides a continuous support to capture the overall information including the dominant points of the object by automatic generation of polygon of the object to be tracked where the vertices’s of the polygon is basically the dominant points of the object being tracked.This polygon is automatically updated with the moving object from frame to frame. With the movements of the object being tracked the shape of the object(usually non rigid) changes which is continuously updated by the newly generated polygon of the object at each frame.Thus a total information, in an approximated sense, is provided to the tracking algorithm by the dual function of optical flow and swarm intelligence.

We track dominant points using KLT. As the video sequence changes a lot there is a very high probability that KLT tracker may loose some of these dominant points in course of its tracking. During tracking, if the tracking of a dominant point is disturbed then the particle(particles) of the corresponding swarm is(are) also distracted and PSO tracker may failed to track. To avoid this situation we propose reinitialization of missing dominant points.
Dominant point re-initialization.

3.1.1 Pictorial Illustration of Reinitialization of missing dominant points

We explain reinitialization process of missing Dominant points using an example. Let’s consider a curvature C whose start (s) and end (e) points are dominant points. We track these dominant point using KLT method.

In the following figure-(10) there are two dominant points (s and e) which are marked as RED. These two points are tracked by KLT tracking algorithm and all yellow color points are swarm particles which spread over the line joining between two dominant points.

Figure 10: Initial curvature C(s,e) with two dominant points (RED dot) tracked by KLT and particle swarms(yellow dot)

If we consider that the curved object which is being tracked is moving from left to right then due to various reasons stated above KLT may loose tracking of one of the dominant points as shown in the figure-(11).

Figure 11: Dominant point (s) lost indicated by RED point shifted left. Blue arrow show direction of movement of the object. One of the particle(yellow dot) lost and other particle still successfully track the curvature

As we can see, in figure-(11) KLT missed tracking of dominant point s. We also assume that some of the swarms may be lost during tracking because they are distributed over the path which originates from point s. This phenomenon is shown by a yellow dots near to the lost dominant point.

In figure-(12) we show how re-initialization happens. The PSO algorithm tracks the curvature of the object which is approximated by a straight line between s and e. After few frames when we observe that s is not moving as its pixel position is not changing, we then consider that KLT has lost point s. When we detect that loss, we need another dominant point for continuation of our curvature tracking. But we do not compute another dominant point using formula-3 Ray and Ray (1992) as in real time tracking recomputation of lost dominant point is not a feasible solution. Instead of computing of another dominant point we assign a moving PSO particle which is nearest to the lost dominant point s. As we already keep tracking of all particles which follow the path joining between two dominant points,it is much more feasible and viable approach to follow. It does not require any computation and also we do not need to find where the closest particle will be.

Figure 12: Nearest neighbor point selected as new dominant point. Other particles track as usual.there is no considerable amount of delay

In figure- (13) we have shown, after selection of a new dominant point, the entire curvature tracking is resumed.

By this approach, neither we have lost our tracking nor we have made any delay/break in tracking due to loss of dominant point. But the question is, weather the approach to replace lost dominant points by a new one rather then actual computation of dominant point is feasible? We need to remember we are not tracking exact contour of tracked object. Hence we do not need to follow exact curvature of the object body. Rather PSO Particles are tracking the approximated curvature of the object by simply a straight line joining two dominant points in our case. Though the newly selected point is not exact dominant point but it can easily solve our purpose to follow the object boundary. This heuristic approach to design a tracker is basically an attempt to extract the element of intelligence of a swarm.

Figure 13: Newly selected point marked as s and curvature tracking is resumed as earlier

3.2 Re initialization of the particles of the swarms

Reinitialization of particle is sometime required as it is inherent in nature of PSO that few particles are too diverged from their desired position and even after several updation may not bring them towards their goal. In our case it is also possible that some particles are too far away from curvature boundary and after a finite number of iteration they still unable to converge. Then we need to reinitialize those particle. Reinitializing particles over entire image space certainly feasible but not a practical idea, because it again may diverge. So we have a better possibility to converge by assigning the position of the diverged particle on the current position of dominant point. Diagrammatically we can represent this as following figure-(14) –

Figure 14: at frame number-f we detect two swarm particle are diverged marked as RED points

Let us consider two swarm particles start diverging at frame number –f and we detect this after few more frames are processed. Let say at frame – (f + t), we find two particles are diverged. In frame –(f+t+1) we took action about its repositioning. Lots of research works have been done about repositioning of diverged particleRöhler and Chen (2011),Richards and Ventura (2004),de Melo and Delbem (2012). In Richards and Ventura (2004) Richards et.al use generators from centroidal Voronoi tessellations as the starting points for the swarm.In de Melo and Delbem (2012) de Melo et.al consider the algorithm named Smart Sampling (SS) finds regions with high possibility of containing a global optimum. A meta-heuristic can be used to initialize inside each region to find that optimum. Smart Sampling(SS) and Differential Evaluation (DE) are combined to establish SSDE algorithm to evaluate the approximate position of the diverged particles. So we can choose and apply any of these methods which works successfully. But in the present context, instead of doing this we consider another approach we think to be more effective in our case.

At frame number–(f+t+1), we simply consider the X and Y direction of those diverged particles and update their position according to the following formula.

(27)
(28)

where,
- is the latest X directional positional value of kth diverged particle from swarm.
- is the latest Y directional positional value of kth diverged particle from swarm.
- X directional value of the dominant point of the swarm.
- Y directional value of the dominant point of the swarm.
We need to keep in mind that, dompts is a set of dominant point which continuously updated frame by frame as per our algorithm. So whenever we mention dompts we always refer latest updated dominant points. Another point worth referring here is that swarm particles try to converge over the straight line joining two dominant points. So whenever any swarm particle is diverged from its desired location over a particular line segment, if we directly place the said diverged particle on any dominant points of the line segment, according to the above formula then the question is which dominant point we should choose among two dominant points joining which we get the line segment?. Actually it does not produce any serious impact if we choose arbitrarily any dominant point among these two.

Basic intuition behind the above equations is that if some particles diverge, their main objective is to reach as near as possible to the straight line joining two consecutive dominant points. So rather computing any complex mathematical function and performing extensive number of iteration we consider the most simple approach by placing the diverged particles positions directly on any of the two dominant points around which a swarm was already formed and from where particle(particles) were diverged. Figure-(15) explains the newly updated position of the diverged particles.

Figure 15: Convergence of the diverged swarm particle at frame (f+t+1)

4 Some further illustration on the proposed dual tracking algorithm

Step 1: First we extract the first frame from input video and convert it into binary image.By trial and error we find a pixel point on the boundary of target object as shown in figure-(16). Here () is the boundary point. Note that for simplicity of illustration we consider the front view of an object. In practice it can be any given aspect of an object to be tracked.

Figure 16: A human body where() is mapped using trial and error method

Step 2: Apply Freeman Chain code to find the breakpoints started with as shown in figure-(17).

Figure 17: Boundary points are detected using freeman chain code

Step 3: Find out dominant points using max cosine valuesRay and Ray (1992). Initially it eliminates all linear points and subsequently find out those points which have maximum cosine values.We denote those dominant point set as D. ={}. The resultant figure is shown in figure -(18).

Figure 18: Selected Dominant points calculated from breakpoints are shown here

Step 4: These dominant points as shown in figure-(18) by RED dots are tracked by KLT tracker from to . In figure-(19) blue arrows show how KLT tracks dominant point independently from frame to frame.On we are distribute PSO particles over the image space randomly. The distribution of PSO particles shown in figure -(20). Again for simplicity of illustration we show front view of an object in Frame-2. In practice it can be any given aspect of an object as frame-2.

Figure 19: Tracking of dominant points as shown by Red Dots,by KLT tracker from one frame to another.
Figure 20: PSO particles distributed over the image space randomly

Step 5: On frame -2 as stated above and as shown in figure-(21) we draw the lines joining two consecutive dominant points. All green dots are PSO particles and RED dots are dominant points. The straight line joining two consecutive dominant points has been shown by black straight line and green PSO particles spread on those straight lines. The yellow arrows show the movement of PSO particles. The right hand object of figure-(21) is basically a polygonal approximation of the left hand object of figure -(21). The vertices’s of the polygonal approximation(i.e the right hand object) represent the dominant points of the object at frame -2.

Figure 21: Polygonal approximation of the Object Boundary at frame 2.

Step 6: In figure-(22) we have shown that from frame -2 to frame -3 optical flow and swarm intelligence simultaneously performing the task of dual tracking. KLT(based on the concept of optical flow) tracks the dominant points of the object(i.e.the vertices’s of the polygon)and PSO(based in the concept of Swarm Intelligence)tracks the boundary (approximated by the straight line of the polygon) of the object.
The green points are PSO particles and they are distributed over a straight line between two consecutive dominant points. The green points of PSO particles which are distributed over each small line segments of the dynamically generated polygon of the target object form a swarm. Thus around the polygonally approximated target object a multi-swarm scenario is generated and the approximated polygon of the target object is embedded over the annular ring (strip) of the multi-swarms. Dominant points are marked as RED. Blue arrows show the tracking of dominant points is performed by KLT.Yellow arrows show the tracking of the boundary(approximated by a straight line)of the curved object is performed by PSO particles.
Step 7: Note that by process of dual tracking when the target object, which is polygonally approximated , reaches the 3rd frame of the video sequence the shape of the annular ring(strip) of the multi-swarm changes due to the change of the shape of the dynamically generated polygon for the movement of the non-rigid target object. The newly generated polygon of the target object is automatically embedded over the changed annular ring(strip) of the multi-swarm and the process of dual tracking proceed from frame-3 to last frame. Figure-(23) shows the polygon created by PSO particles at frame no.4. Thus, in addition to the tracking of the dominant points of the object, which are basically vertices’s of the polygon dynamically created by PSO(see frame -4 of figure-23), we simultaneously track the boundary of the curved object which is approximated by straight line of the polygon(see frame -4 of figure-23) by PSO particles.

Figure 22: Simultaneous tracking of moving object by both KLT and PSO.
Figure 23: Fourth object shows the polygonal approximation dynamically created by PSO particles.

Step 8: From frame 2 upto the last frame of the video sequence a bounding box, as shown in figure-(24), is designed based on position of PSO particles.

Figure 24: Bounding box is designed based on PSO particles position

5 Algorithmic summary and Complexity analysis

The Dual Tracking Algorithm(DTA) is represented in pseuducode as follows and

5.0.1 Pseudocode

Procedure:DualTrackingAlgorithm(DTA)(videoSequence_with_traget_object)
         [Get Frames form the input video]
         Frames CALL Algorithm Frame_Extraction(videoSequence_with_traget_object)[see Appendix]
         [Calculate Breakpoints of target objects and store those points in “brpts” variable]
         brpts CALL Algorithm BrPtCal(Frames)[see Appendix]
         [Calculate dominant points from Breakpoints “brpts]
         dompts CALL Algorithm DominantPt(brpts)[ see Appendix]

         [Define number of swarms and number of particles in each swarm]
         nSwarm number of swarms
         ss Number of particles per Swarm.

         [Initialization of each particle’s position and velocity for the swarm ]
             For particles 1 to ss
                 For to ss
                      Initialize particles velocity and position.
                      Initialize plbest and pgbest.
                      Compute Procedure FitnessComputePSO
                 End for

             For frame 1 to Frames
         [Track Dominant points from 1st frame to last frame using KLT tracker]
                 For dmpt1 to dompts
                      old_dmptdompts(dmpt)
                      dominantpts CALL klt(dmpt) [ see Appendix]
         [dominant point re-initialization]
                      If dominantpts(dmpt) - Old_dmpt(dmpt) = 0
                      CALL DominantPointReInitialization(AcceptedParticles, dompts(dmpt) [see Appendix]
                 End For
         [Track of all PSO particles around curvature for each frames]
                 For swarm2 to nSwarm
                      For to ss
                          Compute Procedure FitnessComputePSO()
                              If not “accepted” in FitnessComputePSO() then                                    Update position of using equation – (1)
                                   Update velocity of using equation – (2)
                      End For
                 End For
             End For
         [Draw Bounding Box based on particles position available from ”AcceptedParticles” vector]
         CALL Algorithm BoundingBox(AcceptedParticles)[see Appendix]
End Procedure DualTrackingAlgorithm(DTA)

5.0.2 Complexity analysis

To calculate time complexity of Procedure DualTrackingAlgorithm(DTA)() we need to compute complexity of all the sub algorithms it called and summing up all those complexity will give us approximated time complexity of this algorithm.

Complexity of Algorithm Frame_Extraction(Video_input) reading a video file and extracting each frame and storing them in a separate file requires O(f). Where f is the number of frames.

Complexity of Algorithm BrPtCal(Frames) if the targeted object contains p number of pixels for the entire boundary, then freeman chain code at maximum will check 8 direction for boundary condition for each pixels. So at maximum the time required to find all the breakpoints for the object is – O(7*p), which we can consider as linear time O(q), where q approximately 7*p.

Complexity of Algorithm DominantPt(brpts) – let’s consider number of breakpoints are – b, then no_region = b/[5-10], lets that is . Now computing kcosine values required constant time. So final time required is O( * b * k), where k is a constant time, - no_region, b – no of breakpoints.

Complexity of Algorithm klt(dmpt) Assume that the number of warp parameters is n and the number of pixels in T is N. The total computational cost of each iteration of Lucas-Kanade algorithm is O( * N + ), detail discussion explained in Lucas-Kanade 20 Years On: A Unifying Framework: Part 1cite42:baker2004lucas.

Complexity of Algorithm BoundingBox(AcceptedParticles) In order to get time complexity of this algorithm we need only to calculate time complexity of height and breadth procedure. According to algorithm it will be O(h) where is h is number of particle we need to check whether they lie on the object boundary range. Similarly for breadth it will be O(br) where br is b number of particle in breadth computation .so over all in BoundingBox() algorithm time complexity will be O(b+h).

Complexity of Algorithm DominantPointReInitialization(Accepted
Particles, dompts(dmpt))
If total number of accepted particle is n and for sorting that vector containing x particle will at best take O(nlogn). After sorting we will took first particle as next new dominant point, so overall time complexity will ne O(nlogn).

Final complexity of UnifiedObjectTrackingPSO() Frame_Extraction
(),BrPtCal(), and DominantPt() will be called only one time so time required to compute will be
O(f) + O(q)+ O(* b * k)
=O(f) + O(q) +O(*b) [ k is constant]
=O(f) + O(q) + O() [ is very less than b]
= O(f+q) + O() = O()
Initialization of PSO particles will give us time O(n) where n is no. of particles. Now KLT will be called for every frame, so if there is f number of frame then this will give f * O(*N + ), and DominantPointReInitialization() will called very few times so it is approximately O(nlogn), where n is number of accepted particle. And finially PSO will run for each frames required O(f), f is frame number.
Total time complexity = [O() + f* O(*N +) + O(f)] where b – no. of breakpoints, f – no of frames and n – number of pso particle, and N – no. of pixels in KLT tracking.

6 Experimental Result and Analysis

6.1 Experimental Setting

The proposed dual tracking approach for variable background and static background under different challenges as stated earlier, is tested by MatLab 2015a on a 64 bit PC with Intel i5 processor with 3 GHz speed. The image size of the frame 180 X 144. Static video is 20 sec duration whereas variable background is 13 sec duration.

6.2 Experimental Dataset-1 (Wu et.all)Wu et al. (2013)

All the experimental dataset has been taken from benchmark library created by Wu, Yi Wu, Jongwoo Lim and Ming-Hsuan YangWu et al. (2013) and Wu et al. (2013) which is available on http://pami.visual-tracking.net

6.2.1 Tracking Results of the proposed method

Form the experimental test data set we pick up 3 video streams which have static background and point of interest is moving high to moderate rate. From TB-50 sequence Girl[SV,OCC,IPR,OPR],Walking2[SV,OCC,LR] and one Walking[LR,IV]. And 3 video stream from TB-100 sequence for dynamic background; jogging(1)(2)[OCC,DEF,OPR], Suv[OCC,IPR,OV], Walking[OCC]. Here SV Scale Variation, OCC Occlusion, DEF Deformation, IPR In-Plane –Rotation, OPR Out-Plane-Rotation, OV Out of View,BC Background Clutter, IV Illumination Variation, LR Low Resolution are the attributes we consider.

Based on these video streams we demonstrate the tracking results of the proposed dual tracking algorithm.

6.2.2 Static background

The first experiment is on static background. We consider Walking [LR,IV], Girl[SV,
OCC,IPR,OPR], Walking2[SV,OCC,LR] datasets in figure-(25) to figure-(30).

Figure 25: It shows a sequence of frames of a single person moving towards a camera where background is static. Proposed dual tracking algorithm successfully tracks the target object as it moves in Walking dataset. Green dots show the dominant points and RED dots show the swarm particles.
Figure 26: In Walking dataset, a bounding box based on PSO based algorithm is shown. For representational clarity the dominant points and swarm particles are not shown explicitly inside the bounding box.
Figure 27: The Dual tracking approach tracks the movement of the face in GIRLS dataset. Green dots show the dominant points and RED dots show the swarm particles.
Figure 28: The Bounding Box is shown around the face of GIRLS dataset.For representational clarity the dominant points and swarm particles are not shown explicitly inside the bounding box.
Figure 29: The Dual tracking approach tracks the movement of the target object in Walking2 dataset.Green dots show the dominant points and RED dots show the swarm particles.
Figure 30: The Bounding Box is shown around the target object in Walking2 dataset.For representational clarity the dominant points and swarm particles are not shown explicitly inside the bounding box.

6.2.3 Variable Background

Now we perform our experiment on a video where background is moving with object. Video is taken by a moving camera. Here we consider 3 video frames from TB-100 sequence namely jogging[1][2][OCC,DEF,OPR], Suv[OCC,IPR,OV], Walking[OCC,BC]. Both tracking results and Bounding Box representations are shown below from figure-(31) to (36).

Figure 31: The tracking result obtain by the dual tracking approach for jogging (1)(2) dataset.Green dots show the dominant points and RED dots show the swarm particles.
Figure 32: The bounding box representation is shown for jogging (1)(2) dataset. For representational clarity the dominant points and swarm particles are not shown explicitly inside the bounding box.
Figure 33: The tracking result obtained by the dual tracking approach for SUV dataset.Green dots show the dominant points and RED dots show the swarm particles.
Figure 34: The bounding box representation is shown for SUV dataset.For representational clarity the dominant points and swarm particles are not shown explicitly inside the bounding box.
Figure 35: The tracking result is obtained by the dual tracking approach for Walking dataset. Green dots show the dominant points and RED dots show the swarm particles.
Figure 36: The bounding box representation is shown for Walking dataset. For representational clarity the dominant points and swarm particles are not shown explicitly inside the bounding box.

We have tested the overall performance of our proposed Dual Tracking Algorithm(DTA) not only for the above data sets but also for other datasets such as;
DOG[SV,DEF,OPR],FOOTBALL[OCC,IPR,OPR,BC],HUMAN2[IV,SB,OPR],
HUMAN3[SV,OCC, DEF],GIRL[SV,OCC,IPR,OPR],SINGER1[IV,SV,OCC, OPR],SKATER2[SV,DEF,FM, IPR, OPR] WOMEN[IV, SV, OCC, DEF]. Tracking results have been shown in figure-(37).

Figure 37: The tracking result obtained by the dual tracking approach for above mentioned datasets are shown.Green dots show the dominant points and RED dots show the swarm particles.

6.2.4 Analysis and Evaluation

We evaluate the proposed dual tracking algorithm(DTA) using three parameters: True Detection(TD), False Detection(FD), Missed Detection(MD). We consider the parameter Frames per Seconds(FPS) to denote the number of frames per second. There is substantial amount of impacts in tracking due to high speed (high FPS) video sequence. TD is evaluated as the percentage of frames that successfully detect object and track in each video sequence. Following is the mathematical formulation of TD :

(29)

where N = Total number of frames in a video sequence and = number of frames that qualify as Truly Detected objects. Successful object detection and tracking can be computed as per the following rule - we consider a frame is successful in marking with the target object if our proposed bounding box around a detected and tracked object overlaps with the bounding box of the given ground truth. Mathematically, , where Ct’s are the centroids and B’s are the bounding boxes, centroid of the ground truth, centroid of the target object. In this ratio , represents area of the proposed bounding box of the target object and represents area of the bounding box of the ground truth. If the detected object position of our algorithm does not match with the position indicated by ground truth value or both of them do not overlap i.e. ; then the detection is false and is represented as:

(30)

If the algorithm is unable to detect target object in a frame, but ground truth value exists; then the situation is considered as Missed Detection and is represented as:

(31)

We have compared the proposed Dual Tracking Algorithm(DTA) with other state of the art algorithms: Visual tracking via adaptive structural local sparse appearance model(ASLA) Jia et al. (2012), Beyond semi-supervised tracking: Tracking should be as simple as detection, but not simpler than recognition(BSBT) Stalder et al. (2009), Color-based probabilistic tracking(CPF) Pérez et al. (2002), Exploiting the circulant structure of tracking-by-detection with kernels(CSK) Henriques et al. (2012), Real-time compressive tracking(CT)Zhang et al. (2012).
Table-1 shows the comparative result of execution time with various tracking algorithm stated above based on FPS with respect to all six attributes stated above. In all cases We achieve superior results in comparison with other algorithms.

Attributes ASLAJia et al. (2012) BSBTStalder et al. (2009) CPF Pérez et al. (2002) CSK Henriques et al. (2012) CT Zhang et al. (2012) DTA
OCC 47 56 34 45 67 78
SV 57 45 32 45 61 67
DEF 49 45 67 33 45 71
OPR 62 72 56 55 65 70
IPR 58 46 66 71 56 74
BC 59 78 34 78 56 87
Table 1: Attribute wise Execution time based on Frames per Second (FPS) on the benchmark datasets Wu et al. (2013) and Wu et al. (2013).
  • red: rank1, blue: rank2

For each of the above mentioned tracking algorithm , based on six attribute , True Detection(TD), False Detection(FD), Missed Detection(MD) are evaluated and presented in Table-2.

Attribute Mean TD(%) Mean FD(%) Mean MD(%)
ASLA BSBT CPF CSK CT DTA ASLA BSBT CPF CSK CT DTA ASLA BSBT CPF CSK CT DTA
OCC 77 73.77 72.36 72.9 77.5 80.1 4.72 4.2 4.81 4.12 3.92 3.63 2.0 3.41 3.67 3.12 3.41 1.97
SV 85.23 80 78.1 71.1 79.23 88.19 4.20 2.9 2.2 2.7 2.9 2.0 3.17 1.2 1.43 2.8 2.91 1.2
DEF 70.63 67.21 67.92 60.51 69.86 75.13 4.25 4.21 3.16 3.84 4.3 3.27 3.17 2.92 2.86 3.62 3.19 2.57
OPR 60.45 68 57.20 61.4 62.4 69.23 5.67 5.91 5.57 5.9 5.7 5.28 3.13 2.92 3.96 2.7 3.7 2.4
IPR 60.24 61 52.1 59.1 59.4 65.23 5.7 5.81 5.77 6.3 6.7 5.21 3.19 3.32 3.95 3.12 3.17 3.0
BC 51.45 68 57.1 60.1 62.4 69.73 5.93 5.81 5.77 5.99 5.87 5.28 3.13 3.92 3.83 2.6 3.9 2.1
  • red: rank1, blue: rank2

Table 2: Attribute-wise Experimental Results on Benchmark Datesets Wu et al. (2013)

In figure-(38), we show that the proposed Dual Tracking Algorithm(DTA) achieve superior performances in terms of Precision, Recall and F-measure in comparison with other state-of-art algorithms.

Figure 38: The tracking result obtain by the proposed Dual Tracking Algorithm(DTA) for above mentioned datasets.

In Table-3 we represent (average number of success) / (average number of failure) based on 6 attributes with respect to 5 tracking algorithms, where the overlap threshold value is 0.5.

Attributes ALSAJia et al. (2012) BSBTStalder et al. (2009) CPF Pérez et al. (2002) CSKHenriques et al. (2012) CTZhang et al. (2012) DTA
OCC 56.0/3.8 32.0/7.7 47.9/5.8 52.7/5.1 43.7/6.4 68.5/3.4
SV 54.0/3.9 32.5/7.7 44.1/6.6 52.0/5.0 40.2/6.8 64.6/4.5
DEF 50.5/4.5 20.6/8.7 44.2/6.8 48.1/5.7 39.4/6.8 51.3/3.8
OPR 56.3/3.7 30.4/7.9 47.6/5.9 44.9/6.1 51.5/4.4 54.4/3.9
IPR 52.1/ 4.1 31.2/7.6 43.1/6.4 42.8/6.0 48.2/5.3 54.3/5.3
BC 59.2/ 3.0 30.8/7.8 42.8/6.4 42.2/5.9 50.7/4.9 62.3/2.8
Table 3: Performance measurement of different algorithms on different attributes.
  • red: rank1, blue: rank2

Here, we show that the proposed Dual Tracking Algorithm(DTA) outperforms other tracking algorithms in terms of average success rate and average number of failures which can be visualized as minimized error rate.

6.3 Experimental dataset-2(TLP dataset)Moudgil and Gandhi (2017)

Recently, Moudgil et.al developed a new benchmark dataset which contain long duration video sequence which they name as ”Track Long and Prosper” (TLP). This dataset contain 50 real world videos which is approximately 400 minutes nearly 676K frames. This dataset is important because most tracking algorithms work best in short sequences but drastically fail on long challenging video sequence. We perform experiment with our proposed dual tracking approach on 6 different dataset from this benchmark suite. This benchmark suite is available here: https://amoudgl.github.io/tlp/

From the TLP dataset we pick up 6 video streams which have 5 attributes: Scale Variation(SV), Motion Blur(MB),Occlusion(OCC), Multiple Instances(MI),Out of view(OV). This 5 attributes are very challenging attributes as: OV indicates a situation where target fully out of the viewing window momentarily, similarly, MI indicates more than one objects with similar appearance as the target exist in the sequence and interact with it. This six video sequence with their attributes distributions are : Lion(SV, MB, OCC, MI), Badminton1(MB,OV,OCC,MI), Boat(SV,OV,MB), Carchase1
(SV,OCC,MI,OV),Helicopter(SV), Jet5(SV,MB,OCC,OV). This 6 datasets contains other attributes as well. We mention those attributes which we consider for comparison with the proposed dual tracking algorithm.

Figure-(39) shows tracking results on 6 challenging video streams consecutively.

Figure 39: The tracking result obtain by the proposed Dual Tracking Algorithm(DTA) for above mentioned datasets.Green dot shows the dominant points and RED dots shows the swarm particles and Bounding box also present in every dataset.

6.3.1 Analysis and Evaluation Methodology

We further compare the proposed Dual Tracking Algorithm(DTA) in terms of three evaluation method: Precision plot, Success plot and Longest Subsequence Measure(LSM) Moudgil and Gandhi (2017).

Precision plot - It is the most common and widely used method in object tracking Wu et al. (2013) Grabner et al. (2008). it shows the percentages of frames whose calculated pixel position(location) of the image is within the given threshold distance of the ground truth value. We use threshold distance value as 20 Babenko et al. (2011).

Success plot

- Another evaluation metric is success plot

Wu et al. (2013). It provides the result of computing Intersection over Union(IoU) between computed and ground-truth bounding box position and also computes the number of successful frames whose IoU values is larger than given threshold values. If computed bounding box position of the target object is - and given groundtruth bounding box position value is - , then the overlap score Everingham et al. (2010) is -

(32)

where and represent intersection and union operators respectively and represents number of pixels in that region. We take Average Overlap Score(AOS) as the performance metric. AOS value decides weather a frame is successfully tracked or not.

LSM plot - It shows Moudgil and Gandhi (2017) which tracking algorithm successfully tracks the length of longest tracked subsequence per sequence. If F percentage of frames in a long video sequence is successfully tracked then we call it Longest Subsequence(LS), where F is an appropriate large value.

We pick up 5 state-of-art algorithm form the TLP dataset Moudgil and Gandhi (2017) : Learning Multi-Domain Convolutional Neural Networks for Visual Tracking(MDNet) Nam and Han (2016), Fully-convolutional siamese networks for object tracking(SiamFC) Bertinetto et al. (2016), CREST: Convolutional Residual Learning for Visual Tracking(CREST) Song et al. (2017),Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning (ADNet) Yoo et al. (2017),MEEM: Robust Tracking via Multiple Experts using Entropy Minimization(MEEM) Zhang et al. (2014). We perform the task of comparison with the proposed Dual Tracking Algorithm(DTA) in terms of Success plot , Precision plot and LSM plot. The proposed Dual Tracking Algorithm(DTA) achieves superior result in all three categories. In figure-(40), we show the success plot and the precision plot and in figure-(41) we show the LSM plot.

Figure 40: precision and Success plot evaluated on TLP dataset with Five other state-of-art algorithms.
Figure 41: Longest Subsequence Measure(LSM) plots evaluated on TLP dataset with Five other state-of-art algorithms.

6.4 Experimental dataset-3(Performance analysis with other PSO algorithms)

We extend our experiment with other state-of-the-art Particle Swarm Optimization algorithms. We choose several published PSO algorithms and compare their tracking performance with the proposed dual tracking algorithm(DTA). We consider the Object Tracking Evaluation 2012 from The KITTI Vision Benchmark SuiteGeiger et al. (2012).Web link is as follows http://www.cvlibs.net/datasets/kitti/eval_tracking.php.

The DTA algorithm is tested with CLEAR matrixStiefelhagen et al. (2006). We consider few parameters: TheMulti Objective Tracking Accuracy [MOTA], which counts all missed target, false positive and identity mismatches, the Multiobjective Tracking Precision[MOTP] which considers the normalized distance between ground truth location and actual location. Another two parameters are, Mostly Tracked[MT] and Mostly Lost[ML]. Table-4 gives comparative performance of all this parameters with 6 state-of-the-art algorithms.

Parameters Hsu and Dai (????) Rymut and Kwolek (2014) Nguyen et al. (2013) Zhang (2014) Xia and Ludwig (2017) DTA
MOTA 81% 87.6% 89.5% 93.6% 77.5% 98.2%
MOTP 79% 74.1% 81.1% 88.6% 92.8% 90.3%
MT - 83.4% 78.2% 72.8% 79.3% 84.1%
ML - 2.3% 3.5% 2.9% 3.9% 2.6%
Table 4: Quantitative Comparison with our proposed Dual Tracking Algorithm (DTA) and other state-of-the-art algorithm. Red, green and blue represent First, Second and Third top performance values respectively.
Figure 42: performance graph of our proposed approach with other State-of-the-art PSO algorithms.

We also perform experiment on video stream Crowd_PETS09_S2_L3_Time_14-41_View_01 dataset from KITTI Vision Benchmark Suite.In figure -(42) we show how successfully the proposed dual tracking algorithm(DTA) performs the tracking in comparison with other PSO algorithms. In figure-(43) we show the tracking results with other competitive PSO algorithms.

Figure 43: successful tracking images on Crowd_PETS09_S2_L3_Time_14-41_View_01. From Top to Bottom and left to right frames number 101,134,160,170,184,197,209,216,228,232,237 and 239 are tracked successfully.

7 Conclusion and Future Work

In this paper we propose a dual tracking algorithm based on optical flow and swarm intelligence. KLT tracker which tracks the dominant points of the target object is based on optical flow method whereas PSO tracker tracks the boundary information of the target object which is approximated by polygon. The proposed dual tracking algorithm (DTA) is inherently robust mainly because of two reasons; i)each tracker continuously supplement the performance of the other and thus acts as a corrective measure for each other under several disturbances during tracking as stated earlier in this paper. ii) the multiswarms annular rings(strips) where is approximated polygon of the target object is embedded captures the target object very tightly so that during tracking under several undesirable disturbances as stated earlier there is no chance for loss of tracking the target object.
Hence the proposed dual tracking algorithm is robust for short video sequences and long challenging video sequence. In both the cases DTA is equally effective under static background as well as variable background.
We consider dominant point as a primary feature of the target object. It is considered as a good feature to trackShi et al. (1994). Another major advantage of choosing dominant point as good features to track is that it help constructing the approximated polygon of the target object just by joining by two consecutive dominant points. Thus from frame-2 the PSO tracker which is an important part of the dual tracking algorithm is readily supplied with approximated polygon of the target object and a multiswarms environment is generated which provides the automatic mechanism for robustness of the dual tracking algorithm. Also the fitness function of the PSO algorithm is based on the coordinates of the dominant points. Thus dominant points of the target object have many important roles to play in dual tracking algorithm(DTA). Also it is very easy to calculate the dominant point of a new object which may arrive at any instance during tracking. If a new object appear there is no need to start the tracking from very beginning The dual tracking algorithm(DTA) will calculate the dominant point and automatically approximate the contour of the new object in the form of a polygon which will be embedded further in multiswarms annular ring(strip).Construction of the bounding box around the target object is unique and is based on the concept of PSO algorithm We test the performance of the dual tracking algorithm under several benchmark datasets and show that the performance of the proposed dual tracking algorithm(DTA) is superior then the existing algorithms as shown in section -6.The proposed dual tracking algorithm can be further improved by some finer tuning of the parameters like w, of the PSO tracker. The basic concepts of the multiswarms environment,with certain modification, can be further extended to object recognition and action recognition problems

8 Appendix

Algorithm Frame_Extraction(Video_input)
         [Read a Video file “Video_input”]
         mov = VideoReader(filename)
         [Get number of frames using MatLab function “NumberOfFrames”]
         numFrames = mov.NumberOfFrames
         [writing frames into separate file”]
         For frame 1 to numFrames
             Frmaes Write frmaes in a separate file
         End For
         [Returning Resultant file ]
         Return Frames
End Procedure

Algorithm BrPtCal(Frames)
         [Implementation of FREEMAN-CHAIN Code algorithm]
         [Convert the first image into binary image]
         bm im2bw(image,0.5)
         Direction Define 8 direction
         [Define a point which lies exactly on object boundary]
         Statically define row and column value pair (r,c) which will reside exactly on object boundary, we did this by trial and error process.
         [Loop through Freeman Chain code]
         While(Ending coordinates = (r,c))
             brpts Check the direction and mark it weather it is on object boundary or not according to Freeman chain code rule.
         End While
         [Return breakpoints]
         Return(brpts)
End Procedure

Algorithm DominantPt(brpts)
         [define region size – How many breakpoints should be considered for each dominant points ]
         region_size rand(5,10)
         no_region (breakpoints / region_size)
         [Calculate Dominant points usingRay and Ray (1992)]
         For region1 to no_region
             For br 1 to no_of_brpts
             Dom_set Compute k –cosine value using rule – (4)
             EndFor
             Dompts select max k-cosine from Dom_set using rule –(7)
         EndFor
         [Return set of Dominant points]
         Return Dompts
End Procedure

Algorithm klt(dmpt)
         For pt 1 to dmpt
             Call matlab procedure vision.PointTracker
             New_pos vision.PointTracker
         EndFor
         [Return new position obtained by KLT tracker]
         Return New_pos
End Procedure

Algorithm BoundingBox(AcceptedParticles)
         [Sort in ascending order the list of AcceptedParticles]
         Sorted_particle sort(AcceptedParticles)
         [Determine how may particles should be considered for starting point of bounding box design; it will be decided based on experiment; we choose 10 for our experiment which gives us satisfactory result]
          p 10
         [choose first p points and find average ]
         q ( )/2
         [Decide Height and Breadth threshold]
         height_thresold random number between (5- 10)
         breadth_thresold random number between (5- 10)

[Finding Height of the object]
         For h 2 to y-cordinate(Sorted_particle)
             If y-cordinate(Sorted_particle()h) - y-cordinate(Sorted_particle)(h-1) height_thresold
             Store that particle in another vector named as Y_particle
             endIf
         EndFor
         ObjectHeight difference between first and last particle of Y_particle.

[Finding Breadth of the object]
         For b 2 to x-cordinate(Sorted_particle)
             If x-cordinate(Sorted_particle(b) - x-cordinate(Sorted_particle)(b-1) breadth_thresold
             Store that particle in another vector named as X_particle
             endIf
         EndFor
         ObjectBreadtht difference between first and last particle of X_particle.
         [Bounding Box design]
         Draw a straight-line connecting (q,q+ ObjectBreadth)
         Draw a straight-line connecting (q,q+ ObjectHeight)
         Draw a straight-line connecting [(q+ ObjectHeight), (q+ ObjectHeight + q+ ObjectBreadth)
         Draw a straight-line connecting [(q+ ObjectBreadth),(q+ ObjectHeight + q+ ObjectBreadth)
End Procedure

Algorithm DominantPointReInitialization(AcceptedParticles,
dompts(dmpt))
         For neighborParticle 1 to AcceptedParticles
             Find the nearest particle of dompts(dmpt))
             [weather the nearest particle is moving or not?]
             If(its last two position is different == true)
                 Accept it as new dominant point
             Else
                 Move to next swarm particle
             EndIf
         EndFor

Algorithm FitnessComputePSO (Particle set)
         for each particle Pi
             Compute PerDist ()
             if PerDist ()¡ acceptable range
             particle accepted
         end for
end FitnessComputePSO

Procedure 1 PsoAlgorithm(Particle set)
For each particle
      Initialize particle
END
Do
       For each particle
           Calculate fitness value
           If the fitness value is better than the best fitness value
           () in history
             set current value as the new
End
       Choose the particle with the best fitness value of all
       the particles as the
       For each particle
           Calculate particle velocity according equation (2)
           Update particle position according equation (1)
       End
While (maximum iterations or minimum error criteria is not attained)

References

References

  • Tomasi and Kanade (1991) C. Tomasi, T. Kanade, Detection and tracking of point features (1991).
  • Horn and Schunck (1993) B. K. Horn, B. Schunck, “determining optical flow”: a retrospective, Artificial Intelligence 59 (1993) 81–87.
  • Buxton and Buxton (1984) B. F. Buxton, H. Buxton, Computation of optic flow from the motion of edge features in image sequences, Image and Vision Computing 2 (1984) 59–75.
  • Jepson and Black (1993) A. D. Jepson, M. J. Black, Mixture models for optical flow computation, Department of Computer Science, University of Toronto, 1993.
  • Barron et al. (1994) J. L. Barron, D. J. Fleet, S. S. Beauchemin, Performance of optical flow techniques, International journal of computer vision 12 (1994) 43–77.
  • Sundaram et al. (2010) N. Sundaram, T. Brox, K. Keutzer, Dense point trajectories by gpu-accelerated large displacement optical flow, in: European conference on computer vision, Springer, 2010, pp. 438–451.
  • Chen et al. (2011) Z. Chen, J. Cao, Y. Tang, L. Tang, Tracking of moving object based on optical flow detection, in: Computer Science and Network Technology (ICCSNT), 2011 International Conference on, volume 2, IEEE, 2011, pp. 1096–1099.
  • Schwarz et al. (2012) L. A. Schwarz, A. Mkhitaryan, D. Mateus, N. Navab, Human skeleton tracking from depth data using geodesic distances and optical flow, Image and Vision Computing 30 (2012) 217–226.
  • Aslani and Mahdavi-Nasab (2013) S. Aslani, H. Mahdavi-Nasab, Optical flow based moving object detection and tracking for traffic surveillance, International Journal of Electrical, Computer, Energetic, Electronic and Communication Engineering 7 (2013) 1252–1256.
  • Kale et al. (2015) K. Kale, S. Pawar, P. Dhulekar, Moving object tracking using optical flow and motion vector estimation, in: Reliability, Infocom Technologies and Optimization (ICRITO)(Trends and Future Directions), 2015 4th International Conference on, IEEE, 2015, pp. 1–6.
  • Husseini (2017) S. Husseini, A survey of optical flow techniques for object tracking (2017).
  • Wu et al. (2013) Y. Wu, J. Lim, M.-H. Yang, Online object tracking: A benchmark,

    in: 2013 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2013. URL:

    https://doi.org/10.1109%2Fcvpr.2013.312. doi:10.1109/cvpr.2013.312.
  • Jia et al. (2012) X. Jia, H. Lu, M.-H. Yang, Visual tracking via adaptive structural local sparse appearance model, in: Computer vision and pattern recognition (CVPR), 2012 IEEE Conference on, IEEE, 2012, pp. 1822–1829.
  • Stalder et al. (2009) S. Stalder, H. Grabner, L. Van Gool, Beyond semi-supervised tracking: Tracking should be as simple as detection, but not simpler than recognition, in: Computer vision workshops (ICCV Workshops), 2009 IEEE 12th international conference on, IEEE, 2009, pp. 1409–1416.
  • Pérez et al. (2002) P. Pérez, C. Hue, J. Vermaak, M. Gangnet, Color-based probabilistic tracking, in: European Conference on Computer Vision, Springer, 2002, pp. 661–675.
  • Henriques et al. (2012) J. F. Henriques, R. Caseiro, P. Martins, J. Batista, Exploiting the circulant structure of tracking-by-detection with kernels, in: European conference on computer vision, Springer, 2012, pp. 702–715.
  • Zhang et al. (2012) K. Zhang, L. Zhang, M.-H. Yang, Real-time compressive tracking, in: European conference on computer vision, Springer, 2012, pp. 864–877.
  • Moudgil and Gandhi (2017) A. Moudgil, V. Gandhi, Long-term visual object tracking benchmark, arXiv preprint arXiv:1712.01358 (2017).
  • Nam and Han (2016) H. Nam, B. Han, Learning multi-domain convolutional neural networks for visual tracking, in: Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on, IEEE, 2016, pp. 4293–4302.
  • Bertinetto et al. (2016) L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, P. H. Torr, Fully-convolutional siamese networks for object tracking, in: European conference on computer vision, Springer, 2016, pp. 850–865.
  • Song et al. (2017) Y. Song, C. Ma, L. Gong, J. Zhang, R. W. Lau, M.-H. Yang, Crest: Convolutional residual learning for visual tracking, in: 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2017, pp. 2574–2583.
  • Yoo et al. (2017) S. Y. J. C. Y. Yoo, K. Yun, J. Y. Choi, Action-decision networks for visual tracking with deep reinforcement learning (2017).
  • Zhang et al. (2014) J. Zhang, S. Ma, S. Sclaroff, Meem: robust tracking via multiple experts using entropy minimization, in: European Conference on Computer Vision, Springer, 2014, pp. 188–203.
  • Sevilla-Lara and Learned-Miller (2012) L. Sevilla-Lara, E. Learned-Miller, Distribution fields for tracking, in: 2012 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2012. URL: https://doi.org/10.1109%2Fcvpr.2012.6247891. doi:10.1109/cvpr.2012.6247891.
  • Zheng and Meng (2007) Y. Zheng, Y. Meng, Adaptive object tracking using particle swarm optimization, in: 2007 International Symposium on Computational Intelligence in Robotics and Automation, IEEE, 2007. URL: https://doi.org/10.1109%2Fcira.2007.382848. doi:10.1109/cira.2007.382848.
  • Zheng and Meng (2008) Y. Zheng, Y. Meng, Swarm intelligence based dynamic object tracking,

    in: 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), IEEE, 2008. URL:

    https://doi.org/10.1109%2Fcec.2008.4630829. doi:10.1109/cec.2008.4630829.
  • Zhang et al. (2008) X. Zhang, W. Hu, S. Maybank, X. Li, M. Zhu, Sequential particle swarm optimization for visual tracking, in: 2008 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2008. URL: https://doi.org/10.1109%2Fcvpr.2008.4587512. doi:10.1109/cvpr.2008.4587512.
  • John et al. (2010) V. John, E. Trucco, S. Ivekovic, Markerless human articulated tracking using hierarchical particle swarm optimisation, Image and Vision Computing 28 (2010) 1530–1547.
  • Ching-Han and Miao-Chun (2011) C. Ching-Han, Y. Miao-Chun, PSO-based multiple people tracking, in: Communications in Computer and Information Science, Springer Berlin Heidelberg, 2011, pp. 267–276. URL: https://doi.org/10.1007%2F978-3-642-21984-9_23. doi:10.1007/978-3-642-21984-9_23.
  • Keyrouz (2012) F. Keyrouz, A fast-multiplying PSO algorithm for real-time multiple object tracking, International Journal of Computer Applications 60 (2012) 1–6.
  • Hsu and Dai (2012) C. Hsu, G.-T. Dai, Multiple object tracking using particle swarm optimization, World Academy of Science, Engineering and Technology 68 (2012) 41–44.
  • Kwolek (2013) B. Kwolek, Multi-object tracking using particle swarm optimization on target interactions, in: Advances in Heuristic Signal Processing and Applications, Springer Berlin Heidelberg, 2013, pp. 63–78. URL: https://doi.org/10.1007%2F978-3-642-37880-5_4. doi:10.1007/978-3-642-37880-5_4.
  • Kim et al. (2012) S. W. Kim, K. Yun, K. M. Yi, S. J. Kim, J. Y. Choi, Detection of moving objects with a moving camera using non-panoramic background model, Machine Vision and Applications 24 (2012) 1015–1028.
  • Rymut and Kwolek (2014) B. Rymut, B. Kwolek, Real-time multiview human body tracking using GPU-accelerated PSO, in: Parallel Processing and Applied Mathematics, Springer Berlin Heidelberg, 2014, pp. 458–468. URL: https://doi.org/10.1007%2F978-3-642-55224-3_43. doi:10.1007/978-3-642-55224-3_43.
  • Nguyen et al. (2013) X. S. Nguyen, S. Dubuisson, C. Gonzales, Hierarchical annealed particle swarm optimization for articulated object tracking, in: Computer Analysis of Images and Patterns, Springer Berlin Heidelberg, 2013, pp. 319–326. URL: https://doi.org/10.1007%2F978-3-642-40261-6_38. doi:10.1007/978-3-642-40261-6_38.
  • Zhang (2014) B. J. Zhang, Monocular video human motion tracking based on hybrid PSO, Journal of Multimedia 9 (2014).
  • Xia and Ludwig (2017) G. Xia, S. A. Ludwig, Object tracking using particle swarm optimization and earth mover’s distance, in: Evolutionary Computation (CEC), 2017 IEEE Congress on, IEEE, 2017, pp. 193–200. doi:10.1109/CEC.2017.7969313.
  • Simpson (2015) A. J. Simpson, On-the-fly learning in a perpetual learning machine, arXiv preprint arXiv:1509.00913 (2015).
  • Marcus (2018) G. Marcus, Deep learning: A critical appraisal, arXiv preprint arXiv:1801.00631 (2018).
  • Horn (1968) J. L. Horn, Organization of abilities and the development of intelligence., Psychological review 75 (1968) 242.
  • Cattell (1963) R. B. Cattell, Theory of fluid and crystallized intelligence: A critical experiment., Journal of educational psychology 54 (1963) 1.
  • Schneider and McGrew (2012) W. J. Schneider, K. S. McGrew, The cattell-horn-carroll model of intelligence., In D. P. Flanagan, P. L. Harrison (Eds.), Contemporary intellectual assessment: Theories, tests, and issues (2012) pp.99–144.
  • Blair (2006) C. Blair, How similar are fluid cognition and general intelligence? a developmental neuroscience perspective on fluid cognition as an aspect of human cognitive ability, Behavioral and Brain Sciences 29 (2006) 109–125.
  • Gray et al. (2003) J. R. Gray, C. F. Chabris, T. S. Braver, Neural mechanisms of general fluid intelligence, Nature neuroscience 6 (2003) 316.
  • Cattell (1987) R. B. Cattell, Intelligence: Its structure, growth and action, volume 35, Elsevier, 1987.
  • Ashton and Lee (2006) M. C. Ashton, K. Lee, “minimally biased” g-loadings of crystallized and non-crystallized abilities, Intelligence 34 (2006) 469–477.
  • Ackerman et al. (2005) P. L. Ackerman, M. E. Beier, M. O. Boyle, Working memory and intelligence: The same or different constructs?, Psychological bulletin 131 (2005) 30.
  • Ackerman (2000) P. L. Ackerman, Domain-specific knowledge as the” dark matter” of adult intelligence: Gf/gc, personality and interest correlates, The Journals of Gerontology Series B: Psychological Sciences and Social Sciences 55 (2000) P69–P84.
  • Wolff (2018) J. G. Wolff, Solutions to problems with deep learning, arXiv preprint arXiv:1801.05457 (2018).
  • Wolff (2004) J. G. Wolff, Unifying computing and cognition: The sp theory and its applications, arXiv preprint cs/0401009 (2004).
  • Wolff (2013) J. G. Wolff, The sp theory of intelligence: an overview, Information 4 (2013) 283–341.
  • Wolff (2014a) J. G. Wolff, Application of the sp theory of intelligence to the understanding of natural vision and the development of computer vision, SpringerPlus 3 (2014a) 552.
  • Wolff (2014b) J. G. Wolff, Autonomous robots and the sp theory of intelligence, IEEE Access 2 (2014b) 1629–1651.
  • Wolff (2016a) J. G. Wolff, Commonsense reasoning, commonsense knowledge, and the sp theory of intelligence, arXiv preprint arXiv:1609.07772 (2016a).
  • Wolff (2016b) J. G. Wolff, Information compression, multiple alignment, and the representation and processing of knowledge in the brain, Frontiers in psychology 7 (2016b) 1584.
  • Wolff (????) J. Wolff, The sp theory of intelligence: its distinctive features and advantages. ieee access, 4: 216–246, 2016. bit. ly/2qgq5qf, arXiv preprint arXiv:1508.04087 (????).
  • Wolff and Wolff (2017) G. Wolff, J. Wolff, Strengths and potential of the sp theory of intelligence in general, human-like artificial intelligence (2017).
  • Shi et al. (1994) J. Shi, et al., Good features to track, in: Computer Vision and Pattern Recognition, 1994. Proceedings CVPR’94., 1994 IEEE Computer Society Conference on, IEEE, 1994, pp. 593–600.
  • Geiger et al. (2012) A. Geiger, P. Lenz, R. Urtasun, Are we ready for autonomous driving? the kitti vision benchmark suite, in: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, IEEE, 2012, pp. 3354–3361.
  • Röhler and Chen (2011) A. B. Röhler, S. Chen, An analysis of sub-swarms in multi-swarm systems, in: Australasian Joint Conference on Artificial Intelligence, Springer, 2011, pp. 271–280.
  • Xueyan and Zheng (2015) L. Xueyan, X. Zheng, Swarm size and inertia weight selection of particle swarm optimizer in system identification, in: Computer Science and Network Technology (ICCSNT), 2015 4th International Conference on, volume 1, IEEE, 2015, pp. 1554–1556.
  • Ray and Ray (1992) B. K. Ray, K. S. Ray, Detection of significant points and polygonal approximation of digitized curves, Pattern Recognition Letters 13 (1992) 443–452.
  • Ray and Ray (2013) K. S. Ray, B. K. Ray, Polygonal approximation and scale-space analysis of closed digital curves, CRC Press, 2013.
  • Wu (2003a) W.-Y. Wu, An adaptive method for detecting dominant points, Pattern Recognition 36 (2003a) 2231–2237.
  • Wu (2003b) W.-Y. Wu, Dominant point detection using adaptive bending value, Image and Vision Computing 21 (2003b) 517–525.
  • Freeman (1961) H. Freeman, On the encoding of arbitrary geometric configurations, IRE Transactions on Electronic Computers (1961) 260–268.
  • Tomasi and Kanade (1991) C. Tomasi, T. Kanade, Detection and tracking of point features (1991).
  • Birchfield (1997) S. Birchfield, Derivation of kanade-lucas-tomasi tracking equation, unpublished notes (1997).
  • Eberhart and Kennedy (1995) R. Eberhart, J. Kennedy, A new optimizer using particle swarm theory, in: Micro Machine and Human Science, 1995. MHS’95., Proceedings of the Sixth International Symposium on, IEEE, 1995, pp. 39–43.
  • Ahmed and Glasgow (2012) H. Ahmed, J. Glasgow, Swarm intelligence: concepts, models and applications, School Of Computing, Queens University Technical Report (2012).
  • Shi and Eberhart (1998) Y. Shi, R. Eberhart, A modified particle swarm optimizer, in: Evolutionary Computation Proceedings, 1998. IEEE World Congress on Computational Intelligence., The 1998 IEEE International Conference on, IEEE, 1998, pp. 69–73.
  • Carlisle and Dozier (2001) A. Carlisle, G. Dozier, An off-the-shelf pso, in: Proceedings of the workshop on particle swarm optimization, volume 1, Citeseer, 2001, pp. 1–6.
  • Shi et al. (2001) Y. Shi, et al., Particle swarm optimization: developments, applications and resources, in: evolutionary computation, 2001. Proceedings of the 2001 Congress on, volume 1, IEEE, 2001, pp. 81–86.
  • de Melo and Delbem (2012) V. V. de Melo, A. C. B. Delbem, Investigating smart sampling as a population initialization method for differential evolution in continuous problems, Information Sciences 193 (2012) 36–53.
  • Melo et al. (2009) V. V. Melo, T. S. Duque, A. C. Delbem, Efficiency enhancement of ecga through population size management, in: Intelligent Systems Design and Applications, 2009. ISDA’09. Ninth International Conference on, IEEE, 2009, pp. 19–24.
  • Jiang et al. (2007) M. Jiang, Y. P. Luo, S. Y. Yang, Stochastic convergence analysis and parameter selection of the standard particle swarm optimization algorithm, Information Processing Letters 102 (2007) 8–16.
  • Richards and Ventura (2004) M. Richards, D. Ventura, Choosing a starting configuration for particle swarm optimization, in: IEEE Int. Joint. Conf. Neural, volume 3, 2004, pp. 2309–2312.
  • Wu et al. (2013) Y. Wu, J. Lim, M.-H. Yang, Online object tracking: A benchmark, in: Computer vision and pattern recognition (CVPR), 2013 IEEE Conference on, Ieee, 2013, pp. 2411–2418.
  • Grabner et al. (2008) H. Grabner, C. Leistner, H. Bischof, Semi-supervised on-line boosting for robust tracking, in: European conference on computer vision, Springer, 2008, pp. 234–247.
  • Babenko et al. (2011) B. Babenko, M.-H. Yang, S. Belongie, Robust object tracking with online multiple instance learning, IEEE transactions on pattern analysis and machine intelligence 33 (2011) 1619–1632.
  • Everingham et al. (2010) M. Everingham, L. Van Gool, C. K. Williams, J. Winn, A. Zisserman, The pascal visual object classes (voc) challenge, International journal of computer vision 88 (2010) 303–338.
  • Stiefelhagen et al. (2006) R. Stiefelhagen, K. Bernardin, R. Bowers, J. Garofolo, D. Mostefa, P. Soundararajan, The clear 2006 evaluation, in: International Evaluation Workshop on Classification of Events, Activities and Relationships, 2006, pp. 1–44. doi:https://doi.org/10.1007/978-3-540-69568-4_1.
  • Hsu and Dai (????) C. Hsu, G.-T. Dai, Multiple object tracking using particle swarm optimization, World Academy of Science, Engineering and Technology 68 (????) 41–44.