Bregman Iteration for Correspondence Problems: A Study of Optical Flow

Bregman iterations are known to yield excellent results for denoising, deblurring and compressed sensing tasks, but so far this technique has rarely been used for other image processing problems. In this paper we give a thorough description of the Bregman iteration, unifying thereby results of different authors within a common framework. Then we show how to adapt the split Bregman iteration, originally developed by Goldstein and Osher for image restoration purposes, to optical flow which is a fundamental correspondence problem in computer vision. We consider some classic and modern optical flow models and present detailed algorithms that exhibit the benefits of the Bregman iteration. By making use of the results of the Bregman framework, we address the issues of convergence and error estimation for the algorithms. Numerical examples complement the theoretical part.

Authors

• 5 publications
• 12 publications
04/19/2022

A qualitative investigation of optical flow algorithms for video denoising

A good optical flow estimation is crucial in many video analysis and res...
09/14/2020

PRAFlow_RVC: Pyramid Recurrent All-Pairs Field Transforms for Optical Flow Estimation in Robust Vision Challenge 2020

Optical flow estimation is an important computer vision task, which aims...
11/08/2020

FlowCaps: Optical Flow Estimation with Capsule Networks For Action Recognition

Capsule networks (CapsNets) have recently shown promise to excel in most...
04/11/2016

Beyond Brightness Constancy: Learning Noise Models for Optical Flow

Optical flow is typically estimated by minimizing a "data cost" and an o...
07/23/2017

Deep Optical Flow Estimation Via Multi-Scale Correspondence Structure Learning

As an important and challenging problem in computer vision, learning bas...
11/03/2016

Adaptive mixed norm optical flow estimation

The pel-recursive computation of 2-D optical flow has been extensively s...
12/05/2016

Multi-way Particle Swarm Fusion

This paper proposes a novel MAP inference framework for Markov Random Fi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In 2005, Osher et al. Osher2005 proposed an algorithm for the iterative regularisation of inverse problems that was based on findings of Bregman Bregman1967 . They used this algorithm, nowadays called Bregman iteration, for image restoration purposes such as denoising and deblurring. Especially in combination with the Rudin-Osher-Fatemi (ROF) model for denoising ROF92 they were able to produce excellent results. Their findings caused a subsequent surge of interest in the Bregman iteration. Among the numerous application fields, it has for example been used to solve the basis pursuit problem Cai2009 ; Osher2008 ; Yin2007 and was later applied to medical imaging problems in Lin2006 . Further applications include deconvolution and sparse reconstructions ZBBO09 , wavelet based denoising Xu2006 , and nonlinear inverse scale space methods Burger2006 ; Burger2005 . An important adaptation of the Bregman iteration is the split Bregman method (SBM) Goldstein2009 and the linearised Bregman approach Cai2009 . The SBM can be used to solve -regularised inverse problems in an efficient way. Its benefits stem from the fact that differentiability is not a necessary requirement on the underlying model and that it decomposes the original optimisation task in a series of significantly easier problems that can be solved very efficiently, especially on parallel architectures. The Bregman algorithms belong to the family of splitting schemes as well as to the primal-dual algorithms E2009 ; Setzer2009 which enjoy great popularity in the domain of image processing and which are still a very active field of ongoing research ODBP2015 ; GLY2015 .

The aim of this paper is to contribute to the mathematical foundation of the rapidly evolving area of computer vision. We explore the use of the Bregman framework, especially the application of the split Bregman method, for the problem of optical flow (OF) which is of fundamental importance in that field, cf. Aubert2006 ; KSK98 ; TV98 . We give a thorough discussion of the Bregman framework, thereby unifying results of several recent works. Then we show how to adapt the SBM to several classic and modern OF models. Detailed descriptions of corresponding algorithms are presented. Employing the Bregman framework, we show that convergence for these methods can be established and error estimates can be given.

1.1 The optical flow problem

The OF problem is an ill-posed inverse problem. It consists in determining the displacement field between different frames of a given image sequence by looking for correspondences between pixels. In many cases such correspondences are not unique or simply fail to exist because of various problems such as noise, illumination changes and overlapping objects. Nevertheless, the study of the OF problem is of fundamental importance for dealing with correspondence problems such as stereo vision where accurate flow fields are necessary BS07 ; MMK98 ; SBW2005 ; MJBB2015 . For solving the OF problem in a robust way, variational formulations and regularisation strategies belong to the most successful techniques. Those methods have been studied for almost three decades, starting from the approach of Horn and Schunck HS81 . During this period of time, many efforts have been spent to improve the models cf. BA96 ; ZBWS09 ; BBPW04 ; MP02 ; NBK08 ; WCPB09 ; WPB10 ; XJM10 ; ZBWVSRS09 ; BZW2011 ; ZBW2011 for an account of that field.

While many developments have been made on the modelling side, there are just a few works concerned with the mathematical validation of algorithms. In ki08 ; mm04 it has been shown that the classic numerical approach of Horn and Schunck converges. Furthermore, the authors of mm04 showed that the linear system obtained through the Euler-Lagrange equations has a symmetric and positive definite matrix and thus allows the usage of many efficient solvers. The authors of Wedel2008 ; Pock2007 developed an algorithm that solves the so called TV- model through an alternating minimisation scheme. This is applied to a variational formulation that augments the original energy functional with an additional quadratic term. This quadratic term allows the authors to divide their objective into simpler subproblems for which efficient solvers exist. In practice their approach yields excellent results. However, in general it does not converge towards the solution of the original energy functional but to a solution of the augmented variational formulation. Alternative approaches to minimise the occurring variational models include CP2011 ; OCBP2014 ; OBP2015 . These well performing algorithms possess good convergence properties, but may require additional regularity conditions, such as strong convexity of the considered energy, see OBP2015 . The author of br06 discusses the usage of efficient algorithms such as the Multigrid approach B1973 ; BL2011 ; BHM00 ; Hac85 ; Wes92 and the so called Lagged-Diffusivity or Kac̆anov method CM99 ; FKN73 ; KNPS68 . Finally, it is also possible to consider the solutions of the Euler-Lagrange equations as a steady-state of a corresponding diffusion-reaction system that one may solve by means of a steepest descend approach wb05 ; WBPB04 . Recent developments have extended the study of the OF problem onto dynamic non-Euclidean settings where the motion is estimated on an evolving surface KLS2015

. Other trends include the use of powerful (deep) learning strategies

SRLB2008 ; DFIH2015

or even combinations of variational and machine learning approaches

WRHS2013 .

1.2 Our contribution

We present an approach to the OF problem by exploring the Bregman framework. Despite their usefulness, Bregman iterations have received little attention in the context of OF up to now. Early attempts inlcude LBS2010 ; H2010 . Here, we propose mathematically validated methods for OF models, among them the prominent model of Brox et al. BBPW04 .

The main contribution of this work lies in the thorough presentation of the Bregman framework and the proof of convergence of the algorithms in the context of OF, thus giving the numerical solution of the OF problem a solid mathematical basis. To this end, we adapt the general convergence theory of the Bregman framework to the OF algorithms and show that the SBM iterates converge towards a minimiser of the considered energy functionals. Related questions that are important in the context of the numerical processing will also be discussed here. For instance, we will show that the arising linear systems have a symmetric and positive definite matrix. The assumptions for this are quite weak and naturally met in almost all setups.

1.3 Paper Organisation

In Section 2, we give a brief account of mathematical prerequisites, whereas Section 3 elaborates on the Bregman framework. Next, in Section 4 we give an account of the OF models we consider, and how to formulate the corresponding algorithms in terms of the SBM. Finally, we complement the theoretical developments by some numerical experiments given in Section 5 and finish the paper by some conclusions.

2 Mathematical prerequisites

In this work we strongly rely on the notion of subdifferentiability as it grants us the ability to handle non-differentiable robust regularisers, such as the norm, in similar style as smooth functions. For a thorough analysis of this important concept in convex optimisation we refer to the excellent presentations in Ekeland1999 ; Rockafellar1997 ; rj98 . Here, we merely recall the definition of a subdifferential. The subdifferential of a function at position is a set valued mapping given by

 (1)

Its elements are called subgradients. Without further requirements on this set may contain infinitely many elements or be empty. A common example is the subgradient of the absolute value function where a simple computation shows that

 ∂(|⋅|)(x)=⎧⎨⎩{−1},x<0,[−1,1],x=0,{1},x>0. (2)

For strictly concave functions the subdifferential is always empty. On the other hand, convex functions always have a least one subgradient in the interior of their domain. Subdifferentials exhibit many properties of usual derivatives. One of the most important properties of subdifferentials is for example that is a necessary condition for being a minimiser of .

Robust regularisers involving the norm are quite common in variational image analysis models. Their optimisation often leads to subproblems of the kind

 argminx∈Rn{∥x∥1+λ2∥x−b∥22} (3)

with a positive parameter

and an arbitrary vector

. A closed form solution can be derived in terms of the well known soft shrinkage operator: , where

 shrink(y,α)\coloneqq⎧⎨⎩y−α,y>α,0,y∈[−α,α],y+α,y<−α. (4)

For vector valued arguments, the shrinkage operator is applied componentwise. Unfortunately, the norm is not rotationally invariant and promotes in many applications undesired structures parallel to the coordinate axes. A possible workaround consists in adapting the considered models such that we are lead to tasks of the form

 argminx∈Rn{∥x∥2+λ2∥x−b∥22} (5)

with . Here, the closed form solution can be expressed in terms of the generalised shrinkage operator.

Definition 1 (Generalised Shrinkage)

Let be a vector in and , then we define the generalised shrinkage operator as

 (6)

where we adopt the convention .

The solutions of (5) can then be expressed in term of this generalised shrinkage. It holds . The proof is lengthy but not difficult. One has to find those for which 0 is a subgradient of the cost function. This can be done by discerning the cases and .

3 The Bregman Framework

We begin with recalling the standard Bregman iteration as developed by Osher et al. Osher2005 . Furthermore, we present an alternative but equivalent formulation of the Bregman algorithm which has been discussed in Goldstein2009 . We make use of this formulation as it simplifies the proof of a convergence assertion and for describing the SBM introduced by Goldstein and Osher Goldstein2009 . As indicated, the SBM will be the basis for our OF algorithms. Let us emphasise that many approaches to the Bregman framework exist in the literature. Bregman himself Bregman1967 wanted to describe non-orthogonal projections onto convex sets. Further research in that direction can for example be found in BB1997 . In E2009 ; Setzer2009 a certain number of equivalences between different optimisation techniques are discussed. They allow us to interpret the Bregman algorithms by means of conjugate duality for convex optimisation. Thus, the Bregman framework may also be interpreted as a splitting scheme or a primal dual algorithm. The presentation in this work relies more on similarities between constrained and unconstrained convex optimisation problems. The convergence theory that we will recall and present here is based on results in Brune2009 ; Burger2006 ; BRH07 ; Cai2009a ; Goldstein2009 ; Osher2005 . The authors of these works employed different mathematical settings. Some of the results require Hilbert spaces, others are stated in rather general vector spaces only equipped with semi-norms. We unify the results here within one framework using finite dimensional, normed vector spaces. This set-up allows the usage of a common set of requirements and to clarify relations between the different works. While doing this, we also add some new results to the SBM.

We note that this mathematical setting suffices for the typical application in computer vision where one ultimately needs to resort to a discretised problem.

Let us now introduce the mathematical formalism that we will require in the forthcoming sections. One of the central concepts behind the Bregman iteration is the Bregman divergence. It has been presented by Bregman in 1967 Bregman1967 , where it has been used to solve convex optimisation problems through non-orthogonal projections onto convex sets.

Definition 2 (Bregman Divergence)

The Bregman divergence of a proper convex function is defined as . Thereby is a subgradient of at .

The aim of the Bregman iteration is to have a formulation that can handle convex non-differentiable cost functions and that avoids ill conditioned formulations. To illustrate the main idea, one may consider e.g. the following optimisation problem:

 x(k+1)=argminx∈Rn {Dpφ(x,x(k))+ι{0}(Ax−b)} (7)

where is the indicator function of the set , i.e.  is 0 if and else. In case the linear system has multiple solutions or if it has a very large system matrix, then it might be difficult to determine the iterates . Therefore, one may reformulate (7) in terms of a regularised and unconstrained problem

 x(k+1)=argminx∈Rn{Dpφ(x,x(k))+λ∥Ax−b∥22} (8)

with some fixed to approximate (7). This iterative strategy motivates Definition 3, which coincides with the formulation found in Goldstein2009 ; Osher2005 ; Yin2007 .

Let us note that in Osher2005 ; Yin2007 the Bregman iteration has been formulated as a method for minimising convex functionals of the form . However, the convergence theory presented below that is derived from Brune2009 ; Burger2006 ; BRH07 ; Cai2009a ; Goldstein2009 ; Osher2005 states that the iterates converge towards the solution of a constrained formulation. Therefore, we define the algorithm from the beginning on as a method for solving constrained optimisation problems. In the following we silently assume that and are always two proper convex functions defined on the whole . Further, will be a non-negative differentiable function with and this minimum is reached at some point in .

Definition 3 (Bregman iteration)

The Bregman iteration of the constrained optimisation problem

 argminu∈Rn{J(u)+ι{0}(H(u))} (9)

is given by:

1. Choose arbitrarily, and .

2. Compute iteratively

 u(k+1)=argminu∈Rn{Dp(k)J(u,u(k))+λH(u)} (10)

where we have until a fixed-point is reached.

From our assumptions on it follows that it has at least one subgradient at every point. Thus always exists, but it is not necessarily unique. In general settings the computation of a subgradient may not always be simple. The following result from Yin2007 provides a comfortable strategy to obtain a single specific subgradient.

Proposition 1

If is differentiable, the second step of the Bregman iteration from Definition 3 becomes

 u(k+1)=argminu∈Rn{Dp(k)J(u,u(k))+λH(u)},p(k+1)=p(k)−λ∇H(u(k+1)). (11)
Proof

It suffices to show that is a subgradient of at position . The definition of the iterates implies that is a minimiser of . Expanding the definition of the Bregman divergence and removing the constant terms, we see that is a subgradient of at position . Since the subdifferential of a sum coincides with the sum of the subdifferentials it follows that there must exist that fulfils the equation

 0=p(k+1)−p(k)+λ∇H(u(k+1)). (12)

Although one could basically use any subgradient of at , the previous proposition gives us a convenient way of finding a specific one that is easy to obtain. This makes the computation of the iterates much simpler and improves the overall speed of the algorithm.

Our next goal is to analyse the convergence behaviour of the Bregman iteration given in Definition 3, and to show that its iterates obtained from

 argminu∈Rn {Dp(k)J(u,u(k))+λH(u)} (13)

converge towards a solution of

 argminu∈Rn{J(u)+ι{0}(H(u))}. (14)

For well-posedness reasons we will assume that (14) as well as the iterative formulation in (13) are always solvable. If these requirements are not feasible, then our iterative strategy cannot be carried out or fails to converge. We emphasise that the existence of a solution of either (13) or (14) cannot always be deduced from the existence of the other formulation. Even if cannot be fulfilled, it might still be possible that all iterates in (13) exist.

The results compiled by the Propositions 2, 3, and 4, as well as Corollary 1 were already discussed in Osher2005 . There, the authors discussed iterative regularisation strategies in the space of functions with bounded variation. Furthermore, they established the link between their algorithm and the Bregman divergence. The proofs from Osher2005 for the variational setting carry over verbatim to the finite dimensional set-up that we use within this work. Thus, we just recall the statements without proofs.

Proposition 2

The sequence is monotonically decreasing. We have for all :

 H(u(k+1))⩽H(u(k)) (15)

and strict inequality when is positive.

In this context we remark that the Bregman divergence is always non-negative for convex and if is even strictly convex, then can only hold if and only if .

Proposition 3

We have for all

 Dp(k)J(u,u(k))+Dp(k−1)J(u(k),u(k−1))−Dp(k−1)J(u,u(k−1))⩽λ(H(u)−H(u(k))). (16)
Corollary 1

For the particular choice , where is a solution of we immediately get:

 0⩽Dp(k)J(~u,u(k))⩽Dp(k−1)J(~u,u(k−1)). (17)

One can easily infer from the above assertions, that for strictly convex , the iterates converge towards a solution of . The next proposition gives an estimate how fast this convergence is, and it shows that the strict convexity is in fact not necessary. Let us note that the proof of it relies on the Propositions 2 and 3.

Proposition 4

If is a solution of and if for some starting value then one obtains for all

 0=H(~u)⩽H(u(k))⩽Dp(0)J(~u,u(0))λk. (18)

Therefore, the iterates always converge towards a solution of .

So far we have seen that the iterates converge towards a solution of . But at this point we do not know whether this solution also minimises our cost function . If has a unique solution, then the above theory is already sufficient.

The following proposition states, that under certain assumptions, the iterates that solve also minimise our cost function, even if has multiple solutions. This highly important result was first pointed out in Yin2007 , where the authors analysed the convergence behaviour of the Bregman iteration within the context of the basis pursuit problem. There, the authors analysed the Bregman framework in a finite dimensional setting and further showed an interesting relationship to augmented Lagrangian methods.

Proposition 5

Assume there exists such that it is possible to choose in (11). Furthermore, assume that where is some matrix, an arbitrary vector and is a differentiable non-negative convex function that only vanishes at . If an iterate fulfils , i.e. it solves , then that iterate is also a solution of the constrained optimisation problem of (9).

Concerning the proof of Proposition 5 from Yin2007 , let us note that this proposition requires that vanishes only at . However, the linear system can have multiple solutions. Thus can have multiple solutions, too. The requirement that only vanishes at is essential in the proof, as it enforces that every zero of solves the linear system.

We conclude that convergence is guaranteed if has the form described in Proposition 5 and if we can choose such that is a valid subgradient. The latter requirement is in fact rather weak such that only the former is of importance. For the formulation of the OF problems, these conditions will fit naturally into the modelling.

We will now focus on the special case of interest for us that . In that case it is possible to derive an estimate for the error at each iteration step. Such a result was presented in BRH07 , where the authors discussed the convergence behaviour of the Bregman iteration in the context of inverse scale methods for image restoration purposes. Their setting included a variational formulation and used spaces as well as the space of functions of bounded variation. Furthermore, they had to formulate certain convergence results in terms of the weak-* topology. The usage of finite dimensional settings allows a more consistent formulation. In our mathematical setting, the proof can be done analogously to the one in BRH07 .

In order to prepare the presentation of the SBM formulation we consider in the following a more concrete optimisation task. Therefore, assume now that is a given matrix and a known vector in . The problem that we consider is

 argminu∈Rn{J(u)+ι{0}(12∥Au−b∥22)}. (19)

Proposition 1 implies that we have the following algorithm:

 u(k+1)=argminu∈Rn{Dp(k)J(u,u(k))+λ2∥Au−b∥22},p(k+1)=p(k)+λAT(b−Au(k+1)). (20)

For technical reasons we will continue to assume that it is possible to choose such that can be used. This can always be done as long as has a minimum at some finite point in our framework. It is useful to consider the following two definitions, which stem from BRH07 .

Definition 4 (Minimising Solution)

A vector is called minimising solution of , if and for all other that fulfil .

Definition 5 (Source Condition)

Let be a minimising solution of . We say satisfies the source condition if there exists an such that .

The source condition can, in a certain sense, be interpreted as an additional regularity condition that we impose on the solution. Not only do we require that the minimising solution has a subgradient, we even want that there exists a subgradient that lies in the range of . Requirements like this are a frequent tool in the analysis of inverse problems. The next theorem adopted from BRH07 shows that it is possible to give an estimate for the error if this source condition holds.

Theorem 3.1

Let be a solution of minimising , and assume that the source condition holds, i.e. there exists a vector such that for some vector . Furthermore, assume that it is possible to choose such that is a subgradient of at . Then we have the following estimation for the iterates of (20):

 Dp(k)J(~u,u(k))⩽∥q∥222λk∀k∈N∗. (21)

The result of the following proposition can be found in Yin2007 .

Proposition 6 (Alternative formulation)

The normal Bregman iteration for solving the constrained optimisation problem

 argminu∈Rn{J(u)+ι{0}(12∥Au−b∥22)} (22)

can also be expressed in the following iterative form:

 u(k+1)\coloneqqargminu∈Rn{J(u)+λ2∥∥Au−b(k)∥∥22},b(k+1)\coloneqqb(k)+b−Au(k+1). (23)

if we set and choose such that .

Because of the equivalence of the two formulations the iterates given by this alternative Bregman algorithm have the same properties as the ones of the standard Bregman iteration. Thus, all the convergence results for the standard set-up also apply in this case.

3.1 The Split Bregman Method

The split Bregman method (SBM) proposed in Goldstein2009 extends the Bregman iteration presented so far. It aims at minimising unconstrained convex energy functionals.

While we mainly follow Goldstein2009 for the description of the algorithm, we will also give some new results. We will for example discuss how the convergence estimate of Brune et al. Brune2009 can be applied to the SBM.

The split Bregman formulation is especially useful for solving the following two problems:

 (24)

The function is an affine mapping, i.e.  for some matrix and some vector . should be a convex function from to . The difficulty in minimising these cost functions stems from the fact that neither , nor are not differentiable in 0.

The basic idea behind the SBM is to introduce an additional variable that enables us to separate the non-differentiable terms from the differentiable ones. This is done by rewriting (24) as a constrained optimisation:

 argmind,u∈Rn{∥d∥k+G(u)+ι{0}(d−Φ(u))}. (25)

The previous section has shown us how to handle constrained optimisation tasks of this kind. The main idea of SBM is to apply standard Bregman to (25). In order to simplify the presentation, we employ the following aliases:

 η \coloneqq(u,d)T, (26) J(η) \coloneqq∥d∥k+G(u), (27) A(η) \coloneqqd−Λu. (28)

Obviously is again a convex function and is a linear mapping. Using the new notations, (25) can be rewritten as

 argminη∈Rn{J(η)+ι{0}(12∥A(η)−b∥22)}. (29)

We assume at this point that it is possible to choose such that is a subgradient of at . This is always possible if attains its minimum.

By applying the Bregman algorithm from Proposition 6 one obtains the following iterative procedure:

 (30)

with being a constant positive parameter. Reintroducing the definitions of and leads to a simultaneous minimisation of and . Since such an optimisation is difficult to perform, we opt for an iterative alternating optimisation with respect to and . Goldstein et al. Goldstein2009 suggest to do a single sweep. In this paper we allow a more flexible handling and allow up to alternating optimisations. All in all, we have to solve for :

 argminu∈Rn {G(u)+μ2∥∥d(k,j)−Λu−b(k)∥∥22}, (31) argmind∈Rn {∥d∥1+μ2∥∥d−Λu(k,j+1)−b(k)∥∥22}. (32)

The first optimisation step depends largely on the exact nature of . As a consequence one cannot make any general claims about it. We just note that for the case where for some matrix and a vector , the cost function becomes differentiable and the minimiser can be obtained by solving a linear system of equations with a positive semi-definite matrix. If either or has full rank, then the system matrix will even be positive definite. This will especially be true for the upcoming applications to optic flow. The second optimisation has a closed form solution in terms of shrinkage operations. The solution is given by

 (33)

where the computation is done componentwise. If we replace the norm by the Euclidean norm, then we have to resort to the generalised shrinkage operator and the solution is given by

 d(k,j+1)=gshrink(Λu(k,j+1)+b(k),1μ). (34)

The detailed formulation of the SBM with iterations and alternating minimisation steps for solving (24) is depicted in Algorithm LABEL:bh-algor1.

algocf[htbp]     Since the SBM relies on the Bregman iteration, it is clear that all the related convergence results also hold for the SBM. Especially Theorem 3.1 gives us an estimate for the convergence speed if certain regularity conditions are met. In the following, we would like to analyse if these conditions can be fulfilled for the SBM. We are going to consider the following problem

 argminu{N∑k=1∥Aku+bk∥2+λ2∥Bu−c∥22} (35)

where , are matrices, , some vectors and a real-valued parameter. This model represents a generic formulation that also includes all forthcoming OF models. Thus, all statements concerning this model are automatically valid for our OF methods, too. The corresponding split Bregman algorithm of (35) solves

 argminu,d1,…,dN{N∑k=1∥dk∥2+λ2∥Bu−c∥22}such thatN∑k=1∥dk−Aku−bk∥22=0. (36)

Note that the necessary conditions for the application of the split Bregman algorithm are met. The cost function attains its global minimum for for all and a solution of . The constraining condition obviously also has a solution. Let us now define the matrix by

 Λ\coloneqq⎡⎢ ⎢ ⎢ ⎢ ⎢⎣−A1I0…0−A20I…0⋮⋮⋮⋱⋮−AN00…I⎤⎥ ⎥ ⎥ ⎥ ⎥⎦ (37)

where

is the identity matrix in

. If we define further then (36) can be rewritten as

 argminu,d1,…,dN{N∑k=1∥dk∥2+λ2∥Bu−c∥22\eqqcolonJ(u,d1,…,dN)}such that∥∥Λ(u,d1,…,dN)T−~b∥∥22=0. (38)

Now assume that we have found , a minimising solution of . In order to apply Theorem 3.1 we need to know how looks like. So assume is a subgradient. By definition we must have for all and all :

 N∑k=1(∥dk∥2−∥∥~dk∥∥2)+λ2(∥Bu−c∥22−∥B~u−c∥22)⩾⟨w,u−~u⟩+N∑k=1⟨wk,dk−~dk⟩. (39)

Since this must hold for all possible choices, it must hold especially for with . But then we see that must be a subgradient of at . Setting and all but one to yields in the same way that every must be a subgradient of at . It follows that we have the following representation

 w =λBT(B~u−c), (40) wk =~dk∥∥~dk∥∥2 % for k=1,…,N. (41)

We assume here that all are different from . If this is not the case, then the choice of the subgradient is not unique anymore and would complicate the following discussion. Theorem 3.1 requires that there is a vector such that . From the structure of the matrix we deduce that the following conditions must be fulfilled

 N∑k=1ATk~dk∥∥~dk∥∥2=λBT(c−B~u). (42)

If this relation holds for the minimising solution, then the estimate given in Theorem 3.1 also holds for the split Bregman algorithm.

Let us close this section with two small remarks concerning the previous results.

Should any of the be 0, then any vector with norm less or equal than would be a valid subgradient of at

. In that case we gain additional degrees of freedom in the above formula which increases the chances that it can be fulfilled.

The SBM still converges even if (42) is not fulfilled. Theorem 3.1 only gives an estimate for the convergence speed, not for the convergence itself. The convergence is guaranteed by Propositions 24 and Proposition 5. We refer to NF2014 for an additional discussion on the necessity criteria to assert convergence. Further convergence investigations under duality considerations are also exhibited in YMO2013 . A discussion on convergence rates under strong convexity and smoothness assumptions can also be found in GB2016 ; G2016 . These works also include findings on optimal parameter choices.

4 Optic Flow: The Setup

The purpose of this section is to present the OF models that are addressed in this work. First we briefly consider basic model components. Then we summarise the models that are of interest here, in a variational set-up as well as in the discrete setting.

4.1 Optical Flow Models

Let us denote by the set of all partial derivatives of order less or equal than of a given image from a sequence , where is a subset of representing the (rectangular) image domain and is a time interval. We restrict our attention to grey value images. Extensions to colour images are possible, they just render the proceeding more cumbersome and offer little insight into the underlying mathematics. The aim of the OF problem is to determine the flow field of

between two consecutive frames at the moments

and . The two components of this displacement field are .

The general form of a variational model that determines the unknown displacement field as the minimiser of an energy functional can then be written as

 argmin(u,v){∫ΩD(u,v)+λS(∇u,∇v)dydy}. (43)

Thereby, denotes a data confidence term (or just data term), while the so-called smoothness term regularises the energy, and where is a regularisation parameter. The operator corresponds as usual to the gradient. Such variational formulations have the advantages that they allow a transparent modelling, and that the resulting flow fields are dense.

We employ a modern approach that combines the two following model assumptions:

Grey value constancy: One assumes here that the following equality holds

 f(x+u(x,y),y+v(x,y),t+1)=f(x,y,t). (44)

Surprisingly, this assumption is relatively often fulfilled when the displacements remain small enough. Unfortunately we have two unknowns but only one equation. Thus, there is no chance to recover the complete displacement field based on this equation alone. This problem is known in the literature as the aperture problem.

 ∇f(x+u,y+v,t+1)=∇f(x,y,t). (45)

Here we have two unknowns and two equations. As a consequence the aperture problem is not always present. This assumption is of interest as it remains fulfilled when the image undergoes global illumination changes, whereas the grey value constancy does not.

Our constancy assumptions represent nonlinear relationships between the data and the flow field . As a remedy we assume that all displacements are small. In this setting we may approximate the left hand side of the equations above by their corresponding first order Taylor expansions. Then (44) becomes

 fxu+fyv+ft=0 (46)

and (45) becomes

 fxxu+fxyv+fxt=0,fxyu+fyyv+fyt=0 (47)

where the indices designate the partial derivatives with respect to the corresponding variables. Deviations of the lefthand side from 0 can be considered as errors and will be penalised in our models. Making use of a weight , interesting combinations of these models can be found in Table 1.

Using the notation , we also address three smoothness terms of interest in OF models, see , , in Table 1.

The data term

is optimal from a theoretical point of view because it is convex and smooth. However, it is not robust with respect to outliers in the data. The data term

is more robust since the penalisation is sub-quadratic. Its disadvantage is that it is not differentiable. The most interesting smoothness terms are and . Both are not differentiable, but they offer a sub-quadratic penalisation. Furthermore is even rotationally invariant. While is convex and differentiable and thus offers attractive theoretical properties, the quadratic penalisation may cause an oversmoothing of discontinuities in the motion field.

Now that we have presented the smoothness and data terms, we can combine them to different energy functionals. In Table 2 we summarise the possible choices and cite some references where these models have been successfully applied.

4.2 Algorithmic Aspects

The following details are pre- and postprocessing steps that improve the quality of our results. Most of these strategies are generic and many of them are applied in various successful OF algorithms. We emphasise that they do not infer with the Bregman framework that we use for the minimisation.

As usual for countless imaging applications we convolve each frame of our image sequence

with a Gaussian kernel with a small standard deviation in order to deal with noise. For image sequences with large displacements, we follow

BBPW04 and embed the minimisation of our energy into a coarse-to-fine multiscale warping approach. In all our experiments we set the scaling factor to 0.9. During warping, we employ a procedure from Wedel2008 where the authors proposed to apply a median filter on the components that were obtained from the coarser grid. We point to HB2011 for an analysis on the benefits of this strategy. Furthermore, we disable the data term at occlusions. This can be achieved by multiplying the data term with an occlusion indicator function , where if a pixel is occluded and if a pixel is visible. For the detection of occlusions we follow the popular cross-checking technique from cm92 ; pgpo94 . The occlusion handling is especially important for approaches with a quadratic data term.

5 Optical Flow: The Bregman Framework

In this section we elaborate on the formulation of the SBM for the considered OF models. From an algorithmic point of view the most important questions have already been answered. It remains to show that the optic flow models can be cast into a form which is suitable for the application of the Bregman algorithms.

5.1 The OSB model

First, we consider the model we denoted as OSB. A straightforward discretisation yields

 argminu,v{λ2D1(u,v)+∑i,j√∥∥∇ui,j∥∥22+∥∥∇vi,j∥∥22} (48)

where the summation goes over all pixel coordinates. Before we start applying the Bregman algorithm let us have a look at the smoothness term first. In can be reformulated in the following way

 ∑i,j √∥∥∇ui,j∥∥22+∥∥∇vi,j∥∥22=:∑i,j∥∥∥(∇ui,j∇vi,j)∥∥∥2. (49)

Thus, this model can also be written in the following more compact form

 argminu,v{λ2D1(u,v)+∑i,j∥∥∥(∇ui,j∇vi,j)∥∥∥2}. (50)

The constrained formulation is now easily deduced. The best way to cast this model into the SBM framework is to introduce slack variables and for the non-differentiable smoothness term and to add and equality constraint between the new variables and our flow field.

 argminu,v {λ2D1(u,v)+∑i,j∥∥ ∥∥(dui,jdvi,j)∥∥ ∥∥2}such that 12∑i,j∥∥ ∥∥(dui,jdvi,j)−(∇ui,j∇vi,j)∥∥ ∥∥22=0. (51)

A straightforward reordering and grouping of all the involved terms leads us to the following expression

 argminu,v {λ2D1(u,v)+∑i,j∥∥ ∥∥(dui,jdvi,j)∥∥ ∥∥2}such that 12∥∥∥(dudv)−(∇u∇v)∥∥∥22=0 (52)

which is well suited for applying the Bregman framework. For convenience we have grouped all variables and into large vectors , respectively , while the vectors and contain the corresponding derivative information. The constraining condition admits a trivial solution and thus, it does not pose any problem. The cost function obviously has a minimum, too. The variables and act independently of and . Simply setting them all to and determining the minimising and of , by solving a least squares problem, yields the desired existence of a minimiser. This implies that the cost function attains its minimum and that there exists a point where is a subgradient. Note that can always be minimised since we operate in a finite dimensional space, where such problems are always solvable. It follows that the split Bregman algorithm is applicable. Following the notational convention from Section 3.1, we set

 η=(ui,j,vi,j,dui,j,dvi,j)⊤,J(η)=λ2D1(u,v)+∑i,j∥∥ ∥∥(dui,jdvi,j)∥∥ ∥∥2,A(η)=(dudv)−(∇u∇v),b=0. (53)

The application of the SBM algorithm is now straightforward. In the alternating optimisation steps the minimisation with respect to requires solving a linear system of equations with a symmetric and positive definite matrix. We refer to Section 6 for a proof. As mentioned in Goldstein2009 it is enough to solve this system with very little accuracy. A few Gauß-Seidel iterations are already sufficient. In YO2012 , the authors also discuss the robustness of the Bregman approach with respect to inaccurate iterates and provide a mathematically sound explanation. The minimisation with respect to and can be expressed through shrinkage operations and does not pose any problem. A detailed listing of the complete algorithm is given in Algorithm 1.

5.2 The Model of Brox et al.

The model that we discuss in this section differs only very little from the previous one in terms Bregman iterations. Although we have robustified the data term and rendered the smoothness term rotationally invariant, the differences to the previous Bregman iterative scheme will be surprisingly small. After the discretisation of the variational formulation we obtain

 argminu,v{λD2(u,v)+∑i,j√∥∥∇ui,j∥∥22+∥∥∇vi,j∥∥22}. (54)

Here, we observe that none of the terms of the energy functional is differentiable. In the same way as for OSB, the smoothness term can be rewritten in the following way

 argminu,v{λD2(u,v)+∑i,j∥∥∥(∇ui,j∇vi,j