1 Efficient Superimposition Recovering Algorithm
With estimated transformation parameters , we align the transmitted layers by warping mixtures with . Then our mixing model is rewritten as:
(1) 
Here is the latent transmitted layer, is the reflected layer in th (mixtures), is the mixing coefficients. With this new mixing model, the influence of parametric transformations can be ignored in the intermediate recovering process. For simplicity, we use to represent . and denote and , respectively. Let stand for the extracted gradients from . To recover high quality latent image layers, we propose to employ penalty on the extracted gradients and nonnegative constraints on the layers’ intensities along with the loss of the mixing model. Thus our recovering objective function is written as:
(2)  
where
is a large vector containing all pixel values in all latent layers. The first
term enforces the agreement between reconstructed layer gradients and extracted layer gradients, while the second term tends to satisfy our mixing mode. Since the extracted gradients are nonzero at very few coordinates, the norm term not only prefers layers with sparse gradients but also avoids oversmooth results. is a trade off coefficient.To solve the nonsmooth convex optimization model (2) efficiently, we denote
(3)  
Here is the penalty on the extracted gradients and corresponds to the loss and nonnegative constraints. can be formulated in the following matrix form:
(4)  
where is continuously differentiable and , of which Lipschitz constant , and is the unit matrix. We note the objective function in (2) is a composite function of a differential term and a nondifferential term . Denote
(5)  
which is the first order Taylor expansion of at , with the squared Euclidean distance between and as the regularization term. The traditional gradient descent algorithm obtains the solution at the th iteration by with a proper step size (greater than ). Here we propose to employ the accelerated gradient descent [1, 2] to solve the reconstruction problem, named Efficient Superimposition Recovering Algorithm (ESRA). Here we generate a solution at the th iteration by computing the following proximal operator
(6) 
where and for . We note that is a linear combination of and . The combination coefficient plays an important role in the convergence of the algorithm. As suggested by [3], we set and for . According to the theoretical analysis in [3], this accelerated gradient descent method can get within of the optimal objective value after steps. While solving problem (6) is still very challenging, we propose a Parallel Algorithm with Constrained Total Variation (PACTV) method to find the optimal solution, which is presented in the sequel.
(7)  
(8)  
(9) 
2 PACTV via dual approach
Given problem (6), we observe it can be solved block separable in the following way. If we denote (, we can split into separable parts. Then by employing the definition of (3), we transform (6) into the following form:
(10)  
As illustrated in (10), finding is to solve following separable problems with constrained total variation in parallel:
(11) 
Here , and represent , respectively. Similar with the image denoising problem [4, 3], we propose a dual approach to solve (11) and give some notation in order:

is the set of matrixpairs where and that satisfy . And we assume , for every .

The linear operation is defined by the formula

The operator which is adjoint to is given by where and .

is the orthogonal projection operator on the convex closed set .
Equipped with these notation, we derive a dual problem of (11), and give following proposition to state the relation between the primal and dual optimal solutions.
Proposition 1.
Let be the optimal solution of the problem
(12)  
where for every . Then the optimal solution of (11) is given by .
Proof.
First note the following relation holds true:
(13) 
Hence, we can give
(14) 
where,
(15)  
With this notation we have
(16) 
Thus the original problem (11) becomes
(17)  
Since the objective function is convex in and concave in , we can exchange the order of the minimum and maximum and get
(18)  
and which can be written as
(19)  
Thus the optimal solution of the inner minimization problem is
(20) 
And last, we plug the above expression for back into (19) and ignore the constant term, we obtain the dual problem is
which is the same as (12). ∎
what’s more, given (12), we can easily have following lemma.
Lemma 1.
The objective funtion of (12) is continuously differentiable and its gradient is given by
(21) 
And let be the Lipschitz constant of , then .
Proof.
Consider the function defined by
(22) 
Then the dual function (12) can be written as:
(23)  
Obviously, is continuously differentiable and its gradient is given by
(24) 
Therefore,
(25)  
Then for every two pairs of matrices where and for , we have
(26)  
here the above inequalities follow from the nonexpensiveness property of the orthogonal projection operator and property of linear operators . And from [4], we have . Therefore, implying that and hence . ∎
With definition of and , fast gradient projection (FGP) is applied on the dual problem (12). And the complexity of each iteration in FGP is . Above all, our proposed Parallel Algorithm with Constrained Total Variation (PACTV) is using FGP to solve the dual problems (12) in parallel. Then we catenate the optimal and resize them into vector form to achieve .
Given above proposition and lemma, we can use the fast gradient projection (FGP) on dual problem (12). Fast gradient projection (FGP) is outlined in Algorithm 2. Here means projecting the matrixpair on the set . And finally we achieve the optimal solution of (11). Then our recovering method ESRA is outlined in Algorithm 1.
In our implementations, we set the total iteration number of ESRA is 100 and FGP tolerance is , and we also set to ensure a constant stepsize. The initial value of is zero. The final recovered reflected layers of (2) should be warped with and enhance the intensity by 2 to be visible. Our recovering method launches a general optimization framework and can be extended to solve other reconstruction problems in [5, 6].
References
 [1] A. Nemirovski, “Efficient methods in convex programming,” 2005.
 [2] Y. Nesterov and I.U.E. Nesterov, Introductory lectures on convex optimization: A basic course, vol. 87, Springer, 2004.
 [3] A. Beck and M. Teboulle, “A fast iterative shrinkagethresholding algorithm for linear inverse problems,” SIAM Journal on Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.
 [4] A. Beck and M. Teboulle, “Fast gradientbased algorithms for constrained total variation image denoising and deblurring problems,” TIP, vol. 18, no. 11, pp. 2419–2434, 2009.
 [5] Kun Gai, Zhenwei Shi, and Changshui Zhang, “Blind separation of superimposed images with unknown motions,” in Proc. CVPR, 2009, pp. 1881–8.
 [6] K. Gai, Z. Shi, and C. Zhang, “Blind separation of superimposed moving images using image statistics,” TPAMI, , no. 99, pp. 1–1.