We focus on the following convex minimization problem with linear equality constraints,
where is a proper closed convex function but possibly nonsmooth, and
are given matrix and vector, respectively,is a closed convex set. Without loss of generality, the solution set of the problem (1) denoted by is assumed to be nonempty.
where denote the penalty parameter and the Lagrange multiplier w.r.t. the equality constraint, respectively. As analyzed in , ALM can be viewed as an application of the well-known proximal point algorithm (PPA) that can date back to the seminal work of Martinet  and Rockafellar  for the dual problem of (1). Obviously, the efficiency of ALM heavily depends on the solvability of the -subproblem, that is, whether or not the core -subproblem has closed-form solution. Unfortunately, in many real applications [2, 4, 9, 10], the coefficient matrix
is not identity matrix (or does not satisfies), which makes it difficult even infeasible for solving this subproblem of ALM. To overcome such difficulty, Yang and Yuan  proposed a linearized ALM aiming at linearizing the -subproblem such that its closed-form solution can be easily derived. We refer to the recent progress on this direction [3, 6].
Under basic regularity condition , it is well-known that is an optimal solution of (1) if and only if there exists such that the following variational inequality holds
From the aforementioned assumption on the solution set of the problem (1), the solution set of (2) denoted by is also nonempty. When PPA is applied to solve the variational inequality , it usually reads the unified updating scheme: at the -th iteration, find satisfying
We call the proximal matrix that is usually required to be symmetric positive definite to ensure the convergence of (3). To our knowledge, this idea was initialized by He et al. [HLHY02]. Clearly, different structures of would result in different versions of PPA. From a computational perspective, our motivation is to design a multi-parameterized PPA for solving problem (1) while maintaining the efficiency as the linearized ALM, although the feasible starting point may be different. Interestingly, many customized proximal matrices shown in [5, 7, 10, 16] turn out to be special cases of our multi-parameterized proximal matrix (See Remark 2.2 for details). In this sense, our proposed algorithm can be viewed as a general customized PPA for solving problem (1). Moreover, we adopt a relaxation strategy to accelerate the convergence.
2 Main Algorithm
In this paper, we design the following multi-parameterized proximal matrix
where denotes the identity matrix, is an arbitrary real scalar and
The notation represents the spectral norm of It is easy to check that the above matrix is symmetric positive definite for any parameters satisfying (5).
Now, substituting the matrix into (3) we have
By the equation in (6), it can be deduced that
which further makes (7) become
Based on the inequality in (6), i.e., the first-order optimality condition of -subproblem, we obtain
Then, our relaxed multi-parameterized PPA (RM-PPA) is described as Algorithm 2.1, where we use to replace the output of (3) with given iterate , and we use to stand for the new iterate after combining a relaxation step. Finally, the inequality (3) becomes
Note that in step 5 actually plays a role of penalty parameter in ALM, while can be treated as the proximal parameter as used in the customized PPA . The quadratic term
plays a second penalty role for the equality constraint relating to its -th iteration. By the way of updating , it uses the convex combination of the feasibility error at the current iteration and the former iteration when . The parameterized matrix designed in this paper is more general than some in the literature:
If , then our parameterized proximal matrix turns to the matrix involved in [5, page 158]. If , then our matrix is identical to that in [10, Eq. (3.1)] but Algorithm 2.1 uses an additional relaxation step for fast convergence. Moreover, we establish the worst-case ergodic convergence rate for the objective function value error and the feasibility error.
Regardless of step 6, it is easy to check that Algorithm 2.1 with is a linearization of ALM:
Specifically, by letting the scheme (10) is ALM with extra proximal term which eliminates the term in the iteration. Algorithm 2.1, in such choice of parameters, is a linearized ALM. Besides, our parameter is more general and flexible than that in .
3 Convergence Analysis
Before analyzing the global convergence and sublinear convergence rate of Algorithm 2.1, we give a fundamental lemma as the following.
Proof According to the inequality (2
) and the skew-symmetric property of, i.e.
the inequality (9) with setting gives Note that the step 6 shows
so we have
Since the matrix can be decomposed as where
denotes the zero matrix of sizeand is given by (11), we thus obtain
Then, applying the identity
to the left-hand side of (14), the following inequality holds immediately
Substituting (13) into the expression of , it can be deduced that
This completes the whole proof.
Lemma 3.1 shows the sequence is contractive under the -norm w.r.t. the solution set , since the matrix is positive definite and the term . Similar to the convergence proof in e.g.  and the proof of Lemma 3.1, the global convergence and sublinear convergence rate of Algorithm 2.1 can be easily established as as below, whose proof is omitted here for the sake of conciseness.
Theorem 3.1 illustrates that Algorithm 2.1 converges globally with a sublinear ergodic convergence rate. Furthermore, we can deduce a compact result as the following corollary by making using of the second result in Theorem 3.1. For any , let and
where the second equality and the final inequality use . Then, it follows from (18) that
which, by the definition of in (17), completes the proof.
holds with given by (17). Rearranging the above inequality, we have
Hence, we will also have showing that
4 Numerical Experiments
In this section, we apply the proposed algorithm to solve the following -minimization problem from signal processing , which aims to reconstruct a length sparse signal from observations:
Note that this is a special case of (1) with specifications and . Applying Algorithm 2.1 to problem (21), we derive444The proximity operator is defined as
that can be explicitly expressed by the shrinkage operator  to be coded by the MATLAB inner function ‘
withresh’. Followed by Lemma 3.1, we use the following stopping criterions under given tolerance:
All of the forthcoming experiments use the same starting points and are tested in MATLAB R2018a (64-bit) on Windows 10 system with an Intel Core i7-8700K CPU (3.70 GHz) and 16GB memory.
Consider an original signal containing 180 spikes with amplitude . The measurement matrix is drawn firstly from the standard norm distribution and then each of its row is normalized. The observation is generated by , where
is generated by the Gaussian distributionon . With the tuned parameters , some computational results under different parameter are shown in Table 1 in which we present the number of iterations (IT), the CPU time in seconds (CPU), the final obtained residuals It_err and Eq_err, as well as the recovery error . Reported results from Table 1 indicate that the choice of could make a great effect on the performance of our algorithm w.r.t. IT and CPU. And it seems that setting would be a reasonable choice to save the CPU time and to cost fewer number of iterations. The reconstruction results under are shown in Fig. 1, from which the solution obtained by our algorithm always has the correct number of pieces and is closer to the original noseless signal.
Table 1: Results by Algorithm 2.1 with different parameter .
Algorithm 2.1 without the relaxation step (“M-PPA”),
The customized PPA (“C-PPA”, ) with parameters ,
The parameterized PPA (“P-PPA”, ) with parameters ,
we show comparative results about the convergence behaviors of the residuals and in Fig. 2, respectively. The effect on recovering the original signal with different algorithms is shown in Fig. 3. Here, we emphasize that the parameter values in [7, 10] can not terminate the algorithms C-PPA and P-PPA because of the fact for their examples, so we set the same value as ours but keep as the value in their experiments. From Figs. 2-3, we observe that M-PPA is competitive to P-PPA and RM-PPA (that is, Algorithm 2.1) performs better than the rest three algorithms.
-  J. Bai, H. Zhang, J. Li, A parameterized proximal point algorithm for separable convex optimization, Optim. Lett. 12 (2018) 1589-1608.
-  J. Bai, J. Liang, K. Guo, Y. Jing, Accelerated symmetric ADMM and its applications in signal processing, (2019) arXiv:1906.12015v2.
-  D. Donoho, Y. Tsaig, Fast solution of -norm minimization problems when the solution may be sparse, IEEE Trans. Inform. Theory, 54 (2008) 4789-4812.
-  G. Gu, B. He, X. Yuan, Customized proximal point algorithms for linearly constrained convex minimization and saddle-point problems: a unified approach, Comput. Optim. Appl. 59 (2014) 135-161.
-  B. He, F. Ma, X. Yuan, Optimal proximal augmented Lagrangian method and its application to full Jacobian splitting for multi-block separable convex minimization problems, IMA J. Numer. Anal. (2019) doi:10.1093/imanum/dry092.
-  B. He, X. Yuan, W. Zhang, A customized proximal point algorithm for convex minimization with linear constraints, Comput. Optim. Appl. 56 (2013) 559-572.
-  M. Hestenes, Multiplier and gradient methods, J. Optim. Theory Appl. 4 (1969) 303-320.
-  S. Kim, K. Koh, M. Lustig, S. Boyd, D. Gorinvesky, An interior-point method for large-scale -regularized least squares, IEEE J-STSP, 1 (2007) 606-617.
-  F. Ma, M. Ni, A class of customized proximal point algorithms for linearly constrained convex optimization, Comp. Appl. Math. 37 (2018) 896-911.
-  B. Martinet, Brve communication, Rgularisation d’inquations variationnelles par approximations successives, ESAIM: Math. Model. Numer. Anal. 4(R3), (1970) 154-159.
-  M. Powell, A method for nonlinear constraints in minimization problems, Optimization (R. Fletcher ed.). New York: Academic Press, (1969) 283-298.
-  R. Rockafellar, Augmented Lagrangians and applications of the proximal point algorithm in convex programming, Math. Oper. Res. 1(1976) 97-116.
-  R. Rockafellar, Monotone operators and the proximal point algorithm, SIAM J. Control Optim. 14, (1976) 97-116.
-  J. Yang, X. Yuan, Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization, Math. Comput. 82 (2013) 301-329.
-  Y. Zhu, J. Wu, G. Yu, A fast proximal point algorithm for -minimization problem in compressed sensing, Appl. Math. Comput. 270 (2015) 777-784.