In this paper, we consider designing a unified numerical method for computing the solution of the following three types of problems: projection onto the intersection of an ball and an ball:
projection onto the intersection of an ball and an sphere:
and projection onto the intersection of an sphere and an sphere:
Here the (i.e., Euclidean) norm on is indicated as with the unit ball (sphere) defined as ), and the norm is indicated as with the ball (sphere) with radius denoted as (). Notice that . Trivial cases for the problems of interests are: (a) , in this case implies , meaning , and . (b) , in this case implies , meaning , and . Without loss of generality, we assume in the remainder of this paper.
) arise widely in modern science, engineering and business. For example, the gradient projection methods for Sparse Principal Component Analysis (sPCA)[1, 2, 3, 4] often involve problems of (1.1) or (1.3), and (1.3) is also an integral part in efficient sparse non-negative matrix factorization [6, 7]8], and dictionary learning with sparseness-enforcing projections . Problem (1.2) often arises in Sparse Generalized Canonical Correlation Analysis (SGCCA), and Witten et al.  use (1.2) for computing the rank-1 approximation for a given matrix along with a block coordinate decent method, which can be applied to sparse principal components and canonical correlation.
Our contribution in this paper can be summarized as follows:
We propose a unified analysis for solving these problems. Specifically, we show that their solutions can all be determined by the root of a piecewisely quadratic auxiliary function.
A series of properties of the proposed auxiliary function are provided, which provide detailed characterization of the solutions of these problems.
A unified method with/without sorting is designed for finding the root of the auxiliary function, which accounts for the major computational efforts of solving these problems.
In the remainder of this section, we outline our notation and introduce various concepts that will be employed throughout the paper. In §2, we discuss the most related existing problems and algorithms. In §3, we introduce our proposed auxiliary function and provide a series of properties of the auxiliary function. We use the proposed auxiliary function to characterize the optimal solutions in §4. A unified algorithm is proposed in §5 for finding the root of the auxiliary function. The results of numerical results are shown in §6. Concluding remarks are provided in §7.
For any , let be the -th element of and be the nonnegative orthant of , i.e., . Denote the soft thresholding operator in with threshold by , i.e., for any , for . Given , denote as the projection of onto the nonnegative orthant , i.e. . The norm of is defined as with and , where is the number of groups. For a compact set and , denote . The function is convex, then the subdifferential of at is given by
be the vector of all ones. The largest and the second-largest ofare denoted as and , respectively. To simplify the analysis, we assume are the distinct components of such that with and .
2 Related methods
We discuss the most related works in this section.
Projection onto ball. As for projection onto a single ball, many algorithms have emerged. It can be shown [9, 10, 11] that the projection of onto can be characterized by the root of the auxiliary function
The properties of are summarized in the proposition below.
Function is continuous, strictly decreasing and piecewisely linear on with breakpoints , and for any .
By Proposition 2.1, has a unique root on since as and . The algorithms for computing the ball projection are summarized and compared in , in which an efficient algorithm is also proposed with worst-case complexity and observed complexity .
Group ball projection. The first related work is the Euclidean projection onto the intersection of and norm balls ( or ) proposed by Su et al. . With and one group, this problem reverts to (1.1). They proved that the projection can be reduced to finding the root of an auxiliary function
Lemma 2.2 ( Theorem 1).
The following statements hold true: (i) is continuous piece-wise smooth on ; (ii) is monotonically decreasing and has a unique root in .
Remark: However, part (ii) of this lemma may not hold in general. We show this by the following two counterexamples.
Example 1. Consider and . Then for
Obviously, for this instance, has no root on . Therefore, Lemma 2.2 does not hold.
Example 2. Consider and . Then for
Clearly, any point in is the root of , so that Lemma 2.2 does not hold.
Sparseness-enforcing projection operator. Another related work is the “sparseness-enforcing projection operator” proposed by Hoyer , which requires the solution to satisfy a normalized smooth “sparseness measure” defined by
This leads to solving the problem of (1.3).
Theis et al.  shown that the projection is almost surely unique for drawn from a continuous distribution, and if it is unique, the projection is shown to be determined by the root of . We summarized the results in Lemma 2.3. Algorithms for solving (1.3) mainly include the alternating projection method in [6, 14], the method of Lagrange multipliers based on sorted in , and the method in  based on computing the root of the auxiliary function .
([8, Lemma 3 in Appendix]) Let be a point such that is unique and . Then is well defined and the following hold:
is continuous on ;
is differentiable on .
is strictly decreasing on , and is constant on .
has a unique root , and .
Projection onto intersection of an ball and an sphere. Tenenhaus et al.  provided a close form of the solution (1.2). The algorithms for solving (1.2) mainly include the root finding with bisection proposed by  and the root finding method with sorting by . Let and suppose the elements are sorted in descent order. They analyzed the properties of in the following lemma.
([15, Proposition 1]) The following statements hold true.
is continuous and decreasing.
Let be the number of elements of equal to . For , there exists and such that .
is a solution of a second degree polynomial equation.
Remark. Part (ii) of Lemma 2.4 shown that is the sufficient condition that have a root on . However, Example 1 is a counterexample indicating that is not sufficient to guarantee has a root on .
3 Proposed auxiliary function
Based on the discussion in §2, most existing projection algorithms onto the intersection of and balls/spheres are constructed by using the auxiliary function . Our proposed methods are based on different auxiliary functions for characterizing the properties of the projections, which is the main focus of this section.
Using the symmetry of the feasible region stated in Proposition 3.1, we can transform the original problems (1.1), (1.2) and (1.3) to their corresponding problems restricted in , so that from now on we can focus on the following problems
We define the following univariate function for given and :
Denote the index set of components greater than or equal to a given :
The summations of those components and the squared components are denoted as and respectively. For simplicity, for the distinct values in , we write
Notice that since ,
In particular, it is obvious that
Therefore, we can rewrite as
For , , , , define
For brevity, let which must exist by the fact that and .
The properties of dependent on are analyzed below.
For , is concave on and strictly increasing on .
If and , is convex and strictly decreasing on . If and , is convex on and strictly decreasing on and
where the equality holds only if .
For and , for any .
For , the smaller root for is
There is no root for if .
(i) It follows from (3.6) that the first and second derivative of is
Note that for any . Therefore, both the sign of and are determined by the sign of . For , on and on since by the definition of .
(iii) It holds naturally that
Plugging this into yields that . Moreover, it can be easily verified that for ,
If , then , meaning for . In addition,
for any . Therefore, for , it holds that It then follows that for any , completing the proof of (iii).
(iv) The discriminant of is Now we discuss the sign of . By the Cauchy-Schwarz inequality
where the inequality holds strictly for since there are at least two distinct values in the summation. Therefore, if , then and the smaller root of is given by (3.8). In particular, if , then and has a unique root . Moreover, if , then since and , implying has no root. This completes the proof of (iv). ∎
The following statements hold true.
is continuous on .
Suppose , is decreasing, piecewisely convex and quadratic on .
Suppose , is decreasing, piecewisely convex and quadratic on and on .
Suppose . is increasing and piecewisely concave and quadratic on . Furthermore, if , then is decreasing and piecewisely convex and quadratic on ; if , then is decreasing and piecewise quadratic convex on , and on
For any ,
and is convex on . Furthermore, for for and .
Part (i) is trivial.
Part (iii) follows naturally from Part (ii) and Lemma 3.2(ii).
Using Proposition 3.3, we can summarize the behavior of as follows.
For , the following statements hold true:
If , then for any .
If , then for any and for any .
If , possesses a unique root on and this root lies in . Furthermore, possesses a unique root on if and only if .
(i) If , then . By Proposition 3.3(ii), is strictly decreasing on . Therefore, part (i) is true.
(ii) If , then and since . By Proposition 3.3(iii), is decreasing on . Hence part (ii) is true.
(iii) If , then and is strictly increasing on by Proposition 3.3 (iv). Now we consider two cases. If , is decreasing on by Proposition 3.3 (iv); this together with the fact is continuous and , implies part (iii) is true. If , is strictly decreasing on and keeps a negative constant by Proposition 3.3 (iv) because and . This implies that attains 0 only once on , and more precisely we know the root lies in . Overall, we know part (iii) is true. ∎
4 Characterizing the solution
In this section, we use to characterize the solution of (3.1), (3.2) and (3.3). Notice that (3.1) is convex; (3.2) and (3.3) are nonconvex. We develop a unified framework using the partial Lagrangian duality, which takes form
Here for each problem the dual variables is associated with the ball/sphere constraint and is associated with the ball/sphere constraint, respectively. The dual function is given by
The properties of are analyzed in the following lemma.
For given , the following hold.
If or and , then .
(i) Suppose . We have Clearly, if , the optimal solution of () is ; if , the solution must satisfy (4.1). The rest of (i) is trivial.
(ii) Suppose . Let be the multipliers for . The optimal must satisfy
Now, suppose is stationary for . It holds that , implying and
Conversely, if with , letting
we can see and . Hence is stationary for . This completes the proof of part (ii).
(iii) It can be verified trivially.