1 Introduction
Recovery problems in many fields are commonly solved under the paradigm of maximum likelihood estimation. Despite the rich theory it enjoys, in many instances the parameter space is exponentially large and non-convex, often rendering the computation of the maximum likelihood estimator (MLE) intractable. It is then common to settle for heuristics, such as expectation-maximization. Unfortunately, popular iterative heuristics often get trapped in local minima and one usually does not know if the global optimum was achieved
A common alternative to these heuristics is the use of convex relaxations - to attempt optimizing (usually minimizing the minus log-likelihood) in a larger convex set that contains the parameter space of interest, as that allows one to leverage the power of convex optimization (e.g. [Candes and Tao(2010)]). The downside is that the solution obtained might not be in the original feasible set, forcing one to take an extra, potentially suboptimal, rounding step. The upside is that if it does lie in the original set then no rounding step is needed and one is guaranteed to have found the optimal solution to the original problem. Fortunately, this seems to often be the case in several problems.
We focus on a particular type of convex relaxations, namely semidefinite programming (SDP) based relaxations, and illustrate both these techniques and the open problem we wish to pose via the generalized orthogonal Procrustes problem ([Nemirovski(2007)]): There is an unknown underlying point cloud of points in , (where columns of represent the coordinates of the points) and unknown orthogonal transformations in , satisfying . We are given noisy measurements of the form , and, for simplicity, assume the noise to be white Gaussian, i.e., and is a matrix with i.i.d. entries. The MLE for the unknowns and is the minimizer of: constrained to . With the mild assumption that is fixed, the MLE is equivalent to an optimization problem only on the orthogonal transformations, we will refer to it as the quasi-MLE^{1}^{1}1If is fixed, the MLE is the maximizer of , whose optimum is a multiple of .:
(1) |
Unfortunately, the non-convexity and the exponential size of the search space renders problem (1) intractable in general. Note that minimizing has the same solution as maximizing which is in turn, by taking and with blocks , equivalent to
(2) |
The semidefinite relaxation in [Bandeira et al.(2013)Bandeira, Kennedy, and Singer] can then be obtained by dropping the non-convex rank constraint, giving the following SDP
(3) |
When [Abbe et al.(2014)Abbe, Bandeira, Bracher, and Singer]
showed that, below a certain level of outlier based noise, the solution of (
3) achieves exact recovery (with high probability) and thus is a feasible point of (
2). This is shown by constructing a dual certificate for the optimal point. However, for , the nature of (1) together with the fact that there are infinitely many orthogonal transformations will render exact recovery under noise impossible, even by solving (2). Remarkably we observe that, even then, the relaxation (3) often manages to recover the quasi-MLE, the solution to (2).This type of behavior has also been observed in the multireference alignment problem by [Bandeira et al.(2014)Bandeira, Charikar, Singer, and Zhu], in the global registration problem by [Chaudhury et al.(2013)Chaudhury, Khoo, and Singer], and in camera motion estimation by [Ozyesil et al.(2013)Ozyesil, Singer, and Basri]. Yet, to the best of our knowledge, there is no theoretical understanding of this rank recovery phenomenon. We note that there has been work on understanding the rank of solutions of random SDPs by [Amelunxen and Bürgisser(2014)] but the results hold only under specific distributions and do not apply to these problems. The difficulty of analyzing rank recovery lies in the fact that, unlike in exact recovery, we cannot identify the exact form of the MLE, rendering dual certificate arguments very difficult to carry out.
2 An open problem
Although we observe rank recovery in a variety of problems we formulate conjectures in two particularly simple problems, the generalized Procrustes problem (Conjecture 2) described above, and the multireference alignment problem treated in [Bandeira et al.(2014)Bandeira, Charikar, Singer, and Zhu]. Numerical evidence supporting these conjecture is given in Figure 1. We first pose the conjecture for generalized Procrustes.
Let , , , represent random points in (i.i.d. uniform random coordinates in ), and be any sequence of orthogonal transformations. Let , where is a matrix with i.i.d. standard gaussian entries. There exists such that, for , with high probability, the solution of (3) has rank , hence matching the quasi-MLE (solution of (2)).
The multireference alignment consists in estimating a -dimensional signal by observing shifted noisy copies of it. For the sake of brevity, we will not describe the problem or the SDP based relaxation here (and refer the reader to [Bandeira et al.(2014)Bandeira, Charikar, Singer, and Zhu]) but take the opportunity to conjecture that a similar phenomenon happens: Below a certain noise level the solution to SDP-based relaxation in [Bandeira et al.(2014)Bandeira, Charikar, Singer, and Zhu] has rank , thus matching the quasi-MLE. Although not going into details, we note that the SDP for this problem is considerably different than (3), in particular, it has positivity constraints. Also, this problem is discrete and so exact recovery is possible. However, exact recovery can be shown to be only possible for asymptotically vanishing levels of noise and here we conjecture rank recovery happens for a constant level of noise.
References
- [Abbe et al.(2014)Abbe, Bandeira, Bracher, and Singer] E. Abbe, A. S. Bandeira, A. Bracher, and A. Singer. Linear inverse problems on Erdős-Rényi graphs: Information-theoretic limits and efficient recovery. IEEE International Symposium on Information Theory (ISIT2014), to appear, 2014.
- [Amelunxen and Bürgisser(2014)] D. Amelunxen and P. Bürgisser. Intrinsic volumes of symmetric cones and applications in convex programming. Mathematical Programming, pages 1–26, 2014.
- [Bandeira et al.(2013)Bandeira, Kennedy, and Singer] A. S. Bandeira, C. Kennedy, and A. Singer. Approximating the little grothendieck problem over the orthogonal and unitary groups. Available online at arXiv:1308.5207 [cs.DS], 2013.
- [Bandeira et al.(2014)Bandeira, Charikar, Singer, and Zhu] A. S. Bandeira, M. Charikar, A. Singer, and A. Zhu. Multireference alignment using semidefinite programming. 5th Innovations in Theoretical Computer Science (ITCS 2014), 2014.
- [Candes and Tao(2010)] E. J. Candes and T. Tao. The power of convex relaxation: Near-optimal matrix completion. Information Theory, IEEE Transactions on, 56(5):2053–2080, May 2010.
- [Chaudhury et al.(2013)Chaudhury, Khoo, and Singer] K. N. Chaudhury, Y. Khoo, and A. Singer. Global registration of multiple point clouds using semidefinite programming. arXiv:1306.5226 [cs.CV], 2013.
- [Nemirovski(2007)] A. Nemirovski. Sums of random symmetric matrices and quadratic optimization under orthogonality constraints. Math. Program., 109(2-3):283–317, 2007.
- [Ozyesil et al.(2013)Ozyesil, Singer, and Basri] O. Ozyesil, A. Singer, and R. Basri. Camera motion estimation by convex programming. Available online at http://arxiv.org/abs/1312.5047 [cs.CV], 2013.