1 Introduction
The classic phase retrieval problem concerns the reconstruction of a function from the magnitude of its Fourier transform. Let . It is well known that can be uniquely reconstructed from , where denotes the Fourier transform of . In many applications such as X-ray crystallography, however, we can only measure the magnitude of the Fourier transform while the phase information is lost. This raises the question whether reconstruction of (namely recovery of the lost phase information) is possible, up to some obvious ambiguities such as translation and reflection.
Recent focus has been largely on the finite dimensional generalization of the phase retrieval problem. In this setting, one aims to recover a real or complex vector (signal)
from the magnitude of some linear measurements of . Our paper studies phase retrieval in this setting. On the finite dimensional space where or , a set of elements in is called a frame if it spans . Given this frame, any vector can be reconstructed from the inner products . Often it is convenient to identify the frame with the corresponding frame matrix . The phase retrieval problem in is:The Phase Retrieval Problem. Let be a frame in . Can we reconstruct any up to a unimodular scalar from , and if so, how?
is said to have the phase retrieval (PR) property if the answer is affirmative. The above phase retrieval problem has important applications in imaging, optics, communication, audio signal processing and more chai2010array ; harrison1993phase ; heinosaari2013quantum ; millane1990phase ; walther1963question . One of the many challenges is the “how” part of the problem, namely to find robust and efficient algorithms for phase retrieval. This turns out to be much more difficult than it looks.
The phase retrieval problem is an example of a more general problem: the recovery of a vector from quadratic measurements. For this problem we would like to recover a vector from a finite number of quadratic measurements where each is a Hermitian matrix in . This is the so-called generalized phase retrieval problem, which was first studied in wang2017generalized from a theoretical angle, but earlier in special cases such as that for orthogonal projection matrices by others edidin2017projections ; heinosaari2013quantum ; cahill2013phase .
To computationally recover the signal in phase retrieval, the greatest challenge comes from the nonconvexity of the objective function when it is phrased as an optimization problem. Let in be the measurement frame for the phase retrieval problem. Assume that . A typical set up is to solve the optimization problem
(1) |
Clearly here the objective function is nonconvex. The same holds for other objective functions used for phase retrieval. As a result, for a general frame, finding the global minimizer of the optimization problem (1) is extremely challenging if not intractable.
Nevertheless one class of phase retrieval problems for which very efficient reconstructive algorithms have been extensively studied is when the measurements are i.i.d. Gaussian random measurements. Several approaches based on convex relaxation techniques, such as PhaseLift candes2014solving , PhaseCut and MaxCut have been developed, see candes2015phase1 ; waldspurger2015phase , PhaseMax goldstein2018phasemax and the work by Bahmani and Romberg bahmani2016phase . Such convex methods can be computationally challenging for large dimensional problems or high computational complexity, which had led to the development of various non-convex optimization approaches. The methods by AltMinPhase netrapalli2013phase and Karczmarz wei2015solving
first estimate the missing phase information and solve the phase retrieval problem through the least square method and Karczmarz method, respectively. It is shown that AltMinPhase converges linearly to the true solution up to a unimodular scalar. The Wirtinger Flow (WF) algorithm introduced in
candes2015phase is guaranteed to converge linearly to the global minimizer for Gaussian measurements when the number of measurements is in the order of . Various other techniques, such as truncated methods chen2015solving ; wang2017solving , have been developed to improve its efficiency and robustness with Gaussian measurements. Other techniques, such as Gauss-Newton’s method gao2017phaseless , rank-1 alternating minimization algorithm cai2017fast and composite optimization algorithm duchi2017solving have all provided theoretical convergence analysis for Gaussian random measurements. Some of the aforementioned methods such as the WF algorithm also work for Fourier measurements with a very specially designed random mask, namely the Coded Diffraction model candes2015phase . However, those are virtually the only models for which provable fast phase retrieval algorithms have been developed. In a big picture, the lack of phase retrieval models that go beyond Gaussian measurements is extremely conspicuous.The main objective of this paper is to fill the above void by analyzing phase retrieval models for sub-gaussian measurements and developing efficient algorithm for such models. More specifically we consider phase retrieval problems where sub-gaussian random measurements are used instead of the traditional Gaussian measurements. It turns out that this change causes significant more challenge in the analysis due to the lack of rotational symmetry. We overcome the challenge through more refined analysis and a slightly weakened result.
Key to any non-convex methods for phase retrieval is the initialization step, from which an approximation of the true solution is obtained. This approximated solution can then be used to serve as the initial guess for iteration steps to converge to the true solution. Especially, we use Wirtinger Flow as an example, which uses the so-called spectral initialization to obtain an initial guess and then refine the result by gradient descent iterations. When this initial guess is close enough to the true solution, the gradient descent is guaranteed to converge to the true solution. Spectral initialization or other initialization methods work well for the Gaussian model (and for the admissible Coded Diffraction model), but it fails for general sub-gaussian random measurements models. That’s the reason why we require the random variables in the Coded Diffraction model to be
admissible. Here in this paper we develop a more general spectral initialization that is less stringent than before, and thus can be applied to most sub-gaussian random measurements models and efficiently solve corresponding phase retrieval problem computationally.Our generalized spectral initialization aims to provide an initial approximation for phase retrieval problem with sub-gaussian random measurements. Consider the phase retrieval problem of recovering from quadratic measurements , where are i.i.d. sub-gaussian random vectors. We will require to be sampled randomly from a given distribution satisfying certain properties. More precisely, our model requires the following conditions for Generalized Spectral Initialization:
Conditions for Generalized Spectral Initialization:
-
Let are i.i.d. sub-gaussian random vectors in and
. Furthermore with probability one
is not pure imaginary and , where denotes the diagonal matrix corresponding to the diagonal part of . -
There exist constants independent of , , such that , , and
(2) for all .
We shall prove that under this model a good approximation to the true solution of the phase retrieval problem can be obtained provided that with satisfying conditions (I) and (II). We also develop an efficient algorithm for solving the phase retrieval problem under this model.
The rest of the paper is organized as follows: In Section 2, we give the generalized spectral initialization and prove that the method can with high probability achieve good initial results provided . In Section 3, we prove that when the measurements satisfy the conditions (I) and (II), then gradient descent iteration can linearly converge to the global minimizer. Finally, we provide the details of the proofs as well as some auxiliary results in Section 5 and the Appendix, respectively.
2 Generalized Spectral Initialization
Let be i.i.d. sub-gaussian random vectors satisfying conditions (I) and (II) for generalized spectral initialization and set . Now for any we denote . The goal of phase retrieval is of course to recover up to a unimodular constant from the measurements . The generalized spectral initialization introduced here aims to provide a good first approximation to , and we describe how it works. Define
(1) |
Note that
Generalized Spectral Initialization: Let be i.i.d. sub-gaussian random vectors in satisfying conditions (I) and (II). Set for . Let . Denote . Set
(2) |
where denotes the diagonal matrix consisting only the diagonal part of matrix .
Definition 2.1
Let be the eigenvector corresponding to the largest eigenvalue of
We shall show that and the vector provides a good initial approximation to the true solution if we have enough measurements, much like the classical spectral initialization for Gaussian measurements.
Lemma 2.1
Let satisfy conditions (I) and (II). Then we have , and .
Proof. Since all are identically distributed we will examine conditions (I) and (II) for . Write . Taking in (2) yields
Since for some we have by the assumption that we have in this case . It follows that . Thus .
Now taking with and looking at the off diagonal elements in (2) we have
It is easy to see that this yields . Since we must have . But is not pure imaginary, so there must exist such that . It follows that .
Theorem 2.2
Let be i.i.d. sub-gaussian random vectors in satisfying conditions (I) and (II) and set . For the phase retrieval problem, given the measurements let be the corresponding generalized spectral initialization. Then for any , there exist constants depending on , such that with probability at least we have
(3) |
provided .
Proof. We shall leave the proof of this theorem to Section 5.
The above theorem is a key ingredient for solving the sub-gaussian measurements phase retrieval problem.
3 Phase Retrieval with Sub-Gaussian Random Measurements
Throughout this section we shall assume that we have random measurments satisfying conditions (I) and (II). The generalized spectral initialization combined with the Wirtinger Flow (WF) method can solve the phase retrieval with sub-gaussian measurements.
As before and throughout the rest of the paper we denote . Given (where or ) we have measurements . To recover we solve the following minimization problem:
(1) |
The target function is a 4-th order polynomial and is not convex.
Definition 3.1
We also define the -neighborhood of by
To solve the optimization problem (1) where the measurements satisfying conditions (I) and (II), we start from an initial guess and iterate via
(2) |
with being the stepsize, where as before is the target function. We shall show that with proper generalized spectral initialization for the initial guess such iterations converge to the global minimizer linearly.
The linear convergence will follow from the two key lemmas below. From the scaling property of the target function , without loss of generality we may assume the true solution to the optimization problem (1) satisfies . Throughout the paper, we adopt the notation and , which represent the positive and negative parts of any respectively. Throughout the paper, we use , or subscript forms of them to denote constants, whose value may change from instance to instance but depend only on the sub-gaussian norm of the distribution of the measurements .
Lemma 3.1 (Local Curvature Condition)
Let be the solution of the optimization problem (1) with . Assume that the measurement vectors satisfy conditions (I) and (II). For any sufficiently small there exist constants where depend on , such that for , with probability greater than we have
(3) |
for all and , where
(4) |
with and .
Proof. Since are i.i.d. sub-gaussian random vectors, we may without loss of generality assume . By the definition of sub-gaussian random vectors, with probability greater than for some constant we have .
Let with . By definition we have and
To establish (3), it suffices to prove that
holds for all satisfying , . Equivalently, we only need to prove that for all satisfying , and for all with ,
By Lemma A.3 for , with probability greater than , we have
for any with . Therefore to establish the local curvature condition (3) it suffices to show that
(5) |
To prove this inequality, we first prove it for a fixed , and then use a covering argument. To simplify the statement, we use the shorthand
For a fixed , according to the expectations given in the proof of Lemma A.4 and we have
Now define . First, since , . Second, we bound using Holder’s inequality with :
Here is a constant depending only on and the second inequality is by the definition of sub-gaussian norm
(6) |
Appling Lemma A.1 with and ,
Therefore, with probability at least , we have
Here the second inequality comes from Lemma A.4.
The inequality above holds for a fixed and a fixed value . To prove (5) for all and all with , define
and
Recall that and , we have . Moreover, for any unit vectors ,
So we have
Thus when ,
(7) |
Let be an net for the unit sphere of with cardinality obeying . Applying (5) together with the union bound, we conclude that for all and a fixed ,
(8) | ||||
The last line follows by choosing as before such that , where is a sufficiently large constant. Now for any on the unit sphere of , there exists a vector such that . By combining (7) and (8), holds with probability at least for all and for a fixed . Applying a similar covering number argument over we can further conclude that for all and ,
holds with probability at least as long as . Thus when , (3) holds with probability greater than .
Lemma 3.2 (Local Smoothness Condition)
Proof. Set . For any with , let . Recall that , and we calculate
Without loss of generality, we assume that . As before the inequality holds with probability at least . Combined this fact with the Cauchy-Schwarz inequality and Lemma 5.3 we obtain