# Phase retrieval for sub-Gaussian measurements

Generally, phase retrieval problem can be viewed as the reconstruction of a function/signal from only the magnitude of the linear measurements. These measurements can be, for example, the Fourier transform of the density function. Computationally the phase retrieval problem is very challenging. Many algorithms for phase retrieval are based on i.i.d. Gaussian random measurements. However, Gaussian random measurements remain one of the very few classes of measurements. In this paper, we develop an efficient phase retrieval algorithm for sub-gaussian random frames. We provide a general condition for measurements and develop a modified spectral initialization. In the algorithm, we first obtain a good approximation of the solution through the initialization, and from there we useWirtinger Flow to solve for the solution. We prove that the algorithm converges to the global minimizer linearly.

## Authors

• 2 publications
• 6 publications
• 188 publications
01/10/2021

### Solving phase retrieval with random initial guess is nearly as good as by spectral initialization

The problem of recovering a signal 𝐱∈ℝ^n from a set of magnitude measure...
06/04/2021

### Phase Retrieval for L^2([-π,π]) via the Provably Accurate and Noise Robust Numerical Inversion of Spectrogram Measurements

In this paper, we focus on the approximation of smooth functions f: [-π,...
05/31/2022

### Optimizing Intermediate Representations of Generative Models for Phase Retrieval

Phase retrieval is the problem of reconstructing images from magnitude-o...
01/12/2019

### Stability estimates for phase retrieval from discrete Gabor measurements

Phase retrieval refers to the problem of recovering some signal (which i...
04/09/2020

### Well conditioned ptychograpic imaging via lost subspace completion

Ptychography, a special case of the phase retrieval problem, is a popula...
09/06/2021

### Stable Gabor phase retrieval in Gaussian shift-invariant spaces via biorthogonality

We study the phase reconstruction of signals f belonging to complex Gaus...
04/23/2019

### Phase Retrieval for Binary Signals: Box Relaxation and Uniqueness

Recovering a signal from its Fourier magnitude is referred to as phase r...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The classic phase retrieval problem concerns the reconstruction of a function from the magnitude of its Fourier transform. Let . It is well known that can be uniquely reconstructed from , where denotes the Fourier transform of . In many applications such as X-ray crystallography, however, we can only measure the magnitude of the Fourier transform while the phase information is lost. This raises the question whether reconstruction of (namely recovery of the lost phase information) is possible, up to some obvious ambiguities such as translation and reflection.

Recent focus has been largely on the finite dimensional generalization of the phase retrieval problem. In this setting, one aims to recover a real or complex vector (signal)

from the magnitude of some linear measurements of . Our paper studies phase retrieval in this setting. On the finite dimensional space where or , a set of elements in is called a frame if it spans . Given this frame, any vector can be reconstructed from the inner products . Often it is convenient to identify the frame with the corresponding frame matrix . The phase retrieval problem in is:

The Phase Retrieval Problem.  Let be a frame in . Can we reconstruct any up to a unimodular scalar from , and if so, how?

is said to have the phase retrieval (PR) property if the answer is affirmative. The above phase retrieval problem has important applications in imaging, optics, communication, audio signal processing and more chai2010array ; harrison1993phase ; heinosaari2013quantum ; millane1990phase ; walther1963question . One of the many challenges is the “how” part of the problem, namely to find robust and efficient algorithms for phase retrieval. This turns out to be much more difficult than it looks.

The phase retrieval problem is an example of a more general problem: the recovery of a vector from quadratic measurements. For this problem we would like to recover a vector from a finite number of quadratic measurements where each is a Hermitian matrix in . This is the so-called generalized phase retrieval problem, which was first studied in wang2017generalized from a theoretical angle, but earlier in special cases such as that for orthogonal projection matrices by others edidin2017projections ; heinosaari2013quantum ; cahill2013phase .

To computationally recover the signal in phase retrieval, the greatest challenge comes from the nonconvexity of the objective function when it is phrased as an optimization problem. Let in be the measurement frame for the phase retrieval problem. Assume that . A typical set up is to solve the optimization problem

 ^x=argminx∈Fd1NN∑j=1(|⟨x,fj⟩|2−yj)2. (1)

Clearly here the objective function is nonconvex. The same holds for other objective functions used for phase retrieval. As a result, for a general frame, finding the global minimizer of the optimization problem (1) is extremely challenging if not intractable.

Nevertheless one class of phase retrieval problems for which very efficient reconstructive algorithms have been extensively studied is when the measurements are i.i.d. Gaussian random measurements. Several approaches based on convex relaxation techniques, such as PhaseLift candes2014solving , PhaseCut and MaxCut have been developed, see candes2015phase1 ; waldspurger2015phase , PhaseMax goldstein2018phasemax and the work by Bahmani and Romberg bahmani2016phase . Such convex methods can be computationally challenging for large dimensional problems or high computational complexity, which had led to the development of various non-convex optimization approaches. The methods by AltMinPhase netrapalli2013phase and Karczmarz wei2015solving

first estimate the missing phase information and solve the phase retrieval problem through the least square method and Karczmarz method, respectively. It is shown that AltMinPhase converges linearly to the true solution up to a unimodular scalar. The Wirtinger Flow (WF) algorithm introduced in

candes2015phase is guaranteed to converge linearly to the global minimizer for Gaussian measurements when the number of measurements is in the order of . Various other techniques, such as truncated methods chen2015solving ; wang2017solving , have been developed to improve its efficiency and robustness with Gaussian measurements. Other techniques, such as Gauss-Newton’s method gao2017phaseless , rank-1 alternating minimization algorithm cai2017fast and composite optimization algorithm duchi2017solving have all provided theoretical convergence analysis for Gaussian random measurements. Some of the aforementioned methods such as the WF algorithm also work for Fourier measurements with a very specially designed random mask, namely the Coded Diffraction model candes2015phase . However, those are virtually the only models for which provable fast phase retrieval algorithms have been developed. In a big picture, the lack of phase retrieval models that go beyond Gaussian measurements is extremely conspicuous.

The main objective of this paper is to fill the above void by analyzing phase retrieval models for sub-gaussian measurements and developing efficient algorithm for such models. More specifically we consider phase retrieval problems where sub-gaussian random measurements are used instead of the traditional Gaussian measurements. It turns out that this change causes significant more challenge in the analysis due to the lack of rotational symmetry. We overcome the challenge through more refined analysis and a slightly weakened result.

Key to any non-convex methods for phase retrieval is the initialization step, from which an approximation of the true solution is obtained. This approximated solution can then be used to serve as the initial guess for iteration steps to converge to the true solution. Especially, we use Wirtinger Flow as an example, which uses the so-called spectral initialization to obtain an initial guess and then refine the result by gradient descent iterations. When this initial guess is close enough to the true solution, the gradient descent is guaranteed to converge to the true solution. Spectral initialization or other initialization methods work well for the Gaussian model (and for the admissible Coded Diffraction model), but it fails for general sub-gaussian random measurements models. That’s the reason why we require the random variables in the Coded Diffraction model to be

admissible. Here in this paper we develop a more general spectral initialization that is less stringent than before, and thus can be applied to most sub-gaussian random measurements models and efficiently solve corresponding phase retrieval problem computationally.

Our generalized spectral initialization aims to provide an initial approximation for phase retrieval problem with sub-gaussian random measurements. Consider the phase retrieval problem of recovering from quadratic measurements , where are i.i.d. sub-gaussian random vectors. We will require to be sampled randomly from a given distribution satisfying certain properties. More precisely, our model requires the following conditions for Generalized Spectral Initialization:

Conditions for Generalized Spectral Initialization:

• Let are i.i.d. sub-gaussian random vectors in and

. Furthermore with probability one

is not pure imaginary and , where denotes the diagonal matrix corresponding to the diagonal part of .

• There exist constants independent of , , such that , , and

 E((x∗Ajx)Aj)=τ2∥x∥2I+τ3xx∗+τ4diag([|x1|2,…,|xd|2]) (2)

for all .

We shall prove that under this model a good approximation to the true solution of the phase retrieval problem can be obtained provided that with satisfying conditions (I) and (II). We also develop an efficient algorithm for solving the phase retrieval problem under this model.

The rest of the paper is organized as follows: In Section 2, we give the generalized spectral initialization and prove that the method can with high probability achieve good initial results provided . In Section 3, we prove that when the measurements satisfy the conditions (I) and (II), then gradient descent iteration can linearly converge to the global minimizer. Finally, we provide the details of the proofs as well as some auxiliary results in Section 5 and the Appendix, respectively.

## 2 Generalized Spectral Initialization

Let be i.i.d. sub-gaussian random vectors satisfying conditions (I) and (II) for generalized spectral initialization and set . Now for any we denote . The goal of phase retrieval is of course to recover up to a unimodular constant from the measurements . The generalized spectral initialization introduced here aims to provide a good first approximation to , and we describe how it works. Define

 Y:=1NN∑j=1yjAj=1NN∑j=1(x∗Ajx)Aj. (1)

Note that

 E(x∗Ajx)=x∗E(Aj)x=τ1∥x∥2.

Generalized Spectral Initialization:  Let be i.i.d. sub-gaussian random vectors in satisfying conditions (I) and (II). Set for . Let . Denote . Set

 M=Y−τ4τ3+τ4D(Y−τ2ρ2I), (2)

where denotes the diagonal matrix consisting only the diagonal part of matrix .

###### Definition 2.1

Let

be the eigenvector corresponding to the largest eigenvalue of

in (2) normalized to . We shall call the generalized spectral initialization for the measurements .

We shall show that and the vector provides a good initial approximation to the true solution if we have enough measurements, much like the classical spectral initialization for Gaussian measurements.

###### Lemma 2.1

Let satisfy conditions (I) and (II). Then we have , and .

Proof.  Since all are identically distributed we will examine conditions (I) and (II) for . Write . Taking in (2) yields

 E(a2kk)=τ2+τ3+τ4,  E(akkamm)=τ2~{}if m≠k,  and E(akkamn)=0~{}if m≠n.

Since for some we have by the assumption that we have in this case . It follows that . Thus .

Now taking with and looking at the off diagonal elements in (2) we have

 E(akm(akk+amm+akm+amk))=E(amk(akk+amm+akm+amk))=τ3.

It is easy to see that this yields . Since we must have . But is not pure imaginary, so there must exist such that . It follows that .

###### Theorem 2.2

Let be i.i.d. sub-gaussian random vectors in satisfying conditions (I) and (II) and set . For the phase retrieval problem, given the measurements let be the corresponding generalized spectral initialization. Then for any , there exist constants depending on , such that with probability at least we have

 dist(z0,x)≤ε∥x∥ (3)

provided .

Proof.  We shall leave the proof of this theorem to Section 5.

The above theorem is a key ingredient for solving the sub-gaussian measurements phase retrieval problem.

## 3 Phase Retrieval with Sub-Gaussian Random Measurements

Throughout this section we shall assume that we have random measurments satisfying conditions (I) and (II). The generalized spectral initialization combined with the Wirtinger Flow (WF) method can solve the phase retrieval with sub-gaussian measurements.

As before and throughout the rest of the paper we denote . Given (where or ) we have measurements . To recover we solve the following minimization problem:

 ^z=argminz∈Fd12NN∑j=1(z∗Ajz−yj)2. (1)

The target function is a 4-th order polynomial and is not convex.

###### Definition 3.1

Let be the solution of (1) where . For any we define as

 θ(z):=argminθ∈[0,2π)∥z−xeiθ∥.

The distance between and is defined as

 dist(z,x)=∥z−xeiθ(z)∥.

We also define the -neighborhood of by

 \SS(x,ε):={z∈Cd:dist(z,x)≤ε∥x∥}.

To solve the optimization problem (1) where the measurements satisfying conditions (I) and (II), we start from an initial guess and iterate via

 (2)

with being the stepsize, where as before is the target function. We shall show that with proper generalized spectral initialization for the initial guess such iterations converge to the global minimizer linearly.

The linear convergence will follow from the two key lemmas below. From the scaling property of the target function , without loss of generality we may assume the true solution to the optimization problem (1) satisfies . Throughout the paper, we adopt the notation and , which represent the positive and negative parts of any respectively. Throughout the paper, we use , or subscript forms of them to denote constants, whose value may change from instance to instance but depend only on the sub-gaussian norm of the distribution of the measurements .

###### Lemma 3.1 (Local Curvature Condition)

Let be the solution of the optimization problem (1) with . Assume that the measurement vectors satisfy conditions (I) and (II). For any sufficiently small there exist constants where depend on , such that for , with probability greater than we have

 Re(⟨∇zEx(z),z−xeiθ(z)⟩)≥β−δ4⋅dist2(z,x)+110NN∑j=1∣∣(z−xeiθ(z))∗Aj(z−xeiθ(z))∣∣2 (3)

for all and , where

 ε0:=1027α⎛⎝√36|τ4|2+27αβ10−6|τ4|⎞⎠ (4)

with and .

Proof.  Since are i.i.d. sub-gaussian random vectors, we may without loss of generality assume . By the definition of sub-gaussian random vectors, with probability greater than for some constant we have .

Let with . By definition we have and

 Re(⟨∇zEx(z),z−xeiθ(z)⟩)=1NN∑j=1(2(Re(h∗Ajx))2+3Re(h∗Ajx)(h∗Ajh)+|h∗Ajh|2).

To establish (3), it suffices to prove that

 1NN∑j=1(2(Re(h∗Ajx))2+3Re(h∗Ajx)(h∗Ajh)+|h∗Ajh|2)−110NN∑j=1|h∗Ajh|2≥β−δ4∥h∥2

holds for all satisfying , . Equivalently, we only need to prove that for all satisfying , and for all with ,

 1NN∑j=1(2(Re(h∗Ajx))2+3sRe(h∗Ajx)(h∗Ajh)+9s210|h∗Ajh|2)≥β−δ4.

By Lemma A.3 for , with probability greater than , we have

for any with . Therefore to establish the local curvature condition (3) it suffices to show that

 1NN∑j=1(52(Re(h∗Ajx))2+3sRe(h∗Ajx)(h∗Ajh)+9s210|h∗Ajh|2)≥β4+E(Re2(h∗Ajx))2 (5)

To prove this inequality, we first prove it for a fixed , and then use a covering argument. To simplify the statement, we use the shorthand

 Uj: =52(Re(h∗Ajx))2+3sRe(h∗Ajx)(h∗Ajh)+9s210|h∗Ajh|2

For a fixed , according to the expectations given in the proof of Lemma A.4 and we have

 uj =EUj≤52(τ32+τ2+τ32+|τ4|)+3(τ2+τ3+|τ4|)+910(τ2+τ3+|τ4|) <7(τ2+τ3+|τ4|).

Now define . First, since , . Second, we bound using Holder’s inequality with :

 EX2j≤EU2j =254E(Re4(h∗Ajx))+81100s4E(|h∗Ajh|4)+272s2E(Re2(h∗Ajx)|h∗Ajh|2) +15sE(Re3(h∗Ajx)|h∗Ajh|)+27s35E(Re(h∗Ajx)|h∗Ajh|3) ≤254√E(|a∗jh|8)E(|a∗jx|8)+81100s4E(|a∗jh|8)+27s22√E(|a∗jh|12)E(|a∗jx|4) +15s√E(|a∗jh|10)E(|a∗jx|6)+27s35√E(|a∗jh|14)E(|a∗jx|2) ≤(254⋅84+81100s4⋅84+27s22⋅√126⋅16 +15s⋅√63⋅105+27s35⋅√147⋅2)⋅∥aj∥8ψ2 :=Cs.

Here is a constant depending only on and the second inequality is by the definition of sub-gaussian norm

 E(|a∗v|p)≤pp2∥a∥pψ2. (6)

Appling Lemma A.1 with and ,

 P(Nuj−N∑j=1Uj≥β8N)≤e−3γN.

Therefore, with probability at least , we have

 1NN∑j=1Uj ≥uj−β8 =12E(Re2(h∗Ajx))+2E(Re2(h∗Ajx))+3sE(Re(h∗Ajx)|h∗Ajh|)+910s2E(|h∗Ajh|2)−β8 ≥12E(Re2(h∗Ajx))+β2−β8 ≥12E(Re2(h∗Ajx))+3β8.

Here the second inequality comes from Lemma A.4.

The inequality above holds for a fixed and a fixed value . To prove (5) for all and all with , define

and

 p(h,s)=1NN∑j=1p2j(h,s).

Recall that and , we have . Moreover, for any unit vectors ,

 |pj(u,s)−pj(v,s)|≤√52∣∣Re((u−v)∗Ajx)∣∣+√910s(|a∗ju|+|a∗jv|)|a∗j(u−v)|<8dlogN∥u−v∥.

So we have

 |p(u,s)−p(v,s)| =∣∣1NN∑j=1(p2j(u,s)−p2j(v,s))∣∣ =∣∣1NN∑j=1(pj(u,s)+pj(v,s))(pj(u,s)−pj(v,s))∣∣ <96d2log2N∥u−v∥.

Thus when ,

 p(v,s)≥p(u,s)−β16. (7)

Let be an net for the unit sphere of with cardinality obeying . Applying (5) together with the union bound, we conclude that for all and a fixed ,

 P(p(u,s) ≥12E(Re2(u∗Ajx))+3β8)≥1−|Nη|exp(−3γN) (8) ≥1−(1+3072d2log2N/β)2dexp(−3γN) ≥1−exp(−2γN).

The last line follows by choosing as before such that , where is a sufficiently large constant. Now for any on the unit sphere of , there exists a vector such that . By combining (7) and (8), holds with probability at least for all and for a fixed . Applying a similar covering number argument over we can further conclude that for all and ,

 p(h,s)≥E(Re2(h∗Ax))2+β4

holds with probability at least as long as . Thus when , (3) holds with probability greater than .

###### Lemma 3.2 (Local Smoothness Condition)

Under the same assumptions as in Lemma 3.1 and with being given in (4), for and any , with probability at least we have

 ∥∇zEx(z)∥2≤R(β−δ8dist2(z,x)+110NN∑j=1∣∣(z−xeiθ(z))Aj(z−xeiθ(z))∣∣2). (9)

Here is defined as

 R≥max(96^α2(1+δ2)β−δ,270^α(1+δ)+60dτ1(1+δ)ε20)

with , and constants depending on .

Proof.  Set . For any with , let . Recall that , and we calculate

 |∇zEx(z)∗u|2 =∣∣1NN∑j=1(2Re(x∗Ajh)+h∗Ajh)(v∗Ajx+v∗Ajh)∣∣2 =∣∣1NN∑j=1(2Re(x∗aja∗jh)+h∗aja∗jh)(v∗aja∗jx+v∗aja∗jh)∣∣2 ≤(1NN∑j=13|x∗aj||v∗aj||h∗aj|2+2|h∗aj||v∗aj||x∗aj|2+|h∗aj|3|v∗aj|)2 :=27I1+12I2+3I3.

Without loss of generality, we assume that . As before the inequality holds with probability at least . Combined this fact with the Cauchy-Schwarz inequality and Lemma 5.3 we obtain

 I1 ≤ (1NN∑j=1|h∗aj|4)⋅(1NN∑j=1|v∗aj|2|x∗aj|2) ≤ 1NN∑j=1|h∗Ajh|2⋅v∗(1NN∑j=1(x∗Ajx)Aj)v ≤ ^α(1+δ)⋅1N