I Introduction
Seismic data interpolation is crucial for accurate inversion and imaging procedures such as fullwaveform inversion [32], reversetime migration [4, 25] and multiple removal methods (SRME, EPSI) [31, 20]. Dense acquisition is prohibitively expensive in these applications, motivating reduction in seismic measurements. On the other hand, using subsampled sources and receivers without interpolation gives unwanted imaging artifacts. A range of trace interpolation methodologies have been proposed, exploiting low dimensional structure of seismic data, including sparsity ([14, 23]) and lowrank ([1, 18, 24, 29, 34]), and with theoretical guarantees for lowrank matrix recovery available in a range of contexts ([11, 26]). The main goal is to simultaneously sample and compress a signal using optimization to replace dense acquisition, thus enabling a range of applications in seismic data processing at a fraction of the cost.
Lowrank matrix completion techniques ([7, 6]) have been successfully applied to seismic trace interpolation. Here we focus on residualconstrained
formulations, which minimize a regularizer subject to a constraint on the data misfit. This formulation is particularly appealing when practitioners have an estimate of the
noise floor [30, 1].In [1], the authors combine explicit lowrank factorization ([26, 28]) with levelset optimization techniques ([30, 3, 2]), and apply the resulting approach to seismic data interpolation in the midpointoffset domain. A recent extension [22] includes seismic data sampled on unstructured grids, with recovery error bounds provided for the methodology.
Here, we develop a competitive optimization scheme based on alternating minimization for factorized formulations. The alternating approach in our context yields convex residualconstrained subproblems, which we solve using primaldual splitting techniques of [8]. The resulting scheme is simple, and can be applied to extremely largescale problems.
The paper proceeds as follows. In Section III, we formulate seismic data interpolation as a lowrank optimization problem, highlighting the transform domain that makes lowrank techniques applicable. In Section IV, we present the alternating optimization scheme for the residualconstrained formulation, together with an algorithm to solve the convex subproblems in each factor. The new approach is illustrated using 3D seismic data volumes in Section V, and we show it compares favorably to the the levelset approach of [1]. We end with a discussion and future directions in Section VI.
Ii Notation
Iii Lowrank matrix completion
The goal of lowrank matrix completion is to accurately estimate the unobserved entries of a data matrix,
, from the observed entries and prior knowledge that the matrix exhibits lowrank structure, i.e. has few nonzero singular values, or can be accurately approximated by such a lowrank matrix. Letting
be the set of observed entries, we define a sampling operator by its elementwise action:We can write our observations as , where models data corruption on the subset of observed entries, with a given noise floor . Rather than minimizing the rank of , which is a combinatorial problem, we can consider the nuclear norm ; in many contexts this allows exact recovery guarantees on [11, 26]. This convex relaxation gives us the interpolated data volume minimizer of
(1) 
where denotes the Frobenius norm. This procedure finds the with lowest nuclear norm that still fits observations up to the noise level . In realworld applications, may require estimation by crossvalidation or other techniques.
Matrix completion via nuclear norm minimization has been extensively studied ([26, 7, 6, 11]). Typical results show that if is generated uniformly randomly, then any “incoherent” rank matrix can be reconstructed with as few as observed entries. When , this methodology provides a great reduction in the number of measurements needed compared to typical sampling procedures used in many applications. These insights have spurred work with the goal of solving (1) efficiently.
Many implementations to solve (1), such as singularvalue projection and singular value thresholding ([15, 5]
) require singular value decomposition (SVD) or partial SVD computations of the data matrix. Furthermore, treating the entire
as a decision variable requires storage and manipulation of an object at every iteration. Storage and manipulation of large volumes becomes prohibitive for hugescale problems, making (1) and equivalent formulations impractical for realistic seismic interpolations where the spatial interpolated grid for 3D seismic data acquisition is of the order . To rectify these issues ([1, 26, 28]) propose a matrix factorization approach. Stipulating a maximal rank for , the authors write the representation , with , and denoting the Hermitian transpose. As shown in ([26, 28]),motivating the formulation of [1]:
(2) 
If , we have reduced the memory requirements from to , while projection onto the Frobenius norm ball is linear in the size of the variable, avoiding expensive SVD computations.
When is a fullrank matrix that can be accurately approximated by a rank matrix, implementing (2) with this choice of provides reconstruction error bounds proportional to the error of the best rank approximation of (see for example [6]). This holds true for many applications, where e.g. the data matrix is not lowrank but exhibits quickly decaying singular values, so that the error of the best rank approximation will be small for appropriately chosen . When the underlying rank is not known, minimizing a regularization functional subject to a data constraint (2) has an important practical consequence: as the nominal rank (number of columns in and ) increases, we do not overfit the data [1].
Simply considering standard seismic data volumes does not immediately yield lowrank structure.
In the next section, we show how seismic data can be transformed in order to find and
exploit lowrank representation for interpolation.
Iiia Lowrank structure of seismic data
For 3D seismic data acquisition, each monochromatic tensor is a 4dimensional volume with source dimensions , and receiver dimensions , , respectively. In order to exploit lowrank structure for 3D seismic data acquisition, [10, 19] proposed two choices of matricization, or operation of unfolding a tensor into a matrix along specific dimensions. We either place the dimensions in the rows and dimensions in the columns (Figure 1a), or dimensions in the rows and dimensions in the columns (Figure 1c). In matricization, operator samples observed rows and/or columns, i.e. observed data corresponding to source and/or receiver combinations. Placing both receiver coordinates along the rows, i.e., matricization , yields a matrix that has high rank (Figure 2a, red curve) and action of the subsampling operator (in particular removal of rows and/ or columns) can only decrease the rank (Figure 2b, red curve). Therefore, interpolation of the data by rank penalization is doomed to failure when using matricization, since the fully sampled object has higher rank (and larger singular values) than the subsampled dataset.
We therefore look for a transform domain where the fully sampled data matrix exhibits lowrank structure, while subsampling by increases the rank and/or negatively alters the fast decay of singular values. In this context, we can expect lowrank penalization to help. We use the rankrevealing transforms of [1, 9, 19]. To motivate the use of these transforms, we first explore the singular value decay of full and subsampled volumes in the context of 3D seismic data acquisition. Note that, monochromatic seismic data matrices are full rank even in the transformed domain, but they have fast decay of the singular values in the transform domain, which means that these matrices can be well approximated with lowrank matrices. Therefore, we assume that seismic data matrix exhibit lowrank structure when it has fastdecay of singular values and highrank structure when it has slowdecay of singular values. In case of 3D seismic data acquisition, matricization yields fast decay of the singular values (Figure 2a, blue curve) for the original fully sampled data volume, while subsampling causes the singular values to decay at a slower rate (Figure 2b, blue curve). Therefore, we select the latter matricization for 3D seismic data acquisition.
Working under an appropriate rankrevealing transformation domain, the subsampling operator will also involve a transformation operator represented by . This transformation operator projects the seismic data from the sourcereceiver domain to the rankrevealing transform domains and its adjoint reverse the operation. We call this samplingtransformation operation , where represents the adjoint, and minimize (2) with replaced by .
Iv Methodology
While several formulations are available to solve the equality constrained version of (2) (i.e. where ), see [26], and penalized formulations, e.g. [28], few focus on the case where is provided by the user. The levelset factorized approach, which we call LRBPDN following [1], solves (2) for a prescribed by defining the value function
and applying Newton’s method to find . In practice this means inexactly solving a sequence of optimization problems to evaluate , and using duality theory to compute in order to compute the next iterate [30, 1]. The overall approach is reliable but can take many iterations, especially for highfidelity data fitting.
Here, we propose a blockcoordinate descent scheme, outlined in Algorithm 1.
Steps 3 and 4 focus on a single matrix factor at a time. Alternating approaches are common for matrix factorization ([21, 16, 12]). The main competing algorithms for constrained matrix completion formulations (2) use a levelset approach along the lines of [1] (see e.g [21]).
We implement steps and using a matrixfree approach, described in detail below.
Iva Solving Steps 3 and 4 with PrimalDual Splitting
All of the work of Algorithm 1 occurs in steps 3 and 4. Each step is an inequality constrained, convex optimization problem. In typical primalonly iterative optimization algorithms, such as the projectedgradient method, we must, at every iteration, project onto the constraint set. In largescale problems, such a projection itself requires an iterative algorithm. Instead, we solve steps 3 and 4 using the primaldual approach developed in [8], each step of which consists solely of matrixvector products and scalar multiplication, and which is completely free of projections, making it suitable for extremely largescale problems. Moreover, it is very simple to implement: the full details of the approach are contained in Algorithm 2 for the update (step 5). The update (step 4) is obtained immediately by changing the roles of and . The matrix norm used by Algorithm 2 for stepsize selection is the largest singular value of the input matrix.
To understand Algorithm 2, we formulate a convexconcave saddlepoint representation of step 5 of Algorithm 1:
(3) 
Computing the maximum over in (3) immediately recovers the primal problem in for step 5 of Algorithm 1, since this maximum is equal to when , and is infinity otherwise. To align Algorithm 2 with the notation of [8], set and
i.e., is the convex indicator of the scaled norm ball. The convex conjugate can be immediately computed: (see e.g. [27]). Then steps 5 and 7 of Algorithm 2 correspond to the proximal updates:
where is defined as in step 6 of Algorithm 2. The reader can check that any fixed point of this iteration solves the saddle point formulation (3), and hence the problem in step 5 of Algorithm 1. See [8] for a detailed convergence analysis and rate results.
IvB Relaxation of the constraint
We found that running Algorithm 1 with , the final target value, is too aggressive, especially in the context of the alternating scheme in and . In particular, in the early steps, when the values of and are far from the optimum, it is to our advantage to solve approximate subproblems. As the algorithm progresses, we tighten , and eventually solve the problem of interest with .
V Numerical Experiments
We perform seismic data interpolation using the proposed primaldual alternating scheme on a 5D synthetic data set from a realistic complex geological model. Our goal is to interpolate 3D geological models with very finescale features and complex sedimentary environments (where interpolation problems are very challenging for densely sampled data), comparing levelset and primaldual methods for residualconstrained formulations.
The 5D seismic synthetic data set is generated using the Compass velocity model provided to us by the BG Group, which has receivers spaced by 25m, and sources spaced by 25 also with temporal sampling interval of 0.004s. It consists of two receiver and two source coordinates, with time as the fifth dimension. We use oceanbottom nodes style acquisition, where we places the receivers at the oceanbed and sources are placed at seasurface. In all experiments, we initialize and using standard Gaussian random matrices. More sophisticated initialization schemes have been proposed ([17, 1]), but are expensive for largescale seismic data since they require computing the largest singular vectors.
We compare the practical performance (signaltonoise ratio (SNR) in dB of the reconstructions) and computational time (in seconds) of the new primaldual formulation to that of LRBPDN [1]. For simplicity, we perform interpolation for a missingsources scenario, though the missingreceivers scenario is similar. During seismic data acquisition, the sampling operator
only acts along source and receiver coordinates and does not depend on the time axis. We first perform the Fourier transform along the timeaxis, and then interpolate each monochromatic data matrix independently. This approach avoids repeated application of Fourier and inverse Fourier transforms during the optimization process. Once interpolation is finished for each monochromatic data slice, we perform a single inverse Fourier transform along the frequency axis to get the final reconstructed seismic data volume in the timedomain.
We use jittered sampling scheme [13] to remove of the sources resulting in the minimum spatial sampling of 25 m and maximum spatial sampling of 225 m between two consecutive sources. The spatial sampling of final interpolated grid is 25 m. Since seismic data is bandlimited because of the bandlimited nature of the source wavelet, we perform the interpolation within the frequency bandwidth of seismic data, which ranges from 370 Hz. In practice, successful recovery means that an actionable seismic volume (preserving all the coherent energy including late arrivals) can be obtained by computational techniques from merely of acquired data.
To choose these rank of the factors and for interpolation, we consider the 5 Hz and 70 Hz frequency slices, subsample columns, and perform reconstruction. The best values (according to SNR value) for 5 and 70 Hz monocromatic slices were 30 and 100, respectively. Hence, we work with all of the monochromatic slices and adjust the rank linearly from 30 to 100 when moving from lower to higher frequencies. To select the target value of the data fitting constraint , we use the value , where is the observed data. As described in Section IVB, we start Algorithm 1 with a relatively loose value of , and decrease geometrically until we arrive at the target value of .
Figures 3 and 4 show fully sampled and subsampled (80% missing sources) data from the 5D seismic volume. Figures 3 (c) and 4 (c) show a zoom section of commonreceiver gather from the true and subsampled data. The reconstructed data volume and corresponding residual plots are shown in Figures 5 (a, b) and (c,d) using the proposed alternating primaldual splitting method and the LRBPDN based levelset approach. We boost the amplitudes of residual plots (Figures 5 (c,d)) by a factor of 100 to show that we are not loosing any coherent energy. Both of the methods are able to reconstruct the coherent energy of 5D seismic data volumes while preserving the latearrival energy, which can be also seen from the residual plots. We also plot the zoom sections (Figure 6) of reconstructed data and corresponding residual to strengthen our observations.
We compare the signaltonoise ratio (SNR in dB) and computational times (in sec.) across the frequency spectrum as shown in Figures 7 (a, b). We can see that the alternating primaldual method works better along the higher frequency regime whereas the levelset method works better along the lowerfrequency regime. Computationally, the primaldual method is faster by a factor of two. In addition, the computational time for the primaldual scheme is remarkably consistent across the frequency spectrum, while LRBPDN requires more time as frequency increases. We run both the methods on a system with two 10core Intel E52690 v2 Ivy Bridge CPUs at 3.00GHz.
Vi Discussion and conclusions
Residualconstrained formulations are well suited for practical interpolation schemes, since they provide a simple way for practitioners to fit to a target datafitting level. In this work, we propose a new approach for residualconstrained formulations, using a block coordinate alternating optimization scheme. As with other factorized schemes, in the current approach the full data volume is represented implicitly via the outer product , and achieves time and memory savings by working with the factors instead of the full volume .
Even though the overall problem is nonconvex, each problem in and is a convex residualconstrained problem, and we solve it using a primaldual splitting approach detailed in Algorithm 2. The approach requires only matrixvector products with and their adjoints. It is simpler than the levelset approach, which requires variational computations of the value function. Experimental results on largescale seismic interpolation show that the proposed approach can match the practical performance of level set methods, obtaining comparable results in half the time, and working consistently across the range of frequencies of interest.
An important consequence of this work is that it opens future directions in regularized data interpolation. One can for example consider constraints on the factors themselves, or additional constraints on . When working with additional convex constraints, Algorithm 1 can be left completely unchanged, and Algorithm 2 can be adapted. The key point is that with or fixed, the subproblems remain convex. We leave these developments for future work.
Acknowledgements
We would like to thank the BG Group for permission to use the synthetic Compass velocity model. The authors wish to acknowledge the SENAI CIMATEC Supercomputing Center for Industrial Innovation, with support from BG Brazil and the Brazilian Authority for Oil, Gas and Biofuels (ANP), for the provision and operation of computational facilities and the commitment to invest in Research & Development. This work was financially supported in part by the Natural Sciences and Engineering Research Council of Canada Collaborative Research and Development Grant DNOISE II (CDRP J 37514208). This research was carried out as part of the SINBAD II project with the support of the member organizations of the SINBAD Consortium. Aleksandr Aravkin was partially supported by the Washington Research Foundation Data Science Professorship.
References
 [1] A. Aravkin, R. Kumar, H. Mansour, B. Recht, and F. J. Herrmann. Fast methods for denoising matrix completion formulations, with applications to robust seismic data interpolation. SIAM Journal on Scientific Computing, 36(5):S237–S266, 2014.
 [2] A. Y. Aravkin, J. V. Burke, D. Drusvyatskiy, M. P. Friedlander, and S. Roy. Levelset methods for convex optimization. arXiv preprint arXiv:1602.01506, 2016.
 [3] A. Y. Aravkin, J. V. Burke, and M. P. Friedlander. Variational properties of value functions. SIAM Journal on optimization, 23(3):1689–1717, 2013.
 [4] E. Baysal, D. D. Kosloff, and J. W. Sherwood. Reverse time migration. Geophysics, 48(11):1514–1524, 1983.
 [5] J.F. Cai, E. J. Candès, and Z. Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4):1956–1982, 2010.
 [6] E. J. Candes and Y. Plan. Matrix completion with noise. Proceedings of the IEEE, 98(6):925–936, 2010.
 [7] E. J. Candès and B. Recht. Exact matrix completion via convex optimization. Foundations of Computational mathematics, 9(6):717–772, 2009.
 [8] A. Chambolle and T. Pock. A firstorder primaldual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision, 40(1):120–145, 2011.
 [9] C. Da Silva and F. Herrmann. Hierarchical tucker tensor optimizationapplications to 4d seismic data interpolation. In 75th EAGE Conference & Exhibition incorporating SPE EUROPEC 2013, 2013.
 [10] C. Da Silva and F. J. Herrmann. Optimization on the Hierarchical Tucker manifold  applications to tensor completion. Linear Algebra and its Applications, 481:131–173, 09 2015. (Linear Algebra and its Applications).
 [11] M. Fazel. Matrix rank minimization with applications. PhD thesis, PhD thesis, Stanford University, 2002.
 [12] T. Hastie, R. Mazumder, J. Lee, and R. Zadeh. Matrix completion and lowrank svd via fast alternating least squares. arXiv preprint arXiv:1410.2596, 2014.
 [13] G. Hennenfent and F. J. Herrmann. Simply denoise: wavefield reconstruction via jittered undersampling. Geophysics, 73(3):V19–V28, 05 2008.
 [14] F. J. Herrmann and G. Hennenfent. Nonparametric seismic data recovery with curvelet frames. Geophysical Journal International, 173(1):233–248, 2008.
 [15] P. Jain, R. Meka, and I. S. Dhillon. Guaranteed rank minimization via singular value projection. In Advances in Neural Information Processing Systems, pages 937–945, 2010.

[16]
P. Jain, P. Netrapalli, and S. Sanghavi.
Lowrank matrix completion using alternating minimization.
In
Proceedings of the fortyfifth annual ACM symposium on Theory of computing
, pages 665–674. ACM, 2013.  [17] P. Jain, P. Netrapalli, and S. Sanghavi. Lowrank matrix completion using alternating minimization. In Proceedings of the fortyfifth annual ACM symposium on Theory of computing, pages 665–674. ACM, 2013.
 [18] N. Kreimer, A. Stanton, and M. D. Sacchi. Tensor completion based on nuclear norm minimization for 5d seismic data reconstruction. Geophysics, 78(6):V273–V284, 2013.
 [19] R. Kumar, C. Da Silva, O. Akalin, A. Y. Aravkin, H. Mansour, B. Recht, and F. J. Herrmann. Efficient matrix completion for seismic data reconstruction. Geophysics, 80(5):V97–V114, 2015.
 [20] T. T. Lin and F. J. Herrmann. Robust estimation of primaries by sparse inversion via onenorm minimization. Geophysics, 78(3):R133–R150, 05 2013.
 [21] O. Lopez, R. Kumar, and F. J. Herrmann. Rank minimization via alternating optimizationseismic data interpolation. In 77th EAGE Conference and Exhibition 2015, 2015.
 [22] O. Lopez, R. Kumar, Ö. Yılmaz, and F. J. Herrmann. Offthegrid lowrank matrix recovery and seismic data reconstruction. IEEE Journal of Selected Topics in Signal Processing, 10(4):658–671, 2016.
 [23] H. Mansour, F. J. Herrmann, and Ö. Yılmaz. Improved wavefield reconstruction from randomized sampling via weighted onenorm minimization. Geophysics, 78(5):V193–V206, 2013.
 [24] V. Oropeza and M. Sacchi. Simultaneous seismic data denoising and reconstruction via multichannel singular spectrum analysis. Geophysics, 76(3):V25–V32, 2011.
 [25] R.E. Plessix. A review of the adjointstate method for computing the gradient of a functional with geophysical applications. Geophysical Journal International, 167(2):495–503, 2006.
 [26] B. Recht, M. Fazel, and P. A. Parrilo. Guaranteed minimumrank solutions of linear matrix equations via nuclear norm minimization. SIAM review, 52(3):471–501, 2010.
 [27] R. T. Rockafellar. Convex Analysis. Priceton Landmarks in Mathematics. Princeton University Press, 1970.
 [28] N. Srebro, J. Rennie, and T. S. Jaakkola. Maximummargin matrix factorization. In Advances in neural information processing systems, pages 1329–1336, 2004.
 [29] S. Trickett, L. Burroughs, A. Milton, L. Walton, R. Dack, et al. Rankreductionbased trace interpolation. In 2010 SEG Annual Meeting. Society of Exploration Geophysicists, 2010.
 [30] E. Van Den Berg and M. P. Friedlander. Probing the pareto frontier for basis pursuit solutions. SIAM Journal on Scientific Computing, 31(2):890–912, 2008.
 [31] D. J. Verschuur, A. Berkhout, and C. Wapenaar. Adaptive surfacerelated multiple elimination. Geophysics, 57(9):1166–1177, 1992.
 [32] J. Virieux and S. Operto. An overview of fullwaveform inversion in exploration geophysics. Geophysics, 74(6):WCC1–WCC26, 2009.
 [33] Y. Xu and W. Yin. A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM Journal on imaging sciences, 6(3):1758–1789, 2013.
 [34] Y. Yang, J. Ma, and S. Osher. Seismic data reconstruction via matrix completion. UCLA CAM Report, pages 12–14, 2012.
Comments
There are no comments yet.