In a multi-receiver radar system, a transmit source signal is reflected from sparsely located targets and measured at the receivers. The received signals are modeled as convolutions of the source signal and sparse filters that depend on the targets’ location relative to the receivers [3, 4]. Often, the source signal is unknown at the receiver due to distortion during transmission. With an unknown source, the problem of determining sparse filters is known as sparse multichannel blind-deconvolution (S-MBD). This model is ubiquitous in many other applications such as seismic signal processing , room impulse response modeling , sonar imaging , and ultrasound imaging [27, 28]. In these applications, the receivers’ hardware and computational complexity depend on the number of measurements required at each receiver or channel to determine sparse filters uniquely. Hence, it is desirable to compress the number of measurements on each channel.
Prior works have proposed computationally efficient and robust algorithms to recover the filters from the measurements. Examples are -norm methods [29, 17, 6], the sparse dictionary calibration [16, 22] and truncated power iteration methods , and convolutional dictionary learning . The works in [18, 29, 10] establish theoretical guarantees on identifying the S-MBD problem. However, these methods are computationally demanding, require a series of iterations for convergence, and need access to the full measurements.
proposed an autoencoder, called RandNet, for dictionary learning. In this work, compression is achieved by projecting data into a lower-dimensional space through a data-independent unstructured random matrix. Mulleti et al. proposed a data-independent, structured compression operator that could be realized through a linear filter. Specifically, the compressed measurements are computed by applying a specific filter to the received signals followed by the truncation of the filtered measurements. Two natural questions are (1) can we learn a compression filter from a given set of measurements, rather than applying a fixed filter as in ? (2) will this data-driven approach result in better compression for a given estimation accuracy?
We propose a model-based neural network that learns a hardware-efficient, data-driven, and structured compression matrix to recover sparse filters from reduced measurements. Our approach takes inspiration from filter-based compression , learning compression operators , and the model-based compressed learning approach for S-MBD . The architecture, which we call learned (L) structured (S) compressive multichannel blind-deconvolution network (LS-MBD), learns a filter for compression, recovers the unknown source, and estimates sparse filters. In contrast to [22, 21, 8], LS-MBD improves in the following ways. In , a fixed and data-independent filter is used for compression, and the reconstruction is independent of the compression. In LS-MBD, we learn a filter that enables compression and lets us estimate sparse filters accurately. The approach is computationally efficient and results in lower reconstruction error for a given compression ratio compared with . Unlike in , our compression operator is linear, is used recurrently in the architecture, has fewer parameters to learn, and has computationally efficient implementation .
2 Problem Formulation
Consider a set of signals given as
where denotes the linear-convolution operation and the matrix
is the convolution matrix constructed from the vector. We make the following structural assumption on the signals: (A1) and (A2) and for . Due to convolution, each has length . In a S-MBD problem, the objective is to determine the common source signal and sparse filters from the measurements .
In compressive S-MBD, the objective is to estimate the unknowns from a set of compressive measurements where . The compression operator is a mapping from to . This operator could be either linear or nonlinear, random or deterministic, structured or unstructured, data-dependent or data-independent (see  for details).
We consider the problem of designing a linear, structured, and data-driven compression operator that enables accurate estimation of sparse filters from the compressed measurements. Specifically, our goal is to jointly learn , and a structured, practically-realizable and data-driven such that sparse filters, , are determined from the compressive measurements .
3 Learning Structured Compressive S-MBD
3.1 Compression Operator
We impose a Toeplitz structure on the compression operator and denote the filter associated with the operator by . Such compression matrices have several advantages compared to random projections; they facilitate the use of computationally efficient matrix operations and can be practically implemented as a filtering operation in hardware. The compression operator involves a convolution followed by a truncation. As the operation holds for all , we drop the channel-index superscript, for simplicity. Consider a causal filter of length . The truncated convolution samples are
which are the convolution samples where and have complete overlap. Choice of samples is to ensure that maximum information from is retained in . The corresponding measurement matrix is a Toeplitz matrix given as
3.2 Network Architecture
We aim to minimize the following objective
where is a sparsity-enforcing parameter, and the norm constraints are to avoid scaling ambiguity. Following a similar approach to [15, 26, 20], we construct an autoencoder where its encoder maps into a sparse filter by unfolding iterations of a variant of accelerated proximal gradient algorithm, FISTA , for sparse recovery. Specifically, each unfolding layer performs the following iteration
where , is set as in FISTA, is the unfolding step-size, and the sparsity promoting soft-thresholding operator where .
One may leave the bias in each layer unconstrained to be learned. However, theoretical analysis of unfolding ISTA  has proved that converges to as goes to infinity. Hence, we set with . In this regard, is a scalar that can be either tuned or learned. We keep the unfolded layers tied as this leads to better generalization in applications with limited-data.
The decoder reconstructs the data using . We call this network, shown in Figure 1, LS-MBD. In this architecture, and
correspond to weights of the neural network and are learned by backpropagation in two stages. This method’s novelty is in the hardware-efficient design of the operatorcapturing data information when compressing. This architecture reduces to RandNet  when and have no structure (e.g. Toeplitz) and .
3.3 Training Procedure
We follow a two-stage approach to learn the compression filter. In the first stage, we start with a limited set of full measurements and estimate the source and filters. In the second, we learn the compression filter associated with given the estimated source. This two-stage training disentangles source estimation and learning of the compression. Hence, source estimation is performed only once to be used for various compression ratios (CR)s. Besides, a low CR would not affect the quality of source estimation, and the number of measurements required for filter recovery can be optimized independently.
Specifically, having access to the full measurements
, we set the compression operator to be the identity matrix (i.e.,), hence reducing the autoencoder architecture to a variant of CRsAE , and learn by minimizing . Then, given the learned source matrix , we run the forward set of the encoder to estimate sparse filters, which we denote . In the second stage, for a given compression ratio , we set to the learned source and train within the encoder by minimizing .
In practice, the method is useful in the following scenario. Consider a multi-receiver radar scenario where a set of receivers are already in place and operate with full measurements. Let it require adding a new set of receivers to replace the existing ones or gather more information. While designing these new set of receivers, one can use the information from the available full measurements from the existing receivers to estimate the source and then design the optimal compression filters for new receivers. Thus, the new receivers sense compressed measurements that result in reduced cost and complexity. In essence, the approach is similar to the deep-sparse array method , where a full array is used in one scan to learn sparse arrays for the rest of the scans.
4.1 Data Generation
We considered the noiseless case and generated measurements following the model in (1) where , , and . The source follows a Gaussian shape generated as where , and then normalized such that . The support of the filters are generated uniformly at random over the set the set
, and their amplitudes from a uniform distribution.
In the aforementioned assumptions, we neither impose restrictions on the minimum separation of any two non-zero components of the sparse filter nor the the filter components’ relative amplitudes. In the presence of noise, both of these factors play a crucial role in the filter estimation, and the recovery error is expected to increase. Our method’s recovery performance and stability analysis in the presence of noise is a direction of future research.
4.2 Network and Training Parameters
We implemented the network in PyTorch. We unfolded the encoder foriterations, and set . We set the regularization parameter and decreased by a factor of at every iteration (i.e., ). To ensure stability, we chose the parameter
to be less than the reciprocal of the singular value of the matrixat initialization. In the second stage, we tuned and finely by grid-search in the ranges and , respectively. Given the non-negativity of sparse filters, we let .
In the first stage, we initialize the source’s entries randomly according to a standard Normal distribution. The network is trained forepochs using full-batch gradient descent. We use the ADAM optimizer with a learning rate of , which we decrease by a factor of every epochs. To achieve convergence without frequent skipping of the minimum of interest, we set of the optimizer to be . Given the learned source, and following a similar approach in generating sparse filters, we generate examples for training, for validation, and for testing. Then, we run the encoder given to estimate of sparse filters . In the second stage, we initialize the filter for similarly to the source. We use the ADAM optimizer with and a learning rate of with a decaying factor of for every epochs. We use a batch size of and train for epochs for each CR.
The iterative usage of and within each layer of the architecture, a property that does not exist in , allows us to use a combination of analytic and automatic differentiation  to compute the gradient; in this regard, we backpropagate up to the last iterations within the encoder to compute the gradient and assume that the representation is not a function of the trainable parameters. This allows us to unfold the network for large in the forward pass without increasing the computational and space complexity of backpropagation, a property that unfolded networks with untied layers do not possess. Lastly, we normalize the source and compression filter after every gradient update.
For each CR, we compare five methods detailed below.
LS-MBD: is learned (L) and structured (S).
LS-MBD-L: is learned (L) and structured (S). Motivated by the fast convergence of LISTA, the encoder performs steps of the proximal gradient iteration . In the second stage of training, we learn .
GS-MBD: is random Gaussian (G) and structured (S).
FS-MBD: is fixed (F) and structured (S) . In , the authors derive identifiability in the Fourier domain, and design the filters to enable computation of the specific Fourier measurements. The authors also show that for FS-MBD, the -sparse filters are uniquely identifiable from compressed measurements from any two channels and compressed measurements from the rest of the channels. Here, we applied the blind-dictionary calibration approach from  together with FISTA  for the sparse-coding step.
G-MBD: is an unstructured random Gaussian (G) matrix. We consider G-MBD as an oracle baseline that implements a computationally expensive compression operator.
LS-MBD, GS-MBD, and G-MBD use the architecture shown in Figure 1, each with a different .
We show that LS-MBD is superior to GS-MBD, FS-MBD, and LS-MBD-L. We evaluate performance in terms of how well we can estimate the filters and source. Let be an estimate true filters . We use the normalized MSE in dB as a comparison metric and call a method successful if this error is below dB. Letting denote an estimate of the source, we quantify the quality of source recovery using the error , which ranges from to , where corresponds to exact source recovery.
Figures 2, 2, and 2 visualize the results from the first stage of training. Figure 2 shows that the source estimation error and the training loss both decrease and converge to zero as a function of epochs. Figure 2 shows the filter estimation error and successful recovery of the filters upon training. Figure 2 shows the source before and after training, where the learned and true sources match.
Table 1 shows the inference runtime of the methods averaged over 20 independent trials. All methods except FS-MBD are run on GPU. LS-MBD-L has the fastest inference because it only unfolds a small number of proximal gradient iterations. Despite its speed, LS-MBD-L has the worst recovery performance.
Table 2 shows the filter recovery error for various CR on the test set. G-MBD and GS-MBD results are averages over ten independent trials. For LS-MBD-L, we only report for , which already shows a very high recovery error. Among the three structured compression matrix methods, we observe that the proposed LS-MBD approach outperforms other structured methods. Theoretical results in  suggest that for structured compression when (i.e., ), we will have successful recovery of sparse filters. LS-MBD goes beyond and has a successful recovery for CR lower than up to .
Comparison of LS-MBD and LS-MBD-L highlights the importance of deep unfolding/encoder depth and the use of a model-based encoder (i.e., large and tied encoder/decoders). LS-MBD-L, even in the presence of a learned encoder, fails to recover sparse filters. Besides, we observed that LS-MBD generalizes better (i.e., closer test and training errors) than LS-MBD-L, highlighting the importance of limiting the number of trainable parameters by weight-tying in applications where data are limited.
For FS-MBD, we observe that for the given set of data, the approach fails to estimate the filters for accurately. In large part, this is due to the failure of the FISTA algorithm in the sparse-coding step. To verify, we consider the non-blind case where the source is assumed to be known. In this case, the normalized MSE in the estimation of the filters from the compressed Fourier measurements by using FISTA are comparable to that of FS-MBD. In comparison to , LS-MBD learns a filter that enables compression and accurate sparse coding, which results in a lower MSE. We observe that our oracle method, G-MBD, outperforms the structure-based compression methods. We attribute this superior performance to the fact that the compression matrix in G-MBD has independent random entries that result in low mutual coherence among its columns 
. In the rest of the methods, the compression matrices have a Toeplitz structure, results in high coherence. Despite being less accurate compared to G-MBD, the compression matrix in LS-MBD has fewer degrees of freedom, is practically feasible to implement, and its Toeplitz structure can be used to speed up matrix computations in the recovery process. Table3
compares the memory storage and computational costs of the unstructured (G-MBD) and structured (LS-MBD) compression operators. The table highlights the efficiency of LS-MBD; in this case, we report the complexity of the operation performed using the fast Fourier transform.
|Method Cost||Memory Storage||Complexity|
In LS-MBD, when , the filter corresponding to is initialized at random. For lower ratios, we “warm-start” the network using a shortened version of the filter learned when . Figures 2 visualizes the filter recovered using a test example obtained with a CR of . Figure 3 shows, in the time domain, the compression filters learned for various compression ratios. Figure 4 depicts the magnitude of the discrete Fourier transform of the source (black), learned compression filter (blue), and the compressed measurements (red) when . The alignment of and indicates that the filtering operation performed by the learned filter preserves information from the source, which may explain the success of LS-MBD compared to the other methods.
We proposed a compressive sparse multichannel blind-deconvolution method, named LS-MBD, based on unfolding neural networks . LS-MBD is an autoencoder that recovers sparse filters at the output of its encoder, and whose convolutional decoder corresponds to the source of interest. In this framework, we learn an efficient and structured compression matrix that allows us to have a faster and better accuracy in sparse filter recovery than other methods. We attribute our framework’s superiority against FS-MBD  to learning a compression optimized for both reconstruction and filter recovery.
-  (2020) Super-efficiency of automatic differentiation for functions defined as a minimum. In Proc. Int. Conf. on Machine Learning (ICML), pp. 1–10. Cited by: §4.2.
-  (2016) Learning sparsely used overcomplete dictionaries via alternating minimization. SIAM J. Opt. 26, pp. 2775–2799. Cited by: §4.4.
Identification of parametric underspread linear systems and super-resolution radar. IEEE Trans. Signal Process. 59 (6), pp. 2548–2561. External Links: Cited by: §1.
-  (2014-Apr.) Sub-Nyquist radar via Doppler focusing. IEEE Trans. Signal Process. 62 (7), pp. 1796–1811. External Links: Cited by: §1.
-  (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2 (1), pp. 183–202. Cited by: Figure 1, §3.2, §4.3.
-  (2014-Sep.) Convex optimization approaches for blind sensor calibration using sparsity. IEEE Trans. Signal Process. 62 (18), pp. 4847–4856. External Links: Cited by: §1.
-  (1981-Jun.) Time delay estimation for passive sonar signal processing. IEEE Trans. Acoust., Speech, Signal Process. 29 (3), pp. 463–470. External Links: Cited by: §1.
Randnet: deep learning with compressed measurements of images. In Proc. Workshop on Machine Learning for Signal Process. (MLSP), pp. 1–6. Cited by: §1, §1, §3.2.
-  Theoretical linear convergence of unfolded ista and its practical weights and thresholds. In Proc. Advances in Neural Info. Process. Sys. (NeurIPS), pp. 9061–9071. Cited by: §3.2.
-  (2017) A note on the blind deconvolution of multiple sparse signals from unknown subspaces. Proc. SPIE 10394 (). External Links: Cited by: §1.
-  (2011) Structured compressed sensing: from theory to applications. IEEE Tran. Signal Process. 59 (9), pp. 4053–4085. Cited by: §1, §2.
-  (2019) Cognitive radar antenna selection via deep learning. IET Radar, Sonar & Navigation 13 (6), pp. 871–880. Cited by: §3.3.
-  (2012) Compressed sensing: theory and applications. Cambridge University Press. External Links: Cited by: §4.4.
-  (2018) Convolutional dictionary learning: A comparative review and new algorithms. IEEE Trans. Comput. Imag. 4 (3), pp. 366–381. Cited by: §1.
-  (2010) Learning fast approximations of sparse coding. In Proc. Int. Conf. Machine Lerning (ICML), pp. 399–406. Cited by: §3.2.
-  (2012-Mar.) Blind calibration for compressed sensing by convex optimization. In Proc. IEEE Int. Conf. Acoust., Speech and Signal Process. (ICASSP), Vol. , pp. 2713–2716. Cited by: §1.
-  (2014) Sparse multichannel blind deconvolution. Geophysics 79 (5), pp. V143–V152. Cited by: §1.
-  (2017-Feb.) Identifiability in bilinear inverse problems with applications to subspace or sparsity-constrained blind gain and phase calibration. IEEE Trans. Info. Theory 63 (2), pp. 822–842. External Links: Cited by: §1.
-  (2019-05) Blind gain and phase calibration via sparse spectral methods. IEEE Trans. Info. Theory 65 (5), pp. 3097–3123. External Links: Cited by: §1.
-  (2019) Algorithm unrolling: interpretable, efficient deep learning for signal and image processing. arXiv preprint arXiv:1912.10557. Cited by: §3.2, §5.
-  (2015) A deep learning approach to structured signal recovery. In Proc. Allerton Conf. Commun., Control, and Compu. (Allerton), Vol. , pp. 1336–1343. Cited by: §1, §4.2.
-  (2020) Identifiability conditions for compressive multichannel blind deconvolution. IEEE Trans. Signal Process. 68 (), pp. 4627–4642. Cited by: §1, §1, §1, §4.3, §4.4, §4.4, §5.
-  (2018-Mar.) Improving sparse multichannel blind deconvolution with correlated seismic data: Foundations and further results. IEEE Signal Process. Mag. 35 (2), pp. 41–50. External Links: Cited by: §1.
Sparse parametric modeling of the early part of acoustic impulse responses. In proc. European Signal Process. Conf. (EUSIPCO), Vol. , pp. 678–682. External Links: Cited by: §1.
-  (2018) Scalable convolutional dictionary learning with constrained recurrent sparse auto-encoders. In Proc. Workshop on Machine Learning for Signal Process. (MLSP), pp. 1–6 (eng). Cited by: §1, §3.3.
Deep residual autoencoders for expectation maximization-inspired dictionary learning. IEEE Trans. Neural Netw. Learn. Syst., pp. 1–15. External Links: Cited by: §1, §3.2.
-  (2011-Apr.) Innovation rate sampling of pulse streams with application to ultrasound imaging. IEEE Trans. Signal Process. 59 (4), pp. 1827–1842. External Links: Cited by: §1.
-  (2012-Sep.) Compressed beamforming in ultrasound imaging. IEEE Trans. Signal Process. 60 (9), pp. 4643–4657. External Links: Cited by: §1.
-  (2016-Oct.) Blind deconvolution from multiple sparse inputs. IEEE Signal Process. Lett. 23 (10), pp. 1384–1388. External Links: Cited by: §1.