1 Introduction
In a multireceiver radar system, a transmit source signal is reflected from sparsely located targets and measured at the receivers. The received signals are modeled as convolutions of the source signal and sparse filters that depend on the targets’ location relative to the receivers [3, 4]. Often, the source signal is unknown at the receiver due to distortion during transmission. With an unknown source, the problem of determining sparse filters is known as sparse multichannel blinddeconvolution (SMBD). This model is ubiquitous in many other applications such as seismic signal processing [23], room impulse response modeling [24], sonar imaging [7], and ultrasound imaging [27, 28]. In these applications, the receivers’ hardware and computational complexity depend on the number of measurements required at each receiver or channel to determine sparse filters uniquely. Hence, it is desirable to compress the number of measurements on each channel.
Prior works have proposed computationally efficient and robust algorithms to recover the filters from the measurements. Examples are norm methods [29, 17, 6], the sparse dictionary calibration [16, 22] and truncated power iteration methods [19], and convolutional dictionary learning [14]. The works in [18, 29, 10] establish theoretical guarantees on identifying the SMBD problem. However, these methods are computationally demanding, require a series of iterations for convergence, and need access to the full measurements.
Recent works proposed modelbased neural networks to address computational efficiency [26, 25], but still require full measurements for recovery. To enable compression, Chang et al. [8]
proposed an autoencoder, called RandNet, for dictionary learning. In this work, compression is achieved by projecting data into a lowerdimensional space through a dataindependent unstructured random matrix. Mulleti et al.
[22] proposed a dataindependent, structured compression operator that could be realized through a linear filter. Specifically, the compressed measurements are computed by applying a specific filter to the received signals followed by the truncation of the filtered measurements. Two natural questions are (1) can we learn a compression filter from a given set of measurements, rather than applying a fixed filter as in [22]? (2) will this datadriven approach result in better compression for a given estimation accuracy?We propose a modelbased neural network that learns a hardwareefficient, datadriven, and structured compression matrix to recover sparse filters from reduced measurements. Our approach takes inspiration from filterbased compression [22], learning compression operators [21], and the modelbased compressed learning approach for SMBD [8]. The architecture, which we call learned (L) structured (S) compressive multichannel blinddeconvolution network (LSMBD), learns a filter for compression, recovers the unknown source, and estimates sparse filters. In contrast to [22, 21, 8], LSMBD improves in the following ways. In [22], a fixed and dataindependent filter is used for compression, and the reconstruction is independent of the compression. In LSMBD, we learn a filter that enables compression and lets us estimate sparse filters accurately. The approach is computationally efficient and results in lower reconstruction error for a given compression ratio compared with [22]. Unlike in [21], our compression operator is linear, is used recurrently in the architecture, has fewer parameters to learn, and has computationally efficient implementation [11].
2 Problem Formulation
Consider a set of signals given as
(1) 
where denotes the linearconvolution operation and the matrix
is the convolution matrix constructed from the vector
. We make the following structural assumption on the signals: (A1) and (A2) and for . Due to convolution, each has length . In a SMBD problem, the objective is to determine the common source signal and sparse filters from the measurements .In compressive SMBD, the objective is to estimate the unknowns from a set of compressive measurements where . The compression operator is a mapping from to . This operator could be either linear or nonlinear, random or deterministic, structured or unstructured, datadependent or dataindependent (see [11] for details).
We consider the problem of designing a linear, structured, and datadriven compression operator that enables accurate estimation of sparse filters from the compressed measurements. Specifically, our goal is to jointly learn , and a structured, practicallyrealizable and datadriven such that sparse filters, , are determined from the compressive measurements .
3 Learning Structured Compressive SMBD
3.1 Compression Operator
We impose a Toeplitz structure on the compression operator and denote the filter associated with the operator by . Such compression matrices have several advantages compared to random projections; they facilitate the use of computationally efficient matrix operations and can be practically implemented as a filtering operation in hardware. The compression operator involves a convolution followed by a truncation. As the operation holds for all , we drop the channelindex superscript, for simplicity. Consider a causal filter of length . The truncated convolution samples are
(2) 
which are the convolution samples where and have complete overlap. Choice of samples is to ensure that maximum information from is retained in . The corresponding measurement matrix is a Toeplitz matrix given as
3.2 Network Architecture
We aim to minimize the following objective
(3)  
where is a sparsityenforcing parameter, and the norm constraints are to avoid scaling ambiguity. Following a similar approach to [15, 26, 20], we construct an autoencoder where its encoder maps into a sparse filter by unfolding iterations of a variant of accelerated proximal gradient algorithm, FISTA [5], for sparse recovery. Specifically, each unfolding layer performs the following iteration
(4) 
where , is set as in FISTA, is the unfolding stepsize, and the sparsity promoting softthresholding operator where .
One may leave the bias in each layer unconstrained to be learned. However, theoretical analysis of unfolding ISTA [9] has proved that converges to as goes to infinity. Hence, we set with . In this regard, is a scalar that can be either tuned or learned. We keep the unfolded layers tied as this leads to better generalization in applications with limiteddata.
The decoder reconstructs the data using . We call this network, shown in Figure 1, LSMBD. In this architecture, and
correspond to weights of the neural network and are learned by backpropagation in two stages. This method’s novelty is in the hardwareefficient design of the operator
capturing data information when compressing. This architecture reduces to RandNet [8] when and have no structure (e.g. Toeplitz) and .3.3 Training Procedure
We follow a twostage approach to learn the compression filter. In the first stage, we start with a limited set of full measurements and estimate the source and filters. In the second, we learn the compression filter associated with given the estimated source. This twostage training disentangles source estimation and learning of the compression. Hence, source estimation is performed only once to be used for various compression ratios (CR)s. Besides, a low CR would not affect the quality of source estimation, and the number of measurements required for filter recovery can be optimized independently.
Specifically, having access to the full measurements
, we set the compression operator to be the identity matrix (i.e.,
), hence reducing the autoencoder architecture to a variant of CRsAE [25], and learn by minimizing . Then, given the learned source matrix , we run the forward set of the encoder to estimate sparse filters, which we denote . In the second stage, for a given compression ratio , we set to the learned source and train within the encoder by minimizing .In practice, the method is useful in the following scenario. Consider a multireceiver radar scenario where a set of receivers are already in place and operate with full measurements. Let it require adding a new set of receivers to replace the existing ones or gather more information. While designing these new set of receivers, one can use the information from the available full measurements from the existing receivers to estimate the source and then design the optimal compression filters for new receivers. Thus, the new receivers sense compressed measurements that result in reduced cost and complexity. In essence, the approach is similar to the deepsparse array method [12], where a full array is used in one scan to learn sparse arrays for the rest of the scans.
4 Experiments
4.1 Data Generation
We considered the noiseless case and generated measurements following the model in (1) where , , and . The source follows a Gaussian shape generated as where , and then normalized such that . The support of the filters are generated uniformly at random over the set the set
, and their amplitudes from a uniform distribution
.In the aforementioned assumptions, we neither impose restrictions on the minimum separation of any two nonzero components of the sparse filter nor the the filter components’ relative amplitudes. In the presence of noise, both of these factors play a crucial role in the filter estimation, and the recovery error is expected to increase. Our method’s recovery performance and stability analysis in the presence of noise is a direction of future research.
4.2 Network and Training Parameters
We implemented the network in PyTorch. We unfolded the encoder for
iterations, and set . We set the regularization parameter and decreased by a factor of at every iteration (i.e., ). To ensure stability, we chose the parameterto be less than the reciprocal of the singular value of the matrix
at initialization. In the second stage, we tuned and finely by gridsearch in the ranges and , respectively. Given the nonnegativity of sparse filters, we let .In the first stage, we initialize the source’s entries randomly according to a standard Normal distribution. The network is trained for
epochs using fullbatch gradient descent. We use the ADAM optimizer with a learning rate of , which we decrease by a factor of every epochs. To achieve convergence without frequent skipping of the minimum of interest, we set of the optimizer to be . Given the learned source, and following a similar approach in generating sparse filters, we generate examples for training, for validation, and for testing. Then, we run the encoder given to estimate of sparse filters . In the second stage, we initialize the filter for similarly to the source. We use the ADAM optimizer with and a learning rate of with a decaying factor of for every epochs. We use a batch size of and train for epochs for each CR.The iterative usage of and within each layer of the architecture, a property that does not exist in [21], allows us to use a combination of analytic and automatic differentiation [1] to compute the gradient; in this regard, we backpropagate up to the last iterations within the encoder to compute the gradient and assume that the representation is not a function of the trainable parameters. This allows us to unfold the network for large in the forward pass without increasing the computational and space complexity of backpropagation, a property that unfolded networks with untied layers do not possess. Lastly, we normalize the source and compression filter after every gradient update.
4.3 Baselines
For each CR, we compare five methods detailed below.
LSMBD: is learned (L) and structured (S).
LSMBDL: is learned (L) and structured (S). Motivated by the fast convergence of LISTA, the encoder performs steps of the proximal gradient iteration . In the second stage of training, we learn .
GSMBD: is random Gaussian (G) and structured (S).
FSMBD: is fixed (F) and structured (S) [22]. In [22], the authors derive identifiability in the Fourier domain, and design the filters to enable computation of the specific Fourier measurements. The authors also show that for FSMBD, the sparse filters are uniquely identifiable from compressed measurements from any two channels and compressed measurements from the rest of the channels. Here, we applied the blinddictionary calibration approach from [22] together with FISTA [5] for the sparsecoding step.
GMBD: is an unstructured random Gaussian (G) matrix. We consider GMBD as an oracle baseline that implements a computationally expensive compression operator.
LSMBD, GSMBD, and GMBD use the architecture shown in Figure 1, each with a different .
4.4 Results
We show that LSMBD is superior to GSMBD, FSMBD, and LSMBDL. We evaluate performance in terms of how well we can estimate the filters and source. Let be an estimate true filters . We use the normalized MSE in dB as a comparison metric and call a method successful if this error is below dB. Letting denote an estimate of the source, we quantify the quality of source recovery using the error [2], which ranges from to , where corresponds to exact source recovery.
Figures 2, 2, and 2 visualize the results from the first stage of training. Figure 2 shows that the source estimation error and the training loss both decrease and converge to zero as a function of epochs. Figure 2 shows the filter estimation error and successful recovery of the filters upon training. Figure 2 shows the source before and after training, where the learned and true sources match.
Table 1 shows the inference runtime of the methods averaged over 20 independent trials. All methods except FSMBD are run on GPU. LSMBDL has the fastest inference because it only unfolds a small number of proximal gradient iterations. Despite its speed, LSMBDL has the worst recovery performance.
Speed Method  GMBD  FSMBD  LSMBD  LSMBDL 

runtime [s]  164 
CR [%]  GMBD  GSMBD  FSMBD  LSMBD  LSMBDL  

99  54.05  44.93  43.96  53.27  26.54  
80  55.07  40.55  26.52  52.80    
70  52.43  40.00  22.76  51.50    
62  53.63  37.13  21.86  54.71    
50  53.36  28.57  8.40  51.41    
47  50.60  26.11  6.84  50.35    
45  52.98  23.17  6.14  43.61    
40  47.39  14.75  5.13  17.07   
Table 2 shows the filter recovery error for various CR on the test set. GMBD and GSMBD results are averages over ten independent trials. For LSMBDL, we only report for , which already shows a very high recovery error. Among the three structured compression matrix methods, we observe that the proposed LSMBD approach outperforms other structured methods. Theoretical results in [22] suggest that for structured compression when (i.e., ), we will have successful recovery of sparse filters. LSMBD goes beyond and has a successful recovery for CR lower than up to .
Comparison of LSMBD and LSMBDL highlights the importance of deep unfolding/encoder depth and the use of a modelbased encoder (i.e., large and tied encoder/decoders). LSMBDL, even in the presence of a learned encoder, fails to recover sparse filters. Besides, we observed that LSMBD generalizes better (i.e., closer test and training errors) than LSMBDL, highlighting the importance of limiting the number of trainable parameters by weighttying in applications where data are limited.
For FSMBD, we observe that for the given set of data, the approach fails to estimate the filters for accurately. In large part, this is due to the failure of the FISTA algorithm in the sparsecoding step. To verify, we consider the nonblind case where the source is assumed to be known. In this case, the normalized MSE in the estimation of the filters from the compressed Fourier measurements by using FISTA are comparable to that of FSMBD. In comparison to [22], LSMBD learns a filter that enables compression and accurate sparse coding, which results in a lower MSE. We observe that our oracle method, GMBD, outperforms the structurebased compression methods. We attribute this superior performance to the fact that the compression matrix in GMBD has independent random entries that result in low mutual coherence among its columns [13]
. In the rest of the methods, the compression matrices have a Toeplitz structure, results in high coherence. Despite being less accurate compared to GMBD, the compression matrix in LSMBD has fewer degrees of freedom, is practically feasible to implement, and its Toeplitz structure can be used to speed up matrix computations in the recovery process. Table
3compares the memory storage and computational costs of the unstructured (GMBD) and structured (LSMBD) compression operators. The table highlights the efficiency of LSMBD; in this case, we report the complexity of the operation performed using the fast Fourier transform.
Method Cost  Memory Storage  Complexity 

Structured  
Unstructured 
In LSMBD, when , the filter corresponding to is initialized at random. For lower ratios, we “warmstart” the network using a shortened version of the filter learned when . Figures 2 visualizes the filter recovered using a test example obtained with a CR of . Figure 3 shows, in the time domain, the compression filters learned for various compression ratios. Figure 4 depicts the magnitude of the discrete Fourier transform of the source (black), learned compression filter (blue), and the compressed measurements (red) when . The alignment of and indicates that the filtering operation performed by the learned filter preserves information from the source, which may explain the success of LSMBD compared to the other methods.
5 Conclusions
We proposed a compressive sparse multichannel blinddeconvolution method, named LSMBD, based on unfolding neural networks [20]. LSMBD is an autoencoder that recovers sparse filters at the output of its encoder, and whose convolutional decoder corresponds to the source of interest. In this framework, we learn an efficient and structured compression matrix that allows us to have a faster and better accuracy in sparse filter recovery than other methods. We attribute our framework’s superiority against FSMBD [22] to learning a compression optimized for both reconstruction and filter recovery.
References
 [1] (2020) Superefficiency of automatic differentiation for functions defined as a minimum. In Proc. Int. Conf. on Machine Learning (ICML), pp. 1–10. Cited by: §4.2.
 [2] (2016) Learning sparsely used overcomplete dictionaries via alternating minimization. SIAM J. Opt. 26, pp. 2775–2799. Cited by: §4.4.

[3]
(2011Jun.)
Identification of parametric underspread linear systems and superresolution radar
. IEEE Trans. Signal Process. 59 (6), pp. 2548–2561. External Links: Document, ISSN 1053587X Cited by: §1.  [4] (2014Apr.) SubNyquist radar via Doppler focusing. IEEE Trans. Signal Process. 62 (7), pp. 1796–1811. External Links: Document, ISSN 1053587X Cited by: §1.
 [5] (2009) A fast iterative shrinkagethresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2 (1), pp. 183–202. Cited by: Figure 1, §3.2, §4.3.
 [6] (2014Sep.) Convex optimization approaches for blind sensor calibration using sparsity. IEEE Trans. Signal Process. 62 (18), pp. 4847–4856. External Links: Document, ISSN 1053587X Cited by: §1.
 [7] (1981Jun.) Time delay estimation for passive sonar signal processing. IEEE Trans. Acoust., Speech, Signal Process. 29 (3), pp. 463–470. External Links: Document, ISSN 00963518 Cited by: §1.

[8]
(2019)
Randnet: deep learning with compressed measurements of images
. In Proc. Workshop on Machine Learning for Signal Process. (MLSP), pp. 1–6. Cited by: §1, §1, §3.2.  [9] Theoretical linear convergence of unfolded ista and its practical weights and thresholds. In Proc. Advances in Neural Info. Process. Sys. (NeurIPS), pp. 9061–9071. Cited by: §3.2.
 [10] (2017) A note on the blind deconvolution of multiple sparse signals from unknown subspaces. Proc. SPIE 10394 (). External Links: Document, Cited by: §1.
 [11] (2011) Structured compressed sensing: from theory to applications. IEEE Tran. Signal Process. 59 (9), pp. 4053–4085. Cited by: §1, §2.
 [12] (2019) Cognitive radar antenna selection via deep learning. IET Radar, Sonar & Navigation 13 (6), pp. 871–880. Cited by: §3.3.
 [13] (2012) Compressed sensing: theory and applications. Cambridge University Press. External Links: Document Cited by: §4.4.
 [14] (2018) Convolutional dictionary learning: A comparative review and new algorithms. IEEE Trans. Comput. Imag. 4 (3), pp. 366–381. Cited by: §1.
 [15] (2010) Learning fast approximations of sparse coding. In Proc. Int. Conf. Machine Lerning (ICML), pp. 399–406. Cited by: §3.2.
 [16] (2012Mar.) Blind calibration for compressed sensing by convex optimization. In Proc. IEEE Int. Conf. Acoust., Speech and Signal Process. (ICASSP), Vol. , pp. 2713–2716. Cited by: §1.
 [17] (2014) Sparse multichannel blind deconvolution. Geophysics 79 (5), pp. V143–V152. Cited by: §1.
 [18] (2017Feb.) Identifiability in bilinear inverse problems with applications to subspace or sparsityconstrained blind gain and phase calibration. IEEE Trans. Info. Theory 63 (2), pp. 822–842. External Links: Document, ISSN 00189448 Cited by: §1.
 [19] (201905) Blind gain and phase calibration via sparse spectral methods. IEEE Trans. Info. Theory 65 (5), pp. 3097–3123. External Links: ISSN 15579654 Cited by: §1.
 [20] (2019) Algorithm unrolling: interpretable, efficient deep learning for signal and image processing. arXiv preprint arXiv:1912.10557. Cited by: §3.2, §5.
 [21] (2015) A deep learning approach to structured signal recovery. In Proc. Allerton Conf. Commun., Control, and Compu. (Allerton), Vol. , pp. 1336–1343. Cited by: §1, §4.2.
 [22] (2020) Identifiability conditions for compressive multichannel blind deconvolution. IEEE Trans. Signal Process. 68 (), pp. 4627–4642. Cited by: §1, §1, §1, §4.3, §4.4, §4.4, §5.
 [23] (2018Mar.) Improving sparse multichannel blind deconvolution with correlated seismic data: Foundations and further results. IEEE Signal Process. Mag. 35 (2), pp. 41–50. External Links: Document, ISSN 10535888 Cited by: §1.

[24]
(2017Aug.)
Sparse parametric modeling of the early part of acoustic impulse responses
. In proc. European Signal Process. Conf. (EUSIPCO), Vol. , pp. 678–682. External Links: Document, ISSN 20761465 Cited by: §1.  [25] (2018) Scalable convolutional dictionary learning with constrained recurrent sparse autoencoders. In Proc. Workshop on Machine Learning for Signal Process. (MLSP), pp. 1–6 (eng). Cited by: §1, §3.3.

[26]
(2020)
Deep residual autoencoders for expectation maximizationinspired dictionary learning
. IEEE Trans. Neural Netw. Learn. Syst., pp. 1–15. External Links: Document Cited by: §1, §3.2.  [27] (2011Apr.) Innovation rate sampling of pulse streams with application to ultrasound imaging. IEEE Trans. Signal Process. 59 (4), pp. 1827–1842. External Links: Document, ISSN 1053587X Cited by: §1.
 [28] (2012Sep.) Compressed beamforming in ultrasound imaging. IEEE Trans. Signal Process. 60 (9), pp. 4643–4657. External Links: Document, ISSN 1053587X Cited by: §1.
 [29] (2016Oct.) Blind deconvolution from multiple sparse inputs. IEEE Signal Process. Lett. 23 (10), pp. 1384–1388. External Links: Document, ISSN 10709908 Cited by: §1.
Comments
There are no comments yet.