I Introduction
Digital signal processing systems operate on finitebit representation of continuousamplitude physical signals. The mapping of an analog signal into a digital representation of a finite dictionary is referred to as quantization [1]. This representation is commonly selected to accurately match the quantized signal, in the sense of minimizing a distortion measure, such that the signal can be recovered with minimal error from the quantized measurements [2], [3, Ch. 10]. In many relevant scenarios, the task of the system is to recover some underlying parameters, and not to accurately represent the observed signal. In these cases, it was shown that by accounting for the system task in the design of the quantizers, namely, by utilizing taskbased quantization, the performance can be improved without increasing the number of bits used [4, 5, 6, 7].
In practice, quantizers are typically implemented using analogtodigital converters
, which operate on the input signal in a serial scalar manner. In such systems, the quantization rule is based on a uniform partition of a subspace of the real line, determined by the support of the quantizer. This quantization logic is very limited due to its simplicity: except for the specific case where the input is uniformly distributed over the support of the quantizer, uniform quantization is far from optimality
[8, Sec. 22], namely, a more accurate representation can be obtained with the same number of bits. Furthermore, such quantizers typically do not account for the system task, namely, they are taskignorant. While the distortion induced by such inefficient quantization can be mitigated by assigning more bits for digital representation, i.e., using highresolution quantizers, it can severely degrade the performance of bitconstrained systems.Recent years have witnessed a growing interest in systems operating with lowresolution ADCs. In particular, the power consumption of ADCs typically grows with the bandwidth and the quantization resolution [9]. To maintain feasible cost and power usage when acquiring multiple signals at large frequency bands, lowresolution quantizers may be used. An example where such bitconstrained systems are popular is multipleinput multipleoutput (MIMO) communication receivers, which simultaneously acquire and process multiple analog signals in order to recover the transmitted symbols and/or estimate the underlying channel, i.e., for a specific task. MIMO receivers operating at large spectral bands, e.g., millimeter wave systems [10], are commonly designed to acquire the channel output with lowresolution quantizers, and a large body of work focuses on schemes for carrying out the aforementioned tasks from coarsely discretized measurements, see, e.g., [12, 11, 16, 13, 14, 15, 17]
Quantizers are inherently nonlinear systems. Hence, the design and implementation of practical quantizers which provide an accurate discrete representation while accounting for the system task, is difficult in general. Two notable challenges are associated with designing such taskbased quantization systems: In order to design the quantization scheme, one must have full knowledge of the stochastic model of the underlying signal [2, 1], which may be unavailable in practice; Even when the stochastic model is perfectly known, the scalar continuoustodiscrete rule which minimizes the representation error is generally unknown for most distributions under finite resolution quantization [8, Ch. 23.1]. A possible approach to tackle the second challenge is to use a uniform quantization rule, while applying additional processing in analog prior to quantization, resulting in an analogdigital hybrid system [18, 19]. While such hybrid systems were shown to result in substantially improved performance for signal recovery tasks under bit constraints [5, 6, 7], their design is commonly restricted to a subset of analog mappings, e.g., linear processing [5]; and specific stochastic models, such as Gaussian observations [6, 7]. Furthermore, these modelbased quantization systems assume uniform quantizers, hence, they do not exploit the ability to utilize arbitrary quantization rules, while requiring accurate knowledge of the underlying statistical model.
An alternative approach to inferring the quantization system from the model, is to learn it from a set of training samples in a datadriven fashion. In particular, by utilizing machine learning methods, one can implement taskbased quantizers without the need to explicitly know the underlying model and to analytically derive the proper quantization rule. Existing works on deep learning for quantization typically focus on image compression
[21, 20, 22, 23, 24], where the goal is to represent the analog image using a single quantization rule, i.e., non taskbased quantization. Alternatively, a large body of deep learning related works considerdeep neural network
(DNN) model compression [25, 26, 27], where a DNN operates with quantized instead of continuous weights. The work [28] used DNNs to compress and quantize highdimensional channel state information in a massive MIMO feedback setup. The design of DNNs for processing onebit quantized measurements in the digital domain, i.e., in the presence of taskignorant quantizers, was considered for signal recovery in [29]; while DNNbased MIMO receivers with onebit quantizers were studied in [30, 31]. To the best of our knowledge, despite the importance of quantization with scalar ADCs in digital signal processing, the application of deep learning in such systems has not yet been studied.In this paper we consider the design of datadriven taskbased quantizers, utilizing scalar ADCs. Following [5, 6, 7], we propose a hybrid quantization system in which the analog mapping, the quantization rule, and the digital processing, are learned from training in an endtoend fashion. The operation of the scalar ADCs is modeled as an intermediate activation layer. Unlike previous works which combined fixed uniform quantizers as part of a neural network [22, 23, 28]
, our method is specifically designed for learning scalar quantization mappings. We consider two generic tasks: estimating a set of parameters taking values in a continuous set from the quantized observations, and classifying the acquired signals. Our main target application is bit constrained
MIMO receivers, in which these tasks may represent, for example, channel estimation and symbol detection, respectively.Since continuoustodiscrete mappings applied in the quantization process are inherently nondifferentiable, standard deep learning training algorithms, such as stochastic gradient descent (SGD), cannot be applied in a straightforward manner. To overcome this difficulty, previous works used a simplified model of the quantizer, in which the quantization error is replaced by additive i.i.d. noise [22, 23, 28]. As the quantization error is a deterministic function of the analog input [32], the resulting model is relatively inaccurate, inducing a mismatch which, as we numerically demonstrate, degrades the ability to properly optimize the system in light of the task. Furthermore, this model is limited to fixed uniform continuoustodiscrete mappings, namely, the quantization mapping cannot be learned during training. Here, we approximate the continuoustodiscrete mapping with a differentiable one during training which faithfully represents the operation of the quantizer, facilitating the application of backpropagation, while allowing to learn the quantization mapping as part of an endtoend network.
We numerically evaluate the performance of our proposed DNNbased system in MIMO communication scenarios. We first consider channel estimation, and compare our datadriven taskbased quantizer to previous channel estimators from taskignorant quantized measurements, as well as to the modelbased taskbased quantization system proposed in our previous work [5]. We also compare with the fundamental limits on channel estimation performance in MIMO systems with quantized observations, derived using indirect ratedistortion theory, which are achievable using optimal vector quantizers [8, Ch. 23]. Our results demonstrate that, even when the DNNbased quantizer is trained with samples taken from setups with different signaltonoise ratio (SNR), it is still able to approach the performance of the optimal taskbased quantizers with ADCs for varying SNRs, which is within a small gap of the fundamental performance limits.
Next, we test the datadriven quantizer for the task of symbol detection in multiuser MIMO communications. Here, we show that our quantizer achieves performance which is comparable to applying the
maximum aposteriori probability
(MAP) rule without any quantization constraints, and is notably more robust to inaccurate channel state information (CSI). Furthermore, our deep taskbased quantizer significantly outperforms the previously used approach of modeling quantization as additive noise during training, and we illustrate that the gap stems from the usage of a more accurate model for the quantization mapping. We also discuss how the proposed approach can be exploited to construct trainable taskbased ADCs, by combining neuromorphic electronic systems [33] with digital neural networks, giving rise to robust, efficient, and accurate, datadriven methods for acquisition of analog signals.The rest of this paper is organized as follows: Section II formulates the problem; Implementation of the datadriven taskbased quantizer is presented in Section III. Section IV numerically evaluates the proposed quantizer in MIMO communication scenarios. Finally, Section V provides some concluding remarks.
Throughout the paper, we use boldface lowercase letters for vectors, e.g., , and boldface uppercase letters for matrices, e.g., . Sets are denoted with calligraphic letters, e.g., . We use to represent the identity matrix. Transpose, Euclidean norm, stochastic expectation, real part, and imaginary part are written as , , , , and , respectively, is the set of real numbers, and is the set of complex numbers.
Ii Preliminaries and Problem Statement
Iia Preliminaries in Quantization Theory
To formulate the problem, we first briefly review the standard quantization setup. While parts of this review also appear in our previous work [5], it is included for completeness. We begin with the definition of a quantizer:
Definition 1 (Quantizer).
A quantizer with bits, input size , input alphabet , output size , and output alphabet , consists of: 1) An encoding function which maps the input into a discrete index. 2) A decoding function which maps each index into a codeword .
We write the output of the quantizer with input as . Scalar quantizers operate on a scalar input, i.e., and is a scalar space, while vector quantizers have a multivariate input. When the input size and the output size are equal, , we write .
In the standard quantization problem, a quantizer is designed to minimize some distortion measure between its input and its output. The performance of a quantizer is characterized using two measures: the quantization rate, defined as , and the expected distortion . For a fixed input size and codebook size , the optimal quantizer is
(1) 
Characterizing the optimal quantizer via (1) and its tradeoff between distortion and quantization rate is in general a very difficult task. Optimal quantizers are thus typically studied assuming either high quantization rate, i.e., , see, e.g., [34], or asymptotically large inputs, namely, , commonly with i.i.d. inputs, via ratedistortion theory [3, Ch. 10].
In taskbased quantization, the design objective of the quantizer is some task other than minimizing the distortion between its input and output. In the following, we focus on the generic task of acquiring a random vector from a statistically dependent random vector . The set represents the possible values of the unknown vector: It can be continuous, representing an estimation task; discrete, for classification tasks; or binary, for detection tasks. This formulation accommodates a broad range of applications, including channel estimation and symbol detection, that are the common tasks considered in bitconstrained hybrid MIMO communications receivers [6], which are the main target systems considered in this work.
When quantizing for the task of estimation, under the objective of minimizing the MSE distortion, i.e., , it was shown in [35] that the optimal quantizer applies vector quantization to the minimum (MMSE) estimate of the desired vector from the observed vector . While the optimal system utilizes vector quantization, the fact that such prequantization processing can improve the performance in estimation tasks was also demonstrated in [5], which considered scalar quantizers. However, it was also shown in [5] and [7] that the prequantization processing which is optimal with vector quantizers, i.e., recovery of the MMSE estimate of from , is no longer optimal when using scalar quantization, and that characterizing the optimal prequantization processing in such cases is very difficult in general. The fact that processing the observations in the analog domain is beneficial in taskbased quantization motivates the hybrid system model which is the focus of the current work, and detailed in the following subsection. Due to the difficulty in analytically characterizing the optimal hybrid system, we consider a datadriven design, described in Section III.
IiB Problem Statement
As discussed in the introduction, practical digital signal processing systems typically obtain a digital representation of physical analog signals using scalar ADCs. Since in such systems, each continuousamplitude sample is converted into a discrete representation using a single quantization rule, this operation can be modeled using identical scalar quantizers. In this work we study the implementation of taskbased quantization systems with scalar ADCs in a datadriven fashion.
The considered signal acquisition system with scalar ADCs is modeled using the hybrid setup depicted in Fig. 1, where a set of analog signals are converted to digital in order to extract some desired information from them. This model can represent, e.g., sensor arrays or MIMO receivers, and specializes the case of a single analog input signal. While acquiring a set of analog signals in digital hardware includes both sampling, i.e., continuoustodiscrete time conversion, as well as quantization, namely, continuoustodiscrete amplitude mapping, we henceforth focus only the quantization aspect assuming a fixed sampling mechanism, and leave the datadriven design of the overall system for future investigation.
We consider the recovery of an unknown random vector based on an observed vector quantized with up to bits. The observed is related to via a conditional probability measure , which is assumed to be unknown. For example, in a communications setup. the conditional probability measure encapsulates the noisy channel. The input to the ADC, denoted , where denotes the number of scalar quantizers, is obtained from using some prequantization mapping carried out in the analog domain. Then, is quantized using an ADC modeled as identical scalar quantizers with resolution . The overall number of bits is . The ADC output is processed in the digital domain to obtain the quantized representation .
Our goal is to design a generic machinelearning based architecture for taskbased quantization with scalar ADCs. The proposed system operates in a datadriven manner, namely, it is capable of learning the analog transformation, quantization mapping, and digital processing, from a training data set, consisting of independent realizations of and , denoted
. In general, the training samples may be taken from a set of joint distributions, and not only from the true (unknown) joint distribution of
and , as we consider in our numerical study in Section IV. We focus on two tasks which are relevant for MIMO receivers: An estimation task, in which , representing, e.g., channel estimation; and classification, where is a finite set, modeling, e.g., symbol detection. Our design is based on machinelearning methods, and specifically, on the application of DNNs.Iii Deep TaskBased Quantization
In the following, we present a deep taskbased quantizer, which implements the system depicted in Fig. 1 in a datadriven fashion using DNNs. To that aim, we first discuss the proposed network architecture in Subsection IIIA. Then, in Subsection IIIB we elaborate on the discretetocontinuous mapping and its training method, and provide a discussion on the resulting system in Subsection IIIC.
Iiia Dnn Architecture
We propose to implement a datadriven taskbased quantizer using machinelearning methods. In particular, we realize the pre and post quantization mappings using dedicated DNNs, jointly trained in an endtoend manner, as illustrated in Fig. 2.
In the proposed architecture, the serial scalar ADC
, which implements the continuoustodiscrete mapping, is modeled as an activation function between the two intermediate layers. The trainable parameters of this activation function determine the quantization rule, allowing it to be learned during training. The
DNN structure cannot contain any skip connections between the multiple layers prior to quantization (analog domain) and those after quantization (digital domain), representing the fact that all analog values must be first quantized before processed in digital. The pre and post quantization networks are henceforth referred to as the analog DNN and the digital DNN, respectively. The system input is the observed vector , and we useto denote the hyperparameters of the network. As detailed in Subsection
IIB, we consider two main types of tasks:
Estimation: Here, the deep taskbased quantizer should learn to recover a set of unknown parameters taking values on a continuous set, i.e., . By letting denote the mapping implemented by the overall system, the output is given by the vector , which is used as a representation of the desired vector
. The loss function is the empirical
MSE, given by(2) 
Classification: In such tasks, the deep taskbased quantization should decide between a finite number of options based on its analog input. Here, is a finite set, and we use to denote its cardinality. The last layer of the digital DNN
is a softmax layer, and thus the network mapping
is a vector, whose entries represent the conditional probability for each different value of given the input . By letting be the output value corresponding to , the decision is selected as the most probable one, i.e., . The loss function is the empirical crossentropy, given by(3)
By utilizing DNNs, we expect the resulting system to be able to approach the optimal achievable distortion for fixed quantization rate and input size , without requiring explicit knowledge of the underlying distribution . Such performance is illustrated in the numerical example presented in Subsection IVA.
The proposed architecture is generic, and its main novelty is in the introduction of the learned quantization layer, detailed in the following subsection. Our structure can thus be combined with existing dedicated networks, which are trainable in an endtoend manner, as a form of transfer learning. For example,
sliding bidirectional recursive neural networks were shown to achieve good performance for the task of symbol detection in nonquantized communication systems with long memory [36]. Consequently, one can design a deep symbol detector operating under quantization constraints, as common in, e.g., millimeter wave communications [10], by implementing the digital DNN of Fig. 2 as an SBRNN. In this work we focus on fullyconnected analog and digital DNNs, and leave the analysis of combination with dedicated networks to future investigation.IiiB Quantization Activation
Our proposed deep taskbased quantizer implements scalar quantization as an intermediate activation in a joint analogdigital hybrid DNN. This layer converts its continuousamplitude input into a discrete digital representation. The nondifferentiable nature of such continuoustodiscrete mappings induces a major challenge in applying SGD for optimizing the hyperparameters of the network. In particular, quantization activation, which can be modeled as a superposition of step functions determining the continuous regions jointly mapped into a single value, nullifies the gradient of the cost function. Consequently, straightforward application of SGD fails to properly set the prequantization network. To overcome this drawback, we first review the common approach, referred to henceforth as passing gradient, after which we propose a new method, referred to as softtohard quantization.
IiiB1 Passing Gradient
In this approach the quantized values are modeled as the analog values corrupted by mutually independent i.i.d. noise [22, 23, 28], and thus quantization does not affect the backpropagation procedure. Since the quantization error is deterministically determined by the analog value [32], the resulting model is quite inaccurate. Specifically, while under some input distributions, the quantization noise can be modeled as being uncorrelated with the input [32], they are not mutually independent. In fact, in order for the quantization error to be independent of the input, one should use substractive dithered quantization [37], which does not represent the operation of practical ADCs. Consequently, using this model for quantization during training results in a mismatch between the trained system and the tested one.
Under this model, the continuoustodiscrete mapping is fixed, representing, e.g., uniform quantization, and the training algorithm backpropagates the gradient value intact through the quantization layer. An illustration of this approach is depicted in Fig. 3(a). We expect the resulting system to obtain poor performance when nonnegligible distortion is induced by the quantizers. In our numerical study presented in Subsection IVB, it is illustrated that this method achieves relatively poor performance at low quantization rates, where scalar quantization induces an error term which is nonnegligible and depends on the analog input. It is therefore desirable to formulate a network structure which accounts for the presence of scalar quantizers during training, and is not restricted to fixed uniform quantizers.
IiiB2 SofttoHard Quantization
Our proposed approach is based on approximating the nondifferentiable quantization mapping by a differentiable one. Here, we replace the continuoustodiscrete transformation with a nonlinear activation function which has approximately the same behavior as the quantizer, as illustrated in Fig. 3(b). Specifically, we use a sum of shifted hyperbolic tangents, which are known to closely resemble step functions in the presence of large magnitude inputs. The resulting scalar quantization mapping is given by:
(4) 
where are a set of realvalued parameters. Note that as the parameters increase, the corresponding hyperbolic tangents approach step functions. Since we use a differentiable activation to approximate a set of nondifferentiable functions [20], we refer to this method as softtohard quantization.
In addition to learning the weights of the analog and digital DNNs, this softtohard approach allows the network to learn its quantization activation function, and particularly, the best suitable constants (the amplitudes) and (the shifts). These tunable parameters are later used to determine the decision regions of the scalar quantizer, resulting in a learned quantization mapping. The parameters , which essentially control the resemblance of (4) to an actual continuoustodiscrete mapping, do not reflect on the quantization decision regions (controlled by ) and their associated digital values (determined by ), and are thus not learned from training. The set can be either set according to the quantization resolution , or alternatively, modified using annealingbased optimization [38], where are manually increased during training. The proposed optimization is achieved by including the parameters as part of the network hyperparameters . Due to the differentiability of (4), one can now apply standard SGD to optimize the overall network, including the analog and digital DNNs as well as the quantization rule, in an endtoend manner.
Once training is concluded, we replace the learned activation (4) with a scalar quantizer whose decision regions are dictated by the tunable parameters . In particular, since for , we use the set to determine the decision regions of the quantizer, and set the value of at each decision region center as its corresponding representation level. Without loss of generality, we assume that (when this conditions is not satisfied, the parameters are sorted and reindexed accordingly). The resulting quantizer is given by
(5) 
An illustration of how the differentiable mapping (4) is converted into a continuoustodiscrete quantization rule via (5) is depicted in Fig. 4. The dashed smooth curve in Fig. 4 represents the differentiable function after training is concluded, and the straight curve is the resulting scalar quantizer.
In the simulations study presented in Subsection IVA, it is illustrated that the proposed method, which faithfully represents the presence of scalar quantizers during training and is capable of optimizing their decision regions, can outperform the modelbased MSE minimizing taskbased quantizer with scalar ADCs of [5], which requires complete knowledge of the underlying model, yet is restricted to uniform quantizers.
IiiC Discussion
The deep taskbased quantizer proposed in Subsection IIIA implements hybrid multivariate acquisition using a set of identical scalar ADCs with learned decision regions, combined with DNNbased analog and digital transformations. While realizing DNNs in digital can be done in software, analog DNNs requires dedicated tunable hardware weights and activations. Such hardware networks, commonly referred to as neuromorphic electronic systems [33], implement configurable DNNs as analog components. Recent advances in memristors technology substantially facilitate the implementation of these hardware devices [39], contributing to the feasibility of our proposed deep taskbased quantizer.
It is noted that in some applications, constrained analog structures may be preferable. For example, in MIMO receivers with a large number of antennas, i.e., massive MIMO, prequantization analog processing is commonly limited to phase shifting [18]. In this case, the analog DNN is replaced with a single layer whose weights are restricted to have a unit magnitude, and this constraint has to be accounted for in training. Here we focus on generic analog DNNs, in which the weights are not constrained.
Our taskbased quantizer can thus be implemented as a system consisting of adjustable analog hardware, configurable scalar quantizers, and software. The natural approach to set the hyperparameters of the network would be to train the system model offline in software using an apriori acquired training set. The network weights and quantization decision regions obtained from this trained model can be then configured into the hardware components and the tunable ADCs, resulting in the desired taskbased quantization system.
One can also envision an online trainable taskbased quantizer, which is capable of further tuning its hyperparameters in realtime to track dynamic environments, as in, e.g., [40]. For example, a communication receiver using a deep taskbased quantizer for symbol detection, can exploit apriori knowledge of pilot sequences as labels corresponding to inputs acquired in realtime. A major challenge in implementing such a system stems from the fact that both the labels as well as the inputs are required in order to update a network coefficients using conventional training algorithms, e.g., SGD. However, in our system the digital processor does not have direct access to the analog signal, but only to its quantized digital representation. Consequently, if the processor only utilizes digital values, it can only train the digital DNN using SGD. This challenge may be handled by allowing access to a high resolution quantized version of the analog signals, acquired in the specific time instances for which labels are available. An alternative approach is to utilize an errorcorrection based update algorithm [41] instead of SGD
, or reinforcement learning methods
[42], since these techniques typically do not require direct access to the network input.Iv Application to MIMO Receivers
While the generic deep taskbased quantizer proposed in Section III is applicable to a multitude of different setups, our main target application, studied in this section, is uplink multiuser MIMO communications. The problem of MIMO communications with lowresolution quantization is the focus of many recent works, including, e.g., [12, 13, 19, 6, 30]. Here, we consider a single cell multiuser MIMO system, in which single antenna users are served by a base station (BS) with antennas, which operates under quantization constraints. We focus on two tasks encountered in such setups: The first is channel estimation detailed in Subsection IVA, for which we are capable of quantifying the performance gap of our system from optimality as well as comparing it to modelbased designs. Then, in Subsection IVB we focus on symbol detection, which we treat as a classification task.
Iva Channel Estimation Task
We first consider channel recovery, which is an estimation task commonly encountered in MIMO systems. We focus on a specific scenario for which we can compute both the fundamental performance limits, namely, a lower bound on the achievable recovery accuracy which holds for any bit constrained system, as well as the performance of the best hybrid system restricted to using linear operations and uniform quantization, derived in [5]. These performance measures, which correspond to modelbased systems, are used as a basis for comparison to evaluate our proposed datadriven taskbased quantizer. The main motivation for the study detailed in this subsection is thus to compare the performance achievable using our proposed deep taskbased quantizer to modelbased techniques and the fundamental performance limits in a specific scenario where these values are computable.
In the following, we consider a channel estimation task carried out in a time diversity duplexing manner as in [6], using orthogonal pilot sequences of length . We use to denote the known pilot sequence matrix, where the orthogonality of the pilots implies that , and is the SNR. Additionally, let
be a random vector whose entires are i.i.d. zeromean unitvariance complex normal channel coefficients, and
be a random vector with i.i.d. zeromean unitvariance complex normal entries mutually independent of , representing the additive noise at the BS. The observed signal , used by the BS to estimate , can be written as [12, Eq. (4)]:(6) 
where is the Kronecker product.
To put the setup in (6) in the framework of our problem formulation, which considers realvalued signals, we write the observations as and the unknown channel as . Consequently, the number of measurements is , the number of unknown parameters is , and their ratio is , which is not smaller than one.
The performance measure for evaluating the quantization systems here is the average MSE, namely, . For the above model, the average MMSE, which is the optimal performance achievable with no quantization constraints, is given by . In the presence of quantization constraints, the optimal approach is to quantize the MMSE estimate [35], and the resulting average distortion is obtained from ratedistortion theory [3, Ch. 10.3] as
(7) 
Note that is achievable using optimal vector quantization in the limit . For finite and scalar quantizers, (7) serves as a lower bound on the achievable performance. We thus refer to as the fundamental performance limit.
We now numerically evaluate our proposed deep taskbased quantizer, compared to the fundamental performance limit in (7), as well as to the performance of the taskbased quantizer with scalar uniform ADCs designed in [5], denoted . It is noted that while our proposed system can modify the quantization regions, the model of [5] assumes fixed uniform quantizers. Consequently, the average MSE of the system of [5] does not necessarily lower bound the performance of our proposed system. We also note that the system of [5] requires full knowledge of the underlying statistical model, namely, the SNR as well as the distribution of and .
We simulate a multiuser MIMO network in which a BS equipped with antennas serves users. We set the SNR to be and the number of pilots to . As in [12], we fix the pilots matrix to be the first columns of the
discrete Fourier transform matrix. In the implementation of the deep quantizers, we set the pre and post quantization
DNNs to consist of linear layers. The motivation for using linear layers stems from the fact that for the considered setup, the MMSE estimate is a linear function of the observations. Furthermore, this setting guarantees fair comparison with the modelbased system of [5], which focused on linear analog and digital processing. Following [5, Cor. 1], we evaluate the average MSE of our proposed systems with quantizers. We consider two training sets, both of size : In the first training set, representing optimal training, the realizations are sampled from the true joint distribution of ; In the second training set, representing SNR uncertainty, are sampled from the joint distribution of with different values of , uniformly randomized over the set for each realization. At the end of the training session, we fix the quantizer to implement the continuoustodiscrete rule in (5). We numerically evaluate our trained proposed deep quantizer using independent channel realizations.In Fig. 5 we depict the resulting performance versus the quantization rate in the range . The empirical performance is compared to three theoretical measures: the MMSE ; the fundamental performance limits of channel estimation from quantized measurements, given by in (7); and the performance of the analytically derived taskbased quantizer with scalar ADCs [5], denoted . Since [5] requires perfect knowledge of the underlying model, and particularly of the SNR , and as this information may not be available accurately in practice, we also consider the case where [5] utilizes an estimation of corrupted by zeromean Gaussian noise with variance . Finally, we compute the average MSE of the BLMMSE estimator proposed in [12] via [12, Eq. (15)]. Since the BLMMSE estimator quantizes the observed signal without analog preprocessing, it is applicable only for .
Observing Fig. 5, we note that the performance of our softtohard deep quantizer is within a relatively small gap of the fundamental performance limits. Furthermore, the fact that the softtohard method is not restricted to uniform quantizers allows it to outperform the modelbased , especially in lower quantization rates. Finally, we note that in the presence of SNR uncertainty, the performance of the softtohard method is similar to with noisy SNR estimate, and that both outperform the BLMMSE estimator of [12]. This indicates that our proposed scheme is applicable also when the training data is not generated from the exact same distribution as the test data. Our results demonstrate the ability of deep taskbased quantization to implement a feasible and optimalapproaching quantization system in a datadriven fashion.
IvB Symbol Detection Task
The main task of a communication receiver is to recover the transmitted messages. Channel estimation, studied in the previous subsection, is intended to facilitate the recovery of the unknown symbols. Consequently, we next consider the task of symbol recovery, in which the receiver learns to recover a set of constellation points from its quantized channel output.
As shown in the previous subsection, multivariate complexvalues (baseband) can be represented as real vector channels of extended dimensions. Therefore, here we focus on communications over a realvalued MIMO channel. In particular, we consider a BS equipped with antennas, serving users. The users transmit i.i.d. binary phase shift keying (BPSK) symbols, represented via the vector . The received signal at the BS, denoted , is given by
(8) 
where is the channel matrix, and is additive Gaussian noise with zeromean i.i.d. entries of variance .
Here, the task of the BS is to recover the transmitted symbols vector from the channel output , i.e., in this scenario the input dimension is and the task dimension is . We use a DNN architecture consisting of two fully connected layers in analog and two fully connected layers in digital. As this is a classification task, the output layer is a softmax function with probabilities, and the overall network is trained to minimize the crossentropy loss. An illustration of the DNN structure is depicted in Fig. 6. Unlike the scenario considered in the previous subsection, for which the number of quantizers can be set according to the analytical results in [5], here this value was determined based on empirical evaluations. In particular, we use , resulting in each scalar quantizer using at least bits in the hybrid system.
We compare the achievable bit error rate (BER) of our proposed deep taskbased quantizer with softtohard training to using the same architecture with passing gradient training, namely, where the quantizers are replaced with additive i.i.d. noise uniformly distributed over the decision regions during training, which is the approach used to train neural networks with intermediate quantization in [22, 23, 28]. In particular, for the passing gradient method we used a uniform quantization rule over the support . The DNNs are trained using a relatively small training set consisting of realizations sampled from the joint distribution of .
The aforementioned datadriven systems are compared to two modelbased symbol detectors, which require accurate CSI, i.e., knowledge of or from which the joint distribution of and can be inferred using (8):

The MAP rule for recovering from without quantization constraints, i.e.,
(9) The performance of the MAP detector with perfect CSI constitutes a lower bound on the achievable BER of any recovery scheme.

The MAP rule for recovering from a uniformly quantized with rate , namely,
(10) where represents the elementwise uniform quantization rule over the interval using decision regions. The performance of the quantized MAP detector represents the achievable BER when processing is carried out solely in the digital domain, i.e., without using analog processing and / or tunning the quantization mapping in light of the task.
Unlike the detectors based on the MAP rule in (9)(10), datadriven taskbased quantizers do not require CSI, namely, no apriori knowledge of or is used in the detection procedure. Instead, a set of training samples are needed. In order to study the resiliency of our deep taskbased quantizer to inaccurate training, we also compute the BER under CSI uncertainty, namely, when the training samples are randomized from a joint distribution of in which the entries of the matrix in (8) are corrupted by additive i.i.d. Gaussian noise, whose variance is the magnitude of the corresponding entry. For comparison, we also evaluate the BER of the MAP rule (9) with the same level of CSI uncertainty. The numerically computed BER values are averaged over Monte Carlo simulations.
The simulated BER values versus SNR, defined here as , in the range of dB, are depicted in Figs. 78 for quantization rates and , respectively. Observing Figs. 78, we note that in the presence of accurate CSI, the BER of our deep taskbased quantizer is comparable to that achievable using the MAP rule operating without quantization constraints. In particular, while the MAP detector, which is independent of the quantization rate, achieves BER of at SNR of dB, the deep taskbased quantizer obtains such BER values at SNRs of and dB, respectively, for quantization rates and , respectively, namely, SNR gaps of and dB. For comparison, the quantized MAP rule, which operates only in the digital domain, does not achieve BER values below at and requires SNR of over dB to achieve BER of at rate , i.e., with twice the number of bits used by the deep taskbased quantizer to achieve the same error rate. This demonstrates the benefit of applying prequantization processing in the analog domain, which reduces the dimensionality of the input to the scalar quantizers, thus allowing to utilize more accurate quantization while keeping the semantic information required to classify the symbols from the channel output.
The performance gain of the hybrid DNN architecture stems from the ability to properly model the scalar quantizers during training using our softtohard approach. This model allows to jointly train both the analog and digital DNNs as well as the decision regions of the quantizers, while accurately reflecting the quantization mapping. For comparison, it is observed in Figs. 78 that using the passing gradient approach, i.e., replacing quantization with additive uniformly distributed i.i.d. noise as was done in [22, 23, 28], leads to substantially deteriorated BER values compared to the proposed softtohard approach. To understand whether the improved gains of softtohard modeling over passing gradient stems from the better approximation of the continuoustodiscrete mapping or from the ability to use nonuniform quantizers, we compare in Fig. 9 the performance of the taskbased quantizers with softtohard modeling and with passing gradient modeling for the scenario of Fig. 7 when using a fixed uniform quantizer with softtohard modeling. In particular, for the uniform softtohard quantizer we used the model in (4) during training with the parameters being fixed to uniform partition of the interval , i.e., not optimized during training. It is clearly observed in Fig. 9 that most of the gain follows from the usage of an accurate differentiable approximation of the continuoustodiscrete quantization mapping, which allows to train the system in an endtoend manner while faithfully representing quantization. The gains due to optimizing the decision regions are rather small, indicating that our proposed approach can also lead to substantial improvements when restricted to using uniform scalar quantizers.
The results in Figs. 78 also demonstrate the improved robustness to inaccurate CSI. The performance of the modelbased MAP detector is very sensitive to CSI uncertainty, resulting in a notable increase in BER due to the model mismatch. However, the performance of the deep taskbased quantizer trained under CSI uncertainty is within an SNR gap of approximately dB from its achievable performance when trained using accurate CSI. Furthermore, the deep taskbased quantizer with CSI uncertainty substantially outperforms the MAP rule without quantization constraints with the same level of uncertainty for all considered scenarios, and outperforms the quantized MAP with accurate CSI for quantization rate of . This demonstrates the gains of using DNNs, with their established generalization properties, for overcoming the sensitivity of modelbased approaches to inaccurate knowledge of the underlying parameters.
Next, we evaluate the BER of the considered quantization systems versus the quantization rate . The results are depicted in Figs. 1011 for SNR values of dB and dB, respectively. Observing Figs. 1011, we note that the gain of the proposed deep taskbased quantizer is more dominant when operating with low quantization rates. As the quantization rate approaches three bits per channel input, the BER of applying the MAP in the digital domain via (10) is only within a small gap of the hybrid quantizer with softtohard training. However, for lower quantization rates, as well as in the presence of CSI uncertainty, the proposed deep taskbased quantizer maintains its superiority observed in Figs. 78. Furthermore, it is noted that when using the passing gradient training approach, there is a very small gap between the performance achievable with and without CSI uncertainty. This observation is likely due to the fact that when modeling quantization as additive independent noise during training, the network is trained on a mismatched model, regardless of whether the training samples are taken from the same distribution as the test samples. Consequently, such datadriven quantizers operate under some level of uncertainty even when trained using an optimal training set.
Finally, we note that the DNNs used in this subsection were trained using a relatively small training set, consisting of samples. This indicates that such architectures can be used to realize an online trainable dynamic ADC, as discussed in Subsection IIIC.
V Conclusions
In this work we designed a datadriven taskbased quantization system, operating with scalar ADCs, using DNNs. We proposed a method for handling the nondifferentiability of quantization by approximating its mapping as a smooth function. Our proposed model faithfully represents such continuoustodiscrete mappings while allowing to learn the quantization rule from training. We discussed how this strategy can be used for designing dynamic machinelearning based ADCs for various tasks. Our numerical results, which considered channel estimation and symbol recovery in bitconstrained MIMO systems, demonstrate that the performance achievable with the proposed deep taskbased quantizer is comparable with the fundamental limits for this setup, achievable using optimal vector quantizers. Furthermore, we showed that our softtohard method for training the network in an endtoend fashion allows the system to be accurately trained with a relatively small training set, and that it notably outperforms the common approach for training DNNs with intermediate quantization.
References
 [1] R. M. Gray and D. L. Neuhoff. “Quantization”. IEEE Trans. Inform. Theory, vol. 44, no. 6, Oct. 1998, pp. 23252383.
 [2] T. Berger and J. D. Gibson. “Lossy source coding”. IEEE Trans. Inform. Theory, vol. 44, no. 6, Oct. 1998, pp. 26932723.
 [3] T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley Press, 2006.
 [4] M. R. D. Rodrigues, N. Deligiannis, L. Lai, and Y. C. Eldar. “Ratedistortion tradeoffs in acquisition of signal parameters”. Proc. IEEE ICASSP, NewOrleans, LA, Mar. 2017, pp. 61056109.
 [5] N. Shlezinger, Y. C. Eldar and M. R. D. Rodrigues. “Hardwarelimited taskbased quantization”. IEEE Trans. Signal Process. Early access, 2019.
 [6] N. Shlezinger, Y. C. Eldar and M. R. D. Rodrigues. “Asymptotic taskbased quantization with application to massive MIMO”. IEEE Trans. Signal Process. vol. 67, no. 15, Aug. 2019, pp. 39954012.
 [7] S. Salamatian, N. Shlezinger, Y. C. Eldar, and M. Medard. “Taskbased quantization for recovering quadratic functions using principal inertia components”. Proc. IEEE ISIT, Paris, France, Jul. 2019.
 [8] Y. Polyanskiy and Y. Wu. Lecture Notes on Information Theory. 2015.
 [9] R. H. Walden. “Analogtodigital converter survey and analysis”. IEEE J. Sel. Areas Commun., vol. 35, no. 9, Sep. 2017, pp. 19091935.
 [10] M. Xiao, S. Mumtaz, Y. Huang, L. Dai, Y. Li, M. Matthaiou, G. K. Karagiannidis, E. Bjornson, K. Yang, and I. ChihLin. “Millimeter wave communications for future mobile networks”. IEEE J. Sel. Areas Commun., vol. 17, no. 4, Apr. 1999, pp. 539550.
 [11] J. Mo, P. Schniter, and R. W. Heath. “Channel estimation in broadband millimeter wave MIMO systems with fewbit ADCs”. IEEE Trans. Signal Process., vol. 66, no. 5, Mar. 2018, pp. 11411154.
 [12] Y. Li, C. Tao, G. SecoGranados, A. Mezghani, A. L. Swindlehurst, and L. Liu. “Channel estimation and performance analysis of onebit massive MIMO systems”. IEEE Trans. Signal Process., vol. 65, no. 15, Aug. 2017, pp. 40754089.
 [13] J. Choi, J. Mo, and R. W. Heath. “Near maximumlikelihood detector and channel estimator for uplink multiuser massive MIMO systems with onebit ADCs”. IEEE Trans. Commun., vol. 64, no. 5, May 2016, pp. 20052018.
 [14] S. Jacobsson, G. Durisi, M. Coldrey, U. Gustavsson, and C. Studer. “Throughput analysis of massive MIMO uplink with lowresolution ADCs”. IEEE Trans. Wireless Commun., vol. 16, no. 6, Jun.. 2017, pp. 4038  4051.
 [15] H. Pirzadeh and A. L. Swindlehurst. “Spectral efficiency of mixedADC massive MIMO”. IEEE Trans. Signal Process., vol. 66, no. 13, Jul. 2018, pp. 35993613.
 [16] C. Mollen, J. Choi, E. G. Larsson and R. W. Heath. “Uplink performance of wideband massive MIMO with onebit ADCs”. IEEE Trans. Wireless Commun., vol. 16, no. 1, Jan. 2017, pp. 87100.
 [17] C. Studer and G. Durisi “Quantized massive MUMIMOOFDM uplink”. IEEE Trans. Commun., vol. 64, no. 6, Jun. 2016, pp. 23872399.
 [18] S. Stein and Y. C. Eldar. “A family of hybrid analog digital beamforming methods for massive MIMO systems”. IEEE Trans. Signal Process., vol. 67, no. 12, Jun. 2019, pp. 32433257.
 [19] J. Mo, A. Alkhateeb, S. AbuSurra, and R. W. Heath. “Hybrid architectures with fewbit ADC receivers: Achievable rates and energyrate tradeoffs”. IEEE Trans. Wireless Commun., vol. 16, no. 4, Apr. 2017, pp. 22742287.
 [20] E. Agustsson, F. Mentzer, M. Tschannen, L. Cavigelli, R. Timofte, L. Benini, and L. van Gool. “Softtohard vector quantization for endtoend learning compressible representations”. Proc. NIPS, Long Beach, CA, 2017, pp. 1141–1151.

[21]
G. Toderici, D. Vincent, N. Johnston, S. J. Hwang, D. Minnen, J. Shor, and M. Covell.
“Full resolution image compression with recurrent neural networks”.
Proc. IEEE CVPR, Honolulu, Hi, 2017.  [22] J. Balle, V. Laparra, and E. P. Simoncelli. “Endtoend optimized image compression”. arXiv preprint, arXiv:1611.01704, 2016.
 [23] J. Balle, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston. “Variational image compression with a scale hyperprior”. arXiv preprint, arXiv:1802.01436, 2018.
 [24] N. Johnston, D. Vincent, D. Minnen, M. Covell, S. Singh, T. Chinen, S. J. Hwang, J. Shor, and G. Toderici. “Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks”. arXiv preprint, arXiv:1703.10114, 2017.
 [25] S. Han, H. Mao, and W. J. Dally. “Compressing deep neural networks with pruning, trained quantization and huffman coding”. arXiv preprint, arXiv:1510.00149, 2015.
 [26] K. Ullrich, E. Meeds, and M. Welling. “Soft weightsharing for neural network compression”. arXiv preprint, arXiv:1702.04008, 2017.
 [27] I. Hubara, M. Courbariaux, D. Soudry, R. ElYaniv, and Y. Bengio. “Quantized neural networks: Training neural networks with low precision weights and activationss”. Journal of Machine Learning Research, vol. 187, no. 18, Apr. 2018, pp. 130.
 [28] Q. Yang, M. B. Mashhadi, and D. Gunduz. “Deep convolutional compression for massive MIMO CSI feedback”. arXiv preprint, arXiv:1907.02942, 2019.
 [29] S. Khobahi, N. Naimipour, M. Soltanalian, and Y. C. Eldar. “Deep signal recovery with onebit quantization”. arXiv preprint, arXiv:1812.00797, 2018.
 [30] E. Balevi and J. G. Andrews. “Onebit OFDM receivers via deep learning”. IEEE Trans. Commun., vol. 67, no. 6, Jun. 2019, pp. 43264336.
 [31] J. Choi, Y. Cho, B. L. Evans, and A. Gatherer. “Robust learningbased ML detection for massive MIMO systems with onebit quantized signals”. arXiv preprint, arXiv:1811.12645, 2018.
 [32] B. Widrow, I. Kollar, and M. C. Liu . “Statistical theory of quantization”. IEEE Trans. Instrumentation and Measurement, vol. 45, no. 2, Apr. 1996, pp. 353361.
 [33] C. Mead. “Neuromorphic electronic systems”. Proc. IEEE, vol. 78, no. 10, Oct. 1990, pp. 16291636.
 [34] J. Li, N. Chaddha, and R. M. Gray. “Asymptotic performance of vector quantizers with a perceptual distortion measure”. IEEE Trans. Inform. Theory, vol. 45, no. 4, May 1999, pp. 10821091.
 [35] J. K. Wolf and J. Ziv. “Transmission of noisy information to a noisy receiver with minimum distortion”. IEEE Trans. Inform. Theory, vol. 16, no. 4, Jul. 1970, pp. 406411.
 [36] Y. Liao, N. Farsad. N. Shlezinger, Y. C. Eldar, and A. Goldsmith. “Deep symbol detection for millimeter wave communications”. Proc. GLOBECOM, Waikola, HI, Dec. 2019.
 [37] R. A. Wannamaker, S. P. Lipshitz, J. Vanderkooy, and J. N. Wright. “A theory of nonsubtractive dither”. IEEE Trans. Signal Process., vol. 48, no. 2, Feb. 2000, pp. 499516.
 [38] K. Rose, E. Gurewitz, and G. C. Fox. “Vector quantization by deterministic annealing”. IEEE Trans. Inform. Theory, vol. 38, no. 4, Apr. 1992, pp. 12491257.
 [39] L. Danial, N. Wainstein, S. Kraus, and S. Kvatinsky. “Breaking through the speedpoweraccuracy tradeoff in ADCs using a memristive neuromorphic architecture”. IEEE Trans. Emerg. Topics Comput. Intell., vol. 2, no. 5, Oct. 2018, pp. 396409.
 [40] N. Shlezinger, N. Farsad, Y. C. Eldar, and A. J. Goldsmith. “ViterbiNet: A deep learning based Viterbi algorithm for symbol detection”. arXiv preprint, arXiv:1905.10750, 2019.

[41]
B. Widrow and M. A. Lehr.
“30 years of adaptive neural networks: perceptron, madaline, and backpropagation”.
Proc. IEEE, vol. 78, no. 9, Sep. 1990, pp. 14151442.  [42] R. G. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 2018.
Comments
There are no comments yet.