Energy Optimization of Faulty Quantized Min-Sum LDPC Decoders

08/26/2021 ∙ by Mohamed Yaoumi, et al. ∙ Corporation de l'ecole Polytechnique de Montreal 0

The objective of this paper is to minimize the energy consumption of a quantized Min-Sum LDPC decoder, by considering aggressive voltage downscaling of the decoder circuit. Since low power supply may introduce faults in the memories used by the decoder architecture, this paper proposes to optimize the energy consumption of the faulty Min-Sum decoder while satisfying a given performance criterion. The proposed optimization method relies on a coordinate descent algorithm that optimizes code and decoder parameters which have a strong influence on the decoder energy consumption: codeword length, number of quantization bits, and failure probability of the memories. Optimal parameter values are provided for several codes defined by their protographs, and significant energy gains are observed compared to non-optimized setups.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Energy consumption is an important criterion in the design of electronic circuits, and can be greatly reduced by aggressive voltage scaling of the circuit. Low power supply may however introduce faults in the computation operations and memories of the circuit [gupta2012underdesigned]. In this paper, we address this issue in the area of channel coding, and more specifically for Low Density Parity Check (LDPC) codes. The objective is to find the best compromise between decoder circuit energy consumption and LDPC decoding performance under circuit faults.

Two energy consumption models are provided in [ganesan2015towards]

for non-faulty LDPC decoders: the first model estimates the decoding complexity, while the second evaluates the wire length in the circuit. Then, 

[nguyen2016non] introduces a method to minimize the alphabet size of quantized messages exchanged in the decoder, aiming to lower the memories energy consumption. Finally, [smith2010design], proposes to optimize the code rate and irregular code degree distribution in order to minimize the decoder complexity and therefore its energy consumption.

In addition, the performance of LDPC decoders implemented on faulty hardware was widely studied in the literature. In [huang2013gallager] the authors assume that the LDPC decoder is subject to both transient and permanent errors. Transient errors make faulty gates or memory units provide an erroneous output from time to time with a non-zero probability. Permanent errors make a fraction of the gates and memories stuck at the same output. When dealing with energy consumption issues, we consider process diversity strategies, where the permanent errors turn into transient error [leduc2018modeling]. The authors in [varshney2011performance, balatsoukas2014density, ngassa2015density, leduc2018modeling] theoretically investigate the effect of transient errors on various LDPC decoders, such as Gallager A and B or quantized Min-Sum. However, none of these works relate the amount of faults introduced in the decoder to its energy consumption.

In this work, our objective is to minimize the energy consumption of a faulty LDPC decoder while satisfying a given performance criterion. For this, we consider protograph-based LDPC codes and quantized Min-Sum decoders for their easy hardware implementation [karkooti2004semi]. In [marchand2011architecture], it is shown that memories represent the largest part in term of area and energy consumption of circuits of LDPC decoders. Therefore, in this paper, we assume that the decoder faults are introduced in the memory units, and we consider the noise model introduced in [chatterjee2016energy], that relates the noise in the stored bits with the energy consumption of a memory cell.

To estimate the LDPC decoder energy consumption, we update the non-faulty memory energy model of [yaoumi2019optimization], in order to apply it to faulty decoders. This energy model depends on several code and decoder parameters, such as the protograph, the noise level, the number of quantization bits for the messages, the codeword length, and the number of iterations performed by the decoder. In order to properly evaluate the proposed energy model, we consider the method of [yaoumi2020energy] which relies on Density Evolution (DE) in order to estimate the average number of decoder iterations required for a given codeword length. Then, protograph optimization being a difficult problem itself [yaoumi2019optimization], we consider a fixed prtograph, and we propose a method to optimize both the codeword length and the decoder parameters (number of quantization bits and noise level) in order to minimize the decoder energy consumption. This method is based on a coordinate-descent algorithm that successively optimizes each parameter, assuming that the other ones are fixed, and repeats the process over several iterations. Simulation results provide the values of optimized parameters for several protographs, and show the energy gains compared to non-optimized decoders.

Ii LDPC codes and decoders

We consider a codeword of length

to be transmitted over an additive white Gaussian noise (AWGN) channel of variance

, with binary phase-shift keying (BPSK) modulation. We use to denote the -th channel output, and to denote the

th modulated coded bit. The channel Signal-To-Noise Ratio (SNR) is defined as

. In this section, we introduce our notation for protograph-based LDPC codes, and describe the considered faulty quantized Min-Sum decoder.

Ii-a Protograph-based LDPC codes

LDPC codes are represented by a sparse parity-check matrix where is the coefficient in the -th row and -th column of . Assuming that is full rank, the code rate is , where is the information length. The matrix can also be represented by a Tanner graph with N variable nodes and M check nodes. The i-th Variable Node (VN) and the j-th Check Node (CN) are connected in the Tanner graph if . We use to denote the set of all VNs connected to CN , and we use to denote the set of all CNs connected to VN .

In this work, we consider LDPC codes constructed from protographs [fang2015survey]. A protograph is a matrix of size that gives the number of connections between each VN and CN in the reduced Tanner graph representing the protograph. We can construct an LDPC code of length by first copying the protograph times where is called the lifting factor, and then by interleaving the edges to get the parity check matrix .

Ii-B Faulty quantized Min-Sum decoder

In this paper, we consider a quantized offset Min-Sum decoder [balatsoukas2014density, ngassa2015density, leduc2018modeling], implemented with the architecture proposed in [dupraz2018low]. For simplicity, no pipeline stages are considered, which corresponds to a row-layered scheduling. This enables to use fewer decoding iterations and reduces the size of the circuit. The decoder messages are quantized on bits and between values and , where . We consider the following quantization function:

(1)

where if , and if .

Since memory units are responsible for a large part of the decoder energy consumption [marchand2011architecture], we assume that faults are introduced during memory read operations in the faulty decoder. The error model corresponds to XOR-wise operation

between the memory read port output and a noise term represented by independent and identically distributed random variables

. These random variables can equivalently be represented on  bits as . We assume that the random variables are independent and identically distributed, and that each

follows a Bernoulli distribution with parameter

.

In order to initialize the decoder, we compute log-likelihood ratio (LLR) for each received value as:

(2)

where is a scaling parameter. In the architecture proposed in [dupraz2018low], the VN messages at iteration are updated as

(3)

with , represents the message sent from the CN to the VN at iteration , and is the noise introduced when the messages are read from their dedicated memory. The messages are quantified on bits, with , in order to avoid any saturation issue when writing the message into the memory. In the considered architecture, the message sent from the VN to the CN , at iteration , is calculated during the CN update, which is as follows:

(4)
(5)

where is an offset parameter, and where represents the noise introduced when reading the memory where the variable-node messages are stored. The decoder stops when a stopping criterion is satisfied, or when the maximum number of iterations is reached.

Iii Finite-length performance evaluation

DE [richardson2001design, richardson2002multi] allows to estimate the error probability of an LDPC decoder, for a given protograph and at given SNR and iteration number . However, DE calculates under the assumption that the codeword length tends to infinity. As an alternative, [leduc2016finite] provides a method to estimate the error probability of an LDPC decoder at finite length . This method estimates the error probability as

(6)

In this expression , and is the error probability evaluated with standard DE at SNR value . The function

is the probability density function of a Gaussian random variable with mean

and variance .

For simplicity, we implemented the DE equations by considering a flooding scheduling. Following [sharon2007efficient], we empirically obtained the same error probabilities as a row-layered scheduling, given that the number of iterations is doubled. In addition, the DE equations were derived by considering that the memory faults are introducing after computation of the check-to-variable messages , as in [balatsoukas2014density]. This slightly differs from the hardware decoder of [dupraz2018low] described in Section II, where the faults are introduced when the VN messages are read. However, we observed through simulations a negligible difference on the obtained error probabilities.

Then, in order to evaluate the decoder energy consumption, we need to estimate the number of iterations required by the decoder. Therefore, we use the method of [yaoumi2020energy] to evaluate the average number of iterations at codeword length :

(7)

where . As for the error probability defined in (6), the expression of takes into account the channel variability, but does not evaluate the effect of cycles onto the decoder performance. However, as shown in [leduc2016finite, yaoumi2019optimization], these two formula accurately predict the finite-length decoder performance for long codewords.

Iv Energy Model

This section introduces the memory energy model we consider for the faulty Min-Sum decoder described in Section II.

Iv-a Faults-vs-energy model

We consider the generic faults-vs-energy model introduced in [chatterjee2016energy], that relates the memory failure parameter to the energy level as

(8)

where and are positive constants that depend on the circuit technology. In order to specify this model in our setup, we express the energy of writing one bit in memory as

(9)

where takes values in , and is referred to as the nominal energy. Therefore, means that the device does not consume any energy, and means that the device operates with nominal energy .

We now give an example on how to set the parameters , , and , depending on the technology. First, for we consider a failure probability of , which gives that . Then, for a typical nm SRAM cell, the failure probability at a nominal voltage is [dreslinski2010near], which gives . The value of can be estimated based on [horowitz:2014], where pJ is the storage energy for a -bit access from a Kb cache, which gives pJ. These values will be considered in our simulations.

Iv-B Memory energy model

The energy model proposed in [yaoumi2019optimization] estimates the overall memory energy consumption of a non-faulty quantized Min-Sum decoder by counting the total number of bits written into memory during the decoding process. In [yaoumi2019optimization], the total number of bits written in memory is evaluated from the facts that: (i) at a VN, the message is stored on bits, (ii) since we are using a row-layered scheduling, a VN updates its messages every time one of its neighboring check nodes is updated, (iii) at a CN, bit is stored for the sign of the output message, and two minimum absolute values of bits each are stored.

Here, we consider the memory energy consumption per information bits, in order to properly capture the effect of increased codeword length . As a result, the following energy model will be considered in the optimization:

(10)

where is the total memory energy consumption of the decoder, is the average number of iterations given in (7) for codeword length , and can be expressed with respect to the failure probability from (8) and (9).

V Energy Optimization

We now propose an optimization method to minimize the decoder energy consumption while satisfying a certain performance criterion.

V-a Optimization problem

As a performance criterion for the optimization, we fix a target error probability to be reached at a target SNR value . For simplicity, we assume that the code rate and the protograph are fixed. We propose to minimize the energy consumption with respect to the quantization level , the noise parameter , and the codeword length , while satisfying the performance criterion. The optimization problem can then be formulated as

(11)

where

In the above optimization problem, is given by (10) and is calculated from (6). In addition, gives the minimum error probability that can be reached by optimizing the scaling parameter and the offset parameter .

V-B Optimization method

The optimization problem (11) is difficult to solve because it involves discrete parameters and . In addition, it is computationally expensive to evaluate the number of iterations and the error probability for given parameters , because this requires to numerically evaluate integrals. Therefore, we want to lower the number of evaluations of these terms.

In order to solve the optimization problem (11), we first define search intervals for the parameters involved in the optimization. First, according to Section IV, the continuous parameter lies in the interval . Then, we assume that discrete parameters and take values in the sets and , respectively. The range of values for and must be selected so as to satisfy the performance criterion at least for the largest values and . For instance, in our simulations, we set , since for this case, the performance of the quantized Min-Sum decoder is almost the same as the performance of the non-quantized decoder.

Once the search intervals are set, we then perform a coordinate descent optimization, which consists of optimizing alternatively each of the three parameters , , and , over several iterations. Since we consider a constrained optimization problem, we verify at each iteration that the selected parameters meet the performance criterion of the optimization problem. For this reason, we first initialize our algorithm with the three parameters , , and . Then, at iteration , we successively solve the following three optimization problems:

  1. Given and , solve

    (12)
  2. Given and , solve

    (13)
  3. Given and , solve

    (14)

In (12), the parameter is optimized by exhaustive search since the search interval is small. Then, for the optimization of and in (13) and (14), we retain the parameters that satisfy the performance criterion and minimize the energy among a certain number of values between and and between and , respectively. To further reduce the computation time, we first evaluate the performance criterion , and then evaluate the corresponding energy only if the performance criterion is satisfied.

Finally, the coordinate descent approach guarantees that the energy criterion is reduced at each iteration. It also ensures that the final solution satisfies the performance criterion. However, there is a risk that the algorithm falls onto a local minimum. This issue is discussed in the next section, in which we evaluate through numerical simulations the proposed optimization method.

Vi Numerical results

In this section, we consider four different protographs given in Table I, all with parameters , , and code rate . The protographs and

were constructed using a genetic algorithm called Differential Evolution 

[richardson2001design] that optimizes protographs for performance only. When applying Differential Evolution, the protographs were optimized by considering a large quantization level in order to get a performance very close to the non-quantized decoder. We also consider the protographs and that were obtained in [yaoumi2019optimization] by optimizing the decoder energy consumption.

For the four protographs, we set as the target error probability to be achieved at the SNR dB. We then find the optimum parameters , and from the method proposed in Section V, with iterations. The optimal parameters are provided in Table I, along with the corresponding energy value . In order to verify over these four protographs that the coordinate descent algorithm did not fall onto a local optimum, we also solved the optimization problem (11) by exhaustive search, and found the same optimum values given in Table I for the three parameters . We further observe that, for every protograph, the minimum energy is achieved for the same quantization level and that the optimal failure probabilities are close to each other, i.e., , which roughly corresponds to using between 80% and 85% of the nominal energy . On the other hand, the optimal code length strongly depends on the considered protograph.

We also compare the obtained minimum energy values with respect to two setups in which we optimize only a part of the parameters. In Table I, gives the minimum energy value when only is optimized, and when and . And gives the minimum energy value when both and are optimized, and when . We observe an energy gain of to in the case where all the parameters are optimized.

Protograph
pJ pJ pJ
pJ pJ pJ
pJ pJ pJ
pJ pJ pJ
TABLE I: Minimum energy value and optimal parameters for the four considered protographs. The left part of the table represents the case where the three parameters (, , ) are optimized. The right part gives energy values and , when only is optimized, and when only and are optimized, respectively.
Fig. 1: Energy values , , and , with respect to SNR, for the protograph .
Fig. 2: BER with respect to SNR of the protograph , evaluated from the finite-length DE method and from Monte Carlo simulations.

We now focus on the protograph . Figure 1 gives the values of , and with respect to SNR, evaluated from (10). As explained in Section II-B, the memory fault model considered for DE is simplified, and slightly differs from the hardware implementation. To validate the accuracy of the model, the energy is evaluated from the average number of iterations obtained through

  1. simulation of the decoder using the hardware memory fault model,

  2. simulation of the decoder using the simplified model,

  3. DE (equation (7)) using the simplified model.

In every considered cases, we use the parameters , , and provided in Table I which were optimized for the target SNR . We see that although the optimization is performed at one single SNR value, the performance order is preserved at any SNR. Furthermore, the energy obtained with the simplified memory fault model demonstrates a good match when compared to the simulated models. The figure also confirms the energy gain at optimizing together the three parameters.

At the end, for protograph , Figure 2 shows the bit error rate (BER) with respect to SNR, evaluated both from the finite-length method of Section III and from Monte-Carlo simulations. Again, the three sets of parameters leading to , , and , are considered. We first observe that the finite-length method of Section III accurately predicts the decoder error probabilities, with a maximum gap of dB. In addition, as expected, we see that the case where the three parameters are optimized shows a degraded performance. Finally, we conclude that optimizing the decoder parameters allows to reduce the decoder energy consumption, with respect to the decoding performance criterion.

Vii Conclusion

In this paper, we introduced an energy model for faulty quantized Min-Sum decoders. We then proposed a method to optimize the number of quantization bits, the code length, and the failure probability, in order to minimize the energy consumption while satisfying a given decoding performance criterion. Simulation results show that using the optimal parameters greatly reduces the energy consumption while satisfying the performance criterion.

References