I Introduction
Polar codes are proposed by Arıkan in [1] and achieve the capacity of binaryinput, memoryless, outputsymmetric (BMS) channels with low encoding and decoding complexity. Given independent BMS channels , polarization occurs through channel combining and splitting, resulting in perfect bit channels or completely noisy bit channels as approaches infinity. The portion of the perfect bit channels is exactly the symmetric capacity of the underlying channel . Polar codes transmit information bits through the perfect bit channels and fix the bits in the completely noisy channels. Accordingly, the bits transmitted through the completely noisy channels are called frozen bits.
The construction of polar codes (selecting the good bit channels from all bit channels) is presented in [1, 2, 3, 4, 5, 6, 7]. In [1], Arıkan proposes MonteCarlo simulations to sort the bit channels with a complexity of ( represents the iterations of the MonteCarlo simulations). In [2, 3], density evolutions are used in the construction of polar codes. Since the density evolution includes function convolutions, its precisions are limited by the complexity. Bit channel approximations are proposed in [4] with a complexity of ( is a userdefined parameter to control the number of output alphabet at each approximation stage). In [7, 6, 5], the Gaussian approximation (GA) is used to construct polar codes in additive white Gaussian noise (AWGN) channels.
To achieve arbitrary code lengths and code rates, puncturing and shortening of polar codes are reported in [8, 9, 10, 11, 12, 13, 14]. In [8], a channelindependent procedure is proposed for puncturing that involves the minimum stopping set of each bit. In [8], the punctured bits are unknown to decoders, and it is therefore called the unknown puncturing type. The quasiuniform puncturing (QUP) algorithm is proposed in [9], which simply punctures the reversed bits from 1 to ( is the number of bits to be punctured). The QUP puncturing is the unknown puncturing. Reordering the bit channels after puncturing with the GA method is proposed in [10], and selecting the punctured bits from the frozen positions is also proposed in [10]. The puncturing in [10] is also the unknown puncturing. Another type of puncturing, called known puncturing or shortening, is proposed in [12, 11, 13, 14]. The reversal quasiuniform puncturing (RQUP) algorithm proposed in [11] simply punctures the reversed bits from to . The shortening in [12] is based on the column weights of the generator matrices. A lowcomplexity construction of shortened and punctured polar codes from a unified view is proposed in [13]. In [14], an optimization algorithm to find a shortening pattern and a set of frozen symbols for polar codes is proposed.
Regardless of the puncturing or shortening pattern, a reordering of bit channels is necessary when these operations are performed. In AWGN channels, GA can be used to reorder the bit channels when some of the coded bits are punctured or shortened. Puncturing and shortening are equivalent to the case in which the underlying channel corresponding to the selected coded bit is no longer the underlying channel . As some of the underlying channels change, the bit channels constructed from these underlying channels differ from the original channels without puncturing or shortening. Reordering of these new bit channels is necessary to avoid deterioration of performance. However, the GA method is only applicable to AWGN channels. New procedures are important when studying puncturing or shortening of polar codes. This is the motivation of the work in this paper.
To study the construction of polar codes in which some of the coded bits are punctured or shortened, we first generalize this problem by considering the underlying channels to be independent BMS channels (not necessarily identical ones). For BEC channels, recursive equations are proposed in [15] to calculate the Bhattacharyya parameter for each bit channel. The construction complexity is the same as the original complexity in [1]. For other channel types, the general construction in this paper is based on TalVardy’s procedures in [4]. The symmetric property of polar codes, which is first stated in [1], is proven to hold in the new setting in which the underlying channels can differ. The degradation relationship (the foundation of TalVardy’s procedure) is also proven to hold. Based on the theoretical analysis, a modification to the TalVardy algorithm [4] that is applicable to any BMS channel is proposed to reorder the bit channels when some of the underlying channels are independent BMS channels (which, again, could be different channels). For continuous output channels such as AWGN channels, conversion to BMS channels can be performed first, and then the modified TalVardy algorithm can be applied analogous to the TalVardy algorithm itself. The general construction can therefore be applied to reorder the bit channels with puncturing or shortening. Depending on the puncture type, the punctured channel must be equivalently modeled. Then, the recursion in BEC channels or the modified TalVardy’s procedure for all other channels can be applied to reorder the bit channels. Simulation results show that the reordering greatly improves the error performance of polar codes.
Utilizing the property that the beginning of the source bits are usually frozen bits (s in other words), the encoding throughput can be improved. With the increase of the code length, the area of the encoder increases exponentially. Folding [16]
is a technique to reduce the area by multiplexing the modules. By exploiting the same property between polar encoding and the fast Fourier transformation (FFT),
[17] first applies the folding technique for the polar encoding based on [18]. Folded systematic polar encoder is implemented in [19]. Moreover, [20] designs an autogeneration folded polar encoder, which could preprint the hardware code directly given the length and the level of parallelism. Combining the property of the puncturing mode, current folded encoder could be pruned further. In this paper, a pruned folded polar encoder is proposed. It avoids the beginning calculation of the frozen ‘’ bits. Therefore, the latency could be reduced significantly. Implementation results also proves the feasibility of the pruned encoder, which provides throughput improvement.The remainder of this paper is organized as follows. In Section II, we briefly introduce the basics of polar codes. The general construction based on TalVardy’s procedure is presented in Section III. The numerical results for applying the BEC construction and the general construction in Section III are provided in Section IV. Section V proposes a pruned folded polar encoder architecture and the results are compared with the stateoftheart. The paper ends with concluding remarks.
Ii Background on Polar Codes
Iia Polarization Process
For a given BMS channel :
, its input alphabet, output alphabet, and transition probability are
, , and , respectively, where and . Two parameters represent the quality of a BMS channel : the symmetric capacity and the Bhattacharyya parameter. The symmetric capacity can be expressed as(1) 
The Bhattacharyya parameter is
(2) 
The term is used to represent the generator matrix: , where is the code length (), is the permutation matrix used for the bitreversal operation, , and denotes the th Kronecker product of . The channel polarization is divided into two phases: channel combing and channel splitting. The channel combing refers to the combination of copies of a given BMS
to produce a vector channel
, defined as(3) 
The channel splitting splits back into a set of binaryinput channels , defined as
(4) 
The channel is called bit channel , which indicates that it is the channel that bit experiences from the channel combining and splitting stages. Bit channel can be viewed as a BMS channel: .
Polar codes can also be constructed recursively in a tree structure [1]. The tree structure is expanded fully in Fig. 1 for . There are eight independent and identical BMS channels at the righthand side. In Fig. 1, from right to left, there are three levels: level one, level two, and level three, each containing Zshapes. A Zshape is the basic onestep transformation with the transition probability defined in (4) with . This onestep transformation converts two input channels to two output channels: the upper left channel and the lower left channel. For bit channel (), the binary expansion of is denoted as ( being the MSB). The bit at level () determines whether bit channel takes the upper left channel or the lower left channel: If , bit channel takes the upper left channel; otherwise, it takes the lower left channel. At level , there are Zshapes with the same input channels. For example, in Fig. 1, at level , all Zshapes have the same input channels . At level , there are two Zshapes with the same input channels: the two dashedline Zshapes have input channels , and the two solidline Zshapes have input channels . The Zshapes are grouped with the same input channels as one group in each level. Then, at level 1, there is one group (containing four Zshapes) sharing the same input channels. By contrast, at level 2, there are two groups (each containing two Zshapes) that share the same input channels (the dashedline group and the solid line group). At level 3, all four Zshapes have different input channels. To construct polar codes, the onestep transformation of the Zshapes in the same group only needs to be calculated once [1, 4].
IiB Motivation for the General Construction
The original code length of polar codes is limited to the power of two, i.e., . To obtain any code length, puncturing or shortening is typically performed. For the puncturing mode, some coded bits are punctured in the encoder and the decoder has no a priori information about these bits. For the shortening mode, the values of the shortened coded bits are known by both the encoder and decoder.
The code lengths of both the punctured codes and the shortened codes are denoted by . Let denote the number of punctured (or shortened) bits, with . The code rate of the punctured or shortened codes is . For the punctured mode, the decoder does not have a priori information on the punctured bits. A BMS punctured channel of this type can be modeled as a BMS channel with since for the received symbol , the likelihood of or being transmitted is equal. The following lemma can be easily checked.
Lemma 1
For a punctured channel with ( and ), the symmetric capacity of is .
The proof of this lemma can be found in the Appendix.
For the shortened mode, the shortened bits are known to the decoder that can be modeled from the following lemma.
Lemma 2
A shortened channel with shortened bits known to the receiver can be modeled as a binary symmetric channel (BSC) with a crossover transition probability zero: with and . The capacity of a shortened channel is therefore .
The focus of this lemma is the model of the shortened channel . The capacity of is a wellknown result [21].
Once some of the coded bits are punctured or shortened, the underlying channels are no longer the same channels as originally proposed in [1]. The bit channels constructed from the channel combining and splitting stages therefore have different qualities and must be reordered. Fig. 2 shows an example of the Bhattacharyya parameters of bit channels constructed from a underlying BEC channel with an erasure probability 0.5. In Fig. 2, the original polar code block length is . The blue dots are the Bhattacharyya parameters of the original bit channels. The red asterisks are the Bhattacharyya parameters of the bit channels with punctured coded bits, and these punctured bits are unknown to the receiver. Equivalently, among the original independent BEC channels, channels are now completely noisy channels (). The Bhattacharyya parameters of the bit channels in this case are worse than the original bit channels, as indicated by the red asterisks in Fig. 2. The black circles are the Bhattacharyya parameters of the bit channels with shortened coded bits, and these shortened bits are known to the receiver. Equivalently, among the original independent BEC channels, channels are now completely perfect channels (). Therefore, the good bit channels should be reselected from the new set of bit channels with different Bhattacharyya parameters due to puncturing or shortening.
IiC Construction of BEC Channels
First, we consider the onestep transformation and generalize the original transformation from two identical independent underlying channels to two independent underlying channels. The two independent channels can be different channels as indicated in Fig. 3, where and are two BMS channels. With puncturing or shortening, can be a completely noisy channel with (puncturing) or a perfect channel (shortening). With the generalization in Fig. 3, the synthesized channel can be expressed as follows:
(5) 
The splitting channels and can be expressed as follows:
(6)  
(7) 
Borrowing the notations from [15], the above onestep transformation can be written as follows:
(8)  
(9) 
For BEC channels, the onestep transformation from and to has the following Bhattacharyya parameters [15]:
(10)  
(11) 
The construction of polar codes in BEC channels can be performed by recursively employing these two equations where the underlying channels are independent BMS channels.
Iii General Construction Based on
the TalVardy Procedure
TalVardy’s construction of polar codes [4] is based on the fact that polar codes can be constructed in levels for a block length , as shown in Fig. 1. In each level, the onestep transformation defined in (6) or (7) is performed. Note that in the original onestep transformation in [1], the underlying channels are identical: (also shown in Fig. 1). From level to level (), the size of the output alphabet at least squares. The output channel in each level is still a BMS channel. The idea in [4] is to approximate the output BMS channel in each level by a new BMS channel with a controlled output alphabet size. This size is denoted as , which indicates that the output alphabet has at most symbols in each level. With this controlled size, each approximated bit channel can be evaluated in terms of the error probability,
As shown in [4, 15], the degradation relation is preserved by the onestep channel transformation operation. In addition, as shown by Proposition 6 of [4], the output of the approximate procedure remains a BMS channel: taking an input BMS channel, the output from the approximate process is still a BMS channel. Therefore, the key to applying TalVardy’s approximate function to the general construction is that 1) the output channels from (6) and (7) are still BMS channels; 2) the degradation relation is preserved from (6) and (7).
In the following, we first prove that the output of the generalized onestep transformation remains a BMS channel. Then, the degradation relation from (6) and (7) is shown to be preserved. The modification to TalVardy’s algorithm follows.
Iiia Symmetric Property
Some notations are needed first. For a BMS channel , it has a permutation on with = and for all . Let be the identity permutation on . As in [1], a simpler expression is used to replace , for , and .
Obviously, the equation is established for , , and . It is also shown in [1] that . For and , let
(12) 
This is an elementwise permutation.
Proposition 1
For two independent BMS channels and , = is also symmetric in the sense that
(13) 
where .
Proof:
Observe that . From the symmetric property of and , we also have
Therefore, , which is exactly .
Proposition 2
Proof:
Let and observe that
Let , and observe that
This proves the first result in (14). Next, we prove the second claim in (15). With and , the bit channel can be written as follows:
(17)  
Then, let , and observe that
Thus, the second claim in (15) is established. Finally, we prove the final claim. Observe that
Then, observe that
Thus, the third claim in (16) is also established.
IiiB Degradation Relation
First, the notation denotes the degraded relationship as in [15]. Therefore, degradation of with respect to can be denoted as .
Lemma 3
Proof:
Let : be an intermediate channel that degrades to . That is, for all and , we have
(19) 
We first prove the first part of this lemma: . According to (6), can be written as
(20) 
(21) 
As shown in [4], a channel can be considered to be both degraded and upgraded with respect to itself. Let be the channel to degrade to itself:
(22) 
(23) 
Define another channel as for all and :
(24) 
Applying (24) to (23), we have
which shows that is degraded with respect to : . This proves the first part of this lemma. In the same fashion, the second part of the lemma can be proven: .
Lemma 4
The proof of this lemma is immediately available following the proof of Lemma 3. Please note that the proof of Lemma 3 and Lemma 4 is provided in [22], where the increasing convex ordering property is invoked. In this part, we provide another way to prove the degradation preservation of the general onestep polar transformation.
Proposition 3
Suppose there is a degrading (upgrading) algorithm that approximates the BMS channel of the onestep transformation defined in (6) or (7) with another BMS channel. Apply the approximate algorithm to each of the onestep transformations. For the th () bit channel, denote the final approximate bit channel as . Then, is a BMS channel that is degraded (upgraded) with respect to .
Proof:
From Proposition 1 and Proposition 2, it is shown that the output of the onestep transformation is still a BMS channel. From Lemma 3 and Lemma 4, it is shown that the degradation relation is preserved with the transformation defined in (6) and (7). Then with induction on each onestep transformation, the final bit channel applying any degrading (upgrading) algorithm is a degraded (upgraded) version of the original bit channel.
In the following subsection, the approximate procedure is chosen to be TalVardy’s in [4].
IiiC Modified TalVardy Algorithm
The TalVardy algorithm is used to construct polar codes in [4]. The algorithm can obtain an approximating bitchannel with a specific size using the degrading merge function or the upgrading merge function. The underlying channels are assumed to be independent and identical BMS channels. In this part, we propose a modification to TalVardy’s approximate procedure that can take independent BMS underlying channels.
Fig. 4 shows an example of the general construction, indicating the key difference of the construction problem with the original construction. The labeling of the intermediate channels in Fig. 4 is different from the labeling in Fig. 1. For example, the four channels (these are output channels of level one) in Fig. 1 are identical BMS channels, whereas in Fig. 4, these channels could be independent but different BMS channels. The superscript is changed to to differentiate these channels, where () is the position of the channel counting from the top to the bottom in that level. Originally, there are four channels located at positions 1, 3, 5, and 7. These are now possibly different channels: , , , and in the general construction. The labelling of the output channels at level 2 follows the same fashion. Originally, two Zshapes composed of the four channels belong to the same group, thus requiring only one calculation of the onestep transformation. In the general construction, since , , , and could be different, the two Zshapes composed of them need to be evaluated.
We use approximateFun
to represent the degrading or the upgrading procedure in [4].
The vector contains sections: , with representing the transition probability of the th underlying channel.
As in [4], suppose is sorted according to the ascending order of the likelihood ratios.
The modified TalVardy procedure is presented in Algorithm 1. Algorithm 2
locates the transition probabilities of the two underlying channels for a given Zshape at a given level.
IiiD Differences and Complexity Analysis Compared with TalVardy’s Procedure
Originally, the independent underlying channels are identical: independent copies of BMS channel . The independent underlying channels of the general construction in Section IIIC can be different. The Zshape (or the onestep transformation) of the general construction takes the form in Fig. 3. The main difference of the modified TalVary algorithm with the original algorithm lies in the number of calculations of the onestep transformation. In the original TalVardy algorithm, all input channels to Zshapes of the same group are the same at each level, requiring one calculation of the onestep transformation for each group (Please refer to Section IIA for this discussion). Therefore, for the original TalVardy algorithm, the number of calculations of the onestep transformation is for level . Suppose all output channels have size . Let us consider the approximate process by which all channels are stored from level to level . All bit channels can be approximated at level . At level , the memory space to store these channels is thus , leading to the largest memory space of . The total onestep transformation in levels for the original TalVardy algorithm is therefore: , requiring the approximate procedure to be applied times. By contrast, in the modified TalVardy algorithm, Zshapes in the same group in each level can have different input channels, leading to a complete calculation of all onestep transformations in each group. The approximate procedure is applied to each of the onestep transformations in each level. The total number of onestep transformations is . These onestep transformations require approximate procedures and memory space. Table I is a summary of the complexity discussion.


Largest Memory Space  Number of Approximate Procedures  
TalVardy  
Modified TalVardy  

Iv Simulation Results of the General Construction
In this section, the construction of polar codes in BEC channels and the modified TalVardy algorithm in Algorithm 1 for all other channels are used to construct polar codes with puncturing and shortening, echoing our motivation of this paper’s work in Section IIB. In the puncturing mode, the receiver has no knowledge of the punctured bits. The punctured coded bits are not transmitted and the corresponding punctured channels are modeled as the channel in Lemma 1. For BEC channels, this puncturing is equivalent to receiving an erasure bit at the punctured position. In AWGN channels with the BPSK modulation, it is equivalent to a received value of zero at the punctured position. With shortening, the shortened bits are known at the decoder side and are all set to zero in our simulations. The transmission process is therefore modeled in the following steps:

Construction preparation:

According to the puncturing/shortening mode, the punctured/shortened coded bits are obtained.

From these underlying channels, the construction algorithm can be employed to reorder the bit channels. For BEC channels, the recursive equations in (10) and (11) can be employed to calculate the Bhattacharyya parameters of the bit channels. For AWGN channels, GA [9, 10] or the proposed modified TalVardy procedure can be employed to reorder the bit channels. For all other BMS channels, the proposed TalVardy procedure can be employed since GA is no longer applicable.

Denote the information set as and the frozen set as from the previous reordering of the bit channels.


The decoding process:

The successive cancellation (SC) [1] decoding process carries over from the initial LR values, the information set , and the frozen set , just as in the original SC decoding process.
Fig. 5 shows the error probability of polar codes with , and in the BEC channels. The number of punctured (or shortened) bits is , resulting in a final code length of . Define two vectors: and . The punctured and shortened coded bits are bitreversed versions of and , respectively, as in [9, 11]. In Fig. 5, the frameerrorrate (FER) performance is shown (a frame is a code block) where the xaxis is the erasure probability of the underlying BEC channels. The label with ‘puncture: No reordering’ indicates that even with puncturing, the good bit channels are selected from the original sorting of bit channels as though there were no puncturing. The label with ‘puncture: Reordering’ indicates that the bit channels are reselected from the Bhattacharyya parameters recursively calculated according to equations (10) and (11). The label with ‘shorten: Reordering’ corresponds to the FER performance of reordering bit channels when shortening is performed. The initial Bhattacharyya parameters at the punctured positions are set to one. The initial Bhattacharyya parameters at the shortened positions are set to zero. It can be observed that reordering bit channels with puncturing and shortening improves the FER performance of the polar codes.
Fig. 6 shows the error performance of puncturing and shortening in binary symmetric channels (BSC). Puncturing and shortening are conducted in the same fashion as that in the BEC channels with the same parameters. The xaxis is the transition probability of the underlying BSC channels. Originally, the bit channels are sorted according to TalVardy’s degrading merging procedure with [4] as though there was no puncturing or shortening. The performance of such a construction is shown in Fig. 6 by the lines with asterisks (with legend ‘puncture: No reordering’). Applying Algorithm 1 to reorder the bit channels, the output alphabet size in each approximate process is still set as . At the punctured positions, the transition probabilities are set to and (position corresponds to a punctured position). At the shortened positions, the transition probabilities are and since the shortened bits are set to zero. As shown in Fig. 6, if the bit channels are not reordered, the FER performance is degraded compared with the performance with reordering.
Note that polar codes with puncturing or shortening in BSC channels can not be constructed using Gaussian approximation [6, 7]. However, the proposed modified TalVardy procedure can be employed to construct polar codes with independent BMS channels.
Fig. 7 shows the error performance of puncturing and shortening in AWGN channels. The underlying AWGN channel is first converted to a BMS channel as in [4] with an output alphabet size of . Then, puncturing and shortening are carried out in the same fashion as the BEC channel with the same puncturing and shortening parameters. The bit channels are first sorted according to TalVardy’s degrading merging procedure with (assuming no puncturing or shortening). The FER performance of the polar code constructed from this sorting is the lines with asterisks in Fig. 7 (corresponding to a code rate of and ). Applying Algorithm 1, the bit channels are reordered, and the output alphabet size is still . At the punctured positions, the channels are treated as receiving a zero (or with a LR of 1); at the shortened positions, the channels are treated as a known channel with an LR of infinity. As shown in Fig. 7, without reordering of the bit channels, the FER performance is degraded compared with the performance with reordering.
V Hardware Implementation
In this section, a general folded polar encoder architecture is studied. Based on that, a pruned architecture is proposed which could reduce the latency and improve the throughput significantly.
Va General Folded Polar Encoder
Folding is a transformation technique to reduce the hardware area by multiplexing the processing time. When polar codes are applied in ultrareliable scenarios, the code length is long and the hardware area is large. By exploiting the similarity between polar encoding and the FFT, the folded polar encoders are proposed in [17], [23].
Illuminated by [20], a general description of folded polar encoder is summarized which requires less registers than [20]. The folded architecture could be represented by a general equation. The folded polar encoder is composed of three basic modules which are shown in Fig. 8. The XORorPASS is the arithmetic unit and is abbreviated as XP. The module S is the commutator with delays which switches the signal flow. The P is the permutation module with inputs and outputs. Fig. 8 shows the case of P.
Suppose that the parallelism degree is and the overall architecture could be represented as below,
(26) 
where means the module has copies at each stage. For example, when and , the architecture is shown in Fig. 9.
Based on the equation, the hardware code of the folded polar encoder could be generated automatically. In this paper, this architecture is abbreviated as “auto encoder”.
VB Pruned Folded Polar Encoder
Based on the partial orders [3, 24, 25] of polar codes, it can be concluded that the bit channels of polar codes with large indices tend to be good bit channels; and those with small indices tend to be frozen bit channels. Of course there are regions where frozen and information bits are tangled together. For example, in Fig. 10, the beginning part of the source bits contains frozen bits (bits s) and the last part is the information bits (bits s). With the puncturing such as [9], the first bit channels are punctured, meaning that there are consecutive frozen bits at the beginning of the source bits. The bottom figure of Fig. 10 shows an example of the distribution of the frozen and information bits. It is intuitive that the beginning part of number of s (frozen bits) increases with puncturing.
Since the XOR value of two ‘’s is still ‘’, the process for the beginning bits could be avoided. In parallelism folded polar encoder, the source bits enters the circuit in blocks of length . Define the latency as the first datain to the first dataout. The latency of the encoder is . According to the bit distribution, denote as the number of s at the beginning. For the punctured polar code in Fig. 10, . The datain for these bits could be pruned. Herein, the latency could be reduced to . Correspondingly, the last dataout cycles should be combined to one cycle.
Take Fig. 9 as an example, label the datain and dataout as to and to . The clock cycles of the original folded polar encoder and pruned folded polar encoder are illustrated in Table II. It can be seen that the latency of the original encoder is when pipelined, while the latency of the pruned encoder is reduced to when pipelined.


Clock Cycle  
Encoder  Signal  1  2  3  4  5  6  7  8 
in  
Original  out  
in 
Comments
There are no comments yet.