I Introduction
The structure of concatenating a convolutional code (CC) with a cyclic redundancy check (CRC) code has been a popular paradigm since 1994 when it was proposed in the context of hybrid automatic repeat request (ARQ) [13]. It was subsequently adopted in the cellular communication standards of both 3G [17] and 4G LTE [10]. In general, the CRC code serves as an outer errordetecting code that verifies if a codeword has been correctly received, whereas the CC serves as an inner errorcorrecting code to combat channel errors.
Recently, there has been a renewed interest in designing powerful short blocklength codes. This renewed interest is mainly driven by the development of finite blocklength information theory by Polyanskiy et al., [12] and the stringent requirement of ultrareliable lowlatency communication (URLLC) for missioncritical IoT (Internet of Things) service [5]. In [12], Polyanskiy et al.
developed new achievability bound known as the randomcoding union (RCU) bound and new converse bound, known as the metaconverse (MC) bound. Together, these two bounds characterize the error probability of the best short blocklength code of length
with codewords. The URLLC for missioncritical IoT requires that the timetotransmit latency is within 500 while maintaining a block error rate no greater than .Several short blocklength code designs have been proposed in literature. Important examples include the tailbiting (TB) convolutional codes decoded using the wraparound Viterbi algorithm (WAVA) [4], extended BCH codes under ordered statistics decoding [2, 20], nonbinary lowdensity paritycheck (LDPC) codes [3], nonbinary turbo codes [9], and polar codes under CRCaided successivecancellation list decoding [16]. Recent advances also include the polarization adjusted convolutional codes by Arıkan[1]. As a comprehensive overview, Coşkun et al. [2] surveyed most of the contemporary short blocklength code designs in recent decade. We refer the reader to [2] for additional information.
In [19], Yang et al. proposed the CRCaided CCs as a powerful short blocklength code for binaryinput (BI) additive white Gaussian noise (AWGN) channels. In [19], the convolutional encoder of interest is of rate and is either zeroterminated (ZT) or TB. In order to construct a good CRCaided CC, Yang et al. designed a distancespectrum optimal (DSO) CRC generator polynomial for the given CC. The resulting concatenated code generated by the DSO CRC polynomial and the convolutional encoder is a good CRCaided CC.
The nature of the concatenation naturally permits the use of the serial list Viterbi decoding (SLVD), an efficient algorithm originally proposed by Seshadri and Sundberg [15]. Yang et al. showed that the expected list rank of the SLVD of the CRCaided CC is small at a target low error probability, thus achieving a low average decoding complexity. Yang et al. demonstrated that several concatenated codes generated by the DSO CRC polynomial and the TBCC, or in short, CRCTBCCs, approach the RCU bound. In [14], Schiavone extended this line of work by looking at the parallel list Viterbi decoding with a bounded list size. However, these works did not consider rate CCs. It remains open as to whether this framework can be extended to CCs with arbitrary rate such that the resulting concatenated code can approach the RCU bound at a low decoding complexity.
In this paper, we consider designing good CRCaided CCs for rate CCs at short blocklength for the BIAWGN channel, where the CC is either ZT or TB. We consider systematic, rate convolutional encoders. The resulting concatenated codes are respectively called a CRCZTCC and CRCTBCC. We assume that the SLVD has a sufficiently large list size such that no negative acknowledgement is produced. Thus, the SLVD is an implementation of maximumlikelihood decoding. The frame error rate (FER) is in fact the undetected error probability. Simulations show that in the shortblocklength regime, our rate CRCTBCCs still perform closely to the RCU bound.
A work related to this line of research is that of Karimzadeh & Vu [6]. They considered designing the optimal CRC polynomial for multiinput CCs. In their framework, the information sequence is first divided into streams, one for each input rail and they aim at designing optimal CRC polynomial for each rail. Unlike their architecture, in this paper, the information sequence is first encoded with a single CRC polynomial and is then divided into streams. Simulation results show that our framework can yield better FER performance than that of Karimzadeh & Vu.
For rate CCs, the SLVD on the primal trellis requires high decoding complexity because of the outgoing branches at each node. SLVD implementation becomes more complicated when there are more than two outgoing branches per state. In order to simplify SLVD implementation and reduce complexity, we utilize the dual trellis pioneered by Yamada et al.[18]. The dual trellis expands the length of the primal trellis by a factor of , while reducing the number of outgoing branches at each node from to at most two.
The remainder of this paper is organized as follows. Section II reviews systematic encoding for convolutional codes and describes the dual trellis construction. Section III considers CRCZTCCs for rate CCs. It addresses the zerotermination issue, presents DSO CRC design for highrate ZTCCs, and shows CRCZTCC simulation results. Section IV considers CRCTBCCs for rate CCs. It addresses how to find the tailbiting initial state over the dual trellis, describes DSO CRC design for TBCCs, and shows CRCTBCC simulation results. Section V concludes the paper.
Notation: Let and denote the information length and blocklength in bits. Let denote the rate of the CRCaided CC. A degree CRC polynomial is of the form , where , . For brevity, a CRC polynomial is represented in hexadecimal when its binary coefficients are written from the highest to lowest order. For instance, 0xD represents . The codewords are BPSK modulated. The SNR is defined as (dB), where represents the BPSK amplitude and the noise is always distributed as a standard normal.
Ii Systematic Encoding and Dual Trellis
This section describes systematic encoding and introduces the dual trellis proposed by Yamada et al. [18] for convolutional codes with rate , where represents the overall constraint length.
Iia Systematic Encoding
We briefly follow [8, Chapter 11] in describing a systematic convolutional encoder. A systematic convolutional encoder is represented by its parity check matrix
(1) 
where each is a polynomial of degree up to in delay element associated with the th code stream, i.e.,
(2) 
where . For convenience, we represent each in octal form. For instance, can be concisely written as . Define , . The systematic encoding matrix associated with is given by
(3) 
The first output bit is a coded bit whereas the remaining output bits are a direct copy of the corresponding input bits.
IiB Dual Trellis
The primal trellis associated with an ZTCC has outgoing branches per state. In the case where , SLVD over the primal trellis will become complicated. In [19], the low decoding complexity of SLVD for rate convolutional codes relies on the fact that only outgoing branches are associated with each state. In order to efficiently perform SLVD, we consider the dual trellis proposed by Yamada et al. [18].
We briefly explain the dual trellis construction for parity check matrix . First, define the maximum instant response order as
(4) 
The state of the dual trellis is represented by the partial sums of adders in the observer canonical form of . At time index , , the state is given by
(5) 
Next, we show how state evolves in terms of the output bits , , so that a dual trellis can be established.
Dual trellis construction for :

At time , , where . Namely, only states exist at .

At time , , draw branches from each state to state by
(6) 
At time , draw branches from each state to state by
(7) where .

For time , draw a branch from each state according to (6) only for .
After repeating the above construction for each , , we obtain the dual trellis associated with the convolutional code. Since the primal trellis is of length , whereas the dual trellis is of length , dual trellis can be thought of as expanding the primal trellis length by a factor of , while reducing the number of outgoing branches per state from to no greater than .
Iii ZTCC with DSO CRC via Dual Trellis SLVD
This section considers CRCZTCCs for rate CCs. Section IIIA presents a zero termination method over the dual trellis. Section IIIB describes our DSO CRC polynomial search procedure. Finally, Section IIIC presents simulation results of the CRCZTCC compared with the RCU bound. As a case study, this paper mainly focuses on the rate systematic feedback convolutional codes in [8, Table 12.1(e)].
161.2  
DSO CRC  
0x9  
0x4D  
0x31B  
0x9  
0x7B  
0x3F1  
161.2 
Iiia Zero Termination of Dual Trellis
For an CC, zero termination over the dual trellis requires at most steps. In our implementation, a breadthfirst search identifies the zerotermination input and output bit patterns that provide a trajectory from each possible state to the zero state. The input and output bit patterns have lengths and respectively.
IiiB Design of DSO CRCs for HighRate ZTCCs
In general, a DSO CRC polynomial provides the optimal distance spectrum which minimizes the union bound on the FER at a specified SNR [19]. In this paper, we focus on the low FER regime. Thus, the DSO CRC polynomials identified in this paper simply maximize the minimum distance of the concatenated code. Examples in [19] indicate that DSO CRC polynomials designed in this way can provide optimal or nearoptimal performance for a wide range of SNRs.
The design procedure of the DSO CRC polynomial for highrate ZTCCs essentially follows from the DSO CRC design algorithm for low target error probability in [19]. The first step is to collect the irreducible error events (IEEs), which are ZT paths on the trellis that deviate from the zero state once and rejoin it once. In order to maintain efficiency, we only consider IEEs with output Hamming weight up to some threshold . Dynamic programming constructs all ZT paths of length equal to and output weight no greater than . Finally, we use the resulting set of ZT paths to identify the degree DSO CRC polynomial for the rate CC.
Table I presents the DSO CRC polynomials for ZTCCs generated with and . The design assumes a fixed blocklength bits. Due to the overhead caused by the CRC bits and by zero termination, the rates of CRCZTCCs are less than . Specifically, for a given information length , CRC degree and an encoder, the blocklength for a CRCZTCC is given by
(8) 
yielding a rate
(9) 
We see from (8) that the convolutional encoder can accept any CRC degree as long as is divisible by .
IiiC Results and Comparison with RCU Bound
Fig. 1 shows the performance of CRCZTCCs with increasing CRC degrees and and a fixed blocklength bits. We see that at the target FER of , increasing the CRC degree reduces the gap to the RCU bound. With and , the CRCZTCC approaches the RCU bound within dB.
In [6], Karimzadeh et al. considered designing optimal CRC polynomials for each input rail of a multiinput CC. In their setup, an information sequence for an encoder needs to be split into subsequences before CRC encoding. In contrast, the entire information sequence in our framework is encoded with a single CRC polynomial. Then the resulting sequence is evenly divided into subsequences, one for each rail. To compare the performance between these two schemes, we design three degree optimal CRC polynomials, one for each rail, for ZTCC with . The three CRC polynomials jointly maximize the minimum distance of the CRCZTCC. For the singleCRC design, we use the single degree DSO CRC polynomial for the same encoder from Table I. Both CRCZTCCs have an information length and blocklength . Fig. 2 shows the performance comparison between these two codes, showing that a single degree DSO CRC polynomial outperforms three degree DSO CRC polynomials, one for each rail. This suggests that a single DSO CRC polynomial may suffice to provide a superior protection for each input rail.
Iv TBCC with DSO CRC and Dual Trellis SLVD
The performance of CRCaided list decoding of ZTCCs relative to the RCU bound is constrained by the termination bits appended to the end of the original message, which are required to bring the trellis back to the allzero state. TBCCs avoid this overhead by replacing the zero termination condition with the TB condition that the final state of the trellis is the same as the initial state of the trellis [11].
In this section, we apply the SLVD of CRCTBCCs over the dual trellis. We will discuss how to determine the initial state for the TBCC to ensure that the tailbiting condition is met and demonstrate designs of DSO CRCs for rate TB codes. Decoding complexity and performance are analyzed at the end of this section.
Iva List Decoding for TBCC with CRC
There are two primary differences between our development and analysis of list decoding of ZTCCs, as described in Section III, and what is needed for TBCCs. One difference is that since the ZT condition is replaced with the TB condition, the encoder must determine the initial trellis state so that the TB condition is satisfied. The other difference is that SLVD on the dual trellis must be adapted to handle the TB condition.
To satisfy the TB condition, encoding is attempted from every initial state to identify the initial state that satisfies the TB condition. This is required because our recursive encoder cannot simply achieve the TB condition by setting the initial encoder memory to be the final bits of the information sequence.
To adapt SLVD on the dual trellis to handle the TB condition, we propose an efficient way to keep track of the path metrics and find the next path with minimum metric through an additional root node as shown in Fig. 3. The root node connects to all terminating states after forward traversing the dual trellis. The Hamming distance of the branch metric for the branch connecting any state to this root node is zero. This additional root node allows the trellis to end in a single state, so that the basic SLVD approach for a ZTCC may be applied. During SLVD, if the current path does not pass either the CRC or TB check, the minimum value among all remaining path metrics will be selected as the next path to check.
IvB Design of DSO CRCs for HighRate TBCCs
The design of DSO CRCs for highrate TBCCs follows the twophase design algorithm as in [19]. This algorithm is briefly explained below.
Consider a TB trellis of length , where denotes the set of output alphabet, denotes the set of states, and denotes the set of edges described in an ordered triple with and [7]. Assume and let . Define the set of IEEs at state as
(10) 
where
(11) 
The IEEs at state can be thought of as “building blocks” for an arbitrarily long TB path that starts and ends at the same state .
The first phase is called the collection phase, during which the algorithm collects with output Hamming weight no greater than the threshold over a sufficiently long TB trellis. The second phase is called the search phase, during which the algorithm first reconstructs all TB paths of length and output weight no greater than via concatenation of the IEEs and circular shifting of the resulting path. Then, using these TB paths, the algorithm searches for the degree DSO CRC polynomial by maximizing the minimum distance of the undetected TB path.
Table II presents the DSO CRC polynomials for TBCCs generated with and . The design assumes a fixed blocklength . TB encoding avoids the rate loss caused by the overhead of the zero termination. Specifically, for a given information length , CRC degree and an encoder, the blocklength for a CRCTBCC is given by
(12) 
yielding a rate
(13) 
161.2  
DSO CRC  
0x9  
0x7D  
0x1CF  
0x38F  
0x73F  
0x9  
0x4F  
0x173  
0x3BF  
0x697  
161.2 
IvC Complexity Analysis
In [19], the authors provided the complexity expression for SLVD of CRCZTCCs and CRCTBCCs. Observe that the dual trellis has no more than outgoing branches per state, similar to the trellis of a rate CC. Thus, we directly apply their complexity expression to the SLVD over the dual trellis.
As noted in [19], the overall average complexity of the SLVD can be decomposed into three components:
(14) 
where denotes the complexity of a standard soft Viterbi (SSV), denotes the complexity of the additional traceback operations required by SLVD, and denotes the average complexity of inserting new elements to maintain an ordered list of path metric differences.
is the complexity of addcompareselect (ACS) operations and the initial traceback operation. For CRCZTCCs,
(15) 
For CRCTBCCs, this quantity is given by
(16) 
The second component for CRCZTCC is given by
(17) 
For CRCTBCCs, is given by
(18) 
The third component, which is the same for ZT and TB, is
(19) 
where is the expected number of insertions to maintain the sorted list of path metric differences. For CRCZTCCs,
(20) 
and for CRCTBCCs,
(21) 
In the above expressions, and are two computerspecific constants that characterize implementationspecific differences in the implemented complexity of traceback and list insertion (respectively) as compared to the ACS operations of Viterbi decoding. In this paper, we assume that and use (20) and (21
) to estimate
for CRCZTCCs and CRCTBCCs, respectively.Fig. 4 shows the tradeoff between the SNR gap to the RCU bound and the average decoding complexity at the target FER . The average decoding complexity of SLVD is evaluated according to the aforementioned expressions. We see that for the same ZTCC or TBCC, increasing the CRC degree significantly reduces the gap to the RCU bound, at the cost of a small increase in complexity. However, for the same CRC degree , increasing the overall constraint length dramatically increases the complexity, while achieving a minimal reduction in the SNR gap to the RCU bound.
IvD Results, Analysis, and Expected List Rank of SLVD
Fig. 5 shows the FER vs. SNR for three CRCTBCCs at blocklength . At the target FER of , the SNR gap is reduced to dB for the CRCTBCC with . Fig. 6 shows the tradeoff between the expected list rank and the FER. We see that the expected list rank for achieving the target FER of , implying a low average decoding complexity of SLVD.
V Conclusion
This paper shows list decoding of a rate CC concatenated with a DSO CRC yields block codes that approach the RCU bound for the BIAWGN. CRCTBCCS have FERs within 0.15 dB of RCU bound for FER at blocklength N=128 bits. Adding a bit of CRC can improve FER far more than adding an additional memory element to the CC.
Acknowledgment
The authors thank Dariush Divsalar for helpful discussions on construction of the dual trellis. We thank Ethan Liang and Linfang Wang for guidance and mentorship. Thanks to Mai Vu and Mohammad Karimzadeh for helpful collaboration.
References
 [1] (20190826) From sequential decoding to channel polarization and back again. External Links: http://arxiv.org/abs/1908.09594, Link Cited by: §I.
 [2] (2019) Efficient errorcorrecting codes in the short blocklength regime. Physical Commun. 34, pp. 66 – 79. External Links: Document, ISSN 18744907 Cited by: §I.
 [3] (2014) Nonbinary protographbased LDPC codes: enumerators, analysis, and designs. ieee_j_it 60 (7), pp. 3913–3941. External Links: Document Cited by: §I.
 [4] (201702) On the performance of short tailbiting convolutional codes for ultrareliable communications. In SCC 2017; 11th Int. ITG Conf. Syst., Commun., and Coding, Vol. , pp. 1–6. External Links: Document, ISSN Cited by: §I.
 [5] (2018) Ultrareliable and lowlatency communications in 5G downlink: physical layer aspects. ieee_m_wc 25 (3), pp. 124–130. External Links: Document Cited by: §I.
 [6] (2020) Optimal CRC design and serial list viterbi decoding for multiinput convolutional codes. In 2020 IEEE Global Commun. Conf., Vol. , pp. 1–6. External Links: Document Cited by: §I, Fig. 2, §IIIC.
 [7] (2003Sep.) The structure of tailbiting trellises: minimality and basic principles. ieee_j_it 49 (9), pp. 2081–2105. External Links: Document, ISSN 15579654 Cited by: §IVB.
 [8] (2004) Error control coding: fundamentals and applications. Pearson Prentice Hall, New Jersey, USA. Cited by: §IIA, §III.
 [9] (2013) Short turbo codes over high order fields. ieee_j_com 61 (6), pp. 2201–2211. External Links: Document Cited by: §I.
 [10] (2018) LTE; evolved universal terrestrial radio access (EUTRA); multiplexing and channel coding; 3GPP TS 36.212 version 15.2.1 release 15. Technical report European Telecommunications Standards Institute. Cited by: §I.
 [11] (198602) On tail biting convolutional codes. ieee_j_com 34 (2), pp. 104–111. External Links: Document, ISSN 00906778 Cited by: §IV.
 [12] (201005) Channel coding rate in the finite blocklength regime. ieee_j_it 56 (5), pp. 2307–2359. External Links: Document, ISSN 15579654 Cited by: §I.
 [13] (1994) Comparative analysis of two realizations for hybridARQ error control. In 1994 IEEE Global Commun. Conf., Vol. , pp. 115–119. External Links: Document Cited by: §I.
 [14] (2021) Channel coding for massive iot satellite systems. Master’s Thesis, Politechnic University of Turin (Polito). Cited by: §I.
 [15] (199402) List Viterbi decoding algorithms with applications. ieee_j_com 42 (234), pp. 313–323. External Links: Document, ISSN 00906778 Cited by: §I.
 [16] (2015) List decoding of polar codes. ieee_j_it 61 (5), pp. 2213–2226. External Links: Document Cited by: §I.
 [17] (2006) Universal mobile telecommunications system (UMTS); multiplexing and channel coding (FDD); 3GPP TS 25.212 version 7.0.0 release 7. Technical report European Telecommunications Standards Institute. Cited by: §I.
 [18] (1983) A new maximum likelihood decoding of high rate convolutional codes using a trellis. Elec. and Commun. in Japan Part Icommun. 66, pp. 11–16. Cited by: §I, §IIB, §II.
 [19] (2021) CRCaided list decoding of convolutional codes in the short blocklength regime. ArXiv abs/2104.13905. Cited by: §I, §IIB, §IIIB, §IIIB, §IVB, §IVC, §IVC.
 [20] (2021) A revisit to ordered statistics decoding: distance distribution and decoding rules. ieee_j_it 67 (7), pp. 4288–4337. Cited by: §I.