Memory-Assisted Quantized LDPC Decoding
Abstract
We enhance coarsely quantized LDPC decoding by reusing computed check node messages from previous iterations. Typically, variable and check nodes generate and replace old messages in every iteration. We show that, under coarse quantization, discarding old messages involves a significant loss of mutual information. The loss is avoided with additional memory, improving performance up to 0.36 dB. We propose a modified information bottleneck algorithm to design node operations taking messages from the previous iteration(s) into account as side information. Finally, we reveal a 2-bit row-layered decoder that can operate within 0.25 dB w.r.t. 32-bit belief propagation.
Index Terms:
LDPC decoder, layered decoding, rate-compatible, coarse quantization, information bottleneckI Introduction
Efficient and reliable decoding of low-density parity-check (LDPC) codes is vital in modern technologies with high data rate requirements, such as 5G[1]. Particularly the exchange of messages in iterative message passing decoding algorithms like belief propagation demands significant complexity[2]. To overcome this bottleneck many works focus on reducing the bit width of the exchanged messages in these algorithms through quantization operations, see e.g. [3, 4, 5, 6, 7, 8, 9, 10, 11].
The quantized messages represent reliability levels that encode reliability information exchanged between variable nodes (VNs) and check nodes (CNs). The choice of reliability levels is crucial for excellent decoding performance with low-resolution messages [11]. Information optimum reliability levels can be found with the information bottleneck (IB) method which is a clustering framework, that enables the design of compression operations for maximizing preserved relevant mutual information[12, 13, 8, 9, 10, 11]. Relevant mutual information measures the average amount of information between the transmitted code bits and exchanged decoding messages.
Typically, calculated messages from a previous iteration are replaced by updated messages from the current iteration [3, 8, 9, 4, 5, 10, 11, 6, 7]. One might question whether discarding previously computed and exchanged messages wastes valuable information. Indeed, under coarse quantization, this work confirms that preserving old messages of the previous iteration can significantly improve the decoding performance. For the design of a memory-assisted decoder we modify the sequential IB algorithm from [8] to be aware of the messages retained in memory. This algorithm is specifically suited for the design of deterministic compression mappings realized with symmetric thresholds. It has significantly reduced computational costs compared to more general solutions[13].
We combine the memory-assisted decoder structure with our recently proposed region-specific CN-aware quantizer design [11]. Region-specific quantization allows individual alphabets of reliability levels for subsets of exchanged messages particularly improving low-resolution decoding of highly irregular 5G-LDPC codes. A CN-aware quantizer design for the VN extends the optimization scope to maximize preserved relevant information at the output of the subsequent CN update[10]. The combination of this work and [11] yields up to 0.68 dB gain w.r.t. 2-bit decoding without those techniques.
II Preliminaries on LDPC Decoding with Mutual Information Maximizing Quantization
Most standards define LDPC codes through a base matrix with entries . The base matrix can be represented by a Tanner graph illustrated in Fig. 1. Each column turns into a variable node (VN) and each row into a check node (CN). The non-negative entries are edges between VNs and CNs. The node degree, i.e., the number of connected edges to a node, is for a VN and for a CN.
Lifting replaces every edge with edges that are subjected to -cyclic permutation. The lifted graph can be equivalently represented by a lifted parity check matrix . The encoder maps the information bits to code bits such that [11]. The decoder in the receiver assumes a memoryless channel. For every a binary channel LLR is quantized to a -bit message as input to the decoder. The quantization maximizes mutual information between and as in [11].
II-A Decoding with Arbitrary Schedules
Message passing decoding computes and exchanges messages between VNs and CNs to aggregate soft information for error correction from the parity check constraints. Each edge of the graph contains a VN and CN memory location enumerated with for the VN-to-CN messages and for the CN-to-VN messages (cf. Fig.1). A memory location stores messages after lifting the graph as illustrated within the orange box in Fig. 1. The sets and specify target memory locations for VN and CN updates, respectively. The decoding schedule defines the order in which memory locations are updated as
(1) |
followed by a final hard decision update that uses the most recent updated CN messages.
II-B Node Operations
We introduce the discrete random variables , and for modeling the channel, VN and CN messages. A realization of takes values from an LLR-sorted alphabet where and . We set and in this paper. For the design of the decoder, we keep track of and that change with every VN and CN update. Updating a VN memory location yields[11]
(2) |
with extrinsic CN locations . An LLR reconstruction of a message is denoted as using the aligned variables introduced in the next subsection II-C. The hard decision yields with the a-posteriori probability (APP) LLR . Row-layered decoder structures, such as the one in Fig. 8, typically use the APP LLR for computing (2) efficiently as . Those decoders initialize the APP LLR with . Hence, the reconstruction of the channel message must be done only once for all iterations. In an implementation the reconstruction in (2) is typically carried out with integer scaled LLRs of bit width to avoid performance loss[11].
II-C Alignment Regions
The reconstruction in (2) and quantization in (3) can be designed such that decoding messages from different memory locations can share the same functions. Using the same functions reduces the overall number of parameters to be designed and implemented. Common functions are realized through an alignment operation applied to the variables and before the node design as[11]
(4) |
where comprises all elements from the same region. This work considers a row-alignment or matrix-alignment [11].
II-D Region-Specific Quantization with Check Node Awareness
A compact version of the CN messages can be obtained with threshold quantization (cf. Fig. 2(a)). The objective is to maximize the mutual information preserved by any CN message in the alignment region . The optimization of can be performed with the sequential IB algorithm[8], also described in section IV. This algorithm requires the distribution . In case of a layered schedule, each update points to a subset of all memory locations . A single quantizer design suffices for each region where enumerates the distinct regions. The notation builds a set with unique elements, e.g., reduces to . The quantizer designed for region is used only for updating the locations defined by [11].
III Memory-Assisted Reconstruction
[mode=buildmissing]figs_tikz/mem_aware_setup/source
A small bit width of the exchanged messages significantly lowers the decoder complexity for several reasons:
-
•
One quantization operation uses comparisons[10].
-
•
The routing network size scales with .
-
•
The min-CN update can be carried out much faster. For example using instead of bits potentially reduces the logic gate delay by a factor of 4 [10].
Unfortunately, with bit these major complexity savings can lead to a noticeable performance degradation[11]. This section proposes a novel approach to overcome most of the degradation when using e.g. instead of bits. Fig. 3 extends the setup by preserving the old CN message of the previous iteration as . We remark that preserving for another iteration would give only marginal performance gains.
III-A Modification of Existing Decoder Design
Instead of aiming for we propose to take the statistics of the message into account. The message already provides a certain amount of mutual information . The optimization of the quantizer shall optimize preservation of additional mutual information . The variable node update now takes into account as
(5) |
For the design of in (3) we measure the joint distribution which considers correlations between and . Therefore, we generate a large set of decoding messages under a specific design-. In the next section IV we introduce a side-information aware IB algorithm which is used for optimization of the quantization thresholds. We define , , , and as the relevant, observed, compressed and side-information variable, , , , and , respectively. The optimization aims for . We remark that the alphabet is not strictly LLR-sorted as defined in (7) as a result of the CN minimum approximation.
IV Information Bottleneck Algorithm with Side-Information Awareness
Typically, an IB setup is defined by a relevant, observed and compressed random discrete variable , and that form a Markov chain . The IB method is a generic clustering framework for designing compression operations with optimization objective . The choice of allows to trade preservation of relevant information for compression. Of very high practical interest is the case where because it can be achieved with a deterministic mapping through threshold quantization
(6) |
with outer thresholds and , and identifying the th element of the ordered set . The mapping with thresholds can be information-optimum if [14]
(7) |
This section extends the conventional setup with a fourth variable which provides side-information about [13], see Fig. 4. Prior works[12, 13] also consider a setup with side information, however, those works do not explicitly provide a low-complexity solution for optimizing a threshold quantizer in the context of LDPC decoding. We propose Algorithm 1 which is a modified variant of the sequential IB algorithm from [8] taking into account the side information . The algorithm exploits (7) to sequentially optimize initial random boundaries defining the target clusters . To avoid local optima, we run 500 different initializations in parallel. We enforce symmetric thresholds , reducing the number of design and implementation parameters.
(8) |
IV-A Merger Costs with Side Information
In line 16, the element and counterpart element are moved into singleton clusters and , respectively. The temporary decompression is modeled with a discrete random variable . Line 18 optimizes the deterministic mapping with . The algorithm restricts merging into an adjacent cluster or . Thus, two mapping options exist with
(9) |
The mutual information loss from merging is
(10) | ||||
(11) | ||||
(12) |
where the individual merger costs are
(13) |
with .
V Evaluation with 5G Codes
This section investigates the performance of the proposed decoders with memory-assistance. As in [11] we use a 5G-LDPC code with length , base graph 1 and various code rates[1]. Furthermore, we consider an AWGN channel with BPSK modulation. All decoders use the initialization schedule described in [11] to avoid useless CN updates resulting from processing punctured messages. If not mentioned otherwise, the remaining schedule follows the flooding scheme with a maximum of 30 decoder iterations. Each decoder design uses a large set of training data with 10000 transmitted and received code words generated for a specific design . Analytical tracking of joint probabilities for the design of memory-assisted decoders seems infeasible. For a fair comparison, also the conventional decoders are designed with the training data, leading to slightly different results compared to our work[11].
V-A Evolution of Mutual Information
Fig. 5 depicts the evolution of mutual information between code bit and the corresponding hard decision for every iteration. The quantized messages are matrix-aligned such that all messages use the same alphabet of reliability levels in one iteration. The design process is initialized with the same design . Particularly under 2-bit decoding, the mutual information gains per iteration are significantly improved with the proposed memory-assistance. For 3-bit decoding those gains appear smaller. Nevertheless, the proposed 3-bit decoding almost achieves the same performance as the conventional 4-bit decoding.
V-B Boundary Placement for Memory-Assisted Reconstruction
This section analyzes the placement of quantizer boundaries for every iteration to explain the performance gains achieved with the proposed structure. Now, all decoders use an individual design- so that the mutual information converges after 30 iterations . A careful optimization is very important to ensure minimum frame error rate for a given budget of decoding iterations.
Fig. 6 shows boundary levels which increase for higher iterations as the reliability of messages improves. One key observation is that boundary magnitudes for the proposed 2-bit decoder show up an alternating rising and falling trend. Thus, memory-assistance enhances the resolution by using different quantizers in successive iterations. This behavior clearly shows the effectiveness of the side-information aware IB algorithm: A CN message is a compressed version of the non-quantized LLR in (3) from the current iteration. A CN message is a compressed version of from the previous iteration. The difference is sufficiently small on average, such that can approximately resolve . Different boundaries for and improve their combined capability to resolve in relevant ranges. The combined resolution approaches the 3-bit quantizer in Fig. 6.
We remark that the side information is only actively used during reconstruction . For example, consider a sign-magnitude alphabet sorted by underlying LLR with . For the reconstructed LLR is more reliable if the message in memory is matching, e.g., . It is less reliable if does not agree, e.g., . Thus, . Combined knowledge about the message and allows better inference of the non-quantized LLR .
[mode=buildmissing]figs_tikz/alignments_cd_ms_flooding_cn_aware_mi_evolution_1_3/source_miapp
[mode=buildmissing]figs_tikz/alignments_cd_ms_flooding_cn_aware_mi_evolution_1_3/source_bounds
V-C Reduced Complexity with Merged Check Node Messages
In the previous sections the side information is equal to from the previous iteration and is discarded after it has been used. Instead of discarding , this section considers a side information which is merged with into a compressed message obtained through a compression table . The table is designed using the IB method with a measured joint distribution . The reconstruction in (5) now uses instead of . The side information can propagate over many iterations as where is another compression for reducing the table size of .
The compression reduces the memory demand from to bits in a row-layered decoder structure depicted in Fig. 8. The row-layered schedule sequentially updates orthogonal sets of rows of [11]. One layer update consists of partial VN and a full CN update (see section II-B). A partial VN update computes the VN-to-CN message as with . The reconstruction translates from the previous iteration to a -bit integer representations of . Significant loss is avoided if bits[11]. A permutation performs a cyclic shift of parallel messages. A full CN update computes CN-to-VN messages for all connected VNs according to (3). After an inverse permutation, a CN-to-VN message is merged with the side information . Finally, .
V-D Frame Error Rate Performance
The decoding performance with different resolutions, code rates and schedules is depicted in Fig. 7. If not indicated otherwise all quantized decoders apply CN-aware quantization with row-alignment as in [11]. For rate , the proposed -bit () memory-assisted decoder improves performance by 0.36 dB (0.1 dB) compared to the conventional decoders. Relative to a decoder with matrix alignment, no CN-aware quantization and no memory-assistance the performance gain is 0.68 dB. The -bit memory-assisted decoder is able to slightly outperform a -bit decoder. The labels (3 4 2) and (2 3 2) specify the bit widths ( ) of the decoders with merged old and new CN message (see section V-C), respectively. They reduce the memory demand from 6 to 4 and 4 to 3 bits without losing performance compared to the non-merged decoders. The row-layered schedule with 15 iterations can achieve almost similar performance as the flooding schedule with 30 iterations. The improvements also translate to other code rates to . Remarkably, under high rate , the (2 3 2)-bit decoder operates within dB compared to a belief propagation decoder with accurate box-plus operation and 32-bit LLR messages[3].
[mode=buildmissing]figs_tikz/alignments_cd_ms_flooding_cn_aware_low_to_high_rate/source
[mode=buildmissing]figs_tikz/hardware/horizontal
VI Conclusions
This letter proposed coarsely quantized decoding which is assisted through messages retained in memory. We extended an existing IB algorithm for the design of threshold quantization which is aware of side information provided by the memory. Further, we proposed a new structure which merges the side information with a newly generated CN message to reduce the memory overhead. In summary, 2-bit (layered) decoding is improved up to 0.36 dB by increasing the memory from 2 to 3 bits, but preserving high speed 2-bit operations for quantization, routing network and the CN update.
References
- [1] 3GPP, “5G NR: Multiplexing and Channel Coding, TS 38.212,” 2018.
- [2] R. Gallager, “Low-Density Parity-Check Codes,” IRE Transactions on Information Theory, vol. 8, no. 1, pp. 21–28, 1962.
- [3] J. Chen, A. Dholakia et al., “Reduced-complexity decoding of LDPC codes,” IEEE Trans. on Commu., vol. 53, no. 8, pp. 1288–1299, 2005.
- [4] P. Kang, K. Cai et al., “Generalized Mutual Information-Maximizing Quantized Decoding of LDPC Codes With Layered Scheduling,” IEEE Trans. on Vehicular Technology, vol. 71, no. 7, pp. 7258–7273, 2022.
- [5] L. Wang, C. Terrill et al., “Reconstruction-Computation-Quantization (RCQ): A Paradigm for Low Bit Width LDPC Decoding,” IEEE Trans. on Communications, vol. 70, no. 4, pp. 2213–2226, 2022.
- [6] M. Geiselhart, A. Elkelesh et al., “Learning quantization in LDPC decoders,” in 2022 IEEE Globecom Workshops, 2022, pp. 467–472.
- [7] Y. Ren, H. Harb et al., “A Generalized Adjusted Min-Sum Decoder for 5G LDPC Codes: Algorithm and Implementation,” 2024.
- [8] J. Lewandowsky and G. Bauch, “Information-Optimum LDPC Decoders Based on the Information Bottleneck Method,” IEEE Access, vol. 6, pp. 4054–4071, 2018.
- [9] M. Stark, L. Wang et al., “Decoding Rate-Compatible 5G-LDPC Codes With Coarse Quantization Using the Information Bottleneck Method,” IEEE Open Journal of the Comm. Society, vol. 1, pp. 646–660, 2020.
- [10] P. Mohr and G. Bauch, “A Variable Node Design with Check Node Aware Quantization Leveraging 2-Bit LDPC Decoding,” in GLOBECOM 2022 - 2022 IEEE Global Communications Conf., 2022, pp. 3484–3489.
- [11] ——, “Region-Specific Coarse Quantization with Check Node Awareness in 5G-LDPC Decoding,” arXiv:2406.14233, 2024.
- [12] G. Chechik and N. Tishby, “Extracting Relevant Structures with Side Information,” in Advances in Neural Inf. Proc. Sys. MIT Press, 2002.
- [13] S. Steiner and V. Kuehn, “Distributed compression using the information bottleneck principle,” in ICC 2021 - IEEE International Conference on Communications, 2021, pp. 1–6.
- [14] B. M. Kurkoski and H. Yagi, “Quantization of Binary-Input Discrete Memoryless Channels,” IEEE Transactions on Information Theory, vol. 60, no. 8, pp. 4544–4552, Aug. 2014.