Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
AMPLITUDE ENCODING USING A BAND-ESS ALGORITHM AND A LOGARITHMIC-BASED CCDM ALGORITHM
Document Type and Number:
WIPO Patent Application WO/2023/211270
Kind Code:
A1
Abstract:
The application presents a method of encoding data for being transmitted from a transmitter to a receiver over a channel, wherein the method is performed by a control circuitry, and comprises the step of: encoding data as a shaped coded modulation signal by shaping the signal based on an amplitude selection algorithm that shapes a sequence of input data bits into a sequence of amplitudes, wherein the amplitude selection algorithm includes: constructing a trellis having a sequence of energy-bounded amplitude values (I), wherein the energy bound (L) is determined based on an estimation of the channel, wherein constructing the trellis includes: - storing only a reduced band of values of the trellis, wherein the band has a predetermined height (h) and slope (N, h), and - storing and computing remaining values of the trellis corresponding to the initial width (w,) and to the final width (Wf) of the trellis. A complimentary method and corresponding transceiver devices are also presented herein.

Inventors:
GÜLTEKIN YUNUS CAN (NL)
WILLEMS FRANS M J (NL)
ALVARADO ALEX (NL)
Application Number:
PCT/NL2023/050220
Publication Date:
November 02, 2023
Filing Date:
April 25, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV EINDHOVEN TECH (NL)
International Classes:
H04L1/00; H04L27/34
Other References:
GULTEKIN YUNUS CAN ET AL: "Mitigating Nonlinear Interference by Limiting Energy Variations in Sphere Shaping", 2022 OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXHIBITION (OFC), 6 March 2022 (2022-03-06), USA, pages 1 - 3, XP034109976
AMARI ABDELKERIM ET AL: "Introducing Enumerative Sphere Shaping for Optical Communication Systems With Short Blocklengths", IEEE JOURNAL OF LIGHTWAVE TECHNOLOGY, vol. 37, no. 23, 1 December 2019 (2019-12-01), USA, pages 5926 - 5936, XP011758388, ISSN: 0733-8724, [retrieved on 20191126], DOI: 10.1109/JLT.2019.2943938
GULTEKIN YUNUS CAN ET AL: "Band-ESS: Streaming Enumerative Coding with Applications to Probabilistic Shaping", IEEE GLOBECOM 2022, 4 December 2022 (2022-12-04), USA, pages 3223 - 3228, XP034269089, DOI: 10.1109/GLOBECOM48099.2022.10000599
PATRICK SCHULTE ET AL: "Constant Composition Distribution Matching", vol. 62, no. 1, 31 January 2016 (2016-01-31), USA, pages 430 - 434, XP002777173, Retrieved from the Internet [retrieved on 20180109]
GULTEKIN YUNUS CAN ET AL: "Log-CCDM: Distribution Matching via Multiplication-free Arithmetic Coding", 2022 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 26 June 2022 (2022-06-26), USA, pages 55 - 60, XP034160586, DOI: 10.1109/ISIT50566.2022.9834834
Attorney, Agent or Firm:
ALGEMEEN OCTROOI- EN MERKENBUREAU B.V. (NL)
Download PDF:
Claims:
CLAIMS

1. A method of encoding data for being transmitted from a transmitter to a receiver over a channel, wherein the method is performed by a control circuitry, and comprises the step of: encoding data as a shaped coded modulation signal by shaping the signal based on an amplitude selection algorithm resulting in a symmetrical input and by constructing a trellis having a bounded energy sequence of amplitude values by consulting a look-up table, computing and storing a plurality of channel related energy constraints based on use of a non-linear estimation process, and providing an index for the bounded energy sequences of amplitudes, wherein constructing the trellis includes storing only a reduced band of values of the trellis, and storing and computing on-the-fly- computation remainders ensuing from the nonlinear estimation process for representing other parts of the trellis.

2. The method according to claim 1 , wherein the reduced band of values of a trellis occupy a portion of one of the primary diagonals of a matrix comprising the index for bounded energy sequences of amplitudes.

3. The method according to any of the previous claims, wherein the control circuitry is a logic circuitry.

4. The method according to any of the previous claims, wherein the storing, and computing the computation remainders is implemented using shift registers.

5. The method according to any of the previous claims, wherein the storing and computing the computation remainders is performed in a logarithmic domain.

6. The method according to any of the previous claims, wherein the size of the reduced band of the trellis is determined based on channel quality.

7. The method according to claim 6, wherein a larger size of the reduced band of the trellis is used for a channel of high quality. 8. The method according to any of the previous claims, wherein encoding data comprises a sliding window algorithm based on a non-linear estimation process.

9. The method according to any of the previous claims, wherein the method is employed in a digital communications system.

10. A method of encoding data for being transmitted from a transmitter to a receiver over a channel, wherein the method is performed by a control circuitry, wherein the method comprises the steps of: implementing aa Full PPrreecciissiioonn - Constant Composition Distribution matching, FP-CCDM, algorithm in a logarithmic domain, by consulting a plurality of Look-Up Tables, LUTs, that allow a mapping of the symbols in an arithmetic domain to a logarithmic domain; performing the CCDM algorithm in the logarithmic domain by making use of shift registers to perform the addition and subtraction steps in the logarithmic domain.

11. The method according claim 10, wherein the plurality of LUTs are stored locally at the transmitter.

12. The method according to any of claim 10-12, comprising three LUTs.

13. The method according to claim 12, wherein a first of the three LUTs provides a mapping between a symbol and an exponential function F(s), and wherein a second and a third LUT provide a mapping between values in an arithmetic domain and a logarithmic domain.

14. A transceiver in a digital communication system, the transceiver comprising:

- control circuitry,

- encode-decode equipment communicatively coupled with the control circuitry and arranged for encoding data as a shaped coded modulation signal by shaping the signal based on an amplitude selection algorithm resulting in a symmetrical input; the control circuitry being arranged for constructing a trellis having a bounded energy sequence of amplitude values by consulting a look-up table, computing and storing a plurality of channel related energy constraints based on use of a non-linear estimation process, and providing an index for the bounded energy sequences of amplitudes, wherein constructing the trellis includes storing only a reduced band of values of the trellis, and storing and computing on-the-fly- computation remainders ensuing from the nonlinear estimation process for representing other parts of the trellis.

15. The transceiver according to claim 14, further arranged to perform a method according to any of the claims 2 - 9.

16. A transceiver in a digital communication system, the transceiver comprising:

- control circuitry;

- process circuitry communicatively couple to the control circuitry further arranged for implementing a Full Precision - Constant Composition Distribution matching, FP-CCDM, algorithm in a logarithmic domain, by consulting a plurality of Look-Up Tables, LUTs, that allow a mapping of the symbols in an arithmetic domain to a logarithmic domain; and performing the CCDM algorithm in the logarithmic domain by making use of shift registers to perform the addition and subtraction steps in the logarithmic domain;

17. The transceiver according to claim 16, further arranged to perform a method according to any of claims 11 - 13.

Description:
Title

AMPLITUDE ENCODING USING A BAND-ESS ALGORITHM AND A LOGARITHMIC-BASED CCDM ALGORITHM

Technical Field

The present invention generally relates to the field of digital communication and more specifically to method of encoding/decoding data and corresponding devices.

Brief Description of the drawings

Fig. 1 illustrates the shell occupation in an N-dimensional sphere for ESS and B-ESS.

Fig. 2 illustrates an enumerative sphere shaping trellis for N=3.

Fig. 3 illustrates a band-trellis for N=7

Fig. 4 illustrates the element wise ration of adjacent columns of the band

Fig. 5 illustrates a plot of the maximum convergence that satisfies (7) vs the number of columns that are stored at the rightmost part of the band.

Fig. 6 illustrates a band trellis for N=9.

Fig. 7 illustrates different slopes for B-ESS in the n-dimensional sphere.

Fig. 8 illustrates the duality of source coding and Distribution matching

Fig. 9 illustrates CCDM for C=[3,2] and A=2 via Arithmetic Coding.

Fig. 10 illustrates the exponential function F(s) for 1<= s<=2S and log functions.

Fig. 11 illustrates the rate k/N of Log-CCDM

Description

In a first aspect of the present disclosure, there is presented a method of encoding data for being transmitted from a transmitter to a receiver over a channel, wherein the method is performed by a control circuitry, and comprises the step of encoding data as a shaped coded modulation signal by shaping the signal based on an amplitude selection algorithm resulting in a symmetrical input and by constructing a trellis having a bounded energy sequence of amplitude values by consulting a lookup table, computing and storing a plurality of channel related energy constraints based on use of a non-linear estimation process, and providing an index for the bounded energy sequences of amplitudes, wherein constructing the trellis includes storing only a reduced band of values of the trellis, and storing and computing on-the-fly- computation remainders ensuing from the nonlinear estimation process for representing other parts of the trellis.

According to an example, the reduced band of values of a trellis occupy a portion of one of the primary diagonals of a matrix comprising the index for bounded energy sequences of amplitudes.

In an example, the control circuitry is a logic circuitry.

In an embodiment, the storing, and computing the computation remainders is implemented using shift registers.

According to an example, the storing and computing the computation remainders is performed in a logarithmic domain.

In an embodiment, the size of the reduced band of the trellis is determined based on channel quality.

In a further embodiment, a larger size of the reduced band of the trellis is used for a channel of high quality.

According to an example, encoding data comprises a sliding window algorithm based on a non-linear estimation process.

In an example, the method is employed in a digital communications system. In a second aspect of the present disclosure, there is presented a method of encoding data for being transmitted from a transmitter to a receiver over a channel, wherein the method is performed by a control circuitry, wherein the method comprises the steps of implementing a Full Precision - Constant Composition Distribution matching, FP-CCDM, algorithm in a logarithmic domain, by consulting a plurality of Look-Up Tables, LUTs, that allow a mapping of the symbols in an arithmetic domain to a logarithmic domain, and performing the CCDM algorithm in the logarithmic domain by making use of shift registers to perform the addition and subtraction steps in the logarithmic domain.

According to an embodiment, the plurality of LUTs are stored locally at the transmitter.

According to a further embodiment, the method comprises three LUTs.

In an embodiment, a first of the three LUTs provides a mapping between a symbol and an exponential function F(s), and wherein a second and a third LUT provide a mapping between values in an arithmetic domain and a logarithmic domain.

In a third aspect of the present disclosure, there is presented a transceiver in a digital communication system, the transceiver comprising:

- control circuitry,

- encode-decode equipment communicatively coupled with the control circuitry and arranged for encoding data as a shaped coded modulation signal by shaping the signal based on an amplitude selection algorithm resulting in a symmetrical input; the control circuitry being arranged for constructing a trellis having a bounded energy sequence of amplitude values by consulting a look-up table, computing and storing a plurality of channel related energy constraints based on use of a non-linear estimation process, and providing an index for the bounded energy sequences of amplitudes, wherein constructing the trellis includes storing only a reduced band of values of the trellis, and storing and computing on-the-fly- computation remainders ensuing from the nonlinear estimation process for representing other parts of the trellis.

According to an exemplary embodiment, the transceiver of the third aspect of the present disclosure is further arranged to perform a method according to any of the exemplary embodiments of the first aspect of the present disclosure.

In a fourth aspect of the present disclosure, there is presented a transceiver in a digital communication system, the transceiver comprising

- control circuitry;

- process circuitry communicatively couple to the control circuitry further arranged for implementing a Full Precision - Constant Composition Distribution matching, FP-CCDM, algorithm in a logarithmic domain, by consulting a plurality of Look-Up Tables, LUTs, that allow a mapping of the symbols in an arithmetic domain to a logarithmic domain; and performing the CCDM algorithm in the logarithmic domain by making use of shift registers to perform the addition and subtraction steps in the logarithmic domain;

According to an exemplary embodiment, the transceiver of the fourth aspect of the present disclosure is further arranged to perform a method according to any of the exemplary embodiments of the second aspect of the present disclosure.

For the optical fiber channel, energy variations in the input signal play an important role in the generation of nonlinear interference. Recently, band-trellis enumerative sphere shaping (B-ESS) was introduced as a constellation shaping technique that limits these energy variations. In this work, we study the implementation of B-ESS. B-ESS is based on a signal trellis in which the signals with high energy variations are pruned. We show that thanks to the trellis structure created by this pruning, B-ESS can be implemented with an insignificant storage complexity. This is in the sense that the trellis computation reduces to a set of recursive multiplications with a scalar factor. Moreover, this scalar factor can be adjusted such that the trellis computation is further simplified and realized with binary shifts only. This way, B-ESS can be implemented for arbitrarily long input block-lengths in a streaming mode. As an example, with a quaternary input symbol alphabet, a rate 1.28 bit/sym shaping encoder can be operated at virtually any block-length requiring 0.35 kilobytes of storage

Probabilistic amplitude shaping (PAS) is a coded modulation strategy that combines constellation shaping and forward error correction. The constellation shaping block is used to control the channel input distribution and this way, the capacity of the additive white Gaussian noise (AWGN) channel can be achieved. Moreover, PAS has attracted considerable attention and been shown to increase the achievable rates for a variety of communication channels, e.g., wireless channels, optical fiber channels, free-space optical channels, underwater optical channels, etc.

The nonlinear interference (NLI) generated during the propagation of an optical signal over the fiber can be modelled as time-varying inter-symbol interference. Thus, the NLI experienced by a symbol transmitted during a certain channel use depends on the content of the adjacent channel uses, creating memory in the channel. The presence of memory in the NL fiber channel implies that good input sequences having certain temporal structures could experience a lower NLI. Motivated by this, temporal shaping was also studied in. More recently, achievable rates of communication over the fiber channel were computed based on the selection and transmission of good sequences. In certain studies, variations in the energy of the transmitted signal are given as the reason behind NLI, thus hinting that symbol sequences with limited energy variations would experience lower NLI.

Recently, we have proposed band-trellis enumerative sphere shaping (B-ESS) for generating shaped channel input sequences with limited energy variations. B-ESS was implemented by modifying the ESS algorithm. ESS operates based on an amplitude trellis that represents signal points within a sphere as in Fig. 1 (left). We modified ESS such that only a band-like portion of the trellis is considered, in which only the sequences with gradual-like increase in energy are represented.

These signal points are located on (but do not fill) a number of outermost shells of the sphere as in Fig. 1 (right). Numeric results demonstrated that B-ESS provides significant gains in terms of effective SNR and error probability with respect to both uniform signalling and ESS. As an additional benefit, we indicated that the band structure enables a streaming implementation, i.e., shaping block-length can be increased arbitrarily without an increase in storage complexity. Increasing the shaping block-length has been shown to improve the performance for linear channels, and for the nonlinear fiber channel when there is a carrier phase recovery algorithm implemented. In this disclosure, we study the implementation of B-ESS. We demonstrate that thanks to the band-like structure imposed on the enumerative trellis, B-ESS can be implemented with minimal storage complexity. This is because the trellis can now be computed via recursive scalar multiplications. Furthermore, we show that this scalar multiplicative factor can be selected such that the trellis computation consists only of binary shifts.

A. Enumerative Sphere Shaping

Enumerative sphere shaping (ESS) assigns an index to the sequences in a sphere shaping set where A is the amplitude alphabet. For this purpose, an amplitude trellis is created as shown in Fig. 2. The sphere shaping set A* in (1) (and hence, the trellis) is specified by N, A, and E max . The maximum energy E max can be written as L = ( E max -N )/8+1 for alphabets in the form A = {1, 3, . . . , M-1} which is the amplitude alphabet of M = 2 m - ary amplitude-shift keying (M-ASK). Typically, the inputs of a wireless communication channel are drawn from the M2- ary quadrature amplitude modulation (M 2 -QAM) alphabet, which is the Cartesian product of the M-ASK alphabet with itself. Dualpolarized inputs of optical communication channels, however, are drawn from the Cartesian product of the M 2 -QAM alphabet with itself. Here, L is the number of nodes in the final column , i.e., the height of the trellis, while the width of the trellis is N + 1. Thus, the nodes in the trellis are specified by (n, I) for n = 0, 1, . . . , L - 1 and I = 0, 1 L - 1.

In Fig. 2, the nodes (/, n) are labelled with (e), the energy level e = 8l + n that they represent, which is the accumulated energy of the sequences for their first n amplitudes, more precisely, Σ i=1 n |an| 2 . Therefore, the nodes can equivalently be specified by (n, e). Branches that arrive at a node (l, n) in the nth column represent an. Each N -sequence is, therefore, represented by an N -branch path, starting from node (0, 0) and arriving at a node (l, N ) in the final, i.e., rightmost, column. As an example, the dashed path in Fig. 2 represents (3, 1, 3) with energy 19.

The numbers written inside each node (l, n) give the number of ways to arrive at the final column, starting from (l, n). We denote the column vector that stores the values in the nth column of a trellis by cn = [cL-1,n cL-2,n . . . c0,n] T . The value inside node (0, 0), i.e., c 0 ,0, is the number of sequences represented in the trellis, which is 11 for Fig. 2. The trellis can be computed recursively via for n = N - 1 , N - 2, . . . , 0, where the initial condition is cN = [1 1 . . . 1], Here, we call the L-by-L matrix the ESS adjacency matrix. Illustratively, a i j = 1 if there is a connection between the nodes (i, n) and ( j, n + 1). As an example, the adjacency matrix of the trellis in Fig. 2 is given in (3). The structure of this adjacency matrix depends on A and L.

The trellis can be stored in the form of a matrix C =[c 0 c 1 . . . c N ], i.e. ,

As an example, C of the trellis in Fig. 2 is given in (4). Note that in an amplitude trellis, the nodes that correspond to energy levels that are not possible are not included. For instance, the node (l = 2, n = 1) — which requires the existence of an amplitude with energy 2 — is not present in Fig. 2. Accordingly, the elements of C that correspond to these nodes are filled with zeros as in (4). Based on the trellis, ESS realizes a lexicographical mapping from binary strings, i.e., information sequences, to the amplitude sequences in a sphere. For this, C must be stored. The input length of ESS and the corresponding shaping rate are defined as in bits and bits per amplitude (b/amp), respectively. A very detailed explanation of ESS algorithms and discussions on its storage and computational complexity can be found in other documents.

B. Approximate Enumerative Sphere Shaping

When C is computed using (2), it satisfies with strict equality. We have shown that the invertibility of ESS is guaranteed for any trellis that satisfies (7). For the same set of parameters N, A, and L, the only difference between the exact trellis that satisfies (7) and an approximate trellis that does not satisfy (7) is that some sequences inside the sphere are lost in the approximate trellis. This is in the sense that the ESS algorithm operating on this approximate trellis cannot output some sequences from the sphere, which causes a rate loss.

This property can be used to create a trade-off between the performance (via rate) and the complexity (via required precision and storage) as follows. After the computation of a number cl,n (either via (2) or via the right hand side of (7)), the result can be rounded down to nα-bits and be approximated as where α is an n α -bit mantissa and β is an n β -bit exponent. This approximation rounds the least significant n β bits of c l , n down to zero, while keeping the most significant n α bits unchanged. Then the number c l , n can approximately be stored in the form (α, β) using (nα + nβ) bits. The exponent size should be selected in bits. The resulting rate loss is upper-bounded by - Iog2(1- 2" (nα -1) ) in b/amp. By proper selection of n α , the required storage for C can be decreased significantly with a negligible rate loss. We call enumerative trellises stored with this principle bounded precision.

C. Band-trellis: Altering the Trellis to Limit Energy Variations

To discard sequences with large energy variations, we consider a band- like portion of the amplitude trellis as shown in Fig. 3. Unlike Fig. 2, only the trellis values are included in Fig. 3, and the nodes are not labelled with their corresponding energy levels for the sake of simplicity. The motivation behind the modification in Fig. 3 is that the sequences represented in this band will have smaller energy variance var[ A 2 ] than the others. As an example, consider the sequence (7, 3, 1 , 1 , 1 , 1, 1) which is highlighted by yellow in Fig. 3. This sequence is in the complete trellis, but not in the band. Also consider the sequence (3, 3, 3, 3, 3, 3, 3) which is drawn with red and belongs to the band. The first sequence has an energy variance var[ A 2 ] = 274.3, while the second has no variations. Thus, we expect the sequences in the band to have a lower average energy variance than the sequences in the complete sphere. We emphasize that the band-trellis employs double pruning of sequences with respect to uniform signalling which has an N-cubical signal set. The first is due to the sphere constraint imposed by E max which leads to Fig. 1 (a). The second is due to the newly introduced band-structure which then leads to Fig. 1 (b).

We specify the band-trellis with three values (in addition to N, A, and L): the number of active nodes h i (initial height) and w i (initial width) in the top-right portion of the trellis, and the slope s of the band (in nodes per column). For Fig. 3, h i = w i = 3 and s = 1. From these parameters — which are design choices — , two other parameters can be derived: w f = N +1-(L-h i )/s (final width) on the bottom-left portion of the trellis and h = h i + |s(w i - 1)J (the height of the band). The parameter h will be used when computing the trellis via matrix multiplications in the next section. The parameter w f , on the other hand, can be used to check the validity of the design choices since it must be an integer larger than 0. Finally, since the trellis is computed from left to right, we call the rightmost w i - 1 columns of the trellis the initial portion, the leftmost wr-1 columns the final portion, and the middle N -w i -w f +3 columns the band portion.

The trellis can again be stored in the form of a matrix C, which is given by for Fig 3. Again, the elements of C that correspond to nodes that are not available are filled with zeros. The ESS algorithms can also be operated on this band-trellis. The corresponding shaping approach is called band-ESS (B-ESS). For this, the trellis C must again be stored. The input length of ESS and the shaping rate can be computed using (5)-(6).

D. Comparison: ESS vs. B-ESS

In Table I, we compare ESS and B-ESS at a shaping rate of 1.5 b/amp for N = 108 with A = {1, 3, 5, 7}. B-ESS has a larger L, i.e., shells which are outside the sphere of ESS are considered as illustrated in Fig. 1. This is because the band structure removes some sequences from the trellis, which causes rate loss. To compensate for this loss, B-ESS starts with a larger sphere with a larger rate than that of ESS, then imposes the band structure. We see for the example in Table I that B- ESS leads to a 0.66 dB degradation in average energy and thus, it is expected to perform worse than ESS for the AWGN channel and in the linear regime of optical channels. However, the energy variance for B-ESS is 0.68 dB smaller than that of ESS, from which we expect a gain in the nonlinear regime. Here, we computed the energy variance for the unit-energy random variable for fair comparison. Finally, B-ESS also has a smaller kurtosis, which was also demonstrated to improve the performance in the nonlinear regime too.

III. STREAMING SHAPING ENCODERS WITH B-ESS

We now focus on the computation and the storage of the band portion of the trellis. As N → ∞ , this part will dominate the required storage. We define a new h-by- h matrix B that represents the non-zero values of C for this band portion. Accordingly, we denote the column vector that stores the non-zero values in the nth column by b n = [b h-1 ,n b h-2 , n . . . b 0 , n ]T for n = N-w i +1 , N- w i , . . . , w f -1. For Fig. 3, this matrix is given by

With a slight abuse of notation, we do not start the column indices of B with zero to be consistent with the indices of C. This can be confirmed by comparing (10), (11), and Fig. 3 (dash-dotted parallelogram). Note that B consists of some columns of C (the band portion) without the zero entries. As an example, B in (11) consists of the values written in red in (10).

Similar to (2), B can be computed recursively via initialized with b N -wi+ 1 (i.e., the rightmost column of the band) and using an h-by-h B- ESS adjacency matrix A. As an example, for the trellis in Fig. 3, the adjacency matrix is given by

The structure of the adjacency matrix A used in (12) depends on A, s, and h. Finally, we also define another column vector which we call the ratio vector r n = [r h-1 ,n r h-2,n . . . r 0,n ]T that stores the element-wise ratio r i,n = b i,n /b j,n+1 of adjacent columns b n and b n+1 for i = 0, 1 , . . . , h - 1. As an example, for the trellis in Fig. 3 (and for the B in (10)), r3 = [11/4 17/7 26/10 27/12 18/8] T

A. Evolution of the Band-trellis

An interesting property of the band-trellis is that for large enough values of N, elements of r n become identical as n → w f . In other words, as you move towards node (0, 0), i.e. , towards left, adjacent columns become approximately linked to each other by a scalar multiplicative factor. Moreover, this scalar multiplicative factor p(A) is the spectral radius of A, i.e., the greatest (in absolute value) eigenvalue of A. This is a result of the Perron-Frobenius theorem for nonnegative and irreducible square matrices. The irreducibility of A can be shown by observing that the associated directed graph is strongly connected, which implies irreducibility.

Example 1 (Growth of the Band with Rate p(A)). Consider the bandtrellis constructed for N = 128, L = 129, A = {1, 3, 5, 7}, hi = w i = 3, and s = 1. Final width of the trellis is w f = N +1- (L - h i )/s = 3. The height of the band is h = h i + [s(w i - 1)] = 5 and thus, the ratio vector r n is 5-by-1 for n = N - w i , N - w i - 1, . . . , w f - 1 = 125, 124, . . . , 2. The adjacency matrix A for this trellis is already given in (13). Eigenvalues of A are and 1.

Thus, p(A) = 2.4422. In Fig. 4, we plotted the elements of rn as a function of n. We see that the ratios quickly converge to p( A). At n = 110, the ratios are equal to p(A) up to the fifth digit after the decimal point. At n = 60, the ratios are equal to p(A) up to the twelfth digit after the decimal point.

Using the convergence of r i,n ’s, the trellis can be approximately computed via instead of (12), initialized with b N-wi+1 . However, the corresponding C does not necessarily satisfy (7), especially as n → N, due to the fluctuation of the ratios from p( A), see Fig. 4. To avoid this problem, we can (1) store y additional columns b N-wi-y+2 , b N-wi-y+3 , ▪ ▪ ▪ , b N-wi+1 from the band (in addition to the initial portion of the trellis), and (2) select Then we can use instead of (14), initialized with b N-wi-y+2 , to compute the trellis. This backoff should be adjusted such that the C that corresponds to the B computed with (15) satisfies (7). Though, it will inevitably lead to some rate loss, since (7) will not be satisfied with equality.

Example 2 (Growth of the Band with Rate p < p ( A)).

In Fig. 5, we show the maximum p- that satisfies (7) when used in (15) to compute the trellis for the set of parameters given in Example 1. We also show the behavior when N and L are increased to 256 and 257, resp. The curves are plotted as a function of the number y of the columns that are stored from the band (in addition to the initial portion of the trellis). It is seen that as the number of stored columns y increases, the required backoff decreases. This is because the ratios r i,n converge to p(A) as shown in Fig. 4. On the other hand, for small values of y, the backoff is too large that the resulting rate loss decreases the input length of the shaper significantly. To keep k = 164 and 329 for N = 128 and N = 256, resp., which are the values that are obtained with the exact computation, we need to store at least 7 and 9 columns from the band, resp.

B. Trellis Computation via Binary Shifts

We now focus on the practical case in which B-ESS is implemented on a hardware using a binary number representation. Moreover, we assume that the trellis is stored with bounded precision as explained in Sec. I l-B: Each c l , n is stored with an n α -bit mantissa, and an n β -bit exponent, see (8).

The approximate relation in (14) can be extended to If the period p is selected such that p(A) p is slightly larger than an integer t power of 2, we can write (16) as which can be used to compute the trellis by t-bit shifts. For a column b n+p of which the elements are stored by mantissa exponent pairs, (17) tells that elements of b n have the same mantissas, and their exponents are increased by t. In such a case, we only need to store the first y > p additional columns b N-wi-y+2 , b N-wi-y+3 , ▪ ▪ ▪ , b N-wi+1 from the band (in addition to the initial trellis). Then the rest can be computed in a block-wise fashion, p columns at a time, by just increasing the exponents of the previous p columns by t, i.e., by just appending t zeros to the content of previous p columns.

C. Storage Complexity of B-ESS

Consider the set of parameters given in Example 1 for which p(A) = 2.4422, and thus, p7 = 518.24 > 29, i.e., p = 7 and t = 9. Here we switch to the bounded precision implementation, i.e., instead of storing the numbers with full precision using k + 1 = 165 bits, we only use n α + n β = 16 + 8 = 24 bits.

With this approximation, k is still 164. Then, by only storing 14 column (w i - 1 = 2 from the initial portion, y = 12 from the band), the rest of the trellis can be computed by appending p = 7 new columns recursively via b n ≃ 2 9 b n+7 . In the mantissa- exponent representation, this means the mantissas of b i , n and b i,n+7 are the same, while the latter has its exponent increased by 9. When computed with this approximation, k is still 164 as shown in the row † in Table II. Therefore, the rate is not decreased due to bounded precision or due to the approximation in (17). However, the required storage which was 13.12 and 1.27 kB in the full and bounded precision (FP and BP) cases (rows * and ★), resp., is now 0.19 kB.

Next, the same B-ESS trellis is computed for other values of mantissa lengths N α (and the corresponding values of n β , see (9)). We see from Table II that as we gradually decrease n α from 16 to 8, the number of columns we need to store increases from 14 to 31. This is because as n α decreases, the inaccuracy due to bounded precision and the resulting rate loss, combined with the approximation in (17), decreases k below 164, i.e., a rate loss of 1/N = 1/128 b/amp. To compensate for this, we need to store more columns that are not affected by the approximation in (17), and start this approximation at a later column of the band. We see from Table II that the minimum required storage (0.14 kB) is obtained by storing 14 columns, and using n α = 10 and n β = 8.

Then, the B-ESS trellis is extended, i.e., N is increased while other parameters are kept fixed at L = N+1, w i = h i = 3, s = 1 , and n α = 8. We first observe that since k increases with increasing N, n β also increases via (9). As an example, n β = 11 for N = 1024. Then we fixed n α = 8 and n β = 11. For these values, by storing w i + y - 1 = 31 columns from the band, the trellis can be extended from N = 128 to 256, 512, and 1024 without requiring any additional storage, see rows • in Table II. In other words, with a fixed storage complexity, we can increase the shaping block-length to arbitrarily large values of N at a constant rate (Rs = 1.28 here). Consequently, we achieve block-length adaptivity for constant rate: with 0.35 kB storage, we can operate virtually at any N for k ≈ 1.28N, i.e., storage is independent of N. We numerically see from Table II that in the FP case, the required storage scales with N2, while it scales with N in the BP case.

Remark 1. Sliding-window shaping When a BP trellis is used with n α -bit mantissas and n β -bit exponents, the shaping effectively consists of successive n α -bit subtractions of c i , n from the k-bit input. The locations of these subtractions are determined by the exponents: the subtractions neglect the least significant β bits of the input, and affects only the least significant nα bits of the remaining part. This is illustrated in known documents. Combined with the approximate B-ESS trellis computation (17) based on shift registers, the shaping procedure reduces to the implementation of successive subtractions with subtrahends taken from a limited set, since the mantissas repeat themselves via (17) while only the exponents change.

D. Non-integer Band Slopes

In case a relatively small shaping rate is targeted or an amplitude alphabet with relatively small cardinality is used, in other words L is smaller than N, the slope of the band s ≈ L/N < 1. In such cases, the slope can be selected in the form s = 1/s where s is an integer, and the band can be constructed in a way that the band is “stair-like” where each step is bs-node-long. This can be related to the signal space by considering Fig. 7. Since a smaller slope s roughly means a smaller L, it translates into a signal space that consists of shells with smaller radius. Here, illustratively, the signal points on the outer (green) shells can be indexed, e.g., by B-ESS with s = 1, while the ones on the inner (red) shells can be indexed, e.g., by B-ESS with s = 1/2.

As an example, consider the band-trellis shown in Fig. 6 where s = 1/s = 1/2. Now, the band-like portion of the trellis evolves by moving 1 node per s = 2 column vertically. For instance the second and the third columns are horizontally at the same level, while the fourth and the fifth are placed 1 node above. In this form of the trellis, the relation between b 2 and b 3 is not the same as the relation between b 3 and b 4 . More precisely, for Fig. 6, where the adjacency matrices are given by For such trellises, we define the ratio vector slightly differently: we now consider the ratio of columns separated by bs nodes, As an example, for th T e trellis in Fig. 6, r 4 = [5/1 16/4 30/7 53/11 61/14] . With this definition of the ratio vector, the elements of r n become identical as n → w f , and now, this value p(A) is the spectral radius of A = A1 A2. Note that the eigenvalues, i.e., the spectra, of A1 A2 and A2 A1 are identical, and thus, the result would not change as long as ratio vector is defined for columns separated by bs = 2 nodes. Example 3 (Growth of the Band with Rate p(A) Rev.). Consider the band-trellis constructed for N = 128, L = 67, A = {1, 3, 5, 7}, ℎ i = w i = 4, and ^ = 1/2. For this trellis, w f = 3 and ℎ = 5, and thus, the ratio vector r n is 5-by-1. The adjacency matrices A1 and A2 for this trellis are already given in (19). Eigenvalues of A1 A2 are 1.07 exp(±41j), 4.3866, 2.2695, and 0.1993. Thus, p(A) = 4.3866. In Fig. 4, we plotted the elements of ^ ^ as a function of ^. We see again that the elements of the ratio vectors quickly converge to p(A). Recent years have seen renewed attention to arithmetic coding (AC). This is thanks to the use of AC for distribution matching (DM) to control the channel input distribution in probabilistic amplitude shaping. There are two main problems inherent to AC: (1) its required arithmetic precision grows linearly with the input length, and (2) high-precision multiplications and divisions are required. Here, we introduce a multiplication-free AC-based DM technique via three lookup tables (LUTs) which solves both problems above. These LUTs are used to approximate the high-precision multiplications and divisions by additions and subtractions. The required precision of our approach is shown to grow logarithmically with the input length. We prove that this approximate technique maintains the invertibility of DM. At an input length of 1024 symbols, the proposed technique achieves negligible rate loss (< 0.01 bit/sym) against the full-precision DM, while requiring less than 4 kilobytes of storage Distribution matching (DM) converts a binary string into a symbol sequence with the desired distribution. DM is an essential block in probabilistic amplitude shaping (PAS) that controls the channel input distribution. Thanks to this control, PAS achieves the capacity of the additive white Gaussian noise (AWGN) channel. PAS has also become popular in wireless and fiber optical communications.

The DM technique used when PAS was introduced was constant composition distribution matching (CCDM). In CCDM, binary strings — which are the data sequences in this context — are converted into amplitude sequences with a fixed composition. By selecting this composition such that the resulting amplitude distribution resembles a one-sided Gaussian distribution and by selecting the signs uniformly, a Gaussian-like channel input is obtained, which increases the achievable rates for the AWGN channel and optical channels.

When CCDM was first introduced, it was implemented using arithmetic coding (AC). AC is a source coding algorithm that represents nonuniform source sequences by subintervals of the unit interval. Since DM is the dual operation to compression, AC can be used to realize DM in an “inverted encoder-decoder pair” setup as illustrated in Fig. 1 : matching is realized via AC decoding (decompression), dematching is realized via AC encoding (compression).

There are two main problems inherent to AC. First, the required arithmetic precision grows linearly with the input length, which makes AC very challenging to realize for inputs longer than a thousand symbols. Long block-lengths are necessary to decrease the rate loss of CCDM. Second, multiplications and divisions are necessary to realize the algorithm. There is a large amount of research dedicated to solve these problems in the context of source coding, see e.g., and references therein. For DM, the first problem is solved via a finite-precision implementation in. To the best of our knowledge, there is no study solving the second problem for DM.

In this disclosure, we iinnttrroodduuccee “Log-CCDM”, an approximate logarithmic (log) domain AC-based algorithm that solves both problems above. Log- CCDM works based on three lookup tables (LUTs). High-precision multiplications and divisions required in AC are approximated by low-precision additions and subtractions in the log domain. The required arithmetic precision of Log-CCDM grows logarithmically with the input length. We prove that the invertibil ity of DM is maintained. We demonstrate that the performance of this approximate algorithm (in terms of rate) depends on the sizes of the LUTs. Requiring less than a few kilobytes (kBs), the rate loss of Log-CCDM against full-precision CCDM (FP-CCDM) is negligible, i.e. , < 0.01 bit/sym. This performance is achieved using 20-bit arithmetic operations which is comparable to that of the finite-precision algorithm of [16] that requires multiplications and divisions.

II. DISTRIBUTION MATCHING VIA ARITHMETIC CODING

Consider a composition C = [n 0 , n 1 , ... , n A-1 ] of symbols m A = {0, 1 ,...., A - 1}, resp., where Pa2A na = N and A the symbol alphabet. The corresponding set C cc of CC symbol sequences x = (x 1 , x 2 ,...., x N ) consists of all sequences that have PN i=1 1 [xi = a] = na for a 2 A. Here1[ ] is the indicator function which is 1 when its argument is true and 0 otherwise. CCDM is a shaping technique that maps k-bit strings (indices) v = (v 1 , v 2 , . . . , v k ) to CC symbol sequences x 2 C cc . This is called matching, while the reverse operation is called dematching. The matching rate is then defined as k=N bit/sym. CCDM can be realized via AC.

In AC, the binary string v is represented by a number d(v) 2 [0, 1) which is defined as d(v) =Δ Pk i=1 vi2— i. Further, each symbol sequence x corresponds to a subinterval l(x) of the interval [0, 1). These subintervals partition the interval [0, 1). During arithmetic encoding, the interval l(x) is found based on x, then a number d(v) 2 l(x) is determined. The outcome of this process is v. During arithmetic decoding, the input v is mapped to sequence x if d(v) 2 l(x).

The interval l(x) can be found based on the probability model pn(a) = Pr Δ fXn = ajX1 , X2, . . . , Xn-1g for n = 1, 2, . . . , N and a 2 A. The base bN and width wN of the interval l(x) = [bN, bN +wN) are computed via the recursions for n = 1, 2, . . . , N where bO = 0 and w0 = 1. From (21), we see that the width wN = QN n=1 pn(xn) which is equal to p(x) via the chain rule. Thus, recursions (20)-(21) result in intervals l(x) of length p(x). Typically, d(v) 2 l(x) is selected as the number that leads to the v with the shortest representation. The minimum length of v is d- log l(x)e bits, and AC generates variable-length v in general. Here, log denotes the binary logarithm.

In the case of CCDM, the order of coding operations is inverted. At the transmitter, arithmetic decoding is used to map v to a x 2 Ccc, while arithmetic encoding is realized for the reverse mapping at the receiver. For CCDM, the probability model pn(a) can be obtained considering the composition C of the sequence x and the composition of already-processed symbols (x1 , x2, . . . , xn-1). It is given by

Via (22), the symbol probabilities are initialized as p1(a) = na=N, and recursively updated by decreasing na whenever a symbol x = a is processed. From (22), we see that p(x) = QN n=1 pn(xn) = (QA a=0 -1 na!)=N! = 1=jCccj for all x 2 Ccc. A one-to-one mapping from v to x 2 Ccc is established if each l(x) contains at most one d(v). Since the intervals are of equal length 1=jCccj, the maximum k which guarantees this is k = blog jCccjc, and AC becomes fixed-length coding.

Example 1. (Binary-output CCDM)

Consider the composition C = [3, 2] for A = {0, 1}, i.e., we want to generate binary sequences of length N = 5 containing 2 ones. There are 10 such sequences and k = 3. As an example, let us find the CC sequence x that is mapped to v = (1, 1 , 0) (d(v) = 0.75) which is written with blue in Fig. 9. The first symbol x1 partitions the interval [0, 1) in fractions p1(0) = n0=N = 3=5 and p1 (1) = n1=N = 2=5. Since d(v) = 0.75 2 [6=10, 1), this interval is selected and x1 = 1. Then the second symbol x2 partitions the interval [6=10, 1) in fractions p2(0) = n0=(N -1) = 3=4 and p2(1) = (n1 - 1)=(N - 1) = 1=4. Since d(v) = 0.75 2 [6=10, 9=10), this interval is selected and x2 = 0. This procedure is repeated until x5 and the final sequence x = (1 , 0, 0, 1, 0) is obtained.

During matching, the interval l(x) that includes d(v) can also be found without storing the base bn, but instead, by successively subtracting it from d(v). The corresponding dematching algorithm can also be realized straightforwardly. A pseudocode for this FP-CCDM is given in Algorithm 1 for A = f0, 1g.

The recursive subtraction of base from the input index is realized in line 5 (highlighted in red). This algorithm can also be realized in the log domain, which is the main idea behind Log-CCDM. Then the width would be updated by an addition and a subtraction, instead of the multiplication and the division in lines 6 and 11 (highlighted in blue).

III. LOG-CCDM WITH LUTS

We will explain Log-CCDM for the binary case for simplicity. Binary DM can be used to approximately shape the channel inputs. Extension to nonbinary alphabets is possible for all the techniques discussed in this section.

A. Log-CCDM

We define an exponential function for positive integer s where integers r > 0, d > 0. In (23), both S and M are positive integers. We see from (23) that F (s) can be computed for any positive integer s by storing F (s) only for s = 1, 2 S.

This requires a LUT with S entries. For s > S, F (s) can be computed only with shifts in base-2 thanks to the 2-d factor in (23). We assume that M is an integer power of two, and each entry of this LUT is stored with log M bits. Note that with log M bits, only the nonnegative numbers below M can be stored in an exact manner. This, i.e. , F (s) < M for 1 ≤ s ≤ S, is ensured when S < M from (23). An example of F (s) is shown in Fig. 10.

Consider Algorithm 1 realized in the binary domain. Here, 0 ≤ In < 1 and 0 ≤ wn < 1 for n = 0, 1, . . . , N and hence, their integer parts are always 0. Thus, we neglect their integer part and focus on their fractional part. We approximate the fractional part of wn by F (sn) where sO = 1 A . At step n of Algorithm 1 , there are two possible choices for the width wn+1 corresponding to symbols 1 and 0, resp., see lines 6 and 11. We approximate these choices by first defining

Then we observe from (23) that F (r + dS) = 2-dF (r) which implies F (r

- S log K) = KF (r) for integer S log K. Then for xn+1 = 1 , 0, resp., for n = 1 , 2, . . . , N. In the context of Algorithm 1 , (24) corresponds to the width updates in lines 6 and 11. Updating wn to wn+1 using multiplications is equivalent to updating sn to sn+1 using subtractions.

However, we observe that (24) requires the computation of log function with FP which is typically no less complex than a multiplication. To circumvent this complexity, we define two logarithmic functions such that for n = 1, 2, , N. This computation can be realized offline. Resulting functions can be stored in two LUTs, each having N entries. Examples of Lg+(n) and Lg-(n) are shown in Fig. 10.

We approximate multiplications and divisions involving F (s) and an integer n by resp. The relations in (28)-(29) are approximate due to the way Lg+ and Lg- are defined in (26)-(27), and due to the ceiling operation in (23). Finally, combining (25) and (28)— (29), for symbols 1 and 0, resp. This way, the FP-CCDM in Algorithm 1 is converted into the Log-CCDM in Algorithm 2 by. (1) representing the subinterval width wn+1 by F (sn+1), and (2) keeping track of this width through sn+1 which is updated via (30) in lines 6 and 11 (highlighted in blue). The rest of the algorithms are identical. B. Representability of Data Sequences

When AC is used for compression, if the arithmetic operations are implemented with finite-precision (or in any approximate way), the decodability of the code becomes questionable. Decodability is ensured only when the subdivisions do not create overlapping subintervals, which would make two different source sequences (the red circles in Fig. 8) mapped to the same code sequence (the blue circle). This makes uniquely decodability during decompression impossible.

On the other hand, gaps between subintervals only decreases the efficiency of the source code, i.e., increases its rate. When AC is used for DM which is dual of source coding, see Sec. I, the dual of decodability is representability. AC creates an invertible mapping from data sequences to CC sequences only if the subintervals do not have gaps in between. Overlaps are allowed. As discussed above via Fig. 8, overlaps make two amplitude sequences mapped to the same data sequence. However, since matching is realized first in the case of DM (at the transmitter), this does not create a problem: some channel sequences are just never generated. This only decreases the efficiency of the code, i.e., decreases its rate.

With a FP implementation, AC creates subintervals that neither overlap nor have gaps, satisfying the decodability and representability criteria as in Fig. 9. For our Log-CCDM algorithm, the representability, i.e., “no gaps”, criterion can be expressed from (30) as

Lemma 1. The condition in (31) is satisfied for F(s), Lg+(n), and Lg- defined in (23), (26), and (27), resp.

Proof. First, observe that the following inequalities are satisfied from (26)-(27) by definition for any positive integer s: Then for the first case in (30), we can write

Similarly, for the second case in (30), we can write

Combined, (34) and (35) imply that (31) is satisfied.

A. Input Length of the Matcher

The final width for any input in Algorithm 2 is F(sN) where

Thus, all CC sequences are represented with an interval of identical width. This allows us to represent each interval with a fixed length k-bit index as discussed in Sec. II.

There are F(sO)=F(sN) CC sequences, i.e., jCccj = F(sO)=F(sN). Then as in Sec. II,

The parameters γ and k can be computed online.

B. Storage Complexity and Arithmetic Precision

The storage complexity of Log-CCDM is S log M + 2N(log S + log log N) bits. To store F(s) in (23), we need a LUT with S entries, each stored with log M bits as discussed in Sec. Ill-A. Thus, the storage requirement of this LUT is S log M bits. From (23) and (28)-(29), it can be shown that both Lg+(n) and Lg-(n) are approximately equal to S log n.

Therefore, the entries in the corresponding LUTs can be stored with approximately log S + log log N bits assuming S and N are integer powers of two. Thus, the total storage requirement of these two LUTs is 2N(log S + log log N) bits.

In line 5 of Algorithm 2 (highlighted in red), values of F( ) are recursively subtracted from the input index. These subtractions require an arithmetic precision of log M + log N bits assuming again N is an integer power of two. The proof of this will consist of a lemma and a theorem. Lemma 2. If In < F(sn), then Algorithm 2 guarantees that ln+1 < F(sn+1). This implies that if I0 < F(s0), then all In for n = 0, 1 , . . . , N satisfy In < F(sn).

Proof. We define sn,

Note that s n,0 and sn,1 are the candidates for sn+1 in lines 11 and 6 of Algorithm 2. We first observe that

Then, there are two options at the nth step of Algorithm 2:

Since I 0 < 1 ≤ F (s 0 = 1) due to (23) and Algorithm 2, we conclude that l n < F (s n ) for n = 0, 1 , . , N.

Theorem 1. The subtraction In - F (sn,0) in line 5 of Algorithm 2 requires an arithmetic precision of at most log M + log N bits assuming N is an integer power of two.

Proof. First, observe from (34) that Thus, since In < F (sn) from Lemma 2, the subtrahend of the subtraction In - F (sn,0) in line 5 of Algorithm 2 can be at most N times smaller than its minuend. Since each entry of the LUT of F (▪) is log M-bit long, this subtraction requires an arithmetic precision of at most log M + log N bits.

Remark 1. Log-CCDM can be implemented with two (log M +log N)-bit shift registers. The k-bit input index (and some trailing zeros if necessary) is gradually loaded into the first register, which stores the minuend. The log M-bit F (▪) values are loaded into the second register, which stores the subtrahend. The dematching can be implemented with the same principle via (log M + log N)-bit additions.

Remark 2. It is not straightforward to evaluate the complexity of FP- CCDM. In this work, we have transformed FP-CCDM, which requires no storage but is based on multiplications and divisions, into Log-CCDM, which requires a small amount of storage but is based on additions and subtractions.

Then the selection among these two depends on how costly the multiplications and divisions are in comparison to a certain amount of storage for a given hardware.

C. Rate Loss

There are two sources of inaccuracies in Log-CCDM. The first is due to (23) which leads to an imprecise representation of the log of the interval width. The second is due to (26) and (27), and then to (32) and (33) which lead to imprecise multiplications (with nO or n1) and divisions (with nO + n1). These two inaccuracies lead to overlapping intervals as discussed in Sec. Ill-B. Thus, some CC sequences that would be generated by FP-CCDM are never produced by Log-CCDM, which decreases the rate k=N, causing a rate loss.

In Fig. 11 , we show k=N as a function of N for a composition of C = [0.75N, 0.25N] for FP-CCDM and Log-CCDM with different (S, M) pairs. First, we see that FP-CCDM is asymptotically optimum for large N as shown in [10], This is in the sense that it has a matching rate =A kmax=N converging to the (binary) entropy H(0.75) = 0.8113 bits of the resulting distribution, see the black curve in Fig. 11.

However, Log-CCDM is suboptimal, i.e. , it does not converge to the entropy.

By increasing S and M, this rate can be made closer to the entropy, while the required storage increases, see Sec. IV-B. Next, we see that for increasing S and M, the matching rate of Log-CCDM gets closer to kmax=N. When using LUTs with S = 512 and N ≤ 1024 entries (the red circles), the gap to kmax=N is around 0.007 bit/sym. The gap increases to 0.03 bit/sym for LUTs with S = 128 and N ≤ 1024 entries (the blue curve with squares). Thus, there is a trade off between the rate loss with respect to kmax=N and the table sizes. Assume we operate at N = 1024, S = 512, M = 1024, and k=N = 0.7988 < 0.8063 = kmax=N bit/sym (the filled red circle).

Then the required storage for the three LUTs is S log M + 2N(log S+log log N) = 3.79 kBs (bottom-right inset figure), and the required arithmetic precision is log M + log N = 20 bits (top-left inset figure), see Sec. IV-B. Last, we see that for S = 256, the required storage is virtually identical with M = 512 and 1024 (bottom-right inset figure). However, the rate loss of (S = 256, M = 1024) (dashed green) is smaller. For instance, the same rate loss as (S = 512, M = 1024, N = 1024) (filled red circle) is obtained via (S = 256, M = 1024, N = 1024) (dashed green), requiring 3.22 kBs of storage instead of 3.79. This demonstrates that M plays a more important role in determining the rate loss. For a given M, smaller S can be chosen to decrease the storage complexity, since the rate loss is relatively insensitive to the changes in S.

V. CONCLUSION

In this disclosure, we have introduced Log-CCDM: an approximate algorithm to realize arithmetic-coding-based constant composition distribution matching. Log-CCDM operates in the logarithmic domain and is based on three simple lookup tables

(LUTs). Thanks to these LUTs, it is possible to realize CCDM (1) with an arithmetic precision that grows logarithmically with input length (instead of linearly), and (2) only using additions and subtractions (instead of multiplications and divisions). This decreases the computational complexity, however, increases the storage complexity (to a few kilobytes) due to the LUTs. The performance of Log-CCDM in terms of rate depends on the sizes of these LUTs.