The University of Western Australia
School of Electrical, Electronic and Computer Engineering
Crawley, WA 6009
On A Posteriori Probability Decoding of
Linear Block Codes over Discrete
Channels
Wayne Bradley Griffiths
BCM(Hons), Dip.Mod.Lang.
This thesis is presented for the degree of
Doctor of Philosophy
of
The University of Western Australia.
June 2008
ii
Abstract
One of the facets of the mobile or wireless environment is that errors quite often
occur in bursts. Thus, strong codes are required to provide protection against such
errors. This in turn motivates the employment of decoding algorithms which are
simple to implement, yet are still able to attempt to take the dependence or memory
of the channel model into account in order to give optimal decoding estimates.
Furthermore, such algorithms should be able to be applied for a variety of channel
models and signalling alphabets.
The research presented within this thesis describes a number of algorithms which
can be used with linear block codes. Given the received word, these algorithms
determine the symbol which was most likely transmitted, on a symbol-by-symbol
basis. Due to their relative simplicity, a collection of algorithms for memoryless
channels is reported first. This is done to establish the general style and principles
of the overall collection. The concept of matrix diagonalisation may or may not
be applied, resulting in two different types of procedure. Ultimately, it is shown
that the choice between them should be motivated by whether storage space or
computational complexity has the higher priority. As with all other procedures
explained herein, the derivation is first performed for a binary signalling alphabet
and then extended to fields of prime order.
These procedures form the paradigm for algorithms used in conjunction with
finite state channel models, where errors generally occur in bursts. In such cases,
the necessary information is stored in matrices rather than as scalars. Finally,
by analogy with the weight polynomials of a code and its dual as characterised
by the MacWilliams identities, new procedures are developed for particular types
of Gilbert-Elliott channel models. Here, the calculations are derived from three
parameters which profile the occurrence of errors in those models. The decoding
is then carried out using polynomial evaluation rather than matrix multiplication.
Complementing this theory are several examples detailing the steps required to
perform the decoding, as well as a collection of simulation results demonstrating the
practical value of these algorithms.
iii
iv
Acknowledgements
As one can imagine, submitting oneself to the rigours of a higher degree by research
is not an easy task, and it would become even more arduous if one had to face that
task alone. Fortunately, there were a few people and organisations which provided
me with assistance, in different ways, during my period of candidature. I would like
to take this opportunity to thank them.
Firstly, I must thank my principal discipline supervisor Professor Dr.-Ing. Hans-
Jurgen Zepernick. If he had not accepted me as one of his students, this thesis
would not exist. He has been my compass from the beginning, always there to
direct me onto the next task. As evidenced by the amount of time and resources
he contributed, I must conclude that he is a wealth of knowledge and someone who
was supportive when things went bad, as well as being an excellent resource and aid
in making my written work more polished.
He was also instrumental in permitting me to study for six months at Blekinge
Tekniska Hogskola in Ronneby, Sweden. Without this opportunity, I would never
have experienced many different things, nor would I have grown as a person as much
as I believe I have done. Speaking of BTH, I wish to thank all the staff and doctoral
students from Avdelningen for Signalbehandling for accepting yet another Aussie
through their automatic doors.
I am also indebted to my co-supervisor Dr Manora Caldera. She spent countless
hours reading through my work and was always providing suggestions to aid the
readability of my written output. For that, I thank her. Thanks also to my principal
administrative supervisor Professor Sven Nordholm who ensured all the necessary
paperwork was filled out in a timely fashion and thus allowed me to optimise my
time spent researching.
There are a few organisations to which I must express my gratitude. Firstly, I
thank The University of Western Australia for offering me a scholarship position.
Their monetary assistance allowed me to have a life outside of study. In a similar
vein, I am grateful to the Australian Telecommunications Cooperative Research
Centre for providing additional financial and travel aid for me, as well as organising
v
Abstract
a number of conferences at which I was able to gain valuable experience in presenting
my work.
Thank you also to the Western Australian Telecommunications Research Insti-
tute for furnishing me with a pleasant research environment over the past four years,
providing monetary assistance when other income sources expired, and supplying
the necessary equipment on which to carry out my simulations. Additionally, I wish
to acknowledge the various fellow students who have been cohabitants of “the of-
fice” during my time at WATRI. They have given support when it was needed and
provided a welcome source of joviality in my life.
Finally, I thank my parents Colleen and Raymond, whose boundless support,
love and understanding have sustained me through these years and instilled in me
the determination to succeed at everything to which I set my mind.
Wayne Griffiths
2007.
vi
Table of Contents
Abstract iii
Acknowledgements v
List of Abbreviations xi
List of Common Symbols xiii
List of Figures xix
List of Tables xxiii
1 Introduction 11.1 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Summary of Major Contributions . . . . . . . . . . . . . . . . . . . . 51.3 List of Relevant Publications . . . . . . . . . . . . . . . . . . . . . . . 8
2 Foundations 112.1 Communication System Model . . . . . . . . . . . . . . . . . . . . . . 122.2 Channel Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Memoryless channels . . . . . . . . . . . . . . . . . . . . . . . 152.2.2 Channels with memory . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Error Control Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.3.1 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.3.2 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412.3.3 Trellis representations of linear block codes . . . . . . . . . . . 45
3 APP Decoding on Discrete Channels without Memory 533.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.2 Original Domain Matrix Representations of Linear Block Code Trellises 56
3.2.1 Matrix representation for APP decoding on BSCs . . . . . . . 563.2.2 Matrix representation for APP decoding on DMCs . . . . . . 61
3.3 Spectral Domain Matrix Representations of Linear Block Code Trellises 643.4 Instructive Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4.1 Example of decoding in the original domain . . . . . . . . . . 703.4.2 Example of decoding in the spectral domain . . . . . . . . . . 72
3.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
vii
TABLE OF CONTENTS
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4 APP Decoding on Binary Channels with Memory 83
4.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.2 Binary APP Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2.1 Original domain . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2.2 Spectral domain . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.3 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.3.1 Computational complexity . . . . . . . . . . . . . . . . . . . . 94
4.3.2 Storage requirements . . . . . . . . . . . . . . . . . . . . . . . 96
4.4 Instructive Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.4.1 Example of decoding in the original domain . . . . . . . . . . 98
4.4.2 Example of decoding in the spectral domain . . . . . . . . . . 100
4.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.5.1 Description of parameter values in these simulations . . . . . . 103
4.5.2 Observations from the simulations . . . . . . . . . . . . . . . . 103
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5 APP Decoding on Non-binary Channels with Memory 109
5.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.2 Non-binary APP Decoding . . . . . . . . . . . . . . . . . . . . . . . . 111
5.2.1 Original domain . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.2.2 Spectral domain . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.3 Properties of the Conditional Spectral Coefficients . . . . . . . . . . . 120
5.4 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.4.1 Computational complexity . . . . . . . . . . . . . . . . . . . . 124
5.4.2 Storage requirements . . . . . . . . . . . . . . . . . . . . . . . 125
5.5 Instructive Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.5.1 Example of decoding in the original domain . . . . . . . . . . 126
5.5.2 Example of decoding in the spectral domain . . . . . . . . . . 129
5.6 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.6.1 Non-binary Hamming codes . . . . . . . . . . . . . . . . . . . 131
5.6.2 The ISBN-10 code . . . . . . . . . . . . . . . . . . . . . . . . 135
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
6 Generalised Weight Polynomials for Binary Restricted GECs 141
6.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.2 Burst-error Characteristics . . . . . . . . . . . . . . . . . . . . . . . . 143
6.3 Derivation of APPs using Generalised Weight Polynomials . . . . . . 145
6.4 Instructive Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.5.1 (16,8) cyclic code . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.5.2 (22,13) Chen code . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
viii
TABLE OF CONTENTS
7 Generalised Weight Polynomials for Non-binary Restricted GECs1637.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1647.2 The Channel Reliability Factor for a Non-binary GEC . . . . . . . . 1657.3 Derivation of Non-binary APPs using Generalised Weight Polynomials 1677.4 Instructive Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 1727.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.5.1 (4,2) Hamming code over GF (3) . . . . . . . . . . . . . . . . . 1777.5.2 (26,22) BCH code over GF (3) . . . . . . . . . . . . . . . . . . 178
7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
8 Conclusion and General Discussion 1838.1 Summary of Major Findings and Contributions . . . . . . . . . . . . 1838.2 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Appendices 191
Appendix A Proof of (3.44) 193
Appendix B Proof of Lemma 5.3.1 195
Bibliography 197
ix
x
List of Acronyms
APP A posteriori probability
BCJR Bahl, Cocke, Jelinek and Raviv
BER Bit error rate
BSC Binary symmetric channel
DMC Discrete memoryless channel
GEC Gilbert-Elliott channel
GWP Generalised weight polynomial
HMM Hidden Markov model
ISBN International Standard Book Number
LDPC Low-density parity-check
LLR Log likelihood ratio
MAP Maximum a posteriori probability
ML Maximum likelihood
SER Symbol error rate
xi
xii
List of Common Symbols
arg(·) Argument function
bin(·) Function which gives the binary representation in vector form
of its input
c(λ) Characteristic polynomial in terms of an indeterminate λ. For
a p× p matrix K, this equals det(λIp − K)
circ(l1, l2, . . . , lp) Circulant p× p matrix consisting of all p cyclic permutations
of a list of entries l1, l2, . . . lp
d Coset leader used in syndrome decoding
d(C) Hamming distance of a code C
d(u1,u2) Hamming distance between two codewords u1 and u2
dec(·) Function which gives the base 10 scalar representation of its
input
det(K) Determinant of a matrix K
diag{f(i)} Diagonal matrix with the ith diagonal element given by a func-
tion f(i)
e All-ones column vector
ei ith standard basis (row) vector
f ∗(D) Monic irreducible polynomial in indeterminate D used in
defining a field of polynomials
gi ith row of generator matrix G for a block code
gj(D) jth polynomial constraint in indeterminate D for a convolu-
tional code
hi ith column of parity check matrix H
i Index within the set of k information symbol positions
i Vector of information symbols to be encoded and transmitted
over a channel
i Vector of decoded information symbols
j Index within the set of n codeword symbol positions
Square root of -1. The imaginary unit
xiii
TABLE OF CONTENTS
k Number of information symbols per codeword in a code C
logp(·) Base p logarithm
max(·) Maximisation function
n Length of codewords in a code C
p Positive prime integer, order of the Galois field GF (p)
pb Average bit error probability for a binary GEC
ps Average symbol error probability for a non-binary GEC
psiCrossover probability within a GEC for the DMC correspond-
ing to the state si
psisi+1Crossover probability within a binary general two-state
Markov channel for the DMC corresponding to a transition
from state si at time instant i to state si+1 at time instant
i+1
s Index within the set of pn−k dual codewords
s p-ary vector representation of the decimal number s
si State of a stochastic sequential machine at time instant i
si Partial syndrome at depth i
t Index for the original domain used to refer to the cosets of a
linear block code C
tr(K) Trace of a matrix K
u Transmitted codeword of a code C
ui ith transmitted symbol of a code C
ui Estimate of the ith transmitted symbol ui obtained by decod-
ing
u⊥s sth codeword of a dual code C⊥
u⊥s,j jth symbol of the sth codeword of a dual code C⊥
v Vector of received symbols
vi ith received symbol of received word v
vj Invert or logical NOT of the jth received bit vj
vecp(·) Function which outputs the base p representation in vector
form of its input
w Complex pth root of unity
wi ith row of the Walsh-Hadamard transformation matrix
x Average fade to connection time ratio for a GEC
y Burst factor for a GEC
z Channel reliability factor for a GEC
xiv
TABLE OF CONTENTS
Aj Number of words of weight j in a code C
A(z) Weight polynomial for a code C
B ‘Bad’ state of a GEC
B Branch set of a trellis
Bj Number of words of weight j in a dual code C⊥
B(ui)(x, y, z) Generalised weight polynomial in terms of burst-error char-
acteristics x, y and z for the ith transmitted symbol ui
B(z) Weight polynomial for a dual code C⊥
C Linear block code
C Field of complex numbers
C Capacity of a channel
C⊥ Dual of the code C
D Stochastic sequential machine
D Stochastic automaton for a channel model with memory
D Stochastic state transition matrix for a channel with memory
D0 Matrix probability corresponding to correct transmission over
a channel with memory
D1 Matrix probability for binary codes corresponding to incorrect
transmission over a channel with memory
Dǫ Matrix probability for non-binary codes corresponding to in-
correct transmission over a channel with memory
DfiMatrix probability for non-binary codes corresponding to re-
ceiving the symbol which is fi greater (mod p) than the trans-
mitted symbol ui over a channel with memory
E Set of nodes in a trellis at depth zero
F Set of nodes in a trellis at depth n
G ‘Good’ state of a GEC
G Generator matrix for a linear block code C
GF (p) Galois field of order p
H Parity check matrix for a linear block code C
Ipn−k Identity matrix of order pn−k
KH Hermitian of a matrix K
KT Transpose of a matrix K
K∗ Complex conjugate of a matrix K
K−1 Inverse of a matrix K
[K](l) lth principal leading submatrix of a matrix K
xv
TABLE OF CONTENTS
M Set of matrix probabilities using the spectral domain for a
restricted GEC. Equal to {D, δ}Mhj
(uj) Matrix representation of trellis section corresponding to col-
umn hj for transmitted symbol uj
Mh(u) Elementary trellis matrix
N Set of non-negative integers
N Node set of a trellis
Nt Set of nodes of a trellis at depth t which can be reached from
the set E of nodes at depth zero
O Big-O notation for the asymptotic upper bound on complexity
OC(u) Orthogonal complement of a vector u
P State transition probability from ‘good’ state G to ‘bad’ state
B in a GEC
Pt Coset probability of the coset Vt in syndrome decoding
Pt(ui|v) APP for transmitted symbol ui given received word v when
encoding using the tth coset
P(ui|v) Vector of APPs for transmitted symbol ui given received word
v
Q State transition probability from ‘bad’ state B to ‘good’ state
G in a GEC
Q(ui|v) Vector of conditional spectral coefficients for ith transmitted
symbol ui and received word v
Qs(ui|v) sth conditional spectral coefficient for ith transmitted symbol
ui and received word v
Qs(ui|v) sth conditional spectral coefficient matrix for ith transmitted
symbol ui and received word v
Q(ui)s (x, y, z) sth conditional spectral polynomial for the ith transmitted
symbol ui, in terms of burst-error characteristics x, y and z
R Code rate of a code C. Equal to kn
S Number of states in a stochastic sequential machine
S Set of states in a stochastic sequential machine
U Alphabet of symbols which can be transmitted on the channel
UhjWeighted trellis section matrix for column hj of H regardless
of transmitted symbol
Uhj(uj) Weighted trellis section matrix for column hj of H and trans-
mitted symbol uj
xvi
TABLE OF CONTENTS
UH(ui) Matrix representation for entire weighted trellis described by
parity check matrix H for ith transmitted symbol ui
V Alphabet of symbols which can be received on the channel
Vt tth coset of a code C
W Spectrum of eigenvalues of the elementary trellis matrices
Wpn−k Walsh-Hadamard transformation matrix of order pn−k
Y Storage requirement for implementation of an algorithm in
terms of the size in memory of a single real number
Z+ Set of positive integers
Z[x, y, z] Ring of polynomials with integer coefficients and in indeter-
minates x, y and z
δ Difference matrix for a restricted GEC
δa,b Dirac-delta function, equal to 1 if a = b and 0 otherwise
δorig Homomorphism resulting in a circulant matrix representation
δspec Homomorphism resulting in a diagonal matrix representation
ǫ Crossover probability for a BSC or DMC
ι0 First row of the Walsh-Hadamard matrix Wpn−k
σ0 Stationary state distribution for a stochastic sequential ma-
chine
σsiStationary state probability for state si
τ 0 Vector post-multiplied by the representation of a decoding
trellis in the original domain in order to select paths com-
mencing at the 0th node
∆ Difference matrix for a channel with memory
∆0,∆1 Difference weights on a weighted diagonal trellis for decoding
over a BSC
Λhj(uj) Spectral matrix representation of trellis section corresponding
to column hj for transmitted symbol uj
Λh(u) Elementary spectral matrix
ΘhjWeighted spectral matrix for column hj of H regardless of
transmitted symbol
Θhj(uj) Weighted spectral matrix for column hj of H and transmitted
symbol uj
xvii
TABLE OF CONTENTS
ΘH(ui) Spectral matrix representation for entire weighted diagonal
trellis described by parity check matrix H for ith transmitted
symbol ui
Θs,i(ui) Diagonal weightings used in the weighted spectral matrix
ΘH(ui) for ith transmitted symbol ui
xviii
List of Figures
2.1 Block diagram of the considered digital communication system. . . . 13
2.2 Transition probability diagram of a BSC. . . . . . . . . . . . . . . . . 16
2.3 Transition probability diagram of the p-ary symmetric DMC model:(a) standard model, (b) alternative model. . . . . . . . . . . . . . . . 18
2.4 Structure of the binary general two-state Markov model. . . . . . . . 23
2.5 Structure of the binary GEC model. . . . . . . . . . . . . . . . . . . . 25
2.6 Structure of the standard non-binary GEC model. . . . . . . . . . . . 26
2.7 Structure of the binary Gilbert channel model. . . . . . . . . . . . . . 29
2.8 Structure of the non-binary Gilbert channel model using the standardp-ary DMC model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.9 Structure of the binary restricted GEC model. . . . . . . . . . . . . . 32
2.10 Structure of the standard non-binary restricted GEC model. . . . . . 33
2.11 Structure of the alternative non-binary restricted GEC model. . . . . 34
2.12 A (2,1,3) convolutional encoder constructed with generator polyno-mials g1(D) = D2+D+1 and g2(D) = D2+1. . . . . . . . . . . . . . 38
2.13 A basic trellis with eight nodes and seven branches. . . . . . . . . . . 46
2.14 Trellis representations of the binary (4,2) linear block code C:(a) standard syndrome trellis, (b) minimal trellis (Dashed: si+1 = si,Solid: si+1 = si ⊕ hT
i+1). . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.15 Trellis representations of the binary (4,2) linear block code C suitablefor computing APPs: (a) P (u2 = 0|v), (b) P (u2 = 1|v) (Dashed:si+1 = si, Solid: si+1 = si ⊕ hT
i+1). . . . . . . . . . . . . . . . . . . . . 51
3.1 Original domain APP decoding trellises for the binary (4,2) linearblock code C which allow for computation of the conditional proba-bilities (a) P (u2 = 0|v) and (b) P (u2 = 1|v). (Dashed: sj+1 = sj,Solid: sj+1 = sj ⊕ hT
j+1.) . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.2 Illustration of the relationship between the codewords u⊥s , s = 0, 1, 2, 3,
of the dual code C⊥ and the spectral section matrices Λhj(uj), j =
1, 2, 3, 4, uj = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.3 Weighted diagonal trellises of the binary (4, 2) linear block code Cused for computing the conditional spectral coefficients (a) Qs(u2 =0|v) and (b) Qs(u2 = 1|v); s = 0, 1, 2, 3. . . . . . . . . . . . . . . . . . 78
3.4 BER performance of some binary block codes on a BSC. . . . . . . . 80
3.5 SER performance of some block codes over GF (3) on a ternary DMC. 81
xix
LIST OF FIGURES
4.1 Original domain weighted APP decoding trellises for the binary (4,2)linear block code used to compute (a) P (u2 =0|v) and (b) P (u2 =1|v).(Dashed: sj+1 = sj, Solid: sj+1 = sj ⊕ hT
j+1.) . . . . . . . . . . . . . . 99
4.2 Weighted diagonal trellises of the binary (4, 2) linear block code usedfor computing the spectral coefficients (a) Qs(u2 = 0 | v) and (b)Qs(u2 = 1 | v); s = 0, 1, 2, 3. . . . . . . . . . . . . . . . . . . . . . . 102
4.3 Performance of the (7,4) Hamming code on a GEC with Q=0.01 and(a) pB = 0.1, (b) pB = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . 104
4.4 Performance of the (7,4) Hamming code on a GEC with Q=0.3 and(a) pB = 0.1, (b) pB = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . 105
5.1 Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)used to compute spectral coefficients Qs(u2 = 0|v); s = 0, 1, . . . , 8. . . 132
5.2 Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)used to compute spectral coefficients Qs(u2 = 1|v); s = 0, 1, . . . , 8. . . 133
5.3 Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)used to compute spectral coefficients Qs(u2 = 2|v); s = 0, 1, . . . , 8. . . 134
5.4 Performance of the (4,2) Hamming code over GF (3) on a GEC withstate transition probabilities P=0.05 and Q=0.2. . . . . . . . . . . . 136
5.5 Performance of the (6,4) Hamming code over GF (5) on a GEC withstate transition probabilities P=0.05 and Q=0.2. . . . . . . . . . . . 136
5.6 Performance of the (8,6) Hamming code over GF (7) on a GEC withstate transition probabilities P=0.05 and Q=0.2. . . . . . . . . . . . 137
5.7 Performance of the ISBN-10 code on a GEC with state transitionprobabilities P=0.05 and Q=0.2. . . . . . . . . . . . . . . . . . . . . 137
6.1 Weighted diagonal trellises of the binary (4, 2) linear block code C
used to compute spectral polynomials (a) Q(u2=0)s (x, y, z) and (b)
Q(u2=1)s (x, y, z); s = 0, 1, 2, 3 for a binary restricted GEC. . . . . . . . 158
6.2 Performance of the (16,8) block code on a binary restricted GEC andpb = 1%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.3 Performance of the (22,13) Chen code on a binary restricted GECwith average fade to connection time ratio x = 0.004. . . . . . . . . . 161
7.1 The relationship between the channel reliability function z and themean SER, ps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
7.2 Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)
used to compute conditional spectral polynomials Q(u2=0)s (x, y, z); s =
0, 1, . . . , 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
7.3 Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)
used to compute conditional spectral polynomials Q(u2=1)s (x, y, z); s =
0, 1, . . . , 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
7.4 Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)
used to compute conditional spectral polynomials Q(u2=2)s (x, y, z); s =
0, 1, . . . , 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
xx
LIST OF FIGURES
7.5 Performance of the (4,2) Hamming code over GF (3) on restrictedGECs with pB = 1
3, ps =1%. . . . . . . . . . . . . . . . . . . . . . . . 179
7.6 Performance of the (26,22) BCH code over GF (3) on restricted GECswith pB = 1
3, ps =1%. . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
xxi
xxii
List of Tables
2.1 Addition and multiplication table of the Galois field GF (2). . . . . . 37
8.1 Elements of GF (32) and their ternary vector images. . . . . . . . . . 187
xxiii
xxiv
Chapter 1
Introduction
Human beings have always been searching for ways in which their lives can be made
easier and/or more efficient. For example, the development of wireless communi-
cations through the 20th century and beyond has made life more convenient, since
it is becoming increasingly possible to communicate with anyone or any machine,
anywhere, and at any time. There no longer needs to be a wired connection to a
network in order to communicate.
As attested to by [1], the demand for wireless communications has undergone
exponential growth since the 1990s. Not only are people talking wirelessly across
the globe in increasing numbers, but there has also been a boom in wireless internet
usage. The fact which must be taken into account is that wireless communications
are usually less reliable than for a wired channel. Imagine, for example, the following
common scenario. A mobile telephone user wishes to receive a call from another
person. Whether that other person is using a landline or mobile phone is irrelevant.
The signal will need to be transmitted from a base station to the mobile handset.
Even though base stations are often in raised locations, there will not necessarily
be a line of sight path for the signal. There will however usually be a number of
paths between the transmitter and receiver which result from reflections of the signal
off obstacles which occupy the space between them. The effects of these multiple
paths include, but are not limited to, a time delay between different versions of
the received signal, and differences in phase or amplitude. The signals received by
the handset can interfere with each other, sometimes constructively and sometimes
destructively. All of this is further complicated by the fact that the receiver will
usually be moving in space. There are times when the signal will be good, such
that its strength is above a certain threshold and not many errors will occur. By
contrast, there will be times when the signal is not good and many errors will occur.
On these occasions, the signal is said to be experiencing fading.
1
The wireless communication environment is also extremely complex. Probabili-
ties of correct reception are highly time- and space-dependent. In order to simplify
concepts and testing, channel models are used. One of the simplest models for this,
as evidenced by [2], is a Gilbert-Elliott channel (GEC) model [3]. A GEC model
consists of two distinct states and the channel is only in one of the two states at any
time. Transference between states occurs as a result of a random variable. One state
corresponds to times when there are relatively few errors, and the other corresponds
to times of relatively many errors. It is also important to note that it is difficult to
determine exactly how many states are required to model the mobile environment
and the figure may vary with the actual data [4].
One method of protecting data from channel errors, or rather, retrieving the
data after such errors have occurred, is to employ error control coding. Redundancy
is used to assist in determining the transmitted symbols. There is also a decision
to be made over which signalling alphabet to use. The traditional approach is to
use binary symbols, which permit only yes-or-no decisions. If a higher resolution of
decisions is required, a logical choice is to use the symbols from a non-binary field.
In particular, ternary is looked upon the most favourably in terms of economy of
information representation [5]. However, as the amount of information contained in
each symbol is increased, the more one stands to lose if that symbol is corrupted in
the transmission process. Extension fields, which have an order given by a prime
raised to a power of two or more, are also an option. Here, several symbols may be
used to represent a single element of the extension field.
If error control coding is used, then there is required to be a way of obtaining
an estimate of the transmitted data, in other words, decoding. For the past 60
years, there has been a search for codes which will meet the so-called Shannon
limit [6]. With recent advances such as the discovery of turbo codes [7], there
is renewed interest in decoding algorithms. In particular, a posteriori probability
(APP) decoding algorithms make use of soft information, resulting in performances
which approach the Shannon limit. Considering the above discussion points, there is
thus a need to investigate APP decoding algorithms which can be easily implemented
with a variety of codes over a field of any size, and on a variety of channels described
by a finite number of states.
Several APP decoding algorithms have been developed over the years assuming
discrete memoryless channels (DMCs) or perfectly interleaved flat-fading channels
in the presence of additive white Gaussian noise. The first major advance in the de-
velopment of favourable decoding algorithms was the trellis-based decoding scheme
by Bahl, Cocke, Jelinek and Raviv [8] referred to as the BCJR algorithm. Other
2
research in the area of APP decoding has focussed on reducing the computational
complexity of the decoding to a single-sweep algorithm [9,10], use of general input-
output hidden Markov models [11], or on exploiting features of the code itself [12,13],
or the dual code [14]. There has been increased recent interest in non-binary com-
puting [15], promoting the use of codes over such fields [9, 16, 17] and over rings
in [18]. However, the implementation of APP decoding algorithms can often be too
complex in practice. This is especially true for codes over large fields because more
probabilities need to be calculated. It is also interesting to note the most recent re-
search on decoding of general low-density parity-check (LDPC) codes over Abelian
groups [19] and non-binary LDPC codes over prime and extension fields [20]. This
recent research is motivated by the observation that the error performance of LDPC
codes can be improved for moderate code length by increasing the order of the
associated group or field.
It is also noted that due to the increasing demand for efficient utilisation of
the available bandwidth, especially in wireless communications and mobile radio
systems, higher rate codes are becoming more desirable, which promotes the use
of block codes. Although there has been some research into turbo decoding of
convolutional codes over a Markov channel [21,22] and a GEC [23], and some recent
work on the analysis of LDPC codes for the GEC [24], little can be found on APP
decoding of linear block codes over a GEC.
This lack of APP decoding methods for linear block codes over channels with
memory has motivated the research presented in this thesis. In particular, APP de-
coding algorithms for binary and non-binary linear block codes over discrete channels
with and without memory are developed. The proposed algorithms can be classified
into the field of single-sweep APP decoding techniques that use matrix multiplica-
tions and exploit the concepts of ‘dual APP’ decoding [9, 10, 14]. By formulating
suitable matrix representations for the different channel-decoder cascades, the tools
of linear algebra are used to develop APP decoding algorithms in an ‘original’ do-
main and a ‘spectral’ domain. The relationship between these two domains can
be formulated by a similarity transformation. In this way, it is possible to develop
efficient APP decoding algorithms for linear block codes on discrete memoryless
channels as well as on channels with memory that can be modelled by stochastic
sequential machines. As far as channels with memory are concerned, the focus of
this thesis will be the GEC.
By fixing the crossover probability in the ‘bad’ state of the GEC such that for
a given transmitted symbol, all symbols are equally likely to be received, an APP
decoding decision can efficiently be reached by evaluating trivariate polynomials.
3
1.1. THESIS STRUCTURE
It should also be stressed that this particular class of GEC is described by three
variables, namely, the average fade to connection time ratio, the burst factor, and
the channel reliability factor. As these variables can be deduced from error sequence
measurements, relevance to practical digital communication scenarios is readily pro-
vided. This contribution of the thesis is aesthetically pleasing as it reveals similarities
to the way that the well-known MacWilliams identities [25] use polynomials to de-
scribe the relationship between the weight distribution of a code and its dual. In this
sense, a relationship between the APPs of the original domain and the coefficients
of the spectral domain is presented. By analogy with the MacWilliams identities,
the polynomial expressions derived in this thesis, for binary and non-binary codes
respectively, are termed generalised weight polynomials.
It should be noted, however, that the thesis is not intended as a database of
a set of possible codes and their corresponding error correction performances over
different channels. The purpose of the thesis is instead to demonstrate the means
by which one could determine such capabilities, by reporting the necessary theory
and developing the analytical framework.
1.1 Thesis Structure
It is intended that the chapters of this thesis be read in the order of their presenta-
tion, with each one building on information from its predecessors. The thesis begins
with some of the basic concepts about channel models and error control coding.
Then, the middle part of the thesis deals with APP decoding algorithms for mem-
oryless channels and some standard channels with memory. In the later chapters,
some exciting polynomial-based APP decoding algorithms are reported for a par-
ticular type of finite state channel. Finally, there is a brief discussion on the future
directions in which this research could progress.
The following is a breakdown of the main content of each chapter.
Chapter 2 presents the necessary background material to ensure easy under-
standing of the concepts concerning channel models, coding and decoding which
occur later in the thesis. Specifically, it covers symmetric DMC models, as well
as channels with memory constructed using a hidden Markov model (HMM). In
addition, the basics of coding theory are covered, beginning with finite field arith-
metic. This progresses through explanations of convolutional and block encoding,
including some required information on dual codes. Some decoding strategies are
also discussed, including syndrome decoding, sequence estimation decoding, symbol
estimation decoding and the corresponding APP decoding formulations. All of this
4
1.2. SUMMARY OF MAJOR CONTRIBUTIONS
leads up to the final section on trellis decoding, which provides a foundation for the
decoding procedures presented in later chapters.
The first of the APP decoding algorithms are presented in Chapter 3. They are
designed to function with memoryless channels. Initially, the matrix representation
of the decoding trellis is constructed, based entirely on the structure of the code.
The matrices must then be weighted by the probabilities of transmission error or
non-error which form part of the channel model. Memoryless channels may be
viewed as degenerate examples of channels with memory having one, rather than
multiple states and so these weightings are matrices of dimension one, or scalars. It
is possible to perform the decoding using the representation of the original trellis,
or alternatively to diagonalise the matrices representing the code and work in the
spectral domain.
The adaptation of the original and spectral domain algorithms for use with
channels described by a stochastic sequential machine, such as a GEC, is covered
in the subsequent two chapters. To be specific, Chapter 4 reports the procedure
as applied to binary codes, while Chapter 5 expands the idea to include non-
binary fields and analyses the conditional spectral coefficients of the spectral domain.
The computational complexity and storage requirements of these procedures are
examined in both chapters.
Specialisations to the restricted GEC models are the focus of Chapters 6 and
7. Such models contain two states, one of which has the property that for a given
transmitted symbol, each symbol of the signalling alphabet is equally likely to be
received. Polynomial representations of the conditional spectral coefficients from
Chapters 4 and 5 for this specific channel type are derived. Instead of the basic error
and state transition probabilities which form the matrix elements that are used in
the procedures of the preceding chapters, the three variables of these polynomial
expressions reflect the nature of the error bursts on the channel. The paradigm of
generalised weight polynomials for binary and non-binary codes is also derived in
these Chapters 6 and 7, respectively.
Finally, Chapter 8 summarises the contributions of the thesis and explores some
of the areas in which future research related to this topic could be undertaken.
1.2 Summary of Major Contributions
A list of the significant items contained within the body of this thesis which represent
additions to scholarship is given below. These items have each been categorised
into one of four areas, which are channel modelling, APP decoding for discrete
5
1.2. SUMMARY OF MAJOR CONTRIBUTIONS
channels without memory, APP decoding for discrete channels with memory, and
APP decoding using generalised weight polynomials for restricted GECs.
Channel modelling
Symmetric discrete channels without memory can be described with the aid of a
parameter called the crossover probability. It was noted early in the research period
that two unequal descriptions of the non-binary varieties of such a class of channel
could be made. In order to emphasise that both viewpoints are equally correct,
much of the research in this thesis is presented using both models. Non-binary
symmetric DMCs are vital to the structure of non-binary restricted GEC models.
It is possible to describe such models using three parameters, one of which is known
as the channel reliability factor. Although the definition of this parameter has
already been established for a GEC [26], its derivation for a non-binary restricted
GEC represents new material. Thus, the achievements of this thesis in the area of
channel modelling are:
• Characterisation of non-binary symmetric DMCs using two different view-
points, depending on how the crossover probability is defined.
• Application of the concept of the channel reliability factor to non-binary re-
stricted GECs.
APP decoding for discrete channels without memory
In order to discuss the proposed APP decoding methods for channels which possess
memory, an important first step is the development of similar procedures for use
when the channel is without memory. Such theory has already been researched [27],
however the consideration of such schemes by the author has revealed a new way
of interpreting the involved matrix operations. Additionally, the implementation of
these methods using computer simulation of the communication system resulted in
the collection of performance data for several binary and non-binary linear block
codes. Therefore, the contributions of the thesis in this area are:
• Use of the trace of the weighted spectral matrix of the full weighted trellis to
deliver the APPs which are required for the decoding procedure.
• Numerical examples obtained by computer simulations which demonstrate the
range of performance analysis options that are supported by the proposed APP
decoding procedures for DMCs.
6
1.2. SUMMARY OF MAJOR CONTRIBUTIONS
APP decoding for discrete channels with memory
One of the exciting breakthroughs of this thesis is the novel application of the APP
decoding algorithms for memoryless channels to a channel with memory. This is
chiefly handled by adapting the weighting mechanism for the multi-dimensional
weights which are a feature of these channels. This ideology follows through in both
the original and spectral domains, thus making possible the development of a suite of
APP decoding algorithms. The usefulness of such a suite is supported by an analysis
of both the computational complexity and the storage requirements of the reported
algorithms, in terms of the parameters of the linear block code and the number
of states of the channel. In particular, the storage benefits of the spectral domain
approach are highlighted. These methods are also dramatically more efficient than
some other known approaches, allowing speedy recovery of corrupted data. Whilst
doing this research, the author was also able to prove an interesting fact about
the coefficients of the spectral domain, which are counterparts of the APPs of the
original domain. Hence, the main contributions of this thesis to scholarship in the
area of APP decoding for discrete channels with memory are:
• Development of an APP decoding procedure using the original domain for
binary linear block codes over a binary channel described by a stochastic au-
tomaton.
• Adaptation of the above algorithm to non-binary linear block codes over a
non-binary channel described by a stochastic automaton.
• Employing matrix diagonalisation to deduce spectral domain alternatives to
the preceding two algorithms.
• Derivation of a result concerning the conditional spectral coefficients which is
of interest to probability theorists.
• Complexity analysis for the implementation of the above four algorithms, as
well as an examination of their storage requirements. From this investigation,
the benefits of the spectral domain approach for high rate codes become clear.
• Instructive examples which showcase the differences in the presented APP
decoding approaches between channels with and without memory.
• Numerical examples displaying a variety of performance analysis options avail-
able when investigating Hamming codes over GECs in conjunction with these
APP decoding procedures.
7
1.3. LIST OF RELEVANT PUBLICATIONS
APP decoding using generalised weight polynomials for restricted GECs
Another advancement presented in this thesis involves using generalised weight poly-
nomials on restricted GECs. Previously, APP decoding methods oriented to this
specific channel model had not been published, as the closest contribution [26] con-
sidered a different type of decoding. The other main achievement in this area of the
thesis is the extension of this method for use with non-binary codes over non-binary
restricted GECs. Not only is a proof provided here of a conjecture in [26] regarding
the expression of conditional spectral coefficients in terms of the burst-error charac-
teristics of a restricted GEC, but the result is also extended to non-binary restricted
GECs. Thus, the major contributions of this thesis to the area of APP decoding for
restricted GECs are:
• Derivation of the conditional spectral coefficients used in APP decoding on
both binary and non-binary restricted GECs, in terms of the burst-error char-
acteristics of the channel.
• A new proof of the conjecture in [26] regarding the definition of the condi-
tional spectral coefficients in terms of burst-error characteristics for a binary
restricted GEC, as well as the generalisation of this theory for a non-binary
restricted GEC.
• Discussion of the link between the conditional spectral polynomials and the
binary and non-binary MacWilliams identities.
• Formulation of APP decoding procedures for both binary and non-binary re-
stricted GEC models using generalised weight polynomials.
• Numerical examples highlighting some of the many possible applications of
the theoretical framework developed in this thesis.
1.3 List of Relevant Publications
A list of the publications which have been authored or co-authored by the author
of this thesis during the time of his candidature which are relevant to the material
contained herein is as follows.
(P.1) H.-J. Zepernick, W. Griffiths and M. Caldera, “APP decoding of binary
linear block codes on Gilbert-Elliott channels,” in Proc. 2nd IEEE Int. Symp.
on Wireless Commun. Systems, Siena, Italy, Sept. 2005, pp. 434-437.
8
1.3. LIST OF RELEVANT PUBLICATIONS
The objective of (P.1) was to demonstrate the existence of APP decoding
algorithms which could be applied to binary linear block codes when commu-
nicating over a GEC. Although two algorithms were developed, one using the
original domain and one using the spectral domain, only the latter is described
explicitly. The paper is augmented by an example and simulation results. This
research forms the basis of Chapter 4.
(P.2) W. Griffiths, “APP decoding of linear block codes on Fritchman channels,”
in Proc. 5th Australian Telecommun. Cooperative Research Centre Workshop,
Melbourne, Australia, Nov. 2005, pp. 50-53.
This paper has a similar goal to (P.1), however it is adapted for use with a
single error state Fritchman channel model instead of a GEC. Such a channel
model can have a large number of states in order to perhaps model the bursty
nature of the wireless channel more closely. This increase in states comes
with a corresponding increase in computational complexity when performing
APP decoding. The overall strategy in (P.1) and (P.2) is however the same.
Chapter 4 presents the general algorithms for the spectral and also the original
domain, regardless of the state structure of the finite state Markov model.
(P.3) W. Griffiths, H.-J. Zepernick and M. Caldera, “APP decoding of block
codes over Gilbert-Elliott channels using generalized weight polynomials,” in
Proc. IEEE Veh. Technol. Conf., Melbourne, Australia, May 2006, pp. 1998-
2002.
The majority of the material in Chapter 6 has its origins in (P.3). Here, the
channel model used is a particular type of binary GEC. This model requires
only three parameters in order to describe its behaviour. The paper demon-
strates how this fact can be used to produce an algorithm based on polynomials
rather than matrices. The polynomials are related to the weight polynomials
of the MacWilliams identity [25], since they describe a relationship between
the dual and original domains. A short example and some numerical results
are also provided in (P.3).
(P.4) W. Griffiths, H.-J. Zepernick and M. Caldera, “On APP decoding of
non-binary block turbo codes over discrete channels,” in Proc. Int. Symp. on
Inf. Theory and its Appl., Parma, Italy, Oct. 2004, pp. 362-366.
Finally, the concept of schemes like the one presented in (P.4) is alluded to
in Chapter 8, under the heading of Future Research. This thesis is predom-
9
1.3. LIST OF RELEVANT PUBLICATIONS
inantly concerned with maximum likelihood (ML) algorithms which assume
equally likely a priori information and do not reuse reliability information.
Nevertheless, this information is available for use when dealing with iterative
decoding or concatenated codes. These schemes are slightly more computa-
tionally complex than those presented here, however they can give better error
correcting performance. The paper (P.4) is one possibility of deploying itera-
tive decoding techniques, as it concerns memoryless channels only. Additional
possible research areas are discussed in Chapter 8.
10
Chapter 2
Foundations
When modelling a system, it is beneficial to use a model which is simple to under-
stand, yet is still sufficiently complex to include all of its intricacies. Furthermore,
the problem to be solved as well as any additional devices used in combatting such
a problem must be able to be expressed using the model. The purpose of this chap-
ter is to set up the framework used in this thesis for the discussion of transmission
and reception of information over discrete channels. In particular, the framework is
oriented towards using a posteriori probability decoding algorithms to attempt to
retrieve information which has been encoded using linear block codes.
One objective of this thesis is to design decoding algorithms for use over a variety
of communication channels. In order to do this, a method of modelling the channels
is established. Such models must reflect the different symbol alphabets which can
be used for transmission, as well as the way in which the errors occur. A model
for a channel where the errors occur independently is likely to be simpler than one
where there is dependence in the error generation process.
A common strategy to protect data from the effect of errors is to employ channel
encoding. As this thesis concerns linear block codes, the focus will be on this type
of code and related concepts such as the dual code and Hamming distance. Equally
important in the overall model is the channel decoding step, as this allows the
transmitted symbols to be estimated. Many methods exist for achieving this, and
they may be classified in different ways. Such classifications include whether or not
reliability information is used, and whether decoding estimates are made symbol-by-
symbol or word-by-word. Some decoding algorithms are trellis-based. Trellises can
display information about the code and the likelihoods of receiving particular symbol
sequences using such a code. Later chapters of this thesis will use representation
theory to map the information given in a trellis into matrix form.
This chapter is organised as follows. A general overview of the communica-
11
2.1. COMMUNICATION SYSTEM MODEL
tion system used in this thesis is presented in Section 2.1. Section 2.2 provides
an overview and discussion of some of the discrete channel models for which the
decoding algorithms presented in the following chapters can be used. These chan-
nel models can be divided into two broad categories, those with memory and those
without. The focus of Section 2.3 is how coding can be used to attempt to recover
information which has been corrupted upon arrival at the receiver. The first part of
Section 2.3 deals with the different kinds of encoding processes which are available.
Next, a brief overview of the main types of decoding methods is given. Finally,
it is shown how trellis representations of linear block codes may assist with both
syndrome and APP decoding.
The main contributions to the body of knowledge in communications theory to
be found in this chapter are:
• Characterisation of a non-binary discrete memoryless channel in two distinct
ways, depending on the definition of the term “crossover probability”.
• Definition of the channel reliability factor for non-binary restricted Gilbert-
Elliott channels.
2.1 Communication System Model
The fundamental task in communications is the transmission of information from
a source to a sink. Besides the source, the transmitter may comprise units such as
a source encoder, channel encoder and a modulator. The objective of the source
encoder is to reduce redundancy which may be contained in the information re-
leased from the sink in order to reduce the total amount of data that needs to be
transmitted. In this way, system resources such as power and bandwidth may be
conserved. Given the increased susceptibility of information that has been com-
pressed by source encoding to channel errors, some form of channel encoding is
commonly employed in communication systems to protect the information prior to
transmission. In particular, error control coding is used, through which redundancy
is added to the source encoded information in a systematic way. This will enable
the receiver to detect, and if possible, correct transmission errors to some degree.
The receiver would then include a demodulator, channel decoder, and source de-
coder to perform the inverse operations. In this way, an estimate of the sequence of
transmitted symbols is produced for release to the sink.
The spatial separation between transmitter and receiver is bridged by a transmis-
sion channel. This channel may be associated with a wired or a wireless environment
12
2.1. COMMUNICATION SYSTEM MODEL
SourceChannel
EncoderModulator
Channel
DemodulatorChannel
DecoderSink
- -
��
?
?
Discrete Channel
i u
vi, u
Figure 2.1: Block diagram of the considered digital communication system.
and can thus possess a variety of stochastic characteristics. Obtaining an accurate
description of how information is passed from source to sink requires the generation
of a channel model which is as precise as possible with respect to the physical channel
under study. Parameters of the channel model include specifications such as the size
of the signalling alphabet and the degree of statistical dependencies between error
events that usually impair the transmission of information. The resulting errors in
the received sequence of information may originate from a number of causes, such
as signal attenuation, multipath propagation, interference and additive noise at the
receiver. These errors cause information degradation at the receiver, so that unless
action is taken, the information will arrive incorrectly at the sink.
A block diagram of the components of the communication system model consid-
ered in this thesis is given in Fig. 2.1. As the main focus of the work presented is
related to the channel-decoder cascade, the functionalities of source encoding and
decoding are not considered. Instead it is assumed that perfect source encoding
has been achieved, which supports the concept of the source releasing symbols or
sequences of given length with equal probability. In addition, the involved signals
are assumed to be discrete in both time and value. Such signals are then referred to
as digital signals. The considered communication system is therefore also known as
a digital communication system over which digital transmission is performed. The
components of the considered digital communication system that will be adopted
for this thesis are then as follows:
13
2.1. COMMUNICATION SYSTEM MODEL
Source: The discrete source produces sequences of given length k comprised of
information symbols that are taken from a finite signalling alphabet. The
result is also referred to as an information word and will be denoted by
i = [i1, i2, . . . , ik]. (2.1)
Channel encoding: The information words i are processed by a channel encoder,
which performs a mapping of the input words i of length k onto output words
u of length n. The resulting word is referred to as codeword and also consists
of symbols from a finite alphabet. Codewords will be denoted by
u = [u1, u2, . . . , un]. (2.2)
In this thesis, the paradigms of binary and non-binary linear block codes will
be considered to perform the task of systematically introducing redundancy
to the information words.
Discrete channel: The transmission medium will be modelled as a discrete chan-
nel and can be considered as an abstraction of the modulator, the physical
channel, and the demodulator. As such, the discrete channel processes code-
words u and releases received words
v = [v1, v2, . . . , vn]. (2.3)
As a consequence, the particular modulation and demodulation schemes to-
gether with the complex description of the statistical properties of the physical
channel are replaced by a discrete model and the related pattern of error se-
quences. In the case that the error events are statistically independent for
each discrete time instant, the channel is considered as being memoryless,
otherwise it is said to possess memory. The concept of a discrete channel is
particularly suited to conducting a performance assessment of potential chan-
nel coding schemes within a given application and an early design phase of a
digital communication system. It has been adopted, for example, in analysing
the performance of linear block codes on channels with and without memory
when using syndrome decoding [28], the analysis of turbo codes on channels
with memory [23], and the analysis of LDPC codes for the GEC [24].
Channel decoder: The channel decoder processes the received word v to produce
an estimate u of the transmitted codeword u and eventually an estimate i
14
2.2. CHANNEL MODELS
of the information word i released by the source. In this thesis, symbol-by-
symbol APP decoding schemes for binary and non-binary linear block codes in
conjunction with different classes of discrete channel models with and without
memory are considered.
2.2 Channel Models
A sequence transmitted in a digital communication system is usually subject to
channel noise and other impairments. These may result in the received sequence
being different to the transmitted sequence. There can be differences in the lengths
and/or values of the sequences. In the latter case, the channel causes errors to occur.
In terms of modelling the behaviour of an arbitrary channel, there are two pos-
sibilities for describing the relationship between these errors. Firstly, errors could
occur independently of the errors that have occurred in all previous time periods.
In this situation, the channel model is termed memoryless, because the model is
not required to keep a history of its error events. An example of this model type
is the binary symmetric channel (BSC) model. Alternatively, the pattern of errors
in previous time instants could affect the probability of there being an error in the
current time instant. The corresponding type of channel model needs to retain a
description of past events and is said to have memory. One such channel model is
the GEC model.
In order to obtain the truest reconstruction of real communication channels,
choosing the correct type of channel model is paramount. For example, wireless
channels often experience fades due to multipath fading as well as Doppler shifts
due to the movement of the mobile station or objects in the environment. The
result is that the errors may occur in bursts during some time periods. This type
of behaviour is more naturally modelled by a channel with memory, as the bursts
typically last for more than one time instant.
This section describes some of the channels in conjunction with which the de-
coding methods as presented in Chapters 3 to 7 can be used. Firstly, a selection of
memoryless channels designed for different signalling alphabets will be presented.
This is followed by the descriptions of several channels with memory.
2.2.1 Memoryless channels
Let the alphabet of symbols which may be transmitted and received be denoted Uand V , respectively. The alphabets U and V are also called input alphabet or input
15
2.2. CHANNEL MODELS
-
-
��
��
�3Q
QQs
��
��
��
QQut vt1 − ǫ
1 − ǫ
ǫ
ǫ
0 0
1 1r
r
r
r
Figure 2.2: Transition probability diagram of a BSC.
set, and output alphabet or output set, respectively. In this thesis, input and output
alphabets are assumed to be of equal size, i.e., |U| = |V|. Furthermore, for discrete
time instant i ∈ Z+, denote the ith transmitted and received symbols by ui ∈ U and
vi ∈ V, respectively.
A memoryless channel is characterised by the property that for each input se-
quence or input word u of length n, and output sequence or output word v of length
n, the conditional probability P (v|u) can be written as the product
P (v|u) =n∏
i=1
P (vi|ui) (2.4)
of the so-called transition probabilities P (vi|ui). In other words, each output symbol
vi ∈ V depends only on the corresponding input symbol ui ∈ U at discrete time
instant i, but not on the history of input symbols prior to the current time instant.
In terms of the error process, the property of being memoryless means that the
underlying noise affects each input symbol independently.
Binary symmetric channel
The input alphabet U and output alphabet V of a BSC consist of two elements. For
convenience, denote these binary elements as 0 and 1. The conditional probabilities
of a BSC are symmetric and are given by the transition probabilities
P (vi|ui) =
{
1 − ǫ if vi = ui,
ǫ if vi 6= ui.(2.5)
As can be seen from (2.5), a BSC is completely described by the parameter ǫ, which
is referred to as the crossover probability. Also, the structure of a BSC may be
concisely represented by a transition probability diagram as shown in Fig. 2.2. A
binary channel model would be suitable for systems that incorporate binary channel
codes, where the BSC is a rather idealistic but simple model.
In this context, the operations on pairs of elements in the considered binary
16
2.2. CHANNEL MODELS
alphabet {0, 1} are defined as addition and multiplication. These two operations
are performed modulo 2 with respect to the algebra defined by a Galois field or
finite field (see also Section 2.3)
GF (2) = < {0, 1},⊕, · > . (2.6)
In brief, a Galois field is a finite set equipped with two binary operations which
can be applied to its elements. These operations are usually referred to as addi-
tion and multiplication as mentioned above, and must possess certain properties.
Namely, each operation must have a different identity element, and be closed and
associative. Each element is required to have an additive inverse and addition must
be commutative. A set satisfying these properties is called a ring. If all elements
besides the additive identity possess a unique multiplicative inverse, then the set is
a field. In view of the properties of the Galois field GF (2), a transition from an
input symbol ui ∈ U to an output symbol vi ∈ V may be formulated as
vi = ui ⊕ fi, (2.7)
where ⊕ denotes addition modulo 2 and fi ∈ {0, 1} represents the absence or pres-
ence of an error event.
An ultimate measure for the maximum rate of information in terms of bits per
transmission over a channel is given by the channel capacity C. As a consequence of
Shannon’s landmark 1948 paper [6], for all transmission rates less than the channel
capacity C it is possible to transmit with arbitrarily small error probability by using
sufficiently strong channel coding. In the case of a BSC with crossover probability
ǫ, the channel capacity in bits is given by
C = 1 + (1 − ǫ) log2(1 − ǫ) + ǫ log2 ǫ. (2.8)
Discrete memoryless channel
The BSC is a special case of a class of channel models known as DMCs, which allow
for non-binary input and output symbols. In this thesis, the non-binary input and
output alphabets U and V , respectively, are assumed to be of size |U| = |V| = p,
where p is a prime. This generalisation allows investigation of transmission systems
that use non-binary codes such as the ternary Golay code [29], non-binary Hamming
codes, and many others.
In this case, the involved alphabets U and V can be considered as the set of
integers {0, 1, . . . , p−1}. The operations of addition and multiplication on pairs of
17
2.2. CHANNEL MODELS
�������������:
�������������*
-XXXXXXXXXXXXXzu
u
u
u
u
u
u
u
ut vt
0
1
g
p−1
0
1
g
p−1
1−(p−1)ǫ1
ǫ1 ǫ1
ǫ1
...
...
...
...
(a)
�������������:
�������������*
-XXXXXXXXXXXXXzu
u
u
u
u
u
u
u
ui vi
0
1
g
p−1
0
1
g
p−1
1 − ǫ2
ǫ2p−1 ǫ2
p−1
ǫ2p−1
...
...
...
...
(b)
Figure 2.3: Transition probability diagram of the p-ary symmetric DMC model:(a) standard model, (b) alternative model.
elements of this set are performed modulo p according to the algebra defined by a
Galois field (see also Section 2.3), which is denoted for brevity as
GF (p) = < {0, 1, . . . , p−1},⊕, · > . (2.9)
In addition, the considered p-ary DMC models are assumed to be symmetric
and differentiate only between error-free and erroneous transmission. The actual
value of the discrete error is not taken into account. The only parameter required
to describe such a type of p-ary DMC is the crossover probability. As the definition
of the crossover probability is somewhat ambiguous for a p-ary DMC, the following
two forms of model may be described.
Standard model: The transition probability diagram of a p-ary symmetric DMC
in standard form is shown in Fig. 2.3(a). For clarity, only the branches ema-
nating from the input node for an element g ∈ GF (p) have been shown. There
is a symmetrical set of p branches emanating from each of the other p−1 input
nodes. Given the crossover probability ǫ1, the weights of the branches on the
transition probability diagram are defined by
P (vi|ui) =
{
1 − (p−1)ǫ1 if vi = ui,
ǫ1 if vi 6= ui.(2.10)
Alternative model: The alternative form of the p-ary symmetric DMC model is
shown in Fig. 2.3(b). In this alternative model, the probability ǫ2 is defined
as
ǫ2 =∑
vi∈GF (p)vi 6=ui
P (vi|ui), (2.11)
18
2.2. CHANNEL MODELS
and the weights of the transition probability diagram are given by
P (vi|ui) =
{
1 − ǫ2 if vi = ui,ǫ2
p−1if vi 6= ui.
(2.12)
Although ǫ1 6= ǫ2 holds for p > 2, the relationship between the parameters ǫ1
and ǫ2 can be easily deduced from (2.10) and (2.12) as
ǫ2 = (p−1)ǫ1. (2.13)
The channel capacities C1 and C2 of the standard and alternative p-ary symmetric
DMC models, respectively, are given as
C1 = 1 + [1 − (p− 1)ǫ1] logp[1 − (p− 1)ǫ1] + (p− 1)ǫ1 logp ǫ1, (2.14)
C2 = 1 + (1 − ǫ2) logp(1 − ǫ2) + ǫ2 logp
(
ǫ2
p− 1
)
. (2.15)
Applying (2.13) in (2.15) shows that the two models have equal channel capacities.
Unless otherwise stated, the convention in this document will be to use the
standard p-ary DMC model as presented in Fig. 2.3(a). However, there will be some
cases where calculations will be done using both models, to emphasise the equality
of the two models.
2.2.2 Channels with memory
Many practical communication channels, particularly wireless channels, do not pro-
duce single independent errors. Instead, the errors are likely to occur in bursts [30].
It is for this reason that such a channel is referred to as one having memory. Each er-
ror or non-error inherits a statistical dependence on the previous channel behaviour.
As such, the memoryless channel models as described in Section 2.2.1 are not com-
plex enough to represent such a dependence.
In order to furnish a model with memory, it is usually assumed that the channel
must reside in one of a finite number of states at each discrete time instant. These
states are indicators of different error likelihoods with respect to the symbol trans-
mission. It is then the transference of the model between these states which results
in periods with different clusterings of transmission errors. This may be observed
on a physical discrete channel with memory as error bursts and gaps between these
bursts. In this section, prominent models of channels with memory are reported and
discussed to the extent needed for the formulation of APP decoding algorithms in
later chapters.
19
2.2. CHANNEL MODELS
Stochastic sequential machines
One of the first discussions of a finite state channel model, also known as a stochastic
sequential machine, was in Shannon’s watershed paper [6]. The introduction of a
finite number of states into the channel model has proven to be a relatively simple
method of incorporating memory. The following is the interpretation of a stochastic
sequential machine which is adopted in this thesis as a means of describing channels
with memory.
A stochastic sequential machine D is a discrete time device consisting of an input
set U of cardinality |U|, an output set V of cardinality |V|, a finite set S of S states
and a set of up to |U|·|V| matrix probabilities {D(vi|ui)} of size S×S. The notation
adopted is stated as
D = (U ,V ,S, {D(vi|ui)}). (2.16)
At each time instant i ∈ Z+, a symbol ui ∈ U is transmitted and a symbol vi ∈ V is
received. The channel is in a state si ∈ S during this symbol transmission, and it will
subsequently assume a state si+1 ∈ S at time i+1. The stochastic process associated
with crossovers of symbols and state transitions is contained in the elements of the
matrix probabilities D(vi|ui) and will be detailed later in this section for a number
of major channels with memory. In order to complete the channel model, an initial
state distribution is required which will be described by a row vector σ0 of length
S. The stochastic sequential machine D together with the initial state distribution
vector σ0 forms a stochastic automaton
D = (D,σ0). (2.17)
It should be mentioned that the marginalisation of the matrix probabilities
D(vi|ui) over all output symbols vi ∈ V results in the so-called state transition
matrix
D =∑
vi∈V
D(vi|ui). (2.18)
In view of (2.18), the stationary state distribution is used in this thesis to initialise
the sequence of state transitions over discrete time rather than randomly choosing
an arbitrary initial state distribution. Given the constraint that the sum of the
stationary state probabilities represented by the elements of the stationary state
distribution vector σ0 must be one as
σ0e = 1, (2.19)
20
2.2. CHANNEL MODELS
where e denotes an all-ones column vector of length S, the stationary state distri-
bution can be easily obtained by solving the following equation:
σ0D = σ0. (2.20)
Hidden Markov models
It remains to be shown how the matrix probabilities D(vi|ui) of a stochastic sequen-
tial machine facilitate the modelling of channel memory. In essence, the error and
state sequences form a HMM, and as such, they result in two layers of stochasticity.
The error sequence representing the difference between the transmitted and received
symbol sequences is the observable result of a random process. However, the under-
lying sequence of states which in part produced the error sequence is hidden.
More specifically, if a discrete channel model with a memory of size l ∈ Z+ is re-
quired, then the state sequence must form a Markov chain of order l. In other words,
the state sequence obeys the lth order Markovian property that if a sequence of dis-
crete random variables S1, S2, . . . , Si produces a sequence of realisations s1, s2, . . . , si,
where sj ∈ S, ∀j ∈ {1, 2, . . . , i}, then
P (Si+1 = si+1|Si = si, Si−1 = si−1, . . . , S1 = s1)
= P (Si+1 = si+1|Si = si, Si−1 = si−1, . . . , Si−l+1 = si−l+1).(2.21)
Accordingly, the probability that the channel assumes state Si+1 = si+1 at time
instant i + 1 is conditioned only on the sequence of l preceding states but not the
entire history of states. As only channels with a memory of size l = 1 will be
considered in this thesis, (2.21) simplifies to
P (Si+1 = si+1|Si = si, Si−1 = si−1, . . . , S1 = s1) = P (Si+1 = si+1|Si = si). (2.22)
By analogy, conditional probabilities that account for both the error sequence
and the state sequence can be formulated. Suppose the input symbol ui ∈ U and
the output symbol vi ∈ V as realisations of the random variables Ui and Vi at time
instant i, respectively, and a channel model with memory of size l = 1 are given.
Then, an element dsi,si+1(Vi = vi|Ui = ui) of the matrix probability D(vi|ui) of
size S × S for each pair of the current state Si = si ∈ S and subsequent state
Si+1 = si+1 ∈ S can be expressed as
dsi,si+1(Vi = vi|Ui = ui) = P (Vi = vi, Si+1 = si+1|Si = si, Ui = ui). (2.23)
21
2.2. CHANNEL MODELS
Simplifying notation, applying Bayes’ rule, and assuming independence between the
underlying source and state processes, the expression in (2.23) may be written as
dsi,si+1(vi|ui) = P (vi, si+1|si, ui)
= P (vi|si+1, si, ui)P (si+1|si, ui)
= P (vi|si+1, si, ui)P (si+1|si).
(2.24)
For each combination of si ∈ S and si+1 ∈ S, assume that there is a p-ary sym-
metric DMC where the probabilities of correct and erroneous symbol transmission
are known. For the binary case, two matrix probabilities may be defined as
Dui⊕vi= D(vi|ui) =
[
dsi,si+1(vi|ui)
]
S×S∈ {D0,D1}. (2.25)
Thus, the probabilities P (vi|si+1, si, ui) and P (si+1|si) are required in order to de-
scribe the channel model. A more specific description of some prominent two-state
binary and non-binary channel models with memory follows.
Binary general two-state Markov channel
Consider a two-state Markov channel comprising a ‘good’ state G and a ‘bad’ state
B. The terms “good” and “bad” are used to indicate favourable and difficult channel
conditions in terms of providing correct transmission, respectively. The related error
processes can be quantified by probabilities P (vi|si+1, si, ui), which, for the general
binary channel with ui, vi ∈ {0, 1}, depend on the state si at time instant i, as
well as the subsequent state si+1 at time instant i+ 1. Assuming that a BSC model
applies for each of the possible pairs of the current state si ∈ {G,B} and subsequent
state si+1 ∈ {G,B}, the corresponding crossover probabilities may be denoted as
P (vi 6= ui|Si+1 = G,Si = G, ui) = pGG, (2.26)
P (vi 6= ui|Si+1 = B,Si = G, ui) = pGB, (2.27)
P (vi 6= ui|Si+1 = G,Si = B, ui) = pBG, (2.28)
P (vi 6= ui|Si+1 = B,Si = B, ui) = pBB. (2.29)
In addition, define the time-invariant state transition probabilities for the con-
sidered class of two-state channel models with S = {G,B} as
P (Si+1 = G|Si = G) = 1 − P, (2.30)
P (Si+1 = B|Si = G) = P, (2.31)
P (Si+1 = G|Si = B) = Q, (2.32)
P (Si+1 = B|Si = B) = 1 −Q. (2.33)
22
2.2. CHANNEL MODELS
G B1 − P
P
Q
1 −Q
-@
@@@R
-��
���@
@@
@@�
��
��
q q
q qui vi
0 0
1 1
1 − pGG
1 − pGG
pGG
pGG
-@
@@@R
-��
���@
@@
@@�
��
��
q q
q qui vi
0 0
1 1
1 − pGB
1 − pGB
pGB
pGB
-@
@@@R
-��
���@
@@
@@�
��
��
q q
q qui vi
0 0
1 1
1 − pBG
1 − pBG
pBG
pBG
-@
@@@R
-��
���@
@@
@@�
��
��
q q
q qui vi
0 0
1 1
1 − pBB
1 − pBB
pBB
pBB
Figure 2.4: Structure of the binary general two-state Markov model.
The structure of the error process and state transitions along with the related
parameters can be summarised as shown in Fig. 2.4. The state transition diagram
shows the connection between the ‘good’ state G and the ‘bad’ state B, where each
branch is labelled with its corresponding state transition probability. As far as the
four BSC models are concerned, these may also be considered as being associated
with one of the four branches in the state transition diagram.
For the considered symmetric channel with memory, the parameters given in
(2.26)-(2.33) may be used to concisely describe the general two-state Markov channel
by matrix probabilities. As the considered channel is symmetric with respect to
transmission error, the two possible matrix probabilities may be written as
D0 = D(vi = ui|ui) =
[
(1 − pGG) (1 − P ) (1 − pGB)P
(1 − pBG)Q (1 − pBB) (1 −Q)
]
, (2.34)
D1 = D(vi 6= ui|ui) =
[
pGG (1 − P ) pGB P
pBGQ pBB (1 −Q)
]
. (2.35)
Then, the state transition matrix for the general two-state Markov channel model
can be reported as
D = D0 + D1 =
[
1 − P P
Q 1 −Q
]
. (2.36)
In addition, the difference matrix ∆ between the matrix probabilities D0 and
D1 will be frequently used in the following chapters for the proposed APP decoding
algorithms. Such a difference matrix can be represented for the general two-state
23
2.2. CHANNEL MODELS
Markov channel as
∆ = D0 − D1 =
[
(1 − 2pGG) (1 − P ) (1 − 2pGB)P
(1 − 2pBG)Q (1 − 2pBB) (1 −Q)
]
. (2.37)
To complete the channel model, the stationary state distribution vector σ0 can
be calculated by solving (2.20) using (2.36) and subject to the constraint in (2.19)
as
σ0 = [σG, σB] =[
Q
P+Q, P
P+Q
]
. (2.38)
Binary Gilbert-Elliott channel
If the crossover probabilities of the binary general two-state Markov channel model
were to depend only on the current state Si of the channel, rather than both the
current state Si and the subsequent state Si+1, then
P (vi|Si+1 = si+1, Si = si, ui) = P (vi|Si = si, ui). (2.39)
Only two different crossover probabilities would then need to be specified for the
characterisation of the error process, that is
pG = pGG = pGB, (2.40)
pB = pBG = pBB, (2.41)
where pG and pB refer to the crossover probabilities in the ‘good’ state G and ‘bad’
state B, respectively. The condition pG ≤ pB could be applied to identify the states
G and B.
The resulting channel model is known as the GEC model [3] and may be illus-
trated as shown in Fig. 2.5. Its two matrix probabilities are given by
D0 = D(vi = ui|ui) =
[
1 − pG 0
0 1 − pB
][
1 − P P
Q 1 −Q
]
, (2.42)
D1 = D(vi 6= ui|ui) =
[
pG 0
0 pB
][
1 − P P
Q 1 −Q
]
. (2.43)
The stationary state distribution vector σ0 and state transition matrix D remain
unchanged from (2.38) and (2.36), respectively, while the difference matrix ∆ may
be expressed as
∆ = D0 − D1 =
[
1 − 2pG 0
0 1 − 2pB
]
D. (2.44)
24
2.2. CHANNEL MODELS
G B1 − P
P
Q
1 −Q
-
-
��
��
�3Q
QQs
��
��
��
QQui vi1 − pG
1 − pG
pG
pG
0 0
1 1r
r
r
r
-
-
��
��
�3Q
QQs
��
��
��
QQui vi1 − pB
1 − pB
pB
pB
0 0
1 1r
r
r
r
Figure 2.5: Structure of the binary GEC model.
Channel models usually do not have a unique parametrisation. The binary GEC
model has been defined above in terms of four parameters, which are connected
to the theoretical concept of Markov chains. Specifically, these parameters were
the state transition probabilities P and Q, and the crossover probabilities pG and
pB. Alternatively, a parametrisation more strongly related to the distribution of the
bursts of transmission errors may be considered. These types of parameters could
be extracted by measuring error patterns from an actual channel. The following
three alternative channel parameters have been reported in [26]:
Average fade to connection time ratio: The average fade to connection time
ratio x describes how often on average the channel is in the ‘good’ state G
compared to the ‘bad’ state B. It is related to the state transition probabilities
P and Q of the GEC model by
x =P
Q. (2.45)
Burst factor: The burst factor y relates to the correlation function of the error
process and indicates how clustered the discrete error events may appear on
the channel. A burst factor of y = 0 is obtained for the case of statistically
independent errors, such as for a DMC. As y increases, the errors occur more
frequently in bursts, as each becomes increasingly dependent on the existence
or non-existence of previous errors. A burst factor of y = 1 then indicates max-
imal error dependency. The relationship to the state transition probabilities
P and Q is given by
y = 1 − P −Q. (2.46)
25
2.2. CHANNEL MODELS
G B1 − P
P
Q
1 −Q
�������������:
�������������*
-XXXXXXXXXXXXXzu
u
u
u
u
u
u
u
ui vi
0
1
g
p−1
0
1
g
p−1
1−(p−1)pG
pG pG
pG
...
...
...
...�������������:
�������������*
-XXXXXXXXXXXXXzu
u
u
u
u
u
u
u
ui vi
0
1
g
p−1
0
1
g
p−1
1−(p−1)pB
pB pB
pB
...
...
...
...
Figure 2.6: Structure of the standard non-binary GEC model.
Channel reliability factor: This parameter is directly related to the average
bit error rate (BER) of the channel, which is a quantity of greater practical
significance compared to the individual crossover probabilities in each channel
state. Accordingly, the average BER pb of a binary GEC can be written as
pb = pGσG + pBσB = pG
Q
P +Q+ pB
P
P +Q, (2.47)
where the subscript b signifies BER rather than symbol error rate (SER). Given
the average BER in (2.47), the channel reliability factor z for the binary GEC
has been defined in [26] as
z = 1 − 2pb = (1 − 2pG)Q
P +Q+ (1 − 2pB)
P
P +Q. (2.48)
A channel reliability factor of z = 0 is obtained for a totally unreliable channel
(pb = 0.5). On the other hand, an error-free channel (pb = 0) gives a channel
reliability factor of z = 1.
Non-binary Gilbert-Elliott channel
It is straightforward to extend the binary GEC model to accommodate non-binary
input and output symbols. In this situation, the BSCs associated with the two states
of the binary GEC model are replaced by p-ary DMCs. The state transition process
between the two channel states and the related state transition probabilities remain
26
2.2. CHANNEL MODELS
the same. The structure of a non-binary GEC model may then be illustrated as
shown in Fig. 2.6. Note that the p-ary DMC model in standard form as shown in
Fig. 2.3(a) has been used here, although either p-ary DMC model may be applied.
The non-binary GEC model is able to be described by a stochastic automaton
D = (σ0,D) as defined in (2.17), with stationary state distribution vector σ0 as
given in (2.38), and stochastic sequential machine D defined by
D = (U = {0, 1, . . . , p− 1},V = {0, 1, . . . , p− 1},S = {G,B}, {D(vi|ui)}). (2.49)
The elements of the matrix probabilities D(vi|ui) can be specified by noting that
the following property holds for the p-ary symmetric DMCs in each state Si = si ∈ S:
P (vi = ui ⊕ fi|si, ui) = P (vi = u′
i ⊕ gi|si, u′
i) ∀ui, u′
i ∈ GF (p), (2.50)
whenever both symbols fi, gi ∈GF (p)\{0} or if fi = gi = 0. Adopting the notation
presented in [31], the set of matrix probabilities may then be represented as
{D(vi|ui)} = {Dfi:= D(vi = ui ⊕ fi|ui) | fi, ui ∈ GF (p)}. (2.51)
In view of (2.51), the set of matrix probabilities is of size p and its elements can be
identified through the value of the error symbols fi ∈ GF (p). Using the notation
vi ⊖ ui ≡ vi − ui (mod p), (2.52)
the set of matrix probabilities may be related to the events of correct and erroneous
symbol transmission as
D(vi|ui) = Dvi⊖ui= Dfi
=
{
D0 if fi = 0,
Dǫ if fi ∈ GF (p) \ {0}.(2.53)
As for the binary GEC, only two 2 × 2 matrix probabilities are required to de-
scribe this discrete non-binary channel with memory. The two matrix probabilities
accounting for correct and erroneous transmission can be written using (2.24) and
(2.39) in terms of the parameters of a p-ary symmetric DMC as
D0 =
[
1 − (p− 1)pG 0
0 1 − (p− 1)pB
][
1 − P P
Q 1 −Q
]
, (2.54)
Dǫ =
[
pG 0
0 pB
][
1 − P P
Q 1 −Q
]
. (2.55)
27
2.2. CHANNEL MODELS
The marginalisation of the matrix probabilities with respect to the output symbol
vi ∈ GF (p) leads to the state transition matrix D, while the concept of the difference
between correct and erroneous transmission results in the difference matrix ∆. These
may be expressed respectively as
D = D0 + (p− 1)Dǫ =
[
1 − P P
Q 1 −Q
]
, (2.56)
∆ = D0 − Dǫ =
[
1 − ppG 0
0 1 − ppB
]
D. (2.57)
It may also be beneficial to report an alternative parametrisation for the non-binary
GEC model in terms of the average fade to connection time ratio x, the burst factor
y, and the channel reliability factor z. The former two parameters depend only on
the state transition probabilities, and are therefore given by the same expressions
(2.45) and (2.46) as for the binary GEC. However, the calculation of the channel reli-
ability factor z also involves the crossover probabilities pG and pB of the constituent
DMCs, and can be derived as follows.
Without loss of generality, consider the standard non-binary GEC model shown
in Fig 2.6. Assuming the input symbols ui ∈ GF (p) appear with equal probability,
that is
∀ui ∈ U : P (ui) =1
p, (2.58)
then the average SER for this model can be calculated as
ps =∑
si∈{G,B}
∑
ui∈GF (p)
P (fi 6= 0|si, ui)P (si)P (ui)
=1
p
∑
ui∈GF (p)
[(p− 1)(pGσG + pBσB)] (2.59)
= (p− 1)(pGσG + pBσB).
By analogy with the difference matrix ∆ defined in (2.57), the channel reliability
factor z can be regarded as an average difference in the probabilities for correct and
erroneous transmission for a given transmission error fi = vi ⊖ ui = g ∈ GF (p).
As averaging is being performed with respect to the channel states G and B, the
28
2.2. CHANNEL MODELS
G B1 − P
P
Q
1 −Q
-
-
ui vi1
10 0
1 1r
r
r
r
-
-
��
��
�3Q
QQs
��
��
��
QQui vi1 − pB
1 − pB
pB
pB
0 0
1 1r
r
r
r
Figure 2.7: Structure of the binary Gilbert channel model.
channel reliability factor z may be calculated as
z =∑
si∈{G,B}
∑
ui∈GF (p)
[P (fi = 0|si, ui) − P (fi = g 6= 0|si, ui)]P (si)P (ui)
= 1p
∑
ui∈GF (p)
[(1 − ppG)σG + (1 − ppB)σB]
= (1 − ppG)σG + (1 − ppB)σB
= 1 − p
p−1ps.
(2.60)
Binary Gilbert channel
There are special cases of the GEC model which deserve particular consideration.
Firstly, suppose that the BSC in the ‘good’ state G is perfect in the sense that
it does not cause transmission errors and therefore has a crossover probability of
pG = 0. This channel model is known as the Gilbert model [32], named after its
developer Edgar N. Gilbert. It has the advantage of fairly simple analytical error
probability calculations [33]. Figure 2.7 shows a schematic diagram of the binary
Gilbert model.
In terms of the matrix probabilities, the only change from the binary GEC model
is the restriction of pG = 0 and therefore
D0 =
[
1 0
0 1 − pB
][
1 − P P
Q 1 −Q
]
, (2.61)
D1 =
[
0 0
0 pB
][
1 − P P
Q 1 −Q
]
. (2.62)
29
2.2. CHANNEL MODELS
G B1 − P
P
Q
1 −Q
-
-
-
-
u
u
u
u
u
u
u
u
ui vi
0
1
g
p−1
0
1
g
p−11
1
1
1
...
...
...
...�������������:
�������������*
-XXXXXXXXXXXXXzu
u
u
u
u
u
u
u
ui vi
0
1
g
p−1
0
1
g
p−1
1−(p−1)pB
pB pB
pB
...
...
...
...
Figure 2.8: Structure of the non-binary Gilbert channel model using the standardp-ary DMC model.
The state transition matrix and difference matrix are then calculated as
D =
[
1 − P P
Q 1 −Q
]
, (2.63)
∆ =
[
1 0
0 1 − 2pB
]
D. (2.64)
Non-binary Gilbert channel
The non-binary Gilbert channel is obtained from the non-binary GEC model by
setting the crossover probability pG of the channel in the ‘good’ state G to zero. A
diagram of the non-binary Gilbert channel model using the standard p-ary DMC
model as in Fig. 2.3(a) is shown in Fig. 2.8.
It follows that the state transition matrix D is defined as in (2.63), while the
remaining three matrices are given as
D0 =
[
1 0
0 1 − (p−1)pB
]
D, (2.65)
Dǫ =
[
0 0
0 pB
]
D, (2.66)
30
2.2. CHANNEL MODELS
and
∆ =
[
1 0
0 1 − ppB
]
D. (2.67)
Binary restricted Gilbert-Elliott channel
In the same way that the Gilbert channel considers the extreme case of the channel
being error-free in the ‘good’ state G, another special case would be to consider a
channel that is totally unreliable in the ‘bad’ state B. In other words, with the
crossover probability pB = 0.5, the channel capacity of the BSC in the ‘bad’ state
B is zero. This case will be considered below for the binary GEC model and will
be referred to as the binary restricted GEC model. The structure of this restricted
GEC is displayed in Fig. 2.9.
Firstly consider the matrix probabilities D0 and D1 for a binary restricted GEC
model (pB = 0.5), which may be expressed as
D0 =
[
1 − pG 0
0 0.5
]
D, (2.68)
D1 =
[
pG 0
0 0.5
]
D, (2.69)
where the state transition matrix D remains unchanged from that given in (2.36).
However, the difference matrix for the binary restricted GEC model has a particu-
larly sparse form and is therefore given the special designation
δ = ∆(pB = 0.5) =
[
1 − 2pG 0
0 0
]
D. (2.70)
As the restricted GEC model is defined by a set of three parameters, either
the mathematical parameters P , Q, and pG, or the more measurement-related pa-
rameters x, y, and z may be used. In the latter case, it is possible to formulate
APP decoding based on polynomials in the three variables x, y, and z as will be
shown in Chapter 6. In addition, duality concepts similar to those contained in the
MacWilliams identity [25] can be revealed for the proposed APP decoding algorithms
on restricted GEC models by using the polynomials in x, y, and z.
While the average fade to connection time ratio x and the burst factor y do
not depend on crossover probabilities and therefore are given in (2.45) and (2.46),
respectively, the channel reliability factor z for the binary restricted GEC is obtained
31
2.2. CHANNEL MODELS
G B1 − P
P
Q
1 −Q
-
-
��
��
�3Q
QQs
��
��
��
QQui vi1 − pG
1 − pG
pG
pG
0 0
1 1r
r
r
r
-
-
��
��
�3Q
QQs
��
��
��
QQui vi0.5
0.5
0.5
0.5
0 0
1 1r
r
r
r
Figure 2.9: Structure of the binary restricted GEC model.
by applying the constraint pB = 0.5 to (2.48), resulting in
z = (1 − 2pG)Q
P +Q. (2.71)
This allows the stationary state distribution vector σ0, state transition matrix D,
and difference matrix δ=∆(pB =0.5) given in (2.38), (2.36) and (2.70), respectively,
to be expressed in terms of the burst-error characteristics x, y, and z as
σ0(x) =[
11+x
, x1+x
]
, (2.72)
D(x, y) =1
1 + x
[
1 + xy x− xy
1 − y x+ y
]
, (2.73)
δ(x, y, z) = z(1 + x)
[
1 0
0 0
]
D(x, y). (2.74)
Non-binary restricted Gilbert-Elliott channel
In order to obtain a channel capacity of zero in the p-ary symmetric DMC for the
‘bad’ state B of the non-binary GEC, a crossover probability of pB = 1p
is required if
using the standard model shown in Fig. 2.3(a). In the case of the alternative model
shown in Fig. 2.3(b), a crossover probability of pB = p−1p
is required. The resulting
non-binary restricted GEC models are shown in Figs. 2.10 and 2.11, respectively. In
both cases, the crossover probabilities in the ‘bad’ state B are identical.
The matrix probabilities D0 and Dǫ under the standard DMC model can be
32
2.2. CHANNEL MODELS
G B1 − P
P
Q
1 −Q
�������������:
�������������*
-XXXXXXXXXXXXXzu
u
u
u
u
u
u
u
ui vi
0
1
g
p−1
0
1
g
p−1
1−(p−1)pG
pG pG
pG
...
...
...
...�������������:
�������������*
-XXXXXXXXXXXXXzu
u
u
u
u
u
u
u
ui vi
0
1
g
p−1
0
1
g
p−1
1p
1p
1p
1p
...
...
...
...
Figure 2.10: Structure of the standard non-binary restricted GEC model.
written as
D0 =
[
1 − (p− 1)pG 0
0 1p
]
D, (2.75)
Dǫ =
[
pG 0
0 1p
]
D, (2.76)
where the state transition matrix D remains as given in (2.36). The difference
matrix δ can be represented as
δ = ∆
(
pB =1
p
)
=
[
1 − ppG 0
0 0
]
D. (2.77)
Under the alternative model, the state transition matrix D remains as in (2.36),
while the other matrix probabilities can be reported as
D0 =
[
1 − pG 0
0 1p
]
D, (2.78)
Dǫ =
[
pG
p−10
0 1p
]
D, (2.79)
δ = ∆
(
pB =1
p
)
=
[
1 − p
p−1pG 0
0 0
]
D. (2.80)
33
2.2. CHANNEL MODELS
G B1 − P
P
Q
1 −Q
�������������:
�������������*
-XXXXXXXXXXXXXzu
u
u
u
u
u
u
u
ui vi
0
1
g
p−1
0
1
g
p−1
1 − pG
pG
p−1pG
p−1
pG
p−1
...
...
...
...�������������:
�������������*
-XXXXXXXXXXXXXzu
u
u
u
u
u
u
u
ui vi
0
1
g
p−1
0
1
g
p−1
1p
1p
1p
1p
...
...
...
...
Figure 2.11: Structure of the alternative non-binary restricted GEC model.
In order to formulate APP decoding algorithms for non-binary restricted GEC
models using polynomials in terms of the burst-error characteristics x, y, and z (see
Chapter 7), the same rationale as outlined for the binary restricted GEC model
applies. Clearly, the stationary state distribution vector σ0(x) and state transition
matrix D(x, y) in x and y are identical to those for the binary restricted GEC, as
these are only dependent on the state transitions and are unrelated to the error
process specified by the crossover probabilities. It is also straightforward to show
that the difference matrix δ(x, y, z) is the same for both the standard and alternative
DMC model formulations of the non-binary restricted GEC. The only difference is
the channel reliability factor z, which is defined differently for each model. For
the standard DMC model, substituting pB = 1p
into (2.60) results in the channel
reliability factor
z = (1 − ppG)σG, (2.81)
while under the alternative DMC model, the expression
z =
(
1 − p
p− 1pG
)
σG (2.82)
for the channel reliability factor can be derived. Both models result in the same for-
mulation of the difference matrix δ(x, y, z) as given in (2.74) for the binary restricted
GEC.
34
2.3. ERROR CONTROL CODING
Other channel models
The number of states in the HMM for the channel can also be increased. In order to
limit the complexity of such a model, assume binary transmission and that the error
probabilities in each state are deterministic. That is, let there be one or more ‘good’
states where there are no errors and the crossover probability is zero, as well as one
or more ‘bad’ states which always produce errors and where the crossover probability
is one. Additionally, assume that state transitions between any two different error
states and any two different error-free states are not permitted. There are thus only
alternating variable-length periods of correct and incorrect reception. Observe that
these are semi-hidden Markov models as the current state of the channel cannot
be determined completely from the error sequence, however some information can
be gathered. These models were proposed by Fritchman in [34]. More information
concerning them can be found in [4, 35,36,37].
A common task in telecommunications is to find the best model for an observed
sample of a real channel. This is usually referred to as the parameter estimation
problem and a tool used to solve it is the Baum-Welch algorithm [38]. In [39], it is
demonstrated that the three-state Fritchman model is equivalent to the two-state
Gilbert model. This equivalence of models highlights the fact that there may not
be a unique solution to the parameter estimation problem.
Alternatively, assume non-deterministicity but suppose that state transitions are
allowed only when errors occur. This is the premise of the McCullough model, also
referred to as the binary regenerative channel [40]. Although its state diagram is
more complex than that of the Gilbert model, its parameters are better related to the
statistical properties of the noise involved. Another model sometimes used is that of
Swoboda [41], which is particularly useful for pulse code modulation channels [42].
There are thus many different models to which the decoding procedures derived in
Chapters 4 and 5 may be applied.
2.3 Error Control Coding
The second major functional block of the digital communication system model con-
sidered in this thesis deals with channel coding. A review of some fundamentals of
error control coding is therefore given in the following sections. The ideas behind
finite fields, convolutional codes and linear block codes are provided. With a focus
on block codes, the concepts of syndrome decoding, sequence estimation decoding,
and symbol estimation decoding are described. Then, the syndrome trellis is intro-
35
2.3. ERROR CONTROL CODING
duced as a systematic framework for supporting symbol-by-symbol APP decoding.
As the trellis plays an important role in the derivation of APP decoding algorithms
in this thesis, an instructive example of constructing different types of trellises for
a simple linear block code is provided.
2.3.1 Encoding
In the context of error control coding, encoding refers to the process by which a
sequence of information symbols created by the source may gain protection from
errors as it is transmitted over a channel. This is achieved by introducing some
form of redundancy to the original sequence of information symbols.
In this section, the arithmetic of finite fields will be briefly reviewed, in order to
understand how redundancy may be produced from the information symbols. Then,
the two major paradigms of convolutional and block encoding will be discussed.
Finite fields
A code is a subset of a vector space, which is a mathematical object taken over a
set of scalars. This set could for example be a group, a ring or a field. Examples
of codes over rings and cyclic groups can be found in [43] and [44], respectively.
Although fields are the most restrictive of these three, vector spaces over a finite
field will be the assumed environments for codes in this thesis. The set of field
elements corresponds to the input and output signalling alphabets of the channel.
It is a famous result of Galois Theory that it is only possible to construct fields
of order pa, where p is a prime and a ∈ Z+. To determine how addition and
multiplication are performed, polynomials are used as the field elements. For a field
of order pa, consider all possible polynomials in indeterminate D having the form
f(D) =a−1∑
i=0
fiDi, (2.83)
where each coefficient fi ∈ Zp and Zp denotes the set of integers {0, 1, . . . , p−1}modulo p. This results in pa different polynomials which can be added and multi-
plied. However, this set is not closed under multiplication. The problem is solved by
choosing one particular monic irreducible polynomial of degree a, say f ∗(D). Thus,
it is impossible to write f ∗(D) as the product of two or more polynomials of degree
d, where 1 ≤ d ≤ a−1. This ensures each nonzero element has a unique multiplica-
tive inverse. Multiplication in the field is then performed modulo f ∗(D), so that a
product has degree at most a−1, and thus the set is closed under multiplication.
36
2.3. ERROR CONTROL CODING
Table 2.1: Addition and multiplication table of the Galois field GF (2).
⊕ 0 1 · 0 10 0 1 0 0 01 1 0 1 0 1
Fields for which a ≥ 2 are sometimes called extension fields, since they in essence
extend the concept of the ground field GF (p). Examining the case a = 1 in detail, by
(2.83), the field elements are simply the integers {0, 1, . . . , p−1}. As the irreducible
polynomial f ∗(D) must be monic and of degree one,
f ∗(D) = D. (2.84)
In other words, addition and multiplication are calculated using modulo p arithmetic.
For example, the addition and multiplication operations for the Galois field GF (2)
using modulo 2 arithmetic are shown in Table 2.1. For more information on finite
fields, the interested reader is directed to [45] or [46].
Convolutional codes
A convolutional encoder can be modelled as a deterministic finite state machine.
The different states correspond to the contents of a series of memory registers. The
term “convolutional” is used because the input symbols are convolved with the
impulse responses of the machine in order to produce the output symbols. The
complete structure is referred to as an (n, k,m) encoder, where n is the number
of output symbols, k is the number of input symbols, and m is the number of
memory registers. The definition of the encoder for the field GF (p) also requires
stipulation of a set of n p-ary generator polynomials g1(D), g2(D), . . . , gn(D), each
in indeterminate D and of degree k ·m.
According to [47], the encoder consists of a horizontal array of m groups of k
memory registers. Initially, the rightmost k(m − 1) registers are filled with zeroes.
These registers define the state of the encoder. Each register contains one of p sym-
bols, resulting in pk(m−1) possible states. Information symbols are introduced from
the left in batches of size k. The modulo p addition circuitry defined by generator
polynomials g1(D), g2(D), . . . , gn(D) is then used to output encoded symbols u1 to
un. At each time instant, the contents of all registers are transferred k registers
to the right. After the final symbols are flushed out of the encoder by inputting
a tail of zeroes, the collection of output symbols at the righthand end of the en-
coder forms the encoded sequence. Convolutional codes are able to operate with a
37
2.3. ERROR CONTROL CODING
-
-? 6
����
?-
6����
?
ff
HHY -Input bit bu1 first output bit
u2 second output bit
Figure 2.12: A (2,1,3) convolutional encoder constructed with generator polyno-mials g1(D) = D2+D+1 and g2(D) = D2+1.
stream of information symbols of almost any length, and the asymptotic rate is kn
as
more and more symbols are encoded. A diagram of a (2, 1, 3) convolutional encoder
with operations in GF (2) and binary inputs is given in Fig. 2.12. More information
regarding convolutional codes can be found in [48].
Block codes
By contrast, block codes encode words of a particular length individually. There is
a fixed number k of information symbols per word, and additional parity symbols
are produced in the encoding process, resulting in codewords of a fixed length n.
This is referred to as an (n, k) block code C, where the code rate R of such a block
code is given by
R =k
n. (2.85)
Generator matrix: Suppose that C is a set of codewords over the Galois field
GF (p). If C 6= span(C), where span(C) denotes the linear span of a set of
vectors, then C is a non-linear block code. Otherwise, an (n, k) block code C
is linear and is a k-dimensional subspace of the finite vector space [GF (p)]n
spanned by a basis of k linearly independent p-ary vectors of length n. Let
these vectors be denoted gi, 1 ≤ i ≤ k, and form the k × n generator matrix
G =
g1
g2
...
gk
. (2.86)
Thus, G can be viewed as a transformation from the vector space [GF (p)]k
to the subspace C of [GF (p)]n. A row vector i of k information symbols is
38
2.3. ERROR CONTROL CODING
transformed into a codeword u of length n using the equation
u = i · G. (2.87)
For a k× (n−k) matrix A, a standard generator matrix G for an (n, k) linear
block code C satisfies the form
G = [Ik|A]. (2.88)
The ith row of the identity matrix Ik of order k is also known as the ith
standard basis vector ei, 1 ≤ i ≤ k. If the columns of a generator matrix
G contain all k transposes of the standard basis vectors of order k, but not
necessarily consecutively or in order, then G is systematic. Linear block codes
with a systematic generator matrix are advantageous because in the decoding
process, the information symbols are simply read off in the same order as the
k columns of Ik appear in G.
Denote the addition operation for the n-dimensional vector space over GF (p)
by ⊕. Since the rows of G are a basis for the vector space C, performing
elementary row operations of the form
f1gi ⊕ f2gj → gi, (2.89)
where f1, f2 ∈ GF (p) and gi and gj are row vectors of G, can give another
generator matrix for the same code. In most cases, performing these elemen-
tary row operations can deliver a standard generator matrix for C. If it is
not possible to obtain a standard generator matrix, there must exist a set of
k columns in a generator matrix which are linearly independent, since it is a
basis for a k-dimensional vector space. It is always possible to perform ele-
mentary row operations so that the k standard basis vectors appear in those
columns. Then, permuting the order of the columns of the matrix, it can be
transformed into a standard generator matrix for a code which is equivalent
to the original code. Therefore, the following result is true.
Theorem 2.3.1. Every linear block code either has a standard generator ma-
trix which can be found by performing elementary row operations, or it is
equivalent to another code of the same size which does possess a standard gen-
erator matrix.
Dual code: The orthogonal complement OC of a codeword u ∈ [GF (p)]n is the set
39
2.3. ERROR CONTROL CODING
of all vectors u⊥ ∈ [GF (p)]n which have an inner product of 0 with u. That
is,
OC(u) = {u⊥ ∈ [GF (p)]n | < u,u⊥ > = 0}, (2.90)
where for u = [u1, u2, . . . , un] and u⊥ = [u⊥1 , u⊥2 , . . . , u
⊥n ], the inner product is
defined as
< u,u⊥ > =n∑
i=1
uiu⊥i (mod p). (2.91)
The dual C⊥ of a code C is defined as the set of all vectors which are in the
orthogonal complement of every element of C. In other words,
C⊥ = {u⊥ ∈ [GF (p)]n | u⊥ ∈ OC(u), ∀u ∈ C}. (2.92)
It is a simple task to show that C⊥ is also a linear block code over GF (p),
of dimension n−k. Any potential generator matrix H for the dual code C⊥
must satisfy certain conditions. Namely, it must be an (n−k) × n matrix of
full rank, since its rows must form a basis for the dual code. Also, to ensure
the rows of H are in OC(C), a potential matrix H must satisfy the constraint
GHT = 0, (2.93)
where 0 is the k× (n−k) zero matrix. If G is a standard generator matrix for
C, a generator matrix for C⊥ is easily found by consideration of (2.93) and
the independence of the standard basis vectors.
Theorem 2.3.2. If a linear block code C has a standard generator matrix as
defined by (2.88), then H = [−AT |In−k] is a generator matrix for C⊥.
Hamming distance: The Hamming distance d(u1,u2) between codewords u1 and
u2 of a code C is defined as the number of positions in which u1 and u2 differ.
However, the Hamming distance d(C) of a code C is defined as the minimum
Hamming distance over all distinct pairs of codewords of C. That is,
d(C) = min{d(u1,u2) | u1,u2 ∈ C,u1 6= u2}. (2.94)
Consider a code with Hamming distance d where hyperspheres of radius ⌊d−12⌋
centred at each codeword partition the vector space [GF (p)]n such that all pn
vectors lie in exactly one hypersphere. Such a code is called perfect. Hamming
40
2.3. ERROR CONTROL CODING
codes are one of the few examples of perfect linear block codes. Furthermore,
they can be easily described in terms of their parity check matrices.
2.3.2 Decoding
Once a word has been received after transmission through the channel, the objective
is to decode that word and reconstruct the information symbols. The following de-
coding schemes can be performed with all types of linear block codes. In particular,
syndrome decoding, sequence estimation decoding, and symbol estimation decoding
are considered. In addition, the concept of a syndrome trellis and how it can be
used for APP decoding is described.
Syndrome decoding
The method of syndrome decoding in conjunction with a decoding array takes ad-
vantage of properties of the parity check matrix to decode a received word. It is a
relatively simple scheme which aims to find the nearest codeword to the received
vector. Assume C is an (n, k) linear block code over GF (p) with generator matrix
G and parity check matrix H. Firstly, note that (2.93) is in essence a system of k
linear equations having the form
giHT = 0 (2.95)
for each row vector gi of G, i ∈ {1, 2, . . . , k}, and where 0 is a row vector of n−kzeroes. Furthermore,
figiHT ⊕ fjgjH
T = 0, (2.96)
where fi, fj ∈ GF (p) and i, j ∈ {1, 2, . . . , k}. In other words,
uHT = 0, (2.97)
where u is any codeword in C. On the other hand, the received word v may or may
not be a codeword and may therefore be expressed as
v = u ⊕ d, (2.98)
where u ∈ C and d is the displacement of v from u. It then follows that
vHT = (u ⊕ d)HT = uHT ⊕ dHT = 0 ⊕ dHT = dHT = s, (2.99)
41
2.3. ERROR CONTROL CODING
where s is a p-ary vector of length n−k called the syndrome of the received word v.
For each of the pn−k possible values of s, there are pk received words which result
in that syndrome s. This partitions the n-dimensional vector space [GF (p)]n into
cosets Vt, t ∈ {0, 1, . . . , pn−k−1}, where the code C is V0 and each of the other cosets
consists of words with the same syndrome. In many cases, particularly if the channel
is memoryless, the displacement d for each coset is chosen as a vector with minimal
weight in that coset. These particular displacements are known as the coset leaders
d. The one-to-one correspondence between coset leaders and syndromes leads to
the following decoding procedure.
Procedure 2.1. Syndrome decoding of an (n, k) linear block code C over GF (p)
using the coset leader - syndrome correspondence.
Step 1. Construct a table with two columns containing a list of all pn−k syndromes
in one column and words of length n of minimum weight in each syndrome in
the corresponding cells of the other column.
Step 2. For a received word v, calculate its syndrome s = vHT .
Step 3. Obtain the estimate d of the displacement d as the coset leader corre-
sponding to the syndrome s from Step 2.
Step 4. By (2.98), decode u = v ⊖ d as the estimate of the codeword u. Here, ⊖denotes element-by-element subtraction modulo p.
Procedure 2.1 is capable of correcting only the error patterns given by the set
of coset leaders. However, there are several options for how coset leaders may be
chosen, such as the following:
• Choose the coset leader d at random,
• Choose the coset leader d as a word with minimal weight in that coset,
• Choose the coset leader d as the first vector when the set of valid coset leaders
is ordered lexicographically, either left-to-right or right-to-left, or
• Choose the coset leader d in that coset as the most likely error pattern for a
given channel.
The decoding based on Procedure 2.1 can be made either correctly or erroneously.
An erroneous decoding occurs if the coset leader d obtained by Procedure 2.1 is not
the actual displacement imposed by the transmission channel. In general, decoding
42
2.3. ERROR CONTROL CODING
of linear block codes using other algebraic approaches may also result in the detection
of an error pattern that cannot be corrected. In this case, the received word may
be discarded or its retransmission may be requested if a feedback channel from the
receiver to the transmitter is available.
Sequence estimation decoding
In the context of sequence estimation decoding, an APP is the conditional probabil-
ity P (u|v) that a sequence or word u was transmitted, given the sequence or word
v was received. In addition, a decoder may or may not use information about the
reliability of the decisions it makes.
Suppose a codeword u ∈ C was transmitted over the channel, and a word v was
received. In performing the decoding with respect to sequences that form a word of
length n, the aim is to maximise the probability that the estimate u is equal to the
transmitted word u, given the received word v. This objective may be stated as
u = arg maxu∈C
{P (u|v)}. (2.100)
In the very small chance of there being two maximally likely codewords, one is chosen
at random. Under the formulation in (2.100), only correct or incorrect decoding can
result. Using Bayes’ rule, (2.100) can be rewritten as
u = arg maxu∈C
{
P (v|u)P (u)
P (v)
}
. (2.101)
As the maximisation operation is independent of the probability P (v) of a given
received word v, the calculation can be expressed as the maximum a posteriori
probability (MAP) sequence estimation equation of
u = arg maxu∈C
{P (v|u)P (u)}, (2.102)
where P (u) is also known as the a priori probability of the word u. Examples of
algorithms developed to solve this equation are the Viterbi algorithm [49, 50, 51],
and other related sequential decoding algorithms such as [52].
Further simplification to the MAP criterion is obtained when the a priori prob-
ability is equal for all codewords. The related decoding schemes are called ML
algorithms. In this case, (2.102) simplifies to
u = arg maxu∈C
{P (v|u)}. (2.103)
43
2.3. ERROR CONTROL CODING
Symbol estimation decoding
The other strategy for decoding is to estimate one symbol at a time, rather than
a complete word. Assuming an (n, k) linear block code C in standard form, an
information symbol can be decoded for each position i ∈ {1, 2, . . . , k}. In symbol-
by-symbol MAP decoding, the aim is to find the a posteriori probabilities P (ui|v)
of each transmitted symbol ui for each position i given the received vector v. On
the basis of these APPs, the most likely symbol is selected as the estimate ui for the
transmitted symbol ui at time instant i.
A frequently used approach to perform symbol-by-symbol MAP decoding for
binary codes is based on log likelihood algebra, which has been extended in [17] to
non-binary codes using log likelihood ratio (LLR) vectors. However, the complexity
of performing the necessary calculations in the log domain is too high. For that
reason, algorithms using approximations which could be determined efficiently were
developed. For example, the Max-Log-MAP algorithm [53] could be used. This
was faster than the standard LLR MAP algorithm, however the approximations
were too coarse, significantly degrading performance. As a compromise, the Log-
MAP algorithm [54] was suggested. This method involves an alternative exact
calculation, but stores some commonly used values in a lookup table. Thus it is more
efficient than LLR MAP. Perhaps the most famous of the symbolwise MAP decoding
algorithms however was the BCJR algorithm [8], which has regained significant
attention with the advent of turbo codes and iterative decoding.
Alternatively, under an ML scheme, a uniform a priori probability distribution
is assumed. The estimate for the ith symbol is the field element which maximises
the APP, so that by Bayes’ rule,
ui = arg maxg∈GF (p)
{P (ui = g|v)}
= arg maxg∈GF (p)
∑
u∈Cui=g
P (u|v)
= arg maxg∈GF (p)
∑
u∈Cui=g
P (v|u)P (u)
= arg maxg∈GF (p)
∑
u∈Cui=g
P (v|u)
.
(2.104)
The values for conditional probabilities P (v|u) can be obtained from the channel
44
2.3. ERROR CONTROL CODING
model being employed. For example if the channel is memoryless, then
P (v|u) =n∏
i=1
P (vi|ui). (2.105)
Thus, by substituting (2.105) into (2.104), the symbolwise ML estimates for a mem-
oryless channel can be calculated as
ui = arg maxg∈GF (p)
∑
u∈Cui=g
n∏
j=1
P (vj|uj)
. (2.106)
However, if the channel model is not memoryless then (2.105) is not applicable. For
the models presented in Section 2.2.2, both the hidden sequence of state transitions
and the symbol error processes within each state must be accounted for. This is
handled precisely by the matrix probabilities D(vj|uj) = {D0,Dǫ} in (2.53). For
initial state distribution σ0 and all-ones column vector e as defined in (2.19), the
symbolwise ML estimates can be expressed as
ui = arg maxg∈GF (p)
σ0
∑
u∈C,ui=g
n∏
j=1
D(vj|uj)
e
. (2.107)
2.3.3 Trellis representations of linear block codes
According to [49], a trellis T = (N ,B) is a directed graph with node set N and
branch set B where each element of B possesses a label. Branches in a trellis of
length n may only join a node at depth i ∈ {0, 1, . . . , n−1} to a node at depth
i+1. Let E and F be the sets of nodes at depths zero and n, respectively. A path
of length l in a trellis T is defined for {b1, b2, . . . , bl} ∈ B as a sequence of branches
(b1, b2, . . . , bl) which is traversable in T and extends from depth η to depth η+l for
some η ∈ N. An example of a trellis defined by the sets
N = {1, 2, . . . , 8}, B = {b1, b2, . . . , b7}, E = {1, 2}, F = {8} (2.108)
is shown in Fig. 2.13. This trellis contains the 13 paths (b1), (b2), . . . , (b7), (b1, b2),
(b3, b6), (b4, b6), (b6, b7), (b3, b6, b7) and (b4, b6, b7).
A code trellis for an (n, k) linear block code C over GF (p) is a trellis containing
a path of length n corresponding to each codeword of C. This definition of a code
trellis is used in the construction of a syndrome trellis for C. However, in order for
45
2.3. ERROR CONTROL CODING
n
n
n
n
n
n
n
n��
��
����
-
@@
@@
@@@R
-
@@
@@
@@@R
-
-2
1
5
4
3
7
6
8
b1 b2
b3
b4
b5
b6
b7
Figure 2.13: A basic trellis with eight nodes and seven branches.
trellis decoding algorithms to be efficient, it may be beneficial in some applications
to construct a trellis of minimal size, usually requiring some branches and nodes to
be removed. Trellises can also be used to represent the symbolwise ML estimates in
(2.106) and (2.107). However in this case, a different set of branches is removed as
will be illustrated below.
Syndrome trellis
Consider an (n, k) linear block code C over GF (p) and suppose C has parity check
matrix H defined by its n columns of length n−k:
H =[
h1, h2, . . . , hn
]
. (2.109)
Also, let the syndrome s of a given received word v = [v1, v2, . . . , vn] be calculated
recursively with the reception of the symbols vi ∈ GF (p) over discrete time as
si+1 = si + vi+1hTi+1. (2.110)
Each of the pn−k levels of the syndrome trellis is associated with a different syn-
drome and, by the arguments presented in Section 2.3.2, a corresponding coset
Vt, t ∈ {0, 1, . . . , pn−k − 1}, where V0 = C. The actual value of the subscript t
of a coset Vt may be calculated as the decimal representation of the syndrome
s = [sn−k−1, . . . , s1, s0] as
t = dec(s) =n−k−1∑
j=0
sjpj. (2.111)
46
2.3. ERROR CONTROL CODING
The set of nodes corresponding to partial syndromes si at depth i lying on a path
from E is denoted Ni. To begin, set
E = N0 = {[0, 0, . . . , 0]} (2.112)
as only codewords will result in an all-zero syndrome. A branch joins a node in
Ni+1 depending on the partial syndrome si at depth i and level corresponding to
the partial syndrome si+1 = si ⊕ vi+1hTi+1, ∀vi+1 ∈ GF (p). Set
Ni+1 = {si ⊕ vi+1hTi+1}, (2.113)
where all partial syndromes si represented by nodes in Ni are considered. Repeating
this process for i ∈ {0, 1, . . . , n − 1}, the syndrome trellis of the code can be con-
structed. Clearly, the syndrome trellis provides a systematic way of decomposing
the n-dimensional vector space [GF (p)]n into the different cosets.
The construction of a code trellis that is minimal in size with respect to compris-
ing only paths of codewords through the trellis can be achieved in at least two ways.
One of these is through the elimination of illegal branches of the syndrome trellis,
made with the parity check matrix. It is shown in [49] that the removal of these
illegal paths ensures that the resulting trellis is minimal in terms of the size |N | of
the set N of total number of nodes in the trellis. Another method of minimising the
code trellis with respect to the overall number of nodes uses the generator matrix in
a special form, however this method will not be considered in detail here. Basically,
the trellis is built from the Shannon product [55, 56] of trellises from each row of
the generator matrix, but it requires the LR algorithm [49] to format the generator
matrix in order to produce the minimal trellis. The method involving the parity
check matrix is however simpler and more practical.
Example 2.1. Consider the (4, 2) linear block code C over GF (2) defined by the
parity check matrix
H =[
h1, h2, h3, h4
]
=
[
1 1 1 0
0 1 0 1
]
. (2.114)
The four codewords of this binary block code are given as the rows of the set
C =
0 0 0 0
0 1 1 1
1 0 1 0
1 1 0 1
. (2.115)
47
2.3. ERROR CONTROL CODING
- - - -
- -
- - -
- -
AAAAAAAAAAAAA
AAAAU
BBBBBBBBBBBBBBBBBBB
BBBBBBN
��
��
���
���
AAAAAAAAAAAAA
AAAAU
AAAAAAAAAAAAA
AAAAAAU
�������������
�������
�������������
�����
@@
@@
@@@
@@R
��
��
���
���
@@
@@
@@@
@@R
��
��
���
���
u u u u u
u u u
u u u u
u u u
[
0 0]
[
0 1]
[
1 0]
[
1 1]
h1 h2 h3 h4
0 0 0 00 1 1 11 0 1 01 1 0 1
=V0 =C
0 0 0 10 1 1 01 0 1 11 1 0 0
=V1
0 0 1 00 1 0 11 0 0 01 1 1 1
=V2
0 1 0 00 0 1 11 1 1 01 0 0 1
=V3
(a)
- - - -
-
-
AAAAAAAAAAAAA
AAAAU
BBBBBBBBBBBBBBBBBBB
BBBBBBN
��
��
���
���
�������������
�������
�������������
�����
��
��
���
���
u u u u u
u u
u u
u
[
0 0]
[
0 1]
[
1 0]
[
1 1]
h1 h2 h3 h4
V0 = C
V1
V2
V3
(b)
Figure 2.14: Trellis representations of the binary (4,2) linear block code C:(a) standard syndrome trellis, (b) minimal trellis (Dashed: si+1 = si, Solid:
si+1 = si ⊕ hTi+1).
48
2.3. ERROR CONTROL CODING
The structure of the syndrome trellis for this code is shown in Fig. 2.14(a) along
with the corresponding syndromes on the four nodes from the set E of originating
nodes, and cosets on the four nodes from the set F of terminating nodes.
The minimal trellis for the considered (4, 2) linear block code C defined by the
parity check matrix H is presented in Fig. 2.14(b). In this case, only the paths
through the trellis taken by codewords are considered and the sets E and F contain
one node each corresponding to the all-zeroes syndrome:
E = N0 = {[0, 0, 0, 0]}, (2.116)
F = N4 = {[0, 0, 0, 0]}. (2.117)
Trellises for APP decoding
A trellis used to perform symbol-by-symbol APP decoding provides a direct graph-
ical representation of the summands of the decision rule given in (2.106) or (2.107).
Its construction is similar to the full syndrome trellis, see Fig. 2.14(a) for example.
However, there are two notable changes to be imposed on the construction of a trellis
for APP decoding. Firstly, from the summation indices, the symbol ui in the ith
position of interest is required to have a particular value. Thus, in order to make
comparisons using the maximisation operation, a total of p different trellises are
constructed. The ith section of the trellis for ui = g ∈ GF (p), lying between depths
i and i + 1, only allows the at most pn−k branches from the level corresponding
to a partial syndrome si ∈ Ni to the level corresponding to the partial syndrome
si+1 = si ⊕ ghTi+1 ∈ Ni+1. This section will be much sparser than the other n−1
sections due to this constraint. Secondly, trellis paths may start at a level corre-
sponding to any of the pn−k syndrome states in order to support a description of
the trellis structure by trellis matrices.
Example 2.2. To illustrate the features involved in the structure of an APP de-
coding trellis, the two trellises for estimating the second symbol u2 in a codeword
u = [u1, u2, u3, u4] of the (4, 2) linear block code C of Example 2.1 are displayed in
Fig. 2.15. Accordingly, the trellis shown in Fig. 2.15(a) provides a systematic way of
arranging those words in the 4-dimensional vector space [GF (2)]4 into sets of words
that result in the same syndrome and in addition have the symbol u2 = 0 at position
i = 2. On the other hand, the trellis shown in Fig. 2.15(b) also structures words in
the 4-dimensional vector space [GF (2)]4 into sets of words that result in the same
syndrome but have symbol u2 = 1 at position i = 2. In this example, the following
49
2.3. ERROR CONTROL CODING
unions of subsets may be performed to produce the code and different cosets:
C =
{
0 0 0 0
1 0 1 0
}
∪{
0 1 1 1
1 1 0 1
}
, (2.118)
V1 =
{
0 0 0 1
1 0 1 1
}
∪{
0 1 1 0
1 1 0 0
}
, (2.119)
V2 =
{
0 0 1 0
1 0 0 0
}
∪{
0 1 0 1
1 1 1 1
}
, (2.120)
V3 =
{
0 0 1 1
1 0 0 1
}
∪{
0 1 0 0
1 1 1 0
}
. (2.121)
50
2.3. ERROR CONTROL CODING
- - - -
-
AAAAAAAAAAAAAA
AAAAAAAU
��������������
��������
v v v v v
v v v v v
v v v v v
v v v v v
[
0 0]
[
0 1]
[
1 0]
[
1 1]
{
0 0 0 01 0 1 0
}
⊂V0 =C
{
0 0 0 11 0 1 1
}
⊂V1
{
0 0 1 01 0 0 0
}
⊂V2
{
0 0 1 11 0 0 1
}
⊂V3
(a)
-
-
AAAAAAAAAAAAAA
AAAAAAAU
BBBBBBBBBBBBBBBBBBBBB
BBBBBBBN
��
��
��
�
����
��������������
��������
��
��
��
�
��
���
v v v v v
v v v v v
v v v v v
v v v v v
[
0 0]
[
0 1]
[
1 0]
[
1 1]
{
0 1 1 11 1 0 1
}
⊂V0 =C
{
0 1 1 01 1 0 0
}
⊂V1
{
0 1 0 11 1 1 1
}
⊂V2
{
0 1 0 01 1 1 0
}
⊂V3
(b)
Figure 2.15: Trellis representations of the binary (4,2) linear block code C suitablefor computing APPs: (a) P (u2 = 0|v), (b) P (u2 = 1|v) (Dashed: si+1 = si, Solid:
si+1 = si ⊕ hTi+1).
51
52
Chapter 3
APP Decoding on Discrete
Channels without Memory
The BSC is one of the simplest channel models developed to describe the error
characteristics of a transmission channel assuming errors occur independently. It
is specified by a single parameter referred to as the crossover probability, which
quantifies the probability of error for every bit transmitted. The generalisation of
this concept to an alphabet of size p is given by the p-ary DMC. In order to provide
a comprehensive assembly of APP decoding algorithms for a variety of classes of
discrete channels, these two memoryless channels form the starting point. These
idealistic discrete channel models can also be deployed in the formulation of models
for channels with memory as outlined in Chapter 2. As such, the fundamental
concepts behind the APP decoding strategies in this chapter can subsequently be
adapted to be employed for APP decoding on discrete channels with memory.
One approach for APP decoding is to work directly on the code trellis. Some of
the more prominent trellis-based APP decoding approaches include the BCJR algo-
rithm [8] and the Viterbi algorithm [51]. Wolf [57] improved on Viterbi’s approach
by minimising the trellis. Johansson and Zigangirov [9] made a significant contribu-
tion to APP decoding over memoryless channels by reducing the BCJR algorithm to
one requiring only a single sweep of the trellis. Other algorithms are predominantly
algebra-based. One such example is an algorithm involving generating functions and
Fourier transforms given in [14]. Clearly, the aim is to derive algorithms which are
as efficient to execute and as simple to implement as possible.
The methods described in this chapter, however, use representation theory and
linear algebra as suggested in [27] to perform the decoding. That is, the information
about the algebraic characteristics of the linear block code contained in the trellis
is mapped into a matrix group by a homomorphism. Two sorts of homomorphism
53
are covered here. One is a direct mapping into a cyclic subgroup of the group of
permutation matrices, whilst the other uses a diagonalisation technique to map the
necessary information into diagonal matrices. The calculations are then performed
using matrix algebra, and the required APPs are extracted from the resulting ma-
trix representations. Since the trellises need to be weighted according to the error
characteristics of the channel, the same applies to the matrices used in the calcu-
lations. For memoryless channels, this is achieved by scalar multiplication, so that
weighting does not increase the size of the matrices used. With this strategy, a de-
coding decision can be made efficiently using either of the homomorphisms, although
the diagonal matrix approach clearly has merit due to the simplicity of performing
arithmetic with such matrices.
This chapter is structured as follows. Section 3.1 formulates the APP decoding
problem which will be solved in this chapter. Then, Section 3.2 demonstrates how
a matrix representation of a code trellis in the original domain can be constructed
for both binary and non-binary linear block codes. This leads to elementary trellis
matrices, trellis section matrices, and the trellis matrix of the full code trellis. On
this basis, the stochastic characteristics of the discrete channel without memory are
included in terms of the crossover probability. The derivations result in weighted
trellis section matrices and the related weighted trellis matrix of the full weighted
trellis. Eventually, the APP decoding procedures for BSCs and DMCs are formu-
lated in the original domain. Having established matrix representations of the code
trellis and the weighted code trellis, the tools of linear algebra with respect to eigen-
values and eigenvectors of matrices are used in Section 3.3 to perform a similarity
transformation of the matrices in the original domain into diagonal matrices in the
spectral domain. It should be mentioned that the term “spectral domain” has been
chosen because the spectrum of a transformation on a finite dimensional vector space
is given by the set of all its eigenvalues. Such eigenvalues appear in the matrix rep-
resentations of the spectral domain, namely, elementary spectral matrices, spectral
section matrices, and spectral matrices corresponding to a full code trellis, as well
as the different weighted spectral matrices. The ultimate outcome in the algorith-
mically much simpler spectral domain are conditional spectral coefficients. These
coefficients can be related to the a posteriori probabilities needed for deriving an
APP decoding decision by a straightforward inverse transformation. In Section 3.4,
instructive examples are provided to demonstrate the algorithmic components in-
volved in the original domain and the spectral domain. In particular, it is shown
how the dual code and the elements of the received word control the APP decoding
in the spectral domain. Numerical examples are contained in Section 3.5, describing
54
3.1. PROBLEM STATEMENT
the performance of APP decoding for several selected codes on DMCs using com-
puter simulations. It should be noted that these performance investigations are used
to indicate options for the applications of the presented APP decoding approach,
but they are not considered to provide an exhaustive investigation into good code
and channel combinations. Finally, the chapter is summarised in Section 3.6.
The principal contributions of this chapter are:
• A formulation of the a posteriori probabilities required for decoding in terms
of the trace of the weighted spectral matrix of the full weighted trellis.
• Instructive examples showing the algorithmic differences between the original
and spectral domain approaches to APP decoding for a memoryless channel.
• A selection of numerical examples obtained by computer simulations which
demonstrate the range of performance analysis options that are supported by
the proposed APP decoding on discrete channels without memory.
3.1 Problem Statement
Consider a systematic (n, k) linear block code C in standard form over GF (p).
Furthermore, it is assumed that the linear block code C is used on a DMC, which
is characterised by conditional probabilities P (vj|uj), j = 1, 2, . . . , n. In view of
(2.106) under the assumption of a uniform a priori probability distribution, the APP
decoding problem of finding an estimate ui, i = 1, 2, . . . , k, of the ith transmitted
symbol ui of codeword u can then be formulated as
ui = arg maxg∈GF (p)
∑
u∈Cui=g
n∏
j=1
P (vj|uj)
. (3.1)
In the binary case, (3.1) may be rewritten as
ui =
0 if∑
u∈Cui=0
n∏
j=1
P (vj|uj) ≥∑
u∈Cui=1
n∏
j=1
P (vj|uj),
1 otherwise,
(3.2)
or in terms of difference probabilities as
ui =
0 if∑
u∈Cui=0
n∏
j=1
P (vj|uj) −∑
u∈Cui=1
n∏
j=1
P (vj|uj) ≥ 0,
1 otherwise.
(3.3)
55
3.2. ORIGINAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
In both the binary and non-binary problem setting, a total of pk−1 products of
conditional probabilities
P (v|u) =n∏
j=1
P (vj|uj) (3.4)
have to be calculated and summed in an order such that the involved sequences of
symbols uj establish codewords u = [u1, u2, . . . un] subject to the symbol ui at the
position i of interest being a given value g ∈ GF (p). This ordering of conditional
probabilities can be performed and evaluated either in the original domain using
the code trellis and its associated matrix representation, or it may be transformed
into a spectral domain using the corresponding spectral matrix representation. The
analytical framework for both domains will be presented in the next two sections.
3.2 Original Domain Matrix Representations of
Linear Block Code Trellises
A trellis representation of a linear block code provides a systematic means of decom-
posing an n-dimensional vector space into the actual code and its other cosets (see
Section 2.3.3). In order to develop an analytical framework for APP decoding, the
structure of a code trellis needs to be described by mathematical expressions. This
can be achieved by using the concepts of linear algebra and matrix representations.
In particular, trellis matrices can be derived which account for the algebraic proper-
ties of the code, while weighted trellis matrices include the stochastic properties of
the channel in the representation. Full details on the development of these matrix
representations can be found in [27], and the main concepts are summarised in this
chapter. As the mathematical framework described in the following sections relates
directly to the code trellis without the imposition of additional modifications, it is
referred to as being examined in the original domain.
3.2.1 Matrix representation for APP decoding on BSCs
Consider a binary (n, k) linear block code C defined by a parity check matrix
H =[
h1, h2, . . . , hn
]
, (3.5)
where hj, j ∈ {1, 2, . . . , n}, is a binary column vector of length n − k. Without
loss of generality, it is assumed that the block code C is given in standard form by
56
3.2. ORIGINAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
the related generator matrix G. Accordingly, k information bits are mapped onto
codewords of length n such that they appear as the first k elements of a codeword
prior to transmission over a BSC with crossover probability ǫ.
The key component in developing a mathematical framework that allows for
the construction of a code trellis is given by the partial syndrome, as will now be
explained. As mentioned in Section 2.3.3, the syndrome s of a word u is given by
s = uHT (3.6)
and can be considered as a state of the trellis. Instead of computing the syndrome
s using (3.6), it is beneficial to produce the syndrome recursively such that the
so-called partial syndromes
sj = sj−1 ⊕ ujhTj , j = 1, 2, . . . , n, (3.7)
are generated step-by-step with the processing of the bits uj. In this way, it is
possible to determine the transitions from all possible partial syndromes or states
sj−1 to the related subsequent partial syndromes or states sj for uj = 0 and uj = 1.
Clearly, this reveals the structure of the jth section of the code trellis. An analytical
construction of the sections of a code trellis may then be derived using a matrix
representation for the partial syndrome computation of (3.7) as presented in [27].
To further break this problem down into smaller components, it is instructive to
first consider only the scalar components
s′ ≡ s+ t (mod 2) with t ≡ u · h (mod 2) (3.8)
of the partial syndrome calculation (3.7) and formulate a corresponding matrix
representation. The resulting matrices are referred to as elementary trellis matrices
and constitute the fundamental entities from which the trellis section matrices are
constructed. In this way it is possible to analytically describe the code trellis by
matrix representations and then combine this description with the characteristics
of the considered discrete channel without memory to produce an APP decoding
decision.
Elementary trellis matrices and trellis section matrices
In view of the above, firstly consider the elementary trellis matrices as the building
blocks for the formulation of trellis section matrices. For this purpose, it is noted
that a matrix representation of a group G1 of order p is a homomorphism from G1
into a group G2 of p× p matrices. In the context of the considered problem setting
57
3.2. ORIGINAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
of deriving a matrix representation for the operation given in (3.8), G1 is taken to
be the additive group of GF (p) and G2 is a group of transformations from a finite
p-dimensional vector space V over the field C of complex numbers to itself. Then,
define the particular homomorphism for the original domain as
δorig : t 7→ M(t) = circ(0, . . . , 0, 1, 0, . . . , 0), (3.9)
where the circulant matrix circ(0, . . . , 0, 1, 0, . . . , 0) has t zeroes preceding the single
one entry and p−t−1 zeroes following that one entry. The first row of a circulant
matrix is the same as its arguments in order from left to right, and the successive
rows are cyclic shifts of this vector, one position to the right per row [58].
Noting that the argument t of the circulant matrix M(t) is given in the form of
t ≡ u · h (mod p), an elementary trellis matrix can be defined as
Mh(u) = M(u · h). (3.10)
As the Galois field GF (p) is closed under multiplication and the case of binary linear
block codes C is being considered here, there are in fact only two distinct elementary
trellis matrices:
Mh(u) =
[
1 0
0 1
]
for h · u = 0,
[
0 1
1 0
]
for h · u = 1.
(3.11)
A matrix representation for each column of a parity check matrix H can then be
constructed from the elementary trellis matrices defined in (3.11). It has been
demonstrated in [27] that a representation for the jth column hj of H, under the
assumption that the jth transmitted symbol was uj, is given by the 2n−k × 2n−k
trellis section matrix
Mhj(uj) =
n−k⊗
µ=1
Mhn−k−µ,j(uj), (3.12)
where the operator ⊗ denotes the Kronecker or tensor product. Given a matrix A
of size c×d and a matrix B of size l×m, then the related Kronecker product A⊗B
is a matrix of size cl × dm and may be written as
A ⊗ B =
a1,1B a1,2B . . . a1,dB
a2,1B a2,2B . . . a2,dB
......
...
ac,1B ac,2B . . . ac,dB
. (3.13)
58
3.2. ORIGINAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
Weighted trellis matrices
Having established an analytical expression for the trellis sections in terms of trellis
section matrices, it is now possible to include the characteristics of the discrete
channel into the description to support the APP decision rule given in (3.2). This
can be simply done by weighting the trellis section matrices by the conditional
probabilities
P (vj|uj) =
{
1 − ǫ if uj = vj,
ǫ if uj 6= vj,(3.14)
as defined in (2.5). The resulting weighted trellis section matrix Uhj(uj) for the jth
section assuming that the symbol uj was transmitted is then given as
Uhj(uj) = P (vj|uj) · Mhj
(uj) =
{
(1 − ǫ) · Mhj(uj) if uj = vj,
ǫ · Mhj(uj) if uj 6= vj,
(3.15)
whilst the weighted trellis section matrix Uhjfor the jth trellis section irrespective
of the bit transmitted at that position is given by
Uhj= Uhj
(0) + Uhj(1). (3.16)
In order to include all valid paths in the trellis that are required for deriving an
APP decision ui for the ith transmitted bit ui according to (3.2), it is necessary to
calculate the matrix sum over the transitions caused by both uj = 0 and uj = 1
for all trellis sections except for the ith section. At the position of interest i, only
one transition type, caused by either ui = 0 or ui = 1, is considered. A matrix
representation of the entire weighted trellis for calculating the conditional probability
P (ui|v) that the ith transmitted bit was ui ∈ {0, 1} given the received word v, can
therefore be formulated as
UH(ui) =i−1∏
j=1
Uhj· Uhi
(ui) ·n∏
j=i+1
Uhj. (3.17)
Determining the a posteriori probabilities for a BSC
There are two more conditions which must be met in order that only paths through
the weighted trellis corresponding to codewords are considered. The first of these
conditions is that according to (3.2), the summation of probabilities must only be
taken over paths which begin in the all-zero state. This condition can be accounted
for by pre-multiplying the weighted trellis matrix UH(ui) by a row vector τ 0 of
59
3.2. ORIGINAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
length 2n−k such that only the first row of the weighted trellis matrix is selected.
The suitable row vector is given by
τ 0 =[
1, 0, . . . , 0]
. (3.18)
The suggested vector-matrix product resulting in a vector P(ui|v) of APPs Pt(ui|v),
t = 0, 1, . . . , 2n−k − 1, for the ith transmitted bit ui given the received word v, is
then obtained as
P(ui|v) =[
P0(ui|v), P1(ui|v), . . . , P2n−k−1(ui|v)]
= τ 0UH(ui). (3.19)
Due to the operation of the considered type of code trellis as a device that
organises the elements of the n-dimensional vector space [GF (p)]n into the code
C = V0 and the related cosets Vt, t = 1, 2, . . . , 2n−k−1, (3.19) gives the corresponding
conditional probabilities P0(ui|v) and Pt(ui|v), t = 1, 2, . . . , 2n−k − 1. However,
the second condition of (3.2) is that trellis paths must only correspond to valid
codewords. Therefore, only encodings that correspond to paths through the code
trellis which originate from the all-zero state and terminate at the all-zero state
have to be considered. As such, only the element P0(ui|v) of the probability vector
P(ui|v) is of interest for deriving the APP decision as will be formulated below.
APP decoding procedure for a BSC
The fundamental steps of APP decoding when using a binary (n, k) linear block
code C in standard form on a BSC can now be summarised as a formal procedure.
As this procedure is based directly on the weighted code trellis, the procedure is
referred to as being performed in the original domain.
Procedure 3.1. Given is a binary (n, k) linear block code C in standard form to
be used on a BSC with crossover probability ǫ. The linear block code C shall be
defined by a parity check matrix H. A codeword u is transmitted over the channel
and a word v is received. APP decoding in the original domain is comprised of the
following steps.
Step 1. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ {0, 1}, compute the trellis section matrix
Mhj(uj) for column hj of parity check matrix H and jth transmitted symbol
uj using (3.11) and (3.12).
Step 2. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ {0, 1}, compute the weighted trellis section
matrix Uhj(uj) using (3.14) and (3.15).
60
3.2. ORIGINAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
Step 3. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, compute the weighted trellis matrix
UH(ui) for the full weighted trellis using (3.17) with (3.15) and (3.16).
Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, calculate the a posteriori probability
P0(ui|v) using (3.19).
Step 5. Derive an estimate ui for the transmitted symbol ui of codeword u at each
position i ∈ {1, 2, . . . , k} using
ui =
{
0 if P0(ui = 0|v) ≥ P0(ui = 1|v),
1 if P0(ui = 0|v) < P0(ui = 1|v).(3.20)
3.2.2 Matrix representation for APP decoding on DMCs
The same APP decoding approach in the original domain as presented above for a
binary linear block code C may be adapted for use with a linear block code C over
GF (p). Accordingly, the corresponding discrete channel without memory is given by
a non-binary DMC model. The following exposition will be based on the standard
model of a p-ary DMC model as defined in Fig. 2.3(a). It is noted that using the
alternative p-ary DMC model shown in Fig. 2.3(b) would change only the weighting
of the code trellis but not the actual APP decoding procedure. The major change
compared to the binary case is seen in a broader definition of the elementary trellis
matrices and the related trellis section matrices.
Elementary trellis matrices and trellis section matrices
The generalisation of an elementary trellis matrix Mh(u) for a binary linear block
code to an (n, k) linear block code C over GF (p) is achieved by expanding the
range of the regular representation δorig from the two circulant matrices of order 2
to the set of p circulant matrices of order p. Given the definition in (3.9), the set of
elementary trellis matrices for the considered class of non-binary linear block codes
comprises the p× p matrices
s′ ≡ s+ u · h (mod p) 7→ Mh(u) =
0 1 0 . . . 0
0 0 1...
0. . . 0
... 1
1 0 0 . . . 0
u·h
. (3.21)
61
3.2. ORIGINAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
As for the case of a binary linear block code, the elementary trellis matrices
are the building blocks for the construction of the trellis section matrices of a non-
binary linear block code. A trellis section matrix Mhj(uj), assuming symbol uj was
transmitted, captures the impact of column hj of the parity check matrix H on the
state transitions in the jth trellis section and is given by the Kronecker product
Mhj(uj) =
n−k⊗
µ=1
Mhn−k−µ,j(uj) (3.22)
of size pn−k × pn−k. Again, the arrangement of elements h ∈ GF (p) in a column h
defines the order in which elementary trellis matrices Mh(u) appear in the construc-
tion of a trellis section matrix Mh(u) for a given transmitted symbol u ∈ GF (p).
Weighted trellis matrices
The stochastic characteristics of a p-ary DMC in terms of crossover probabilities can
now easily be combined with the trellis section matrices to generate weighted trellis
matrices. For this purpose, assume that the standard model of the p-ary DMC given
in Fig. 2.3(a) is used. It then follows that (3.15) applies, but now with the weights
defined as
P (vj|uj) =
1 − (p− 1)ǫ if uj = vj,
ǫ if uj 6= vj.(3.23)
This results in the weighted trellis section matrices
Uhj(uj) = P (vj|uj) · Mhj
(uj) =
[1 − (p− 1)ǫ] · Mhj(uj) if uj = vj,
ǫ · Mhj(uj) if uj 6= vj.
(3.24)
Additionally, the weighted trellis section matrix Uhjfor the jth trellis section irre-
spective of the symbol transmitted at that position is given by
Uhj=
p−1∑
uj=0
Uhj(uj). (3.25)
It follows that the entire weighted trellis for calculating the conditional probability
P (ui|v) that the ith transmitted symbol was ui ∈ GF (p) for a received word v, is
given by
UH(ui) =i−1∏
j=1
Uhj· Uhi
(ui) ·n∏
j=i+1
Uhj. (3.26)
62
3.2. ORIGINAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
Determining the a posteriori probabilities for a DMC
Clearly, a vector P(ui|v) of APPs Pt(ui|v), t = 0, 1, . . . , pn−k − 1, can be calculated
using the same rationale as in the binary case as
P(ui|v) =[
P0(ui|v), P1(ui|v), . . . , Ppn−k−1(ui|v)]
= τ 0UH(ui). (3.27)
Although this vector is of length pn−k, only the APP P0(ui|v) is needed because this
APP relates to the paths through the trellis that originate in the all-zero state, and
after processing n symbols, also terminate in the all-zero state.
APP decoding procedure for a DMC
With the above presented analytical framework, it is thus possible to describe an
APP decoding procedure in the original domain for (n, k) linear block codes over
GF (p) on a discrete channel modelled by a p-ary DMC as follows.
Procedure 3.2. Given is an (n, k) linear block code C in standard form over GF (p)
to be used on a p-ary DMC also given in standard form. The linear block code C
shall be defined by a parity check matrix H. A codeword u is transmitted over the
channel and a word v is received. APP decoding in the original domain comprises
the following steps.
Step 1. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the trellis section matrix
Mhj(uj) for column hj of parity check matrix H and jth transmitted symbol
uj using (3.21) and (3.22).
Step 2. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the weighted trellis section
matrix Uhj(uj) using (3.24).
Step 3. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute the weighted trellis matrix
UH(ui) of the full weighted trellis using (3.26) with (3.24) and (3.25).
Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), calculate the a posteriori probability
P0(ui|v) using (3.27).
Step 5. Derive an estimate ui for the transmitted symbol ui of codeword u at each
position i ∈ {1, 2, . . . , k} using
ui = arg maxui∈GF (p)
{P0(ui|v)}. (3.28)
63
3.3. SPECTRAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
3.3 Spectral Domain Matrix Representations of
Linear Block Code Trellises
It is possible to find another set of p homomorphisms to form the matrix representa-
tion of the decoding trellis. From (3.17) and (3.26), one of the most computationally
expensive tasks in Procedures 3.1 and 3.2 is the multiplication of matrices. Storage
of the matrix elements is also an issue, since the trellis matrices in the original do-
main are relatively dense. Using the spectral domain, a product of diagonal matrices
can be used instead. Matrix multiplication of two n× n matrices is approximately
an O(n2.41) operation [59], whereas it becomes O(n) when the two matrices are di-
agonal. Additionally, only n values need to be stored for a diagonal n × n matrix.
A method for obtaining this representation for p-ary transmission is shown in this
section.
Elementary spectral matrices and spectral section matrices
Given a suitable transformation matrix Wp of order p, a similarity transformation of
each elementary trellis matrix M(t) = Mh(u) onto a diagonal matrix Λ(t) = Λh(u),
∀t ∈ GF (p) can be formulated as
Λ(t) = W−1p M(t)Wp. (3.29)
It is a basic result from linear algebra that the composition of the diagonal
matrix Λ(t) may be given by the eigenvalues λs, s = 0, 1, . . . , p−1, of the matrix
M(t) in some order whilst the rows of the transformation matrix Wp represent the
corresponding eigenvectors ws. In particular, the set of eigenvalues can be obtained
by calculating the roots of the characteristic polynomial c(λ) for λ ∈ C as
c(λ) = det[λIp − M(t)], (3.30)
where det(K) denotes the determinant of a matrix K. As M(t) is a diagonal matrix
itself for t = 0, consider the cases t = 0 and t > 0 individually as follows:
t = 0: In this case, M(0) is equal to an identity matrix Ip of order p and the char-
acteristic polynomial c(λ) for λ ∈ C is obtained as
det[λIp − M(t)] = (λ− 1)p. (3.31)
The roots of this characteristic polynomial are λs = 1, s = 0, 1, . . . , p− 1.
64
3.3. SPECTRAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
t > 0: In this case, the non-zero entries of the matrix which is the argument to the
determinant operation are given in a form such as
λIp − M(t) =
λ −1
λ. . .
. . . −1
−1. . .
. . . . . .
−1 λ
, (3.32)
where the placement of the −1 in the first row depends of the value t ∈ GF (p)
and then follows the circular structure of M(t). Using Gaussian elimination,
(3.32) can be transformed into an upper triangular matrix where the main
diagonal comprises p−1 entries of λ and one entry of λ − λ1−p. Noting that
determinants of matrices are invariant under Gaussian elimination and the
fact that the determinant of an upper triangular matrix is the product of its
main diagonal entries, the characteristic polynomial can be calculated as
c(λ) = det[λIp − M(t)]
= λp − 1.(3.33)
Solving c(λ) = 0 reveals the eigenvalues of M(t) for t > 0 as being the pth
roots of unity
λs = ws; w = e− 2π
p , (3.34)
where s = 0, 1, . . . , p − 1 and =√−1. The spectrum of eigenvalues of the
matrix M(t) is therefore given by the set
W = {1, w1, w2, . . . , wp−1}. (3.35)
Having established the spectrum W of eigenvalues λ for the cases of t = 0 and t > 0,
a suitable transformation matrix Wp of order p that supports the desired diagonal-
isation of M(t) needs to be reported. It can easily be shown that an eigenvector ws
corresponding to an eigenvalue λs = ws, s = 0, 1, . . . , p− 1, may be represented as
ws =[
ws·0, ws·1, ws·2, . . . , ws·(p−1)]
. (3.36)
65
3.3. SPECTRAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
These eigenvectors can then be used as rows of the transformation matrix such that
Wp =
1 1 1 . . . 1
1 w w2 . . . wp−1
1 w2 w4 . . . wp−2
......
......
1 wp−1 wp−2 . . . w
. (3.37)
Using the transformation matrix Wp as defined in (3.37), the similarity transforma-
tion formulated in (3.29) results in a diagonal matrix
Λ(t) = diag{w0, wt, w2t, . . . , w(p−1)t} = diag{wst}. (3.38)
As the diagonal matrix Λ(t) contains the spectrum of eigenvalues of M(t) in its main
diagonal, it is referred to as a spectral matrix. Since the transformation matrix Wp is
kept fixed, the arrangement of eigenvalues in the spectral matrix Λ(t) corresponding
to the matrix M(t) depends on the value t ∈ GF (p). In view of this property, the
set of homomorphisms for the spectral domain may be specified as
δspec : t 7→ Λ(t) = diag{wst}. (3.39)
Analogously to the elementary trellis matrices in the original domain and noting
that t ≡ u · h (mod p), elementary spectral matrices can be defined for the spectral
domain as
Λh(u) = Λ(uh) = diag{ws·uh} = diag{wu·sh} = diag{wu·u⊥}, (3.40)
where u⊥ = s · h is used to indicate the relationship to the dual code C⊥. Following
simple arguments of linear algebra, the spectral section matrix Λhj(uj) for an (n, k)
linear block code C over GF (p) defined by a parity check matrix H and a given
transmitted symbol uj for j ∈ {1, 2, . . . , n} can then be deduced from the similarity
transformation
Λhj(uj) = W−1
pn−kMhj(uj)Wpn−k . (3.41)
Such a matrix for each trellis section is constructed from the Kronecker product of
elementary spectral matrices, given by
Λhj(uj) =
n−k⊗
µ=1
Λhn−k−µ,j(uj) = diag{wuj ·u
⊥
s,j}, (3.42)
66
3.3. SPECTRAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
where u⊥s,j is the jth symbol of the sth dual codeword u⊥s = sH, and s = dec(s)
denotes the decimal representation of the p-ary vector s. It should also be mentioned
that the transformation matrix Wpn−k can be constructed recursively using the
elementary transformation matrix Wp, as shown in [27], by
Wpn−k = Wpn−k−1 ⊗ Wp. (3.43)
For p = 2, the transformation matrix W2n−k of order 2n−k is known as a Walsh-
Hadamard matrix, while for prime number p > 2, the obtained matrix Wpn−k of
order pn−k may be considered as a generalised Walsh-Hadamard matrix. As far as
the inverse matrix W−1pn−k of matrix Wpn−k is concerned, it is easy to show that
W−1pn−k =
1
pn−kWH
pn−k , (3.44)
where (·)H denotes the Hermitian of the argument (see Appendix A).
Weighted spectral matrices
With the above results, it is also possible to perform a similarity transformation of
the weighted trellis matrices of the original domain into weighted spectral matrices
of the spectral domain. This operation can be formulated to produce a weighted
spectral matrix ΘH(ui) for the whole weighted trellis and a given argument ui at
position i ∈ {0, 1, . . . , k} of interest as
ΘH(ui) = W−1pn−kUH(ui)Wpn−k
=i−1∏
j=1
[
W−1pn−kUhj
Wpn−k
]
·[
W−1pn−kUhi
(ui)Wpn−k
]
·n∏
j=i+1
[
W−1pn−kUhj
Wpn−k
]
=i−1∏
j=1
Θhj· Θhi
(ui) ·n∏
j=i+1
Θhj. (3.45)
The individual factors in the product (3.45) shall be referred to as weighted spectral
section matrices Θhj(uj) and are given, for the jth section assuming that symbol uj
was transmitted over a p-ary DMC in standard form, as
Θhj(uj) = W−1
pn−kUhj(uj)Wpn−k
= P (vj|uj)Λhj(uj).
(3.46)
67
3.3. SPECTRAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
Incorporating (3.23), (3.46) may be further specified as
Θhj(uj) =
[1 − (p− 1)ǫ] · Λhj(uj) if uj = vj,
ǫ · Λhj(uj) if uj 6= vj,
=
diag{Θs,j(uj) = [1 − (p− 1)ǫ] · wuj ·u⊥
s,j} if uj = vj,
diag{Θs,j(uj) = ǫ · wuj ·u⊥
s,j} if uj 6= vj.(3.47)
Similarly, the weighted spectral section matrices Θhjfor the jth section regardless
of the transmitted symbol uj can be expressed as
Θhj= W−1
pn−kUhjWpn−k
=
p−1∑
uj=0
Θhj(uj)
= diag
Θs,j =
p−1∑
uj=0
P (vj|uj) · wuj ·u⊥
s,j
. (3.48)
By substituting (3.47) and (3.48) into (3.45), the weighted spectral matrix ΘH(ui)
with focus on symbol ui at position i ∈ {1, 2, . . . , k} can be reformulated as
ΘH(ui) = diag
{
Qs(ui|v) = Θs,i(ui)n∏
j=1, j 6=i
Θs,j
}
, (3.49)
where the so-called conditional spectral coefficients Qs(ui|v), s = 0, 1, . . . , pn−k − 1,
of the spectral domain represent the counterpart to the conditional probabilities
Pt(ui|v), t = 0, 1, . . . , pn−k − 1, of the original domain. The conditional spectral
coefficients may be expressed as
Qs(ui|v) = P (vi|ui) · wui·u⊥
s,i ·n∏
j=1, j 6=i
p−1∑
uj=0
P (vj|uj) · wuj ·u⊥
s,j
. (3.50)
Determining the a posteriori probabilities in the spectral domain
With the proposed similarity transformations as outlined above, it is straightforward
to determine the APPs using the conditional spectral coefficients of the spectral
domain as follows. In the first step, the transformation matrix Wpn−k is applied to
rewrite the vector P(ui|v) of conditional probabilities Pt(ui|v), t = 0, 1, . . . , pn−k−1,
68
3.3. SPECTRAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES
for transmitted symbol ui, i ∈ {1, 2, . . . , k}, as
P(ui|v) = τ 0Wpn−k · W−1pn−kUH(ui)Wpn−k · W−1
pn−k
= τ 0Wpn−k · ΘH(ui) · W−1pn−k
= ι0ΘH(ui) · W−1pn−k
= Q(ui|v)W−1pn−k . (3.51)
In other words, instead of calculating the vector of conditional probabilities directly
in the original domain, the initial vector τ 0 and weighted spectral matrix UH(ui)
may be transformed into the spectral domain, resulting in the vector of conditional
spectral coefficients
Q(ui|v) =[
Q0(ui|v), Q1(ui|v), . . . , Qpn−k−1(ui|v)]
= ι0ΘH(ui), (3.52)
where
ι0 = τ 0Wpn−k =[
1, 1, . . . , 1]
. (3.53)
Exploring the simple computational structure of the spectral domain and then
performing an inverse transformation to return to the original domain eventually
leads to the required a posteriori probability
P0(ui|v) =1
pn−ktr[ΘH(ui)] =
1
pn−k
pn−k−1∑
s=0
Qs(ui|v), (3.54)
where tr(K) denotes the trace of a matrix K. The remaining elements of the vector
of conditional probabilities may be obtained using the inverse transform
Pt(ui|v) =1
pn−k
pn−k−1∑
s=0
Qs(ui|v)w<−s,t>, (3.55)
where the operator < s, t > denotes the scalar product in modulo p arithmetic
between s = vecp(s) and t = vecp(t), which are p-ary vectors representing the
decimal numbers s and t, respectively.
APP decoding procedure in the spectral domain
The following procedure formulates APP decoding in the spectral domain and out-
puts the estimated sequence of information symbols for a linear block code over a
BSC or DMC.
69
3.4. INSTRUCTIVE EXAMPLES
Procedure 3.3. Given is an (n, k) linear block code C in standard form over GF (p)
to be used on a p-ary DMC also given in standard form. The linear block code C
shall be defined by a parity check matrix H. The codeword u is transmitted over
the channel and the received word is obtained as v. APP decoding in the spectral
domain comprises the following steps.
Step 1. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the spectral section matrix
Λhj(uj) for column hj of parity check matrix H and jth transmitted symbol
uj using (3.40) and (3.42).
Step 2. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the weighted spectral section
matrix Θhj(uj) using (3.23) and (3.47).
Step 3. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute the weighted spectral matrix
ΘH(ui) relating to the full weighted trellis using (3.45) with (3.47) and (3.48).
Step 4. Derive an estimate ui for the transmitted symbol ui of codeword u at each
position i ∈ {1, 2, . . . , k} using
ui = arg maxui∈GF (p)
{tr[ΘH(ui)]}. (3.56)
3.4 Instructive Examples
Here, an example is provided to illustrate the algorithmic components involved in
APP decoding of linear block codes on discrete channels without memory using
the original domain and spectral domain as formulated in Procedures 3.1 and 3.3,
respectively. For this purpose, consider the binary (4, 2) linear block code C defined
by the parity check matrix H given in (2.114) (see Example 2.1). Accordingly, the
related discrete channel without memory shall be modelled by a BSC with crossover
probability ǫ. Furthermore, without loss of generality, assume that the objective is
to obtain an APP decoding decision for the symbol u2 at position i = 2 of codeword
u, given the received word
v = [1, 1, 1, 0]. (3.57)
3.4.1 Example of decoding in the original domain
Using the parity check matrix H defined in (2.114), the trellis section matrices
Mhj(uj) for the four sections of the trellis and argument uj = 0, j = 1, 2, 3, 4, can
70
3.4. INSTRUCTIVE EXAMPLES
be calculated using (3.12) as
Mh1(0) = Mh2
(0) = Mh3(0) = Mh4
(0) = I4. (3.58)
Similarly, for argument uj = 1, the trellis section matrices Mhj(uj), j = 1, 2, 3, 4,
are obtained using (3.12) as
Mh1(1) = M1(1) ⊗ M0(1) =
0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
, (3.59)
Mh2(1) = M1(1) ⊗ M1(1) =
0 0 0 1
0 0 1 0
0 1 0 0
1 0 0 0
, (3.60)
Mh3(1) = M1(1) ⊗ M0(1) =
0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
, (3.61)
Mh4(1) = M0(1) ⊗ M1(1) =
0 1 0 0
1 0 0 0
0 0 0 1
0 0 1 0
. (3.62)
Given the received vector v = [1, 1, 1, 0], the corresponding weighted trellis section
matrices Uhj(uj) for argument uj = 0, j = 1, 2, 3, 4, can be calculated with (3.15)
and (3.58) as
Uhj(0) =
{
ǫI4 for j = 1, 2, 3,
(1 − ǫ)I4 for j = 4.(3.63)
On the other hand, the weighted trellis section matrices Uhj(uj) for argument
uj = 1, j = 1, 2, 3, 4, can be calculated with (3.15) and the trellis section matri-
ces given by (3.59)-(3.62) as
Uhj(1) =
{
(1 − ǫ)Mhj(1) for j = 1, 2, 3,
ǫMhj(1) for j = 4.
(3.64)
Then using (3.16), a weighted matrix representation of the whole trellis in order to
determine the likelihood of u2 = 0 and u2 = 1, respectively, can be obtained by
71
3.4. INSTRUCTIVE EXAMPLES
computing the following matrix products (see also Fig. 3.1):
UH(u2 = 0) = Uh1Uh2
(0)Uh3Uh4
, (3.65)
UH(u2 = 1) = Uh1Uh2
(1)Uh3Uh4
. (3.66)
After some elementary algebra, the related probabilities can be determined from
(3.65) and (3.66) by (3.19) as
P0(u2 = 0|v) = ǫ(1 − ǫ)(2ǫ2 − 2ǫ+ 1), (3.67)
P0(u2 = 1|v) = 2ǫ2(1 − ǫ)2. (3.68)
The APP decoding decision can be deduced from the following expression:
ui =
{
0 if ǫ(1 − ǫ)(2ǫ2 − 2ǫ+ 1) ≥ 2ǫ2(1 − ǫ)2,
1 if ǫ(1 − ǫ)(2ǫ2 − 2ǫ+ 1) < 2ǫ2(1 − ǫ)2.(3.69)
Noting that the difference probability
P0(u2 = 0|v) − P0(u2 = 1|v) = (2ǫ− 1)2, (3.70)
is non-negative for all possible crossover probabilities ǫ ∈ [0, 1], the decision rule
given by (3.3) produces the estimate u2 for the second symbol u2 of a codeword
u = [u1, u2, u3, u4] ∈ C as
u2 = 0. (3.71)
3.4.2 Example of decoding in the spectral domain
Alternatively, the same APP decoding problem can be solved using the Walsh-
Hadamard matrix of order four to diagonalise the trellis matrices given in Sec-
tion 3.4.1. This allows direct consideration of the related spectral section matri-
ces as defined in (3.42). The elements u⊥s,j, j = 1, 2, 3, 4, of the dual codewords
u⊥s = [u⊥s,1, u
⊥s,2, u
⊥s,3, u
⊥s,4], s = 0, 1, 2, 3, needed to evaluate (3.42) are given as the
rows of the set
C⊥ =
0 0 0 0
0 1 0 1
1 1 1 0
1 0 1 1
. (3.72)
72
3.4. INSTRUCTIVE EXAMPLES
- - - -
- - - -
- - - -
- - - -
AAAAAAAAAAAAAA
AAAAAU
AAAAAAAAAAAAAA
AAAAAAU
��������������
�������
��������������
������
AAAAAAAAAAAAAA
AAAAAU
AAAAAAAAAAAAAA
AAAAAAU
��������������
�������
��������������
������
@@
@@
@@
@
@@@R
��
��
��
�
����
@@
@@
@@
@
@@@R
��
��
��
�
����
v v v v v
v v v v v
v v v v v
v v v v v
1
0
0
0
P0(u2 =0|v)
P1(u2 =0|v)
P2(u2 =0|v)
P3(u2 =0|v)
ǫ
ǫ
ǫ
ǫ
1−ǫ
1−ǫ1−ǫ
1−ǫ
ǫ
ǫ
ǫ
ǫ
ǫ
ǫ
ǫ
ǫ
1−ǫ
1−ǫ1−ǫ
1−ǫ
ǫ
ǫ
ǫ
ǫ
1−ǫ
1−ǫ
1−ǫ
1−ǫ
(a)
- - -
- - -
- - -
- - -
AAAAAAAAAAAAAA
AAAAAU
AAAAAAAAAAAAAA
AAAAAAU
��������������
�������
��������������
������
BBBBBBBBBBBBBBBBBBBBB
BBBBBBBN
��
��
��
�
����
���������������������
��������
@@
@@
@@
@
@@@R
AAAAAAAAAAAAAA
AAAAAU
AAAAAAAAAAAAAA
AAAAAAU
��������������
�������
��������������
������
@@
@@
@@
@
@@@R
��
��
��
�
����
@@
@@
@@
@
@@@R
��
��
��
�
����
v v v v v
v v v v v
v v v v v
v v v v v
1
0
0
0
P0(u2 =1|v)
P1(u2 =1|v)
P2(u2 =1|v)
P3(u2 =1|v)
ǫ
ǫ
ǫ
ǫ
1−ǫ
1−ǫ1−ǫ
1−ǫ
1−ǫ
1−ǫ
1−ǫ1−ǫ
ǫ
ǫ
ǫ
ǫ
1−ǫ
1−ǫ1−ǫ
1−ǫ
ǫ
ǫ
ǫ
ǫ
1−ǫ
1−ǫ
1−ǫ
1−ǫ
(b)
Figure 3.1: Original domain APP decoding trellises for the binary (4,2) linear blockcode C which allow for computation of the conditional probabilities (a) P (u2 = 0|v)
and (b) P (u2 = 1|v). (Dashed: sj+1 = sj, Solid: sj+1 = sj ⊕ hTj+1.)
73
3.4. INSTRUCTIVE EXAMPLES
The spectral section matrices Λhj(uj) for uj = 0, j = 1, 2, 3, 4 are given by (3.42)
as
Λh1(0) = Λh2
(0) = Λh3(0) = Λh4
(0) = I4, (3.73)
while for uj = 1, the four spectral section matrices obtained may be expressed as
Λh1(1) = Λ1(1) ⊗ Λ0(1) = diag{+1,+1,−1,−1}, (3.74)
Λh2(1) = Λ1(1) ⊗ Λ1(1) = diag{+1,−1,−1,+1}, (3.75)
Λh3(1) = Λ1(1) ⊗ Λ0(1) = diag{+1,+1,−1,−1}, (3.76)
Λh4(1) = Λ0(1) ⊗ Λ1(1) = diag{+1,−1,+1,−1}. (3.77)
It may be instructive to visualise the structure of the spectral section matrices for
uj = 1, j = 1, 2, 3, 4, given in (3.74)-(3.77) by a diagonal trellis as shown in Fig. 3.2.
Comparing the set of codewords given in (3.72) and the weights of the diagonal
trellis in Fig. 3.2, the following relationship can be seen:
C⊥ =
0 0 0 0
0 1 0 1
1 1 1 0
1 0 1 1
↔
+1 +1 +1 +1
+1 −1 +1 −1
−1 −1 −1 +1
−1 +1 −1 −1
. (3.78)
In other words, the algebraic characteristics of the linear block code C under con-
sideration are represented in the spectral domain by the corresponding dual code
C⊥. As such, state transitions in the trellis of the original domain are transformed
into a pattern of +1 and −1 weights in the diagonal trellis of the spectral domain.
Then, the eight spectral matrices Λhj(uj) given in (3.73)-(3.77) must be weighted
by the conditional probabilities P (vj|uj), j = 1, 2, 3, 4. Given the received word
v = [1, 1, 1, 0], these probabilities may be expressed as
[P (1|u1), P (1|u2), P (1|u3), P (0|u4)]=
{
[ǫ, ǫ, ǫ, 1 − ǫ] for uj = 0, j = 1, 2, 3, 4,
[1−ǫ, 1−ǫ, 1−ǫ, ǫ] for uj = 1, j = 1, 2, 3, 4.
(3.79)
The resulting weighted spectral matrices Θhj(uj) for a transmitted symbol of uj = 0,
j = 1, 2, 3, 4, can then be expressed using (3.47) as
Θh1(0) = diag{ǫ, ǫ, ǫ, ǫ}, (3.80)
Θh2(0) = diag{ǫ, ǫ, ǫ, ǫ}, (3.81)
Θh3(0) = diag{ǫ, ǫ, ǫ, ǫ}, (3.82)
Θh4(0) = diag{1 − ǫ, 1 − ǫ, 1 − ǫ, 1 − ǫ}, (3.83)
74
3.4. INSTRUCTIVE EXAMPLES
u u u u u
u u u u u
u u u u u
u u u u u
- - - -
- - - -
- - - -
- - - -
−1 +1 −1 −1
−1 −1 −1 +1
+1 −1 +1 −1
+1 +1 +1 +1
Figure 3.2: Illustration of the relationship between the codewords u⊥s , s = 0, 1, 2, 3,
of the dual code C⊥ and the spectral section matrices Λhj(uj), j = 1, 2, 3, 4, uj = 1.
whilst those for a transmitted symbol of uj = 1, j = 1, 2, 3, 4, can be expressed as
Θh1(1) = diag{(+1)(1 − ǫ), (+1)(1 − ǫ), (−1)(1 − ǫ), (−1)(1 − ǫ)}, (3.84)
Θh2(1) = diag{(+1)(1 − ǫ), (−1)(1 − ǫ), (−1)(1 − ǫ), (+1)(1 − ǫ)}, (3.85)
Θh3(1) = diag{(+1)(1 − ǫ), (+1)(1 − ǫ), (−1)(1 − ǫ), (−1)(1 − ǫ)}, (3.86)
Θh4(1) = diag{(+1)(ǫ), (−1)(ǫ), (+1)(ǫ), (−1)(ǫ)}. (3.87)
In a similar fashion to that relationship between the dual code and weights which
is illustrated in (3.78), the products shown in (3.84)-(3.87) relate to the elements of
the codewords in the dual code but now include information about the error process
induced by the discrete channel in terms of the crossover probability.
Using (3.48) and (3.49), the matrix descriptions of the complete weighted spectral
trellis in the two cases u2 = 0 and u2 = 1 can be formulated as
ΘH(u2 = 0) = Θh1Θh2
(0)Θh3Θh4
= diag{Qs(u2 = 0|v)}, (3.88)
ΘH(u2 = 1) = Θh1Θh2
(1)Θh3Θh4
= diag{Qs(u2 = 1|v)}. (3.89)
After performing the multiplications given in (3.88) and (3.89), respectively, the
75
3.4. INSTRUCTIVE EXAMPLES
conditional spectral coefficients Qs(0|v), s = 0, 1, 2, 3, for u2 = 0 are obtained as
Q0(u2 = 0|v) = ǫ, (3.90)
Q1(u2 = 0|v) = ǫ(1 − 2ǫ), (3.91)
Q2(u2 = 0|v) = ǫ(2ǫ− 1)2, (3.92)
Q3(u2 = 0|v) = ǫ(2ǫ− 1)2(1 − 2ǫ), (3.93)
while for the case of u2 = 1, the results may be expressed as
Q0(u2 = 1|v) = 1 − ǫ, (3.94)
Q1(u2 = 1|v) = (1 − ǫ)(1 − 2ǫ), (3.95)
Q2(u2 = 1|v) = (1 − ǫ)(2ǫ− 1)2, (3.96)
Q3(u2 = 1|v) = (1 − ǫ)(2ǫ− 1)2(1 − 2ǫ). (3.97)
The weights of the diagonal trellis for u2 = 0 and their relationship with the con-
ditional spectral coefficients Qs(0|v), s = 0, 1, 2, 3 as found in (3.90)-(3.93) can be
seen in Fig. 3.3(a). However, in order to illustrate the impact of the dual code C⊥
and the received word v on the composition of the conditional spectral coefficients
Qs(ui|v), it may be instructive to consider the weights of the diagonal trellis for
u2 = 1 as shown in Fig. 3.3(b). For ease of exposition, define the difference weights
∆0 = 1 − 2ǫ and ∆1 = 2ǫ− 1. (3.98)
The impact of the received word v and the dual code C⊥ on the order of weights in
the diagonal trellis can then be deduced from the following relationship:
C⊥ =
0 0 0 0
0 1 0 1
1 1 1 0
1 0 1 1
↔
1 +(1 − ǫ) 1 1
1 −(1 − ǫ) 1 ∆0
∆1 −(1 − ǫ) ∆1 1
∆1 +(1 − ǫ) ∆1 ∆0
. (3.99)
For positions j = 1, 3, 4, it can be seen from (3.99) that the element u⊥j = 0 in
a codeword u⊥ of the dual code C⊥ relates to the weight +1 while the element
u⊥j = 1 relates to the weight ∆0 or ∆1 depending on the elements of the received
76
3.4. INSTRUCTIVE EXAMPLES
word v. In this context, it is noted that the difference value ∆0 is actually used
when vj = 0 whereas the difference value ∆1 is used when vj = 1. For i = 2,
the position at which the APP decision is to be established in this example, the
elements u⊥s,2 = 0 and u⊥s,2 = 1, s = 1, 2, 3, 4, relate to the factors +1 and −1 in the
weights, respectively. The properties of the discrete channel are accounted for by the
conditional probability P (v2|u2), which for the given transmitted symbol u2 = 1 is
determined by the element v2 = 1 of the received word v as P (1|1) = 1− ǫ. Clearly,
these structural characteristics can be used to efficiently implement APP decoding
over discrete channels without memory. In particular, only the set of weights of the
diagonal trellis has to be produced and the order of their appearance in the product
leading to the conditional spectral coefficients is determined by the dual code C⊥
and the received word v. It is to be noted that similar findings extend to the case
of discrete channels with memory, subject to the modification that the set of scalar
weights is replaced by corresponding matrices.
Having computed the related conditional spectral coefficients Qs(u2 = 0|v) and
Qs(u2 = 1|v), respectively, the mapping to the conditional probabilities P0(u2 = 0|v)
and P0(u2 = 1|v) can be derived using the inverse transform (3.54) as
P0(u2 = 0|v) =1
4
3∑
s=0
Qs(u2 = 0|v) = ǫ(1 − ǫ)(2ǫ2 − 2ǫ+ 1), (3.100)
P0(u2 = 1|v) =1
4
3∑
s=0
Qs(u2 = 1|v) = 2ǫ2(1 − ǫ)2. (3.101)
Clearly, the APP decoding decision deduced from the spectral domain characteristics
in terms of conditional spectral coefficients through the expression
ui =
0 if3∑
s=0
Qs(u2 = 0|v) ≥3∑
s=0
Qs(u2 = 1|v),
1 if3∑
s=0
Qs(u2 = 0|v) <3∑
s=0
Qs(u2 = 1|v),
(3.102)
then leads to the same outcome as in the original domain. It produces the estimate
u2 for the transmitted symbol u2 at position i = 2 of a codeword u as
u2 = 0. (3.103)
77
3.4. INSTRUCTIVE EXAMPLES
v v v v v
v v v v v
v v v v v
v v v v v
- - - -
- - - -
- - - -
- - - -
Q3(u2 =0|v)
Q2(u2 =0|v)
Q1(u2 =0|v)
Q0(u2 =0|v)
2ǫ− 1 ǫ 2ǫ− 1 1 − 2ǫ
2ǫ− 1 ǫ 2ǫ− 1 1
1 ǫ 1 1 − 2ǫ
1 ǫ 1 1
1
1
1
1
(a)
v v v v v
v v v v v
v v v v v
v v v v v
- - - -
- - - -
- - - -
- - - -
Q3(u2 =1|v)
Q2(u2 =1|v)
Q1(u2 =1|v)
Q0(u2 =1|v)
2ǫ− 1 1 − ǫ 2ǫ− 1 1 − 2ǫ
2ǫ− 1 ǫ− 1 2ǫ− 1 1
1 ǫ− 1 1 1 − 2ǫ
1 1 − ǫ 1 1
1
1
1
1
(b)
Figure 3.3: Weighted diagonal trellises of the binary (4, 2) linear block codeC used for computing the conditional spectral coefficients (a) Qs(u2 = 0|v) and
(b) Qs(u2 = 1|v); s = 0, 1, 2, 3.
78
3.5. NUMERICAL EXAMPLES
3.5 Numerical Examples
Computer simulations were carried out for several binary and non-binary linear
block codes to examine the BER performance of these codes over BSCs and DMCs,
respectively. In particular, APP decoding in the spectral domain as formulated in
Procedure 3.3 was used. The linear block codes were chosen such that they have code
rates of R ≥ 0.5, which ensures the complexity benefits of the spectral domain can
be utilised. It must be noted that these simulations are not intended to provide an
exhaustive performance investigation of APP decoding on discrete channels without
memory but rather to verify the applicability of the derived theoretical framework.
Simulation results for binary linear block codes on BSCs
The particulars of the considered binary (n, k) linear block codes C used together
with the BSC model are given below.
(7,4) Hamming code: This one-error correcting Hamming code of rate R = 0.57
can be defined by the parity check matrix
H =
0 1 1 1 1 0 0
1 0 1 1 0 1 0
1 1 0 1 0 0 1
. (3.104)
(16,8) Cyclic code: This two-error correcting binary block code of rate R = 0.5
is defined by a generator polynomial with coefficients given as the vector
g = [1, 1, 1, 0, 1, 0, 1, 1, 1]. Accordingly, the equivalent standard form of the
parity check matrix for this code can be obtained as
H =
1 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0
0 1 1 1 0 1 1 1 0 1 0 0 0 0 0 0
0 0 0 1 1 1 1 0 0 0 1 0 0 0 0 0
1 0 0 0 1 1 1 1 0 0 0 1 0 0 0 0
1 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0
1 1 1 1 0 0 0 1 0 0 0 0 0 1 0 0
1 1 0 1 1 1 0 1 0 0 0 0 0 0 1 0
0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 1
. (3.105)
(22,13) Chen code: This two-error correcting binary (22,13) linear block code
of rate R = 0.59 reported by Chen, Fan, and Jin has been defined in [60], and
is henceforth referred to as the Chen code. The code can be defined by parity
79
3.5. NUMERICAL EXAMPLES
10−4
10−3
10−2
10−1
10−6
10−5
10−4
10−3
10−2
10−1
100
ǫ
BE
R
Uncoded
(7,4) Hamming
(22,13) Chen
(16,8) Cyclic code
Figure 3.4: BER performance of some binary block codes on a BSC.
check matrix
H =
0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0
0 0 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0
0 1 0 1 0 1 0 0 1 0 1 1 1 0 0 0 1 0 0 0 0 0
0 1 1 0 1 0 0 1 0 1 0 1 1 0 0 0 0 1 0 0 0 0
1 0 0 1 0 1 1 0 1 1 0 1 0 0 0 0 0 0 1 0 0 0
1 0 1 0 1 1 1 1 0 1 1 0 1 0 0 0 0 0 0 1 0 0
1 1 0 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 1 0
1 1 1 0 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 1
. (3.106)
The BER performance of these block codes on BSCs was obtained through com-
puter simulations and is shown in Fig. 3.4. The performance for transmitting data
without coding is shown for comparison. As expected, the channel codes improve
the performance compared to an uncoded transmission. The (22,13) Chen code and
the (16,8) cyclic code outperform the weaker (7,4) Hamming code due to their supe-
rior error-correcting capabilities. In all cases, the BER increases with the crossover
probability ǫ.
80
3.5. NUMERICAL EXAMPLES
10−4
10−3
10−2
10−1
10−6
10−5
10−4
10−3
10−2
10−1
100
ǫ
SE
R
Uncoded
(4,2) Hamming
(11,6) Golay
Figure 3.5: SER performance of some block codes over GF (3) on a ternary DMC.
Simulation results for ternary linear block codes on DMCs
The specifications of the considered (n, k) linear block codes C over GF (3) used on
the ternary DMC model are as follows.
(4,2) Hamming code over GF (3): This one-error correcting ternary Hamming
code of rate R = 0.5 is a perfect code and defined by the parity check matrix
H =
[
2 2 1 0
1 2 0 1
]
. (3.107)
(11,6) Golay code over GF (3): The parity check matrix of this two-error cor-
recting perfect block code of rate R = 0.54 is given by
H =
2 0 2 1 1 2 1 0 0 0 0
2 2 0 2 1 1 0 1 0 0 0
2 1 2 0 2 1 0 0 1 0 0
2 1 1 2 0 2 0 0 0 1 0
2 2 1 1 2 0 0 0 0 0 1
. (3.108)
The simulated SER performance obtained for these linear block codes overGF (3)
on ternary DMCs is displayed in Fig. 3.5. It should be mentioned that the stan-
81
3.6. SUMMARY
dard DMC model shown in Fig. 2.3(a) has been used in the simulations. Again, as
expected, it is observed that channel coding produces a gain over the uncoded sce-
nario. In this case, however, the lower rate of the Hamming code does not result in
a better performance when compared with the Golay code. Both are perfect codes,
but the Golay code can correct two errors whilst the Hamming code is single-error
correcting.
3.6 Summary
There are many different trellis-based decoding algorithms available. However, work-
ing with trellises can be cumbersome, especially if the length n of the codewords or
order p of the considered Galois field is large. It is therefore convenient to be able
to create a matrix representation of the trellis. In this way, trellis operations are
converted to addition and multiplication of matrices, which can be handled easily
by digital signal processors and computers.
The solution to the APP decoding problem in (3.1) involves forming the ma-
trix representation in either the original or the spectral domain, based solely on the
structure of the code or dual code. APP decoding algorithms were given for both
the original and spectral domains. In the original domain, the matrices are usually
non-sparse due to the many intersecting paths in the trellis. However, it has the
advantage that the APPs can be obtained directly at the terminating end of the
weighted trellis. On the other hand, the spectral domain approach involves diago-
nalising the matrix descriptions of the code trellis. This results in a diagonal trellis
with all paths parallel. The calculation of the conditional spectral coefficients can
be done relatively fast, as it requires only addition and multiplication of diagonal
matrices. These coefficients must be transformed back to APPs in order to make the
decoding decision. The involved APP decoding concepts were demonstrated in an
instructive example, verifying that the same answer is obtained when using either
domain. Some performance examples for selected linear block codes on channels
without memory were obtained by computer simulation to illustrate potential areas
of application of the presented APP decoding approach.
82
Chapter 4
APP Decoding on Binary
Channels with Memory
The errors which occur in physical wireless channels are not usually independent and
therefore memoryless models provide too coarse an approximation [61]. One solution
to the problem of obtaining an accurate model is to force the current behaviour of
the model to be dependent upon its recent behaviour and thus endow the model with
a memory. An information theoretic argument [62] demonstrates that consideration
of the behaviour of the model during only the previous symbol-period usually results
in an acceptable approximation to a mobile channel experiencing fading.
The behaviour of such a model can be described using a hidden Markov model
which represents the different concentrations of errors in a received sequence. Since
the probabilities of all possible state transitions must be incorporated into an APP
decoding algorithm, matrices are used instead of the scalar crossover probabilities
of Chapter 3, and thus the complexity increases with the number of states.
Turbo decoding algorithms [23, 63] have been developed for convolutional codes
over channels with memory, but their block code counterparts are less prevalent.
This motivates the exposition in this chapter of two APP decoding algorithms for
binary channels described by a hidden Markov model. As in the memoryless case,
one operates in the original domain and the other uses the spectral domain.
This chapter is organised as follows. Firstly, Section 4.1 defines the main prob-
lem to be solved in this chapter, which concerns APP decoding of binary linear
block codes over channels described by stochastic automata. Section 4.2 develops
and describes two procedures for performing such decoding. The solution to the
APP decoding problem is first formulated using the original domain and then the
equivalent procedure is derived in the spectral domain. In Section 4.3, the computa-
tional complexity and storage requirements of both procedures are examined. The
83
4.1. PROBLEM STATEMENT
theory involved in the two procedures can be made more tangible by demonstration
in examples. Section 4.4 contains two such examples, specifically APP decoding on
a binary GEC using each domain. Section 4.5 displays some performance results
obtained by computer simulation for several codes with the spectral domain APP
decoding procedure developed in Section 4.2. In particular, it is shown how the BER
performance of the decoder is affected by changing the parameters of the channel
model. Finally, Section 4.6 summarises the chapter.
The main contributions to research of this chapter are:
• Development of an APP decoding procedure using the original domain for
binary linear block codes over channels described by stochastic automata.
• Through diagonalisation, the development of an alternative procedure using
the spectral domain to perform the same task.
• Demonstration of the benefits of the spectral domain approach for high rate
codes in terms of the storage space required.
• Numerical examples showing a variety of available options when investigating
the performance of Hamming codes with APP decoding using these proce-
dures, through computer simulation of transmission of information over GECs.
4.1 Problem Statement
Suppose that C is a systematic binary (n, k) linear block code which is to be used
on a channel described by a stochastic automaton
D = (D,σ0). (4.1)
Here, the stochastic sequential machine
D = (U ,V ,S, {D(vj|uj)}) (4.2)
has binary input and output sets U ={0, 1} and V ={0, 1} and a set S of S states.
In general, there are four S×S matrix probabilities D(vj|uj) for uj ∈ U and vj ∈ V.
Additionally, σ0 is a row vector of length S representing the initial or stationary state
distribution of the automaton D. For an all-ones column vector e, the equations to
be solved for APP decoding of binary linear block codes over finite state channel
models can be expressed as
ui = arg maxg∈GF (2)
σ0
∑
u∈C,ui=g
n∏
j=1
D(vj|uj)
e
. (4.3)
84
4.2. BINARY APP DECODING
However, if the channel in each state is a BSC, then the four matrix probabilities
D(vj|uj) can be replaced by a set of two matrix probabilities using the equation
D(vj|uj) = Duj⊕vj∈ {D0,D1}. (4.4)
It follows from (4.3) and (4.4) that an APP decoding rule can be formulated as
ui =
0 if σ0
∑
u∈C,ui=0
n∏
j=1
Duj⊕vj
e ≥ σ0
∑
u∈C,ui=1
n∏
j=1
Duj⊕vj
e,
1 otherwise.
(4.5)
If it is further assumed that the block code C is in standard form, then the problem
to be solved is to find an estimate ui of the transmitted bit ui for i = 1, 2, . . . , k.
This is done by comparing the two matrix products for all information bits, using
either the original or the spectral domain.
4.2 Binary APP Decoding
The foundations of trellis representations using matrices were presented in Section
3.2. Furthermore, the elementary trellis matrices Mh(u) were introduced for a block
code C over GF (p). In this section, it is shown how to obtain a description of an
entire weighted trellis for a channel with memory by means of these elementary
matrices. As before, let C be defined by an (n−k) × n parity check matrix
H =[
h1, h2, . . . , hn
]
, (4.6)
where the jth column, 1 ≤ j ≤ n, is given by
hj =[
hn−k−1,j, hn−k−2,j, . . . , h0,j
]T
. (4.7)
A solution to (4.5) will first be presented using the original domain. This will be
followed by an analogous procedure in the spectral domain.
4.2.1 Original domain
The decoding procedure using the original domain is developed by first weighting
the trellis section matrices by the appropriate matrix probabilities for bit error and
non-error. The APPs are calculated from the matrix representation of the trellis
and a decoding decision is then reached.
85
4.2. BINARY APP DECODING
Weighted trellis matrices
The matrix representation for a trellis section must be weighted according to the
state transition and error probabilities of the channel. This is achieved by taking
the Kronecker product of the matrix for that trellis section with the required matrix
probability. For uj, vj ∈GF (2), a binary channel produces four matrix probabilities
D(vj|uj). If each state contains a BSC, then there are only two matrix probabilities
D0 = D(vj = uj|uj), (4.8)
D1 = D(vj 6= uj|uj), (4.9)
and the overall structure forms a GEC model. In this case, two possible weighted
trellis section matrices exist for each column of the parity check matrix H. One
corresponds to correct reception of the transmitted bit, and one corresponds to the
situation where an error has occurred. Incorporating (4.8) and (4.9) as extensions
of (3.15), the two (2n−kS)× (2n−kS) weighted trellis section matrices can be written
asUhj
(uj = 0) = Mhj(0) ⊗ D(vj|0) = I2n−k ⊗ Dvj
,
Uhj(uj = 1) = Mhj
(1) ⊗ D(vj|1) = Mhj⊗ Dvj
,(4.10)
where vj = vj ⊕ 1, I2n−k is the identity matrix of order 2n−k, and the trellis section
matrices Mhj(uj) are described in (3.11) and (3.12). Suppose that the ith bit is
being decoded. For all sections j 6= i of an APP decoding trellis, one horizontal
and one oblique branch leave each node. These correspond to transmitted bits 0
and 1, respectively. Given that all paths through the trellis must be considered, the
weighted trellis section matrices Uhjirrespective of the transmitted bit are used for
all but the ith section. It is possible to represent the complete trellis section in these
n−1 cases as
Uhj= Uhj
(uj = 0) + Uhj(uj = 1)
= I2n−k ⊗ Dvj+ Mhj
⊗ Dvj. (4.11)
Assigning a probability to each path through the trellis is accomplished by mul-
tiplication of the weighted trellis section matrices in order, from the first to the
nth section. However, the summation bound “u ∈ C, ui = g” in (2.107) stipulates
that the matrix Uhi(g) is the ith multiplicand in the overall product. The matrix
representation of the entire trellis for the ith transmitted bit ui is thus given by
UH(ui) =i−1∏
j=1
Uhj· Uhi
(ui) ·n∏
j=i+1
Uhj. (4.12)
86
4.2. BINARY APP DECODING
Determining the a posteriori probabilities in the original domain
A vector P(ui|v) of 2n−k APPs must be extracted from the matrix product of size
(2n−kS) × (2n−kS) in (4.12). This is done by calculating the expected value of the
product of trellis branch values over all possible initial states. The entries of the
stationary state distribution vector σ0 are used as the values of the probability
distribution for the initial state of the finite state channel. More formally, as shown
in [26] and [27], a vector P(ui|v) of APPs may be calculated using the equation
P(ui|v) = (τ 0 ⊗ σ0) · UH(ui) · (I2n−k ⊗ e), (4.13)
where e is a length-S column vector of ones, and
τ 0 =[
1, 0, . . . , 0]
(4.14)
represents a vector of length 2n−k. The row vector τ 0 has this form because all
paths through a decoding trellis in the original domain must commence at the 0th
node. Paths commencing at any of the other nodes are not allowed. The zeroes in
(4.14) show that these paths would not contribute to the APP. The resulting vector
P(ui|v) consists of 2n−k entries Pt(ui|v) for t = 0, 1, . . . , 2n−k−1, which denote the
probability that the ith transmitted bit was ui, given that a word v was received
and that the encoding was performed by mapping a binary vector of k information
bits onto length-n words of the tth coset Vt. In this description, as was the case
for memoryless channels, the 0th coset V0 corresponds to the code C. Hence, the
required APP is P0(ui|v).
Although all rows and columns of each matrix Uhjor Uhi
(ui) are used in the
calculation of the matrix representation UH(ui) of the entire trellis for the ith trans-
mitted bit ui, once this has been calculated, only the upper S rows of UH(ui) are
used in calculating the product (τ 0 ⊗ σ0) · UH(ui). This is due to the zeroes in all
positions of τ 0 except the first. Since only the first element of the resulting product
of (τ 0 ⊗σ0) ·UH(ui) with I2n−k ⊗ e is required to find P0(ui|v), it follows that only
the first S columns of UH(ui) are ultimately relevant.
For a matrix K and l ∈ Z+, define the lth principal leading submatrix [K](l) as
the square submatrix consisting of the intersection of the first l rows and l columns
of K. For example, if K = [ki,j]3×4, then
[K](2) =
[
k1,1 k1,2
k2,1 k2,2
]
. (4.15)
87
4.2. BINARY APP DECODING
With this notation, (4.13) can be rewritten in order to calculate P0(ui|v) as
P0(ui|v) = σ0 · [UH(ui)](S) · e. (4.16)
APP decoding procedure for the original domain
It is now possible to report a procedure which performs APP decoding for a binary
linear block code using the original domain. For a binary channel described by a
stochastic automaton, the procedure is as follows.
Procedure 4.1. Given is a binary (n, k) linear block code C in standard form,
to be used on a channel which is described by a stochastic automaton containing
a stochastic sequential machine which has S <∞ states. The linear block code C
shall be defined by parity check matrix H. A codeword u is transmitted over the
channel and a word v is received. APP decoding in the original domain comprises
the following steps.
Step 1. Use the state transition and crossover probabilities to find the stationary
state distribution vector σ0 and the two S×S matrix probabilities D0 and D1.
Step 2. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ {0, 1}, compute the matrix representation
Mhj(uj) for column hj and jth transmitted symbol uj using (3.12).
Step 3. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ {0, 1}, compute the weighted trellis section
matrix Uhj(uj) using (4.10).
Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, compute the matrix representation of
the trellis UH(ui) using (4.12).
Step 5. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, let [UH(ui)](S) be the Sth principal
leading submatrix of UH(ui) and calculate P0(ui|v) using (4.16).
Step 6. Derive an estimate ui for each position i ∈ {1, 2, . . . , k} using
ui =
{
0 if P0(ui = 0|v) ≥ P0(ui = 1|v),
1 if P0(ui = 0|v) < P0(ui = 1|v).(4.17)
An algorithm for APP decoding of binary linear block codes over a finite state
channel has been provided. In essence, it is a generalisation of the procedure shown
in Chapter 3 for memoryless channel models, but now the sequence of state transi-
tions plays an additional role. Procedure 4.1 uses the columns of the parity check
88
4.2. BINARY APP DECODING
matrix H to construct a matrix representation of the decoding trellis, but another
procedure with strong connections to the dual code may also be developed.
4.2.2 Spectral domain
A decoding procedure for the spectral domain can be developed in a similar way
to the original domain approach by using diagonalised versions of the trellis section
matrices. After weighting these diagonal matrices by the appropriate matrix prob-
abilities of bit error and non-error, a transformation back to the original domain in
order to obtain the APPs is made.
Spectral section matrices
As outlined in Section 3.3, a set of elementary spectral matrices can be constructed,
each representing a different section of the trellis for a particular assumed transmit-
ted bit in that position. For column hj of the parity check matrix and transmitted
bit uj, the spectral matrix Λhj(uj) defined in terms of a variable s, which takes
values between 0 and 2n−k−1, may be expressed as
Λhj(uj) = diag{(−1)uj ·u
⊥
s,j}, (4.18)
where u⊥s,j denotes the jth symbol of the sth dual codeword u⊥s = sH and s = dec(s)
is the decimal representation of the binary vector s.
Weighted spectral matrices
The spectral section matrices must be weighted by the state transition and error
probabilities of the channel, as was the case in the original domain. One method of
achieving this is to apply a similarity transformation directly to the weighted trellis
section matrices derived in the original domain. When considered for a transmitted
bit uj, this relationship may be expressed as
Θhj(uj) = T−1Uhj
(uj)T (4.19)
for a transformation matrix T. The Walsh-Hadamard transformation is applied, as
was the case for memoryless models. However, here it must be applied over all S
states and so the matrix T in (4.19) can be expressed as
T = W2n−k ⊗ IS. (4.20)
89
4.2. BINARY APP DECODING
Since all rows and columns of W2n−k are orthogonal to all of its disparate rows and
columns, it follows that
W22n−k = 2n−kI2n−k (4.21)
and therefore the inverse of the Walsh-Hadamard matrix can be written as
W−12n−k =
1
2n−kW2n−k . (4.22)
Substituting (4.10), (4.20) and (4.22) into (4.19) produces the (2n−kS)× (2n−kS)
weighted spectral matrix
Θhj(uj) = 1
2n−k W2n−kMhj(uj)W2n−k ⊗ D(vj|uj)
= Λhj(uj) ⊗ D(vj|uj)
= diag{(−1)uj ·u⊥
s,jDuj⊕vj},
(4.23)
which is a block diagonal matrix. Considering the weighted spectral matrix in (4.23)
for a specific input uj gives
Θhj(uj) =
diag{Dvj} if uj = 0,
diag{(−1)u⊥
s,jDvj} if uj = 1,
= diag{ujDvj+ uj(−1)u⊥
s,jDvj}.
(4.24)
Weighted spectral matrices for the jth 6= ith diagonal trellis sections irrespective of
transmitted symbol uj are given by the sum of the Θhj(uj) matrices over all uj ∈U .
Thus,
Θhj= Θhj
(0) + Θhj(1) = diag{Dvj
+ (−1)u⊥
s,jDvj}. (4.25)
Assuming that an estimate of the probability that the ith bit is equal to ui is required,
a weighted (2n−kS) × (2n−kS) matrix ΘH(ui) for the entire diagonal trellis, which
consists of 2n−k parallel paths, is calculated by multiplying the weighted spectral
matrices for each trellis section in order, from the first to the nth section. Due
to the definition of the APPs, each of the weighted spectral matrices will be taken
irrespective of the transmitted symbol, apart from the ith factor, where the weighted
spectral matrix for an input of ui will be used. That is,
ΘH(ui) =i−1∏
j=1
Θhj· Θhi
(ui) ·n∏
j=i+1
Θhj. (4.26)
90
4.2. BINARY APP DECODING
Each factor of ΘH(ui) is a block diagonal matrix with square, invertible submatrices
on the main diagonal. Additionally, diagonal matrices over C with the same square
block structure of invertible submatrices form a group under matrix multiplication.
Therefore, ΘH(ui) is also a block diagonal matrix, and it is possible to write
ΘH(ui) = diag{Qs(ui|v)}, (4.27)
where
Qs(ui|v) =i−1∏
j=1
[
Dvj+ (−1)u⊥
s,jDvj
]
×
[uiDvi+ ui(−1)u⊥
s,iDvi] × (4.28)
n∏
j=i+1
[
Dvj+ (−1)u⊥
s,jDvj
]
.
Observing the structure of Θhjin (4.25), when n≥ 3, there are more instances of
D0 ±D1 than either D0 or D1 alone. For most codes, it is thus beneficial to rewrite
(4.28) using the notationD = D0 + D1,
∆ = D0 − D1.(4.29)
Firstly for columns j 6= i,
Dvj+ (−1)u⊥
s,jDvj= (−1)vj ·u
⊥
s,j(u⊥s,jD + u⊥s,j∆). (4.30)
Secondly, for the ith column,
uiDvi+ui(−1)u⊥
s,iDvi=
1
2
[
(−1)ui·u⊥
s,iD + (−1)(ui·u⊥
s,i)+vi∆]
. (4.31)
Using (4.30) and (4.31) in (4.28) yields the conditional spectral coefficient matrices
Qs(ui|v) =i−1∏
j=1
[
(−1)vj ·u⊥
s,j(u⊥s,jD + u⊥s,j∆)]
×
1
2
[
(−1)ui·u⊥
s,iD + (−1)(ui·u⊥
s,i)+vi∆]
× (4.32)
n∏
j=i+1
[
(−1)vj ·u⊥
s,j(u⊥s,jD + u⊥s,j∆)]
.
In particular, note the correspondence between the zeroes and ones of the dual
codewords and the arrangement of D and ∆ matrix probabilities within the matrices
Qs(ui|v). The conditional spectral coefficients Qs(ui|v), which factor in the relative
91
4.2. BINARY APP DECODING
likelihoods of the model commencing in each of the states, can be obtained from the
conditional spectral coefficient matrices Qs(ui|v) using the conversion
Qs(ui|v) = σ0 · Qs(ui|v) · e. (4.33)
Together, the 2n−k conditional spectral coefficients Qs(ui|v) form the vector of con-
ditional spectral coefficients
Q(ui|v) =[
Q0(ui|v), Q1(ui|v), . . . , Q2n−k(ui|v)]
. (4.34)
Determining the a posteriori probabilities in the spectral domain
The conditional spectral coefficient matrices cannot be used directly to perform the
APP decoding. These matrices are constructed from the spectral coefficients and
in order to calculate the required APPs, coefficients from the original domain must
be used. The two sets of coefficients are related by the Walsh-Hadamard matrix
W2n−k of order 2n−k. The spectral domain equivalent of (4.13) can be determined
using properties of the Kronecker product. Firstly, substituting (4.12) into (4.13)
produces
P(ui|v) = (τ 0 ⊗ σ0) ·i−1∏
j=1
Uhj· Uhi
(ui) ·n∏
j=i+1
Uhj· (I2n−k ⊗ e). (4.35)
Then, application of (4.19), (4.20) and (4.26) results in
P(ui|v) = (τ 0 ⊗ σ0)(W2n−k ⊗ IS) · ΘH(ui) · (W−12n−k ⊗ IS)(I2n−k ⊗ e). (4.36)
It is also important to note that
(τ 0 ⊗ σ0)(W2n−k ⊗ IS) = ι0 ⊗ σ0, (4.37)
where ι0 is the all-ones vector of length 2n−k, which is equal to the first row of W2n−k
as defined in (3.53). A further simplification may be made using the properties of
identity matrices, so that
(W−12n−k ⊗ IS)(I2n−k ⊗ e) = W−1
2n−k ⊗ e. (4.38)
The substitution of (4.27), (4.37) and (4.38) into (4.36) results in
P(ui|v) = (ι0 ⊗ σ0) · diag{Qs(ui|v)} · (W−12n−k ⊗ e). (4.39)
92
4.2. BINARY APP DECODING
Applying (4.22) and considering the structure of the vector of conditional spectral
coefficients defined in (4.33) and (4.34) produces
P(ui|v) =1
2n−kQ(ui|v)W2n−k . (4.40)
Then, the required APP is the first element of the vector P(ui|v), that is
P0(ui|v) =1
2n−k
2n−k−1∑
s=0
Qs(ui|v). (4.41)
Since the estimate ui for the transmitted bit ui is the binary symbol that maximises
P0(ui|v), (4.41) gives
ui = arg maxui∈GF (2)
{ 2n−k−1∑
s=0
Qs(ui|v)
}
. (4.42)
APP decoding procedure for the spectral domain
Assuming the same conditions as those for Procedure 4.1, an alternative APP de-
coding procedure using the spectral domain is summarised below.
Procedure 4.2. Given is a binary (n, k) linear block code C in standard form,
to be used on a channel which is described by a stochastic automaton containing a
stochastic sequential machine which has S<∞ states. The linear block code C shall
be defined by parity check matrix H. A codeword u is transmitted over the channel
and a word v is received. APP decoding in the spectral domain is comprised of the
following main steps.
Step 1. Use the state transition and crossover probabilities to find the stationary
state distribution vector σ0 and the matrix probabilities D0 and D1. Formu-
late D and ∆ in terms of D0 and D1 using (4.29).
Step 2. ∀s = dec(s) ∈ {0, 1, . . . , 2n−k − 1}, compute dual codewords
u⊥s = s · H =
[
u⊥s,1, u⊥s,2, . . . , u⊥s,n
]
∈ C⊥, (4.43)
which are used in defining the arrangement of D and ∆ matrices in (4.32).
Step 3. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, compute the conditional spectral coef-
ficients Qs(ui|v) using (4.32) and (4.33).
93
4.3. COMPLEXITY ANALYSIS
Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, accumulate these coefficients to com-
pute the APPs P0(ui|v) using (4.41).
Step 5. Derive an estimate ui of the ith transmitted bit ui of codeword u at each
position i ∈ {1, 2, . . . , k} using
ui =
0 if2n−k−1∑
s=0
Qs(ui = 0|v) ≥2n−k−1∑
s=0
Qs(ui = 1|v),
1 otherwise.
(4.44)
4.3 Complexity Analysis
There are two common methods of analysing the complexity of an algorithm. Firstly,
one can investigate how much computation time is required for the execution of the
algorithm in terms of the sizes of its input arguments. Secondly, one may also
look at the amount of computer memory which would be required in order to run
the algorithm. This section provides such an analysis of Procedures 4.1 and 4.2 in
comparison to each other and to alternative approaches.
4.3.1 Computational complexity
A reasonable approximation to the execution time requirements of the procedures
may be made by determining the number of multiplications which would be needed
to decode a single word. The addition operations are ignored, since they are of
lower computational complexity. An analysis of the number of operations is made
in terms of the sizes of three main input arguments. These arguments are the total
number n of bits in a codeword, the number k of information bits per codeword,
and the number S of states in the channel model. This analysis is provided using
O notation. Specifically, if f(x) and g(x) are real-valued functions of one variable,
then f(x) is O(g(x)) if and only if there exists x0 ∈ R and a positive constant c such
that
|f(x)| ≤ c · |g(x)| (4.45)
for all x values greater than the threshold x0. Multivariate extensions are similar. In
this way, O notation provides an asymptotic upper bound to a function and permits
a comparison of functions as the magnitude of their arguments tend to infinity.
94
4.3. COMPLEXITY ANALYSIS
Computational complexity of the original domain approach
Procedure 4.1 is oriented only towards calculation of the APPs P0(ui|v). Although
alternative analyses could be made for determining Ps(ui|v) ∀s ∈ {1, 2, . . . , 2n−k−1},as is considered in [64] for linear unequal error protection codes, a similar conclusion
as to the computational complexity of such an approach would be reached and this
scenario will not be considered here. Therefore, assume the APP P0(ui|v) needs
to be calculated for both ui = 0 and ui = 1, for all values of i ∈ {1, 2, . . . , k}. To
calculate each such APP, a sum of 2k−1 matrix products of length n needs to be
created. The exponent is k−1 because there are 2k choices for information bits for
the codewords, however only half of these will satisfy the condition that ui has a
specific value. The parity check matrix H then ensures that no more choices can
be made for the remaining n−k bit positions. Calculation of each matrix product
of length n, when pre-multiplied by the stationary state distribution vector σ0 and
post-multiplied by the all-ones column vector e, requires O(nS2) operations when
the channel model comprises S states. The overall complexity for decoding a single
word using the original domain is therefore O(2kknS2).
Computational complexity of the spectral domain approach
Suppose a model has S states and a word is to be decoded using Procedure 4.2. The
number of multiplications needed is approximately n × 2n−k × S2 for each of the
two possible binary symbols and for each position i ∈ {1, 2, . . . , k}. This is because
there are 2n−k rows in the diagonal trellis, each requiring a vector of length S to be
multiplied by an S × S matrix n times. Therefore the overall complexity per word
is O(2n−k+1knS2).
Whilst storage requirements also need to be considered, a major factor in the
decision whether the original or spectral domain should be used is determining which
of them results in a lower computational complexity. From the discussions in the
previous two paragraphs, the only difference in these complexities is the exponent
to which the base 2 is raised, namely k versus n−k+1. Simple arithmetic shows
that the spectral domain approach of Procedure 4.2 is preferred whenever
k >n+ 1
2(4.46)
and the original domain should be used if this is not the case. In summary, the
original domain is more suited to lower rate codes and the spectral domain is more
appropriate for codes which have a high rate.
95
4.3. COMPLEXITY ANALYSIS
Computational complexity of other known approaches
Some APP decoding algorithms which use a trellis to perform their calculations have
been published. For example, the BCJR algorithm [8] uses both a forward and a
backward recursion along the decoding trellis. It requires O(2n−k+1kn) operations
per codeword of a binary (n, k) linear block code C, since each of the k information
bits to be decoded requires two passes along a trellis of size 2n−k×n. The one-sweep
algorithm by Johansson and Zigangirov, presented in [9], is a significant improvement
over the BCJR algorithm since it reduces the number of passes per trellis required
from two to one. Thus, its complexity is approximately half that of the algorithm in
[8]. However, the algorithms such as those described in [8] and [9] were developed for
memoryless channels, without the concept of states. The advantages of Procedures
4.1 and 4.2 are that they were developed specifically for channel models which possess
states, and also, when compared to the BCJR algorithm, that they require only a
forward recursion.
4.3.2 Storage requirements
The other main factor in determining the cost of an algorithm is the amount of
data it needs to store in order to carry out its tasks. An algorithm may have a low
computational complexity, however its desirability will be decreased if it has large
storage requirements. In the following analysis, let Y be the quantity of real number
storage spaces which would be required for the execution of an algorithm. The Onotation again indicates the asymptotic size of such storage as the parameters n, k
and S increase.
Storage requirements of the original domain approach
Even though the original domain approach of Procedure 4.1 is based upon calculating
the S×S matrix [UH(ui)](S), the entire matrix UH(ui) must in essence be determined.
This requires storage of (2n−kS)2 real numbers. Calculation of P0(ui|v) using (4.16)
requires a temporary vector of length S and the space for two real numbers is used
to store the results P0(ui|v) for ui =0 and ui =1. Therefore Procedure 4.1 requires
space to store 4n−kS2 + S + 2 real numbers and hence
Y = O(4n−kS2). (4.47)
96
4.4. INSTRUCTIVE EXAMPLES
Storage requirements of the spectral domain approach
There is a definite advantage in using the spectral domain when compared to the
original domain because the conditional spectral coefficients Qs(ui|v) can be cal-
culated one at a time. Such calculations involve multiplications of a row vector
of length S by an S×S matrix. After multiplication by the column vector e, the
final result is a real number. As each of the 2n−k conditional spectral coefficients is
calculated, it is added to the previous tally. Thus, only the storage space of a single
real number is required to determine P0(ui|v). For the binary case, only two of
these are required in order to arrive at a decoding decision for position i. Therefore,
ignoring the space required to store vectors σ0 and v, as well as matrices D and ∆,
Procedure 4.2 needs storage space of approximately S2 + S + 2 real numbers. Thus,
for the spectral domain,
Y = O(S2). (4.48)
Storage requirements of other known approaches
To provide a comparison to these analyses, the BCJR algorithm requires storage
of 2n−k vectors, each of length n, and Johansson and Zigangirov’s improvement
reduces this figure to just 2n−k real numbers [9]. The high storage costs of the
BCJR algorithm have prompted many alterations to be made, such as in [65], where
the amount of storage required is reduced to
Y = O(2n−k) (4.49)
at the expense of a 13
increase in the computational complexity. This analysis shows
the attractiveness of using Procedure 4.2 if storage requirements are a factor.
4.4 Instructive Examples
To demonstrate the calculation of the a posteriori probabilities required in Proce-
dures 4.1 and 4.2, a decoding example in both the original and the spectral domain
for communications on a GEC will be provided. The code used in this example
is the same binary (4,2) linear block code as for the example in Section 3.4, thus
making clear the effect of altering the channel model from BSC to GEC. Define the
stationary state distribution vector σ0, the all-ones vector e, the state transition
probabilities P and Q and the crossover probabilities pG and pB for the GEC as
done in Section 2.2.2.
97
4.4. INSTRUCTIVE EXAMPLES
4.4.1 Example of decoding in the original domain
Suppose a codeword from the binary (4,2) linear block code
C = {[0, 0, 0, 0], [0, 1, 1, 1], [1, 0, 1, 0], [1, 1, 0, 1]} (4.50)
of Example 2.1 is sent over a GEC. Assume v = [1, 1, 1, 0] is received and the aim
is to use the original domain to decode the second bit transmitted. The first step
is to calculate the trellis matrices Mhj(uj) for each of the four columns hj of the
parity check matrix H and for each possible transmitted bit uj. These were given in
(3.58) to (3.62). The trellis matrices must be weighted by the matrix probabilities
D0 and D1, as defined in (2.42) and (2.43), respectively, in order to determine the
weighted trellis section matrices with respect to a specific transmitted bit uj. The
matrix probability D0 is used when uj = vj, as there has not been a transmission
error. When uj 6=vj, a transmission error has occurred, and D1 is used instead. The
weighted matrices representing the first trellis section for transmitted bits u1 = 0
and u1 =1 can be calculated as
Uh1(0) = Mh1
(0) ⊗ D1 =
D1 0 0 0
0 D1 0 0
0 0 D1 0
0 0 0 D1
, (4.51)
Uh1(1) = Mh1
(1) ⊗ D0 =
0 0 D0 0
0 0 0 D0
D0 0 0 0
0 D0 0 0
. (4.52)
A weighted matrix representation of the first trellis section regardless of transmitted
bit u1 is given by
Uh1= Uh1
(0) + Uh1(1) =
D1 0 D0 0
0 D1 0 D0
D0 0 D1 0
0 D0 0 D1
. (4.53)
Weighted matrix representations of the third and fourth trellis sections are calculated
similarly. It then follows from (4.16) that the APPs required in order to decode the
second transmitted bit are given by
P0(u2 = 0|v) = σ0 · (D1D1D1D0 + D0D1D0D0) · e,P0(u2 = 1|v) = σ0 · (D0D0D1D1 + D1D0D0D1) · e.
(4.54)
98
4.4. INSTRUCTIVE EXAMPLES
- - - -
- - - -
- - - -
- - - -
AAAAAAAAAAAAAA
AAAAAU
AAAAAAAAAAAAAA
AAAAAAU
��������������
�������
��������������
������
AAAAAAAAAAAAAA
AAAAAU
AAAAAAAAAAAAAA
AAAAAAU
��������������
�������
��������������
������
@@
@@
@@
@
@@@R
��
��
��
�
����
@@
@@
@@
@
@@@R
��
��
��
�
����
x x x x x
x x x x x
x x x x x
x x x x x
σ0
0
0
0
e P0(u2 =0|v)
e P1(u2 =0|v)
e P2(u2 =0|v)
e P3(u2 =0|v)
D1
D1
D1
D1
D0
D0
D0
D0
D1
D1
D1
D1
D1
D1
D1
D1
D0 D0
D0D0
D1
D1
D1
D1
D0
D0
D0
D0
(a)
- - -
- - -
- - -
- - -
AAAAAAAAAAAAAA
AAAAAU
AAAAAAAAAAAAAA
AAAAAAU
��������������
�������
��������������
������
BBBBBBBBBBBBBBBBBBBBB
BBBBBBBN
��
��
��
�
����
���������������������
��������
@@
@@
@@
@
@@@R
AAAAAAAAAAAAAA
AAAAAU
AAAAAAAAAAAAAA
AAAAAAU
��������������
�������
��������������
������
@@
@@
@@
@
@@@R
��
��
��
�
����
@@
@@
@@
@
@@@R
��
��
��
�
����
x x x x x
x x x x x
x x x x x
x x x x x
σ0
0
0
0
e P0(u2 =1|v)
e P1(u2 =1|v)
e P2(u2 =1|v)
e P3(u2 =1|v)
D1
D1
D1
D1
D0
D0
D0
D0
D0
D0
D0
D0
D1
D1
D1
D1
D0 D0
D0D0
D1
D1
D1
D1
D0
D0
D0
D0
(b)
Figure 4.1: Original domain weighted APP decoding trellises for the binary (4,2)linear block code used to compute (a) P (u2 =0|v) and (b) P (u2 =1|v).
(Dashed: sj+1 = sj, Solid: sj+1 = sj ⊕ hTj+1.)
99
4.4. INSTRUCTIVE EXAMPLES
Figure 4.1 shows how the components of the complete set of APPs Ps(u2|v),
s ∈ {0, 1, 2, 3}, are derived from the two trellises for u2 = 0 and u2 = 1. The fi-
nal decoding decision is made by substituting the values of P,Q, pG and pB into
(2.38), (2.42) and (2.43), and subsequently into (4.54). For example, if the channel
parameters are given as
P = 0.01, Q = 0.2, pG = 0.001, and pB = 0.3, (4.55)
then
P0(u2 = g|v) =
{
8.34 × 10−3 if g = 0,
3.25 × 10−3 if g = 1.(4.56)
Therefore P0(u2 = 0|v) < P0(u2 = 1|v) and so by (4.17), u2 = 0.
4.4.2 Example of decoding in the spectral domain
The elements u⊥s,j, j = 1, 2, 3, 4, of the dual codewords u⊥s = [u⊥s,1, u
⊥s,2, u
⊥s,3, u
⊥s,4],
s = 0, 1, 2, 3, which are needed to evaluate (4.18) can be expressed as the rows of
the set
C⊥ =
0 0 0 0
0 1 0 1
1 1 1 0
1 0 1 1
. (4.57)
Equation (4.32) gives
Qs(u2|v) = (−1)v1·u⊥
s,1(u⊥s,1D + u⊥s,1∆) ×1
2
[
(−1)u2·u⊥
s,2D + (−1)(u2·u⊥
s,2)+v2∆]
× (4.58)
4∏
j=3
[
(−1)vj ·u⊥
s,j(u⊥s,jD + u⊥s,j∆)]
,
where D and ∆ are defined in (2.36) and (2.44), respectively. In total, eight condi-
tional spectral coefficient matrices need to be calculated. One is required for each
value of s ∈ {0, 1, 2, 3}, and both u2 = 0 and u2 = 1 must be treated in each of these
four cases. Substituting the necessary values of the transmitted bit u2, the received
word v = [1, 1, 1, 0] and the entries u⊥s,j of the dual codewords yields
Q0(u2 = 0|v) = D(
D−∆
2
)
DD,
Q1(u2 = 0|v) = D(
D−∆
2
)
D∆,
Q2(u2 = 0|v) = (−∆)(
D−∆
2
)
(−∆)D,
Q3(u2 = 0|v) = (−∆)(
D−∆
2
)
(−∆)∆,
(4.59)
100
4.5. SIMULATION RESULTS
for u2 = 0 and
Q0(u2 = 1|v) = D(
D+∆
2
)
DD,
Q1(u2 = 1|v) = D(
−D+∆
2
)
D∆,
Q2(u2 = 1|v) = (−∆)(
−D+∆
2
)
(−∆)D,
Q3(u2 = 1|v) = (−∆)(
D+∆
2
)
(−∆)∆,
(4.60)
for u2 = 1. The arrangement of the plus and minus signs preceding the factors of
(4.59) and (4.60) is a reflection of the pattern of zeroes and ones in C⊥ as defined
in (4.57) and in the received word v. The two diagonal trellises of Fig. 4.2 (a) and
(b) demonstrate how the conditional spectral coefficient matrices are used in the
calculation of the scalars Qs(u2|v). Namely, in each row of the trellis, the stationary
state distribution vector σ0 is successively multiplied by each of the four matrices
of the corresponding Qs(u2|v) matrix product and then finally multiplied by the
column vector e to arrive at a scalar result.
To calculate the two APPs P0(u2|v), it is necessary to find the mean of these
four scalars. The required calculations may be written as
P0(u2 =0|v) = 14
3∑
s=0
Qs(u2 = 0|v)
= 18σ0(D
4−D∆D2+D3∆−D∆D∆+∆D∆D−∆3D+∆D∆2−∆4)e,
(4.61)
P0(u2 =1|v) = 14
3∑
s=0
Qs(u2 = 1|v)
= 18σ0(D
4+D∆D2−D3∆−D∆D∆−∆D∆D−∆3D+∆D∆2+∆4)e.
(4.62)
Once the values P,Q, pG and pB of a particular parameter set have been substituted
into D, ∆ and σ0, in (2.36), (2.44) and (2.38) respectively, the results can be used
in (4.61) and (4.62) to determine the APPs. Assuming the same parameter set as
for the original domain as given in (4.55), it follows that
P0(u2 = g|v) =
{
8.34 × 10−3 if g = 0,
3.25 × 10−3 if g = 1.(4.63)
This is the same result as for the original domain, and therefore u2 = 0. Procedures
4.1 and 4.2 are two different ways of obtaining the same answer.
4.5 Simulation Results
To provide a sample of the performance of codes used for transmission over a channel
described by a stochastic automaton in conjunction with Procedure 4.2, computer
101
4.5. SIMULATION RESULTS
x x x x x
x x x x x
x x x x x
x x x x x
- - - -
- - - -
- - - -
- - - -
σ0
σ0
σ0
σ0
e Q3(u2 =0|v)
e Q2(u2 =0|v)
e Q1(u2 =0|v)
e Q0(u2 =0|v)
−∆ D−∆
2 −∆ ∆
−∆ D−∆
2 −∆ D
D D−∆
2 D ∆
D D−∆
2 D D
(a)
x x x x x
x x x x x
x x x x x
x x x x x
- - - -
- - - -
- - - -
- - - -
σ0
σ0
σ0
σ0
e Q3(u2 =1|v)
e Q2(u2 =1|v)
e Q1(u2 =1|v)
e Q0(u2 =1|v)
−∆ D+∆
2 −∆ ∆
−∆ −(
D+∆
2
)
−∆ D
D −(
D+∆
2
)
D ∆
D D+∆
2 D D
(b)
Figure 4.2: Weighted diagonal trellises of the binary (4, 2) linear block code usedfor computing the spectral coefficients (a) Qs(u2 = 0 | v) and (b) Qs(u2 = 1 | v);
s = 0, 1, 2, 3.
102
4.5. SIMULATION RESULTS
simulations were carried out using MATLABr. As there are many combinations
of parameter values which can be chosen, and many binary linear block codes in
existence, only some are presented here. Note that Procedure 4.1 would have made
the same decoding decisions and hence produced the same results. However, as was
discussed in Section 4.3, more storage space would be required if the original domain
were used.
4.5.1 Description of parameter values in these simulations
In these simulations, the binary (7,4) Hamming code in systematic form as described
in (3.104) is used for transmission over a GEC. The model of this channel has four
parameters which may independently vary. A complete analysis of APP decoding
of this code would then require an investigation in four different dimensions. A
small subset of the parameter space is selected for simulation, and is defined by the
constraintsP ∈ {10−7, 10−6, 10−5, 3×10−5, 10−4, 3×10−4 . . . , 1},Q ∈ {0.01, 0.3},pG ∈ {10−4, 10−3, 10−2, 10−1},pB ∈ {0.1, 0.5}.
(4.64)
After decoding a large number of received words for each choice of the four parameter
values, the BER was calculated. The results are displayed in Fig. 4.3 for the case
Q = 0.01, and Fig. 4.4 for the case Q = 0.3. To further distinguish the results, the
BERs for situations where pB is set to 0.1 are presented in subfigure (a); those for
situations where pB is set to 0.5 are presented in subfigure (b).
4.5.2 Observations from the simulations
As expected, for each value of the crossover probability pG in the ‘good’ state G, the
performance degrades as the value of the state transition probability P is increased,
since this corresponds to an increased probability that the channel is in the ‘bad’
state B. In addition, the BER decreases as pG decreases.
For each value of pG, an error floor is reached at a different value of P . This is
because as P approaches zero, the GEC is in essence a BSC with crossover prob-
ability pG, which produces a particular BER when used in conjunction with this
code and decoding procedure. The differing error floors exist due to the different
values of pG relative to P . The curves for pG = pB = 0.1 in Figs. 4.3(a) and 4.4(a)
are horizontal. Since the crossover probability is equal in both states, the BER is
103
4.5. SIMULATION RESULTS
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
10−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
P
BE
R
pG = 10−1
pG = 10−2
pG = 10−3
pG = 10−4
(a)
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
10−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
P
BE
R
pG = 10−1
pG = 10−2
pG = 10−3
pG = 10−4
(b)
Figure 4.3: Performance of the (7,4) Hamming code on a GEC with Q=0.01 and(a) pB = 0.1, (b) pB = 0.5.
104
4.5. SIMULATION RESULTS
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
10−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
P
BE
R
pG = 10−1
pG = 10−2
pG = 10−3
pG = 10−4
(a)
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
10−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
P
BE
R
pG = 10−1
pG = 10−2
pG = 10−3
pG = 10−4
(b)
Figure 4.4: Performance of the (7,4) Hamming code on a GEC with Q=0.3 and(a) pB = 0.1, (b) pB = 0.5.
105
4.6. SUMMARY
independent of P . The channel is in fact a BSC. Finally, note that the channel
models in Fig. 4.3, where the state transition probability Q = 0.3, reach their error
floor at higher values of P compared to the corresponding models in Fig. 4.4, where
Q = 0.01. This supports the idea that the average fade to connection time ratio x
provides a better idea of the channel’s behaviour than P or Q individually.
In summary, Figs. 4.3 and 4.4 demonstrate that Procedure 4.2 produces the
expected results in terms of the parameters of binary GECs when used with the
(7,4) Hamming code. The link to a BSC as a degenerate case of a GEC was noted,
as was the importance of the ratio between state transition probabilities P and Q.
4.6 Summary
This chapter has provided two solutions to the problem of APP decoding for bi-
nary linear block codes over channels described by stochastic automata. As in the
memoryless channel model situation from Chapter 3, one solution used the origi-
nal domain, whilst the other used the spectral domain. However, these solutions
are more advanced than their predecessors because they use matrix probabilities
to consider all possible state transitions introduced by the discrete channel with
memory.
In the original domain approach, the trellis section matrices were weighted by
one of four possible matrix probabilities representing the stochastic properties of
the channel. For a GEC, there are only two such matrix probabilities due to the
symmetry of the constituent BSCs. The result was a collection of weighted trellis
section matrices that could be multiplied together in the right combinations to
produce matrix representations of complete weighted trellises. It was shown how
principal leading submatrices could be used to calculate the necessary APPs from
which the decoding decisions could be made.
After application of the Walsh-Hadamard transform, expressions for weighted
spectral matrices were obtained. A spectral domain trellis can be constructed from
these weighted spectral matrices, however it is easier to work with the matrices
themselves. In this case, strong ties with the dual codewords were observed, and
a change in notation involving the sum and difference of the matrix probabilities
was made to reflect these ties. The APPs for the spectral domain procedure were
calculated by accumulating the conditional spectral coefficients. A decoding decision
in each information bit position was made by selecting the bit which optimised the
APP.
The computational complexity and storage requirements of both approaches were
106
4.6. SUMMARY
then analysed. It was shown that the spectral domain method required less storage
space for its execution compared to that of the original domain, and was ideal for
use with block codes of high rate, since its computational complexity was related
to the dimension of the dual code. Many other APP decoding algorithms have
been developed for memoryless channels, whereas the procedures presented in this
chapter were designed for channels with memory.
A decoding example was also provided in both domains to illustrate the various
calculations which are required, and to demonstrate the equality of the two solutions
obtained. Finally, simulations of the transmission of information encoded by a
Hamming code over various GEC models were carried out. The BER performance
observed suggested that increases in either of the crossover probabilities or in the
probability of transition from the ‘good’ state G to the ‘bad’ state B would degrade
the performance of the code. By contrast, increasing the probability of transition
from the ‘bad’ state B to the ‘good’ state G appeared to improve the performance.
107
108
Chapter 5
APP Decoding on Non-binary
Channels with Memory
The formulation of APP decoding given in Chapter 4 only works for situations where
binary data is being transmitted on the channel. Once the order of the field from
which the symbols are to be selected for transmission rises above two, that decoding
methodology will cease to function. Since more branches on the original domain and
diagonal trellises are required due to the larger number of choices for each of the
transmitted symbols, the elementary trellis and spectral matrices must accordingly
be made larger in size. Additionally, the bipartite “error” or “non-error” model of
the matrix probabilities is no longer adequate. However, if the channel is symmetric,
this problem can be dealt with in a simple way.
Although codes over GF (p), for p an odd prime, are not as common in practice as
binary codes, they do promote the use of symbols containing a higher resolution of
information. For example, the International Standard Book Number (ISBN) system
for library items is based on a code over GF (11). Also, the (11,6) Golay code [66]
as described in (3.108) is one of the few perfect linear codes and is constructed over
GF (3). In addition, Hamming codes constructed over GF (p) are perfect. There
is therefore a need to find good decoding algorithms for non-binary codes. How-
ever, most of the algorithms already developed have been for memoryless channels.
For example, [67] presents MAP decoding methods for non-binary block and con-
volutional codes on a time-discrete memoryless channel. There is a definite lack of
powerful decoding methods for channels with memory such as the GEC.
This chapter is, in a sense, a non-binary analogue of Chapter 4. Section 5.1 gives
the description of the problem which is solved in this chapter. The majority of the
theory is presented in Section 5.2. Section 5.2.1 defines the weighted trellis matrices
for a code over GF (p) on a channel with memory, which leads to the specification of
109
5.1. PROBLEM STATEMENT
an APP decoding algorithm for the original domain. Similarly, in Section 5.2.2 the
formulation of the weighted spectral matrices leads to an APP decoding algorithm
for the spectral domain. Further information is given in Section 5.3 about some
of the probabilities involved in the necessary calculations for the spectral domain
algorithm. In Section 5.4, the computational complexity and storage requirements
of the algorithms are discussed. Examples of applying the algorithms to specific
non-binary codes over a finite state channel are given in Section 5.5. Addition-
ally, Section 5.6 provides results of simulations of the algorithms developed in this
chapter. Finally, Section 5.7 summarises the major findings of this study into APP
decoding of non-binary codes for channels with memory.
Therefore, the major contributions of this chapter are as follows:
• Description of an APP decoding procedure for linear block codes over GF (p)
in conjunction with a channel with memory, using the original domain.
• The spectral domain equivalent of the aforementioned procedure.
• Proof of a result regarding probability theory and the conditional spectral
coefficients.
• An analysis of the requirements of both procedures in terms of execution time
and memory usage.
• Computer-based simulations of the spectral domain procedure as applied to
several non-binary linear block codes over a non-binary GEC.
5.1 Problem Statement
Assume C is a systematic (n, k) linear block code over GF (p) defined by parity
check matrix
H =[
h1, h2, . . . , hn
]
, (5.1)
where the jth column of H is given by the vector
hj =[
hn−k−1,j, hn−k−2,j, . . . , h0,j
]T
∈ [GF (p)]n−k. (5.2)
Let the channel over which the data encoded by C is transmitted be described by
the stochastic automaton
D = (D,σ0), (5.3)
110
5.2. NON-BINARY APP DECODING
where the stochastic sequential machine D has identical input and output sets equal
to {0, 1, . . . , p−1}, a set S of S states, and p2 matrix probabilities D(vj|uj) of
size S×S for a transmitted symbol uj and received symbol vj. Furthermore, σ0
is a vector of length S which represents the stationary state distribution of the
automaton. According to (2.107), for a received word v = [v1, v2, . . . , vn] the APP
decoding decision ui for the ith symbol of the transmitted codeword u is given by
ui = arg maxg∈GF (p)
σ0
∑
u∈C,ui=g
n∏
j=1
D(vj|uj)
e
, (5.4)
for i ∈ {1, 2, . . . , k}, where e is an all-ones column vector of length S. If the DMC
corresponding to each of the S states is symmetric, then it was shown in Section
2.2.2 that the matrix probabilities D(vj|uj) depend on whether uj and vj are equal
or not equal. Specifically, they may be written as
D(vj|uj) = Dvj⊖uj∈ {D0,Dǫ}. (5.5)
The value of ui is found for each position i ∈ {1, 2, . . . , k} by comparing the p differ-
ent APP values which result from the matrix arithmetic within (5.4) and selecting
the estimate which results in the largest APP value. Again, this may be achieved
in either the original or spectral domain.
5.2 Non-binary APP Decoding
The decoding methods presented in this section combine two aspects. One is
the theory of matrix representations for non-binary codes, which was presented
in Chapter 3. The other is the two decoding strategies for channels with memory,
which were presented in Chapter 4. Firstly, in order to reach a decoding decision for
each information symbol position, matrix representations of all viable paths through
the original domain trellis for all values of information symbol are found. After each
matrix representation of a viable path is weighted by its probability of occurring,
a decoding decision for each information symbol position is reached. Then in the
spectral domain, the matrix representations are diagonalised using a similarity trans-
formation. After calculation of the conditional spectral coefficients, conversion to
the original domain APPs is simple.
111
5.2. NON-BINARY APP DECODING
5.2.1 Original domain
The first step in APP decoding using the original domain for a non-binary linear
block code when the channel has memory is the creation of a weighted matrix
representation of each trellis section. The correct combinations of these matrix
representations must then be multiplied in order to arrive at values for the APPs,
from which decoding decisions can be made.
Weighted trellis matrices
If the jth transmitted symbol is uj, then a matrix representation of the jth trellis
section is the pn−k ×pn−k matrix Mhj(uj), as defined in (3.21) and (3.22). However,
the trellis section matrices must be weighted by the relevant matrix probabilities for
the channel model. Suppose the set of possible states for the stochastic sequential
machine D is
S = {Sm | m ∈ 1, 2, . . . , S}, (5.6)
with a symmetric DMC corresponding to each state. A weighted trellis matrix for
the jth trellis section is then obtained by taking the Kronecker product of the trellis
matrix representation for column hj and transmitted symbol uj with the relevant
matrix probability based on the sent symbol uj and received symbol vj. That is,
Uhj(uj) = Mhj
(uj) ⊗ Dvj⊖uj, (5.7)
where by (2.53),
Dvj⊖uj=
D0 if uj = vj,
Dǫ if uj 6= vj.
(5.8)
By analogy with the binary case, a weighted trellis section matrix irrespective of
symbol uj is given by the sum of the weighted trellis section matrices Uhj(uj) over
all uj ∈GF (p). This relationship may be expressed as
Uhj=
p−1∑
g=0
Mhj(g) ⊗ Dvj⊖g. (5.9)
Using the same methodology for APP decoding as presented in Section 4.2.1 for
binary codes, the representation of the entire weighted trellis for the ith transmitted
symbol ui is the square matrix UH(ui) of dimension pn−kS. This matrix may be
112
5.2. NON-BINARY APP DECODING
written as
UH(ui) =i−1∏
j=1
Uhj· Uhi
(ui) ·n∏
j=i+1
Uhj
=i−1∏
j=1
[ p−1∑
uj=0
Mhj(uj) ⊗ Dvj⊖uj
]
×
[Mhi(ui) ⊗ Dvi⊖ui
] × (5.10)n∏
j=i+1
[ p−1∑
uj=0
Mhj(uj) ⊗ Dvj⊖uj
]
.
Determining the a posteriori probabilities in the original domain
In order to obtain a length-pn−k vector of APPs, the p-ary analogue of (4.13) is used:
P(ui|v) = (τ 0 ⊗ σ0) · UH(ui) · (Ipn−k ⊗ e), (5.11)
where
τ 0 =[
1, 0, . . . , 0]
(5.12)
denotes a vector of length pn−k. In this way, only paths commencing at the 0th
node of the trellis are used. The vector P(ui|v) consists of entries Pt(ui|v) for
t = 0, 1, . . . , pn−k−1, with the tth entry denoting the probability that the ith trans-
mitted symbol was ui, given that a word v was received and that the encoding was
performed by mapping a p-ary vector of k information symbols onto length-n words
of the tth coset Vt. The required APP for encoding onto the 0th coset, which is C
itself, is thus P0(ui|v). This is the only situation which will be considered here.
Then (5.11) can be rewritten in order to calculate P0(ui|v) as
P0(ui|v) = σ0 · [UH(ui)](S) · e. (5.13)
The estimate ui for the ith transmitted symbol is the element ui of GF (p) resulting
in the highest value of P0(ui|v), so that
ui = arg maxui∈GF (p)
{P0(ui|v)} . (5.14)
APP decoding procedure in the original domain
The following procedure can be used to perform APP decoding in the original domain
for an (n, k) linear block code over GF (p) and a non-binary channel described by a
stochastic automaton.
113
5.2. NON-BINARY APP DECODING
Procedure 5.1. Given is an (n, k) linear block code C in standard form over GF (p),
to be used on a channel which is described by a stochastic automaton containing a
stochastic sequential machine which has S <∞ states. A p-ary DMC in standard
form corresponds to each of the states. The linear block code C shall be defined by
parity check matrix H. A codeword u is transmitted over the channel and a word v
is received. APP decoding in the original domain consists of the following steps.
Step 1. Use the state transition and crossover probabilities to find the stationary
state distribution vector σ0 and the two S×S matrix probabilities D0 and Dǫ.
Step 2. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the trellis section matrix
Mhj(uj) for column hj and transmitted symbol uj using (3.22).
Step 3. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the weighted trellis section
matrix Uhj(uj) using (5.7).
Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute the weighted trellis matrix
UH(ui) for the full weighted trellis using (5.10).
Step 5. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), let [UH(ui)](S) be the Sth principal
leading submatrix of UH(ui) and calculate the a posteriori probability P0(ui|v)
using (5.13).
Step 6. Derive an estimate ui for the transmitted symbol ui of codeword u at each
position i ∈ {1, 2, . . . , k} using (5.14).
Procedure 5.1 performs the same task as Procedure 4.1 for a code over an al-
phabet of size p, rather than an alphabet of size 2. The major differences are that
the matrix probability D1 corresponding to an error event has been replaced by the
matrix probability Dǫ, and the elementary trellis matrices are now of size p × p,
causing the overall trellis structure to be correspondingly larger. As was shown
for binary codes in Chapter 4, the decoding process can also be carried out in the
spectral domain. This approach is described in the next subsection.
5.2.2 Spectral domain
Here, the APPs are calculated through the intermediate step of determining the
conditional spectral coefficients. That is, diagonalised matrix representations of the
original domain are weighted by the necessary probabilities of symbol error and
non-error. Products of combinations of these diagonal matrices, corresponding to
the entire trellis, are then calculated. Finally, a matrix transformation back to the
114
5.2. NON-BINARY APP DECODING
original domain delivers the APPs, through which the decoding decisions may be
reached.
Weighted spectral matrices
From Chapter 3, the spectral representation of the trellis section corresponding to
the jth column hj of H under the postulate that uj is the jth transmitted symbol is
given by
Λhj(uj) = diag{wuj ·u
⊥
s,j}, (5.15)
where w = e− 2π
p is a complex pth root of unity. Note that (5.15) corresponds to
the original domain representation of the jth trellis section after application of the
similarity transformation Wpn−k , resulting in a diagonal matrix. In other words,
Λhj(uj) = W−1
pn−kMhj(uj)Wpn−k , (5.16)
where Wpn−k is the complex generalisation of the binary Walsh-Hadamard trans-
formation matrix, and is defined by (3.37) and (3.43). Recall that the inverse of
the matrix Wpn−k is 1pn−k W
Hpn−k , as given in (3.44). The spectral matrices for each
trellis section must then be weighted by the relevant matrix probabilities D(vj|uj).
This can be seen by considering the transformed versions of the weighted trellis sec-
tion matrices Uhj(uj) of the original domain. Following the same strategy as was
adopted in Section 4.2.2 for binary channels with memory, applying the generalised
Walsh-Hadamard transform to the weighted trellis section matrix for the jth trellis
section and transmitted symbol uj results in the weighted spectral section matrix
Θhj(uj) =
(
W−1pn−k ⊗ IS
)
· Uhj(uj) ·
(
Wpn−k ⊗ IS
)
= W−1pn−kMhj
(uj)Wpn−k ⊗ Dvj⊖uj
= Λhj(uj) ⊗ Dvj⊖uj
= diag{wuj ·u⊥
s,jDvj⊖uj}.
(5.17)
A weighted spectral section matrix irrespective of the transmitted symbol is given
by the sum, over all uj ∈ GF (p), of the weighted trellis section matrices defined in
(5.17). Therefore,
Θhj=
∑
uj∈GF (p)
Θhj(uj) = diag
{ p−1∑
uj=0
wuj ·u⊥
s,jDvj⊖uj
}
. (5.18)
The complete spectral matrix is then constructed by multiplying each of the weighted
spectral section matrices in sequence to create a matrix product. When decoding
115
5.2. NON-BINARY APP DECODING
the ith symbol, the ith weighted spectral section matrix used in this product must
be taken with respect to transmitted symbol ui. The matrix representation of the
entire diagonal trellis can be written as
ΘH(ui) =i−1∏
j=1
Θhj· Θhi
(ui) ·n∏
j=i+1
Θhj= diag{Qs(ui|v)}, (5.19)
where the block diagonal submatrices of ΘH(ui) can be expressed as
Qs(ui|v) =i−1∏
j=1
( p−1∑
uj=0
wuj ·u⊥
s,jDvj⊖uj
)
×
(wui·u⊥
s,iDvi⊖ui)×
n∏
j=i+1
( p−1∑
uj=0
wuj ·u⊥
s,jDvj⊖uj
)
.
(5.20)
Some simplifications to (5.20) can be made by considering (5.8). Firstly,
p−1∑
uj=0
wuj ·u⊥
s,jDvj⊖uj= wvj ·u
⊥
s,jD0 +[
w0·u⊥
s,j + w1·u⊥
s,j + . . .
+wvj ·u⊥
s,j + . . .+ w(p−1)·u⊥
s,j − wvj ·u⊥
s,j
]
Dǫ.
(5.21)
The expression in (5.21) has a different value depending on whether u⊥s,j is zero or
nonzero. For the cases where u⊥s,j = 0, the result can be expressed as
p−1∑
uj=0
wuj ·u⊥
s,jDvj⊖uj= D0 + (p− 1)Dǫ. (5.22)
Before considering the case of u⊥s,j 6= 0, observe the following lemma concerning
complex roots of unity.
Lemma 5.2.1. For a complex pth root of unity w,p−1∑
g=1
wg = −1.
Proof. The sum may be written explicitly as
p−1∑
g=1
wg = w + w2 + . . .+ wp−1. (5.23)
Then, multiplication by (1 − w) results in
(1 − w)p−1∑
g=1
wg = w + w2 + . . .+ wp−1 − w2 − w3 − . . .− wp
= w − wp
= w − 1,
(5.24)
116
5.2. NON-BINARY APP DECODING
which in turn implies that
p−1∑
g=1
wg =w − 1
1 − w= −1. (5.25)
For the case u⊥s,j 6= 0, applying Lemma 5.2.1 to evaluate (5.21) results in
p−1∑
uj=0
wuj ·u⊥
s,jDvj⊖uj=wvj ·u
⊥
s,jD0+[
wu⊥
s,j(w0+w+. . .+wvj ·u⊥
s,j +. . .+wp−1)−wvj ·u⊥
s,j
]
Dǫ
=wvj ·u⊥
s,j(D0 − Dǫ). (5.26)
Reviewing the structure of (5.20) with the benefit of (5.22) and (5.26) reveals that
for n ≥ 3, changing the notation to reflect the connection with the dual code makes
it more compact. This alternative notation was first discussed in (2.56) and (2.57).
Explicitly, set
D = D0 + (p−1)Dǫ,
∆ = D0 − Dǫ.(5.27)
It then follows from (5.22), (5.26) and (5.27) that
p−1∑
uj=0
wuj ·u⊥
s,jDvj⊖uj=
D if u⊥s,j = 0,
wvj ·u⊥
s,j∆ if u⊥s,j 6= 0,
= wvj ·u⊥
s,j
[
δu⊥
s,j ,0D + (1 − δu⊥
s,j ,0)∆]
,
(5.28)
where δa,b denotes the Dirac-delta function, which has value 1 if a and b are equal,
otherwise it has value 0.
The central term of wui·u⊥
s,iDvi⊖uiin (5.20) has two distinct values depending on
whether ui and vi are equal or not. Using the notation in (5.27), the term may be
written as
wui·u⊥
s,iDvi⊖ui=
wui·u
⊥s,i
p[D + (p− 1)∆] if ui = vi,
wui·u
⊥s,i
p(D − ∆) if ui 6= vi,
= wui·u
⊥s,i
p[D + (δui,vi
p− 1)∆] .
(5.29)
Finally, substituting (5.28) and (5.29) into (5.20) produces the conditional spectral
coefficient matrices
117
5.2. NON-BINARY APP DECODING
Qs(ui|v) =i−1∏
j=1
wvj ·u⊥
s,j
[
δu⊥
s,j ,0D + (1 − δu⊥
s,j ,0)∆]
×
wui·u
⊥s,i
p[D + (δui,vi
p− 1)∆]×n∏
j=i+1
wvj ·u⊥
s,j
[
δu⊥
s,j ,0D + (1 − δu⊥
s,j ,0)∆]
.
(5.30)
These are converted to the conditional spectral coefficients Qs(ui|v), as a weighted
average over the S initial states, using the relationship
Qs(ui|v) = σ0 · Qs(ui|v) · e. (5.31)
The set of pn−k such scalars is collected together to form the vector of conditional
spectral coefficients
Q(ui|v) =[
Q0(ui|v), Q1(ui|v), . . . , Qpn−k(ui|v)]
. (5.32)
Determining the a posteriori probabilities in the spectral domain
In order to convert between the APPs of the original domain and the spectral do-
main coefficients, the first row of the complex generalisation of the Walsh-Hadamard
matrix is required.
Lemma 5.2.2. The first row or column of the complex Walsh-Hadamard transform
matrix Wpn−k consists entirely of ones, and hence the sum of the entries in this row
or column is pn−k.
Proof. This may be proved inductively. The base case is given by the definition of
Wp in (3.37). For d ∈ Z+, assume that the first row or column of Wpd consists
entirely of ones. Then, (3.43) shows that the first row or column of Wpd+1 contains
only ones. Thus, by the Principle of Mathematical Induction, for any value of n−k,the first row or column of Wpn−k consists entirely of ones, and the sum of its entries
is pn−k.
Rearranging (5.17) and (5.19) before substituting into (5.11) gives
P(ui|v) = (τ 0 ⊗ σ0)(Wpn−k ⊗ IS) · ΘH(ui) · (W−1pn−k ⊗ IS)(Ipn−k ⊗ e). (5.33)
By properties of the Kronecker product, it follows that
(τ 0 ⊗ σ0)(Wpn−k ⊗ IS) = ι0 ⊗ σ0, (5.34)
118
5.2. NON-BINARY APP DECODING
where
ι0 =[
1, 1, . . . , 1]
. (5.35)
Additionally, the rightmost two products of (5.33) may be simplified as
(W−1pn−k ⊗ IS)(Ipn−k ⊗ e) = W−1
pn−k ⊗ e. (5.36)
Further consideration of (5.19), (5.34) and (5.36) in relation to (5.33) reveals that
P(ui|v) = (ι0 ⊗ σ0) · diag{Qs(ui|v)} · (W−1pn−k ⊗ e)
= 1pn−k (ι0 ⊗ σ0) · diag{Qs(ui|v)} · (WH
pn−k ⊗ e)(5.37)
via application of (3.44). Looking at the structure of the vector of conditional
spectral coefficients described in (5.31) and (5.32) allows (5.37) to be rewritten as
P(ui|v) =1
pn−kQ(ui|v)WH
pn−k . (5.38)
Since the APPs which are required for decoding are P0(ui|v), for ui∈GF (p), the first
element of the vector in (5.38) is extracted with the assistance of (5.35), resulting
in
P0(ui|v) =1
pn−k
pn−k−1∑
s=0
Qs(ui|v). (5.39)
A decision rule to determine the estimate ui of the ith transmitted symbol ui in the
codeword u can be formulated by comparing the sums of the conditional spectral
coefficients. This decision rule may be expressed as
ui = arg maxg∈GF (p)
{ pn−k−1∑
s=0
Qs(ui = g|v)
}
. (5.40)
APP decoding procedure in the spectral domain
Procedure 5.2 is an alternative to Procedure 5.1 which uses the spectral domain. It
requires knowledge of the parity check matrix H, the received word v, and the state
transition and crossover probabilities of the channel model.
Procedure 5.2. Given is an (n, k) linear block code C in standard form over GF (p),
to be used on a channel which is described by a stochastic automaton containing a
stochastic sequential machine which has S <∞ states. A p-ary DMC in standard
form corresponds to each of the states. The linear block code C shall be defined by
119
5.3. PROPERTIES OF THE CONDITIONAL SPECTRAL COEFFICIENTS
parity check matrix H. A codeword u is transmitted over the channel and a word v
is received. APP decoding in the spectral domain consists of the following steps.
Step 1. Use the state transition and crossover probabilities to find the stationary
state distribution vector σ0 and the matrix probabilities D0 and Dǫ. Formulate
the state transition matrix D and the difference matrix ∆ in terms of the
matrix probabilities D0 and Dǫ using (5.27).
Step 2. ∀s = dec(s) ∈ {0, 1, . . . , pn−k − 1}, compute dual codewords
u⊥s = s · H =
[
u⊥s,1, u⊥s,2, . . . , u⊥s,n
]
∈ C⊥, (5.41)
which define the arrangement of the D and ∆ matrices in (5.30).
Step 3. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute the complete set of condi-
tional spectral coefficients Qs(ui|v) using (5.30) and (5.31).
Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), accumulate all pn−k of these coeffi-
cients to compute the APPs P0(ui|v) using (5.39).
Step 5. Use the decision rule in (5.40) to determine an estimate ui of the ith trans-
mitted symbol ui, for each position i ∈ {1, 2, . . . , k}.
5.3 Properties of the Conditional Spectral Coef-
ficients
In order to provide more information concerning the properties of the vectors of
conditional spectral coefficients Q(ui|v), for ui ∈ GF (p), it is shown in this section
that the first entries Q0(ui|v) of these vectors form a probability distribution.
Given (5.38), it is sensible to examine the complex Walsh-Hadamard transform
matrix Wpn−k in more detail. The following results give information about each of
the rows of the complex conjugate of this matrix. Initially, the first row is considered.
Then, the sum of the entries in all other rows is investigated.
Corollary 5.3.1. The first row or column of W∗pn−k , the complex conjugate of
Wpn−k , consists entirely of ones and the sum of its entries is pn−k.
Proof. Note that 1 = 1+0. The result follows in the same way as Lemma 5.2.2.
Lemma 5.3.1. The sum of the entries in any row or column except the first of the
complex Walsh-Hadamard transform matrix Wpn−k is zero.
120
5.3. PROPERTIES OF THE CONDITIONAL SPECTRAL COEFFICIENTS
Proof. This can be proved in several ways. One proof is given in Appendix B.
Another minor result concerning the sum of the entries in the complex conjugate
of a vector is required.
Lemma 5.3.2. Let x be a vector ∈ Cpn−k
such that the sum of its entries is zero.
Then the sum of the entries in x∗ is also zero.
Proof. Let the entries of x be xi = ai + bi, for 1 ≤ i ≤ pn−k. Then
pn−k
∑
i=1
xi = 0 ⇒pn−k
∑
i=1
ai = 0 and
pn−k
∑
i=1
bi = 0
⇒pn−k
∑
i=1
ai −
pn−k
∑
i=1
bi = 0
⇒pn−k
∑
i=1
x∗i = 0. (5.42)
Using Lemma 5.3.2 in conjunction with Lemma 5.3.1 shows that the sum of the
entries in every row and column except the first of W∗pn−k is zero. This leads to a
proof of an important result.
Theorem 5.3.1. The set of first elements Q0(ui = g|v) of the conditional spectral
coefficient vectors Q(ui = g|v), for g∈GF (p), form a probability distribution.
Proof. It is first shown that the sum of the first entries of the conditional spectral
coefficient vectors Q(ui = g|v) over ui ∈ GF (p) is one. Then it is proven that
0 ≤ Q0(ui = g|v) ≤ 1 for each g ∈ GF (p).
Clearly, considering the ith transmitted symbol ui,
P (ui = 0|v) + P (ui = 1|v) + . . .+ P (ui = p−1|v) = 1. (5.43)
In addition, each of the APPs P (ui = g|v) assumes an encoding using exactly one
of the cosets. This is normally, but not necessarily, performed using the code itself.
Therefore,
P (ui = g|v) =
pn−k−1∑
s=0
Ps(ui = g|v). (5.44)
121
5.3. PROPERTIES OF THE CONDITIONAL SPECTRAL COEFFICIENTS
Then, combining (5.43) and (5.44) gives
p−1∑
g=0
pn−k−1∑
s=0
Ps(ui = g|v) = 1. (5.45)
Examining the vector P(ui|v) of APPs, it follows from (5.38) and the symmetry of
WHpn−k that
P(ui = g|v) =1
pn−kQ(ui = g|v) · W∗
pn−k . (5.46)
Access the entry in the rth row and cth column of the complex conjugate of the
generalised Walsh-Hadamard matrix using the notation
W∗pn−k = [φr,c]pn−k×pn−k . (5.47)
If (5.46) is expanded and substituted into (5.45), the result may be stated as
1
pn−k
p−1∑
g=0
pn−k
∑
c=1
pn−k
∑
r=1
Qr−1(ui = g|v)φr,c = 1. (5.48)
Expanding and then simplifying (5.48) produces
p−1∑
g=0
Q0(ui = g|v)pn−k∑
c=1
φ1,c +pn−k∑
r=2
(
p−1∑
g=0
Qr−1(ui = g|v)pn−k∑
c=1
φr,c
)
pn−k= 1. (5.49)
By Corollary 5.3.1, Lemma 5.3.1 and Lemma 5.3.2, it follows that
pn−k
∑
c=1
φr,c =
pn−k if r = 1,
0 if r 6= 1.(5.50)
Applying (5.50) to (5.49) implies that
1pn−k
[
p−1∑
g=0
Q0(ui =g|v)pn−k +pn−k∑
r=2
(
p−1∑
g=0
Qr−1(ui =g|v)·0)
]
=p−1∑
g=0
Q0(ui =g|v)
= 1.
(5.51)
Thus, the sum of the first elements Q0(ui|v) of the conditional spectral coefficient
vectors Q(ui|v) over all ui ∈ GF (p) is one. The other condition which must be
checked is whether Q0(ui|v) lies between 0 and 1 inclusive for all values of ui. This
will hold if none of the coefficients are negative. Consider (5.30) for the case s = 0.
122
5.4. COMPLEXITY ANALYSIS
The result can be expressed as
Q0(ui|v) =1
pDi−1 [D + (δui,vi
p− 1)∆]Dn−i. (5.52)
Therefore by (5.27) and (5.31),
Q0(ui|v) =
{
σ0Di−1D0D
n−ie if ui = vi,
σ0Di−1DǫD
n−ie if ui 6= vi.(5.53)
All values in the matrices in (5.53) are non-negative, because they are probabilities.
The same applies to the vector σ0, since its values correspond to the stationary state
probability distribution. Post-multiplication by e is simply a summation of these
non-negative values, and hence the overall product is a non-negative real number,
regardless of the value of ui. Combining
Q0(ui|v) ≥ 0 ∀ui ∈ GF (p) (5.54)
with (5.51) implies that
Q0(ui|v) ≤ 1 ∀ui ∈ GF (p). (5.55)
Therefore {Q0(ui|v) | ui ∈ GF (p)} constitutes a probability distribution.
In (5.45), it is important to note that the summation occurs over all s values
from 0 to pn−k − 1. It follows that
p−1∑
g=0
P0(ui = g|v) ≤ 1. (5.56)
Although equality in (5.56) is technically a possibility, in general the set of APPs
{P0(ui|v) | ui ∈ GF (p)} corresponding to the coset V0 do not sum to one and thus
do not form a probability distribution. Therefore, decoding decisions can only be
made by comparing the APPs P0(ui|v) for each and every ui ∈ GF (p).
5.4 Complexity Analysis
It is possible to analyse the requirements of Procedures 5.1 and 5.2 in terms of ex-
ecution time and memory needed. Again the O notation will be used to provide
an asymptotic upper bound on these requirements, as the values of the considered
parameters approach +∞. Since these two procedures for non-binary codes are gen-
eralisations of their counterparts in Chapter 4 for binary codes, their computational
complexities are generalisations of those derived in Section 4.3.
123
5.4. COMPLEXITY ANALYSIS
5.4.1 Computational complexity
An idea of the execution time for Procedures 5.1 and 5.2 is obtained by considering
the number of multiplications required in terms of four parameters. These param-
eters are the number, n, of symbols per codeword, the number, k, of information
symbols per codeword, the number, S, of states in the model, and the size, p, of the
Galois field.
Computational complexity of the original domain approach
Assume that the decoding is performed in the original domain using only the APPs
P0(ui|v). To decode a single word, one such probability must be calculated for all p
values of ui, for each value of i∈{1, 2, . . . , k}. Each of these k · p APPs requires pk−1
matrix products to be summed. The structure of the parity check matrix H allows a
choice of p possible symbols for the first k positions. Once these are chosen, there is
only one choice for codeword symbols in the remaining n−k positions. There would
thus be pk possible codewords or matrix products per trellis. However, the condition
“u∈C, ui =g” forces a particular value to occur at one of these k positions, and thus
the total is pk−1 matrix products. The computation of each such matrix product
requires O(nS2) multiplications, since a row vector of length S is multiplied by an
S×S matrix n times. Thus, the total complexity of Procedure 5.1 per codeword is
O(knpkS2).
Computational complexity of the spectral domain approach
The decoding of a word using Procedure 5.2 would require the equivalent of con-
structing k·p trellises. That is, p different trellises for the decoding at each of the k
information symbol positions are needed. One scalar conditional spectral coefficient
must be calculated for each of the pn−k rows of each trellis. In turn, calculation of
each scalar requires O(nS2) multiplications, since a row vector of length S is multi-
plied by an S×S matrix n times. Note that these numbers are complex rather than
real. However, supposing these are stored as a real and a complex part, each com-
plex number multiplication is really a series of four multiplications of real numbers,
which still requires O(nS2) multiplications per trellis overall. The total number of
multiplications is therefore O(knpn−k+1S2). Basic arithmetic then demonstrates the
preference towards the spectral domain for high-rate codes, where
k >n+ 1
2, (5.57)
124
5.5. INSTRUCTIVE EXAMPLES
whereas the original domain approach will be more efficient when the rate of the
code is low.
5.4.2 Storage requirements
The quantity Y of real number storage spaces needed for execution of Procedures
5.1 and 5.2 can be determined. These analyses are extrapolations of those for the
binary case in Section 4.3.
Storage requirements of the original domain approach
Calculation of the weighted trellis matrix UH(ui) for the ith transmitted symbol ui
is required in order to extract the Sth principal leading submatrix [UH(ui)](S) of
(5.13). Thus, a block of (pn−kS)2 real number storage spaces must be allocated,
along with a temporary vector of length S whilst the multiplications are performed,
as well as p spaces for the APPs P0(ui|v) before choosing the maximum of these as
the decoded symbol. Therefore for Procedure 5.1,
Y = O(p2(n−k)S2). (5.58)
Storage requirements of the spectral domain approach
When using the spectral domain with non-binary codes, it must be remembered that
there will be times when multiplication of complex numbers is required. This can be
accommodated by a vector of length 2S and a matrix holding 2S2 real entries. Each
of the pn−k conditional spectral coefficients is added to the previous tally, so that
two real numbers will be sufficient for storage of P0(ui|v) for each of the p possible
values of ui. It follows that Procedure 5.2 requires memory to store
Y = O(S2 + p) (5.59)
real numbers. This analysis advocates the use of the spectral domain over the
original domain, particularly if the size of the field and/or the dual code is large.
5.5 Instructive Examples
Consider the task of using APP decoding to estimate the second symbol transmitted
u2 when the word v = [1, 2, 2, 0] is received after transmission of a codeword u from
the (4,2) linear block code over GF (3) described in (3.107). Assume a ternary GEC
125
5.5. INSTRUCTIVE EXAMPLES
model is used, with state transition probabilities P and Q and crossover probabilities
pG and pB in the ‘good’ and ‘bad’ states G and B, respectively. Suppose that
P = 10−4, Q = 10−2, pG = 10−3 and pB = 10−2. (5.60)
This results in matrix probabilities D0 and Dǫ, which have values
D0 =
[
1 − 2pG 0
0 1 − 2pB
][
1 − P P
Q 1 −Q
]
=
[
0.9979 0.0000998
0.0098 0.9702
]
, (5.61)
Dǫ =D1 =D2 =
[
pG 0
0 pB
][
1 − P P
Q 1 −Q
]
=
[
0.0009999 0.0000001
0.0001 0.0099
]
. (5.62)
By (5.27), the matrix probabilities D and ∆ are calculated as
D =
[
0.9999 0.0001
0.01 0.99
]
, ∆ =
[
0.9969003 0.0000997
0.0097 0.9603
]
. (5.63)
The stationary state distribution vector for this model may be expressed as
σ0 =[
Q
P+Q, P
P+Q
]
=[
100101, 1
101
]
, (5.64)
and, as the channel model consists of two states,
e =[
1 1]T
. (5.65)
The calculations involved in performing this task using Procedures 5.1 and 5.2 are
shown in the following two subsections. It is demonstrated that both cases result in
the same answer.
5.5.1 Example of decoding in the original domain
Using (3.21), the set of nine elementary trellis matrices Mh(u) for h, u ∈ GF (3) may
be listed in terms of the 3 × 3 circulant matrices as
M0(0) = M1(0) = M2(0) = M0(1) = M0(2) = I3, (5.66)
M1(1) = M2(2) = circ(0, 1, 0), (5.67)
M2(1) = M1(2) = circ(0, 0, 1). (5.68)
126
5.5. INSTRUCTIVE EXAMPLES
There are n · p = 12 trellis section matrices to consider for a jth transmitted symbol
uj. The three such matrices relating to the first transmitted symbol u1 may be
determined as
Mh1(0) = M2(0) ⊗ M1(0) = I9, (5.69)
Mh1(1) = M2(1) ⊗ M1(1)=
0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0
, (5.70)
and
Mh1(2) = M2(2) ⊗ M1(2) =
0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
0 0 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
. (5.71)
Trellis matrices for the remaining sections may be calculated similarly.
The weighted trellis section matrices for codes over GF (3) are given by
Uhj(uj = 0) = Mhj
(0) ⊗ D(vj|0) = I9 ⊗ Dvj,
Uhj(uj = 1) = Mhj
(1) ⊗ D(vj|1) = Mhj(1) ⊗ Dvj⊖1,
Uhj(uj = 2) = Mhj
(2) ⊗ D(vj|2) = Mhj(2) ⊗ Dvj⊖2.
(5.72)
Applying (5.72) to (5.69)-(5.71) by setting j=1 results in
Uh1(0) = I9 ⊗ Dv1
= diag{D1,D1,D1,D1,D1,D1,D1,D1,D1}, (5.73)
127
5.5. INSTRUCTIVE EXAMPLES
Uh1(1)=Mh1
(1)⊗Dv1⊖1 =
0 0 0 0 0 0 0 D0 0
0 0 0 0 0 0 0 0 D0
0 0 0 0 0 0 D0 0 0
0 D0 0 0 0 0 0 0 0
0 0 D0 0 0 0 0 0 0
D0 0 0 0 0 0 0 0 0
0 0 0 0 D0 0 0 0 0
0 0 0 0 0 D0 0 0 0
0 0 0 D0 0 0 0 0 0
, (5.74)
and
Uh1(2)=Mh1
(2)⊗Dv1⊖2 =
0 0 0 0 0 D2 0 0 0
0 0 0 D2 0 0 0 0 0
0 0 0 0 D2 0 0 0 0
0 0 0 0 0 0 0 0 D2
0 0 0 0 0 0 D2 0 0
0 0 0 0 0 0 0 D2 0
0 0 D2 0 0 0 0 0 0
D2 0 0 0 0 0 0 0 0
0 D2 0 0 0 0 0 0 0
. (5.75)
Weighted trellis matrices for the other sections can also be derived using (5.72).
In order to decode the second symbol, the matrix representations of the first,
third and fourth trellis sections irrespective of the sent symbol are required. The
representation of the first section may be calculated as
Uh1=Uh1
(0)+Uh1(1)+Uh1
(2)=
D1 0 0 0 0 D2 0 D0 0
0 D1 0 D2 0 0 0 0 D0
0 0 D1 0 D2 0 D0 0 0
0 D0 0 D1 0 0 0 0 D2
0 0 D0 0 D1 0 D2 0 0
D0 0 0 0 0 D1 0 D2 0
0 0 D2 0 D0 0 D1 0 0
D2 0 0 0 0 D0 0 D1 0
0 D2 0 D0 0 0 0 0 D1
. (5.76)
According to (5.10), the entire weighted trellis matrices are of size 18 × 18 and are
128
5.5. INSTRUCTIVE EXAMPLES
given by
UH(u2) = Uh1Uh2
(u2)Uh3Uh4
. (5.77)
The complete structures of these three matrices are irrelevant, because only the 2nd
principal leading submatrices [UH(u2)](2) need to be considered when calculating
the APPs. Therefore, by (5.13), the first element of the vector of APPs in the three
cases can be expressed as
P0(u2 = 0|v) = σ0 · [D1(D2)2D0 + D0D2(D1)
2 + (D2)2D0D2] · e,
P0(u2 = 1|v) = σ0 · [D2D1D2D1 + D0D1(D0)2 + (D1)
3D2] · e,P0(u2 = 2|v) = σ0 · [D2D0D1D0 + D1(D0)
2D1 + (D0)2(D2)
2] · e.(5.78)
Then, substituting (5.61), (5.62), (5.64) and (5.65) into (5.78) gives
P0(u2 = 0|v) = 3.15 × 10−8,
P0(u2 = 1|v) = 1.08 × 10−3,
P0(u2 = 2|v) = 5.77 × 10−6.
(5.79)
Evaluating (5.14) with (5.79), it follows that u2 = 1 under these channel conditions.
5.5.2 Example of decoding in the spectral domain
The calculations in the spectral domain tend to be more concise, because only
diagonal and block-diagonal matrices are involved. As the code used in this example
is over GF (3), let
w = e−2π3
. (5.80)
Since C is self-dual, the dual codewords are simply those of C. However, their order
is important. Applying (3.107) to (5.41) gives
u⊥0 = [0, 0, 0, 0],
u⊥1 = [1, 2, 0, 1],
u⊥2 = [2, 1, 0, 2],
u⊥3 = [2, 2, 1, 0],
u⊥4 = [0, 1, 1, 1],
u⊥5 = [1, 0, 1, 2],
u⊥6 = [1, 1, 2, 0],
u⊥7 = [2, 0, 2, 1],
u⊥8 = [0, 2, 2, 2].
(5.81)
129
5.5. INSTRUCTIVE EXAMPLES
Equation (5.30) produces
Qs(u2|v) = wv1·u⊥
s,1
[
δu⊥
s,1,0D + (1 − δu⊥
s,1,0)∆]
×w
u2·u⊥s,2
3[D + (3δu2,v2
− 1)∆]×4∏
j=3
wvj ·u⊥
s,j
[
δu⊥
s,j ,0D + (1 − δu⊥
s,j ,0)∆]
,
(5.82)
where, for j ∈ {1, 2, 3, 4} and s ∈ {0, 1, . . . , 8}, u⊥s,j is the jth entry in the sth dual
codeword u⊥s as defined in (5.81). In total, there are 27 conditional spectral coeffi-
cient matrices to calculate. Substituting the values of u2, u⊥s,j and v = [1, 2, 2, 0] in
(5.82) givesQ0(u2 = 0|v) = D
(
D−∆
3
)
DD,
Q1(u2 = 0|v) = w∆(
D−∆
3
)
D∆,
Q2(u2 = 0|v) = w2∆(
D−∆
3
)
D∆,
Q3(u2 = 0|v) = w2∆(
D−∆
3
)
w2∆D,
Q4(u2 = 0|v) = D(
D−∆
3
)
w2∆∆,
Q5(u2 = 0|v) = w∆(
D−∆
3
)
w2∆∆,
Q6(u2 = 0|v) = w∆(
D−∆
3
)
w∆D,
Q7(u2 = 0|v) = w2∆(
D−∆
3
)
w∆∆,
Q8(u2 = 0|v) = D(
D−∆
3
)
w∆∆,
(5.83)
Q0(u2 = 1|v) = D(
D−∆
3
)
DD,
Q1(u2 = 1|v) = w∆(
w2 D−∆
3
)
D∆,
Q2(u2 = 1|v) = w2∆(
wD−∆
3
)
D∆,
Q3(u2 = 1|v) = w2∆(
w2 D−∆
3
)
w2∆D,
Q4(u2 = 1|v) = D(
wD−∆
3
)
w2∆∆,
Q5(u2 = 1|v) = w∆(
D−∆
3
)
w2∆∆,
Q6(u2 = 1|v) = w∆(
wD−∆
3
)
w∆D,
Q7(u2 = 1|v) = w2∆(
D−∆
3
)
w∆∆,
Q8(u2 = 1|v) = D(
w2 D−∆
3
)
w∆∆,
(5.84)
Q0(u2 = 2|v) = D(
D+2∆3
)
DD,
Q1(u2 = 2|v) = w∆(
wD+2∆3
)
D∆,
Q2(u2 = 2|v) = w2∆(
w2 D+2∆3
)
D∆,
Q3(u2 = 2|v) = w2∆(
wD+2∆3
)
w2∆D,
Q4(u2 = 2|v) = D(
w2 D+2∆3
)
w2∆∆,
Q5(u2 = 2|v) = w∆(
D+2∆3
)
w2∆∆,
Q6(u2 = 2|v) = w∆(
w2 D+2∆3
)
w∆D,
Q7(u2 = 2|v) = w2∆(
D+2∆3
)
w∆∆,
Q8(u2 = 2|v) = D(
wD+2∆3
)
w∆∆.
(5.85)
130
5.6. SIMULATION RESULTS
Calculation of the conditional spectral coefficients Qs(u2|v) by (5.31) is depicted in
the trellises of Figs. 5.1 - 5.3, with the terms in (5.83) - (5.85) appearing on their
branches. In sections 1, 3 and 4 of the trellises, note how the zeroes in the dual
codewords of (5.81) correspond with D matrices, and how the nonzero elements
produce ∆ matrices. The values of the three APPs P0(u2|v) are found by (5.39) to
beP0(u2 = 0|v) = 1
9
∑8s=0Qs(u2 = 0|v) = 3.15 × 10−8,
P0(u2 = 1|v) = 19
∑8s=0Qs(u2 = 1|v) = 1.08 × 10−3,
P0(u2 = 2|v) = 19
∑8s=0Qs(u2 = 2|v) = 5.77 × 10−6,
(5.86)
which match the original domain APPs in (5.79). It can thus again be concluded
that u2 = 1. Procedures 5.1 and 5.2 are therefore two ways of arriving at the same
decoding estimate.
5.6 Simulation Results
An analysis of the performance of large classes of codes when used over a large
number of possible channels and decoded using the two methods presented in this
chapter would take an inordinate amount of time and computing power. Therefore,
only a select few have been chosen for simulation using an implementation of Pro-
cedure 5.2 in MATLABr. In particular, some non-binary Hamming codes and the
ISBN-10 code are investigated.
5.6.1 Non-binary Hamming codes
The details of the considered non-binary (n, k) Hamming codes used over a non-
binary GEC are as follows:
(4,2) Hamming code over GF (3): As defined in (3.107).
(6,4) Hamming code over GF (5): The parity check matrix for this one-error
correcting Hamming code of order n−k = 2 and rate R = 0.67 is given as
H =
[
1 1 1 1 1 0
1 2 3 4 0 1
]
. (5.87)
(8,6) Hamming code over GF (7): Similarly, this one-error correcting Hamming
code of order two and rate R = 0.75 is defined by the parity check matrix
H =
[
1 1 1 1 1 1 1 0
1 2 3 4 5 6 0 1
]
. (5.88)
131
5.6. SIMULATION RESULTS
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
σ0
σ0
σ0
σ0
σ0
σ0
σ0
σ0
σ0
e Q8(u2 =0|v)
e Q7(u2 =0|v)
e Q6(u2 =0|v)
e Q5(u2 =0|v)
e Q4(u2 =0|v)
e Q3(u2 =0|v)
e Q2(u2 =0|v)
e Q1(u2 =0|v)
e Q0(u2 =0|v)
D D−∆
3 D D
w∆ D−∆
3 D ∆
w2∆D−∆
3 D ∆
w2∆D−∆
3 w2∆ D
D D−∆
3 w2∆ ∆
w∆ D−∆
3 w2∆ ∆
w∆ D−∆
3 w∆ D
w2∆D−∆
3 w∆ ∆
D D−∆
3 w∆ ∆
Figure 5.1: Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)used to compute spectral coefficients Qs(u2 = 0|v); s = 0, 1, . . . , 8.
132
5.6. SIMULATION RESULTS
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
σ0
σ0
σ0
σ0
σ0
σ0
σ0
σ0
σ0
e Q8(u2 =1|v)
e Q7(u2 =1|v)
e Q6(u2 =1|v)
e Q5(u2 =1|v)
e Q4(u2 =1|v)
e Q3(u2 =1|v)
e Q2(u2 =1|v)
e Q1(u2 =1|v)
e Q0(u2 =1|v)
D D−∆
3 D D
w∆ w2 D−∆
3 D ∆
w2∆ wD−∆
3 D ∆
w2∆ w2 D−∆
3 w2∆ D
D wD−∆
3 w2∆ ∆
w∆ D−∆
3 w2∆ ∆
w∆ wD−∆
3 w∆ D
w2∆D−∆
3 w∆ ∆
D w2 D−∆
3 w∆ ∆
Figure 5.2: Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)used to compute spectral coefficients Qs(u2 = 1|v); s = 0, 1, . . . , 8.
133
5.6. SIMULATION RESULTS
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
t t t t t
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
- - - -
σ0
σ0
σ0
σ0
σ0
σ0
σ0
σ0
σ0
e Q8(u2 =2|v)
e Q7(u2 =2|v)
e Q6(u2 =2|v)
e Q5(u2 =2|v)
e Q4(u2 =2|v)
e Q3(u2 =2|v)
e Q2(u2 =2|v)
e Q1(u2 =2|v)
e Q0(u2 =2|v)
D D+2∆3 D D
w∆ wD+2∆3 D ∆
w2∆ w2 D+2∆3 D ∆
w2∆ wD+2∆3 w2∆ D
D w2 D+2∆3 w2∆ ∆
w∆ D+2∆3 w2∆ ∆
w∆ w2 D+2∆3 w∆ D
w2∆D+2∆
3 w∆ ∆
D wD+2∆3 w∆ ∆
Figure 5.3: Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)used to compute spectral coefficients Qs(u2 = 2|v); s = 0, 1, . . . , 8.
134
5.6. SIMULATION RESULTS
Figures 5.4-5.6 display the SER performance obtained after data was transmitted
over a GEC model and decoded using Procedure 5.2, which operates in the spectral
domain. The results for the (4,2) Hamming code over GF (3) are depicted in Fig.
5.4, those for the (6,4) Hamming code over GF (5) are given in Fig. 5.5, and the SER
performance obtained with the (8,6) Hamming code over GF (7) is displayed in Fig.
5.6. In all three cases, the state transition probabilities of the channel model were
fixed at P = 0.05 and Q = 0.2. Simulations were carried out for pairs of crossover
probability values taken from the sets
pG ∈ {10−4, 10−3, 10−2},
pB ∈ {0.01, 0.02, . . . , 0.1}.(5.89)
A vertical scale incorporating both SER and BER is provided in order to aid
performance comparisons between the different codes. There are a number of ways
that the conversion from SER to an equivalent BER may be carried out, however
the one used here to convert from an error rate in p-ary units of information to one
in terms of bits is given by
BER =SER
log2(p). (5.90)
As would be expected, in all three cases, the SER decreases as the crossover prob-
ability pB in the ‘bad’ state B decreases. The SER also decreases as the crossover
probability pG in the ‘good’ stateG decreases within the range given, however further
decreases in this crossover probability did not produce curves which were apprecia-
bly discernable from the curves for pG = 10−4. To improve legibility, such results
have been omitted from the figures. In addition, the BER increases as the order
of the field increases. Although each code is designed to correct one transmission
error, the codes are between four and eight symbols in length. In particular, the
code over GF (3) is able to correct one out of the four symbols, the code over GF (5)
can correct one out of six symbols, while the code over GF (7) is only capable of
correcting one out of eight symbols. This explains the differences in the error rates.
5.6.2 The ISBN-10 code
Let C be the (10,9) single parity check code over GF (11) where the parity symbol
u10 is defined by the equation
u1 + 2u2 + 3u3 + 4u4 + 5u5 + 6u6 + 7u7 + 8u8 + 9u9 + 10u10 ≡ 0 (mod 11). (5.91)
135
5.6. SIMULATION RESULTS
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.110
−4
10−3
10−2
10−1
100
pB
Err
or r
ate
SER pG = 10−2
SER pG = 10−3
SER pG = 10−4
BER pG = 10−2
BER pG = 10−3
BER pG = 10−4
Figure 5.4: Performance of the (4,2) Hamming code over GF (3) on a GEC withstate transition probabilities P=0.05 and Q=0.2.
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.110
−4
10−3
10−2
10−1
100
pB
Err
or r
ate
SER pG = 10−2
SER pG = 10−3
SER pG = 10−4
BER pG = 10−2
BER pG = 10−3
BER pG = 10−4
Figure 5.5: Performance of the (6,4) Hamming code over GF (5) on a GEC withstate transition probabilities P=0.05 and Q=0.2.
136
5.6. SIMULATION RESULTS
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.110
−4
10−3
10−2
10−1
100
pB
Err
or r
ate
SER pG = 10−2
SER pG = 10−3
SER pG = 10−4
BER pG = 10−2
BER pG = 10−3
BER pG = 10−4
Figure 5.6: Performance of the (8,6) Hamming code over GF (7) on a GEC withstate transition probabilities P=0.05 and Q=0.2.
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.0910
−4
10−3
10−2
10−1
100
pB
Err
or r
ate
SER pG = 10−2
SER pG = 10−3
SER pG = 10−4
BER pG = 10−2
BER pG = 10−3
BER pG = 10−4
Figure 5.7: Performance of the ISBN-10 code on a GEC with state transitionprobabilities P=0.05 and Q=0.2.
137
5.6. SIMULATION RESULTS
This produces the parity check matrix
H =[
1 2 3 4 5 6 7 8 9 X]
, (5.92)
where “X” is used to represent “10”, thus avoiding any confusion between “10” and
“1 0”. However, this parity check matrix is not yet in systematic form. To fix this,
multiply H by 10 to obtain
H =[
X 9 8 7 6 5 4 3 2 1]
. (5.93)
Since the ISBN-10 code has Hamming distance d = 2, it is only of real use for error
detection as it detects d− 1 = 1 error but is not guaranteed to correct any errors.
For example, if the barcode is scanned and it is not recognised as a valid ISBN by
the scanner, then an error is detected and the barcode may be scanned again. Note
that the above encoding process refers to the now-obsolete ISBN-10 code. From
2007, the decimal (13,12) ISBN-13 code [68] is used in practice. The transition
occurred because the number of individual codewords available for issue with new
publications was being depleted. There are three more information symbol positions
in the new ISBN-13 code compared to the ISBN-10 code. The parity symbol x13 for
an ISBN-13 codeword can be calculated using the GF (10) equation
u1+u3+u5+u7+u9+u11+u13+3(u2+u4+u6+u8+u10+u12) ≡ 0 (mod 10). (5.94)
This produces the parity check matrix
H =[
1 3 1 3 1 3 1 3 1 3 1 3 1]
, (5.95)
and as ISBN-13 is a decimal code, the “X” symbol is not required.
A notion of the performance of the ISBN-10 code may be obtained from Fig.
5.7. In this simulation, Procedure 5.2 was again used to decode transmissions over
a GEC with parameters P = 0.05 and Q = 0.2. Pairs of crossover probability values
were taken from the sets
pG ∈ {10−4, 10−3, 10−2},
pB ∈ {0.01, 0.02, . . . , 0.09}.(5.96)
It can be shown that an 11-ary symmetric DMC which has crossover probability
138
5.7. SUMMARY
pB within the half-closed interval ( 111, 1
10] has the same capacity as a DMC with
crossover probability less than 111
, and thus crossover probabilities greater than 0.09
are not considered in the simulations. The results suggest that, in the same way
as for the Hamming codes, the performance degrades when either of the crossover
probabilities increases. Additionally, the BER is much higher for the ISBN-10 code
than for the Hamming codes. This occurs for two main reasons. Firstly, the rate of
the ISBN-10 code is higher than those of the Hamming codes, at 0.9. Secondly, the
ISBN-10 code is only designed to detect errors. It is not designed to correct them,
as its Hamming distance is only two.
5.7 Summary
The transition from a memoryless model to one with memory requires the decoder
to factor in all possible state sequences. This is the major difference between the
decoding problems of Chapter 3 compared to those of Chapters 4 and 5. The
APP decoding procedures for channels described by stochastic automata employ
the same underlying algorithms as their memoryless channel counterparts, but the
scalar weightings of crossover probabilities must be replaced by matrix probabilities.
In general, there is one such matrix probability for each pair of possible transmitted
and received symbols. However, the number of such matrix probabilities is reduced
to two when the DMC in each state is symmetric. For a GEC model, there is one
matrix probability corresponding to correct reception of a symbol, considering all
possible state transitions, and another for incorrect reception.
The change from a binary signalling alphabet to one of higher order is handled
in the same way as for the memoryless case. That is, the circulant matrices of
the same order as the size of the signalling alphabet are used to define the matrix
representation of a trellis section in the original domain. Alternatively, the complex
generalisation of the Walsh-Hadamard matrix can be used to perform a similarity
transformation, which produces the diagonal matrices of the spectral domain. In
this case the conditional spectral coefficient matrices are strongly connected to the
structure of the dual code. Capitalising on these ties in the case of a GEC means
converting from the two standard matrix probabilities in the original domain to
the state transition and difference matrices which are representative of the spectral
domain. Either domain may be used to perform the necessary calculations. The
choice of domain should be motivated by the code rate.
For a systematic non-binary linear block code in conjunction with a channel de-
scribed by a stochastic automaton which includes a stochastic sequential machine
139
5.7. SUMMARY
with finitely many states, an analysis of the execution time and memory require-
ments for the APP decoding procedures using the original and spectral domains
developed in this chapter was provided. The computational complexity of the spec-
tral domain approach was favoured for codes with a high code rate, whilst it was
more efficient to use the original domain when the code rate was low. The diag-
onalisation process allowed the overall large-sized matrix multiplications that were
necessary in the original domain to be performed in individual chains of multipli-
cation of smaller-sized matrices. As the conditional spectral coefficients could be
accumulated one at a time, this led to a significant reduction in the amount of
memory used by Procedure 5.2 compared to Procedure 5.1.
It was also proved that the set of the first elements of the conditional spectral
coefficient vectors over all possible transmitted symbols forms a probability distri-
bution. The fact that the first element of each vector of APPs, when considered over
all values of transmitted symbol, do not form a probability distribution means that
APPs for the transmission of each possible symbol, given the received vector, must
be calculated in order to reach a decision for each information symbol position.
An instructive example of APP decoding for a self-dual code over GF (3) was
given using both domains in order to demonstrate the types of calculations which
are necessary in the two procedures. The equality of the two answers to the same
decoding task was verified. Finally, a selection of numerical results from computer
simulations of decoding over various fields was provided. It was observed by con-
version to equivalent BERs that for Hamming codes of order two, raising the order
of the field is likely to increase the error rate. However, decreasing either of the
crossover probabilities of a GEC appears to improve the performance. The poor er-
ror correcting capabilities of the ISBN-10 code were also observed. It must be noted
that these simulations are only a small portion of the possibilities for analysis. A
complete investigation into the performance of non-binary linear block codes on a
GEC is beyond the scope of this thesis.
140
Chapter 6
Generalised Weight Polynomials
for Binary Restricted GECs
A GEC model does not have a unique parametrisation. That is, there is more than
one way to describe the sequence of state transitions and the error generation process
for that model. Perhaps the simplest paradigm discussed in previous chapters was
one dominated by the probabilities of error and state transition events at each
discrete time instant. This chapter examines another paradigm for modelling a
binary GEC, which is instead based on the distribution of error bursts. These
characteristics may in fact be easier to measure, making the method described herein
more useful in a practical sense.
Additionally, the procedures for APP decoding developed in Chapters 4 and 5
depend heavily on vector and matrix multiplication. Four parameters were required
to fully describe the channel. However, restricting the channel by fixing the value
of one parameter allows the model to be described by three variables and by exami-
nation of all possible matrix products which could be involved in the APP decoding
calculations, the matrix multiplication can be replaced by the evaluation of trivariate
polynomials. This is the motivation for the study of restricted GECs as introduced
in Section 2.2.2.
The expression of the APPs in terms of these polynomials in three variables is
pleasing from an aesthetic point of view. It allows a more direct connection with the
fading profile of the channel. Furthermore, these polynomials provide information
about probabilities of codeword symbols as a function of artifacts of the dual code.
This is a similar purpose to the single-variable weight polynomials of a code and
its dual as related by the MacWilliams identity [25]. Since the polynomials derived
in this chapter have three variables, as opposed to the weight enumerators of one
variable, they shall be referred to as generalised weight polynomials (GWPs). This
141
6.1. PROBLEM STATEMENT
concept of GWPs has been introduced in [26] for use with syndrome decoding and is
adopted in this thesis for APP decoding. Thus, the APP decoding method developed
in this chapter is based on a generalisation of one of the most profound results in
coding theory, the MacWilliams identity.
This chapter is structured as follows. The statement of the APP decoding task
for binary restricted GECs using GWPs is formulated in Section 6.1. Section 6.2
discusses the alternative method of parameterising the channel model for binary
GECs in general using burst-error characteristics. In particular, the relationship
between the channel reliability factor and the matrix probabilities of the spectral
domain is highlighted. It is then possible to describe the APPs in terms of the
trivariate GWPs. This method is outlined in Section 6.3. An example of using
these polynomials to perform APP decoding is given in Section 6.4, while simulation
results for two binary linear block codes are shown in Section 6.5. Finally, Section
6.6 concludes the chapter.
The principal contributions of this chapter are:
• Formulation of the relationship between the channel reliability factor z and
the state transition matrix D and difference matrix ∆ for a binary GEC.
• Derivation of the conditional spectral coefficients, which are necessary for APP
decoding on a binary restricted GEC, in terms of the burst-error characteris-
tics.
• Discussion of the connection between the binary MacWilliams identity and
the polynomial form of the conditional spectral coefficients.
• Description of an APP decoding algorithm using burst-error characteristics
and polynomial evaluation rather than matrix multiplication, for a binary
restricted GEC.
• Numerical examples which highlight some of the many possible applications
of this theory.
6.1 Problem Statement
Assume that C is a binary (n, k) linear block code in standard form which is used
to protect data transmitted over a channel described by a stochastic automaton D,
where D is a binary restricted GEC as discussed in Section 2.2.2 together with an
142
6.2. BURST-ERROR CHARACTERISTICS
initial state distribution σ0. That is, the binary input, binary output channel has
state set S = {G,B} with the crossover probability in state B given by
pB = 0.5. (6.1)
Since D also falls under the broader classification of a GEC, the APP decoding de-
cisions for each information bit position are given by (4.42). However, that equation
is, at its most elementary level, a statement in terms of four parameters P , Q, pG
and pB. It is also a statement about matrix products. Given (6.1), the task to be
completed in order to perform APP decoding is to find a closed form polynomial
expression for ui involving at most three variables. That is, to find 2k polynomials
f (ui)(x1, x2, x3) =2n−k−1∑
s=0
Qs(ui|v), (6.2)
for i ∈ {1, 2, . . . , k} and ui ∈ GF (2), and where x1, x2 and x3 are variables which
completely define the particular binary restricted GEC model being used. It is then
a consequence of (4.42) that the decoding decisions can be made according to
ui =
{
0 if f (0)(x1, x2, x3) ≥ f (1)(x1, x2, x3),
1 otherwise.(6.3)
6.2 Burst-error Characteristics
One possible set of parameters for describing a binary GEC is the set of two state
transition probabilities P and Q, combined with the two crossover probabilities
pG and pB. This description is adequate if the behaviour of the channel has been
described in terms of the underlying theoretical Markov chain. However, if the
error patterns from a physical channel are being measured, it may be easier or more
practical to use other parameters which are more closely related to the distribution
of the bursts of transmission errors.
Another set of parameters for describing a GEC suggested in Chapter 2 was the
so-called burst-error characteristics. The likelihood of the model being in the ‘good’
state G or the ‘bad’ state B can be retrieved from the average fade to connection
time ratio x and the burst factor y as defined in (2.45) and (2.46) respectively as
x =P
Q, (6.4)
y = 1 − P −Q. (6.5)
143
6.2. BURST-ERROR CHARACTERISTICS
However, these two parameters give no information about the likelihood of transmis-
sion errors. This aspect of the channel is described by the parameter conventionally
known as z. It is called the channel reliability factor and is directly related to the
average symbol error rate of the channel, which is an easier quantity to measure than
the crossover probabilities. Hence, in the discussion that follows it will be necessary
to recall the definition of the average BER for a GEC as given in (2.47) by
pb = pGσG + pBσB. (6.6)
From [26], it is known that the relationship between the channel reliability factor
and the average BER for any binary GEC can be expressed as
z = 1 − 2pb, (6.7)
and in Chapter 2 it was defined as an expected value over both states G and B of
the difference between the probabilities of correct and erroneous transmission. The
following derivation of the channel reliability factor will be based upon the spectral
representation of a trellis, given that calculations are often simpler to perform in
that domain. Instead of the conventional APP decoding trellis, the summation
condition “u ∈ C, ui = g” in (4.3) is relaxed to “u ∈ C”, to give a description of
the syndrome trellis of Section 2.3.3. Such a trellis is not biased toward determining
probabilities for any particular position or bit. It is a particular case of a result
in [31], or alternatively it can be derived from Section 4.2, that for a binary code of
length n, a description of the syndrome trellis in terms of spectral matrices is given
by
ΘH =n∏
j=1
diag{
Θshj
}
, (6.8)
where
Θshj= D0 + D1(−1)<s,hj> =
{
D if < s,hj > = 0,
∆ if < s,hj > = 1.(6.9)
Here, s = bin(s) and bin(·) denotes the function which gives the binary representa-
tion of its input in vector form. Furthermore, < s,hj > refers to the dot product of
the vectors s and hj, where 0 ≤ s ≤ 2n−k−1 and 1≤j≤n. In the spectral domain,
the two matrix probabilities D and ∆ are the only two choices available to describe
the reliability of the channel and each is examined individually. For a binary GEC,
the difference matrix ∆ may be written as
∆ =
[
1 − 2pG 0
0 1 − 2pB
]
· D. (6.10)
144
6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS
For all four possible state transitions, ∆ supplies the probability of correct trans-
mission but negatively affected by the probability of incorrect transmission. By
contrast, D does not completely reflect the behaviour of the channel as it is entirely
described by the parameters x and y as shown in (2.73), and is thus independent of
transmission errors. Therefore, the prime candidate for expressing the channel reli-
ability is ∆, which can be converted to a scalar variable by calculating the weighted
average over the two states of the channel in the stationary state distribution. In
other words,
z = σ0∆e. (6.11)
Simplifying,
z =[
σG, σB
]
[
1 − 2pG 0
0 1 − 2pB
][
1 − P P
Q 1 −Q
][
1
1
]
= σG(1 − 2pG) + σB(1 − 2pB)
= 1 − 2pb. (6.12)
The average BER pb of the channel is easier to deduce from measurements than the
two individual crossover probabilities, because the state is hidden for this channel
model. Given the stationary state distribution σ0, the model may be described
by the parameters x, y, z, plus either pB or pG. Fixing one of these two crossover
probabilities could then allow the channel to be described by the three burst-error
characteristics. This approach would allow a reformulation of the conditional spec-
tral coefficient matrices, the details of which are shown in the next section.
6.3 Derivation of APPs using Generalised Weight
Polynomials
Examining in detail Procedure 4.2, which is an APP decoding algorithm for a
GEC using the spectral domain, one becomes aware of its dependence on the non-
commutative multiplication of D and ∆ matrices. A desirable situation would
therefore arise if the pre-multiplication of ∆ by a 2 × 2 matrix K could render one
of the two columns of K unnecessary to be considered for the remainder of the mul-
tiplications. The possibility of this occurring for pG, pB ∈ [0, 1] is now examined.
The maximum value of crossover probability which needs to be considered on a
BSC is 0.5, since a model with crossover probability 0.5+ǫ is equivalent to one with
crossover probability 0.5−ǫ where the received bits are inverted before decoding.
145
6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS
Assuming that pG < pB so that the ‘good’ state G is indeed better than the ‘bad’
state B, only the pB = 0.5 scenario as stated in (6.1) will guarantee that a column
of any such matrix K can effectively be ignored in the matrix multiplication. Under
such an assumption, the binary restricted GEC as shown in Fig. 2.9 results. Recall
the definitions of the state transition matrix D and the difference matrix δ, given
respectively in (2.36) and (2.70) as
D =
[
1 − P P
Q 1 −Q
]
, (6.13)
δ =
[
1 − 2pG 0
0 0
]
D. (6.14)
Then the expression for the conditional spectral coefficients Qs(ui|v) in (4.32) and
(4.33) under the additional assumption of (6.1) may be rewritten as
Qs(ui|v) = σ0
i−1∏
j=1
[
(−1)vj ·u⊥
s,j(u⊥s,jD + u⊥s,jδ)]
×
1
2
[
(−1)ui·u⊥
s,iD + (−1)(ui·u⊥
s,i)+viδ]
×n∏
j=i+1
[
(−1)vj ·u⊥
s,j(u⊥s,jD + u⊥s,jδ)]
e (6.15)
= c1σ0ADBe + c2σ0AδBe,
where
A =i−1∏
j=1
u⊥s,jD + u⊥s,jδ, (6.16)
B =n∏
j=i+1
u⊥s,jD + u⊥s,jδ, (6.17)
c1 =1
2(−1)ui·u
⊥
s,i
n∏
j=1j 6=i
(−1)vj ·u⊥
s,j , (6.18)
c2 =1
2(−1)(ui·u
⊥
s,i)+vi
n∏
j=1j 6=i
(−1)vj ·u⊥
s,j . (6.19)
It is important that each matrix in both of the products A and B is either the
state transition matrix D or the difference matrix δ, as certain patterns of D and
δ sequences may be able to be replaced by simpler expressions when the conversion
to the burst-error characteristics x, y and z is performed. This conversion for σ0,
146
6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS
D and δ was explained in (2.72), (2.73) and (2.74). Let
M = {D, δ} (6.20)
represent the set of possible matrices from which each matrix in the products A and
B is taken. Define a function g for a length n binary code used over a restricted
GEC with a given stationary state distribution σ0 by
g : M×M× . . .×M → Z[x, y, z]
g(K(1),K(2), . . . ,K(n)) = σ0
∏n
j=1 K(j)e,(6.21)
where K(j) ∈ M ∀j ∈ {1, 2, . . . , n} and Z[x, y, z] denotes the set of polynomials in
indeterminates x, y and z with integer coefficients. In this notation, x, y and z
represent the three burst-error characteristics. One matrix probability corresponds
to each section of the spectral domain trellis as it is traversed from left to right.
A series of lemmas will quickly establish the polynomials output by g for all 2n
possible inputs K(j). As reported in [26] for the restricted GEC and proven in [69]
for a simplified Gilbert channel, the following result concerning powers of the state
transition matrix D is true.
Lemma 6.3.1. Dn(x, y) =1
1 + x
[
1 + xyn x− xyn
1 − yn x+ yn
]
∀n ∈ N.
Proof. The proof is by induction on n. Firstly, the lemma holds for n = 0 because
D0(x, y) = I2 =1
1 + x
[
1 + xy0 x− xy0
1 − y0 x+ y0
]
. (6.22)
Also note the case n = 1 is true by the formulation of the state transition matrix
D(x, y) given in (2.73). Assume the lemma holds for some n ∈ N. Examining the
(n+ 1)th power of D(x, y) reveals that
Dn+1(x, y) =1
1+x
[
1 + xyn x− xyn
1 − yn x+ yn
]
1
1 + x
[
1 + xy x− xy
1 − y x+ y
]
=1
(1 + x)2
[
(1 + x)(1 + xyn+1) (x+ x2)(1 − yn+1)
(1 + x)(1 − yn+1) (1 + x)(x+ yn+1)
]
=1
1 + x
[
1 + xyn+1 x− xyn+1
1 − yn+1 x+ yn+1
]
. (6.23)
Therefore, if the lemma is true for an exponent of n, it is also true for an exponent
of n+1. By the Principle of Mathematical Induction, the lemma holds ∀n ∈ N.
147
6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS
The vector σ0 is unaffected by post-multiplication by the state transition matrix
D to the power of any non-negative integer. This is because the stationary state
distribution vector σ0 is an eigenvector of Dn corresponding to the eigenvalue 1.
Lemma 6.3.2. σ0 = σ0Dn ∀n ∈ N.
Proof. This proof is also by induction on the exponent n. The lemma can be shown
to hold for n = 0 by considering
σ0D0 = σ0I2 = σ0. (6.24)
Additionally, the lemma is true for n = 1 because
σ0D =[
11+x
, x1+x
]
11+x
[
1 + xy x− xy
1 − y x+ y
]
= 1(1+x)2
[
1 + xy + x− xy, x− xy + x2 + xy]
= 11+x
[
1, x]
= σ0.
(6.25)
Assume the lemma is true for some n ∈ N. By this assumption and (6.25),
σ0Dn+1 = σ0D
nD
= σ0D (6.26)
= σ0.
Hence the lemma holds for n+1 and by the Principle of Mathematical Induction,
σ0 = σ0Dn ∀n ∈ N.
Since Dn is a stochastic matrix, the sum of the entries in both of its rows is one.
This concept can also be expressed using multiplication by the column vector e.
Lemma 6.3.3. Dne = e ∀n ∈ N.
Proof.
Dne = 11+x
[
1 + xyn x− xyn
1 − yn x+ yn
][
1
1
]
= 11+x
[
1 + x
1 + x
]
= e.
(6.27)
148
6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS
A consequence of Lemmas 6.3.2 and 6.3.3 is that g can be calculated for an input
of K(j) = D, ∀j ∈ {1, 2, . . . , n}.
Corollary 6.3.1. g(D,D, . . . ,D) = 1 ∀n ∈ N.
Proof. By the definition of g given in (6.21),
g(D,D, . . . ,D) = σ0Dne = σ0e = 1. (6.28)
It is possible to treat more of the 2n possible sequences of matrices in Mn with
the assistance of the following lemmas. In the next case to consider, the initial and
final matrix probabilities in the matrix product are both δ, while the rest are D.
Lemma 6.3.4. g(δ,D(1),D(2), . . . ,D(n−2), δ) = z2(1 + xyn−1) ∀n ∈ N.
Proof. By Lemma 6.3.3, the output of the function g in this situation is found to be
σ0δDn−2δe =1
1+x
[
1, x]
(1+x)z
[
1 0
0 0
]
Dn−1(1+x)z
[
1 0
0 0
]
De
= (1+x)z2[
1, 0] 1
1+x
[
1+xyn−1 x−xyn−1
1−yn−1 x+yn−1
][
1 0
0 0
]
De
= z2[
1 + xyn−1, x− xyn−1]
[
1 0
0 0
]
e
= z2(1 + xyn−1).
(6.29)
Lemma 6.3.4 is easily generalised to inputs of multiple instances of a sequence
of D matrices enclosed between two δ matrices.
Lemma 6.3.5. For r ∈ Z+ and each value ci ∈ N, where i ∈ {1, 2, . . . , r},
g(δ,D(1),D(2), . . . ,D(c1), δ,D(1), . . . ,D(c2), δ, . . . , δ,D(1), . . . ,D(cr), δ)
= zr+1(1 + xyc1+1)(1 + xyc2+1) . . . (1 + xycr+1).
Proof. The proof is by induction on r. Cases where ci = 0 must be considered here,
as two or more consecutive δ matrices are possible. Firstly, g is evaluated for r=1
as
g(δ,D(1),D(2), . . . ,D(c1), δ) = σ0δDc1δe = z2(1 + xyc1+1), (6.30)
149
6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS
which holds by Lemma 6.3.4. Assume that the current lemma is true for some
r∈Z+. That is, it is possible to write
σ0δDc1δDc2δ . . . δDcr =[
fG(x, y, z), fB(x, y, z)]
(6.31)
for some polynomials fG(x, y, z), fB(x, y, z) ∈ Z[x, y, z] satisfying
[
fG(x, y, z), fB(x, y, z)]
δe
=[
fG(x, y, z), fB(x, y, z)]
(1 + x)z
[
1 0
0 0
]
1
1 + x
[
1 + xy x− xy
1 − y x+ y
][
1
1
]
= z[
fG(x, y, z) · (1 + xy), fG(x, y, z) · (x− xy)]
[
1
1
]
= (1 + x)zfG(x, y, z). (6.32)
Therefore, the Inductive Hypothesis may be formulated as
(1 + x)zfG(x, y, z) = zr+1(1 + xyc1+1)(1 + xyc2+1) . . . (1 + xycr+1). (6.33)
By Lemma 6.3.3 and (6.33), extending the examination to the case for r + 1 gives
σ0δDc1 . . . δDcrδDcr+1δe
=[
fG(x, y, z), fB(x, y, z)]
δDcr+1δe
= (1 + x)z[
fG(x, y, z), 0] 1
1 + x
[
1 + xycr+1+1 x− xycr+1+1
1 − ycr+1+1 x+ ycr+1+1
]
δe
= z2fG(x, y, z) · (1 + x)[
1 + xycr+1+1, x− xycr+1+1]
[
1 0
0 0
]
e
= (1 + x)zfG(x, y, z) · z(1 + xycr+1+1)
= zr+1(1 + xyc1+1) . . . (1 + xycr+1) · z(1 + xycr+1+1). (6.34)
Thus it follows that
σ0δDc1δDc2δ . . . δDcr+1δe = zr+2(1 + xyc1+1)(1 + xyc2+1) . . . (1 + xycr+1+1), (6.35)
and the lemma is true for r+1 sequences of state transition matrices enclosed between
difference matrices in the parameter list for g. By the Principle of Mathematical
Induction, the lemma is true for all choices of {ci | 1 ≤ i ≤ r} and all r ∈ Z+.
An extension of the previous case is where there are additional state transition
150
6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS
matrices D at the start and/or the end of the list of inputs to the function g. The
following lemma shows they can effectively be ignored when evaluating g.
Lemma 6.3.6. For n1, n2 ∈ N,
g(D(1),D(2), . . . ,D(n1), δ, . . . , δ,D(1),D(2), . . . ,D(n2)) = g(δ, . . . , δ),
where there is some pattern of state transition and/or difference matrices in the list
of inputs to g between the two δ matrices given.
Proof. Firstly by application of the definition of g and Lemma 6.3.2, it can be
established that
g(D(1), . . . ,D(n1), δ, . . . , δ,D(1), . . . ,D(n2)) = σ0Dn1δ . . . δDn2e
= σ0δ . . . δDn2e
= [fG(x, y, z), fB(x, y, z)]Dn2e,
(6.36)
for polynomials fG(x, y, z), fB(x, y, z) ∈ Z[x, y, z] where by Lemma 6.3.5,
fG(x, y, z) + fB(x, y, z) = g(δ, . . . , δ). (6.37)
It then follows from Lemma 6.3.1 that
g(D(1),D(2), . . . ,D(n1), δ, . . . , δ,D(1),D(2), . . . ,D(n2))
= [fG(x, y, z), fB(x, y, z)] 11+x
[
1 + xyn2 x− xyn2
1 − yn2 x+ yn2
][
1
1
]
= fG(x, y, z) + fB(x, y, z)
= g(δ, . . . , δ).
(6.38)
It is now possible to prove the conjecture in [26], which is summarised by the
following theorem.
Theorem 6.3.1. Let g be defined as in (6.20) and (6.21). Then
g(
K(1),K(2), . . . ,K(n))
=
{
zβ if β = 0, 1
zβ∏n−1
r=1 (1 + xyr)γr if 2 ≤ β ≤ n,(6.39)
where the expression∏n
j=1 K(j) contains β matrices of the δ variety and γr occur-
rences of Dr−1 embedded between two δ matrices.
151
6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS
Proof. Firstly, if K(j) = D, ∀j ∈ {1, 2, . . . , n}, then β = 0 and g = 1 or z0 by
directly applying Corollary 6.3.1.
Secondly, if∏n
j=1 K(j) contains exactly one δ matrix in the jth position, where
j ∈ {1, 2, . . . , n}, then β = 1. By Lemmas 6.3.2 and 6.3.3, it follows that
g(D(1),D(2), . . . ,D(i−1), δ,D(i+1), . . . ,D(n)) = σ0δe
=1
1 + x
[
1, x]
(1 + x)z
[
1 0
0 0
]
De
= z[
1, x]
[
1 0
0 0
][
1
1
]
= z. (6.40)
If neither of these two cases hold, then∏n
j=1 K(j) must contain two or more δ
matrices. Assume that there are γα instances of Dα which occur between two δ
matrices, 0 ≤ α ≤ n − 2. For l ∈ N and {c0, c1, . . . , cl+1} ∈ N, application of
Lemmas 6.3.5 and 6.3.6 produces
g(D(1), . . . ,D(c0),δ,D(1), . . . ,D(c1),δ, . . . , δ,D(1), . . . ,D(cl),δ,D(1), . . . ,D(cl+1))
= σ0δDc1δ . . . δDclδe
= zl+1(1+xyc1+1) . . . (1+xycl+1)
= zl+1∏n−2
α=0(1 + xyα+1)γα .
(6.41)
The result in (6.39) follows since there are β = l+1 matrices of the δ variety. There
are no other possible products∏n
j=1 K(j), since they must contain either zero, one
or at least two δ matrices. Hence the theorem is proved.
The conditional spectral coefficients Qs(ui|v) can now be written as polynomials
in x, y and z using Theorem 6.3.1. Define the notation
Qs(ui|v) =
Q(0)s (x, y, z) for ui = 0,
Q(1)s (x, y, z) for ui = 1.
(6.42)
Also, let K(m)A denote the mth matrix in the product A in (6.16) and let K
(m+i)B
denote the mth matrix in the product B in (6.17). Then by (6.15) and (6.42), the
conditional spectral polynomials can be expressed in terms of the three burst-error
152
6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS
characteristics x, y and z as
Q(ui)s (x, y, z) = c1g(K
(1)A ,K
(2)A , . . . ,K
(i−1)A ,D,K
(i+1)B ,K
(i+2)B , . . . ,K
(n)B )
+ c2g(K(1)A ,K
(2)A , . . . ,K
(i−1)A , δ,K
(i+1)B ,K
(i+2)B , . . . ,K
(n)B )
=
c1 + c2z for β=0,
c1zβ
n−1∏
l=1
(1+xyl)γl + c2zβ+1
n−1∏
r=1
(1+xyr)γr for β≥1,(6.43)
where β is the number of δ(x, y, z) matrices in AD(x, y)B, γl is the multiplicity of
Dl−1(x, y) embedded between two δ(x, y, z) matrices in AD(x, y)B, and γr is the
multiplicity of Dr−1(x, y) embedded between two δ(x, y, z) matrices in Aδ(x, y, z)B.
MacWilliams identity
Some of the procedures detailed in this thesis have used the spectral domain trellis,
which corresponds to the dual code, rather than to the code itself. It has been
shown that for certain codes, the use of such procedures is advantageous in terms
of computational complexity and/or storage requirements. There are similarities
between the relationship of the APPs to the conditional spectral polynomials and
the weight distribution of a code compared to its dual. Suppose that a systematic
binary (n, k) linear block code C contains Aj codewords of weight j and that its
(n, n−k) dual code C⊥ contains Bj codewords of weight j, where 0 ≤ j ≤ n. Define
the weight polynomial for the code C as
A(z) =n∑
j=0
Ajzj, (6.44)
and the weight polynomial for the dual code C⊥ as
B(z) =n∑
j=0
Bjzj. (6.45)
A concise way to describe the relationship between these two weight polynomials is
the binary version of the MacWilliams identity [25]:
B(z) = 2−k(1 + z)nA
(
1 − z
1 + z
)
. (6.46)
Identity (6.46) is particularly useful when investigating the performance of a
binary code over a BSC using syndrome decoding [26, 28]. In general, the weight
153
6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS
polynomial for a coset Vt, 0 ≤ t ≤ 2n−k − 1 is given by
Bt(z) =n∑
j=0
Bjtzj, (6.47)
where Vt contains Bjt words of weight j. Syndrome decoding involves calculation
of coset probabilities Pt based on the displacement vector d of the received vector
from the transmitted codeword. The coset probabilities are defined as
Pt = P (d ∈ Vt). (6.48)
In [26] it is shown that for a BSC with crossover probability ǫ,
Pt =1
2n−kBt(1 − 2ǫ) (6.49)
and when t=0, the above equation becomes the MacWilliams identity (6.46).
Such a derivation only applies for a memoryless channel and there is only one
variable involved. The work presented in [26] considered the case of the restricted
GEC and derived generalised weight polynomials for linear block codes used on this
particular type of channel with focus on calculating the performance of syndrome
decoding.
It is also possible to deploy the concept of GWPs for APP decoding and with
a channel which has memory. The generalisation does not provide weight distribu-
tions or probabilities relating to decodability of received words, rather it is used to
directly calculate the necessary APPs. In addition, it is a function of three vari-
ables. Rephrasing in terms of the three burst-error characteristics the original/dual
relationship in (4.41) and (4.42) which determines each APP decoding decision, the
result may be stated as
B(ui)(x, y, z) =2n−k−1∑
s=0
Q(ui)s (x, y, z), (6.50)
where ui ∈ {0, 1}. Since the polynomials B(ui)(x, y, z) generalise the concept of
weight polynomials, they shall be referred to by analogy as generalised weight poly-
nomials.
APP decoding procedure using generalised weight polynomials
Given (6.50), an APP decoding procedure for a code used on a binary restricted GEC
defined by burst-error characteristics x, y and z can be described in the following
way.
154
6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS
Procedure 6.1. Given is a binary (n, k) linear block code C in standard form, to
be used on a binary restricted GEC defined by burst-error characteristics x, y and z.
The linear block code C shall be defined by parity check matrix H. A codeword u is
transmitted over the channel and a word v is received. APP decoding using GWPs
can be performed using the following steps.
Step 1. ∀s∈{0, 1, . . . , 2n−k − 1}, compute the dual codeword
u⊥s = sH =
[
u⊥s,1, u⊥s,2, . . . , u⊥s,n
]
∈ C⊥. (6.51)
Step 2. ∀s ∈ {0, 1, . . . , 2n−k − 1}, ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (2), compute
coefficients c1 and c2 using (6.18) and (6.19).
Step 3. ∀s ∈ {0, 1, . . . , 2n−k − 1}, ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (2), compute
conditional spectral polynomials Q(ui)s (x, y, z) using (6.43).
Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui∈GF (2), compute the generalised weight polyno-
mials B(ui)(x, y, z) by accumulating the 2n−k conditional spectral polynomials
Q(ui)s (x, y, z) as in (6.50).
Step 5. For each position i ∈ {1, 2, . . . , k}, derive an APP decoding decision as
ui =
{
0 if B(0)(x, y, z) ≥ B(1)(x, y, z),
1 if B(0)(x, y, z) < B(1)(x, y, z).(6.52)
Note that if the channel is described in terms of state transition and crossover
probabilities, the burst-error characteristics can be quickly found using (2.45), (2.46),
(2.47) and (6.12). Procedure 6.1 is only applicable for a GEC with the imposed re-
striction of pB = 0.5. This occurs because GEC models with other restrictions on
that parameter do not give the correct form of matrix probabilities which are easily
rendered into the three-variable polynomials in (6.43). Nevertheless, whenever a
GEC model is used and the BSC corresponding to the ‘bad’ state has minimal ca-
pacity, Procedure 6.1 provides an alternative to the matrix multiplication-dependent
Procedure 4.2. An example demonstrating how it can be used for decoding is given
in the next section.
155
6.4. INSTRUCTIVE EXAMPLE
6.4 Instructive Example
Consider the binary (4,2) linear block code C described in Example 2.1. However,
this time assume that v = [1, 0, 0, 1] is received on a binary restricted GEC and the
goal is to find an estimate u2 for the transmitted symbol u2 at position i = 2. The
diagonal weighted trellises as calculated using (6.18), (6.19) and (6.43) for u2 = 0
and u2 = 1 are shown in Fig. 6.1 (a) and (b), respectively. The correspondence
between the zeroes and ones of the codewords of the dual code C⊥ as listed in
(4.57), and the positions of the state transition matrices D and difference matrices
δ in the first, third and fourth trellis sections is clear. The conditional spectral
coefficients can be calculated as a sum of products involving matrix probabilities D
and δ. Firstly for u2 = 0, the four coefficients for s = 0, 1, 2, 3 may be expressed as
Q0(u2 = 0|v) = 12σ0D
4e + 12σ0DδD2e,
Q1(u2 = 0|v) = −12σ0D
3δe − 12σ0DδDδe,
Q2(u2 = 0|v) = −12σ0δDδDe − 1
2σ0δ
3De,
Q3(u2 = 0|v) = 12σ0δDδ
2e + 12σ0δ
4e.
(6.53)
On the other hand for u2 = 1, the resulting four coefficients may be reported as
Q0(u2 = 1|v) = 12σ0D
4e − 12σ0DδD2e,
Q1(u2 = 1|v) = 12σ0D
3δe − 12σ0DδDδe,
Q2(u2 = 1|v) = 12σ0δDδDe − 1
2σ0δ
3De,
Q3(u2 = 1|v) = 12σ0δDδ
2e − 12σ0δ
4e.
(6.54)
These eight sums or differences of matrix products can then be converted to polyno-
mials in x, y, and z using (6.43). In the case for u2 = 0, the four conditional spectral
polynomials are obtained as
Q(0)0 (x, y, z) = 1
2(1 + z),
Q(0)1 (x, y, z) = −1
2[z + (1 + xy2)z2],
Q(0)2 (x, y, z) = −1
2[(1 + xy2)z2 + (1 + xy)2z3],
Q(0)3 (x, y, z) = 1
2[(1 + xy)(1 + xy2)z3 + (1 + xy)3z4].
(6.55)
156
6.4. INSTRUCTIVE EXAMPLE
If u2 = 1, a set of four slightly different conditional spectral polynomials results.
These polynomials can be listed as
Q(1)0 (x, y, z) = 1
2(1 − z),
Q(1)1 (x, y, z) = 1
2[z − (1 + xy2)z2],
Q(1)2 (x, y, z) = 1
2[(1 + xy2)z2 − (1 + xy)2z3],
Q(1)3 (x, y, z) = 1
2[(1 + xy)(1 + xy2)z3 − (1 + xy)3z4].
(6.56)
The two generalised weight polynomials are then given by
B(0)(x, y,z) =1
2
[
1 − 2(1+xy2)z2 + xy(y−1)(1+xy)z3 + (1+xy)3z4]
, (6.57)
B(1)(x, y, z) =1
2
[
1 + xy(y − 1)(1 + xy)z3 − (1 + xy)3z4]
. (6.58)
Assume the channel has the same values for the parameters P,Q and pG as given
in the example in [70]. These three values are listed as
P = 1.68 × 10−3,
Q = 3.28 × 10−2,
pG = 5.7 × 10−3.
(6.59)
Clearly the value for pB in [70] of 2.19 × 10−1 cannot be used here, as it is required
that pB = 0.5. The burst-error characteristics (to 5 d.p.) for this channel model can
thus be calculated as
x = 0.05122,
y = 0.96552,
z = 0.94043.
(6.60)
Substituting the above three values into (6.57) and (6.58), it follows that the values of
the two relevant GWPs for this decoding decision and this channel can be expressed
as
B(0)(x, y, z) = 2.46 × 10−2,
B(1)(x, y, z) = 4.72 × 10−2.(6.61)
Comparing these two values according to (6.52) means that the decoded bit can be
determined as
u2 = 1. (6.62)
157
6.5. SIMULATION RESULTS
v v v v v
v v v v v
v v v v v
v v v v v
- - - -
- - - -
- - - -
- - - -
σ0(x)
σ0(x)
σ0(x)
σ0(x)
e Q(0)3 (x, y, z)
e Q(0)2 (x, y, z)
e Q(0)1 (x, y, z)
e Q(0)0 (x, y, z)
−δ(x, y, z) D(x,y)+δ(x,y,z)2
δ(x, y, z) −δ(x, y, z)
−δ(x, y, z) D(x,y)+δ(x,y,z)2
δ(x, y, z) D(x, y)
D(x, y) D(x,y)+δ(x,y,z)2
D(x, y) −δ(x, y, z)
D(x, y) D(x,y)+δ(x,y,z)2
D(x, y) D(x, y)
(a)
v v v v v
v v v v v
v v v v v
v v v v v
- - - -
- - - -
- - - -
- - - -
σ0(x)
σ0(x)
σ0(x)
σ0(x)
e Q(1)3 (x, y, z)
e Q(1)2 (x, y, z)
e Q(1)1 (x, y, z)
e Q(1)0 (x, y, z)
−δ(x, y, z) D(x,y)−δ(x,y,z)2
δ(x, y, z) −δ(x, y, z)
−δ(x, y, z) −D(x,y)+δ(x,y,z)2
δ(x, y, z) D(x, y)
D(x, y) −D(x,y)+δ(x,y,z)2
D(x, y) −δ(x, y, z)
D(x, y) D(x,y)−δ(x,y,z)2
D(x, y) D(x, y)
(b)
Figure 6.1: Weighted diagonal trellises of the binary (4, 2) linear block code C
used to compute spectral polynomials (a) Q(u2=0)s (x, y, z) and (b) Q
(u2=1)s (x, y, z);
s = 0, 1, 2, 3 for a binary restricted GEC.
158
6.5. SIMULATION RESULTS
6.5 Simulation Results
A description of the performance of some binary linear block codes when used over
binary restricted GECs and decoded using Procedure 6.1 is provided in this sec-
tion. Such investigations into the performance which could be expected on channel
models with a range of parameter values were obtained using simulations run with
MATLABr. Not all possible parameter sets have been simulated. This is only a
sample of the possible applications of Procedure 6.1 to give an indication as to the
variety of its uses. Here, two codes which can correct two transmission errors per
word are examined.
6.5.1 (16,8) cyclic code
Computer simulations were carried out for the (16,8) cyclic block code described
in (3.105) over a GEC with pB = 0.5. The structure of this code is defined by the
generator polynomial coefficient vector
g = [1, 1, 1, 0, 1, 0, 1, 1, 1], (6.63)
and the columns of the generator matrix are then permuted into standard form.
The BER performance is shown in Fig. 6.2 for an average bit error probability
pb = pGσG + 0.5σB = 1%. (6.64)
This value of pb is typical for a mobile radio channel. By (6.12), it follows that the
channel reliability factor is given by
z = 0.98. (6.65)
Simulations were performed for pairs of x and y values taken from the sets
x ∈ {0.004, 0.008, . . . , 0.02},
y ∈ {0, 0.05, . . . , 1}.(6.66)
As expected, the performance improves as x decreases, since the channel has a higher
probability of being in the ‘good’ state G where the crossover probability pG is lower.
The performance also improves with a lower burst factor y, which corresponds to
increased statistical independence of errors.
159
6.5. SIMULATION RESULTS
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110
−4
10−3
10−2
10−1
y
BE
R
x = 0.004
x = 0.008
x = 0.012
x = 0.016
x = 0.020
Figure 6.2: Performance of the (16,8) block code on a binary restricted GEC andpb = 1%.
6.5.2 (22,13) Chen code
Another example of a code capable of correcting two errors is the quasi-perfect code
by Chen, Fan and Jin [60]. The 9 × 22 parity check matrix for this code is given
in (3.106). In this case, computer simulations were run for values of burst-error
characteristics taken from the sets
x = 0.004,
y ∈ {0, 0.1, . . . , 1},z ∈ {0.95, 0.98, 0.99},
(6.67)
thus corresponding to average bit error probabilities of 2.5%, 1% and 0.5%.
It can be observed from the plots of the BER performance of this code given in
Fig. 6.3 that for each of the channel reliability factor z values investigated, a decrease
in the burst factor y produces a decrease in the post-decoding BER. Additionally, it
is observed that a decrease in the channel reliability factor z results in an increase in
the post-decoding BER. This happens because a lower z value for a binary restricted
GEC corresponds to an increased crossover probability pG in the ‘good’ state G, and
thus more errors are likely to occur.
160
6.6. SUMMARY
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110
−5
10−4
10−3
10−2
y
BE
R
z = 0.95
z = 0.98
z = 0.99
Figure 6.3: Performance of the (22,13) Chen code on a binary restricted GEC withaverage fade to connection time ratio x = 0.004.
There is relatively little change in the performance of the Chen code over a
binary restricted GEC with z = 0.95, from y = 0 to y = 1. In these cases, the
crossover probability pG in the ‘good’ state has value 2.31 × 10−2. Whether the
channel is experiencing errors in bursts or independently, the crossover probabilities
are high and the code is not effective in correcting errors. There is a more profound
difference between the cases y=0 and y=1 for the situations where z=0.99. Here,
the corresponding value of the crossover probability pG is 3.02 × 10−3. The code is
not able to correct more than two errors per word, and thus it does not perform
particularly well when the burst factor y is high. However, the low value of pG
means that when errors occur more independently as y approaches zero, such errors
can be corrected and a low BER is observed.
6.6 Summary
A second and more practically-oriented system of parameters for a binary restricted
GEC has been the focus of this chapter. This system has been designed to consider
the distribution of the error bursts rather than the crossover and state transition
probabilities.
161
6.6. SUMMARY
An expression for the channel reliability factor was given in terms of the average
error rate over the two states of the channel. This was established by considering the
representation of a syndrome trellis, that is, one which is not used in the calculation
of a particular APP. The objective was then to take advantage of the restriction
on pB in order to determine expressions for the conditional spectral coefficients. A
large portion of the chapter presented a proof of the structure of this expression in
terms of any possible spectral domain trellis.
The binary MacWilliams identity was discussed as an entity providing infor-
mation as a function of one variable about the weight distribution of a code by
consideration of the weight distribution of its dual. Similarly, information about
how a received word should be decoded, as a function of the three burst-error char-
acteristics, is related through the GWPs to the conditional spectral coefficients. The
structure of these coefficients in terms of matrix probabilities depends on the ele-
ments of the dual codewords. These matrix probabilities in turn are dependent on
the burst-error characteristics.
As a result of this correspondence, an APP decoding procedure for a binary
restricted GEC was obtained. A example demonstrating the tasks involved in de-
coding a received word was included in order to reinforce the concepts involved in
this procedure. In particular, the steps of converting the conditional spectral co-
efficients in matrix form to an alternative description in terms of the burst-error
characteristics, and then construction of the GWPs leading to a decoding decision
were outlined. Finally, simulations of the decoding procedure on some binary linear
block codes were carried out, where it was observed that the resulting BER appears
to grow in proportion to both the average fade to connection time ratio and the
burst factor. By contrast, an increase in the value of the channel reliability factor
appears to result in a decreased BER, since the channel reliability factor is inversely
proportional to the average probability of a transmission error occurring. This sam-
ple of simulation results indicates the large number of options for further analysis of
the performance of binary linear block codes on a restricted GEC when combined
with APP decoding.
162
Chapter 7
Generalised Weight Polynomials
for Non-binary Restricted GECs
It is also possible to perform APP decoding using GWPs when the code used is non-
binary. As in the binary case, the behaviour of a non-binary restricted GEC can be
described in terms of the burst-error characteristics. The channel reliability factor
does however need to be calculated differently when non-binary data is transmitted
over the channel. The GWPs again provide a method of calculating APPs using
polynomial evaluation, rather than matrix multiplication. A simple and familiar
relationship between the spectral domain trellis entries and the structure of the
GWPs is established. It is noted that this relationship has similarities with the
non-binary version of the MacWilliams identity.
This chapter is organised as follows. Firstly, the problem to be solved in this
chapter is stated in Section 7.1. In Section 7.2, the channel reliability factor for a
non-binary GEC is examined in further detail. Then, the three burst-error char-
acteristics are used to express the conditional spectral coefficients for a non-binary
restricted GEC in Section 7.3. A resemblance to the MacWilliams identity is dis-
cussed, after which the decoding algorithm can be described with reference to the
GWPs. An example of the decoding process is given in Section 7.4, and Section
7.5 contains a discussion of simulation results. Finally, the chapter is concluded in
Section 7.6.
The major contributions of this chapter are:
• The relationship between the channel reliability factor z and the matrix prob-
abilities D and ∆ for a non-binary GEC using the standard DMC model is
established.
• Derivation of the conditional spectral coefficients for APP decoding on a non-
163
7.1. PROBLEM STATEMENT
binary restricted GEC is given in terms of the three burst-error characteristics.
• Similarities between these conditional spectral polynomials and the non-binary
MacWilliams identity are noted.
• Description of an APP decoding algorithm for a non-binary restricted GEC
which is based on the evaluation of polynomials is given.
• Numerical examples which provide an indication of the possible uses of this
APP decoding algorithm are reported.
7.1 Problem Statement
Let D be a non-binary restricted GEC with a p-ary DMC in both states of its state
set S = {G,B}, together with an initial state distribution vector σ0. Since D is a
restricted channel, the probability of receiving any symbol whilst in state B given
a transmitted symbol must be identical for all possible received symbols. That is,
under the standard DMC model in Fig. 2.3(a), the crossover probability in the ‘bad’
state can be written as
pB =1
p. (7.1)
If the alternative model in Fig. 2.3(b) was used, then the value of pB would be
slightly different. This scenario will not be considered in as much detail as the
standard model.
Suppose that C is an (n, k) linear block code in standard form over GF (p). The
linear block code C can be used to encode data prior to transmission over D. As Dis a particular type of GEC, the APP decoding decisions ui, for i ∈ {1, 2, . . . , k}, can
be found by computing (5.40). This equation is a statement about matrix products,
the entries of which are constructed from four channel parameters P , Q, pG and
pB. Incorporating (7.1), the challenge is to determine a closed form polynomial
expression for ui which instead involves the three burst-error characteristics. That is,
the task is to find a method of expressing the sums of conditional spectral coefficients
Qs(ui|v) in (5.40), not in terms of σ0, D, δ and e, but in terms of the average fade
to connection time ratio x, the burst factor y, and the channel reliability factor z.
Decoding a received word then involves determining k · p polynomials B(ui)(x, y, z),
one for each of the k values of i ∈ {1, 2, . . . , k} and each of the p values of ui ∈ GF (p).
By analogy with (5.40), it follows that the decoding decision for each information
164
7.2. THE CHANNEL RELIABILITY FACTOR FOR A NON-BINARY GEC
symbol can be made by evaluating
ui = arg maxui∈GF (p)
{
B(ui)(x, y, z)}
. (7.2)
7.2 The Channel Reliability Factor for a Non-
binary GEC
Two sets of parameters have been discussed for the non-binary GEC model. The
matrix multiplication algorithms given in Chapter 5 involved matrices containing
elements which were composed of the parameters P,Q, pG and pB. However in
Chapter 2, it was also explained that the behaviour of the channel could be de-
scribed by burst-error characteristics. Since the system by which the state changes
is identical to that of the binary GEC, the average fade to connection time ratio x
and the burst factor y are defined as in (2.45) and (2.46). The channel reliability
factor z is, however, defined slightly differently.
Here the results of [31] are applied directly. For a code of length n, a description
of the syndrome trellis, which does not consider the likelihood of any specific trans-
mitted symbol in any specific position and uses matrices of the spectral domain, is
given by
ΘH =n∏
j=1
diag{
Θshj
}
, (7.3)
where
Θshj=
∑
g∈GF (p)
Dgw<s,ghj> =
{
D if < s,hj > = 0,
∆ if < s,hj > 6= 0.(7.4)
In this formulation, w is a complex pth root of unity, s = vecp(s), and vecp(·) denotes
the p-ary vector representation of its decimal input. Additionally, < s,hj > refers
to the dot product of the vectors s and hj, where 0 ≤ s ≤ pn−k−1 and 1≤ j ≤ n.
The result in (7.4) can be derived using Lemma 5.2.1 and (5.27). By the same
arguments presented in Section 6.2 for binary codes, selecting D as the definition of
the channel reliability factor is also implausible for non-binary codes because of its
complete independence from the crossover probabilities of the channel. Hence the
definition using the difference matrix ∆ is selected. The matrix to scalar conversion
is again performed using the stationary state distribution σ0 and the column vector
e of all ones, so that the channel reliability factor for a non-binary GEC can be
expressed as
z = σ0∆e. (7.5)
165
7.2. THE CHANNEL RELIABILITY FACTOR FOR A NON-BINARY GEC
0
1
z
p−1p
ps
p
p−1
6
-
\\
\\
\\
\\
\\
\\
\\\
Figure 7.1: The relationship between the channel reliability function z and themean SER, ps.
This confirms the expression for z in terms of the average SER for a GEC using the
standard DMC model as given in Section 2.2. Explicitly, using the average SER of
that non-binary GEC model as given in (2.59), the relationship may be expressed
as
z =[
σG, σB
]
[
1 − ppG 0
0 1 − ppB
][
1 − P P
Q 1 −Q
][
1
1
]
= σG(1 − ppG) + σB(1 − ppB)
= 1 − p
p−1ps.
(7.6)
Although not explicitly derived here, it is to be noted that the expression for the
channel reliability factor z in the case of a non-binary GEC using the alternative
DMC model is also that of (7.6).
As shown in the graph of the channel reliability factor z as a function of the
average SER ps in Fig. 7.1, the maximum value of z is one, which occurs when ps
is zero. The minimum value of z is zero and occurs when ps reaches its maximum
value of p−1p
. The mean SER ps of the channel may be practically easier to obtain
than the two individual crossover probabilities, due to the channel being described
as a Markov model where the current state is hidden.
The crossover probabilities for a non-binary GEC are restricted to be at most1p, since all possible capacities of symmetric DMCs can be observed by limiting the
crossover probability to [0, 1p]. This can be shown in the following way.
Lemma 7.2.1. The function representing the channel capacity of a symmetric DMC
over GF (p), for a positive prime p, assumes all possible values in [0, 1] when restrict-
166
7.3. DERIVATION OF NON-BINARY APPS USING GENERALISED WEIGHT POLYNOMIALS
ing its domain to [0, 1p].
Proof. The capacity function in p-ary units of information for a standard DMC
model with capacity ǫ is defined in (2.14) as
f(ǫ) = 1 + [1 − (p− 1)ǫ] logp[1 − (p− 1)ǫ] + (p− 1)ǫ logp ǫ. (7.7)
The result in (7.7) can be verified with the alternative DMC model in [71]. Al-
though f(ǫ) is undefined for ǫ < 0, f is considered continuous on [0, 1p] as f consists
of products and sums of logarithms, which are themselves continuous on (0, 1p]. Fur-
thermore,
limx→0+
f(x) = 1 (7.8)
is sufficient to ensure continuity at the endpoint of the domain of f . A DMC model
can be defined for 0 ≤ ǫ ≤ 1p−1
, however for all primes p, the inequality
1
p<
1
p− 1(7.9)
holds, and thus it can be said that f is continuous on the shorter interval of [0, 1p].
The values of f at the endpoints of this interval are
f(0) = 1, f
(
1
p
)
= 0. (7.10)
Then by the Intermediate Value Theorem, for all capacities c ∈ (0, 1), there exists
a crossover probability ǫ ∈ (0, 1p) such that f(ǫ) = c. Combining this fact with
(7.10) means that restriction of the domain of f to [0, 1p] is sufficient to result in the
maximal range of f , which is [0, 1].
Thus for any GEC model, only crossover probabilities up to and including 1p
need
to be considered. The behaviour of a GEC may then be described by the parameters
x, y, z and pB, where 0≤pB ≤ 1p. It is also possible to express the conditional spectral
coefficient matrices as developed in Chapter 5 in terms of these four parameters.
7.3 Derivation of Non-binary APPs using Gener-
alised Weight Polynomials
In Chapter 6, it was demonstrated how the ability to describe a binary restricted
GEC in terms of three parameters produced an alternative to expressing the condi-
tional spectral coefficients Qs(ui|v) as matrix products. The alternative was to write
167
7.3. DERIVATION OF NON-BINARY APPS USING GENERALISED WEIGHT POLYNOMIALS
the conditional spectral coefficients as polynomials with a structure determined by
the distribution of the state transition matrices D and difference matrices δ in each
matrix product. The same task is performed here for the non-binary restricted GEC.
The similarity of the binary and non-binary results will be shown.
For brevity, only the non-binary restricted GEC using the standard DMC model
will be discussed. Application of the restriction with the standard model means
pB =1
p. (7.11)
With this restriction in place, the channel can be described by the three burst-
error characteristics x, y and z. As this is a GEC, the definitions of the burst-error
characteristics x and y in terms of P andQ as given in (2.45) and (2.46), respectively,
are applicable here. Also, (2.38) and (2.81) combine to produce the expression
z = (1 − ppG)Q
P +Q(7.12)
for the channel reliability factor. Equation (7.11) implies the use of the difference
matrix δ to describe the conditional spectral coefficients Qs(ui|v). Consequently,
the spectral domain is appropriate to use and the notation of the state transition
matrix D and the difference matrix δ as given in (2.36) and (2.77), respectively,
will be adopted. The subsequent combination of the expression for the conditional
spectral coefficient matrices Qs(ui|v) given in (5.30) with (5.31) and (7.11) gives a
new way of writing the conditional spectral coefficients as
Qs(ui|v) = σ0
i−1∏
j=1
{
wvj ·u⊥
s,j
[
δu⊥
s,j ,0D + (1 − δu⊥
s,j ,0)δ]}
×
wui·u⊥
s,i
p[D + (δui,vi
p− 1)δ] ×n∏
j=i+1
{
wvj ·u⊥
s,j
[
δu⊥
s,j ,0D + (1 − δu⊥
s,j ,0)δ]}
e (7.13)
= c1σ0ADBe + c2σ0AδBe,
where
A =i−1∏
j=1
[
δu⊥
s,j ,0D + (1 − δu⊥
s,j ,0)δ]
, (7.14)
B =n∏
j=i+1
[
δu⊥
s,j ,0D + (1 − δu⊥
s,j ,0)δ]
, (7.15)
168
7.3. DERIVATION OF NON-BINARY APPS USING GENERALISED WEIGHT POLYNOMIALS
and
c1 =wui·u
⊥
s,i
p
n∏
j=1j 6=i
wvj ·u⊥
s,j , (7.16)
c2 =wui·u
⊥
s,i
p(δui,vi
p− 1)n∏
j=1j 6=i
wvj ·u⊥
s,j . (7.17)
In addition, let
M = {D, δ} (7.18)
and note that each matrix within the products A and B is a member of the set
M. It is now possible to reformulate (7.13) in terms of the burst-error characteristic
definitions for the stationary state distribution vector σ0(x), the state transition
matrix D(x, y) and the difference matrix δ(x, y, z) as given in (2.72), (2.73) and
(2.74), respectively.
Define a function g for a linear block code of length n over GF (p) being used on
a restricted GEC with stationary state distribution vector σ0 as
g : M×M× . . .×M → Z[x, y, z]
g(K(1),K(2), . . . ,K(n)) = σ0
∏n
j=1 K(j)e,(7.19)
where
K(j) ∈ M ∀j ∈ {1, 2, . . . , n}. (7.20)
Then, Theorem 6.3.1 also applies in the non-binary case and
g(
K(1),K(2), . . . ,K(n))
=
{
zβ if β = 0, 1
zβ∏n−1
r=1 (1 + xyr)γr if 2 ≤ β ≤ n,(7.21)
where the expression∏n
j=1 K(j) contains β instances of the difference matrix δ and
γr occurrences of Dr−1, which is r−1 consecutive instances of the state transition
matrix D, embedded between two δ matrices.
It then becomes possible to rewrite the conditional spectral coefficients Qs(ui|v)
as polynomials in terms of the burst-error characteristics x, y and z. Let this polyno-
mial be denoted Q(ui)s (x, y, z). To reference each matrix in (7.14) and (7.15), define
the notation
A = K(1)A K
(2)A · · ·K(i−1)
A , (7.22)
B = K(i+1)B K
(i+2)B · · ·K(n)
B . (7.23)
169
7.3. DERIVATION OF NON-BINARY APPS USING GENERALISED WEIGHT POLYNOMIALS
Then, (7.13) can be rewritten using (7.21) as
Q(ui)s (x, y, z) = c1g(K
(1)A ,K
(2)A , · · · ,K(i−1)
A ,D,K(i+1)B ,K
(i+2)B , · · · ,K(n)
B )
+ c2g(K(1)A ,K
(2)A , · · · ,K(i−1)
A , δ,K(i+1)B ,K
(i+2)B , · · · ,K(n)
B )
=
c1 + c2z for β=0,
c1zβ
n−1∏
l=1
(1+xyl)γl + c2zβ+1
n−1∏
r=1
(1+xyr)γr for β≥1,(7.24)
where A,B, c1 and c2 are defined in (7.14)-(7.17). Furthermore, β is the number
of difference matrices δ(x, y, z) in AD(x, y)B, γl is the multiplicity of Dl−1(x, y)
embedded between two δ(x, y, z) matrices in AD(x, y)B, and γr is the multiplicity
of Dr−1(x, y) embedded between two δ(x, y, z) matrices in Aδ(x, y, z)B.
Non-binary MacWilliams identity
There is also a non-binary version of the MacWilliams identity [25]. For a systematic
(n, k) linear block code C over GF (p) containing Aj codewords of weight j, and its
(n, n− k) dual code C⊥ containing Bj codewords of weight j, 0≤j≤n, the identity
may be expressed in terms of an indeterminate z as
A
(
1 − z
1 + (p− 1)z
)
=pk
[1 + (p− 1)z]nB(z). (7.25)
In this formulation, the weight polynomial A(z) for the linear block code C is
A(z) =n∑
j=0
Ajzj (7.26)
and the weight polynomial B(z) for the dual code C⊥ is
B(z) =n∑
j=0
Bjzj. (7.27)
There is thus a transformation involved in obtaining the weight polynomial A(z) for
the code C from the weight polynomial B(z) for the dual code C⊥. In a similar way,
(5.38) demonstrated that the vector P(ui|v) of a posteriori probabilities is related to
the vector Q(ui|v) of conditional spectral coefficients through a transformation ma-
trix Wpn−k . Specifically, the relationship between the original and spectral domains
was given by
P(ui|v) =1
pn−kQ(ui|v)WH
pn−k . (7.28)
170
7.3. DERIVATION OF NON-BINARY APPS USING GENERALISED WEIGHT POLYNOMIALS
As shown in Chapter 5, the APP decoding decisions are made by comparing the first
elements P0(ui|v) of the vectors P(ui|v) for the different values of ui. In (7.2), the
polynomials B(ui)(x, y, z) were conceived as generalisations of these first elements
P0(ui|v) to trivariate polynomials. Since (7.24) demonstrates how to express the
conditional spectral coefficientsQs(ui|v) as functionsQ(ui)s (x, y, z) of the three burst-
error characteristics x, y and z, consideration of the structure of (7.2) and (7.28)
reveals that
B(ui)(x, y, z) =
pn−k−1∑
s=0
Q(ui)s (x, y, z) (7.29)
provides the necessary link between the original domain polynomials B(ui)(x, y, z)
and the spectral domain polynomials Q(ui)s (x, y, z). The weight polynomials of (7.25)
possess a similar dual relationship, but since (7.29) has generalised the relationship
of the polynomials B(ui)(x, y, z) with their spectral domain counterparts to three
variables, the polynomials B(ui)(x, y, z) shall be referred to as generalised weight
polynomials. It therefore follows that the final decoding decision for each position
i ∈ {1, 2, . . . , k} is given by
ui = arg maxui∈GF (p)
{B(ui)(x, y, z)}. (7.30)
APP decoding procedure using generalised weight polynomials
It is now possible to describe a procedure which performs APP decoding of non-
binary linear block codes over a restricted GEC.
Procedure 7.1. Given is an (n, k) linear block code C over GF (p). The code is
in standard form, defined by parity check matrix H, and is to be used on a p-ary
restricted GEC defined by burst-error characteristics x, y and z. A codeword u is
transmitted over the channel and a word v is received. APP decoding using GWPs
consists of the following steps.
Step 1. ∀s ∈ {0, 1, . . . , pn−k − 1}, compute the dual codeword
u⊥s = sH =
[
u⊥s,1, u⊥s,2, . . . , u⊥s,n
]
∈ C⊥. (7.31)
Step 2. ∀s ∈ {0, 1, . . . , pn−k − 1}, ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute
coefficients c1 and c2 using (7.16) and (7.17).
Step 3. ∀s ∈ {0, 1, . . . , pn−k − 1}, ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute
conditional spectral polynomials Q(ui)s (x, y, z) using (7.24).
171
7.4. INSTRUCTIVE EXAMPLE
Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute the generalised weight poly-
nomials B(ui)(x, y, z) by accumulating the pn−k conditional spectral polynomi-
als Q(ui)s (x, y, z) as in (7.29).
Step 5. Derive an APP decoding decision ui for the ith transmitted symbol ui for
each position i∈{1, 2, . . . , k} using (7.30).
7.4 Instructive Example
Let C be the (4,2) linear block code over GF (3) as given in (3.107). Assume that
v = [1, 2, 2, 0] is received through a ternary restricted GEC with pB = 13
and it is
required that the second symbol transmitted u2 be estimated using APP decoding.
The estimate u2 of this symbol can be found by applying Procedure 7.1. Diagonal
weighted trellises for the three possibilities u2 = 0, u2 = 1 and u2 = 2 are similar to
those in Figs. 5.1-5.3. However, the replacement of δ for ∆ has been made since the
channel is restricted. Additionally, the conditional spectral polynomialsQ(ui)s (x, y, z)
are given in terms of the burst-error characteristics x, y and z. The results are
presented in Figs. 7.2-7.4. Referring to the codewords of C⊥ in (5.81), note the
correspondence in the first, third and fourth trellis sections between the locations
of the state transition matrices D(x, y) and the positions of the zero entries of the
dual codewords. Furthermore, the locations of the difference matrices δ(x, y, z) in
trellis sections one, three and four correspond to the positions of nonzero entries of
the dual codewords. Since p = 3, w is fixed at e−2π3
. The nine conditional spectral
polynomials in each of the three cases are obtained by multiplying across each row of
the trellis and using (7.16), (7.17) and (7.24). That is, considering the case u2 = 0,
the conditional spectral polynomials can be listed as
Q(0)0 (x, y, z) = 1
3(1 − z),
Q(0)1 (x, y, z) = 1
3w[z2(1 + xy3) − z3(1 + xy)(1 + xy2)],
Q(0)2 (x, y, z) = 1
3w2[z2(1 + xy3) − z3(1 + xy)(1 + xy2)],
Q(0)3 (x, y, z) = 1
3w[z2(1 + xy2) − z3(1 + xy)2],
Q(0)4 (x, y, z) = 1
3w2[z2(1 + xy) − z3(1 + xy)2],
Q(0)5 (x, y, z) = 1
3[z3(1 + xy)(1 + xy2) − z4(1 + xy)3],
Q(0)6 (x, y, z) = 1
3w2[z2(1 + xy2) − z3(1 + xy)2],
Q(0)7 (x, y, z) = 1
3[z3(1 + xy)(1 + xy2) − z4(1 + xy)3],
Q(0)8 (x, y, z) = 1
3w[z2(1 + xy) − z3(1 + xy)2].
(7.32)
172
7.4. INSTRUCTIVE EXAMPLE
If u2 = 1, then the conditional spectral polynomials can be reported as
Q(1)0 (x, y, z) = 1
3(1 − z),
Q(1)1 (x, y, z) = 1
3[z2(1 + xy3) − z3(1 + xy)(1 + xy2)],
Q(1)2 (x, y, z) = 1
3[z2(1 + xy3) − z3(1 + xy)(1 + xy2)],
Q(1)3 (x, y, z) = 1
3[z2(1 + xy2) − z3(1 + xy)2],
Q(1)4 (x, y, z) = 1
3[z2(1 + xy) − z3(1 + xy)2],
Q(1)5 (x, y, z) = 1
3[z3(1 + xy)(1 + xy2) − z4(1 + xy)3],
Q(1)6 (x, y, z) = 1
3[z2(1 + xy2) − z3(1 + xy)2],
Q(1)7 (x, y, z) = 1
3[z3(1 + xy)(1 + xy2) − z4(1 + xy)3],
Q(1)8 (x, y, z) = 1
3[z2(1 + xy) − z3(1 + xy)2].
(7.33)
Finally, under the supposition of u2 = 2, the conditional spectral polynomials can
be calculated as
Q(2)0 (x, y, z) = 1
3+ 2
3z,
Q(2)1 (x, y, z) = 1
3w2z2(1 + xy3) + 2
3w2z3(1 + xy)(1 + xy2),
Q(2)2 (x, y, z) = 1
3wz2(1 + xy3) + 2
3wz3(1 + xy)(1 + xy2),
Q(2)3 (x, y, z) = 1
3w2z2(1 + xy2) + 2
3w2z3(1 + xy)2,
Q(2)4 (x, y, z) = 1
3wz2(1 + xy) + 2
3wz3(1 + xy)2,
Q(2)5 (x, y, z) = 1
3z3(1 + xy)(1 + xy2) + 2
3z4(1 + xy)3,
Q(2)6 (x, y, z) = 1
3wz2(1 + xy2) + 2
3wz3(1 + xy)2,
Q(2)7 (x, y, z) = 1
3z3(1 + xy)(1 + xy2) + 2
3z4(1 + xy)3,
Q(2)8 (x, y, z) = 1
3w2z2(1 + xy) + 2
3w2z3(1 + xy)2.
(7.34)
The three generalised weight polynomials which will be used in order to determine
u2 are obtained by adding each set of nine polynomials. After performing these
additions, the GWPs can be determined as
B(0)(x, y, z) =1−z−z2m(x, y)+z3(5+2xy+3xy2)(1+xy)−2z4(1+xy)3
3, (7.35)
B(1)(x, y, z) =1 − z + 2z2m(x, y) − 4z3(1 + xy)2 − 2z4(1 + xy)3
3, (7.36)
B(2)(x, y, z) =1 + 2z − z2m(x, y) − 4z3(1 + xy)2 + 4z4(1 + xy)3
3, (7.37)
where
m(x, y) = 3 + xy(1 + y + y2) (7.38)
denotes an expression which is common to all three GWPs. The values for the burst-
error characteristics x, y and z can now be substituted into (7.35)-(7.38). According
173
7.4. INSTRUCTIVE EXAMPLE
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
e Q(0)8 (x, y, z)
e Q(0)7 (x, y, z)
e Q(0)6 (x, y, z)
e Q(0)5 (x, y, z)
e Q(0)4 (x, y, z)
e Q(0)3 (x, y, z)
e Q(0)2 (x, y, z)
e Q(0)1 (x, y, z)
e Q(0)0 (x, y, z)
D(x, y) D(x,y)−δ(x,y,z)3
D(x, y) D(x, y)
wδ(x, y, z) D(x,y)−δ(x,y,z)3
D(x, y) δ(x, y, z)
w2δ(x, y, z) D(x,y)−δ(x,y,z)3
D(x, y) δ(x, y, z)
w2δ(x, y, z) D(x,y)−δ(x,y,z)3
w2δ(x, y, z) D(x, y)
D(x, y) D(x,y)−δ(x,y,z)3
w2δ(x, y, z) δ(x, y, z)
wδ(x, y, z) D(x,y)−δ(x,y,z)3
w2δ(x, y, z) δ(x, y, z)
wδ(x, y, z) D(x,y)−δ(x,y,z)3
wδ(x, y, z) D(x, y)
w2δ(x, y, z) D(x,y)−δ(x,y,z)3
wδ(x, y, z) δ(x, y, z)
D(x, y) D(x,y)−δ(x,y,z)3
wδ(x, y, z) δ(x, y, z)
Figure 7.2: Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)
used to compute conditional spectral polynomials Q(u2=0)s (x, y, z); s = 0, 1, . . . , 8.
174
7.4. INSTRUCTIVE EXAMPLE
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
e Q(1)8 (x, y, z)
e Q(1)7 (x, y, z)
e Q(1)6 (x, y, z)
e Q(1)5 (x, y, z)
e Q(1)4 (x, y, z)
e Q(1)3 (x, y, z)
e Q(1)2 (x, y, z)
e Q(1)1 (x, y, z)
e Q(1)0 (x, y, z)
D(x, y) D(x,y)−δ(x,y,z)3
D(x, y) D(x, y)
wδ(x, y, z) w2[D(x,y)−δ(x,y,z)]3
D(x, y) δ(x, y, z)
w2δ(x, y, z) w[D(x,y)−δ(x,y,z)]3
D(x, y) δ(x, y, z)
w2δ(x, y, z) w2[D(x,y)−δ(x,y,z)]3
w2δ(x, y, z) D(x, y)
D(x, y) w[D(x,y)−δ(x,y,z)]3
w2δ(x, y, z) δ(x, y, z)
wδ(x, y, z) D(x,y)−δ(x,y,z)3
w2δ(x, y, z) δ(x, y, z)
wδ(x, y, z) w[D(x,y)−δ(x,y,z)]3
wδ(x, y, z) D(x, y)
w2δ(x, y, z) D(x,y)−δ(x,y,z)3
wδ(x, y, z) δ(x, y, z)
D(x, y) w2[D(x,y)−δ(x,y,z)]3
wδ(x, y, z) δ(x, y, z)
Figure 7.3: Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)
used to compute conditional spectral polynomials Q(u2=1)s (x, y, z); s = 0, 1, . . . , 8.
175
7.4. INSTRUCTIVE EXAMPLE
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
r r r r r
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
σ0(x)
e Q(2)8 (x, y, z)
e Q(2)7 (x, y, z)
e Q(2)6 (x, y, z)
e Q(2)5 (x, y, z)
e Q(2)4 (x, y, z)
e Q(2)3 (x, y, z)
e Q(2)2 (x, y, z)
e Q(2)1 (x, y, z)
e Q(2)0 (x, y, z)
D(x, y) D(x,y)+2δ(x,y,z)3
D(x, y) D(x, y)
wδ(x, y, z) w[D(x,y)+2δ(x,y,z)]3
D(x, y) δ(x, y, z)
w2δ(x, y, z) w2[D(x,y)+2δ(x,y,z)]3
D(x, y) δ(x, y, z)
w2δ(x, y, z) w[D(x,y)+2δ(x,y,z)]3
w2δ(x, y, z) D(x, y)
D(x, y) w2[D(x,y)+2δ(x,y,z)]3
w2δ(x, y, z) δ(x, y, z)
wδ(x, y, z) D(x,y)+2δ(x,y,z)3
w2δ(x, y, z) δ(x, y, z)
wδ(x, y, z) w2[D(x,y)+2δ(x,y,z)]3
wδ(x, y, z) D(x, y)
w2δ(x, y, z) D(x,y)+2δ(x,y,z)3
wδ(x, y, z) δ(x, y, z)
D(x, y) w[D(x,y)+2δ(x,y,z)]3
wδ(x, y, z) δ(x, y, z)
Figure 7.4: Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)
used to compute conditional spectral polynomials Q(u2=2)s (x, y, z); s = 0, 1, . . . , 8.
176
7.5. SIMULATION RESULTS
to (7.30), the three polynomials B(u2)(x, y, z) must be evaluated and then compared
in magnitude. The estimate u2 for the transmitted symbol u2 is the element ofGF (3)
which produces the highest evaluation amongst the three GWPs after substitution
of the three burst-error characteristics x, y and z. For example, if the GEC model
has burst-error characteristics
x = 1.2 × 10−2,
y = 5 × 10−2,
z = 9.85 × 10−1,
(7.39)
then the values of the three GWPs can be determined as
B(0)(x, y, z) = 3.57 × 10−5,
B(1)(x, y, z) = 4.14 × 10−2,
B(2)(x, y, z) = 1.19 × 10−3.
(7.40)
Therefore the result of the decoding can be calculated as u2 = 1.
7.5 Simulation Results
Some computer simulations of APP decoding using Procedure 7.1 were carried out
using MATLABr. The SER performance for a selection of different burst-error
characteristic values was obtained for two codes. The objective of the simulations is
not to provide a complete performance analysis of codes. Instead, some observations
can be made and the possibilities for analysis can be seen.
7.5.1 (4,2) Hamming code over GF (3)
Computer simulations of decoding using Procedure 7.1 were first effectuated for the
(4,2) Hamming code over GF (3) described in (3.107). The channel is a ternary
restricted GEC. A plot of the SER values obtained is given in Fig. 7.5. The value
of the channel reliability factor z was fixed at 9.85 × 10−1, corresponding to a mean
symbol error rate of ps = 1%. Pairs of values for the average fade to connection
time ratio x and the burst factor y were chosen from the sets
x ∈ {0.003, 0.006, . . . , 0.015},y ∈ {0, 0.05, . . . , 1}.
(7.41)
As can be observed from Fig. 7.5, a decrease in the average fade to connection time
ratio x means the GEC is more likely to be in the ‘good’ state G, where the crossover
177
7.5. SIMULATION RESULTS
probability is lower. Thus, the simulation results are consistent with the hypothesis
that post-decoding SER increases with the average fade to connection time ratio x,
as was observed in Section 6.5.
There is a marked improvement in the performance of this code as the burst
factor y decreases, particularly for high values of the average fade to connection
time ratio x as the burst factor y tends toward zero from above. In this case, the
error bursts are both rare and decreasing in duration. Thus, it becomes increasingly
probable that at most one transmission error per received word has occurred. So
the code is increasingly likely to be able to correct such errors and the resulting SER
decreases sharply.
It is also to be observed that the five values of the average fade to connection time
ratio x produce the same post-decoding SER when the burst factor y is zero. It can
be seen that this is the expected behaviour by considering (7.24) for y = 0. Suppose
the ith transmitted symbol ui of a codeword from a code C is being estimated and
let β be the number of nonzero entries in the sth codeword of the dual code C⊥,
0≤ s≤ pn−k−1, other than in the ith position of that word. It can then be shown
that the conditional spectral polynomial Q(ui)s (x, y, z) for this situation is given by
Q(ui)s (x, 0, z) = zβ(c1 + c2z). (7.42)
Thus by (7.29) and (7.30), when the burst factor y is zero, the APP decoding deci-
sions are independent of the average fade to connection time x, which explains the
intersection of the five curves in Fig. 7.5. A similar argument involving the evalua-
tion of Q(ui)s (0, y, z) can be used to show that when the average fade to connection
time x is zero, the SER achieved is independent of the burst factor y.
7.5.2 (26,22) BCH code over GF (3)
The Bose-Chaudhuri-Hocquenghem (BCH) codes were one of the major advances in
the history of coding theory [72]. Although better codes have since been discovered,
at the time they represented a class of codes with good error correction capabilities.
Their design was general enough so that they could be implemented over any field.
In fact, the Hamming codes are a subclass of the BCH codes. Gorenstein and
Zierler [73] noted the codes’ ability for correction of errors which occurred in bursts,
rather than independently. In this example, a code over GF (3) is constructed. The
BCH codes are cyclic, so if the coefficients of the generator polynomial are given as
g = [1, 2, 1, 2, 2] (7.43)
178
7.5. SIMULATION RESULTS
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110
−4
10−3
10−2
10−1
y
SE
R
x = 0.003
x = 0.006
x = 0.009
x = 0.012
x = 0.015
Figure 7.5: Performance of the (4,2) Hamming code over GF (3) on restrictedGECs with pB = 1
3, ps =1%.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110
−4
10−3
10−2
10−1
y
SE
R
x = 0.005
x = 0.01
x = 0.015
Figure 7.6: Performance of the (26,22) BCH code over GF (3) on restricted GECswith pB = 1
3, ps =1%.
179
7.5. SIMULATION RESULTS
and n is set as 33−1 = 26, then the result is a 22× 26 generator matrix with zeroes
in its upper right and lower left regions, and g cyclically shifted one position to the
right in each row. This is however not in standard form. By performing elementary
row operations, a generator matrix in standard form for an equivalent code C is
found as
G = [I22 | K], (7.44)
where
K =
2 1 1 1 1 0 1 2 0 1 1 2 1 1 2 0 2 0 1 0 0 2
1 1 0 0 0 1 2 2 2 2 0 2 1 0 2 2 1 2 2 1 0 1
2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2
1 1 1 1 0 1 2 0 1 1 2 1 1 2 0 2 0 1 0 0 2 2
T
. (7.45)
This results in a 4 × 26 parity check matrix of
H=
1 2 2 2 2 0 2 1 0 2 2 1 2 2 1 0 1 0 2 0 0 1 1 0 0 0
2 2 0 0 0 2 1 1 1 1 0 1 2 0 1 1 2 1 1 2 0 2 0 1 0 0
1 1 1 2 2 0 1 2 1 0 0 1 0 1 1 1 2 2 0 1 2 1 0 0 1 0
2 2 2 2 0 2 1 0 2 2 1 2 2 1 0 1 0 2 0 0 1 1 0 0 0 1
. (7.46)
It is clear from (7.44) and (7.45) that d(C) = 3 and so this code of rate R = 0.85
uses its four parity symbols to correct one error.
Computer simulations of the transmission of data encoded using the (26,22) BCH
code over GF (3) as described above were also carried out. The ternary restricted
GEC was again used with
z = 0.985, (7.47)
or an average symbol error rate ps of one percent. The values for the average fade
to connection time ratio x and the burst factor y were selected from the sets
x ∈ {0.005, 0.01, 0.015},y ∈ {0, 0.1, . . . , 1}.
(7.48)
The results of these simulations are displayed in Fig. 7.6. As with the Hamming
code above, this BCH code is capable of correcting one symbol error. However,
this code is much longer and its code rate is much higher than the (4,2) Hamming
code. This explains the higher post-decoding SER in almost all cases for the BCH
code compared to the Hamming code when both have an average transmission SER
of 1%. The probability of having an error pattern which can be detected but not
180
7.6. SUMMARY
corrected with 26 symbols is higher than that with four symbols, assuming the same
frequency of transmission errors. However, the same relationships between the SER
and both x and y as for the (4,2) Hamming code are observed.
The gradients of the curves in Fig. 7.6 are lower than that of Fig. 7.5, particularly
when the burst factor y is small and the channel more closely resembles one with
independent errors. In such a situation, the value of the channel reliability factor z
is low enough, and the code length n = 26 is large enough, that in many cases, a
detectable error pattern will occur which cannot be corrected. This contributes to
a relatively high post-decoding SER and thus the high gradients at the left of Fig.
7.5 are not observed in Fig. 7.6.
It can be observed that the same intersection of the curves at y = 0 occurs in
Fig. 7.6, as occurred in Fig. 7.5. The reason for this is given in (7.42). That is,
for any linear block code and a fixed value of the channel reliability factor z, the
post-decoding SER obtained on a restricted GEC which has a burst factor y of zero
appears to be independent of the value of the average fade to connection time ratio
x.
7.6 Summary
The focus of this chapter has been the design of a procedure to perform APP de-
coding for linear block codes over a non-binary restricted GEC. This was achieved
through a number of steps. Firstly, the channel reliability factor was discussed for
a general non-binary GEC and a formula was found for this parameter in terms
of the average transmission symbol error rate of the channel. The expression was
found by consideration of the syndrome trellis. A proof showing that only crossover
probabilities of at most 1p
need to be considered in order to deal with all possible
channel capacities was also given.
Then, the next section dealt specifically with how to obtain conditional spectral
polynomials from the conditional spectral coefficients. Expressions for each row of
the spectral domain trellis could be obtained principally by consideration of the
structure of the dual code along with the received vector. The theorem developed
in Chapter 6 for a binary restricted GEC also applies in the non-binary case to give
the conditional spectral polynomials in terms of the burst-error characteristics of
the average fade to connection time ratio x, the burst factor y, and the channel
reliability factor z.
The relationship between the weight polynomial for a systematic non-binary code
and that of its dual is given by the MacWilliams identity and its instantiation given
181
7.6. SUMMARY
in this chapter was in terms of one variable. Similarly, the generalised weight poly-
nomials in this chapter give a posteriori probabilities of the original domain based on
probabilities which are derived from the spectral domain. The weight polynomials
are termed “generalised” because they are functions of three variables, specifically
the three burst-error characteristics which describe a non-binary restricted GEC.
An APP decoding procedure was developed using the evaluation of these GWPs.
A decoding example for a ternary linear block code was then given in order
to illustrate the steps involved in the procedure. Firstly, the conditional spectral
polynomials were calculated, and these were used to determine the GWPs. Once the
values of the three burst-error characteristics were substituted, a decoding decision
could be made. Finally, simulation results were obtained for two ternary codes
over restricted GECs with various burst-error characteristics and some observations
were made. In particular, increases in either the average fade to connection time
ratio x or the burst factor y appear to increase the post-decoding SER and hence
degrade the performance of the code. It was also noted that if the burst factor y
for a restricted GEC is zero, then for a fixed value of the channel reliability factor
z, the post-decoding SER appears to be the same for all values of the average fade
to connection time ratio x.
182
Chapter 8
Conclusion and General
Discussion
Wireless communication systems were a step forward in man’s quest to improve
communications over large distances. However, relative to a memoryless system, this
more complex scheme brought its own set of problems with it. The channel models
are likely to experience error bursts. This necessitated the use of error correcting
codes to protect the information from such errors. However, hand-in-hand with this,
effective and efficient decoding algorithms are required. The research presented in
this thesis has been a step towards achieving that goal.
This chapter is structured as follows. In Section 8.1, the principal ideas and
findings of this thesis are summarised in order to demonstrate what has been ac-
complished, and for which channels the methods are applicable. However, this is
by no means a closed subject matter. There are still many more problems to be
solved and there are ways in which the strategies may be improved. For this reason,
Section 8.2 explores some of the directions which future research in this exciting and
pertinent area may take.
8.1 Summary of Major Findings and Contribu-
tions
Given a channel model and a code for transmission of data over that channel, it is also
necessary to have a method of retrieving the encoded information after reception.
More specifically, there is a need to develop decoding algorithms for the mobile
environment. It is desirable that these algorithms fulfil certain criteria. Namely,
that they are able to correct some patterns of transmission errors, that they are
183
8.1. SUMMARY OF MAJOR FINDINGS AND CONTRIBUTIONS
practical for a large number of codes, and that they are based on a simple concept.
The errors which occur with a wireless system do not occur independently. In-
stead they occur in bursts and this adds additional complexity to the model. So to
simplify things, the basic method of the algorithms was first developed for a memo-
ryless channel. After presentation of the necessary background material in Chapter
2, Chapter 3 revealed that the evaluation of the APPs could be carried out in either
of two domains. In both the original and spectral domains, the computations could
be performed using a trellis or alternatively a representation of the trellis in matrix
form could be found. This was a succinct way to represent the information, and
also provided a link between the two domains with the use of a similarity matrix
transformation.
In the case of the original domain, the parity check matrix was used in the
creation of a trellis for the code. The entries on the branches of this trellis were then
weighted by the error probabilities of the channel model and the details of obtaining
the matrix representation of such a trellis were summarised. The steps required
to perform APP decoding with this matrix representation were then given. For a
systematic code, the possibility of each information symbol position taking on each
value in the signalling alphabet must be investigated. It should be noted that each of
the algorithms developed in this thesis can be extended for use with non-systematic
codes. In such cases, the weighted trellis matrices, conditional spectral coefficients,
conditional spectral polynomials, or generalised weight polynomials are calculated
for all positions of the code, producing APPs for each and every transmitted symbol.
The complexity of such methods is greater than those presented here for systematic
codes, as a higher number of APPs must be determined in order to perform the
decoding.
It was next shown how to obtain a spectral domain matrix representation of
the trellis. This involved diagonalising the matrices which were used in the original
domain and the resulting trellis had a very simple structure. After weighting the
individual components according to the error probabilities induced by the channel,
they could be summed and the same decoding decisions as for the original domain
could be made.
The above methodology for memoryless channels formed the basis for the fol-
lowing two chapters. The phenomenon of errors occurring in bursts in the mobile
environment is however modelled better by a finite state channel model. In other
words, it is the nature of wireless communications that there will be times when
errors are fairly likely and other times when they are not. The different states are
designed to correspond to different error likelihoods and the model transfers between
184
8.1. SUMMARY OF MAJOR FINDINGS AND CONTRIBUTIONS
these states according to the probabilities in the state transition matrix. However,
it is in general not possible to ascertain which state the model is in at any time. It
is only possible to observe the received sequence of symbols.
Chapters 4 and 5 demonstrated the possibility of APP decoding for this type
of channel model. The error probabilities were represented by matrices instead of
scalars, with one entry for each pair of current and successive states. This has two
chief drawbacks in terms of the calculations required. Namely, matrix multiplication
is in general non-commutative, and is more computationally complex than scalar
multiplication. Hence the procedures developed for a finite state channel model will
not be as efficient as those for a memoryless channel. The tradeoff is being better
equipped to deal with burst errors.
Procedures were developed for both binary and non-binary codes operating in
either the original or the spectral domain. It was also noted that some of the con-
ditional spectral coefficients formed a probability distribution. However, there is an
underlying question as to which domain to use. An analysis of the computational
complexities and storage requirements of the two approaches concluded that the
spectral domain is suitable for codes of high rate. Additionally, the spectral domain
approach benefits from the lower amount of storage space required, but the original
domain suffers from requiring more storage by comparison in order to perform its
calculations. However, if storage space is not a primary concern, then the original
domain is preferred when the code has a low rate. It should be noted that the meth-
ods presented herein are comparable with multiple-state extensions of algorithms
which have been developed for memoryless channels.
Simulation results have advocated that increases in the order of the field, the
crossover probabilities and the probability of transferring to the ‘bad’ state of a GEC
are all factors which increase the SER. On the other hand, an increase in the prob-
ability of transferring to the ‘good’ state, which has a lower crossover probability,
will usually decrease the SER.
Chapters 6 and 7 of this thesis focussed on representing the behaviour of a special
type of GEC in terms of burst-error characteristics. Although this has already been
achieved for the binary case, the description for the non-binary restricted GEC
is novel, and in particular the treatment of the channel reliability factor for the
non-binary restricted GEC is new. The value of this burst-error characteristic is
increased with higher probabilities of correct symbol reception, but it must also be
handicapped by higher probabilities of incorrect reception.
This thesis discussed the concept of a restricted GEC, which has the property
that when in the ‘bad’ state, given a transmitted symbol, all symbols are equally
185
8.2. FUTURE RESEARCH
likely to be received. A proof was given of a statement presented in [26] concern-
ing the definition of the conditional spectral coefficients in terms of the burst-error
characteristics of a binary restricted GEC, and thus the conditional spectral polyno-
mials were obtained. Using the same strategy of proof, this result was extended to
non-binary restricted GECs. The structure of the conditional spectral polynomials
depended heavily on the positions of the zero and nonzero elements of the codewords
of the dual code. A similarity was noted between the weight polynomials, which are
often functions of a single variable, as described in the MacWilliams identities and
the expressions derived for the conditional spectral coefficients in this thesis as they
both concerned a relationship between a code and its dual. Since the polynomials in
terms of the burst-error characteristics are functions of three variables rather than
one, they have been named generalised weight polynomials.
This theory permitted the development of two additional APP decoding algo-
rithms, specifically for restricted GECs. One algorithm handled the binary case
and the other was the non-binary extension. Ultimately, these methods gave more
pleasing results than those of Chapters 4 and 5. This is because the calculations of
the a posteriori probabilities are constructed in terms of burst-error characteristics,
which provide a more useful description of a wireless channel since they focus on the
bursts themselves rather than the states of the model. Additionally, the similarities
between these trivariate polynomial expressions for the APPs and the weight poly-
nomials of the MacWilliams identity are aesthetically pleasing. The error correction
capabilities of a sample of linear block codes using these methods for decoding on
a restricted GEC were shown. Decreases in either the average fade to connection
time ratio or the burst factor appeared to result in a lower SER. Additionally, for
a fixed value of the channel reliability factor and a burst factor of zero, the same
post-decoding SER was obtained for each of the average fade to connection time
ratio values simulated. This was verified by setting the burst factor to zero in the
expression for the conditional spectral polynomials.
8.2 Future Research
There are at least three directions along which future research in this area of APP
decoding of linear block codes over discrete channels may proceed. These advances
can be developed independently, which means that an extension in one direction
from the material presented here will not hinder the possibilities of advances in the
other directions. The possible extensions are the signalling alphabet, the use of
reliability information, and the channel model.
186
8.2. FUTURE RESEARCH
Table 8.1: Elements of GF (32) and their ternary vector images.
Exponential Polynomial [GF (3)]2 image0 0 [0,0]D0 1 [0,1]D1 D [1,0]D2 2D + 1 [2,1]D3 2D + 2 [2,2]D4 2 [0,2]D5 2D [2,0]D6 D + 2 [1,2]D7 D + 1 [1,1]
Signalling alphabet
Firstly, the procedures reported here have been developed for linear block codes over
GF (p), where p is either 2 (Chapters 3, 4 and 6) or an odd prime (Chapters 3, 5 and
7). There are however many codes used in practice which are defined over a field
GF (pa), for a > 1. Examples include BCH codes and Reed-Solomon codes, which
have applications such as compact discs and Digital Versatile Discs and are resilient
against burst errors.
One method of processing elements of GF (pa) is to use their p-ary image as
a vector. Recall from Chapter 2 that the elements of GF (pa) can be regarded as
polynomials modulo a monic irreducible polynomial of degree a. These polynomials
have a coefficients chosen from GF (p) and thus there is a bijection ϑ between them
and the set of p-ary vectors of length a. If the polynomials are in terms of an
indeterminate D, then define
ϑ : GF (pa) → [GF (p)]a
ca−1Da−1 + . . .+ c1D + c0 7→
[
ca−1, . . . , c1, c0
]
. (8.1)
The following example should clarify concepts. Knowing that the nonzero elements
of a Galois field form a cyclic multiplicative group, a field of order nine can be con-
structed by taking the elements {0, 1, D1, . . . D7} defined modulo the monic polyno-
mial
f ∗(D) = D2 +D + 2, (8.2)
which is irreducible over the base field GF (3). The elements in exponential form are
listed in Table 8.1 along with their polynomial equivalents and ternary vector images
as defined by (8.1). Using p-ary symbols instead of bits can be advantageous because
it permits the transmission of more information per time unit. On the other hand,
187
8.2. FUTURE RESEARCH
if such a symbol is decoded incorrectly, then so much more information is lost. For
this reason coding using the base field is sometimes preferred over pa-ary symbols,
particularly if the channel conditions are harsh. In the example in Table 8.1, each
symbol carries 1.58 bits, as opposed to 3.17 bits if 9-ary symbols were used. In this
way, the non-binary procedures reported in this research are capable of decoding
symbols in GF (pa), hence a code over any field can be used. Ideally, collections of
a symbols would be combined into a single signal during the modulation process,
and then demodulated into a symbols after transmission through the channel. The
performance of such schemes however has not been investigated.
Reliability information
It is important to remember that the values being calculated with the procedures
described in this thesis are probabilities. As such, they are small in magnitude. This
is especially the case when many are multiplied together, or when the number of
possible received symbols is large, meaning the individual error probabilities are tiny.
This phenomenon is discussed in [74]. When the size of the code and/or the size of
the signalling alphabet is large, implementation of these APP decoding procedures
as listed on most computers will result in underflow. That is, all symbols will be
decoded as zero and the SER will be intolerably high. This is further motivation to
refrain from working directly with elements of GF (pa). One solution given in [74],
at least for memoryless channels, is to normalise the APPs by dividing by the sum
of all APPs found for each information symbol position. This will also work for
channels with memory.
The underflow problem in [74] resulted from the consideration of concatenated
codes. These, along with iterative decoding are two ways of reusing the reliability
information. As discussed in Section 2.3.2, this means a MAP decoding algorithm.
In basic iterative decoding, the same word is decoded multiple times. For the first
iteration, the a priori probabilities of all symbols are equal. However for all subse-
quent iterations, the APP derived from the previous iteration becomes the a priori
probability for the next. Thus the algorithm will converge to a solution over time.
This solution may still not be the correct one. When sufficient iterations have been
performed, a hard decision is made using the ‘arg max’ operation. Note that iterative
decoding generally lowers the SER at the expense of increasing the amount of time
or energy required to perform the decoding. It is possible to adapt the procedures
reported in this research to accommodate iterative decoding. It would require the
use of the penultimate rather than the final line of (2.104), which would then lead
to different problem statements in (2.106) for memoryless channels and (2.107) for
188
8.2. FUTURE RESEARCH
channels with memory. In essence, each path through the trellis is weighted by the
relevant APPs obtained in the previous decoding iteration. These weightings are
now a priori information. The summation over sets of trellis paths is performed in
the same way as for the non-iterative case to calculate APPs to be used as a priori
information for the following iteration, and so forth.
Reliability information can also be reused when decoding parallel concatenated
block turbo codes, also known as product codes. Suppose there exist linear block
codes C1, C2, . . . , Cl each over GF (p) where for 1 ≤ i ≤ l, Ci is an (ni, ki) code
in standard form with Hamming distance di. To encode, the∏l
i=1 ki information
symbols are arranged into an l-dimensional hypermatrix. All vectors in the first
dimension are encoded using C1. The resulting vectors are encoded in the second
dimension using C2. This process continues until the encoding of all vectors in the
lth dimension produces a codeword of∏l
i=1 ni symbols. The code has a Hamming
distance of∏l
i=1 di. Decoding is carried out in the reverse order of the encoding.
The APP of an information symbol when decoding in one dimension becomes the a
priori probability of that symbol in the following dimension. The viability of such a
scheme has been investigated for two-dimensional single parity check product codes
in [74] and [75], however only for memoryless channels. The APP decoding schemes
in this research have yet to be considered for product codes over channels with
memory.
Channel model
Another theme discussed briefly in [74] is using the structure of GF (p) for p > 2 to
perform the necessary calculations in a different way which may lower the compu-
tational complexity of the decoding procedures. It may be possible to implement a
similar scheme for channels with memory so that the time needed to retrieve the in-
formation symbols is shortened. As reported in Section 5.4.1, increasing the number
of states of the channel model by just one means large increases in the time taken
for the procedures to be executed.
This research has only considered channel models with a memory of zero or one
symbol duration. In order to consider more complex models with greater memory,
it would be necessary to work with matrix probabilities which are different from
D0, Dǫ, D and ∆. It may also be possible to find expressions for the conditional
spectral coefficients in terms of burst-error characteristics as discussed in Chapters
6 and 7 for channel models other than a restricted GEC. Thus, the concepts of this
research could be applied to a wide variety of practical situations.
189
190
Appendices
191
Appendix A
Proof of (3.44)
It is claimed in (3.44) that W−1pn−k = 1
pn−k WHpn−k . A proof of this result is given
below.
Proof. Firstly note that the complex conjugate of a pth root of unity is that same
root with a negative exponent. That is, if wγ is a pth root of unity then
(wγ)∗ = w−γ. (A.1)
If the ith row of the symmetric matrix Wpn−k is given by[
wi,1, wi,2, . . . , wi,pn−k]
,
for 1 ≤ i ≤ pn−k, then the entry in row i and column j of the matrix product
Wpn−k · WHpn−k = [ψ(pn−k)i,j]pn−k×pn−k of Wpn−k and its Hermitian may be determined
as
ψ(pn−k)i,j =⟨[
wi,1, wi,2, . . . , wi,pn−k]
,[
(wj,1)∗, (wj,2)∗, . . . , (wj,pn−k
)∗]⟩
= wi,1w−j,1 + wi,2w−j,2 + . . .+ wi,pn−k
w−j,pn−k
by (A.1)
=
pn−k
∑
m=1
wi,m−j,m. (A.2)
Then for the diagonal entries, i = j and
ψ(pn−k)i,i =
pn−k
∑
m=1
wi,m−i,m
=
pn−k
∑
m=1
1
= pn−k. (A.3)
193
For a non-diagonal element which is in row i and column j 6= i, it can be shown
that∑pn−k
m=1 wi,m−j,m is zero by induction. Thus, the value of ψ(pd)i,j is first examined
for d = 1. The ith row of Wp can be given as[
wi·0, wi·1, wi·2, . . . , wi·(p−1)]
, and
similarly the jth row is given as[
wj·0, wj·1, wj·2, . . . , wj·(p−1)]
. Then
ψ(p1)i,j =
p∑
m=1
w(i−j)·m
= wi−j
p∑
m=1
wm
= 0, (A.4)
where (A.4) follows because the sum of all complex pth roots of unity is zero. So it
has been shown that any two different rows of Wp have a dot product of zero. Now
assume ψ(pd)i,j equals zero for a positive integer d. By definition,
Wpd+1 = Wp ⊗ Wpd =
w0Wpd w0Wpd . . . w0Wpd
w0Wpd w1Wpd . . . wp−1Wpd
......
...
w0Wpd wp−1Wpd . . . w(p−1)(p−1)Wpd
. (A.5)
For notational efficiency, let a and b be the rth and sth row of Wpd , respectively.
The dot product of the (h · pd + r)th and the (i · pd + s)th rows of Wpd+1 , where
0≤h, i≤p− 1 and 1≤r, s≤pd, may be calculated as
⟨[
w0a, wha, w2ha, . . . , w(p−1)ha]
,[
w0b, wib, w2ib, . . . , w(p−1)ib]⟩
= w0 〈a,b〉 + whwi 〈a,b〉 + w2hw2i 〈a,b〉 + . . .+ w(p−1)hw(p−1)i 〈a,b〉= < a,b > [w0 + wh+i + w2(h+i) + . . .+ w(p−1)(h+i)]
= 0[w0 + wh+i + w2(h+i) + . . .+ w(p−1)(h+i)] by the Inductive Hypothesis
= 0. (A.6)
Thus ψ(pd+1)i,j equals zero, and by the Principle of Mathematical Induction, ψ(pd)i,j
equals zero for all positive integers d. In summary, it has been shown that
Wpn−k · WHpn−k = pn−kIpn−k , (A.7)
or alternatively thatW−1
pn−k =1
pn−kWH
pn−k . (A.8)
194
Appendix B
Proof of Lemma 5.3.1
Lemma 5.3.1. The sum of the entries in any row or column except the first of the
complex Walsh-Hadamard transform matrix Wpn−k is zero.
Proof. This proof is given by induction on d = n−k. Let Σ(r, d) stand for the
sum of the entries in the rth row of Wpd . It is required to show that Σ(r, d) = 0,
∀r ∈ {2, 3, . . . , pd}. Firstly,
Σ(r, 1) = w(r−1)·0 + w(r−1)·1 + . . . w(r−1)·(p−1)
(1−wr−1)Σ(r, 1) = (1 − wr−1)[
w(r−1)·0 + w(r−1)·1 + . . .+ w(r−1)·(p−1)]
= w0(r−1)+w1(r−1)+. . .+w(p−1)(r−1)−wr−1−w2(r−1)−. . .−wp(r−1)
= w0 − wp(r−1)
= 0. (B.1)
Since 2 ≤ r ≤ p, it follows that 1 − wr−1 is nonzero. Therefore (B.1) implies that
Σ(r, 1) equals zero and the lemma is true for n−k = 1. Now assume that the
lemma is true for a positive integer n−k = d. This means the sum of the entries
in any row except the first row of Wpd is zero. Examine the (i · pd + s)th row of
Wpd+1 for some i ∈ {0, 1, . . . , p−1} and some s ∈ {1, 2, . . . , pd} but excluding the
case where i = 0 and s = 1, as this was covered in Lemma 5.2.2. Let b be the sth
row of Wpd . Then the structure of the (i · pd + s)th row of Wpd+1 can be reported
as[
w0b, wib, w2ib, . . . , w(p−1)ib]
. There are two cases to consider. Firstly,
assume s = 1 and i 6= 0. Then by Lemma 5.2.2, the row vector b can be expressed
as
b =[
1, 1, . . . , 1]
, (B.2)
and using the base case in (B.1), the sum of the entries in the (i · pd + 1)th row may
195
be calculated as
Σ(i · pd + 1, d+ 1) = pd(w0) + pd(wi) + pd(w2i) + . . .+ pd[w(p−1)i]
= pd[
w0 + wi + w2i + . . .+ w(p−1)i]
= pd(0)
= 0.
(B.3)
The other case to consider is where s 6= 1. By the Inductive Hypothesis, the sum of
the entries in the sth row of Wpd is zero. Then
Σ(i · pd+s, d+1) = w0Σ(s, d) + wiΣ(s, d) + w2iΣ(s, d) + . . .+ w(p−1)iΣ(s, d)
=[
w0 + wi + w2i + . . .+ w(p−1)i]
Σ(s, d) (B.4)
= 0.
Therefore, the lemma is true for n−k = d+1 and by the Principle of Mathematical
Induction, the lemma is true for all positive integers d. Since Wpn−k is symmetric,
the result also holds for all of its columns. Thus, the sum of the entries in all rows
and columns of Wpd except the first is zero.
196
Bibliography
[1] Q. Bi, G. I. Zysman, and H. Menkes, “Wireless mobile communications at the
start of the 21st century,” IEEE Commun. Mag., vol. 39, no. 1, pp. 110–116,
Jan. 2001.
[2] J. Chen and R. M. Tanner, “A hybrid coding scheme for the Gilbert-Elliott
channel,” IEEE Trans. Commun., vol. 54, no. 10, pp. 1787–1796, Oct. 2006.
[3] E. O. Elliott, “Estimates of error rates for codes on burst-noise channels,” Bell
System Technical Journal, vol. 42, pp. 1977–1997, Sept. 1963.
[4] J.-Y. Chouinard, M. Lecours, and G. Y. Delisle, “Simulation of error sequences
in a mobile communications channel with Fritchman’s error generation model,”
in Proc. IEEE Pacific Rim Conf. on Commun., Computers and Signal Process.,
Victoria, Canada, June 1989, pp. 134–137.
[5] B. Hayes, “Third base,” American Scientist, vol. 89, no. 6, pp. 489–494, Nov.-
Dec. 2001.
[6] C. E. Shannon, “A mathematical theory of communication,” Bell System Tech-
nical Journal, vol. 27, no. 3 and 4, pp. 379–423 and 623–656, July and Oct.
1948.
[7] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon limit error-
correcting coding and decoding: Turbo-codes (1),” in Proc. IEEE Int. Conf.
on Commun., vol. 2, Geneva, Switzerland, May 1993, pp. 1064–1070.
[8] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear
codes for minimizing symbol error rate,” IEEE Trans. Inf. Theory, vol. 20,
no. 2, pp. 284–287, Mar. 1974.
[9] T. Johansson and K. Zigangirov, “A simple one-sweep algorithm for optimal
APP symbol decoding of linear block codes,” IEEE Trans. Inf. Theory, vol. 44,
no. 7, pp. 3124–3129, Nov. 1998.
197
BIBLIOGRAPHY
[10] H.-J. Zepernick, “A forward-only recursion algorithm for MAP decoding of
linear block codes,” Int. J. Adaptive Control and Signal Process., vol. 16, no. 8,
pp. 577–588, Sept. 2002.
[11] W. Turin, “MAP decoding in channels with memory,” IEEE Trans. Commun.,
vol. 48, no. 5, pp. 757–763, May 2000.
[12] L. Ping and K. L. Yeung, “Symbol-by-symbol APP decoding of the Golay code
and iterative decoding of concatenated Golay codes,” IEEE Trans. Inf. Theory,
vol. 45, no. 7, pp. 2558–2562, Nov. 1999.
[13] Y. Kaji, R. Shibuya, T. Fujiwara, T. Kasami, and S. Lin, “MAP and LogMAP
decoding algorithms for linear block codes using a code structure,” IEICE
Trans. on Fund., vol. E83-A, no. 10, pp. 1884–1890, Oct. 2000.
[14] C. R. Hartmann and L. D. Rudolph, “An optimum symbol-by-symbol decoding
rule for linear codes,” IEEE Trans. Inf. Theory, vol. 22, no. 5, pp. 514–517, Sept.
1976.
[15] E. Dubrova, Y. Jamal, and J. Mathew, “Non-silicon non-binary computing:
Why not?” in Proc. 1st Workshop on Non-Silicon Computation, Boston, USA,
2002, pp. 23–29.
[16] J. Berkmann, “On turbo decoding of nonbinary codes,” IEEE Commun. Lett.,
vol. 2, no. 4, pp. 94–96, Apr. 1998.
[17] A. C. Reid, D. P. Taylor, and T. A. Gulliver, “Non-binary turbo codes,” in
Proc. Int. Symp. on Inf. Theory, Lausanne, Switzerland, July 2002, p. 57.
[18] J. Berkmann, “A symbol-by-symbol MAP decoding rule for linear codes over
rings using the dual code,” in Proc. Int. Symp. on Inf. Theory, Cambridge,
USA, Aug. 1998, p. 90.
[19] A. Goupil, M. Colas, G. Gelle, and D. Declercq, “FFT-based BP decoding of
general LDPC codes over Abelian groups,” IEEE Trans. Commun., vol. 55,
no. 4, pp. 644–649, Apr. 2007.
[20] D. Declercq and M. Fossorier, “Decoding algorithms for nonbinary LDPC codes
over GF (q),” IEEE Trans. Commun., vol. 55, no. 4, pp. 633–643, Apr. 2007.
[21] J. Garcia-Frias and J. D. Villasenor, “Combining hidden Markov source models
and parallel concatenated codes,” IEEE Commun. Lett., vol. 1, no. 4, pp. 111–
113, July 1997.
198
BIBLIOGRAPHY
[22] ——, “Turbo codes for continuous Markov channels with unknown parameters,”
in Proc. IEEE Global Telecommun. Conf., Rio de Janeiro, Brazil, Dec. 1999.
[23] ——, “Turbo decoding of Gilbert-Elliot channels,” IEEE Trans. Commun.,
vol. 50, no. 3, pp. 357–363, Mar. 2002.
[24] A. W. Eckford, F. R. Kschischang, and S. Pasupathy, “Analysis of low-density
parity-check codes for the Gilbert-Elliott channel,” IEEE Trans. Inf. Theory,
vol. 51, no. 11, pp. 3872–3889, Nov. 2005.
[25] F. J. MacWilliams, “A theorem on the distribution of weights in a systematic
code,” Bell System Technical Journal, vol. 42, pp. 79–94, Jan. 1963.
[26] L. Kittel and H.-J. Zepernick, “Generalized weight polynomials for linear binary
block codes used on a burst error channel,” in Proc. Int. Symp. on Inf. Theory
and its Appl., Honolulu, USA, Nov. 1990, pp. 175–178.
[27] H.-J. Zepernick, “A posteriori probability decoding of linear block codes over
prime fields,” in Proc. Int. Conf. Optimization Techniques and Appl., Hong
Kong, China, Dec. 2001, pp. 1497–1504.
[28] ——, “On computing the performance of linear block codes in nonindependent
channel errors,” in Proc. IEEE Int. Conf. on Commun., vol. 2, Dallas, USA,
June 1996, pp. 989–994.
[29] M. J. Golay, “Notes on digital coding,” Proc. IRE, vol. 37, no. 6, p. 657, June
1949.
[30] L. N. Kanal and A. R. K. Sastry, “Models for channels with memory and their
applications to error control,” Proc. IEEE, vol. 66, no. 7, pp. 724–744, July
1978.
[31] H.-J. Zepernick, “Modal analysis of linear nonbinary block codes used on
stochastic finite state channels,” in Proc. IEEE Int. Symp. on Inf. Theory,
Whistler, Canada, Sept. 1995, p. 287.
[32] E. N. Gilbert, “Capacity of a burst-noise channel,” Bell System Technical Jour-
nal, vol. 39, pp. 1253–1265, Sept. 1960.
[33] B. Wong and C. Leung, “On computing undetected error probabilities on the
Gilbert channel,” IEEE Trans. Commun., vol. 43, no. 11, pp. 2657–2661, Nov.
1995.
199
BIBLIOGRAPHY
[34] B. D. Fritchman, “A binary channel characterization using partitioned Markov
Chains,” IEEE Trans. Inf. Theory, vol. 13, no. 2, pp. 221–227, Apr. 1967.
[35] J.-Y. Chouinard, M. Lecours, and G. Y. Delisle, “Estimation of Gilbert’s and
Fritchman’s models parameters using the Gradient Method for digital mobile
radio channels,” IEEE Trans. Veh. Technol., vol. 37, no. 3, pp. 158–166, Aug.
1988.
[36] A. Semmar, M. Lecours, J.-Y. Chouinard, and J. Ahern, “Characterization
of error sequences in UHF digital mobile radio channels,” IEEE Trans. Veh.
Technol., vol. 40, no. 4, pp. 769–776, Nov. 1991.
[37] W. Griffiths, “APP decoding of linear block codes on Fritchman channels,”
in Proc. 5th Australian Telecommun. Cooperative Research Centre Workshop,
Melbourne, Australia, Nov. 2005, pp. 50–53.
[38] L. E. Baum, T. Petrie, G. Soules, and N. Weiss, “A maximization technique
occurring in the statistical analysis of probabilistic functions of Markov chains,”
Annals of Mathematical Statistics, vol. 41, no. 1, pp. 164–171, Feb. 1970.
[39] W. Turin, Performance Analysis and Modeling of Digital Transmission Systems,
3rd ed. New York, USA: Kluwer Academic/Plenum Publishers, 2004.
[40] R. H. McCullough, “The binary regenerative channel,” Bell System Technical
Journal, vol. 47, pp. 1713–1735, Oct. 1968.
[41] J. Swoboda, “Ein statistisches Modell fur die Fehler bei binarer
Datenubertragung auf Fernsprechkanalen (in German),” AEU, vol. 23, pp. 313–
332, 1969.
[42] C. White, P. Farrell, J. Hagan, M. Reimean, H. Rudin, A. Goldstein, and
H. Ohnsorge, “Meeting reports,” IEEE Commun. Mag., vol. 17, no. 4, pp.
28–34, June 1979.
[43] I. F. Blake, “Codes over integer residue rings,” Information and Control, vol. 29,
no. 4, pp. 295–300, Dec. 1975.
[44] G. Caire and E. Biglieri, “Linear block codes over cyclic groups,” IEEE Trans.
Inf. Theory, vol. 41, no. 5, pp. 1246–1256, Sept. 1995.
[45] J. B. Fraleigh, A First Course in Abstract Algebra, 7th ed. Addison Wesley,
2002.
200
BIBLIOGRAPHY
[46] R. Lidl and H. Niederreiter, Introduction to finite fields and their applications.
New York, USA: Cambridge University Press, 1986.
[47] C. Langton, “Coding and decoding with convolutional codes,” http://www.
complextoreal.com/convo.htm, 1999.
[48] R. E. Blahut, Theory and Prctice of Error Control Codes. Reading, USA:
Addison-Wesley Publishing Company, 1983.
[49] M. Bossert, Channel Coding for Telecommunications. Chichester, England:
John Wiley & Sons, Ltd, 1999.
[50] A. J. Viterbi, “Convolutional codes and their performance in communication
systems,” IEEE Trans. Commun. Technol., vol. COM-19, no. 5, pp. 751–772,
Oct. 1971.
[51] G. D. Forney, “The Viterbi algorithm,” Proc. IEEE, vol. 61, no. 3, pp. 268–278,
Mar. 1973.
[52] F. Jelinek, “Fast sequential decoding algorithm using a stack,” IBM J. Res.
Develop., vol. 13, no. 6, pp. 675–685, Nov. 1969.
[53] J. Erfanian and S. Pasupathy, “Low-complexity parallel-structure symbol-by-
symbol detection for ISI channels,” in Proc. IEEE Pacific Rim Conf. on Com-
mun., Computers and Signal Process., Victoria, Canada, June 1989, pp. 350–
353.
[54] P. Robertson, E. Villebrun, and P. Hoeher, “A comparison of optimal and sub-
optimal MAP decoding algorithms operating in the log domain,” in Proc. IEEE
Int. Conf. on Commun., vol. 2, Seattle, USA, June 1995, pp. 1009–1013.
[55] C. E. Shannon, “The zero error capacity of a noisy channel,” IRE Trans. Inf.
Theory, vol. IT-2, no. 3, pp. S8–S19, Sept. 1956.
[56] V. Sidorenko, G. Markarian, and B. Honary, “Minimal trellis design for linear
codes based on the Shannon product,” IEEE Trans. Inf. Theory, vol. 42, no. 6,
Part 1, pp. 2048–2053, Nov. 1996.
[57] J. K. Wolf, “Efficient Maximum Likelihood decoding of linear block codes using
a trellis,” IEEE Trans. Inf. Theory, vol. 24, no. 1, pp. 76–80, Jan. 1978.
201
BIBLIOGRAPHY
[58] D. Geller, I. Kra, S. Popescu, and S. Simanca, “On circu-
lant matrices,” State University of New York at Stony Brook,
http://www.math.sunysb.edu/∼sorin/eprints/circulant.pdf, 2002.
[59] D. Coppersmith and S. Winograd, “Matrix multiplication via arithmetic pro-
gressions,” in Proc. 19th Annual ACM Conf. on Theory of Computing, New
York, USA, May 1987, pp. 1–6.
[60] Z. Chen, P. Fan, and F. Jin, “On a new binary [22, 13, 5] code,” IEEE Trans.
Inf. Theory, vol. 36, no. 1, pp. 228–229, Jan. 1990.
[61] L. N. Kanal and A. R. K. Sastry, “Models for channels with memory and their
applications to error control,” Proc. IEEE, vol. 66, no. 7, pp. 724–744, July
1978.
[62] M. Zorzi, R. R. Rao, and L. B. Milstein, “On the accuracy of a first-order
Markov model for data transmission on fading channels,” in Proc. Fourth IEEE
Int. Conf. on Universal Personal Commun. Record, vol. 1, Tokyo, Japan, Nov.
1995, pp. 211–215.
[63] J. Garcia-Frias and J. D. Villasenor, “Turbo decoders for Markov channels,”
IEEE Commun. Lett., vol. 2, no. 9, pp. 257–259, Sept. 1998.
[64] H.-J. Zepernick and B. Rohani, “On symbol-by-symbol MAP decoding of linear
UEP codes,” in Proc. IEEE Global Telecommun. Conf., vol. 3, San Francisco,
USA, Nov. 2000, pp. 1621–1626.
[65] A. Trofimov and T. Johansson, “A memory-efficient optimal APP symbol-
decoding algorithm for linear block codes,” IEEE Trans. Commun., vol. 52,
no. 9, pp. 1429–1434, Sept. 2004.
[66] J. H. van Lint, “A survey of perfect codes,” Rocky Mountain Journal of Math-
ematics, vol. 5, no. 2, pp. 199–224, 1975.
[67] J. Berkmann, “Symbol-by-symbol MAP decoding of nonbinary codes,” in Proc.
ITG Fachtagung: Codierung fur Quelle, Kanal und Ubertragung, Aachen, Ger-
many, Mar. 1998, pp. 95–100.
[68] Z. Wykes, ISBN-13 For Dummiesr, Special Edition. Indianapolis, USA: Wiley
Publishing Inc., 2005.
202
BIBLIOGRAPHY
[69] J. R. Yee and E. J. Weldon, Jr., “Evaluation of the performance of error-
correcting codes on a Gilbert channel,” in Proc. IEEE Int. Conf. on Commun.,
vol. 2, New Orleans, USA, May 1994, pp. 655–659.
[70] L. Wilhelmsson and L. B. Milstein, “On the effect of imperfect interleaving
for the Gilbert-Elliott channel,” IEEE Trans. Commun., vol. 47, no. 5, pp.
681–688, May 1999.
[71] B. Dronma, “Codes over different alphabets and signal sets,” Master’s thesis,
Department of Mathematics, University of Bergen, Bergen, Norway, May 2004.
[72] E. R. Berlekamp, Key Papers in The Development of Coding Theory. New
York: Institute of Electrical and Electronics Engineers, 1974.
[73] D. Gorenstein and N. Zierler, “A class of error-correcting codes in pm symbols,”
Journal of the Society for Industrial and Applied Mathematics, vol. 9, no. 2, pp.
207–214, June 1961.
[74] W. Griffiths, H.-J. Zepernick, and M. Caldera, “On APP decoding of non-
binary block turbo codes over discrete channels,” in Proc. Int. Symp. on Inf.
Theory and its Appl., Parma, Italy, Oct. 2004, pp. 362–366.
[75] M. Caldera and H.-J. Zepernick, “APP decoding of nonbinary SPC product
codes over discrete memoryless channels,” in Proc. 10th Int. Conf. on Telecom-
mun., vol. 2, Papeete, French Polynesia, Feb. 2003, pp. 1167–1170.
203