Coordinated Multiuser Communications

COORDINATED MULTIUSER COMMUNICATIONS

Coordinated Multiuser

Communications

CHRISTIAN SCHLEGEL

University of Alberta,

Edmonton, Canada

and

ALEX GRANT

University of South Australia,

Adelaide, Australia

by

A C.I.P. Catalogue record for this book is available from the Library of Congress.

ISBN-10 1-4020-4074-1 (HB)

ISBN-13 978-1-4020-4074-0 (HB)

ISBN-10 1-4020-4075-X ( e-book)

ISBN-13 978-1-4020-4075-7 (e-book)

Published by Springer,

P.O. Box 17, 3300 AA Dordrecht, The Netherlands.

Printed on acid-free paper

All Rights Reserved

No part of this work may be reproduced, stored in a retrieval system, or transmitted

in any form or by any means, electronic, mechanical, photocopying, microfilming, recording

or otherwise, without written permission from the Publisher, with the exception

of any material supplied specifically for the purpose of being entered

and executed on a computer system, for exclusive use by the purchaser of the work.

Printed in the Netherlands.

© 2006 Springer

www.springer.com

to Rhonda and Robyn

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 11.2 Multiple Terminal Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Multiple-Access Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.4 Degrees of Coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4.1 Transmitter and Receiver Cooperation . . . . . . . . . . . . . . . 71.4.2 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.4.3 Fixed Allocation Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.5 Network vs. Signal Processing Complexity . . . . . . . . . . . . . . . . . . 101.6 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Linear Multiple-Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.1 Continuous Time Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2 Discrete Time Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3 Matrix-Algebraic Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 182.4 Symbol Synchronous Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.5 Principles of Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.5.1 Sufficient Statistics and Matched Filters . . . . . . . . . . . . . . 242.5.2 The Correlation Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.5.3 Single-User Matched Filter Detector . . . . . . . . . . . . . . . . . 272.5.4 Optimal Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.5.5 Individually Optimal Detection . . . . . . . . . . . . . . . . . . . . . 30

2.6 Access Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.6.1 Time and Frequency Division Multiple-Access . . . . . . . . . 312.6.2 Direct-Sequence Code Division Multiple Access . . . . . . . 322.6.3 Narrow Band Multiple-Access . . . . . . . . . . . . . . . . . . . . . . . 362.6.4 Multiple Antenna Channels . . . . . . . . . . . . . . . . . . . . . . . . . 372.6.5 Cellular Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.6.6 Satellite Spot-Beams Channels . . . . . . . . . . . . . . . . . . . . . . 41

2.7 Sequence Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.7.1 Orthogonal and Unitary Sequences . . . . . . . . . . . . . . . . . . 42

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xi. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xixxvii

The Dawn of Digital Communications . . . . . . . . . . . . . . . . . . . . . .

List of Figures

VIII Contents

2.7.2 Hadamard Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3 Multiuser Information Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.2 The Multiple-Access Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.2.1 Probabilistic Channel Model . . . . . . . . . . . . . . . . . . . . . . . . 463.2.2 The Capacity Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.3 Binary-Input Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543.3.1 Binary Adder Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543.3.2 Binary Multiplier Channel . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3.4 Gaussian Multiple-Access Channels . . . . . . . . . . . . . . . . . . . . . . . . 593.4.1 Scalar Gaussian Multiple-Access Channel . . . . . . . . . . . . . 593.4.2 Code-Division Multiple-Access . . . . . . . . . . . . . . . . . . . . . . 63

3.5 Multiple-Access Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733.5.1 Block Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753.5.2 Convolutional and Trellis Codes . . . . . . . . . . . . . . . . . . . . . 81

3.6 Superposition and Layering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 813.7 Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843.8 Asynchronous Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4 Multiuser Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974.2 Optimal Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.2.1 Jointly Optimal Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 1004.2.2 Individually Optimal Detection: APP Detection . . . . . . . 1074.2.3 Performance Bounds – The Minimum Distance . . . . . . . . 109

4.3 Sub-Exponential Complexity Signature Sequences . . . . . . . . . . . 1124.4 Signal Layering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

4.4.1 Correlation Detection – Matched Filtering . . . . . . . . . . . . 1184.4.2 Decorrelation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1194.4.3 Error Probabilities and Geometry . . . . . . . . . . . . . . . . . . . 1204.4.4 The Decorrelator with Random Spreading Codes . . . . . . 1224.4.5 Minimum-Mean Square Error (MMSE) Filter . . . . . . . . . 1244.4.6 Error Performance of the MMSE . . . . . . . . . . . . . . . . . . . . 1264.4.7 The MMSE Receiver with Random Spreading Codes . . . 1274.4.8 Whitening Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1284.4.9 Whitening Filter for the Asynchronous Channel . . . . . . . 132

4.5 Different Received Power Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . 1344.5.1 The Matched Filter Detector . . . . . . . . . . . . . . . . . . . . . . . 1344.5.2 The MMSE Filter Detector . . . . . . . . . . . . . . . . . . . . . . . . . 135

Contents IX

5 Implementation of Multiuser Detectors . . . . . . . . . . . . . . . . . . . . 1395.1 Iterative Filter Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

5.1.1 Multistage Receivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1395.1.2 Iterative Matrix Solution Methods . . . . . . . . . . . . . . . . . . . 1425.1.3 Jacobi Iteration and Parallel Cancellation Methods . . . . 1435.1.4 Stationary Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . 1475.1.5 Successive Relaxation and Serial Cancellation Methods . 1485.1.6 Performance of Iterative Multistage Filters . . . . . . . . . . . 151

5.2 Approximate Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . . 1585.2.1 Monotonic Metrics via the QR-Decomposition . . . . . . . . 1595.2.2 Tree-Search Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1615.2.3 Lattice Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

5.3 Approximate APP Computation . . . . . . . . . . . . . . . . . . . . . . . . . . 1705.4 List Sphere Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

5.4.1 Modified Geometry List Sphere Detector . . . . . . . . . . . . . 1725.4.2 Other Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

6 Joint Multiuser Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1756.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1756.2 Single-User Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

6.2.1 The Projection Receiver (PR) . . . . . . . . . . . . . . . . . . . . . . . 1796.2.2 PR Receiver Geometry and Metric Generation . . . . . . . . 1826.2.3 Performance of the Projection Receiver . . . . . . . . . . . . . . 185

6.3 Iterative Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1926.3.1 Signal Cancellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1946.3.2 Convergence – Variance Transfer Analysis . . . . . . . . . . . . 1956.3.3 Simple FEC Codes – Good Codeword Estimators . . . . . . 202

6.4 Filters in the Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2096.4.1 Per-User MMSE Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2096.4.2 Low-Complexity Iterative Loop Filters . . . . . . . . . . . . . . . 2146.4.3 Examples and Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . 217

6.5 Asymmetric Operating Conditions . . . . . . . . . . . . . . . . . . . . . . . . . 2196.5.1 Unequal Received Power Levels . . . . . . . . . . . . . . . . . . . . . 2206.5.2 Optimal Power Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2226.5.3 Unequal Rate Distributions . . . . . . . . . . . . . . . . . . . . . . . . . 2286.5.4 Finite Numbers of Power Groups . . . . . . . . . . . . . . . . . . . . 232

6.6 Proof of Lemma 6.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

A Estimation and Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237A.1 Bayesian Estimation and Detection . . . . . . . . . . . . . . . . . . . . . . . . 237A.2 Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239A.3 Linear Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241A.4 Quadratic Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

A.4.1 Minimum Mean Squared Error . . . . . . . . . . . . . . . . . . . . . . 242A.4.2 Cramer-Rao Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

X Contents

A.4.3 Jointly Gaussian Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244A.4.4 Linear MMSE Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 245

A.5 Hamming Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245A.5.1 Minimum probability of Error . . . . . . . . . . . . . . . . . . . . . . . 246A.5.2 Relation to the MMSE Estimator . . . . . . . . . . . . . . . . . . . 246A.5.3 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . . . 246

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

List of Figures

1.1 Basic setup for Shannon’s channel coding theorem. . . . . . . . . . . . 21.2 Multi-terminal networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 A historical overview of multiuser communications. . . . . . . . . . . . 51.4 Multiple-access channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.5 Degrees of cooperation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1 Simplified two-user linear multiple-access channel. . . . . . . . . . . . . 132.2 Continuous time linear multiple-access channel. . . . . . . . . . . . . . . 152.3 Sampling of the modulation waveform. . . . . . . . . . . . . . . . . . . . . . . 192.4 The modulation vectors sk[i]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.5 Synchronous model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.6 Symbol synchronous matched filtered model. . . . . . . . . . . . . . . . . . 252.7 Structure of the cross-correlation matrix. . . . . . . . . . . . . . . . . . . . . 262.8 Symbol synchronous single-user correlation detection for

antipodal modulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.9 Optimal joint detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.10 Modulating waveform built from chip waveforms. . . . . . . . . . . . . . 332.11 Chip match-filtered model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.12 Multiple transmit and receive antennas. . . . . . . . . . . . . . . . . . . . . . 372.13 Simplified cellular system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.14 Satellite spot beam up-link. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.1 Two-user multiple-access channel. . . . . . . . . . . . . . . . . . . . . . . . . . . 473.2 Example of a discrete memoryless multiple-access channel. . . . . . 493.3 Coded multiple-access system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.4 Two-user achievable rate region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523.5 Three-user achievable rate region. . . . . . . . . . . . . . . . . . . . . . . . . . . 523.6 Two-user binary adder channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543.7 Convex hull of two achievable rate regions for the two-user

binary adder channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563.8 Capacity region of the two-user binary adder channel. . . . . . . . . . 57

XII List of Figures

3.9 Channel as seen by user two. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583.10 Capacity region of the two-user binary multiplier channel. . . . . . 603.11 . 613.12 Rates achievable with orthogonal multiple-access. . . . . . . . . . . . . . 633.13 Convergence of spectral density. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663.14 Spectral efficiency of DS-CDMA with optimal, orthogonal and

random spreading. Eb/N0 = 10 dB. . . . . . . . . . . . . . . . . . . . . . . . . . 683.15 Spectral efficiency of DS-CDMA with random spreading. . . . . . . 683.16 Random sequence capacity with Rayleigh fading. . . . . . . . . . . . . . 733.17 Finding the asymptotic spectral efficiency via a geometric

construction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743.18 Rates achieved by some existing codes for the BAC . . . . . . . . . . . 753.19 Combined 2 user trellis for the BAC . . . . . . . . . . . . . . . . . . . . . . . . 823.20 Two user nonlinear trellis code for the BAC . . . . . . . . . . . . . . . . . 823.21 Successive cancellation approach to achieve vertex of capacity

region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 833.22 Two-user MAC with perfect feedback. . . . . . . . . . . . . . . . . . . . . . . 853.23 Capacity region for the two-user binary adder channel with

feedback. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883.24 Simple feedback scheme for the binary adder channel. . . . . . . . . . 893.25 Channel seen by V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 893.26 Capacity region for the two-user GMAC channel with feedback. 903.27 Capacity region for symbol-asynchronous two-user Gaussian

multiple-access channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 943.28 Capacity region for two-user collision channel without feedback. 95

4.1 Classification of multiuser detection and decoding methods. . . . . 984.2 A joint detector considers all available information. . . . . . . . . . . . 1004.3 Matched filter bank serving as a front-end for an optimal

multiuser CDMA detector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1024.4 Illustration of the correlation matrix R for three asynchronous

users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034.5 Illustration of the recursive computation of the quadratic form

in (4.11). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1054.6 Illustration of a section of the CDMA trellis used by the

optimal decoder, shown for three interfering users, i.e. K = 3,causing 8 states. Illustrated is the merger at state s, whereeach of the path arrives with the metric (4.12). . . . . . . . . . . . . . . 105

4.7 Illustration of the forward and backward recursion of the APPalgorithm for individually optimal detection. . . . . . . . . . . . . . . . . . 108

4.8 Bounded tree search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1124.9 Histograms of the distribution of the minimum distances of a

CDMA system with length-31 random spreading sequences,for K = 31 (dashed lines), and K = 20 users (solid lines), andmaximum width of the search tree. . . . . . . . . . . . . . . . . . . . . . . . . . 113

Example of Gaussian multiple-access channel capacity region. .

List of Figures XIII

4.10 Linear preprocessing used to condition the channel for a givenuser (shaded). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

4.11 Information theoretic capacities of various preprocessing filters. 1174.12 Geometry of the decorrelator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1224.13 Shannon Bounds for the AWGN channel and the random

CDMA decorrelator-layered channel. Compare with Figure 4.11. 1244.14 Shannon bounds for the AWGN channel and the MMSE

layered single-user channel for random CDMA. Compare withFigures 4.11 and 4.13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

4.15 The partial decorrelating feedback detector uses a whitenedmatrix filter as a linear processor, followed by successivecancellation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

4.16 Shannon Bounds for an MMSE joint detector for the unequalreceived power scenarios of one strong user, and equal powerfor the remaining users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

4.17 Shannon Bounds for an MMSE joint detector the case of twopower classes with equal numbers of users in each group. . . . . . . 138

5.1 Illustration of the asynchronous blocks in the correlationmatrix R. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

5.2 The multistage receiver for synchronous and asynchronousCDMA systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

5.3 Example performance of a parallel cancellation implementationof the decorrelator as a function of the number of iteration steps.144

5.4 BER Performance of Jacobi Receivers versus system load foran equal power system, i.e. A = I. . . . . . . . . . . . . . . . . . . . . . . . . . . 146

5.5 Visualization of the Gauss-Seidel update method as iterativeminimization procedure, minimizing one variable at a time.The algorithm starts at point A. . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

5.6 Iterative MMSE filter implementations for random CDMAsystems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

5.7 Shannon bounds for the AWGN channel and multistage filterapproximations of the MMSE filter for a random CDMAsystem with load β = 0.5. The iteration constant τ was chosenaccording to (5.39). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

5.8 Similar Shannon bounds for multistage filter approximationsof the decorrelator with load β = 0.5. . . . . . . . . . . . . . . . . . . . . . . . 157

5.9 Three-user binary tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1625.10 Performance of the IDDFD for a system with 20 active users

and random spreading sequences of length 31. . . . . . . . . . . . . . . . . 1645.11 Performance of the IDDFD under an unequal received power

situation with one strong user. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1655.12 Performance of different multiuser decoding algorithms as a

function of the number of active users at Eb/σ2 = 7 dB. . . . . . . . 1665.13 Sphere detector performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

XIV List of Figures

5.14 Sphere detector average complexity. . . . . . . . . . . . . . . . . . . . . . . . . 170

6.1 Comparison of the per-dimension capacity of optimal andlinearly processed random CDMA channels. The solid linesare for β = 0.5 for both linear and optimal processing, thedashed lines are for a full load β = 1. . . . . . . . . . . . . . . . . . . . . . . . 177

6.2 Diagram of a “coded CDMA” system, i.e. a CDMA systemusing FEC coding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

6.3 Projection Receiver block diagram using an embeddeddecorrelator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

6.4 Projection Receiver diagram using an embedded decorrelator. . . 1836.5 Lower bound on the performance loss of the PR. . . . . . . . . . . . . . 1866.6 Performance examples of the PR for random CDMA. The

dashed lines are from applying the bound from Theorem 6.1.The values of Eb/N0 is in dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

6.7 Iterative multiuser decoder with soft information exchange. . . . . 1936.8 Soft cancellation variance transfer curve. . . . . . . . . . . . . . . . . . . . . 1966.9 Code VT curves for a selection of low-complexity FEC codes. . . 1976.10 VT chart and iteration example for a highly loaded CDMA

system. FEC code VT is dashed, the cancellation VT curve isthe solid curve. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

6.11 Illustration of the turbo effect of iterative joint CDMA detection.1996.12 Illustration of variance transfer curves of various powerful

error control codes, such as practical-sized LDPC codes oflength N = 5000 and code rate R = 0.5, as well as two seriallyconcatenated turbo codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

6.13 VT transfer chart and iteration example for a highly loadedCDMA system using a strong serially concatenated turbo code(SCC 2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

6.14 Determination of the limiting performance of FEC codedCDMA systems via VT curve matching for Eb/N0 →∞. . . . . . . 203

6.15 Bit error performance of SCC2 from Table 6.1 as a function ofthe number of iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

6.16 Interference cancellation with a weak rate R = 1/3 repetitioncode, acting as non-linear layering filter. . . . . . . . . . . . . . . . . . . . . . 205

6.17 Achievable spectral efficiencies using linear and nonlinearlayering processing in equal power CDMA systems. . . . . . . . . . . . 210

6.18 Variance transfer curves for matched filter (simple)cancellation and per-user MMSE filter cancellation (dashedlines) for β = 2 and Es/N0 = 0dB and Es/N0 →∞. . . . . . . . . . . 213

6.19 Variance transfer curves for various multi-stage loop filtersfor a β = 2 and two values of the signal-to-noise ratio:P/σ2 = 3dB and P/σ2 = 23dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

List of Figures XV

6.20 Variance transfer chart for an iterative decoder usingconvolutional error control codes and a two-stage loop filter,showing an open channel at Eb/N0 = 4.5dB. . . . . . . . . . . . . . . . . . 217

6.21 Bit error rate performance of an iterative canceler with atwo-stage loop filter for 1,10,20, and 30 iterations, comparedto the performance of an MMSE loop filter. . . . . . . . . . . . . . . . . . . 218

6.22 Optimal and linear preprocessing capacities for various systemloads for random CDMA signaling. . . . . . . . . . . . . . . . . . . . . . . . . . 218

6.23 Performance of low-rate repetition codes in high-load CDMAsystems, compared to single-user layered capacities formatched and MMSE filter systems. . . . . . . . . . . . . . . . . . . . . . . . . . 220

6.24 CDMA spectral efficiencies achievable with iterative decodingwith different power groups assuming ideal FEC coding withrates R = 1/3 for simple cancellation as well as MMSEcancellation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

6.25 Capacity polytope illustrated for a three-dimensionalmultiple-access channels. User 2 is decoded first, than user 1,and finally user 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228

6.26 CDMA spectral efficiencies achievable with iterative decodingwith equal power groups assuming ideal FEC coding withoptimized rates according to (6.124). . . . . . . . . . . . . . . . . . . . . . . . . 231

6.27 Illustration of different power levels and average VTcharacteristics shown for both a serial turbo code (on theleft) and a convolutional code (on the right). The systemparameters are K1 = 22, K2 = 18, K3 = 16 for the SCCsystem at an Eb/N0 = 13.45dB, and K1 = K2 = K3 = 20 atand Eb/N0 = 8.88dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

List of Tables

3.1 Coding schemes shown in Figure 3.18. . . . . . . . . . . . . . . . . . . . . . . 763.2 Uniquely decodeable rate 1.29 code for the two-user BAC. . . . . . 763.3 Non uniquely decodeable code for the two-user BAC. . . . . . . . . . 773.4 Rate R = (0.571, 0.558) uniquely decodeable code for the

two-user binary adder channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6.1 Serially Concatenated Codes of total rate 1/3 whose VT curvesare shown in Figure 6.12. For details on serial concatenatedturbo codes, see [120, Chapter 11]. . . . . . . . . . . . . . . . . . . . . . . . . . . 200

Preface

Mathematical communications theory as we know it today is a fairly young,but rapidly maturing science just over 50 years old. Multiple-user theories ex-tend back to the same recent birthplace, but are only recently showing the firstearly signs of maturation. The goal of this book is to present both classicaland new approaches to the problems of designing co-ordinated communica-tions systems for large numbers of users.

The problems of reliable information transfer are in most cases intertwinedwith the problems of allocating the sparse resources available for use. Themultiuser philosophy attempts to optimize whole systems, by combining themultiple-access and information transmission aspects.

It is the purpose of this book to introduce the reader to the concepts in-volved in designing multiple-user communications systems. To achieve thisgoal, conventional multiple-access strategies are contrasted with newer tech-niques, such as joint detection and interference cancellation. Emphasizing thetheory and practice of unifying accessing and transmission aspects of commu-nications, we hope that this book will be a valuable reference for students andengineers, presenting many years of research in book format.

Chapter 2 sets out the main area of interest of the book, namely the linearmultiple-access channel. The emphasis is on obtaining a general model withwide application. Chapter 3 gives an overview of results from multiuser infor-mation theory, concentrating on the multiple-access channel. The remainderof the book, Chapters 4–6 are devoted to the design and analysis of multiuserdetectors and decoders. Chapter 4 describes joint detection strategies for un-coded systems, and implementation details for such detectors are considered inChapter 5. Joint decoders for systems with error control coding is the subjectof Chapter 6, which concentrates on the iterative decoding paradigm.

The multiple-user communications philosophy does not solve the world’scommunications needs. With every problem addressed, others are peering outof dark places under the guise of complexity. It is the goal of this book to putthe tools, techniques and most importantly the philosophy of multiple-usercommunications systems into the hands and minds of the reader.

XX Preface

As we write these final words, it seems that the information and communi-cations theory community is embarking on a renewed multiple-user revolution,far beyond the scope of this book. We look forward with great anticipation towhat the future holds for communications networks.

Park City, Utah and Adelaide, South Australia Christian SchlegelMay 2005 Alex Grant

1

Introduction

1.1 The Dawn of Digital Communications

Early in the last century, a fundamental result by Nyquist, the sampling theo-rem, ushered in the era of digital communications. Nyquist [91] showed that aband-limited signal, that is, a signal whose spectral representation was sharplycontained in a given frequency band, could be represented by a finite numberof samples. These samples can be used to exactly reconstruct the original sig-nal. In other words, the sampling theorem showed that it is sufficient to knowa signal at discrete time intervals, and there is no need to store an entire signalwaveform. This discretization of time for purposes of information transmis-sion was a very important starting point for the sweeping success of digitalinformation representation and communications later in the 20th century.

In 1948, Shannon [123] showed that the time-discrete samples used to rep-resent a communications signal could also be discretized in amplitude, andthat the number of levels of amplitude discretization depended on the noisepresent in the communications channel. This is in essence Shannon’s cele-brated channel coding theorem, which assigns to any given communicationschannel a capacity, which represents the largest number of (digital) informa-tion bits that can be reliably transported through this channel.

Combined with Nyquist’s sampling theorem, Shannon’s channel codingtheorem states that information can be transported in discrete amounts atdiscrete time intervals without compromising optimality. That is, packaginginformation into time-discrete units with discrete (possibly fractional) num-bers of bits in each unit is the optimal way of transmitting information. Thisrealization has had a profound impact on communications and informationprocessing. Virtually all information nowadays is represented, processed, andtransported in discrete digital form.

The Shannon channel coding theorem clearly played a pivotal role in thisdrive towards digital signaling. It quantifies the fundamental limit of the in-formation carrying capacity of a communications channel between a singletransmitter and a single user. This set-up is illustrated in Figure 1.1, where

2 1 Introduction

a transmitter sends time-discrete symbols from a (typically) finite signalingalphabet through a transmission channel. The channel is the sum total of allthat happens to the signal from transmitter to receiver. It includes distortion,noise, interference, and other effects the environment has on the signal. Thereceiver extracts the transmitted information from the channel output signal.It can do so only if the transmission rate R, in bits per symbol, is smallerthan the channel capacity C, also measured in bits per symbol.

InformationSource

Transmitter Receiver Destination

NoiseSource

+Message Signal

Received

Signal Message

Channel

Fig. 1.1. Basic setup for Shannon’s channel coding theorem.

Shannon’s channel coding theorem says more, however. The above state-ment is generally known as the converse to the channel coding theorem, stat-ing what is not possible, i.e. where the limits are in terms of admissible rates.The direct part of the theorem states that there exist encoding and decodingprocedures that allow the transmitted rate R approach the channel capacityarbitrarily closely with an error rate that can be made arbitrarily small. Thecost to achieve this lies in the length of the code, that is, the block of datathat is processed as a unit has to grow in size. Additionally, the complexity ofthe decoding algorithm increases as well, leading to ever more complex errorcontrol decoding circuits which can push rates closer to the capacity of thechannel.

Typically, the computation of the channel capacity is a fairly straight-forward exercise, while the design, study, and analysis of capacity achievingencoding and decoding methods can be exceedingly difficult. For example, ifthe transmitter is restricted to transmit with average power P and the chan-nel is affected only by additive white Gaussian noise with power N , and hasa bandwidth of W Hz, its capacity is given by

C = W log2

(1 +

P

N

), [bits/s/Hz]. (1.1)

Equation (1.1) is arguably the most famous of Shannon’s formulas, and itssimplicity belies the depth of the channel coding theorem associated with it.

1.2 Multiple Terminal Networks 3

We will encounter channels such as that one in Figure 1.1 repeatedly through-out this book.

It behooves us to recall that the tremendous progress towards realizing thepotential of a communications channel via complex encoding and decodingalgorithms was mainly fueled by the enormous strides that the technology ofvery large-scale integrated (VLSI) circuits and systems has made over the last5 decades since the invention of the transistor, also in 1948.

1.2 Multiple Terminal Networks

In real-world situations, the clean arrangement of a single transmitter and asingle receiver is more and more becoming a special case, as most transmissionsare occurring in a multi-terminal network environment, which consists of apotentially large number of transmitters and receivers. Communication takesplace over a common physical channel. The messages from each transmitterare only intended for some subset of the receivers, but may be received by allreceivers. This is illustrated in Figure 1.2 in which network nodes are shownas circles and transmissions are the links..

1

2

4

5

8

7

3

6

9

Fig. 1.2. Multi-terminal networks.

Transmitters cause interference for the non-intended receivers. Tradition-ally, this interference has been lumped into the channel noise, and a set ofsingle-channel models has been applied to multiple terminal communications.This is still the case with modern spread-spectrum based multiple-access sys-tems such as the cdma2000 standard [134].

4 1 Introduction

Multiple terminal networks can be decomposed into more basic compo-nents depending on the functionality of the system or the service.

• In a multiple-access channel (MAC) a number of terminals attempt toreach a common receiver via a shared joint physical channel (e.g. nodes1, 3, 4 and 2 in Figure 1.2). This scenario is the closest to a single-userchannel and is best understood theoretically.

• In broadcast channels the reverse is the case. A common transmitterattempts to reach a multitude of receivers (e.g. nodes 5, 8, and 9). The goalis to reach each receiver with possibly different messages at different datarates, at a minimum of resources. The simple description of the broadcastchannel belies its theoretical challenge, and very little is known for generalbroadcast channels.

• In a relay channel information is relayed over many communicationslinks from a transmitter to a receiver (e.g. nodes 4, 5, and 8). The relaysmay have no information of their own to transmit, and exist to assist thetransmission of other users. In the simplest case, these links are single-userchannels. The Internet uses such channels, where information traversesmany communications links from source to destination.

• In an interference channel each transmitter has one intended receiverand one or more unintended receiver (e.g. nodes 4, 5, 6 and 7).

• In two-way channels both terminals act as transmitters and receiversand share a common physical channel which allows transmission to occurin both directions.

A data network comprises an arbitrary combination of all of these channels,and a general theory of network communication is still in its infancy. Anearly overview of multi-terminal channels can be found in [87]. A more recenttreatment can be found in [24].

In this book we will deal almost exclusively with the multiple-access chan-nel. This has several reasons. First, the multiple-access channel is the mostnatural extension of the single-user channel, and Shannon’s fundamental re-sults are fairly easily extendible to this case. The multiple-access channel isalso the most basic physical level channel, since, even in a more general datanetwork, it describes the behavior of a single receiver node which is withinreach of several transmitters, and the fundamental limits of the multiple-accesschannel apply to this situation. Lastly, the information transmission problemfor the multiple-access channel, in contrast to the other multiple terminalnetwork arrangements, can be addressed by transmitter and receiver designsconceptually analogously to single-channel designs largely by designing appro-priate physical layer systems. The multiple-access model includes some veryimportant modern-day examples, such as the code-division multiple-access(CDMA) channel and the multiple-antenna channel, a recently popularizedexample of a multiple-input multiple-output (MIMO) channel (see Chapter3).

1.2 Multiple Terminal Networks 5

Figure 1.3 shows a coarse time-line of some major developments inmultiple-access information theory, signaling, and receiver design for themultiple-access channel.

Multiple-terminal information theory began with Shannon, who consideredthe two way channel. He also claimed to have found the capacity region of themultiple access channel, but this was never published, and it was not until theearly 1970s that research into multiple-terminal information theory becamewidespread. Successively more detailed and more general channel models havebeen considered since then, and this progress is subject of Chapter 3, whichalso describes some of the early attempts at code design for the multiple-accesschannel.

In the mid 1970s it was realized that the performance of multiple-accessreceivers for uncoded transmissions could be improved using joint detection, ata cost of increased implementation complexity. This motivated much researchinto practical signal processing methods for joint detection, particularly linearfiltering methods. This is the subject of Chapter 4 and 5, the latter dealingspecifically with implementation details.

After about 1996, much of the receiver design and analysis has focusedon iterative turbo-type receivers for coded transmissions. These methods arediscussed in detail in Chapter 6 of this book.

Fig. 1.3. A historical overview of multiuser communications.

6 1 Introduction

1.3 Multiple-Access Channel

Figure 1.4 shows the basic situation of a multiple-access channel where anumber of terminals communicate to a joint receiver using the same channel.As mentioned earlier, this situation arises naturally in a number of importantpractical cases, such a cellular telephone networks. Much is known aboutthe multiple access channel, such as its capacity, or more appropriately itscapacity region discussed in Chapter 3, which is the analog of the channelcapacity in the single-channel case. The information theoretic properties ofthe multiple-access channel were first investigated in [2].

An important innovation that grew out of the information theoretic resultsfor the multiple-access channel is that users should be decoded jointly, ratherthan independently, treating other users as interference. The joint receivermakes use of the known structure of all the users’ transmitted signals, andpotentially significant gains can be accomplished by doing this. The how andwhy of joint decoding is the major theme of this book. As can be appreci-ated from Figure 1.4, the capacity limits of the multiple-access channel, andalgorithmic ways to approach them, is of major importance in any kind ofmulti-terminal network, since the data flow through the multi-access node islimited by the multiple-access channel it forms with the transmitting termi-nals. These limits apply whether data is destined for the multi-access node ornot. We wish to restate that most of the concepts and results find applicationin the physical layer of a communications system, albeit the next higher layer,the medium access layer, will also have to be involved in efficient overall net-work design. Functions which are to be executed at the medium access layerinclude the selection of power levels, transmission rates, and transmission for-mats.

InformationSource

1

Transmitter1

Receiver

Destination1

NoiseSource

Message 1

Received

Signal

Multiple-AccessChannel

+Destination

2

Destination3

InformationSource

2

Transmitter2

Message 2

InformationSource

3

Transmitter3

Message 3

Message 1

Message 2

Message 3

Fig. 1.4. Multiple-access channel.

1.4 Degrees of Coordination 7

1.4 Degrees of Coordination

One important conceptual and practical extension that needs to be addedto the multiple-access communications problem is the coordination amongthe different communicating terminals. Different levels of coordination havea strong impact on the ability of the multi-access receiver to operate effi-ciently, even though the information theoretic capacity of the channel can bequite insensitive to these different levels of cooperation. The main cooperationconcepts are i) source cooperation and ii) synchronization:

1.4.1 Transmitter and Receiver Cooperation

Different levels of transmitter and receiver cooperation correspond to differ-ent statistical assumptions concerning the users’ transmissions, and differentassumptions about the type of receiver signal processing. Three types of co-operation are represented in Figure 1.5

No cooperation: Each source independently transmits information. The re-ceived signal is decoded separately by different receivers. The decoder foreach user treats every other user as unwanted noise. This situation essen-tially turns each of the channels into an independent single-user channel,where all the effects of concurrent users are lumped into channel impair-ments. Current cellular telephony systems and the vast majority of com-munications systems to date apply this methodology.

Receiver cooperation: Each source independently transmits information.The receiver makes full use of the received signal by performing jointdecoding. This is usually what we mean by coordinated multiuser com-munications. The resulting channel is the multiple-access channel withindependent transmissions.

Full cooperation: The sources may fully coordinate their transmissions.Joint decoding is used at the receiver. If full cooperation is allowed (requir-ing communication between users), there is no restriction of independenceon the joint transmission of the different users. Full cooperation allows thechannel resource to be used as if by a single “super-source”, which can beused as a benchmark. In certain cases, full cooperation may not add tothe fundamental capacity (region) of many practically important multiple-access channels (such as the Gaussian multiple-access channel). In suchcases, from a theoretical perspective, it is unimportant if the communicat-ing terminals are physically separated and independent or are co-locatedand can coordinate their transmissions. Full cooperation can however bevery important in practice to reduce the complexity burden of the receiver.

The remaining possibility - transmitter cooperation, without receiver co-operation is also of interest. This is to the broadcast channel, which we do notconsider in this book. Receiver cooperation with independent transmitters isthe main theme of this book.

8 1 Introduction

Transmitter 1

Channel

Transmitter 2

Receiver 1

Receiver 2

(a) No cooperation.

Transmitter 1

Channel

Transmitter 2

Receiver

(b) Receiver cooperation.

Transmitter Channel Receiver

(c) Full cooperation.

Fig. 1.5. Degrees of cooperation.

1.4 Degrees of Coordination 9

1.4.2 Synchronization

The other important type of coordination between users in a network is syn-chronization. Whereas the different levels of cooperation described above areconcerned with the sharing of transmitted data, or coordinating transmissionsignals, various degrees of synchronism are a consequence of the availabilityof a common time reference.

Depending upon the time reference available, users may align their symbolsor frames.

Symbol Synchronism: Users strictly align their symbol epochs to commontime boundaries. It is important to note that the reference location is thereceiver, i.e. the transmitted symbols must align when received.

Frame Synchronism: The users align their codewords to common timeboundaries. This is a less restrictive level of synchronization and thereforeeasier to accomplish than symbol synchronism.

Phase/Frequency Synchronism: Especially in wireless transmissions thephase and frequency of the carrier signal are very important. Knowingthese, a receiver can be built with significantly less effort than withoutthat knowledge. Phase and frequency synchronization are typically onlyfeasible when full cooperation is possible. This is the case for multiple-antenna transmission systems for example.

We shall see that loss of either type of synchronism may change the informa-tion rates that can be achieved.

1.4.3 Fixed Allocation Schemes

One conceptually simple way to share a common channel is to use some kindof fixed allocation. This is the current state of the art, and the followingallocation methods are widely used:

Time-Division Multiple-Access (TDMA): Users do not transmit simul-taneously, they transmit in orthogonal time periods, and therefore, from a“Shannon” point of view, they all use single channels in the conventionalsense. This does require that transmissions are (frame) synchronized intime, which typically adds significant system overhead.

Frequency-Division Multiple-Access (FDMA): Users transmit in or-thogonal frequency bands instead of time intervals. The available chan-nel is sliced up into a number of frequency channels which are used bya single user at a time. This requires coordination of the transmissionsand frequency synchronization of the transmissions. Ideally FDMA andTDMA behave similarly on an ideal channel, however, on mobile channelsan FDMA system may suffer more significantly from signal fading. Thepan-European cellular telephony system GSM [37] uses a combination ofFDMA and TDMA.

10 1 Introduction

Code-Division Multiple-Access (CDMA): This relative newcomer usesorthogonal, or nearly orthogonal, signaling waveforms for each of the users.It should properly be called signal-space-division multiple-access, but his-torically, its most popular representative was CDMA, where the signalwaveforms were generated by spreading codes, hence the terminology.Both time- and frequency-division can be viewed as special instances ofsignal-space-division. Code-division multiple access also requires transmit-ter synchronization. Furthermore, the signals used must be orthogonal atthe receiver. Many real-world channels destroy this orthogonality, and thechannel turns into a proper multiple-access channel. CDMA finds appli-cation in cellular-based telephony systems such as IS-95 [135], cdma2000[134], and future third generation systems.

Mathematical and physical details of these accessing schemes are discussed indetail in Chapter 2.

1.5 Network vs. Signal Processing Complexity

State-of-the art networks such a cellular telephony systems use mostly or-thogonal allocation schemes to simplify the transmission technology. This,however, puts organizational and computational burden on the network con-trols. Networks using fixed allocation schemes must maintain synchronism –this requires reliable feedback channels from the terminals. The network mustalso perform the resource allocation, which requires that the complete state ofthe network is known to the controller. These requirements can lead to verycomplex network control operations, which will become more and more com-plex as wireless networks migrate to packet transmission formats and ad-hocoperation. The precise power control mechanism required in CDMA cellularnetworks is an example of such a centralized network function [154].

Joint accessing of a MAC using multiple-user coding strategies has thepotential to significantly increase the system capacity while reducing the over-head associated with network control functions. Complexity is transfered intothe receiver, which is equipped with signal processing functions which canextract information streams from the correlated composite signal. With therapid advances of VLSI technology, receiver designs which seemed impossi-ble just years ago, are now well within reach as VLSI chips push processingpowers into the tera flop region.

A well-designed multiuser receiver will relax or obviate the requirementson network synchronization and coordination. The properties of the signalprocessing functions provide for access coordination. For example, since amultiuser detector can process signals with different received power levels(see Chapter 6), power control does not need to be realized by the networkanymore. Since a multiuser detector can potentially approach the capacityof the multiple-access channel, the physical channel resources can be usedoptimally. This can lead to dramatic increases in system capacity.

1.6 Future Directions 11

Clearly, we are not getting anything for free, and the complexity of a mul-tiuser receiver can be significantly higher than that of single-channel receivers.In Chapter 5 we will present joint receiver structures which can be viewedas bridges between single-channel receivers and joint receivers, in that theyrely on only moderately complex extensions of conventional single-channel re-ceivers. In general, however, it is fair to say that complexity is transferredfrom the network to the receiver, where is is realized in VLSI circuits. Theability to implement complex systems with comparable ease allows the ex-ploitation of channel resources right at the receiver, rather than having to tryto compensate for inefficient receiver processing with expensive network-levelmeasures.

1.6 Future Directions

As pressure rises to use expensive channel resources such as power and band-width more efficiently, highly resource-efficient systems will replace less ef-ficient solutions. Multiuser receivers are required to exploit these resourcesoptimally at the physical layer of a multiple-access channel.

A level higher up, resources will likely have to allocated dynamically asneeded rather than as fixed allocations. This requires a high level of coordina-tion. Intelligent medium-access control layer algorithms have to select resourceallocations, but physical layer multiuser receivers will allow the optimal ef-fect of a chosen allocation. Such receivers will also alleviate the pressure onthe medium-access control layer by allowing the receivers to adjust to manychannel aspects, such as different received power levels.

The dropping cost of signal processing allows efficient exploitation of chan-nel resources right at the receiver, making it possible to approach the funda-mental limits of the communications channel. Joint signal processing at thereceiver, combined with novel medium-access control layer control protocolswill allow future data networks to harness the intrinsic capacity of the multi-tude of channels in a multi-terminal network.

Laying the foundations of the physical layer processing methodologies forsuch future networks is the purpose of this book.

2

Linear Multiple-Access

Most of the multiple-access channel models of interest in this book are linear,meaning that the channel output is a linear transformation of the user’s inputsignals, affected by additive noise. A simplified example of a two-user linearmultiple-access channel is shown in Figure 2.1. The corresponding mathemat-ical model is

r(t) = d1(t) + d2(t) + z(t),

where r(t) is the received signal, d1(t) and d2(t) are the signals transmittedby user 1 and 2, and z(t) is an additive noise process.

Noise

Receiver

User 1

User 2

d1(t)

d2(t)z(t)

r(t)

Fig. 2.1. Simplified two-user linear multiple-access channel.

The assumption of linearity may be based for example on the underlyinglinear superposition of signals which occurs in radio transmissions. In manyreal-world applications however, various non-linear effects (for example due toamplifier non-linearities) may be present. Such non-linearities will be ignoredin this book.

The purpose of this Chapter is to develop a reasonably generic mathemati-cal model for the linear multiple-access channel, to be adopted for the remain-

14 2 Linear Multiple-Access

der of this book. This model describes a number of users accessing a commonchannel by modulating their data signals with specially designed waveforms.We shall begin in Section 2.1 by describing the channel model in continuoustime, and further develop a signal-space representation. We shall then dis-cretize time in Section 2.2, via sampling and derive a linear matrix-algebraicmodel in Section 2.3 that will form the basis for much of the discussion inthe remainder of this book. This matrix-algebraic model applies to both theasynchronous and synchronous case, which facilitates the later discussion ofdetection algorithms. The special case of the symbol synchronous model willbe discussed in Section 2.4.

Following on from this in Section 2.5 we give a brief overview of the princi-ples of single-user and multiple-user detection for these channels. In particular,the single-user correlation detector, described in Section 2.5.3 has motivatedthe design of many existing multiple-access systems (historically due to itslow implementation complexity).

In Section 2.6 we describe some of these multiple-access systems, includingtime- and frequency-division methods, and show how the discrete-time linearmodel is specialized to each case. Of particular interest is the direct-spreadcode-division multiple-access channel of Section 2.6.2, which forms a commontheme for the remainder of this book. The Chapter concludes with an in-vestigation into the design of signaling waveforms for single-user detection inSection 2.7.

2.1 Continuous Time Model

Figure 2.1 shows a schematic representation of a K user continuous time linearmultiple-access channel. Each user’s transmitter consists of a data source, anencoder (i.e. forward error correction), and a modulator. Typically the datasource and encoder output binary symbols (bits). It will be assumed that anydata compression is contained within the data source, and we do not consider itexplicitly. Modulation is performed in two stages, first a baseband modulatormaps the coded data onto a complex signaling constellation. The resultingcomplex samples are then amplified and multiplied by a channel modulationwaveform, which is the method used to access the common channel. Thereceiver observes a signal which is the sum of delayed versions of the modulatedusers’ signals, together with noise.

In Figure 2.1, a dashed box shows the portion of the system that werefer to as the multiple-access channel. Strictly speaking, from an informationtheoretic point of view, the channel modulation waveforms should not beregarded as part of the channel. For the purposes of channel modeling however,we will consider the modulation waveforms as part of the channel. We will nowdevelop a detailed mathematical model for this linear multiple-access channel.

Considering the output of the baseband modulator, each user k = 1, . . . , Kgenerates a sequence of n data symbols dk[i], i = 1, 2, . . . , n. The subscript k

2.1 Continuous Time Model 15

Baseband

Modulator

Baseband

Modulator

Baseband

Modulator

Noise

Receiver

EncoderData

Source

EncoderData

Source

EncoderData

Source

Multiple-Access Channel

d1[i]

d2[i]

dK [i]

x1(t)

x2(t)

xK(t)

τ1

τ2

τK

r(t)

z(t)

√P1 s1(t)

√P2 s2(t)

√PK sK(t)

Fig. 2.2. Continuous time linear multiple-access channel.

indexes the user and the square-bracket indexing i denotes the symbol timeindex. Each data symbol is selected from a baseband modulation alphabet,dk[i] ∈ Dk.

The users’ alphabets need not be the same for all users, and may be dis-crete or continuous. In general, they may be subsets of the complex numbers,Dk ⊂ C, which allows us to use the sequences dk[i] to model various formsof complex baseband modulation. For example antipodal modulation, suchas binary phase-shift keying can be modeled by Dk = {−1,+1}. Alterna-tively, bi-orthogonal, e.g. quaternary phase-shift keyed data can be modeledby Dk = {1 + j,−1 + j,−1− j, 1− j}/

√2. See [104, 167] for basic concepts of

signal space and the complex baseband representation of signals. In Figure 2.1the thicker lines indicate complex signal paths.

For the purposes of channel modeling, we shall not be concerned withhow the discrete-time symbol sequences dk[i] are generated. For example theycould represent either uncoded or coded information sequences. In Chapter 3,we shall be more concerned with statistical models for the users’ messages,which is of relevance for information theory.

Each data symbol has duration T seconds. We turn the discrete-time sym-bol sequences dk[i] into continuous-time sequences (a function of the contin-uous time variable t ≥ 0) by writing

dk(t) = dk

[⌈t

T

⌉],

where �· is the ceiling function which returns the smallest integer that isgreater than or equal to its argument.

We shall use the convention that discrete-time sequences are indicatedwith square bracket indexing by i. Integer indices such as k and i shall start


from 1. Continuous time waveforms will be indicated with the round bracketindexing by t ≥ 0.

The symbol sequences for each user are amplified via multiplication bythe amplitudes

√Pk ≥ 0. The resulting scaled sequences are then linearly

modulated with a real or complex continuous-time waveform sk(t). It will beassumed without loss of generality that the per-symbol energy of the modu-lating waveforms is normalized so that

1nT

∫ nT

0

|sk(t)|2 dt = 1.

(recalling that n consecutive symbols are transmitted with total duration nT ).

It will also be assumed that

sk(t) = 0, t < 0, t > nT.

Under the further assumption that the modulation waveforms sk(t) are in-dependent of the data sequences dk[i] this results in a per-symbol transmitenergy P . Appropriate selection of the modulation waveforms allows us torepresent a wide variety of linear modulation schemes. Some examples will bedeveloped in Section 2.6.

Each transmitted waveform experiences a time delay of τk seconds. Thesedelays can model the propagation delays inherent in the channel, and the factthat the individual users may not be synchronized in time. Without loss ofgenerality, we can assume that 0 ≤ τk < T (delays longer than one symbolperiod can be modeled by re-indexing the users’ symbols).

Thus (in the absence of delay) the signal contribution xk(t) of user k =1, 2, . . . , K is

xk(t) = dk

[⌈t

T

⌉]√Pk sk(t).

The delay and noise-affected channel output waveform r(t) is therefore givenby

r(t) =K∑

k=1

xk(t− τk) + z(t) (2.1)

=K∑

k=1

dk

[⌈t− τk

T

⌉]√Pk sk(t− τk) + z(t), (2.2)

where z(t) is a continuous time noise process, which is usually assumed tobe Gaussian (real or complex to match the modulating waveforms and dataalphabet). See [49] for a discussion of the Gaussian noise process in relationto communications systems.

2.2 Discrete Time Model 17

2.2 Discrete Time Model

For the purposes of analysis and indeed implementation, it can be more con-venient to work with an equivalent discrete-time model. By a discrete-timemodel, we mean a representation of r(t) in terms of an integer-indexed se-quence r[j], j = 1, 2, . . . . By equivalent, we require r[j] to be a sufficient statis-tic for detection of the users’ data, according to Section A.2. A discrete-timemodel is obtained by representing the received signal r(t) using a completeorthonormal basis (see [49, Chapter 8]). Let the functions φj(t), j = 1, 2, . . .be a complete orthonormal basis. Then the discrete-time representation is

r(t) =∑

r[j]φj(t) where

r[j] =∫

r∗(t)φj(t) d(t).

where (·)∗ denotes complex conjugation. There are many possible choices forthe basis φj , depending on the structure of the transmitted waveforms.

In any practical system, the modulating waveforms sk(t) will be (at leastapproximately) band-limited to a interval of frequencies [−W, W ], and (ap-

mon choice of basis is the set of sampling functions

φj(t) =sin(2Wπ

(t− j−1

2W

))√

2Wπ(t− j−1

2W

) , j = 1, 2, . . . , 2WnT + 1. (2.3)

Using this basis (2.3), a finite dimension discrete-time model is obtainedby sampling the received waveform (2.2) at intervals of 1/(2W ) seconds.

r[j] = r

(j − 12W

)(2.4)

=K∑

k=1

dk

[⌈j−12W − τk

T

⌉]√Pk sk

(j − 12W

− τk

)+ z

(j − 12W

), (2.5)

where j = 1, 2, . . . , 2WnT + 1 is the sample time index (recalling our conven-tion to start integer indices from 1). With reference to Figure 2.3, let

Tk = �2Wτk

be the delay of user k, rounded up and measured in samples, and for futurereference define

Tmax = maxk

Tk.

The first non-zero sample instant for user k is at an offset of

τ0k =

Tk

2W− τk

proximately) time-limited to an interval [0, nT ] and in this case, one com-


seconds from the start of the modulating waveform. Define the sampled mod-ulation waveforms

sk[j] = s

(j − 12W

− τk

). (2.6)

which are zero for j < Tk + 1 and j ≥ nL + Tk + 1, where

L = �2WT �

is the number of samples per symbol period.Using these definitions, we arrive at the following discrete-time model,

r[j] =K∑

k=1

dk[i]√

Pk sk[j] + z[j]. (2.7)

In the above equation, the data symbol index i of user k corresponding tosample number j is

i =

⌈j

2W − τk

T

⌉,

valid for j > Tk.The sequence z[j] = z ((j − 1)/(2W )) are noise samples. Throughout this

book, the additive noise will be modeled by a zero mean, circularly symmet-ric complex white Gaussian process with power spectral density N0/L. In thiscase, the noise samples z(t) have independent zero mean Gaussian real andimaginary parts, each with variance N0/(2L), and these samples are also in-dependent and identically distributed (i.i.d.) over time. The received signal tonoise ratio γk for user k is therefore

γk =

{Pk/N0 for complex modulation2Pk/N0 for real modulation.

2.3 Matrix-Algebraic Representation

Much insight into the behavior of the linear multiple-access channel and thedesign of multiple-user detectors is gained through expressing the discrete-time model (2.7) in terms of matrix-vector operations. In order to accomplishthis, we need to define some appropriate vectors and matrices.

Modulation Matrix

Collect each user’s sampled modulation waveforms into length nL + Tmax

column vectors sk[i], one vector for each symbol period of each user, with zeropadding corresponding to the delays τk. The vector for user k and symbol i is

2.3 Matrix-Algebraic Representation 19

t

0 12W

22W

32W

42W

sk[1] sk[2] sk[3]

sk[4]

sk[5]

sk(t)

τk

τ0k

Fig. 2.3. Sampling of the modulation waveform.

sk[i] =

( 0, . . . , 0| {z }L(i−1)+Tk

, sk[L(i − 1) + Tk + 1], sk[L(i − 1) + Tk + 2], . . . , sk[Li + Tk], 0, . . . )t.

(2.8)

Note thatn∑

i=1

k , . . . , [nL + Tmax])t

is a vector containing the entire sampled modulation waveform for user k.

Example 2.1. Figure 2.4 shows a schematic representation of the modulationvectors for user k in a system with L = 3, Tk = 2, Tmax = 3 and n = 3. Thezero part of each vector is omitted for clarity.

Now define the (Ln + Tmax)×Kn modulation matrix S, which takes thevectors sk[i], k = 1, 2, . . . , K, i = 1, 2, . . . , n as columns,

S =(s1[1]s2[1] . . . sK [1]s1[2]s2[2] . . . sK [2] . . . s1[i]s2[i] . . . sK [i] . . .

)i.e. all the users’ vectors for symbol 1 (in order of user number), followed by allthe vectors for symbol 2 and so on. Column l of S is the sampled modulationsequence of user

k = ((l − 1) mod K) + 1 (2.9)

corresponding to symboli = �l/K . (2.10)

Amplitude Matrix

Define a diagonal Kn×Kn matrix

A = diag(√

P1,√

P2, . . . ,√

PK ,√

P1,√

P2, . . . ,√

PK , . . .)

which contains the user amplitudes, repeated n times. Note that time-varyingamplitudes are also easily accommodated.

s [i] = (s[1], [1]ks ks


j = 0

Tk

Tk+L

Tk+2L

Tk+3L

sk[2]

sk[3]

sk[4]

sk[5]

sk[6]

sk[7]

sk[8]

sk[9]

sk[10]

sk[1] sk[2] sk[3]

Fig. 2.4. The modulation vectors sk[i].

Data Vector

Collect the users’ symbols into a single length n column vector

d = (d1[1], d2[1], . . . , dK [1], . . . , d1[i], d2[i], . . . , dK [i], . . . , dK [n])t.

The elements of this vector are in the same ordering as the columns of S,namely data symbol 1 for each user followed by data symbol 2 for each userand so on.

The Basic Model for the Received Vector

Finally define the length nL + Tmax column vectors

r = (r[1], r[2], . . . , r[nL + Tmax])t

z = (z[1], z[2], . . . , z[nL + Tmax])t

which respectively contain the received samples and the noise samples.We can now re-write the discrete-time model (2.7) using the following

matrix-vector representation

2.4 Symbol Synchronous Model 21

r = SAd + z. (2.11)

This is the canonical linear-algebraic model for the linear-multiple-accesschannel.

Using the notation established so far, there are two ways to refer to ele-ments of these matrices. In cases where we wish to emphasize user and timeindexes, we will use the sequence notation sk[i]. When we wish to emphasizethe linear algebraic structure, we will use matrix-vector indices, sil.

As explained above, the conversion between user/time indexes k, i androw/column indexes is via (2.9) and (2.10).

Example 2.2. The equation below shows the structure of this linear model, forfour users, K = 4 and two symbols n = 2 with T1 = 0, T2 = 1, T3 = 3, T4 = 0and L = 4. All zero elements are omitted for clarity.

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

r[1]r[2]r[3]r[4]r[5]r[6]r[7]r[8]r[9]r[10]

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

=

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

s1[1] s4[1]s1[2] s2[2] s4[2]s1[3] s2[3] s3[3] s4[3]s1[4] s2[4] s3[4] s4[4]

s2[5] s3[5] s1[5] s4[5]s3[6] s [6] s4[6]

s1[7] s2[7] s3[7] s4[7]s1[8] s2[8] s3[8] s4[8]

s2[9] s3[9]s3[10]

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

√P1 √

P2 √P3 √

P4 √P1 √

P2 √P3 √

P4

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

d1[1]d2[1]d3[1]d4[1]d1[2]d2[2]d3[2]d4[2]

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

+

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

z[1]z[2]z[3]z[4]z[5]z[6]z[7]z[8]z[9]z[10]

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

2.4 Symbol Synchronous Model

In the case τ1 = τ2 = · · · = τK = 0, the system is symbol synchronous. Thisis obviously a less general model to the asynchronous model presented above,but it allows us to write a per-symbol matrix model that is very convenient forthe development of the main ideas of this book. The degree to which symbolsynchronism can be achieved is a technological issue, and in practice it istypically difficult to achieve perfect synchronism across all users.

For the symbol synchronous model we have the following symbol-wisediscrete-time matrix model.

r[i] = S[i]Ad[i] + z[i], (2.12)

where

r[i] = (r[(i− 1)L + 1], r[(i− 1)L + 2], . . . , r[iL])t

z[i] = (z[(i− 1)L + 1], z[(i− 1)L + 2], . . . , z[iL])t

d[i] = (d1[i], d2[i], . . . , dK [i])t

A = diag(√

P1, . . . ,√

PK)

1 s2[6]


and S[i] is a L × K matrix with column k being the modulation waveformsamples for user k corresponding to symbol i,

sk[i] = (sk[(i− 1)L + 1], sk[(i− 1)L + 2], . . . , sk[iL])t.

It is important to note that the specific waveform samples are different to thoseused in the asynchronous model. Since the users are completely synchronous,τ0k = 0 for each user and

s[j] = s

(j − 12W

), (2.13)

which are different sampling instants compared to the asynchronous model(2.11) (compare (2.13) to (2.6)).

Note that the symbol-synchronous (2.12) and the general asynchronous(2.11) models share the same basic mathematical form. In both cases, theobservation (whether it is for a single symbol, or and entire block of symbols)is a noise-affected linear transformation of the corresponding input. The onlymodification required is in the specification of the matrices and vectors in-volved. In the case of the symbol synchronous model, it is common practiceto drop the symbol indexing when it does not cause confusion, and write

r = SAd + z. (2.14)

With this convention, we can use row/column indexing as explained aboveto write expressions that apply to both the synchronous (2.14) and asyn-chronous (2.11) models. This is very useful, since it allows easy translation ofresults between the two cases. in particular, detection algorithms developed forthe synchronous channel can usually be directly applied to the asynchronouschannel simply via a re-definition of the linear model.

The symbol-synchronous discrete-time channel model is shown in Fig-ure 2.5, where the summation is now a vector sum.

Example 2.3. The equation below shows the structure of the symbol syn-chronous linear model at symbol i, for three users, K = 3 and L = 4. Allzero elements are omitted for clarity.⎛⎜⎜⎝

r[(i− 1)L + 1]r[(i− 1)L + 2]r[(i− 1)L + 3]

r[iL+]

⎞⎟⎟⎠ =

⎛⎜⎜⎝

s1[(i− 1)L + 1] s2[(i− 1)L + 1] s3[(i− 1)L + 1] s4[(i− 1)L + 1]s1[(i− 1)L + 2] s2[(i− 1)L + 2] s3[(i− 1)L + 2] s4[(i− 1)L + 2]s1[(i− 1)L + 3] s2[(i− 1)L + 3] s3[(i− 1)L + 3] s4[(i− 1)L + 3]

s1[iL] s2[iL] s3[iL] s4[iL]

⎞⎟⎟⎠⎛⎝√

P1 √P2 √

P3

⎞⎠⎛⎝d1[i]

d2[i]d3[i]

⎞⎠+

⎛⎜⎜⎝

z[(i− 1)L + 1]z[(i− 1)L + 2]z[(i− 1)L + 3]

z[iL]

⎞⎟⎟⎠

2.5 Principles of Detection

So far we have developed sample-level discrete time models for both the asyn-chronous and the symbol synchronous linear multiple-access channel. Usingthese models, we may apply existing results from detection theory to developa variety of detection techniques with different levels of performance, at theexpense of different levels of implementation complexity. In almost all cases

2.5 Principles of Detection 23

Baseband

Modulator

Baseband

Modulator

Baseband

Modulator

Noise

Receiver

d1[i]

√P1 s1[i]

d2[i]

√P2 s2[i]

dK [i]

√PK sK [i]

z[i]

r[i]

Fig. 2.5. Synchronous model.

of interest, implementation of the optimal detector is prohibitively complex– its complexity increases exponentially with the number of users. The highcost of implementation for the optimal detector motivates the development ofreduced-complexity sub-optimal detectors. See Appendix A for a brief intro-duction to Bayesian estimation and detection.

Most existing multiple-access strategies have been designed with a par-ticular sub-optimal detection method in mind, namely single-user correlationdetection, also known as the single-user matched filter. Historically, the corre-lation detector pre-dates the optimal detector, and was the motivation behindthe development of many different multiple-access techniques, for exampledirect-sequence code-division multiple-access.

The purpose of this section is to give a brief introduction to the principlesof detection as they apply to the linear multiple-access channel. In particular,we will introduce the single-user correlation receiver as motivation for theaccess strategies to be developed in Section 2.6. Chapters 4 – 6 develop infurther detail a whole range of detection and decoding strategies, providing atrade-off between performance and complexity.

As explained in Sections 2.3 and 2.4, the discrete time asynchronous andsymbol-synchronous models share the mathematical formulation

r = SAd + z (2.15)

under different definitions of these matrices and vectors.In the asynchronous case, the modulation matrix S is nK × (nL + Tmax)

and has as column l the zero-padded portion of the sampled modulation wave-form of user ((l−1) mod K)+1 associated with symbol �l/K. The nK×nKdiagonal matrix A has l-th diagonal element

√P((l−1) mod K)+1. The length

nK data vector d has as l-th element d�l/K�[((l − 1) mod K) + 1].


In the symbol synchronous case, (2.15) is a symbol-wise model and at eachsymbol i, the L×K modulation matrix S has as columns the users’ sampledmodulation waveforms associated with symbol i. The K ×K amplitude ma-trix A = diag

(√Pk

)and the length K data vector contains the users’ data

symbols for time i.Subject to the Gaussian noise model, in both cases, the noise vector z con-

tains either real or circularly symmetric complex i.i.d Gaussian noise sampleswith

E [zz∗] =

{N0I for complex noise,N0/2I for real noise.

We will assume complex modulation.In cases where it causes no confusion, we shall simply refer to the linear

model without specifying the degree of synchronism. In this way, the principlescan be clearly explained without the need to describe a proliferation of specialcases. This is one of the main advantages of this linear model.

The basic design objective for the multiple-user detector is to find esti-mates dk[i] satisfying various optimality criteria.

2.5.1 Sufficient Statistics and Matched Filters

According to (2.15), the observation r, conditioned on knowledge of S,A, d and N0 is Gaussian, with mean SAd and covariance N0I, denotedr ∼ N (SAd, N0I). The matched filter output y, obtained by multiplyingthe channel output with the Hermitian transpose of the modulation sequencematrix,

y = S∗r

= S∗SAd + S∗Az

= Rd + z

is a sufficient statistic for the detection of the transmitted symbols d (seeSection A.2). The matrix R = S∗S is called the correlation matrix. Theresulting noise z affecting the matched filter output is correlated according to

E [zz∗] = RN0.

The matched-filter model is shown schematically in Figure 2.6 (for the symbolsynchronous case).

A sufficient statistic preserves the statistical properties of the observation.Hence the statistic y is an information lossless, reduced dimension represen-tation of the original observation r. In the asynchronous case, R is nK×nK.For the synchronous model, R is K ×K. The correlation matrix is of partic-ular importance in multiuser detection and its structure will be described infurther detail in Section 2.5.2.


Baseband

Modulator

Baseband

Modulator

Baseband

Modulator

Noise

Receiver

Reset

Matched Filter Front End

d1[i]

√P1 s1[i]

d2[i]

√P2 s2[i]

dK [i]

√PK sK [i]

z[i]

r[i]

s1[i]

s2[i]

sK [i]

P

P

P

y1[i]

y2[i]

yK [i]

Fig. 2.6. Symbol synchronous matched filtered model.

Using either r or y we can find minimum probability of error, or min-imum variance estimates of d (see Section A.2). In many cases, it is moreconvenient to work with y, but this is not necessary. In fact, any invertiblelinear transformation Ar is also a sufficient statistic. We shall later see othertransformations that are useful for coded systems.

2.5.2 The Correlation Matrix

The matrix R = S∗S is called the correlation matrix, and appears frequentlyin the design and analysis of multiple-user detectors. We shall therefore con-sider this matrix in a little more detail.

Asynchronous Case

In the asynchronous case, column l of S is the spreading sequence for userk = ((l − 1) mod K) + 1 at symbol period i = �l/K (reproducing (2.8) forreference),

sk[i] =

( 0, . . . , 0| {z }L(i−1)+Tk

, sk[L(i − 1) + Tk + 1], sk[L(i − 1) + Tk + 2], . . . , sk[Li + Tk], 0, . . . )t.

(2.16)

The correlation matrix R of an asynchronous system has elements


Rlm = s∗k[i]sk′ [i′], wherel = K(i− 1) + k

m = K(i′ − 1) + k′.

If symbol i for user k does not overlap in time with symbol i′ for user k′

then the resulting Rlm = 0. Thus the correlation matrix is band-diagonalwith bandwidth 2K +1, since each symbol can only overlap with at most twosymbols from any one other user.

Example 2.4. Figure 2.7 shows the structure of R for a K = 3 system andi = 1, 2, 3. We have without loss of generality assumed τ1 ≤ τ2 ≤ τ3. Hence Ris a 9×9 band-diagonal matrix with bandwidth 5. The columns of the matrixare labeled with values for k and i. The rows are labeled with values for k′

and i′. The entries correspond to the cross-correlation between user k symboli and user k′, symbol i′.

Note that the matrix consists of a block-diagonal part, together with lowerand upper triangular matrices to fill in the band-diagonal structure. Thesquare matrices on the diagonal are the cross-correlations between all theusers for a given symbol interval. The upper-triangular matrices contain theinter-symbol interference terms from users at the previous symbol interval.The lower-triangular matrices contain the terms for the ISI from users at thenext symbol interval.

1 2 3

1

2

3

1 3213212 3

3

1

2

3

1

2

3

1

2

i

i'

k

k'

Fig. 2.7. Structure of the cross-correlation matrix.


Symbol Synchronous Case

For a symbol-synchronous model, there is no inter-symbol interference andthe correlation matrix R is block-diagonal

R = diag (R[1],R[2], . . . , )

where R[i] is a K ×K symmetric matrix and has elements

Rkk′ [i] = s∗k[i]sk′ [i].

With reference to Figure 2.7, if the system is symbol synchronous, the upper-and lower-triangular components of R are zero, and the remaining squarematrices on the diagonal are exactly R[1], R[2], etc.

2.5.3 Single-User Matched Filter Detector

The correlation receiver, also known as the single-user matched filter receiverholds a special place in the history of multiple-user detection. It is an essen-tially single-user technique that forms its decision dk for user k based only onthe matched filter output for user k. Now the matched filter output for user kconsists of the symbol of user k as well as additive multiple-access interference,

yk[i] = s∗k[i]r (2.17)

= dk[i] +∑

k′ �=k,i′ �=s∗k[i]sk′ [i′]dk′ [i′] + zk[i] (2.18)

= dk[i] +∑

k′ �=k,i′ �=Rlmdk′ [i′]

︸︷︷︸Multiple-access Interference

+ zk[i]

︸︷︷︸Thermal Noise

. (2.19)

The last line above shows how the entries of the correlation matrix affectthe amount of multiple-access interference (where as before, l = Ki + k andm = Ki′+k′). The correlator receiver is motivated by a Gaussian assumptionregarding the distribution of the multiple-access interference. The folkloreargument in favor of the correlator receiver goes as follows.

“The matched filter output is optimal (in terms of minimizing eachuser’s error probability) when the multiple-access interference is un-correlated Gaussian noise. If there are many users, the central limittheorem ensures that this interference is in fact asymptotically Gaus-sian distributed.”

Invoking the Gaussian assumption on the multiple-access interference, thecorrelator receiver applies standard detection methods individually to theoutput of each matched filter. In the case of binary antipodal modulation,Dk = {−1,+1}, each user’s symbol estimate is obtained by hard-limiting thematched filter output as follows.

i′

i′


dk[i] = sgn(yk[i]) .

Figure 2.8 shows the structure of the correlator receiver for the symbol-synchronous channel and antipodal modulation. In the case of other modu-lation alphabets, the threshold device is replaced by the appropriate decisiondevice (based on the Gaussian assumption).

Reset

r[i]

s1[i]

s2[i]

sK [i]

P

P

P

y1[i]

y2[i]

yK [i]

d1[i]

d2[i]

dK [i]

Fig. 2.8. Symbol synchronous single-user correlation detection for antipodal mod-ulation.

Largely due to the pioneering work of [121, 142, 147], the Gaussian assump-tion argument has been refuted, and since then, a proliferation of multiple-userdetection strategies, exploiting the inherent structure of the multiple-accessinterference has occurred. The sub-optimality of the single-user correlator isbasically due to the fact that although the vector y is a sufficient statisticfor the detection of d, the single element yk is not (in general) a sufficientstatistic for the detection of dk. By ignoring the other matched filter outputs,information essential for optimal detection is lost.

In Chapter 3 we shall see that not only is the correlator sub-optimal froman uncoded bit-error probability point of view, but if each user performssingle-user decoding based on the output of the correlator, a penalty is paidin terms of spectral efficiency. Nevertheless, the simplicity of implementationof the correlator makes it an attractive choice for system designers, and it isthe method of choice for most contemporary existing systems.


2.5.4 Optimal Detection

We shall now develop the optimal detector for d, given the observation r.Whenever we talk about optimality, we need to clearly specify what functionis being optimized. Of particular interest in this detection problem is theprobability of error, and there are two main criteria of interest, correspondingrespectively to joint or individual optimality.

The jointly optimal detector minimizes the probability of a sequence error,

Pr(d[i] �= d[i]

), (2.20)

whereas the individually optimal detector minimizes the individual error prob-abilities,

Pr(dk[i] �= dk[i]

).

Let us now develop the jointly optimal detector [121, 142] (the individ-ually optimal detector will be developed in the next section). According tothe discussion of minimum error probability detection in Section A.5.1, thejointly optimal detector outputs the data vector with maximum a-posterioriprobability. Using Bayes rule, the jointly optimal detector1 is given by

dMAP = arg maxd∈DK

Pr (d) f (r | d) , (2.21)

where Pr(d) is the prior probability distribution on the transmitted data and(in the case of complex noise),

f(r | d) = (2πN0)−L exp

(− 1

N0‖r− SAd‖22

)is the conditional density of the channel output. Alternatively, this detectorcould have been defined as a function of the matched filter output, since y isa sufficient statistic. The development of this detector, as well as its extensionto the asynchronous channel will be given in Chapter 4.

Note that the estimate of the entire vector d is based on the entire vectorr, or equivalently the output of every matched filter, [142], as depicted inFigure 2.9.

In the case that Pr(d) is the uniform distribution, as might be the casefor uncoded data, the joint MAP detector becomes the joint maximum likeli-hood detector. For Gaussian noise, the joint ML detector minimizes Euclideandistance,

dML = arg mind∈DK

‖r− SAd‖22 . (2.22)

The general problem of MAP or ML estimation is NP-hard and bruteforce evaluation is exponentially complex in K, [149]. The complexity is due1 Also assuming identical modulation alphabets Dk = D for each user.


Baseband

Modulator

Baseband

Modulator

Baseband

Modulator

Noise

Matched Filter

Front End

Optimal

Detector

d1[i]

√P1 s1[i]

d2[i]

√P2 s2[i]

dK [i]

√PK sK [i]

z[i]

r[i]

y1[i]

y2[i]

yK [i]

d1[i]

d2[i]

dK [i]

S∗

Fig. 2.9. Optimal joint detection.

to the discrete nature of the support which has |D|K elements. Brute forceevaluation involves calculation of f(y|d) for each element of the support.

The ML estimator may be implemented using an exponentially complextrellis search. This prohibitive level of complexity motivates many sub-optimalreduced-complexity detection methods, which is the subject of most of theremainder of this book. In particular Chapter 5 describes reduced complexitymethods, mostly based on reduced tree searches for near-optimal detection,which approximate the action of either the jointly optimal, or individuallyoptimal detectors.

2.5.5 Individually Optimal Detection

The individually optimal detector minimizes the symbol error probability,

Pr(dk[i] �= dk[i]

),

rather than the sequence error probability (2.20). This is accomplished byoutputting the symbol which maximizes the marginal a-posteriori probability,Pr(dk[i] | y). For the symbol synchronous system, this corresponds to

dk[i] = arg maxd∈D

∑d:dk=d

Pr(d)f(y[i] | d). (2.23)

There are DK−1 terms in the above summation. Brute-force evaluation of theindividually optimal decision is exponentially complex in the number of users.Note also that the individually optimal decision is still a function of the entirechannel output.

2.6 Access Strategies 31

2.6 Access Strategies

There are many different existing multiple-access strategies. Each strategy canbe described by a specific choice of modulation waveforms, usually intendedto have properties particularly suitable for the application of the single-usermatched filter receiver. In this section we describe how to formulate severalwell-known multiple-access schemes within the generic linear multiple-accessframework (2.2). We will focus on symbol synchronous discrete-time versionsof each channel, and hence the various access strategies will be parameterizedby their choices of the modulation matrix S.

2.6.1 Time and Frequency Division Multiple-Access

Perhaps the most obvious way to share a given bandwidth is to use time-division or frequency division multiple-access. The basic idea is to allocatenon-overlapping subsets of the entire time/bandwidth resource to each user.

Time division multiple-access (TDMA) allows each user to transmit usingthe entire available bandwidth, restricted however to non-overlapping timeintervals. Assuming, without loss of generality that each user transmits onecomplex baseband symbol per allocated time interval, the resulting modula-tion matrix for the K-user symbol synchronous TDMA channel is

STDMA = IK .

This is the most obvious example of an orthogonal modulation matrix, i.e.

RTDMA = IK .

With reference to (2.19), this means that there is no multiple-access interfer-ence. In this case the single-user matched filter receiver is indeed optimal.

Frequency division multiple-access (FDMA) allows each user to transmitcontinuously in non-overlapping frequency bands. Assuming again withoutloss of generality that each user transmits a single symbol per allocated fre-quency band, the resulting modulation matrix for the symbol synchronouschannel has elements

(SFDMA)jk =1√K

exp(2π√−1 jk/K

).

This is in fact the Fourier transform matrix, F , and once again

RFDMA = IK ,

resulting in optimality of the correlation receiver.Time and frequency-division multiple-access are duals of each other via

the Fourier transform, indeed (somewhat trivially)


SFDMA = FSTDMA

STDMA = F∗SFDMA.

The TDMA and FDMA methods just described are two examples of orthog-onal multiple-access, in which the modulation matrix is orthogonal, S∗S = I.

Dividing time or frequency between the users makes a certain amount of“intuitive” sense, especially for engineers who are familiar with time-frequencyrepresentations of signals. More generally however, what is going on is thatthere are 2WT signaling dimensions to be shared among K users and thereare many more ways to do this.

2.6.2 Direct-Sequence Code Division Multiple Access

Direct-sequence code-division multiple-access (DS-CDMA) uses modulatingwaveforms sk(t) that are built from component waveforms ϕ(t) called chipwaveforms. These chip waveforms are of short duration compared to thelength of the modulating waveforms themselves. We now specialize the linearmultiple-access model to the case of chip-synchronous direct-sequence modu-lation. Let ϕ(t) be a chip waveform satisfying∫

ϕ(t)ϕ∗(t− jTc) = 0 0 �= j ∈ Z (2.24)∫|ϕ(t)|2 dt = 1, (2.25)

i.e. the chip waveform is unit energy and is orthogonal to any translation ofitself by integer multiples of Tc, the chip period. We assume an integer numberof chips L per symbol period, T = LTc.

For theoretical purposes, we may consider rectangular chip waveforms,with support [0, Tc), which however results in an infinite bandwidth. In prac-tice, band-limited pulses, such as pulses with a raised cosine spectrum maybe used.

The modulating waveforms for each user is composed out of copies of thechip waveform,

sk(t) =nL∑j=0

sk [j]ϕ(t− jTc), (2.26)

where the integer j denotes the chip index and the chip-rate sequences sk[j]are the real or complex chip amplitudes. Note that we consider the case inwhich each user has the same chip waveform, and therefore these waveformsare not indexed by the user number k. The resulting modulating waveformsare made different through the multiplication of the chip waveforms by thechip amplitudes, which can be different from user to user.

Code-division multiple-access was motivated by spread-spectrum commu-nications and typically, each user’s modulation waveform sk(t) occupies a


bandwidth considerably larger than that required by Shannon theory fortransmission of the data alone, [83]. This is the case when Tc � T , or equiv-alently, L � 1. These sequences sk[j] are sometimes referred to as spreadingsequences, signature sequences or spreading codes (we will however reserve theword code to mean forward error control codes). The elements of the spreadingsequence sk[j] are usually chosen from a finite alphabet, e.g. {−1,+1}/

√L.

The concept of spectrum spreading can be viewed more abstractly as thescenario when each user occupies only a small fraction of the total availablesignaling dimensions. In the context of multiple-access communications how-ever, the entire signal space may be occupied, even though each user spansonly a small subspace. These concepts are better understood from an infor-mation theoretic point of view, and in Chapter 3 we will discuss this issue ingreater depth.

Multiplication of the symbol-rate data dk by the spreading sequence sk iscalled direct-sequence spreading and the resulting form of multiple-access isknown as direct-sequence code-division multiple-access.

Example 2.5. Figure 2.10 shows an example modulating waveform built fromchips, corresponding to a Nyquist chip waveform with roll-off 0.3 and the chipamplitude sequence sk [1] , . . . , sk [10] = 1, 1, 1, 1,−1,−1, 1, 1,−1, 1.

Fig. 2.10. Modulating waveform built from chip waveforms.

With the above definitions, each user’s undelayed noise-free contributionis given by

xk(t) = dk [i]√

Pk

nL∑j=0

sk [j]ϕ(t− jTc),

where i = �t/T .Under the convenient (albeit unrealistic) assumption of chip synchronism

(but not symbol synchronism) across the users, we assume that the integersτk measure the delay in whole number of chip periods. Although this assump-tion would rarely hold true in practice, it is a model commonly used in the


literature, and suffices for the purposes of the development of the material inthis book.

The output of the chip-synchronous direct-sequence modulated multiple-access channel is therefore given by

r(t) =K∑

k=1

xk(t− τkTc) + z(t)

=K∑

k=1

dk

[⌈t− τkTc

T

⌉]√Pk

nL∑j=1

sk [j]ϕ(t− (j + τk)Tc) + z(t).

Previously, we obtained a discrete-time model (2.11) from the continuoustime model (2.2) using the sampling functions (2.3) as a complete orthonor-mal basis. Of course, any orthonormal basis will do, and in the case of chipsynchronous DS-CDMA, it is convenient to use the set of delayed chip func-tions,

φj(t) = ϕ (t− jTc) ,

which according to (2.24) and (2.25) are orthonormal. Furthermore, sincewe have by design constructed the modulation waveforms as linear combina-tions (2.26) of φj(t), and the users are chip synchronous, they are a completeorthonormal basis. We can therefore obtain a discrete-time model for the chipsynchronous channel by applying a chip matched filter (assuming that thereceiver knows the chip boundaries and therefore the optimal sampling in-stance). The resulting sequence r[j] is a sufficient statistic for d. This set-upis shown in Figure 2.11.

Baseband

Modulator

Baseband

Modulator

Baseband

Modulator

Noise

Chip Matched Filter

Receiver

Chip-rate sampling

d1[i]

d2[i]

dK [i]

x1(t)

x2(t)

xK(t)

τ1

τ2

τK

r(t)

z(t)

√P1 s1(t)

√P2 s2(t)

√PK sK(t)

ϕ∗(−t)r[i]

Fig. 2.11. Chip match-filtered model.


From (2.26) we can see that the modulation waveform samples resultingfrom the chip matched filter are precisely the chip amplitudes sk[j]. Applica-tion of the chip match filter and a sampling at a frequency of 1/Tc samplesper second results in the following discrete-time chip synchronous model.

r[j] =K∑

k=1

dk

[⌈j − τk

L

⌉]√Pk sk[j − τk] + z[j], (2.27)

where z[j] is a sampled Gaussian noise sequence. Note that this model onlyapplies in the case of no inter-chip interference, which in addition to chipsynchronism, requires that the chip waveforms ϕ(t) and ϕ(t + Tc) offset byand integer number of chip periods are orthogonal. This is true by our as-sumption (2.24). Note that this discrete-time CDMA channel model (2.27)is identical in form to the generic discrete-time asynchronous model (2.7)presented earlier. The only difference is in the definition of the modulationwaveform samples.

This discrete time model can be re-written in the familiar matrix-vectorform (2.11)

r = SAd + z,

where the definitions of A, d and z are identical to those given in Section 2.3.The modulation matrix has exactly the same structure as (2.8), except thatthe entries are the chip amplitudes rather than the sampled waveforms (thisis a somewhat pedantic difference, since here we are in fact using the chipmatched filter to achieve the same goal as sampling).

If the delays τk are known to the transmitter, and do not change over time,the modulation matrix S can be chosen (via selection of the sk[j]) to have de-sired properties. For example, it may be possible to choose an orthogonal Ssuch that R = I, resulting in zero multiple-access interference and optimalityof the single-user matched filter. We have already encountered two possibleorthogonal matrices, I and F , corresponding to time- and frequency-divisionmultiple-access. There is nothing particularly special about these choices, andit is clear that any orthogonal (or unitary) matrix can be used with the sameresult. Rather than assigning specific time or frequency dimensions to indi-vidual users, an arbitrary orthogonal modulation matrix assigns orthogonalsignaling dimensions, from a total of 2WT dimensions. All of these orthogonalmultiple-access strategies are equivalent via change of basis.

One problem with orthogonal modulation however is maintaining synchro-nism. A modulation matrix S designed for one set of user delays may no longerhave the desired properties if the delays change. In particular, orthogonalitymay be hard to maintain over a wide range of possible delays. There are alsomany cases in which it is not reasonable to expect that the transmitter ei-ther knows the user delays, or that it has the option to adapt the modulationsequences (it may be desired that the modulation sequences are fixed for alltime).


We shall distinguish between two broad classes of spreading sequences:periodic sequences, where the same sequence is used to modulate each datasymbol for a given user (but different sequences for each user); and randomsequences, in which the spreading sequence corresponding to each symbol israndomly selected.

Definition 2.1 (Periodic Spreading). The spreading sequencesk[j] is periodic if the same sequence is used for each symbol period

sk[j] = sk[j + L]

Periodic spreading may be used when symbol synchronism can be enforcedand the sequences may be designed to have desired properties, such as lowcross-correlation.

Definition 2.2 (Random Spreading). The spreading sequenceis random if each element sk[j] is selected i.i.d. from a fixed distri-bution with zero mean, unit variance, and finite higher moments.2

Note that this definition includes uniform selection of chips from an appro-priately normalized binary (or any other finite cardinality) alphabet. It alsoincludes such choices as Gaussian distributed chips.

Random spreading may be approximated by using long pseudo-randomsequences, such as m-sequences. The use of the term “random spreading” doesnot mean that the sequences are unknown to the receiver. In fact we usuallyassume that the receiver also knows the sequences. This is possible when thesequences are only really pseudo-random (i.e. the receiver can generate thesame pseudo-random sequence and the transmitter).

Random spreading is a useful model for systems in which the period ofthe spreading is much greater than the symbol duration. As we shall also see,random spreading is a useful theoretical tool for system analysis.

2.6.3 Narrow Band Multiple-Access

Rather than assigning wide-band signaling waveforms, we can consider a sys-tem in which each user transmits using the same modulation waveform s(t).In the symbol synchronous case the signal model is

r(t) =K∑

k=1

d

[⌈t

T

⌉]√Pk s(t) + z(t), (2.28)

2 This mild restriction on higher moments is a technical requirement for large-system capacity results.


and the corresponding discrete-time matrix model is

r[i] = S[i]A[i]d[i] + z[i]

where the modulation matrix S[i] has identical columns. In that case, thecorrelation matrix R is the all-ones matrix,

y[i] = 1A[i]d[i] + S[i]∗z[i]

and the output of each user’s matched filter is identical and is given by

y[i] =K∑

k=1

√Pk dk[i] + z[i]. (2.29)

Note that we could have obtained (2.29) directly from (2.28) via matchedfiltering for s(t).

2.6.4 Multiple Antenna Channels

In certain scenarios, such as in the presence of multipath propagation, it maybe advantageous for the receiver and each transmitter to use multiple anten-nas. Let us develop an idealized channel model for a K user system in whicheach user has M transmit antennas, and the receiver has N antennas. Exten-sion to different number of antennas for each transmitter is straightforward.

Figure 2.12 shows a two-user example in which each user has two transmitantennas, and there are three receive antennas. The figure is simplified toconcentrate on the multiple antenna aspect of the system.

User 1

User 2

Receiver

h111

h211h31

1

h121

h221

h321

h112

h212

h312

h122

h222

h322

x11(t)

x21(t)

x12(t)

x22(t)

r1(t)

r2(t)

r3(t)

Fig. 2.12. Multiple transmit and receive antennas.


To allow for the most general transmission strategies, let each user employa different modulation waveform for each transmit antenna and transmit dif-ferent data symbols over each antenna. To this end, let sμ

k(t) and dμk [i] be the

respective modulation waveforms and data symbols for user k = 1, 2, . . . , K,and antenna μ = 1, 2, . . . , M . Each user may also transmit using different sym-bol energies from each transmit antenna. Let Pμ

k denote the symbol energyfor antenna μ of user k.

We can therefore write the undelayed transmitted signals for antenna μ ofuser k as

xμk(t) = dμ

k

[⌈t

T

⌉]√Pμ

k sμk(t).

Let hνμk be the channel gain between transmit antenna μ of user k and

receive antenna ν = 1, 2, . . . , N . Then the signal received at antenna ν is

rν(t) =K∑

k=1

M∑μ=1

hνμk (t)xμ

k(t− τk) + zν(t)

=K∑

k=1

M∑μ=1

hνμk (t)dμ

k

[⌈t− τk

T

⌉]√Pμ

k sμk(t− τk) + zν(t).

The received signal at each antenna shares the same formulation as a KMuser system in which each user has a single transmit antenna (consider themapping (k, μ) �→ K(μ− 1) + k).

Sampling the output of each receive antenna leads to a discrete-time modelwhich may be represented in matrix form. Under the assumption of symbolsynchronism we can write

V[i] = S[i]AD[i]H[i] + Z[i] (2.30)

where the various matrices in (2.30) are defined as follows.The L×N matrix V[i] contains the received samples for symbol period i.

Element (V)jν of this matrix is sample j from antenna ν. Thus each columnis the output of one antenna.

The L×KM matrix S[i] is defined in a similar way to (2.12). Its columnsare the sampled modulation waveforms for each transmit antenna of each user.More specifically, column l is the sampled waveform for transmit antenna μof user k, according to

k = �l/M (2.31)μ = ((l − 1) mod M) + 1, (2.32)

i.e. we get each transmit antenna of user 1 followed by each antenna of user2 and so on.

The KM ×KM matrices A[i] and D[i] are both diagonal. The diagonalelements of A are the transmit amplitudes for each transmit antenna of each


user. The diagonal elements of D are the transmitted symbols for each antennaof each user. The ordering is the same as that used for S, i.e. subject to (2.31)and (2.32)

(D[i])ll = dμk [i]

(A[i])ll =√

Pμk [i].

Subject to the assumption that the channel gains hμνk are constant for

each symbol interval, but that they may change from symbol interval to sym-bol interval, the KM × N matrix H[i] contains the channel gains, orderedaccording to (2.31) and (2.32),

(H[i])lν = hμνk [i].

With the introduction of appropriate statistical models, the channel gains maymodel effects such as fast or slow flat fading, or they may be used to modelphased arrays.

Note that we may recover the single antenna model of (2.12) via M =N = 1 and H[i] = 1, a K × 1 all-ones vector. With these definitions, (2.30) isidentical to (2.12) since d[i] = D[i]1.

2.6.5 Cellular Networks

Cellular networks spatially re-use signaling dimensions (time and frequency)in order to increase the overall system capacity. The idea is that given enoughspatial separation, the signal attenuation due to radio propagation will besufficient to limit the effects of multiple-access interference from other cellsoperating in the same signal space.

An idealized model of a narrow band cellular multiple-access channel wasdeveloped in [170]. In this model, a user contributes to the received signal of abase station only if it is in that base station’s cell, or if it is in an immediatelyadjacent cell. This is shown schematically in Figure 2.13. This figure shows tenhexagonal cells each with a base station represented by a circle. Two mobileterminals are shown. Each transmits an intended signal to its own base station(solid lines) and interference to adjacent base stations (dashed lines).

Consider the up-link of a cellular network in which there are L cells, eachwith a single base station. A total of K users populate the network. Assumethat each user transmits using identical modulation waveforms and that thesystem is symbol synchronous. Then (according to the narrow-band modeldeveloped in Section 2.6.3), the discrete-time signal model is

r[i] = S[i]Ad[i] + z[i].

In this model, rj [i], j = 1, 2, . . . , L is the signal received at base station j.According to Wyner’s idealized model, the matrix S defines the connectiongains between user k and base station j according to


1

2

4

3

5

6

7

8

9

10

1

2

Fig. 2.13. Simplified cellular system.

sjk =

⎧⎪⎨⎪⎩

1 User k is in cell j

α User k is in a cell adjacent to cell j

0 otherwise.

Thus each user is received in its own cell with symbol energy P and in adjacentcells with symbol energy Pα2.

In this simplified model, the modulation matrix acts as a connection ma-trix. This is a crude model for perfect power control within each cell, and afixed path loss between cells. In a more general model, the gains sjk wouldbe random variables and would include the effect of path loss, shadowing andfading.

Example 2.6. The cellular arrangement shown in Figure 2.13 is modeled by

S =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

α 0α 0α α1 α0 αα 0α 1α α0 α0 α

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

.


2.6.6 Satellite Spot-Beams Channels

Satellite spot-beam channels, [89, 90] share a similar formulation to cellularnetworks. With reference to Figure 2.14, terrestrial user terminals communi-cate with a ground station via a satellite which employs a multi-beam antenna(usually implemented using a phased array). The satellite spatially re-usestime and frequency resources by using a spot beam antenna. Each beam ofthe spot beam antenna illuminates a geographical region and the antennais designed to provide isolation between each beam. It is however difficultto achieve perfect isolation between beams and multiple-access interferenceresults. In the cellular model, the average gain between each user and basestation pair is determined largely by path loss considerations. In the spot-beam channel, the gain between each user and beam is determined by theantenna roll-off.

Under the assumption of symbol synchronism and identical modulatingwaveforms for each user, the satellite spot-beam channel is the same as thecellular network described in Section 2.6.5. The modulation matrix now con-tains the gain from each mobile each mobile earth terminal to each antennabeam.

Ground

Station

Satellite

Fig. 2.14. Satellite spot beam up-link.


Example 2.7. The seven-beam example shown in Figure 2.14 is modeled by

S =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝

1 α α α α α αα 1 α 0 0 0 αα α 1 α 0 0 0α 0 α 1 α 0 0α 0 0 α 1 α 0α 0 0 0 α 1 αα α 0 0 0 α 1

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠

2.7 Sequence Design

Most modulation sequence design is motivated by the use of sub-optimaldetection strategies, in particular the correlation receiver described in Sec-tion 2.5.3. Conditioned on this choice of sub-optimal detection strategy, it isnatural to try to optimize the performance of the receiver by carefully choos-ing the modulation sequences. To this end, one could try to choose sequencesthat minimize the average error probability or that minimize the total crosscorrelation.

2.7.1 Orthogonal and Unitary Sequences

If L ≥ K, it is always possible to find modulation matrices S that resultin zero multiple-access interference. In the case of real matrices, this meansR = StS = I and S is called orthogonal. In the case of complex matrices,R = S∗S = I, and S is called unitary. Obviously orthogonal matrices arealso unitary if we consider the real elements to be complex numbers with zeroimaginary part.

Unitary matrices have the property that each column is orthogonal toevery other column, and these columns are the modulating vectors for eachuser. Starting from an given vector, it is possible to find K − 1 other vectorswhile maintaining orthogonality, via a Gram-Schmidt process.

In the case of an unitary S, the matched-filter output is

y = S∗r

= S∗Sd + S∗z

= d + z.

It is a property of the Gaussian density that it is invariant to unitary trans-formations (i.e. it is isotropic). This means that unlike most transformations,z has the same distribution as z. Therefore each user k may be detected basedonly on its own matched filter output.

Some examples of unitary matrices that we have already encountered in-clude I, the identity matrix and F , the Fourier transform matrix. Use of thesematrices result respectively in time and frequency-division multiple-access.

2.7 Sequence Design 43

Although it is always possible to construct unitary matrices for L ≥ K,it is not always possible to ensure that the elements of S are members of agiven modulation alphabet. It may be required for instance that the elementsare binary, sk[j] ∈ {−1,+1}.

2.7.2 Hadamard Sequences

Let us now consider the problem of designing modulation sequences to opti-mize the performance of the correlation receiver. The total variance σ2

k of themultiple-access interference as seen by user k is the k-th row (or column) sumof R, minus the diagonal term

σ2k =

K∑j=1

R2jk − 1,

and a reasonable goal for sequence design is to find binary modulation se-quences S that minimize the total squared cross-correlation

σ2tot =

K∑i=1

σ2k,

subject to a fairness condition σ2k = σ2

tot/K.Now if K ≤ L, it is obvious that we can achieve σ2

tot = 0 by using orthogo-nal modulation sequences, since in that case R = I. The correlation receiver isin fact optimal for orthogonal sequences. Binary matrices S with the propertyS∗S = I are called Hadamard Matrices, and can only exist for L = 1, 2 or amultiple of 4. (It is not know whether they exist for every multiple of 4).

For arbitrary given K and L (i.e. K not necessarily smaller than L), howclose to orthogonal can we get? The following theorem gives a bound on thetotal squared cross-correlation [159].

Theorem 2.1 (Welch’s Bound). Let sk ∈ CL and ‖sk‖2 = 1,

k = 1, 2, . . . , K then

K∑i=1

K∑j=1

R2ij ≥

K2

L.

Equality is achieved if and only if S has orthogonal rows. Equalityimplies

K∑j=1

R2ij =

K

L.


It has been shown in [85] that sequences that meet the Welch bound withequality are optimal if correlation detection is being used. We also see inSection 3.4.2 that such sequences are also are optimal from an informationtheoretic point of view.

Example 2.8 (Welch bound equality sequences). The following non-orthogonalsequence set achieves Welch’s bound (where for clarity, + denotes a +1 and− denotes −1).

S =

⎛⎜⎜⎝

+ + + + + + + ++ + + + − − − −+ + − − + + − −+ − + − + − + −

⎞⎟⎟⎠

The requirement for equality in Welch’s bound is the orthogonality of therows of S. Recall that the modulation sequences themselves form the columnsof S. In the case K ≤ L, we can have orthogonality both of the rows, and thecolumns. If however K > L, the sequences themselves cannot be orthogonal,yet the correlator performance is optimized by having orthogonal rows.

3

Multiuser Information Theory

3.1 Introduction

In this chapter we consider some of the Shannon theoretic results and codingstrategies for the multiple-access channel. The underlying assumption is thatsignals are encoded, and that no restrictions will be placed on the capabilityof the receiver processing. Information theory provides fundamental limitson data compression, reliable transmission and encryption for single user orpoint-to-point channels. In the context of multiple-user communications, thereare many interesting and relevant questions that we could ask:

• What are the fundamental limits for multiple-user communications?• What guidance can information theory give for system design?• What are the impact of system features such as feedback, or synchronism?• What modulation and/or coding techniques should we use?• How should we deal with multiple-access interference from an engineering

point of view?• What is the cast of using sub-optimal reception techniques?

We will approach some of these questions from an information theoretic per-spective, concentrating on the multiple-access channel.

We begin in Section 3.2 by formalizing our definition of a multiple-accesschannel. From this basis, we introduce in Section 3.2.2 the capacity region,which is the multiple-user analog of the channel capacity.

In Section 3.3 we consider some simple binary-input channels, namelythe binary adder channel and the binary multiplier channel. These serve toillustrate some of the main ideas regarding the computation of the capacityregion for discrete channels.

Multiple-access channels with additive Gaussian noise are developed inSection 3.4. Of particular interest is the direct-sequence code-division multiple-access channel, which is the subject of Section 3.4.2. We discuss the relation-ship between the selection of signature sequences and the channel capacity,

46 3 Multiuser Information Theory

and give an overview of large systems analysis, applicable to random signa-ture sequences. This section also serves as an introduction to Chapters 4 – 6,which discuss the design of multiple-user receivers for DS-CDMA.

In Sections 3.5.1 and 3.5.2 , we discuss the design of uniquely decodeableblock and trellis codes, which are error-control codes specifically designed forthe multiple-access channel. Another coding strategy for the multiple-accesschannel is superposition coding, discussed in Section 3.6. This approach allowsthe multiple-access channel to be viewed as a series of single-user channels,with no theoretical loss in channel capacity.

Various modifications of the channel assumptions have different effectsupon the size and shape of the capacity region. In Section 3.7, we discuss theeffect of feedback, and in Section 3.8 we discuss asynchronous transmission.

3.2 The Multiple-Access Channel

3.2.1 Probabilistic Channel Model

In Chapter 2, we developed signal-based mathematical models for the linearmultiple-access channel. In order to obtain an information theoretic under-standing of the multiple-access channel, we must first define an appropri-ate probabilistic model (the underlying probabilistic models were implicit inChapter 2).

Let us begin by introducing some concise notation to describe many usersystems. For a K user system, let

K = {1, 2, . . . , K}

be the set of user index numbers. For an arbitrary subset of users, S ⊆ K, thesubscript S denotes objects indexed by the elements of S. For example,

XS = {Xk, k ∈ S}

are those users indexed by the elements of S. Likewise, letting Rk be theinformation rate of user k,

RS = {Rk, k ∈ S} .

For objects with a suitably defined addition function (rates, powers), letthe functional notation

R (S) =∑k∈S

Rk

be the sum over those users indexed by S. We are now ready to define thediscrete memoryless multiple-access channel.

3.2 The Multiple-Access Channel 47

Definition 3.1 (Discrete Memoryless Multiple-AccessChannel). A discrete memoryless multiple-access channel(DMMAC) consists of K finite input alphabets, X1,X2, . . . ,XK

(denoted XK in the shorthand notation introduced above), a setof transition probabilities,

p(y | xK) = p(y | x1, x2, . . . , xK)

and an output alphabet Y. We shall refer to such a channel bythe triple (XK; p(y | xK) ; Y), which emphasizes the entities thatcompletely define the channel.

Figure 3.1 shows a schematic representation of a two-user multiple-accesschannel. Transmission over the DMMAC takes place as follows. Time is dis-crete, requiring symbol transmissions to be aligned to common boundaries. Ateach symbol interval, every user k = 1, 2, . . . , K transmits a symbol xk, drawnfrom Xk according to a source probability distribution pk(xk). The receivedsymbol, y ∈ Y is observed according to the transition probabilities p(y | xK).

x2 ∈ X2

x1 ∈ X1

y ∈ Yp(y | x1, x2)

Fig. 3.1. Two-user multiple-access channel.

Although we have only formally defined a discrete memoryless multipleaccess channel, it is straightforward to extend this definition to some othertypes of channel. For example, we can model the discrete-input continuous-output MAC by allowing the received symbols to be real-valued, Y ⊆ R. Ac-cordingly, the conditional transition probabilities must now be density, ratherthan mass functions. Similarly, continuous-input continuous-output channels(usually subject to input power constraints), allow the sources to take valuesfrom the reals, XK ⊆ R

K according to some density function pk(xk), yieldingnow a joint probability density, p(y | xK).

Although we have defined the individual user’s (marginal) source prob-abilities, we have not mentioned any statistical dependence between users.The form of the joint source distribution p(xK) depends on the degree ofcooperation allowed between the users.

If the users cannot cooperate, p(xK) is restricted to be a product distri-bution,


p(xK) =K∏

k=1

pk(xk) .

Such absence of cooperation between users can be caused for example byphysical separation of the users, who have no direct communication linkswith each other. This is the case in most mobile radio networks.

Alternatively, if full cooperation between users is allowed (which impliesthe existence of a communication link between the users), there is no restric-tion on the joint distribution p(xK). Such full cooperation of sources allowsthe complete channel resource to be used as if by a single “super-source” withalphabet X1 × X2 × · · · × XK , and can therefore be used as a benchmark forsystems with independent users.

Note that our definition of a DMMAC did not include the source distribu-tions. The system designer is typically free to choose the most advantageousset of source distributions (for example to maximize the rate of transmission).

A discrete memoryless multiple-access channel may be schematically rep-resented in a fashion similar to single-user channels.

Example 3.1. Figure 3.2 shows the transition probability diagram for an ex-ample two-user channel with X1 = X2 = {0, 1}. Each user transmits indepen-dently from uniformly distributed binary alphabets. The arrows on the graphshow the probability of receiving a particular y ∈ Y = {0, 1, 2}, correspondingto a matrix of transition probabilities (the outputs index the columns)⎡

⎢⎢⎣0.9 0.1 00.05 0.8 0.150 0.3 0.70 0 1

⎤⎥⎥⎦ .

Note that when the users transmit independently, the resulting multiple-access channel model is very similar to that for a single-user discrete multiple-access channel with memory, e.g. an inter-symbol interference (ISI) channel.Instead of the source labels referring to the current, and previous transmittedbit (in an ISI channel with memory one), they now refer to user 1 and user 2.

3.2.2 The Capacity Region

The capacity region is the multiple-user equivalent of the channel capacity fora single-user system. In a single-user system, there is a single number, C, whichis the channel capacity.1 The channel capacity is the rate of transmission belowwhich there exist encoders and decoders that allow arbitrarily low probabilityof decoding error.

In a multiple-user system with K users, there are K possibly differentrates, one for each user and the capacity region is the set of allowable rate1 Provided the channel has a capacity.


p(x1, x2)

14

14

14

14

(x1, x2)

(0, 0)

(0, 1)

(1, 0)

(1, 1)

Y

0

1

2

p(Y )

0.2375

0.3

0.4625

0.9

0.1

0.05

0.8

0.15

0.3

0.7

1.0

Fig. 3.2. Example of a discrete memoryless multiple-access channel.

combinations, which we represent geometrically in K-dimensional Euclideanspace as rate vectors, R = (R1, R2, . . . , RK). These rate combinations may beused such that arbitrarily low error probabilities can be theoretically achieved.First, we must define the concept of a channel code, and the probability oferror, before any capacity theorem can make sense.

With reference to Figure 3.3, a code for a multiple-access channel consistsof K encoding functions, one for each source. Since the sources are indepen-dent, we can describe the multiple-access code in terms of the individual codesfor each user (rather than a single joint encoder).

Let Mk = {1, 2, . . . , Mk} represent the set of possible messages that userk may wish to transmit. Thus the message for user k is a random variableUk ∈ Mk, and the encoding function fk for user k maps from this set of Mk

messages onto a n dimensional code vector, whose elements are drawn fromthe source alphabet Xk:

fk : Mk → Xnk .

The parameter n is called the codeword length. Assuming symbol synchronoustransmission of the users, we can represent the entire process as a multiple-access encoder fK mapping as follows

fK : M1 ×M2 × · · · ×MK → Xn1 ×Xn

2 × · · · × XnK ,

where in this instance, × denotes the Cartesian product. We shall abbreviatethis mapping by fK : MK → Xn

K.For user k, define

Rk =1n

log2 Mk

to be the code rate, in bits per symbol.


The decoding function for the multiple-access code is a mapping

g : Yn →MK.

from the set of possible received vectors onto the message sets of the indi-vidual users. This definition implies the use of receiver cooperation, i.e. jointdecoding. The K messages output by the decoder are random variables, Uk.

Encoder

Encoder

Encoder

Channel Decoder

UK

U2

U1

fK

f2

f1

p(y|xK) g

Xn1

Xn2

XnK

U1

U2

UK

Y n

Fig. 3.3. Coded multiple-access system.

A decoding error is said to have occurred if the message set at the outputof the decoder does not agree completely with the input message set. In otherwords an error occurs if the decoded message for any of the users is in error.The error probability conditioned on a particular set of transmitted messagesmK = m1, m2, . . . , mK is therefore

Pr (Error | UK = mK) = Pr (g (Yn) �= mK)

Assuming equi-probable transmission of messages, the average error probabil-ity for a multiple-access code is given by

Pe =1

2nR(K)

∑mK∈MK

Pr (g (Yn) �= mK) .

Definition 3.2 (Achievable Rate Vector). A particular ratevector R = (R1, . . . , RK) is achievable if, for any ε > 0 thereexists a multiple-access code with rate vector R such that Pe ≤ ε.Otherwise it is said to be not achievable.


Now we have formalized our description of the multiple-access system, we canpresent the theorem which describes the entire set of achievable rates. Wedescribe the capacity region in terms of the achievable rate regions for fixedsource distributions.

Theorem 3.1 (Achievable Rate Region). Consider themultiple-access channel

(XK; p(y | xK) ; Y) .

For a fixed product distribution on the channel input symbols,

π (xK) =∏k

pk(xk)

every rate vector R in the set

R [π (xK) , p(y | xK)] �{R : R (S) ≤ I

(XS ;Y | XK\S

),∀S ⊆ K

}(3.1)

is achievable in the sense of Definition 3.2.

Polytopes which are defined by a system of constraints such as (3.1) areknown as bounded polymatroids [138]. The following example writes out theachievable rate region for the two-user case.

Example 3.2 (Two-user achievable rate region). Consider a two-user DMMAC.For a fixed input distribution π (x1, x2) the following region is achievable.

R [p1(x1) p2(x2) , p(y | xK)] =

⎧⎨⎩(R1, R2) :

R1 ≤ I (X1;Y | X2)R2 ≤ I (X2;Y | X1)

R1 + R2 ≤ I (X1, X2;Y )

⎫⎬⎭

This region is shown in Figure 3.4.

Example 3.3 (Three-user achievable rate region). Consider a three-user DM-MAC. For a fixed input distribution π (x1, x2, x3) the following region isachievable.

R [π (xK) , p(y | xK)] =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎩

(R1, R2, R3) :

R1 ≤ I (X1; Y | X2, X3)R2 ≤ I (X2; Y | X1, X3)R3 ≤ I (X3; Y | X1, X2)

R1 + R2 ≤ I (X1, X2;Y | X3)R1 + R3 ≤ I (X1, X3;Y | X2)R2 + R3 ≤ I (X2, X3;Y | X1)

R1 + R2 + R3 ≤ I (X1, X2, X3;Y )

⎫⎪⎪⎪⎪⎪⎪⎪⎪⎬⎪⎪⎪⎪⎪⎪⎪⎪⎭

This region is shown in Figure 3.5.


R1

R2

I (X1; Y ) I (X1; Y | X2)

I (X2; Y )

I (X2; Y | X1) R1 +

R2 =

I (X1 ,X

2 ;Y)

Fig. 3.4. Two-user achievable rate region.

R1

R2

R3

I(X1;Y |X2,X3)

I(X2;Y |X1,X3)

I(X3;Y |X1,X2)

R2+R3=I(X2,X3;Y |X1)

R1+R3=I(X1,X3;Y |X2)

R1+R2=I(X1,X2;Y |X3)

R1+R2+R3=I(X1,X2,X3;Y )

Fig. 3.5. Three-user achievable rate region.


In general there are 2K − 1 rate constraints, one for each non-empty subsetof users. Note that in some cases, there may be a smaller number of effectiveconstraints, since some inequalities may be implied by others.

The system designer can arbitrarily select the source product distributionπ (·) and hence the union of achievable regions (over the set of all sourceproduct distributions) is achievable. Furthermore, suppose R1 and R2 areboth achievable rate vectors. Every point on the line connecting R1 and R2

is also achievable, by using the codebook corresponding to R1 for λn symbolsand that corresponding to R2 for the remaining (1−λ)n, 0 ≤ λ ≤ 1 symbols.This process is known as time-sharing and implies that the convex hull2 ofan achievable region is achievable. A common time reference3 is howeverrequired so the users know when to change codebooks.

Theorem 3.2 (Multiple-Access Capacity Region). Assum-ing the availability of a common time reference to all users, thecapacity region is given by the closure of the convex hull (denotedco) of the union of all achievable rate regions over the entire familyof product distributions on the sources:

C [p(y | xK)] = co⋃

π(xK)

R [π (xK) , p(y | xK)] . (3.2)

The capacity region was first determined by [2], [129] and [76]. More ac-cessible proofs of the multiple-access channel coding theorem may be foundin [51] using the concept of jointly typical sequences, see also [24] and in [50],using random coding arguments.

The DMMAC is rather special in that it admits a single-letter descriptionof its capacity region, i.e. can be expressed in terms of quantities such asI (X;Y ), rather than limiting expressions4 such as

limn→∞

I (X[1], X[2], . . . , X[n];Y [1], Y [2], . . . , Y [n]) .

There are not many other examples of multi-terminal channels in which thegeneral solution is single-letter. Although single-letter achievable regions maybe easily constructed for most multi-terminal channels, it appears difficult ingeneral to obtain single-letter converse theorems.2 The convex hull (convex cover) of a set of points, A is defined in [32] as the set

of points which is the intersection of all the convex sets that contain A.3 This qualification shall be examined in more detail in Section 3.84 It is possible to obtain “single-letter” expressions at the expense of unbounded

increase in the source cardinality, but this is cheating.


3.3 Binary-Input Channels

In order to illustrate the concepts introduced above, let us now consider somesimple examples, concentrating on binary-input channels. In particular, weshall consider the binary adder channel and the binary multiplier channel.

3.3.1 Binary Adder Channel

The binary adder channel is a K-user DMMAC in which each user transmitsfrom a binary alphabet Xk = {0, 1}, k = 1, . . . , K and the output is given bythe the real addition

Y =K∑

k=1

Xk

The channel output will therefore be a member of Y = {0, 1, . . . , K}.What are the allowable rates of transmission for the binary adder channel?

A simple upper bound based on the cardinality of Y is R(K) ≤ log(K + 1).We can however say a lot more than this.

Two Users

First, we shall find the capacity region for the two-user case. Let user 1 anduser 2 transmit a 0 with probabilities p1(0) = p and p2(0) = q respectively. Thetransition diagram of the channel is shown in Figure 3.6. By Theorem 3.1, the

(1 − p)(1 − q)

(1 − p)q

p(1 − q)

pq

p1(x1) p2(x2)

(1, 1)

(1, 0)

(0, 1)

(0, 0)

(x1, x2)

2

1

0

y

(1 − p)(1 − q)

p + q − 2pq

pq

pY (y)

Fig. 3.6. Two-user binary adder channel.

3.3 Binary-Input Channels 55

achievable rate region for a given p and q is given by the rate vectors, (R1, R2),which satisfy

R1 ≤ I (X1;Y | X2) (3.3)R2 ≤ I (X2;Y | X1) (3.4)

R1 + R2 ≤ I (X1, X2; Y ) (3.5)

Consider the first inequality (3.3).

I (X1;Y | X2) = H (X1 | X2)−H (X1 | X2, Y )= H (X1)= h (p)

where h (p) is the binary entropy function. The first step follows from theindependence of the two users (no transmitter cooperation), and the fact thatif we know both Y and X2, we can always determine X1. Similarly, (3.4) isfound to be

I (X2;Y | X1) = h (q)

The sum constraint (3.5) is determined as follows:

I (X1, X2;Y ) = H (Y )−H (Y | X1, X2)= H (Y )= −pq log pq − (1− p)(1− q) log(1− p)(1− q)− (p + q − 2pq) log(p + q − 2pq)

since the two users’ inputs uniquely determine the output.It is now necessary to take the closure of the convex hull over the union

of all such regions, where 0 ≤ p, q ≤ 1. Fortunately, there is a trick to be em-ployed, but for the moment, let us see what this statement means. Figure 3.7shows an example.

The region bounded by the thick line is due to p = 0.1, q = 0.6. The secondregion, bounded with a thin line is due to p = 0.5, q = 0.8. The shaded regionshows the additional rates found by taking the convex hull over these tworegions. In principle, one continues this way for all p, q combinations. It is nothard to see that this processes would be rather tedious, since p and q haveuncountable support.

Instead, note that for p = q = 0.5, we obtain the maxima h (p) = h (q) =1.0. Fortunately, it is precisely these distributions which also maximize thethird constraint, giving maxp,q H (Y ) = 1.5. Since it is possible to simulta-neously maximize all constraints, the resulting region must contain all otherregions.

The resulting capacity region is therefore

R1 < 1R2 < 1

R1 + R2 < 1.5.


0.5 1.0

0.5

1.0

R2

R1

(p, q) = (0.5, 0.8)

(p, q) = (0.1, 0.6)

Fig. 3.7. Convex hull of two achievable rate regions for the two-user binary adderchannel.

This region is a pentagon and is shown bounded by the thick line in Figure 3.8.

Single-user Decoding and Time Sharing

Two other regions are shown on Figure 3.8, corresponding to different trans-mission and decoding strategies. The triangle defined by

R1 + R2 < 1

and labeled no cooperation is the set of rates achievable with single-user de-coding, in which the decoder for each user operates independently and treatsthe unwanted user as noise.

This region is sometimes referred to as the time-sharing region, since it maybe achieved by time sharing the channel between the two users. To achieve arate point R1 = λ, R2 = 1−λ via time sharing, the input distribution p = 1/2,q = 0 is used λ of the time, and the remaining 1 − λ of the time the inputdistribution p = 0, q = 1/2 is used. Note that such time-sharing implies adegree of cooperation between the users, namely that they have access to acommon time reference.

It is perhaps less well-known that time sharing is not really necessary toachieve points with R1 + R2 = 1. If each user is decoded independently andtreats the other user as noise, the rate constraints are

3.3 Binary-Input Channels 57

No cooperation

Joint decoding

Full cooperation

0.5 1.0 1.5

0.5

1.0

1.5

R2

R1

A

B

Fig. 3.8. Capacity region of the two-user binary adder channel.

R1 ≤ H (Y )−H (Y | X1) = H (Y )− h (q)R2 ≤ H (Y )−H (Y | X2) = H (Y )− h (p) .

Setting q = 1/2 results in

R1 =12h (p) ≤ 1

2

R2 = 1− 12h (p) ≥ 1

2.

and the portion of the line R1 + R2 = 1 from R = (0, 1) to R = (1/2, 1/2)may be achieved by varying p. Alternatively, setting p = 1/2 results in

R1 = 1− 12h (q) ≥ 1

2

R2 =12h (q) ≤ 1

2.

which yields the remaining portion of the line R1+R2 = 1 from R = (1/2, 1/2)to R = (1, 0).

The full cooperation region is bounded by the sum rate R1 + R2 = log2 3,which is achieved if the users cooperate in their transmissions to ensure a


uniform distribution over the three output symbols. This region is outside themultiple-access channel capacity region, because without cooperating in theirtransmissions, the required uniform distribution of output symbols cannot beobtained.

Cancellation and the Corner Points

It is worthwhile discussing the corner points of the joint decoding capacityregion, marked A and B on the figure. Consider the first of these points, Awhich is at (I (X1;Y | X2) , I (X2;Y )) = (1, 0.5). The form of the expressionfor this point is suggestive of a method for transmission on the channel, namelyinterference cancellation. Let user 2 transmit at rate R2 = I (X2;Y ), whichis the mutual information between user 2 and the output, treating user 1 asnoise. Recall that p = q = 0.5, hence user 2 sees the single-user channel shownin Figure 3.9. This is the binary erasure channel, with erasure probability

12

12

12

12

1

0

2

1

0

��

��

��

��

��

��

Fig. 3.9. Channel as seen by user two.

δ = 0.5. The capacity of this channel is known to be 1− δ = 0.5 [24, p. 187],hence user 2 may indeed transmit with arbitrarily low error probability (usinga separate single-user decoder) at 0.5 bits/channel use. Assuming that we havedecoded user 2 error free, user 1 may transmit at R1 = I (X1;Y | X2) = 1bit per channel use, which is the maximal mutual information between user 1and the output, given complete knowledge of user 2. Thus we can achieve thepoint A, using only single-user coding techniques, combined with interferencecancellation. Point B may be achieved in a similar way, decoding user 1 first.Any point on the line connecting A and B may be achieved by time-sharingbetween these two schemes.

K > 2 Users

For the general case of K > 2 users, the capacity region is given [18, 19] asthe closure of the convex hull of the union of the following regions:⎧⎨

⎩(R1, R2, . . . , RK) : 0 ≤ R (S) ≤|S|∑k=1

(|S|k

)2|S| log2

2|S|(|S|k

) , ∀S ⊆ K

⎫⎬⎭ .

3.4 Gaussian Multiple-Access Channels 59

Often, we are interested in the maximum total transmission rate, or sumcapacity. A good approximation to the sum rate constraint, R (K) ≤ Csum forthis channel is [19, 165]

Csum ≈ 12

log2

πeK

2bits. (3.6)

In fact the following bounds show the asymptotic tightness of this approxi-mation [18]

12

log2

πK

2≤ max R (K) ≤

{12 log2

πeK2 Keven,

12 log2

πe(K+1)2 Kodd.

From (3.6), the total throughput is proportional to logK. As the system growslarger, the sum capacity increases, with individual users transmitting less.

It is interesting to note that

limk→∞

12 log2

πeK2

log2(K + 1)=

12.

In the limit, the lack of transmitter cooperation reduces the available sumcapacity by a factor two. As the number of users grows, the distribution ofthe channel output is becoming Gaussian (for independent users), rather thanuniform (with transmitter cooperation).

3.3.2 Binary Multiplier Channel

The previous example showed how joint detection could be used to increasethe region of achievable rates (and how full cooperation increases this regioneven further). In the next example, we see that this is not always the case.

The two-user binary multiplier channel performs the real multiplicationY = X1X2, where X1, X2 ∈ {0, 1}. Hence Y ∈ {0, 1}.

The cardinality of the output alphabet places a restriction on the maxi-mum sum rate, R1 + R2 ≤ 1. The capacity region is shown in Figure 3.10. Inthis case, there is nothing to be gained over time-sharing of the channel, oreven by complete transmitter cooperation.

3.4 Gaussian Multiple-Access Channels

3.4.1 Scalar Gaussian Multiple-Access Channel

The Gaussian multiple-access channel (GMAC) differs from the channels ex-amined so far, in that each user transmits from an infinite alphabet.


0.5 1.0

0.5

1.0

R2

R1

Fig. 3.10. Capacity region of the two-user binary multiplier channel.

Definition 3.3 (Gaussian Multiple-Access Channel). TheGaussian multiple-access channel is a K-user multiple-access chan-nel in which each user transmits from an infinite alphabet, Xk ∈ R,subject to an average power constraint E

[|Xk|2

]≤ Pk. The chan-

nel output Y ∈ R is given by the real addition

Y =K∑

k=1

Xk + z,

where z is a zero mean, variance σ2, white Gaussian random vari-able, denoted z ∼ N

(0, σ2

).

The above definition is easily extended to a complex channel, in which theusers transmit complex numbers and the additive noise is a zero mean circu-larly symmetric complex Gaussian random variable (i.e. independent real andimaginary parts, each with variance σ2/2).

The capacity region of the Gaussian multiple-access channel is found byextending the work of [2], [129] and [76], which was for finite alphabets, to thecase of infinite alphabets. This was done independently by [22] and [169]. Theresulting capacity region (in the real case) is given by the following constraints

R (S) ≤ 12

log2

(1 +

P (S)σ2

)bits, ∀S ⊆ K.

For complex channels, the factor of 1/2 is omitted.From this we can see that although the users transmit independently,

the total rate is equal to that obtained by a single user, with average power


P (K), with exclusive access to the channel. This implies that no rate loss issuffered due to the distributed nature of the various users. As was the casefor the binary adder channel, the total sum rate is proportional to logK, withever increasing capacity as the numbers of users increases, with each usertransmitting a smaller and smaller amount.

It is convenient to introduce the following notation for Gaussian channelcapacity,

C(P, σ2

)=

{12 log

(1 + P/σ2

)real channel

log(1 + P/σ2

)complex channel

(3.7)

Using this notation, Figure 3.11 shows an example of a capacity region, inwhich P1 > P2.

R1

R2

C`P2, σ

2´

C`P2, P1 + σ2

´

C`P1, σ

2´

C`P1, P2 + σ2

´

R1+

R2=

C(P

1+P2 ,σ 2

)

Orthogonal Multiple-Access

What rate vectors are achievable with only single-user decoding on theGMAC? Consider the use of an orthogonal multiple-access technique, suchas time sharing (i.e. time-division multiple-access). For the purposes of illus-tration, consider a two-user channel and suppose user 1 has exclusive accessto the channel λn symbols out of n. Correspondingly, user 2 has exclusiveaccess for (1− λ)n symbols.

User 1 transmits only a fraction λ of the time, and hence when it doestransmit it may do so with power P1/λ while still maintaining the long-termaverage power P1. The resulting rate of transmission during the active periodsis therefore C

(P1/λ, σ2

)(noting that there is no interference since user 2 does

not transmit at the same time). The overall transmission rate for user 1 is

Fig. 3.11. Example of Gaussian multiple-access channel capacity region.


reduced by the factor λ. Similar arguments apply to user 2 and the achievablerate pair is

R1 = λC

(P1

λ, σ2

)

R1 = (1− λ) C

(P2

1− λ, σ2

).

Figure 3.12 shows the region obtained by varying 0 ≤ λ ≤ 1 compared tothe capacity region (in this example P1 = P2 = σ2 = 1).

Orthogonal multiple-access touches the outer boundary of the capacityregion at the trivial points R = (0, C

(P2, σ

2)) and R = (C

(P1, σ

2), 0), in

which case either user 1 or user 2 has exclusive access to the channel.The more interesting case is achieved by setting

λ =P1

P1 + P2,

which results in the non-trivial boundary point

R1 =P1

P1 + P2C(P1 + P2, σ

2)

R2 =P2

P1 + P2C(P1 + P2, σ

2).

Figure 3.12 shows how almost the entire capacity region of the two-userGMAC can be achieved, with the only cooperation required being a commontime reference to schedule the transmissions.

Similarly, a point on the boundary of the K-user capacity region of aGMAC with user powers P1, P2, . . . , PK can be achieved with user k havingexclusive channel access a proportion

λk =Pk

P (K)

of the time. The resulting rate for user k is

Rk =Pk

P (K)C(P (K) , σ2

),

and the sum of these rates is clearly C(P (K) , σ2

).

There is of course nothing special about the use of time-division as theorthogonal accessing method. The same rates result from the use of any or-thogonal access method, where the variable λ accounts for the proportion ofdegrees of freedom used by user 1.

For example, in a band-limited system with bandwidth W , we could con-sider allocating orthogonal frequency bands. Allocating λW Hz to user 1 and(1− λ)W Hz to user 2 results in


R1 = λW C (P1, λWN0) = λW C

(P1

λ, WN0

)

R1 = (1− λ)W C (P2, (1− λ)WN0) = (1− λ)W C

(P2

1− λ, WN0

).

0.1 0.2 0.3 0.4 0.5

0.1

0.2

0.3

0.4

0.5

R1

R2

Fig. 3.12. Rates achievable with orthogonal multiple-access.

3.4.2 Code-Division Multiple-Access

The remaining chapters of this book are devoted to the the design of mul-tiuser receivers for linear multiple-access channels, typically exemplified bydirect-sequence code-division multiple-access. Therefore let us now considerthe CDMA channel, from an information theoretic point of view. The ap-proach that we take is to model the spreading as part of the channel speci-fication, and to calculate the resulting capacity. Using this approach we cangain insight into several important questions concerning CDMA channels:

• What are the fundamental limitations of such systems?• We know from the data processing theorem that spreading cannot increase

capacity. However spreading is useful for reasons other than capacity. Arethere cases when spreading does not decrease capacity?

• How much spectrum spreading should be used?• What sort of performance improvement can be expected through the use

of receiver cooperation, i.e. joint decoding?• How do we design sequences to maximize capacity?

There are two main cases of interest: Firstly, time-invariant spreading,according to Definition 2.1. Secondly, we can consider random spreading,


according to Definition 2.2, which models aperiodic sequences (or sequenceswith period much longer than the symbol interval).

Time-invariant spreading sequences S[i] = S, model periodic, synchronousspreading sequences. The following theorem gives the capacity of the time-invariant CDMA channel (normalized by the number of signaling dimensions)as a function of the spreading sequences [112, 148]. We will concentrate on thecomplex channel (with complex spreading). Real channel results are obtainedvia a factor 1/2.

Theorem 3.3 (Synchronous CDMA). The capacity region ofthe time-invariant CDMA channel with spreading sequence ma-trix S, average energy constraints W and noise variance σ2 is theclosure of the region

C [S] =⋂T ∈KT �=∅

{R : R (T ) ≤ 1

Llog det

(I +

1σ2

ST WT S∗T

)}.

ST is the matrix S, with the columns whose indices do not be-long to T omitted. WT is formed from W by removing rows andcolumns whose indices do not belong to T . Capacity is achievedfor Gaussian inputs X.

Using our notation (3.7), we can also write the sum rate constraint Csum(S)for real or complex channels in terms of λ1, λ2, . . . , λL, the eigenvalues ofSWS∗

R (K) ≤ Csum(S) =1L

L∑l=1

C(λl, σ

2). (3.8)

We shall see that the latter expression is useful for calculating the capacityfor randomly spread systems.

An upper bound on Csum is the capacity of the channel with L = 1, namelythe K user Gaussian multiple-access channel with average energy constraintwtot = P (K). It turns out that there exist choices for the sequences S thatachieve this bound [111, 112].

Theorem 3.4 (Optimal Sequences for Time InvariantCDMA). Let W = P I. The Gaussian multiple-access channelupper bound

Csum ≤ CGMAC = C(wtot, σ

2)

nats per chip

is achieved by use of sequences that satisfy SSt = IL.


Now the spreading sequences are the columns of S. The condition for equalityin the above theorem requires the rows to be orthogonal. It is not required thatthe sequences themselves be orthogonal. A requirement for row orthogonalityis that K ≥ L, otherwise S will be rank deficient. Sequences satisfying thiscondition also satisfy the Welch lower bound on total squared correlation [159].Hence, by restricting the amount of spreading to L ≤ K, no capacity penaltyneed be suffered. Note that these are the same sequences that are optimal forthe correlation detector, as described in Section 2.5.3.

Let us now consider the effect of using randomly selected sequences, ac-cording to Definition 2.2.

Now, according to (3.8), for any particular set of spreading sequences S, wecan write the sum capacity Csum(S) in terms of the eigenvalues of SSt (we shallfor the moment consider systems with equal powers for all the users, W = I).This motivates interest in the eigenvalue distribution of random matrices ofthe form SSt. Now since the spreading sequences are being chosen randomly,the eigenvalues corresponding to any particular S[i] (and the correspondingcapacity) are also random.

The random nature of these spreading sequences therefore makes it difficultto make statements about the capacity. However, it turns out that if we insteadconsider a large systems limit, we can calculate the asymptotic capacity. Bylarge systems we mean that we take the limit K →∞ such that K/L → β, aconstant. In this scenario, results from random matrix theory state that thedistribution of eigenvalues tends to a non-random, known limit. This in turnenables us to compute the limiting capacity to which the random capacity fora finite system converges, as we increase the system dimensions.

The following theorem shows how the eigenvalue spectrum for a largerandom matrix converges to a non-random limit distribution, see [64, 127,156, 172]

Theorem 3.5 (Asymptotic Spectral Distribution). LetF (x) be the empirical cumulative distribution of a randomly se-lected eigenvalue of 1

K SSt, where S is selected according to Defi-

nition 2.2. For large systems F (x) → F(x),

F ′(x) = f(x) =

{√(x−a(β))(b(β)−x)

2πβx a(β) ≤ x ≤ b(β)0 otherwise

a(β) = (√

β − 1)2 b(β) = (√

β + 1)2.

Example 3.4 (Convergence of Eigenvalue Distribution). Figure 3.13 shows howthe eigenvalue distribution of SSt converges to the theoretical limit. The figurecompares simulated eigenvalue histograms to the limit distribution f(x) forK = L = 5, 10 and 50. We see that the convergence is quite fast. Although

where


the histogram and density do not match very well at K = L = 5, they domatch well already at K = L = 10.

K = L = 50

K = L = 10

K = L = 5

Fig. 3.13. Convergence of spectral density.

Using Theorem 3.5, it is possible to find the large-systems sum capacityCr of the randomly spread CDMA channel [150].

Theorem 3.6 (Spectral Efficiency of Random CDMA).For large systems, the sum capacity of random CDMA where theusers have identical powers wtot/K is

Cr =∫

log (1 + γx) f(x) dx

=β

2log(

1 + γ − 14F(γ, β)

)+

12

log(

1 + γβ − 14F(γ, β)

)

− log e

8γF(γ, β)

bits per chip, where γ = wtot/(Lσ2) and

F(x, z) =(√

x(1 +√

z)2 + 1−√

x(1−√

z)2 + 1)2


The convergence is in probability, i.e. as we increase the system dimensions,Csum(S)/L → Cr. A direct solution for a closed-form expression for this ca-pacity is also given in [106].

Using similar techniques, one may also find the capacity of the randomlyspread CDMA channel with no cooperation (i.e. each user independently de-codes based only the output of its own matched filter) [150].

Theorem 3.7 (No Cooperation Spectral Efficiency). Forlarge systems, the sum capacity of match-filtered Random CDMAwith no cooperation is

CMF =β

2log(

1 +γ

1 + γβ

)

Finally, if no spreading (or Welch Bound Equality sequences) is used, thespectral efficiency is the solution to the following equation.

COPT =12

log(

1 + 2COPTEb

N0

)If orthogonal sequences are used (K ≤ L), we have

C⊥ = βCOPT.

These results are illustrated on Figure 3.14 which is plotted for Eb/N0 = 10dB. For a fixed Eb/N0, the spectral efficiency of either the optimal detectoror the correlation detector is maximized by letting β → ∞. In the case ofcorrelation detection,

limK→∞

CMF =log2 e

2− 1

2

(Eb

N0

)−1

,

which converges to log2 e/2 ≈ 0.72 as Eb/N0 →∞ [62, 150].Figure 3.15 shows the spectral efficiency for both optimal detection and

correlation detection as a function of Eb/N0 for K → ∞. The spectral ef-ficiency of the optimal system grows without bound as the SNR increases.In fact, comparing to (3.6), we see that for L = 1 and a fixed, large K, thespectral efficiency behaves like

12

log2

πeK

2with increasing SNR. This is the same limit obtained for the K-user binaryadder channel in Section 3.3.1.

In the limiting case of many users or high signal-to-noise ratios, it it pos-sible to write a closed-form expression for the sum capacity of the randomlyspread channel [56].


1 2

1

2

3

0β

Spectral Efficiency

bits/chip No spreading/WBE Spreading

Random SpreadingJoint Decoding

Random SpreadingMatched Filter

Orthogonal

Fig. 3.14. Spectral efficiency of DS-CDMA with optimal, orthogonal and randomspreading. Eb/N0 = 10 dB.

0 2 4 6 8 10

0

1

2

Eb/N0 (dB)

Spectral Efficiency

bits/chip

Joint Decoding

Matched Filter

Fig. 3.15. Spectral efficiency of DS-CDMA with random spreading.


Theorem 3.8 (Limiting Capacity). For large systems (manyusers) or for high signal-to-noise ratios, the average sum capacityof a direct-sequence CDMA channel with equal powers is given by

limγ→∞

Cr = log γ +(

K − L

L

)ln(

K

K − L

)− 1 nats, K ≥ L.

Furthermore

limγ→∞

CGMAC − Cr =

{0 L

K → 0,

1 LK → 1.

From this limiting expression we can see that the penalty for using randomlyselected, rather than optimal spreading sequence in large systems is at most1 nat. As L/K decreases, random spreading becomes optimal.

All of the large systems results that we have presented have been for thecase when all the users transmit with equal powers. If unequal powers are used,the analysis may be extended by using appropriate results from random matrixtheory. The results end up being expressed in terms of Stieltjes transforms.

Definition 3.4 (Stieltjes Transform). The Stieltjes transformm(z), z ∈ C

+ of a cumulative distribution F (x) with probabilitydensity function F ′(x) = f(x) is given by

m(z) =∫

1λ− z

f(λ) dλ,

and possesses the inversion formula∫ b

a

f(x) dx =1π

limη→0+

∫ b

a

� [m(ξ + iη)] dξ (3.9)

We shall continue using the convention of using upper-case letters for cumu-lative distributions and the corresponding lower case letter for densities.5

As a special case of the result presented in [128] we have the following.

5 Most of the following results are usually presented in measure-theoretic terms. Weassume the existence of probability density functions in order to avoid measure-theoretic concepts.


Theorem 3.9 (Limiting Spectrum for Unequal Powers).Let the elements of S ∈ C

L×K be chosen according to Defini-tion 2.2. Let W = diag(w1, w2, . . . , wK), wi ∈ R be independent ofS and let the empirical distribution of {w1, w2, . . . , wK} convergealmost surely to a cumulative distribution function H with densityH ′ = h as K →∞. Then for large systems, the empirical distribu-tion function of the eigenvalues of the matrix SWSt/spreadlengthconverges to a non-random probability distribution whose Stieltjestransform m(z) ∈ C

+ is the unique solution to

m(z) = −(

z − 1β

∫τh(τ)

1 + τm(z)dτ

)−1

. (3.10)

We can now describe the large-systems capacity result for the unequal powercase.

Theorem 3.10. At each symbol interval, let the matrix of spread-ing sequences S be randomly selected as in Theorem 3.9 and letthe empirical distribution function of the users’ energies convergeto a non-random cumulative distribution function H. Then

Cr =∫

log(

1 +xL

σ2

)f(x) dx, (3.11)

where the Stieltjes transform of the cumulative distribution func-tion F (x) satisfies (3.10).

This theorem shows that for fixed β, the sum capacity depends only uponthe noise variance and the limiting distribution of the users’ energies, albeitthrough the rather awkward (3.10), combined with the inverse Stieltjes trans-form.

Example 3.5. Let the power distribution be discrete, consisting of J pointmasses at powers 0 < P1 < P2 < · · · < PJ ,

h(τ) =J∑

j=1

αjδ(τ − Pj)

where∑

αj = 1. Substituting into (3.10) and using the sifting properties ofthe Dirac delta function results in

m(z) = −

⎛⎝z − 1

β

J∑j=1

Pjαj

1 + Pjm(z)

⎞⎠−1

. (3.12)

L


The resulting equation for m(z) is a degree J + 1 polynomial. For a singlepoint mass (the equal power case), solution of the corresponding quadratic,together with the inversion formula (3.9), results in the expression for F givenin Theorem 3.5.

Capacity Computation for Arbitrary Power Distributions

In all but the simplest of cases, direct computation of the Stieltjes transformm(z) of the limiting eigenvalue distribution F (λ) using (3.10) is difficult, ifnot intractable (for example four point masses results in a quintic for m).

It is possible however to find a parametric relation for the random sequencecapacity that side-steps the problem of solving (3.10). This parametric equa-tion is in terms of the capacity of a parallel (orthogonal modulation) channelwith the same power distribution.

Definition 3.5 (Shannon Transform). Let h(x) be a probabil-ity density on the positive real numbers. Define

ηH (γ) =12

∫ ∞

0

log (1 + γx) h(x) dx.

Theorem 3.11. Retaining the definitions and assumptions ofTheorem 3.10, the spectral efficiency of the randomly spreadmultiple-access channel is given parametrically via

Cr = ηF (γ) =1β

ηH (s)− lns

γ+

s

γ− 1 (3.13)

γ = s

(1− s

β

dηH (s)ds

)−1

(3.14)

where for convenience γ = L/σ2.

Example 3.6 (Equal Powers). Let h(x) = δ(x − 1) be a single point mass,corresponding to the unit equal power case. Consider β = 1. Then ηH (s) =ln(1+ s) and hence γ = s(1+ s), from (3.14). Solving for s in terms of γ gives

s =12

(−1 +

√1 + 4ρ

),

which when substituted into (3.13) gives

ηF (ρ)√

4ρ + 1− 12ρ

+ 2 ln(√

4ρ + 1 + 1)− 1− 2 ln 2


which is the same result obtained in [150, (9)], which was given in Theorem 3.6above.

Although this parametric form appears to simplify the problem of find-ing closed-form capacity results, it suffers from the same basic problem asthe fixed-point equation (3.10). For instance, the J-level discrete distributiondescribed in Example 3.5 results again in a degree J + 1 polynomial.

Example 3.7 (Two Point Masses). Consider a two-level power distributionwith point masses at powers P1 > 1 > P2, normalized such that the aver-age power over all users is 1,

h(x) =1− P2

P1 − P2δ(x− P1) +

P1 − 1P1 − P2

δ(x− P2).

Then ηH (s) and γ(s) are given by

ηH (s) =(P1 − 1) log (sP2 + 1) + (1− P2) log (sP1 + 1)

P1 − P2(3.15)

γ(s) =s (sP1 + 1) (sP2 + 1)(P1 + P2) s− s + 1

. (3.16)

From (3.16) it can be seen that solution for s in terms of γ (which wouldyield a closed-form expression for ηF (γ)) involves solution of the same cubicequation arising in (3.12).

Nevertheless, it is straightforward to parametrically obtain plots of ηF (γ)using (3.15) and (3.16) in Theorem 3.11. Thus the polynomial is still present,there is just no need to solve it.

Example 3.8 (Rayleigh Fading). Let each user’s transmission experience inde-pendent Rayleigh fading, namely

h(x) =e−x/2

√2π√

x.

Then ηH (x) may be easily found via numerical integration and

γ(s) =e−

12s

√2π s3/2

1− erf(

1√2√

s

) .

The resulting random sequence capacity can therefore be numerically deter-mined. The result is shown in Figure 3.16, as a function of γ.

Furthermore, it is possible to find ηF (γ) directly from ηH (s), in the followingway. For each point (s, ηH (s)) there is a single corresponding point (γ, ηF (γ)),which can be obtained directly from the plot of ηH (s) by a simple point-wisegeometric construction. Plot the curve β−1ηH (s) and extend a tangent line

3.5 Multiple-Access Codes 73

0 2 4 6 8 10 12

0

0.2

0.4

0.6

0.8

γ

ηF (γ)

Equal Power

Rayleigh

Fig. 3.16. Random sequence capacity with Rayleigh fading.

back from s to the s = 0 intercept, say η0. Denote the vertical drop of thisline by α = β−1ηH (s)− η0, which is always less than 1. The point (s, ηH (s))corresponds to a point (γ, ηF (γ)) = (s + Δs, ηH (s) + Δη) above and to theright, according to the following theorem, illustrated in Figure 3.17.

Theorem 3.12. The following point-wise geometrical construc-tion obtains ηF (γ) from ηH (s).

ηF (γ) =1β

ηH (s) + Δη

γ = s + Δs,

where the coordinate increments Δη and Δs are given in terms of

α = sdηH (s)

ds(3.17)

as

Δη(α) = − ln(1− α)− α

Δs(α) =αs

1− α.

The utility of this theorem is due to the fact that typically, ηH (s) and itsderivative is easily computed and in some cases may be found in closed form.

3.5 Multiple-Access Codes

Although it is important to know the ultimate information rate limits forthe multiple-access channel, practical systems require channel coding schemes


s

ηH (s) /β

α

Δη

Δs

γ

ηF (γ)

Fig. 3.17. Finding the asymptotic spectral efficiency via a geometric construction.

to achieve these rates. In addition to providing error detection/correctioncapability in the presence of noise, multiple access channel codes should alsohave the property that they separate the users’ transmissions (reduce MAI).A code that can perform this task without error is called uniquely decodeable.

Definition 3.6 (Uniquely Decodeable). A code for themultiple-access channel is uniquely decodeable (UD) if, in the ab-sence of noise, the mapping MK → Yn is one-to-one.

Codes that fulfill this definition have the property that the combination ofthe users’ encoding function and the channel function is uniquely invertible.Both block and trellis uniquely decodeable codes have been constructed for avariety of multiple-access channels. In particular, much effort has been spenton the binary adder channel. In the following sections, we shall summarizesome of the currently available techniques. For a survey of channel codes upto 1980, one is directed to Farrell [40].

Note that the requirement of unique decodeability is related to the conceptof the zero-error capacity region. If we require codes that can be decoded withzero probability of error, then we should compare their performance to thezero-error capacity region, rather than the ε-error region. In general, the zero-error capacity region for the multiple-access channel is unknown (apart fromspecial cases such as [84]. It has in fact been shown for the two-user binary


adder channel, the zero error capacity region is strictly smaller than the ε-errorregion [141].

Much of the literature on coding is for the binary adder channel, and assuch, is a useful basis for comparison of various schemes. Figure 3.18 sum-marizes graphically various codes of historical interest, including the currentbest known code (in terms of sum rate).

� R1

�

R2

0 0.5 1.0

0.5

1.0�

��

��

��

��

�

•1

•2

•3

•4

•5

•6

•7•8•9•10

•11•12

•13•14

•15

Fig. 3.18. Rates achieved by some existing codes for the BAC

3.5.1 Block Codes

Let us begin by introducing some notation for block codes. Denote the code-book for user k by Ck, the number of codewords by Mk = |Ck|, and the rateRk = 1

n log2 Mk.

Block Codes for the Binary Adder Channel

Let us take a look at our first multiuser code for the two-user BAC, and indoing so, we shall illustrate the principle of unique decodeability.

Example 3.9 (Uniquely Decodeable Code for the Two-User BAC). Until 1985,the best known code for the 2-BAC was also one of the simplest. It assigns


Point number R1 R2 R1 + R2 Reference Type

1 0.5 0.792 1.292 [67] Block2 0.571 0.558 1.129 [67] Block3 0.512 0.793 1.306 [14] Block4 0.5 0.5 1.0 [96] Convolutional5 0.75 0.5 1.25 [20] Trellis6 0.33 0.878 1.208 [27] Block asynch.7 0.100 0.999 1.099 [14] Block8 0.143 0.999 1.143 [14] Block9 0.178 0.998 1.177 [14] Block10 0.222 0.994 1.216 [73] Block11 0.288 0.972 1.260 [14] Block12 0.307 0.961 1.268 [14] Block13 0.414 0.885 1.300 [14] Block14 0.434 0.869 1.303 [14] Block15 0.512 0.798 1.310 [141] Block

Table 3.1. Coding schemes shown in Figure 3.18.

to user X1 the codebook C1 = {00, 11}, and to user X2 the words C2 ={00, 01, 10}. This output words resulting from C1×C2 are shown in Table 3.2.This code was first given in [67]. For this code, R1 = 0.5, and R2 = 0.792. Thesum rate R1 + R2 = 1.29. This code is represented by point 1 in Figure 3.18.From the table, one can verify that all output codewords are distinct, hence thecode is uniquely decodeable. Let us now assume that we want to increase the

C2

C1 00 01 1000 00 01 1011 11 12 21

Table 3.2. Uniquely decodeable rate 1.29 code for the two-user BAC.

rate of the code, by including another codeword for user 1, say 01. Table 3.3shows the new codebook pair, and the corresponding channel outputs. Wenow have the problem that two channel outputs are ambiguous: the output01 could have been caused by the transmission (user 1, user 2) of either (01, 00)or (01, 00). Similarly, the output 11 is ambiguous. The best that the decodercould do in such cases would be to choose the message pair at random, whichwould of course result in an irreduceable error rate.

It is interesting to compare the sum rate of the code in the previous example,R1 + R2 = 1.2925, this was the best code in terms of sum rate, until 1985,when it was beaten by a R1 +R2 = 1.306 code [14]. More recently, this recordwas pushed to R1 + R2 = 1.30999 [141]. The difficulty in improving the sum


C2

C1 00 01 1000 00 01 1011 11 12 2101 01 02 11

Table 3.3. Non uniquely decodeable code for the two-user BAC.

rate of such codes may turn out to be a limitation of the zero-error capacityof the channel.

Kasami and Lin were the first to consider code construction for the two-user BAC [67]. They considered code constructions in which C1 is taken to bea linear (n, k) block code [77], which contains the all ones codeword 11 . . . 1.The members of C2 are then chosen from the cosets of C1. One particularUD construction is to select the members of C2 as the coset leaders of thecosets of C1 which are not equal to C1, their one’s complements and the allzeros codeword 00 . . . 0. Using this technique one can construct codes withrate vector

R =(

k

n,1n

log2

(2n−k+1 − 2

)).

They also provide bounds on the number of vectors that can be taken fromthe cosets of C1.

Example 3.10 (Kasami-Lin Construction). Let us now use this method to con-struct a two-user code. Let the codebook for user 1, C1 be the (7, 4) Hammingcode. This code is perfect [77], which means that it has as coset leaders everyerror pattern of weight 0 or 1. We include these eight coset leaders in thecodebook for user 2. According to the construction, we also include the one’scomplement of each of these codewords, except for the all-one word, whichis already in C1. This gives |C1| = 15. Therefore this code has rate vectorR = (0.571, 0.558) and is point 2 on Figure 3.18. The two codebooks areshown in Table 3.4.

Coebergh van den Braak and van Tilborg have described another construc-tion for the 2-BAC [14]. They construct code pairs at sum rates above1.29, but only marginally. The highest rate code presented has n = 7,R = (0.512, 0.793), giving sum rate 1.30565. This is shown as point 3 inFigure 3.18. Apart from codes with high sum rate, they also construct manycodes achieving point very close to the single-user constraint (i.e. with ratefor one of the users very close to 1, and a non zero rate for the other user).

Kasami and Lin also investigated codes for the noisy BAC, and in [68],give bounds on the achievable rates for certain block coding schemes for thenoisy channel. In [69] they present a reduced complexity decoding scheme,which however is still exponentially complex as the code length increases.Van Tilborg gives a further upper bound on the sum rate for the two-user


C1 C2

0000000 00000001101000 00000010110100 00000101011100 00001001110010 00010000011010 00100001000110 01000000101110 10000001010001 11111100111001 11111011100101 11110110001101 11101110100011 11011111001011 10111110010111 01111111111111

Table 3.4. Rate R = (0.571, 0.558) uniquely decodeable code for the two-userbinary adder channel.

BAC, where C1 is linear [136]. Kasami et al. [70] have used graph theoreticapproaches to improve upon lower bounds for the Kasami–Lin codes. Theyrelate the code design problem to the independent set problem of graph theory,and use the Turan theorem, which gives a lower bound on the independencenumber, in terms of the numbers of vertices and edges of the graph. It isinteresting however to note that if R1 > 0.5, the best code pairs require bothcodes to be non-linear [160].

K-user binary adder channel

Chang and Weldon [18] have constructed codes using an iterative method forthe K-BAC. Their construction for the noiseless case (they also present asimilar construction for the noisy channel) is based on a linearly independentdifference matrix d ∈ {−1, 0, 1}K×n. User k is assigned two codewords, ck,1

and ck,2 obtained from dk, the kth row of d in the following way.

((ck,1)l , (ck,2)l

)=

⎧⎪⎨⎪⎩

(0, 0) if (dk)l = 0(1, 0) if (dk)l = 1(0, 1) if (dk)l = −1

The iteration is on the matrix d. Let d0 = [1]. Then

dj =

⎡⎣dj−1 dj−1

dj−1 −dj−1

I2j−1 02j−1

⎤⎦


defines a K = (j + 2)2j−1 user code of length 2j , where I2j−1 and 02j−1 arethe 2j−1 × 2j−1 identity and all-zero matrices respectively.

Example 3.11 (Chang-Weldon Construction). Let us consider the Chang-Weldon construction for 3 users. We generate the difference matrix

d1 =

⎡⎣1 1

1 −11 0

⎤⎦

This gives the codebooks C1 = {11, 00}, C2 = {10, 01}, C3 = {10, 00}. Thecode has rate vector R = (1

2 , 12 , 1

2 ).

This construction is interesting because of the following theorem, which saysthat these codes are asymptotically optimal in a certain sense.

Theorem 3.13. The Chang-Weldon construction is asymptoti-cally good, in the sense that as the number of users increases,the ratio of the sum code rate to the sum capacity approachesunity.

limK→∞

R (K)Csum

= 1.

Although the ratio of the code rate to the sum capacity approaches one, theabsolute difference increases without bound, with the logarithm of j

Ferguson [41] generalizes these codes, by replacing the identity and all-zeromatrices in the iteration of (3.5.1) with arbitrary matrices A,B with entriesfrom {−1, 0, 1} such that the modulo 2 reduction of the sum A + B is aninvertible matrix. Ferguson also determines the size of the equivalence classesof the Chang–Weldon codes.

Hughes and Cooper [61] have investigated codes for the K-user binaryadder channel, where the users have different rates. They modify the Chang-Weldon construction [18] to distribute the codewords among as few users aspossible. The main result is that K-user codes may be constructed for anyrate point within the polytope derived from the capacity region by subtracting1.09 bits/channel use from each constraint. In particular they construct familyof K-user code with sum rate R (K) ≥ Csum − 0.547.

Frame-Asynchronous Binary Adder Channel

Block codes for the frame-asynchronous two-user BAC have been consideredby Deaett and Wolf [27, 165]. One such simple code assigns two codewords oflength n1 to user 1: the all one word 11 . . . 1 and the all zero word, 00 . . . 0.The codebook for the second user is all binary n2-tuples such that the firstsymbol is 0, and the word does not contain n1 consecutive ones. The maximum


rate pair is achieved for n1 = 3: R = (0.33, 0.878). This is point 6 on thegraph. Plotnik [98] considers the code construction problem for the frame-asynchronous channel, with random access to the channel.

Example 3.12 (Deaett-Wolf Construction). Let use now construct a code forthe two-user frame-asynchronous channel, with n1 = 2 and n2 = 4. To user1 we assign the codebook C1 = {00, 11}. According to the construction, weassign to user 2 the codebook {0000, 0001, 0010, 0100, 0101}.

K Users, N-ary Orthogonal Signal Channel

Chang and Wolf [19] have constructed codes for a generalization of the K-BAC, to larger input alphabets. In particular they consider a channel in whicheach user may at each symbol interval, activate one of N orthogonal frequen-cies, {f1, f2, . . . , fN}. Of course, we can consider the use of any set of orthog-onal signals, but for convenience, we shall refer to frequencies. The receiverobserves the output of each frequency. Depending upon what type of obser-vation is made of each frequency, two channels are defined: the channel withintensity information, and the channel without intensity information. For theformer case, the number of users transmitting on each frequency is availableto the receiver. The latter case provides only a binary output of active/notactive for each frequency.

Three code constructions are given for the channel without intensity in-formation. All of these constructions are characterized by the use of uniquefrequencies, or frequency patterns as markers for each user. This has the effectof transforming the system into a frequency division multi-access system.

The first construction, for K < N assigns two codewords of length oneto each user, Ck = {f1, fk+1}. This construction has sum rate N − 1. Theremaining constructions are for the K = 2 channel, for which the reader isreferred to the original work. Wilhelmsson and Zigangirov construct codes forthis channel with polynomial decoding complexity [162].

The construction for the channel with intensity information is a general-ization of the approach of Chang and Weldon [18]. As for the Chang–Weldoncodes, their construction is asymptotically optimal, if one increases K →∞,while fixing N . For further results concerning coding for the K-BAC, the in-terested reader is referred to [17]. Mathys considers the case when randomaccess to the channel is allowed [86].

Binary Switching Channel

Vanroose [144] considers coding for the two-user binary switching MAC, whichis a counterpart to the two-user BAC. Each user transmits symbols from{0, 1}, but the output is Y = x1/x2. Division by 0 results in the ∞ symbol.This is of interest, as it is the only other ternary output, binary input MACform. The capacity region is determined, and is found to touch the total

3.6 Superposition and Layering 81

cooperation line. In addition, the capacity region is also shown to be thezero-error capacity region.

3.5.2 Convolutional and Trellis Codes

The subject of designing trellis codes for multiple-access channels has receivedconsiderably less attention than for block codes. Work has focused entirely onthe binary adder channel.

Peterson and Costello have investigated convolutional codes for the two-user binary adder channel [96]. They introduce the concept of a combined two-user trellis (see Figure 3.19), and define a distance measure, the L-distancebetween any two channel output sequences. They go on to prove several re-sults. Among these are conditions for unique decodeability and conditions forcatastrophicity. In a rather striking theorem, they show that no convolutionalcode pair for the two-user BAC exists at a sum rate greater than 1, whichcould have been achieved anyway with no cooperation.

Figure 3.19 shows a combined trellis for a two-user uniquely decodeablecode. The branch labels on the trellis are of the form u1u2 | v1v2. If user 1 anduser 2 input u1 and u2 respectively to their separate convolutional encoderinputs, the channel output is the symbol v1 followed by the symbol v2. Thestate labels on the combined trellis are of the form s1s2, where sk is the stateof user k’s individual code.

Decoding takes place by simply applying the Viterbi decoding algo-rithm [152] to the combined trellis, using L-distance as the metric. This codeachieves the upper bound set for convolutional codes on this channel, and isshown as point 4 in Figure 3.18.

Chevillat has investigated trellis coding for the K-BAC [20]. He finds con-volutional code pairs with large dL,free and gives a two-user non-linear trelliscode for the BAC, found by computer search. This code is shown in Fig-ure 3.20. It possesses sum rate 1.25, and is point 5 on the comparison figure.Peterson and Costello compute bounds on error probability and free distancefor arbitrary two-user multiple-access channels [95]. Sorace gives an algebraicrandom coding performance bound [130].

3.6 Superposition and Layering

We shall now see that the vertices of achievable rate regions possess spe-cial properties that provide a “single-user” interpretation of the capacity re-gion. First, let us formally describe these vertices, or extreme points of theachievable rate region. Vertices are rate points R that have (after a possiblere-indexing of the users) elements with the following form6:

6 See [57] for the one-to-one correspondence between such points and vertices ofthe rate region


�

��

��

�

0

1

0

1

0 | 00

1 | 11

0 | 01

1 | 10Single user trellis for user 1

�

��

��

�

0

1

0

1

0 | 00

1 | 11

0 | 10

1 | 01Single user trellis for user 2

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

00

01

10

11

00

01

10

11

00 | 00

01 | 11

10 | 1111 | 22

00 | 1001 | 01

10 | 2111 | 12

00 | 0101 | 12

10 | 10

11 | 21

00 | 1101 | 02

10 | 20

11 | 11

Combined trellis

Fig. 3.19. Combined 2 user trellis for the BAC

1 1

0 0��

��

��

��

��

��

��

��

��

��

��

��

0001,0010,0011,0100

0101,1001,1010,1101

0000,10110100,1000

1001,10110101,0110

User 1

1 1

0 0��

��

��

��

��

��

0001,1110

0011,1100

0000,1111

0011,1100

User 2

Fig. 3.20. Two user nonlinear trellis code for the BAC

Ri = I(Xi;Y | X{1,2,...,i−1}

), (3.18)

that is R looks something like

R = (I (X1;Y ) , I (X2;Y | X1) , . . . , I (XK ;Y | X1, X2, . . . , XK−1)) .

Note that by definition of the achievable rate region (3.1), such points are onthe boundary of the region. In Section 3.3.1, we saw that the vertices of theachievable rate region (in that case coincident with the capacity region) were

3.6 Superposition and Layering 83

achievable by interference cancellation. This property holds for all vertices,and is in fact due to the form of the vertices (3.18). At a vertex, the usersmay be ordered, such that the maximum rate is given by the mutual infor-mation between each user and the output, conditioned only on knowledge of“previous” users, where previous is defined by the ordering.

Source1

Encoder 1

Noise

LinearMultiple-Access

Channel

+Source

2Encoder 2

Source3

Encoder 3

Decoder 1

Decoder 2

Decoder 3 +

+

X1

X2

X3

X2 X3interference

X3interference

no interference

I(X3;Y |X1,X2)

I(X2;Y |X1)

I(X1;Y )

-

-

X1

X2

X3

Y

Fig. 3.21. Successive cancellation approach to achieve vertex of capacity region.

Wither reference to Figure 3.21, the achievability of the vertices is provedusing a successive decoding argument as follows. The first user treats allother users as noise, and sees a discrete memoryless channel. By the standardrandom-coding argument, this user may transmit at rate R1 ≤ I (X1;Y ) witharbitrarily low error probability, which is the rate point required by (3.18).We now assume that we have used such a coding scheme for this user, andthat we know perfectly the transmitted data, which is now available for sub-sequent users. The second user now treats the remaining users, k > 2 as noise,


and once again from the single-user random coding argument, may transmitat R2 ≤ I (X2;Y | X1), recalling that the users transmit independently. Wecontinue this argument down the chain for all users, each transmitting us-ing single user codes which are decoded one by one, using knowledge of allprevious users. For linear channels, such as the binary adder channel, or theGaussian multiple-access channel, the previous users’ data may be incorpo-rated into the decoding process by subtracting it from the channel output,hence the name interference cancellation.

A common objection to such schemes is the “error propagation” argu-ment, whereby errors in one decoding step lead to even more errors in thenext. However in proving the achievability of these rate points by successivecancellation, we do not suffer from this problem, as the single-user coding the-orem guarantees the existence of codes with vanishing error probabilities. Infact it is possible to bound the probability that an error is made anywhere inthe step-by-step process by a function which tends to zero exponentially withthe codeword length. The error propagation is only a concern for a practicalimplementation of the scheme.

In order properly understand the effect of asynchronism in the next section,we first need to better understand the role of the convex hull operation inTheorem 3.1. This convex hull operation is due to the idea of time-sharing.Given that the two rate vectors, R1 and R2 are achievable, then every point onthe line connecting R1 and R2 is also achievable, simply by using the codebookcorresponding to the the point R1 for λn symbols and that corresponding toR2 for the remaining (1 − λ)n, 0 ≤ λ ≤ 1. In general, we can achieve anyadditional point that is the convex combination of any number of achievablepoints. Note that in order to implement this time-sharing scheme, a commontime reference must be available, in order for the users to agree upon whento change transmission strategies.

Caratheodry’s theorem [32] states that every point in the convex closureof a connected compact set A is contained within at least one simplex7 whichtakes its vertices from A. This implies that every point within the capacityregion for a K user channel may be achieved by time-sharing between at mostK + 1 vertices, requiring each user to have access to K + 1 codebooks.

We now see that we may achieve any point in the capacity region by time-sharing between at most K + 1 successive cancellation schemes, each with Kcancellation steps.

3.7 Feedback

In this section, we will consider some results for the multiple-access channelunder the assumption of perfect feedback, by which we mean that the en-coder for each user has available all previous channel outputs. Figure 3.227 A simplex in d-dimensional space is a polytope with d + 1 affinely independent

vertices, e.g. a tetrahedron for 3-dimensional Euclidean space.

3.7 Feedback 85

shows a two-user channel with feedback. Is is a somewhat surprising result

Source 1 Encoder 1

Channel

Source 2 Encoder 2

Fig. 3.22. Two-user MAC with perfect feedback.

of single-user information theory, that noiseless feedback from the receiver tothe transmitter does not increase the capacity of a memoryless channel, a factproved by Shannon in 1956 [124]. Feedback does however increase the capac-ity of channels with memory, by aiding the transmitter in predicting futurenoise, using the time-correlation properties of the channel. It is therefore notaltogether surprising that feedback can increase the set of achievable ratesfor the multiple-access channel. Such feedback essentially enables the users tocooperate in their transmissions to some degree. We shall also see that theuse of feedback can simplify the coding schemes required for transmission.As a motivation, let us see how we can use feedback in a simple way to in-crease the set of achievable rates. The following example is due to Gaarderand Wolf [48], which was historically the first example of feedback increasingthe set of achievable rates.

Example 3.13 (Gaarder-Wolf Feedback Scheme). Consider the two-user binaryadder channel, X1, X2 ∈ {0, 1}, Y = X1+X2 ∈ {0, 1, 2}. At the output, Y = 1is the only ambiguous symbol. Call this output an erasure. Transmission willtake place in two stages.

Stage 1: In the first stage, each user transmits at rate 1. At the output of thechannel, the decoder will not be able to successfully decode, since there areerasures. However each user observes the channel output, which combinedwith the knowledge of its own transmitted data, allows the deduction ofthe transmitted data of the other user.

Stage 2: The users can now cooperate to re-transmit those bits of one user,say user 1 which suffered erasure. This can be done at a rate of log2 3,since both users know from stage 1 what must be transmitted. The outputcan easily determine the second user’s erased bits from those of the first.


What rates can be achieved using this method? Consider a block of n trans-missions. Let the first stage use λn symbols. With probability exponentiallyapproaching 1 with n, there will be 1

2λn erasures in the first stage. For eacherasure, we must provide 1 bit of information in stage 2, i.e. we must have

(1− λ)n log2 3 =12λn.

Solving for λ, which is the rate for each user, we find

R1 = R2 = λ =2 log2 3

1 + 2 log2 3.

The resulting sum rate 1.52 exceeds the sum constraint of 1.5 for the channelwith no feedback.

This two-stage approach can be thought of in the following way. In the firststage, the users transmit at the highest possible rate to each other via thefeedback link. The output however cannot fully decode, but may be able topartially decode, e.g. with a list decoder. During the second stage, the userscooperate to resolve any ambiguity at the output. For the case of list decoding,they would transmit the index of the correct codeword in the list. This list-decoding method was in fact used to show the following achievable rate regionfor the two user discrete memoryless MAC, where either one, or both userscan observe the channel output.

Theorem 3.14 (Cover-Leung Achievable Rate Region).The following is achievable rate region for the K = 2 discretememoryless multiple-access channel (XK; p(y | xK) ; Y) with per-fect feedback to one, or both users.

R [pU,X1,X2,Y (u, x1, x2, y)] = {R : 0 ≤R1 ≤ I (X1;Y | X2, U)0 ≤R2 ≤ I (X2;Y | X1, U)0 ≤R1 + R2 ≤ I (X1, X2; Y ) ,

where the joint distribution pU,X1,X2,Y (u, x1, x2, y) is of the form

pU,X1,X2,Y (u, x1, x2, y) =pU(u) pX1|U(x1 | u) pX2|U(x2 | u) pY |X1,X2(y | x1, x2) (3.19)

and U ∈ U is an arbitrary random variable, with

|U| ≤ min{|X1| |X2|+ 1, |Y|+ 2}.

The achievability of this region was originally shown in [23], for the case offeedback to both users. It was shown to also be achievable for feedback toonly one user in [164].

3.7 Feedback 87

For a certain class of channels, the achievable rate region of Theorem 3.14coincides with the capacity region. The converse required to show the opti-mality of the Cover-Leung region for these channels was proved by Willemsin [163].

Theorem 3.15 (Optimality of Cover-Leung Region). Con-sider the class of two-user discrete memoryless multiple-accesschannels in which at least one input is a deterministic functionof the output and the other input, i.e. either H (X1 | Y, X2) = 0or H (X2 | Y, X1) = 0. For channels within this class, the rate re-gion of Theorem 3.14 is the capacity region, for the case of perfectfeedback to one or both users.

The binary adder channel is a member of the class of channels described inTheorem 3.15.

Example 3.14 (Binary Adder Channel with Feedback). The capacity regionfor the binary adder channel with feedback is shown in Figure 3.23. Also shownis the total cooperation line, R1 + R2 = log2 3 = 1.58496. The maximumsum rate achievable with feedback is at the equal rate point, (R1, R2) =(0.7911, 0.7911), giving a sum value of 1.5822, slightly less than that for totalcooperation. For the binary adder channel, as the number of users increases,this difference vanishes, and with feedback, the total cooperation rate may beachieved at the equal rate point.

We now describe a simple scheme due to Kraemer [74], which achievesthe capacity region for the two-user BAC (and in fact any channel for whichH (X1 | X2, Y )=0). The system diagram is shown in Figure 3.24. X1, X2 gen-erate new information such that Pr(X1 = 0) = Pr(X2 = 0) = p. The randomvariable V ∈ {0, 1}, which is common to both transmitters is generated fromthe feedback signal in the following way. The encoder for V takes as its in-put the feedback output Y , via a mapping, which outputs 1 if Y = 1, and 0otherwise (i.e. the mapping is an erasure detector). This feedback signal is en-coded, using an identical random block code for each user at a rate RV . Eachuser transmits the modulo-2 sum Xk ⊕ V . The cooperative random variableV serves to resolve the ambiguity about the erased Y = 1 channel outputs.

In order for the scheme to work, note that for each Y = 1 erasure, V mustsuccessfully transmit one bit of information to the receiver, i.e. we must have

RV ≥ Pr(Y = 1), (3.20)

with arbitrarily low error probability for V . Under these conditions, the ratepoint achieved will be (R1, R2) = (h (p) , h (p)). Let us now calculate theserates. V sees the binary symmetric erasure channel shown in Figure 3.25. Thecapacity of this channel can be found as follows,

RV ≤ maxPr(V =0)

I (V ;Y ) =[(1− p)2 + p2

]·[1− h

(p2

(1− p)2 + p2

)], (3.21)


R2

R1

Fig. 3.23. Capacity region for the two-user binary adder channel with feedback.

for Pr(V = 0) = 12 . Note also that

Pr(Y = 1) = 2p(1− p). (3.22)

Substituting the maximum value for RV (3.21) and the erasure probability(3.22) into (3.20) and solving for p, we find p = 0.23766, hence

(R1, R2) = (0.7911, 0.7911),

which is on the boundary of the capacity region.This scheme is a special case of the Cover-Leung list decoding scheme. For

each received codeword of length n, we have a certain number of erasures,e. The decoder for Y generates a list with 2e entries, consisting of each pos-sible codeword, assuming all combinations of values for the erased symbols.However superimposed upon this “fresh” information is the resolution infor-mation V , which can be thought of as the index into the list (for the previouscodeword).

Example 3.15 (Gaussian Multiple-Access Channel with Feedback). The ca-pacity region for the white Gaussian noise multiple-access channel withfeedback was found by Ozarow [93]. Consider the two-user white Gaussianmultiple-access channel, in which for k = 1, 2, each source Xk ∈ R, is subjectto an average power constraint E X2

k ≤ Pk. The channel output is given by

3.7 Feedback 89

+

+

+

Encoder 1

Encoder 2

Source 2

Souce 1 map

map

X1

X2

V

V

Fig. 3.24. Simple feedback scheme for the binary adder channel.

2p(1 − p)

p2

p2

2p(1 − p)

(1 − p)2

(1 − p)2

Pr(Y = 1) = 2p(1 − p)

YV

1

0

2

1

0

�

��

�

��

��

��

��

��

�

�

Fig. 3.25. Channel seen by V .

Y = X1 + X2 + z, where z ∼ N(0, σ2

). The capacity region for this channel

under the assumption of perfect feedback to both users is given by


C =⋃

0≤ρ≤1

{(R1, R2) : 0 ≤ R1 ≤

12

log(

1 +P1

σ2(1− ρ2)

),

0 ≤ R2 ≤12

log(

1 +P2

σ2(1− ρ2)

),

0 ≤ R1 + R2 ≤12

log(

1 +P1 + P2 + 2ρ

√P1P2

σ2

)}.

The feedback capacity region (for the case P1 = P2 = σ2 = 1) is shown inFigure 3.26. Also shown for reference is the non-feedback region. In the case

R2

R1

Fig. 3.26. Capacity region for the two-user GMAC channel with feedback.

of non-white noise, the capacity region is upper-bounded by the non-feedbackregion, scaled by a factor of two [92].

3.8 Asynchronous Channels

In the preceding discussion of the capacity region, two types of synchronismwere assumed. The first was symbol synchronism - the users strictly align theirsymbol epochs to common time boundaries. The second type of synchronismwas frame synchronism, in which the users align their codewords to commontime boundaries. We shall see that loss of either type of synchronism changesthe region of achievable rates.

3.8 Asynchronous Channels 91

Frame-Asynchronism

First, we shall consider the case in which symbol synchronism is maintained,but frame synchronism is not. It has been shown [100] and [63] that thecapacity region for this channel is given by the following theorem.

Theorem 3.16. The capacity region of the symbol syn-chronous, frame asynchronous memoryless multiple-access channel(XK; p(y | xK) ; Y) is the closure of

C [p(y | xK)] =⋃

π(xK)

R [π (xK) , p(y | xK)] ,

where R [·] is defined in (3.1) and π (xK) is the family of productdistributions on the sources.

In other words, the loss of frame synchronism removes only those extra ratepoints included by the closure of the convex hull operation which are not al-ready included in the union of achievable rate regions. Intuitively, we can seethat if frame synchronism is lost, the users cannot coordinate their transmis-sions to time-share between two achievable points, and thus we cannot applythe convex hull operation. The proof of Theorem 3.16 however still requiresthat the remaining “union” points are shown to be achievable without framesynchronism. For many channels, the union region is already convex, for ex-ample the binary adder channel and the Gaussian multiple-access channel.For such channels, the removal of frame synchronism has no effect upon thecapacity region. We shall see however that there are channels where this isnot true.

An interesting question is: Does the loss of frame-synchronization destroyour single-user interpretation of the capacity region, which relied on a globallyknown time reference, as discussed in Section 3.6? The answer is no, but wemust slightly change our single-user strategy [57].

Complete Asynchronism

In modeling the completely asynchronous case, a new concept must be in-troduced. Whereas the synchronous and frame-asynchronous channels werediscrete time, the completely asynchronous channel must be modeled as acontinuous time multiple-access channel8. The general form of the capacityregion for the completely asynchronous channel is at present unknown. Wenow present two examples, for which the capacity region is known, namelythe completely asynchronous Gaussian MAC [148] and the collision channelwithout feedback [84]8 It is tempting to try to model the channel as discrete time, with smaller time

increments, but this is really the same as the symbol synchronous case.


Example 3.16 (Completely Asynchronous Gaussian MAC). The completelyasynchronous two-user Gaussian multiple-access channel has been studied byVerdu [148]. The model differs a little from the synchronous GMAC model,since we now have a continuous time waveform channel.

Each user k transmits a length n codeword (xk[1], xk[2], . . . , xk[n]) ∈ Rnk ,

at a rate of one codeword symbol every T seconds, by sending the linearmodulation of a fixed “signature” waveform sk(t). The signature waveformsare zero outside the interval [0, T ]. The channel output is therefore representedas

y(t) =n∑

i=1

x1[i]s1(t− iT − τ1) +n∑

i=1

x2[i]s2(t− iT − τ2) + z(t),

where z(t) is white Gaussian noise with power spectral density σ2, and thedelays τ1, τ2 ∈ [0, T ) introduce the symbol asynchronism. It is necessary toassume that the receiver knows these delays, but they are unknown to thetransmitters. We must also apply an energy constraint,

1n

n∑i=1

x2k[i] ≤ wk.

The capacity region is found by considering a “equivalent” discrete time chan-nel, which has the same capacity as the continuous time one. This can be doneif the outputs of the discrete time channel are sufficient statistics for the orig-inal channel. It can be shown that the output of filters matched to s1(t) ands2(t), sampled at iT + τ1 and iT + τ2 respectively are indeed such sufficientstatistics. The capacity region is given by

C =[(

(R1, R2) :

R1 ≤ 1

4π

Z π

−π

log

„1 +

S1(ω)

σ2

«dω

R2 ≤ 1

4π

Z π

−π

log

„1 +

S2(ω)

σ2

«dω

R1 + R2 ≤ infρ12,ρ21∈Γ

1

4π

Z π

−π

log

„1 +

S1(ω) + S2(ω)

σ2+

S1(ω)S2(ω)

σ4·

ˆ1 − ρ2

12 − ρ221 − 2ρ12ρ21 cos ω

˜«dω

),

where inputs to the channel are stationary Gaussian processes with powerspectral densities S1(ω) and S2(ω), with Sk(ω) ≥ 0, ω ∈ [−π, π] and the unionis over all processes conforming to the energy constraint

12π

∫ π

−π

Sk(ω) dω = wk

The ρij terms are the cross correlations between the signature waveforms(assuming τ1 ≤ τ2):


ρ12 =∫ T

0

s1(t)s2(t + τ1 − τ2) dt.

ρ21 =∫ T

0

s1(t)s2(t + T + τ1 − τ2) dt.

The infimum in the last constraint is over Γ = {(ρ12, ρ21)}, which is the setof possible cross correlations, given the signature waveforms. This representsthe fact that the users do not know their relative time offsets, and hence donot know their cross correlations.

The capacity region therefore depends upon the cross-correlation betweenthe users’ signature waveforms. It is interesting to see what happens if theusers are assigned the same waveform. In this case, Γ contains (0, 1) and1−ρ2

12−ρ221−2ρ12ρ21 cos ω = 0, which is the minimum value that it can take.

It is now easy to see that the resulting capacity region is exactly the same asthat of the synchronous channel of example 3.49. If the waveforms are howeverdifferent, the resulting capacity region is a larger polytope, with roundedcorners. The rounded corners are due to the fact that there is no combination

1 2

shows the capacity regions for an equal power asynchronous channel. RegionA is for the completely correlated waveform case. Region B is for waveformsthat are orthogonal when τ1 = τ2. Note that in either case, the capacity regionis still convex.

Example 3.17 (Collision Channel without Feedback). The collision channelwithout feedback was proposed by Massey and Mathys [84]. This channel hasa number of distinct features, which makes it an interesting comparison tothe channels already described. In particular, it attempts to model randomaccessing to the channel.

The channel is described as follows. User k sends a packet of fixed durationT , with some probability, pk. Users’ transmissions are not synchronized in anyway, and there exists no feedback path to the users, so they can never deter-mine the degree of asynchronism, whereas the receiver can. At the receiver, acollision is said to have occurred if two or more packets overlap by any non-zero time duration. Any packet involved in such a collision is destroyed, andits information lost. In the absence of a collision, packets are assumed to bereceived successfully, and all the information contained is retrieved.

It is shown in [84] that the capacity region, and zero-error10 capacity re-gion coincide, and this region further coincides with the corresponding regionsobtained if slot (frame) synchronism is allowed. The outer boundary of thecapacity region is given by9 If it is not so easy to see, consider each user as a white Gaussian process, which

together with (3.23) gives Sk(ω) = wk.10 By zero-error it is meant that there exists a coding scheme such that Pe = 0.

See [124].

of S (ω) and S (ω) which simultaneously maximizes all constraints. Figure 3.27


R2

R1

B

A

Fig. 3.27. Capacity region for symbol-asynchronous two-user Gaussian multiple-access channel.

Rk ≤ pk

K∏j=1j �=k

(1− pj) ,

where pj ≥ 0 and∑K

j=1 pj = 1. This is shown for two users in Figure 3.28. Thisregion is not convex, and in fact the convexity of its first orthant complementwas proved by Post in [103]. The symmetric capacity (the maximum sum rateachievable with every user at the same rate) of this channel approaches 1/e forlarge systems, which is equal to the maximum throughput of slotted ALOHAfor an infinite number of identical users [13].


R2

R1

Fig. 3.28. Capacity region for two-user collision channel without feedback.

4

Multiuser Detection

4.1 Introduction

In this chapter we explore basic detection techniques for linear multiple-accesschannels. These techniques exploit the structure of the interference signalsand do not assume the interference to be uncorrelated Gaussian noise as isthe case in conventional correlation detection. They make use of the fact thatthe interference signal is information bearing. We will make a distinction be-tween multiuser detection which deals with (uncoded) systems and essentiallydemodulates the transmitted signals without regard to any time correlationamong the data symbols (coding), and multiuser decoding, which explicitlyincludes data dependencies in the form of forward error control coding (FEC).Multiuser decoding is treated in detail in Chapter 6, but these decoding meth-ods are built on the detection techniques discussed here.

Multiuser detection was first proposed for CDMA system in [121, 142, 147].In [147] multiuser detection was shown to be able to handle the debilitatingeffect of different received power levels, the so-called near-far problem. Thisproblem occurs when the signal of a close-by user drowns the signals of usersfurther away which are received with less power. With conventional correla-tion reception, CDMA signals are very susceptible to the near-far problem, seeSection 4.5.1. In cellular networks [134, 135], power control assures that all theusers’ received powers are kept approximately equal. Since no joint detectionis attempted, the near-far problem can only be avoided by carefully moni-toring and adjusting the transmission power levels of the transmitting users.The complexity of power control resides in the network. Multiuser detection,however, can alleviate or eliminate this network complexity by translatingit into computational receiver complexity. Apart from the promise of higherspectral and power system efficiencies, it is this capability to eliminate thenear-far problem which makes multiuser detection so interesting. Given theinexorable tendencies of Moore’s law, however, multiuser detection and decod-ing have come close to being practical and economically interesting. Avoiding

98 4 Multiuser Detection

unnecessary network complexity and locating it in the receiver instead hasthe potential to make future networks far more resource efficient.

Figure 4.1 lists some the classes of multiuser detectors (and decoders)discussed in this book loosely ordered by computational complexity and per-formance (not to scale). At the bottom of the diagram we find the conventionalcorrelation detector, which does not use explicit joint detection principles andtreats the signals of the interfering users as additive uncorrelated noise. Itis a most simple receiver, based on conventional point-to-point transmissiontechniques, and affords adequate performance as long as the number of usersis limited, usually to a fraction of the processing gain of the system [97, 154].This kind of receiver is based on low-complexity terminals, but requires asophisticated network system with provides control functions such as powercontrol.

CorrelationDetection

IS-95, cdma2000

3GPP

Statistical InterferenceCancellation

Decorrelator, LMMSE,

Multistage

OptimalDetection

Viterbi Algorithm

APP Algorithm

IterativeDecoders

Turbo Decoders

Cancellation/Bounded Search

Tree Search

Sphere Detector

List Detection

OptimalDecoding

Viterbi Algorithm

APP Algorithm

Performance

Com

plexity

Near-Far

Resistant

NOT Near-Far

Resistant

Exponential

Complexity

Polynomial

Complexity

Uncoded Systems Coded Systems

Linear Front EndDecoders

Fig. 4.1. Classification of multiuser detection and decoding methods.

At the top of the diagram of Figure 4.1, the most complex receivers per-form a maximum-likelihood (ML) estimate of the data sequences of all theusers. This receiver is near-far resistant, and forms a benchmark for ideal

practical detector. We will discuss optimal detection in detail in Section 4.2.We also note that optimal, or ML-decoding, is not necessary to approach

performance. Its computational complexitymakes it largely uninteresting as a

4.1 Introduction 99

the capacity limit of the multiuser channel, since in the proof of Shannon’scapacity theorems (Chapter 3) an ML detector does not have to be assumed.

Between the extremes, we find a number of detectors, which all share theproperty that they are near-far resistant, or nearly so. We have somewhatarbitrarily grouped them into three classes. The first class are the statisticalinterference cancelers. The decorrelator and minimum mean-square error de-tectors discussed in Sections 4.4.2 and 4.4.5, are the main representatives ofthis class. The term statistical interference cancellation refers to the fact thatthese receivers do not attempt to perform actual interference signal cancel-lation, but use statistical properties of the received and interfering signals inorder to provide improved performance. These methods work surprisingly wellwith complexities much lower than that for optimal detection, primarily dueto the fact that for the purpose of statistical cancellation, complete receiversfor the interfering users are not needed.

The next class of multiuser detectors are the actual interference cancelers.Many of these detectors were originally of ad-hoc design. Typically they eitherdecode users via some form of successive cancellation, or they are approxima-tions to the optimum detector. Systems which successively cancel users’ signalsstart by power-ordering the users, then subtract the influence of stronger usersfrom the received signal and proceed recursively to decode subsequent users.This works well as long as the detection of the different users’ transmitted sym-bols is correct. In fact, assuming error-free detection at each stage, we show inSection 5.2.2 that such successive cancellation receivers can achieve the capac-ity of the multiple-access channel. Without accurate decoding, however, suchreceivers suffer from potentially debilitating error propagation. In Chapter 5on implementation aspects of multiuser detectors, we show that such inter-ference cancelers can be viewed as variants of iterative implementations ofthe statistical interference cancelers, making cancellation techniques the mostimportant practical detection and decoding concept for multiuser systems.The interference cancelers approximate the optimal detector. Among

branch-and-bound tree search algorithms as well as the sphere decoder.They typically suffer from loosing the correct signal hypothesis, but provideexcellent performance for a given computational effort.

More recently, principles of iterative decoding have successfully been ap-plied to the problem of joint decoding of CDMA systems. Iterative decod-ing has become popular with the comet-like ascent of turbo and low-densityparity-check codes [12, 81, 120] in recent years. Their application to linearmultiple-access channels has lead to very efficient iterative decoders. We willtreat such iterative decoding systems in Chapter 6.

All of the CDMA multiuser detection methods are based on the particularchannel model which arises from CDMA transmission, i.e. on the canonicallinear-algebraic channel from Section 2.3

r = SAd + z. (4.1)

otherapproximations to the optimal detector are limited search algorithms such as


Recall that S is the modulation matrix whose columns are the spreadingsequences of the different users and different transmission time intervals. Inthe sequel we will set the amplitude matrix A = I in some of the derivationsfor convenience. We do this only where A does not play an important role andits inclusion is a straightforward exercise. Note that the linearity of (4.1) makesmany of the multiuser detectors possible, and indeed feasible complexity-wise.

4.2 Optimal Detection

A joint detector considers all available information, either in an optimal or anapproximate way. It therefore must jointly decode all the accessing users’ datastreams as shown in Figure 4.2, which is a simplified rendition of Figure 2.9.Brute-force optimal joint detection typically translates into a large complexityof the joint detector as we will show below. If the different users employ FECcodes, the complexity of the decoder sharply increases unless cancellation oriterative decoders are used. At any rate, for full exploitation of the CDMAchannel capacity the view taken in Figure 4.2 needs to be adopted.

Encoder /

Modulator

Encoder /

Modulator

Encoder /

Modulator

Noise

Joint Detector

for all users

Source

Source

Source

u d u or d

Fig. 4.2. A joint detector considers all available information.

4.2.1 Jointly Optimal Detection

We will now turn our attention to the derivation of the optimal detector forthe CDMA channel [121, 142, 147], which is applicable in general to channels

4.2 Optimal Detection 101

of the form (4.1). By “jointly optimal” we mean the detector which producesthe maximum-likelihood estimate of the transmitted uncoded symbols d, and,consequently, minimizes the (vector) error probability Pr(d �= d) for the un-coded symbols.

The received vector of noise samples z is (mostly) caused by thermal re-ceiver noise, and hence is well described by independent Gaussian noise sam-ples with variance σ2 = N0/2, where N0 is the one-side noise power spectraldensity1. Armed with these assumptions, we expressed the conditional prob-ability of r given d by a multi-variate Gaussian distribution in Section 2.5.4,and derived the ML estimate as

dML = arg mind∈DKn

‖r− Sd‖22 . (4.2)

Note that in the synchronous case this minimization can be carried out in-dependently over each symbol period, which, ironically, does not reduce thecomplexity of the algorithm, as we will see. We may furthermore neglect theterm r∗r in (4.2) and write

dML = arg maxd∈DKn

(2d∗y − d∗S∗Sd) . (4.3)

Note that the vector y = S∗r has dimension Kn and its j-th entry, given by(S∗r)[j] = s∗jr, is the correlation of the received vector with the spreadingsequence of the k-th users at symbol interval i, where i = �j/K and k =j − (i − 1)K. (This correlation can also be computed as the output of afilter matched to sj and sampled at time T

(k)i = jT + τk. That is, S∗r is the

output of a bank of filters matched to the spreading sequences of the K users).The receiver of (4.3) is shown in Figure 4.3, reproduced from Figure 2.6 forconvenience.

If the spreading sequences which make up the channel matrix S are or-thogonal, S∗S = I, and the channel symbols d are uncorrelated and haveconstant energy, (4.3) reduces to

dML = arg maxd

(d∗y) , (4.4)

which is easily solved by dML = sgn(y). This is the correlation receiver whichperforms no joint detection. In general, however, the term d∗S∗Sd needs tobe considered. This term is the source of the complexity of the joint detector,but also of its performance advantage.

The spreading sequences sj = sk[i] may be time-varying, i.e. depend notonly on k but also on the index i. This is the case for time-varying CDMAwhich uses sequences for each of the users with periods much longer than L[135]. This is also referred to as random spreading. The other alternative is touse time-invariant CDMA, i.e. sk[i] = sk, where identical, length-L, spread-ing sequences are used for all symbol transmissions. There are advantages1 For an introductory discussion of Gaussian noise see e.g. [49, 104, 166]


Baseband

Modulator

Baseband

Modulator

Baseband

Modulator

Noise

Optimal

Detector

Reset

Matched Filter Front End

d1[i]

√P1 s1[i]

d2[i]

√P2 s2[i]

dK [i]

√PK sK [i]

z[i]

r[i]

s1[i]

s2[i]

sK [i]

P

P

P

y1[i]

y2[i]

yK [i]

Fig. 4.3. Matched filter bank serving as a front-end for an optimal multiuser CDMAdetector.

and disadvantages to both systems, however, random CDMA is becoming themore popular variant in practical applications [134, 135]. The system modelof Figure 4.3 and equation (4.3) is equally applicable to both alternatives aswell as synchronous and asynchronous transmission, as discussed in Chapter2.

Since optimal detection based on y is possible, the outputs of the cor-relation detector provide what is known as a sufficient statistic (see Defini-tion A.5), and, while making hard decisions on y is suboptimal in general,the bank of correlators or matched filters serves as a front-end of the optimaldetector in Figure 4.3. No information is lost due to the linear correlationoperations.

We now show that optimal detection can be performed by a finite-statemachine with 2K − 1 states [147], more specifically by a trellis decoder [120].The observation is based on realizing that the term d∗S∗Sd in (4.3) has band-diagonal form with K− 1 off-diagonal terms and can be recursively evaluatedby formulating the problem as a finite-state system that evolves through therow index. The algorithm is identical to the trellis decoding algorithms dis-cussed in [120, Chapter 6], with only minor differences in how the branchmetrics are generated.

The operation that causes the computational complexity of the optimaldetector is the evaluation of the quadratic form d∗S∗Sd. The correlationmatrix R = S∗S of the spreading sequences has dimensions Kn×Kn and itsn, m-th entry is given by (Section 2.5.2)

Rlm = s∗l sm = s∗k[i]sk′ [i′]. (4.5)


Recall that the subscripts k and k′ refer to users, and the arguments i and i′ totime intervals. Note that the indexing l, m of the spreading sequences is in theorder in which they appear in S, since the distinction between users and timeunits is irrelevant to the optimal detector, but technically sl = sk[�l/K�]; k =l mod K.

Figure 4.4 shows the structure of the cross-correlation matrix R for threeasynchronous users. The fact that R is band-diagonal with width 2K − 1allows us to evaluate d∗Rd with a trellis decoder with 2K−1 states as follows.We must evaluate (4.3) for every possible sequence d, that is, we need tocalculate a sequence metric

λ(d) = 2d∗y − d∗S∗Sd = 2Kn∑i=1

diyi −Kn∑i=1

Kn∑j=1

diRijdj (4.6)

and then selectdML = arg max

d∈DKn(λ(d)) . (4.7)

1 2 3

1

2

3

1 3213212 3

3

1

2

3

1

2

3

1

2

i

i'

k

k'

R11 R12 R13

R12 R22 R23 R24

R13 R23 R33 R34

R24 R34 R44

R(l−2)lR(l−1)l Rll

R(l−1)l

R(l−2)l

Fig. 4.4. Illustration of the correlation matrix R for three asynchronous users.

To do this, let us define the partial metric at time l − 1 as


λl−1(d) = 2l−1∑i=1

diyi −l−1∑i=1

l−1∑j=1

diRijdj (4.8)

and write the entire sequence metric in recursive form as

λl(d) = λl−1(d) + 2dlyl −l−1∑i=1

2di Rildl −Rll

∣∣dl

∣∣2, (4.9)

where we have used the crucial fact that R is symmetric. The key observation,first made by Ungerbock in [140] in the context of inter-symbol interferencechannel equalization, is to note that λl(d) depends only on di, i ≤ l, andnot on “future” symbols di, i > l – see Figure 4.4. Furthermore, since we areassuming binary modulation, dk ∈ {+1,−1} we may neglect all terms

∣∣dl

∣∣2in the metrics since their contributions are identical for all metrics. We nowmay modify our partial metrics to

λl(d) = λl−1(d) +

t0︷︸︸︷2dlyl−

term 1︷︸︸︷l−1∑i=1

2diRildl (4.10)

= λl−1(d) + bl, (4.11)

where bl, implicitly defined above, plays the role of the branch metric in atrellis decoder .

Figure 4.5 illustrates the recursive nature of the computation of λl(d).Ateach time index l, the previousmetric λl−1(d) is updated by two terms, t0 andterm 1. The first term t0 depends only on the received signal at time l, i.e., yl,and the data symbol at time l. Term 1 depends on K−1 previous data symbolsdi; i = l − K, · · · , l, which will need to be stored. The incremental terms in(4.11) are represented by the wedge-like slice highlighted in Figure 4.5. At thenext time interval, a new such slice is added, until the entire quadratic formis computed. Note that it makes no difference if the system is synchronousor asynchronous. In a synchronous system, the triangular components aresimply zero, as discussed in Section 2.5.2, however, at its widest point in thematrix, there are still K − 1 symbols which need to be stored to compute theincrement.

From Figure 4.5 it is evident that the branch metric bl can be calculatedrequiring only the K − 1 most recent values (dl−1, . . . , dl−K) and the presentvalue dl from the symbol vector d.

We therefore define a decoder state sl−1 = (dl−1, . . . , dl−K) and note thatthere are a total of 2K−1 such states. The sequence metrics λl(d) can now becalculated for all d by a trellis decoder as illustrated in Figure 4.6 for threeusers, where each state has two branches leaving, corresponding to the twopossible values of dl, and two branches entering, corresponding to the twopossible values of dl−K .

,

[120]


time l

tim

el

t0term 1

term

1

Fig. 4.5. Illustration of the recursive computation of the quadratic form in (4.11).

s

q1

q2

State at l:

(dl, dl−1, dl−2)

Fig. 4.6. Illustration of a section of the CDMA trellis used by the optimal decoder,shown for three interfering users, i.e. K = 3, causing 8 states. Illustrated is themerger at state s, where each of the path arrives with the metric (4.12).


Since we will be working with state metrics we rewrite (4.11) as

λl(sl) = λl−1(ql−1) + bl(ql−1 → sl), (4.12)

where sl ranges over all possible 2K−1 states at time l, and ql−1 is a state onetime unit earlier in the trellis for which a connection to sl exists. There aretwo such states, since there are two paths merging at each state sl. Of thesetwo paths we may eliminate the partial path with the lesser partial metricwithout compromising (4.7). The resulting optimal detection algorithm haslong been known in the error control community as the “Viterbi” algorithm[45, 77, 120, 152]. The ML algorithm, or Viterbi algorithm which computesthe complete sequence metrics, for optimal decoding of correlated signal setsis described in Algorithm 4.1.

Algorithm 4.1 (Viterbi decoder for optimal detection).

Step 1: Initialize each of the S = 2K−1 states s of the detectorwith a metric m0 (s) = −∞ and survivor d(s) = (). Initializethe starting state of the encoder, state s = (0, · · · , 0), with themetric m0 (0) = 0. Let l = 1.

Step 2: Calculate the branch metric

bl = dlyl −K−1∑j=1

dl−jR(l−j)ldl, (4.13)

for each branch stemming from each state s at time l − 1 foreach extension dl.

Step 3: (Add-Compare-Select) For each state s at time l formthe sum ml−1 (q)+bl for both previous states q which connectto l, and select the larger to become the new state metric,i.e. ml (s) = maxq (ml−1 (q) + bl). Retain the path with thelargest metric as survivor for state s, and append the symbol dl

on the surviving branch to the state survivor d(s) = (d(q), dl).Step 4: If l < Kn, let l = l + 1 and go to Step 2.Step 5: Output the survivor d(sm) = (d1, . . . , dKn)(sm) corre-

sponding to the state sm = arg maxs (mKn (s)) which max-imizes mKn (sm) as the maximum-likelihood estimate of thetransmitted sequence.

The proof in Step 3 above that the merging partial path with the lessermetric can be eliminated without discarding the maximum likelihood solution

Theorem of Irrelevance.is standard and can be found, e.g. in [77, 120]. This fact is referred to as the


In an asynchronous system with large frame length n it is not necessary towait until l = Kn before the decisions on the decoded sequence can be made.We may modify the algorithm and obtain a fixed-delay decoder by addingStep 4b and changing Step 5 above as follows:

Step 4b: If l ≥ nt, where nt is a delay taken large enough, typically around5K, called the truncation length, output dl−nt

as the estimated symbol attime l−nt from the survivor d(s) = (d1, . . . , dl)(s) with the largest partialmetric ml+1 (s). If l < Kn, let l = l + 1 and go to Step 2.

Step 5: Output the remaining estimated symbols dl;Kn−nt < l ≤ Kn fromthe survivor d(sm), sm = arg maxs (mKn (s)).

4.2.2 Individually Optimal Detection: APP Detection

While we derived an optimal decoder for the entire set of transmitted symbols,there is no guarantee that the output symbols dj in (4.3) are optimal forany given user k. They may even be poor estimates for some of the users.Individually optimal detection calculates

dk[i] = arg maxd∈DKn

Pr(dk[i]|y) (4.14)

as the marginalization


∑d:dk[i]=d

Pr(y|d)Pr(d) (4.15)

as shown in Section 2.5.5. The a priori probabilities Pr(d) are identically anduniformly distributed, meaning that no prior information about d is available,unless iterative decoding systems make repeated use of (4.15), in which casePr(d) is computed by external soft-decision decoders, as is the practice withturbo decoding systems [120].

The marginalization sum in (4.15) grows exponentially with the numberof users. It can, however, be calculated relatively efficiently by a bi-directionaltrellis search algorithm, the BCJR or APP algorithm [9, 120], which operatesin the same trellis as the Viterbi algorithm discussed. The state space com-plexity is still exponential in K − 1, and exact calculation of (4.15), however,is rarely an option.

The purpose of the bi-directional a posteriori probability (APP) algorithmis the calculation of (4.14), by carrying out the marginalization (4.15) in anefficient manner using the trellis description of the correlation between sym-bols. To accomplish this, the algorithm first calculates the probability thatthe trellis model of the CDMA system traversed a specific transition, i.e. thealgorithm computes Pr[sl−1 = q, sl = s|y], where sl is the state at time l, andsl−1 is the state at time l− 1. The algorithm computes this probability as theproduct of the three terms, i.e.


Pr[sl−1 = q, sl = s|y] =1

Pr(y)Pr[sl−1 = q, sl = s,y]

∝ αl−1(q)γl(s, q)βl(s). (4.16)

The α-values are internal variables of the algorithm and are computed by aforward recursion through the CDMA trellis

αl−1(q) =∑

states p

αl−2(p)γl−1(q, p). (4.17)

This forward recursion evaluates α-values at time l − 1 from previously cal-culated α-values at time l − 2, and the sum is over all states p at timel − 2 that connect with state q at time l − 1. The α values are initiatedas α0(0) = 1, α0(s �= 0) = 0. This enforces the boundary condition that theencoder starts in state s = (0, · · · , 0).

The β-values are calculated by an analogous procedure, called the back-ward recursion

βl(s) =∑

states t

βl+1(t)γl+1(t, s) (4.18)

which is initialized as βKn(0) = 1, βKn(s �= 0)) = 0 to enforce the terminatingcondition of the trellis representing the CDMA system. The sum is over allstates t at time l + 1 to which state s at time l connects. The forward andbackward recursions are illustrated in Figure 4.7.

s

p2

p1 t2

t1

q

Forward Recursion

−→

Backward Recursion

←−

time l

Fig. 4.7. Illustration of the forward and backward recursion of the APP algorithmfor individually optimal detection.


The γ values are conditional transition probabilities, and are the inputs tothe algorithm. In order to compute the γl(s, q) values, recall that the algorithmneeds to compute (4.15), where we now focus on the sequence d that carriesthe trellis path through the states q and s as illustrated in Figure 4.7. Theprobability Pr(d) breaks into a product of individual probabilities, and theone affecting the transition in question is Pr[sl = q|sl+1 = s] = Pr(dl). Theterm

Pr(dl|y) ∝ exp (2d∗y − d∗S∗Sd) (4.19)

can be broken into three terms for the path through (q, s), and decomposedusing the partial sequence metric formulation (4.12) from the preceding sectioninto

Pr(sl−1 = q, sl = s|y) ∝ exp (λl−1(q))︸︷︷︸αl−1(q)

exp (bl(q → s))︸︷︷︸γl(s,q)

exp

⎛⎝ Kn∑

j=l+1

bj

⎞⎠

︸︷︷︸βl(s)

(4.20)

which are the three factors from (4.16). From (4.20) we see that

γl(s, q) = exp (bl(q → s)) Pr(dl) (4.21)

= exp

⎛⎝dlyl −

K−1∑j=1

dl−jR(l−j)ldl

⎞⎠Pr(dl). (4.22)

This factor Pr(dl) can be used to account for a priori probability informationon the user data d in iterative systems.

The a posteriori symbol probabilities Pr(dk[i] = d|y) can now be calculatedfrom the a posteriori transition probabilities (4.16) or (4.20) by summing overall transitions corresponding to dl = 1, and, separately, by summing over alltransitions corresponding to dl = −1 (with dl = dk[i]).

A formal description of this algorithm can be found in [120], where a rigor-ous derivation is presented. This derivation was first given by Bahl et. al. [9].

4.2.3 Performance Bounds – The Minimum Distance

Determining the performance of a general trellis decoder analytically is aknown difficult problem, and one is often satisfied with finding bounds orapproximations. The minimum squared Euclidean distance between two pos-sible sequences (signal points) can serve as a measure of performance for anoptimal detector. It is defined as

d2min = min

d′,d∈DKn

d�=d′

‖S(d− d′)‖22 . (4.23)


A coarse approximation of the error performance for small noise values canbe obtained as (see [104, Page 256])

Pe ≈ AdminQ

⎛⎝√

d2min

2N0

⎞⎠ , (4.24)

where Q(√

d2min/(2N0)

)is simply the probability of error between two

equally likely signals with squared Euclidean distance (energy) d2min in ad-

ditive white Gaussian noise of complex variance N0, and Admin is the numberof such minimum distance neighbors, which is known as the multiplicity ofthe minimum distance signal pair.

Unfortunately, the calculation of d2min is computationally intensive via

(4.23) except for small values of K, since the search space grows exponen-tially with K. In fact, the calculation of (4.23) is known to be NP-complete[149]. However, this does not necessarily mean that the calculation of d2

min isimpossible [118].

We need to calculate

d2min = min

d′,d∈DKn

d�=d′

(d− d′)∗AS∗SA(d− d′) (4.25)

where we have re-introduced the matrix A of amplitudes. If S is not rankdeficient, S∗S is positive definite, and there exists a unique lower-triangular2,non-singular matrix F of dimension Kn×Kn, such that S∗S = F∗F. This isknown as the Cholesky decomposition [55].

Using this decomposition and (4.25) we obtain

d2min = min

d′,d∈DKn

d�=d′

‖FA(d− d′)‖22 . (4.26)

Now let εj = dj−d′j for j = 1, . . . , Kn. Then equation (4.26) can be rewritten

as the sum of squares given by

d2min = min

d′,d∈DKn

d�=d′

Kn∑l=1

δ2l , (4.27)

where, due to the lower triangular nature of F,2 Sometimes the Cholesky factorization yields an upper-triangular matrix F, but

the matrix PF, where P =

2664

11

·1

3775 is a permutation matrix, is lower-triangular

and also complies with the decomposition, i.e. R = (PF)∗PF = F∗F.


δ2l =

⎛⎝ l∑

j=1

√PjFljεj

⎞⎠2

. (4.28)

Since δ2l depends only on ε1, . . . , εl, we can use a bounded tree search to eval-

uate the minimum value of (4.27). This branch-and-bound algorithm startsat a root node and branches into two new nodes at each level. The nodes atlevel l are labeled with the symbol differences (ε1, . . . , εl) leading to them,and the node weight is Δ2

l =∑l

j=1 δ2l . The branch connecting the two nodes

(ε1, . . . , εl) and (ε1, . . . , εl+1) is labeled by δ2l+1.

The key observation now is that only a small part of this tree needs tobe explored. Due to the fact that δ2

l is positive, the node weights can onlyincrease, and most nodes can quickly be discarded from future considerationif their weight exceeds some threshold. This threshold can initially be chosento be an estimate of the minimum distance. For instance, from the single-userbound, achieved by orthogonal spreading sequences, we know that d2

min ≥2 min(Pj).

If we are interested in the minimum distance of a specific symbol k, wemodify (4.27) to

d2min,k = min

d′,ddj �=d′

k

Kn∑l=1

δ2l . (4.29)

The algorithm is an adaptation of the T -algorithm [120] to this search prob-lem. This basic concept of a tree search, combined with branch and boundingmethods, forms the basic for a large number of decoding algorithms, such asthe sphere decoders, to be discussed in Chapter 5.

Algorithm 4.2 (Finding d2min of a CDMA Signal Set).

Step 1: Initialize l = 1 and activate the root node, denoted by (),at level 0 of the search tree with Δ2

0 = 0.Step 2: Compute the node weight Δ2

l = Δ2l−1 + δ2

l for all exten-sions from active nodes at level l − 1.

Step 3: Deactivate nodes whose weight exceeds the preset thresh-old.

Step 4: Let l = l + 1, and if l < Kn go to Step 2, otherwise stopand output d2

min = minactive nodes

= Δ2Kn.

Note that since nodes whose weight increases above the threshold aredropped, the number of branches searched by this algorithm is significantlysmaller than the total number of tree branches of the full tree (3Kn).

Algorithm 4.2, illustrated in Figure 4.8, finds the minimum distance of user k.


width of tree

dep

thoftree

δ21

δ22

δ23

δ24

δ25

δ26

Fig. 4.8. Bounded tree search.

However, the problem (4.28) is still NP-complete, as shown in [118]. Thekey lies in the fact that the above algorithm finds d2

min very efficiently in mostcases. In fact, Schlegel and Lei [118] performed searches for a synchronous (n =1) CDMA system with length-31 random spreading sequences, and found thefollowing empirical distribution for the minimum distances, shown in Figure4.9. The figure also shows the width of the search tree at each depth for theworst cases found for both search experiments. The maximum width of theactive tree of 48,479 for 31 users is significantly less than the total number ofpossible tree branches, which is 331 = 6.2× 1014.

4.3 Sub-Exponential Complexity Signature Sequences

While optimal detection is, in general, NP-hard, meaning that an exact opti-mal detector, whether jointly or individually optimal, must expend a compu-tational effort which is exponential in the number of users, we have seen that

much reduced complexity. It is difficult to precisely gauge the complexity for

However, there exist specific signature sets which have a provably lowercomplexity, no matter how large the set. For example, if the cross-correlationof the signature sets are all non-positive, then there exist an optimal detectionalgorithm with a complexity of order O(K3

Another set for which sub-exponential optimal detection is possible, will bepresented here. We assume that the cross-correlation values are all equal (or

there exist clever search algorithms which obtain “near-optimal” results with

claims can be found in the literature.

) [113].

such algorithms as a function of the number of users K, and many different

4.3 Sub-Exponential Complexity Signature Sequences 113

dmin| | | | | | | |

Rel

ativ

e fr

equ

ency

of

sign

atu

re w

avef

orm

s 50

40

30

20

10

0

0.4 0.6 0.8 1

| | | | | | | | l

Wid

tho

f se

arch

tre

e

20 Users

31 Users

0 5 10 15 20 25 30 35

10

1

2

3

4

5

10

10

10

10

Fig. 4.9. Histograms of the distribution of the minimum distances of a CDMAsystem with length-31 random spreading sequences, for K = 31 (dashed lines), andK = 20 users (solid lines), and maximum width of the search tree.

maybe nearly so), and given by ρ. Such a class of signature sequences has beenused as a benchmark [88] for general CDMA systems, and includes signaturesequence sets of practical interest, such as synchronous CDMA systems usingcyclically shifted m-sequences [139].

Given our assumption of equal cross-correlations, i.e.

Rij ={

1, i = jρ, i �= j

, (4.30)

we can rewrite (4.3) as (note we are focusing on a synchronous example withn = 1.)

2d∗y − d∗S∗Sd = 2K∑

i=1

diyi −K∑

i=1

K∑j=1

didjRij (4.31)

= 2K∑

i=1

diyi − ρK∑

i=1

K∑j=1

didj −K(1− ρ). (4.32)

The key observation is that∑K

i=1

∑Kj=1 didj depends only on the number of

negative (or positive) elements in d, and not their arrangement. Therefore,let N (d) be the number of elements in d which are negative, i.e.

N (d) =12

(K −

K∑i=1

di

). (4.33)

With this definition, further define the functions


K∑i=1

K∑j=1

didj = T1(d) = ρ(2d−K)2 (4.34)

T2(d) = −2K∑

i=1

diyi. (4.35)

Now, since we can ignore the constant term K(1−ρ), we see that maximizing(4.31) is equivalent to

d = arg mind∈DK

(T1(N (d)) + T2(d)) . (4.36)

Upon inspection, the following observations about the functions T1 and T2

can be made. T1(N (d)) is convex and possesses a unique extreme point atN (d) = K/2, which is a minimum for ρ > 0, and a maximum for ρ < 0. Thesecond term, T2(d), is minimized by d = sgn(y), and although it depends onthe arrangement of the negative elements in d, we may form a lower boundthat depends only on the number of negative terms that appear. For N (d) =0, 1, · · · , K define

σ(N (d)) =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

−2K∑

i=1

yisgn(yi), N (d) = η

σ(η) + 4N (d)∑i=η+1

|yπ(i)|, N (d) > η

σ(η) + 4η∑

i=N (d)+1

|yπ(i)|, N (d) < η

(4.37)

where we have introduced the notational simplification η = N (sgn(yi)), and πis a permutation among the yi, such that yπ(1) ≤ yπ(2), · · · ≤ yπ(K). Now, weconclude that σ(N (d)) is convex with a minimum at η, and it is furthermoreclear that

T2(d) ≥ σ(N (d)). (4.38)

Equality is achieved in (4.38) if the elements of d are arranged such that dπ(i)

= 1for i 1, 2, · · · ,N (d), and the remaining K − η elements are +1.− −

4.4 Signal Layering 115

Algorithm 4.3 (Optimal Decoding for Equal Cross-Correlations).

Step 1: Let η = N (sgn(y))Step 2: Let π be the permutation of the indices 1, 2, · · · , K such

that yπ(i) is non-decreasing with i.Step 3: Calculate the functions T1 and σ according to (4.34) and

(4.37). Also, find

j = arg minm=1,··· ,K

(T1(m) + σ(m)) . (4.39)

j is the number of negative elements in the minimizing vectord.

Step 4: Output the vector d given by

dπ(i) =

{−1, i = 1, · · · , j

+1, i = j + 1, · · · , K.(4.40)

The optimality of Algorithm 4.3 is a direct consequence of minimizing thelower bound (4.38) by the permutation π, and the fact that the output vectormeets this bound with equality.

The complexity of this detection algorithm is dominated by the operationin Step 2, which, in the worst case is of order O(K log K). However, in manycases a full sort and search will not be required.

Schlegel and Grant [117] generalize this decoding algorithm to the casewhere blocks of signature sequences have a fixed cross-correlation, and showthat the algorithm is exponential only in the number of unique cross correla-tion values.

We have seen that in many cases optimal detection algorithms with sub-exponential complexity can be found, or that approximate algorithms canobtain solutions with error probabilities close to those of optimal detectors.Nonetheless, a general practical solution to optimal decoding of CDMA isboth elusive, and not necessary, as we will see in subsequent sections andchapters.

4.4 Signal Layering

Preprocessing of the received signal r has two functions. A preprocessor canact as a multiuser detector and suppress multiple access interference as isthe case in the linear filter receivers we discuss in subsequent sections. Apreprocessor, however, also acts by conditioning the channel of a single userto generate an “improved” channel for that user. This viewpoint is illustrated


in Figure 4.10, where a linear preprocessor is used to create a single-userchannel for a given user.

Encoder /

Modulator

Encoder /

Modulator

Encoder /

Modulator

Noise

Linear

Preprocessing

Source

Source

Source

Decoder

Multiple

Access

Channel

Single

User

Channel

Fig. 4.10. Linear preprocessing used to condition the channel for a given user(shaded).

This single-user channel has an information theoretic capacity which wewill calculate for a number of popular preprocessors. From the results of con-ventional error control coding it is well known that this single-user capacitycan be approached closely by powerful forward error control coding systems[120], that is, single-user capacity achieving codes can be used to approachthe “layered capacity”. Calculation of these layered capacities allows us there-fore to determine how much capacity is lost by converting the multiple-accesschannel into parallel single-user channels. This process shall be called channellayering. It can be very effective in reducing the complexity of the joint de-coding problem, and, as we will see, in many cases can achieve a performancewhich rivals or equals that of a full joint detector.

Figure 4.11 shows the layered information theoretic capacities for severalcases as a function of the signal-to-noise ratio Eb/N0. More precisely, thecurves are Shannon bounds which relate spectral efficiency in bits/Hz (y-axis)to power efficiency (x-axis). The Shannon bound for an AWGN channel isquickly computed from the capacity formula, i.e. from CAWGN = 1/2 log2(1+2P/N0), we obtain

Eb

N0≥ 22CAWGN − 1

2CAWGN, (4.41)

by realizing that at the maximum rate EbCAWGN = P . Equation (4.41) isthe well-known Shannon bound for the AWGN channel.


The Shannon bounds calculated in Figure 4.11 are for random CDMAsystems. The CDMA channel capacity was calculated in Chapter 3, where weshowed that the loss w.r.t. the AWGN capacity is minimal and vanishing forlarge loads β →∞, or large P/N0 →∞. The two filters used are the minimum-square error (MMSE) filter discussed in Section 4.4.5, which produces the bestlinear layered channel, and matched filtering, which represents conventionalcorrelation detection. It is important to note that the capacity bounds arecalculated for equal received powers of all the different users.

AWGN Capacity

CDMA: β = 2

CDMA: β = 0.5

Matched Filter: β = 0.5

Matched Filter: β = 2

MMSE: β = 0.5

MMSE: β = 2

-2 0 2 4 6 8 10 12 14 16

Eb/N0

[dB]

0.1

1

10

Capaci

tybits/

dim

ensi

on

Fig. 4.11. Information theoretic capacities of various preprocessing filters.

The figure3 distinguishes between two system loads, defined as β = K/L,i.e. the ratio of (active) users to available dimensions, also known as the pro-cessing gain. For lower loads, β < 0.5, the layered capacities of linear filterswith equal received power are virtually equivalent to the optimal system ca-pacity, and very little is lost by layering. In contrast, for large loads β > 1,3 The range of Eb/N0 is chosen to represent typical values in wireless communi-

cations systems. Note that Eb/N0 values in excess of 16dB are rare, since theywould imply constellations with 256 points per complex symbol, or larger.


layering via linear filters becomes inefficient, and significant extra capacitycan be gained by going to more complex joint receivers.

Some interesting conclusions can be drawn from these results. First, if thesignal-to-noise ratio (SNR) is small, the system is mostly fighting additivenoise, and multiuser detection has only a limited effect on capacity and itscomplexity may not be warranted. Also, in the low SNR regime, a loss ofabout 1dB is incurred by using random spreading codes versus Welsh-boundequivalent codes. Furthermore, it becomes evident that not only are linear fil-ters not capable of extracting the channel’s capacity at higher loads, but theirperformance actually degrades below the capacity of a lighter loaded system.Both of these effects are due to the fact that the linear filter is suppressingsignal space dimensions in the absence of more detailed information about thetransmitted signal. This will be explored later in this chapter, as well as theinfluence of unequal powers on the system capacity.

4.4.1 Correlation Detection – Matched Filtering

We have encountered the correlation or matched filter bank as a front-end ofthe optimal detector in Section 4.2. If we make simple sign decisions on thereceived matched filter output vector y, we have the conventional correlationreceiver. Interference in the correlation receiver is only suppressed by theprocessing gain of the spreading sequences used. This suppression is not ideal,unless the sequences are orthogonal, and, on average, each user in the systemcontributes an amount of interference which is approximately equal to itspower divided by the processing gain [154].

More precisely, the detection process is given by

d = sgn (y) . (4.42)

For a given user k this means

dk = sign

⎛⎝dk +

∑j �=k

djRkj + zk

⎞⎠ . (4.43)

Since Rkj is the product of two spreading sequences according to (4.5), itsvariance is straightforward to calculate if we assume that the chips of thesespreading sequences are chosen randomly and with equal probability to be±1/

√L. The interference term in (4.43) is then the sum of a (large) number

of random variables with bounded variance, which allows us to apply thecentral limit theorem and replace

Ik =∑j �=k

djRkj + zk (4.44)

by a Gaussian noise source with zero-mean and variance


var(Ik) =K − 1

LP + σ2 K,L→∞−→ βP + σ2. (4.45)

The channel for a single user now simply looks like an AWGN channel withnoise variance (4.45), and therefore with capacity [166]

CMF =12

log2

(1 +

P

βP + σ2

)bits/use. (4.46)

The capacity per dimension of the overall system is K times this, divided bythe processing gain, i.e.

CMF =β

2log2

(1 +

P

βP + σ2

)bits/dimension. (4.47)

Substituting the limiting rate CMFP/β = Eb into the equation above givesthe Shannon bound curve shown in Figure 4.11.

The deleterious effect of unequal received power can also easily be seenfrom (4.43). A user whose signal arrives at the receiver κ times stronger, willshow up as κ virtual users. This can quickly lead to a significant loss in systemcapacity.

4.4.2 Decorrelation

Decorrelation has intuitive appeal since it is a linear filtering operation whichcompletely eliminates the multiple-access interference, and one is tempted toconclude that its performance should therefore be close to optimal. We willhowever see that this is not the case, since in the process of eliminating themultiple access interference significant and detrimental enhancement of theadditive channel noise can occur. This limits what is theoretically possible bydecorrelation, and this limitation is especially severe for large system loads.

We start by considering the output signals of the matched filter outputs,which were calculated as

y = S∗r

= S∗SAd + S∗z. (4.48)

If the spreading sequences in S are linearly independent, the correlation matrixR = S∗S has an inverse, which can be applied to y to obtain

d = R−1y

= Ad + R−1S∗z. (4.49)

From (4.49) we see that d is an interference-free estimate of d. The matrixH† = R−1S∗ is the pseudo-inverse of S [55, 131] and we define the decorrelatoras


Definition 4.1 (Decorrelator). The decorrelator detector out-puts

dDEC = sgn(R−1S∗r

)

The decorrelating detector, or the decorrelator for short, has been studiedin the literature in [65, 79, 80]. While the pseudo-inverse H†

a complete separation of the interfering users is only possible whenever R−1

exists, which is the case whenever S has full column rank. That is, all theusers employ linearly independent spreading sequences. This assumption isreasonable, since if the signals of two or more users form a linearly dependentset, they become inseparable by linear operation, and the detection of theusers’ data sequences is much complicated. In fact, sets of linearly dependentusers would remain correlated after decorrelation and joint detection wouldhave to be applied to these subsets.

An interesting derivation of the decorrelator proceeds as follows. Let

dDEC = sgn(

arg mind‖r− SAd‖2

), (4.50)

where d ∈ RKn, in contrast to (4.2) where d was restricted to be from the

set {+1,−1}Kn. This simplification in the search space turns an NP-completediscrete minimization problem into a continuous quadratic minimization prob-lem. We define v = Ad and minimize (4.50) by taking partial derivatives, i.e.

∂

∂v(r− Sv)2 = 2S∗r− 2S∗Sv = 0

⇒ v = (S∗S)−1S∗r, (4.51)

obtaining the same solution as (4.49). If the amplitudes A of the transmit-ted symbols are unknown, then we have just proven that v is the maximumlikelihood estimate of v.

4.4.3 Error Probabilities and Geometry

Let us calculate the probability of error for the decorrelator. Assume thatdk = −1. For binary modulation, an error occurs if dDEC,k = 1, and is givenby

Pr(dDEC,k = 1

∣∣dk = −1)

= Pr((R−1S∗z)k >

√Pk

). (4.52)

The quantity ηk = (R−1S∗z)k is a Gaussian random variable since it is thesum of weighted Gaussian components. Its expectation and variance are easilycomputed as

always exists,


E [ηk] = 0; (4.53)

var(ηk) = E[R−1S∗zz∗SR−1

]kk

= σ2[R−1

]kk

. (4.54)

With this, the error probability is given by the standard Gaussian integral

Pr(dDEC,k = 1|dk = −1

)= Q

(√Pk

σ2 [R−1]kk

), (4.55)

where Pk is the energy per symbol of user k. As can be seen from (4.55), theonly difference between antipodal signaling in AWGN and the error proba-bility of the decorrelator is the noise enhancement factor [R−1]kk, which canbe shown to be always larger or equal to unity, that is, the decorrelator willalways increase the noise and the bit error probability. Furthermore, the per-formance of user k is independent of the powers of any of the other users,since multiple-access interference is completely eliminated. The actual errorrate depends very strongly on the signature sequences, but as P/σ2 →∞, theerror probability Pe → 0. This has been referred to as near-far resistance.

Furthermore, as the minimum eigenvalue of R, λmin → 0, the noise en-hancement coefficient [R−1]kk → ∞, as can be seen by writing the inverse interms of the spectral decomposition of R. Hence, any two spreading sequenceswith a high cross-correlation will degrade the performance of the decorrelatorsubstantially.

Since σ2 [R−1]kk ≥ σ2, there is an inherent loss of performance sufferedby decorrelation w.r.t. interference-free transmission. For example, consider asynchronous two-user system with

R =

⎡⎢⎢⎢⎢⎢⎣

1 ρρ 1

. . .1 ρρ 1

⎤⎥⎥⎥⎥⎥⎦ , (4.56)

for which the noise variances can be calculated as follows: E [η∗η] = σ2R−1,and from there σ2 [R−1]kk = σ2/(1 − ρ2). Hence, a correlation of 50% (ρ =0.5) implies a loss of 1.25dB in signal-to-noise ratio. If we do not ignore thenoise correlation, the users could share information about the noise and betterperformance would be possible using an optimal detector.

Figure 4.12 shows a geometrical perspective of the decorrelator. Let usstart with the decorrelator output as the pseudo-inverse of the spreading ma-trix (using a synchronous system for illustration) multiplying the receivedchip-sampled vector

dDEC = R−1S∗r = H†r. (4.57)

The rows j of the pseudo-inverse H† are orthogonal to the spreading sequencesi �= j in S, as can be seen from

122 4 Multiuser Detection⎡⎢⎣h∗

1...

h∗K

⎤⎥⎦ [s1, · · · , sK

]=[(S∗S)−1S∗]S = I. (4.58)

The vector vj in Figure 4.12 is the projection of sj onto hj , which iscalculated as

vj = (h∗jsj)

hj

‖hj‖2= dDEC,j

hj

‖hj‖2. (4.59)

It must be orthogonal to all si,∀i �= j from (4.58). The length of the projectionvector equals dDEC,j multiplied by 1/ ‖hj‖. This dependence of the projectedlength of hj and dDEC,j is unimportant for symbol by symbol detection, butis very important for multiuser decoders using FEC coding. There the metricsneed to be adjusted to this scaling factor (see Section 6.2.1).

span{sk}k �=j

span{sk}⊥k �=j

sj

vjvj

hj

Fig. 4.12. Geometry of the decorrelator.

4.4.4 The Decorrelator with Random Spreading Codes

As discussed above, the actual performance of the decorrelator depends heav-ily on the set of spreading sequences used, and few general conclusions can bedrawn. If we consider the use of random spreading sequences, however, veryprecise results exist.

As derived in (4.57) ff., the output of the decorrelator for a given userk consists of a signal power component: Pk (h∗

ksk)2 and a noise componentσ2s∗ksk, giving a signal-to-noise ratio of the k-th user of

SNRk =Pk

σ2

(h∗ksk)2

s∗ksk. (4.60)

We will show that for large systems the following theorem holds:


Theorem 4.1 (Decorrelator Performance in RandomCDMA). If K → ∞, and L → ∞, such that the load β = K/Lis constant, then

SNRkK,L−→∞−→ Pk

σ2

L−K + 1L

(4.61)

Proof. Assume that ‖hk‖ = 1 and let Q = [q1, · · · ,qK−1] be a basis forspan{sk}k �=j of dimension K − 1. Since sk is random, i.e. each component israndomly chosen,

E [q∗l sk] = 0 (4.62)

andE [q∗

l sks∗kql] =qlIq∗

l

L=

1L

. (4.63)

That is, the length of the average projection onto an arbitrary basis vector is1/L, irrespective of whether the system is synchronous or not. The averagetotal length of the projection onto span{sk}k �=j is therefore (see Figure 4.12)

E[v∗

j vj

]= E

⎡⎢⎣ K∑

l=1l �=k

K∑m=1m �=k

q∗l sks∗kqm

⎤⎥⎦ =

K−1∑l=1

E [qlIq∗l ] =

K − 1L

(4.64)

and, since, v∗j vj + v∗

jvj = sis∗i = 1,

E[v∗

jvj

]=

L−K + 1L

. (4.65)

It remains to show that var(v∗

jvj

) L→∞−→ 0, which is accomplished in a similarfashion.

From this theorem we can calculate the information theoretic capacity ofthe decorrelator layered single-user channel using random CDMA. Equation(4.61) states that the symbol energy, or the signal-to-noise ratio, of an arbi-trary user k is reduced by the factor 1−β+1/L with respect to interference-freetransmission. At the same time, the system has K channels in L dimensions.Therefore, the decorrelator (system) capacity is given as

CDEC =β

2log(

1 + 2Eb

N0

(1− β +

1L

))bits/dimension. (4.66)

We recall the Shannon bound for an AWGN channel (4.41), and, from (4.66),we can analogously derive the Shannon Bound for the decorrelator layeredchannel as


Eb

N0≥ 22C/β − 1

2C(1− β), (4.67)

which is shown in comparison to the AWGN Shannon bound of (4.41) inFigure 4.13 for a few different system loads β.

AWGN Capacity

DecorrelatorCapacities

β = 0.25

β = 0.9

β = 0.5

β = 0.75

-2 0 2 4 6 8 10 12 14 Eb/N0 [dB]

0.1

1

10

Dec

orr

elato

rper

-use

rC

apaci

tyC

DEC

Fig. 4.13. Shannon Bounds for the AWGN channel and the random CDMAdecorrelator-layered channel. Compare with Figure 4.11.

As can be seen from Figure 4.13, system loads in the range of 0.5 ≤ β ≤0.75 are most efficient. For larger loads the energy loss of the decorrelator istoo big, and for smaller loads the spectrum utilization is inefficient.

4.4.5 Minimum-Mean Square Error (MMSE) Filter

The MMSE detector is an estimation theory concept transplanted to thefield of (multiuser) detection. The MMSE filter minimizes the variance atthe output of the filter taking into account both channel noise and interfer-ence, but ignoring the data structure. As long as this structure is disregarded,


the MMSE filter provides the best possible preprocessing. Since the residualnoise is asymptotically Gaussian [101], the minimum variance filter is also thebest overall minimum variance estimator, and maximizes the capacity of theresulting layered channels.

The MMSE filter minimizes the squared error between the transmittedsignals and the output of the filter. It is found as

M = arg minM

E[‖d−Mr‖2

]. (4.68)

Carrying out this minimization is standard [60], and makes use of the orthog-onality principle:

∂

∂ME[‖d−Mr‖2

]= 2E [(d−Mr)r∗] = 0. (4.69)

Evaluating the various terms we obtain

E [dr∗] = ME [rr∗]AS∗ = M

(SAA∗S∗ + σ2I

)M = AS∗ (SWS∗ + σ2I

)−1. (4.70)

See also Section A.4.3. Using the matrix inversion lemma4 an alternate formof the MMSE filter (4.70) can be found as follows:

M = AS∗( S︸︷︷︸B

W︸︷︷︸C

S∗︸︷︷︸D

+ σ2I︸︷︷︸A

)−1

=1σ2

AS∗ − 1σ2

AS∗S(S∗ 1

σ2S + W−1

)−1 1σ2

=1σ2

A(I−R

(R + σ2W−1

)−1)S∗

= A∗−1(R + σ2W−1)−1

S∗. (4.71)

Scaling the filter output by A∗−1 does not affect hard decision, hence wepropose the following

Definition 4.2 (Minimum Mean-Square Error Detector).The Minimum Mean-Square Error Filter (MMSE) Detector out-puts

dMMSE = sgn((

R + σ2W−1)−1

S∗r)

4 Given as(A + BCD)−1 = A−1 − A−1B(DA−1B + C−1)

−1DA−1


The MMSE detector ignores the data structure of the interference, andtakes into account only its signal structure. In this class of receivers, theMMSE is optimal, and the capacity of the layered channel is maximized. TheMMSE filter application to CDMA joint detection was first proposed in [171],and then further developed in [107] and [82].

4.4.6 Error Performance of the MMSE

The error probability of the MMSE receiver can be calculated analogously tothe decorrelator. Again, assume dk = −1. An error occurs if dMMSE,k = 1 (forBPSK), and the error probability is calculated as

Pr (dMMSE,k = 1|dk = −1) = Pr(((

R + σ2W−1)−1

y)

k> 0)

. (4.72)

Introducing the abbreviation T =(R + σ2W−1

)−1, we continue as follows:

Pr (dMMSE,k =1|dk =−1) = Pr ((T (S∗Rd + S∗z))k > 0)= Pr ((TS∗Rd)k+(TS∗z)k > 0) . (4.73)

It is straightforward to show that

E [(TS∗z)k] = 0; var((TS∗z)k) = E [TS∗zz∗ST] = σ2 [TRT]kk (4.74)

and that the error probability of user k is therefore given by

Pr (dMMSE,k = 1|dk = −1) = 21−K∑d

(dk=−1)

Q

((TS∗Rd)k√σ2 [TRT]kk

). (4.75)

The error formula assuming dk = 1 is completely analogous.From (4.75) we can draw a few conclusions. First, the error probability

depends on the powers of the interfering users, unlike for the decorrelator.The error probability also depends strongly on the spreading sequences Swhich are used, but, as σ2 → 0, the error probability Pe → 0.

It is very informative to express the error of the MMSE in terms of thespectral decomposition of the correlation matrix. Evaluating the mean squareerror of the MMSE receiver we obtain

E[‖d−Mr‖2

]= tr

(I−A∗S∗ (SWS∗ + σ2I

)−1SA)

= K − tr(SWS∗ (SWS∗ + σ2I

)−1)

= K −L∑

i=1

λi

λi + σ2=

K∑i=1

σ2

λi + σ2, (4.76)

where the eigenvalues λi are the K eigenvalues of SWS∗. This particularexpression will be useful in the next section.


4.4.7 The MMSE Receiver with Random Spreading Codes

As for the decorrelator, little can be said in general for any given specific setof spreading sequences. However, for the class of random spreading sequences,the results that are obtained are very precise, and in the limit of large system,these results are no longer random, but become deterministic, as the vari-ances of the random quantities involved vanish. It is this property, and thewidespread acceptance of random spreading codes as practical alternatives,which make them popular both for theory and application.

First we note from (4.76) that the sum of the mean square errors of all Kusers equals

K∑k=1

MMSEk =K∑

i=1

σ2

λi + σ2. (4.77)

In a random system MMSEk will be equal for all k if Pk = Pk′ ,∀k, k′, and

MMSEk =K∑

i=1

1K

1λi

σ2 + 1K,L−→∞−→

∫ ∞

0

1Pσ2 λ + 1

fλ(λ) dλ, (4.78)

where fλ(λ) is the limiting eigenvalue distribution of SS∗/K. We have comeacross this distribution in Chapter 3, Theorem 3.5, It was calculated by Baiand Yin [10] as

fλ(x) = [1− β−1]δ(x) +

√[x− a(β)]+[b(β)− 1]+

2πβx

a(β) =(√

β − 1)2

; b(β) =(√

β + 1)2

; [z]+ = max(0, z).

Verdu and Shamai [150] have shown that there exists the following closed-form formula for the mean square error of the arbitrary user k in the case ofsynchronous CDMA:

MMSEk =[1 +

P

σ2− 1

4F(

P

σ2, β

)]−1

, (4.79)

where the function F(x, z) is defined as

F(x, z) =[√

x(1 +

√z)2 + 1−

√x(1−

√z)2 + 1

]2. (4.80)

The signal-to-noise ratio and the mean square error of the MMSE receiverare related by

MMSEk =1

1 + SNRk(4.81)

and, conversely,

SNRk =1

MMSEk− 1. (4.82)

Combining (4.82) and (4.79) we arrive at the following theorem.


Theorem 4.2 (MMSE Performance in Random CDMA).If K →∞, and L →∞, such that the load β = K/L is constant,then, for synchronous CDMA

SNRkK,L−→∞−→ P

σ2− 1

4F(

P

σ2, β

). (4.83)

Since the residual noise of the MMSE receiver is Gaussian, the capacity ofthe corresponding single-user channel is given as straightforward applicationof the capacity formula (4.41), where the Gaussian noise SNR is replaced by(4.83), and, using P = REb in order to normalize equations, we obtain

CMMSE =β

2log[1 + 2

REb

N0− 1

4F(

2REb

N0, β

)]. (4.84)

Setting βR = CMMSE leads to an implicit equation for the maximum spectralefficiency of the MMSE receiver, which is plotted in Figure 4.14 below for afew load values.

Observation of these capacity curves reveals some interesting properties ofthe respective filters. The MMSE receiver is most efficient in the low signal-to-noise ratio regime, where thermal noise dominates the channel impairments.Here, however, the simple matched filter performs almost as well. As thesignal-to-noise ratio improves, the differences between decorrelator and MMSEreceivers diminish. Like the decorrelator, albeit not as severely, the efficiencyof the MMSE receiver diminishes as the system load is increased in the highsignal-to-noise ratio regime.

Like the decorrelator, the MMSE receiver requires the inversion of a K ×K correlation matrix at each symbol period. This represents an enormouscomplexity burden and is not desirable for actual systems implementations.We will present lower complexity iterative approximations for both of thesepreprocessors in the next chapter.

4.4.8 Whitening Filters

The whitening filter approach represents an intermediate stage between filter-ing and cancellation receivers. It is based on a matrix decomposition method,namely the Cholesky decomposition [55], which leads to this detector struc-ture. The symmetric matrix R allows the decomposition R = F∗F, whereF∗ is lower-triangular, as already encountered in Section 4.2.3. The Choleskydecomposition can be accomplished in O(n3/6) operations, where n is the sizeof the matrix [55], and has therefore a complexity which is comparable to thatof inverting the matrix. Furthermore, if R is invertible, so are F and F∗.

We now pre-multiply the correlator output (4.48) by F∗−1 and obtain


AWGN Capacity

MMSECapacities

Decorrelator

β = 2

β = 0.5

β = 0.5

β = 1

-2 0 2 4 6 8 10 12 14 16Eb/N0

[dB]

0.1

1

10C

apaci

tybits/

dim

ensi

on

Fig. 4.14. Shannon bounds for the AWGN channel and the MMSE layered single-user channel for random CDMA. Compare with Figures 4.11 and 4.13.

yWF = F∗−1y = FAd + zWF. (4.85)

The correlation of the filtered noise vector zWF is easily calculated as

E [zWFz∗WF] = F∗−1S∗ E [zz∗]SF−1

= σ2I, (4.86)

that is, the noise samples zWF are white (uncorrelated) with common vari-ance σ2 per dimension. This property has also given the filter F∗−1 the namenoise whitening filter. In a sense F∗−1 accomplishes the complement of thedecorrelator, which whitens the data.

The benefit of yWF is that, since F is lower triangular,

yWF =

⎡⎢⎢⎢⎣

yWF,1

yWF,2

...yWF,Kn

⎤⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎣

F11

√P1d1

F21

√P1d1 + F22

√P2d2

...∑Kni=1 FKn,i

√Pidi

⎤⎥⎥⎥⎥⎦+ zWF (4.87)


and a successive interference cancellation detector suggests itself which startswith d1 = sgn(yWF,1) and proceeds to decode successively

di = sgn

⎛⎝yWF,i −

i−1∑j=1

Fij

√Pj dj

⎞⎠ , (4.88)

as shown in Figure 4.15. Since F is lower triangular with width K (see Section4.4.9) even in the asynchronous case

di = sgn

⎛⎝yWF,i −

K∑j=1

Fi(i−j)

√Pi−j di−j

⎞⎠ , (4.89)

and we never need to cancel more than K − 1 symbols. However, in contrastto the decorrelator, and as for the MMSE receiver, this decoder requires theamplitudes A of all the users in order to function properly, a complicationwhich must be handled by the estimation part of the receiver.

Since F is lower triangular, only previous decisions are required at eachlevel, forming the input to the i-th decision device. If all previous i−1 decisionsare correct, i.e. dj = dj , j = 1, . . . , i− 1, the input to the i-th decision deviceis given by

yWF,i −i−1∑j=1

Fij

√Pjdj = Fii

√Pidi + zWF,i, (4.90)

and yWF,i is free of interference from the other users’ symbols. The discretesignal-to-noise ratio at the input of the decision device can now be calculatedas

SNRi =F 2

iiPi

σ2. (4.91)

Since we are starting with i = 1, it is important to rearrange the users suchthat they are ordered with decreasing SNRi, or equivalently with decreasingF 2

iiPi. This is to minimize error propagation, since, if one of the decisions di iserroneous, equation (4.91) no longer applies and we have an instance of errorpropagation typical for decision-feedback systems.

It is interesting to note that the filter F∗−1 maximizes SNRi for eachsymbol d[i], given correct previous decisions, as originally shown by [30, 31],and restated in

Theorem 4.3 (Local Optimality of the Whitening Filter).Among all causal decision feedback detectors, F∗−1 maximizesSNRi for all i, given correct previous decisions dj = dj , j =1, . . . , i− 1.


sgn( )

sgn( )

sgn( )

Partial

Decorrelator-

- -

F∗−1

F21

FK1 FK2

d1

d2

dK

Fig. 4.15. The partial decorrelating feedback detector uses a whitened matrix filteras a linear processor, followed by successive cancellation.

Proof. Assume M is any filter such My = T′d + z′, where T′ is lower trian-gular. We may write T′ = TF, where T must necessarily also be lower trian-gular. But E[z′z′∗] = σ2TT∗, and SNRi = t2iiF

2iiPk/

(σ2∑i

j=1 t2ij

), which is

maximized for all i by setting T = I.

Using optimal error control codes at each stage before cancellation user#1 could operate at a at a maximum error free rate per dimension of R1 =12 log

(1 + F 2

11P1σ2

), and after cancellation the k-th decoder could operate at

Rk = 12 log

(1 + F 2

kkPk

σ2

), yielding the maximum sum rate

Rsum =K∑

k=1

Rk =K∑

k=1

12

log(

1 +F 2

kkPk

σ2

). (4.92)

Since F is triangular det(F) =∏K

k=1 Fkk, and hence det(F∗WF) =∏Kk=1 F 2

kkPk. We conclude further that, since (F∗WF) = (S∗WS). We obtainfrom (4.92)

12

log det(I +

S∗WSσ2

)= Csum, (4.93)

which is the information theoretic capacity of the CDMA channel as derived inChapter 3. Whitened filtering with cancellation therefore theoretically suffersno performance loss. It is a capacity achieving serial cancellation method.Note, furthermore, that power ordering is no longer required if each level isusing capacity achieving coding. We will later show in Chapter 6 that a bank


of MMSE filters with a successive cancellation structure like that in Figure4.15 can also achieve the capacity of the CDMA channel.

Despite this, there are a number of difficulties associated with partialdecorrelation. Not only is exact knowledge of the power levels Pk required,but the transmission rates have to be tailored according to (4.92), and thatrequires accurate backward communication with the transmitters, i.e. feed-back. The decoding delay may also become a problem since each stage has towait until the previous stage has completed decoding of a particular symbolbefore cancellation can occur.

4.4.9 Whitening Filter for the Asynchronous Channel

For asynchronous CDMA and continuous transmission the frame length nmay become quite large. As a consequence the filter decomposition R = F∗Fvia a straightforward Cholesky decomposition becomes too complex. Since thedifferent symbol periods do not decouple like in the synchronous case, we needsome way of calculating this decomposition efficiently, exploiting the fact thatmost entries in the asynchronous correlation matrix R are zero.

We will make use of the fact that the matrix R has the special form of aband-diagonal matrix of width 2K − 1, i.e.

R =

⎡⎢⎢⎢⎢⎢⎣R0[0] R∗

1[1]R1[1] R1[0] R∗

2[1]R2[1] R2[0] R∗

3[0]R2[1] R3[0]

. . . . . .

⎤⎥⎥⎥⎥⎥⎦ . (4.94)

It can be shown then ([55], or by considering F∗F = R) that F is lowertriangular with width K, or in block form

F =

⎡⎢⎢⎢⎢⎢⎣F0[0]F0[1] F1[0]

F1[1] F2[0]F2[1] F3[0]

. . .

⎤⎥⎥⎥⎥⎥⎦ , (4.95)

where

(i) Fi[0] is lower triangular, and(ii)Fi[1] is strictly upper triangular.

Expanding R = F∗F we obtain the following set of equations for theindividual blocks of F:

Rn−1[0] = F∗n−1[0]Fn−1[0] (4.96)

Ri[1] = F∗i [0]Fi−1[1]; 0 < i ≤ n− 1 (4.97)

Ri[0] = F∗i [0]Fi[0] + F∗

i [1]Fi[1]; 0 < i ≤ n− 1. (4.98)


These equations can be calculated recursively starting with (4.96) which isa K ×K Cholesky factorization. Substituting into (4.97) we find Fn−2[1] =(F∗

n−1[0])−1

Rn−1[1], and, Fn−2[0] can be found by a Cholesky factorization ofRn−2[0]−F∗

n−1[1]Fn−1[0]. At this point the process becomes recursive. Note,however, this decomposition starts at the end of the frame with i = n − 1,and we hence need to wait until the entire frame is received before processingcan commence – an undesirable restriction.

For large values of n, we cannot wait for the entire frame to be received,and we need to use another approach. The idea is to perform the Choleskyfactorization “locally”. This will lead to a sliding window approximation ofthe algorithm above. Let us substitute the solution of (4.97), i.e.

Fi[1] =(F∗

i+1[0])−1

Ri+1[1] (4.99)

into (4.98), and we obtain

Ri[0] = F∗i [0]Fi[0] + R∗

i+1[1](F∗

i+1[0]Fi+1[0])−1

Ri+1[1]= Yi + R∗

i+1[1]Y−1i+1Ri+1[1], (4.100)

where we have defined Yi = F∗i+1[0]Fi+1[0] for convenience. Again, if we start

out at the end of the frame, i.e. i = n− 1, (4.100) can be solved recursively.However, in order to obtain a continuous algorithm, we apply a sliding

window to the equation above and start the algorithm at i = r + j, ratherthan at the end of the frame. This then gives us the following sliding windowfactorization algorithm [4, 6]:

Algorithm 4.4 (Asynchronous Cholesky Factorization).

Step 1: Set Yr+j−1 = Rr+j−1[0]. Let i = 2 and m = r + j − 2.Step 2: Compute

Ym = Rm[0]−R∗m+1[1]Y−1

m+1Rm+1[1]. (4.101)

Step 3: If i < j, let i = i + 1, m = r + j − i and go to Step 2.Step 4: Ym=r is the jth-step approximation of the solution to

(4.100). Find Fr[0] via the Cholesky factorization of Yr =F∗

r [0]Fr[0], and

Fr−1[0] = (F∗r [0])−1 Rr[1]. (4.102)

Furthermore it can be shown that the solutions to the above algorithmmonotonically converge towards the correct solutions of (4.100) [4, 6]. Theparameter j determines the degree of accuracy of the approximative algorithm.


4.5 Different Received Power Levels

A major complication with linear layering filters is their sensitivity to largepower fluctuations. Proper signal cancellation cannot be accomplished withoutdata estimation of the interfering users, and hence, linear preprocessors allsuppress the interfering signal spaces in one form or another. This rendersthem inefficient in scenarios with unequal received power distributions, as wewill show in this section.

4.5.1 The Matched Filter Detector

The situation of different received power levels for the correlation receiver isfairly easy to analyze. We have already mentioned that for the correlation,or matched filter receiver, users with larger received power levels appear im-plicitly as “additional” users, reducing the capacity for the low power users.More precisely, if we assume a large user population, and impose the Linde-berg condition on asymptotic negligibility on the powers of the users, givenformally as

Pk

K∑m=1j �=k

Pj

K→∞−→ 0; ∀k, (4.103)

the interference variance from (4.45) approaches for all users the value

var(Ik) =1L

K∑j=1

Pj + σ2 = σ2MF, (4.104)

as K,n →∞.Analogously to (4.47) we can compute the system capacity as

CMF =1

2L

K∑k

log2

(1 +

Pk

σ2MF

)bits/dimension. (4.105)

Furthermore, applying Jensen’s inequality to bring the sum inside the loga-rithm in (4.105) we obtain

12L

K∑k

log2

(1 +

Pk

σ2MF

)≤ β

2log2

(1 +

P

σ2MF

), (4.106)

where P = 1K

∑Kk Pk is the average energy of the users. We now see from

(4.106), comparing it to (4.47), that unequal received power levels hurt theoverall system efficiency, and that this loss can be dramatic. It is thereforenot surprising that power control is such a vitally important aspect of currentCDMA systems [154].

4.5 Different Received Power Levels 135

The decorrelator shares the same fate as the matched filter receiver. Inthe decorrelator the signals are whitened, and the capacity of each layeredchannel is given by

CDEC,k =12

log(

1 +Pk

σ2

(1− β +

1L

))bits/use. (4.107)

Applying Jensen’s inequality in exactly the same manner as above shows thatthe system capacity of the decorrelator is maximized by the equal receivedpower distribution, and that substantial loss can occur if large power varia-tions are present. Note that the result of the decorrelator does not depend onwhether the Lindeberg condition (4.103) is met or not, since interferers areideally suppressed.

4.5.2 The MMSE Filter Detector

The MMSE filter does not fare any better, it too has maximum system ca-pacity for an equal power distribution. Let us start out with the interferenceterm. Similar to the matched filter receiver, the MMSE filter leaves a residualinterference term that has noise and signal components. Following [138], itcan be shown that this term is zero-mean and has a variance that can becalculated as

var(Ik) = σ2 +1L

K∑i=1i �=k

PiPk

Pk + PiPk

var(Ik)

. (4.108)

We see that, contrary to the matched filter equation, the interference varianceis only available as an implicit equation. If we let γk = Pk/var(Ik), then theabove equation leads to an implicit equation for the signal-to-noise ratios ofthe different layered channels, given by

γk =Pk

σ2+

1L

K∑i=1i �=k

PiPk

Pk + Piγk. (4.109)

We now apply the same arguments as above, i.e. the total system capacity iscalculated as

CMMSE =1

2L

K∑k=1

log2 (1 + γk) ≤ β

2log2(1 + γ), (4.110)

where γ = 1K

∑Kk γk is the average signal-to-noise ratio for the different users.

Now, showing that γ is maximized if all γk are equal establishes equality in theupper bound (4.110) and that unequal receiver signal-to-noise ratios lead to anoverall system capacity loss. Tse and Hanly have established the related resultthat if we require all receivers to have the same γk, then the equal received


power distribution minimizes the total power. Again, the MMSE performancesuffers in an unequal power environment.

The above analytical derivations are difficult to visualize, and we there-fore present a couple of example scenarios. Four such power scenarios will bediscussed, the first is the equal power optimal situation, the second is the thestrong user case, where a single user has a significantly larger power than therest of the user, which have equal received power. The situation is reversedin the weak user case, where one user has substantially less power than theremaining users, and the fourth case is that of two equal-sized groups withdifferent powers each.

The equal power case will be used as reference, and the weak user case is

situation is simply that the weak user has a very poor channel. The remainingtwo cases are more interesting.

Starting from the first equality in (4.110), we compute the average bit en-ergy to noise power spectral ratio as Eb/N0 =

∑Kk=1 Pk/(σ2CMMSE). Equation

(4.110) is the spectral efficiency. Plotting it against Eb/N0 gives the Shannonbounds shown in Figure 4.16 and 4.17. The examples are calculated for aneight-user system and the two cases of the strong user (Figure 4.16), and thetwo equal-sized power groups (Figure 4.17). The resulting Shannon boundsare drawn for four levels of power difference, 10dB, 15dB, 20dB, and 30dBand for a system load of β = 8/15. It becomes evident that the single pow-erful user has the most detrimental effect on the overall system efficiency,since, in essence, everybody else is trying to suppress the strong user and thusdecreases their layered capacities.

The case of a single strong user is, in a certain sense, the worst casesituation. If we require a minimum power level of the lowest power user, thespecific power distribution with a single strong user cannot be written as theweighted combination of other power distributions. Furthermore, it can beshown that both the decorrelator and MMSE capacities are convex in thepowers W. Consequently, any distribution other than the ones with a singlestrong user, which can be written as a linear combinations of the former, hasa larger capacity.

We will see in Chapter 6 that interference cancellation where the signalsof these strong users are detected or estimated will fare much better and willnot suffer from this caveat of unequal received powers. In fact, there we willshow that the equal received power situation is the worst-case scenario, notthe best case as it is here with linear preprocessing. One can, of course, arguethat the average energy is not a appropriate measure of the system energyload, but similar conclusions can be drawn with other definitions of powerefficiency.

easily dismissed, since the weak user has no effect on the remaining users, the

4.5 Different Received Power Levels 137

-2 0 2 4 6 8 10 12 14 16

0.1

1

10

Eb/N0

[dB]

Capaci

tybits/

dim

ensi

on

Fig. 4.16. Shannon Bounds for an MMSE joint detector for the unequal receivedpower scenarios of one strong user, and equal power for the remaining users.


-2 0 2 4 6 8 10 12 14 16

0.1

1

10

Eb/N0

[dB]

Capaci

tybits/

dim

ensi

on

Fig. 4.17. Shannon Bounds for an MMSE joint detector the case of two powerclasses with equal numbers of users in each group.

5

Implementation of Multiuser Detectors

5.1 Iterative Filter Implementation

We have seen in the previous chapter that linear filtering is a very powerfuljoint detection method. Unfortunately, most of the efficient linear multiuserreceivers require a matrix inversion, such as the MMSE detector of Section4.4.5, and the decorrelator of Section 4.4.2. Each requires a matrix inverse ofthe size of the system, that is, the maximum number of users. For high speedimplementations, the required complexity may be prohibitive. Furthermore,if random spreading sequences are used as in IS95 [135], cdma2000 [134],and 3GPP [38], these inverses cannot be found via adaptive signal processing

The situation with a straightforward matrix inversion is further compli-cated if the system is asynchronous, see Figure 2.7, where the correlationmatrix becomes band-diagonal. Recursive ways to invert such a matrix areknown, however they require a windowed procedure analogous to the one dis-cussed in Section 4.4.9. In the remainder of this chapter we discuss iterativematrix inversion techniques [8, 34, 35] applied to linear multiuser detectorswhich require a matrix inverse. These techniques have been developed in thelinear algebra community to reduce the complexity of solving linear algebraicequations. Well-known methods such as the Jacobi, first and second-orderstationary, Chebyshev, and Gauss-Seidel iterative methods can all be appliedto multiuser detection. In subsequent sections we explore these methods andtheir convergence behavior as applied to random CDMA systems.

5.1.1 Multistage Receivers

We start this chapter with the concept of a multi-stage receiver, and then showthat this receiver is, in fact, the realization of an iterative matrix inversionmethod. Multistage receivers were developed as a response to the complexityproblem of multiuser receivers, initially without reference to iterative matrixsolution methods.

methods, and have to be computed for each symbol interval.

140 5 Implementation of Multiuser Detectors

Let us start with the outputs of the bank of matched filters i.e. with (4.48)

y = S∗r = RAd + z′. (4.48)

Let y[i] be the sub-vector corresponding to the i-th time interval, i.e. y[i] =(y[iL + 1], · · · , y[(i + 1)L]), and therefore y = (y[1], · · · ,y[n]). Decomposing(4.48) into equations for each time interval as shown in Figure 5.1, we obtain

d[i] = R1[j−1]Ad[j−1] + R0[j]Ad[j] + R∗1[j+1]Ad[j+1] + z[j], (5.1)

where A = diag(√

P1, · · · ,√

PK

)is the synchronous amplitude matrix (see

Section 2.4). Rearranging terms in (5.1) and ignoring the noise we obtain

y[j] =A−1 (y[i]−R1[j−1]Ad[j−1]− (R0[j]− I)Ad[j]−R∗

1[j+1]Ad[j+1]) .(5.2)

Fig. 5.1. Illustration of the asynchronous blocks in the correlation matrix R.

If we have available an initial estimate of the symbols d[j−1],d[j],d[j+1],denoted by d(0)[j−1],d(0)[j],d(0)[j + 1], we may use the above equation tocompute iteratively

R0[j]

R∗1[j+1]

R1[j+1]

R1[j]

R∗1[j] zeros

zeros

5.1 Iterative Filter Implementation 141

d(m+1)[j] = A−1(y[j]−R1[j]Ad(m)[j−1]−R0,lower[j]Ad(m)[j]

−R0,upper[j]Ad(m)[j]−R∗[j+1]Ad(m)[j+1])

, (5.3)

from a previous estimate d(m)[j−1],d(m)[j],d(m)[j + 1], where R0,lower[j] isthe lower triangular part of R0[j], and R0,upper[j] is the upper triangularpart. The initial estimate may be taken from (4.48), i.e. Ad[j] = y[j]. Ofcourse multiplication and division by A does not need to be carried out in(5.3), since we can iterate over Ad[j] rather than d[j]. This is in essence thebasic idea behind the multistage detectors which were originally proposed andconsidered for implementation in [29, 94, 145, 173]. The block diagram of sucha multistage receiver is shown in Figure 5.2.

CorrelatorsUnit

Delay

Unit

Delay

Unit

Delay

Unit

Delay

y[j

+1]

Ad(0)[j+1] Ad(0)[j] Ad(0)[j−1]

R∗1[j+1] R∗

0[j]−I R1[j]

Sta

ge

1

y[j−1

]

Ad

(1)[j

]

Ad(1)[j−1] Ad(1)[j−2]

R∗1[j]

R∗0[j−1]

−IR1[j−1]

Sta

ge

2

Ad(2)[j−1]

Fig. 5.2. The multistage receiver for synchronous and asynchronous CDMA sys-tems.

If we study this detector carefully we see that we may replace the previousestimates as soon as the new ones become available, rather than wait untilan entire iteration has been carried out. We then obtain a modified iterativeupdate equation, given by


d(m+1)[j] = A−1(y[j]−R1[j]Ad(m+1)[j−1]−R0,lower[j]Ad(m+1)[j]

−R0,upper[j]Ad(m)[j]−R∗[j+1]Ad(m)[j+1])

, (5.4)

It turns out that this version of the multistage receiver converges much fasterto a good solution, and we will see the reasons why in the next subsection.

In the implementations presented here, the intermediate estimates d(m)[j]are real-number estimates. The original multistage receivers used hard deci-sions at each stage, arguing that the symbols of d ∈ {±1}. It turns out thatthe hard decision multistage receivers show a slightly better performance (seeSection 5.1.2), but are significantly more difficult to analyze since the linearanalysis techniques discussed below does not apply.

5.1.2 Iterative Matrix Solution Methods

We will now show that (5.4) is in fact an iterative version of the decorrelatorequation (4.49), and equation (5.4) is the well-known Gauss-Seidel method[55, 131] for iteratively calculating the solution of a linear algebraic equation.

Borrowing from the ample mathematical literature devoted to the prob-lem of iterative equation solving, see e.g. [8, 55, 60, 131], we now start byconsidering an arbitrary linear systems equation, given by Mx = b, and letM = S−T be a splitting of M. Now

Sx = Tx + b =⇒ Sx(m+1) = Tx(m) + b. (5.5)

We have now generated an iterative solution approach for x:

x(m+1) = S−1(Tx(m) + b

). (5.6)

The question is whether repeated iterations of (5.6) will lead to a solutionof Mx = b. This question is quickly settled by studying the error e(m) atiteration j. Let x(m) = x + e(m), then

S(x + e(m+1)) = T(x + e(m)) + b

Se(m+1) = Te(m)

e(m+1) = S−1Te(m) = (S−1T)m e(0). (5.7)

Using the spectral decomposition theorem to write S−1T = QΛQ∗, where Λis a diagonal matrix of the eigenvalues of S−1T, and Q is unitary, the iterationerror is expressed by

e(m+1) = QΛmQ∗e(0). (5.8)The matrix Λm contains the powers of the eigenvalues λk of S−1T on itsdiagonal, and therefore e(m+1) is dominated by the largest eigenvalue of S−1T,also known as the spectral radius of S−1T, i.e. λmax = ρ(S−1T). As long asρ(S−1T) < 1, the error e(m+1) will decrease at each iteration, and e(m+1) → 0,irrespective of the starting vector x(0). Different iterative solution methodsare now obtained by choosing different splittings of the original matrix (foran in-depth discussion, see [8]).


5.1.3 Jacobi Iteration and Parallel Cancellation Methods

If we let M = R and solve Md = r, then

d(m+1) = A−1((I−R)Ad(m) + y

)(5.9)

is an iterative solution for the decorrelator output A−1R−1y. The particularsplitting used is S = D = I and yields the iterative solution known as Jacobi’smethod. Writing (5.9) out in block matrix form for an asynchronous systemwe obtain the exact form of the multistage equation (5.3).

The iterations may be started with an arbitrary initial vector, e.g. d(0) = y.Of course the initial error has an effect on the number of iteration stepsrequired to achieve a given accuracy. Furthermore, this initial estimate shouldbe found with low complexity, which is the case if we use rMF. Furthermore,

faster than initializing the iterations with the all-zero estimate [66].Equation (5.3) corresponds to parallel interference cancellation, whose cir-

cuit implementation is shown in Figure 5.2, i.e. if the soft decisions from theprevious stage are used in the cancellation procedure, the Jacobi iterativedecorrelator is identical to the multistage receiver presented in the last sec-tion. This connection of iterative matrix inversion with parallel and serial in-terference cancellation has been explored by a number of authors [34, 35, 58].

Figure 5.3 shows the experimental simulated performance of an iterativeJacobi decorrelator as a function of the number of iterations for a system loadof β = 0.16, and Es/N0 = 7dB. The performance is compared to a first-, andsecond-order stationary iteration method discussed below. The performanceapproaches that of the decorrelator within about 5 iterations, then dips belowthe decorrelator error probability, and finally, asymptotically approaches itsperformance again, as predicted by the theory.

From the convergence criterion it is clear that the Jacobi decorrelatorconverges if ρ(R − I) < 1 ⇒ ρ(R) < 2, for a given fixed set of signaturesequences S, and irrespective of the amplitudes A.

If the spreading sequences are random, we have the following

Theorem 5.1. For large systems with random spreading codes,the Jacobi iterative receiver converges to the decorrelator using

the splitting S = D = I if and only if K < L(√

2− 1)2

, and theasymptotic convergence factor1is given by

ρJ = β + 2√

β. (5.10)

it has been noted that using the matched filter outputs as a starting point is

144 5 Implementation of Multiuser DetectorsA

ver

age

Bit

Err

or

Rate

10−1

10−2

10−3

2 4 6 8 10 12 14 16 Iterations

Fig. 5.3. Example performance of a parallel cancellation implementation of thedecorrelator as a function of the number of iteration steps.

Proof. The Jacobi iteration converges if and only if ρ(I − R) < 1. Now forany eigenvalue λ of R, 1 − λ is an eigenvalue of I − R [60]. The iterationconverges therefore if and only if maxλ(R) |1− λ| < 1. According to Theorem3.5 in Chapter 3, ρ(R) converges in probability to

(√β + 1

)2 if K,L → ∞,and K/L = β is fixed. The minimum eigenvalue of R converges to

(√β − 1

)2.For β ∈ [0, 1] therefore |1 − λmin| = 2

√β − β ≤ |1 − λmax| = 2

√β + β, and

hence the condition 2√

β + β < 1 implies that K < L(√

2− 1)2

.

The above proof is strictly true only for synchronous systems due to The-orem 3.5, for which extensions to asynchronous matrices are not readily avail-able. However, numerical evidence indicates that asynchronous systems followthe same performance and convergence laws.

Since the decorrelator and the MMSE filter (4.71) differ only in the di-agonal components of the filter matrix, we can equally well implement aniterative Jacobi MMSE filter. From Chapter 4 the MMSE solution is given as

AdMMSE =(R + σ2W−1

)−1y (4.71)

by using the splitting S = I + σ2W−1 = W−1(W + σ2I

)which leads to the

iterative implementation1 For a given norm of an iteration matrix B, the k-step convergence factor is given

by‚‚Bk

‚‚. However, for any matrix norm limk→∞‚‚Bk

‚‚1/k= ρ(B). Thus the

average convergence factor per step approaches the spectral radius [60].


d(m+1) = A−1(W + σ2I

)−1W((I−R)Ad(m) + y

). (5.11)

For this iterative receiver the following convergence result is proven in [58]:

Theorem 5.2. For large systems and random spreading se-quences the Jacobi iterative receiver converges to the MMSE re-ceiver if

K < L

(√2 + γ−1

max − 1)2

where γmax = maxk

(wk/σ2).

(5.12)Moreover, for equal power users (5.12) is also tight, and thereforenecessary.

Proof. Note that

ρ (S−1T) = ρ((

W + σ2I)−1

W (I−R))

(5.13)

≤ ρ((

W + σ2I)−1

W)

ρ (I−R) (5.14)

= (λmax − 1) maxi=1···K

wi

wi + σ2(5.15)

= (λmax − 1)1

1 + γ−1max

, (5.16)

since the spectral radius is a metric measure. After some straightforward al-gebra (5.12) follows.

For equal powers(W + σ2I

)is just a scaled identity matrix and (5.14)

becomes an equality. The iterations (5.11) can be made to converge for allβ ≤ 1 if γ−1

max ≥ 2 ⇒ γmax ≤ −3dB. However, in this uninteresting caseperformance for all users would be quite poor.

Figure 5.4 shows simulation results for K = 8 to illustrate that the asymp-totic results are achieved quite rapidly. The curve labeled “hard” uses harddecisions at each iteration, i.e. d(j) → sgn

(d(j))

before the next iteration.This was the procedure suggested in the original papers on multistage decod-ing [145, 146]. Clearly observable is the failure of the Jacobi decorrelator forβ > 0.16, even though K �∞.

It is worth noting that the hard-iteration Jacobi receiver outperforms thedecorrelator for small loads β, a phenomenon which we have seen before inFigure 5.3, which derives from the facts that the decorrelator does not achievethe minimum mean-square error, and that we are measuring hard decisionerror rates obtained from the soft estimates d(j).

From iterative matrix solution theory, several more advanced methodsexist, such as higher order stationary methods using a fixed iteration scheme,


Fig. 5.4. BER Performance of Jacobi Receivers versus system load for an equalpower system, i.e. A = I.

or non-stationary method such as the Chebyshev method [8, 60]. All of thesemethods lead to variations of parallel interference cancellation, the details ofwhich are discussed in [58]. Furthermore, the asymptotic convergence factorsdiffer slightly for different methods. They are tabulated for comparison at theend of the next section.

Note also that any invertible K ×K matrix M has a series expression ofthe inverse M−1 with at most K terms, since each matrix fulfills its own char-acteristic equation [60, 131]. This is the celebrated Cayley Hamilton Theorem:

Theorem 5.3. A matrix M fulfills its own characteristic equationc(λ) = (M− λI)v = 0, i.e.

c(M) = cKMK + cK−1MK−1 + · · ·+ c1M = −c0I. (5.17)

From Theorem 5.3 we can find an exact inverse for M as

M−1 = − 1c0

(cKMK−1 + · · ·+ c1I

). (5.18)

However, computation of the series coefficients ck is tantamount to invertingthe matrix and thus not practical.

1

10−1

10−2

10−3

10−4

eta

Rr

orrE

tiB

eg

arev

A

0 0.2 0.4 0.6 0.8 1 β

Jacobi

Hard

Decorrelator

Eb/N0 = Es/N0 = 7dB

β = (√

2 − 1)2


5.1.4 Stationary Iterative Methods

The limitations of the Jacobi iterative method can be avoided by generalizingthe iteration equation (5.9). This is accomplished by the first-order stationaryiteration solution, given by

Ad(m+1) = Ad(m) − τ (m)(MAd(m) − y

), (5.19)

where τ (m) is a sequence of constants [8]. If τ (m) = τ for all m, the methodis called stationary. Clearly, we now want to choose τ such that the spectralradius of the iteration matrix S−1T is minimized. This is accomplished by thefollowing

Theorem 5.4. For large systems with β < 1, the first-order sta-tionary iteration

Ad(m+1) =1

1 + βy −

(1

1 + βR− I

)Ad(m) (5.20)

converges to the decorrelator for any initial vector d(0), and theconstant τ = 1/(1 + β) maximizes the convergence speed. Theasymptotic convergence factor is given by

ρS1 =2√

β

1 + β. (5.21)

Proof. According to [8, Theorem 5.6], for M symmetric positive definite, theparameter which results in the fastest convergence of the first-order stationaryiteration is τopt = 2/(λmin + λmax). Applying the limiting distribution of theeigenvalues of R from Theorem 3.5 completes the proof.

A second-order iteration is a generalization of (5.19) using the two mostrecent estimates, given by

Ad(m+1) = a(m)Ad(m) − (1− a(m))Ad(m−1) − b(m)(MAd(j) − y

). (5.22)

For a stationary iteration we set a(j) = a and b(j) = b, as well as a(0) = 1 andb(0) = τopt.

In [58] the following theorem is proven:


Theorem 5.5. For large systems with β < 1, the first-order sta-tionary iteration

Ad(m+1) = y − (R− (1 + β)I)Ad(m) − βd(m−1) (5.23)

with

Ad(1) = y − 11 + β

(R− I)Ad(0) (5.24)

converges to the decorrelator for any initial vector d(0), and thechosen parameters are optimal for second-order stationary meth-ods. The asymptotic convergence factor is given by

ρS2 =√

β. (5.25)

Letting M = R+σ2W−1, both first-, and second-order stationary methodcan also be used to implement the MMSE filter.

5.1.5 Successive Relaxation and Serial Cancellation Methods

In this section we explore a very efficient splitting, which leads to the so-called successive relaxation technique known in linear algebra [60]. It willalso lead to a modified cancellation method, namely serial cancellation. Westart with the matrix splitting M = D−ωL− (1−ω)L−L∗ where −L is thelower triangular part of M. The parameter ω is the relaxation parameter. Theresulting iterative solution can be obtained after a few algebraic manipulationsas (

1ω

D− L)

Ad(m+1) = y +(

1− ω

ωD + L∗

)Ad(m). (5.26)

As long as ω ∈ (0, 2) (5.26) always converges [58]. In fact, what is required isthat ρ

((1ωD− L

)−1 ( 1−ωω D + L∗)) < 1. It can be shown that this is always

the case for symmetric matrices M with positive diagonal elements, and hencefor the two matrices M = R, and M = R+σ2W−1, which are of most interestto joint linear detection.

The relaxation parameter can be designed to optimize convergence speedand lies between 1 and 1.8 for β ∈ [0, 1]. Note that we have again assumedthe starting vector b = y.

If we choose ω = 1, a particularly simple successive iteration scheme re-sults, known as the Gauss-Seidel method, and given by

(D− L)Ad(m+1) = y + L∗d(m). (5.27)


Since D−L is lower diagonal, and L∗ is upper diagonal, the iteration solutiond(m+1) can calculated successively by starting with its first entry d(m+1)[1],which depends only on previous iteration results d(m), and continues as

√Pkd(m+1)[k] = y[k]−

K∑i=1

Rki

√Pid

(m+1)[i]−K∑

i=i>k

Rki

√Pid

(m)[i]. (5.28)

Inspection of (5.28) and (5.4) reveals that multistage decoding according to(5.4) is an instance of the Gauss-Seidel iterative matrix solution method foran asynchronous system.

Let us consider an alternate derivation [39] of the Gauss-Seidel methodwhich will allow us to gain further intuitive understanding of its operation.Let us consider the quadratic function

J(Ad) = ‖r− SAd‖ (5.29)

= d∗ARAd− d∗AS∗r− r∗SAd + c (5.30)

J(x) = x∗Rx− b∗x− x∗b + c, (5.31)

with x = Ad and b = SAd. Since R is positive definite, the function J(x) isconvex and has a single minimum. This minimum can be computed exactlyby setting the gradient of J(x) equal to zero, and will lead to the closed formMMSE solution discussed in Section 4.4.5.

However, in an iterative fashion the Gauss-Seidel method approaches thisminimum with successive minimization steps over each element x, assumingthe rest of the elements of x to be fixed. Algorithmically, if the i-th elementof x is varied to minimize J(x), its partial derivative

∂J(x)∂x[i]

= 2K∑

l=1

Rilx[l]− 2b[i] (5.32)

is set to zero, leading to

x(m+1)[i] =1

Rii

(b[i]−

i−1∑l=1

Rilx(m+1)[l]−

K∑l=i+1

Rilx(m)[l]

)

= x(m)[i] +1

Rii

⎛⎝b[i]−

i−1∑l=1

Rilx(m+1)[l]−

K∑l=i)

Rilx(m)[l]

⎞⎠ .

(5.33)

Executing the Gauss-Seidel updates according to (5.33) (second line) not onlyis numerically robust, but also allows for a simple visualization of the updateprocedure, shown in Figure 5.5 for a system with two variables, where the

2

i<kk 1+


ellipses are the contour lines for a constant value of J(x). Note that theGauss-Seidel method optimizes each variable in turn, leading to the indicatedzig-zag trajectory towards the global minimum, which represents the MMSEsolution.

Fig. 5.5. Visualization of the Gauss-Seidel update method as iterative minimizationprocedure, minimizing one variable at a time. The algorithm starts at point A.

Figure 5.6 shows the simulated performance of some of the advanced it-erative receiver systems, including the Gauss-Seidel method for a load ofβ = 8/32 = 0.25 at a signal-to-noise ratio of Es/N0 = 10dB. The target filterwas the MMSE filter. As can be seen, the stationary methods have roughlythe same performance, while the advanced methods are slightly faster. Thedifference between different over-relaxation methods is minor, which meansthat the simple Gauss-Seidel method is preferable in practical applicationsdue to its simplicity. The conjugate gradient method is in general the fastest,but it is more complex than the simpler parallel and serial cancellation meth-ods, which have the additional advantage that their implementations can bepipelined to reduce latency.

Using the results on the asymptotic behavior of the eigenvalues of randommatrices as discussed in Chapter 3, Grant and Schlegel [58] have derived thefollowing table of convergence factors for the different iterative methods:


Fig. 5.6. Iterative MMSE filter implementations for random CDMA systems.

Method Optimal Parameters Converges Factor

Jacobi - β < 0.17 β + 2√

βStationary 1 τopt = 1 + β−1 β < 1 2

√β/(1+β)

Stationary 2 aopt = 1 + β, bopt = 1 β < 1√

β

Chebyshev τi =h1+β+2

√β cos

“k−1/2

kmax+1π

”i−1

β < 1√

β

Succ. Relax. ωopt = 2/(2 − β) β < 1 β1/4

All but the successive relaxation method are parallel interference cancel-lation schemes.

5.1.6 Performance of Iterative Multistage Filters

In the previous sections we have seen how linear filters requiring matrix in-versions can be implemented via multistage filtering structures which realizeiterative solutions to the linear algebraic matrix problem. The performancesof these filters will, in the limit of large numbers of stages, exactly approachthat of the original filters, and the performance limits discussed in the pre-vious chapter therefore apply. In this section we again look at the ultimateperformance potential of a system using capacity arguments. We are inter-ested in the performance for a finite number of stages, since low complexityimplementations will attempt to minimize the number of iterations stages.An understanding of the complexity/performance tradeoffs involved are thusimportant.

10−1

10−2

10−3

10−4

10−5

eta

Rr

orrE

tiB

eg

arev

A

0 1 2 3 4 5 6 7 Iterations

MMSE Performance Eb/N0 = 10dB

N = 32, K = 8


The first-order stationary iterative filter is probably the structure mostanalyzed and best understood. It is both simple and efficient, and allows a fairamount of analytical treatment. We will concentrate our treatment on first-order iterative filters in this section. First note that the first-order iterativeapproximation of the MMSE filter2, given by

d(m+1) = τy −(

τR− 1− τσ2

PI)

d(m) (5.34)

can be rewritten as a linear filter equivalent in the following way:

d(m+1) =

[m∑

i=0

τ

(1− τ

σ2

PI− τR

)m

+(

1− τσ2

PI− τR

)m+1]y. (5.35)

Using the spectral decomposition of the hermitian matrix R = QΛQ∗ we canexpress (5.35) as

d(m+1) = Q

[m∑

i=0

τ

(1− τ

σ2

PI− τΛ

)i

+(

1− τσ2

PI− τΛ

)m+1]Q∗y

= QΓQ∗y. (5.36)

The composite matrix Γ is diagonal with l-th entry given as

γl = τm∑

i=0

(1− τ

σ2

P+ λl

)i

+(

1− τσ2

P+ λl

)m+1

=1−(1− τ σ2

P + λl

)m+1

σ2

P + λl

+(

1− τσ2

P+ λl

)m+1

=1gl

(1− (1− gl) (1− τgl)

m+1)

, (5.37)

where gl = σ2 + λl. For random spreading sequences and equal powers theeigenvalues λl of R are known and given by Theorem 3.5. The values gl

therefore have a known distribution, which is that for λl shifted by the noisepower σ2.2 We choose the implementation for the MMSE filter here, but an equivalent deriva-

tion holds for any other filter involving an inverse. Furthermore, we assume forthe remainder of this section an equal-power system with A =

√P I.

)(

( )( )

( )( )

( ) ( ))( ( )


Furthermore, using the formula for optimal convergence given by

τopt =2

λmin + λmax(5.38)

as in Theorem 5.4, together with the limiting eigenvalues of the random ma-trix, we finally find

τopt =1

1 + β + 2σ2/P. (5.39)

It is important to note that for τ > τopt a severe degradation of the perfor-mance can occur. This point was stressed in [137] who have done extensivenumerical simulations. This necessitates that, for finite stages, the iterationconstant τ may have to be backed off from its theoretical value (5.39).

The output d(m+1) of the multistage filter will contain signal components,noise components, and incompletely canceled signal components. The noisecomponents are given by

zMS = QΓQ∗S∗z, (5.40)

which has a correlation matrix given by

Rz = E [zMS∗zMS] = σ2QΓΛΓQ∗. (5.41)

The total noise power over all components equals σ2tot = σ2tr(QΓΛΓQ∗),

and therefore, by symmetry, the noise component in each component of themultistage filter equals

σ2MS = σ2 E

[γ2

l λl

]= σ2 E

[λl

g2l

(1− (1− gl) (1− τgl)

m+1)2], (5.42)

and the expectation is over the distribution of the eigenvalues λl of R givenby the density in Theorem 3.5.

Likewise, the signal components in the output d(m+1) are given by

dMS = QΓQ∗S∗SAd. (5.43)

Assuming the data to be independently distributed, i.e. E [d∗d] = P I wecalculate the total signal and interference power as

tr(d∗MSdMS) = P tr

(Γ 2Λ2

), (5.44)

which contains signal as well as interference components. Again, due to sym-metry, the signal and interference value of any given dimension is given by

PS+I = P E

[λ2

l

g2l

(1− (1− gl) (1− τgl)

m+1)2]. (5.45)

Since the received vector is given by (5.36), the signal part of the K differentlayered channels is given by the diagonal values


dS = diag {QΓΛQ∗}d = diag {B}Ad. (5.46)

The signal powers are the squared diagonal entries of B. We can bound thesum of their values using Jensen’s inequality as follows:

K∑i

B2ii ≤

1K

(K∑i

Bii

)2

. (5.47)

Furthermore, for asymptotically large systems

K∑i

B2ii

K,L→∞−→ 1K

(K∑i

Bii

)2

, (5.48)

and the bound quickly becomes tight.Assuming independence between the signal and interference terms, and

using tr (B) = tr (ΓΛ) the sum of the eigenvalues, the userpower of any given layered signal stream converges to

PSK,L→∞−→ P

K

⎡⎣E

(K∑i

λl

gl

(1− (1− gl) (1− τgl)

m+1))2

⎤⎦ , (5.49)

and the interference power is

σ2IP = PS+I − PS. (5.50)

From the above we can compute the signal-to-noise ratio of the iterative filteras

γMS =PS

σ2MS + σ2

IP

, (5.51)

from which the capacity of an individual layered channel can be computed.Trichard et. al. [137] use a recursive moment generating function for the

moments of the eigenvalues of S∗S which allows the evaluation of the aboveintegrals as recursive algebraic equations.

Following the same procedure as in Chapter 4 an implicit equation forthe maximum spectral efficiency can be computed from the per-dimensioncapacity formula

CMS =β

2log (1 + γMS) (5.52)

as follows. The overall (interference-free) signal to noise ratio

KP

σ2=

CMSEb

βσ2=

2CMSEb

βN0(5.53)

for one-dimensional systems, and therefore, in terms of Eb/N0

which equals


CMS =β

2log

(1 +

2CMSPSEb

N0

2CMSσ2IP

Eb

N0+ P

σ2 βσ2MS

). (5.54)

Figure 5.7 shows numerical evaluations of the Shannon bound equation(5.54) for an example load of β = 0.5 and equal power of all users – otherloads produce quite similar results. What is remarkable is that in the range ofmoderate values of Eb/N0 of about 0–4dB, simple iterative filters with as fewas two stages are sufficient to approach the capacity of the complete MMSEfilter. Such values of Eb/N0 are not uncommon in typical coded wireless links,suggesting that the complexity of linear multiuser detection could be as littleas a few times that of simple matched filtering.

Furthermore, note that the number of stages does not depend on the sys-tem size, i.e. even if K,L → ∞, the same small number of stages is requiredas for a small system to achieve a certain per-user capacity.

AWGN Capacity

MultistageCapacities

m = 10

m = 5

m = 2

m = 1

Matched Filter: m = 0

β = 0.5

-2 0 2 4 6 8 10 12 14 Eb/N0 [dB]

0.1

1

10

MM

SE

per

-use

rC

apaci

tyC

MS

Fig. 5.7. Shannon bounds for the AWGN channel and multistage filter approxi-mations of the MMSE filter for a random CDMA system with load β = 0.5. Theiteration constant τ was chosen according to (5.39).

Note that thus far, including Figure 5.7, the iterative filter has been basedon the minimum-square error filter R + σ2W−1. If we are to use an iterative


approximation to the decorrelator filter R, some of the equations simplify,and the noise variance (5.42) becomes

σ2DEC = σ2 E

[1λl

(1− (1− λl) (1− τλl)

m+1)2], (5.55)

while the signal and interference power (5.45) is given by

PS+I = P E

[(1− (1− λl) (1− τλl)

m+1)2], (5.56)

and the signal power (5.49) by

PSK,L→∞−→ P

K

⎡⎣E

(K∑i

(1− (1− λl) (1− τλl)

m+1))2

⎤⎦ . (5.57)

Equation (5.39) for the optimal iteration constant still applies, but becomesnow independent of the signal-to-noise ratio P/σ2.

Using (5.55), (5.56), and (5.57) in (5.54) produces the different Shannonbounds of Figure 5.8 for the finite-stage approximation of the decorrelator.The major difference is that higher-order filters show a poorer performanceat low values of Eb/N0, that is, a higher number of stages is counter pro-ductive for Eb/N0 less than approximately 2dB. This is not surprising, sincethe iterative filter approaches the performance of the decorrelator, which issignificantly poorer at low signal-to-noise ratios than that of the MMSE fil-ter. Somewhat surprisingly, however, at lower values of the Eb/N0 a low-stageiterative filter approximation of the decorrelator approaches the performanceof the MMSE filter. A phenomenologically similar behavior of the iterativeapproximation to the decorrelator we have already encountered in Figure 5.3.

The performance of the MMSE filter constitutes an upper bound on thecapacities of these iterative filters, while the lower of the two values of eitherthe decorrelator or the matched filter receiver forms a lower bound.


AWGN Capacity

MultistageCapacities

m = 10

m = 5

m = 2

m = 1

Decorrelator

Matched Filter: m = 0

β = 0.5

-2 0 2 4 6 8 10 12 14 Eb/N0 [dB]

0.1

1

10D

ecorr

elato

rper

-use

rC

apaci

tyC

DEC

Fig. 5.8. Similar Shannon bounds for multistage filter approximations of the decor-relator with load β = 0.5.

5.2 Approximate Maximum Likelihood

As discussed in Section 4.2.1, jointly optimal detection minimizes the vectorerror probability Pr(d �= d). Recalling (4.2), this is achieved for uniform priorprobabilities by

dML = arg mind∈DnK

‖r− SAd‖22 . (5.58)

The main impediment to low-cost implementation of jointly optimal detectionis the cardinality of the set DK . Brute-force implementation of (5.58) requirescomputation of |D|K metrics. This appears to be a fundamental limitation inlight of the NP-hard nature of the problem [149].

The Viterbi algorithm implementation of optimal detection described inSection 4.2.1 computes the metric ‖r− SAd‖22 for every possible d ∈ DnK

and operates on a trellis with maximum width of 2K−1 states.Despite the apparent difficulty of the jointly optimal detection problem,

in practice only a relatively small subset of DnK has any significant poste-rior probability (depending upon the noise level in the system). The goal oflow-complexity sub-optimal approaches is to make effective use of this phe-nomenon.

If it were possible to restrict attention to only those (hopefully few) se-quences with significant posterior probability, detector complexity would besignificantly reduced. The main design problem is therefore to determine whichsequences to consider.

In the context of the trellis representation introduced in Section 4.2.1, theproblem is to design an effective tree search that does not visit every node. Theidea is to avoid following paths through the tree that can be determined earlyon to have little chance of being the optimal one. There are many such searchmethods in the literature, and this basic idea has been used widely in thearea of error control coding. The effectiveness of restricted search algorithmstherefore depends heavily upon the ability to prune sections of the tree asearly as possible.

In Section 5.2.2 we will describe reduced-complexity tree search methods.Other search methods motivated by lattice structures will be developed inSection 5.2.3. First however, we need to develop a more suitable recursivemetric.

In Section 4.2.1, we developed a recursive metric for use in a Viterbi im-plementation of the jointly optimal detector. Let us review the developmentof this metric. The relevant terms of the Euclidean distance

‖r− Sd‖22 (5.59)

for the purposes of optimal detection are

−2y∗d + d∗Rd (5.60)

which (using the symmetry of R) led directly to the recursive metric (4.11)

5 Implementation of Multiuser Detectors158

5.2 Approximate Maximum Likelihood 159

λl(d) = λl−1(d)− 2dlyl +l−1∑i=1

2didlRil︸︷︷︸Branch Metric

. (5.61)

An important observation from (5.61) is that the branch metric

bl(d1, d2, . . . , dl) = 2dlyl +l−1∑i=1

2didlRil (5.62)

can be either positive or negative. This is not a problem for an algorithm thatsearches the entire tree, but it does cause difficulties for restricted tree searchalgorithms. This is due to the fact that a partial metric along a given pathmay in fact decrease further along the path. This means that some pathsthat appear very unlikely part of the way the tree may later become morelikely through the addition of negative branch metrics. This possible futurereduction in metric makes it very hard to compare two partial paths. It wouldbe preferable for the partial metric to be monotonic. This would mean thatunlikely paths cannot become any more likely based on future observations,and would motivate early pruning of unlikely portions of the tree.

5.2.1 Monotonic Metrics via the QR-Decomposition

It is possible to obtain just such a monotonic metric, via use of the well-known QR-decomposition. This is a matrix decomposition of the form QU,where Q is unitary, Q∗Q = I and U is upper triangular. (we use U ratherthan the more traditional R to emphasize the upper triangular structure andto avoid confusion with the correlation matrix). For our purposes, it will bemore convenient to work with a modified QR-decomposition that yields alower triangular factor. This can be accomplished as follows. Let

J =

⎛⎜⎜⎜⎜⎜⎜⎝

0 . . . 0 1... 1 0

0 1...

1 0 . . . 0

⎞⎟⎟⎟⎟⎟⎟⎠

be the identity matrix with its columns in reverse order. Let

QU = SJ

be the regular QR-decomposition of the re-ordered modulation matrix. Then,since J2 = I,

S = QUJ

= (QJ)(JUJ)= QL


where Q = QJ is unitary and L = JUJ is lower triangular.Considering again the Euclidean distance metric (5.59), we can use the

modified QR-decomposition of S to write

‖r− Sd‖22 = ‖r−QLd‖22 (5.63)

= ‖Q (Q∗r− Ld)‖22 (5.64)

= ‖Q∗r− Ld‖22 , (5.65)

where the last line is due to the invariance of the 2-norm under unitary trans-formations.

Noting that

R = S∗S

= L∗Q∗QL∗

= L∗L,

we obtain the Cholesky decomposition of R, i.e. the lower triangular matrixL = F is in fact the Cholesky factor of R described in Section 4.4.8. In fact,

Q∗r = L−∗L∗Q∗r

= L−∗S∗r

= L−∗y

= rWF

according to (4.85). Thus the above manipulations using the QR decompo-sition are identical to the application of the noise whitening filter in Sec-tion 4.4.8. Recall that the noise whitening filter is a kind of dual to the decor-relator. The decorrelator whitens the data, at the expense of correlated noise,whereas the application of L−∗ to the matched filter output results in whitenoise, at the expense of correlated data.

Equations (5.63)-(5.65) demonstrate that computing the Euclidean dis-tance (5.65) based on the output of the noise whitened matched filter outputresults in no loss of optimality.

The form of the metric (5.65) offers a particular advantage, namely thelower triangular structure of F = L. Rather than performing similar manip-ulations to (5.60) above (where the 2-norm was expanded, and certain termsdropped), a recursive partial metric can be written directly in terms of theEuclidean metric as follows.

The entire metric for a given data hypothesis d is

μnK(d,F) = ‖rWF − Fd‖22

=nK∑l=1

⎛⎝rWFl −

l∑j=1

Fljdj

⎞⎠2

,


which can be written recursively as

μl(d,F) = μl−1(d,F) +

⎛⎝rWFl −

l∑j=1

Fljdj

⎞⎠2

︸︷︷︸Branch Metric

. (5.66)

Just like the Ungerbock metric λl(d), this new metric depends only on pre-vious symbols, μl(d) = μl(d1, d2, . . . , dl), and is therefore suitable for use in atree or trellis search.

However, unlike (5.62), the branch metrics

b(d1, d2, . . . , dl) =

⎛⎝rWFl −

l∑j=1

Fljdj

⎞⎠2

(5.67)

are always non-negative.The complexity of noise whitening lies in the computation of the QR-

decomposition, or Cholesky factorization, which is of order O(K3) both inthe synchronous and asynchronous cases. However, the resulting monotonicbranch metrics allows for relatively low complexity extensions to the detectionalgorithm that can yield dramatic performance improvements.

5.2.2 Tree-Search Methods

Optimal detection may be viewed as operating on a full D-ary tree of depthnK (assuming without loss of generality that each user transmits symbolsfrom an identical modulation alphabet D). Starting from the root node, eachnode branches into |D| children, one for each possible transmitted symbol.

Each node at depth l corresponds to a particular choice of data sym-bol hypothesis dl ∈ D. Let us label each node at depth l with the par-tial data sequence dl = d1, d2, . . . , dl and the corresponding partial metricμl(d1, d2, . . . , dl). According to the recursive formulation (5.66), this can becomputed by

μl(dl) = μl−1(dl−1)︸︷︷︸Parent node metric

+ b(dl)︸︷︷︸Branch metric

.

Thus the node metric is the sum of the parent node’s metric and the branchmetric.

The path from the root node to any of the |D|nK leaves corresponds to oneparticular hypothesis for the entire vector d. Optimal detection consists there-fore of searching the entire tree and choosing the best sequence by comparingthe corresponding metrics

μnK(d) =nK∑l=1

bl(dl).


Example 5.1. Figure 5.9 shows a three-user symbol synchronous example,where each user transmits binary symbol. One path corresponding tod = (−1,+1,+1) is highlighted.

μ0

μ1(−1)

b1(−1)

μ2(−1, +1)

b2(−1, +1)

μ3(−1, +1, +1)

b3(−1, +1, +1)

d1

d2

d3

Fig. 5.9. Three-user binary tree.

One approach to complexity reduction is to limit the search space using abounded tree, retaining a maximum number M(l) of partial sequences (andhence nodes) dl at each step l. The partial sequences retained for the nextstep are those which minimize the partial metrics μl(dl). The larger we chooseM(l), the larger the likelihood that the set of retained sequences will containdML at the end of the frame, in which case the algorithm will deliver the samedecision as the optimal decoder. Since the tree grows starting from the rootnode, M(i) = min(2l, M), where M is the search-breadth parameter. Thisalgorithm is known in the coding community as the M -algorithm [120] andwas first applied to the multiuser detection problem in [158], where it wascalled the improved decorrelating decision feedback detector (IDDFD). Thisis summarized in Algorithm 5.1.

The tree search algorithm is very similar to the algorithm we used to findthe minimum distance of a sequence set in Section 4.2.3.


Algorithm 5.1 (Improved decorrelating decision feedbackdetector).

Step 1: Initialize l = 1 and activate the root node d0 = () of thedecoding tree and its metric μ0 = 0.

Step 2: Compute the node metrics μl(dl) = μl−1(dl−1)+b(dl) forall active nodes, where b(dl) is calculated according to (5.67).

Step 3: Select the M(l) nodes with the best metrics and deacti-vate other nodes.

Step 4: If l ≥ nt, where nt is some convenient truncation length asin Section 4.2, output di−nt

from the node in the set of activenodes at depth l which possess the smallest node metric λl

as the estimated symbol at time l − nt. Let l = l + 1, and, ifl ≤ nK go to Step 2.

Step 5: Output the remaining estimated symbols dl; nK − nt <l ≤ nK from the node with smallest node metric.

Note that if we set M = 1, the M -algorithm is identical to the partialdecorrelating decision feedback receiver from Section 4.4.8, and if we chooseM = 2Kn, we have optimal decoding. Between these two extremes, the -algorithm provides a complexity versus performance trade-off.

Figure 5.10 shows the performance of a synchronous CDMA system with20 users employing random spreading sequences of length 31. As can be seen inthis case, the decoder approaches the single user bound with moderate valuesof M . Figure 5.11 also illustrates the near-far resistance of the IDDFD. Thissimulation was performed for the situation which presents the worst case forthe original DDFD, i.e. , the first user has the least power. Users 2–31 haveequal power (x-axis of figure), and user 1 has 3dB less power. All users achievetheir respective single-user bounds with use of the IDDFD. The codes usedin this experiment were Gold codes [54] which possess low cross-correlationproperties, and the system was fully loaded, i.e. β = 1 ⇒ K = L = 31.

Finally, Figure 5.12 shows a comparison of the error performance of thesedifferent multiuser decoding methods, plotted as a function of the system loadfor a CDMA system using random spreading codes for a value of Eb/N0 =8dB. The IDDFD, as the most complex decoding algorithm, achieves thebest performance and operates very close to the single-user bound for loadβ ≤ 2/3. The correlation receiver fails for system loads larger than aboutβ = 0.1, while the DDFD shows an almost linear degradation of the bit errorrate with system load. The results in Figure 5.12 were obtained for an equalreceived power scenario.

M


b 0

Fig. 5.10. Performance of the IDDFD for a system with 20 active users and randomspreading sequences of length 31.

5.2.3 Lattice Methods

An alternative framework to tree search methods is a class of search methodsthat operate within the framework of a lattice. Given a lattice basis L, aninteger lattice is the set of all points {Lq : qi ∈ Z}.

For simplicity, assume that the signal constellations D for all users areidentical, with |D| = Q points such that they may be linearly transformedonto a set of consecutive integers

Q = αD + β = {0, 1, . . . , Q− 1},

and define q = αd + β1We can now transform the jointly optimal detection problem into a prob-

lem of finding the lattice point closest to a given point. Starting with theoutput of the noise whitened matched filter,

rWF = Fd + z

=1αF (αd + β1)− β

αF1 + z,

,

1

10−1

10−2

10−3

10−4

10−5

eta

Rr

orrE

tiB

3 4 5 6 7 8 9 Eb/N0


Fig. 5.11. Performance of the IDDFD under an unequal received power situationwith one strong user.

and hencerWF +

β

αF1 =

1α

Defining the translated whitened matched filter output, y = rWF + βαF1 and

the scaled lower triangular matrix H = F/α, results in

y = Hq + z.

Thus the optimal detection problem (5.58) is equivalent to the constrainedlattice search problem

minq∈QnK

‖y −Hq‖22 . (5.68)

The Sphere Detector

The objective of the sphere detector [42, 99, 122] is to find all lattice pointsq ∈ Z

nK within a given radius r of the received point y, namely

Sr(y) ={q ∈ Z

nK : μ(q,H) = ‖y −Hq‖22 < r2}

.

Fq + z.

1

10−1

10−2

10−3

10−4

10−5

eta

Rr

orrE

tiB

3 4 5 6 7 8 9 Eb/N0


1 6 11 16 21 26 31 Users

Fig. 5.12. Performance of different multiuser decoding algorithms as a function ofthe number of active users at Eb/σ2 = 7 dB.

Provided Sr(y)∩QnK is non-empty, this set contains the maximum likelihoodpoint.

Just like the tree search methods described above, the sphere detectormakes use of the monotonic property of the μl(q,H) metric. Specifically, thecondition μ(q,H) < r2 implies μl(q,H) < r2 for all l = 1, 2, . . . , nK. Consid-ering user l, and using the recursive property of μl, this translates into

μl(q) = μl−1(q) + bl(q) ≤ r2.

Substituting for the branch metric,

1

10−1

10−2

10−3

10−4

eta

Rr

orrE

tiB

eg

arev

A

1 6 11 16 21 26 31 Users


μl−1(q) +

⎛⎝yl −

nK∑j=1

Hljqj

⎞⎠2

≤ r2 (5.69)

yl −nK∑j=1

Hljqj ≤√

r2 − μl−1(q) (5.70)

−Hllql ≤ −yl +nK∑j=1

Hljqj +√

r2 − μl−1(q) (5.71)

ql ≥1

Hll

⎛⎝yl −

nK∑j=1

Hljqj −√

r2 − μl−1(q)

⎞⎠

(5.72)

which gives a lower bound on ql in terms of the radius. Taking the negativesquare root in (5.70) results in a similar upper bound, and we have

Ll(q1, . . . , ql−1) ≤ ql ≤ Ul(q1, . . . , ql−1)

where

Ll(q1, . . . , ql−1) =

⎡⎢⎢⎢ 1

Hll

⎛⎝yl −

l−1∑j=1

Hljqj −√

r2 − μl−1(q)

⎞⎠⎤⎥⎥⎥ (5.73)

Ul(q1, . . . , ql−1) =

⎢⎢⎢⎣ 1Hll

⎛⎝yl −

l−1∑j=1

Hljqj +√

r2 − μl−1(q)

⎞⎠⎥⎥⎥⎦ . (5.74)

Algorithm 5.2 describes the sphere detector algorithm, which is a depth-first search of the full Q-ary tree of depth nK. Starting from the root node, thealgorithm performs a “one-step look ahead” and discards (prunes) portionsof the tree for which the bounds Ll or Ul are not met. The algorithm isparameterized by an ordering, or permutation function π : {0, 1, . . . , Q−1} �→{0, 1, . . . , Q−1}, which determines the order in which the branches of the treeare searched. Various choices of π have been suggested in the literature. Thealgorithm descends the tree until it either reaches a dead end, meaning allchild nodes have been pruned (upon which it ascends one level), or it reachesa leaf node. Upon reaching a leaf node, the corresponding path is comparedto the current best path. If it is better (has a smaller metric), the current bestpath is reset and the sphere radius is reset to the metric of the new best path.

Thus the algorithm successively contracts the radius each time it finds acomplete path, until there is only one full path inside the sphere, namely themaximum likelihood estimate.


Algorithm 5.2 (Sphere Detector).Require: r2 > 01: ν ← ROOT2: l ← 03: μbest ←∞4: while l ≥ 0 do5: if ν is a LEAF then6: if μ < r2 then7: r2 ← μ8: d ← d associated with ν9: end if

10: ν ← PARENT of ν11: l ← l − 112: end if13: Compute node metric μl

14: Compute L and U according to (5.73) and (5.74).15: Compute permutation π16: Prune all CHILD nodes of ν with q < L or q > U .17: if ν has remaining CHILD nodes then18: ν ← CHILD of ν with smallest π(q).19: l ← l + 120: else21: ν ← PARENT of ν22: l ← l − 123: end if24: end while

The choice of branch ordering π and initial radius r is critical to the compu-tational complexity of the algorithm. If the radius is too large, or the orderingpoorly chosen, the search sphere will be very slow to contract, and the algo-rithm may end up searching the entire tree. Conversely, the better the firstpoint, the faster the algorithm will converge.

Fincke-Pohst enumeration [42, 99] uses the natural ordering π(q) = q.Fincke-Pohst enumeration can however be computationally inefficient [1]. Animproved ordering was given by [122], alternating about the midpoint betweenL and U . The Schnorr-Euchner enumeration has the useful property that thefirst path found by the algorithm is the zero forcing decision feedback estimate(ZF-DFE) [1, 26]. Other improved orderings can be found in [47] and [168].

Since the Schnorr-Euchner enumeration finds the ZF-DFE point first, theinitial radius may be set to a maximum value r = ∞. In the case of Fincke-Pohst, more care must be taken. Selection of a suitable radius is somewhat ofan art, and is typically found in various heuristic ways.


The look-ahead bounds Ll and Ul are most effective with high-order mod-ulation. For small constellations, it may be more efficient to not use thesebounds, and to simply prune the tree based on the metrics μl alone. In theabsence of these bounds, the resulting algorithm corresponds to a depth-firstT -algorithm, where the threshold T is reset each time a better complete pathis found.

Figure 5.13 shows the average bit error rate performance of the optimaldetector (implemented using the Schnorr-Euchner sphere detector). We em-phasize that there is no loss from optimality in these results, since the spheredetector is simply an implementation of the optimal detector. In this result,antipodal modulation and random spreading of length L = 32 is used. Fig-ure 5.14 shows the corresponding average number of leaves (terminal nodes),i.e. candidate data vectors found by the Schnorr-Euchner sphere detector.Typically less than ten candidates are found on overage, compared to 231 fora K = 31 user system.

In general, the complexity of the sphere detector is O(|Q|K

), since in

the worst case, it may have to search the entire tree. In practice however,the portion of the tree searched depends upon the operating conditions, e.g.the signal-to-noise ratio. In practice, a limited search width may be enforced(similar to the M -algorithm). This would ensure a manageable upper boundon complexity, at the cost of a performance degradation.

0 1 2 3 4 5 6 7 8 9 1010

-6

10-5

10-4

10-3

10-2

10-1

100

24 Users

31 UsersSingle User

Eb/N0 [dB]

Aver

age

Bit

Err

or

Rate

Fig. 5.13. Sphere detector performance.


0 1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

10

11

31 Users

24 Users

Eb/N0 [dB]

Aver

age

num

ber

ofle

aves

found

Fig. 5.14. Sphere detector average complexity.

5.3 Approximate APP Computation

In Section 4.2.2, the individually optimal detector computes the marginala-posterior probability p(dl = d|r) that a particular symbol d ∈ D was trans-mitted at time i by user k according to

p(dk[i] = d|r) ∝∑

d∈DnK :dk[i]=d

p(r|d)p(d), (5.75)

where p(r|d) is the conditional probability of the observation r, conditionedon the candidate transmit vector d, and p(d) is the prior probability. In mostcases of interest, it is assumed that the individual transmit dimensions areindependent, namely that

p(d) =nK∏l=1

pl (dl) .

The sum (5.75) has |D|nK−1 terms, therefore brute force computation of themarginal posterior is impractical for all but the smallest of systems.

Typically however, only a relatively small number of candidate vectorsneed be considered in order to generate a good estimate of the true APP.The problem is how to identify a suitable (hopefully small) list of vectorsL(r,S) ⊂ DnK to marginalize over in order to produce an accurate APP

5.4 List Sphere Detector 171

estimate with reduced complexity. Thus the basic approach is similar to thattaken in Section 5.2.

In an additive Gaussian channel, the APP approximation (up to a nor-malization factor) is

p(dl = d|r) ∝∑d∈Ldl=d

e−Λ(r,d) (5.76)

where

Λ (r,d) = ‖r− Sd‖22 −nK∑l=1

ln pl(dl) (5.77)

= μnK(d,S)−nK∑l=1

ln pl(dl) (5.78)

In the log domain, the metric consists of the Euclidean distance term alreadyencountered in joint ML detection 5.58, with a correction factor representingthe prior probability.

5.4 List Sphere Detector

In the absence of a-priori information (uniform priors), one approach is toonly marginalize over the most likely symbol vectors. These are the vectors dfalling within a sphere of radius r centered on the receive point r.

A list sphere detector finds all vectors which satisfy

‖y −Hq‖22 ≤ r2.

In fact Algorithm 5.2 already performs this task. Instead of lines 6–9, whichupdate the best path and contract the radius, the list sphere detector simplyadds every complete path that it finds to a list L (and leaves the radiusunchanged). At the termination of the algorithm, the list L consists of allpaths corresponding to leaves reached by the algorithm.

The radius r must be set large enough such that a good estimate of theposterior probabilities can be obtained from (5.76). Determining this radiusis non-trivial, since even the desired list size may be unknown. However ifthe desired list size |L| is known, and is sufficiently large, the volume of thedesired sphere is approximately |L| times the fundamental lattice volume, andthe corresponding radius is

r =(|L||detH|

VnK

)1/nK

where


VnK =πnK/2

Γ (nK/2 + 1)

is the volume of a unit sphere of dimension nK.Since the goal of the list detector is to find all points inside the sphere, the

branch search order π is irrelevant. Any choice of π will result in the same list.In this case, the Fincke-Pohst natural order is a good choice, since it does notrequire any additional computations. The branch ordering is only importantif it is only the joint ML vector that is required.

5.4.1 Modified Geometry List Sphere Detector

The assumed independence of the prior probabilities in (5.78) allows a directmodification of the branch metrics used in the recursive metric computa-tion (5.66). Furthermore, since − ln pl(dl) ≥ 0, the resulting recursive metricis still monotonic,

μl(d,F) = μl−1(d,F) + b(d1, . . . , dl) (5.79)

where

b(d1, . . . , dl) =

⎛⎝rWFl −

l∑j=1

Fljdj

⎞⎠2

− ln pl(dl) ≥ 0. (5.80)

This new metric can be used directly in any tree-search algorithm, or it canbe used to find modified bounds Ll and Ul for use in a list sphere detector.It should be noted that the resulting sphere detector no longer finds all pointwithin a Euclidean sphere. With non-trivial priors, the prior probabilitiesdefine a non-Euclidean geometry where the relevant sphere is defined by

Λ(r,d) ≤ r2.

This is the approach used by [151].

5.4.2 Other Approaches

There are several other approaches for APP approximation described in theliterature.

A dual-list approach has been proposed by [109, 110]. This maintains com-putes two lists. The first list L1 is simply a list of the most likely sequencesaccording to the Euclidean distance metric, which can be found using thelist sphere detector of Section 5.4, or approximated using any restricted tree-search algorithm. The second list L2 is computed based only on the priorprobabilities. The P -shortest paths algorithm (e.g. [36]) is used to find theP vectors with greatest a-priori probability. This can be accomplished in

5.4 List Sphere Detector 173

O (K + P + (K + 1) log K). The final list L = L1 ∪L2 is found by taking theunion of the two lists. Unlike the modified list sphere detector, this approach isnot guaranteed to return the most likely vectors. The main advantage of thisapproach is found when used in an iterative decoding framework. The list L1

(which typically dominates the overall complexity via the QR-decompositionand list sphere detector/tree search) only needs to be computed once for eachsymbol, rather than for every iteration of the decoder. The second list L2 isre-computed each iteration, but this computation has very low complexity.

An alternative approach is the LISS detector of [11, 59, 75]. In this ap-proach a straightforward depth-first (sequential) tree search is applied with arestriction on the total effort, e.g. the total number of branches searched. Withany such finite-memory depth-first search there is no guarantee that even asingle full path will be found. The idea of the LISS detector is to completeeach partial path with soft decisions computed from the prior probabilities.

6

Joint Multiuser Decoding

6.1 Introduction

In order to achieve rates that approach the information theoretic capacity ofany communications channel, we need to use redundant signaling, referredto as forward error control (FEC) coding. Dating back to Shannon’s ground-breaking theory [123], it has been known that large FEC codes can approachthe capacity of a channel virtually arbitrarily, and families of very powerful,capacity-approaching codes have been discovered (or rediscovered) over thelast two decades. The most well-known code families are trellis codes [78, 120],turbo codes [120], and LDPC codes [120].

FEC thus forms an indispensable component of modern communicationssystems, and therefore also plays a fundamental role in multiple access com-munications. In this chapter we add FEC to the problem of multiuser com-munications and study how methods and designs are impacted. Our purposeis to illustrate the design and analysis of coordinated multiple-user systemswhich incorporate FEC, and gain understanding of the role of FEC in suchsystems.

We concentrate our discussion on code-division multiple-access (CDMA)due to its practical importance and widespread use. Many of the results, how-ever, apply to other systems as well, in particular to any linear multiple-accesssystems. Among these are multiple-input multiple-output (MIMO) channelswhich have recently gained prominence as multiple-antenna channels due totheir potential to dramatically increase the information theoretic capacity ofa wireless link [46, 132].

Two basic approaches are studied in this chapter. First, it is often thecase that the single-user channel created by linear pre-processing exhibitsa capacity which is close to the per-user capacity of the original multiple-access channel, and therefore linear pre-processing is completely sufficient.This methodology has been studied in detail in Chapter 4. Decomposing themultiple-access channel into single-user channels via linear pre-processing hasthe advantage that basic error control methods designed for single-user AWGN

176 6 Joint Multiuser Decoding

channels can be used “off the shelf” to build a multiple-access system withFEC. One situation where pre-processing is sufficient is when the CDMA chan-nel is only “lightly loaded”, i.e. when the ratio of users to signal dimensionsis less than unity.

On the other hand, in cases where optimal processing exhibits a significantincrease in capacity over linear pre-processing, more complex joint detectorswill have to be used to harness the channel’s capacity. While optimal decoderscan be formulated in many cases, they do not have much practical appeal sincethey usually require a prohibitively large complexity (for CDMA, see [52, 53]);nor do they hold much appeal to the theorist since their analysis is typicallyintractable. In the cases where joint detection is indicated, we focus on it-erative receiver systems, whose genesis can be traced to the turbo principleinvented for decoding of turbo codes [12]. We will see that these iterative re-ceivers are very effective and that their low complexity requirements are nomatch for other, more complex receiver structures.

Figure 6.1 shows the information theoretic capacities for random CDMAsystems using both a linear minimum mean-square error filter and optimalprocessing for two system loads. The two capacities for β = 0.5 show verylittle difference, and it is therefore “nearly optimal” to use a minimum meansquare error filter as linear preprocessor which converts the CDMA channelinto parallel single-user AWGN channels where well-known FEC methods canbe used. This is the case for a system load β of approximately less than 0.5in random CDMA channels.

As the system load increases, linear preprocessing becomes increasinglyless efficient, even for equal power channels, and for highly loaded systemsjoint detection can provide significant extra gains. When and where jointdetection provides a capacity advantage depends on the particular system,but the capacity formulas discussed in Chapter 4 can be used to determinethis. In general, systems with a small number of users per signal dimensiondo not profit significantly from joint processing.

Figure 6.2 shows the system diagram of CDMA using FEC. We also assumereceiver cooperation, i.e. the sources independently encode their information.Since the different encoders in this model are assumed to be located at differ-ent physical locations, transmitter cooperation in a one-way communicationsystem like the one we consider here is not possible. The receiver, however,makes maximum use of the received signal by joint decoding. Figure 6.2 isan extension of Figure 2.1 where the incoming information bits have been la-beled by uk[j] at time j for user k, and the interleavers (π1, · · · , πK) have beenadded between the FEC encoders and the channel. An interleaver is nothingmore than a large permutation operation, where the symbols in the outputblocks of length n are reordered according to some rule. Here, as is custom-ary with turbo coding, this relabeling is assumed to be completely random.Some sophisticated rules in the construction of these interleavers have beensuccessfully applied to turbo coding [120] to improve performance. However,in the context of the system in figure 6.2, the structure of the interleaver is

6.1 Introduction 177

AWGN Capacity

optimalβ = 1

Linearβ = 1

optimalβ = 0.5

Linearβ = 0.5

-2 0 2 4 6 8 10 12 14 Eb/N0 [dB]

0.1

1

10

Capaci

tybits/

dim

ensi

on

Fig. 6.1. Comparison of the per-dimension capacity of optimal and linearly pro-cessed random CDMA channels. The solid lines are for β = 0.5 for both linear andoptimal processing, the dashed lines are for a full load β = 1.

less important, and has not been researched either. In fact, we will show thatthe performance of the system is determined primarily by the error controlcode. Random interleavers make achieving this performance possible and alsoallow for a precise analysis.

If the FEC code already contains an interleaver as part of its internal struc-ture, as is the case with turbo codes or LDPC codes, no additional interleaveris required between the FEC encoder and the channel. For reasons of analysis,we will however assume that the interleavers (π1, · · · , πK) are always there,and are of large size.

We will see that depending on the system load, the role of the error controlcodes can be quite different. For low loads, the property of the error controlcontrol codes to operate in noisy environments dominates, while for large loadthe dominant role of the error control codes is to resolve the multiple-accessinterference. We will see that in an iterative joint decoder the FEC codeshave these two major functions of overcoming the multiple-access resolution


Baseband

Modulator

InterleaverError Control

Encoder


Encoder


Encoder

Baseband

Modulator

Baseband

Modulator

Noise

d1[i]

√P1 s1[i]

d2[i]

√P2 s2[i]

dK [i]

√PK sK [i]

z[i]

r[i]

uK [j]

u2[j]

u1[j]

πK

π2

π1

Fig. 6.2. Diagram of a “coded CDMA” system, i.e. a CDMA system using FECcoding.

by assuring convergence of the iterations, and providing low error probabilitiesby overcoming the channel noise after cancellation.

The following main strategies have been proposed for decoding codedmultiple-access signals:

1. Optimal Decoding is an extension of the maximum likelihood sequencedetection principle discussed in Chapter 4 to include the constraints im-posed by the error control codes. The immediate and debilitating disad-vantage of this approach is that the decoder state space, already large foruncoded optimal detection, is further increased by the product of the codestate spaces of the different users. Optimal decoding has been proposedand studied for convolutionally coded CDMA systems in [52, 53]. We willnot further discuss optimal decoding for two reasons: Its practical impor-tance is very limited due to its excessive complexity, and its derivation isa direct extension of the optimal uncoded detector (Chapter 4).

2. Linear Pre-Processing and Single-User Decoding is appropriatewhenever the user density is relatively low. Different filters can be used aspre-processors and are combined with conventional error control codes insingle-pass decoding systems. For random CDMA systems, where the sig-nal geometry changes at each symbol interval, the generation of a propermetric for the error control decoder is important [7, 115, 116]. This isdiscussed in the next section.

3. Iterative Decoders arise from the perspective of viewing a coded CDMAsystem as the concatenation of FEC codes with the CDMA channel. Theidea is to use soft-output decoders for both the CDMA channel and the

6.2 Single-User Decoding 179

FEC codes in an iterative decoding arrangement. This methodology wasintroduced in [5, 88, 108] and further developed in [3, 33, 126, 157]. Moher[88] proposed the use of a channel a-posteriori probability detector, i.e.the channel “decoding” was performed optimally at the cost of a complex-ity of O(2K). All other proposals used some form of reduced-complexitychannel decoding, with Alexander et. al. [5] proposing simple, yet efficientinterference cancellation achieving excellent results. Wang and Poor [157]introduced the per-user MMSE filter, and Shi and Schlegel [125] providean analysis of using linear cancellation techniques. We study these meth-ods in subsequent sections of this chapter.

6.2 Single-User Decoding

As discussed in Chapter 4, linear pre-processing, such as decorrelation andMMSE filtering, convert the multiple-access channel into parallel channelswith additive noise. Using central limit theorem arguments [101], it can beshown that the residual noise is well approximated by uncorrelated Gaus-sian statistics. These results formed the basis for the capacity calculations inChapter 4.

The use of error control codes now seems pretty straight forward. Sincethe two problems of decoding and multiple-access interference resolution havebeen decoupled. However, due to the fact the multiple-access channel may betime-varying, in particular for random CDMA, care must be taken in how themetrics for the error control decoders are derived. The simple optimal AWGNchannel squared-distance metric

λ = ‖r − d‖2 (6.1)

is not optimal in all situations. This will become clearer when we discussthe relationships between the decorrelator pre-processor and the projectionreceiver.

6.2.1 The Projection Receiver (PR)

The projection receiver (PR) is a single-pass detection method which combinesFEC and multiuser interference cancellation. The PR makes no hard decisionon the transmitted symbols d, but instead works with sequence metrics λ(d)for the encoded bit streams. The PR does not actually perform joint decoding,but decoding with linear interference suppression.

We start our derivation with the formulation of the maximum-likelihood(ML) detector, given in (4.2), i.e.

dML = arg mind∈DKn

‖r− SAd‖22 , (6.2)


where we recall that DKn is the set of sequence candidates d taken from{−1,+1}, and constrained by the error control codes. The projection receiverreduces the complexity of this ML search by partitioning the symbols in dinto two sets,

d → {dc,du}. (6.3)

The complexity reduction is now achieved by estimating du over the real(unconstrained) domain, and detecting only dc over the constrained domain,now denoted by Dc. This leads to the partitioned minimization

dPR = arg mindc∈Dc

mindu∈ Kn

(‖r− SAd‖2

). (6.4)

The constrained portion dc of the data can be any combination of transmittedsymbols, but typically it would be the sequence of a single user, i.e. dc = dk =[dk[1], . . . ,dk[n]], or that of a few “desired” users. The complexity savings of(6.4) over (6.2) are substantial. In the fully projected case, where dc containsthe symbols of only one user, the savings over maximum-likelihood decodingare a factor of 2K−1!

The minimization over du is quadratic and can be accomplished in closedform. Let us rewrite the inner minimization in (6.4) as

mindu

‖r− SAd‖22 = mindu

∥∥∥∥r− [Sc Su][Acdc

Audu

]∥∥∥∥22

= mindu

‖r− ScAcdc − Suv‖22 , (6.5)

where we have partitioned the matrix S into the component Sc with thecolumns corresponding to the symbols in du removed, and Su which containsthese columns. The diagonal matrix A is partitioned likewise. Furthermorewe define v = Audu as the product of the unconstrained symbols and theiramplitudes.

We now take the partial derivative

∂

∂v‖r− ScAcdc − Suv‖2 = −2S∗

u (r− ScAcdc − Suv) (6.6)

which we set to zero to obtain

v = (S∗uSu)−1 S∗

u (r− ScAcdc) (6.7)

for the joint estimate of interfering symbols and amplitudes. Note that this isthe decorrelator filter output of the unconstrained symbols with the hypothesisdc removed or canceled. Substituting the unconstrained estimate (6.7) into(6.5), the detected constrained symbols are now calculated as

dcPR = arg mindc∈Dc

‖MPR (r− ScAcdc)‖22 , (6.8)


whereMPR = I− Su (Su

∗Su)−1 Su∗ = I−Pu. (6.9)

The matrix MPR is the projection matrix onto the null space of Su, and Pu

is the projection on the subspace spanned by Su hence the name “projectionreceiver”. Note that the amplitudes Au of the unconstrained users are notrequired for detection. This also makes the receiver near-far resistant, that is,its performance is independent of power levels of the interfering users.

Furthermore, since the amplitudes can be complex, given different phasesof the different user, the receiver also does not need phase knowledge of theinterfering users in order to generate the metrics in (6.8), that is, the receiveris non-coherent as far as the interferers are concerned. These properties areshared with the linear pre-processing filters discussed in Chapter 4.

The minimization in (6.8) can be further simplified by observing thatMPRx ⊥ Pux. This leads to


‖MPRr− ScAcdc + PuScAcdc‖2

= arg mindc∈Dc

{‖MPRr− ScAcdc + PuScAcdc‖2 + ‖−PuScAcdc‖2

}= arg min

dc∈Dc

‖MPRr− ScAcdc‖2 , (6.10)

where the addition of the second term in the second step does not change theresult of the arg(·) function for binary modulations with dk[i] ∈ {−1,+1},since this term is orthogonal to the first and independent of the signs in dc.

The projection front-end of the receiver now produces a metric

λ(dc) = ‖MPR (r− ScAcdc)‖2 ≡ ‖MPRr− ScAcdc‖2 (6.11)

for a given codeword dc, and the detector will select as final estimate thecodeword dc for each user with the smallest metric λ(dc), i.e.


λ(dc). (6.12)

From ˆdc,PR the bit estimates ˆuc.PR are easily computed via the error controlencoder inverse.

If the projection metrics have to be computed for all users, it is moreeconomical to approximate

MPRr ≈ r− Su (S∗S)−1 S∗r = r− SuR−1rMF. (6.13)

The error thus introduced is negligible, but this formulation nicely illus-trates the method’s relationship to cancellation techniques and decorrelationas shown in Figure 6.3.

Figure 6.3 illustrates that the PR does not need to obtain phase estimatesof the interfering signals, the metric calculations and cancellations according


s1[i]

s2[i]

sK [i]

r[i]

s1[i]

s2[i]

sK [i]

P

P

P

R−1

v1

v2

vK

-

dk

Fig. 6.3. Projection Receiver block diagram using an embedded decorrelator.

to (6.11) and (6.13) can be performed without that knowledge, that is, non-coherently. Phase (and possibly frequency) synchronization only needs to beperformed for the user of interest.

Note that using the output vk directly as input to the single-user errorcontrol decoder corresponds to decorrelator pre-processing discussed in Chap-ter 4. We show later, that this shortcut can cause a loss in performance ofrandom CDMA, and that the metric generation according to the projectionprinciple is the “correct” way.

6.2.2 PR Receiver Geometry and Metric Generation

Let us spend some time on developing a geometric view of the metric genera-tion (6.11). The matrix (6.9) is the projection onto the space orthogonal to thespace spanned by the columns in Su, i.e. onto Span{Su}⊥. This is illustratedin Figure 6.4. The metric for a candidate sequence dc is produced by first sub-tracting its effect from the received sequence r, i.e. we generate r− ScAcdc.This requires knowledge of the amplitudes of the constrained user(s). The


resulting vector is then projected by the matrix MPR. The length of the re-sulting projected vector is the metric λ(dc). The receiver, in effect, ignores thedimensions in signal space which are “contaminated” by the interference. Thisresults in a shortening of the received signal vector (since ‖MPRx‖ ≤ ‖x‖),which is equivalent to an energy loss. In Section 6.2.3 we will calculate thisenergy loss and learn that it is not severe, as long as the load β < 1 and notclose to unity. It also becomes obvious now why the PR ideally is near-farresistant. Since it operates in the space orthogonal to the interference, thepower levels of the interferers are irrelevant.

Computing the metrics (6.11) for all possible sequences dc is a formidabletask for asynchronous CDMA systems. In practical implementation, an it-erative implementation of the inversion in Figure 6.3 would be preferred asdiscussed in Chapter 5. This iterative implementation is identical for bothsynchronous or asynchronous system (see Chapter 5).

r

r − ScAcdc

ScAcdc

MPR (r − ScAcdc)

Origin

OrthogonalComplement

Span{Su}

Fig. 6.4. Projection Receiver diagram using an embedded decorrelator.

Alternately, the metric calculation can be performed in a recursive fashion.To this end let us partition MPR into L × L blocks. MPR is, in general, afull matrix, but its off-diagonal terms decrease rapidly with distance from thediagonal. This allows us to work with a block-diagonal approximation MPR,s

retaining s of the off-diagonal blocks on either side of the diagonal. So, forexample


MPR,1 =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

M11 M12 0 · · · 0

M21 M22 M23

...

0 M32 M33 M34

......

. . ....

0 · · · Mnn

⎤⎥⎥⎥⎥⎥⎥⎥⎦

. (6.14)

Note that due to (6.9), Mij = M∗ji. In what follows we shall concentrate

on the case where the constrained symbols are those of a single user, i.e.dc = dk. Writing out the sequence metric (6.10) in terms of individual symbolcomponents we obtain

λ(dk) =

∥∥∥∥∥MPRr−n∑

i=1

√Pksk[i]dk[i]

∥∥∥∥∥2

= −2r∗MPR

n∑i=1

√Pksk[i]dk[i] +

∥∥∥∥∥n∑

i=1

√Pksk[i]dk[i]

∥∥∥∥∥2

, (6.15)

where the sequence energy term has ‖MPRr‖ has been discarded since itdoes not affect the decoding decision. Since the spreading sequences sk[i] ofuser k are not overlapping in time, the second term in (6.15) is equal for allsequences as long as binary modulation is used, and can therefore be omitted.This simplifies (6.15) further and we obtain an equivalent additive sequencemetric

λ(dk) def=n∑

i=1

λ(dk)[i] = −r∗MPR

n∑i=1

√Pksk[i]dk[i]. (6.16)

We now decompose r = (r[1], . . . , r[n]) into n partial vectors of length L,associated with each symbol transmission, and obtain

λ(dk)[i] = −i−1∑

j=i−s

r∗[j]√

PkMjisk[j]dk[j]− r∗[i]i∑

j=i−s

Mij

√Pksk[j]dk[j].

(6.17)The decomposition of (6.16) into (6.17) follows analogous steps as those usedin Section 4.2.1 to decompose the optimal metric. The above metric is reminis-cent of the case of a single input single output with inter-symbol interference[104]. But in our case the interference between adjacent symbols is created bythe asynchronicity of the CDMA channel in conjunction with the projectionfiltering.

For synchronous CDMA [7, 119] the off-diagonal elements of MPR vanish,i.e. Mij = 0 for i �= j and the metrics in (6.17) simplify to

−λ(dk)[i] = r∗[i]Mii

√Pksk[j]dk[i]. (6.18)

Detection via (6.17) requires an extended sequence detector, except forsynchronous CDMA. In the case of using an error control code with a trellis


representation1, this results in an augmentation of the code state space, similarto the case of coding for inter-symbol interference (ISI) channels [155]. Theaugmented code space will contain 2s times as many states as the originalcode state space. This is the price to be paid by the asynchronous nature ofthe channel. The augmented code state space can now be searched by anytype of trellis search algorithm, for example the Viterbi algorithm.

6.2.3 Performance of the Projection Receiver

In Section 4.4.4 we proved that the signal-to-noise ratio of the decorrelatoroutput for random spreading sequences converges to the limiting value

SNRk =Pk

σ2

L−K + 1L

, (6.19)

as both K,L →∞. For finite values of K and L, the the signal-to-noise ratioin (6.19) is not constant, which means that the decorrelator outputs cannotdirectly be fed into a standard error control decoder without incurring a loss,since in that case the decorrelator outputs are no longer ML metrics. This lossis caused by “surges” in the noise value, which distorts the decision statistics,i.e. different symbols are not weighed equally. These surges can be thought ofas caused by spreading sequence combinations during certain symbol intervalswhich cause particularly large noise enhancement by the decorrelator. Thiseffect is particularly noticeable for codes with small to moderate free distance.

We will show in this section that the PR correctly compensates for thisdistortion, and that its performance can be superior to that of decorrelation.We approach this development by first showing that the performance of thePR is lower bounded by that of a channel with the asymptotic signal-to-noiseratio (6.19), with equality as K,L →∞. In a second step we then show thatthe performance of decorrelation is lower bounded by the performance of thePR. Simulation results will give quantitative examples of this effect.

In the sequel we prove the following theorem:

Theorem 6.1. The performance of the PR with random spread-ing codes (synchronous or asynchronous) using the metric (6.11)suffers an energy degradation of L with respect to interference-freesingle-user performance, and

L ≥ 10 log10

(L

L−K + 1

)[dB]. (6.20)

This loss L is plotted in Figure 6.5 as a function of the system load β =K/L. We see that below 50% system load, the loss is less than 3dB.1 We assume that an error control code which can be represented by a trellis is

used. The most obvious such choice would be a convolutional code. However,block codes, too, have a code trellis [120] which can be used.

186 6 Joint Multiuser DecodingLoss

indB

0

2

4

6

8

10

12

14

16

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

β

Fig. 6.5. Lower bound on the performance loss of the PR.

The bound in Theorem 6.1 on performance loss is very tight, even forcertain non-random spreading sequences, as shown by the simulation resultspresented Figure 6.6. The FEC code was chosen to be a rate R = 1/2 convo-lutional code with 4 states, and generator polynomials (5,7) [104, 120]. Simu-lation results are shown for K/L = 5/15, 8/15, and 10/15, leading to boundedloss factors of 1.3dB, 2.7dB, and 4dB, illustrated by the dashed lines in thefigure. The error flares that occur for the higher loads at lower bit error ratesare mainly due to the fact that the inter-symbol interference was neglected inthe decoder.

For other classes of spreading codes, similar results can be obtained. Forexample, if Gold codes [54] are used as spreading sequences in a synchronoussystem, a loss of

L = 10 log10

((L−K + 2)L

(L−K + 1)(L + 1)

)[dB] (6.21)

can be calculated [7]. Gold codes are nearly orthogonal, causing, i.e. a loss ofonly 0.39dB with K = 10, L = 15.

Before we proceed with the proof of Theorem 6.1, we outline the resultsin this section. The reader not interested in the mathematical details of theproofs can thus skip these details. First, the bound in (6.20) becomes tight, notonly as K,L → ∞, but also as the minimum Hamming distance of the codebecomes large. The proof of this is analogous to the proof in Section 4.4.4 andis omitted. Next, we show in Lemma 6.1 that the performance of asynchronousCDMA is strictly poorer than that of synchronous CDMA, even though the


1

10−1

10−2

10−3

10−4

10−5

Aver

age

Bit

Err

or

Rate

2 3 4 5 6 7 8 Eb/N0

Fig. 6.6. Performance examples of the PR for random CDMA. The dashed linesare from applying the bound from Theorem 6.1. The values of Eb/N0 is in dB.

difference is negligible in most cases. Finally, in Lemma 6.2 we prove thatdecorrelation produces a larger error rate than using the PR metrics. Thislemma holds both for asynchronous and synchronous CDMA. These resultsthen show that the PR metrics (6.17) are the correct way of interfacing themultiuser pre-processor with the FEC decoder. These conclusions are validalso if other linear pre-processors are used, such as the MMSE filter.

We now proceed with the somewhat tedious proof of Theorem 6.1. Weapproach the performance of the PR in terms of decoded bit error probabilitywith random spreading codes. We first address the degradation of the PR sys-tem relative to the interference free single-user performance in general terms,and then proceed to develop the bound (6.20).

For mathematical convenience we assume unit received powers for all users,i.e. A = I, but note that the result is independent of the power distribution ofthe users. We follow the standard approach for calculating error probabilitiesin coded systems by considering the probability that an incorrect codeword d′

k

of user k’s codebook is decoded. The resulting two-codeword error probabilitycan then be used in standard distance spectrum analyses to find bounds forconvolutional or block codes [120].


We must derive an expression for the probability Pw that the correct code-word dk, which shall be assumed to have Hamming distance w with respectto d′

k, has a larger metric. The Hamming-weight w error vector

ε = dk − d′k (6.22)

has therefore w elements equal to ±2. We place the symbol indexes of theseerror positions into the set G.

The two codeword error probability for such a pair-wise error event basedon the PR metrics of (6.11) is given by

Pw = P (λ(dk) > λ(d′k))

= P(‖MPR (r− Skdk)‖2 > ‖MPR (r− Skd′

k)‖2)

= P(‖MPRSkdk‖2−‖MPRSkd′

k‖2−2r∗MPRSk (dk−d′

k)>0)

.(6.23)

Using r∗ = d∗kS

∗ + z∗, and using (6.22) we obtain

Pw = P“‖MPRSkdk‖2 − ‚‚MPRSkd

′k

‚‚2 − 2d∗S∗MPRSkε − 2z∗MPRSkε > 0”

= P`−‖MPRSkε‖2 + 2z∗MPRSkε > 0

´= P (ηw > 0) . (6.24)

The random variable ηw is Gaussian distributed as follows

ηw ∼ N(−‖MPRSkε‖2 , 4σ2 ‖MPRSkε‖2

), (6.25)

which leads to the two codeword error probability

Pw = Q

⎛⎝√‖MPRSkε‖2

4σ2

⎞⎠ . (6.26)

This equation is analogous to the error equation for an AWGN channel, theonly difference is the energy reducing projection MPR, and therefore, theprojection receiver suffers an energy loss which depends on the spreadingsequences used, which is quantified by (6.26).

We will now break the argument of this error probability into a sum overindividual symbol metrics. In order to obtain a bound, we will average Pw

over all possible combinations of the error vector ε. This leads to the bound:

E [Pw] ≥ Q

⎛⎜⎜⎝√√√√E

[‖MPRSkε‖2

]4σ2

⎞⎟⎟⎠ (6.27)

= Q

⎛⎝√∑

i∈G ‖MPRsk[i]‖2

σ2

⎞⎠ . (6.28)


The inequality is established by the fact that Q(√

x) is a ∪-convex functionof x. Equality in (6.28) occurs if and only if

MPRsk[i] ⊥ MPRsk[j]; ∀i, j ∈ G. (6.29)

This occurs, for example, if the CDMA channel is synchronous. We have thusestablished the following

Lemma 6.1. The error probability of the PR with asynchronous CDMA isstrictly larger than the error probability of the PR with synchronous CDMA.

For the purpose of this lemma a CDMA system is called asynchronous if(6.29) does not hold for some i, j. Note, however, that with random CDMAthe difference of error performance is typically very small.

We now study the behavior of Pw for random CDMA. We will pro-ceed analogously to the proof of Theorem 4.1 in Section 4.4.4. Let Q =[q1, · · · ,qn(L−K+1)] be an orthonormal basis for spanS⊥

u of dimension u =n(L−K + 1). In terms of Q, the projection matrix MPR is given by

MPR =

⎡⎢⎣M11 · · · Mn1

......

M1n · · · Mnn

⎤⎥⎦ = QQ∗, (6.30)

where Q has dimension nL × u, and the L × L sub-matrix Mij of MPR isgiven by

Mij = QiQ∗j , (6.31)

where Qi has dimension L× u and is defined as

Qi =

⎡⎢⎢⎢⎣

row iL of Qrow iL + 1 of Q...row iL + L− 1 of Q

⎤⎥⎥⎥⎦ . (6.32)

Since Q∗Q = Iu

tr(Q∗Q) = u

= tr

(n∑

i=1

Q∗i Qi

)

=n∑

i=1

tr(Q∗i Qi) . (6.33)

The term tr(Q∗i Qi) is a random variable, but from (6.33) we conclude that

E [tr(Q∗i Qi)] =

u

n= L−K + 1. (6.34)


In the synchronous channel Qi will have only L columns that are non-zero,and therefore tr(Q∗

i Qi) = L − K + 1 and (6.34) holds trivially. However, inthe asynchronous channel the number of non-zero columns is increased dueto the dispersion of the signal that is caused by the projection filter.

Now from (6.26) and (6.30)

Pw= Q

⎛⎜⎜⎜⎝√√√√√ 1

σ2

∑i∈G

s∗k[i]Miisk[i] +1

2σ2

∑i∈G

∑j∈Gj �=i

εis∗k[i]Mijεjsk[j]

⎞⎟⎟⎟⎠

= Q

⎛⎜⎜⎜⎝√√√√√ 1

σ2

∑i∈G

s∗k[i]QiQ∗i sk[i] +

12σ2

∑i∈G

∑j∈Gi�=j

εis∗k[i]QiQ∗j εjsi,k

⎞⎟⎟⎟⎠ ,

(6.35)

and using x∗x = tr(xx∗),

Pw=Q

⎛⎜⎜⎜⎝√√√√√ 1

σ2

∑i∈G

tr(Q∗i sk[i]s∗k[i]Qi)+

12σ2

∑i∈G

∑j∈Gi�=j

tr(Q∗i εisk[i]εjs∗k[i]Qj)

⎞⎟⎟⎟⎠ .

(6.36)We can now proceed by evaluating the expectation over the channel, wherewe introduce the lower bound by moving the expectation inside the Q(

√·)

function:


E [Pw] ≥ Q

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝

√√√√√√√√√√√√

1σ2

E

[∑i∈G

tr(Q∗i sk[i]s∗k[i]Qi)

]

+1

2σ2E

⎡⎢⎢⎣∑

i∈G

∑j∈Gi�=j

tr(Q∗i εisk[i]εjs∗k[j]Qj)

⎤⎥⎥⎦

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠

(1)= Q

⎛⎜⎜⎜⎝√√√√√ 1

σ2

∑i∈G

u

n+

12σ2

∑i∈G

∑j∈Gi�=j

E [tr(Q∗i εisk[i]εjsk[i]Qj)]

⎞⎟⎟⎟⎠

= Q

⎛⎝√ 1

σ2

∑i∈G

u

n

⎞⎠

= Q

(√P (L−K + 1)

σ2

)(6.37)

where the inequality follows from the convexity of Q(√

x) in x and anapplication of Jensen’s inequality. Equality above in (1) results from the factthat E [sk[i]s∗k[i]] = I, that Qi and sk[i] are statistically independent withE [sk[i]] = 0, and from E [sk[i]s∗k[j]] = 0 for i �= j. We note that the probabilityof an error depends only on the Hamming weight w of the error path ofthe constrained user, and, furthermore, that the degradation is identical forthe synchronous and the asynchronous channel. Together with the fact thatPw = Q

(√P/σ2

), Theorem 6.1 is established.

This last result justifies our extensive treatment of the PR receiver:

Theorem 6.2. The performance using the PR metrics (6.15) isstrictly better than the performance using the distance metrics ofthe output of the decorrelator, for both synchronous and asyn-chronous random CDMA.

Proof. First we introduce an upper bound on bit error probability (6.26) ofthe PR by lower bounding the numerator in the argument of the functionQ(√·), given by

λ(dk) =

∥∥∥∥∥MPR

∑i∈G

sk[i]εi

∥∥∥∥∥2

≥∥∥∥∥∥∑

i∈GMisk[i]εi

∥∥∥∥∥2

, (6.38)

where Mi is the projection onto span{S/sk[i]}⊥, i.e. onto the subspace whichis the complement of the space of all the spreading sequences without sk[i].


This is true because‖MPRsk[i]‖ ≥ ‖Misk[i]‖ , (6.39)

since the space onto which Mi projects has fewer dimensions, and

Pw ≤ Q

⎛⎝√∑

i∈G ‖Misk[i]‖2

σ2

⎞⎠ . (6.40)

On the other hand, using the following block matrix description of

R =[s∗csc s∗cSu

S∗usc S∗

uSu

], (6.41)

letting sc = sk[i], and applying the partitioned matrix inversion lemma, weobtain for the decorrelator output at time i

vk[i] =s∗k[i]Mir

‖Misk[i]‖2, (6.42)

which is Gaussian distributed with mean and variance

vk[i] ∼ N(1, σ2 ‖Misk[i]‖2

). (6.43)

This leads to the following two-codeword error probability if symbol-wisedecorrelation is used:

Pw,DEC = Q

(√P 2

σ2∑


)(6.44)

Due to the fact that the function 1/x is ∪-convex,

1P

∑i∈G

‖Misk[i]‖2 ≥ 11P

∑i∈G ‖Misk[i]‖2

, (6.45)

and therefore

Pw,DEC ≥ Q

⎛⎝√∑


σ2

⎞⎠ ≥ Pw (6.46)

for any two codewords. This proves our theorem.

6.3 Iterative Decoding

Iterative decoding results from interpreting the joint multiple-access systemwith error control coding in Figure 6.2 as a serially concatenated coding sys-tem [120], in which the error control codes take the role of outer codes and the

6.3 Iterative Decoding 193

Filter Deinterleaver Soft Output

FEC Decoder

Deinterleaver Soft Output

FEC Decoder

Deinterleaver

tanh( /2// )

tanh( /2)

tanh( /2)

Soft Ouput

FEC Decoder

Filter

Filter

Interleaver

Interleaver

Interleaver

CDMA

Interference

Resolution

Function

r

λ(dK)

λ(d2)

λ(d1)

rK

r2

r1

w∗K

w∗2

w∗1

π−1K

π−12

π−11

πK

π2

π1

Fig. 6.7. Iterative multiuser decoder with soft information exchange.

multiple-access channel takes the role of the inner code. With this interpre-tation, a soft information exchange decoding algorithm, similar to the turbodecoding algorithm, suggests itself, and has proven very effective. A diagramof such an iterative multiuser decoder is shown in Figure 6.7.

The error control decoders are soft-output a posteriori probability (APP)decoders [120] which generate the a posteriori probabilities of the individualtransmitted coded bits, i.e.

Pr(dk[i] = 1|rk); Pr(dk[i] = −1|rk), (6.47)

which are typically calculated as the symbol-wise log-likelihood ratio (LLR)

λ(dk[i]) = log(

Pr(dk[i] = 1|rk)Pr(dk[i] = −1|rk)

). (6.48)

The interleavers π1, · · · , πK are assumed to be large enough to destroy anycorrelation between different LLR values. This independence assumption is


very important for the statistical analysis of the detector, as well as for itssuccessful operation.

Figure 6.7 shows the canonical form of an iterative multiuser decoder. Dif-ferent receivers mainly differ in the type of FEC codes used, in the soft-outputdecoding algorithms applied, in the CDMA interference resolution functionemployed, and, in the loop filters w∗

k used. We will see in the next sectionsthat a cancellation front-end is not only an attractive low-complexity solu-tion, but, in conjunction with different received power levels or code rates,can approach the capacity of the multiple access channel.

6.3.1 Signal Cancellation

The interference resolution block has the function of extracting soft estimatesfrom the received vector r using as input soft estimates of the coded symbolsfrom the soft-output error control decoders. A possible way of doing this isby calculating the APP probability of the coded symbols, this time using thechannel constraints, i.e. from Chapter 4, equation (4.15), we calculate


∑d:dk[i]=d

Pr(r|d)Pr(d), (6.49)

where Pr(r|d) is the multiple-access channel transition probability and isGaussian for CDMA, and Pr(d) is the vector of a priori probabilities of thetransmitted symbols learned from a previous iteration.

Equation (6.49) represents the “optimal” APP detector proposed by Mo-her [88]. Unfortunately, even a cursory analysis shows that the sum has anexponentially growing number of terms in K and n, and becomes thus quicklyunmanageable, and is therefore not of much practical interest.

Guided by information theoretic thinking, or simple intuition, one canargue that the next best thing to do is to cancel the interference for each ofthe individual users by subtracting an appropriate estimate

√Pldl[j]sl[j] of

the interference caused by other users. This operation generates a new receivedsequence r(2)

k for user k as

r(2)k =

n∑j=1

√Pkdk[j]sk[j] +

I(1)k︷︸︸︷

n∑j=1

K∑l=1

(l �=k)

√Pl

(dl[j]− dl[j]

)sl[j] +z

=n∑

j=1

√Pkdk[j]sk[j] + I

(1)k + z. (6.50)

Note that the received sequence r(i)k as well as the interference I

(i−1)k change

with each iteration i, denoted by the superscripts. We will usually omit these


superscript unless they are specifically needed. The received signal of the k-thuser rk is distorted by additive noise and the interference Ik. Central limittheorem arguments quickly establish that this interference, which results fromincomplete cancellation of the interfering users’ signals is Gaussian – see below.

The filters w∗1, · · · ,w∗

K are receiver loop filters inserted in the processingchain of each of the users. In the absence of multiple-access interference thesefilters would simply be discrete signal matched filters.

The question now turns to how to generate the estimate of interference.Clearly, if we know precisely all of the interfering symbols, perfect cancellationis theoretically possible. However, the other APP decoders are not, initially,capable of generating exact estimates. We therefore attempt to minimize theerror of the estimate dl[j]− dl[j] between actual and estimated symbols in themean square sense. This is accomplished by using as estimated symbols theconditional expectations

dl[j] = E

[dl[j]

∣∣∣∣λ(dl[j])]

= tanh(

λ(dl[j])2

), (6.51)

which leads to the tanh(·) functions to generate the soft-bits dl[j]. These soft-bits will minimize the variance of Ik given the locally optimal probabilityestimates from the code APP decoders.

The received signals rk are now considered to be distorted only by Gaus-sian noise, and matched filters w∗

k[j] = s∗l [j] are therefore the optimal receiverfilters.

6.3.2 Convergence – Variance Transfer Analysis

Using random CDMA, the interference is proportional to the ratio of usersto processing gain as shown in Chapter 4. Furthermore, the interference isproportional to the variance of the estimation error

vd,l = E

[(dl[j]− dl[j]

)2]

. (6.52)

Assuming equal power for all users Pk = P , all estimation error will be equalvd,l = vd. Further, using matched receiver filters, i.e. w∗

k[j] = s∗k[j], leads toa particularly simple expression for the variance of the residual interferenceplus noise at the input to the error control decoder, given by

vIC(vd) = σ2 +K − 1

LPvd

K,L→∞=⇒ σ2 + βPvd, (6.53)

which is intuitive: there are K−1 interferers for user k, each being suppressedby the processing gain L. The APP decoder estimation error variance vd isthe average power of the residual interference.

Therefore, given an average residual symbol interference variance vd, equa-tion (6.53) describes the noise variance in the signal rk for the next iteration


through the FEC decoders. For matched filters in the loop this variance trans-fer curve (VT) is a linear function in the system load and the residual symbolvariance, and is conveniently visualized by the following Figure 6.8.

vd: cancelled symbol variance vd = 1vd = 0

Sin

gle

-use

rnois

ele

vel

v IC

channelnoiselevel

slope:system load

Fig. 6.8. Soft cancellation variance transfer curve.

The more accurate the symbol estimates of the interferers, the smaller theresidual noise that the error control decoder has to overcome. Note that evenif the interfering symbols are exactly known, vd = 0, there is still the AWGNnoise which the error control decoder has to overcome.

The idea of iterative decoding is, that through iterations, the variance onall the single-user channels can be improved until complete cancellation of allinterference is achieved. In order to understand this process, we need to gainunderstanding of the operation of the soft-output FEC APP decoders.

The input sequence rk to the k-th APP decoder has an additive Gaussiannoise distortion with associated variance vIC per symbol at the input of thedecoder, and the decoder analysis can be handled analogously to an AWGNchannel. The output of the decoder are the soft-bits (6.51), and the primarymeasure of their reliability is the variance vd. While much numerical analysis[28, 133] seems to indicate that the LLRs (6.48) are Gaussian distributed withgood accuracy, the soft-bits are not Gaussian distributed. We will calculatetheir distribution later assuming that the LLRs are Gaussian. However, sincein the limit of many users, the number of terms in Ik is large, Ik is still


asymptotically Gaussian by central limit theorem arguments, and thereforevd is a sufficient measure of the reliability of dl[j].

Unfortunately, there exists no closed form expression for vd as a functionof vIC other than for very simple codes, such as the rate [2,1,2] parity checkcode. The variance transfer (VT) curves vd = g (vIC) must therefore be foundnumerically for all codes of interest. Figure 6.9 shows such VT curves for aselection of low-complexity FEC codes. Note that for reason which will soonbecome clear, the input vIC is plotted along the vertical axis.

Soft-bit variance vd = g (vIC)

Input

Nois

eV

ari

ance

v IC

Fig. 6.9. Code VT curves for a selection of low-complexity FEC codes.

The operation of an iterative receiver now carries out in the following way:Canceled signals rk for each user are generated according to (6.50) and passedto the FEC APP decoders. They, in turn, generate soft estimates of the codedsignals which are used to generate interference estimates to be used in a newcancellation iteration. If the interleavers used in the system are large enoughto ensure that the implicit assumption of independence between signals isaccurate, the process of iterative cancellation and decoding can be capturedby a variance transfer chart shown in Figure 6.10 for the rate R = 1/3, ν = 3convolutional code and a system load of β = 3.

Decoding starts at point (0) where the FEC decoders have to work withthe full interference and noise. Nonetheless, after one iteration, the systemmanages to reduce the interference by subtracting estimates of the interferingsignals, this leads to point (1), whose vertical component is the now reduced


(0)

(1)

(2)

Rate 1/3 CC

K = 30

L = 10

Eb/N0 = 10dB

Soft-bit variance vd

v IC

β = 3

InterferenceLimitation

Noise Limitation

v IC

0

0.5

1

1.5

2

2.5

3

0 0.2 0.4 0.6 0.8 1 vd

Fig. 6.10. VT chart and iteration example for a highly loaded CDMA system. FECcode VT is dashed, the cancellation VT curve is the solid curve.

noise variance at the input of the soft-output error control decoder. In the nextiteration, the noise variance can be further reduced to point (2), and so on,until the iterations reach the intersection point between the two curves, thesolid point in the enlargement. At this point virtually all the interference hasbeen canceled, and only channel noise is left. This intersection point is a fixpoint of the one-dimensional iterated variance transfer map, which models thebehavior of the iterative decoder. The performance of the individual decodersat that point is essentially that of the decoders in Gaussian noise, and we callit the noise limitation of the iterative decoder.

Note that along the iterative trajectory there is an section which forms anarrow channel through which the trajectory progresses in small steps. Thisarea is the interference limitation. In this example, a small increase in thesystem load will create another intersection point between the two VT curves,and the variance will stop improving at that new fix point, rather than at thenoise limitation fix point. The decoder does not function due to an excessivesystem load. As the load decreases, or the signal-to-noise ratio increases, thechannel opens up and convergence to the noise limitation fix point is suddenlyenabled. This effect happens at a very sharp signal-to-noise threshold, andgives rise to the abrupt, cliff-like behavior of the error rates in such systems.

Figure 6.11 shows the bit error curves for the system whose VT curvesare shown in Figure 6.10. Clearly visible is the sharp onset of the drop in


error rates as the signal-to-noise ratio is increased sufficiently to open theconvergence channel and allow the decoder system to progress beyond theinterference limitation. At sufficient signal-to-noise ratios the bit error ratesclosely follow those of the FEC code operated in additive noise alone, this isthe noise limitation and is only a function of the FEC code used.

1 Iteration

5 Iterations

10 Ite

ratio

ns

15 Ite

ratio

ns

30 Ite

ratio

ns

20 Ite

ratio

ns

1

10−1

10−2

10−3

10−4

10−5

Bit

Err

or

Rate

Single User

2.5 3 3.5 4 4.5 5 Eb/N0

Fig. 6.11. Illustration of the turbo effect of iterative joint CDMA detection.

The question now poses itself as to what FEC code should be used forefficient communication. Examining stronger error control codes, such as turbocodes or LDPC codes will be beneficial in the noise limitation area, where thesingle-user performance of the codes comes to bear. Figure 6.12 shows theVT curves of a selection of powerful error control codes, several rate R = 0.5LDPC codes at their first iteration and at iteration 20, as well as two rateR = 1/3 serially concatenated turbo codes, whose rates and component codesare given in the following table. All of these codes have good to excellentperformance on AWGN channels, and their VT transfer curves all show ageneral trend. The more powerful the code, the more abrupt the transitionbetween unreliable and reliable code output behavior.


Inner Code Outer Code Ri Ro

Code 1 g1 = 11+D

g2 =h1+D2, 1+D+D2+D3, 1+D+D2+D3

i1 1/3

Code 2 g1 =

"1 0 1+D2

1+D+D2

0 1 1+D1+D+D2

#rate 1/2 repetition code 2

31/2

Table 6.1. Serially Concatenated Codes of total rate 1/3 whose VT curves areshown in Figure 6.12. For details on serial concatenated turbo codes, see [120, Chap-ter 11].

In fact, for very large LDPC or turbo codes, which exhibit a sharp thresh-old signal-to-noise ratio, the following upper bound on the code variance trans-fer function is quite tight:

σ2d ≥ g

(vIC

Pk

)≥{

1 if vIC ≥ τPk

0 if vIC < τPk,(6.54)

where 1/τ is the signal-to-noise ratio at which the original code has its con-vergence threshold for an additive-white Gaussian noise channel. This boundon the VT curve will allow us to derive optimal rate and power distributionsin multiple-access systems where either rates or power levels can be chosen tooptimize system efficiency. This will be done in Section 6.5.2.

Figure 6.13 shows an iteration example of the serially concatenated codeSCC 2 for a load of β = K/L = 23/10 = 2.3 at a signal-to-noise ratio ofEb/N0 = 10dB. The achieved performance in terms of spectral efficiency isinferior to that achieved by the weaker convolutional code (Figure 6.10). Inboth cases convergence of the iterations to low error rates is guaranteed aslong the VT curves of the cancellation function and the FEC APP decoderdo not intersect at the interference limited point.

In order to find the limiting performance of a given system in terms ofsystem load and spectral efficiency, one increases either the noise level or thesystem load β until the cancellation VT line becomes tangential to the FECdecoder VT curve. This is illustrated in Figure 6.14 for the two FEC codesdiscussed and for Eb/N0 →∞. From Figure 6.14 one can read off the limitingperformance values of the supportable loads of β = 2.7 for the concatenatedsystem, and β = 3.5 for the convolutional code.

In order for the system to overcome the convergence bottleneck createdat the interference limitation intersection point, it is desirable that the FECcode VT curve has a steep slope throughout to closely match the steep slopeof the cancellation VT curve caused by a high β. This puts the strongercode, such as the LDPC and turbo code at a disadvantage to handle highsystem loads, since their VT curves have shallow slopes right at their turbo


(2,4) regular

LDPC

irregular LDPCs

(4,8) regular

LDPC

Serial Concatenated

Code 2

iteration 1

iteration 20

Serial

Concatened

Code 1

(3,6) regular

LDPC

v IC

0

0.5

1

1.5

2

2.5

3

0 0.2 0.4 0.6 0.8 1 vd

Fig. 6.12. Illustration of variance transfer curves of various powerful error controlcodes, such as practical-sized LDPC codes of length N = 5000 and code rate R = 0.5,as well as two serially concatenated turbo codes.

cliff noise variance level. Granted, for low loads, these same turbo codes canovercome interference and high noise, however, we have seen in Chapter 4that for low loads (β ≤ 0.5), no iterative detection is needed, since linearfilters preserve most of the channel capacity. The use of strong error controlcodes turbo in iterative receivers for highly loaded CDMA systems may becounter-productive.

The error behavior of such iterative decoders follows the typical patternof turbo codes, with a sharp error cliff which happens at the threshold signal-to-noise ratio, above which error rates rapidly drop to very low values, beforeapproaching the single-user error curve. This error floor domain is completelydetermined by the error behavior of the error control code in additive whiteGaussian noise.

While an individual iteration has a very low complexity, the number ofiterations increases rapidly as the cutoff load, or signal-to-noise ratio, is ap-proached. Figure 6.15 shows the performance of the receiver for the codeSCC2 from Table 6.1 as a function of the number of iterations. This code hasa cutoff signal-to-noise ratio of Eb/N0 = 1.3dB where the VT curves becometangential for a load of β = 1.


Rate 1/3 Serially

Concatenated Code (SCC 2)

K = 23, L = 10

Eb/N0 = 10dB


Cance

lled

MU

Vari

ance

v IC

β = 2.3

Fig. 6.13. VT transfer chart and iteration example for a highly loaded CDMAsystem using a strong serially concatenated turbo code (SCC 2).

6.3.3 Simple FEC Codes – Good Codeword Estimators

The difficulty faced by the FEC codes in an iterative system, and the reasonwhy weak codes perform better is related to how well the soft-output decodercan estimate the transmitted codewords, and not to how well the decoder candecode the information sequences contained in these codewords. In order toovercome the multiple-access interference, accurate estimation of the transmit-ted codewords is required, such that the mutual interference can be reducedeffectively, while for low-error detection, good error control properties are re-quired, such as large distances between codewords. These two requirementsare largely contradictory, as can be seen by the following intuitive arguments.It is easy to construct a code which allows for perfect codeword estimation,this code would simply map each message into the same codeword, or into aset of codewords with very small Euclidean distances. The decoder will thenalways be able to estimate the transmitted codeword with essentially arbi-trary accuracy, and thus cancellation will be very accurate. That is, the VTtransfer characteristics of such a code would be a horizontal line that coincideswith the y-axis. The problem obviously is that such a code cannot produce alow decoded bit error rate, even after perfect cancellation of all interference;for that to be possible we need codes that map the the information sequences



Cance

lled

MU

Vari

ance

v IC

Fig. 6.14. Determination of the limiting performance of FEC coded CDMA systemsvia VT curve matching for Eb/N0 → ∞.

into codewords as far apart from each other as possible, such that the noisespheres around each codeword do not overlap as required by Shannon’s theory(see i.e. [167] for geometrical arguments and error bounds).

The practical codes whose VT curves we have presented in Figures 6.10and 6.12 are representatives of these two groups of error control codes, and theweakest code, a simple parity check code, achieves the highest system load,but requires substantial signal-to-noise ratio for acceptable bit error rates.

As we will see later, contrary to linear filter detection, the equal power,equal rate situation represents a worst case scenario for an iterative can-cellation receiver, since neither power nor rate distributions can be used tomaximize spectral efficiency, as will be done in Section 6.5.4. As discussed,weaker error control codes, which exhibit a steeper VT curve than strongercodes, achieve a significantly higher system load β. In a “weak” code, however,the distinction between interference limitation and noise limitation is blurred,making it more difficult to determine the maximal supportable system load.Consider Figure 6.16, which shows the VT curve of a R = 1/3 repetition code.The distinction between interference and noise limitation has vanished, andlow BER performance is not possible due to the weak nature of the repetitioncode, quite independently of the load. This leads to a new view, which is to


1

10−1

10−2

10−3

10−4

Aver

age

Bit

Err

or

Rate

1 5 10 15 20 25 30 Iterations

Eb/N0 = 1.3 dB

Eb/N0 = 1.4 dB

Eb/N0 =1.45 dB

Eb/N0 =1.5 dB

Fig. 6.15. Bit error performance of SCC2 from Table 6.1 as a function of the numberof iterations.

consider the iterative cancellation system using a weak error control code asa non-linear separation filter, rather than a detector.

The intersection point which corresponds to the fix point of the iterativesystem equation, corresponds to a certain variance in the log-likelihood ra-tio of the decoded information bit. (This is not the value of the x-coordinateof the intersection point, which is the variance of the coded bit). The LLRof the information bit, furthermore, is Gaussian distributed, and we there-fore have a binary-input, Gaussian-output channel between input informationbits and their corresponding output information bit LLR values. The chan-nel is binary since we assume a BPSK modulation of our CDMA system, forother modulation formats, we would obtain the corresponding discrete-input,Gaussian-output channel. The iterative decoder now acts as a non-linear fil-ter, suppressing interference, but not completing the decoding operation dueto the weakness of the repetition code. The system can also be viewed as anon-linear signal layering system (compare Chapter 4). Its spectral efficiencyis then given by

C = βRCB

(P

σ2llr(β, C)

), (6.55)

where CB(SNR) is the capacity of the binary-input channel with signal-to-noise ratio SNR. This is so since these output binary LLR can be understoodas noisy versions of the transmitted data bits, and by Shannon’s theorem,they can carry at most a rate given by (6.55). The final output variance σ2

llr


v IC

Rate R = 1/3

Repetition Code

Intersection Point: (v2∞, vd)

0

0.5

1

1.5

2

2.5

3

3.5

0 0.2 0.4 0.6 0.8 1 vd

Fig. 6.16. Interference cancellation with a weak rate R = 1/3 repetition code,acting as non-linear layering filter.

of the (Gaussian) log-likelihood ratio of the data bits at the decoder outputcorresponds to that achieved at the iteration fix point. The (final) outputsignal-to-noise ratio P/σ2

llr does depend on the system load β as well as onthe particular code C used, as discussed above.

In general it is very difficult to calculate the variance transfer functionof a given code, and this makes optimization over the codes C in (6.55) adifficult task as well. The VT functions of typical error control codes, like theones shown in Figure 6.12, are found via numerical methods by simulating theinput-output behavior of the FEC code. However, the output extrinsic LLRof any one of the coded bits in a length-N repetition code at iteration m issimply

λ(E) (dk[j]) =∑i�=j

λ(y(m)k [i]

)=

2vd

∑i�=j

r(m)k [i], (6.56)

where y(m)k [j] = sk[j]r∗(m)

k [j] is the loop matched filter output, and the infor-mation bit LLR can be calculated as


λ(u) =N∑

i=1

λ(y(m)k [i]

). (6.57)

From the extrinsic LLR values λ(E)(dk[j]) of the coded bits the decodercomputes the soft bits dk[j] = tanh(λ(E)(dj [i]/2)) used for cancellation. De-coding is very simple since most of the complexity lies in the tanh(·) function.

The PDFs of both the a priori and extrinsic LLRs are Gaussian due to(6.56) and (6.57) with variance

v2E =

vd

P (N − 1)and

σ2llr

P=

vd

PN(6.58)

respectively. However, the PDF of the soft-bits dk[j] is not Gaussian. It canbe calculated as

p(d|dk[j] = 1) = 2exp(

12v2

E−(log(

1+d1−d

)− 2

v2E

)2)

√2πv2

E

(1− d2

) (6.59)

via standard random variable transformation. Applying the central limit the-orem to the large number of terms in (6.50) as K → ∞ and the fact thatvar(dk[j]

)≤ 1, the residual interference term Ik is converging to a Gaussian

distribution with variance βvd, where

vd =∫ 1

−1

p(d|dk[j] = 1)(1− d

)2

= gd(v2E). (6.60)

Equation (6.60) needs to be evaluated by numerical methods.Using (6.60) and (6.58) leads to the following lemma:

Lemma 6.2. The system spectral efficiency in (6.55) using repetition codesis maximized for N →∞.

Proof. Let the point (x, y) = (g(v2∞), v2

∞) be the intersection between the loadline σ2 + βvd and the variance transfer curve g(vd/P ) of the repetition code,i.e.

βx− σ2 = g(−1)d (x). (6.61)

The spectral efficiency of the layered channels is then given from (6.55) as

C = β1N

CB(NP/v2∞)

=g(−1)(x)− σ2

x

1N

CB

(NP

vd

)

=(N − 1)v2

E − σ2

gd(v2E)

1N

CB

(N

N − 1v2

E

), (6.62)

which can easily be shown to be monotonically increasing with N .


For the repetition code we obtain from (6.56)

g(vd

P

)= tanh

⎛⎝ P

vd

∑i�=j

rk[i]

⎞⎠

= tanh

⎛⎝ (N − 1)P

vd+

√(N − 1)P

vdξ

⎞⎠ , (6.63)

where ξ ∼ N (0, 1). Now

g(b) = E[1− tanh

(b2 + bξ

)]; b =

√(N − 1)P

vd. (6.64)

For our further development, we will make use of some fundamental ap-proximations, given in the following

Lemma 6.3. This lemma consists of two parts:

i) The function g(b), b ≥ 0 is monotonically decreasing, and for any b > 0the following relation holds

E[1− tanh

(b2 + bξ

)]2= 4e−b2/2

∞∑n=0

(−1)neb2(2n+1)2/2Q [b(2n + 1)] ,

(6.65)where Q(x) = 1/(

√2π)∫∞

xe−u2/2du is the Gaussian error function.

ii) For any b ≥ 0 the following inequality is true

E[1− tanh

(b2 + bξ

)]2 ≤ min{

11 + b2

, πQ(b)}

={

11+b2 , b < 1

πQ(b), b ≥ 1.(6.66)

Remarks. Equation (6.66) taken as an approximation has an accuracybetter than 90% for all b ≥ 0. Note also that limb→∞ g(b)/(πQ(b)) = 1.

Proof. See [15].

Now, let vIC at iteration m be denoted by v2m, and using matched-filter

interference cancellation, and equal unit powers Pk = 1 for simplicity, we find

v2m+1 = βg(v2

m) + σ2. (6.67)

Using

g(v2m) ≤ 1

1 + N−1v2

m

(6.68)

by applying the first upper bound (6.66) for a repetition code of length N , weobtain


v2m+1 ≤

βv2m

v2m + N − 1

+ σ2. (6.69)

For the unique fixed point variance we obtain the simple quadratic boundaryequation

σ2∞ =

βv2∞

v2∞ + N − 1

+ σ2 (6.70)

which has the solution

v2∞ =

12

(β + σ2 − (N − 1) +

√(β + σ2 − (N − 1))2 + 4(N − 1)σ2

).

(6.71)Equation (6.71) is due to using the first term in (6.66) as the biting constraint,which is achieved for v2

∞ > 1.Defining the uncoded load β′ = β/(N − 1), i.e. the load with respect to

the information bit rate, we obtain

v2∞ ≤ N − 1

2

⎛⎝β′ − 1 +

σ2

N − 1+

√(β′ − 1 +

σ2

N − 1

)2

+ 4σ2

N − 1

⎞⎠ .

(6.72)Recall that in Chapter 4 we calculated the single-user signal-to-noise ratio

of the MMSE layering filter under equal received power levels, given by equa-tion (4.83). Solving it for the residual noise and interference variance, using(4.80), we obtain

v2mmse =

12

(β − 1 + σ2 +

√(β − 1 + σ2)2 + 4σ2

). (6.73)

From (6.72) and (6.73) we see that the MMSE filter with load β and noisevariance σ2 can directly be related to the performance of the iterative decoderwith load β(N − 1) and noise variance σ2(N − 1). We express these finding inthe following

Theorem 6.3. Iterative cancellation of equal power randomCDMA users using rate 1/N repetition codes with N →∞, has aper-chip capacity at least as large as linear minimum-mean squareerror filtering. The capacity of iterative cancellation is strictlylarger for β < 2− 2σ2/P .

Proof. The variance (6.72) of an iterative receiver is at most as large as that ofan MMSE filter with load β/(N−1) and noise variance σ2/(N−1). AssumingBPSK modulation for both, we obtain

C =β

NCB

(NP

v2∞

)(6.74)

6.4 Filters in the Loop 209

for the iterative filter, and

Cmmse =β

N − 1CB

((N − 1)P

v2∞

)(6.75)

for the MMSE filter. Furthermore, equality in (6.71) holds for iterative can-cellation if equality holds in the first inequality of (6.66), i.e. if

v2∞ < N − 1. (6.76)

From this we easily derive the condition of the theorem.

Figure 6.17 shows the spectral efficiencies in bits/dimension achievablewith MMSE filtering and non-linear iterative filtering. The iterative systemsoperates at an N − 1 times higher system load than the MMSE filter, whichmaximizes spectral efficiency for load values 0.5 ≤ β ≤ 1, and rapidly degradeswith higher loads. Furthermore, iterative filtering significantly outperformslinear MMSE filtering in the regime where Eb/N0 > 6dB. The curious curva-ture of the iterative capacity curves in Figure 6.17 is related to the change incurvature of g(vd), and occurs as the intersection point traverses the sectionof zero curvature of the repetition code VT function – see Figure 6.16. Thishappens where the biting bound in (6.66) switches.

6.4 Filters in the Loop

6.4.1 Per-User MMSE Filters

In Section 6.3 the processing filters w∗k[j] in Figure 6.7 are filters matched to

the spreading waveform of user k. Since the function of the CDMA interferencecanceler and the filters is the efficient suppression of the interference, Wangand Poor [157] concluded that local minimum-mean square error (MMSE)filter could give better performance. In fact, from a variance point of view theMMSE filter is the optimal choice. We therefore consider the filter

w∗k[j] = arg min

w∈RLE[‖dk[j]−w∗rk[j]‖2

](6.77)

which is the best linear filter in the receiver chain. The advantage of theMMSE filter over the matched filter comes from its capability to include thecorrelation inherent in the residual interference signal. This is evident fromits derivation which we develop now.

From the orthogonality principle of estimation theory (see Appendix A)the optimal filter in (6.77) obeys the following relation

E [dk[j]−w∗k[j]rk[j]] r∗k[j] = 0√

Pksk[j] = E [rk[j]r∗k[j]] wk[j]. (6.78)


-2 0 2 4 6 8 10 12 14 16 Eb/N0

0.1

1

10

Capaci

tybits/

dim

ensi

on

Fig. 6.17. Achievable spectral efficiencies using linear and nonlinear layering pro-cessing in equal power CDMA systems.

If we assume for the moment that there is no interference – or that the inter-ference is assumed to be white – then rk[j] =

√Pkdk[i]sk[j] + z, and

wk[j] =√

Pksk[j]Pk + σ2

(6.79)

that is, apart from an unimportant scaling factor, the required filter is aspreading sequence matched filter as used in previous sections.

Next we will take the correlation of the interference into account. In orderto do this, it is advantageous to slightly rewrite equation (6.78) as

E [dk[j]r∗k[j]] = w∗k[j]E [rk[j]r∗k[j]]. (6.80)

We now confine our treatment to the synchronous case, noting that asyn-chronous CDMA can be handled in completely analogous manner, but with amuch more complicated notational apparatus. Furthermore, as we have seen inChapter 5 and will see in the next section, iterative filter implementations are


identical for both synchronous and asynchronous systems, and do not sufferthe complications of directly inverting an asynchronous correlation matrix.

We now proceed by evaluating

M[j] = E [rk[j]r∗k[j]] =(S[j]WS∗[j] + σ2I

)=(sk[j]Pks∗k[j] + Sk[j]WkS∗

k[j] + σ2I)

(6.81)

as the interference correlation matrix calculated in Chapter 4, and we haveseparated the term into an “interference and noise” term and the signal part.The inverse of M is the minimum-mean square error matrix filter. We nowapply the matrix inversion lemma2, and drop the time index j for this calcu-lation to obtain⎛

⎝skPks∗k︸︷︷︸BCD

+SkWkS∗k + σ2I︸︷︷︸

A

⎞⎠−1

=

1α

(αI−

(SkWkS∗

k + σ2I)−1

skPks∗k) (

SkWkS∗k + σ2I

)−1

α = Pk + s∗k(SkWkS∗

k + σ2I)−1

sk. (6.82)

Furthermore, usingE [dkr∗k] =

√Pks∗k (6.83)

the per-user, per-iteration MMSE filter in (6.80) can be calculated as

w∗k =

√Pks∗k

(SkWkS∗

k + σ2I)−1

Pk + s∗k (SkWkS∗k + σ2I)−1 sk

, (6.84)

where the time index j is implicit for a synchronous system, while for asyn-chronous CDMA the matrices in (6.84) will have to be properly redefined.

We see that this filter now depends not only on the spreading sequence ofthe interfering users, but also on their powers Wk. Furthermore, since theseare residual powers, the matrix inverse in (6.84) has to be recomputed forevery iteration – a truly tremendous effort.

The factor√

Pk/α is a metric adjustment factor and has the same correc-tive role as for projection receiver discussed in Section 6.2.3. It’s importancediminishes as the system size grows and the variance of α diminishes. Evi-dently, it also has no effect on calculations of signal-to-noise ratios values.

In order to proceed with our analysis we make use of a result from Tse andHanly [138] which relates the asymptotic signal-to-interference ratio (SIR) for2 Given as

(A + BCD)−1 = A−1 − A−1B(DA−1B + C−1)−1

DA−1


an arbitrary user in a CDMA system with random spreading codes wherethe receiver uses a joint MMSE filter. The different users are allowed to havedifferent power levels, and the result is asymptotic as K,L → ∞. It is givenby by the implicit formulas of the following theorem, proven for synchronousCDMA:

Theorem 6.4 (Tse and Hanly [138]). The asymptotic signalto interference ratio γk for user k as K,L → ∞, K/L = β, in arandom CDMA system with a MMSE receiver filtering obeys theimplicit equation

γk =Pk

σ2 + β E [I(P, Pk, γk)], (6.85)

where the effective interference is

I(P, Pk, γk) =PPk

Pk + Pγk. (6.86)

The expectation in (6.85) is over the distribution of power Pk.

Equation (6.85) is an implicit equation which has no closed-form solutionin general. The signal-to-interference ratio of user k depends on the powers ofall the other users, or more precisely, on the power distribution of the infinitepopulation of interfering users.

For our purposes we are interested in the equal power case, where Pk = Pfor all users k. In this case a closed-form expression for γk can be found as

γk =P (β − 1)− σ2

2σ2+

√P 2(β − 1)2

4σ22 +P (β + 1)

2σ2+

14. (6.87)

Let us consider user k in the case of a joint iterative decoder. Duringiterations, the effective interference power of the other users will decreaseaccording to

P(m)l → v2

mPl (6.88)

at iteration m. Now note that the per-user MMSE filter (6.84) does not dependon Pk other than via a trivial scaling factor, and depends only on the commonequal power levels of (6.88). We can therefore apply (6.87) for the signal-to-noise ratio, which we simply need to adjust to the power Pk, i.e.

v2mγ

(m)k =

v2mP (β − 1)− σ2

2σ2+

√v4

mP 2(β − 1)2

4σ22 +v2

mP (β + 1)2σ2

+14. (6.89)

Assuming that all use the same coding system, let us study the iterationbehavior of a per-user MMSE filtered cancellation receiver analogously to the


simple matched filter cancellation receiver. Normalizing Pk = 1, the residualvariance vIC = 1/γk is plotted in Figure 6.18 for a load of β = 2 and differentsignal-to-noise ratios.

0 0.2 0.4 0.6 0.8 10

0.5

1

1.5

2

2.5


Res

idualM

UV

ari

ance

v IC

P/N0 = 0

P/N0 → ∞

Fig. 6.18. Variance transfer curves for matched filter (simple) cancellation andper-user MMSE filter cancellation (dashed lines) for β = 2 and Es/N0 = 0dB andEs/N0 → ∞.

What becomes evident in Figure 6.18 is substantiated by the followinglemma, namely that the variance transfer curve of the per-user MMSE filteris nearly linear, and thus affords us a tool to compare its performance directlywith that of simple matched filter cancellation.

Theorem 6.5. The variance transfer function (6.87) of the per-user MMSE filter canceler with P → vdP is a linear function ofvd with slope β − 1 as P/σ2 →∞.

Proof. The slope is given by

limP/σ2→∞

vIC

vd= lim

P/σ2→∞

1γkvd

= limP/σ2→∞

2β+1β−1 + σ2

2vd(β−1)P − 1= β − 1, (6.90)


where we have used (6.89) and a number of detailed but simple algebraicsteps.

Theorem 6.5 has the following corollary as immediate consequence:

Corollary 6.1. If matched filter cancellation can support a load of β, the useof per-user MMSE filters instead of the matched filters allows a load of β + 1for high signal-to-noise ratios.

Proof. Choose the signal-to-noise ratio large enough such that the variancetransfer function of the per-user MMSE filtered canceler is linear in vd withslope β − 1. The limiting load is found where the canceler VT curve andthat of the error control code intersect. If this happens for a slope β = βIC

for matched filter cancellation, the load for per-user MMSE filtering can beincreased until β − 1 = βIC.

6.4.2 Low-Complexity Iterative Loop Filters

While a highly specialized loop filter like the MMSE filter can produce amarked advantage for a certain range of channel loads, its implementation issignificantly more complex than that of a matched filter due to the inverse in(6.84). However, if we replace

Kk =(SkWkS∗

k + σ2I)−1

(6.91)

in (6.84) by the first-order stationary multi-stage approximation

K−1k

≈= (I− τKk)n +s−1∑p=0

τ (I− τKk)p

= φn (Kk) (6.92)

according to the principles explored in Chapter 5, the resulting system com-plexity is much reduced, depending only on the number of stages s.

As in the previous section, the user may interpret (6.92) as a symbol-wiseequation for a synchronous system where we have omitted the symbol indexj, or a full asynchronous equation using the appropriate extensions of thematrices and vectors. The point is that the key conclusions are the same andthat the implementation of (6.92) via multistage filtering methods also doesnot depend on whether the system is synchronous or not.

We will show now that the number of stages s does not depend on L orK, and that very few stages recover almost all of the difference between amatched-filter iterative decoder and one using per-user MMSE filters.

Instead of (6.84), the loop filter is now

wk = φs(Kk) sk (6.93)


where we have dropped the normalization factor in (6.84) since its variancegoes to zero for large systems with K,L →∞.

The output of the multi-stage filter contains (desired) signal componentsand noise components, more precisely

w∗krk = s∗kφs (Kk)

⎛⎜⎜⎝√Pkdksk +

∑j=1

(j �=k)

√Pk

(dj − dj

)sj + zk

⎞⎟⎟⎠ (6.94)

Under the assumption that(dj − dj

)has zero mean, the interference in (6.94)

is also zero mean, and the filtered signal has a signal-to-noise ratio of

γk =SPk

IPk=

(s∗k φs(Kk) sk)2

s∗k φs(Kk)Kk φs(Kk) sk(6.95)

where we have simply averaged of the noise z. For large systems, as K,L →∞,the variances of both SPk and IPk vanish, and (6.95) becomes a fixed systemsignal-to-noise ratio, independent of any specific time and k.

From random matrix theory, Section 3.4.2 and [128] it is known that therandom variable s∗k (SkS∗

k)p sk converges in probability to the deterministicvalue

∫∞0

λpfλ(λ)dλ, where λ is an eigenvalue of SkS∗k, and fλ(λ) is the dis-

tribution of these eigenvalues. Since both Kk and Φs(Kk) are polynomial ex-pressions of the random matrix SkS∗

k, these random matrix theory results canbe used to compute the average signal-to-noise ratio. In particular, since theeigenvalues of a polynomial function of a matrix are the polynomial functionsof the original eigenvalues of that matrix, we obtain

SPk −→(∫ ∞

0

φs(λ)fλ(λ)dλ

)2

(6.96)

where φs(λ) is the distribution of the eigenvalues of φs(Kk). The convergencein (6.96) is in probability.

Similarly, we can compute

IPk −→∫ ∞

0

φ2s (λ)κ(λ)fλ(λ)dλ (6.97)

where κ(λ) = λ + σ2 is an eigenvalue of Kk.Equations (6.96) and (6.97) can now be used to find multiuser front-end

VT curves by numerical integration. Figure 6.19 below shows such VT curvesfor a number of stages and two values of the signal-to-noise ratio. It is re-markable that one or two stages are almost completely sufficient to achievethe VT characteristics of the MMSE loop filter over a wide range of input softbit variance vd and channel signal-to-noise ratios.


0 0.2 0.4 0.6 0.8 10

0.5

1

1.5

2

2.5per-user

MMSE Filter

Matched Filter

Interference

Cancellation

4 stages

5 stages

10 stages

2 stages

2 stages

1 stage

1 stage


Res

idualM

UV

ari

ance

v IC

P/σ2 = 3dB

P/σ2 = 23dB

Fig. 6.19. Variance transfer curves for various multi-stage loop filters for a β = 2and two values of the signal-to-noise ratio: P/σ2 = 3dB and P/σ2 = 23dB.

We use the following numerical example to illustrate the accuracy of theanalysis as well as the power of the multi-stage loop filters. The test ex-ample is made up of a convolutional rate R = 1/3 four-state code withG(D) = [1 + D2, 1 + D + D2, 1 + D + D2] and a two-stage first order station-

ary loop filter with stationarity parameter τ = 2/(

2σ2 +(√

K−1L + 1

)2)

to

guarantee convergence.Figure 6.20 shows the matching between the two-stage loop filter canceler

and the 4-state convolutional codes used as FEC. At a signal-to-noise ratio ofEb/N0 = 4.5dB, both the two-state loop filter and a full MMSE loop filter forthe load of β = 3 converge, while a matched filter canceler would no longerachieve low error rate since its VT curve intersects with that of the FEC codesat high values of vd, that is, before the turbo cliff.

Figure 6.21 shows bit error rates for this example system, showing thetypical turbo cliff occur as soon as the convergence channel opens at around4.5dB. The number of users is K = 45, and the spreading gain is L = 15,creating system load β = 3. Note that a full MMSE canceler would gain onlyabout 0.3dB in signal-to-noise ratio over the two-stage loop filter – hardlyworth the additional complexity.



Res

idualM

UV

ari

ance

v IC

K = 45, L = 15

n = 5000

Fig. 6.20. Variance transfer chart for an iterative decoder using convolutional errorcontrol codes and a two-stage loop filter, showing an open channel at Eb/N0 = 4.5dB.

6.4.3 Examples and Comparisons

Let us explore the potential of these iterative cancellation schemes. Matchedfilter cancellation has the lowest complexity, especially in conjunction with asimple error control decoder such as a repetition code whose APP decoder israther trivial. Per-user MMSE filtering, on the other hand, provides somewhatbetter performance in the context of iterative cancellation decoding, but itscomplexity is large. Other cancellation methods will lie in between these twoextremes. A comparison in terms of performance is therefore a valuable tool.We first consider the ultimate performance capacity limits as computed inChapter 4. Figure 6.22 shows the capacities of optimal decoding and linearMMSE filtering (without iterations) for three different system loads β = 2, 1,and β = 0.5. From these figures we can deduce the following proposition.The use of complex joint detection methods, such as the iterative schemesdiscussed in this section, is efficient only for large system loads β > 1. Forsmaller loads linear filtering followed by FEC decoding is nearly as efficient,and would thus constitute a simpler system alternative, especially if iterativefilter implementations are used.

Let us then focus on large system loads for the application of iterative can-cellation receivers. As Lemma 6.1 asserts, the advantage of per-user MMSEfiltering diminishes as the system load increases. In Figure 6.23 we plot the


1

10−1

10−2

10−3

10−4

10−5

Bit

Err

or

Rate

Single User

2.5 3 3.5 4 4.5 5 Eb/N0

Fig. 6.21. Bit error rate performance of an iterative canceler with a two-stage loopfilter for 1,10,20, and 30 iterations, compared to the performance of an MMSE loopfilter.

Orthogonal

Optimal

MMSE

0 2.5 5 7.5 10 12.5 16EbN0

0.1

1

10

Capacity bits/dimension

Load: β = 2

Orthogonal

Optimal

MMSE

5 7.5 10 12.5 15

Capacity

Load: β = 1

Orthogonal

Optimal

MMSE

5 7.5 10 12.5 15

Capacity

Load: β = 0.5

Fig. 6.22. Optimal and linear preprocessing capacities for various system loads forrandom CDMA signaling.

6.5 Asymmetric Operating Conditions 219

performance of repetition codes versus different capacity curves of randomCDMA. As the rate of the repetition code is decreased, the number of si-multaneous transmissions that can be supported increases, i.e. the systemload increases. We observe two noteworthy effects: Firstly, the overall effi-ciency of the system monotonically increases as the code rate decreases dueto a higher supportable system load. The different rate curves are drawn forrepetition codes of rate 1/2 to 1/12 using loop matched filters. The range ofperformances achieved by a per-user MMSE filter lies between the thin dashedlines, which delimit the performance of rate-1/2 codes at the lower end, andrate-1/12 codes at the upper end. A rate-1/2 system with MMSE filteringperforms about as well as a rate-1/4 system with loop matched filters, whilefor rates 1/12 there is virtually no difference. This is in agreement with The-orem 6.5, and confirms that, for large system loads, the advantage of per-userMMSE filtering vanishes and the performance of simple cancellation becomespractically identical to that of the more complex MMSE filtering system.

The capacity curves of optimal detection and the two preprocessing filtersare drawn for rate R = 1/3 coding, which serves as our baseline rate. Thesystem load is adjusted such that the respective capacities are maximized.Compare this to the Figure (4.11), where the system load β is kept constant.

The figure shows performance points for simple cancellation, that is, lim-iting system loads where the VT curves intersect. The dashed lines show theperformance limits of per-user MMSE filtering for both the rate R = 1/2and the rate R = 1/12 repetition coded systems, illustrating the vanishingadvantage of the MMSE filter. At the left ends of the performance curves, abit error rates of 10−4 are reached. For lower signal-to-noise ratios the sys-tem does not act as a decoder anymore, but as the non-linear filter discussedin Section 6.3.3. The performance curves are virtually identical to the non-linear filtering capacities shown in Figure 6.17. Note that at the closest pointthese simple cancellation schemes achieve about 70% of the equal-power jointprocessing Shannon capacity of the system.

6.5 Asymmetric Operating Conditions

What we have considered thus far are symmetric operating condition likethey are aspired to in conventional CDMA system with correlation reception[21, 135, 161]. The conditions are symmetric in that the received power levelsare kept constant to within a small tolerable level using open and closed looppower control mechanisms. The systems are also symmetric in the sense thatall users transmit at equal rates and use the same error control coding. Guar-anteeing power control, in particular, might be unnecessarily complex, andmultiuser decoders typically benefit from asymmetric operating conditions,as we will see below. This is in marked contrast to the linear joint detectors,which achieve the largest power and bandwidth efficiency at symmetric oper-ating conditions – see Section 4.5. We treat the unequal power case first, and


-2 0 2 4 6 8 10 12 14 16 Eb/N0

0.1

1

10

Capaci

tybits/

dim

ensi

on

Fig. 6.23. Performance of low-rate repetition codes in high-load CDMA systems,compared to single-user layered capacities for matched and MMSE filter systems.

show that there exist optimal power distributions which achieve the Shannonlimits of the channel.

6.5.1 Unequal Received Power Levels

Restricting attention to matched filters wk[j] = sk[j] in the loop the residualinterference and noise signal after matched filtering is generalized from (6.50)to

ηk[j] =n∑

i=1

K∑l=1

(l �=k)

√Pl

(dl[i]− dl[i]

)(s∗k[j]sl[i]) + zk[j], (6.98)

where ηk[j] is the combined noise and interference sample at the output ofuser k at time j. The different users’ powers are now assumed variable andunequal. Assuming unbiased estimates of the i.i.d. distributed coded symbolsdk[j] allows us to calculate the mean and variance of ηk[j] as


E [ηk[j]] = 0

E[η2

k[j]]

=n∑

i=1

K∑l=1

(l �=k)

Pl E[dl[j]− dl[j]

]2E [s∗k[j]sl[j]]

2 + σ2. (6.99)

For random spreading, where each chip in each at each time is chosen inde-pendently with equal probability, it follows, even in the asynchronous case,that E [s∗k[j]sl[i]]

2 = 1/L and

σ2k = E

[η2

k[l]]

=1L

K∑l=1

(l �=k)

Pl E[dl[j]− dl[j]

]2+ σ2, (6.100)

which is independent of the time index j, and we can therefore drop it inequation (6.100). For equal powers this equation obviously reduces to (6.53).

The output symbol estimation variance

E[dl − dl

]2= g

(σ2

l

Pl

)(6.101)

is a function of the code VT characteristics and the power of user l. We haveseen that non-pathological VT functions g(x) are monotonically increasingwith range [0, 1].

With iterations, the residual noise and interference variance σ2k will change

from iteration to iteration. If we denote this variance for user k at iteration vby σ2

k,v, then we can formulate the following recursive formula:

σ2k,v =

1L

K∑l=1

(l �=k)

Plg

(σ2

j,v−1

Pl

)+ σ2. (6.102)

Fortunately, the sequence σ2k,v, v = 0, 1, 2, · · · is monotonically decreasing,

which we show as follows. Formally we may set σ2k,0 = ∞, and assume that

σ2k,0 > σ2

k,1 ≥ σ2k,v−1. Using the monotonicity property of the function g(x)

we now show that

σ2k,v =

1L

K∑l=1

(l �=k)

Plg

(σ2

l,v−1

Pl

)+ σ2

≤ 1L

K∑l=1

(l �=k)

Plg

(σ2

l,v−2

Pl

)+ σ2 = σ2

k,v−1 (6.103)

which shows, by standard induction, that the sequence σ2k,v is monotonically

decreasing ∀k, and, since σ2k,v ≥ 0, the sequence must converge to a limit

which satisfies


σ2k,∞ =

1L

K∑l=1

Plg

(σ2

l,∞Pl

)− 1

Lg

(σ2

k,∞Pk

)+ σ2 (6.104)

This is good news, since it tells us that the iterations improve the performancemonotonically for all users, and that there exists a fix point towards whichthese iterations converge.

Furthermore, in cases where the second term in (6.104) can be neglected,i.e. if K and L are large, the residual interference in fact becomes independentof the user considered. Note that this does not mean the signal-to-noise ratiobecomes user independent, that is the case only in equal power systems. Toformalize this situation, we impose what is known as the Lindeberg conditionon asymptotic negligibility on the powers of the users, given as

Pk∑Kl=1l �=k

Pl

K→∞−→ 0; ∀k. (6.105)

Given (6.105), the v-th iteration residual interference and noise variance(6.102) as well as the limit variance (6.104) of the iterative process becomeindependent of the user index and approaches for all users the value

σ2v =

1L

K∑l=1

Plg

(σ2

v−1

Pl

)+ σ2. (6.106)

This decisive observation is quite independent on which loop filteringmethod is used. It enables us to describe the behavior of iterative decod-ing systems in the case of unequal powers by a single-parameter dynamicalsystem, as we have done in the equal power case. Before we look at specificcases, we will study some limiting optimal asymmetric operating conditions.

6.5.2 Optimal Power Profiles

We concentrate on FEC codes with an ideal VT characteristics, given byequation (6.54), i.e. by

vd ≥ g

(σ2

v

Pj

)≥{

1 if σ2v ≥ τPj

0 if σ2v < τPj

Recall that 1/τ is the threshold signal-to-noise ratio of a powerful turbo orLDPC code (or any other code with such a threshold characteristics). Now as-sume that we have separated the K users into J groups of powers P1, · · · , PJ ,and Pj is the received power of the users in group j. Also, let the powers beordered as P1 ≤ P2 ≤ · · · ≤ PJ , even though this is irrelevant in practice dueto the parallel nature of the decoder.

Furthermore define β =∑J

j=1 Kj/L =∑

βj as the sum of the partialloads βjof the j-th group. Due to (6.54), a given user group will completely


decode, before the next lower-power group can be decoded, and we have ef-fectively a group-serial decoder even though the process is run as a parallelalgorithm. Assuming then that Kj � 1, in fact ensuring the non-dominancecondition, and beginning with j = 1 we observe that P1 must obey the fol-lowing condition for successful decoding of the last user group #1, after allother J − 1 groups have been decoded and successfully canceled:

τP1 ≥ β1P1 + σ2 (6.107)

and

τPj ≥j∑

l=1

βlPl + σ2. (6.108)

From the above we recursively find

Pj ≥∑j−1

l=1 βlPl + σ2

τ − βj; 1 ≤ l ≤ J. (6.109)

The minimum powers are achieved in (6.109) when the inequalities are metwith equality, in which case the power levels are given by

Pj = σ2τ j−1

j∏l=1

1τ − βl

; 1 ≤ j ≤ J. (6.110)

We now consider optimization of the partial loads βj in order to maximizethe spectral efficiency, or equivalently, minimize the average power per user(assuming that all users use identical code rates), that is, we want to minimizeP given by

P = min{βj}

∑Jj=1 Pjβj∑J

j=1 βj

(6.111)

The solution to (6.111) is given by the following

Lemma 6.4. The average power in (6.111) is minimized by the uniform loaddistribution βj = β/J ;∀j.

Proof. We wish to minimize the average power

P =σ2

β

J∑i=1

βjτi−1∏i

l=1(τ − βl)given the constraint

J∑i=1

βi = β.

Letting xi = βi/τ for i = 1, . . . , J , we minimize

Pmin = minx1,...,xJ

σ2

β

J∑i=1

xi∏il=1(1− xl)

givenJ∑

i=1

xi =β

τ.

Note that


J∑i=1

xi∏il=1(1− xl)

=J∑

i=1

(xi − 1)∏il=1(1− xl)

+J∑

i=1

1∏il=1(1− xl)

=

=J∑

i=1

1∏il=1(1− xl)

− 1−J∑

i=2

1∏i−1l=1(1− xl)

=1∏J

l=1(1− xl)− 1 =

= exp

{−J

J∑l=1

1J

ln(1− xl)

}− 1 ≥ exp

{−J ln

(1− 1

J

J∑l=1

xl

)}− 1 =

= exp{−J ln

(1− α

Jτ

)}− 1 =

1

(1− α/(Jτ))J− 1,

where we have used the standard inequality E [ln ξ] ≤ ln(E [ξ]), and equalityis achieved for β1 = . . . = βJ .

Lemma 6.4 shows that equal-sized groups are in fact optimal, and underthese conditions, the group powers (6.110) are given by

Pj =σ2

τ − βJ

(1− β

τJ

)1−j

=σ2

τ

(1− β

τJ

)−j

; 1 ≤ j ≤ J, (6.112)

that is, the powers follow an exponential distribution with optimal averagepower (6.111) given by

P opt =J∑

j=1

Pjβj

β=

1J

J∑j=1

Pj

=σ2

τJ

J∑j=1

(1− β

τJ

)−j

=σ2

β

[(τ

τ − β/J

)J

− 1

]. (6.113)

In (6.113), the optimal power is a function of σ2, β and τ . The correspondingbit normalized signal-to-noise-ratio is then[

Eb

N0

]opt

=P opt

2Rσ2=

12Rβ

[(τ

τ − β/J

)J

− 1

], (6.114)

where R is the code rate common to all the FEC codes.Figure 6.24 shows the spectral efficiency achievable with rate R = 1/3

Shannon-type strong codes, for both matched filter cancellation (solid curves)and MMSE loop filter cancellation (dashed). Curves are shown for J = 1, 5,and 10 optimized power groups, using an ideal Shannon-capacity achieving


code whose (iterative) detection threshold τ = σ2/Es is related to its rate viathe well-known Shannon bound [120]

τ =1

22R − 1. (6.115)

The following observations can be made. The achievable capacity is close tothe Shannon limit for AWGN channels for all values of Eb/N0. This is contraryto the situation in the equal power case, where we were unable to find capacityrates that grew with increasing Eb/N0 – see Figures 6.17 and 6.23.

10

1

0.1

Sum

Capaci

ty[b

its/

dim

ensi

on]

−2 0 2 4 6 8 10 12 14 16 Eb/N0

J = 1

J = 5

J = 10

Fig. 6.24. CDMA spectral efficiencies achievable with iterative decoding with dif-ferent power groups assuming ideal FEC coding with rates R = 1/3 for simplecancellation as well as MMSE cancellation.

There is a caveat in such power optimized distributions, and this is thatthe initial parallel iterative interference cancellation receiver is transformedinto a successive canceler in the sense that the different power groups are de-facto decoded successively. However, the novelty of parallel iterative detectionover classic serial cancellation is that cancellation does not need to be errorfree in order for the entire system to converge as we have shown in general. On


the contrary, if weak error control codes are used, convergence is achieved in amanner that one could call partial parallel cancellation, whereby convergencein any given group of users may not yet be complete while convergence in thenext lower power groups has already begun.

The attainable spectral efficiency is evidently improved as the number ofpower groups is increased. This following lemma formalizes this observation:

Lemma 6.5. The sequence(1− β

τJ

)−J

is monotonically decreasing with J

and converges to eβ/τ in the range [J∗,∞), where J∗ denotes the smallestinteger such that β/(J∗τ) < 1.

Proof. Since (τ

τ − β/J

)J

= exp{

J lnτ

τ − β/J

}we consider the function

f(J) = J lnτ

τ − β/J.

Using the standard inequality lnx ≤ x− 1, we have

f ′(J) = lnτ

τ − β/J− β

Jτ − β≤ 0

from which the lemma follows.

In the limit as J →∞ the required threshold Eb/N0 obeys[Eb

N0

]lim

= limJ→∞

12Rβ

[(τ

τ − β/J

)J

− 1

]=

eβ/τ − 12Rβ

, (6.116)

where we have used(

11−x

) 1x → e as x → 0 in the third step. The optimal

signal-to-noise ratio follows approximately an exponential function of the totalsystem load β. The minimum signal-to-noise ratio for CDMA can be obtainedby letting β → 0 in (6.116), and equals 1/(2Rτ), which is the threshold of theFEC code in single-user channels.

Alternately, for R → 0 we obtain the following

Theorem 6.6. Single-user Shannon-capacity achieving codesobeying (6.115) achieve the multiple-access channel Shannonbound as R → 0 given the optimal received power distributionof (6.110) , i.e. their signal-to-noise ratio is given by[

Eb

N0

]lim

=22C − 1

2C; C = Rβ. (6.117)


Proof. Using (6.115) in (6.116) we obtain[Eb

N0

]lim

=eβ(22R−1) − 1

2Rβ. (6.118)

Now let C = Rβ be the spectral efficiency of the system and apply the ruleof l’Hopital to the 0/0 limit in the exponent of the exponential to obtain theShannon bound (6.117).

Equation (6.117) is identical to the Shannon limit for an additive whiteGaussian noise channel [120]. It states that in order to approach the multiple-access channel’s capacity limit, low-rate coding needs to be used. For a generalcode threshold τb < τ , it is easy to see that

C = τb ln 2 log2

(1 + 2C

Eb

N0

), (6.119)

that is, the AWGN Shannon limit is multiplied by the scaling factor 2τb ln 2 <1. Codes that have a threshold τb = 1

2 log2 e make the overall system efficiencyapproach that of an interference free AWGN channel. This again necessitatesR → 0. Furthermore, (6.119) demonstrates that, in conjunction with theoptimal power distribution (6.112), matched filter cancellation is completelysufficient to achieve the capacity of the multiple-access channel.

The exponential power distribution is well-known to achieve the capacityof a (Gaussian) multiple access channel if successive cancellation is performed,where each user is completely decoded and its signal canceled from the receivedsignal before the next lower-power user is decoded. That is, if βj → 0 in(6.109), the load in each group is small, then

Pj =1τ

j−1∑m=1

βmPj + σ2, (6.120)

that is, the threshold signal-to-noise ratio is given by

1τ

=Pj∑j−1

m=1 βmPm + σ2, (6.121)

which implies that the code rate is bounded by

R ≤ 12

log

(1 +

Pj∑j−1m=1 βmPm + σ2

), (6.122)

which in turn equals the rate constraint for the Gaussian multiple-access chan-nel if the signals of the users in groups j+1, · · · , J are known [24]. The optimalpower assignment therefore adjusts the group powers such that (6.122) is anequality for all k. The decoder then progresses by following a sequence of


corner points in the capacity polytope as predicted by the successive cancella-tion principle. Such a power distribution was suggested by Viterbi for low-rateconvolutional coding in CDMA systems in [153]. The process is illustrated bythe three-dimensional example capacity polytope in Figure 6.25. Equal ratesare achieved by adjusting the power such that the active corner point lies onthe equal rate line.

This user

decodes first

Rate R1

Rate R2

Rate R3

Fig. 6.25. Capacity polytope illustrated for a three-dimensional multiple-accesschannels. User 2 is decoded first, than user 1, and finally user 3.

Theorem 6.6 seems at odds with the result in Chapter 3 which stated thatthe equal power distribution achieves capacity of the random CDMA chan-nel. This situation is clarified by realizing that the maximum of the CDMAchannel for large systems is not unique. If fact, for K,L → ∞, the exponen-tial distribution (6.112), as well as all linear combinations of the permutedexponential distributions also achieve the AWGN capacity. In practice, thenumber of power groups J does not have to be very large to achieve highspectral efficiencies (see Figure 6.24).

6.5.3 Unequal Rate Distributions

What can be achieved with different powers, can also be achieved by usingcodes of different coding rates. This has a similar effect of moving the codeVT curves, and causes users with low-rate codes to be able to decode first. Weagain assume J groups of users, who all transmit at the same symbol powerlevel P . We assign rates Rj : R1 > R2, . . . , > RJ to the different groups, andassume Shannon-capacity achieving codes with a VT characteristics bounded


by (6.54). Since the maximum tolerable noise variance for rate Rj codes equalsτj = 1/(4Rj − 1), which is monotonically decreasing with Rj , lower rate usersconverge before higher rate users, as suggested above.

The power constraint equation for group j, in the case of matched-filterinterference cancellation, is given by

τjP ≥j∑

m=1

βmP + σ2, (6.123)

which implies the rate constraints

Rj ≤12

log

(1 +

1∑jm=1 βm + σ2/P

). (6.124)

The variational problem to be solved is that of maximizing the sum of therates Rj in (6.124) for a given system load, i.e. maximize

g({βj}, J, b) =12

J∑j=1

βj log

(1 +

1∑jm=1 βm + σ2/P

)

given thatJ∑

j=1

βm = β

(6.125)

over all possible choices {βj}.The first lemma establishes that the sequences of partial loads βj is a

monotonically increasing sequence:

Lemma 6.6. The maximizing solution of (6.125) is a monotonically increas-ing sequence of partial loads, i.e. β1 < . . . < βJ .

Proof. In (6.125) take any two adjacent partial loads, say, βj and βj+1, andconsider maximizing (6.125) with respect to βj , βj+1, constraining βj +βj+1 =a and keeping all other {βj} fixed. This is equivalent to considering the caseof only two variable groups, lumping all other interference into B = σ2/P +∑j−1

m=1 βm, for which we have with βj = x

g(βj , βj+1) = g(x) = x log(

1 +1

x + B

)+ (a− x) log

(1 +

1a + B

).

We have

g′(x) = ln(

1 +1

x + B

)− ln

(1 +

1a + B

)− x

(x + B)(x + B + 1)

g′′(x) < 0 ; g′(0) > 0 ; g′(a/2) < 0.

Then the maximum of g(x) is attained for some 0 < x < a/2, from which thelemma follows.


The next lemma gives upper and lower bounds on the achievable sum rate.

Lemma 6.7. Let β0 = maxj βj , then the following inequalities hold:

12

β∫0

log(

1 +1

u + β0 + b

)du ≤ g({βj}, J, b) ≤ 1

2

β∫0

log(

1 +1

u + b

)du,

(6.126)where b = σ2/P . The latter integral can be evaluated as (β + b + 1) log(β +b + 1)− (β + b) log(β + b)− (b + 1) log(b + 1) + b log b.

Proof. Due to its length the proof has been relegated to the end of the Chap-ter, in Section 6.6.

We can draw a number of interesting conclusions from Lemma 6.7. First,as long as β0 � σ2/P , i.e. the maximum partial load is small w.r.t. the inverseσ2/P of the SNR, any load distribution gives the same maximum achievablesum rate.

A corollary to this is that the case of a single group, the lower limit in(6.126) is minimized, i.e. the equal rate case is the worst case for cancellationusing Shannon capacity achieving codes.

The achievable capacity is given by

C(β, b) =β+b+1

2log(β+b+1)− β+b

2log(β+b)− b+1

2log(b+1) +

b

2log b

=12

log(

1 +β

b + 1

)+

β + b

2log(

1 +1

β + b

)+

b

2log(

1 +1b

)Expressing the bit energy to noise power ratio in terms of capacity, signalpower, and noise variance, we obtain

Eb

N0=

β

2bC(β, ); b =

σ2

P. (6.127)

If we consider large system loads where β → ∞, we necessarily have lowtotal signal-to-noise ratios, that is, b →∞. In this case

C(β, b) =12

log(

1 +β

b

)− β log e

4b(β + b)+ O

(1b2

), b →∞. (6.128)

Furthermore usingβ

b≈ 22C(β,b)eβ/(2b(β+b)) − 1 (6.129)

in (6.127) we finally obtain

Eb

N0≈(22C − 1

)2C

[1 +

(22C − 1

)2β

], β →∞, (6.130)

where we have used the abbreviation C = C(β, b). For large system loadsβ, (6.130) again assumes the familiar form of the Shannon bound, similar to(6.117), which we express in the following


Theorem 6.7. Shannon-type strong codes can achieve theAWGN Shannon bound as long as the number of user groups issufficiently large such that the maximum partial load β0 � σ2/P .We then obtain[

Eb

N0

]lim

=22C − 1

2C; β →∞. (6.131)

Figure 6.26 shows the resulting spectral efficiencies, i.e. equation (6.127),for different finite values of β. It becomes evident that fairly large system loadswould be required to approach the Shannon bound, but that it is theoreticallypossible.

10

1

0.1

Sum

Capaci

ty[b

its/

dim

ensi

on]

−2 0 2 4 6 8 10 12 14 16 Eb/N0

β = 1

β = 5

β = 10

β = 20β = 50

Fig. 6.26. CDMA spectral efficiencies achievable with iterative decoding with equalpower groups assuming ideal FEC coding with optimized rates according to (6.124).


6.5.4 Finite Numbers of Power Groups

The asymptotic situations considered in the previous sections show that theideal, maximal capacity of the multiple-access channel can be achieved if a(relatively) large number of received power groups and different rate groupscan be established. In general we learn that differences in received power levelsand/or rates are beneficial to the operation of the joint iterative decoder,which is the opposite for single-pass linear filter detectors.

In fact, in practice not very many different power levels would be neededto achieve good performance. Furthermore the advantage of weak codes interms of achievable spectral efficiency also quickly disappears as more powergroups are allowed.

Let us first proceed by developing our graphical variance transfer methodto the case of unequal received power levels. Recall equation (6.106) for theresidual interference and noise variance of an arbitrary user, given by

σ2v =

1L

K∑j=1

Pjg

(σ2

v−1

Pj

). + σ2 (6.106)

Expressing (6.106) in terms of power groups we find

σ2v =

J∑j=1

KjPj

Lg

(σ2

v−1

Pj

),+σ2, (6.132)

where Kj is the number of users in group j. We need a notion of an averagesystem load, since different users now weigh in with different received powerlevels. This we define as βav =

∑Jl=1 βlPl. We now obtain

σ2v = βav

J∑j=1

βjPj

βavg

(σ2

v−1

Kj

)+ σ2

= βavgav

(σ2

v−1

)+ σ2, (6.133)

where gav

(σ2

v−1

)is an “average” variance transfer function. It is obtained

by weighing the individual code VT functions by the weight factor βPj/βav

composed of their loads and powers. Equation (6.133) is linear in the averageload, and can be visualized and plotted analogously to the equal power case,where the FEC VT function of the code in the latter corresponds to thecomposite FEC VT function gav (·) in the case of unequal power levels.

Figure 6.27 presents this combined VT dynamical equation (6.133) for twotypes of FEC codes and 4 equal-sized power groups. The “strong” code systemall use the the serial concatenated code 2 from Table 6.1 discussed in Section6.3.2, and the “weak” code is a 4-state convolutional codes. The power levelsare P1 = P, P2 = 2P, P3 = 4P . The solid VT curve presents the averageVT curve, the individual VT curves are dashed and reflect the different power


levels. It is noteworthy that the average VT curve is fairly “diagonal” for bothcases, and matches well with the cancellation VT curve. Consequently, there islittle difference in the supportable loads of both systems. The exact numbersdepend on the error criterion chosen. The weak-code system supports about1.3bits/dim at around 8.5dB, which is quite close to the theoretical maximum.The convergence trajectories are single simulations illustrating the analysistechnique.

σ2 eff/w

σ2 effw

0

1

2

3

4

5

6

7

8

9

10Strong Codes

0 0.2 0.4 0.6 0.8 1 σ2d 0 0.2 0.4 0.6 0.8 1

0

1

2

3

4

5

6

7

8

9

10Weak Codes

Fig. 6.27. Illustration of different power levels and average VT characteristics shownfor both a serial turbo code (on the left) and a convolutional code (on the right).The system parameters are K1 = 22, K2 = 18, K3 = 16 for the SCC system at anEb/N0 = 13.45dB, and K1 = K2 = K3 = 20 at and Eb/N0 = 8.88dB.

In order to find a numerical technique to optimize the power levels, wenow assume that there are J different power levels, each with a partial loadβj , as defined in Section 6.5.2, however, this time the number of levels J isarbitrary, and simply determines the complexity of the numerical method andits accuracy. The key observation, first made by Caire et. al. [16], is that op-timizing the partial loads will solve the problem and turn the optimizationinto a well-known linear programming problem. Note that, given enough lev-els J , optimizing the partial loads is synonymous with optimizing a powerdistribution.

Let us start with the residual recursive interference equation (6.132)

σ2v =

J∑j=1

βjPjg

(σ2

v−1

Pj

)+ σ2 = f

(P, β, x = σ2

v−1

),

where P = (P1, · · · , PJ), and β = (β1, · · · , βJ). The condition that he VTcurves of the FEC decoders and the multiuser interference cancelers do not


intersect can be reformulated as

f (P,β, x) < x; x ∈[σ2

min,∞]. (6.134)

where the lower limit σ2min is an arbitrary limit dictated by some minimal

performance criterion, such as the error probability of the lowest power usergroup. Given their power P1 and the error control codes used, we can cal-culate the maximum tolerable error variance σ2

min, and (6.134) ensures thatno intersection point exist for larger residual interference variances, thereforeenforcing this minimal performance criterion.

With these preliminaries, we now have the following optimization problem:

minimizeJ∑

j=1

βjPj subject to:

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

f (P,β, x) ≤ x−ε; x ∈[σ2

min,∞]

J∑j=1

βj = β

βj ≥ 0.(6.135)

The optimization criterion in (6.135) minimizes the average Eb/N0, see(6.114), which is equivalent to optimizing the system load for a given aver-age Eb/N0. This seems a fair optimization criterion, and, above all, leads to(6.135), which is a linear optimization problem for which excellent numericaltechniques exist.

The parameter ε controls the width of the “convergence tunnel” in thatit ensures a minimal opening through which the iterations have to proceed,see Figure 6.27. A wider channel opening allows for faster convergence, butat the cost of a decreased load of the system. The optimization strategy isvery versatile, different power groups can use different error control codes,i.e. different VT transfer functions in (6.132). Furthermore in [16] it is shownthat similar linear optimization strategies can also be formulated for the morecomplex receivers using MMSE filtering at the multiuser stage, or even aposteriori probability (fully complex) estimation.

6.6 Proof of Lemma 6.7

We consider the following variational problem

minβ1,...,βJ

J∑j=1

βj given g({βj}, J, b) =J∑

j=1

βj ln

(1 +

1∑ji=1 βi + b

)= C,

(6.136)where b = σ2/P .

The problem (6.136) is equivalent to

maxβ1,...,βJ

J∑j=1

βj ln

(1 +

1∑ji=1 βi + b

)−→ provided

J∑j=1

βj = A. (6.137)

6.6 Proof of Lemma 6.7 235

We first note that the solution to both problems, (6.136) and (6.137), isstrictly monotonically increasing, i.e. β1 < . . . < βK .

This is shown as follows: In (6.137) take any two adjacent variables, say,βj and βj+1, and consider the the minimization with respect to βj , βj+1,assuming that βj + βj+1 = A and all other {βi} remain fixed. Then, lettingβ1 = β it is sufficient to consider the maximization over β of

g(β1, β2) = g(β) = β ln(

1 +1

β + B

)+ (A− β) ln

(1 +

1A + B

),

where B is constant. We observe that

g′(β) = ln(

1 +1

β + B

)− ln

(1 +

1A + B

)− β

(β + B)(β + B + 1);

g′′(β) < 0 , g′(0) > 0 , g′(A/2) < 0.

Then maximum of g(β) is attained for some 0 < β < A/2, from which thestatement follows.

Now we obtain lower and upper bounds for g({βj}, J, b) that work well forlarge J and well-behaved {βj}.

The following statement holds: With β0 = maxj βj the inequalities

u1∫0

ln(

1 +1

u + β0 + b

)du ≤ g({βj}, J, b) ≤

u1∫0

ln(

1 +1

u + b

)du (6.138)

hold, where u1 =∑J

j=1 βj .This part is shown as follows: For a non-decreasing function β(x), x ≥ 0

with β(0) ≥ 0 define the functional

g(β(x), J, b) =

J∫0

β(z) ln

(1 +

1∫ z

0β(u) du + b

)dz.

Also, for a non-negative sequence β1, . . . , βJ let β(x), 0 ≤ x ≤ J be the step-wise function such that β(j) = βj , j − 1 < x ≤ j, j = 1, . . . , J . Then forj = 1, . . . , J we have

j∑i=1

βi =

j∫0

β(x) dx ,

and

βj ln

(1 +

1∑ji=1 βi + b

)=

j∫j−1

β(x) ln

(1 +

1∫ j

0β(u) du + b

)dx ≤


≤j∫

j−1

β(x) ln

(1 +

1∫ x

0β(u) du + b

)dx,

and therefore

g({βj}, J, b) ≤J∫

0

β(x) ln

(1 +

1∫ x

0β(u) du + b

)dx =

=

J∫0

ln

(1 +

1∫ x

0β(y) dy + b

)d

(∫ x

0

β(y) dy

)=

u1∫0

ln(

1 +1

u + b

)du =

= (u1 + b + 1) ln(u1 + b + 1)− (u1 + b) ln(u1 + b)− (b + 1) ln(b + 1) + b ln b,

where

u1 =

J∫0

β(y) dy =J∑

j=1

βj

from which the right-hand side of inequalities (6.138) follows.On the other hand, we have

βj ln

(1 +

1∑ji=1 βi + b

)≥

j∫j−1

β(x) ln

(1 +

1∫ x

0β(u) du + β0 + b

)dx

from which the left-hand side of inequalities (6.138) follows.For a given

∑Jj=1 βj and large J both bounds (6.138) are close to each

other if β0 � b.Note that the function

f(β) = β ln(

1 +1

β + B

); B ≥ 0

increases monotonically with β ≥ 0.

A

Estimation and Detection

A.1 Bayesian Estimation and Detection

Throughout this book, we take the Bayesian approach to estimation and de-tection. The purpose of this Appendix is to provide a short refresher on thesubject. It is by no means an exhaustive introduction to estimation theory, andit is limited in scope to the basic concepts that are used frequently throughoutthe book. There are many textbook introductions to estimation and detectiontheory, for example [71, 72, 102, 114, 143].

When we speak of estimation in the context of random signals, we generallymean the process whereby we attempt to determine one or more unknownparameters, x from an observation y. The Bayesian approach is to assign aknown prior probability density (or mass function) fX(x) to the parameters,which means they are considered as random variables. This is in contrast tothe “classical” approach, which treats the parameters as deterministic, butunknown. The Bayesian approach is particularly relevant to communicationstheory, since Shannon’s theory of information is firmly founded on the ideathat unknown messages are represented as random variables.

The underlying probabilistic structure is therefore the joint probabilitydensity of the observation, y ∈ Y and the parameter x ∈ X , namely

fX,Y (x, y) = fX(x) fY |X(y | x) .

The supports X and Y may be discrete or continuous. Typically there may be asequence y1, y2, . . . , yn of samples and this may be accommodated by allowingY to be a vector space Yn. Vector parameters can likewise be accommodatedby allowing X to be a vector space.

There are two main cases of interest, depending upon the form of the pa-rameter support X . If X is discrete, meaning that the parameter is one of acountable number of alternatives, the estimation problem is a decision prob-lem. In this case, the problem is usually referred to as detection, or hypothesistesting. The term estimation is usually reserved for the case when X is con-

238 A Estimation and Detection

tinuous. Despite differences of terminology, the underlying problem is still thesame.

Definition A.1 (Statistic). A statistic for x ∈ X is a deter-ministic function

t : Y �→ X .

t is also known as an estimator for continuous X and a detectorfor discrete X . A particular value x is referred to as an estimateof x. Note that t is in fact a well-defined random variable.

It is sometimes useful to also consider functions that can take values out-side of the parameter set.

The basic problem at hand is the design of estimators t with desirableproperties. Suppose that there exists a certain cost involved with acceptingx, rather than x as the true value of the parameter in question. Let this costbe defined by a cost function, C(x, x). For example, the cost function couldbe the square difference (x− x)2 or absolute difference |x− x|.

Definition A.2 (Risk). For a given cost function C, the riskincurred through use of some estimator t(y) : Y �→ X is

R(t, x) =∫Y

C(t(y), x) fY |X(y | x) dy,

which is the expected cost, given that x is the true value of theparameter.

The Bayesian approach to estimation is based on minimization of the risk,averaged over the parameter prior.

Definition A.3 (Bayes Criterion). Under the criterion ofBayes, the estimator should minimize the average risk,

E [R(t, x)] =∫X

R(t, x)fX(x) dx.

Such an estimator is called a Bayes estimator.

We can use Bayes rule to write the Bayes estimate in terms of the averageconditional risk, defined as follows.

A.2 Sufficiency 239

Definition A.4 (Conditional Risk).

R(t | y) =∫X

fX|Y (x | y) C(t(y), x) dx

=∫X

fX(x) fY |X(y | x)fY (y)

C(t(y), x) dx

(A.1)

Using this conditional risk, re-write the average risk E [R(t, x)] as the averagewith respect to the prior distribution of the samples,

E [R(t, x)] =∫Y

R(t | y)fY (y) dy, (A.2)

which is minimized if R(x | y) is minimized for each value of y ∈ Y (sincethe conditional density is non-negative). This is summarized in the followingLemma.

Lemma A.1 (Bayes Estimator). The Bayes estimate t(y) minimizes theconditional risk R(t(y) | y) for each value of y.

A.2 Sufficiency

The concept of sufficiency is an important one in estimation theory. It is usu-ally developed within the framework of classical estimation. There is howeveralso a strong connection to Bayesian estimation. The following definition setsthe scene.

Definition A.5 (Sufficient Statistic). A statistic t is said to besufficient for estimation of a parameter x, if

fY |T (y | t) = fY |T,X(y | t, x) . (A.3)

In other words, conditioned on the statistic, the observation is independent ofthe parameter. This means that once we have formed the sufficient statistic,the observation gives us no further information about the parameter – thestatistic serves as a summary of the observation.

It is a consequence of Definition A.1 that the parameter, observation andstatistic form a Markov chain, denoted

x → y → t

which means the joint density factors in the following way,


fX,Y,T (x, y, t) = fX(x) fY |X(y | x) fT |Y (t | y) , (A.4)

Conditioned on the observation, the statistic is independent of the parameter,

fT |Y,X(t | y, x) = fT |Y (t | y) .

It is a known property of Markov chains that the reverse of a Markov chainis also a Markov chain, [24]. Hence

t → y → x

is also a Markov chain. Conditioned upon the observation, the parameter istherefore independent of the statistic,

fX|Y,T (x | y, t) = fX|Y (x | y) .

Sufficiency introduces the extra conditional independence (A.3) that is notimplied by the Markov structure. We can use the Markov structure, togetherwith sufficiency to prove the following useful Lemma.

Lemma A.2. Let t be a sufficient statistic. Then

fX|Y (x | y) = fX|T (x | t) (A.5)

Proof. From the Markov property, fX|Y,T (x | y, t) = fX|Y (x | y). The goal isto use sufficiency to show that fX|Y,T (x | y, t) = fX|T (x | t). This may be ac-complished using nothing more that the chain rule for probability and thedefinition of sufficiency as follows.

fX|Y,T (x | y, t) =fX,Y,T (x, y, t)

fY,T (y, t)

=fX|T (x | t) fY |X,T (y | x, t)

fY |T (y | t)= fX|T (x | t) .

�

Now in Lemma A.1 we expressed the Bayes estimator in terms of the minimalaverage conditional risk (A.1). In the case of a sufficient estimator however,Lemma A.2 allows us to write the conditional risk as∫

XfX|Y (x | y) C(t(y), x) dx =

∫X

fX|T (Y )(x | t(y))C(t(y), x) dx.

The consequence of this is the following Theorem.

Theorem A.1. The Bayes estimator for any cost function C is afunction of any sufficient statistic.

A.3 Linear Cost 241

This shows more directly how a sufficient statistic indeed summarizes an ob-servation. Once we have a sufficient statistic, we no longer need the observationto construct a Bayes estimate.

There are many sufficient statistics for any given parameter. It is knownthat if t is invertible then x → t → y is also a Markov chain, and in this case,the resulting Markov structure implies the sufficiency condition (A.3).

Lemma A.3. Any invertible transformation of the observation is sufficient.

The main motivation for finding a sufficient statistic is however to replacea sequence of observations y1, . . . , yn or a vector observation with a statistic ofreduced dimension. We may even desire a sufficient statistic that is minimal,in the following sense.

Definition A.6. A minimal sufficient statistic is a function of allother sufficient statistics.

Of particular interest in this book are the Gaussian linear models arisingin the context of multiuser detection.

Theorem A.2. Let y ∼ N (Ax, Σ) be a conditionally Gaussianrandom vector with mean Ax (depending on the vector parameterx) and covariance Σ, where A and Σ are known. Then

AtΣ−1y

is a minimal sufficient statistic for x.

Note that via the application of At, the dimensionality of the statistic may bereduced. The minimal sufficient statistic given in the above Theorem is notunique. For any invertible matrix B, the statistic BAtΣ−1y is also minimal.

A.3 Linear Cost

Perhaps the simplest measure of cost is the estimation error, C(t, x) = t− x.For this particular cost function, the average risk is called bias.

Definition A.7 (Bias). The bias of an estimator is defined to bethe expected value of the error,

E [t− θ] ,

where the expectation is over fY,X(y, x) An estimator is unbiasedif its bias is zero.


Note that trying to define a Bayes estimator from C(t, x) = t − x leadsto the meaningless estimate t(y) = −∞ (the problem here is that C is nota positive definite function). Alternatively, setting t(y) = E[x] results in zeroaverage risk, which is again meaningless, since it does not depend on theobservation. Rather than trying to define a Bayes estimate based on linearcost, bias is considered as a property of any given estimator.

While it is desirable that an estimator be unbiased, the property itselfsays little about how good the estimator really is. It does not take too muchimagination to construct a pathological unbiased estimator which, irrespectiveof how many samples are taken, has in probability as large an error as can bedefined on X .

The following definition describes a desirable asymptotic property.

Definition A.8 (Consistent Estimator). An estimator tn :Yn �→ X is consistent if tn → x in probability as n → ∞, bywhich we mean that for every δ > 0 and ε > 0, there exists aninteger n such that

Pr (|tn − x| > ε) < δ.

A.4 Quadratic Cost

Perhaps the simplest positive definite cost function is the squared error,C(t, x) = (t−x)2. The resulting average risk is the mean-squared error, whichfor unbiased estimators is equal to var(t). In the case of vector parameters, thequadratic cost is defined C(t, x) = ‖t− x‖22 = (t− x)t(t − x). The resultingBayes estimator is known as the minimum mean-squared error estimate.

A.4.1 Minimum Mean Squared Error

Theorem A.3 (Bayes Estimator for Quadratic Cost). Theminimum mean-squared error estimator is

t(y) = E [x | y] (A.6)

Proof. We want to minimize the conditional risk for each observation. Nowthe cost function is positive definite and convex, which means we may performthis minimization by equating the gradient of the conditional risk to zero. Now

∂

∂t

∫(t− x)t(t− x)fX|Y (x | y) dx = 2

∫(t− x)fX|Y (x | y) dx

= 2 (t− E [x | y]) ,

A.4 Quadratic Cost 243

and clearly this is set to zero by choosing t(y) = E [x | y].

It is interesting to note that the more general quadratic cost (t− x)tA(t−x),where A is positive definite symmetric, results in the same Bayes estimate.The matrix A has no effect on the minimization of conditional risk (since∂∂t (t− x)tA(t− x) = 2A(t− x) and expectation is linear).

Although the Bayes estimator has a very compact description, the condi-tional expectation may be hard to compute in practice.

A.4.2 Cramer-Rao Inequality

Development of the minimum mean-squared error estimator was based onthe conditional risk, which is the expectation of ‖t− x‖22 with respect to theconditional density fX|Y (x | y).

One of the most famous results from classical estimation theory is theCramer-Rao inequality [25, Section 32.3] and [105]. This inequality providesa bound on the expectation of (t−x)2 with respect to the conditional densityfY |X(y | x) (note that in classical estimation theory there is no fX(x)). It givesa bound on the mean-squared error as a function of the realization of theparameter.

Theorem A.4 (Cramer-Rao Inequality). Every sufficientlywell-behaved estimator t : Yn �→ X , a function of n i.i.d. sam-ples selected according to fY,X(y, x), with x ∈ X , satisfies

E[(T − x)2 | x

]≥(

∂

∂xE [t | x]

)2 1nJ(x)

, (A.7)

where

J(x) =∫ (

∂

∂xln fY |X(y | x)

)2

fY |X(y | x) dy. (A.8)

J(x) is also known as the Fisher information of x, see [24, Section 12.11]. Notethat for unbiased estimators, (A.7) reduces to

var [t | x] ≥ 1J(x)

(A.9)

There are two required conditions for equality in (A.7) or (A.9),

(a). t must be sufficient.(b). It must be possible to write

∂

∂xln fY |X(y | x) = k(x) (t(y)− x) , (A.10)

where k(x) does not depend upon the observations, y1, y2, . . . , yn.


See [24] for this theorem in vector form. This theorem as it stands, has acertain “Shannon” flavor to it, i.e. it says how good an estimator may be, butdoes not give clues on how to find such estimators. Nevertheless, we make thefollowing definition, due to [43, 44].

Definition A.9 (Efficient Estimator). An unbiased estimatort is efficient if it satisfies (A.9) with equality. Such an estimator isa minimum variance unbiased (MVU) estimator.

Efficient estimators do exist, but they are somewhat rare. In order for anestimator to be efficient, it must be unbiased and satisfy conditions (a) and(b) of Theorem A.4.

To summarize, under the mean-squared error criterion, the Bayes estima-tor minimizes the conditional risk∫

(t− x)2fX|Y (x | y) dx

for each value of the observation, whereas an efficient estimator minimizes adifferent conditional risk, ∫

(t− x)2fY |X(y | x) dy

for each value of the parameter.

A.4.3 Jointly Gaussian Model

In the jointly Gaussian model, the minimum mean-squared error estimatorhas a particularly convenient form.

Theorem A.5. Suppose y and x are jointly Gaussian vectors,[yx

]∼ N

([E[y]E[x]

],

[cov[y] cov[y, x]

cov[x, y] cov[x]

])

Then the minimum mean-squared estimate is

E [x | y] = E[x] + cov[x, y](cov[y])−1 (y − E[y]) (A.11)

Proof. It is straightforward to show that the parameter x is conditionallyGaussian with mean (A.11). �

Note that this is simply a linear transformation of the observation.

A.5 Hamming Cost 245

A.4.4 Linear MMSE Estimation

The linear form of the minimum mean-squared estimator for the jointly Gaus-sian model is one of the gems of estimation theory. In general, however theconditional mean (A.6) will be some non-linear function of the observation. Inthe interests of saving computational effort, one could be interested in findingthe optimal linear estimator in the general case. This amounts to finding thebest estimator of the form

t(y) = Ay

where A is a matrix of suitable dimension, i.e.

A = arg minA

E[‖Ay − x‖22

]

Theorem A.6 (Linear Minimum Mean Squared Error Es-timation). Let the parameter x and observation both be zero-mean. Then the linear minimum mean-squared error estimator is

cov[x, y](cov[y])−1y

Example A.1. Suppose the observation vector y is related to the zero mean,identity covariance parameter x via

y = Ax + z,

where z is independent from x and A and cov[z] are known. Then the LMMSEestimate of x is

At(AAt + cov[z]

)−1.

A.5 Hamming Cost

In detection problems (discrete X ), we may be interested in the following cost

C(t, x) =

{0 t = x

1 t �= x

The average cost in this case is in fact the probability of error,

E [C(t, x)] = Pr (t �= x) .


A.5.1 Minimum probability of Error

The Bayes estimator therefore minimizes the probability of error. Since we areconsidering a discrete parameter, fX|Y (x | y) is a probability mass function.Hence the conditional risk is∑

t(y)�=x

fX|Y (x | y) = 1− fT |Y (t | y) ,

which is minimized by choosing t(y) to maximize fT |Y (t | y). This is the max-imum a-posteriori (MAP) estimator.

Theorem A.7 (Bayes Estimator for Hamming Cost). Theminimum probability of error estimator is

t(y) = arg maxx

fX|Y (x | y) .

Using Bayes’ theorem, we can write the MAP estimator as

xMAP = arg maxx∈X

fY (y | x)fX(x),

where fX(x) is the prior distribution of the parameter, and we have removeda term representing the prior probability of y, since this contributes nothingto the maximization over x.

A.5.2 Relation to the MMSE Estimator

Suppose that the mode of the posterior density (or mass function) fX|Y (x | y)of the parameter coincides with the conditional mean E[x | y]. Then the MMSEand MAP estimators are the same.

This is indeed the case for the jointly Gaussian model (A.5), and hencethe MAP estimator for the Gaussian model is given by (A.11).

A.5.3 Maximum Likelihood Estimation

It often happens that the prior distribution of x is unknown, in which case,we must treat the prior distribution of x as uniform (which is the maximumentropy prior for a discrete set), and obtain the the maximum likelihood es-timate, which minimizes error probability in the case of a uniform prior.

A.5 Hamming Cost 247

Definition A.10 (Maximum Likelihood Estimator).

xML = arg maxx∈X

fY (y | x),

equivalently, a solution to

∂

∂xln fY (y | x) = 0. (A.12)

Maximum-likelihood estimators have some important properties, as weshall now show. The following theorem, is a corollary of the Cramer-Raolower bound.

Theorem A.8.

(1). If an efficient estimator teff exists, it is maximum-likelihood,i.e. teff is the unique solution to (A.12).

(2). If a sufficient estimator tsuff exists, any maximum-likelihoodestimate is a function of it, i.e. every solution to (A.12) is afunction of tsuff.

Theorem A.8 makes statements about the form taken by efficient and sufficientestimators, relating them to the ML estimate. The following theorem, provedin [25, p.500] describes the nature of the converse relationship.

Theorem A.9 (Asymptotic Properties of ML Estimators).Every1 maximum-likelihood estimator is

1. Asymptotically efficient.2. Consistent.3. Asymptotically normally distributed, with mean x, and vari-

ance given by (A.7).

It may happen that a parameter is known to be an invertible function of some“underlying” parameter. In such cases, the maximum-likelihood estimates ofthe two parameters obey a simple, but very useful relationship.


Theorem A.10 (ML Estimation via Invertible Functions).Let x ∈ X be related to α ∈ A by a uniquely invertible function g :A �→ X . Let αML and xML be the respective maximum likelihoodestimators for α and x. Then

αML = g−1 (xML) .

This is the so-called principle of invariance for maximum likelihoodestimators [114, p. 217].

Proof. It is easy to see for any γ ∈ X , that x = γ and α = g−1(γ) are thesame event, and therefore result in the same value of the likelihood functionfor x, namely

fY |X(y | x = γ) = fY |X(y | α = g−1(γ)).

It should be noted that this result does not hold for MAP estimation, or anymethod which incorporates the a-priori distribution on α, as the associateddistribution on x will result from a transformation of variables under g.

Example A.2. Let the observation vector y be related to a zero mean vectorparameter x ∈ R

n via the noise linear transformation

y = Ax + z

where A is a known matrix and z is a zero-mean, zero-mode noise vector.Then the maximum likelihood estimate of x is

A−1y.

This can be seen by considering the ML estimate of Ax, which is simply y(due to the zero-mode of z) and Theorem A.10.

Example A.3. Let the observation vector y be related to a zero mean vectorparameter x ∈ {−1,+1}n via the noise linear transformation

y = Ax + z

where A is a known matrix and z is a zero-mean, zero-mode noise vector.Unlike the previous example, in this case the ML estimate of x must be foundby computing the posterior likelihood for each of the 2n possible vectors x.This shows how the form of the ML estimate can depend very strongly on theparameter support.

References

[1] E. Agrell, T. Eriksson, A. Vardy, and K. Zeger. Closest point search inlattices. IEEE Trans. Inform. Theory, 48:2201–2214, 2002.

[2] R. Ahlswede. Multi-way communication channels. In 2nd Int. Symp.Inform. Theory, pages 23–52, Tsahkadsor, Armenian S.S.R, 1971. Hun-garian Academy of Sciences.

[3] P. Alexander, M. Reed, J. Asenstorfer, and C. Schlegel. Iterative mul-tiuser interference reduction: turbo CDMA. IEEE Trans. Commun., 47(7):1008–14, July 1999.

[4] P. D. Alexander. Coded Multiuser CDMA. PhD thesis, University ofSouth Australia, 1996.

[5] P. D. Alexander, A. J. Grant, and M. C. Reed. Iterative detection oncode-division multiple-access with error control coding. Europ. Trans.Telecommun., 9(5):419–426, 1998.

[6] P. D. Alexander and L. K. Rasmussen. Processing filter derivation forasynchronous time–varying CDMA. In International Symposium on In-formation Theory and Its Applications, volume 1, pages 151–154, Syd-ney, Australia, 1994.

[7] P. D. Alexander, L. K. Rasmussen, and C. B. Schlegel. A linear receiverfor coded multiuser CDMA. IEEE Trans. Commun., 45(5):605–610,1997.

[8] O. Axelsson. Iterative Solution Methods. Cambridge University Press,1994.

[9] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv. Optimal decoding oflinear codes for minimising symbol error rate. IEEE Trans. Inform.Theory, 20:284–287, 1974.

[10] Z. D. Bai and Y. Q. Yin. Limit of the smallest eigenvalue of largedimensional sample covariance matrix. Ann. Probab., 21:1275–1294,1993.

[11] S. Baro, J. Hagenauer, and M. Witzke. Iterative detection of MIMOtransmission using a list-sequential (LISS) detector. In IEEE Conf.Commun., pages 2653–2657, 2003.

250 References

[12] C. Berrou and A. Glavieux. Near optimum error correcting coding anddecoding: turbo-codes. IEEE Trans. Commun., 44(10):1261–1271, 1996.

[13] D. Bertsekas and R. Gallager. Data Networks. Prentice Hall, NewJersey, 1992.

[14] P. A. B. M. C. v. d. Braak and H. C. A. v. Tilborg. A family of gooduniquely decodeable code pairs for the two-access binary adder channel.IEEE Trans. Inform. Theory, IT-31(1):3–9, 1985.

[15] M. V. Burnashev, C. Schlegel, W. A. Krzymien, and Z. Shi. Character-istics analysis of successive interference cancellation methods. ProblemyPeredachi Informatsii, May 2004. Submitted.

[16] G. Caire, R. R. Muller, and T. Tanaka. Iterative multiuser joint de-coding: optimal power allocation and low-complexity implementation.IEEE Trans. Inform. Theory, 50(9):1950–1973, Sept. 2004.

[17] S.-C. Chang. Further results on coding for T-user multiple access chan-nels. IEEE Trans. Inform. Theory, IT-30(2):411–415, 1984.

[18] S.-C. Chang and E. Weldon. Coding for T-user multiple access channels.IEEE Trans. Inform. Theory, IT-25(6):684–691, 1979.

[19] S. C. Chang and J. K. Wolf. On the T-user M-frequency noiselessmultiple access channel with and without intensity information. IEEETrans. Inform. Theory, IT-27(1):41–48, 1981.

[20] P. R. Chevillat. N-user trellis coding for a class of multiple-access chan-nels. IEEE Trans. Inform. Theory, IT-27(1):114–120, 1981.

[21] G. E. Corazza and A. Vanelli-Coralli. cdma2000. In Wiley Encyclopediaof Telecommunications, volume 1, pages 358–369. John Wiley & Sons,Hoboken, NJ, 2003.

[22] T. M. Cover. Some advances in broadcast channels. In Advances inCommunication Systems, volume 4, pages 229–260. Academic Press,New York, 1975.

[23] T. M. Cover and S. K. Leung. An achievable rate region for the multipleaccess channel with feedback. IEEE Trans. Inform. Theory, IT-27(3):292–298, 1981.

[24] T. M. Cover and J. A. Thomas. Elements of Information Theory. JohnWiley, New York, 1991.

[25] H. Cramer. Mathematical Methods of Statistics. Princeton UniversityPress, Princeton NJ, 1946.

[26] M. O. Damen, H. El Gamal, and G. Caire. On maximum likelihood de-tection and the search for the closest lattice point. IEEE Trans. Inform.Theory, 49(10):2389–2402, Oct. 2003.

[27] M. A. Deaett and J. K. Wolf. Some very simple codes for the nonsyn-chronized two-user multiple access adder channel with binary inputs.IEEE Trans. Inform. Theory, IT-24(5):635–636, 1978.

[28] D. Divsalar, S. Dolinar, and F. Pollara. Iterative turbo decoder analysisbased on density evolution. Technical Report 42-144, TMO ProgressReport, Feb. 15 2001.

References 251

[29] D. Divsalar and M. K. Simon. CDMA with interference cancellation formultiprobe missions. Technical Report 42-120, JPL-TDA, Feb. 1995.Progress Report.

[30] A. Duel-Hallen. Decorrelating decision–feedback multiuser detector forsynchronous code–division multiple–access channel. IEEE Trans. Com-mun., 41(2):285–290, 1993.

[31] A. Duel-Hallen. A family of multiuser decision–feedback detectorsfor asynchronous code–division multiple–access channels. IEEE Trans.Commun., 43(2):421–434, 1995.

[32] H. G. Eggleston. Convexity. Cambridge University Press, London, 1958.[33] H. El-Gamal and E. Geraniotis. Iterative multiuser detection for coded

CDMA signals in AWGN and fading channels. IEEE J. Select. AreasCommun., 18(1):30–41, 2000.

[34] H. Elders-Boll, A. Busboom, and H. D. Schotten. Spreading sequencesfor zero-forcing DS-CDMA multiuser detectors. In 8th IEEE Int. Symp.PIMRC ’97, pages 53–57, Helsinki, Finland, Sept. 1997.

[35] H. Elders-Boll, H.-D. Schotten, and A. Busboom. Efficient implementa-tion of linear multiuser detectors for asynchronous CDMA systems bylinear interference cancellation. Europ. Trans. Telecommun., 9(5):427–438, Sept-Oct 1998. Special Issue on Code Division Multiple AccessTechniques for Wireless Communication Systems.

[36] D. Eppstein. Finding the k shortest paths. Technical report, Dep. ofInfo. and Comp. Sci., University of California, 1994.

[37] ETSI. Etsi/tc gsm recommendation 05.05: Transmission and reception,1988.

[38] ETSI/SMG/SMG2. The ETSI UMTS terrestrial radio access (UTRA)ITU-R RTT candidate submission, 1998.

[39] B. Farhang-Boroujeny. MMSE linear processing for MIMO channel.Private communication, 2005.

[40] P. G. Farrell. Survey of channel coding for multi-user systems. InJ. Skwirynski, editor, New Concepts in Multi-User Communication,NATO Advanced Study Institute Series, pages 133–159. Sijthoff andNoordhoff, Alphen aan den Rijn, The Netherlands, 1981.

[41] T. J. Ferguson. Generalised T-user codes for multiple-access channels.IEEE Trans. Inform. Theory, IT-28(5):775–779, 1982.

[42] U. Fincke and M. Pohst. Improved methods for calculating vectors ofshort length in a lattice, including a complexity analysis. Mathematicsof Computation, 44(170):463–471, 1985.

[43] R. A. Fisher. On the mathematical foundations of theoretical statistics.Phil. Trans. Royal Soc., London, A, 222:309, 1921.

[44] R. A. Fisher. Theory of statistical estimation. Proc. Cambridge Phil.Soc., 22:700, 1925.

[45] G. D. Forney, Jr. The Viterbi algorithm. Proc. IEEE, 61:268–278, 1973.

252 References

[46] G. Foschini. Layered space-time architecture for wireless communicationin a fading environment when using multi-element antennas. Bell LabsTech. J., 1(2):41–159, Aug. 1996.

[47] G. J. Foschnini, G. D. Golden, R. A. Valenzuela, and P. W. Wolniansky.Simplified processing for high spectral efficiency wireless communicationemploying multi-element arrays. IEEE J. Select. Areas Commun., 17(11):1841–1852, Nov. 1999.

[48] N. T. Gaarder and J. K. Wolf. The capacity of a multiple access discretememoryless channel can increase with feedback. IEEE Trans. Inform.Theory, IT-21:100–102, 1975.

[49] R. G. Gallager. Information Theory and Reliable Communication. JohnWiley and Sons, New York, 1968.

[50] R. G. Gallager. A perspective on multiaccess channels. IEEE Trans.Inform. Theory, IT-31(2):124–142, 1985.

[51] A. E. Gamal and T. M. Cover. Multiple user information theory. Proc.IEEE, 68(12):1466–1483, 1980.

[52] T. R. Giallorenzi and S. G. Wilson. Multiuser ML sequence estimator forconvolutionally coded asynchronous DS-CDMA systems. IEEE Trans.Commun., 44(8):997–1008, 1996.

[53] T. R. Giallorenzi and S. G. Wilson. Suboptimum multiuser receivers forconvolutionally coded asynchronous DS-CDMA systems. IEEE Trans.Commun., 44(9):1183–1196, 1996.

[54] R. Gold. Optimum binary sequences for spread spectrum multiplexing.IEEE Trans. Inform. Theory, 13:619–621, 1967.

[55] G. Golub and C. V. Loan. Matrix Computations. The John HopkinsUniversity Press, 2 edition, 1989.

[56] A. Grant and P. Alexander. Random sequence multisets for synchronouscode-division multiple-access channels. IEEE Trans. Inform. Theory, 44(7):2832–2836, 1998.

[57] A. Grant, B. Rimoldi, R. Urbanke, and P. Whiting. Rate-splitting mul-tiple access for discrete memoryless channels. IEEE Trans. Inform. The-ory, 47(3):873–90, 2001.

[58] A. Grant and C. Schlegel. Iterative implementations for linear multiuserdetectors. IEEE Trans. Commun., 49(10):1824–1834, Oct. 2001.

[59] J. Hagenauer. A soft-in/soft-out list sequential (LISS) decoder for turboschemes. In IEEE Int. Symp. Inform. Theory, page 382, 2003.

[60] R. A. Horn and C. R. Johnson. Matrix Analysis. Oxford UniversityPress, Cambridge, 1990.

[61] B. Hughes and A. B. Cooper. Nearly optimal multiuser codes for the bi-nary adder channel. IEEE Trans. Inform. Theory, 42(2):387–398, 1996.

[62] J. Hui. Throughput analysis for code division multiple accessing of thespread spectrum channel. IEEE J. Select. Areas Commun., SAC-2(4):482–486, 1984.

References 253

[63] J. Y. N. Hui and P. A. Humblet. The capacity region of the totallyasynchronous multiple access channel. IEEE Trans. Inform. Theory,IT-31(2):207–216, 1985.

[64] D. Jonsson. Some limit theorems for the eigenvalues of a sample covari-ance matrix. J. Mult. Anal., 12:1–38, 1982.

[65] P. Jung and J. Blanz. Joint detection with coherent receiver antennadiversity in CDMA mobile radio systems. IEEE Trans. Veh. Technol.,44(1):76–88, 1995.

[66] M. J. Juntti, B. Aazhang, and J. O. Lilleberg. Iterative implementationsof linear multiuser detection for asynchronous CDMA systems. IEEETrans. Commun., 46(4):503–508, April 1998.

[67] T. Kasami and S. Lin. Coding for a multiple-access channel. IEEETrans. Inform. Theory, IT-22(2):129–137, 1976.

[68] T. Kasami and S. Lin. Bounds on the achievable rates of block codingfor a memoryless multiple-access channel. IEEE Trans. Inform. Theory,IT-24(2):187–197, 1978.

[69] T. Kasami and S. Lin. Decoding of linear δ-decodeable codes for amultiple-access channel. IEEE Trans. Inform. Theory, IT-24(5):633–636, 1978.

[70] T. Kasami, S. Lin, V. Wei, and S. Yamamura. Graph theoretic ap-proaches to the code construction for the two-user multiple-access bi-nary adder channel. IEEE Trans. Inform. Theory, IT-29(1):114–130,1983.

[71] S. M. Kay. Fundamentals of Statistical Signal Processing Volume I:Estimation Theory. Prentice Hall, Mew Jersey, 1993.

[72] S. M. Kay. Fundamentals of Statistical Signal Processing Volume II:Detection Theory. Prentice Hall, Mew Jersey, 1998.

[73] G. K. Khachatrian. On the construction of codes for noiseless synchro-nized 2-user channel. Prob. Contr. Inform. Theory, 11:319–324, 1987.

[74] G. Kramer. Feedback strategies for a classof two-user multiple-accesschannels. IEEE Trans. Inform. Theory, 45(6):2054–2059, Sept. 1999.

[75] C. Kuhn and J. Hagenauer. Iterative list-sequential (LISS) detector forfading multiple-access channels. In IEEE Global Commun. Conf., pages330–335, 2004.

[76] H. Liao. Multiple access channels. PhD thesis, Dept. of Electrical Eng.,University of Hawaii, 1972.

[77] S. Lin and D. J. Costello, Jr. Error Control Coding: Fundamentals andApplications. Prentice–Hall, New Jersey, 1983.

[78] S. Lin and D. J. Costello, Jr. Error Control Coding. Prentice-Hall,Englewood Cliffs, 2 edition, 2004.

[79] R. Lupas and S. Verdu. Linear multiuser detectors for synchronouscode–division multiple–access channels. IEEE Trans. Inform. Theory,35(1):123–136, 1989.

[80] R. Lupas and S. Verdu. Near-far resistance of multiuser detectors inasynchronous channels. IEEE Trans. Commun., 38(4):496–508, 1990.

254 References

[81] D. J. C. MacKay. Good error-correcting codes based on very sparsematrices. IEEE Trans. Inform. Theory, IT-45(2):399–431, 1999.

[82] U. Madhow and M. L. Honig. MMSE interference suppression for direct–sequence spread–spectrum CDMA. IEEE Trans. Commun., 42(12):3178–3188, 1994.

[83] J. L. Massey. Information theory aspects of spread-spectrum communi-cations. In The Fifth International Symposium on Personal, Indoor andMobile Radio Communications, pages 16–20, The Hague, The Nether-lands, 1994.

[84] J. L. Massey and P. Mathys. The collision channel without feedback.IEEE Trans. Inform. Theory, IT-31(2):192–204, 1985.

[85] J. L. Massey and T. Mittelholzer. Welch’s bound and sequence setsfor code-division multiple-access systems. In Sequences II, Methods inCommunication, Security and Computer Science. Springer-Verlag, NewYork, 1993.

[86] P. Mathys. A class of codes for a T-active users out of N multiple accesscommunication system. IEEE Trans. Inform. Theory, 36(6):1206–1219,1990.

[87] E. C. v. d. Meulen. A survey of multi-way channels in informationtheory. IEEE Trans. Inform. Theory, IT-23(1):1–37, 1977.

[88] M. Moher. An iterative multiuser decoder for near-capacity communi-cations. IEEE Trans. Commun., 46(7):870–880, 1998.

[89] M. Moher. Multiuser decoding for multibeam systems. IEEE Trans.Veh. Technol., 49(4):1226–1234, July 2000.

[90] M. Moher, L. Erup, and R. G. Lyons. Interference statistics for multi-beam satellites. In Second European Workshop on Mobile and PersonalSatellite Comunications, pages 366–384, Oct. 1996.

[91] H. Nyquist. Certain topics in telegraph transmission theory. AIEETrans., 47:617–644, 1928.

[92] E. Ordentlich. On the factor-of-two bound for Gaussian multiple-accesschannels with feedback. IEEE Trans. Inform. Theory, 42(6):2231–2235,1996.

[93] L. H. Ozarow. The capacity of the white Gaussian multiple accesschannel with feedback. IEEE Trans. Inform. Theory, IT-30:623–629,1984.

[94] P. Patel and J. Holtzman. Analysis of a simple successive interferencecancellation scheme in a DS/CDMA system. IEEE J. Select. AreasCommun., 12(5):796–807, June 1994.

[95] R. Peterson and D. Costello. Error probability and free distance boundsfor two-user tree codes on multiple access channels. IEEE Trans. Inform.Theory, IT-26(6):658–670, 1980.

[96] R. L. Peterson and D. J. Costello. Binary convolutional codes for amultiple-access system. IEEE Trans. Inform. Theory, IT-25(1):101–105,1979.

References 255

[97] R. L. Peterson, R. E. Ziemer, and D. E. Borth. Introduction to SpreadSpectrum Communications. Prentice-Hall, 1995.

[98] E. Plotnik. Code construction for asynchronous random multiple-accessto the adder channel. IEEE Trans. Inform. Theory, 39(1):195–197, 1993.

[99] M. Pohst. On the computation of lattice vectors of minimal length,succesive minima and reduced basis with applications. ACM SIGSAM,15:37–44, 1981.

[100] G. S. Poltyrev. Coding for channel with asynchronous multiple access.Probl. Peredachi Informatsii, 19(3):12–21, 1983.

[101] H. Poor and S. Verdu. Probability of error in MMSE multiuser detection.IEEE Trans. Inform. Theory, pages 858–871, May 1997.

[102] H. V. Poor. An introduction to signal detection and estimation.Springer-Verlag, New York, 2 edition, 1994.

[103] K. A. Post. Convexity of the nonachievable rate region for the collisionchannel without feedback. IEEE Trans. Inform. Theory, IT-31(2):205–206, 1985.

[104] J. G. Proakis. Digital Communications. McGraw-Hill, 4 edition, 2001.[105] C. R. Rao. Information and the accuracy attainable in the estimation

of statistical parameters. Bull. Calcutta Math. Soc., 37:81–91, 1945.[106] P. B. Rapajic and D. Popescu. Information capacity of a random sig-

nature multiple-input multiple-output channel. IEEE Trans. Commun.,48(8):1245–1248, Aug. 2000.

[107] P. B. Rapajic and B. S. Vucetic. Adaptive receiver structures for asyn-chronous CDMA systems. IEEE J. Select. Areas Commun., 12(4):685–697, 1994.

[108] M. Reed, C. Schlegel, P. Alexander, and J. Asenstorfer. Iterative mul-tiuser detection for CDMA with FEC: near-single-user performance.IEEE Trans. Commun., 46(12):1693–9, Dec. 1998.

[109] A. B. Reid, A. J. Grant, and P. D. Alexander. List detection for multi-access channels. In IEEE Global Commun. Conf., pages 1083–1087,2002.

[110] A. B. Reid, A. J. Grant, and A. P. Kind. Low-complexity list-detectionfor high-rate multiple-antenna channels. In IEEE Int. Symp. Inform.Theory, page 273, Yokohama, Japan, 2003.

[111] M. Rupf. Coding for CDMA Channels and Capacity. PhD thesis, Eid-genossiche Technische Hochschule, 1994.

[112] M. Rupf and J. L. Massey. Optimum sequence multisets for synchronouscode-division multiple-access channels. IEEE Trans. Inform. Theory, 40(4):1261–1266, 1994.

[113] C. Sankaran and A. Ephremides. Solving a class of optimum multiuserdetection problems with polynomial complexity. IEEE Trans. Inform.Theory, 44:1958–1961, 1998.

[114] L. L. Scharf. Statistical Signal Processing. Addison-Wesley, Reading,Massachusetts, 1991.

256 References

[115] C. Schlegel, P. Alexander, and S. Roy. Coded asynchronous CDMA andits efficient detection. IEEE Trans. Inform. Theory, 44(7):2837–2847,1998.

[116] C. Schlegel, P. D. Alexander, S. Roy, and Z. Xiang. Multi-user projectionreceivers. IEEE J. Select. Areas Commun., 14(8):1610–1618, 1996.

[117] C. Schlegel and A. Grant. Polynomial complexity optimal detection ofcertain multiple access systems. IEEE Trans. Inform. Theory, 46(6):2246–8, Sept. 2000.

[118] C. Schlegel and L. Wei. A simple way to compute the minimum distancein multiuser CDMA. IEEE Trans. Commun., 45(5):532–535, May 1997.

[119] C. Schlegel and Z. Xiang. Multiuser projection receivers. In IEEE Int.Symp. Inform. Theory, page 318, Whistler, Canada, 1995.

[120] C. B. Schlegel and L. Perez. Trellis and Turbo Coding. IEEE Press,Piscataway, NJ, 2004.

[121] K. S. Schneider. Optimum detection of code division multiplexed signals.IEEE Transactions on Aerospace and Electronic Systems, AES-15(1):181–185, Jan. 1979.

[122] C. P. Schnorr and M. Euchner. Lattice basis reduction: improved prac-tical algorithms and solving subset sum probelms. Mathematical Pro-gramming, 66:181–191, 1994.

[123] C. E. Shannon. A mathematical theory of communication. Bell Syst.Tech. J., 27:379–423, 623–656, 1948.

[124] C. E. Shannon. The zero error capacity of a noisy channel. IRE Trans.Inform. Theory, IT-2:8–19, 1956.

[125] Z. Shi and C. Schlegel. Asymptotic iterative multi-user detection ofcoded random CDMA. Submitted to IEEE Trans. Signal Processing.,005.

[126] Z. Shi and C. Schlegel. Joint iterative decoding of serially concatenatederror control coded CDMA. IEEE J. Select. Areas Commun., pages1646–1653, 2001.

[127] J. W. Silverstein. Eigenvalues and eigenvectors of large dimensionalsample covariance matrices. Contemporary Mathematics, 50:153–159,1986.

[128] J. W. Silverstein and Z. D. Bai. On the emperical distribution of eigen-values of a class of large dimensional random matrices. J. MultivariateAnal., 54(2):175–192, Aug. 1995.

[129] D. Slepian and J. K. Wolf. A coding theorem for multiple access channelswith correlated sources. Bell Syst. Tech. J., 52(7), 1973.

[130] R. Sorace. Trellis coding for the multiple access channel. IEEE Trans.Inform. Theory, IT-29(4):606–611, 1983.

[131] G. Strang. Linear Algebra and its Applications. Harcourt Brace Jo-vanovich, 1988.

[132] I. E. Telatar. Capacity of multi-antenna Gaussian channels. Europ.Trans. Telecommun., 10(6):585–595, Nov.-Dec. 1999. Origninally pub-

References 257

lished as Tech. Memo., Bell Laboratories, Lucent Technologies, October1995.

[133] S. ten Brink. Convergence behavior of iteratively decoded parallel con-catenated codes. IEEE Trans. Commun., 49(10):1727–1737, Oct. 2001.

[134] TIA. Tia/is-2000 interim standard, Introduction to cdma2000 spreadspectrum systems, July 2004.

[135] TIA/EIA. IS-95: Mobile station-base station compatibility standard fordual-mode wideband spread spectrum cellular system, July 1993.

[136] H. v. Tilborg. An upper bound for codes in a two-access binary erasurechannel. IEEE Trans. Inform. Theory, IT-24(1):112–117, 1978.

[137] L. G. F. Trichard, J. S. Evans, and I. B. Collings. Large system analysisof linear parallel interference cancellation. IEEE Trans. Commun., 50(11):1778–1786, Nov. 2002.

[138] D. Tse and S. Hanly. Linear multiuser receivers: Effective interference,effective bandwidth and user capacity. IEEE Trans. Inform. Theory, 45(2):641–657, 1999.

[139] S. Ulukus and R. Yates. Optimum mutliuser detection is tractable forsynchronous CDMA using m-sequences. IEEE Commun. Lett., 2:89–91,1998.

[140] G. Ungerboeck. Adaptive maximum–likelihood receiver for carrier–modulated data–transmission systems. IEEE Trans. Commun., 22(5):624–636, 1974.

[141] R. Urbanke and Q. Li. The zero-error capacity region of the 2-usersynchronous BAC is strictly smaller than its Shannon capacity region.In IEEE Information Theory Workshop, page 61, Killarney, Ireland,1998.

[142] W. van Etten. Maximum likelihood receiver for multiple channel trans-mission system. IEEE Trans. Commun., 24(2):276–283, 1976.

[143] H. L. van Trees. Detection, Estimation, and Modulation Theory, PartI. John Wiley & Sons, 1968.

[144] P. Vanroose. Code construction for the noiseless binary switchingmultiple-access channel. IEEE Trans. Inform. Theory, 34(5):1100–1106,1988.

[145] M. Varanasi and B. Aazhang. Multistage detection in asynchronouscode-division multiple-access communication. IEEE Trans. Commun.,38(4):509–519, April. 1990.

[146] M. Varanasi and B. Aazhang. Near optimum detection in synchronouscode-division multiple-access systems. IEEE Trans. Commun., 39(5):725–736, May. 1991.

[147] S. Verdu. Minimum probability of error for asynchronous Gaussianmultiple–access channels. IEEE Trans. Inform. Theory, 32(1):85–96,1986.

[148] S. Verdu. The capacity region of the symbol-asynchronous Gaussianmultiple-access channel. IEEE Trans. Inform. Theory, 35(4):733–751,1989.

258 References

[149] S. Verdu. Computational complexity of optimum multiuser detection.Algorithmica, pages 303–312, 1989.

[150] S. Verdu and S. Shamai. Spectral efficiency of CDMA with randomspreading. IEEE Trans. Inform. Theory, 45(2):622–640, 1999.

[151] H. Vikalo, B. Hassibi, and T. Kailath. Iterative decoding for mimochannels via modified sphere decoder. IEEE Trans. Wireless Commun.,3(6):2299–2311, Nov. 2004.

[152] A. J. Viterbi. Error bounds for convolutional codes and an asymptoti-cally optimum decoding algorithm. IEEE Trans. Inform. Theory, IT-13:260–269, 1967.

[153] A. J. Viterbi. Very low rate convolutional codes for maximum theoret-ical performance of spread-spectrum multiple-access channels. IEEE J.Select. Areas Commun., 8:641–649, 1990.

[154] A. J. Viterbi. CDMA: Principles of Spread Spectrum Communication.Addison-Wesley, 1995.

[155] A. J. Viterbi and J. K. Omura. Principles of Digital Communicationand Coding. McGraw–Hill, 1979.

[156] K. W. Wachter. The strong limits of random matrix spectra for samplematrices of independent elements. Ann. Probab., 6:1–18, 1978.

[157] X. Wang and V. Poor. Iterative (turbo) soft interference cancellationand decoding for coded CDMA. IEEE Trans. Commun., 47(7):1046–61,July 1999.

[158] L. Wei and C. Schlegel. Synchronous DS-SSMA with improved decor-relating decision–feedback multiuser detection. IEEE Trans. Veh. Tech-nol., 43(3):767–772, 1994. Special Issue on Future PCS Technologies.

[159] L. R. Welch. Lower bounds on the maximum cross correlation of signals.IEEE Trans. Inform. Theory, IT-20(3):397–399, 1974.

[160] E. J. Weldon. Coding for a multiple access channel. Information andControl, 36:256–274, 1978.

[161] K. Wesolowski. Mobile Communication Systems. John Wiley & Sons,Chichester, England, 2002.

[162] L. Wilhelmsson and K. S. Zigangirov. On the transmission over theT-user q-ary multiple access permutation channel. In IEEE Int. Symp.Inform. Theory, page 56, Trondheim, Norway, 1994.

[163] F. M. J. Willems. The feedback capacity of a class of discrete memorylessmultiple access channels. IEEE Trans. Inform. Theory, IT-28(1):93–95,1982.

[164] F. M. J. Willems and E. C. v. d. Meulen. Partial feedback for the discretememoryless multiple access channel. IEEE Trans. Inform. Theory, 29(2):287–290, 1983.

[165] J. K. Wolf. Multi-user communications networks. In J. Skwirynski,editor, Communication Systems and Random Process Theory, NATOAdvanced Study Institute Series, pages 37–53. Sijthoff and Noordhoff,Alphen aan den Rijn, The Netherlands, 1978.

References 259

[166] J. M. Wozencraft and I. M. Jacobs. Principles of Communications En-gineering. J. Wiley and Sons, New York, 1965.

[167] J. M. Wozencraft and I. M. Jacobs. Principles of Communications En-gineering. Waveland Press, 1993.

[168] D. Wubben, B. Bohnke, J. Rinas, V. Kuhn, and K. D. Kammeyer. Ef-ficient algorithm for decoding layered space-time codes. IEE Electron.Lett., 37:1384–1350, Oct. 2001.

[169] A. D. Wyner. Recent results in Shannon theory. IEEE Trans. Inform.Theory, IT-20(1):2–10, 1974.

[170] A. D. Wyner. Shannon-theoretic approach to Gaussian cellular multiple-access channel. IEEE Trans. Inform. Theory, 40(6):1713–27, 1994.

[171] Z. Xie, R. T. Short, and C. K. Rushforth. A family of suboptimumdetectors for coherent multiuser communications. IEEE J. Select. AreasCommun., 8(4):683–690, 1990.

[172] Y. Q. Yin and Z. D. Bai. Spectra for large-dimensional random matrices.Contemporary Mathematics, 50:161–167, 1986.

[173] Y. Yoon, R. Kohno, and H. Imai. A spread-spectrum multiaccess systemwith cochannel interference cancellation for multipath fading channels.IEEE J. Select. Areas Commun., 11:1067–1075, Sept. 1993.

Author Index

Agrell et al. [2002] 168, 249

Ahlswede [1971] 6, 53, 60, 249

Alexander and Rasmussen [1994] 133,249

Alexander et al. [1997] 178, 184, 186,249

Alexander et al. [1998] 179, 249

Alexander et al. [1999] 179, 249

Alexander [1996] 133, 249

Axelsson [1994] 139, 142, 146, 147, 249

Bahl et al. [1974] 107, 109, 249

Bai and Yin [1993] 127, 249

Baro et al. [2003] 173, 249

Berrou and Glavieux [1996] 99, 176,249

Bertsekas and Gallager [1992] 94, 250

Braak and Tilborg [1985] 76, 77, 250

Burnashev et al. [2004] 207, 250

Caire et al. [2004] 233, 234, 250

Chang and Weldon [1979] 58, 59,78–80, 250

Chang and Wolf [1981] 58, 59, 80, 250

Chang [1984] 80, 250

Chevillat [1981] 76, 81, 250

Corazza and Vanelli-Coralli [2003]219, 250

Cover and Leung [1981] 86, 250

Cover and Thomas [1991] 4, 53, 58,227, 240, 243, 244, 250

Cover [1975] 60, 250

Cramer [1946] 243, 247, 250

Damen et al. [2003] 168, 250

Deaett and Wolf [1978] 76, 79, 250

Divsalar and Simon [1995] 141, 250

Divsalar et al. [2001] 196, 250

Duel-Hallen [1993] 130, 251

Duel-Hallen [1995] 130, 251

ETSI/SMG/SMG2 [1998] 139, 251

ETSI [1988] 9, 251

Eggleston [1958] 53, 84, 251

El-Gamal and Geraniotis [2000] 179,251

Elders-Boll et al. [1997] 139, 143, 251

Elders-Boll et al. [1998] 139, 143, 251

Eppstein [1994] 172, 251

Farhang-Boroujeny [2005] 149, 251

Farrell [1981] 74, 251

Ferguson [1982] 79, 251

Fincke and Pohst [1985] 165, 168, 251

Fisher [1921] 244, 251

Fisher [1925] 244, 251

Forney [1973] 106, 251

Foschini [1996] 175, 251

Foschnini et al. [1999] 168, 252

Gaarder and Wolf [1975] 85, 252

Gallager [1968] 16, 17, 101, 252

Gallager [1985] 53, 252

Gamal and Cover [1980] 53, 252

Giallorenzi and Wilson [1996] 176,178, 252

Gold [1967] 163, 186, 252

Golub and Loan [1989] 110, 119, 128,132, 142, 252

Grant and Alexander [1998] 67, 252

Grant and Schlegel [2001] 143,145–148, 150, 252

262 Author Index

Grant et al. [2001] 81, 91, 252Hagenauer [2003] 173, 252Horn and Johnson [1990] 125, 142,

144, 146, 148, 252Hughes and Cooper [1996] 79, 252Hui and Humblet [1985] 91, 252Hui [1984] 67, 252Jonsson [1982] 65, 253Jung and Blanz [1995] 120, 253Juntti et al. [1998] 143, 253Kasami and Lin [1976] 76, 77, 253Kasami and Lin [1978] 77, 253Kasami et al. [1983] 78, 253Kay [1993] 237, 253Kay [1998] 237, 253Khachatrian [1987] 76, 253Kramer [1999] 87, 253Kuhn and Hagenauer [2004] 173, 253Liao [1972] 53, 60, 253Lin and Costello [1983] 77, 106, 253Lin and Costello [2004] 175, 253Lupas and Verdu [1989] 120, 253Lupas and Verdu [1990] 120, 253MacKay [1999] 99, 253Madhow and Honig [1994] 126, 254Massey and Mathys [1985] 74, 91, 93,

254Massey and Mittelholzer [1993] 44,

254Massey [1994] 33, 254Mathys [1990] 80, 254Meulen [1977] 4, 254Moher et al. [1996] 41, 254Moher [1998] 113, 179, 194, 254Moher [2000] 41, 254Nyquist [1928] 1, 254Ordentlich [1996] 90, 254Ozarow [1984] 88, 254Patel and Holtzman [1994] 141, 254Peterson and Costello [1979] 76, 81,

254Peterson and Costello [1980] 81, 254Peterson et al. [1995] 98, 254Plotnik [1993] 80, 255Pohst [1981] 165, 168, 255Poltyrev [1983] 91, 255Poor and Verdu [1997] 125, 179, 255Poor [1994] 237, 255Post [1985] 94, 255

Proakis [2001] 15, 101, 110, 184, 186,255

Rao [1945] 243, 255Rapajic and Popescu [2000] 67, 255Rapajic and Vucetic [1994] 126, 255Reed et al. [1998] 179, 255Reid et al. [2002] 172, 255Reid et al. [2003] 172, 255Rupf and Massey [1994] 64, 255Rupf [1994] 64, 255Sankaran and Ephremides [1998] 112,

255Scharf [1991] 237, 248, 255Schlegel and Grant [2000] 115, 256Schlegel and Perez [2004] XVII, 99,

102, 106, 107, 109, 111, 116, 162,175, 176, 185–187, 192, 193, 200,225, 227, 256

Schlegel and Wei [1997] 110, 112, 256Schlegel and Xiang [1995] 184, 256Schlegel et al. [1996] 178, 256Schlegel et al. [1998] 178, 255Schneider [1979] 28, 29, 97, 100, 256Schnorr and Euchner [1994] 165, 168,

256Shannon [1948] 1, 175, 256Shannon [1956] 85, 93, 256Shi and Schlegel [005] 179, 256Shi and Schlegel [2001] 179, 256Silverstein and Bai [1995] 69, 215, 256Silverstein [1986] 65, 256Slepian and Wolf [1973] 53, 60, 256Sorace [1983] 81, 256Strang [1988] 119, 142, 146, 256TIA/EIA [1993] 10, 97, 101, 102, 139,

219, 257TIA [2004] 3, 10, 97, 102, 139, 257Telatar [1999] 175, 256Tilborg [1978] 78, 257Trichard et al. [2002] 153, 154, 257Tse and Hanly [1999] 51, 135, 211,

212, 257Ulukus and Yates [1998] 113, 257Ungerboeck [1974] 104, 257Urbanke and Li [1998] 75, 76, 257Vanroose [1988] 80, 257Varanasi and Aazhang [1990] 141, 145,

257Varanasi and Aazhang [1991] 145, 257

Author Index 263

Verdu and Shamai [1999] 66, 67, 72,127, 258

Verdu [1986] 28, 97, 100, 102, 257Verdu [1989] 29, 64, 91, 92, 110, 158,

257Vikalo et al. [2004] 172, 258Viterbi and Omura [1979] 185, 258Viterbi [1967] 81, 106, 258Viterbi [1990] 228, 258Viterbi [1995] 10, 98, 118, 134, 258Wachter [1978] 65, 258Wang and Poor [1999] 179, 209, 258Wei and Schlegel [1994] 162, 258Welch [1974] 43, 65, 258Weldon [1978] 78, 258Wesolowski [2002] 219, 258Wilhelmsson and Zigangirov [1994]

80, 258

Willems and Meulen [1983] 86, 258

Willems [1982] 87, 258

Wolf [1978] 59, 79, 258

Wozencraft and Jacobs [1965] 101,119, 258

Wozencraft and Jacobs [1993] 15, 203,259

Wubben et al. [2001] 168, 259

Wyner [1974] 60, 259

Wyner [1994] 39, 259

Xie et al. [1990] 126, 259

Yin and Bai [1986] 65, 259

Yoon et al. [1993] 141, 259

ten Brink [2001] 196, 257

van Etten [1976] 28, 29, 97, 100, 257

van Trees [1968] 237, 257

Subject Index

a posteriori probability 29, 30, 193achievable rate 50

region 51, 53, 86algorithm

M - 162T 111, 169BCJR 107bi-directional 107shortest path 172sphere detector 168Viterbi 106, 185

amplitude 16matrix 19, 24, 35

approximate APP 170asymmetric operation 219asymptotic negligibility 134, 222asynchronous 91, 132

frame 79, 91users 103

band-diagonal 102, 139bandwidth 17, 31, 33, 62basis

lattice 164orthonormal 17, 189

Bayescriterion 238estimation 237estimator 239, 240, 242, 244, 246risk 238rule 29, 246

bias 241block code 75

boundShannon 116, 123, 136, 226Welch 43, 65, 67, 118

branch metric 104, 159, 161, 172branch-and-bound 99

cancellation 58parallel interference 143perfect 195serial 148signal 194successive 227

capacityinformation theoretic 116per chip 208per dimension 177polytope 228upper bound 156

capacity region 48, 53asynchronous 91–93binary adder channel 55, 87binary multiplier channel 59code-division multiple-access 64, 66Gaussian multiple-access channel

60, 88vertex 58, 81

Cayley Hamilton theorem 146CDMA 175

asynchronous 186joint decoding 99synchronous 186time-varying 101

cellular network 39central limit theorem 179, 195

266 Subject Index

channelbinary input 204CDMA 178MIMO 175multiple-access 116, 193single-user 116, 175

channel layering 116chip 32, 33

matched filter 34Cholesky factorization 110code

block 75convolutional 81Gold 186LDPC 175multiple-access 49, 73outer 192parity check 197powerful 222repetition 203, 206, 217serial concatenated 200Shannon-type 224simple 202strong 199, 231trellis 175turbo 175weak 200

code-division multiple-access 32, 63capacity region 64, 66channel model 22, 35direct sequence 33

codebook 187codeword estimator 202coding

error control 192low-density parity-check 99turbo 99

collision channel 93complexity

state space 107sub-exponential 112

concatenation 178serial 192

convergenceasymptotic factor 143, 147bottleneck 200optimal 153speed 147tunnel 234

convex hull 53, 55, 58, 84, 91convexity 191convolutional code 81cooperation

receiver 50transmitter 47, 55, 57, 59

correlationdetector 98equal 113reception 97

correlation matrix 24, 25, 27asynchronous channel 26narrow-band channel 37orthogonal 31synchronous channel 27

correlator 23, 27, 42, 67

decision-feedback 130decoder

APP 193iterative multiuser 193state 104

decodingiterative 192, 196optimal 178single-user 178, 179trellis 102turbo 107

decompositionCholesky 128, 160QR 159spectral 152

decorrelator 99, 119improved decision feedback 162information theoretic capacity 123partial 131random spreading 122Shannon bound 123

delay 16–18, 21, 33, 35, 92detector 238

a posteriori probability 107correlation 118individually optimal 30jointly optimal 29, 158minimum mean-square error 124,

125, 135optimal 29, 67, 100, 194, 246, 247

sub-exponential 112single-user 23, 27, 42, 56, 61, 67, 81

Subject Index 267

sphere 165initial radius 168list 171Schnorr-Euchner 169

distanceEuclidean 109, 158, 202Hamming 188minimum 109

dynamical system 222

effective interference 212eigenvalue 64, 65, 142, 215

distribution 65, 66, 70, 153minimum 144

energy degradation 185energy loss 183equation

characteristic 146error cliff 198, 201error event

pair-wise 188error floor 201error probability 27–30, 42, 50, 246

projection receiver 189two-codeword 192

estimator 238consistent 242, 247efficient 244, 247

Euclidean distance 109, 202exponential distribution 224

factorizationCholesky 110

asynchronous 133feedback 84–88filter

looplow-complexity 214two-stage 216

loop matched 205MMSE

per-user 212, 217non-linear 204whitening 128, 160

asynchronous 132Fincke-Pohst enumeration 168finite-state machine 102first-order stationary method 147fix point 198

forward error control 175frequency-division multiple-access 31,

42, 62

Gauss-Seidel 142Gaussian

assumption 27distribution 24, 36, 42, 64linear model 241, 244, 245multiple-access channel 59, 64noise 16, 18, 24, 29, 35, 60, 101, 196

generator polynomial 186geometry 120

projection receiver 182sphere detector 172

gradient 149

Hammingdistance 245weight 188, 191

hard decision 145, 179

independence assumption 193inequality

Jensen 135, 154, 191inter-symbol interference 26, 48, 185interference cancellation

statistical 99successive 99

interleaver 176random 177

isotropic 42iterated map 198iterative

inversion 139minimization 150update equation 141

iterative cancellation 208iterative decoding 192iterative layering

capacity 208

Jacobidecorrelator 145method 143performance 143receiver 145

Jensen’s inequality 134

268 Subject Index

large systems analysis 65, 66, 70, 143,145, 212

Lindeberg condition 134, 135, 222linear

preprocessor 116linear optimization 234load

average system 232partial

maximum 231supportable 233uncoded 208

log-likelihood ratio 193, 196extrinsic 206

loop filter 194, 195, 209low-complexity 194lower diagonal 149lower triangular 110, 128, 148

marginalization 107matched filter 24, 27–29, 31, 34, 67, 92,

196, 209, 241whitened 160

matched filtering 118matrix

block 192cross-correlation 103inversion lemma 125iterative inversion 139lower triangular 160partitioned

inversion lemma 192permutation 110projection 189splitting 142

maximum a posteriori detector 30, 246maximum likelihood detector 29, 247maximum-likelihood

approximate 158estimate 98, 106

methodChebyshev 146Gauss-Seidel 148lattice 164tree search 158, 161

metricbranch 159, 166Euclidean 160monotonic 159

monotonic property 166recursive 158Ungerbock 161

minimum distance 109minimum mean-square error

detector 99, 124, 125system capacity 135

performance 126random spreading 127

minimum mean-squared error 242–244layering 208linear 245

minimum variance filter 125MMSE solution 149modulation

baseband 15BPSK 15, 204matrix 18, 22, 35, 42, 100

asynchronous 23waveform 16, 18, 32, 36, 92

modulation matrix 31modulation waveform 16multi-stage

approximation 214multiple antenna channel 37multiple-access channel 46

asynchronous 90code 49, 73continuous time model 14discrete memoryless 47discrete time model 17feedback 84Gaussian 59

asynchronous 92feedback 88

matrix model 18, 35narrow band 36orthogonal 61synchronous 21

multiplicity 110mutual information 51, 53

near-farproblem 97resistance 163

near-optimal 112noise limitation 199NP-hard 29, 158

Subject Index 269

optimal detector 67individually optimal 30jointly optimal 29

orthogonality principle 209

parent node 161partial load 222partitioning 180permutation function 167phase-shift keying 15power

average 223control 219equal 225group 222optimal distribution 222, 226received 119received levels 134transmitted 16unequal 219

power groupfinite 232

pre-processing 178projection

filtering 184matrix 189

projection receiverperformance 185, 191

pseudo-inverse 121

quadraticform 102, 104function 149

rateunequal 228

Rayleigh fading 72receiver

iterative multistage 151multistage 139, 141projection 179

relaxation parameter 148root node 162

sampling 17, 18, 22, 34, 35satellite 41scaling factor 210scenario

worst-case 136

Schnorr-Euchner enumeration 168sequence

m 113detector

extended 184maximum-likelihood 178

metrics 103signature 112

Shannon bound 116, 123, 155unequal power 137

signal layering 115signal-to-interference

asymptotic 211signal-to-noise ratio

threshold 200single-user detector 23, 27, 42, 56, 61,

67, 81sliding window 133soft bits 195soft-output 193spectral efficiency 206spectral radius 147sphere decoder 99spreading 16, 18, 22, 32, 100

periodic 36, 64random 36, 65, 69, 70, 101, 112, 122,

127, 139, 143, 185, 212sequence 33sequences 33

stationarity parameter 216stationary method 147statistic 238Stieltjes transform 69, 70strong user 136successive relaxation 148sufficient statistic 17, 24, 28, 29, 34,

102, 240, 243minimal 241

superposition 58, 81supportable load 203, 233survivor 106symbol period 15synchronous

chip 32, 34, 35phase 182symbol 21, 22, 27, 31, 49, 91, 101

system capacitycorrelation receiver 134decorrelator 135

270 Subject Index

system load 216

theorem of irrelevance 106

time sharing 53, 84, 91

region 56

time-division multiple-access 31, 42,61, 62

trajectory 198

tree search 99

breadth-first 162

depth-first 167, 173

trellis code

nonlinear 81

turbo cliff 201, 216

uniquely decodeable 81unitary transformation 160upper bound 191upper diagonal 149upper triangular 110

variance transfer 232variance transfer analysis 195variance transfer function 213variational problem 234Viterbi algorithm 106, 185

weak user 136whitening filter

local optimality 130

Date post:	14-Dec-2016
Category:	Documents
Upload:	vuongdat
View:	214 times
Download:	2 times

Coordinated Multiuser Communications

Documents