+ All Categories
Home > Documents > A Foundation in Digital Communication - ETH Zisistaff/courses/cdt/afidc2.pdf · A Foundation in...

A Foundation in Digital Communication - ETH Zisistaff/courses/cdt/afidc2.pdf · A Foundation in...

Date post: 16-Jun-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
922
A Foundation in Digital Communication Second edition
Transcript
  • A Foundation in Digital Communication

    Second edition

  • A Foundationin Digital Communication

    Second edition

    Amos LapidothETH Zurich, Swiss Federal Institute of Technology

  • Copyright page here

    ©2016 Amos Lapidoth

  • To my family

  • Contents

    Preface to the Second Edition xvi

    Preface to the First Edition xviii

    Acknowledgments for the Second Edition xxvi

    Acknowledgments for the First Edition xxvii

    1 Some Essential Notation 1

    2 Signals, Integrals, and Sets of Measure Zero 42.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Integrating Complex-Valued Signals . . . . . . . . . . . . . . . . . . . 52.4 An Inequality for Integrals . . . . . . . . . . . . . . . . . . . . . . . . 62.5 Sets of Lebesgue Measure Zero . . . . . . . . . . . . . . . . . . . . . 72.6 Swapping Integration, Summation, and Expectation . . . . . . . . . . 102.7 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    3 The Inner Product 143.1 The Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.2 When Is the Inner Product Defined? . . . . . . . . . . . . . . . . . . 173.3 The Cauchy-Schwarz Inequality . . . . . . . . . . . . . . . . . . . . . 183.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.5 The Cauchy-Schwarz Inequality for Random Variables . . . . . . . . . 233.6 Mathematical Comments . . . . . . . . . . . . . . . . . . . . . . . . 233.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    4 The Space L2 of Energy-Limited Signals 274.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.2 L2 as a Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3 Subspace, Dimension, and Basis . . . . . . . . . . . . . . . . . . . . 294.4 ‖u‖2 as the “length” of the Signal u(·) . . . . . . . . . . . . . . . . 314.5 Orthogonality and Inner Products . . . . . . . . . . . . . . . . . . . . 334.6 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.7 The Space L2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    vii

  • viii Contents

    4.8 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    5 Convolutions and Filters 555.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.2 Time Shifts and Reflections . . . . . . . . . . . . . . . . . . . . . . . 555.3 The Convolution Expression . . . . . . . . . . . . . . . . . . . . . . . 565.4 Thinking About the Convolution . . . . . . . . . . . . . . . . . . . . 565.5 When Is the Convolution Defined? . . . . . . . . . . . . . . . . . . . 575.6 Basic Properties of the Convolution . . . . . . . . . . . . . . . . . . . 595.7 Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605.8 The Matched Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 605.9 The Ideal Unit-Gain Lowpass Filter . . . . . . . . . . . . . . . . . . . 625.10 The Ideal Unit-Gain Bandpass Filter . . . . . . . . . . . . . . . . . . 635.11 Young’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635.12 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 635.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

    6 The Frequency Response of Filters and Bandlimited Signals 666.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666.2 Review of the Fourier Transform . . . . . . . . . . . . . . . . . . . . 666.3 The Frequency Response of a Filter . . . . . . . . . . . . . . . . . . . 786.4 Bandlimited Signals and Lowpass Filtering . . . . . . . . . . . . . . . 816.5 Bandlimited Signals Through Stable Filters . . . . . . . . . . . . . . . 916.6 The Bandwidth of a Product of Two Signals . . . . . . . . . . . . . . 926.7 Bernstein’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . 956.8 Time-Limited and Bandlimited Signals . . . . . . . . . . . . . . . . . 956.9 A Theorem by Paley and Wiener . . . . . . . . . . . . . . . . . . . . 986.10 Picket Fences and Poisson Summation . . . . . . . . . . . . . . . . . 986.11 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 1006.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

    7 Passband Signals and Their Representation 1057.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1057.2 Baseband and Passband Signals . . . . . . . . . . . . . . . . . . . . . 1057.3 Bandwidth around a Carrier Frequency . . . . . . . . . . . . . . . . . 1087.4 Real Passband Signals . . . . . . . . . . . . . . . . . . . . . . . . . . 1127.5 The Analytic Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . 1137.6 Baseband Representation of Real Passband Signals . . . . . . . . . . 1207.7 Energy-Limited Passband Signals . . . . . . . . . . . . . . . . . . . . 1347.8 Shifting to Passband and Convolving . . . . . . . . . . . . . . . . . . 1427.9 Mathematical Comments . . . . . . . . . . . . . . . . . . . . . . . . 1437.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

    8 Complete Orthonormal Systems and the Sampling Theorem 1488.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1488.2 Complete Orthonormal System . . . . . . . . . . . . . . . . . . . . . 1488.3 The Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

  • Contents ix

    8.4 The Sampling Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 1538.5 The Samples of the Convolution . . . . . . . . . . . . . . . . . . . . 1578.6 Closed Subspaces of L2 . . . . . . . . . . . . . . . . . . . . . . . . . 1578.7 An Isomorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1628.8 Prolate Spheroidal Wave Functions . . . . . . . . . . . . . . . . . . . 1628.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

    9 Sampling Real Passband Signals 1699.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1699.2 Complex Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1709.3 Reconstructing xPB from its Complex Samples . . . . . . . . . . . . . 1719.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

    10 Mapping Bits to Waveforms 17710.1 What Is Modulation? . . . . . . . . . . . . . . . . . . . . . . . . . . 17710.2 Modulating One Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . 17810.3 From Bits to Real Numbers . . . . . . . . . . . . . . . . . . . . . . . 17910.4 Block-Mode Mapping of Bits to Real Numbers . . . . . . . . . . . . . 18010.5 From Real Numbers to Waveforms with Linear Modulation . . . . . . 18210.6 Recovering the Signal Coefficients with a Matched Filter . . . . . . . 18310.7 Pulse Amplitude Modulation . . . . . . . . . . . . . . . . . . . . . . 18510.8 Constellations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18610.9 Uncoded Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . 18810.10 Bandwidth Considerations . . . . . . . . . . . . . . . . . . . . . . . . 18910.11 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . 19010.12 Some Implementation Considerations . . . . . . . . . . . . . . . . . . 19210.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

    11 Nyquist’s Criterion 19611.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19611.2 The Self-Similarity Function of Energy-Limited Signals . . . . . . . . 19711.3 Nyquist’s Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20011.4 The Self-Similarity Function of Integrable Signals . . . . . . . . . . . 20911.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

    12 Stochastic Processes: Definition 21412.1 Introduction and Continuous-Time Heuristics . . . . . . . . . . . . . 21412.2 A Formal Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 21612.3 Describing Stochastic Processes . . . . . . . . . . . . . . . . . . . . . 21712.4 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 21712.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

    13 Stationary Discrete-Time Stochastic Processes 22113.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22113.2 Stationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 22113.3 Wide-Sense Stationary Stochastic Processes . . . . . . . . . . . . . . 22213.4 Stationarity and Wide-Sense Stationarity . . . . . . . . . . . . . . . . 22313.5 The Autocovariance Function . . . . . . . . . . . . . . . . . . . . . . 224

  • x Contents

    13.6 The Power Spectral Density Function . . . . . . . . . . . . . . . . . . 22613.7 The Spectral Distribution Function . . . . . . . . . . . . . . . . . . . 23013.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

    14 Energy and Power in PAM 23414.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23414.2 Energy in PAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23414.3 Defining the Power in PAM . . . . . . . . . . . . . . . . . . . . . . . 23714.4 On the Mean of Transmitted Waveforms . . . . . . . . . . . . . . . . 23914.5 Computing the Power in PAM . . . . . . . . . . . . . . . . . . . . . 24014.6 A More Formal Account . . . . . . . . . . . . . . . . . . . . . . . . . 25114.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

    15 Operational Power Spectral Density 25915.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25915.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26015.3 Defining the Operational PSD . . . . . . . . . . . . . . . . . . . . . . 26415.4 The Operational PSD of Real PAM Signals . . . . . . . . . . . . . . 26815.5 A More Formal Account . . . . . . . . . . . . . . . . . . . . . . . . . 27215.6 Operational PSD and Average Autocovariance Function . . . . . . . . 27815.7 The Operational PSD of a Filtered Stochastic Process . . . . . . . . . 28515.8 The Operational PSD and Power . . . . . . . . . . . . . . . . . . . . 28715.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

    16 Quadrature Amplitude Modulation 29716.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29716.2 PAM for Passband? . . . . . . . . . . . . . . . . . . . . . . . . . . . 29816.3 The QAM Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29916.4 Bandwidth Considerations . . . . . . . . . . . . . . . . . . . . . . . . 30116.5 Orthogonality Considerations . . . . . . . . . . . . . . . . . . . . . . 30216.6 Spectral Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30516.7 QAM Constellations . . . . . . . . . . . . . . . . . . . . . . . . . . . 30516.8 Recovering the Complex Symbols via Inner Products . . . . . . . . . . 30716.9 Filtering QAM Signals . . . . . . . . . . . . . . . . . . . . . . . . . . 31116.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

    17 Complex Random Variables and Processes 31617.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31617.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31717.3 Complex Random Variables . . . . . . . . . . . . . . . . . . . . . . . 31817.4 Complex Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . 32517.5 Discrete-Time Complex Stochastic Processes . . . . . . . . . . . . . . 33017.6 Limits of Proper Complex Random Variables . . . . . . . . . . . . . . 33617.7 On the Eigenvalues of Large Toeplitz Matrices . . . . . . . . . . . . . 33917.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

    18 Energy, Power, and PSD in QAM 34318.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

  • Contents xi

    18.2 The Energy in QAM . . . . . . . . . . . . . . . . . . . . . . . . . . . 34318.3 The Power in QAM . . . . . . . . . . . . . . . . . . . . . . . . . . . 34618.4 The Operational PSD of QAM Signals . . . . . . . . . . . . . . . . . 35118.5 A Formal Account of Power in Passband and Baseband . . . . . . . . 35618.6 A Formal Account of the PSD in Baseband and Passband . . . . . . . 36318.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372

    19 The Univariate Gaussian Distribution 37619.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37619.2 Standard Gaussian Random Variables . . . . . . . . . . . . . . . . . . 37619.3 Gaussian Random Variables . . . . . . . . . . . . . . . . . . . . . . . 37819.4 The Q-Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38119.5 Integrals of Exponentiated Quadratics . . . . . . . . . . . . . . . . . 38519.6 The Moment Generating Function . . . . . . . . . . . . . . . . . . . 38619.7 The Characteristic Function of Gaussians . . . . . . . . . . . . . . . . 38719.8 Central and Noncentral Chi-Square Random Variables . . . . . . . . . 38919.9 The Limit of Gaussians Is Gaussian . . . . . . . . . . . . . . . . . . . 39319.10 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 39519.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395

    20 Binary Hypothesis Testing 39820.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39820.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 39820.3 Guessing in the Absence of Observables . . . . . . . . . . . . . . . . 40020.4 The Joint Law of H and Y . . . . . . . . . . . . . . . . . . . . . . . 40120.5 Guessing after Observing Y . . . . . . . . . . . . . . . . . . . . . . . 40320.6 Randomized Decision Rules . . . . . . . . . . . . . . . . . . . . . . . 40620.7 The MAP Decision Rule . . . . . . . . . . . . . . . . . . . . . . . . . 40820.8 The ML Decision Rule . . . . . . . . . . . . . . . . . . . . . . . . . . 41020.9 Performance Analysis: the Bhattacharyya Bound . . . . . . . . . . . . 41120.10 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41120.11 (Nontelepathic) Processing . . . . . . . . . . . . . . . . . . . . . . . 41420.12 Sufficient Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 41920.13 Implications of Optimality . . . . . . . . . . . . . . . . . . . . . . . . 42720.14 Multi-Dimensional Binary Gaussian Hypothesis Testing . . . . . . . . 42820.15 Guessing in the Presence of a Random Parameter . . . . . . . . . . . 43420.16 Mathematical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 43620.17 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436

    21 Multi-Hypothesis Testing 44421.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44421.2 The Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44421.3 Optimal Guessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44521.4 Example: Multi-Hypothesis Testing for 2D Signals . . . . . . . . . . . 45021.5 The Union-of-Events Bound . . . . . . . . . . . . . . . . . . . . . . . 45421.6 Multi-Dimensional M-ary Gaussian Hypothesis Testing . . . . . . . . 46121.7 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 46721.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

  • xii Contents

    22 Sufficient Statistics 47122.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47122.2 Definition and Main Consequence . . . . . . . . . . . . . . . . . . . . 47222.3 Equivalent Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 47422.4 Identifying Sufficient Statistics . . . . . . . . . . . . . . . . . . . . . 48422.5 Sufficient Statistics for the M-ary Gaussian Problem . . . . . . . . . . 48922.6 Irrelevant Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49022.7 Testing with Random Parameters . . . . . . . . . . . . . . . . . . . . 49222.8 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 49422.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494

    23 The Multivariate Gaussian Distribution 49723.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49723.2 Notation and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 49823.3 Some Results on Matrices . . . . . . . . . . . . . . . . . . . . . . . . 50023.4 Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50623.5 A Standard Gaussian Vector . . . . . . . . . . . . . . . . . . . . . . . 51223.6 Gaussian Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . 51323.7 Jointly Gaussian Vectors . . . . . . . . . . . . . . . . . . . . . . . . . 52623.8 Moments and Wick’s Formula . . . . . . . . . . . . . . . . . . . . . . 53023.9 The Limit of Gaussian Vectors Is a Gaussian Vector . . . . . . . . . . 53123.10 Conditionally-Independent Gaussian Vectors . . . . . . . . . . . . . . 53223.11 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 53623.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537

    24 Complex Gaussians and Circular Symmetry 54324.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54324.2 Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54324.3 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55124.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561

    25 Continuous-Time Stochastic Processes 56325.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56325.2 The Finite-Dimensional Distributions . . . . . . . . . . . . . . . . . . 56325.3 Definition of a Gaussian SP . . . . . . . . . . . . . . . . . . . . . . . 56625.4 Stationary Continuous-Time Processes . . . . . . . . . . . . . . . . . 56725.5 Stationary Gaussian Stochastic Processes . . . . . . . . . . . . . . . . 56925.6 Properties of the Autocovariance Function . . . . . . . . . . . . . . . 57125.7 The Power Spectral Density of a Continuous-Time SP . . . . . . . . . 57425.8 The Spectral Distribution Function . . . . . . . . . . . . . . . . . . . 57625.9 The Average Power . . . . . . . . . . . . . . . . . . . . . . . . . . . 57925.10 Stochastic Integrals and Linear Functionals . . . . . . . . . . . . . . . 58125.11 Linear Functionals of Gaussian Processes . . . . . . . . . . . . . . . . 58825.12 The Joint Distribution of Linear Functionals . . . . . . . . . . . . . . 59425.13 Filtering WSS Processes . . . . . . . . . . . . . . . . . . . . . . . . . 59725.14 The PSD Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . 60325.15 White Gaussian Noise . . . . . . . . . . . . . . . . . . . . . . . . . . 60625.16 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616

  • Contents xiii

    26 Detection in White Gaussian Noise 62326.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62326.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62326.3 From a Stochastic Process to a Random Vector . . . . . . . . . . . . 62426.4 The Random Vector of Inner Products . . . . . . . . . . . . . . . . . 62926.5 Optimal Guessing Rule . . . . . . . . . . . . . . . . . . . . . . . . . 63126.6 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 63526.7 The Front-End Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 63726.8 Detection in Passband . . . . . . . . . . . . . . . . . . . . . . . . . . 64026.9 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64126.10 Detection in Colored Gaussian Noise . . . . . . . . . . . . . . . . . . 65526.11 Multiple Antennas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66526.12 Detecting Signals of Infinite Bandwidth . . . . . . . . . . . . . . . . . 66726.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668

    27 Noncoherent Detection and Nuisance Parameters 67327.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . 67327.2 The Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67527.3 From a SP to a Random Vector . . . . . . . . . . . . . . . . . . . . . 67627.4 The Conditional Law of the Random Vector . . . . . . . . . . . . . . 67827.5 An Optimal Detector . . . . . . . . . . . . . . . . . . . . . . . . . . 68127.6 The Probability of Error . . . . . . . . . . . . . . . . . . . . . . . . . 68327.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68427.8 Extension to M ≥ 2 Signals . . . . . . . . . . . . . . . . . . . . . . . 68627.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 688

    28 Detecting PAM and QAM Signals in White Gaussian Noise 69128.1 Introduction and Setup . . . . . . . . . . . . . . . . . . . . . . . . . 69128.2 A Random Vector and Its Conditional Law . . . . . . . . . . . . . . . 69228.3 Other Optimality Criteria . . . . . . . . . . . . . . . . . . . . . . . . 69428.4 Consequences of Orthonormality . . . . . . . . . . . . . . . . . . . . 69628.5 Extension to QAM Communications . . . . . . . . . . . . . . . . . . 69928.6 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 70628.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706

    29 Linear Binary Block Codes with Antipodal Signaling 71029.1 Introduction and Setup . . . . . . . . . . . . . . . . . . . . . . . . . 71029.2 The Binary Field F2 and the Vector Space Fκ2 . . . . . . . . . . . . . 71129.3 Binary Linear Encoders and Codes . . . . . . . . . . . . . . . . . . . 71429.4 Binary Encoders with Antipodal Signaling . . . . . . . . . . . . . . . 71729.5 Power and Operational Power Spectral Density . . . . . . . . . . . . 71829.6 Performance Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . 72229.7 Minimizing the Block Error Rate . . . . . . . . . . . . . . . . . . . . 72329.8 Minimizing the Bit Error Rate . . . . . . . . . . . . . . . . . . . . . . 72829.9 Assuming the All-Zero Codeword . . . . . . . . . . . . . . . . . . . . 73329.10 System Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 73729.11 Hard vs. Soft Decisions . . . . . . . . . . . . . . . . . . . . . . . . . 73829.12 The Varshamov and Singleton Bounds . . . . . . . . . . . . . . . . . 738

  • xiv Contents

    29.13 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 73929.14 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 740

    30 The Radar Problem 74330.1 The Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74330.2 The Radar and the Knapsack Problems . . . . . . . . . . . . . . . . . 74930.3 Pareto-Optimality and Linear Functionals . . . . . . . . . . . . . . . . 75030.4 One Type of Error Is Not Allowed . . . . . . . . . . . . . . . . . . . . 75130.5 Likelihood-Ratio Tests . . . . . . . . . . . . . . . . . . . . . . . . . . 75430.6 A Gaussian Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 76230.7 Detecting a Signal in White Gaussian Noise . . . . . . . . . . . . . . 76330.8 Sufficient Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 76530.9 A Noncoherent Detection Problem . . . . . . . . . . . . . . . . . . . 76630.10 Randomization Is Not Needed . . . . . . . . . . . . . . . . . . . . . . 77130.11 The Big Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77530.12 Relative Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77930.13 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 78430.14 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785

    31 A Glimpse at Discrete-Time Signal Processing 78931.1 Discrete-Time Filters . . . . . . . . . . . . . . . . . . . . . . . . . . 78931.2 Processing Discrete-Time Stochastic Processes . . . . . . . . . . . . . 79231.3 Discrete-Time Whitening Filters . . . . . . . . . . . . . . . . . . . . 79731.4 Processing Discrete-Time Complex Processes . . . . . . . . . . . . . 80031.5 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 80431.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804

    32 Intersymbol Interference 80632.1 The Linearly-Dispersive Channel . . . . . . . . . . . . . . . . . . . . 80632.2 PAM on the ISI Channel . . . . . . . . . . . . . . . . . . . . . . . . . 80632.3 Guessing the Data Bits . . . . . . . . . . . . . . . . . . . . . . . . . 81032.4 QAM on the ISI Channel . . . . . . . . . . . . . . . . . . . . . . . . 82132.5 From Passband to Baseband . . . . . . . . . . . . . . . . . . . . . . 82632.6 Additional Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 83032.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 831

    A On the Fourier Series 833A.1 Introduction and Preliminaries . . . . . . . . . . . . . . . . . . . . . 833A.2 Reconstruction in L1 . . . . . . . . . . . . . . . . . . . . . . . . . . 835A.3 Geometric Considerations . . . . . . . . . . . . . . . . . . . . . . . . 838A.4 Pointwise Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . 842

    B On the Discrete-Time Fourier Transform 844

    C Positive Definite Functions 848

    D The Baseband Representation of Passband Stochastic Processes 851

    Bibliography 861

  • Contents xv

    Theorems Referenced by Name 867

    Abbreviations 868

    List of Symbols 869

    Index 879

  • Preface to the Second Edition

    Without conceding a blemish in the first edition, I think I had best come cleanand admit that I embarked on a second edition largely to adopt a more geometricapproach to the detection of signals in white Gaussian noise. Equally rigorous, yetmore intuitive, this approach is not only student-friendly, but also extends moreeasily to the detection problem with random parameters and to the radar problem.

    The new approach is based on the projection of white Gaussian noise onto a finite-dimensional subspace (Section 25.15.2) and on the independence of this projec-tion and the difference between noise and projection; see Theorem 25.15.6 andTheorem 25.15.7. The latter theorem allows for a simple proof of the sufficiencyof the matched-filters’ outputs without the need to define sufficient statistics forcontinuous-time observables. The key idea is that—while the receiver cannot re-cover the observable from its projection onto the subspace spanned by the meansignals—it can mimic the performance of any receiver that bases its decision onthe observable using three steps (Figure 26.1 on Page 626): use local randomnessto generate an independent stochastic process whose law is equal to that of thedifference between the noise and its projection; add this stochastic process to theprojection; and feed the result to the original receiver.

    But the new geometric approach was not the only impetus for a second edition.I also wanted to increase the book’s scope. This edition contains new chapterson the radar problem (Chapter 30), the intersymbol interference (ISI) channel(Chapter 32), and on the mathematical preliminaries needed for its study (Chap-ter 31). The treatment of the radar problem is fairly standard with two twists: wecharacterize all achievable pairs of false-alarm and missed-detection probabilities(pFA, pMD) and not just those that are Pareto-optimal. Moreover, we show thatwhen the observable has a density under both hypotheses, all achievable pairs canbe achieved using deterministic decision rules.

    As to ISI channels, I adopted the classic approach of matched filtering, discrete-time noise whitening, and running the Viterbi Algorithm. I only allow (bounded-input/bounded-output) stable whitening filters, i.e., filters whose impulse responseis absolutely summable; others often only require that the impulse response besquare summable. While my approach makes it more difficult to prove the exis-tence of whitening filters (and I do recommend that the proof be skipped), it isconceptually much cleaner because the convolution of the noise sequence with theimpulse response exists with probability one. This results in the convolution beingassociative, and therefore greatly simplifies the proof that no information is lost inthe whitening process. It is also in line with the book’s philosophy of obtaining all

    xvi

  • Preface to the Second Edition xvii

    results sample-wise.

    The chapter on ISI channels also includes Section 32.5, which treats the detec-tion of QAM signals where—as in most practical receivers—matched filtering isnot performed at the carrier frequency but after conversion to baseband (or toan intermediate frequency). An analysis of the complex stochastic process thatresults when a (real) stationary Gaussian passband stochastic process is convertedto baseband can be found in Appendix D.

    In addition, some original chapters were expanded. New sections on the opera-tional power spectral density (in Chapter 15) and a new section on conditionallyindependent Gaussians and the zeros of their precision matrix (Section 23.10) arenow included.

    Last but not least, I have added over a hundred new exercises. Most reinforce andtest, but some present additional results. Enjoy!

  • Preface to the First Edition

    Claude Shannon, the father of Information Theory, described the fundamentalproblem of point-to-point communications in his classic 1948 paper as “that ofreproducing at one point either exactly or approximately a message selected atanother point.” How engineers solve this problem is the subject of this book.But unlike Shannon’s general problem, where the message can be an image, asound clip, or a movie, here we restrict ourselves to bits. We thus envision thatthe original message is either a binary sequence to start with, or else that it wasdescribed using bits by a device outside our control and that our job is to reproducethe describing bits with high reliability. The issue of how images or text files areconverted efficiently into bits is the subject of lossy and lossless data compressionand is addressed in texts on information theory and on quantization.

    The engineering solutions to the point-to-point communication problem greatlydepend on the available resources and on the channel between the points. Theytypically bring together beautiful techniques from Fourier Analysis, Hilbert Spaces,Probability Theory, and Decision Theory. The purpose of this book is to introducethe reader to these techniques and to their interplay.

    The book is intended for advanced undergraduates and beginning graduate stu-dents. The key prerequisites are basic courses in Calculus, Linear Algebra, andProbability Theory. A course in Linear Systems is a plus but not a must, becauseall the results from Linear Systems that are needed for this book are summarizedin Chapters 5 and 6. But more importantly, the book requires a certain mathemat-ical maturity and patience, because we begin with first principles and develop thetheory before discussing its engineering applications. The book is for those whoappreciate the views along the way as much as getting to the destination; who liketo “stop and smell the roses;” and who prefer fundamentals to acronyms. I firmlybelieve that those with a sound foundation can easily pick up the acronyms andlearn the jargon on the job, but that once one leaves the academic environment,one rarely has the time or peace of mind to study fundamentals.

    In the early stages of the planning of this book I took a decision that greatlyinfluenced the project. I decided that every key concept should be unambiguouslydefined; that every key result should be stated as a mathematical theorem; and thatevery mathematical theorem should be correct. This, I believe, makes for a solidfoundation on which one can build with confidence. But it is also a tall order. Itrequired that I scrutinize each “classical” result before I used it in order to be surethat I knew what the needed qualifiers were, and it forced me to include backgroundmaterial to which the reader may have already been exposed, because I needed the

    xviii

  • Preface to the First Edition xix

    results with all the fine print. Hence Chapters 5 and 6 on Linear Systems andFourier Analysis. This is also partly the reason why the book is so long. When Istarted out my intention was to write a much shorter book. But I found that to dojustice to the beautiful mathematics on which Digital Communications is based Ihad to expand the book.

    Most physical-layer communication problems are at their core of a continuous-time nature. The transmitted physical waveforms are functions of time and notsequences synchronized to a clock. But most solutions first reduce the problem to adiscrete-time setting and then solve the problem in the discrete-time domain. Thereduction to discrete-time often requires great ingenuity, which I try to describe.It is often taken for granted in courses that open with a discrete-time model fromLecture 1. I emphasize that most communication problems are of a continuous-time nature, and that the reduction to discrete-time is not always trivial or evenpossible. For example, it is extremely difficult to translate a peak-power constraint(stating that at no epoch is the magnitude of the transmitted waveform allowed toexceed a given constant) to a statement about the sequence that is used to representthe waveform. Similarly, in Wireless Communications it is often very difficult toreduce the received waveform to a sequence without any loss in performance.

    The quest for mathematical precision can be demanding. I have therefore tried toprecede the statement of every key theorem with its gist in plain English. Instruc-tors may well choose to present the material in class with less rigor and direct thestudents to the book for a more mathematical approach. I would rather have text-books be more mathematical than the lectures than the other way round. Havinga rigorous textbook allows the instructor in class to discuss the intuition knowingthat the students can obtain the technical details from the book at home.

    The communication problem comes with a beautiful geometric picture that I tryto emphasize. To appreciate this picture one needs the definition of the innerproduct between energy-limited signals and some of the geometry of the space ofenergy-limited signals. These are therefore introduced early on in Chapters 3 and 4.Chapters 5 and 6 cover standard material from Linear Systems. But note the earlyintroduction of the matched filter as a mechanism for computing inner productsin Section 5.8. Also key is Parseval’s Theorem in Section 6.2.2 which relates thegeometric pictures in the time domain and in the frequency domain.

    Chapter 7 deals with passband signals and their baseband representation. We em-phasize how the inner product between passband signals is related to the innerproduct between their baseband representations. This elegant geometric relation-ship is often lost in the haze of various trigonometric identities. While this topic isimportant in wireless applications, it is not always taught in a first course in DigitalCommunications. Instructors who prefer to discuss baseband communication onlycan skip Chapters 7, 9, 16, 17, 18, 24, 27, and Sections 26.8, 28.5, 30.9, 31.4, 32.4,32.5. But it would be a shame.

    Chapter 8 presents the celebrated Sampling Theorem from a geometric perspective.It is inessential to the rest of the book but is a striking example of the geometricapproach. Chapter 9 discusses the Sampling Theorem for passband signals.

    Chapter 10 discusses modulation. I have tried to motivate Linear Modulation

  • xx Preface to the First Edition

    and Pulse Amplitude Modulation and to minimize the use of the “that’s just howit is done” argument. The use of the Matched Filter for detecting (here in theabsence of noise) is emphasized. This also motivates the Nyquist Theory, which istreated in Chapter 11. I stress that the motivation for the Nyquist Theory is notto avoid intersymbol interference at the sampling points but rather to guaranteethe orthogonality of the time shifts of the pulse shape by integer multiples of thebaud period. This ultimately makes more engineering sense and leads to cleanermathematics: compare Theorem 11.3.2 with its corollary, Corollary 11.3.4.

    The result of modulating random bits is a stochastic process, a concept which isfirst encountered in Chapter 10; formally defined in Chapter 12; and revisited inChapters 13, 17, 25, and 31. It is an important concept in Digital Communica-tions, and I find it best to first introduce man-made synthesized stochastic processes(as the waveforms produced by an encoder when fed random bits) and only laterto introduce the nature-made stochastic processes that model noise. Stationarydiscrete-time stochastic processes are introduced in Chapter 13 and their complexcounterparts in Chapter 17. These are needed for the analysis in Chapter 14 of thepower in Pulse Amplitude Modulation and for the analysis in Chapter 17 of thepower in Quadrature Amplitude Modulation. They are revisited in Chapter 31,which presents additional results that are needed in the study of intersymbol in-terference channels (Chapter 32).

    I emphasize that power is a physical quantity that is related to the time-averagedenergy in the continuous-time transmitted power. Its relation to the power in thediscrete-time modulating sequence is a nontrivial result. In deriving this relationI refrain from adding random timing jitters that are often poorly motivated andthat turn out to be unnecessary. (The transmitted power does not depend on therealization of the fictitious jitter.) The Power Spectral Density in Pulse AmplitudeModulation and Quadrature Amplitude Modulation is discussed in Chapters 15and 18. The discussion requires a definition for Power Spectral Density for non-stationary processes (Definitions 15.3.1 and 18.4.1) and a proof that this definitioncoincides with the classical definition when the process is wide-sense stationary(Theorem 25.14.3).

    Chapter 19 opens the second part of the book, which deals with noise and detection.It introduces the univariate Gaussian distribution and some related distributions.The principles of Detection Theory are presented in Chapters 20–22. I emphasizethe notion of Sufficient Statistics, which is central to Detection Theory. Buildingon Chapter 19, Chapter 23 introduces the all-important multivariate Gaussiandistribution. Chapter 24 treats the complex case.

    Chapter 25 deals with continuous-time stochastic processes with an emphasis onstationary Gaussian processes, which are often used to model the noise in DigitalCommunications. This chapter also introduces white Gaussian noise. My approachto this topic is perhaps new and is probably where this text differs the most fromother textbooks on the subject.

    I define white Gaussian noise of double-sided power spectral density N0/2with respect to the bandwidth W as any measurable,1 stationary, Gaussian

    1This book does not assume any Measure Theory and does not teach any Measure Theory.

  • Preface to the First Edition xxi

    −W W

    N0/2

    f

    SNN (f)

    Figure 1: The power spectral density of a white Gaussian noise process of double-sided power spectral density N0/2 with respect to the bandwidth W.

    stochastic process whose power spectral density is a nonnegative, symmetric, inte-grable function of frequency that is equal to N0/2 at all frequencies f satisfying|f | ≤ W. The power spectral density at other frequencies can be arbitrary. Anexample of the power spectral density of such a process is depicted in Figure 1.Adopting this definition has a number of advantages. The first is, of course, thatsuch processes exist. One need not discuss “generalized processes,” Gaussian pro-cesses with infinite variances (that, by definition, do not exist), or introduce theItô calculus to study stochastic integrals. (Stochastic integrals with respect to theBrownian motion are mathematically intricate and physically unappealing. Theidea of the noise having infinite power is ludicrous.) The above definition also freesme from discussing Dirac’s Delta, and, in fact, Dirac’s Delta is never used in thisbook. (A rigorous treatment of Generalized Functions is beyond the engineeringcurriculum in most schools, so using Dirac’s Delta always gives the reader theunsettling feeling of being on unsure footing.)

    The detection problem in white Gaussian noise is treated in Chapter 26. No coursein Digital Communications should end without Theorem 26.3.1. Roughly speaking,this theorem states that if the mean-signals are bandlimited to W Hz and if thenoise is white Gaussian noise with respect to the bandwidth W, then there is noloss of optimality in basing our guess on the projection of the received waveformonto the subspace spanned by the mean-signals. Numerous examples as well as atreatment of colored noise are also discussed in this chapter. Extensions to nonco-herent detection are addressed in Chapter 27 and implications for Pulse AmplitudeModulation and for Quadrature Amplitude Modulation in Chapter 28.

    Coding is introduced in Chapter 29. It emphasizes how the code design influencesthe transmitted power, the transmitted power spectral density, the required band-

    (I do define sets of Lebesgue measure zero in order to be able to state uniqueness theorems.) Iuse Measure Theory only in stating theorems that require measurability assumptions. This isin line with my attempt to state theorems together with all the assumptions that are requiredfor their validity. I recommend that students ignore measurability issues and just make a mentalnote that whenever measurability is mentioned there is a minor technical condition lurking in thebackground.

  • xxii Preface to the First Edition

    width, and the probability of error. The construction of good codes is left to textson Coding Theory.

    Motivated by the radar problem, Chapter 30 introduces the Neyman-Pearson the-ory of hypothesis testing as well as the Kullback-Leibler divergence. And after somemathematical preliminaries in Chapter 31, the book concludes with Chapter 32,which introduces the intersymbol interference channel and the Viterbi Algorithm.

    Basic Latin

    Mathematics sometimes reads like a foreign language. I therefore include here ashort glossary for such terms as “i.e.,” “that is,” “in particular,” “a fortiori,” “forexample,” and “e.g.,” whose meaning in Mathematics is slightly different from thedefinition you will find in your English dictionary. In mathematical contexts theseterms are actually logical statements that the reader should verify. Verifying thesestatements is an important way to make sure that you understand the math.

    What are these logical statements? First note the synonym “i.e.” = “that is” andthe synonym “e.g.” = “for example.” Next note that the term “that is” oftenindicates that the statement following the term is equivalent to the one precedingit: “We next show that p is a prime, i.e., that p is a positive integer larger thanone that is not divisible by any positive integer other than one and itself.” Theterms “in particular” or “a fortiori” indicate that the statement following themis implied by the one preceding them: “Since g(·) is differentiable and, a fortiori,continuous, it follows from the Mean Value Theorem that the integral of g(·) overthe interval [0, 1] is equal to g(ξ) for some ξ ∈ [0, 1].” The term “for example” canhave its regular day-to-day meaning but in mathematical writing it also sometimesindicates that the statement following it implies the one preceding it: “Supposethat the function g(·) is monotonically nondecreasing, e.g., that it is differentiablewith a nonnegative derivative.”

    Another important word to look out for is “indeed,” which in this book typicallysignifies that the statement just made is about to be expanded upon and explained.So when you read something that is unclear to you, be sure to check whether thenext sentence begins with the word “indeed” before you panic.

    The Latin phrases “a priori” and “a posteriori” show up in Probability Theory.The former is usually associated with the unconditional probability of an event andthe latter with the conditional. Thus, the “a priori” probability that the sun willshine this Sunday in Zurich is 25%, but now that I know that it is raining today,my outlook on life changes and I assign this event the a posteriori probability of15%.

    The phrase “prima facie” is roughly equivalent to the phrase “before any furthermathematical arguments have been presented.” For example, the definition of theprojection of a signal v onto the signal u as the vector w that is collinear with u andfor which v−w is orthogonal to u, may be followed by the sentence: “Prima facie,it is not clear that the projection always exists and that it is unique. Nevertheless,as we next show, this is the case.”

  • Preface to the First Edition xxiii

    Syllabuses or Syllabi

    The book can be used as a textbook for a number of different courses. For a coursethat focuses on deterministic signals one could use Chapters 1–9 and Chapter 11.A course that covers Stochastic Processes and Detection Theory could be basedon Chapter 12, Chapters 19–26, and Chapter 30 with or without discrete-timestochastic processes (Chapters 13 and 31) and with or without complex randomvariables and processes (Chapters 17 and 24).

    For a course on Digital Communications one could use the entire book or, if timedoes not permit it, discuss only baseband communication. In the latter case onecould omit Chapters 7, 9, 16, 17, 18, 24, 27, and Sections 26.8, 28.5, 30.9, 31.4,32.4, 32.5.

    The dependencies between the chapters are depicted on Page xxiv. A simpler chartpertaining only to baseband communication can be found on Page xxv.

    The book’s web page is

    www.afidc.ethz.ch

  • xxiv Preface to the First Edition

    1,2

    3

    4 5

    6

    10 11 7 8

    12 13 17

    14 16 9

    15 18

    19 23 24

    20 25

    21

    22

    26

    27

    28.1–4

    28.5

    30 29

    31.1–3

    31.4

    32.1–3

    32.4–5

    A Dependency Diagram.

    ch:imp_PAM_QAMch:imp_PAM_QAMsec:QAM_detectsec:QAM_detectch:DT_SPch:DT_SPsec:ISI_CompLinFuncsec:ISI_CompLinFuncch:ISIch:ISIsec:QAM_ISIsec:QAM_ISI

  • Preface to the First Edition xxv

    1,2

    3

    4 5

    6

    10 11 8

    12 13

    14

    19 23 15

    20 25

    21

    22

    26

    28.1–4

    30 29

    31.1–3

    32.1–3

    A Dependency Diagram for Baseband Communications.

    ch:imp_PAM_QAMch:imp_PAM_QAMch:DT_SPch:DT_SPch:ISIch:ISI

  • Acknowledgments for the Second Edition

    I received help from numerous people in numerous ways. Some pointed out typosin the first edition, some offered comments on the new material in the secondedition, and some suggested topics that ended up as new exercises. My thanks goto all of them: the anonymous readers who reported typos through the book’s webpage, Céline Aubel, Annina Bracher, Helmut Bölcskei, Christoph Bunte, Paul Cuff,Samuel Gaehwiler, Johannes Huber, Tobias Koch, Gernot Kubin, Hans-AndreaLoeliger, Mehdi Molkaraie, Stefan Moser, Götz Pfander, Christoph Pfister, QiutingHuang, Bixio Rimoldi, Igal Sason, Alain-Sol Sznitman, Emre Telatar, and MarkosTroulis. Finally, I thank my wife, Danielle Lapidoth-Berger, for her encouragementand willingness to go through this all over again.

    xxvi

  • Acknowledgments for the First Edition

    This book has a long history. Its origins are in a course entitled “Introduction toDigital Communication” that Bob Gallager and I developed at the MassachusettsInstitute of Technology (MIT) in the years 1997 (course number 6.917) and 1998(course number 6.401). Assisting us in these courses were Emre Koksal and Poom-pat Saengudomlert (Tengo) respectively. The course was first conceived as anadvanced undergraduate course, but at MIT it has since evolved into a first-yeargraduate course leading to the publication of the textbook (Gallager, 2008). AtETH the course is still an advanced undergraduate course, and the lecture notesevolved into the present book. Assisting me at ETH were my former and currentPh.D. students Stefan Moser, Daniel Hösli, Natalia Miliou, Stephan Tinguely, To-bias Koch, Michèle Wigger, and Ligong Wang. I thank them all for their enormoushelp. Marion Brändle was also a great help.

    I also thank Bixio Rimoldi for his comments on an earlier draft of this book, fromwhich he taught at École Polytechnique Fédérale de Lausanne (EPFL) and ThomasMittelholzer, who used a draft of this book to teach a course at ETH during mysabbatical.

    Extremely helpful were discussions with Amir Dembo, Sanjoy Mitter, Alain-SolSznitman, and Ofer Zeitouni about some of the more mathematical aspects of thisbook. Discussions with Ezio Biglieri, Holger Boche, Stephen Boyd, Young-HanKim, and Sergio Verdú are also gratefully acknowledged.

    Special thanks are due to Bob Gallager and Dave Forney with whom I had endlessdiscussions about the material in this book both while at MIT and afterwards atETH. Their ideas have greatly influenced my thinking about how this course shouldbe taught.

    I thank Helmut Bölcskei, Andi Loeliger, and Nikolai Nefedov for having toleratedmy endless ramblings regarding Digital Communications during our daily lunches.Jim Massey was a huge help in patiently answering my questions regarding Englishusage. I should have asked him much more!

    A number of dear colleagues read parts of this manuscript. Their commentswere extremely useful. These include Helmut Bölcskei, Moritz Borgmann, SamuelBraendle, Shraga Bross, Giuseppe Durisi, Yariv Ephraim, Minnie Ho, Young-Han Kim, Yiannis Kontoyiannis, Nick Laneman, Venya Morgenshtern, PrakashNarayan, Igal Sason, Brooke Shrader, Aslan Tchamkerten, Sergio Verdú, PascalVontobel, and Ofer Zeitouni. I am especially indebted to Emre Telatar for hisenormous help in all aspects of this project.

    xxvii

  • xxviii Acknowledgments for the First Edition

    I would like to express my sincere gratitude to the Rockefeller Foundation at whoseStudy and Conference Center in Bellagio, Italy, this all began.

    Finally, I thank my wife, Danielle, for her encouragement, her tireless editing, andfor making it possible for me to complete this project.

  • Chapter 1

    Some Essential Notation

    Reading a whole chapter about notation can be boring. We have thus chosen tocollect here only the essentials and to introduce the rest when it is first used. The“List of Symbols” on Page 869 is more comprehensive.

    We denote the set of complex numbers by C, the set of real numbers by R, the setof integers by Z, and the set of natural numbers (positive integers) by N. Thus,

    N = {n ∈ Z : n ≥ 1}.

    The above equation is not meant to belabor the point. We use it to introduce thenotation

    {x ∈ A : statement}

    for the set consisting of all those elements of the set A for which “statement” holds.In treating real numbers, we use the notation (a, b), [a, b), [a, b], (a, b] to denoteopen, half open on the right, closed, and half open on the left intervals of the realline. Thus, for example,

    [a, b) = {x ∈ R : a ≤ x < b}.

    A statement followed by a comma and a condition indicates that the statementholds whenever the condition is satisfied. For example,

    |an − a| < �, n ≥ n0means that |an − a| < � whenever n ≥ n0.We use I{statement} to denote the indicator of the statement. It is equal to 1, ifthe statement is true, and it is equal to 0, if the statement is false. Thus

    I{statement} =

    {1 if statement is true,

    0 if statement is false.

    In dealing with complex numbers we use i to denote the purely imaginary complexnumber whose imaginary part is one

    i =√−1.

    1

  • 2 Some Essential Notation

    We use z∗ to denote the complex conjugate of z, we use Re(z) to denote the realpart of z, we use Im(z) to denote the imaginary part of z, and we use |z| to denotethe absolute value (or “modulus”, or “complex magnitude”) of z. Thus, if z = a+ib,where a, b ∈ R, then z∗ = a− ib, Re(z) = a, Im(z) = b, and |z| =

    √a2 + b2.

    The notation used to define functions is extremely important and is, alas, some-times confusing to students, so please pay attention. A function or a mappingassociates with each element in its domain a unique element in its codomain. Ifa function has a name, the name is often written in bold as in u.1 Alternatively,we sometimes denote a function u by u(·). The notation

    u : A → B

    indicates that u is a function of domain A and codomain B. The rule specifyingfor each element of the domain the element in the codomain to which it is mappedis often written to the right or underneath. Thus, for example,

    u : R→ (−5,∞), t 7→ t2

    indicates that the domain of the function u is the reals, that its codomain is theset of real numbers that exceed −5, and that u associates with t the nonnegativenumber t2. We write u(t) for the result of applying the mapping u to t. The rangeof a mapping u : A → B is the set of all elements of the codomain B to which atleast one element in the domain is mapped by u:

    range of(u : A → B

    )={u(x) : x ∈ A

    }. (1.1)

    The range of a mapping is a subset of its codomain. In the above example, therange of the mapping is the set of nonnegative reals [0,∞). A mapping u : A → Bis said to be onto (or surjective) if its range is equal to its codomain. Thus,u : A → B is onto if, and only if, for every y ∈ B there corresponds some x ∈ A(not necessarily unique) such that u(x) = y. If the range of g(·) is a subset of thedomain of h(·), then the composition of g(·) and h(·) is the mapping x 7→ h

    (g(x)

    ),

    which is denoted by h ◦ g. A function u : A → B is said to be one-to-one (orinjective) if different elements of A are mapped to different elements of B, i.e., ifx1 6= x2 implies that u(x1) 6= u(x2) (whenever x1, x2 ∈ A).Sometimes we do not specify the domain and codomain of a function if they areclear from the context. Thus, we might write u : t 7→ v(t) cos(2πfct) withoutmaking explicit what the domain and codomain of u are. In fact, if there is noneed to give a function a name, then we will not. For example, we might write t 7→v(t) cos(2πfct) to designate the unnamed function that maps t to v(t) cos(2πfct).(Here v(·) is some other function, which was presumably defined before.)If the domain of a function u is R and if the codomain is R, then we sometimes saythat u is a real-valued signal or a real signal, especially if the argument of ustands for time. Similarly we shall sometimes refer to a function u : R → C as acomplex-valued signal or a complex signal. If we refer to u as a signal, then

    1But some special functions such as the self-similarity function Rgg, the autocovariance func-tion KXX , and the power spectral density SXX , which will be introduced in later chapters, arenot in boldface.

  • Some Essential Notation 3

    the question whether it is complex-valued or real-valued should be clear from thecontext, or else immaterial to the claim.

    We caution the reader that, while u and u(·) denote functions, u(t) denotes theresult of applying u to t. If u is a real-valued signal then u(t) is a real number!

    Given two signals u and v we define their superposition or sum as the signalt 7→ u(t) + v(t). We denote this signal by u + v. Also, if α ∈ C and u is any signal,then we define the amplification of u by α as the signal t 7→ αu(t). We denotethis signal by αu. Thus,

    αu + β v

    is the signalt 7→ αu(t) + β v(t).

    We refer to the function that maps every element in its domain to zero as the all-zero function and we denote it by 0. The all-zero signal 0 maps every t ∈ Rto zero. If x : R→ C is a signal that maps every t ∈ R to x(t), then its reflectionor mirror image is denoted by ~x and is the signal that is defined by

    ~x : t 7→ x(−t).

    Dirac’s Delta (which will hardly be mentioned in this book) is not a function.

    A probability space is defined as a triple (Ω,F , P ), where the set Ω is the set ofexperiment outcomes, the elements of the set F are subsets of Ω and are calledevents, and where P : F → [0, 1] assigns probabilities to the various events. It isassumed that F forms a σ-algebra, i.e., that Ω ∈ F ; that if a set is in F then sois its complement (with respect to Ω); and that every finite or countable union ofelements of F is also an element of F . A random variable (RV) X is a mappingfrom Ω to R that satisfies the technical condition that

    {ω ∈ Ω : X(ω) ≤ ξ} ∈ F , ξ ∈ R. (1.2)

    This condition guarantees that it is always meaningful to evaluate the probabilitythat the value of X is smaller or equal to ξ.

  • Chapter 2

    Signals, Integrals, and Sets of Measure Zero

    2.1 Introduction

    The purpose of this chapter is not to develop the Lebesgue theory of integration.Mastering this theory is not essential to understanding Digital Communications.But some concepts from this theory are needed in order to state the main resultsof Digital Communications in a mathematically rigorous way. In this chapterwe introduce these required concepts and provide references to the mathematicalliterature that develops them.

    The less mathematically-inclined may gloss over most of this chapter. Readerswho interpret the integrals in this book as Riemann integrals; who interpret “mea-surable” as “satisfying a minor mathematical restriction”; who interpret “a set ofLebesgue measure zero” as “a set that is so small that integrals of functions areinsensitive to the values the integrand takes in this set”; and who swap orders ofsummations, expectations and integrations fearlessly will not miss any engineeringinsights.

    But all readers should pay attention to the way the integral of complex-valuedsignals is defined (Section 2.3); to the basic inequality (2.13); and to the notationintroduced in (2.6).

    2.2 Integrals

    Recall that a real-valued signal u is a function u : R → R. The integral of u isdenoted by ∫ ∞

    −∞u(t) dt. (2.1)

    For (2.1) to be meaningful some technical conditions must be met. (You may re-call from your calculus studies, for example, that not every function is Riemannintegrable.) In this book all integrals will be understood to be Lebesgue integrals,but nothing essential will be lost on readers who interpret them as Riemann inte-grals. For the Lebesgue integral to be defined the integrand u must be a Lebesguemeasurable function. Again, do not worry if you have not studied the Lebesgueintegral or the notion of measurable functions. We point this out merely to coverourselves when we state various theorems. Also, for the integral in (2.1) to be

    4

  • 2.3 Integrating Complex-Valued Signals 5

    defined we insist that ∫ ∞−∞|u(t)|dt

  • 6 Signals, Integrals, and Sets of Measure Zero

    Before summarizing the key properties of the integral of complex signals we remindthe reader that if u and v are complex signals and if α, β are complex numbers, thenthe complex signal αu+β v is defined as the complex signal t 7→ αu(t)+β v(t). Theintuition for the following proposition comes from thinking about the integrals asRiemann integrals, which can be approximated by finite sums and by then invokingthe analogous results about finite sums.

    Proposition 2.3.1 (Properties of Complex Integrals). Let the complex signals u,vbe in L1 , and let α, β be arbitrary complex numbers.

    (i) Integration is linear in the sense that αu + β v ∈ L1 and∫ ∞−∞

    (αu(t) + β v(t)

    )dt = α

    ∫ ∞−∞

    u(t) dt+ β

    ∫ ∞−∞

    v(t) dt. (2.7)

    (ii) Integration commutes with complex conjugation∫ ∞−∞

    u∗(t) dt =

    (∫ ∞−∞

    u(t) dt

    )∗. (2.8)

    (iii) Integration commutes with the operation of taking the real part

    Re

    (∫ ∞−∞

    u(t) dt

    )=

    ∫ ∞−∞

    Re(u(t)

    )dt. (2.9)

    (iv) Integration commutes with the operation of taking the imaginary part

    Im

    (∫ ∞−∞

    u(t) dt

    )=

    ∫ ∞−∞

    Im(u(t)

    )dt. (2.10)

    Proof. For a proof of (i) see, for example, (Rudin, 1987, Theorem 1.32). The restof the claims follow easily from the definition of the integral of a complex-valuedsignal (2.3).

    2.4 An Inequality for Integrals

    Probably the most important inequality for complex numbers is the TriangleInequality for Complex Numbers

    |w + z| ≤ |w|+ |z|, w, z ∈ C. (2.11)

    This inequality extends by induction to finite sums:∣∣∣∣ n∑j=1

    zj

    ∣∣∣∣ ≤ n∑j=1

    |zj | , z1, . . . , zn ∈ C. (2.12)

    The extension to integrals is the most important inequality for integrals:

  • 2.5 Sets of Lebesgue Measure Zero 7

    Proposition 2.4.1. For every complex-valued or real-valued signal u in L1∣∣∣∣∫ ∞−∞

    u(t) dt

    ∣∣∣∣ ≤ ∫ ∞−∞

    ∣∣u(t)∣∣dt. (2.13)Proof. See, for example, (Rudin, 1987, Theorem 1.33).

    Note that in (2.13) we should interpret | · | as the absolute-value function if u is areal signal, and as the modulus function if u is a complex signal.

    Another simple but useful inequality is

    ‖u + v‖1 ≤ ‖u‖1 + ‖v‖1 , u,v ∈ L1 , (2.14)

    which can be proved using the calculation

    ‖u + v‖1 =∫ ∞−∞|u(t) + v(t)|dt

    ≤∫ ∞−∞

    (|u(t)|+ |v(t)|

    )dt

    =

    ∫ ∞−∞|u(t)|dt+

    ∫ ∞−∞|v(t)|dt

    = ‖u‖1 + ‖v‖1 ,

    where the inequality follows by applying the Triangle Inequality for Complex Num-bers (2.11) with the substitution of u(t) for w and v(t) for z.

    2.5 Sets of Lebesgue Measure Zero

    It is one of life’s minor grievances that the integral of a nonnegative function canbe zero even if the function is not identically zero. For example, t 7→ I{t = 17} is anonnegative function whose integral is zero and which is nonetheless not identicallyzero (it maps 17 to one). In this section we shall derive a necessary and sufficientcondition for the integral of a nonzero function to be zero. This condition willallow us later to state conditions under which various integral inequalities holdwith equality. It will give mathematical meaning to the physical intuition that ifthe waveform describing some physical phenomenon (such as voltage over a resistor)is nonnegative and integrates to zero then “for all practical purposes” the waveformis zero.

    We shall define sets of Lebesgue measure zero and then show that a nonnegativefunction u : R→ [0,∞) integrates to zero if, and only if, the set {t ∈ R : u(t) > 0} isof Lebesgue measure zero. Thus, whether or not a nonnegative function integratesto zero depends on the set over which it takes on positive values and not on theactual values. We shall then introduce the notation u ≡ v to indicate that the set{t ∈ R : u(t) 6= v(t)} is of Lebesgue measure zero.It should be noted that since the integral is unaltered when the integrand is changedat a finite (or countable) number of points, it follows that any nonnegative function

  • 8 Signals, Integrals, and Sets of Measure Zero

    that is zero except at a countable number of points integrates to zero. The reverse,however, is not true. One can find nonnegative functions that integrate to zeroand that are nonzero on an uncountable set of points.

    The less mathematically inclined readers may skip the mathematical definition ofsets of measure zero and just think of a subset of the real line as being of Lebesguemeasure zero if it is so “small” that the integral of any function is unaltered whenthe values it takes in the subset are altered. Such readers should then think of thestatement u ≡ v as indicating that u− v is just the result of altering the all-zerosignal 0 on a set of Lebesgue measure zero and that, consequently,∫ ∞

    −∞|u(t)− v(t)|dt = 0.

    Definition 2.5.1 (Sets of Lebesgue Measure Zero). We say that a subset N ofthe real line R is a set of Lebesgue measure zero (or a Lebesgue null set)if for every � > 0 we can find a sequence of intervals [a1, b1], [a2, b2], . . . such thatthe total length of the intervals is smaller than or equal to �

    ∞∑j=1

    (bj − aj) ≤ � (2.15a)

    and such that the union of the intervals covers N

    N ⊆ [a1, b1] ∪ [a2, b2] ∪ · · · . (2.15b)

    As an example, note that the set {1} is of Lebesgue measure zero. Indeed, it iscovered by the single interval [1 − �/2, 1 + �/2] whose length is �. Similarly, anyfinite set is of Lebesgue measure zero. Indeed, the set {α1, . . . , αn} can be coveredby n intervals of total length not exceeding � as follows:

    {α1, . . . , αn} ⊂[α1 − �/(2n), α1 + �/(2n)

    ]∪ · · · ∪

    [αn − �/(2n), αn + �/(2n)

    ].

    This argument can be also extended to show that any countable set is of Lebesguemeasure zero. Indeed the countable set {α1, α2, . . .} can be covered as

    {α1, α2, . . .} ⊆∞⋃j=1

    [αj − 2−j−1�, αj + 2−j−1�

    ]where we note that the length of the interval

    [αj − 2−j−1�, αj + 2−j−1�

    ]is 2−j�,

    which when summed over j yields �.

    With a similar argument one can show that the union of a countable number ofsets of Lebesgue measure zero is of Lebesgue measure zero.

    The above examples notwithstanding, it should be emphasized that there exist setsof Lebesgue measure zero that are not countable.1 Thus, the concept of a set ofLebesgue measure zero is different from the concept of a countable set.

    1For example, the Cantor set is of Lebesgue measure zero and uncountable; see (Rudin, 1976,Section 11.11, Remark (f), p. 309).

  • 2.5 Sets of Lebesgue Measure Zero 9

    Loosely speaking, we say that two signals are indistinguishable if they agree exceptpossibly on a set of Lebesgue measure zero. We warn the reader, however, thatthis terminology is not standard.

    Definition 2.5.2 (Indistinguishable Functions). We say that the Lebesgue measur-able functions u,v from R to C (or to R) are indistinguishable and write

    u ≡ v

    if the set {t ∈ R : u(t) 6= v(t)} is of Lebesgue measure zero.

    Note that u ≡ v if, and only if, the signal u − v is indistinguishable from theall-zero signal 0 (

    u ≡ v)⇐⇒

    (u− v ≡ 0

    ). (2.16)

    The main result of this section is the following:

    Proposition 2.5.3.

    (i) A nonnegative Lebesgue measurable signal integrates to zero if, and only if,it is indistinguishable from the all-zero signal 0.

    (ii) If u,v are Lebesgue measurable functions from R to C (or to R), then(∫ ∞−∞|u(t)− v(t)|dt = 0

    )⇐⇒

    (u ≡ v

    )(2.17)

    and (∫ ∞−∞|u(t)− v(t)|2 dt = 0

    )⇐⇒

    (u ≡ v

    ). (2.18)

    (iii) If u and v are integrable and indistinguishable, then their integrals are equal:(u ≡ v

    )=⇒

    (∫ ∞−∞

    u(t) dt =

    ∫ ∞−∞

    v(t) dt), u,v ∈ L1 . (2.19)

    Proof. The proof of (i) is not very difficult, but it requires more familiarity withMeasure Theory than we are willing to assume. The interested reader is thusreferred to (Rudin, 1987, Theorem 1.39).

    The equivalence in (2.17) follows by applying Part (i) to the nonnegative functiont 7→ |u(t)− v(t)|. Similarly, (2.18) follows by applying Part (i) to the nonnegativefunction t 7→ |u(t)−v(t)|2 and by noting that the set of t’s for which |u(t)−v(t)|2 6= 0is the same as the set of t’s for which u(t) 6= v(t).Part (iii) follows from (2.17) by noting that∣∣∣∣∫ ∞

    −∞u(t) dt−

    ∫ ∞−∞

    v(t) dt

    ∣∣∣∣ = ∣∣∣∣∫ ∞−∞

    (u(t)− v(t)

    )dt

    ∣∣∣∣≤∫ ∞−∞

    ∣∣u(t)− v(t)∣∣dt,where the first equality follows by the linearity of integration, and where the sub-sequent inequality follows from Proposition 2.4.1.

  • 10 Signals, Integrals, and Sets of Measure Zero

    2.6 Swapping Integration, Summation, and Expectation

    In numerous places in this text we shall swap the order of integration as in∫ ∞−∞

    (∫ ∞−∞

    u(α, β) dα

    )dβ =

    ∫ ∞−∞

    (∫ ∞−∞

    u(α, β) dβ

    )dα (2.20)

    or the order of summation as in∞∑ν=1

    ( ∞∑η=1

    aν,η

    )=

    ∞∑η=1

    ( ∞∑ν=1

    aν,η

    )(2.21)

    or the order of summation and integration as in∫ ∞−∞

    ( ∞∑ν=1

    aν uν(t)

    )dt =

    ∞∑ν=1

    (aν

    ∫ ∞−∞

    uν(t) dt

    )(2.22)

    or the order of integration and expectation as in

    E

    [∫ ∞−∞

    X u(t) dt

    ]=

    ∫ ∞−∞

    E[X u(t)] dt = E[X]

    ∫ ∞−∞

    u(t) dt.

    These changes of order are usually justified using Fubini’s Theorem, which statesthat these changes of order are permissible provided that a very technical measura-bility condition is satisfied and that, in addition, either the integrand is nonnegativeor that in some order (and hence in all orders) the integrals/summation/expectationof the absolute value of the integrand is finite.

    For example, to justify (2.20) it suffices to verify that the function u : R2 → R in(2.20) is Lebesgue measurable and that, in addition, it is either nonnegative or∫ ∞

    −∞

    (∫ ∞−∞|u(α, β)|dα

    )dβ

  • 2.7 Additional Reading 11

    or∞∑ν=1

    |aν |(∫ ∞−∞|uν(t)|dt

    )

  • 12 Signals, Integrals, and Sets of Measure Zero

    Exercise 2.3 (Integrating an Exponential). Show that∫ ∞0

    e−zt dt =1

    z, Re(z) > 0.

    Exercise 2.4 (The Sinc Is not Integrable). Show that the mapping t 7→ sin(πt)/(πt) isnot integrable irrespective of how we define it at zero.

    Hint: Lower-bound the integral of the absolute value of this function from n to n + 1 bythe integral from n + 1/4 to n + 3/4. Lower-bound the integrand in this region, and usethe fact that the harmonic sum

    ∑k≥1 1/k diverges.

    Exercise 2.5 (Triangle Inequality for Complex Numbers). Prove the Triangle Inequalityfor complex numbers (2.11). Under what conditions does it hold with equality?

    Exercise 2.6 (When Are Complex Numbers Equal?). Prove that if the complex numbersw and z are such that Re(βz) = Re(βw) for all β ∈ C, then w = z.

    Exercise 2.7 (Bounding Complex Exponentials). Show that∣∣eiθ − 1∣∣ ≤ |θ|, θ ∈ R.Exercise 2.8 (An Integral Inequality). Show that if u, v, and w are integrable signals,then ∫ ∞

    −∞

    ∣∣u(t)− w(t)∣∣dt ≤ ∫ ∞−∞

    ∣∣u(t)− v(t)∣∣ dt+ ∫ ∞−∞

    ∣∣v(t)− w(t)∣∣ dt.Exercise 2.9 (An Integral to Note). Given some f ∈ R, compute the integral∫ ∞

    −∞I{t = 17} e−i2πft dt.

    Exercise 2.10 (Subsets of Sets of Lebesgue Measure Zero). Show that a subset of a setof Lebesgue measure zero must also be of Lebesgue measure zero.

    Exercise 2.11 (The Union of Sets of Lebesgue Measure Zero). Show that the union oftwo sets that are each of Lebesgue measure zero is also of Lebesgue measure zero.

    Exercise 2.12 (Nonuniqueness of the Probability Density Function). We say that therandom variable X is of density fX(·) if fX(·) is a (Lebesgue measurable) nonnegativefunction such that

    Pr[X ≤ x] =∫ x−∞

    fX(ξ) dξ, x ∈ R.

    Show that if X is of density fX(·) and if g(·) is a nonnegative function that is indistin-guishable from fX(·), then X is also of density g(·). (The reverse is also true: if X is ofdensity g1(·) and also of density g2(·), then g1(·) and g2(·) must be indistinguishable.)

    Exercise 2.13 (Indistinguishability). Let ψ : R2 → R satisfy ψ(α, β) ≥ 0, for all α, β ∈ Rwith equality only if α = β. Let u and v be Lebesgue measurable signals. Show that(∫ ∞

    −∞ψ(u(t), v(t)

    )dt = 0

    )=⇒

    (v ≡ u

    ).

  • 2.8 Exercises 13

    Exercise 2.14 (Indistinguishable Signals). Show that if the Lebesgue measurable signals gand h are indistinguishable, then the set of epochs t ∈ R where the sums

    ∑∞j=−∞ g(t+ j)

    and∑∞j=−∞ h(t + j) are different (in the sense that they both converge but to different

    limits or that one converges but the other does not) is of Lebesgue measure zero.

    Exercise 2.15 (Continuous Nonnegative Functions). A subset of R containing a nonemptyopen interval cannot be of Lebesgue measure zero. Use this fact to show that if a con-tinuous function g : R → R is nonnegative except perhaps on a set of Lebesgue measurezero, then the exception set is empty and the function is nonnegative.

    Exercise 2.16 (Order of Summation Sometimes Matters). For every ν, η ∈ N define

    aν,η =

    2− 2−ν if ν = η−2 + 2−ν if ν = η + 10 otherwise.

    Show that (2.21) is not satisfied. See (Royden and Fitzpatrick, 2010, Section 20.1, Exer-cise 5).

    Exercise 2.17 (Using Fubini’s Theorem). Using the relation

    1

    x=

    ∫ ∞0

    e−xt dt, x > 0

    and Fubini’s Theorem, show that

    limα→∞

    ∫ α0

    sinx

    xdx =

    π

    2.

    See (Rudin, 1987, Chapter 8, Exercise 12).

    Hint: See also Problem 2.3.

  • Chapter 3

    The Inner Product

    3.1 The Inner Product

    The inner product is central to Digital Communications, so it is best to introduceit early. The motivation will have to wait.

    Recall that u : A → B indicates that u (sometimes denoted u(·)) is a function(or mapping) that maps each element in its domain A to an element in itscodomain B. If both the domain and the codomain of u are the set of realnumbers R, then we sometimes refer to u as being a real signal, especially if theargument of u(·) stands for time. Similarly, if u : R → C where C denotes the setof complex numbers and the argument of u(·) stands for time, then we sometimesrefer to u as a complex signal.

    The inner product between two real functions u : R → R and v : R → R isdenoted 〈u,v〉 and is defined as

    〈u,v〉 ,∫ ∞−∞

    u(t) v(t) dt, (3.1)

    whenever the integral is defined.1(In Section 3.2 we shall study conditions underwhich the integral is defined, i.e., conditions on the functions u and v that guar-antee that the product function t 7→ u(t) v(t) is an integrable function.)The signals that arise in our study of Digital Communications often representelectric fields or voltages over resistors. The energy required to generate them isthus proportional to the integral of their squared magnitude. This motivates us todefine the energy of a Lebesgue measurable real-valued function u : R→ R as∫ ∞

    −∞u2(t) dt.

    (If this integral is not finite, then we say that u is of infinite energy.) We say thatu : R→ R is of finite energy if it is Lebesgue measurable and if∫ ∞

    −∞u2(t) dt

  • 3.1 The Inner Product 15

    The class of all finite-energy real-valued functions u : R→ R is denoted by L2 .Since the energy of u : R→ R is nonnegative, we can discuss its nonnegative squareroot, which we denote2 by ‖u‖2 :

    ‖u‖2 ,

    √∫ ∞−∞

    u2(t) dt. (3.2)

    (Throughout this book we denote by√ξ the nonnegative square root of ξ for every

    ξ ≥ 0.) We can now express the energy in u using the inner product as

    ‖u‖22 =∫ ∞−∞

    u2(t) dt

    = 〈u,u〉. (3.3)

    In writing ‖u‖22 above we used different fonts for the subscript and the superscript.The subscript is just a graphical character which is part of the notation ‖·‖2 . Wecould have replaced it with � and designated the energy by ‖u‖2� without anychange in mathematical meaning.3 The superscript, however, indicates that thequantity ‖u‖2 is being squared.For complex-valued functions u : R→ C and v : R→ C we define the inner product〈u,v〉 as

    〈u,v〉 ,∫ ∞−∞

    u(t) v∗(t) dt, (3.4)

    whenever the integral is defined. Here v∗(t) denotes the complex conjugate of v(t).The above integral in (3.4) is a complex integral, but that should not worry you:it can also be written as

    〈u,v〉 =∫ ∞−∞

    Re(u(t) v∗(t)

    )dt+ i

    ∫ ∞−∞

    Im(u(t) v∗(t)

    )dt, (3.5)

    where i =√−1 and where Re(·) and Im(·) denote the functions that map a complex

    number to its real and imaginary parts: Re(a+ ib) = a and Im(a+ ib) = b whenevera, b ∈ R. Each of the two integrals appearing in (3.5) is the integral of a real signal.See Section 2.3.

    Note that (3.1) and (3.4) are in agreement in the sense that if u and v happento take on only real values (i.e., satisfy that u(t), v(t) ∈ R for every t ∈ R), thenviewing them as real functions and thus using (3.1) would yield the same innerproduct as viewing them as (degenerate) complex functions and using (3.4). Notealso that for complex functions u,v : R→ C the inner product 〈u,v〉 is in generalnot the same as 〈v,u〉. One is the complex conjugate of the other.

    2The subscript 2 is here to distinguish ‖u‖2 from ‖u‖1 , where the latter was defined in (2.6)as ‖u‖1 =

    ∫∞−∞ |u(t)|dt.

    3We prefer ‖·‖2 to ‖·‖� because it reminds us that in the definition (3.2) the integrand israised to the second power. This should be contrasted with the symbol ‖·‖1 where the magnitudeof the integrand is raised to the first power (and where no square root is taken of the result); see(2.6).

  • 16 The Inner Product

    Some of the properties of the inner product between complex-valued functionsu,v : R→ C are given below.

    〈u,v〉 = 〈v,u〉∗ (3.6)〈αu,v〉 = α 〈u,v〉, α ∈ C (3.7)〈u, αv〉 = α∗〈u,v〉, α ∈ C (3.8)

    〈u1 + u2,v〉 = 〈u1,v〉+ 〈u2,v〉 (3.9)〈u,v1 + v2〉 = 〈u,v1〉+ 〈u,v2〉. (3.10)

    The above equalities hold whenever the inner products appearing on the right-hand side (RHS) are defined. The reader is encouraged to produce a similar list ofproperties for the inner product between real-valued functions u,v : R→ R.The energy in a Lebesgue measurable complex-valued function u : R → C is de-fined as ∫ ∞

    −∞

    ∣∣u(t)∣∣2 dt,where |·| denotes absolute value so |a + ib| =

    √a2 + b2 whenever a, b ∈ R. This

    definition of energy might seem a bit contrived because there is no such thingas complex voltage, so prima facie it seems meaningless to define the energy ofa complex signal. But this is not the case. Complex signals are used to repre-sent real passband signals, and the representation is such that the energy in thereal passband signal is proportional to the integral of the squared modulus of thecomplex-valued signal representing it; see Section 7.6 ahead.

    Definition 3.1.1 (Energy-Limited Signal). We say that u : R → C is energy-limited or of finite energy if u is Lebesgue measurable and∫ ∞

    −∞

    ∣∣u(t)∣∣2 dt

  • 3.2 When Is the Inner Product Defined? 17

    3.2 When Is the Inner Product Defined?

    As noted in Section 2.2, in this book we shall only discuss the integral of integrablefunctions, where a function u : R → R is integrable if it is Lebesgue measurableand if

    ∫∞−∞ |u(t)|dt < ∞. (We shall sometimes make an exception for functions

    that take on only nonnegative values. If u : R → [0,∞) is Lebesgue measurableand if

    ∫u(t) dt is not finite, then we shall say that

    ∫u(t) dt = +∞.)

    Similarly, as in Section 2.3, in integrating complex signals u : R → C we limitourselves to signals that are integrable in the sense that both t 7→ Re

    (u(t)

    )and

    t 7→ Im(u(t)

    )are Lebesgue measurable real-valued signals and

    ∫∞−∞ |u(t)|dt

  • 18 The Inner Product

    (see Proposition 2.4.1) to conclude from (3.15) that

    |〈u,v〉| =∣∣∣∣∫ ∞−∞

    u(t) v∗(t) dt

    ∣∣∣∣≤∫ ∞−∞

    ∣∣u(t)∣∣ ∣∣v(t)∣∣ dt≤ 1

    2

    ∫ ∞−∞

    ∣∣u(t)∣∣2 dt+ 12

    ∫ ∞−∞

    ∣∣v(t)∣∣2 dt=

    1

    2

    (‖u‖22 + ‖v‖

    22

    ). (3.16)

    This inequality will be improved in Theorem 3.3.1, which introduces the Cauchy-Schwarz Inequality.

    We finally mention here, without proof, a third case where the inner productbetween the Lebesgue measurable signals u,v is defined. The result here is that iffor some numbers 1 < p, q

  • 3.3 The Cauchy-Schwarz Inequality 19

    In fact, the Cauchy-Schwarz Inequality holds with equality if, and only if, either v(t)is zero for all t outside a set of Lebesgue measure zero or for some constant α wehave u(t) = α v(t) for all t outside a set of Lebesgue measure zero.

    There are a number of different proofs of this important inequality. We shall focushere on one that is based on (3.16) because it demonstrates a general technique forimproving inequalities. The idea is that once one obtains a certain inequality—inour case (3.16)—one can try to improve it by taking advantage of one’s under-standing of how the quantity in question is affected by various transformations.This technique is beautifully illustrated in (Steele, 2004).

    Proof. The quantity in question is |〈u,v〉|. We shall take advantage of our under-standing of how this quantity behaves when we replace u with its scaled versionαu and when we replace v with its scaled version β v. Here α, β ∈ C are arbitrary.The quantity in question transforms as

    |〈αu, β v〉| = |α| |β| |〈u,v〉|. (3.18)

    We now use (3.16) to upper-bound the left-hand side (LHS) of the above by sub-stituting αu and β v for u and v in (3.16) to obtain

    |α| |β| |〈u,v〉| = |〈αu, β v〉|

    ≤ 12|α|2 ‖u‖22 +

    1

    2|β|2 ‖v‖22 , α, β ∈ C. (3.19)

    If both ‖u‖2 and ‖v‖2 are positive, then (3.17) follows from (3.19) by choosingα = 1/ ‖u‖2 and β = 1/ ‖v‖2 . To conclude the proof it thus remains to show that(3.17) also holds when either ‖u‖2 or ‖v‖2 is zero so the RHS of (3.17) is zero.That is, we need to show that if either ‖u‖2 or ‖v‖2 is zero, then 〈u,v〉 must alsobe zero. To show this, suppose first that ‖u‖2 is zero. By substituting α = 1 in(3.19) we obtain in this case that

    |β| |〈u,v〉| ≤ 12|β|2 ‖v‖22 ,

    which, upon dividing by |β|, yields

    |〈u,v〉| ≤ 12|β| ‖v‖22 , β 6= 0.

    Upon letting |β| tend to zero from above this demonstrates that 〈u,v〉 must be zeroas we set out to prove. (As an alternative proof of this case one notes that ‖u‖2 = 0implies, by Proposition 2.5.3, that the set {t ∈ R : u(t) 6= 0} is of Lebesgue measurezero. Consequently, since every zero of t 7→ u(t) is also a zero of t 7→ u(t) v∗(t),it follows that {t ∈ R : u(t) v∗(t) 6= 0} is included in {t ∈ R : u(t) 6= 0}, andmust therefore also be of Lebesgue measure zero (Exercise 2.10). Consequently,by Proposition 2.5.3,

    ∫∞−∞ |u(t) v

    ∗(t)|dt must be zero, which, by Proposition 2.4.1,implies that |〈u,v〉| must be zero.)The case where ‖v‖2 = 0 is very similar: by substituting β = 1 in (3.19) we obtainthat (in this case)

    |〈u,v〉| ≤ 12|α| ‖u‖22 , α 6= 0

  • 20 The Inner Product

    and the result follows upon letting |α| tend to zero from above.

    While we shall not use the following inequality in this book, it is sufficiently im-portant that we mention it in passing.

    Theorem 3.3.2 (Hölder’s Inequality). If u : R → C and v : R → C are Lebesguemeasurable functions satisfying∫ ∞

    −∞

    ∣∣u(t)∣∣p dt

  • 3.4 Applications 21

    Another important mathematical consequence of the Cauchy-Schwarz Inequality isthe continuity of the inner product. To state the result we use the notation an → ato indicate that the sequence a1, a2, . . . converges to a, i.e., that limn→∞ an = a.

    Proposition 3.4.2 (Continuity of the Inner Product). Let u and v be in L2 . Ifthe sequence u1,u2, . . . of elements of L2 satisfies

    ‖un − u‖2 → 0,

    and if the sequence v1,v2, . . . of elements of L2 satisfies

    ‖vn − v‖2 → 0,

    then

    〈un,vn〉 → 〈u,v〉.

    Proof.

    |〈un,vn〉 − 〈u,v〉|= |〈un − u,v〉+ 〈un − u,vn − v〉+ 〈u,vn − v〉|≤ |〈un − u,v〉|+ |〈un − u,vn − v〉|+ |〈u,vn − v〉|≤ ‖un − u‖2 ‖v‖2 + ‖un − u‖2 ‖vn − v‖2 + ‖u‖2 ‖vn − v‖2→ 0,

    where the first equality follows from the basic properties of the inner product (3.6)–(3.10); the subsequent inequality by the Triangle Inequality for Complex Numbers(2.12); the subsequent inequality from the Cauchy-Schwarz Inequality; and wherethe final limit follows from the proposition’s hypotheses.

    Another useful consequence of the Cauchy-Schwarz Inequality is that if a signal isenergy-limited and is zero outside an interval, then it is also integrable.

    Proposition 3.4.3 (Finite-Energy Functions over Finite Intervals Are Integrable).If for some real numbers a and b satisfying a ≤ b we have∫ b

    a

    ∣∣x(ξ)∣∣2 dξ

  • 22 The Inner Product

    Proof. ∫ ba

    ∣∣x(ξ)∣∣ dt = ∫ ∞−∞

    I{a ≤ ξ ≤ b}∣∣x(ξ)∣∣dξ

    =

    ∫ ∞−∞

    I{a ≤ ξ ≤ b}︸ ︷︷ ︸u(ξ)

    I{a ≤ ξ ≤ b}∣∣x(ξ)∣∣︸ ︷︷ ︸

    v(ξ)

    ≤√b− a

    √∫ ba

    ∣∣x(ξ)∣∣2 dξ,where the inequality is just an application of the Cauchy-Schwarz Inequality to thefunction ξ 7→ I{a ≤ ξ ≤ b} |x(ξ)| and the indicator function ξ 7→ I{a ≤ ξ ≤ b}.

    Note that, in general, an energy-limited signal need not be integrable. For example,the real signal

    t 7→

    {0 if t ≤ 1,1/t otherwise,

    (3.22)

    is of finite energy but is not integrable.

    The Cauchy-Schwarz Inequality demonstrates that if both u and v are of finiteenergy, then their inner product 〈u,v〉 is well-defined, i.e., the integrand in (3.4) isintegrable. It can also be used in slightly more sophisticated ways. For example, itcan be used to treat cases where one of the functions, say u, is not of finite energybut where the second function decays to zero sufficiently quickly to compensate forthat. For example:

    Proposition 3.4.4. If the Lebesgue measurable functions x : R→ C and y : R→ Csatisfy ∫ ∞

    −∞

    |x(t)|2

    t2 + 1dt

  • 3.5 The Cauchy-Schwarz Inequality for Random Variables 23

    3.5 The Cauchy-Schwarz Inequality for Random Variables

    There is also a version of the Cauchy-Schwarz Inequality for random variables. It isvery similar to Theorem 3.3.1 but with time integrals replaced by expectations. Wedenote the expectation of the random variable X by E[X] and remind the readerthat the variance Var[X] of the random variable X is defined as

    Var[X] = E[(X − E[X])2

    ]. (3.23)

    Theorem 3.5.1 (Cauchy-Schwarz Inequality for Random Variables). Let the ran-dom variables U and V be of finite var


Recommended