UNIVERSITY OF CAMBRIDGE
NEAREST NEIGHBOUR DECODING
FOR FADING CHANNELS
presented by
A. Taufiq Asyhari
Trinity Hall
Cambridge University Engineering Department
January 2012
This dissertation is submitted
for the degree of Doctor of Philosophy
To Mom, Dad and Ana
Declaration
I hereby declare that this dissertation is the result of my own work and includes
nothing which is the outcome of work done in collaboration except where specifi-
cally indicated in the text. This dissertation is not substantially the same as any
that I have submitted for a degree or diploma or other qualification at any other
university.
I also declare that the length of this dissertation is less than 71,000 words and
that the number of figures is less than 150. Approval to extend the word limit
set by the Department of Engineering Degree Committee has been obtained.
Signature: . . . . . . . . . . . . . . . . . . . . . . . . . . .
A. Taufiq Asyhari
Cambridge, January 2012
Nearest Neighbour Decoding for Fading Channels
By: A. Taufiq Asyhari
Abstract
This dissertation addresses the effects of imperfect channel state information
(CSI) on the reliability of nearest neighbour decoding in fading channels.
In the first part, we investigate the reliability of nearest neighbour decoding
in block-fading channels, which are relevant for delay-constrained applications.
The block-fading channel is non-ergodic and the reliability under perfect CSI
is measured by the information outage probability. Using mismatched-decoding
techniques, we develop a framework to study nearest neighbour decoding with
imperfect CSI and propose the generalised outage probability as a new tool to
measure the reliability, which generalises the notion of information outage prob-
ability. Assuming a non-adaptive transmission scheme and imperfect CSI at the
receiver (CSIR), we first evaluate the performance of the generalised outage prob-
ability at high signal-to-noise ratio (SNR). We characterise the reliability as a
function of the system parameters and the quality of channel estimation. We
then consider two adaptive schemes: incremental-redundancy automatic-repeat
request (IR-ARQ) (based on receiver feedback) and power adaptation (based on
imperfect CSI at the transmitter (CSIT)). For IR-ARQ, the reliability is shown to
be a function of the system parameters and the qualities of feedback and channel
estimation. For power adaptation with imperfect CSIT, the reliability is shown
to be a function of the system parameters and the CSI quality at both terminals.
In the second part, we investigate the reliability of nearest neighbour de-
coding in ergodic fading channels. The ergodic channel is relevant for delay-
unconstrained transmission where reliable communication is possible at rates
below the channel capacity. In order to obtain accurate CSI, we propose a
pilot-aided channel-estimation scheme. We first consider point-to-point chan-
nels and demonstrate that our scheme achieves the high-SNR logarithmic growth
of the capacity of multiple-input single-output channels, and achieves the best
so far known lower bound on the high-SNR logarithmic growth of the capacity
of multiple-input multiple-output channels. We then consider fading multiple-
access channels and propose a joint-transmission scheme. We characterise the
high-SNR performance of the joint-transmission scheme and compare it with
time-division multiple-access schemes. We then show the potential of the joint-
transmission scheme for uplink cellular communications.
The results in this dissertation are relevant for the design of codes under
imperfect CSI, channel estimator, feedback signalling and power adaptation.
Acknowledgement
This dissertation is the ultimate destination of my PhD journey for the last 3.5
years. Alhamdulillah, all praises are due to Alloh SWT, the Lord of the universe
for giving me the ability to reach this final destination.
I would like to express my sincere gratitude to my supervisor Dr. Albert
Guillen i Fabregas for giving me the opportunity to study PhD at one of the
best campuses in the world. Albert has been very helpful and supportive in
many circumstances. He gave me freedom and flexibility to explore my research
topics and encouraged me to work independently. Our working relationship for
the past years has shaped and strengthened my interest in information theory.
He is surely one of my role models of a good teacher and researcher. I hope that
we can continue to work together in the future.
I am deeply thankful to Dr. Tobias Koch. I have benefited a lot from many
discussions with him. Tobi has been very critical with mathematics in our papers
and this has influenced my way of doing research. Tobi also has many valuable
suggestions on improving my writing and presentation skills. Some parts of this
dissertation have been improved following from his feedback. I am grateful for
all of these and hope that we can still work together in the future.
I would like to extend my gratitude to Dr. Jossy Sayir for his care and
concern, and who has proofread some chapters in this dissertation. His feedback
has improved the presentation of this dissertation.
I would like to thank all members of our research group. Thanks to Dr.
Alfonso Martinez for his insightful comments on some parts of my research.
Thanks to Li Peng for being my research buddy in the last 3.5 years. Having
started PhD at the same time, we have shared many academic and non-academic
discussions. I would also like to thank Dr. Adria Tauste, Jing Guo, Dr. Alex
Alvarado and Jonathan Scarlett. I have learnt many new things from you guys.
I spent a month of doing research at NCTU, Hsinchu, Taiwan. I would like
to thank Prof. Stefan Moser for hosting me there and for every help he provided.
I have learnt from Stefan idealism in doing research and I am grateful for that.
I would also like to thank those who have helped me in Taiwan: Hsuan-Yin,
Hui-Ting, Yu-Hsing, Sameer, Mas Moro and Mas Agus.
This PhD study would not be possible without generous supports from Yousef
Jameel Scholarship. I would like to express my sincere gratitude to Mr. Jameel
and every one in Yousef Jameel Foundation who have selected me as a scholar in
Cambridge.
I would like to thank staffs at Cambridge University, in particular Kathy
White, Rachel Fogg and Janet Milne for providing administrative supports and
Roger Wareham for providing computing supports.
Being in a foreign country, there were always concerns in practising my reli-
gion. Thanks to my wonderful friends in Islamic Society: Khairul, Syed, Irufan,
Tariq, Sheikh, Ubaid, Mohammed and many more who have always been in
support. I hope that we can keep in touch.
Being far away from my country, it could be so lonely. But that did not
happen. Thanks to Mas Dono and family for having been very supportive since
the first time I arrived in Cambridge. Thanks to my compatriots: Astari, Yuke,
Tracey, Anin, Mbak Ina, Kevin, Antony and many others who have made our
Indonesian community in Cambridge so lively. Thanks to my friend Amika who
always created jokes during our online chat whenever I felt bored with my re-
search.
I would like to gratefully acknowledge my “home” friends: Reyhan, Zuhdi,
Marta, Bang Dian, Barra, Bowo and Digdaya for all their helps during my ap-
plication to Cambridge.
Finally, I would like to thank my family for their unconditional supports and
love. I am very grateful to my father Bapak Muh Kodri, my mother Ibu Siti
Mariyam and my sister Yulia Dwi for every thing they have given in my life. I
am deeply thankful to my lovely wife, Ana, who has accompanied me in every
part of this journey. I understand that it has been very hard to have a long
distance relationship between Cambridge and Kuala Lumpur. I am so proud of
you honey being able to do this together. I am sorry for the time that I have
missed and I promise to make up for that. I am also thankful to my father-in-law
Bapak Muhaimin, S.H., M.Kn. and my mother-in-law Ibu Dra. Siti Asiatun for
their constant supports and motivations to both me and my wife.
Contents
Contents xi
Acronyms xv
List of Figures xvii
List of Tables xix
1 Preliminaries 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . 1
1.2 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
I Non-Ergodic Block-Fading Channels 9
2 The Block-Fading Channel 11
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 The Block-Fading Channel with Perfect CSI . . . . . . . . . . . . 14
2.3.1 Mutual Information, Capacity and Outage Probability . . 14
2.3.2 Nearest Neighbour Decoding and Noise Distribution . . . . 17
2.4 The Block-Fading Channel with Imperfect CSI . . . . . . . . . . . 18
2.4.1 Generalised Gallager’s Bound, GMI and Generalised Outage 19
2.4.2 Section 2.4.1 Proofs . . . . . . . . . . . . . . . . . . . . . . 22
2.4.2.1 Concavity of EQ0 (s, ρ, Hb) . . . . . . . . . . . . . 22
2.4.2.2 Non-Negativity of Igmi(H) . . . . . . . . . . . . . 23
2.4.2.3 GMI Upper Bound . . . . . . . . . . . . . . . . . 25
2.5 Outage Bounds, Diversity and System Design . . . . . . . . . . . 25
3 MIMO Block-Fading Channels with Imperfect CSIR 29
3.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Outage Diversity in Block-Fading Channels . . . . . . . . . . . . . 34
3.3 Generalised Outage Diversity . . . . . . . . . . . . . . . . . . . . 36
3.4 Random Coding Achievability . . . . . . . . . . . . . . . . . . . . 39
xi
CONTENTS
3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5.1 Insights for System Design . . . . . . . . . . . . . . . . . . 44
3.5.2 The DMT for Gaussian Codebooks . . . . . . . . . . . . . 47
3.5.3 Optical Wireless Scintillation Distributions . . . . . . . . . 48
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4 IR-ARQ in MIMO Block-Fading Channels with Imperfect Feed-back and CSIR 51
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.1 Channel Model . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.2 IR-ARQ Scheme . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.3 Feedback Channel and ARQ . . . . . . . . . . . . . . . . 56
4.3 Performance Metrics with Imperfect CSIR . . . . . . . . . . . . . 57
4.3.1 Error Probability with Perfect Feedback . . . . . . . . . . 57
4.3.1.1 Threshold Decoder Ψ(·) . . . . . . . . . . . . . . 58
4.3.1.2 Decoding Error and Communication Outage . . . 62
4.3.2 Error Probability with Imperfect Feedback . . . . . . . . . 65
4.4 ARQ Outage Diversity . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4.1 Uniform Power Allocation . . . . . . . . . . . . . . . . . . 68
4.4.2 Power-Controlled ARQ . . . . . . . . . . . . . . . . . . . . 69
4.4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5 Mismatched CSI Outage SNR-Exponents of MIMO Block-FadingChannels 77
5.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3 Outage SNR-Exponents . . . . . . . . . . . . . . . . . . . . . . . 83
5.3.1 Full-CSIT Power Allocation . . . . . . . . . . . . . . . . . 83
5.3.2 Causal-CSIT Power Allocation . . . . . . . . . . . . . . . . 84
5.3.3 Predictive-CSIT Power Allocation . . . . . . . . . . . . . . 85
5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.4.1 Pilot-Assisted Channel Estimation . . . . . . . . . . . . . 86
5.4.2 Mean-Feedback CSIT Model . . . . . . . . . . . . . . . . . 89
5.4.3 Comments on Achievable Rates . . . . . . . . . . . . . . . 90
5.4.4 Comments on Continuous Input Distributions . . . . . . . 93
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
xii
CONTENTS
II Stationary Ergodic Fading Channels 97
6 Stationary Fading Channels 99
6.1 MIMO Gaussian Flat-Fading Channels . . . . . . . . . . . . . . . 99
6.2 Capacity and The Pre-Log . . . . . . . . . . . . . . . . . . . . . . 100
6.2.1 Coherent Channels . . . . . . . . . . . . . . . . . . . . . . 101
6.2.2 Noncoherent Channels . . . . . . . . . . . . . . . . . . . . 103
7 Pilot-Aided Channel Estimation for Stationary Fading Channels105
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2 System Model and Transmission Scheme . . . . . . . . . . . . . . 107
7.3 The Pre-Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.4 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.4.1 Proof of Theorem 7.1 . . . . . . . . . . . . . . . . . . . . . 115
7.4.1.1 Linear Interpolator . . . . . . . . . . . . . . . . . 115
7.4.1.2 Achievable Rates and Pre-Logs . . . . . . . . . . 123
7.4.2 A Note on Input Distribution . . . . . . . . . . . . . . . . 132
7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
8 Pilot-Aided Channel Estimation for Fading Multiple-AccessChannels 137
8.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.2 Transmission Scheme . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.3 The MAC Pre-Log . . . . . . . . . . . . . . . . . . . . . . . . . . 142
8.4 Joint Transmission Versus TDMA . . . . . . . . . . . . . . . . . . 144
8.4.1 Receiver Employs Less Antennas Than Transmitters . . . 146
8.4.2 Receiver Employs More Antennas Than Transmitters . . . 146
8.4.3 A Case in Between . . . . . . . . . . . . . . . . . . . . . . 147
8.5 Proof of Theorem 8.1 . . . . . . . . . . . . . . . . . . . . . . . . . 149
8.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
9 Summary and Future Research 157
9.1 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 157
9.1.1 Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
9.1.2 Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
9.2 Areas for Future Research . . . . . . . . . . . . . . . . . . . . . . 160
Appendix A 163
A.1 Proof of Lemma 3.1 (Discrete Inputs) . . . . . . . . . . . . . . . . 163
A.2 Proof of Theorem 3.1 (Discrete Inputs) . . . . . . . . . . . . . . . 164
A.2.1 SISO Case . . . . . . . . . . . . . . . . . . . . . . . . . . 165
xiii
CONTENTS
A.2.2 MIMO Case . . . . . . . . . . . . . . . . . . . . . . . . . . 180
A.3 Proof of Theorem 3.1 (Gaussian Inputs) . . . . . . . . . . . . . . 184
A.3.1 GMI Lower Bound . . . . . . . . . . . . . . . . . . . . . . 185
A.3.2 GMI Upper Bound . . . . . . . . . . . . . . . . . . . . . . 192
A.4 Proof of Theorem 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . 197
A.5 Proof of Inequality (A.220) . . . . . . . . . . . . . . . . . . . . . . 202
A.6 Proof of Proposition 3.1 . . . . . . . . . . . . . . . . . . . . . . . 205
A.7 Proof of Theorem 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . 212
Appendix B 219
B.1 Proof of Lemma 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . 219
B.2 Proof of Theorem 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . 220
B.2.1 Converse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
B.2.2 Achievability . . . . . . . . . . . . . . . . . . . . . . . . . 223
B.3 Proof of Proposition 4.1 . . . . . . . . . . . . . . . . . . . . . . . 224
B.3.0.1 GMI Upper Bound . . . . . . . . . . . . . . . . . 225
B.3.0.2 GMI Lower Bound . . . . . . . . . . . . . . . . . 227
B.4 Proof of Proposition 4.2 . . . . . . . . . . . . . . . . . . . . . . . 228
B.4.1 GMI Upper Bound . . . . . . . . . . . . . . . . . . . . . . 231
B.4.2 GMI Lower Bound . . . . . . . . . . . . . . . . . . . . . . 234
Appendix C 241
C.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
C.2 Power Allocation and Asymptotic Analysis . . . . . . . . . . . . . 242
C.2.1 Power Allocation . . . . . . . . . . . . . . . . . . . . . . . 242
C.2.2 Asymptotic Analysis . . . . . . . . . . . . . . . . . . . . . 244
C.2.3 GMI Upper and Lower Bounds . . . . . . . . . . . . . . . 245
C.3 Full-CSIT Power Allocation . . . . . . . . . . . . . . . . . . . . . 247
C.3.1 GMI Upper Bound . . . . . . . . . . . . . . . . . . . . . . 248
C.3.2 GMI Lower Bound . . . . . . . . . . . . . . . . . . . . . . 251
C.4 Causal-CSIT Power Allocation . . . . . . . . . . . . . . . . . . . . 253
C.4.1 GMI upper bound . . . . . . . . . . . . . . . . . . . . . . 253
C.4.2 GMI Lower Bound . . . . . . . . . . . . . . . . . . . . . . 256
C.5 Predictive-CSIT Power Allocation . . . . . . . . . . . . . . . . . . 260
C.5.1 GMI Upper Bound . . . . . . . . . . . . . . . . . . . . . . 260
C.5.2 GMI Lower Bound . . . . . . . . . . . . . . . . . . . . . . 264
C.6 LMMSE Channel Estimation . . . . . . . . . . . . . . . . . . . . 265
References 273
xiv
Acronyms
Herein we list the main acronyms used throughout the dissertation. The meaning
of the acronym is usually stated once when the first time it appears in the text.
ARQ Automatic-Repeat Request
AWGN Additive-White Gaussian Noise
BPSK Binary-Phase Shift Keying
CSI Channel State Information
CSIR Channel State Information at the Receiver
CSIT Channel State Information at the Transmitter
GMI Generalised Mutual Information
i.i.d. independent and identically distributed
IR-ARQ Incremental-Redundancy Automatic-Repeat Request
LMMSE Linear Minimum Mean-Squared Error
MAC Multiple-Access Channel
MIMO Multiple-Input Multiple-Output
MISO Multiple-Input Single-Output
MMSE Minimum Mean-Squared Error
ML Maximum-likelihood
OFDM Orthogonal Frequency Division Multiplexing
pdf probability density function
psd power spectral density
QAM Quadrature-Amplitude Modulation
SISO Single-Input Single-Output
SNR Signal-to-Noise Ratio
TDMA Time-Division Multiple-Access
TDD Time-Division Duplex
xv
List of Figures
2.1 A diagram for a MIMO block-fading channel. . . . . . . . . . . . 13
3.1 A MIMO block-fading channel with imperfect CSIR. . . . . . . . 31
3.2 Random coding SNR-exponent lower bound for discrete signal
codebooks as a function of target rate R (in bits per channel use),
B = 4, nt = 2, nr = 2, τ = 0, M = 4 and de = 0.5. . . . . . . . . . 43
3.3 Generalised outage SNR-exponent for discrete-input block-fading
channel, B = 4, τ = 0 (Rayleigh, Rician and Nakagami-q fading),
nt = 2 and nr = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4 Generalised outage probability for Gaussian-input MIMO Rayleigh
block-fading channel with B = 2, R = 2, nt = 2 and nr = 1. . . . 46
3.5 Generalised outage probability for BPSK-input MIMO Rayleigh
block-fading channel with B = 2, R = 1, nt = 2 and nr = 1. . . . 47
4.1 System model for IR-ARQ transmission with binary feedback. . . 53
4.2 Density of the accumulated GMI at round ℓ = 2 for Gaussian-input
transmission over a SISO Rayleigh fading channel with B = 1,
Pℓ = 16 (unit power), ℓ = 1, 2. . . . . . . . . . . . . . . . . . . . 64
4.3 Simulation results of ARQ outage probability for Gaussian-input
transmission over a MIMO Rayleigh block-fading channel with
parameters: B = 2, L = 2, nt = 2, nr = 1, R = 2 bits per
channel use and the BSC feedback with parameter p0 = 0.5. . . . 72
4.4 ARQ outage diversity for 4-QAM inputs in a MIMO Rayleigh
block-fading channel with parameters: B = 2, L = 3, nt = 2
and nr = 1. UP (OP) indicates results with uniform (optimal)
power allocation. The data rate R is in bits per channel use. The
CSIR-error diversity de is such that (4.71) is satisfied. . . . . . . 73
5.1 System model for MIMO block-fading channels with imperfect CSI
at both terminals. . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
xvii
LIST OF FIGURES
5.2 Interplay among the CSIT- and CSIR-error diversities and the
outage SNR-exponent with full-CSIT power allocation. . . . . . . 83
5.3 Comparison of the densities of the GMI and and the lower bound
(5.51) with fading realisation H = 1, transmission power P = 1
(unit power) and CSIR-error variance σ2e = 0.1. . . . . . . . . . . 92
6.1 A diagram for communication over a stationary MIMO fading
channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.1 Structure of pilot and data transmission for nt = 2, L = 7 and
T = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
8.1 The two-user MAC system model. . . . . . . . . . . . . . . . . . . 138
8.2 Structure of joint-transmission scheme, nt,1 = 2, nt,2 = 1, L = 7
and T = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
8.3 Structure of TDMA scheme, nt,1 = 2, nt,2 = 1, L = 4 and T = 2. . 142
8.4 Pre-log regions for a fading MAC with nr = 2 and nt,1 = nt,2 = 1
for different values of L∗. Depicted are the pre-log region for the
joint-transmission scheme as given in Theorem 8.1 (dashed line),
the pre-log region of the TDMA scheme as given in Remark 8.2
(solid line), and the pre-log region of the coherent TDMA scheme
(8.21) (dotted line). . . . . . . . . . . . . . . . . . . . . . . . . . . 147
xviii
List of Tables
3.1 Pdf for Different Fading Distribution . . . . . . . . . . . . . . . . 32
8.1 Typical values of L∗ for various environments with fc ranging from
800 MHz to 5 GHz. The values of στ are taken from [1] for indoor
and urban environments and from [2] for hilly area environments. 148
C.1 Definition of magnitute-squared and phase variables. . . . . . . . 241
C.2 Definition of normalised magnitute-squared variables. . . . . . . . 242
xix
LIST OF TABLES
xx
Chapter 1
Preliminaries
1.1 Background and Motivation
Wireless communications are characterised by multipath propagation of signals
over a free space medium. The presence of scattering objects in the surrounding
environment causes the transmitted signals to arrive at the receiver with different
delays, which result in random attenuation of the original transmitted signals.
This phenomenon is referred to as fading [3], one of the challenges for reliable
transmission over wireless channels. The severity of fading depends on various
factors such as the geography and the topography of the scattering environment,
the mobile velocity, the carrier frequency and the transmitted signal bandwidth.
Reliable communication over fading channels has traditionally been studied
in information theory under the assumption of perfect knowledge of the fading.
Such perfect knowledge of the fading is commonly referred to as perfect channel
state information (CSI); the corresponding channel where perfect CSI is available
is commonly referred to as the coherent channel. This assumption has facilitated
many developments in modern communication technologies including the design
of good practical coding schemes, the design of adaptive transmission and many
more. Among the well-known results is the discovery of nearest neighbour de-
coding as a reliable decoding scheme. The nearest neighbour decoder is a simple
decoder that selects the codeword that is closest (in a Euclidean-distance sense)
to the channel output. For coherent channels with additive Gaussian noise, this
decoder is the maximum-likelihood decoder and is therefore optimal in the sense
that it minimises the error probability (see [4] and references therein). Due to
this optimality and simplicity, the nearest neighbour decoder becomes a per-
formance benchmark of practical decoders and has inspired the development of
many decoders such as the ones that are based on the widely-used Viterbi al-
1
1.1 Background and Motivation
gorithm [5] (see for example [6–8]) and the ones that are based on generalised
minimum-distance decoding [9] (see for example [10–12]).
Note that the optimality of nearest neighbour decoding can be guaranteed if
the decoder has noiseless access to every realisation of the fading. Such noiseless
access is typically substantiated by assuming that the fading variation is suffi-
ciently slow such that accurate fading estimates can be obtained by transmitting
known training symbols before data transmission. In practical scenarios, this
assumption is too optimistic and may lead to an unexpected behaviour of the
actual system. Due to hardware limitation and time-varying characteristics of
the channel, most channel estimators are not able to guarantee perfect fading
estimation and always incur channel estimation errors. The erroneous or noisy
fading estimates make nearest neighbour decoding inherently suboptimal. We re-
fer to such erroneous or noisy fading estimates as imperfect or mismatched CSI.
The noncoherent channel refers to a channel where perfect CSI is not available,
but the transmitter and receiver may estimate the fading using some channel
estimators.1.1
In this dissertation, we study nearest neighbour decoding that operates based
on erroneous fading estimates as a result of imperfect fading estimation. In par-
ticular, for a given level of accuracy of the fading estimation, we investigate the
reliability of the decoder using the framework of mismatched decoding [13]. We
quantify the performance loss incurred by using noisy fading estimates and iden-
tify the system parameters that contribute to the loss. Note that the reliability
measure of communication systems largely depends on the nature of the signal,
the targeted application and the fading dynamics. In particular, for applications
where large delays are tolerable, the transmission of a codeword spans over a
large number of fading realisations. In this case, the channel is considered er-
godic and long interleaved codes of rate not exceeding the channel capacity can
be used [14, 15]. For such a setup, the reliability measure is concerned with the
design of codes achieving the largest rate with a vanishing error probability, i.e.,
the largest reliable information rate corresponding to the channel capacity. On
the other hand, for applications with stringent delay constraints, long interleavers
cannot be assumed, and the transmission of a codeword only extends over a finite
number of fading realisations. In this case, the channel is considered non-ergodic.
The block-fading channel [14–16] is a useful channel model for such non-ergodic
channels undergoing a slowly-varying fading process. The important feature of
1.1Throughout the dissertation, the term noisy/erroneous fading estimates is used inter-changeably with the term imperfect/mismatched CSI; the term noiseless fading estimates isused interchangeably with the term perfect/matched CSI.
2
1.2 Dissertation Outline
this model is that the channel remains constant in a block (which consists of
several coded symbols) and varying from block to block according to a certain
probability distribution. The block-fading channel is not information stable [17]
and therefore it has zero capacity. This means that for a given positive data rate,
the error probability cannot be made any arbitrarily small. For such a setup,
the reliability measure is concerned with code design that minimises the error
probability.
1.2 Dissertation Outline
The dissertation is structured into two main parts as follows.
Part I
Part I studies nearest neighbour decoding for non-ergodic block-fading channels
and includes the following four chapters.
Chapter 2: The Block-Fading Channel
This chapter serves as an introductory point of Part I. We first describe the
multiple-input multiple-output (MIMO) block-fading channel. We then review
some information-theoretic material for the block-fading channel with perfect
CSI, which includes the notions of mutual information, capacity and outage prob-
ability. We continue by introducing our approach to studying the block-fading
channel with imperfect CSI from a mismatched-decoding perspective. In this sec-
tion, we revisit the generalised mutual information (GMI) as an achievable rate
with a fixed decoding rule, investigate some important properties of the GMI,
introduce the generalised outage probability—the probability that the GMI is
less than the data rate—as a reliability measure, show the generalised outage
probability as the fundamental limit for independent and identically distributed
(i.i.d) codebooks, and show the relationship between the outage probability and
the generalised outage probability. At the end of the chapter, we provide some
perspectives on the system design based on the outage probability and the gen-
eralised outage probability.
Chapter 3: MIMO Block-Fading Channels with Imperfect CSIR
This chapter studies nearest neighbour decoding in block-fading channels with
imperfect CSI at the receiver (CSIR) and no CSI at the transmitter (CSIT). We
use the mismatched-decoding approach introduced in Chapter 2 to study the
3
1.2 Dissertation Outline
reliability of the nearest neighbour decoder. We analyse the generalised outage
probability of nearest neighbour decoding in the high signal-to-noise ratio (SNR)
regime for random codes with Gaussian and discrete signal constellations. In
particular, we characterise the diversity or the SNR-exponent, which is defined
as the high-SNR slope of the error probability curve on a logarithmic-logarithmic
scale. This characterisation, which comprises both converse and achievability of
random coding, provides meaningful and realistic channel estimation and code
design criteria when the perfect fading estimation is not possible.
Chapter 4: IR-ARQ in MIMO Block-Fading Channels with Imperfect
Feedback and CSIR
Incremental-redundancy automatic-repeat request (IR-ARQ) can be used to im-
prove the reliability of the system with the help of feedback from the receiver and
retransmission of incorrectly decoded messages. This chapter studies the error
and outage performance of IR-ARQ in block-fading channels. In particular, we
focus on the high-SNR system performance. We derive the ARQ outage diversity
accounting for imperfect feedback and CSIR. We also demonstrate how power
control can be used to further improve the performance of IR-ARQ.
Chapter 5: Mismatched CSI Outage SNR-Exponents of MIMO Block-
Fading Channels
Another way to improve the reliability of the system is by allowing the transmitter
to have access to the CSI. This chapter studies the outage performance of the
MIMO block-fading channel where both transmitter and receiver do not know the
actual CSI but they have access to a noisy version. In particular, we study the
interplay between the variances of the CSI noise at the transmitter and receiver
in the generalised outage diversity (outage SNR-exponent). We demonstrate
that obtaining a reliable channel estimate at the receiver is more important than
obtaining a reliable channel estimate at the transmitter in terms of outage SNR-
exponent. We provide some connections between mismatched CSI outage SNR-
exponents and the design of a good channel estimation scheme.
Part II
Part II studies nearest neighbour decoding for stationary ergodic fading channels
and includes the following three chapters.
4
1.2 Dissertation Outline
Chapter 6: Stationary Fading Channels
This chapter serves as an opening to the second part of the dissertation. We
revisit existing results on the interplay between channel capacity and CSI in
stationary MIMO Gaussian flat-fading channels. We distinguish the notions of
coherent fading channels and noncoherent fading channels. We then review the
capacity behaviour of coherent and noncoherent fading channels. In particular,
we focus on the high-SNR regime and address the capacity pre-log, defined as
the limiting ratio of the capacity to the logarithm of the SNR as the SNR tends
to infinity.
Chapter 7: Pilot-Aided Channel Estimation for Stationary Fading
Channels
We study the reliable information rates of noncoherent, stationary, Gaussian,
MIMO flat-fading channels that are achievable with nearest neighbour decoding
and imperfect fading estimation. To obtain accurate fading estimates from a
time-varying stationary fading process, we introduce pilots (also known as train-
ing sequences), which are emitted at a regular interval by the transmitter. We
analyse the behaviour of the achievable information rates in the limit as the SNR
tends to infinity. We demonstrate that nearest neighbour decoding with pilot-
aided channel estimation achieves the capacity pre-log of noncoherent multiple-
input single-output (MISO) flat-fading channels, and it achieves the best so far
known lower bound on the capacity pre-log of noncoherent MIMO flat-fading
channels.
Chapter 8: Pilot-Aided Channel Estimation for Fading Multiple-Access
Channels
This chapter extends the use of nearest neighbour decoding with pilot-aided
channel estimation presented in Chapter 7 to the fading multiple-access chan-
nel (MAC). We first introduce a two-user MIMO fading MAC model and a
joint-transmission scheme that jointly transmits codewords from both users and
separately estimates the channel from both users at the receiver. The reliable
information rate region that is achievable with nearest neighbour decoding and
pilot-aided channel estimation is analysed and the corresponding pre-log region,
defined as the limiting ratio of the rate region to the logarithm of the SNR as the
SNR tends to infinity, is determined. We compare the joint-transmission scheme
with time-division multiple-access (TDMA) and derive sufficient conditions when
the joint-transmission scheme is better than TDMA and when TDMA is better
5
1.2 Dissertation Outline
than the joint-transmission scheme.
Chapter 9: Summary and Future Research
This chapter provides concluding remarks and identifies possible areas for future
research.
Note on Published and Submitted Works1.2
The materials in Part I have appeared in the following papers:
• A. T. Asyhari and A. Guillen i Fabregas, “Nearest neighbor decoding in
MIMO block-fading channels with imperfect CSIR”, IEEE Transactions on
Information Theory, vol. 58, no.3, pp. 1483–1517, March 2012.
• A. T. Asyhari and A. Guillen i Fabregas, “Mismatched CSI outage expo-
nents of MIMO block-fading channels”, to be submitted to IEEE Transac-
tions on Information Theory.
• A. T. Asyhari and A. Guillen i Fabregas,“MIMO ARQ block-fading chan-
nels with imperfect feedback and CSIR”, submitted to IEEE Transactions
on Wireless Communications, June 2010.
• A. T. Asyhari and A. Guillen i Fabregas, “Mismatched CSI outage expo-
nents of block-Fading channels”, in Proceedings of the IEEE International
Symposium on Information Theory, Saint Petersburg, Russia, July–August
2011.
• A. T. Asyhari and A. Guillen i Fabregas,“MIMO block-fading channels
with mismatched CSIR”, in Proceedings of the International Symposium
on Information Theory and its Applications, Taichung, Taiwan, October
2010.
• A. T. Asyhari and A. Guillen i Fabregas, “Coding for the MIMO ARQ
block-fading channel with imperfect feedback and CSIR”, in Proceedings of
the IEEE Information Theory Workshop, Dublin, Ireland, August–Septem-
ber 2010.
• A. T. Asyhari and A. Guillen i Fabregas, “Nearest neighbour decoding in
block-fading channels with imperfect CSIR”, in Proceedings of the IEEE
Information Theory Workshop, Taormina, Italy, October 2009.
1.2For on-going (to be submitted) works, the title of the articles have not been finalised andare subject to change.
6
1.3 Notation
The materials in Part II have appeared in the following papers:
• A. T. Asyhari, T. Koch and A. Guillen i Fabregas, “Nearest neighbour
decoding with pilot-aided channel estimation achieves the pre-log”, to be
submitted to IEEE Transactions on Information Theory.
• A. T. Asyhari, T. Koch and A. Guillen i Fabregas, “Nearest neighbour
decoding with pilot-assisted channel estimation for fading multiple-access
channels”, in Proceedings of the 49th Annual Allerton Conference on Con-
trol, Communication and Computing, Monticello, IL, September 2011.
• A. T. Asyhari, T. Koch and A. Guillen i Fabregas, “Nearest neighbour
decoding and pilot-aided channel estimation in stationary Gaussian flat-
fading channels”, in Proceedings of the IEEE International Symposium on
Information Theory, Saint Petersburg, Russia, July–August 2011.
1.3 Notation
In this section, we introduce notations that are generally used throughout the
dissertation. Notations which are specific to each chapter are introduced in the
chapter.
Sets or events are generally denoted by calligraphic fonts, e.g., X and the
superscript c is used to denote their complement, e.g., Xc. Exceptions to this set
notation include the use of for the set of real numbers, for the set of complex
numbers, for the set of integers, for the set of positive integers and 0 for
the set of non-negative integers.
Unless otherwise stated, we denote random scalars by uppercase letters, e.g.,
X , and their realisations by lowercase letters, e.g., x. Random vectors are denoted
by boldfaced uppercase letters, e.g., X, and their realisations are denoted by
boldfaced lowercase letters, e.g., x. Boldfaced letters 0 and 1 denote vectors with
all entries 0 and 1, respectively. To denote random matrices, we use uppercase
letters of a blackboard bold font, e.g., X. Deterministic matrices are denoted by
uppercase sans serif letters, e.g., X. The matrix In represents an identity matrix
of size n× n. Sans serif letters 0 and 1 denote matrices with all entries 0 and 1,
respectively.
We denote [·] for the expectation with respect to random variables in its
arguments.
For any random scalar X over an alphabet X, PX(·) denotes the probability
mass function [18] if X is discrete and the probability density function [19] if X
7
1.3 Notation
is continuous. Similar notations are also used for random vectors and matrices.
The univariate complex-Gaussian distribution with mean µ and variance σ2
is denoted as N(µ, σ2). The n-variate complex-Gaussian distribution with mean
µ and covariance matrix Σ is denoted as Nn(µ,Σ).
The operator | · | represents two things. It may mean the absolute value of a
scalar, or the cardinality of a set. The notation ‖ · ‖ denotes the Euclidean norm
of a vector
‖x‖ =
√
∑
k
|x(k)|2 (1.1)
where x(k) is the k-th element of x.
We use det(·) and tr(·) to denote the determinant and the trace of a matrix.
Other matrix operators include (·)T and (·)†, which denote the transpose and the
conjugate (Hermitian) transpose of a matrix. The notation ‖·‖F is the Frobenius
norm of a matrix
‖X‖F =√
tr (X†X). (1.2)
The indicator function of an event E is given by 1E. It is 1 if the event E
occurs and 0 otherwise.
The floor (ceiling) function ⌊x⌋ (⌈x⌉) denotes the largest (smallest) integer
smaller (greater) than or equal to x, while [x]+ = max(0, x).
We shall denote ı as the square root of negative one, i.e., ı =√−1.
Following the definition in [20], the exponential equality f(x).= xd indicates
that limx→∞log f(x)logx
= d. The exponential inequalities.
≤ and.
≥ are similarly
defined.
The symbols , , ≻ and ≺ describe component-wise inequalities ≥, ≤, >
and <.
We use log(·) and logp(·) to denote the natural and the base-p logarithm
functions.
Limit, limit superior and limit inferior are denoted by lim, lim sup and lim inf,
respectively.
8
Part I
Non-Ergodic Block-Fading
Channels
9
Chapter 2
The Block-Fading Channel
2.1 Introduction
The block-fading channel is a relevant model to study the transmission of delay-
limited applications over a slowly-varying fading channel. Within a block-fading
period, the channel gain remains constant, varying from block to block accord-
ing to some underlying distribution. Assuming a slowly-varying fading condi-
tion, modern communication systems utilising frequency hopping schemes such
as Global System for Mobile Communications (GSM) and multi-carrier modula-
tion such as Orthogonal-Frequency Division Multiplexing (OFDM) are practical
examples that are reasonably modelled by the block-fading channel.
Reliable communication over block-fading channels has traditionally been
studied under the assumption of perfect channel state information at the re-
ceiver (CSIR) [14], [16, 21–24]. Since only a finite number of fading realisations
are spanned in a single codeword, block-fading channel is not information sta-
ble [17]. The input-output mutual information between the transmitted and the
received codewords is varying in a random manner and the channel is consid-
ered to be non-ergodic. For most fading distributions, the Shannon capacity
is strictly zero [23]. Based on error exponent considerations, Malkamaki and
Leib [22] showed that the outage probability, i.e., the probability that the mu-
tual information is smaller than the target transmission rate [14,16], is the natural
fundamental limit of the channel. This means that in the limit as the codeword
length tends to infinity, communication with arbitrarily few errors is not possible
as the smallest error probability cannot be smaller than the outage probability.
In order to reduce the outage probability, the transmitter can adapt its trans-
mission scheme based on the channel condition. This requires the availability of
the channel state information at the transmitter (CSIT). Assuming perfect CSIT,
11
2.2 System Model
some adaptive transmission schemes, which reduce the outage probability by
some significant margins, have been proposed in the literature (see, e.g., [25–32]
and references therein). For example, based on feedback from the receiver, a
scheme that uses automatic-repeat request (ARQ) protocols improves the out-
age performance by adapting the transmission rate [25, 26, 29, 32]. When the
channel condition is good, the message can be transmitted at a high data rate;
when the channel condition degrades, the message can be transmitted at lower
rates. Another example is power adaptation based on the knowledge of the fad-
ing [25,27,28,30,31]. The idea of power adaptation is that in a very bad channel
realisation, power can be saved and used when channel conditions improve.
Channel state information (CSI) therefore plays critical roles in the block-
fading channel. As perfect CSI is difficult to guarantee in practice, designing
a communication system based on perfect CSI may lead to an unexpected be-
haviour of the actual system. In this chapter, we provide an overview on the
interplay of the CSI and the reliability of data transmission in the block-fading
channel. We first introduce the system model in Section 2.2. We then review
some existing results on the block-fading channel with perfect CSI in Section
2.3. We continue by proposing a study of the block-fading channel with imper-
fect CSI using the framework of mismatched decoding in Section 2.4. Using the
results in Sections 2.3 and 2.4, we discuss some guidelines for system design for
the block-fading channel in Section 2.5. We also give some perspectives on the
remaining chapters of Part I at the end of Section 2.5.
2.2 System Model
We consider a multiple-input multiple-output (MIMO) block-fading channel with
nt transmit antennas and nr receive antennas. The channel output at block b is
an nr × J-dimensional random matrix
Yb =
√
Pb
ntH bXb + Zb, b = 1, . . . , B (2.1)
where
• B and J denote the number of fading blocks and the channel block length
(number of channel uses in one block), respectively;
• Xb ∈ nt×J denotes the channel input matrix at block b;
• H b denotes the nr × nt-dimensional random fading matrix at block b;
12
2.2 System Model
m Encoder
×
×
+
+
Decoder m
H1
HB
Z1
ZB
√
P1
ntX1
√
PB
ntXB
Y1
YB
b
b
b
CSIRCSIT
Figure 2.1: A diagram for a MIMO block-fading channel.
• Zb denotes the nr × J-dimensional additive noise matrix at block b;
• and Pb denotes the transmission power allocated at block b.
We assume that the nr × J entries of Zb, b = 1, . . . , B are independent and
identically distributed (i.i.d.) Gaussian random scalars. The random matrices
H b, b = 1, . . . , B (whose values belong to nr×nt) are drawn i.i.d. according to
a certain probability distribution. We assume that the average fading gain is
normalised, i.e., [‖H b‖2F ] = ntnr. We finally assume that for all b = 1, . . . , B,
the fading H b and the noise Zb are independent and that their joint law does not
depend on Xb.
A codeword X of length N = BJ is defined by a concatenation of B channel
input matrices, i.e., X , [X1, . . . ,XB] ∈ Xnt×BJ where X ⊆ denotes the signal
constellation. We assume that the signal constellation is normalised in energy,
i.e., [|X|2] = 1. Collection of all possible transmitted codewords makes up for
a codebook, which is a part of the encoder. We shall consider codebooks whose
entries are generated i.i.d. from a probability distribution PX(x) over Xnt . A
message m, m ∈ M—where M is a set of all possible messages—is mapped into
a codeword X(m) at rate
R =1
BJlog2 |M|. (2.2)
We shall assume throughout that the messages are equiprobable. Upon receiving
the corrupted codeword, the decoder outputs the message m using a pre-defined
decoding rule. We say that a rate R is achievable if there exists a combination
of encoder and decoder such that the error probability Prm 6= m vanishes as
the block length J tends to infinity.
13
2.3 The Block-Fading Channel with Perfect CSI
Power allocation Pb, b = 1, . . . , B can be static or dynamic. Static allocation
refers to power allocation that does not change over time whereas dynamic allo-
cation refers to power allocation that possibly changes over time. An example of
static allocation is uniform power allocation that allocates power equally across
fading blocks independent of the fading realisations. Dynamic allocation is typ-
ically related to power adaptation based on the CSIT. This can be illustrated
as follows. Let fb(H) be the available CSIT at block b, which is a function of
the actual fading H. The power allocation algorithm—which minimises or max-
imises a certain objective—allocates Pb such that Pb = Pb(fb(H)). A constraint
is normally imposed to the transmission power, e.g.,
[
1
B
B∑
b=1
Pb (fb(H))
]
≤ SNR. (2.3)
Here SNR is the average-power constraint. Dynamic power allocation is a possible
technique for adaptive transmission.
2.3 The Block-Fading Channel with Perfect CSI
In this section, we revisit some information-theoretic measures such as mutual
information, capacity and outage probability, which are fundamental for block-
fading channels with perfect CSI. We then discuss the interplay of nearest neigh-
bour decoding and noise distribution in block-fading channels.
To simplify the presentation, we shall assume that the power allocated at
block b, Pb, is static over time and known to the receiver. Hence, the channel
transition probability can be characterised by the conditional probability density
PY|X ,H (Y|X,H).
2.3.1 Mutual Information, Capacity and Outage Proba-
bility
It has been shown in [22,33] that the average error probability for the ensemble of
random codes of rate R and a fixed channel realisation H = H, input distribution
PX(x) over Xnt and the maximum-likelihood decoder with perfect CSIR is given
by
Pe,ave(H) ≤ 2−NEr(R,H) (2.4)
14
2.3 The Block-Fading Channel with Perfect CSI
where
Er(R,H) = sup0≤ρ≤1
1
B
B∑
b=1
E0(ρ,Hb)− ρR (2.5)
is the error exponent for channel realisation H and
E0(ρ,Hb) = − log2
(
PY |X,H (Y |X ′,H b)
PY |X,H (Y |X,H b)
)1
1+ρ
∣
∣
∣
∣
∣
∣
X,Y ,H b
ρ∣∣
∣
∣
∣
∣
H b = Hb
(2.6)
is the Gallager function for a given fading realisation Hb [33]. Here ρ is used as
the optimising variable in (2.5). Note that the inner expectation is taken over
X ′, while the outer expectation is taken over X,Y for a fixed fading realisation
Hb. Here PY |X,H (y|x,Hb) characterises the channel transition probability for one
channel use conditioned that Hb is known. Remark that the upper bound in
(2.4) corresponds to the average over all i.i.d. codebooks. The ensemble-average
in (2.4) implies that there exists a code in the ensemble whose average error
probability is bounded as Pe,ave(H) ≤ 2−NEr(R,H) [33]. Then, the average error
probability for that code averaged over all fading states is
Pe,ave ≤ [
2−NEr(R,H )]
. (2.7)
Basic error exponent results show that Er(R,H) is positive only when
R ≤ I(H) − ε, where ε > 0 is a small number and I(H) is the input-output
mutual information
I(H) =1
B
B∑
b=1
Iawgn
(
√
Pb
ntHb
)
, (2.8)
Iawgn
(
√
Pb
ntHb
)
=
[
log2PY |X,H (Y |X,H b)
[
PY |X,H (Y |X ′,H b)∣
∣Y ,H b
]
∣
∣
∣
∣
∣
H b = Hb
]
. (2.9)
Otherwise, Er(R,H) is zero.
Using Arimoto’s converse [34] it is possible to show that the mutual informa-
tion with optimal input distribution over Xnt, I⋆(H), is the largest rate that can
be reliably transmitted.2.1 There is no rate larger than I⋆(H) having a vanishing
error probability. This converse is strong in the sense that for rates larger than
I⋆(H), the error probability tends to one for sufficiently large block length. This
2.1Herein optimal input distribution refers to an input distribution over Xnt that maximisesI(H). The maximised mutual information I⋆(H) is commonly referred to as capacity for a givenalphabet X and channel H.
15
2.3 The Block-Fading Channel with Perfect CSI
converse is also the “dual” to the Gallager theorem for rates above I⋆(H), i.e., for
a fixed channel realisation H the average error probability of any coding scheme
constructed from the alphabet X is lower-bounded by [34]
Pe,ave(H) ≥ 1− 2−NE′r(R,H) (2.10)
where
E ′r(R,H) = sup
−1≤ρ≤0inf
PX(x)
1
B
B∑
b=1
E0(ρ,Hb)− ρR (2.11)
is of the same form as (2.5) except for the range of ρ in the supremum and for the
infimum of the probability distribution PX(x) over Xnt . With channel realisation
being random, over all fading coefficients, the average error probability becomes
Pe,ave ≥ 1− [
2−NE′r(R,H )
]
. (2.12)
It has been shown in [34] that E ′r(R,H) is positive whenever R > I⋆(H) and zero
otherwise. Let P ⋆X(x) be the input distribution over Xnt that achieves I⋆(H).
Suppose that P ⋆X(x) is used to evaluate the Gallager error exponent (2.5). Then,
it follows from [22, 33, 34] that I(H) is equal to I⋆(H), and for sufficiently large
N , (2.7) and (2.12) converge and we obtain that [22]
Pe,ave∼= inf
ε>0PrI(H)− ε < R (2.13)
= PrI⋆(H) < R , Pout(R) (2.14)
which is the information outage probability [16]. The above results show the
convergence of the random coding achievability and converse to the outage prob-
ability as the block length increases to infinity. These results also imply that the
outage probability is the natural fundamental limit for block-fading channels.
One has to note that the convergence in (2.14) holds when the capacity-
achieving distribution P ⋆X(x) is used to construct codebooks. For a fixed input
distribution PX(x), the probability in (2.13) only characterises random cod-
ing achievability bound to the average error probability; it does not imply any-
thing on the converse bound for any given code. For continuous alphabet, it is
well known that Gaussian inputs achieve the capacity for additive white Gaus-
sian noise (AWGN) channels. On the other hand, for discrete alphabet X, the
capacity-achieving distribution for AWGN channels depends on the operating
SNR. For high SNR, equiprobable distribution over Xnt is optimal.
16
2.3 The Block-Fading Channel with Perfect CSI
2.3.2 Nearest Neighbour Decoding and Noise Distribu-
tion
Gallager’s bound (2.4) and Arimoto’s bound (2.10) are derived based on the
channel transition density conditioned that the fading is known perfectly, i.e.,
PY|X ,H (Y|X,H). Given the channel output Y = [Y1, . . . ,YB], the fading H =
[H1, . . . ,HB] and the codebook, the message output m is obtained from maximis-
ing the likelihood metric
m = argmaxm∈M
PY|X ,H (Y|X(m),H). (2.15)
If the messagesm = 1, . . . , |M| are equiprobable, then this decision rule is optimal
in the sense that it minimises the error probability.
The likelihood metric PY|X ,H (Y|X(m),H) depends on the distribution of the
additive noise Z = [Z1, . . . ,ZB]. Note that using the assumptions in Section 2.2,
we can express the likelihood metric as
PY|X ,H (Y|X(m),H) =
B∏
b=1
J∏
ν=1
PY |X,H (yb,ν|xb,ν(m),Hb) . (2.16)
For AWGN, we have that
PY |X,H (yb,ν |xb,ν(m),Hb) =1
πnte−∥
∥
∥yb,ν−√
Pbnt
Hbxb,ν(m)∥
∥
∥
2
, (2.17)
which by taking − logPY|X ,H (Y|X(m),H) yields a distance metric and results in
(2.15) being nearest neighbour decoding
m = arg minm∈M
B∑
b=1
J∑
ν=1
∥
∥
∥yb,ν −
√
Pb/nt Hbxb,ν(m)∥
∥
∥
2
. (2.18)
It follows that for AWGN, nearest neighbour decoding is optimal if H is known
perfectly.
If the additive noise is not Gaussian, then nearest neighbour decoding may
no longer be optimal. Optimal decoding is derived from maximising the metric
PY|X ,H (Y|X,H), which has a similar form to the density of the noise distribution.
However, Lapidoth [4] has shown for Gaussian inputs that even though nearest
neighbour decoding is suboptimal for additive non-Gaussian noise, it achieves
the same level of reliability as that achieved for AWGN. It follows that nearest
neighbour decoding provides a robust design approach for any noise distribution
17
2.4 The Block-Fading Channel with Imperfect CSI
in the channel. However, one should note that since AWGN is the worst noise
that minimises the mutual information for Gaussian inputs, a drawback of using
nearest neighbour decoding is that any noise in the channel will appear to be as
harsh as AWGN with equal power.
2.4 The Block-Fading Channel with Imperfect
CSI
In this section, we introduce a mismatched-decoding approach to analyse block-
fading channels with imperfect CSI. We shall assume that the receiver has no
perfect knowledge of H b but rather has access to the noisy estimates H b
H b = H b + Eb, b = 1, . . . , B (2.19)
where Eb is the CSIR noise matrix (or the estimation error matrix) at block b. The
receiver then employs a decoding rule based on a positive metric QY|X ,H (Y|X, H).
We further assume that this decoding metric is a product of the individual symbol
metrics, i.e.,
QY|X ,H (Y|X, H) =
B∏
b=1
J∏
ν
QY |X,H (yb,ν |xb,ν, Hb) (2.20)
where each individual metric is positive and bounded by one, i.e.,
0 < QY |X,H (yb,ν |xb,ν, Hb) < 1. (2.21)
Note that the metric from the channel transition probability PY|X ,H (Y|X,H) hasthe same properties as (2.20) and (2.21). Similarly to Section 2.3, we assume
static power allocation.
In following, we study an upper bound to the error probability for a maximum-
metric decoder based on the metric QY|X ,H (Y|X, H), which is an extension of
(2.15), i.e., the output message m is selected if
m = argmaxm∈M
QY|X ,H
(
Y|X(m), H)
. (2.22)
We evaluate the upper bound to the error probability in the limit of large block
length and specifically introduce the notions of generalised mutual information
(GMI) [13, 35] and generalised outage probability, which provide basic tools for
the analysis in Chapters 3, 4 and 5.
18
2.4 The Block-Fading Channel with Imperfect CSI
2.4.1 Generalised Gallager’s Bound, GMI and Generalised
Outage
We say that mismatched decoding occurs if decoding based onQY|X ,H (Y|X, H) does
not match with that based on PY|X ,H (Y|X,H) [13,35–37]. This problem is generally
encountered for a wide-range of communication systems, where the only way to
obtain CSIR is via a channel estimator, inducing accurate yet imperfect channel
coefficients. Note that if QY|X ,H (Y|X, H) has the same form as PY|X ,H (Y|X,H) but
with H being replaced by H, then QY|X ,H (Y|X, H) is a distance metric with noisy
channel estimate and (2.22) yields a nearest neighbour decoder. In this case, the
nearest neighbour decoder treats the channel estimate Hb as if it was the true
channel.
Following the same steps outlined in Section 2.3.1, we can upper-bound the
error probability of the ensemble of random codes as [35, 37, 38]
Pe,ave(H) ≤ 2−NEQr (R,H) (2.23)
where now the mismatched decoding error exponent is
EQr (R, H) = sup
s>00≤ρ≤1
1
B
B∑
b=1
EQ0 (s, ρ, Hb)− ρR (2.24)
and
EQ0 (s, ρ, Hb)
= − log2
QY |X,H (Y |X ′, H b)
QY |X,H (Y |X, H b)
s∣∣
∣
∣
∣
∣
X,Y ,H b,Eb
ρ∣∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
(2.25)
is the generalised Gallager function for a given fading realisation Hb and estima-
tion error Eb [35]. Here s and ρ are used as the optimising variables in (2.24).
Remark 2.1. The use of the decoder metric QY|X ,H (Y|X, H) here is not only
restricted to the distance metric with noisy channel estimate. The random coding
upper bound holds for any bounded positive decoding metric satisfying (2.21).
Proposition 2.1 (Concavity of EQ0 (s, ρ, Hb)). For a fixed input distribution, the
generalised Gallager function, EQ0 (s, ρ, Hb), is a concave function of s for s > 0
and of ρ for 0 ≤ ρ ≤ 1.
19
2.4 The Block-Fading Channel with Imperfect CSI
Proof. See Section 2.4.2.1.
Since EQ0 (s, ρ, Hb) is concave in ρ for 0 ≤ ρ ≤ 1, the maximum slope of
EQ0 (s, ρ, Hb) with respect to ρ occurs at ρ = 0. The maximisation over s results
in a maximum slope equal to the GMI [13, 35], given by
Igmi(H) = sups>0
1
B
B∑
b=1
Igmib (SNR,Hb, Hb, s) (2.26)
where
Igmib (SNR,Hb, Hb, s)
=
log2Qs
Y |X,H(Y |X, H b)
[
QsY |X,H
(Y |X ′, H b)∣
∣
∣Y ,H b,Eb
]
∣
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
. (2.27)
Note that if H = H, then Igmi(H) is equal to I(H). The GMI Igmi(H) has the
following properties.
Proposition 2.2 (Non-Negativity of the GMI). For a decoding metric that can
be expressed as (2.20) and satisfies (2.21), Igmi(H) is always non-negative, i.e.,
Igmi(H) ≥ 0. (2.28)
Proof. See Section 2.4.2.2.
Proposition 2.3 (GMI Upper Bound). The GMI is upper-bounded as
Igmi(H) ≤ 1
B
B∑
b=1
supsb>0
Igmib
(
SNR,Hb, Hb, sb
)
. (2.29)
Proof. See Section 2.4.2.3.
The maximum slope analysis on the concave function EQ0 (s, ρ, Hb) shows that
the exponent EQr (R, H) is only positive whenever R ≤ Igmi(H)− ε, and zero oth-
erwise, proving the achievability of Igmi(H). Then, following the same argument
as that from [33], there exists a code in the ensemble such that the average error
probability—averaged over all fading and its corresponding estimate states—is
bounded as
Pe,ave ≤
[
2−NEQr (R,H )
]
, (2.30)
20
2.4 The Block-Fading Channel with Imperfect CSI
which, for large J (noting that N = BJ) becomes
Pe,ave ≤ infε>0
PrIgmi(H)− ε < R (2.31)
= PrIgmi(H) < R , Pgout(R), (2.32)
the generalised outage probability.
The above analysis shows the achievability of Pgout(R) which indicates that for
large J one may find codes whose error probability approaches Pgout(R). Unfortu-
nately, there are no generally tight converse results for mismatched decoding [13]
which implies that one might be able to find codes whose error probability for
large J might be lower than Pgout(R). However, as shown in [4,36,39], a converse
exists for i.i.d. codebooks, i.e., no rate larger than the GMI can be transmitted
with vanishing error probability.
Proposition 2.4 (Generalised Outage Converse). For i.i.d. codebooks with suf-
ficiently large block length, we have that
Pout(R) ≤ Pgout(R) ≤ Pe,ave. (2.33)
Proof. The inequality of Pe,ave ≥ Pgout(R) follows from the GMI converse in [39]
for i.i.d. codebooks. Furthermore, due to the data-processing inequality for error
exponents EQr (R, H) ≤ Er(R,H) [35,37], we obtain that Igmi(H) ≤ I(H) ≤ I⋆(H)
[13], and hence Pgout(R) ≥ Pout(R).
From the above proposition, we say that the generalised outage probability is
the fundamental limit for i.i.d. codebooks. If one would allow non-i.i.d. codebook
construction, then a smaller error probability than the generalised outage proba-
bility can potentially be achieved. From the works in [13,36,38], it can be shown
that conditioned on the fading and estimation error, one can obtain achievable
rates (e.g., rates below the LM bound for the mismatched capacity [13]) that are
larger than the GMI, which implies that the error probability can be smaller
than the generalised outage probability. The drawback of using the achievability
in [13, 36, 38] is that the LM bound can be very difficult to compute because of
the optimisation over cost function.
21
2.4 The Block-Fading Channel with Imperfect CSI
2.4.2 Section 2.4.1 Proofs
2.4.2.1 Concavity of EQ0 (s, ρ, Hb)
Fix the input distribution PX(x). We first assume 0 < g0 < g1, 0 < δ < 1,
ρ = δg0 + (1− δ)g1. Define
f(s,x,y) ,
[(
QY |X,H (Y |X ′, H b)
QY |X,H (Y |X, H b)
)s∣∣
∣
∣
∣
X = x,Y = y,H b,Eb
]
(2.34)
and
EQ0 (s, ρ, Hb) = − log [f(s,X,Y )ρ|H b = Hb,Eb = Eb] . (2.35)
Then, we have that
[f(s,X,Y )ρ|H b = Hb,Eb = Eb]
= [
f(s,X,Y )δg0f(s,X,Y )(1−δ)g1∣
∣H b = Hb,Eb = Eb
]
. (2.36)
Using Holder’s inequality [40], we have that
[
f(s,X,Y )δg0f(s,X,Y )(1−δ)g1∣
∣H b = Hb,Eb = Eb
]
≤ ( [f(s,X,Y )g0|H b = Hb,Eb = Eb])δ
× ( [f(s,X,Y )g1|H b = Hb,Eb = Eb])(1−δ) . (2.37)
Taking the logarithm, which is a monotonously increasing function, on both sides
yields
−EQ0 (s, ρ, Hb) ≤ −δEQ
0 (s, g0, Hb)− (1− δ)EQ0 (s, g1, Hb) (2.38)
which shows the concavity of EQ0 (s, ρ, Hb) in ρ for ρ ≥ 0.
22
2.4 The Block-Fading Channel with Imperfect CSI
Now, let s = δg0 + (1− δ)g1. Then,
f(s,x,y) =
QY |X,H (Y |X ′, H b)
QY |X,H (Y |X, H b)
)δg0+(1−δ)g1∣
∣
∣
∣
∣
∣
X = x,Y = y,H b,Eb
(2.39)
≤(
[(
QY |X,H (Y |X ′, H b)
QY |X,H (Y |X, H b)
)g0∣∣
∣
∣
∣
X = x,Y = y,H b,Eb
])δ
×(
[(
QY |X,H (Y |X ′, H b)
QY |X,H (Y |X, H b)
)g1∣∣
∣
∣
∣
X = x,Y = y,H b,Eb
])1−δ
(2.40)
= f(g0,x,y)δ × f(g1,x,y)
1−δ (2.41)
where the inequality follows from Holder’s inequality [40]. Evaluating (2.35), we
have that
[
f(s,X,Y )ρ|H b = Hb,Eb = Eb
]
≤ [
f(g0,X,Y )ρδ × f(g1,X,Y )ρ(1−δ)∣
∣H b = Hb,Eb = Eb
]
(2.42)
≤ ( [f(g0,X,Y )ρ|H b = Hb,Eb = Eb])δ
× ( [f(g1,X,Y )ρ|H b = Hb,Eb = Eb])(1−δ) . (2.43)
Taking logarithm on both sides gives us
−EQ0 (s, ρ, Hb) ≤ −δEQ
0 (g0, ρ, Hb)− (1− δ)EQ0 (g1, ρ, Hb) (2.44)
which proves the concavity of EQ0 (s, ρ, Hb) in s for s ≥ 0.
2.4.2.2 Non-Negativity of Igmi(H)
Let s∗ be the value of s achieving the supremum on the right-hand side (RHS)
of (2.26). Then, by substituting a specific value of s to the RHS of (2.26), i.e.,
s ↓ 0, we have that
Igmi(H) =1
B
B∑
b=1
Igmib (SNR,Hb, Hb, s
∗) (2.45)
≥ lims↓0
1
B
B∑
b=1
Igmib (SNR,Hb, Hb, s) (2.46)
23
2.4 The Block-Fading Channel with Imperfect CSI
because s∗ always maximises the RHS of (2.26). Let s = 1s′. Note that from
(2.27) and (2.46)
lims↓0
Igmib (SNR,Hb, Hb, s)
= lims′↑∞
1
s′
[
log2QY |X,H (Y |X, H b)
∣
∣
∣
∣
H b = Hb,Eb = Eb
]
− [
log2[
Q1s′
Y |X,H(Y |X ′, H b)
∣
∣
∣Y ,H b,Eb
]
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
]
(2.47)
= lims′↑∞
−[
log2[
Q1s′
Y |X,H(Y |X ′, H b)
∣
∣
∣Y ,H b,Eb
]
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
]
. (2.48)
Since the function Q1s′
Y |X,H(y|x′, Hb) can be bounded from (2.21) as
0 < Q1s′
Y |X,H(y|x′, Hb) < 1 (2.49)
for s = 1s′> 0, we have that
[
Q1s′
Y |X,H(Y |X ′, H b)
∣
∣
∣Y ,H b,Eb
]
< 1 (2.50)
and
− log2[
Q1s′
Y |X,H(Y |X ′, H b)
∣
∣
∣Y ,H b,Eb
]
> 0. (2.51)
It follows from (2.48) that
lims′↑∞
−[
log2[
Q1s′
Y |X,H(Y |X ′, H b)
∣
∣
∣Y ,H b,Eb
]
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
]
≥ [
lims′↑∞
− log2[
Q1s′
Y |X,H(Y |X ′, H b)
∣
∣
∣Y ,H b,Eb
]
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
]
(2.52)
=
[
− log2
[
lims′↑∞
Q1s′
Y |X,H(Y |X ′, H b)
∣
∣
∣Y ,H b,Eb
]
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
]
(2.53)
= 0 (2.54)
where inequality (2.52) is obtained by applying Fatou’s lemma [41], and equality
(2.54) is obtained by applying the dominated convergence theorem [19]. This
proves Proposition 2.2.
24
2.5 Outage Bounds, Diversity and System Design
2.4.2.3 GMI Upper Bound
We have that
sups>0
1
B
B∑
b=1
Igmib
(
SNR,Hb, Hb, s)
≤ 1
B
B∑
b=1
supsb>0
Igmib
(
SNR,Hb, Hb, sb
)
. (2.55)
The left-hand side (LHS) supremum over s is taken over all B blocks. Thus, the
optimising s does not necessarily maximise the value of Igmib (SNR,Hb, Hb, s) for
each b. In this upper bound, however, the supremum is taken for each b.
2.5 Outage Bounds, Diversity and System De-
sign
In this section, we describe how we can use the outage probability and the gener-
alised probability as guidelines for system design. We start by defining important
reliability measures at high SNR.
Definition 2.1 (Code Diversity). A code is said to have diversity d⋆ if it holds
that
d⋆ = limSNR→∞
− logPe,ave(SNR)
log SNR(2.56)
where Pe,ave(SNR) is the average error probability of that code.
Diversity is commonly referred to as SNR-exponent. In this case, code diver-
sity measures the high-SNR slope of the error probability (plotted in a log-log
scale).
Definition 2.2 (Multiplexing Gain). A coding scheme is said to have multiplex-
ing gain rg if it holds that
rg = limSNR→∞
R(SNR)
log SNR(2.57)
where R(SNR) is the coding rate.
The multiplexing gain was introduced in [20] and characterises the high-SNR
linear gain of the coding rate with respect to the logarithm of the SNR.
Using perfect-CSIR assumption, we have shown that the outage probability
serves as the fundamental limit of block-fading channels. In the following, we
explain one particular aspect of code design for a fixed block length that utilises
the outage result.
25
2.5 Outage Bounds, Diversity and System Design
Definition 2.3 (Outage Diversity). The outage diversity dcsir is defined as the
high-SNR slope of the outage probability curve, as a function of the SNR, in
log-log scale plot when the receiver has access to perfect CSIR
dcsir , limSNR→∞
− logPout(R)
log SNR. (2.58)
Remark that the above Pout(R) is defined as the probability that the mutual
information, maximised over all probability distributions on alphabet Xnt , I⋆(H)
is less than the data rate R.
Suppose that dcsir is a finite real-valued quantity, which is the case for many
fading distributions. Then, we have the following lemma on a lower bound to
the error probability at high SNR [20].
Lemma 2.1 (Converse Outage Bound). For any coding scheme with any fixed
length, the average error probability at high SNR is lower-bounded as
Pe,ave(SNR).≥ SNR
−dcsir. (2.59)
Proof. This lemma has been proven using Fano’s inequality in [20]. Here we
provide an alternative proof based on Arimoto’s converse. Recall E ′r(R,H) is
positive if and only if R > I⋆(H) and zero otherwise. When the error exponent
E ′r(R,H) = 0, then 1 − 2−NE′
r(R,H) = 0. Therefore, the error probability can be
lower-bounded as
Pe,ave ≥ [
1− 2−NE′r(R,H )
]
(2.60)
=
∫
I⋆(H)<R
(
1− 2−NE′r(R,H)
)
PH (H)dH (2.61)
=
∫
I⋆(H)<RPH (H)dH−
∫
I⋆(H)<R2−NE′
r(R,H)PH (H)dH (2.62)
= Pout(R)−∫
I⋆(H)<R2−NE′
r(R,H)PH (H)dH (2.63)
.= Pout(R). (2.64)
Note that for the set I⋆(H) < R, we have that
∫
I⋆(H)<R2−NE′
r(R,H)pH (H)dH < Pout(R) (2.65)
with strict inequality due to 2−NE′r(R,H) < 1 and non-zero probability measure of
the set I⋆(H) < R. Since the RHS of Arimoto’s converse is bounded by zero,
26
2.5 Outage Bounds, Diversity and System Design
as the SNR tends to infinity, the integral term of 2−NE′r(R,H) decays, as a function
of the SNR, at a rate faster than or equal to the decaying rate of Pout(R).
Remark 2.2. The proof for Lemma 2.1 in [20] works for finite length codes when
the multiplexing gain rg is non-zero. For fixed-rate transmission (zero multiplex-
ing gain), it will no longer work unless we assume that the block length grows with
log SNR. Herein we do not make any assumption on the multiplexing gain rg and
the block length J to prove Lemma 2.1. Hence, our proof based on Arimoto’s
converse is more general than the proof based on Fano’s inequality in [20].
Lemma 2.1 provides a guideline to design good codes. The lower bound
implies that the code diversity of any coding scheme with a fixed block length
J can never be larger than the outage diversity dcsir. This result can be used
as a benchmark for code design, i.e., a good code should have diversity that
approaches or equal to the outage diversity. In fact, this result has been used
to construct practical codes from discrete alphabets. Following [20], it has been
shown in [23] that for single-input single-output (SISO) Rayleigh block-fading
channels with channel inputs that are drawn from a discrete alphabet X with
alphabet size |X| = 2M , the outage diversity dcsir is given by the Singleton bound
[42]
dSB(R) , 1 +
⌊
B
(
1− R
M
)⌋
. (2.66)
This Singleton bound turns out to be one of the design criteria in [21–23], i.e., op-
timal codes for the block-fading channel should be maximum-distance separable
(MDS) on a blockwise basis, i.e., achieving the Singleton bound on the blockwise
Hamming distance of the code with equality. Families of blockwise MDS codes
based on convolutional codes [22, 23], turbo codes [43], low-density parity check
(LDPC) codes [44] and Reed-Solomon codes [21] have also been proposed in the
literature. Remark that from dSB(R), we can see that there is a fundamental
trade-off between the SNR-exponent (diversity) and the target rate.
We next illustrate how we can use the result in Lemma 2.1 for system design
when perfect CSIR cannot be assumed. We first define a relevant outage measure
for imperfect CSIR at high SNR.
Definition 2.4 (Generalised Outage Diversity). The generalised outage diversity
dicsir is defined by the high-SNR slope of the generalised outage probability curve
in log-log scale plot when the receiver only knows the noisy CSIR
dicsir , limSNR→∞
− logPgout(R)
log SNR. (2.67)
27
2.5 Outage Bounds, Diversity and System Design
From the data-processing inequality (Proposition 2.4), we have Pgout(R) ≥Pout(R), which implies that dicsir ≤ dcsir. Note that the lower bound in Lemma 2.1
holds for coding schemes with mismatched decoding as well since the maximum-
likelihood decoder (for perfect CSIR) gives the smallest error probability. An
interesting question to address is whether it is possible to design schemes that
yield dicsir = dcsir. If we can achieve dicsir = dcsir, random coding arguments in
Section 2.4 implies that we can find a code with the highest possible diversity.
By assuming that the decoder is supplied by noisy fading estimates, we im-
plicitly consider schemes that separately perform channel estimation and message
decoding. It is thus intuitive to see that the accuracy of channel estimation af-
fects the reliability of message decoding. Using nearest neighbour decoding, we
prove rigorously in Chapter 3 that indeed, dicsir can be expressed as a function of
both the accuracy of the channel estimation and the perfect-CSIR diversity dcsir.
We also demonstrate how random coding schemes can achieve the perfect-CSIR
diversity for a given block length J .
If the channel condition is known prior to transmission, the transmitter can
employ adaptive schemes to improve the generalised outage diversity. In chap-
ter 4, we study the way to improve the generalised outage diversity dicsir using
an ARQ scheme. ARQ improves the outage probability by retransmitting any
erroneously decoded codewords. ARQ with perfect CSIR and feedback has been
studied in the literature (see for example [25,26,29]). We investigate such scheme
with imperfect CSIR and feedback. In Chapter 5, we characterise the improve-
ment of the generalised outage diversity dicsir when power adaptation based on
noisy CSIT is used. Several works have addressed power adaptation in block-
fading channels; see for example [27,30,31] for the works with perfect CSI at both
terminals; and see for example [45, 46] for the works with imperfect CSIT (as-
suming perfect CSIR). In this dissertation, we study imperfect CSIT and CSIR
in a unified framework.
28
Chapter 3
MIMO Block-Fading Channels
with Imperfect CSIR
We have shown in Chapter 2 that for perfect CSIR, nearest neighbour decoding
yields minimal error probability. However, perfect CSIR is difficult to obtain in
practice because of the hardware limitation and the time-varying characteristics
of the channel.
In this chapter, we study the performance of nearest neighbour decoding for
transmission over MIMO block-fading channels when the receiver is unable to
obtain the perfect CSIR. We apply the mismatched-decoding approach intro-
duced in Section 2.4 to evaluate the reliability of nearest neighbour decoding.
More specifically, we utilise the generalised mutual information (GMI) and the
generalised outage probability. In summary, the GMI is an achievable rate when
a fixed decoding rule—which is not necessarily matched to the channel—is em-
ployed [13, 36]. In our case, it characterises the maximum communication rate
under fixed fading and fading estimate realisations, below which the average er-
ror probability is guaranteed to vanish as the codeword length tends to infinity.
Due to the time-varying fading and its corresponding estimate, the GMI is ran-
dom. The generalised outage probability, the probability that the GMI is less
than the target rate, serves as an achievable error probability performance for
MIMO block-fading channels. By achievability, we mean that random codes are
able to achieve this performance but it does not mean that there are no codes
that perform better than the generalised outage probability. However, as shown
in [4, 36, 39], i.i.d. codebooks have a GMI converse, i.e., no rate larger than
the GMI can be transmitted with vanishing error probability for i.i.d. code-
books. Thus, for i.i.d. codebooks, the GMI is the largest achievable rate and the
generalised outage probability becomes the fundamental limit for block-fading
29
3.1 System Model
channels with mismatched CSIR.
We are particularly interested in the reliability performance of nearest neigh-
bour decoding in the high-SNR regime. More specifically, we are interested in
the generalised outage diversity
dicsir , limSNR→∞
− logPgout(R)
log SNR(3.1)
defined in (2.67). This reliability measure is a natural extension of the outage
diversity
dcsir , limSNR→∞
− logPout(R)
log SNR(3.2)
defined in (2.58). Here dcsir characterises the perfect-CSIR diversity, which in
turn provides a performance benchmark for practical codes, i.e., the highest
diversity that a code can achieve is the outage diversity (see Section 2.5). In this
chapter, we shall take a closer look to the relationship between the imperfect-
CSIR generalised outage diversity dicsir and the perfect-CSIR outage diversity
dcsir. Our main results serve as practical guidelines for system design that take
into account the CSIR estimation error.
The rest of the chapter is outlined as follows. Based on the setup in Sec-
tion 2.2, Section 3.1 introduces specific models for the fading distribution, the
codebook generation and the imperfect CSIR. Using Gaussian and discrete-
constellation codebooks, Section 3.2 revisits some existing results on the charac-
terisation of the perfect-CSIR outage diversity. Section 3.3 establishes our main
theorem on the generalised outage diversity. Section 3.4 discusses our results on
the random coding achievability. Section 3.5 provides important remarks, shows
numerical evidence and discusses the generality of the results with respect to the
fading distribution. Finally, Section 3.6 summarises the important points of the
chapter.
3.1 System Model
We recall the channel model for a MIMO block-fading channel with nt transmit
antennas, nr receive antennas and B fading blocks in Section 2.2. Herein we
assume that the CSIT is not available; thus, the uniform power allocation over all
fading blocks and transmit antennas is used in (3.3), i.e., Pb = SNR, b = 1, . . . , B.
30
3.1 System Model
m Encoder
×
×
+
+
Decoder m
H1
HB
Z1
ZB
√
SNR
nt
X1
√
SNR
nt
XB
Y1
YB
b
b
b
CSIR
Hb, b = 1, . . . , B
Figure 3.1: A MIMO block-fading channel with imperfect CSIR.
The channel output at block b is an nr × J-dimensional random matrix
Yb =
√
SNR
ntH bXb + Zb, b = 1, . . . , B (3.3)
where
• B and J denote the number of fading blocks and the channel block length,
respectively;
• Xb ∈ nt×J denotes the channel input matrix at block b;
• H b denotes the nr × nt-dimensional random fading matrix at block b;
• and Zb denotes the nr × J-dimensional additive noise matrix at block b.
We assume that the nr × J entries of Zb, b = 1, . . . , B are i.i.d. complex-
Gaussian random variables with zero mean and unit variance.
The entries of H b are i.i.d. random variables drawn according to a certain
probability distribution. We use the general fading model of [47, 48]. Let Hb,r,t
be the channel coefficient for fading block b, receive antenna r and transmit
antenna t. Then, the probability density function (pdf) of Hb,r,t is given by
PHb,r,t(h) = w0|h|τe−w1|h−w2|ϕ (3.4)
where w0 > 0, τ ∈ , w1 > 0, w2 ∈ and ϕ ≥ 1 are constants (finite and SNR
independent). This model subsumes a number of widely used fading distribu-
tions such as Rayleigh, Rician, Nakagami-m (m ≥ 1), Nakagami-q (0 < q ≤ 1)
31
3.1 System Model
Table 3.1: Pdf for Different Fading Distribution
Fading type pdf w0 τ w1 w2 ϕ
Rayleigh 1πΩe−
|h|2
Ω1πΩ
0 1Ω
0 2
Rician 1πΩe−
|h−µ|2
Ω1πΩ
0 1Ω
µ 2
Nakagami-m mm|h|2m−2
πΩmΓ(m)e−
m|h|2
Ωmm
πΩmΓ(m)2m− 2 m
Ω0 2
Weibull ηΩ−η
2π|h|η−2e−(
|h|Ω )
ηηΩ−η
2πη − 2 1
Ωη 0 η
Nakagami-q 1+q2
2πqΩI0
(
(1−q4)|h|24q2Ω
)
e− (1+q2)2
4q2Ω|h|2
see footnote3.1
and Weibull (η ≥ 2) as tabulated in Table 3.1. For Rayleigh and Rician fading
channels, the above pdf represents the pdf of the complex-Gaussian random vari-
able with independent real and imaginary parts. For Nakagami-m, Weibull and
Nakagami-q fading channels, the above pdf is derived assuming that the magni-
tude and phase are independently distributed, and that the phase is uniformly
distributed over [0, 2π). Furthermore, we assume that the average fading gain is
normalised, i.e., [‖H b‖2F ] = ntnr, b = 1, . . . , B.
We finally assume that for all b = 1, . . . , B, the fading H b and the noise Zb
are independent and that their joint law does not depend on Xb.
At the transmitter end, we consider coding schemes of fixed rate R and length
N = BJ , whose codewords are defined as X , [X1, · · · ,XB] ∈ Xnt×BJ , where X
denotes the signal constellation; herein we focus on Gaussian constellation and
discrete constellation of size |X| = 2M . By fixed rate, we mean that the coding
rate is a positive constant and independent of the SNR; thus the multiplexing
gain (2.57) tends to zero. A non-zero multiplexing gain is only possible with an
input constellation that has continuous probability distribution (such as Gaussian
input) or an input constellation that has discrete probability distribution but with
alphabet size |X| increasing with the SNR. Since many practical systems employ
coding schemes with a fixed code rate and a finite alphabet size, the assumption
of zero multiplexing gain is highly relevant in practice. Furthermore, codewords
are assumed to satisfy the average input power constraint 1BJ [‖X‖2F ] ≤ nt.
At the receiver side, when perfect CSIR is assumed, nearest neighbour de-
coding is optimal in minimising the word error probability. However, practical
systems employ channel estimators that yield accurate yet imperfect channel
3.1We have the bounds for the modified Bessel function of the first kind I0(·) as [47, 48]
1 ≤ I0(x) =∑∞
i=0(x/2)2i
i!i! ≤(
∑∞i=0
(x/2)i
i!
)2
= ex, x ≥ 0. Using these bounds, we recover (3.4).
32
3.1 System Model
estimates. We model the channel estimate as
H b = H b + Eb, b = 1, . . . , B (3.5)
where H b and Eb are the nr × nt-dimensional noisy channel estimate and chan-
nel estimation error matrices, respectively. In particular, the entries of Eb are
assumed to be independent of the entries of H b and to have an i.i.d. complex-
Gaussian distribution with zero mean and variance
σ2e = SNR
−de , de > 0. (3.6)
Thus, we have assumed a family of channel estimation schemes for which the
CSIR noise variance is a decreasing function of the SNR. This model is widely
used in pilot-based channel estimation for which the error variance is proportional
to the reciprocal of the pilot SNR [49, 50]. We generalise this reciprocal of the
pilot SNR with the parameter de which denotes the channel estimation error
diversity.
A nearest neighbour decoder is used to infer the transmitted message. Due
to its optimality under perfect CSIR and its simplicity, this decoder is widely
used in practice even when perfect CSIR cannot be guaranteed. With imperfect
CSIR, the decoder treats the imperfect channel estimate as if it were perfect.
Let Hb be the realisation of H b. Assuming a memoryless channel, the decoder
performs decoding by calculating the following metric
QY |X,H (yb,ν|x, Hb) =1
πnre−∥
∥
∥yb,ν−√
SNR
ntHbx
∥
∥
∥
2
(3.7)
for each channel use, i.e., for b = 1, . . . , B, ν = 1, . . . , J . The decision is made
at the end of BJ channel uses for a single codeword. The message output corre-
sponds to the codeword that maximises the product of the metric (3.7) over BJ
channel uses.
Note that hb,r,t, hb,r,t and eb,r,t are the elements of Hb, Hb and Eb at row r,
r = 1, . . . , nr, and column t, t = 1, . . . , nt, respectively (where Hb, Hb and Eb are
the realisations of H b, H b and Eb, respectively). We define αb,r,t , − log |hb,r,t|2log SNR
,
αb,r,t , − log |hb,r,t|2log SNR
and θb,r,t , − log |eb,r,t|2log SNR
. Then, Ab, Ab and Θb are nr × nt
matrices whose element at row r and column t is given by αb,r,t, αb,r,t and θb,r,t,
respectively, for all r = 1, . . . , nr and t = 1, . . . , nt. We use this change of random
variables to analyse the communication performance in the high-SNR regime.
33
3.2 Outage Diversity in Block-Fading Channels
3.2 Outage Diversity in Block-Fading Channels
We have defined the outage diversity dcsir in Section 2.5, i.e.,
dcsir = − logPout(R)
log SNR(3.8)
where
Pout(R) = PrI⋆(H) < R (3.9)
and where I⋆(H) is the maximised mutual information for which the maximisation
is over all input distribution PX(x) over Xnt. We have also shown in Section 2.5
that for a given input alphabet X, dcsir is the largest diversity that a code may
have.
As mentioned in Section 3.2, we focus on Gaussian and discrete signal con-
stellations. The motivation of using Gaussian constellation can be understood
since for power-limited channels with additive Gaussian noise and perfect CSIR,
the input distribution that maximises the mutual information—maximised over
all possible PX(x) over nt—is Gaussian. However, the application of Gaussian
constellation for code design is limited in practice. The main reason is that Gaus-
sian distributions have unbounded support. This is not desirable from a practical
perspective since the transmission power cannot be peak-limited and we oper-
ate with infinite alphabet size. This motivates a study on systems with discrete
alphabets for which the alphabet size is fixed and for which each constellation
point has a finite energy.
Note that the evaluation of I⋆(H) requires an input distribution PX(x) over
Xnt that maximises the mutual information. It has been shown in the literature
that the optimal input distribution depends on the operating SNR. At high
SNR, the nt-variate complex-Gaussian distribution with zero mean and identity
covariance matrix Int is optimal in terms of diversity for continuous alphabets [20].
On the other hand, for an nt-variate discrete alphabet Xnt, the optimal input
distribution at high SNR is given by independent and equiprobable elements
X1, . . . , Xnt (where Xt is the t-th element ofX) on X since equiprobable elements
maximise the entropy [51]. At low SNR, for both Gaussian and discrete signal
constellations, the optimal input distribution is characterised by the nt elements
that are fully correlated to better combat the noise [51, 52].
In this dissertation, since we are particularly interested in the high-SNR
regime, we shall only consider input distributions that achieve the optimal di-
versity dcsir for a given alphabet, i.e. zero-mean identity-covariance-matrix nt-
34
3.2 Outage Diversity in Block-Fading Channels
variate complex-Gaussian distribution and independent and equiprobable ele-
ments X1, . . . , Xnt on X. With these input distributions, it suffices to consider
I(H) instead of I⋆(H) to find the outage diversity (3.8).
Recall the mutual information for the block-fading channel (2.8)
I(H) =1
B
B∑
b=1
Iawgn
(
√
SNR
ntHb
)
(3.10)
where Iawgn (Ψ) is the mutual information of an AWGN MIMO channel for a
given channel matrix Ψ. In particular, the mutual information is given by
Iawgn
(
√
SNR
nt
Hb
)
= log2 det
(
Inr +SNR
nt
HbH†b
)
(3.11)
for Gaussian inputs and
Iawgn
(
√
SNR
ntHb
)
=Mnt − [
log2∑
x′∈Xnt
e−∥
∥
∥
√
SNR
ntHb(X−x′)+Z
∥
∥
∥
2+‖Z‖2
]
(3.12)
for discrete inputs, where M , log2 |X|. To the best of our knowledge there is
no closed form expression for the expectation term in the mutual information
for discrete inputs. However, this expectation can be computed efficiently using
Gauss-Hermite quadratures [53] for systems of small size; for larger sizes the
above expectation needs to be evaluated using Monte Carlo methods. It follows
that for both Gaussian and discrete signal constellations, the outage diversity
satisfies the dot equality
PrI(H) < R .= SNR
−dcsir. (3.13)
We revisit existing results on the optimal SNR-exponents for both Gaussian
and discrete inputs with perfect CSIR.
Lemma 3.1. Consider transmission over a MIMO block-fading channel at fixed
rate R with fading model parameters w0, w1, w2, τ and ϕ as described in (3.4)
and perfect CSIR using Gaussian constellation and discrete signal constellation
of size 2M . Then,
Pout(R).= Pr I(H) < R .
= SNR−dcsir (3.14)
35
3.3 Generalised Outage Diversity
where
dcsir =
(
1 + τ2
)
Bntnr for Gaussian inputs(
1 + τ2
)
dSB(R) for discrete inputs,(3.15)
and
dSB(R) = nr
(
1 +
⌊
B
(
nt −R
M
)⌋)
(3.16)
is the Singleton bound for MIMO channels. The exponent dcsir is achieved by
random codes for both Gaussian and equiprobable discrete inputs.
Proof. For Gaussian inputs, the proof is outlined in [47] by setting zero multi-
plexing gain. The proof for discrete inputs extending the results of [32] to the
general fading model in (3.4) is outlined in Appendix A.1.
The results in Lemma 3.1 show the interplay among the system and the
channel parameters in determining the optimal SNR-exponents. For any positive
target rate, Gaussian inputs achieve the maximum diversity. On the other hand,
the diversity achieved by the discrete input has a trade-off with the target rate
given by the Singleton bound. Note that
limM→∞
⌊
B
(
nt −R
M
)⌋
= Bnt − 1 (3.17)
which implies that sufficiently large constellations can always achieve maximum
diversity. The diversity characterisation for discrete inputs provides a bench-
mark for the error performance of practical codes. A good practical code must
have an SNR-exponent d⋆ in (2.56) that achieves the scaled Singleton bound
(1 + τ2)dSB(R).
3.3 Generalised Outage Diversity
In this section, we describe the behaviour of the generalised outage probability
for codebooks that are generated from i.i.d. Gaussian and discrete inputs. In
particular, we study the high-SNR regime characterised by the generalised outage
diversity, which is defined in (2.67)
dicsir = limSNR→∞
− logPgout(R)
log SNR(3.18)
where
Pgout(R) = PrIgmi(H) < R (3.19)
36
3.3 Generalised Outage Diversity
and H = H + E. For a given H = H and E = E, the GMI is given by
Igmi(H) = sups>0
1
B
B∑
b=1
Igmib (SNR,Hb, Hb, s) (3.20)
where
Igmib (SNR,Hb, Hb, s)
=
log2Qs
Y |X,H(Y |X, H b)
[
QsY |X,H
(Y |X ′, H b)∣
∣
∣Y ,H b,Eb
]
∣
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
. (3.21)
Note that if H = H, then Igmi(H) is equal to I(H). The evaluation of (3.21) gives
Igmib (SNR,Hb, Hb, s) = log2 det Σy − s
log 2
(
nr +SNR
nt‖Hb − Hb‖2F
)
+
[
sY †Σ−1y Y
∣
∣
∣H b = Hb,Eb = Eb
]
log 2(3.22)
for Gaussian inputs, where
Σy , Inr + sHbH†b
SNR
nt
(3.23)
and
Igmib (SNR,Hb, Hb, s)
= Mnt −
log2∑
x′∈Xnt
(
e−s
∥
∥
∥
√
SNR
nt(HbX−Hbx
′)+Z
∥
∥
∥
2+s
∥
∥
∥
√
SNR
nt(Hb−Hb)X+Z
∥
∥
∥
2)
(3.24)
for discrete inputs. It follows from Proposition 2.4 that the generalised outage
diversity is always upper-bounded by the outage diversity, i.e., dicsir ≤ dcsir.
In the following, we provide a precise fundamental relationship between dicsir
and dcsir. The converse results, which use the large block length results on the
generalised outage probability in Proposition 2.4, are summarised in the following
theorem.
Theorem 3.1 (Converse). Consider the MIMO block-fading channel, imperfect
CSIR and fading model described by (3.3), (3.5) and (3.4), respectively. Then,
for high SNR, the generalised outage probability using nearest neighbour decoding
37
3.3 Generalised Outage Diversity
based on (3.7) behaves as
Pgout(R) = Pr
Igmi(H) < R
.= SNR
−dicsir (3.25)
where
dicsir = min(1, de)× dcsir (3.26)
is the generalised outage SNR-exponent or the generalised outage diversity. This
relationship holds for code constructions based on both i.i.d. Gaussian and dis-
crete inputs.
Proof. We use bounding techniques to prove the result. The lower bound is
derived by evaluating the GMI structure for each type of input distribution. The
upper bound uses the upper bound in Proposition 2.3. See Appendices A.2 (for
discrete inputs) and A.3 (for Gaussian inputs).
Remark 3.1. Standard proof methodologies to derive an upper bound to the
SNR-exponent for MIMO channels with perfect CSIR employ a genie-aided re-
ceiver [32], which eliminates the interference among the nt transmit antennas. It
follows that the mutual information of a parallel nt single-input multiple-output
(SIMO) channel each with nr receive antennas serves as an upper bound to the
mutual information of the nr × nt MIMO channels. However, this approach may
not work for mismatched CSIR. In general, mismatched decoding introduces ad-
ditional interference during the decoding process and this interference may not be
reduced by having a genie-aided receiver that decomposes the MIMO channel into
a parallel SIMO channel.
Theorem 3.1 gives an upper bound to the decaying rate of the average er-
ror probability as a function of the SNR for sufficiently long codes. The re-
sults also provide a precise fundamental relationship between the perfect- and
imperfect-CSIR SNR-exponents. Suppose that a noisy channel estimator pro-
duces a Gaussian random estimation error with variance σ2e = SNR
−de . Then,
Theorem 3.1 shows that the imperfect-CSIR SNR-exponent is a linear function
of the perfect-CSIR SNR-exponent with a linear scaling factor of min(1, de).
The intuition on the scaling factor min(1, de) is as follows. Channel estimation
errors introduce supplementary outage events, adding to those due to deep fades
[20, 23]. Therefore, the generalised outage set contains the perfect-CSIR outage
set, and a generalised outage occurs when there is a deep fade, or when the
channel estimation error is high.
The above analysis also shows that the phases of the fading and of the chan-
nel estimation error play no role in determining the SNR-exponents for both
38
3.4 Random Coding Achievability
Gaussian and discrete signal constellations. However, as shown in the proof (Ap-
pendix A.2), it seems that the phases affect high-SNR outage events for discrete
signal constellations; the exact effect depends on the configuration of the specific
signal constellation.
3.4 Random Coding Achievability
For large block length, as shown in (2.32) and (2.33), it suffices to study Pgout(R)
to characterise the error probability for i.i.d. codebooks. However, practical
wireless communications typically operate with a fixed and finite block length.
Herein, we present the results of achievable random coding SNR-exponents for a
given block length J . In this context, we use the generalised Gallager exponents
[35, 37] and evaluate a lower bound to the SNR-exponents for any fixed length.
We first state the SNR-exponent achieved by random codes with Gaussian
constellations for any τ in the fading model (3.4). We will then provide a tighter
block length threshold for τ = 0.
Theorem 3.2 (Achievability - Gaussian Inputs). Consider the MIMO block-
fading channel (3.3) with fading distribution in (3.4) and data rate growing with
the logarithm of the SNR at multiplexing gain rg ≥ 0 as defined in (2.57). In the
presence of mismatched CSIR (3.5), there exists a Gaussian random code whose
average error probability is upper-bounded as
Pe,ave(SNR).≤ SNR
−dℓG(rg) (3.27)
where
dℓG(rg) = infAc
G∩A0,Θde×1
(
1 +τ
2
)
B∑
b=1
nr∑
r=1
nt∑
t=1
αb,r,t +
B∑
b=1
nr∑
r=1
nt∑
t=1
(θb,r,t − de)
+ J
(
B∑
b=1
[min(1, θmin)− αb,min]+ − Brg
)
(3.28)
and AcG is the complement of the set
AG ,
A,Θ ∈ Bnr×nt :
B∑
b=1
[min(1, θmin)− αb,min]+ ≤ Brg
, (3.29)
39
3.4 Random Coding Achievability
where
θmin , min θ1,1,1, . . . , θb,r,t, . . . , θB,nr,nt , (3.30)
αb,min , min αb,1,1, . . . , αb,r,t, . . . , αb,nr,nt . (3.31)
The function dℓG(rg) can be computed explicitly for any block length J ≥ 1.
Proof. See Appendix A.4.
Corollary 3.1. Following Theorem 3.2, the lower bound of the SNR-exponent
for Gaussian input with multiplexing gain rg ↓ 0 is given by
dℓG = limrg↓0
dℓG(rg) =
dicsir for J ≥⌈(
1 + τ2
)
ntnr
⌉
BJ min(1, de) otherwise.(3.32)
Proof. The proof is obtained by solving the infimum in Theorem 3.2 for a given
length J . The optimiser of Θ to dℓG(rg) is given by Θ∗ = de × 1 since an increase
in θmin increases αb,min in the constraint set, and since by the definition θb,r,t ≥θmin. To find α∗
b,r,t that gives the infimum (3.28), let αb,r,t ∈ [0,min(1, de)], and
let α∗b,min be the value of αb,min that is tight in Ac
G. Since αb,r,t ≥ αb,min, the
infimum solution for the rest of αb,r,t is given by α∗b,r,t = α∗
b,min. Using these, it is
straightforward to show (3.32).
For τ = 0, we have the following proposition that provides a tighter achiev-
ability bound than Theorem 3.2.
Proposition 3.1. Consider the MIMO block-fading channel (3.3) with fading
model (3.4) for τ = 0, imperfect CSIR (3.5), data rate growing with the logarithm
of the SNR at multiplexing gain rg ≥ 0 as defined in (2.57) and n∗ = min(nt, nr).
Let λb be a row vector consisting of the non-zero eigenvalues of HbH†b, with 0 <
λb,1 ≤ · · · ≤ λb,n∗, and λ , λ1, . . . , λb, . . . , λB. Define the variable υb,i as υb,i ,− log λb,i
log SNR. There exists a Gaussian random code whose average error probability
is upper-bounded as
Pe,ave(SNR).
≤ SNR−dℓG(rg) (3.33)
40
3.4 Random Coding Achievability
where
dℓG(rg)
= infAc
G∩υ0,Θde×1
B∑
b=1
n∗∑
i=1
(2i− 1 + |nt − nr|)υb,i +B∑
b=1
nr∑
r=1
nt∑
t=1
(θb,r,t − de)
+ J
(
B∑
b=1
n∗∑
i=1
[min(1, θmin)− υb,i]+ − Brg
)
(3.34)
and AcG is the complement of the set
AG ,
υ ∈ Bn∗
,Θ ∈ Bnr×nt :
B∑
b=1
n∗∑
i=1
[min(1, θmin)− υb,i]+ ≤ Brg
, (3.35)
where
θmin , min θ1,1,1, . . . , θb,r,t, . . . , θB,nr,nt . (3.36)
The function dℓG(rg) can be computed explicitly for any given block length J ≥ 1.
Proof. See Appendix A.6.
The optimiser of Θ in (3.34) is given by Θ∗ = de × 1 because an increase in
θmin increases υb,i in the objective function and the constraint set. To find υ∗b,ithat gives the infimum (3.34), let υb,i ∈ [0,min(1, de)]. Following [20], for J ≥nt + nr − 1, the infimum always occurs with
∑Bb
∑n∗
i
(
min(1, de)− υ∗b,i)
= Brg.
For rg ↓ 0 (fixed coding rate), the length of J ≥ nt + nr − 1 leads to
dℓG = limrg↓0
dℓG(rg) = min(1, de)Bntnr. (3.37)
Using similar steps to those in [20], if J < nt + nr − 1, then dℓG is not tight to
the dicsir and the random coding achievable SNR-exponent is given by solving the
infimum of dℓG(rg) in (3.34) as rg ↓ 0, which is strictly smaller than dicsir.
Therefore, for random codes with Gaussian constellations, the lower bound
of the SNR-exponent is dicsir as long as the block length satisfies
J ≥
nt + nr − 1 for τ = 0⌈(
1 + τ2
)
ntnr
⌉
otherwise.(3.38)
The achievability of the random codes constructed over discrete alphabets of
size |X| = 2M for a given block length J is summarised as follows.
41
3.4 Random Coding Achievability
Theorem 3.3 (Achievability - Discrete Inputs). Let the block length J grow as
limSNR→∞J(SNR)log SNR
= ω, ω ≥ 0. Then, there exists a random code constructed over
discrete-input alphabet of size |X| = 2M such that the average error probability
with mismatched CSIR (3.5) is upper-bounded by
Pe,ave(SNR).
≤ SNR−dℓ
X(R) (3.39)
where
dℓX(R) = ωBM log 2
(
nt −R
M
)
(3.40)
for 0 ≤ ω <min(1,de)×(1+ τ
2 )nr
M log 2,
dℓX(R) = ωM log 2
(
1 +
⌊
BR
M
⌋
− BR
M
)
+min(1, de)×(
1 +τ
2
)
nr
(⌈
B
(
nt −R
M
)⌉
− 1
)
(3.41)
formin(1,de)×(1+ τ
2 )nr
M log 2≤ ω < 1
M log 2×(
min(1,de)×(1+ τ2 )nr
1+⌊BRM ⌋−BR
M
)
and
dℓX(R) = min(1, de)×(
1 +τ
2
)
nr
⌈
B
(
nt −R
M
)⌉
(3.42)
for ω ≥ 1M log 2
×(
min(1,de)×(1+ τ2 )nr
1+⌊BRM ⌋−BR
M
)
.
Proof. See Appendix A.7.
The assumption of the block length J to grow with the SNR as J(SNR) =
ω log SNR, ω ≥ 0 is to obtain a more precise characterisation of what can be
achieved using random codes with a discrete constellation of size 2M . The results
show the interplay among the growth rate ω, the cardinality of the constellation
2M and the achievable SNR-exponent with random coding.
Remark 3.2. By letting ω → ∞, we have that
dℓX(R) = min(1, de)×
(
1 +τ
2
)
nr
⌈
B
(
nt −R
M
)⌉
(3.43)
≤ min(1, de)×(
1 +τ
2
)
nr
(
1 +
⌊
B
(
nt −R
M
)⌋)
(3.44)
= dicsir. (3.45)
This implies that dℓX(R) is equal to dicsir only at the continuous points of the
42
3.4 Random Coding Achievability
0 2 4 6 80
2
4
6
8dℓ X(R
)
R
ωM log 2 → ∞
ωM log 2 = 2
ωM log 2 = 12
Figure 3.2: Random coding SNR-exponent lower bound for discrete signal code-books as a function of target rate R (in bits per channel use), B = 4, nt = 2,nr = 2, τ = 0, M = 4 and de = 0.5.
Singleton bound. Hence, random codes based on discrete constellations are only
able to achieve dicsir at the continuous points only for all possible R, with the block
length growing as log SNR and the growth rate is very large. Herein we have also
shown analytically that the block length is required to grow with log SNR at a
certain growth rate ω > 0 to obtain a non-zero random coding SNR-exponent.
This is illustrated in Figure 3.2. To achieve the perfect-CSIR outage diversity at
the continuous points of the Singleton bound, the growth rate, ω, should approach
infinity, and the channel estimator should provide reliable estimates (de ≥ 1).
Figure 3.2 also explains the case of a finite-valued ω. If ω is fixed and satisfies
ωM log 2 < min(1, de) ×(
1 + τ2
)
nr, then dℓX(R) is strictly smaller than dicsir for
any positive rate R. On the other hand, if ω is fixed and satisfies ωM log 2 ≥min(1, de) ×
(
1 + τ2
)
nr, then dℓX(R) is equal to dicsir for some values of R (as
indicated by the dashed line in Figure 3.2); larger ω implies a larger range of
values of R for which dℓX(R) is equal to dicsir. As ω tends to infinity, random
coding achieves dicsir for all R except at the discontinuity points of the Singleton
bound (as shown by the solid line in Figure 3.2).
43
3.5 Discussion
3.5 Discussion
3.5.1 Insights for System Design
The following observations can be obtained from Theorems 3.1, 3.2, 3.3 and
Proposition 3.1.
1. The optimal SNR-exponent for any coding scheme can be obtained when
de ≥ 1. The converse on the SNR-exponent is strong since Pgout(R) has the
same exponential decay in SNR as Pout(R). We need both a good channel
estimation and a good code design to achieve the optimal SNR-exponent.
2. The term min(1, de) appears naturally from the data-processing inequality.
It highlights the importance of having channel estimators that can achieve
the CSIR-error diversity de ≥ 1. We are able to achieve the perfect-CSIR
SNR-exponent provided that de ≥ 1. If de < 1, the resulting SNR-exponent
scales linearly with de and approaching zero for de ↓ 0. Figure 3.3 illustrates
this effect in a discrete-input block-fading channel with B = 4, τ = 0,
nt = 2 and nr = 2. In a block-fading setup, this result provides a more
precise characterisation on the accuracy of channel estimation at high SNR
than [54].
3. The role of the channel estimation error diversity de is governed by the
channel estimation model. With a maximum-likelihood (ML) estimator,
it can be shown that de is proportional to the pilot power [49]. Larger
pilot power implies larger de. Hence, the price for obtaining high outage
diversity is in the pilot power which does not contain any information data.
The bounding condition min(1, de) implies that the perfect-CSIR outage
diversity can be achieved with de = 1. Note that although having larger de
for de > 1 shows no diversity improvement, it still leads to a better outage
performance. As de tends to infinity, the outage performance converges to
that with perfect CSIR.
4. The outage diversity in Theorem 3.1 is valid for the general fading model
described by (3.4). This fading model is used extensively in analysing the
performance of radio-frequency (RF) wireless communications.
5. For a given de ≥ 1, Gaussian random codes with finite block length can
achieve the perfect-CSIR outage diversity (1+ τ2)Bntnr as long as the block
length is larger than a threshold. On the other hand, discrete-alphabet
44
3.5 Discussion
0 0.5 1 1.5 20
5
10
15dicsir
RM
de = 1
de = 0.5
de = 0.05
Figure 3.3: Generalised outage SNR-exponent for discrete-input block-fadingchannel, B = 4, τ = 0 (Rayleigh, Rician and Nakagami-q fading), nt = 2 andnr = 2.
random codes with finite block length cannot achieve the perfect-CSIR
outage diversity (1+ τ2)dSB(R). In order for these random codes to achieve
(1 + τ2)dSB(R) for almost everywhere, the block length needs to grow as
ω log SNR.
Figures 3.4 and 3.5 provide simulation results of the generalised outage proba-
bility Pgout(R) for Gaussian and binary phase-shift keying (BPSK) inputs, res-
pectively, over a MIMO Rayleigh block-fading channel with nt = 2 and nr = 1.
The following parameters are specified: B = 2 and R = 2 bits per channel use
for Gaussian inputs and B = 2, and R = 1 bits per channel use for BPSK
inputs. The curves were generated as follows. Monte Carlo simulation was used
to compute the number of outage events. Firstly, each entry of Hb and Eb,
b = 1, . . . , B were independently generated from zero-mean complex-Gaussian
distributions with variance one and σ2e = SNR
−de, respectively. The values of
de = 0.5, 1, and 2 were used for comparison with the perfect CSIR outage
probability. Secondly, for a fixed s > 0, channel Hb and channel estimate Hb =
Hb + Eb, Igmib (SNR,Hb, Hb, s), b = 1, . . . , B were computed for Gaussian inputs
(3.22) and BPSK inputs (3.24). Note that for Gaussian inputs, the generalised
45
3.5 Discussion
0 10 20 30 4010
−4
10−3
10−2
10−1
100
Pgout(R)
SNR (dB)
Perfect CSIRde = 0.5
de = 1
de = 1.25
Figure 3.4: Generalised outage probability for Gaussian-input MIMO Rayleighblock-fading channel with B = 2, R = 2, nt = 2 and nr = 1.
outage diversity may not be derived directly from (3.22) particularly due to
the term [sY †Σ−1y Y |H b = Hb,Eb = Eb]. However, this term can be evaluated
numerically using the singular value decomposition [86] as in Appendix A.3.2. On
the other hand, for BPSK inputs, we compute the expectation in (3.24) using the
Gauss-Hermite quadratures [53]. Thirdly, for a fixed Hb and Hb, the supremum
over s on the RHS of (3.20) was solved using standard convex optimisation
algorithm since the function
1
B
B∑
b=1
Igmib (SNR,Hb, Hb, s) (3.46)
is concave in s for s > 0.3.2 Finally, an outage event was declared whenever
Igmi(H) was less than R. Then, Pgout(R) was given by the ratio of the number
of outage events and the number of total transmissions. From the figures, we
observe that dcsir is equal to 4 and 3 for Gaussian and BPSK inputs, respectively.
As predicted by Theorem 3.1, the slope becomes steeper with increasing de,
eventually becoming parallel to the perfect CSIR outage curve for de ≥ 1. For
3.2The concavity of Igmib (SNR,Hb, Hb, s) and
1B
∑Bb=1 I
gmib (SNR,Hb, Hb, s) in s, s > 0 can be
shown using the same technique used to prove the concavity of EQ0 (s, ρ, Hb) in Section 2.4.2.1.
46
3.5 Discussion
−5 0 5 10 15 20 25 30 3510
−4
10−3
10−2
10−1
100
Pgout(R)
SNR (dB)
Perfect CSIRde = 0.5
de = 1
de = 1.25
Figure 3.5: Generalised outage probability for BPSK-input MIMO Rayleighblock-fading channel with B = 2, R = 1, nt = 2 and nr = 1.
de > 1, the slope does not increase as de increases. However, we still observe the
improvement in outage gain; the curves for de = 1.25 achieve the same Pgout(R)
as the one for de = 1 at a lower SNR.
3.5.2 The DMT for Gaussian Codebooks
In this work, we focus on fixed-rate transmission such that the multiplexing gain
rg tends to zero. The analysis can be extended to any positive multiplexing gain
(rg > 0), which is relevant for continuous inputs such as Gaussian inputs or
discrete inputs with alphabet size increasing with the SNR.
The analysis of a general positive multiplexing gain is implicitly covered in
Theorem 3.2 and Proposition 3.1. In the limit as the block length tends to
infinity, both results provide lower bounds to the optimal diversity-multiplexing
trade-off (DMT). Note that from Theorem 3.2, one may obtain dℓG(rg) for general
fading parameter τ and for J → ∞ as
dℓG(rg) =(
1 +τ
2
)
Bntnr (min(1, de)− rg) , 0 ≤ rg ≤ min(1, de). (3.47)
Note that this lower bound can be loose since the maximum multiplexing gain for
47
3.5 Discussion
a positive diversity is min(1, de) as compared to min(nt, nr) for the case of perfect
CSIR [20]. From Proposition 3.1, one may obtain dℓG(rg) for fading parameter τ =
0 and for J → ∞ as the trade-off with the multiplexing gain. Indeed, as shown
in Appendix A.6, the lower bound of the optimal DMT curve dℓG(rg) for J → ∞is given by the piecewise-linear function connecting the points (rg, d
ℓG(rg)), where
rg = 0,min(1, de), 2min(1, de), . . . ,min(nt, nr)min(1, de), (3.48)
dℓG(rg) = min(1, de) · B(
nt −rg
min(1, de)
)
·(
nr −rg
min(1, de)
)
. (3.49)
Note that we have dℓG,max = min(1, de)Bntnr and rg,max = min(nt, nr)min(1, de).
This lower bound is tight for B = 1 and de ≥ 1, which is the perfect-CSIR
DMT [20].
There are several reasons why the above bounds may not be tight for mis-
matched CSIR. The first one is that we only evaluate the above bounds based on
Gallager’s lower bound to the error exponent (2.24) which for large J yields an
upper bound to the generalised outage probability as shown in Appendices A.4
and A.6. Hence, these bounds are not an exact characterisation of the generalised
outage probability. The second one is that for the bound in Theorem 3.2, the
trade-off is derived by using the joint pdf of the entries of Hb and Eb. Note that
this leads to a further lower bound as shown in Appendix A.6. A tighter bound
is obtained by considering the analysis using the joint pdf of the eigenvalues.
However, this last approach has some technical difficulties particularly for τ 6= 0
as shown in Appendix A.6.
From Appendix A.3.2, we have an upper bound to the optimal DMT as
dGicsir(rg) ≤(
1 +τ
2
)
Bntnr min
(
1− rgmin(nt, nr)
, de
)
, 0 ≤ rg ≤ min(nt, nr).
(3.50)
Note the upper bound is trivial for any de ≤ 1− rgmin(nt,nr)
because this is identical
to the result with rg ↓ 0, which always upper-bounds the optimal DMT. The
upper bound (3.50) and the lower bound (3.47) are tight only for rg ↓ 0.
3.5.3 Optical Wireless Scintillation Distributions
In optical wireless scintillation channels, we mainly deal with the received signal
intensities, and not complex input symbols or complex fading realisations; thus
the use of real amplitude modulation such as pulse-position modulation (PPM)
is common [55–57]. This means that the channel phase is not being considered
48
3.6 Conclusion
in the detection and only the real-part of the complex-Gaussian noise affects the
decision. However, the mutual information and the GMI expressions in (3.10)
and (3.20) are valid for real-valued signals and real-valued fading responses as
well. Notice that in our converse and achievability results, we have used the
joint probability of A and . The parameter that distinguishes the resulting
SNR-exponents for different fading conditions is the channel parameter τ in the
form of 1 + τ2. This form comes out naturally from the pdf after defining αb,r,t ,
− log |hb,r,t|2log SNR
. Thus, as long as after performing the change of random variables, we
can express the pdf of the normalised fading gain for each channel matrix entry
as
PAb,r,t(α)
.= exp
(
−(
1 +τ
2
)
α log SNR)
, α ≥ 0 (3.51)
then our main results are valid for those fading distributions as well. Conse-
quently, the results are valid for fading distributions used in optical wireless
scintillation channels such as lognormal-Rice distribution for which(
1 + τ2
)
= 12
and gamma-gamma distribution for which(
1 + τ2
)
= 12min(a, b), where a, b are
the parameters of the individual gamma distributions [55–57].
3.6 Conclusion
We have examined the outage behaviour of nearest neighbour decoding in MIMO
block-fading channels with imperfect CSIR. By treating the problem as a mismat-
ched-decoding problem, we apply the GMI and the generalised outage probabil-
ity, the probability that the GMI is less than the data rate. Due to the data-
processing inequality for error exponents and mismatched decoders, the gener-
alised outage probability is larger than the outage probability of the perfect CSIR
case.
We have analysed the generalised outage probability in the high-SNR regime
and we have derived the diversity (SNR-exponent) for both Gaussian and discrete
inputs. We have shown that in both cases, the SNR-exponents are given by the
perfect-CSIR SNR-exponents scaled by the minimum of the channel estimation
error diversity—which measures the exponential decay of the channel estimation
error with respect to the SNR—and one. Note that the perfect-CSIR outage
diversity is the highest diversity that a code may have. Therefore, in order to
achieve the highest possible diversity, the channel estimator should be designed in
such a way so as to make the estimation error diversity equal to or larger than one.
Furthermore, the optimal SNR-exponent for Gaussian inputs can be achieved
using Gaussian random codes with finite block length as long as the block length
49
3.6 Conclusion
is greater than a threshold; this threshold depends on the fading distribution and
the number of antennas. On the other hand, for discrete-constellation random
codes, the optimal SNR-exponent can be achieved using block length that grows
very fast with log SNR. All these results are well applicable for many fading
distributions.
Our main result suggests that the nearest neighbour decoder can be reliable—
in the sense that it achieves the perfect-CSIR diversity—if the receiver can pro-
vide sufficiently accurate channel estimates. The analysis finds applications in
pilot-based channel estimation for which achieving the perfect-CSIR diversity is
possible by allocating the same amount of power to both data and pilot. Note
that even though increasing the pilot power beyond the data power does not
enhance the diversity, it generally improves the generalised outage probability
and closes the gap with the perfect-CSIR outage probability.
Given that the channel estimate is sufficiently accurate, the design of good
codes with imperfect CSIR has the same performance benchmark (in terms of the
diversity) as that with perfect CSIR. Our random coding achievability analysis
suggests that such good codes do exist. If the channel estimate is not accurate
enough, then our imperfect-CSIR generalised outage diversity may not be the
highest diversity. One can possibly achieve a better performance by using a
more complicated coding scheme (such as the ones with non-i.i.d. inputs) and
decoding scheme (such as the ones that decode without having to perform explicit
channel estimation).
50
Chapter 4
IR-ARQ in MIMO Block-Fading
Channels with Imperfect
Feedback and CSIR
In Chapter 2, we have shown that the outage diversity (with perfect CSIR) is the
highest diversity that any good codes can achieve. In Chapter 3, with uniform
power allocation, we have derived the relationship between the outage diversity
and the generalised outage diversity (with imperfect CSIR). For most fading
distributions both the outage and the generalised outage diversities are finite,
which implies that an arbitrarily small error probability is not attainable.
In this chapter, we consider incremental-redundancy automatic repeat-request
(IR-ARQ) to improve the generalised outage diversity. We provide a brief intro-
duction to IR-ARQ in Section 4.1. We describe the system model in Section 4.2.
We then explain some useful performance metrics for IR-ARQ in Section 4.3.
We state our main results and discuss the findings in Section 4.4. Finally, we
summarise the important findings of the chapter in Section 4.5.
4.1 Introduction
ARQ is a widely-known technique that allows to increase the reliability of trans-
mission over inherently unreliable channels like slowly-varying wireless chan-
nels [58]. Traditionally, ARQ reduces decoding errors based on retransmission
process. When erroneous packets are detected, the receiver requests retransmis-
sion of the same packet until the maximum number of rounds for a given message
is reached. An advanced form of ARQ is hybrid ARQ, where channel coding is
used to protect the packet from the impairments of the channel and provide
51
4.1 Introduction
error-correction capability.
IR-ARQ is a hybrid-ARQ scheme that employs rate-compatible codes where
observations over different transmission rounds can be combined [59–61]. The
coding rate decreases with the number of transmission rounds to adapt to poor
channel conditions. References [60–62] have shown that IR-ARQ provides signif-
icant gains over conventional hybrid-ARQ schemes that only uses channel coding
without combining the observations.
The code diversity performance of IR-ARQ over block-fading channels has
been studied in the literature. A notable result was the optimal rate-diversity-
delay trade-off, firstly studied in [26] for Gaussian inputs under quasi-static fad-
ing channels and later in [29] for both Gaussian and discrete inputs under block-
fading channels. In these works, it has been shown that the optimal diversity is an
increasing function of the maximum allowed ARQ delay, L. Thus, the reliability
improvement offered by ARQ comes into effect at the expense of increasing trans-
mission delay, which inherently reduces the throughput. The throughput loss,
however, is negligible at sufficiently high SNR. This demonstrates that IR-ARQ
is a good technique for reliability improvement at high SNR without sacrificing
throughput.
Common assumptions of the above works are perfect CSIR and perfect ARQ
feedback. Unfortunately, these assumptions are difficult to guarantee in practice.
A number of works have also addressed the effect of imperfect CSIR and
feedback in ARQ channels. For example, references [63–65] studied the effect
of imperfect feedback whereas references [66, 67] investigated the role of the
imperfect-CSIR accuracy on the performance of ARQ schemes. Most of these
works consider separate coding for error correction and detection. A message is
firstly encoded using error-detecting encoder and the resulting packet is subse-
quently encoded using error-correcting encoder [63, 64, 68]. Thus, in general the
performance of the system depends closely on the type of error-correcting and
error-detecting codes.
In this chapter, we consider both imperfect CSIR and feedback and study
precisely their impact on the diversity performance of IR-ARQ coding systems.
In particular, we consider noisy CSIR and a simple binary symmetric channel
(BSC) model for the feedback link. Inspired by [58], we do not use separate coding
for error correction and detection. Instead, we use random coding schemes at
the transmitter and a threshold decoder at the receiver capable of detecting an
error. More specifically, we analyse the performance of random coding schemes
constructed over Gaussian and discrete signal constellations. The ARQ decoder
52
4.2 System Model
mARQ
Encoder
√
Pℓnt
Xℓ(m) YℓIR ARQ
DecoderY1,ℓ
Fr(ℓ)
mChannel
Feedbackchannel
Ft(ℓ)
Figure 4.1: System model for IR-ARQ transmission with binary feedback.
consists of distance-metric decoders that behave as a threshold decoder for rounds
up to (L− 1) and as a nearest neighbour decoder at L-th round.
We first study the corresponding error probability, which is characterised
using the GMI [35, 37] introduced in Chapter 2. We then derive the optimal
SNR-exponent for i.i.d. codebooks in the limit of large code length. In particu-
lar, assuming a general power allocation strategy, we show that the feedback-link
reliability must improve with the transmit SNR for the code to be able to exploit
the diversity offered by ARQ. We also identify the two extremes of imperfect feed-
back: the one that guarantees perfect-feedback diversity and the one that shows
the inability of ARQ to improve the diversity performance. We then consider
uniform power allocation and optimal long-term power control at the transmit-
ter. We derive the conditions for which power control may provide additional
diversity gains with respect to uniform power allocation.
4.2 System Model
The overview of the system model is illustrated in Figure 4.1. In the following,
we shall explain each entity in the system model.
4.2.1 Channel Model
We consider an IR-ARQ coding scheme over a MIMO block-fading channel with
nt transmit antennas, nr receive antennas, L rounds and B fading blocks per
round. The output of the channel at ARQ round ℓ is a Bnr × J-dimensional
random matrix
Yℓ =
√
Pℓ
nt
H ℓXℓ + Zℓ, ℓ = 1, . . . , L (4.1)
53
4.2 System Model
where Zℓ is the Bnr×J-dimensional noise matrix and Xℓ ∈ XBnt×J is the transmit-
ted signal matrix; J denotes the channel block length, X denotes signal constella-
tion and Pℓ denotes the allocated power at round ℓ satisfying the average-power
constraint
[
1
L
L∑
ℓ=1
Pℓ
]
≤ P. (4.2)
We assume that the entries of Zℓ, ℓ = 1, . . . , L are i.i.d. complex-Gaussian
random variables with zero mean and unit variance; P indicates the average
SNR at each received antenna. The fading matrix H ℓ is a Bnr × Bnt random
block-diagonal matrix defined by
H ℓ , diag (H ℓ,1, . . . ,H ℓ,B) (4.3)
where H ℓ,b, b ∈ 1, . . . , B, ℓ ∈ 1, . . . , L is the fading matrix for round ℓ and
block b, and takes values in nr×nt. We further assume that the fading process
H ℓ follows the short-term static model [26] for which H ℓ,b are i.i.d. from block to
block and from round to round. The distribution of H ℓ,b follows from the fading
model in Section 3.1. We write the channel outputs accumulated up to round ℓ
as
Y1,ℓ = H1,ℓX1,ℓ + Z1,ℓ (4.4)
where
Y1,ℓ = [YT
1, . . . ,YT
ℓ ]T, X1,ℓ = [XT
1, . . . ,XT
ℓ ]T, (4.5)
Z1,ℓ = [ZT
1, . . . ,ZT
ℓ ]T , H1,ℓ = diag
(
√
P1
nt
H1, . . . ,
√
Pℓ
nt
H ℓ
)
. (4.6)
We further assume that imperfect channel estimates have the same form as (3.5)
H ℓ = H ℓ + Eℓ, ℓ = 1, . . . , L (4.7)
where H ℓ , diag(H ℓ,1, . . . , H ℓ,B) and Eℓ , diag (Eℓ,1, . . . ,Eℓ,B) (with H ℓ,b and Eℓ,b
taking value in nr×nt) are the noisy channel estimate and the channel estimation
error, respectively. The entries of Eℓ,b, b = 1, . . . , B are independent from the
entries of H ℓ,b and i.i.d. complex-Gaussian random variables with zero mean and
variance
σ2e = P−de, de > 0 (4.8)
54
4.2 System Model
where de is the channel estimation error diversity. As explained in Chapter 3,
this model is widely used in pilot-based channel estimation. Finally, we write
H1,ℓ and E1,ℓ similarly to H1,ℓ in (4.6).
4.2.2 IR-ARQ Scheme
At the transmitter end, a mother code of rate RLis constructed based on concate-
nating several coded blocks. A mother codeword used to encode the message m,
m ∈ 1, . . . , |M| is defined by
X1,L(m) , [XT
1(m), . . . ,XT
L(m)]T
(4.9)
which is made of L matrices Xℓ(m) ∈ XBnt×J , ℓ = 1, . . . , L. Here M is the set
of all possible messages (equiprobable) such that |M| = 2BJR, where R is the
coding rate for each Xℓ(m). Note that Xℓ(m) may be different for different ℓ. At
round ℓ, the transmitter sends the matrix Xℓ, which consists of B coded blocks.
Each coded symbol is drawn i.i.d. from an input signal set X; here we focus on
Gaussian and discrete inputs. For Gaussian inputs, we assume a fixed data rate,
i.e., R > 0, independent of the SNR, whereas for discrete inputs, the data rate
is limited by the number of transmit antennas and the cardinality of signal set
|X| = 2M , i.e., R ∈ (0,Mnt). The constellations are assumed to satisfy the unit
average energy.
At the receiver end, for each transmission round ℓ = 1, . . . , L, the IR-ARQ
decoder considers all coded blocks up to the current round. At rounds ℓ < L,
conditioned on fixed channel H1,ℓ = H1,ℓ and its corresponding estimate H1,ℓ =
H1,ℓ, the receiver employs a threshold decoder based on the set
Tδ(ℓ) =
X1,ℓ(m),Y1,ℓ, H1,ℓ :QsY|X ,H
(
Y1,ℓ
∣
∣
∣X1,ℓ(m), H1,ℓ
)
[
QsY|X ,H
(
Y1,ℓ
∣
∣
∣X′1,ℓ, H1,ℓ
)] ≥ |M|δ
(4.10)
where s and δ are some positive numbers, and where the expectation is taken over
the probability distribution PX(X′1,ℓ). Herein Q
Y|X ,H (·) is the Euclidean-distance
metric given by
QY|X ,H
(
Y1,ℓ
∣
∣
∣X1,ℓ, H1,ℓ
)
,
ℓ∏
ℓ′=1
B∏
b=1
J∏
ν=1
1
πnte−∥
∥
∥yℓ′,b,ν−
√Pℓ′/nt Hℓ′,bxℓ′,b,ν
∥
∥
∥
2
. (4.11)
The threshold decoder Ψ(·) outputs:
55
4.2 System Model
• Ψ(Y1,ℓ) = m, m ∈ 1, . . . , |M| if X1,ℓ(m) is the unique sequence such that
its normalised metric is greater than the threshold |M|δ
in Tδ(ℓ). A positive
acknowledgement (ACK) is then generated.
• Ψ(Y1,ℓ) = 0 if no unique sequence is found, and an error is declared. A
negative ACK is then generated.
At round ℓ = L, error detection is not required. A maximum-metric decoder,
which is identical to the nearest neighbour decoder in Chapter 3, is used to output
the message m such that
m = arg maxm=1,...,|M|
QY|X ,H
(
Y1,L
∣
∣
∣X1,L(m), H1,L
)
. (4.12)
A detailed analysis on the performance of the decoder is given in Section 4.3.1.
At round ℓ, ℓ < L, the transmitter receives through a feedback channel a
positive ACK or negative ACK. If the positive ACK is received, the transmit-
ter understands that the message has been successfully delivered and starts the
transmission of the next message. Instead, if the negative ACK is received, the
transmitter sends the next ARQ round corresponding to the current message,
Xℓ+1(m). This process continues until a positive ACK is received or until the
maximum round L has been reached.
Synchronous transmission is assumed. Each transmission round is numbered
and the sequence is known perfectly at both terminals.
4.2.3 Feedback Channel and ARQ
We shall denote the ACK generated by the receiver at round ℓ as Fr(ℓ). Similarly,
the ACK received by the transmitter at round ℓ is denoted by Ft(ℓ). The numbers
0 and 1 denote negative and positive ACKs, respectively. For example, Fr(ℓ) = 1
(Fr(ℓ) = 0) is the positive (negative) ACK generated by the receiver at round
ℓ and Ft(ℓ) = 1 (Ft(ℓ) = 0) is the positive (negative) ACK received by the
transmitter at round ℓ.
We model the feedback channel as a BSC with crossover probability pfb. The
motivation is that the ACK sent by the receiver may be interpreted incorrectly
by the transmitter due to unreliable medium. We assume that the crossover
probability is such that
pfb = min
p0,p0P dfb
(4.13)
56
4.3 Performance Metrics with Imperfect CSIR
where 0 ≤ p0 ≤ 12and dfb > 0 is the feedback diversity, which is defined as
dfb , limP→∞
− log pfblogP
. (4.14)
This models a feedback channel whose quality increases with the forward link
SNR. The perfect feedback assumption [26, 29, 32] is a special case of this BSC
feedback with pfb = 0.
Upon receiving a positive ACK, the transmitter stops the current message
transmission and starts the next message transmission. Otherwise, the transmit-
ter continues the current message transmission in the next round. The transmit-
ter keeps track of the number of rounds that have already elapsed and terminates
the current transmission once the maximum delay limit L is reached.
Once a message is successfully decoded at a particular round, the receiver
issues a positive ACK. The receiver will disregard any extra transmission rounds
of the current message as a result of wrong negative ACKs at the transmitter;
for each extra transmission round, the receiver will continue to issue positive
ACK. If decoding is not successful (Fr(ℓ) = 0) but there is no retransmission of
the current message (since Ft(ℓ) = 1), the receiver declares an error; no ACK is
further issued for the current message as decoding will start for the next message.
4.3 Performance Metrics with Imperfect CSIR
In the following, we develop ARQ performance metrics accounting imperfect
CSIR.
4.3.1 Error Probability with Perfect Feedback
To analyse the error probability of the underlying ARQ scheme, we first define
the following three decoder events.
• The joint event of error detection up to round ℓ
Dℓ ,
Ψ(Y1,1) = 0, . . . ,Ψ(Y1,ℓ) = 0
. (4.15)
• Assuming that the message m is transmitted, the undetected error event
at round ℓ (conditioned that Dℓ−1 occurs), consists of
57
4.3 Performance Metrics with Imperfect CSIR
– the event of valid decoding
Vℓ ,
⋃
m6=0
Ψ(Y1,ℓ) = m
, (4.16)
– the event of decoding error
Eℓ ,
⋃
m 6=m
Ψ(Y1,ℓ) = m
. (4.17)
Assuming perfect feedback, the error probability at round ℓ, Pe(ℓ), ℓ = 1, . . . ,
L− 1 can be written as
Pe(ℓ) = Pr V1,E1+ℓ−1∑
ℓ′=2
PrDℓ′−1,Vℓ′,Eℓ′+ Pr Dℓ−1,Eℓ . (4.18)
The accumulated error probability at round L is given by
Pe(L) = Pr V1,E1+L−1∑
ℓ′=2
PrDℓ′−1,Vℓ′ ,Eℓ′+ Pr DL−1,EL . (4.19)
Note that for any ℓ = 1, . . . , L − 1, if an undetected error occurs at round ℓ, it
will also be counted for error at round ℓ+1 up to L. This emphasises that ARQ
cannot correct undetected errors.
In order to characterise the above error probability, we first study the thresh-
old decoder Ψ(·) for a block-fading channel and its properties in terms of detected
and undetected error probability in Section 4.3.1.1. We then apply the results to
IR-ARQ in Section 4.3.1.2.
4.3.1.1 Threshold Decoder Ψ(·)
In this section, we consider a block-fading channel with B fading blocks and
characterise the error behaviour of the threshold decoder Ψ(·) based on the set
(4.10), i.e.,
Tδ =
X(m),Y, H :QsY|X ,H
(
Y
∣
∣
∣X(m), H
)
[
QsY|X ,H
(
Y
∣
∣
∣X′, H
)] ≥ |M|δ
(4.20)
for some s, δ > 0. Herein we have specifically omitted the argument ℓ in Tδ and
the subscript 1,ℓ in the arguments of the metric as we do not incorporate ARQ.
58
4.3 Performance Metrics with Imperfect CSIR
We first note that the set Tδ is the modified version of the set implicitly
used in the Csiszar-Korner upper bound to the error probability [69, Lemma
12.9]. There are two differences with [69, Lemma 12.9]. The first one is the
introduction of the parameter s > 0 (allowing a more general bound than using
s = 1 in [69, Lemma 12.9]). The second one is the normalising term
[
QsY|X ,H
(
Y
∣
∣
∣X′, H)]
(4.21)
that lifts the restriction [69]
[
QY|X ,H
(
Y
∣
∣
∣X′, H)]
= 1, ∀Y ∈ Bnr×J . (4.22)
Thus, using the same steps to derive [69, Lemma 12.9]), we have the following
lemma on the upper bound to the error probability of the threshold decoder.
Lemma 4.1. The error probability of the ensemble of random codes using the
threshold decoder—which produces the output m if X(m) is the unique sequence
such that its normalised metric is greater than the threshold in Tδ (4.20); other-
wise, an error is declared—can be upper-bounded as
Pe,ave
(
H
)
≤ δ
+ Pr
1
BJlog2
QsY|X ,H
(
Y
∣
∣
∣X, H
)
[
QsY|X ,H
(
Y
∣
∣
∣X′, H
)∣
∣
∣Y, H
] < R− 1
BJlog2 δ
∣
∣
∣
∣
∣
∣
H = H,E = E
(4.23)
for any s, δ > 0.
Lemma 4.1 implies the following.
Corollary 4.1 (Detected Error Probability). Let
δ′ = − 1
BJlog2 δ, 0 < δ < 1. (4.24)
For a given alphabet X and its corresponding distribution, there exists a random
code whose detected error probability can be bounded as
PrD|H = H,E = E ≤ 2−BJδ′
+ Pr
1
BJlog2
QsY|X ,H
(
Y
∣
∣
∣X, H
)
[
QsY|X ,H
(
Y
∣
∣
∣X′, H
)∣
∣
∣Y, H
] < R + δ′
∣
∣
∣
∣
∣
∣
H = H,E = E
(4.25)
59
4.3 Performance Metrics with Imperfect CSIR
for any s > 0.
The upper bound in Lemma 4.1 holds for the undetected error probability
as well. However, we want the undetected error probability to be as small as
possible. The following lemma—which we prove in Appendix B.1—may provide
a tighter bound.
Lemma 4.2 (Undetected Error Probability). Recall that δ′ is given in (4.24).
For a given alphabet X and its corresponding distribution, there exists a random
code whose undetected error probability can be bounded as
PrV,E|H = H,E = E ≤ 2−BJδ′ (4.26)
for any δ′ > 0.
We shall note that the maximum-metric decoder based on QY|X ,H (Y|X, H),
cf.(2.22), yields a smaller error probability than the above threshold decoder.
This is so because whenever the threshold decoder produces a message output,
this output is identical to the output of the maximum-metric decoder (2.22).
However, the threshold decoder has an advantage over the maximum-metric de-
coder. More specifically, the threshold decoder allows for error detection, i.e.,
when there is no unique codeword above the threshold in Tδ.
Note that with an i.i.d. codebook, we can write the expectation over X′ as
[
QsY|X ,H
(
Y
∣
∣
∣X′, H)]
=
B∏
b=1
J∏
ν=1
[
QsY |X,H
(
yb,ν
∣
∣
∣X ′
b,ν , Hb
)]
. (4.27)
For a fixed fading H b = Hb ∈ nr×nt and its corresponding estimation error
Eb = Eb ∈ nr×nt, b = 1, . . . , B, in the limit as J → ∞, we have from the law of
large numbers [19] that
limJ→∞
1
Jlog2
QsY|X ,H
(
Y
∣
∣
∣X, H
)
[
QsY|X ,H
(
Y
∣
∣
∣X′, H
)]
=B∑
b=1
log2
QsY |X,H
(
Y
∣
∣
∣X, H b
)
[
QsY |X,H
(
Y
∣
∣
∣X ′, H b
)∣
∣
∣Y , H b
]
∣
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
. (4.28)
Fix δ′ > 0. Following from Corollary 4.1 and (4.28), as the block length J
tends to infinity, the detected error probability—conditioned on both fading and
60
4.3 Performance Metrics with Imperfect CSIR
its corresponding estimation error—is given by
PrD|H = H,E = E ≤ 1
Igmi(H , s) ≤ R + δ′∣
∣
∣H = H,E = E
, (4.29)
where
Igmi(H, s) =1
B
B∑
b=1
Igmib (P,Hb, Hb, s) (4.30)
and
Igmib (P,Hb, Hb, s)
=
log2
QsY |X,H
(
Y
∣
∣
∣X, H b
)
[
QsY |X,H
(
Y
∣
∣
∣X ′, H b
)∣
∣
∣Y , H b
]
∣
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
. (4.31)
Note that for a given H = H and E = E, the bound (4.29) is valid for any s > 0.
Thus, by taking the expectation over the ensemble of H ,E, we can tighten the
bound (4.29) by optimising over s > 0 for which s depends on the realisations of
H and E, i.e.,
PrD , [PrD|H = H,E = E] (4.32)
≤ Pr
sups>0
Igmi(H , s) ≤ R + δ′
(4.33)
= Pr
Igmi(H) ≤ R + δ′
(4.34)
which is similar to the generalised outage probability for a given rate R+δ′ (2.32)
but with < being replaced by ≤. On the other hand, following from Lemma 4.2
PrV,E|H = H,E = E ≤ 2−BJδ′ , (4.35)
we observe that for any δ′ > 0, the undetected error probability vanishes as J
tends to infinity. Remark that the upper bound (4.35) is independent from both
H and E.
The converse for i.i.d. codebooks shows that conditioned on fading and its
corresponding estimation error, the error probability with the maximum-metric
decoder (4.12) for sufficiently large J is lower-bounded by [4, 39]
PrE|H = H,E = E ≥ 1− exp(
−e−BJ(Igmi(H)+ε−R) + e−BJ(Igmi(H)+ε))
(4.36)
where ε > 0 is an arbitrarily small number. In the limit of large J , averaging the
61
4.3 Performance Metrics with Imperfect CSIR
RHS of the inequality over all fading and channel estimation error realisations
leads to the generalised outage probability at a given rate R (2.32). Recall that
the threshold decoder Ψ(·) cannot be better than the maximum-metric decoder.
Hence, the lower bound (4.36) holds for the threshold decoder Ψ(·) as well.From Lemma 4.1 and the result (4.29), we have that for large block length,
whenever Igmi(H) > R + δ′, the message will be correctly decoded with high
probability. For i.i.d. codebooks, whenever Igmi(H) < R, a decoding error occurs
with high probability, which follows from (4.36). If a decoding error occurs, then
it can be detected with arbitrarily high probability, which follows from the result
(4.35) by letting J to infinity.
4.3.1.2 Decoding Error and Communication Outage
We can view IR-ARQ decoding at round ℓ, ℓ = 1, . . . , L as decoding for a block-
fading channel with total ℓB fading blocks. Due to the concatenation of the
matrices X1, . . . ,Xℓ in X1,ℓ, the effective coding rate is given by R/ℓ. Note that
following from Lemma 4.2, we have that the probability of undetected error
accumulated up to round ℓ < L can be upper-bounded as
Pr V1,E1+ℓ∑
ℓ′=2
Pr Dℓ′−1,Vℓ′,Eℓ′ ≤ ℓ2−BJδ′ . (4.37)
By fixing δ′ > 0 and letting J → ∞, the contribution of (4.37) to (4.18) can
be made arbitrarily small. The dominating contribution is given by the detected
error probability
Pr Dℓ−1,Eℓ − Pr Dℓ−1,Vℓ,Eℓ = Pr Dℓ−1,Vcℓ,Eℓ (4.38)
= Pr Dℓ . (4.39)
Note that by the definition (4.15), Dℓ characterises the joint detected error event
up to round ℓ. Following Corollary 4.1, in the limit as J → ∞, we have the
bound
Pr Dℓ ≤ Pr
H1,ℓ,E1,ℓ ∈
ℓ⋂
ℓ′=1
Q1,ℓ′(R + δ′)
(4.40)
where
Q1,ℓ(R) ,
H1,ℓ,E1,ℓ ∈ ℓBnr×ℓBnt : Igmi
1,ℓ(H1,ℓ) ≤ R
. (4.41)
62
4.3 Performance Metrics with Imperfect CSIR
For future reference, we shall define a similar set to Q1,ℓ(R) but with ≤ being
replaced by <, i.e.,
O1,ℓ(R) ,
H1,ℓ,E1,ℓ ∈ ℓBnr×ℓBnt : Igmi
1,ℓ(H1,ℓ) < R
. (4.42)
In the above, we have defined Igmi
1,ℓ(H1,ℓ) as the accumulated GMI, related to the
GMI with ℓB fading blocks Igmi(H1,ℓ) as
Igmi
1,ℓ(H1,ℓ) , ℓIgmi(H1,ℓ) (4.43)
where
Igmi(H1,ℓ) = sups>0
1
ℓB
ℓ∑
l=1
B∑
b=1
Igmil,b (Pl,Hl,b, Hl,b, s) (4.44)
and where Igmil,b (Pl,Hl,b, Hl,b, s) is given on the RHS of (4.31). Clearly, this ac-
cumulated GMI replaces the role of the accumulated mutual information [32]
I1,ℓ(H1,ℓ) ,1
B
ℓ∑
l=1
B∑
b=1
Iawgn
(
√
Pl
nt
Hl,b
)
(4.45)
(where Iawgn(·) is defined in Section 2.3.1) to describe communication outages
for IR-ARQ.
At round L, the maximum-metric decoder (4.12), which is better than the
threshold decoder Ψ(·), is used to produce the output message. Thus, at round
L, the error characterisation in Section 2.4.1 applies. Following the same steps
as for ℓ < L, we can see that the upper bound to Pe(L) in (4.19) is dominated by
PrDL−1,EL. Following the steps in Section 2.4.1 and the derivation in Section
4.3.1.1, we can show as J → ∞ that
Pe(L) ≤ Pr
H1,L,E1,L ∈
L⋂
ℓ′=1
Q1,ℓ′(R + δ′)
. (4.46)
Using Proposition 2.4 (see also the discussion in Section 4.3.1.1), we can show
that with i.i.d. codebooks and perfect error detection, the error probability at
round ℓ for sufficiently large block length can be lower-bounded as
PrEℓ,Dℓ−1 ≥ Pr
H1,ℓ,E1,ℓ ∈
ℓ⋂
ℓ′=1
O1,ℓ′(R)
. (4.47)
Notice that the upper bound (4.46) tends to the lower bound (4.47) (at round
L) with δ′ ↓ 0. For future reference, we shall define the RHS of (4.47) in the
63
4.3 Performance Metrics with Imperfect CSIR
0 2 4 6 8 10 12 140
0.05
0.1
0.15
0.2
0.25
0.3
0.35
de = 0.5
de = 1
de = 2
Perfect CSIR
Igmi
1,2(h1,2)
Density
ofIgmi
1,2(H
1,2)
Igmi
1,1(h1,1)
Figure 4.2: Density of the accumulated GMI at round ℓ = 2 for Gaussian-inputtransmission over a SISO Rayleigh fading channel with B = 1, Pℓ = 16 (unitpower), ℓ = 1, 2.
following.
Definition 4.1 (Generalised Joint-Outage). The generalised joint-outage prob-
ability at ARQ round ℓ, ℓ = 1, . . . , L, is defined by the probability that the accu-
mulated GMI from rounds 1 to ℓ are less than R, i.e.,
P ℓgout(R) , Pr
H1,ℓ,E1,ℓ ∈
ℓ⋂
ℓ′=1
O1,ℓ′(R)
. (4.48)
One of the main reasons for employing IR-ARQ in practical systems is the fea-
ture of concatenation of all noisy received coded blocks Y1,ℓ. From an information-
theoretic perspective, under perfect CSIR (where Igmi
1,ℓ(H1,ℓ) and I1,ℓ(H1,ℓ) are
identical), this allows for information accumulation (4.45). However, this may
no longer be true if the CSIR is imperfect. As we can see from (4.43) and (4.44)
increasing the number of coded blocks does not necessarily improve the accumu-
lated GMI due to the optimisation over s > 0. In contrast to the accumulated
mutual information, the accumulated GMI at round ℓ is not the sum of the GMI
at rounds ℓ′ = 1, . . . , ℓ. To illustrate this, we consider Gaussian-input transmis-
sion using a two-round IR-ARQ scheme over a SISO quasi-static Rayleigh-fading
channel and evaluate the accumulated GMI at round ℓ = 2 conditioned on the
accumulated GMI at round ℓ = 1, Igmi
1,1(h1,1). Let h1,1 = h1,1 = 1 − i. With
Pℓ = 16, ℓ = 1, 2, this gives Igmi
1,1(h1,1) = 5.044 bits per channel use. We plot the
64
4.3 Performance Metrics with Imperfect CSIR
density of the random variable Igmi
1,2(H1,2) in Figure 4.2.
As we can see from Figure 4.2 that for de = 0.5, 1, 2, there is non-zero prob-
ability that the accumulated GMI at round ℓ = 2 is less than the accumulated
GMI at round ℓ = 1. This probability decreases with an increase in de. The
figure also suggests that if the decoder is mismatched, increasing the number
of fading blocks does not necessarily translate into improving the accumulated
maximum-achievable rate. This phenomenon does not occur if the CSIR is per-
fect, i.e., the accumulated mutual information at round ℓ = 2 always improves
as a consequence of the non-negativity of mutual information [18].
4.3.2 Error Probability with Imperfect Feedback
With imperfect feedback, there are additional error events which come from
feedback errors, i.e., when the receiver detects an error hence sending a negative
ACK (Fr(ℓ) = 0), but the transmitter receives a positive ACK (Ft(ℓ) = 1). We
first define the joint event when the transmitter receives negative ACKs up to
round ℓ
Aℓ , Ft(1) = 0, . . . , Ft(ℓ) = 0 . (4.49)
Using the feedback model in Section 4.2, we understand that the value of Ft(ℓ)
is known after the value of Fr(ℓ) is known.
With imperfect feedback, the error probability in (4.19) becomes
Pe(L) =PrV1,E1+L−1∑
ℓ′=2
PrAℓ′−1,Dℓ′−1,Vℓ′,Eℓ′+ Pr AL−1,DL−1,EL
+ PrD1, Ft(1) = 1+L−1∑
ℓ′=2
PrAℓ′−1,Dℓ′, Ft(ℓ′) = 1. (4.50)
The first three terms
PrV1,E1+L−1∑
ℓ′=2
PrAℓ′−1,Dℓ′−1,Vℓ′ ,Eℓ′+ Pr AL−1,DL−1,EL (4.51)
characterise decoding errors and undetected errors. The last two terms
PrD1, Ft(1) = 1+L−1∑
ℓ′=2
PrAℓ′−1,Dℓ′, Ft(ℓ′) = 1 (4.52)
are due to feedback errors. Note that only feedback errors up to round L− 1 are
considered as the imperfect feedback at round L is immaterial since there is no
65
4.4 ARQ Outage Diversity
further retransmission. We shall analyse the severity of the imperfect feedback
and CSIR and investigate the behaviour of Pe(L) at high SNR in the following
section.
4.4 ARQ Outage Diversity
In this section, we present our results on the high-SNR performance of the IR-
ARQ scheme. Note that the reliability of IR-ARQ is characterised by errors
accumulated at round L, i.e., Pe(L) in (4.50). Following the large block length
analysis in Section 4.3, using random coding schemes, the behaviour of Pe(L) can
be captured by the ARQ outage probability P arqgout(R), which is defined as follows.
Definition 4.2 (ARQOutage Probability). The ARQ outage probability P arqgout(R)
is the probability that the accumulated GMI at rounds ℓ = 1, . . . , L is less than
the data rate R, but no further retransmission is performed, i.e.,
P arqgout(R) , Pr
H1,1,E1,1 ∈ O1,1(R), Ft(1) = 1
+L−1∑
ℓ=2
Pr
H1,ℓ,E1,ℓ ∈
ℓ⋂
ℓ′=1
O1,ℓ′(R)
,Aℓ−1, Ft(ℓ) = 1
+ Pr
H1,L,E1,L ∈
L⋂
ℓ′=1
O1,ℓ′(R)
,AL−1
. (4.53)
In the following, we define two important quantities that capture the SNR-
exponents of P ℓgout(R) in (4.48) and P arq
gout(R), and will be used to state our main
theorem.
Definition 4.3 (Generalised Outage Diversity at Round ℓ). The generalised out-
age diversity at ARQ round ℓ is defined as the high-SNR slope of the generalised
joint-outage probability at round ℓ (4.48) curve plotted in log-log scale, i.e.,
dicsir(ℓ) , limP→∞
−logP ℓ
gout(R)
logP. (4.54)
Note that the perfect-CSIR outage diversity at round ℓ, dcsir(ℓ) can be obtained
from dicsir(ℓ) by letting de ↑ ∞.
Definition 4.4 (ARQ Outage Diversity). The ARQ outage diversity is defined
as the high-SNR slope of the ARQ outage probability curve plotted in log-log scale,
i.e.,
darq , limP→∞
− logP arqgout(R)
logP. (4.55)
66
4.4 ARQ Outage Diversity
Our main result on darq is summarised in the following theorem.
Theorem 4.1. Consider an IR-ARQ coding scheme over the MIMO block-fading
channel (4.1) with imperfect CSIR (4.7) and imperfect feedback (4.13). For any
possible power allocation P1, . . . , PL satisfying the power constraint (4.2) and any
input constellations, the ARQ outage diversity is given by
darq = min (dicsir(L), dfb + dicsir(1)) (4.56)
where dicsir(ℓ), ℓ = 1, . . . , L is the generalised outage diversity at round ℓ (4.54).
The achievability of darq is shown using random coding.
Proof. See Appendix B.2.
Theorem 4.1 shows the interplay of the ARQ outage diversity darq, the feed-
back diversity dfb, the generalised outage diversity dicsir(ℓ) and the ARQ delay
limit L. Imperfect feedback introduces ACK errors at the transmitter and limits
the communication reliability by preventing compensation of outages occurring
at any rounds less than L. As observed in Appendix B.2, the outage events
occurring at rounds ℓ < L contribute to the ARQ outage probability due to
the crossover probability of BSC feedback that cannot be made any arbitrarily
small. The slowest decaying exponent corresponding to these contributing terms
is given by dicsir(1)+ dfb. At round L, the imperfect feedback is immaterial since
there is no issue of ACK signals and no further retransmission. At this round, the
exponent that contributes to the ARQ diversity is given by dicsir(L). It follows
that the ARQ diversity is given by the minimum of dicsir(1) + dfb and dicsir(L).
Remark 4.1 (Imperfect CSIR, Perfect Feedback). At high SNR, perfect feedback
is a special case of dfb with dfb ↑ ∞. In this case, the ARQ outage diversity
becomes
darq = dicsir(L). (4.57)
If the imperfectness is due to CSIR only, then the gain due to ARQ is still
achievable, i.e., improving diversity by a factor of L. In the perfect feedback
case, the outage events occurring for rounds ℓ < L are being compensated by
the retransmission process; thus, those outage events do not count for the ARQ
outage probability and only the outage events occurring at the maximum round
L are not compensated for [25, 26, 29]. It follows from the discussion in Section
4.3 that in this case, the ARQ outage probability is given by the generalised
joint-outage probability at round L, P arqgout(R) = PL
gout(R).
67
4.4 ARQ Outage Diversity
With sufficiently reliable feedback, we can achieve the perfect-feedback di-
versity darq in Remark 4.1. In particular, it follows from Theorem 4.1 that the
condition for dfb to achieve darq in (4.57) is given by
dfb ≥ dicsir(L)− dicsir(1). (4.58)
Depending on how power is allocated, as L increases, the minimum dfb required
to achieve the perfect-feedback diversity is usually higher. In practice, however,
the feedback diversity dfb depends on the feedback signalling and on the available
resources such as bandwidth and power.
As the feedback diversity gets lower, i.e., dfb ↓ 0, the diversity improvement
with ARQ is limited. The ARQ diversity tends to the single-round diversity
dicsir(1).
In the following, we shall discuss how dicsir(ℓ) evolves with the round index ℓ,
ℓ = 1, . . . , L when uniform power allocation or optimal power control is employed.
This allows for an exact characterisation of Theorem 4.1 when a specific power
allocation scheme is used at the transmitter. For the sake of referencing, the
superscripts u and p will be used to refer results with uniform power allocation
and power control, respectively.
4.4.1 Uniform Power Allocation
Uniform power allocation refers to equal power allocation across rounds, i.e.,
Pℓ = P , ℓ = 1, . . . , L, which clearly satisfies the constraint (4.2). The following
proposition quantifies duicsir(ℓ), the generalised outage diversity at round ℓ with
uniform power allocation.
Proposition 4.1 (Uniform Power Allocation). Consider an IR-ARQ coding
scheme over the MIMO block-fading channel (4.1) with imperfect CSIR (4.7)
and imperfect feedback (4.13). With uniform power allocation, the generalised
outage diversity at round ℓ is given by
duicsir(ℓ) = min(1, de)× ducsir(ℓ), ℓ = 1, . . . , L (4.59)
where
ducsir(ℓ) =
(
1 + τ2
)
ℓBntnr, for Gaussian constellations(
1 + τ2
)
dSB(ℓ, R), for discrete constellations of size 2M(4.60)
68
4.4 ARQ Outage Diversity
is the perfect-CSIR outage diversity at round ℓ [25, 26,29], and where
dSB(ℓ, R) , nr
(
1 +
⌊
ℓB
(
nt −R
ℓM
)⌋)
. (4.61)
Proof. See Appendix B.3.
The term min(1, de) in (4.59) is identical to the finding in Theorem 3.1. In
particular, we observe that the relationship of imperfect- and perfect-CSIR SNR-
exponent in Theorem 3.1 continues to hold in ARQ block-fading channels with
perfect feedback or with feedback satisfying (4.58), i.e.,
dfb ≥ duicsir(L)− duicsir(1). (4.62)
The perfect-CSIR ARQ outage diversity can be achieved with de ≥ 1. Remark
that with de, dfb ↑ ∞, we recover some results in [29, 32].
4.4.2 Power-Controlled ARQ
We now consider a long-term power control satisfying (4.2) that minimises the
ARQ outage probability. Improving the ARQ diversity using power control under
perfect feedback and CSIR has been considered in [26] for Gaussian inputs and
in [25, 32] for discrete inputs. The result in this chapter will generalise these
preceding results when imperfect feedback and CSIR are taken into account.
The transmitter can adapt its transmission power based on the received feed-
back. Power is spent for the current message whenever negative ACK is received
by the transmitter. We thus can write (4.2) as
P1
L+
1
L
L∑
ℓ=2
Pr Ft(ℓ− 1) = 0Pℓ ≤ P. (4.63)
We summarise the improvement made by optimal power control to the gen-
eralised outage diversity at round ℓ in the following proposition.
Proposition 4.2. Consider an IR-ARQ coding scheme over the MIMO block-
fading channel (4.1) with imperfect CSIR (4.7) and imperfect feedback (4.13).
Using optimal power control satisfying (4.63), the generalised outage diversity at
69
4.4 ARQ Outage Diversity
ARQ round ℓ (4.54) is found recursively as
dpicsir(ℓ) =(
1 +τ
2
)
Bntnr
ℓ∑
ℓ′=1
min(
1 + min(
dpicsir(ℓ′ − 1), d(ℓ′ − 1)
)
, de
)
(4.64)
for Gaussian constellations and
dpicsir(ℓ) =(
1 +τ
2
)
min(
1 + min(
dpicsir(ℓ− 1), d(ℓ− 1))
, de
)
dSB(R)
+(
1 +τ
2
)
Bntnr
ℓ−1∑
ℓ′=1
min(
1 + min(
dpicsir(ℓ′ − 1), d(ℓ′ − 1)
)
, de
)
(4.65)
for discrete constellations of size 2M where
d(ℓ) , minl=1,...,ℓ
ldfb + dpicsir(ℓ− l) (4.66)
d(0) , 0 (4.67)
dpicsir(0) , 0 (4.68)
dSB(R) , dSB(1, R). (4.69)
Proof. See Appendix B.4.
Note that Proposition 4.2 only shows the improvement of dicsir(ℓ) obtained
by optimal power control with respect to uniform power allocation. We have
from Propositions 4.1 and 4.2 that duicsir(1) = dpicsir(1). Further note that even
though we have dpicsir(ℓ) ≥ duicsir(ℓ), Theorem 4.1 suggests that the ARQ outage
diversity is strongly affected by the feedback diversity. Having a low feedback
diversity implies that the ARQ outage diversity is limited by the generalised
outage diversity at the first round plus some extra diversity coming from the
feedback. As the feedback diversity tends to zero, the ARQ scheme with power
control does not improve the diversity. There are two factors contributing to this
phenomenon. Firstly, the power used in the first ARQ round is always limited to
the constraint (4.63). Hence, we cannot transmit power at the first ARQ round
at the higher order than the constraint. Secondly, with a low feedback diversity,
the decoding of ACK signals is not highly reliable, hence it limits the overall
system reliability.
From Theorem 4.1 and Propositions 4.1 and 4.2, we observe that optimal
power control is superior to uniform power allocation if the feedback diversity dfb
is sufficiently high. The following corollary identifies the condition when optimal
70
4.4 ARQ Outage Diversity
power control yields some diversity gains.
Corollary 4.2. Optimal power control can improve ARQ outage diversity with
respect to uniform power allocation if
dfb > duicsir(L)− duicsir(1). (4.70)
Otherwise, optimal power control is as good as uniform power allocation.
The following corollary establishes the criteria of achieving the perfect-feedback
and perfect-CSIR ARQ diversity even if the feedback and CSIR are imperfect.
Corollary 4.3. To achieve the perfect-CSIR diversity, the CSIR-error diversity
de has to satisfy
de ≥ 1 + min(
dpicsir(L− 1), d(L− 1))
. (4.71)
To achieve the perfect-feedback diversity, the feedback diversity dfb has to be
such that d(ℓ) in (4.66), ℓ = 1, . . . , L− 1 satisfy the following conditions
d(ℓ) ≥ dpicsir(ℓ) (4.72)
and
dfb ≥ dp∗icsir(L)− dp∗icsir(1) (4.73)
where dp∗icsir(ℓ) is dpicsir(ℓ) with infinite dfb.
The criterion de ≥ 1 in Chapter 3 is not sufficient to achieve the perfect-
CSIR diversity when optimal power control is used. We require much larger
de to fully exploit the diversity offered by ARQ. This is because power control
improves the diversity recursively, and the CSIR quality has to improve as well
for every recursion step to adapt with the power level. Hence, the minimum
requirement for de to achieve the perfect-CSIR diversity is associated with the
power spent at the last round L. The high de requirement can be reduced if one
can design a scheme such that the CSIR-error diversity de can be adapted as
a function of the round index ℓ, i.e., de(ℓ), for which de(ℓ) increases with ℓ. In
particular, if de(ℓ) ≥ 1+min(dpicsir(ℓ−1), d(ℓ−1)), then the perfect-CSIR diversity
can be achieved. For example, in pilot-aided channel estimation, adaptive de(ℓ)
achieving perfect-CSIR diversity can be obtained by allowing the same power for
both pilot and data symbols for each round.4.1 Note, however, that having the
4.1Note that as explained in Chapter 3, the pilot power is inversely proportional to P−de(ℓ).
71
4.4 ARQ Outage Diversity
−10 −5 0 5 10 15 20 2510
−6
10−5
10−4
10−3
10−2
10−1
100
No ARQ, de = 1
dfb = 0.25, de = 1
dfb = 4, de = 1
Perfect feedback, de = 1
Perfect feedback and CSIR
P (dB)
Parq
gout(R)
Figure 4.3: Simulation results of ARQ outage probability for Gaussian-inputtransmission over a MIMO Rayleigh block-fading channel with parameters:B = 2, L = 2, nt = 2, nr = 1, R = 2 bits per channel use and the BSCfeedback with parameter p0 = 0.5.
same power on both data and pilot is not a necessary condition as we may have
a fixed de > 0 to obtain the perfect-CSIR diversity.
The two conditions of the feedback diversity for each round can be explained
as follows. The first condition (4.72) is to ensure that the exponent of the power
allocated at round ℓ is identical to the exponent of the perfect-feedback power
allocation at round ℓ. The second condition (4.73) is to ensure that when the
feedback diversity is added to the generalised outage diversity at the first round,
the result is greater than the perfect-feedback diversity at round L.
4.4.3 Discussion
We illustrate the preceding theoretical results using numerical examples as fol-
lows.
Figure 4.3 shows P arqgout(R) for Gaussian-input transmission with uniform power
allocation. In particular, we evaluate the different outage curves for different
feedback diversities. For dfb = 0.25, the diversity improvement is marginal with
respect to the no-ARQ case; the ARQ diversity is given by the sum of dfb and
duicsir(1). We show here that with dfb = 4 that satisfies (4.62), the perfect-feedback
ARQ diversity can be achieved.
72
4.4 ARQ Outage Diversity
0 0.5 1 1.5 2 2.5 3 3.5 40
5
10
15
Perfect feedback
dfb = 5
dfb = 2
dfb = 0
R
darq
(a) Uniform power allocation
0 0.5 1 1.5 2 2.5 3 3.5 40
20
40
60
80
100
120
140OP, Perfect feedback
OP, dfb = 100
OP, dfb = 50
OP, dfb = 0
UP, Perfect feedback
R
darq
(b) Optimal power allocation
Figure 4.4: ARQ outage diversity for 4-QAM inputs in a MIMO Rayleigh block-fading channel with parameters: B = 2, L = 3, nt = 2 and nr = 1. UP (OP)indicates results with uniform (optimal) power allocation. The data rate R is inbits per channel use. The CSIR-error diversity de is such that (4.71) is satisfied.
Figure 4.4 illustrates the ARQ diversity for 4-quadrature amplitude modula-
tion (QAM) inputs. Figure 4.4a shows the diversity results with uniform power
allocation. With dfb ↓ 0, the system is unable to utilise the benefits of ARQ
in improving diversity. As dfb increases the diversity performance, a high feed-
back diversity is required to fully utilise the ARQ scheme, especially when L is
large. In this figure, the perfect-feedback diversity can be achieved with dfb = 8,
73
4.4 ARQ Outage Diversity
which satisfies (4.62). Figure 4.4b illustrates the effect of optimal power control
to the ARQ outage diversity. Uniform power allocation with perfect feedback
establishes the boundary for improving the diversity with power control. With
a low dfb (e.g., dfb = 0), we are unable to obtain the diversity improvement with
power control. As dfb increases, we observe some diversity gains, starting from
the high data rate region. For a sufficiently high dfb, the diversity curve tends
to coincides with the perfect-feedback ARQ diversity at high data rate (see, e.g.,
R ≥ 2 for dfb = 50, then R ≥ 1 for dfb = 100). The figure confirms that the
power-controlled IR-ARQ promises a significant improvement of the diversity but
requires a highly-reliable feedback channel.
From the above results, we learn that practical system design has to account
for all possible imperfections in the system and the channel. We shall illustrate
in the following that failure to account those imperfections may result in a very
inefficient communication system.
• Failure to account for imperfect feedback
One purpose of IR-ARQ is to improve the reliability of the system. In
particular, we expect that the high-SNR slope of the error probability will
increase as a function of the ARQ delay limit L. However, if feedback
errors, which induce a low feedback diversity, are not accounted for, then
we may only obtain a marginal improvement of the diversity, which is given
by the first-round diversity plus the feedback diversity. This can be much
smaller than the expected ARQ diversity that improves as a function of L.
• Assuming perfect feedback, failure to account for imperfect CSIR
If the CSIR is imperfect, but the transmitter treats that as it was perfect,
then assuming perfect feedback, the power allocation at round ℓ (which is
optimal for perfect CSIR) satisfies [32]
Pℓ.= P 1+dpcsir(ℓ−1). (4.74)
As proved in Appendix B.4 (with dfb ↑ ∞), we have that with imperfect
CSIR
Pr Ft(ℓ− 1) = 0 .= P−dpicsir(ℓ−1). (4.75)
Thus, the average (4.63) becomes
P1
L+
1
L
L∑
ℓ=2
Pr Ft(ℓ− 1) = 0Pℓ
.≥ P 1 (4.76)
74
4.5 Conclusion
where the last inequality is because dpicsir(ℓ−1) ≤ dpcsir(ℓ−1) by Proposition
2.4. This implies that by ignoring that the CSIR is imperfect, we may
violate the power constraint (4.63).
As also pointed out in Appendix B.4, failure to account for the CSIR-
error diversity de may lead to a lower achievable SNR-exponent. Consider
a discrete alphabet of size 2M and suppose that de = 1 and dSB(R) ≥ 1.
Following the steps used in Appendix B.4, with the power allocation (4.74),
we have that
dpicsir(L) = LdSB(R) (4.77)
which is strictly smaller than the ARQ diversity with uniform power alloca-
tion duicsir = dSB(L,R). So, in this case, employing the power adaptation al-
gorithm [32] (which is optimal in terms of SNR-exponent for perfect CSIR)
may yield a lower reliability as compared to employing uniform power allo-
cation. This shows that the optimal power control algorithm under perfect
CSIR can be highly suboptimal in practical systems.
On the last note, we recall that under perfect CSIR, the accumulated mu-
tual information [26, 29, 32] always improves with the number of fading blocks,
implying that under perfect CSIR, IR-ARQ (full observation-combining) with
optimal power control is superior to other hybrid ARQ schemes (partial or no
observation-combining) with or without optimal power control. As we pointed
out in Section 4.3, under imperfect CSIR, the accumulated GMI does not ne-
cessarily improve and can decrease with the number of fading blocks. This sug-
gests that in practical scenarios, IR-ARQ may not necessarily be superior to
other schemes employing partial observation-combining (such as maximal-ratio
combining) or no observation-combining (such as ALOHA). With our decoding
model, for example, to improve the reliability, the general rule is that at round
ℓ, ℓ = 1, . . . , L, we should selectively combine fading blocks which yield a unique
message output m ∈ M from the threshold decoder Ψ(·) in Section 4.2.2. Selec-
tive observation-combining yields a better reliability when the combined coded
blocks improve the accumulated GMI from round to round.
4.5 Conclusion
We have analysed the performance of IR-ARQ over MIMO block-fading channels
with imperfect CSIR and BSC feedback. Specifically, we have characterised the
diversity penalty caused by imperfect CSIR and feedback. Our results suggest
75
4.5 Conclusion
that the feedback SNR must improve with the forward SNR in order for IR-
ARQ to be able to exploit the available diversity; otherwise the IR-ARQ scheme
is unable to improve the diversity. We have derived the conditions for which
the perfect-feedback and perfect-CSIR ARQ diversity may be exploited. We
have learnt that in order to achieve the perfect-feedback diversity, the required
feedback transmission must provide an additional diversity which is increasing
with the maximum number of ARQ rounds. We have identified how power control
can be used to further improve the system performance. We have highlighted the
importance of practical system design to account for all possible imperfections in
the channel. Failure to account for these imperfections may result in overspending
transmitted power and a degradation of the reliability of transmission. At the
end of the chapter, we have pointed out that selective observation-combining in
ARQ may be competitive to IR-ARQ for practical systems where the CSIR and
feedback are imperfect.
76
Chapter 5
Mismatched CSI Outage
SNR-Exponents of MIMO
Block-Fading Channels
In Chapter 2, we have discussed that the error probability in block-fading chan-
nels cannot be made any arbitrarily smaller than the information-outage prob-
ability for sufficiently large block length. One important element that affects
the outage performance is the availability of CSI. In Chapter 3 we have studied
the effect of imperfect CSI at the receiver (CSIR) in block-fading channels. In
this chapter, we extend the analysis in Chapter 3 to include imperfect CSI at
the transmitter (CSIT). We propose a unified framework to study mismatched
CSI at both terminals. In particular, we study the GMI and the generalised out-
age probability—which have been introduced in Chapter 2—of nearest neighbour
decoding when power adaptation is employed at the transmitter.
This chapter is outlined as follows. Section 5.1 specifically introduces our
system model. Section 5.2 provides some relevant backgrounds to study imper-
fect CSI in the block-fading channel. Section 5.3 presents our main results on
outage SNR-exponent. Section 5.4 provides discussion and further analysis on
our findings. Section 5.5 summarises the important points of the chapter.
5.1 System Model
The system model is depicted in Figure 5.1. We consider a MIMO block-fading
channel with nt transmit antennas, nr receive antennas and B fading blocks per
codeword. The output of the channel for block b is an nr×J-dimensional random
77
5.1 System Model
m Encoder
×
×
+
+
Decoder m
H1
HB
Z1
ZB
q
P1
ntX1
q
PB
ntXB
Y1
YB
b
b
b
CSIR
Hb, b = 1, . . . , B
CSIT
H(n(b)), b = 1, . . . , B
Figure 5.1: System model for MIMO block-fading channels with imperfect CSIat both terminals.
matrix
Yb = H bP12b Xb + Zb, b = 1, . . . , B (5.1)
where Zb are the nr × J-dimensional random noise matrix and Xb ∈ Xnt×J are
the transmitted signal matrix; J and X denote the channel block length and
the signal constellation, respectively. We assume that the entries of Zb are i.i.d.
complex-Gaussian random variables with zero mean and unit variance. The
nr × nt random matrix H b denotes the fading for block b and is assumed to be
i.i.d. from block to block. Furthermore, the entries of H b are i.i.d. zero-mean
unit-variance complex-Gaussian random variables; the magnitude of each entry
of H b is then Rayleigh distributed.
A codeword that represents a message m ∈ 1, . . . , 2BJR to be transmitted is
denoted by X(m) = [X1(m), . . . ,XB(m)] where R is the coding rate. The entries
of X are i.i.d. constructed from a probability distribution over Xnt where X ⊆ is the input alphabet. Herein we focus on Gaussian inputs and discrete inputs.
We further assume that the coding rate R is a fixed positive constant; hence
the multiplexing gain (2.57) tends zero. Finally, we assume that a codeword is
normalised such that1
BJ[
‖X‖2F]
= nt. (5.2)
We study a case where imperfect CSI is the actual CSI plus AWGN. This
model of noisy CSI comes from exploiting channel reciprocity [70, 71] for which
the channel realisation is identical at both ends but the channel estimation noises
78
5.1 System Model
are independent, i.e.,
CSIT H b = H b + Eb, (5.3)
CSIR H b = H b + Eb (5.4)
where Eb and Eb are the CSIT and the CSIR noise random matrices, respectively.
Note that Eb is independent from Eb. The entries of Eb and Eb are assumed
to be independent from H b and i.i.d. complex-Gaussian random variables with
zero mean and variances σ2e = P−de and σ2
e = P−de, respectively, where P is
the average SNR. For a fixed fading matrix H b = Hb, the fading estimates at
both terminals are independent. The imperfect CSIR model is widely used in
a pilot-based channel estimation at the receiver for which the error variance is
proportional to the reciprocal of the pilot SNR [49, 50]. The same estimation
technique can also be performed at the transmitter, i.e., by transmitting pilot
symbols at the reverse link of a time-division duplex (TDD) system. We further
incorporate the parameters de > 0 and de > 0, denoting the CSIT-error and the
CSIR-error diversities, respectively.
The power matrix Pb ∈ nt×nt is a diagonal matrix. In particular, we use a
scaled identity power matrix as in [45, 46]
Pb
(
H(n(b))
)
=Pb
(
H(n(b)))
ntInt (5.5)
where n(b) is the number of fading blocks used for power adaptation. Note that
the power allocation at block b (5.5) is performed after knowing the noisy CSIT
vector
H(n(b)) =[
H1, . . . , Hn(b)
]
. (5.6)
Depending on n(b) we have the following cases.
• Full-CSIT power allocation if n(b) = B for all b = 1, . . . , B. Imperfect
fading estimates for the whole B blocks in a codeword are available at the
transmitter prior to transmission. This setup is practically relevant for
multi-carrier orthogonal transmission such as OFDM, where the channel is
estimated in the time domain and data transmission occurs in the frequency
domain.
• Causal-CSIT power allocation if n(b) = b− τd with a fixed delay τd > 0 for
any b = 1, . . . , B. This corresponds to CSIT being limited only to the past
imperfect fading estimates due to the delay τd. Causal CSIT is motivated
79
5.2 Preliminaries
by block-fading channels with time-domain transmission for which only
past fading estimates may be available at the transmitter.
• Predictive-CSIT power allocation if n(b) = b+τf with a fixed τf ≥ 0 for any
b = 1, . . . , B. Here τf is a prediction parameter indicating the number of
predicted fading blocks. This corresponds to CSIT including past, current
and a number of predicted future fading estimates. This setup is relevant
for instantaneous parallel transmission such as in OFDM where (possibly)
not all fading blocks are used for power allocation. More specifically, for
each fading block b = 1, . . . , B, only (n(b) = b + τf) fading matrices are
used for allocating power at block b. This setup is essential to capture the
case in between causal and full CSIT cases.
For the above power allocation schemes, the corresponding long-term average
power constraint is given by
[
1
B
B∑
b=1
tr(
Pb
(
H(n(b))
))
]
≤ P. (5.7)
The power matrix structure in (5.5) implies that for block b, we have equal power
distribution across all transmit antennas.
Nearest neighbour decoding is used to infer the transmitted message. With
imperfect CSIR, the decoder treats the imperfect channel estimate as if it were
perfect. It first computes the following metric for a given Y = [Y1, . . . ,YB], CSIR
H = [H1, . . . , HB], and power matrix P = [P1(H(n(1))), . . . ,PB(H
(n(B)))]
Q(
Y, H,P,X(m))
∝ exp
(
−B∑
b=1
∥
∥
∥Yb − HbP
12b Xb(m)
∥
∥
∥
2
F
)
(5.8)
and then outputs
m = arg maxm∈1,...,2BJR
Q(
Y, H,P,X(m))
. (5.9)
5.2 Preliminaries
By incorporating power adaptation based on noisy CSIT, we have from the
analysis in Chapter 2 that in the large block length regime, the average error
probability—averaged over the ensemble of random codes with input distribution
PX(x) over Xnt—can be upper-bounded using the generalised outage probability
80
5.2 Preliminaries
Pgout(R) defined in (2.32), i.e.,
Pe ≤ Pr
Igmi(H , H ,P) < R
(5.10)
, Pgout(R) (5.11)
where
Igmi(H, H,P) = sups>0
1
B
B∑
b=1
Igmib (Pb,Hb, Hb, s) (5.12)
is the generalised mutual information (GMI) for fading H, receiver estimate H
and power matrix P, and where
Igmib (Pb,Hb, Hb, s)
=
log2
Qs(
Y , Hb,Pb,X)
[
Qs(
Y , Hb,Pb,X ′)∣
∣
∣Y ,Hb, Hb,Pb
]
∣
∣
∣
∣
∣
∣
Hb, Hb,Pb
. (5.13)
This shows the achievability of the generalised outage probability for block-fading
channels with imperfect CSI. The ensemble average implies that we can find codes
that achieve an error probability, that is as small as Pgout(R). This does not mean
that no codes can achieve a smaller error probability than Pgout(R). However,
for i.i.d. codebooks, it has been shown in [72] based on the results in [4, 36, 39]
that Pgout(R) is the smallest error probability for block-fading channels with
mismatched CSIR. Therefore, for i.i.d. codebooks with sufficiently large block
length, it suffices to study Pgout(R) in order to characterise the error events.
We are interested in characterising the behaviour of Pgout(R) at high SNR.
One important figure of merit is the generalised outage diversity defined in (2.67)
d , limP→∞
− logPgout(R)
logP. (5.14)
Throughout this chapter, we shall emphasise that the generalised outage diversity
and the outage SNR-exponent mean the same thing and correspond to (5.14).
In Chapter 3, we have shown that with uniform power allocation, the imperfect-
CSIR outage SNR-exponent duicsir is the function of the perfect-CSIR outage SNR-
exponent ducsir and the CSIR-error diversity de, i.e.,
duicsir = min(1, de)× ducsir. (5.15)
Here the superscript u denotes the uniform power allocation. From [29, 32], we
81
5.2 Preliminaries
have that
ducsir =
Bntnr, for Gaussian inputs
dSB(R) , nr
(
1 +⌊
B(
nt − RM
)⌋)
, for discrete inputs(5.16)
where dSB(R) is the Singleton bound, and where M , log2 |X|. This result
implies that if the variance of the CSIR error is less than or equal to the inverse
of the SNR, the perfect-CSIR diversity is achievable. Otherwise, the imperfect-
CSIR diversity is smaller than the perfect-CSIR diversity.
If CSIT is available, then the transmitter can adapt its transmission power to
minimise the generalised outage probability. The idea is that in a very bad chan-
nel realisation, power can be saved and used when channel conditions improve.
References [27,73] showed that if perfect CSI is available at both terminals, then
zero outage is possible, implying that the delay-limited capacity [74] is positive.
References [45,46] extended the results to perfect CSIR and imperfect full CSIT
setup. In this case, the SNR-exponent is given by
dficsit = ducsir (1 + ducsirde) (5.17)
where the superscript f denotes full-CSIT power control. Assuming perfect CSIR,
reference [75] considered cases where imperfect causal or predictive CSIT is avail-
able. In those cases, the SNR-exponent is given as a function of the CSIT-error
diversity de and the CSIT delay τd or the CSIT prediction parameter τf .
In practical scenarios, both CSIR and CSIT will be imperfect. It is there-
fore of practical interest to study mismatched CSI at both ends under a unified
framework. In this work, we find the SNR-exponents with imperfect CSI at both
ends using nearest neighbour decoding and power allocation. In particular, the
power allocation algorithm is given by the solution to the following optimisation
problem
minimize Pgout(R)
subject to
[
1
B
B∑
b=1
tr(
Pb
(
H(n(b))
))
]
≤ P
diag(
Pb
(
H(n(b))
))
0, b = 1, . . . , B.
(5.18)
Solving the above optimisation problem can be difficult in general. Given our
CSIT model, the minimum-outage power allocation is difficult to find since
Pgout(R) depends on both actual channel and channel estimate. Nevertheless,
we will see that despite this difficulty, studying the behaviour of the optimal
82
5.3 Outage SNR-Exponents
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
de
de
d e
= 1 + ducs
ird e
Mismatched CSIRdominatesdf
icsi= ded
u
csir
Power control is not effective
df
icsi= d
u
csir
Mismatched CSITdominatesdf
icsi= (1 + d
u
csirde)d
u
csir
Figure 5.2: Interplay among the CSIT- and CSIR-error diversities and the outageSNR-exponent with full-CSIT power allocation.
solution at high SNR is possible. We will use the technique in [46] to derive
the asymptotic power allocation that results in no loss in terms of outage SNR-
exponent.
5.3 Outage SNR-Exponents
The solution to the power allocation in (5.18) depends on whether the CSIT is
full, causal or predictive. Therefore, we will separately study the SNR-exponent
for each type of CSIT.
5.3.1 Full-CSIT Power Allocation
Theorem 5.1 (Full CSIT). For full CSIT (where n(b) = B in (5.6)), the outage
SNR-exponent dficsi of MIMO block-fading channels with nt transmit antennas,
nr receive antennas, B fading blocks, CSIT-error diversity de and CSIR-error
diversity de for Gaussian and discrete constellations is
dficsi =
ducsirde, if de ≤ 1 + ducsirde
ducsir (1 + ducsirde) , if de > 1 + ducsirde(5.19)
83
5.3 Outage SNR-Exponents
where ducsir is given in (5.16).
Proof. See Appendix C.3.
The results in Theorem 5.1 highlight the trade-off on the resources spent for
estimating the channel at both terminals and the effectiveness of power control
given a noisy CSIR. We illustrate this in Figure 5.2. Power control is effective
whenever the CSIR noise variance is much smaller than the CSIT noise variance,
i.e., de > 1 + ducsirde. For example, with Gaussian inputs de must be larger than
de approximately by a factor of Bntnr. The condition de > 1 highlights the
potential improvement made by power control over uniform power allocation.
Outage events are dominated by the mismatched CSIR if the CSIR noise is
strong, i.e., for de ≤ 1 + ducsirde. Otherwise, outage events are dominated by the
mismatched CSIT.
Remark 5.1. CSIR has a higher impact on the generalised outage diversity than
CSIT. With high quality CSIR, poor CSIT has the same effect as having no power
control. On the other hand, with high quality CSIT, poor CSIR results in diversity
approaching zero.
The result in Theorem 5.1 is consistent with previous results. In particular,
we recover the mismatched-CSIT perfect-CSIR outage SNR-exponent in [45, 46]
by letting de ↑ ∞ (perfect CSIR), and the no-CSIT mismatched-CSIR outage
SNR-exponent in Chapter 3 by letting de ↓ 0.
5.3.2 Causal-CSIT Power Allocation
Theorem 5.2 (Causal CSIT). Consider a MIMO block-fading channel with nt
transmit antennas, nr receive antennas, B fading blocks, CSIT-error diversity
de and CSIR-error diversity de. For causal CSIT with delay τd > 0 (where
n(b) = b − τd in (5.6)), the outage SNR-exponent dcicsi for Gaussian inputs is
given by
dcicsi = ntnr
B∑
b=1
υb (5.20)
where
υb ,
min (de, 1) , for b = 1, . . . , τd
min(
de, 1 + ntnr
∑b−τdb′=1 min (υb′, de)
)
, for b = τd + 1, . . . , B.(5.21)
84
5.3 Outage SNR-Exponents
On the other hand, the outage SNR-exponent dcicsi for discrete inputs is given by
dcicsi = ntnr
b∑
b=1
ϑb + nr
(
d‡ − bnt
)
ϑb+1 (5.22)
where
d‡ , Bnt −⌈
BR
M
⌉
+ 1 (5.23)
b , maxb: bnt≤d‡
b (5.24)
ϑb ,
min(de, 1) b = 1, . . . ,min(τd, b+ 1)
min(
de, 1 + nrnt
∑b−τdb′=1 min(ϑb′ , de)
)
, b = min(τd, b+ 1) + 1, . . . , b+ 1.
(5.25)
Proof. See Appendix C.4.
There are two cases for which causal-CSIT power allocation cannot increase
the diversity. The first case is when the CSIR estimation is too unreliable, i.e.,
de ≤ 1. The second case is when the delay of obtaining CSIT is long. For
instance, power control cannot improve diversity if the delay in obtaining CSIT
is greater than B (Gaussian inputs) or b (discrete inputs). Thus, the required
CSIT delay for discrete inputs is more strict than that for Gaussian inputs. This
second case is identical to the result in [75]. Indeed, the result in [75] is an
instance of Theorem 5.2 with infinite de.
Note that with perfect CSI at both transmitter and receiver (de ↑ ∞ and
de ↑ ∞), we have that the outage diversity with causal-CSIT is always finite.
This means that at high SNR, the slope of the generalised outage probability
with respect to logP is finite and zero outage is not possible with finite SNR. It
thus follows that the delay-limited capacity [74] is zero.
5.3.3 Predictive-CSIT Power Allocation
Theorem 5.3 (Predictive CSIT). Consider a MIMO block-fading channel with
nt transmit antennas, nr receive antennas, B fading blocks, CSIT-error diversity
de and CSIR-error diversity de. For predictive CSIT (where n(b) = b + τf in
(5.6), τf ≥ 0), the outage SNR-exponent dpicsi for Gaussian inputs is given by
dpicsi = ntnr
B∑
b=1
min (de, 1 + ntnr min (B, b+ τf) de) . (5.26)
85
5.4 Discussion
On the other hand, the outage SNR-exponent dpicsi for discrete inputs is given by
dpicsi = ntnr
b∑
b=1
min (ηb, de) + nr
(
d‡ − bnt
)
min(
ηb+1, de)
(5.27)
where d‡ and b are defined in (5.23) and (5.24), respectively, and
ηb ,
1 + ntnr (b+ τf) de, b+ τf ≤ b
1 + nrd‡de, b+ τf > b.
(5.28)
Proof. See Appendix C.5.
In Theorem 5.3, we observe how the predictive-CSIT power control improves
the outage diversity via a recursion in the power adaptation. Note that with
de ≥ 1 + ducsirde (5.29)
we essentially obtain the same diversity as in the noiseless CSIR case, which is
given in [75]. If the prediction parameter τf satisfies
τf ≥
B − 1, for Gaussian inputs
b, for discrete inputs,(5.30)
then the diversity obtained with predictive CSIT is the same as that with full
CSIT.
For de ≤ 1, we note that any power control with full, causal or predictive
CSIT cannot improve the outage diversity with respect to that achieved with
uniform power allocation. This corresponds to the case where the CSIR is too
unreliable.
5.4 Discussion
5.4.1 Pilot-Assisted Channel Estimation
The CSI models in (5.3) and (5.4) are an abstraction of pilot-based channel esti-
mation for which two-way pilot transmissions are used for estimating the fading
coefficients at both terminals. In particular, the estimations take the advantage
of the slow-fading process and the channel reciprocity in TDD systems. The
channel remains constant for block b and thus, the two-way pilot transmissions
for estimating Hb occur within the block b. For orthogonal pilot design [76, 77],
86
5.4 Discussion
where orthogonal vectors are used to estimate the nr · nt entries of the fading
matrix H b for b = 1, . . . , B, these transmissions require (nt + nr) channel uses
and are done prior transmitting the data for block b. Since the transmitter has
access to noisy fading coefficients up to block b, for block-fading channels with
time-domain data transmission, only causal CSIT power allocation with delay
τd > 0 or predictive CSIT with τf = 0 are realistic. On the other hand, for block
fading channels with time-domain channel estimation and frequency-domain data
transmission (such as multi-carrier transmission and OFDM), the full CSIT as-
sumption is of practical relevance.
Suppose that orthogonal pilots [76, 77] are employed and for each training
only one antenna is active at a time. This training requires nt time instants to
transmit pilot symbols from the transmitter and nr time instants to transmit
pilot symbols from the receiver. Furthermore, assume that the pilot power at
the transmitter is P de and at the receiver is P de, then the received pilot symbols
at both ends, when the transmit antenna t and the receive antenna r are active,
are given by
Receiver: Yb,r,t =√P deHb,r,t + Zb,r,t, (5.31)
Transmitter: Yb,r,t =√P deHb,r,t + Zb,r,t (5.32)
where Zb,r,t and Zb,r,t are zero-mean unit-variance complex-Gaussian noise at
the receiver and at the transmitter, respectively; Zb,r,t and Zb,r,t are independent.
Dividing (5.31) by√P de and dividing (5.32) by
√P de leads to the models in (5.3)
and (5.4). This pilot-based estimation refers to the maximum-likelihood (ML)
channel estimation [49]. Note that in the limit of J → ∞, the pilot fractionnt+nr
Jvanishes and hence, the pilot insertion does not affect the SNR-exponents
in Theorems 5.1, 5.2 and 5.3.
Another common channel estimation scheme is the linear minimum mean-
squared error (LMMSE) estimation. Consider the estimation at the receiver and
let the CSIR be
Hb,r,t = ab,r,tYb,r,t (5.33)
where ab,r,t is chosen such that
σ2e =
[
|Eb,r,t|2]
= [
|Hb,r,t − Hb,r,t|2]
(5.34)
is minimised. It is not difficult to show that ab,r,t that minimises the above
87
5.4 Discussion
expectation is given by
a∗b,r,t =
√P de
P de + 1. (5.35)
Furthermore, from the orthogonality principle [78], we have
Hb,r,t = Hb,r,t + Eb,r,t (5.36)
with Hb,r,t and Eb,r,t being independent from each other. Moreover, the variances
of Hb,r,t and Eb,r,t are given by
σ2h= 1− 1
P de + 1, (5.37)
σ2e = 1− σ2
h=
1
P de + 1. (5.38)
Note that for de > 0 we have the dot equality
σ2e =
1
P de + 1.= P−de. (5.39)
The same explanation applies for the estimation at the transmitter where we
have
Hb,r,t = Hb,r,t + Eb,r,t (5.40)
with Hb,r,t and Eb,r,t being independent from each other. Furthermore, the vari-
ances of Hb,r,t and Eb,r,t are given by
σ2h= 1− 1
P de + 1, (5.41)
σ2e = 1− σ2
h=
1
P de + 1.= P−de (5.42)
where the dot equality is valid for de > 0. Note that for a given Hb,r,t = hb,r,t,
Yb,r,t and Yb,r,t are independent; it then follows that for a fixed fading realisation,
the channel estimations at both ends are independent.
The above LMMSE estimation is equivalent to the ML estimation in the
sense that the CSIR-error and the CSIT-error diversities are given by de and
de, respectively. The difference is on which random variables are independent
from the others. Consider the estimation at the receiver. For ML estimation, we
understand from (5.4) that Hb,r,t and Eb,r,t are independent. On the other hand,
for LMMSE estimation in (5.36), the orthogonality principle [78] implies that
Hb,r,t and Eb,r,t are independent. Furthermore, the variances of Eb,r,t for ML and
LMMSE estimations are not exactly equal but only asymptotically equal, i.e.,
88
5.4 Discussion
they have the same dot equality. Despite this difference, we prove in Appendix
C.6 that using LMMSE estimation, we still obtain the same SNR-exponent as
that obtained using ML estimation. Thus, it is not a surprise that Theorem 5.1 is
a generalisation of the results in [45] for transmission rate with zero multiplexing
gain and the result in [46] as pointed out in Section 5.3. The only relevant
parameters are the CSIR-error and CSIT-error diversities de and de.
5.4.2 Mean-Feedback CSIT Model
The results in Theorems 5.1, 5.2 and 5.3 correspond to a case where the transmit-
ter establishes its own independent channel estimator. This is a typical model
for a two-way training for which both transmitter and receiver transmit pilot
symbols. From the received pilot symbols, each communicating party employs
its channel estimator to obtain accurate fading estimates.
One might also consider a CSIT model for which the transmitter obtains a
noisy version of the CSIR via a dedicated feedback channel. This is commonly
referred to as the mean-feedback model [79]. For this model, we have
CSIR Hb,r,t = Hb,r,t + Eb,r,t, (5.43)
CSIT Hb,r,t = Hb,r,t + Eb,r,t. (5.44)
The CSIT equation can be written as
Hb,r,t = Hb,r,t + Eb,r,t = Hb,r,t + Eb,r,t + Eb,r,t. (5.45)
We can see from the last equation that the effective CSIT noise with respect to
the actual channel Hb,r,t is given by
Eb,r,t + Eb,r,t (5.46)
where it has zero mean and variance
P−de + P−de. (5.47)
The effective CSIT-error diversity is then obtained from the exponent of P−de +
P−de , which is given by min (de, de). Thus, for the mean-feedback model, the out-
age SNR-exponents can be obtained by applying de = min (de, de) in Theorems
5.1, 5.2 and 5.3
89
5.4 Discussion
5.4.3 Comments on Achievable Rates
The technique used to derive our main results is based on the GMI, which is
an achievable coding rate when a fixed decoding rule—which is not necessarily
matched to the channel—is employed [13]. Transmission rates below the GMI
have a vanishing error probability as the block length tends to infinity. Further-
more, the GMI is the largest reliable transmission rate when the encoder is forced
to use i.i.d. inputs [4, 13, 36, 39]. Therefore, the outage SNR-exponents derived
in Theorems 5.1, 5.2 and 5.3 are the optimal SNR-exponents when using i.i.d.
codebooks (Gaussian or discrete) and a nearest neighbour decoder.
Note however, one may obtain a larger achievable rate if the assumption of
i.i.d. codebooks can be lifted. Indeed, i.i.d. inputs may not be optimal and
it has been shown in [13, 36, 38] that using inputs with a good cost constraint
yields a lower bound to the mismatched capacity (LM bound), that is larger
than the GMI. The main difficulty of using LM bound is the optimisation over
all possible cost function, which in general cannot be solved analytically. For a
given codebook, a larger achievable rate is also possible by optimising over all
possible decoders.
There are several works that study a similar problem to ours, but use different
characterisations of the reliable transmission rate. In particular, we refer to the
works in [80–82] for comparison. For simplicity and for the sake of comparison,
we consider a SISO quasi-static channel (B = 1). References [80–82] assumed
Gaussian inputs and LMMSE channel estimation at the receiver; no assumption
on the decoder structure is made. Thus, from (5.1) and (5.36) we can write the
input-output relationship as
Y =√PHx+
√PEx+Z (5.48)
where Y and Z are the random received vector and the noise vector, respec-
tively, which take values on CJ ; H and E are scalar fading estimate and fading
estimation error, respectively; x is the channel input vector; P is the transmis-
sion power. Note that since every realisation of H is known at the receiver,
the argument in [80–82] is that one can treat the term√PEx as an additional
noise term. Furthermore, it was argued in [80, 82] that by modelling the signal-
dependent noise
Z ′ =√PEx+Z (5.49)
as a zero-mean Gaussian noise with i.i.d. entries independent of x and each
90
5.4 Discussion
having variance
1 + P |E|2, (5.50)
one can obtain a lower bound to the instantaneous mutual information as [82]
I(H, E, P ) = log2
(
1 +P |H|2
1 + P |E|2
)
. (5.51)
Note that the above expression leads to an outage SNR-exponent that is obtained
by solving
Pr
I(H, E, P ) < R
= Pr
log2
(
1 +P |H|2
1 + P |E|2
)
< R
. (5.52)
Interestingly, following the steps used in Appendix A.3 for B = 1, the GMI can
be lower-bounded as
Igmi(H, H, P ) ≥ log2
(
1 +P |H|2
1 + P |H − H|2
)
− 1. (5.53)
In the high-SNR regime, the difference between (5.51) and (5.53) does not affect
the outage SNR-exponents. Thus, it is not surprising that for no-CSIT and
transmitter pilot power P de, our results are identical to the results in [80, 82].
Even though evaluating the bound (5.51) seems to be much easier than eval-
uating the GMI, there are several drawbacks with the lower-bounding technique
in (5.51).
• To the best of our knowledge, there is no explicit proof of the achievability
of I(H, E, P ). The lower-bound reasoning comes from the work by Zheng
and Tse [83], where the LMMSE channel estimate model is used at the
receiver to derive a lower bound to the blockwise-ergodic capacity. In [83],
the block length J is a finite quantity in general, and the capacity expression
is obtained via coding over infinitely many blocks. It then follows that H
and E have uncorrelated statistics over infinitely many blocks. The lower-
bound proof then follows from [84, Sec. III], [85, App. I]. The lower bound
of the blockwise-ergodic capacity is obtained after averaging over all states
of fading and its corresponding estimate.
It is not yet clear whether the technique in [83] can directly be applied to
non-ergodic fading channels. As supposed to coding over infinitely many
blocks, in a quasi-static channel, coding is performed for only one block and
91
5.4 Discussion
0 0.5 1 1.5 20
2
4
6
8
10
12
Igmi(H, H, P )
I(H)
I(H, E, P )
I(H, E, P ) > I(H)I(H, E, P ) < I(H)
Information rates (bits per channel use)
Density
Figure 5.3: Comparison of the densities of the GMI and and the lower bound(5.51) with fading realisation H = 1, transmission power P = 1 (unit power) andCSIR-error variance σ2
e = 0.1.
the block length J is taken to infinity to recover the information outage
probability [20,23]. Note that during a single block, both fading and fading
estimate are constant; it follows that the estimation error is also constant
for that block. Hence, (5.51) may not be an achievable lower bound to the
instantaneous mutual information for the block of interest since both H
and E are constant within a single block.
The above explanation means that there is no guarantee that transmitting
codeword at rate R = I(H, E, P ) − ǫ for any ǫ > 0 has a vanishing error
probability as the block length J tends to infinity. This is in contrast with
Igmi(H, H, P ) for which the achievability has been proven in [35, 37, 38].
• For some combinations of H and E, we may find I(H, E, P ) that can be
larger than the perfect-CSIR mutual information I(H)
I(H) = log2(
1 + P |H|2)
. (5.54)
We illustrate this in Figure 5.3 where we fix H = 1 and P = 1, and we use
the LMMSE estimation model such that
H = H + E. (5.55)
92
5.4 Discussion
For a fixed H = 1 and P = 1, the probability that I(H, E, P ) is greater
than I(H) is non zero, which implies that the lower bound (5.51) may
violate the data-processing inequality. This result indirectly disproves the
achievability of I(H, E, P ). Furthermore, this lower bound is in contrast to
Igmi(H, H, P ), which is always smaller than I(H) as shown in [35, 37, 38].
• It is not clear whether modelling Z ′ in (5.49) as a signal-independent Gaus-
sian noise would still result in the correct exponent for discrete inputs.
5.4.4 Comments on Continuous Input Distributions
We have considered Gaussian inputs as a possible continuous input distribution
for our SNR-exponent analysis. The main reason is that Gaussian input is the
optimal input distribution for the channel in (5.1) when perfect CSI is available
at the receiver. However, when only noisy CSIR is known, Gaussian inputs may
no longer be optimal.
Using expression (5.13), we can show that the SNR-exponent for Gaussian
inputs is a lower bound to the SNR-exponent for some other continuous distribu-
tions satisfying certain conditions. We first assume that the input vector is i.i.d.
over all transmit antennas and all channel uses and is such that [
|X|2]
= 1.
The expression in (5.13) (in natural-base logarithm) can be decomposed into two
terms as follows
Igmib (Pb,Hb, Hb, s) =
[
log Qs(
Y , Hb,Pb,X)∣
∣
∣Hb, Hb,Pb
]
− [
log [
Qs(
Y , Hb,Pb,X′)∣
∣
∣Y ,Hb, Hb,Pb
]∣
∣
∣Hb, Hb,Pb
]
.
(5.56)
Evaluating the first term yields
[
log Qs(
Y , Hb,Pb,X)∣
∣
∣Hb, Hb,Pb
]
= −s[
∥
∥
∥
(
Hb − Hb
)
P12b X +Z
∥
∥
∥
2∣
∣
∣
∣
Hb, Hb,Pb
]
(5.57)
= −s(
nr +
[
∥
∥
∥EbP
12b X
∥
∥
∥
2∣
∣
∣
∣
Hb, Hb,Pb
])
(5.58)
≥ −s(
nr +
[
∥
∥
∥EbP
12
b
∥
∥
∥
2
F‖X‖2
∣
∣
∣
∣
Hb, Hb,Pb
])
(5.59)
= −s(
nr + nt
∥
∥
∥EbP
12b
∥
∥
∥
2
F
)
(5.60)
93
5.4 Discussion
where the inequality is due to the Frobenius norm property ‖Ax‖2F ≤ ‖A‖2F ·‖x‖2[86, Sec. 5.6] . The first expectation in the second term can be evaluated as
follows
[
Qs(
Y , Hb,Pb,X′)∣
∣
∣Y ,Hb, Hb,Pb
]
=
∫
x′
PX(x′)e−s
∥
∥
∥Y −HbP
1/2b x′
∥
∥
∥
2
dx′. (5.61)
Then, if the input pdf can be bounded as
PX(x) ≤ G
πnte−‖x‖2 , x ∈ nt (5.62)
for some constant G > 0, independent of the SNR, then the above expectation
can be bounded as
∫
x′
PX(x′)e−s
∥
∥
∥Y −HbP1/2b x′
∥
∥
∥
2
dx′
≤ G
∫
x′
1
πnte−‖x′‖2e
−s∥
∥
∥Y −HbP1/2b x′
∥
∥
∥
2
dx′ (5.63)
=G
det(
Inr + sHbPbH†b
)exp
(
−sY †(
Inr + sHbPbH†b
)−1
Y
)
. (5.64)
It follows that with s > 0, Igmib (Pb,Hb, Hb, s) can be lower-bounded as
Igmib (Pb,Hb, Hb, s)
≥ log det(
Inr + sHbPbH†b
)
− logG− s
(
nr + nt
∥
∥
∥EbP
12b
∥
∥
∥
2
F
)
+
[
sY †(
Inr + sHbPbH†b
)−1
Y
]
(5.65)
≥ log det(
Inr + sHbPbH†b
)
− logG− s
(
nr + nt
∥
∥
∥EbP
12b
∥
∥
∥
2
F
)
(5.66)
where the last inequality is because for any y ∈ nr, the term
y†(
Inr + sHbPbH†b
)−1
y (5.67)
is always non-negative as shown in Appendix A.3. Furthermore, instead of taking
the supremum over s > 0 to the summation of the RHS of (5.66) over all B blocks,
using s = s,
s =B
Bnr + nt
∑Bb=1
∥
∥
∥EbP
12b
∥
∥
∥
2
F
, (5.68)
94
5.5 Conclusion
yields the GMI lower bound
Igmi(H, H,P)
≥ 1
B
B∑
b=1
log det
Inr +
BHbPbH†b
Bnr + nt
∑Bb=1
∥
∥
∥EbP
12b
∥
∥
∥
2
F
− 1− logG
. (5.69)
Comparing the GMI lower bound in Appendix A.3 to the above lower bound
when the transmit power matrix Pb is given in (5.5), it is clear that both yield
the same large-SNR set that characterises the GMI lower bound since logG is
an SNR-independent constant. Since the SNR-exponent for Gaussian inputs
derived using GMI upper and lower bounds are identical, it follows that for any
input distribution such that the condition in (5.62) holds, the SNR-exponent is
lower-bounded by the SNR-exponent for Gaussian inputs. It is not yet clear
whether this lower bound is tight because solving the GMI upper bound for
input distribution such that (5.62) holds remains a challenge. This also implies
that there are other continuous input distributions that may have larger SNR-
exponents than Gaussian inputs.
5.5 Conclusion
We have studied the effects of imperfect CSI on the performance of data trans-
mission over MIMO block-fading channels. In particular, we derived the outage
SNR-exponent as a function of the CSIR and the CSIT noise variances, σ2e = P−de
and σ2e = P−de where P is the average data transmission power. We showed that
noisy CSIR has more detrimental effects on the SNR-exponent than noisy CSIT.
The results shed new light on the design of pilot-assisted channel estimation
in block-fading channels. If pilot symbols from both ends are sent with power
P (de = de = 1), then the CSIT estimation and the power adaptation are not
useful in terms of improving the SNR-exponent. On a positive note, if the pilot
signalling can be done at a power level sufficiently higher than P , then one can
reap significant benefits due to power adaptation across blocks. Furthermore,
since noisy CSIR has more detrimental effects than noisy CSIT, obtaining a
reliable CSIR is more important than obtaining a reliable CSIT. For full CSIT,
this can be achieved by having pilot symbols from the transmitter that have a
larger power exponent than pilot symbols from the receiver. More specifically,
a reliable CSIR can be obtained if de > 1 + ducsirde, where ducsir is the perfect-
CSIR outage SNR-exponent with uniform power allocation. One common way
95
5.5 Conclusion
of guaranteeing this reliable CSIR is by allocating the same amount of power in
the data symbols to the pilot symbols at the transmitter [80–82]. Note that even
though identical pilot and data power at the transmitter justifies perfect-CSIR
analysis of the outage SNR-exponent, the generalised outage probability can still
improve if the pilot power is larger than the data power as shown in Chapter 3.
For causal and predictive CSIT, the outage SNR-exponent does not only depend
on the CSIT-error diversity de and the CSIR-error diversity de, but also on the
CSIT delay τd or the CSIT prediction parameter τf .
The outage SNR-exponents derived in this chapter are the optimal SNR-
exponents when using i.i.d. codebooks (Gaussian or discrete) and a nearest
neighbour decoder. In order to obtain a potentially better SNR-exponent, one
should consider non-i.i.d. codebooks or different decoding strategy.
96
Part II
Stationary Ergodic Fading
Channels
97
Chapter 6
Stationary Fading Channels
Recall that for delay-unconstrained transmission, a codeword spans over a large
number of fading realisations and the channel is assumed to be stationary and
ergodic. For most fading distributions, the channel capacity is positive. With a
good coding scheme, one can transmit reliably, i.e., with vanishing error proba-
bility, at rates below the channel capacity.
In this chapter, we revisit existing results concerning the interplay between
capacity and channel state information (CSI) in stationary Gaussian fading chan-
nels. Similarly to Part I, we refer to the knowledge of the fading as the CSI.
This chapter is structured as follows. Section 6.1 introduces a model for
multiple-input multiple-output (MIMO) stationary Gaussian flat-fading chan-
nels. Section 6.2 reviews the capacity of fading channels for both coherent and
noncoherent settings. We focus on the high signal-to-noise ratio (SNR) regime
and study the capacity pre-log, defined as the limiting ratio of the capacity to
the logarithm of the SNR as the SNR tends to infinity.
6.1 MIMO Gaussian Flat-Fading Channels
We consider a discrete-time MIMO flat-fading channel with nt transmit anten-
nas and nr receive antennas. The channel output at time k, k ∈ , is an nr-
dimensional random vector
Yk =
√
SNR
nt
Hkxk +Zk. (6.1)
Here xk ∈ nt denotes the channel input vector at time k, Hk denotes the nr ×nt-dimensional random fading matrix at time k and Zk denotes the nr-variate
random additive noise vector at time k.
99
6.2 Capacity and The Pre-Log
m ENC
√
SNR
nt
xk
× +Yk
DEC m
Hk Zk
Figure 6.1: A diagram for communication over a stationary MIMO fading chan-nel.
We shall assume throughout that the noise process Zk, k ∈ is a se-
quence of independent and identically distributed (i.i.d.) complex-Gaussian ran-
dom vectors with zero mean and identity covariance matrix. The average SNR
for each received antenna is thus SNR. The fading process Hk, k ∈ is sta-
tionary, ergodic and complex-Gaussian. We assume that the nr · nt processes
Hk(r, t), k ∈ , r = 1, . . . , nr, t = 1, . . . , nt are independent and have the
same law, with each process having zero-mean, unit-variance and power spectral
density fH(λ), −12≤ λ ≤ 1
2. Thus, fH(·) is a non-negative function satisfying
[Hk+m(r, t)H∗k(r, t)] =
∫ 1/2
−1/2
eı2πmλfH(λ)dλ (6.2)
where H∗k(r, t) denotes the complex conjugate of Hk(r, t). We finally assume that
the fading process Hk, k ∈ and the noise process Zk, k ∈ are independentand that their joint law does not depend on xk, k ∈ .
6.2 Capacity and The Pre-Log
The transmission of a message m, m ∈ M = 1, . . . , |M| over the channel
(6.1) is illustrated in Figure 6.1. The encoder (ENC) first maps the message m
into a codeword which is selected from the codebook C. Each codeword forms
a sequence of n channel inputs, e.g., x1, . . . ,xn, which is transmitted over the
channel. We say that the channel inputs satisfy an average-power constraint if
1
n
n∑
k=1
[
‖Xk‖2]
≤ nt. (6.3)
100
6.2 Capacity and The Pre-Log
On the other hand, we say that the channel inputs satisfy a peak-power constraint
if it holds with probability one
‖Xk‖2 ≤ nt, k ∈ . (6.4)
Upon receiving a sequence of n channel outputs y1, . . . ,yn, the decoder (DEC)
decides on an output message m, m ∈ M = 1, . . . , |M| based on a certain
decision rule. We say that a rate R (in nats per channel use),
R ,log |M|n
, (6.5)
is achievable if there exists a combination of encoder and decoder such that the
error probability Prm 6= m tends to zero as the codeword length n tends to
infinity.
Capacity is defined as the supremum of all achievable rates R maximised over
all possible encoders and decoders. We denote Cav(SNR) as the capacity under
the average-power constraint (6.3) and Cp(SNR) as the capacity under the peak-
power constraint (6.4). We shall remove the subscripts av and p if the analysis
applies to both constraints.
We shall focus on the high-SNR behaviour of the capacity. One important
figure of merit is the capacity pre-log, defined in the following.
Definition 6.1 (Capacity Pre-Log). The capacity pre-log ΠC is defined as the
limiting ratio of the capacity to the logarithm of the SNR
ΠC , lim supSNR→∞
C(SNR)
log SNR. (6.6)
In the literature, the capacity pre-log is often referred to as the spatial
multiplexing gain [20]. For high SNR, the capacity can be approximated as
C(SNR) ≈ ΠC · log SNR. Hence, the capacity pre-log measures the growth rate
of the capacity in the high-SNR regime.
6.2.1 Coherent Channels
We refer to the coherent fading channel as a channel where the receiver has
access to the fading realisations. Since the fading realisations are available at
the receiver, we can treat the fading as part of the channel output. The channel
101
6.2 Capacity and The Pre-Log
capacity under an average-power constraint is in this case given by (see, e.g., [87])
Cav(SNR) = limn→∞
1
nsup I(X1, . . . ,Xn;Y1, . . . ,Yn,H1, . . . ,Hn) (6.7)
where I(· ; ·) is the mutual information, and where the maximisation is over the
joint distribution X1, . . . ,Xn satisfying
1
n
n∑
k=1
[
‖Xk‖2]
≤ nt. (6.8)
The capacity under a peak-power constraint is obtained by replacing the con-
straint (6.8) with (6.4).
The capacity of multi-antenna coherent flat-fading channels has been studied
in the literature (see, e.g., [88–90]). Under the average-power constraint (6.8),
the capacity (6.7) (in nats per channel use) can be expressed as an expectation
over the random fading matrix [88–90]
Cav(SNR) =
[
log det
(
Inr +SNR
ntHH
†)]
. (6.9)
This capacity can be achieved using nearest neighbour decoding—which in this
case is maximum-likelihood decoding—and a Gaussian codebook whose entries
are drawn i.i.d. from nt-variate Gaussian distribution Nnt (0, Int).
The following proposition provides the pre-log for Cav(SNR) in (6.9), which
was derived in [89, 90].
Proposition 6.1 (Coherent MIMO Pre-Log). Assume that the fading satisfies
0 < [
‖H‖2F]
<∞. (6.10)
Then, the capacity pre-log of coherent Gaussian fading channels under the average-
power constraint (6.8) is given by
ΠCav = min(nt, nr). (6.11)
Thus, for coherent fading channels, the capacity increases as the logarithm
of the SNR with the growth rate given by the minimum number of transmit
and receive antennas. Therefore, at high SNR, higher information rates can be
supported by increasing the number of antennas.
Notice that Proposition 6.1 characterises the pre-log under the average-power
constraint (6.3). Since the peak-power constraint (6.4) also implies the average-
102
6.2 Capacity and The Pre-Log
power constraint (6.3), it follows that Cp(SNR) ≤ Cav(SNR) and hence ΠCp ≤ΠCav .
6.2.2 Noncoherent Channels
We refer to the noncoherent fading channel as a channel where both the trans-
mitter and the receiver do not have access to the fading realisations but may be
aware of the fading statistics. For this channel, the capacity expression in (6.7)
changes to
Cav(SNR) = limn→∞
1
nsup I(X1, . . . ,Xn;Y1, . . . ,Yn). (6.12)
The capacity Cp(SNR) is defined accordingly. From the chain rule for mutual
information [18, Th. 2.5.2]
I(X1, . . . ,Xn;Y1, . . . ,Yn) = I(X1, . . . ,Xn;Y1, . . . ,Yn,H1, . . . ,Hn)
− I(X1, . . . ,Xn;H1, . . . ,Hn|Y1, . . . ,Yn) (6.13)
and the non-negativity of mutual information [18, Th. 2.6.3]
I(X1, . . . ,Xn;H1, . . . ,Hn|Y1, . . . ,Yn) ≥ 0, (6.14)
we can see that the capacity of the noncoherent fading channel (6.12) is upper-
bounded by the capacity of the coherent fading channel (6.7).
Lapidoth [85] derived the capacity pre-log of noncoherent single-input single-
output (SISO) fading channels under a peak-power constraint. The analysis
in [85] was later extended by Koch and Lapidoth [91] to multiple-input single-
output (MISO) fading channels. For the fading process defined in Section 6.1,
it follows that the capacity pre-log of MISO fading channels (nt ≥ 1, nr = 1) is
equal to the capacity pre-log of the SISO fading channel (nt = 1, nr = 1). The
results in [85, 91] are summarised in the following proposition.
Proposition 6.2 (Noncoherent MISO Pre-Log). Consider a MISO fading chan-
nel nt ≥ 1 and nr = 1 under the peak-power constraint (6.4). If the fading
processes Hk(1, t), k ∈ , t = 1, . . . , nt are independent and have the same
law, with each process having zero-mean, unit-variance and power spectral den-
sity fH(λ), −12≤ λ ≤ 1
2, then
ΠCp = µ(
λ : fH(λ) = 0)
(6.15)
103
6.2 Capacity and The Pre-Log
where µ(·) denotes the Lebesgue measure on the interval [−1/2, 1/2].
To the best of our knowledge, the capacity pre-log of MIMO fading channels
is unknown. The best so far known lower bound is due to Etkin and Tse [1],
which characterises the pre-log under an average-power constraint. The lower
bound in [1] is given in the following proposition.
Proposition 6.3 (Noncoherent MIMO Pre-Log). Consider a MIMO fading chan-
nel under the average-power constraint (6.3). If the nr · nt fading processes
Hk(r, t), k ∈ , t = 1, . . . , nt, r = 1, . . . , nr are independent and have the
same law, with each process having zero-mean, unit-variance and power spectral
density fH(λ), −12≤ λ ≤ 1
2, then
ΠCav ≥ min(nt, nr)
(
1−min(nt, nr)µ(
λ : fH(λ) > 0)
)
. (6.16)
Observe that (6.16) specialises to (6.15) for nr = 1. It should be noted that the
capacity pre-log for MISO and SISO fading channels was derived under a peak-
power constraint on the channel inputs, whereas the lower bound on the capacity
pre-log for MIMO fading channels was derived under an average-power constraint.
Clearly, the capacity pre-log corresponding to a peak-power constraint can never
be larger than that corresponding to an average-power constraint. It is believed
that the two pre-logs are in fact identical (see the conclusion in [85]).
Unlike the capacity pre-log of the coherent fading channel, the capacity pre-
log of the noncoherent channel depends on the time-correlation of the fading,
indicated by the power spectral density fH(·), cf. (6.2). As discussed in [85],
the function fH(·) characterises the predictability of the fading process. As the
Lebesgue measure µ(λ : fH(λ) = 0) increases, the capacity pre-log of the non-
coherent fading channel approaches that of the coherent channel.
The capacity pre-log characterises the largest information rates achievable at
high SNR. The above results, however, do not explain how to construct practical
schemes that achieve the capacity pre-log, particularly in the noncoherent chan-
nel, where CSI is not available. In Chapter 7, we propose a simple scheme using
nearest neighbour decoding and pilot-aided channel estimation that achieves the
MISO capacity pre-log (Proposition 6.2) and achieves the lower bound of the
MIMO capacity pre-log (Proposition 6.3). In Chapter 8, we extend the analysis
of the scheme proposed in Chapter 7 to the fading multiple-access channel.
104
Chapter 7
Pilot-Aided Channel Estimation
for Stationary Fading Channels
7.1 Introduction
The capacity of coherent multiple-input multiple-output (MIMO) channels in-
creases with the signal-to-noise ratio (SNR) as min(nt, nr) log SNR, where nt and
nr are the number of transmit and receive antennas, respectively [88, 89]. In
Chapter 6, we refer to the growth factor min(nt, nr) as the capacity pre-log.
This capacity growth can be achieved using a nearest neighbour decoder which
selects the codeword that is closest (in a Euclidean-distance sense) to the chan-
nel output. In fact, for coherent fading channels with additive Gaussian noise,
this decoder is the maximum-likelihood decoder and is therefore optimal in the
sense that it minimises the error probability (see [4] and references therein). The
coherent channel model assumes that there is a genie that provides the fading
coefficients to the decoder; this assumption is difficult to achieve in practice. In
this work, we replace the role of the genie by a scheme that estimates the fading
via pilot symbols. This can be viewed as a particular coding strategy over a
noncoherent fading channel, i.e., a channel where both communication ends do
not have access to fading coefficients but may be aware of the fading statistics.
Note that with imperfect fading estimations, the nearest neighbour decoder that
treats the fading estimate as if it were perfect is not necessarily optimal. Never-
theless, we show that, in some cases, nearest neighbour decoding and pilot-aided
channel estimation achieves the capacity pre-log of noncoherent fading channels.
The capacity of noncoherent fading channels has been studied in a number of
works. Building upon [77], Hassibi and Hochwald [76] studied the capacity of the
block-fading channel and used pilot symbols (also known as training symbols)
105
7.1 Introduction
to obtain reasonably accurate fading estimates. Lozano and Jindal [92] provided
tools for a unified treatment of pilot-based channel estimation in both block
and stationary bandlimited fading channels. In these works, lower bounds on
the channel capacity were obtained. Lapidoth [85] studied a single-input single-
output (SISO) fading channel for more general stationary fading processes and
showed that, depending on the predictability of the fading process, the capacity
growth in SNR can be, inter alia, logarithmically or double logarithmically. The
extension of [85] to multiple-input single-output (MISO) fading channels can be
found in [91]. A lower bound on the capacity of stationary MIMO fading channels
was derived by Etkin and Tse in [1].
Lapidoth and Shamai [54] and Weingarten et. al. [39] studied noncoherent
stationary fading channels from a mismatched-decoding perspective. In partic-
ular, they studied achievable rates with Gaussian codebooks and nearest neigh-
bour decoding. In both works, it is assumed that there is a genie that provides
imperfect estimates of the fading coefficients.
In this chapter, we add the estimation of the fading coefficients to the anal-
ysis. In particular, we study a communication system where the transmitter
emits at regular intervals pilot symbols, and where the receiver performs channel
estimation and data detection, separately. Based on the channel outputs corre-
sponding to pilot transmissions, the channel estimator produces estimates for the
remaining time instants using a linear minimum mean-square error (LMMSE) in-
terpolator. Using these estimates, the data detector employs a nearest neighbour
decoder to decide what the transmitted message was. We study the achievable
rates of this communication scheme at high SNR. In particular, we study the
pre-log for fading processes of bandlimited power spectral densities. (The pre-
log is defined as the limiting ratio of the achievable rate to the logarithm of the
SNR as the SNR tends to infinity.)
For SISO fading channels, using some simplifying arguments, Lozano [93] and
Jindal and Lozano [92] showed that this scheme achieves the capacity pre-log. In
this chapter, we prove this result without any simplifying assumptions and ex-
tend it to MIMO fading channels. The rest of the chapter is organised as follows.
Section 7.2 describes the channel model and introduces our transmission scheme
along with the nearest neighbour decoder and pilots for channel estimation. Sec-
tion 7.3 defines the pre-log and presents our main results. Section 7.4 provides
the proof of our main results. Section 7.5 summarises the important points of
the chapter.
106
7.2 System Model and Transmission Scheme
7.2 System Model and Transmission Scheme
We consider a discrete-time MIMO flat-fading channel with nt transmit antennas
and nr receive antennas, whose channel output at time instant k ∈ is the
complex-valued nr-dimensional random vector given by
Yk =
√
SNR
ntHkxk +Zk. (7.1)
Here xk ∈ nt denotes the time-k channel input vector, Hk denotes the nr × nt-
dimensional random fading matrix at time k, and Zk denotes the nr-variate
random additive noise vector at time k.
The noise process Zk, k ∈ is a sequence of i.i.d. complex-Gaussian ran-
dom vectors of zero mean and covariance matrix Inr. SNR denotes the average
SNR for each received antenna.
The fading process Hk, k ∈ is stationary, ergodic and complex-Gaussian.
We assume that the nr ·nt processes Hk(r, t), k ∈ , r = 1, . . . , nr, t = 1, . . . , nt
are independent and have the same law, with each process having zero-mean,
unit-variance and power spectral density (psd) fH(λ), −12≤ λ ≤ 1
2. Thus, fH(·)
is a non-negative function satisfying
[Hk+m(r, t)H∗k(r, t)] =
∫ 1/2
−1/2
eı2πmλfH(λ)dλ (7.2)
where H∗k(r, t) denotes the complex conjugate of Hk(r, t). We further assume
that the psd fH(·) has bandwidth λD < 1/2, i.e., fH(λ) = 0 for |λ| > λD and
fH(λ) > 0 otherwise.
We finally assume that the fading process Hk, k ∈ and the noise process
Zk, k ∈ are independent and that their joint law does not depend on xk, k ∈.
The transmission involves both codewords and pilots. The former convey the
message to be transmitted, and the latter are used to facilitate the estimation of
the fading coefficients at the receiver. We denote a codeword conveying a message
m, m ∈ M (where M =
1, . . . , enR
is the set of possible messages), at rate R
by the length-n sequence of input vectors x1(m), . . . , xn(m). The codeword is
selected from the codebook C, which is drawn i.i.d. from an nt-variate complex-
Gaussian distribution with zero mean and identity covariance matrix such that
107
7.2 System Model and Transmission Scheme
Pilot Data No transmission
n +“
n
L−nt
+ 1”
nt
L L(T − 1)L(T − 1)
t = 1
t = 2
Figure 7.1: Structure of pilot and data transmission for nt = 2, L = 7 and T = 2.
1
n
n∑
k=1
[
∥
∥Xk(m)∥
∥
2]
= nt, m ∈ M. (7.3)
To estimate the fading matrix, we transmit orthogonal pilot vectors. The
pilot vector pt ∈ nt used to estimate the fading coefficients corresponding to
the t-th transmit antenna is given by pt(t) = 1 and pt(t′) = 0 for t′ 6= t. For
example, the first pilot vector is p1 = (1, 0, · · · , 0)T. To estimate the whole fading
matrix, we thus need to send the nt pilot vectors p1, . . . ,pnt.
The transmission scheme is as follows. Every L time instants (for some L ∈), we transmit the nt pilot vectors p1, . . . ,pnt. Each codeword is then split up
into blocks of L − nt data vectors, which will be transmitted after the nt pilot
vectors. The process of transmitting L − nt data vectors and nt pilot vectors
continues until all n data vectors are completed. Herein we assume that n is an
integer multiple of L−nt.7.1 Prior to transmitting the first data block, and after
transmitting the last data block, we introduce a guard period of L(T − 1) time
instants (for some T ∈ ), where we transmit every L time instants the nt pilot
vectors p1, . . . ,pnt, but we do not transmit data vectors in between. The guard
period ensures that, at every time instant, we can employ a channel estimator
that bases its estimation on the channel outputs corresponding to the T past
and the T future pilot transmissions. This facilitates the analysis and does not
incur any loss in terms of achievable rates. The above transmission scheme is
illustrated in Figure 7.1. The channel estimator is described in the following.
Note that the total block-length of the above transmission scheme (comprising
data vectors, pilot vectors and guard period) is given by
n′ = np + n+ ng (7.4)
7.1If n is not an integer multiple of L − nt, then the last L − nt instants are not fully usedby data vectors and contain therefore time instants where we do not transmit anything. Thethereby incurred loss in information rate vanishes as n tends to infinity.
108
7.2 System Model and Transmission Scheme
where np denotes the number of channel uses reserved for pilot vectors, and where
ng denotes the number of channel uses during the silent guard period, i.e.,
np =
(
n
L− nt
+ 1 + 2(T − 1)
)
nt, (7.5)
ng = 2(L− nt)(T − 1). (7.6)
We now turn to the decoder. Let D denote the set of time indices where data
vectors of a codeword are transmitted, and let P denote the set of time indices
where pilots are transmitted. The decoder consists of two parts: a channel esti-
mator and a data detector. The channel estimator considers the channel output
vectors Yk′, k′ ∈ P corresponding to the past and future T pilot transmissions
and estimates Hk(r, t) using a linear interpolator, i.e., the estimate H(T )k (r, t) of
the fading coefficient Hk(r, t) is given by
H(T )k (r, t) =
k+TL∑
k′=k−TL:k′∈P
ak′(r, t)Yk′(r), k ∈ D (7.7)
where the coefficients ak′(r, t) are chosen in order to minimise the mean-squared
error.
Note that, since the pilot vectors transmit only from one antenna, the fad-
ing coefficients corresponding to all transmit and receive antennas (r, t) can
be observed. Further note that, since the fading processes Hk(r, t), k ∈ ,r = 1, . . . , nr, t = 1, . . . , nt are independent, estimating Hk(r, t) only based on
Yk(r), k ∈ rather than on Yk, k ∈ incurs no loss in optimality.
Since the time-lags between Hk, k ∈ D and the observations Yk′, k′ ∈ P
depend on k, it follows that the interpolation error
E(T )k (r, t) = Hk(r, t)− H
(T )k (r, t) (7.8)
is not stationary but cyclo-stationary with period L. It can be shown that,
irrespective of r, the variance of the interpolation error
ǫ2ℓ,T (r, t) =
[
∣
∣
∣Hk(r, t)− H
(T )k (r, t)
∣
∣
∣
2]
(7.9)
109
7.2 System Model and Transmission Scheme
tends to the following expressions as T tends to infinity [94]
ǫ2ℓ(t) , limT→∞
ǫ2ℓ,T (r, t) (7.10)
= 1−∫ 1/2
−1/2
SNR|fHL,ℓ−t+1(λ)|2SNRfHL,0(λ) + nt
dλ (7.11)
where ℓ , k mod L denotes the remainder of k/L. Here fHL,ℓ(·) is given by
fHL,ℓ(λ) =1
L
L−1∑
ν=0
fH
(
λ− ν
L
)
ei2πℓλ−νL , ℓ = 0, . . . , L− 1 (7.12)
and fH(·) is the periodic function of period [−1/2, 1/2) that coincides with fH(λ)
for −1/2 ≤ λ ≤ 1/2. If
L ≤ 1
2λD(7.13)
then |fHL,ℓ(·)| becomes
|fHL,ℓ(λ)| = fHL,0(λ) =1
LfH
(
λ
L
)
, −1
2≤ λ ≤ 1
2. (7.14)
In this case, irrespective of ℓ and t, the variance of the interpolation error is given
by
ǫ2ℓ(t) = ǫ2 = 1−∫ 1/2
−1/2
SNR
(
fH(λ))2
SNRfH(λ) + Lntdλ (7.15)
which vanishes as SNR tends to infinity. Recall that λD denotes the bandwidth of
fH(·). Thus, (7.13) implies that no aliasing occurs as we undersample the fading
process L times. While the variance in (7.11) may depend on the transmit an-
tenna index t, t = 1, . . . , nt, the variance in (7.15) is independent of the transmit
antenna index. See Section 7.4.1 for a more detailed discussion.
The channel estimator feeds the sequence of fading estimates H (T )k , k ∈ D
(which is composed of the matrix entries H(T )k (r, t), k ∈ D) to the data detector.
We shall denote its realisation by H(T )k , k ∈ D. Based on the channel outputs
yk, k ∈ D and fading estimates H(T )k , k ∈ D, the data detector uses a nearest
neighbour decoder to guess which message was transmitted. Thus, the decoder
decides on the message m that satisfies
m = arg minm∈M
D(m) (7.16)
110
7.3 The Pre-Log
where
D(m) ,∑
k∈D
∥
∥
∥
∥
∥
yk −√
SNR
ntH
(T )k xk(m)
∥
∥
∥
∥
∥
2
. (7.17)
7.3 The Pre-Log
We say that a rate
R(SNR) ,log |M|n
(7.18)
is achievable if the error probability tends to zero as the codeword length n tends
to infinity. In this work, we study the maximum rate R∗(SNR) that is achievable
with nearest neighbour decoding and pilot-aided channel estimation. We focus
on the achievable rates at high SNR. In particular, we are interested in the
maximum achievable pre-log, defined as
ΠR∗ , lim supSNR→∞
R∗(SNR)
log SNR. (7.19)
We recall some key points on the capacity pre-log of the noncoherent fading
channel in Chapter 6. Proposition 6.2 summarises the capacity pre-log of the
SISO fading channel under a peak-power constraint derived by Lapidoth [85] as
ΠCp = µ(
λ : fH(λ) = 0)
(7.20)
where µ(·) denotes the Lebesgue measure on the interval [−1/2, 1/2]. Koch and
Lapidoth [91] later extended this result to MISO fading channels and showed that
if the fading processes Hk(t), k ∈ , t = 1, . . . , nt are independent and have
the same law, then the capacity pre-log of MISO fading channels is equal to the
capacity pre-log of the SISO fading channel with fading process Hk(1), k ∈ .Using (7.20), the capacity pre-log of MISO fading channels with bandlimited psd
of bandwidth λD can be evaluated as
ΠCp = 1− 2λD. (7.21)
To the best of our knowledge, a complete characterisation on the capacity pre-
log of MIMO fading channels is unknown. Proposition 6.3 provides the best so
far known lower bound due to Etkin and Tse [1], i.e., for independent fading
processes Hk(r, t), k ∈ , t = 1, . . . , nt, r = 1, . . . , nr that have the same
law, the capacity pre-log of the MIMO fading channel under an average-power
111
7.3 The Pre-Log
constraint can be lower-bounded as
ΠCav ≥ min(nt, nr)
(
1−min(nt, nr)µ(
λ : fH(λ) > 0)
)
. (7.22)
For a psd that is bandlimited to λD, this becomes
ΠCav ≥ min(nt, nr)(
1−min(nt, nr) 2λD
)
. (7.23)
Note that since R∗(SNR) ≤ C(SNR), it follows that ΠR∗ ≤ ΠC .7.2
In this work, we show that a communication scheme that employs nearest
neighbour decoding and pilot-aided channel estimation achieves the following
pre-log.
Theorem 7.1. Consider the above Gaussian MIMO flat-fading channel with nt
transmit antennas and nr receive antennas. Then, the transmission and decoding
scheme described in Section 7.2 achieves
ΠR∗ ≥ min(nt, nr)
(
1− min(nt, nr)
L∗
)
(7.24)
where L∗ =⌊
12λD
⌋
is the largest integer satisfying L∗ ≤ 12λD
.
Proof. See Section 7.4.1.
Remark 7.1. We derive Theorem 7.1 for i.i.d. Gaussian codebooks satisfying
the average-power constraint (7.3). Nevertheless, it can be shown that Theorem
7.1 continues to hold when the channel inputs satisfy a peak-power constraint.
More specifically, we show in Section 7.4.2 that a sufficient condition on the
input distribution with power constraint [‖X‖2] ≤ nt for achieving the pre-log
is that its probability density function (pdf) PX(x) satisfies
PX(x) ≤ K
πnte−‖x‖2 , x ∈ nt (7.25)
for some K satisfying
limSNR→∞
logK
log SNR= 0. (7.26)
The condition (7.25) is satisfied, for example, by truncated Gaussian inputs, for
7.2The channel capacity is the supremum of all achievable rates maximised over all possibleencoders and decoders
112
7.3 The Pre-Log
which the nt elements in X are independent and identically distributed and
PX(x) =1
Kπnt
e−‖x‖2 , x ∈ x ∈ nt : |x(t)| ≤ 1, 1 ≤ t ≤ nt , (7.27)
K =
(∫
|x|≤1
1
πe−|x|2dx
)nt
(7.28)
= (1− e−1)nt. (7.29)
If 1/(2λD) is an integer, then (7.24) becomes
ΠR∗ ≥ min(nt, nr)(
1−min(nt, nr) 2λD
)
. (7.30)
Thus, in this case nearest neighbour decoding together with pilot-aided channel
estimation achieves the capacity pre-log of MISO fading channels (7.21) as well
as the lower bound on the capacity pre-log of MIMO fading channels (7.23).
Suppose that both the transmitter and the receiver use the same number of
antennas, namely nt′ , nr
′ , min(nt, nr). Then, as the codeword length tends to
infinity, we have from (7.4), (7.5) and (7.6) that the fraction of time consumed
for the transmission of pilots is given by
limn→∞
np
n′ = limn→∞
(
nL−nt
+ 1 + 2(T − 1))
nt′
(
nL−nt
′ + 1 + 2(T − 1))
nt′ + n + 2(L− nt
′)(T − 1)=nt
′
L.
(7.31)
Consequently, from the achievable pre-log (7.24), namely
ΠR∗ ≥ nt′(
1− nt′
L
)
, L ≤ 1
2λD, (7.32)
we observe that the loss compared to the capacity pre-log of the coherent fading
channel nt′ = min(nt, nr) is given by the fraction of time used for the transmission
of pilots. From this we infer that the nearest neighbour decoder in combination
with the channel estimator described in Section 7.2 is optimal at high SNR in the
sense that it achieves the capacity pre-log of the coherent fading channel. This
further implies that the achievable pre-log in Theorem 7.1 is the best pre-log that
can be achieved by any scheme employing nt′ pilot vectors.
To achieve the pre-log in Theorem 7.1, we assume that L ≤ 12λD
, in which
case the variance of the interpolation error (7.15), namely
ǫ2 = 1−∫ 1/2
−1/2
SNR (fH(λ))2
SNRfH(λ) + Lntdλ ≈ 2λDLnt
SNR, (7.33)
113
7.3 The Pre-Log
vanishes as the inverse of the SNR. The achievable pre-log is then maximised
by maximising L ≤ 12λD
. Note that as a criterion of “perfect side information”
for nearest neighbour decoding in fading channels, Lapidoth and Shamai [54]
suggested that the variance of the fading estimation error should be negligible
compared to the reciprocal of the SNR. Using the linear interpolator (7.7), we
obtain an estimation error with variance decaying as the reciprocal of the SNR
provided that L ≤ 12λD
. Thus, the condition L ≤ 12λD
can be viewed as a sufficient
condition for obtaining “nearly perfect side information” in the sense that the
variance of the interpolation error is in the same order as the reciprocal of the
SNR.
Of course, one could increase the L beyond 12λD
. Indeed, by increasing L, we
could reduce the rate loss due to the transmission of pilots as indicated in (7.32)
at the cost of obtaining a larger fading estimation error, which may reduce the
reliability of the nearest neighbour decoder. To understand this trade-off better,
we shall analyse the contribution of the nearest neighbour decoder to the pre-log
when L > 12λD
. Note that for L > 12λD
, the variance of the interpolation error
follows from (7.11)
ǫ2ℓ(t) = 1−∫ 1/2
−1/2
SNR |fHL,ℓ−t+1(λ)|2SNRfHL,0(λ) + nt
dλ (7.34)
=
∫ 1/2
−1/2
ntfHL,0(λ)
SNRfHL,0(λ) + ntdλ+
∫ 1/2
−1/2
SNR(
(fHL,0(λ))2 − |fHL,ℓ−t+1(λ)|2
)
SNRfHL,0(λ) + ntdλ.
(7.35)
The former integral
∫ 1/2
−1/2
ntfHL,0(λ)
SNRfHL,0(λ) + ntdλ ≈ nt
SNR(7.36)
vanishes as the SNR tends to infinity. Furthermore, we prove in Section 7.4.1
that as the SNR tends to infinity, the latter integral
∫ 1/2
−1/2
SNR(
(fHL,0(λ))2 − |fHL,ℓ−t+1(λ)|2
)
SNRfHL,0(λ) + nt
dλ (7.37)
is bounded away from zero. This implies that the interpolation error (7.35) does
not vanish as the SNR tends to infinity, and the decoder therefore cannot achieve
a positive pre-log. It thus follows that the condition L ≤ 12λD
is necessary in order
to achieve a positive pre-log.
Comparing (7.24) and (7.23) with the capacity pre-log min(nt, nr) for coherent
114
7.4 Proofs
fading channels [88, 89], we observe that, for a fading process of bandwidth λD,
the penalty for not knowing the fading coefficients is roughly(
min(nt, nr))2
2λD.
Consequently, the lower bound (7.24) does not grow linearly with min(nt, nr), but
it is a quadratic function of min(nt, nr) that achieves its maximum at
min(nt, nr) =L∗
2. (7.38)
This gives rise to the lower bound
ΠR∗ ≥ L∗
4(7.39)
which cannot be larger than 1/(8λD). The same holds for the lower bound (7.23).
7.4 Proofs
7.4.1 Proof of Theorem 7.1
Theorem 7.1 is proven as follows. We first characterise the estimation error from
the linear interpolator (7.7). We then compute the rates achievable with the
communication scheme described in Section 7.2. Finally, we analyse the pre-log
corresponding to these rates.
7.4.1.1 Linear Interpolator
By (7.7), the estimate of Hk(r, t) is given by
H(T )k (r, t) =
k+TL∑
k′=k−TL:k′∈P
ak′(r, t)Yk′(r), k ∈ D. (7.40)
We further denote the interpolation error by E(T )k (r, t) = Hk(r, t) − H
(T )k (r, t).
We have the following lemma.
Lemma 7.1. For any k ∈ , let k = jL + ℓ, ℓ = 0, . . . , L − 1. Without
loss of generality, assume that for k ∈ P, ℓ = 0, . . . , nt − 1 and for k ∈ D,
ℓ = nt, . . . , L−1. Then, the linear interpolator (7.40) has the following properties.
1. For each t = 1, . . . , nt and r = 1, . . . , nr, the estimate H(T )k (r, t) and the cor-
responding estimation error E(T )k (r, t) are independent zero-mean complex-
Gaussian random variables.
115
7.4 Proofs
2. (a) For a given transmit antenna t and ℓ ∈ nt, . . . , L−1, the nr processes
(H(T )jL+ℓ(r, t), E
(T )jL+ℓ(r, t)), j ∈ , r = 1, . . . , nr are independent and
have the same law.
(b) For a given receive antenna r and ℓ ∈ nt, . . . , L−1, the nt processes
(H(T )jL+ℓ(r, t), E
(T )jL+ℓ(r, t)), j ∈ , t = 1, . . . , nt are independent but
have different laws.
3. For each ℓ = nt, . . . , L−1, the joint process (H jL+ℓ, H(T )jL+ℓ,XjL+ℓ,ZjL+ℓ), j ∈
is stationary ergodic.
4. It holds that for ℓ = nt, . . . , L− 1
[
Z†ℓ H
(T )ℓ Xℓ
]
= [
X†ℓ H
†(T )ℓ Zℓ
]
= 0. (7.41)
5. Irrespective of j and r, the variance of the interpolation error E(T )jL+ℓ(r, t),
ℓ = nt, . . . , L− 1 tends to
ǫ2ℓ(t) = 1−∫ 1/2
−1/2
SNR |fHL,ℓ−t+1(λ)|2SNRfHL,0(λ) + nt
dλ (7.42)
as T tends to infinity, where
fHL,ℓ(λ) =1
L
L−1∑
ν=0
fH
(
λ− ν
L
)
ei2πℓλ−νL
λ, −1
2≤ λ ≤ 1
2(7.43)
and fH(·) is the periodic function of period [−1/2, 1/2) that coincides with
fH(λ) for −1/2 ≤ λ ≤ 1/2. This implies the following results.
(a) For L ≤ 12λD
, irrespective of ℓ and t, (7.42) becomes
ǫ2ℓ(t) = 1−∫ 1/2
−1/2
SNR (fH (λ))2
SNRfH (λ) + Lntdλ (7.44)
which vanishes as SNR tends to infinity
lim infSNR→∞
ǫ2ℓ(t) = 0. (7.45)
(b) For L > 12λD
, we have for all ℓ = nt, . . . , L− 1, t = 1, . . . , nt
lim infSNR→∞
ǫ2ℓ(t) > 0. (7.46)
116
7.4 Proofs
Proof. 1. By the orthogonality principle [78], we have that H(T )k (r, t) and
E(T )k (r, t) are uncorrelated. Noting that the pilot symbol is unity, we can
write (7.40) as
H(T )k (r, t) =
k+TL∑
k′=k−TL:k′∈P
ak′(r, t)
(
√
SNR
ntHk′(r, t) + Zk′(r)
)
, k ∈ D.
(7.47)
Since the processes Hk(r, t), k ∈ and Zk(r), k ∈ are zero-mean
complex-Gaussian processes, we have from (7.47) and the orthogonality
principle [78] that H(T )k (r, t) and E
(T )k (r, t) are zero-mean independent com-
plex-Gaussian random variables.
2. Let k = jL+ℓ and ℓ = kmodL. Without loss of generality, assume that for
k ∈ D, we have ℓ = nt, . . . , L− 1, and for k ∈ P, we have ℓ = 0, . . . , nt − 1.
Since the pilot vectors are transmitted sequentially from p1 to pnt, we have
for k ∈ P that
pt = pℓ+1, ℓ = 0, . . . , nt − 1 (7.48)
namely the (ℓ + 1)-th pilot vector, ℓ = 0, . . . , nt − 1 is used to estimate
the fading coefficients from transmit antenna t. To estimate Hk(r, t), there
is no loss in optimality by considering only the outputs Yk′(r) for k′ ∈ P,
k′ ∈ [k − TL, k + TL] satisfying
k′ mod L = t− 1. (7.49)
Indeed, the channel outputs Yk′(r), k′modL 6= t−1 correspond to Hk′(r, t
′),
t′ 6= t, which are independent from Hk(r, t). It follows from [94] that for the
estimation at k = jL+ ℓ, the optimal coefficients ak′(r, t) (which minimises
the mean-squared error) depend only on L and ℓ. The fading estimate
(7.40) and its corresponding estimation error can then be expressed as
H(T )jL+ℓ(r, t) =
T−1∑
τ=−T
a−τL,ℓ(r, t)Y(j−τ)L+t−1(r) (7.50)
=
T−1∑
τ=−T
a−τL,ℓ(r, t)
(
√
SNR
ntH(j−τ)L+t−1(r, t) + Z(j−τ)L+t−1(r)
)
,
(7.51)
E(T )jL+ℓ(r, t) = HjL+ℓ(r, t)− H
(T )jL+ℓ(r, t). (7.52)
117
7.4 Proofs
We note that the nr · nt processes Hk(r, t), k ∈ are independent from
each other and have the same law. We have the following results.
(a) We observe in (7.51) that for a given t, the time differences between
the index of interest (jL+ℓ) and the positions of pilots ((j−τ)L+t−1,
τ = −T, . . . , T − 1) are the same for all r = 1, . . . , nr. It thus follows
from [94] that for a given t, the optimal coefficients a−τL,ℓ(r, t) are
identical for all r = 1, . . . , nr. This implies that for a given t and
ℓ, the nr processes (H(T )jL+ℓ(r, t), E
(T )jL+ℓ(r, t)), j ∈ , r = 1, . . . , nr
corresponding to the channel estimation at nr receive antennas are
independent and have the same law.
(b) We also observe in (7.51) that for a given r, the time differences
between the index of interest (jL + ℓ) and the position of pilots
((j − τ)L + t − 1, τ = −T, . . . , T − 1) are different for t = 1, . . . , nt.
It thus follows from [94] that for a given r, the optimal coefficients
a−τL,ℓ(r, t) are generally different for t = 1, . . . , nt. This implies that
for a given r and ℓ, the nt processes (H(T )jL+ℓ(r, t), E
(T )jL+ℓ(r, t)), j ∈ ,
t = 1, . . . , nt are independent but have different laws.
3. We apply existing results on the stationary process, particularly on weakly
mixing and ergodicity. (The definition of these notions can be found in [95].)
Since Hk, k ∈ is ergodic Gaussian process, it is also weakly mixing [96].
Since Zk, k ∈ is i.i.d. Gaussian process and independent from Hk, k ∈, it follows from [97, Prop. 1.6] that the joint process (Hk,Zk), k ∈ is ergodic.
To understand the behaviour of the process (H jL+ℓ, H(T )jL+ℓ,ZjL+ℓ), j ∈ ,
ℓ ∈ nt, . . . , L− 1, we first consider another random matrix Hk, for which
the entry at row r and column t is similarly defined to (7.47) but with the
removal of k′ ∈ P and with the same set of ak′(r, t), k′ = k − TL, . . . , k +
TL for all k ∈ , i.e.,
Hk(r, t) =
k+TL∑
k′=k−TL
ak′(r, t)
(
√
SNR
ntHk′(r, t) + Zk′(r)
)
. (7.53)
We can see that the joint process (Hk, Hk,Zk), k ∈ is Gaussian. For any
process Uk, k ∈ , denote Uk∗
k , k∗ > k as the sequence Uk, Uk+1, . . . , Uk∗ .
Then, we can express (Hk, Hk,Zk) as the output of a time-invariant multi-
118
7.4 Proofs
variate function of (Hk,Zk), k ∈ , i.e.,
(Hk, Hk,Zk) = φ(
Hk+TLk−TL,Z
k+TLk−TL
)
. (7.54)
Since (Hk,Zk), k ∈ is ergodic, and since (Hk, Hk,Zk), k ∈ is
Gaussian, it follows from [96] that the joint process (Hk, Hk,Zk), k ∈ is weakly mixing. Furthermore, as a consequence of the vector process
(H (j+1)L−1jL , H
(j+1)L−1jL ,Z
(j+1)L−1jL ), j ∈
being weakly mixing [96], the undersampled process
(H jL+ℓ, H jL+ℓ,ZjL+ℓ), j ∈
for each ℓ = 0, . . . , L− 1 is weakly mixing, which implies ergodicity [96].
From the proof in Part 2), we note for H(T )k (r, t) in (7.47) with k = jL+ ℓ
that the optimal coefficients ak′(r, t) do not depend on j, and can then
be expressed as a−τL,ℓ. For each ℓ = nt, . . . , L − 1, we can then view
(H jL+ℓ, H(T )jL+ℓ,ZjL+ℓ), j ∈ as a special case of (H jL+ℓ, H jL+ℓ,ZjL+ℓ), j ∈
, where the values of ak′(r, t) in (7.53) is given by
ak′(r, t) =
a−τL,ℓ(r, t), if k′ = (j − τ)L+ t− 1, τ ∈ −T, . . . , T − 10, otherwise.
(7.55)
It thus follows that the undersampled process (H jL+ℓ, H(T )jL+ℓ,ZjL+ℓ), j ∈
for each ℓ = nt, . . . , L− 1 is ergodic.
By applying [87, Lemma 2], since XjL+ℓ, j ∈ is i.i.d. and independent
from (Hk,Zk), k ∈ , we have that the joint process
(H jL+ℓ, H(T )jL+ℓ, XjL+ℓ, ZjL+ℓ), j ∈
is ergodic.
4. Note that ℓ = nt, . . . , L− 1 correspond to k ∈ D. We then have that
[
Z†ℓ H
(T )ℓ Xℓ
]
= [
[
Z†ℓ H
(T )ℓ Xℓ
∣
∣
∣H
(T )ℓ
]]
. (7.56)
The process H (T )k , k ∈ D is a function of (Hk,Zk), k ∈ P. Since
Zk, k ∈ D has zero mean, independent from (Hk,Zk), k ∈ P and
119
7.4 Proofs
Xk, k ∈ D, it follows for any H(T )ℓ = H
(T )ℓ ∈ nr×nt, ℓ = nt, . . . , L − 1
that the inner expectation on the RHS of (7.56) is zero, which implies that
the outer expectation is also zero. The same reason can also be used to
prove [X†ℓ H
†(T )ℓ Zℓ] = 0.
5. As T tends to infinity, we have that from (7.52) that irrespective of j and
r,
ǫ2ℓ(t) = limT→∞
[
∣
∣
∣E
(T )jL+ℓ(r, t)
∣
∣
∣
2]
(7.57)
= limT→∞
[
∣
∣
∣HjL+ℓ(r, t)− H
(T )jL+ℓ(r, t)
∣
∣
∣
2]
(7.58)
= limT→∞
[(
HjL+ℓ(r, t)− H(T )jL+ℓ(r, t)
)
H†jL+ℓ(r, t)
]
(7.59)
= 1− limT→∞
[
H(T )jL+ℓ(r, t)H
†jL+ℓ(r, t)
]
(7.60)
= 1−∞∑
τ=−∞a−τL,ℓ(r, t)
√
SNR/nt
[
H(j−τ)L+t−1H†jL+ℓ
]
. (7.61)
Herein (7.59) follows from the orthogonality principle [78], and (7.61) fol-
lows from (7.51). Then, following the derivation in [94], we obtain
ǫ2ℓ(t) = 1−∞∑
τ=−∞a−τL,ℓ(r, t)
√
SNR/nt
[
H(j−τ)L+t−1H†jL+ℓ
]
(7.62)
= 1−∫ 1/2
−1/2
SNR |fHL,ℓ−t+1(λ)|2SNRfHL,0(λ) + nt
dλ (7.63)
where
fHL,ℓ(λ) =1
L
L−1∑
ν=0
fH
(
λ− ν
L
)
ei2πℓλ−νL
λ, −1
2≤ λ ≤ 1
2(7.64)
and fH(·) is the periodic function of period [−1/2, 1/2) that coincides with
fH(λ) for −1/2 ≤ λ ≤ 1/2.
The inverse of twice bandwidth 12λD
determines the behaviour of ǫ2ℓ(t) as
the SNR tends to infinity.
(a) If L ≤ 12λD
, then it follows that
fH
(
λ− ν
L
)
= 0, ν = 1, . . . , L− 1, −1
2≤ λ ≤ 1
2, (7.65)
120
7.4 Proofs
in which case,
fHL,ℓ(λ) =1
L
L−1∑
ν=0
fH
(
λ− ν
L
)
ei2πℓλ−νL
λ =1
LfH
(
λ
L
)
ei2πℓLλ,
− 1
2≤ λ ≤ 1
2. (7.66)
Combining (7.66) with (7.65), we obtain for the variance of the inter-
polation error
ǫ2ℓ(t) = 1−∫ 1/2
−1/2
SNR |fHL,ℓ−t+1(λ)|2SNRfHL,0(λ) + nt
dλ (7.67)
= 1−∫ 1/2
−1/2
SNR (fH (λ))2
SNRfH (λ) + Lntdλ, L ≤ 1
2λD(7.68)
irrespective of ℓ and t. Since H(T )jL+ℓ(r, t) and E
(T )jL+ℓ(r, t) are indepen-
dent, it follows from (7.68) and (7.52) that
limT→∞
[
∣
∣
∣H
(T )jL+ℓ(r, t)
∣
∣
∣
2]
=
∫ 1/2
−1/2
SNR (fH (λ))2
SNRfH (λ) + Lnt
dλ. (7.69)
(b) We next analyse the interpolation error for L > 12λD
. To this end, we
express L as
L =1
2λD+ ε (7.70)
for some ε > 0. The variance of the interpolation error (7.42) can be
decomposed into two integrals
ǫ2ℓ(t) =
∫ 1/2
−1/2
ntfHL,0(λ)
SNRfHL,0(λ) + nt
dλ
+
∫ 1/2
−1/2
SNR(
(fHL,0(λ))2 − |fHL,ℓ−t+1(λ)|2
)
SNRfHL,0(λ) + nt
dλ. (7.71)
121
7.4 Proofs
Let ℓ′ , ℓ− t+ 1. We have that
(fHL,0(λ))2 − |fHL,ℓ′(λ)|
2
=1
L2
L−1∑
ν=0
L−1∑
ν′=0,ν′ 6=ν
fH
(
λ− ν
L
)
fH
(
λ− ν ′
L
)
(
1− eı2πℓ′ λ−ν
L · e−ı2πℓ′ λ−ν′
L
)
(7.72)
=2
L2
L−1∑
ν=0
L−1∑
ν′>ν
fH
(
λ− ν
L
)
fH
(
λ− ν ′
L
)(
1− cos
(
2πℓ′ν ′ − ν
L
))
.
(7.73)
Since the summands are non-negative, we obtain the following lower
bound by considering only the summands corresponding to ν = 0 and
ν ′ = 1
(fHL,0(λ))2 − |fHL,ℓ′(λ)|
2
≥ 2
L2fH
(
λ
L
)
fH
(
λ− 1
L
)(
1− cos
(
2πℓ′
L
))
. (7.74)
We note that for every k ∈ D, we have ℓ = kmodL ≥ nt. It therefore
follows that 1 ≤ ℓ′ ≤ L− 1 which implies
1− cos
(
2πℓ′
L
)
> 0. (7.75)
By the definition of f(·), it follows that for L = 12λD
+ ε, fH(
λL
)
and
fH(
λ−1L
)
overlap on an interval L ⊆ [−1/2, 1/2]. Indeed, it can be
shown that for L = 12λD
+ε, the Lebesgue measure of L on the interval
−1/2 ≤ λ ≤ 1/2 is given by
µ (L) = min(1, 2λDε). (7.76)
The second integral in (7.71) can then be lower-bounded as
∫ 1/2
−1/2
SNR(
(fHL,0(λ))2 − |fHL,ℓ′(λ)|2
)
SNRfHL,0(λ) + ntdλ
≥ 2(
1− cos(
2πℓ′
L
))
L2
∫
L
SNR fH(
λL
)
fH(
λ−1L
)
SNRfHL,0(λ) + ntdλ. (7.77)
Computing the limit as the SNR tends to infinity and applying Fatou’s
122
7.4 Proofs
lemma [41] yield
lim infSNR→∞
2(
1− cos(
2πℓ′
L
))
L2
∫
L
SNR fH(
λL
)
fH(
λ−1L
)
SNRfHL,0(λ) + nt
dλ
≥ 2(
1− cos(
2πℓ′
L
))
L2
∫
L
lim infSNR→∞
SNR fH(
λL
)
fH(
λ−1L
)
SNRfHL,0(λ) + ntdλ (7.78)
=2(
1− cos(
2πℓ′
L
))
L2
∫
L
fH(
λL
)
fH(
λ−1L
)
fHL,0(λ)dλ. (7.79)
Since L is of positive Lebesgue measure, and since the integrand in
the RHS of in (7.79) is strictly positive, it follows from [98] that
∫
L
fH(
λL
)
fH(
λ−1L
)
fHL,0(λ)dλ > 0. (7.80)
Together with (7.75), this makes the second integral in (7.71) bounded
away from zero as the SNR tends to infinity, which implies that the
variance of the interpolation error ǫ2ℓ(t) is also bounded away from
zero whenever L > 12λD
.
7.4.1.2 Achievable Rates and Pre-Logs
We first note that it suffices to consider the case where nt = nr. If nt > nr,
then we employ only nr transmit antennas, and if nr > nt, then we ignore nr−nt
antennas at the receiver. This yields in both cases a lower bound on the maximum
achievable rate.
To prove Theorem 7.1, we analyse the generalized mutual information (GMI)
[36] for the channel and communication scheme in Section 7.2. The GMI, de-
noted by IgmiT (SNR), specifies the highest information rate for which the average
probability of error, averaged over the ensemble of i.i.d. Gaussian codebooks,
tends to zero as the codeword length n tends to infinity (see [4, 39, 54] and ref-
erences therein). The GMI for stationary Gaussian channels employing nearest
neighbour decoding has been evaluated in [39,54] where explicit assumptions on
the fading estimate process are specified. However, since the fading estimate pro-
duced by the linear interpolator (7.7) has different statistics to the ones in [39,54],
the results on the GMI presented in [39, 54] do not directly extend to our case.
Thus, for the sake of completeness, we shall re-derive IgmiT (SNR) using our fading
estimate specified in Lemma 7.1.
To prove Theorem 7.1, we evaluate IgmiT (SNR) in the following order.
123
7.4 Proofs
1. We compute a lower bound on IgmiT (SNR) for a fixed window size T .
2. We analyse the behaviour of this lower bound as T tends to infinity.
3. We evaluate the limiting ratio of this lower bound to log SNR as SNR tends
to infinity.
IgmiT (SNR) for a fixed T
Note that the linear interpolator (7.40) is used to estimate the fading at time
k ∈ D. At time k ∈ P, the estimation is not required. However, for the sake of
completeness, we obtain the estimate at time k ∈ P using
H(T )k (r, t) =
√
nt
SNRYk(r). (7.81)
In order to evaluate the GMI for a fixed T , we need the following lemma.
Lemma 7.2. Consider the channel and the transmission model in Section 7.2.
Let
n = n+
(
n
L− nt+ 1
)
nt − 1. (7.82)
Without loss of generality, consider a codeword for which the first data vector is
transmitted at time k = nt. Define F (SNR) as
F (SNR) , nr +SNR
(L− nt)nt
(L−1)∑
ℓ=nt
[
∥
∥
∥E(T )ℓ
∥
∥
∥
2
F
]
(7.83)
and a typical set
Tδ ,
xk,yk, H(T )k , k = 0, . . . , n :
∣
∣
∣
∣
∣
1
n
∑
k∈D
∥
∥
∥yk −
√
SNR/nt H(T )k xk
∥
∥
∥
2
− F (SNR)
∣
∣
∣
∣
∣
< δ
(7.84)
for some δ > 0. For any process Uk, k ∈ , denote Uk∗
k , k∗ > k as the sequence
Uk, Uk+1, . . . , Uk∗. Then, it holds that
limn→∞
Pr
(
X n0 ,Y
n0 , H
(T ),n0
)
∈ Tδ
= 1, ∀δ > 0. (7.85)
124
7.4 Proofs
Proof. For k ∈ D, the channel input vector x corresponds to the data vector x.
Then, as the codeword length n tends to infinity, we have the following limit
limn→∞
1
n
∑
k∈D
∥
∥
∥
√
SNR/nt
(
Hk − H(T )k
)
xk + zk
∥
∥
∥
2
=1
L− nt
L−1∑
ℓ=nt
limn→∞
L− nt
n
( n(L−nt
)−1∑
j=0
∥
∥
∥
√
SNR/nt
(
HjL+ℓ − H(T )jL+ℓ
)
xjL+ℓ + zjL+ℓ
∥
∥
∥
2
(7.86)
=1
L− nt
L−1∑
ℓ=nt
[
∥
∥
∥
√
SNR/nt
(
H ℓ − H(T )ℓ
)
Xℓ +Zℓ
∥
∥
∥
2]
, almost surely
(7.87)
=1
L− nt
L−1∑
ℓ=nt
(
nr +SNR
nt
[
∥
∥
∥E(T )ℓ Xℓ
∥
∥
∥
2
F
])
(7.88)
= nr +SNR
(L− nt)nt
L−1∑
ℓ=nt
[
∥
∥
∥E(T )ℓ
∥
∥
∥
2
F
]
. (7.89)
Herein equality (7.87) follows from the ergodicity condition in Part 3) of Lemma
7.1. Equality (7.88) follows from Part 4) of Lemma 7.1. Equality (7.89) follows
since Xℓ is independent from E(T )ℓ (as E(T )
k , k ∈ D is a function of (Hk,Zk), k ∈P) and Nnt (0, Int) distributed. This completes the proof of the lemma.
Let Pe be the ensemble-average error probability and Pe(m) be the ensemble-
average error probability corresponding to message m. Due to the symmetry of
the codebook construction, it suffices to consider the error behaviour, conditioned
on the event that message m = 1 is transmitted.
Let E(m′) denote the event that D(m′) ≤ D(1). The probability of error is
given by
Pe(1) = Pr
⋃
m′ 6=1
E(m′)
. (7.90)
Using the typical set Tδ and its complement Tcδ, the ensemble-average error prob-
125
7.4 Proofs
ability can be upper-bounded as
Pe(1) =Pr
⋃
m′ 6=1
E(m′)
∣
∣
∣
∣
∣
X n0 (1),Y
n0 , H
(T ),n0 ∈ Tδ
Pr
X n0 (1),Y
n0 , H
(T ),n0 ∈ Tδ
+ Pr
⋃
m′ 6=1
E(m′)
∣
∣
∣
∣
∣
X n0 (1),Y
n0 , H
(T ),n0 ∈ T
cδ
Pr
X n0 (1),Y
n0 , H
(T ),n0 ∈ T
cδ
(7.91)
≤Pr
⋃
m′ 6=1
E(m′)
∣
∣
∣
∣
∣
X n0 (1),Y
n0 , H
(T ),n0 ∈ Tδ
+ Pr
X n0 (1),Y
n0 , H
(T ),n0 ∈ T
cδ
(7.92)
≤enR · Pr
1
n·D(m′) < F (SNR) + δ
∣
∣
∣
∣
X n0 (1),Y
n0 , H
(T ),n0 ∈ Tδ
+ Pr
X n0 (1),Y
n0 , H
(T ),n0 ∈ T
cδ
, m′ 6= 1 (7.93)
where in the last inequality we have used the union bound and that, for
(X n0 (1),Y
n0 , H
(T ),n0 ) ∈ Tδ,
1
n·D(1) < F (SNR) + δ. (7.94)
It follows from Lemma 7.2 that the probability PrX n0 (1),Y
n0 , H
(T )n0 ∈ Tc
δ can
be made any arbitrarily small by letting the codeword length n tend to infinity.
The GMI basically characterises the rate of exponential decay of the expres-
sion
Pr
1
n·D(m′) < F (SNR) + δ
∣
∣
∣
∣
X n0 (1),Y
n0 , H
(T )n0 ∈ Tδ
, m′ 6= 1 (7.95)
to zero as δ ↓ 0 [39]. The computation of the GMI requires the conditional
log moment-generating function of the metric D(m′) associated with the wrong
message output m′ 6= 1, conditioned on the channel outputs and on the fading
estimates, i.e.,
κn(θ, SNR) = log E
[
exp
(
θ
n
∑
k∈DDk(m
′)
)∣
∣
∣
∣
∣
(yk, H(T )k ), k ∈ D
]
(7.96)
where
Dk(m′) ,
∥
∥
∥yk −
√
SNR/nt H(T )k xk(m
′)∥
∥
∥
2
. (7.97)
Following along the lines of [39,54], we can express the conditional log moment-
126
7.4 Proofs
generating function in (7.96) as the sum of conditional log moment-generating
functions for the individual vector metrics Dk(m′), k ∈ D, i.e.,
log
[
exp
(
θ
n
∑
k∈DDk(m
′)
)∣
∣
∣
∣
∣
(yk, H(T )k ), k ∈ D
]
=∑
k∈Dlog
[
exp
(
θ
nDk(m
′)
)∣
∣
∣
∣
yk, H(T )k
]
. (7.98)
The expectation on the RHS of (7.98) can be evaluated as
[
exp
(
θ
nDk(m
′)
)∣
∣
∣
∣
yk, H(T )k
]
=
∫
xk
1
πntexp
(
−‖xk‖2 +θ
n
∥
∥
∥yk −
√
SNR/nt H(T )k xk
∥
∥
∥
2)
dxk (7.99)
=1
det(
Inr − θnSNR
ntH
(T )k H
†(T )k
)exp
(
θ
ny†k
(
Inr −θ
n
SNR
ntH
(T )k H
†(T )k
)−1
yk
)
(7.100)
where the integral can be evaluated in the same way as in [39, App. A]. This
yields
κn(θ, SNR)
=∑
k∈D
(
θ
ny†k
(
Inr −θ
n
SNR
ntH
(T )k H
†(T )k
)−1
yk − log det
(
Inr −θ
n
SNR
ntH
(T )k H
†(T )k
)
)
.
(7.101)
127
7.4 Proofs
We then have that for all θ < 0
limn→∞
1
n· κn(nθ, SNR)
= limn→∞
1
n
∑
k∈Dθy†
k
(
Inr − θSNR
nt
H(T )k H
†(T )k
)−1
yk
− limn→∞
1
n
∑
k∈Dlog det
(
Inr − θSNR
nt
H(T )k H
†(T )k
)
(7.102)
=1
L− nt
L−1∑
ℓ=nt
limn→∞
L− nt
n
( nL−nt
)−1∑
j=0
θy†jL+ℓ
(
Inr − θSNR
ntH
(T )jL+ℓH
†(T )jL+ℓ
)−1
yjL+ℓ
− 1
L− nt
L−1∑
ℓ=nt
limn→∞
L− nt
n
(n/(L−nt)−1)∑
j=0
log det
(
Inr − θSNR
ntH
(T )jL+ℓH
†(T )jL+ℓ
)
(7.103)
=1
L− nt
(L−1)∑
ℓ=nt
[
θY †ℓ ·(
Inr − θSNR
nt
H(T )ℓ H
†(T )ℓ
)−1
· Yℓ
]
− 1
L− nt
(L−1)∑
ℓ=nt
[
log det
(
Inr − θSNR
nt
H(T )ℓ H
†(T )ℓ
)]
, almost surely
(7.104)
where the convergence in (7.104) is due to the ergodicity of (YjL+ℓ, H jL+ℓ), j ∈, ℓ = nt, . . . , L− 1, which follows from Part 3) of Lemma 7.1.
Following the same steps as in [39, 54], we can show that for all δ′ > 0, the
ensemble-average error probability can be bounded as
Pe(1) ≤ exp(nR)exp(
−n(
IgmiT (SNR)− δ′
))
+ ε(δ′, n) (7.105)
for some functions ε(δ′, n) such that
limn→∞
ε(δ′, n) = 0. (7.106)
Here IgmiT (SNR) denotes the GMI as a function of SNR for a fixed T , given by
IgmiT (SNR) =
L− nt
L
(
supθ<0
(θF (SNR)− κ(θ, SNR))
)
(7.107)
128
7.4 Proofs
where κ(θ, SNR) is defined by the RHS of (7.104)
κ(θ, SNR) ,1
L− nt
(L−1)∑
ℓ=nt
[
θY †ℓ ·(
Inr − θSNR
nt
H(T )ℓ H
†(T )ℓ
)−1
· Yℓ
]
− 1
L− nt
(L−1)∑
ℓ=nt
[
log det
(
Inr − θSNR
nt
H(T )ℓ H
†(T )ℓ
)]
. (7.108)
Herein the pre-factor (L − nt)/L comes from the fraction of time used for data
transmission. The bound (7.105) implies that for rates below IgmiT (SNR), the
communication scheme described in Section 7.2 has vanishing error probability
as n tends to infinity. Combining (7.83) and (7.104) with (7.107) yields
IgmiT (SNR)
= supθ<0
1
L
L−1∑
ℓ=nt
θ
(
nr +SNR
nt
[
∥
∥
∥E(T )ℓ
∥
∥
∥
2
F
])
+
[
log det
(
Inr − θSNR
ntH
(T )ℓ H
†(T )ℓ
)]
− [
θY †ℓ
(
Inr − θSNR
nt
H(T )ℓ H
†(T )ℓ
)−1
Yℓ
]
. (7.109)
Following the steps used in Appendix A.3, it can be shown that for θ < 0
−[
θY †ℓ
(
Inr − θSNR
ntH ℓ,T H
†ℓ,T
)−1
Yℓ
]
≥ 0. (7.110)
As observed in Appendix A.3, a good lower bound on IgmiT (SNR) for high SNR
follows by choosing
θ =−1
nr + (SNR/nt)ntnrǫ2∗,T(7.111)
where
ǫ2∗,T = maxr=1,...,nr,t=1,...,nt,
ℓ=nt,...,L−1
ǫ2ℓ,T (r, t). (7.112)
Hence, substituting the choice of θ in (7.111) and applying (7.110) to the RHS
of (7.109) yield
IgmiT (SNR) ≥ 1
L
L−1∑
ℓ=nt
[
log det
(
Inr +SNR
ntnr + ntnrSNRǫ2∗,TH
(T )ℓ H
†(T )ℓ
)]
− 1
.
(7.113)
129
7.4 Proofs
Limit T → ∞
We next analyse the RHS of (7.113) in the limit as T tends to infinity. To this
end, we note that, for L ≤ 12λD
, the interpolation error tends to (7.68), namely
ǫ2ℓ(t) = 1−∫ 1/2
−1/2
SNR (fH(λ))2
SNRfH(λ) + Lnt
dλ (7.114)
irrespective of ℓ and t. We shall therefore denote the variance of the interpolation
error ǫ2ℓ(t) by ǫ. Note that for a fixed T , the entries of
1√
ntnr + ntnrSNRǫ2∗,T
H(T )ℓ (7.115)
are independent of each other but not i.i.d., which follows from Part 2) of Lemma
7.1. However, as T tends to infinity, their distribution becomes identical due to
(7.68) and (7.69) and hence they converge in probability to
1√
ntnr + ntnrSNRǫ2∗,T
H(T )ℓ
d−→ 1√ntnr + ntnrSNRǫ2
H (7.116)
where the entries of H are i.i.d. complex-Gaussian random variables with zero
mean and variance (1− ǫ2).
Note that
log det
(
Inr +SNR
ntnr + ntnrSNRǫ2∗,TH
(T )ℓ H
†(T )ℓ
)
≥ 0 (7.117)
is a continuous function with respect to the entries of the matrix
1
ntnr + ntnrSNRǫ2∗,TH
(T )ℓ H
†(T )ℓ . (7.118)
It follows from Portmanteau’s lemma [99] that for T → ∞, the RHS of (7.113)
can be lower-bounded by
limT→∞
1
L
L−1∑
ℓ=nt
[
log det
(
Inr +SNR
ntnr + ntnrSNRǫ2∗,TH
(T )ℓ H
(T )ℓ
)]
− 1
≥ L− nt
L
[
log det
(
Inr +SNR
ntnr + ntnrSNRǫ2H H
†)]
− 1
. (7.119)
130
7.4 Proofs
Applying the lower bound log det (I+ A) ≥ log det A, we further have that
L− nt
L
[
log det
(
Inr +SNR
ntnr + ntnrSNRǫ2H H
†)]
− 1
≥ L− nt
L
(
[
log det
(
SNR
ntnr + ntnrSNRǫ2H H
†)]
− 1
)
. (7.120)
Combining (7.120) with (7.119) and (7.113) yields
Igmi(SNR)
= limT→∞
IgmiT (SNR) (7.121)
≥ L− nt
L
(
nt log SNR− nt log(
nt2 + nt
2SNRǫ2)
+ [
log det H H †]− 1
)
.
(7.122)
Limit SNR → ∞
In the following, we compute a lower bound on the pre-log by computing the
limiting ratio of the RHS of (7.122) to SNR as SNR tends to infinity. To this end,
we first consider
SNR ǫ2 = SNR
(
1−∫ 1/2
−1/2
SNR(fH(λ))2
SNRfH(λ) + Lntdλ
)
(7.123)
=
∫ 1/2
−1/2
SNRfH(λ)Lnt
SNRfH(λ) + Lntdλ. (7.124)
Since the integrand is bounded by
0 ≤ SNRfH(λ)Lnt
SNRfH(λ) + L≤ Lnt (7.125)
it follows that 0 ≤ SNR ǫ2 ≤ Lnt, which implies that
limSNR→∞
log (ntnr + ntnrSNR ǫ2)
log SNR= 0. (7.126)
We next consider the term [
log det H H †]− 1. Note that by [90, Lemma A.2]
[
log det H H †]− 1 = nt log(1− ǫ2) +nr−1∑
b=0
ψ(nt − b)− 1 (7.127)
131
7.4 Proofs
where ψ(·) is Euler’s digamma function [53]. Furthermore, since
0 ≤ SNR(fH(λ))2
SNRfH(λ) + Lnt≤ fH(λ) (7.128)
we have by the dominated convergence theorem [19] that
limSNR→∞
ǫ2 = limSNR→∞
(
1−∫ 1/2
−1/2
SNR(fH(λ))2
SNRfH(λ) + Lntdλ
)
= 0 (7.129)
so, log(1 − ǫ2) vanishes as the SNR tends to infinity. Combining (7.129) with
(7.127) yields
limSNR→∞
[
log det H H †]− 1
log SNR= 0. (7.130)
It thus follows from (7.126) and (7.130) that we obtain the lower bound
ΠR∗ ≥ nt
(
1− nt
L
)
(7.131)
= min(nt, nr)
(
1− min(nt, nr)
L
)
, L ≤ 1
2λD(7.132)
where we have used that nt = nr = min(nt, nr). Note that the condition L ≤ 12λD
is necessary since otherwise (7.114) would not hold. This proves Theorem 7.1.
7.4.2 A Note on Input Distribution
The pre-log in Theorem 7.1 is derived using codebooks whose entries are drawn
i.i.d. from Nnt (0, Int). However, Gaussian inputs are not necessary to achieve
the pre-log (7.24). In fact, (7.24) can be achieved by any i.i.d. inputs whose den-
sity satisfies (7.25) and (7.26). To show this, we consider (7.107) and evaluate an
upper bound to F (SNR) and κ(θ, SNR) for an arbitrary continuous input distri-
bution with the average-power constraint [
‖X‖2]
≤ nt and density satisfying
PX(x) ≤ K
πnte−‖x‖2 , x ∈ nt . (7.133)
Remark that with the constraint (7.133), it is not possible to have [
‖X‖2]
= 0.
In order for Lemma 7.2 to hold, F (SNR) should be re-defined as
F (SNR) , nr +SNR
(L− nt)nt
L−1∑
ℓ=nt
[
∥
∥
∥E(T )ℓ Xℓ
∥
∥
∥
2
F
]
. (7.134)
Using the upper bound on the Frobenius norm of the product of two matrices
132
7.4 Proofs
‖AB‖2F ≤ ‖A‖2F · ‖B‖2F [86, Sec. 5.6] and the independence between E(T )ℓ and Xℓ,
we can bound F (SNR) by
F (SNR) ≤ nr +SNR
(L− nt)nt
L−1∑
ℓ=nt
[
∥
∥
∥E(T )ℓ
∥
∥
∥
2
F
]
· [
∥
∥Xℓ
∥
∥
2]
. (7.135)
To evaluate an upper bound to κ(θ, SNR), we upper-bound κn(θ, SNR), de-
fined in (7.96), using the following upper bound
[
exp
(
θ
nDk(m
′)
)∣
∣
∣
∣
yk, H(T )k
]
=
∫
xk
PX(xk) exp
(
θ
n
∥
∥
∥yk −
√
SNR/nt H(T )k xk
∥
∥
∥
2)
dxk (7.136)
≤∫
xk
K
πntexp
(
−‖xk‖2 +θ
n
∥
∥
∥yk −
√
SNR/nt H(T )k xk
∥
∥
∥
2)
dxk (7.137)
=K
det(
Inr − θnSNR
ntH
(T )k H
†(T )k
)exp
(
θ
ny†k
(
Inr −θ
n
SNR
ntH
(T )k H
†(T )k
)−1
yk
)
.
(7.138)
By following the steps used in Section 7.4.1.2, and by choosing
θ =−1
nr + (SNR/nt)ntnrǫ2∗,T[
‖X‖2] (7.139)
where ǫ2∗,T is given in (7.112), we obtain from (7.135) and (7.138)
IgmiT (SNR)
≥ 1
L
L−1∑
ℓ=nt
[
log det
(
Inr +SNR
ntnr + ntnrSNRǫ2∗,T[
‖X‖2] H
(T )ℓ H
†(T )ℓ
)]
− L− nt
L(1 + logK) . (7.140)
Taking the limit of T to infinity, and repeating the steps used in Section 7.4.1.2
133
7.5 Conclusion
yield
Igmi(SNR)
= limT→∞
IgmiT (SNR) (7.141)
≥ L− nt
L
(
[
log det
(
SNR
ntnr + ntnrSNR ǫ[
‖X‖2] H H
†
)]
− 1− logK
)
(7.142)
=L− nt
L
(
nt log SNR− nt log(ntnr + ntnrSNR ǫ[
‖X‖2]
)
+ [
log det H H †]− 1− logK
)
. (7.143)
We conclude by evaluating the limiting ratio of the RHS of (7.143) to log SNR
as SNR to infinity. To this end, we can see from (7.125) and the average-power
constraint [
‖X‖2]
≤ nt that
limSNR→∞
log(
ntnr + ntnrSNRǫ2[
‖X‖2])
log SNR= 0. (7.144)
It thus follows from (7.144) and (7.130) that the pre-log (7.24) can be achieved
if
limSNR→∞
logK
log SNR= 0. (7.145)
7.5 Conclusion
We have studied the information rate pre-log of noncoherent bandlimited MIMO
fading channels achievable with nearest neighbour decoding and pilot-aided chan-
nel estimation. We have shown that the achievable pre-log is given by the ca-
pacity pre-log of the coherent fading channel times the fraction of time used for
the transmission of data. Hence, the loss with respect to the coherent case is
solely due to the transmission of orthogonal pilots used to obtain accurate fad-
ing estimates. If the inverse of twice the bandwidth of the fading process is an
integer, then for MISO channels, the above scheme is optimal in the sense that it
achieves the capacity pre-log of the noncoherent fading channel derived by Koch
and Lapidoth [91]. For noncoherent MIMO channels, the above scheme achieves
the best so far known lower bound on the capacity pre-log obtained by Etkin and
Tse [1].
134
7.5 Conclusion
The pre-log derived here assumes that L (the smallest time interval for which
the same pilot is being transmitted) is limited by the inverse of twice the band-
width of the fading psd so that we can achieve a decaying variance of the fading
estimation error as the inverse of the SNR. This facilitates reliable fading estima-
tion and enables the nearest neighbour decoder to achieve the capacity pre-log of
the coherent fading channel. In order to improve the pre-log, one should reduce
the time spent for the transmission of pilots, yet still maintain the accuracy of
fading estimates. Note that the fraction of time used for the transmission of
pilots is directly proportional to the number of transmit antennas and inversely
proportional to L. Hence, to reduce the fraction of time for the transmission of
pilots, one could increase L beyond the inverse of twice the bandwidth of the
fading psd. However, we show that the pre-log cannot be improved using this
technique. The fault lies on the variance of the fading estimation error that
is bounded away from zero, which makes the fading estimation unreliable and
makes the nearest neighbour decoder perform poorly.
135
Chapter 8
Pilot-Aided Channel Estimation
for Fading Multiple-Access
Channels
In Chapter 7, we have studied the pre-log of point-to-point MIMO fading channels
achievable with nearest neighbour decoding and pilot-aided channel estimation.
It was demonstrated that the pre-log coincides with the capacity pre-log for
MISO fading channels, derived by Koch and Lapidoth [91], and that the scheme
achieves the best so far known lower bound on the capacity pre-log of MIMO
fading channels, derived by Etkin and Tse [1].
In this chapter, we extend the analysis in Chapter 7 to the fading multiple-
access channel (MAC). We propose a joint-transmission scheme that jointly trans-
mits and decodes messages from all users.8.1 We are interested in the achievable-
rate region that can be achieved with nearest neighbour decoding and pilot-aided
channel estimation. In particular, we study the pre-log region, defined as the lim-
iting ratio of the achievable-rate region to the logarithm of the SNR as the SNR
tends to infinity.
This chapter is organised as follows. We first introduce the MIMO fading
MAC model in Section 8.1 and describe the transmission scheme in Section 8.2.
We present our main results on the MAC pre-log in Section 8.3. We next com-
pare the joint-transmission scheme with time-division multiple-access (TDMA) in
Section 8.4. We give the proof of our main results in Section 8.5. We summarise
the important points of the chapter in Section 8.6.
8.1By joint transmission, we mean that codewords from both users are simultaneously trans-mitted at the same time instants. It is assumed that there exists a central controller thatsynchronises the transmission from both users.
137
8.1 System Model
Tx
Tx
s = 1
s = 2
Rx
m1
m2
(m1, m2)
Figure 8.1: The two-user MAC system model.
8.1 System Model
We consider a two-user MIMO fading MAC, where two terminals wish to com-
municate with a third one, and where the channels between the terminals are
MIMO fading channels. The first user has nt,1 antennas, the second user has
nt,2 antennas and the receiver has nr antennas. The channel model is depicted
in Figure 8.1. The channel output at time instant k ∈ is a complex-valued
nr-dimensional random vector
Yk =√SNRH1,kx1,k +
√SNRH2,kx2,k +Zk. (8.1)
Here xs,k ∈ nt,s denotes the time-k channel input vector corresponding to user
s, s = 1, 2, H s,k denotes the nr × nt,s-dimensional fading matrix at time k cor-
responding to user s, s = 1, 2, SNR denotes the average SNR for each transmit
antenna and Zk denotes the nr-variate additive noise vector at time k.
The noise process Zk, k ∈ is a sequence of i.i.d. complex-Gaussian ran-
dom vectors with zero mean and covariance matrix Inr.
The fading processes H s,k, k ∈ , s = 1, 2 are stationary, ergodic and
complex-Gaussian. We assume that the (nt,1·nr+nt,2·nr) processes Hs,k(r, t), k ∈, s = 1, 2, r = 1, . . . , nr, t = 1, . . . , nt,s are independent and have the same law,
with each process having zero mean, unit variance and power spectral density
(psd) fH(λ), −12≤ λ ≤ 1
2. Thus, fH(·) is a nonnegative function satisfying
[
Hs,k+m(r, t)H∗s,k(r, t)
]
=
∫ 1/2
−1/2
eı2πmλfH(λ)dλ (8.2)
where H∗s,k(r, t) denotes the complex conjugate of Hs,k(r, t). We further assume
that the psd fH(·) has bandwidth λD ∈ (0, 1/2], i.e., fH(λ) = 0 for |λ| > λD and
138
8.2 Transmission Scheme
fH(λ) > 0 otherwise.
We finally assume that the fading processes H s,k, k ∈ , s = 1, 2 and
the noise process Zk, k ∈ are independent and that their joint law does
not depend on xs,k, k ∈ , s = 1, 2. We consider a noncoherent channel
model, where the transmitters and the receiver are aware of the statistics of
H s,k, k ∈ , s = 1, 2 but not of their realisations.
8.2 Transmission Scheme
Both users transmit codewords and pilot symbols over the channel (8.1). To
transmit the message ms ∈ 1, . . . , enRs, s = 1, 2, each user’s encoder selects
a codeword of length n from a codebook Cs, where Cs, s = 1, 2 are drawn i.i.d.
from an nt,s-variate, zero-mean, complex-Gaussian distribution with covariance
matrix Int,s.8.2 To facilitate channel estimation at the receiver, orthogonal pilot
vectors are used. The pilot vector ps,t ∈ nt,s, s = 1, 2, t = 1, . . . , nt,s used to
estimate the fading coefficients from transmit antenna t of user s is given by
ps,t(t) = 1 and ps,t(t′) = 0 for t′ 6= t. For example, the first pilot vector of user
s is given by (1, 0, . . . , 0)T, where (·)T denotes the transpose. To estimate the
fading matrices H1,k and H2,k, each training period requires the (nt,1 +nt,2) pilot
vectors p1,1, . . . ,p1,nt,1,p2,1, . . . ,p2,nt,2.
Assuming synchronous transmissions from both users, the transmission scheme
extends the point-to-point setup in Chapter 7 to the two-user MAC setup as il-
lustrated in Figure 8.2. Every L time instants (for some L ≥ nt,1+nt,2, L ∈ ),user 1 first transmits the nt,1 pilot vectors p1,1, . . . ,p1,nt,1. Once the transmis-
sion of the nt,1 pilot vectors is finished, user 2 transmits its nt,2 pilot vectors
p2,1, . . . ,p2,nt,2. The codewords for both users are then split up into blocks of
(L − nt,1 − nt,2) data vectors, which are transmitted simultaneously after the
(nt,1 + nt,2) pilot vectors. The process of transmitting (L − nt,1 − nt,2) data
vectors and (nt,1 +nt,2) pilot vectors continues until all n data symbols are com-
pleted. Herein we assume that n is an integer multiple of (L − nt,1 − nt,2).8.3
Prior to transmitting the first data block, and after transmitting the last data
block, a guard period of L(T − 1) time instants (for some T ∈ ) is introducedfor the purpose of channel estimation, where we transmit every L time instants
the (nt,1 + nt,2) pilot vectors but we do not transmit data vectors in between.
8.2With this assumption, the channel inputs satisfy an average-power constraint. Using thetruncated Gaussian distribution satisfying the conditions in Remark 7.1, one can also imposea peak-power constraint.
8.3As in the point-to-point setup, this assumption is not critical (cf. footnote 7.1).
139
8.2 Transmission Scheme
Pilot Data No transmission
n +(
nL−nt,1−nt,2
+ 1)
(nt,1 + nt,2)
L L(T − 1)L(T − 1)
s = 1,t = 1
t = 2
s = 2, t = 1
Figure 8.2: Structure of joint-transmission scheme, nt,1 = 2, nt,2 = 1, L = 7 andT = 2.
Here we can see that codewords from both users are jointly transmitted at the
same time instants whereas pilots from both users are separately transmitted
at different time instants. Note that the total block-length of this transmission
scheme (comprising data vectors, pilot vectors and guard period) is given by
n′ = np + n+ ng (8.3)
where np and ng are now given by
np =
(
n
L− nt,1 − nt,2+ 1 + 2(T − 1)
)
(nt,1 + nt,2), (8.4)
ng = 2(L− nt,1 − nt,2)(T − 1). (8.5)
Once the transmission is completed, the decoder guesses which message has
been transmitted. The decoder consists of two parts: a channel estimator and a
data detector. The channel estimator observes the channel output Yk′, k′ ∈ P cor-
responding to the past and future T pilot transmissions and estimates Hs,k(r, t)
using a linear interpolator, i.e., the estimate H(T )s,k (r, t) of the fading coefficient
Hs,k(r, t) is given by
H(T )s,k (r, t) =
k+TL∑
k′=k−TL:k′∈P
as,k′(r, t)Yk′(r), k ∈ D (8.6)
where the coefficients as,k′(r, t) are chosen in order to minimise the mean-squared
error. Here P denotes the set of time indices where pilot symbols are transmitted,
and D denotes the set of time indices where data vectors of a codeword are
transmitted.
140
8.2 Transmission Scheme
Note that, since the pilot symbols are transmitted only from one user and
one antenna at a time, the fading coefficients corresponding to all transmit and
receive antennas from both users can be observed. Further note that, since the
fading processes Hs,k(r, t), k ∈ , s = 1, 2, r = 1, . . . , nr, t = 1, . . . , nt,s are
independent, estimating Hs,k(r, t) only based on Yk(r), k ∈ rather than on
Yk, k ∈ incurs no loss in optimality.
We denote the interpolation error in estimating Hs,k(r, t) using the interpo-
lator (8.6) as
E(T )s,k (r, t) = Hs,k(r, t)− H
(T )s,k (r, t). (8.7)
The error E(T )s,k (r, t) has zero mean and variance less than unity. A detailed anal-
ysis of the variance of E(T )s,k (r, t) follows closely from the analysis of the variance
of the interpolation error in Chapter 7 (see Sections 7.2 and 7.4.1).8.4
From the received codeword yk, k ∈ and the channel-estimate matrices
H(T )s,k , k ∈ D, s = 1, 2 (which are composed of the entries h(T )
s,k (r, t), k ∈ D,where h
(T )s,k (r, t) denotes the realisation of H
(T )s,k (r, t)), the decoder chooses the pair
of messages (m1, m2) that minimises the distance metric
(m1, m2) = arg min(m1,m2)
D(m1, m2) (8.8)
where
D(m1, m2) ,∑
k∈D
∥
∥
∥
∥
∥
yk −√SNR H
(T )1,kx1,k(m1)−
√SNR H
(T )2,kx2,k(m2)
∥
∥
∥
∥
∥
2
. (8.9)
In the following, we will refer to the above scheme as the joint-transmission
scheme.
We shall compare the joint-transmission scheme with a TDMA scheme, where
each user transmits its message using the transmission scheme illustrated in Fig-
ure 8.3. In particular, during the first βn′ channel uses (for some 0 ≤ β ≤ 1), user
1 transmits its codeword according to the transmission scheme given in Chapter
7 (see also Figure 8.3), while user 2 is silent. (Here n′ is given in (8.3).) Then,
during the next (1− β)n′ channel uses, user 2 transmits its codeword according
to the same transmission scheme, while user 1 is silent. In both cases, the re-
ceiver guesses the corresponding message ms, s = 1, 2 using a nearest neighbour
decoder and pilot-aided channel estimation.
8.4One could view the fading estimation from both users as the fading estimation for MIMOchannels with (nt,1 + nt,2) transmit antennas and nr receive antennas. The resulting transmitantenna index depends on the user index and the transmit antenna index for each user.
141
8.3 The MAC Pre-Log
Pilot Data No transmission
L(T − 1) L L(T − 1)
L(T − 1)
L
L(T − 1)
βn′ (1 − β)n′
s = 1,t = 1
t = 2
s = 2, t = 1
Figure 8.3: Structure of TDMA scheme, nt,1 = 2, nt,2 = 1, L = 4 and T = 2.
8.3 The MAC Pre-Log
Let R∗1(SNR), R
∗2(SNR) and R
∗1+2(SNR) be the maximum achievable rate for user
1, the maximum achievable rate for user 2 and the maximum achievable sum-rate,
respectively. The achievable-rate region is given by the closure of the convex hull
of the set [18]
R =
R1(SNR), R2(SNR) : R1(SNR) < R∗1(SNR),
R2(SNR) < R∗2(SNR),
R1(SNR) +R2(SNR) < R∗1+2(SNR)
. (8.10)
We are interested in the pre-logs of R1(SNR) and R2(SNR), defined as the limiting
ratios of R1(SNR) and R2(SNR) to the logarithm of the SNR as the SNR tends
to infinity. Thus, the pre-log region is given by the closure of the convex hull of
the set
ΠR =
ΠR1 ,ΠR2 : ΠR1 < ΠR∗1, ΠR2 < ΠR∗
2, ΠR1 +ΠR2 < ΠR∗
1+2
(8.11)
where
ΠR∗1, lim sup
SNR→∞
R∗1(SNR)
log SNR, (8.12)
ΠR∗2, lim sup
SNR→∞
R∗2(SNR)
log SNR, (8.13)
ΠR∗1+2
, lim supSNR→∞
R∗1+2(SNR)
log SNR. (8.14)
142
8.3 The MAC Pre-Log
The capacity pre-logs ΠC1 , ΠC2 and ΠC1+2 are defined in the same way but
with R∗1(SNR), R
∗2(SNR) and R∗
1+2(SNR) replaced by the respective capacities
C1(SNR), C2(SNR) and C1+2(SNR).
In the following theorem, we present our result on the pre-log region of the
two-user MIMO fading MAC achievable with the joint-transmission scheme.
Theorem 8.1. Consider the MIMO fading MAC model (8.1). Then, the pre-log
region achievable with the joint-transmission scheme is the closure of the convex
hull of the set
ΠR1 ,ΠR2 : ΠR1 < min (nr, nt,1)
(
1− nt,1 + nt,2
L∗
)
,
ΠR2 < min (nr, nt,2)
(
1− nt,1 + nt,2
L∗
)
,
ΠR1 +ΠR2 < min (nr, nt,1 + nt,2)
(
1− nt,1 + nt,2
L∗
)
(8.15)
where L∗ =⌊
12λD
⌋
is the largest integer satisfying L∗ ≤ 12λD
.
Proof. See Section 8.5.
Remark 8.1. The pre-log region given in Theorem 8.1 is the largest region
achievable with any transmission scheme that uses (nt,1 + nt,2)/L∗ of the time
for transmitting pilot symbols. Indeed, even if the channel estimator would be
able to estimate the fading coefficients perfectly, and even if we could decode
the data symbols using a maximum-likelihood decoder, the capacity pre-log region
(without pilot transmission) would be given by the closure of the convex hull of
the set [18,88,89]
(ΠR1 ,ΠR2) : ΠR1 < min(nr, nt,1),
ΠR2 < min(nr, nt,2),
ΠR1 +ΠR2 < min(nr, nt,1 + nt,2)
(8.16)
which, after multiplying by 1 − (nt,1 + nt,2)/L∗ in order to account for the pilot
symbols, becomes (8.15). Thus, in order to improve upon (8.15), one would need
to design a transmission scheme that employs less than (nt,1 + nt,2)/L∗ pilot
symbols per channel use.
143
8.4 Joint Transmission Versus TDMA
Remark 8.2 (TDMA Pre-Log). Consider the MIMO fading MAC model (8.1).
Then, the pre-log region achievable with the TDMA scheme employing nearest
neighbour decoding and pilot-aided channel estimation is the closure of the convex
hull of the set
ΠR1 ,ΠR2 : ΠR1 < βmin (nr, nt,1)(
1− nt,1
L∗
)
,
ΠR2 < (1− β)min (nr, nt,2)(
1− nt,2
L∗
)
, 0 ≤ β ≤ 1
(8.17)
where L∗ =⌊
12λD
⌋
is the largest integer satisfying L∗ ≤ 12λD
. This follows directly
from the pre-log of the point-to-point MIMO fading channel (Theorem 7.1) where
the number of transmit antennas from users 1 and 2 is given by nt,1 and nt,2,
respectively.
Note that the sum of the pre-logs ΠR1+ΠR2 is upper-bounded by the capacity
pre-log of the point-to-point MIMO fading channel with (nt,1 + nt,2) transmit
antennas and nr receive antennas, since the point-to-point MIMO channel allows
for cooperation between the transmitting terminals. While the capacity pre-log
of point-to-point MIMO fading channels remains an open problem, the capacity
pre-log of point-to-point MISO fading channels under a peak-power constraint is
known, cf. (7.21). It thus follows from (7.21) that, for nr = nt,1 = nt,2 = 1, we
have
ΠR1 +ΠR2 < 1− 2λD (8.18)
which together with the single-user constraints [85]
ΠR1 < ΠC1 = 1− 2λD (8.19)
ΠR2 < ΠC2 = 1− 2λD (8.20)
implies that TDMA achieves the capacity pre-log region of the SISO fading MAC
under a peak-power constraint. The next section provides a more detailed com-
parison between the joint-transmission scheme and TDMA.
8.4 Joint Transmission Versus TDMA
In this section, we discuss how the joint-transmission scheme performs compared
to TDMA. To this end, we compare the sum-rate pre-log ΠR∗1+2
of the joint-
144
8.4 Joint Transmission Versus TDMA
transmission scheme (Theorem 8.1) with the sum-rate pre-log of the TDMA
scheme employing nearest neighbour decoding and pilot-aided channel estimation
(Remark 8.2) as well as with the sum-rate pre-log of TDMA when the receiver
has knowledge of the realisations of the fading processes H s,k, k ∈ , s = 1, 2.
In the latter case, the sum-rate pre-log is given by
ΠR∗1+2
= βmin(nr, nt,1) + (1− β)min(nr, nt,2). (8.21)
The following corollary presents a sufficient condition on L∗ under which the
sum-rate pre-log of the joint-transmission scheme is strictly larger than the sum-
rate pre-log of the coherent TDMA scheme (8.21), as well as a sufficient condition
on L∗ under which it is strictly smaller than the sum-rate pre-log of the TDMA
scheme given in Remark 8.2. Since (8.21) is an upper bound on the sum-rate
pre-log of any TDMA scheme over the MIMO fading MAC (8.1), and since the
sum-rate pre-log given in Remark 8.2 is a lower bound on the sum-rate pre-log
of the best TDMA scheme, it follows that the sufficient conditions presented in
Corollary 8.1 hold also for the best TDMA scheme.
Corollary 8.1. Consider the MIMO fading MAC model (8.1). The joint-trans-
mission scheme achieves a larger sum-rate pre-log than any TDMA scheme if
L∗ >min(nr, nt,1 + nt,2)(nt,1 + nt,2)
min(nr, nt,1 + nt,2)−min(
nr,max(nt,1, nt,2)) (8.22)
where we define a/0 , ∞ for every a > 0. Conversely, the best TDMA scheme
achieves a larger sum-rate pre-log than the joint-transmission scheme if
L∗ <min(nr, nt,1 + nt,2)(nt,1 + nt,2)
min(nr, nt,1 + nt,2)−min(nr, nt,1, nt,2)
− min(nt,1nr, nt,12, nt,2nr, nt,2
2)
min(nr, nt,1 + nt,2)−min(nr, nt,1, nt,2). (8.23)
Recall that L∗ is inversely proportional to the bandwidth of the power spectral
density fH(·), which in turn is inversely proportional to the coherence time of
the fading channel. We thus see from Corollary 8.1 that the joint-transmission
scheme tends to be superior to TDMA when the coherence time of the channel
is large. In contrast, TDMA is superior to the joint-transmission scheme when
the coherence time of the channel is small.
Intuitively, this can be explained by observing that, compared to TDMA,
the joint-transmission scheme uses the multiple antennas at the transmitters
145
8.4 Joint Transmission Versus TDMA
and at the receiver more efficiently, but requires more pilot symbols to estimate
the fading coefficients. Thus, when the coherence time is large, the number of
pilot symbols required to estimate the fading is small, so the gain in capacity
by using the antennas more efficiently dominates the loss incurred by requiring
more pilot symbols. On the other hand, when the coherence time is small, the
number of pilot symbols required to estimate the fading is large and the loss in
capacity incurred by requiring more pilot symbols dominates the gain by using
the antennas more efficiently.
We next evaluate (8.22) and (8.23) for some particular values of nr, nt,1, and
nt,2.
8.4.1 Receiver Employs Less Antennas Than Transmit-
ters
Suppose that the number of receive antennas is smaller than the number of
transmit antennas, i.e., nr ≤ min(nt,1, nt,2). Then, the RHSs of (8.22) and (8.23)
become ∞, so every finite L∗ satisfies (8.23). Thus, if the number of receive
antennas is smaller than the number of transmit antennas, then, irrespective of
L∗, TDMA is superior to the joint-transmission scheme.
8.4.2 Receiver Employs More Antennas Than Transmit-
ters
Suppose that the receiver employs more antennas than the transmitters, i.e.,
nr ≥ nt,1 + nt,2, and suppose that nt,1 = nt,2 = nt. Then, (8.22) and (8.23)
become
L∗ > 4nt (8.24)
and
L∗ < 3nt. (8.25)
Thus, if L∗ is greater than 4nt, then the joint-transmission scheme is superior to
TDMA. In contrast, if L∗ is smaller than 3nt, then TDMA is superior. This is
illustrated in Figure 8.4 for the case where nr = 2 and nt,1 = nt,2 = 1. Note that
if L∗ is between 3nt and 4nt, then the joint-transmission scheme is superior to
the TDMA scheme presented in Remark 8.2, but it may be inferior to the best
TDMA scheme.
146
8.4 Joint Transmission Versus TDMA
1 − 1
L∗
1 − 1
L∗
1 − 2
L∗
1 − 2
L∗
ΠR1
ΠR2
1
1
(a) L∗ < 3
1 − 1
L∗
1 − 1
L∗
1 − 2
L∗
1 − 2
L∗
ΠR1
ΠR2
1
1
(b) L∗ > 4
Noncoherent Joint-transmission
Noncoherent TDMA
Coherent TDMA
Figure 8.4: Pre-log regions for a fading MAC with nr = 2 and nt,1 = nt,2 = 1 fordifferent values of L∗. Depicted are the pre-log region for the joint-transmissionscheme as given in Theorem 8.1 (dashed line), the pre-log region of the TDMAscheme as given in Remark 8.2 (solid line), and the pre-log region of the coherentTDMA scheme (8.21) (dotted line).
8.4.3 A Case in Between
Suppose that nr ≤ nt,1 + nt,2 and nt,2 < nr ≤ nt,1. Then, (8.22) becomes
L∗ >∞ (8.26)
and (8.23) becomes
L∗ < nt,2 +nrnt,1
nr − nt,2. (8.27)
Thus, in this case the joint-transmission scheme is always inferior to the coherent
TDMA scheme (8.21), but it can be superior to the TDMA scheme in Remark
8.2.
Typical Values of L∗
We briefly discuss what values of L∗ may occur in practical scenarios. To this
end, we first recall that L∗ is the largest integer satisfying L∗ ≤ 12λD
, where
λD is the bandwidth of the power spectral density fH(·), which in turn can be
associated with the Doppler spread of the channel as
λD =fmWc
. (8.28)
Here fm is the maximum Doppler shift given by
fm =v
cfc (8.29)
147
8.4 Joint Transmission Versus TDMA
EnvironmentDelay Mobile
λD ≈ 5στvc fc L∗
spread στ Speed v
Indoor 10 – 100 ns 5 km/h 2 · 10−7 – 10−5 5 · 104 – 2.5 · 106Urban 1 – 2 µs 5 km/h 2 · 10−5 – 2 · 10−4 2.5 · 103 – 2.5 · 104Urban 1 – 2 µs 75 km/h 2 · 10−4 – 0.004 125 – 2.5 · 103Hilly area 3 – 10 µs 200 km/h 0.002 – 0.05 10 – 250
Table 8.1: Typical values of L∗ for various environments with fc ranging from800 MHz to 5 GHz. The values of στ are taken from [1] for indoor and urbanenvironments and from [2] for hilly area environments.
where v is the mobile speed, c = 3·108 m/s is the speed of light and fc is the carrier
frequency; and Wc is the coherence bandwidth of the channel approximated as
[1, 100]
Wc ≈1
5στ(8.30)
where στ is the delay spread. Following the order of magnitude computations of
Etkin and Tse [1], we determine typical values of λD for indoor, urban, and hilly
area environments and for carrier frequencies ranging from 800 MHz to 5 GHz
and tabulate the results in Table 8.1.
For indoor environments and mobile speeds of 5 km/h, we thus have that L∗ is
typically greater than 5·104. For urban environments, L∗ is typically greater than
2.5 · 103 for mobile speeds of 5 km/h and greater than 125 for mobile speeds of
75 km/h. For hilly area environments and mobile speeds of 200 km/h, L∗ ranges
typically from 10 to 250. Thus, for most practical scenarios, L∗ is typically large.
It therefore follows that, if nr ≥ nt,1+nt,2, (8.22) is satisfied unless nt,1+nt,2 is very
large. For example, if the receiver employs more antennas than the transmitters,
and if nt,1 = nt,2 = nt, then L∗ > 4nt is satisfied even for urban environments and
mobile speeds of 75 km/h, as long as nt < 30. Only for hilly area environments
and mobile speeds of 200 km/h, this condition may not be satisfied for a practical
number of transmit antennas. Thus, if the number of antennas at the receiver
is sufficiently large, then the joint-transmission scheme is superior to TDMA in
most practical scenarios. On the other hand, if nr ≤ min(nt,1, nt,2), then TDMA
is always superior to the joint-transmission scheme, irrespective of how large L∗
is. This suggests that one should use more antennas at the receiver than at the
transmitters.
148
8.5 Proof of Theorem 8.1
8.5 Proof of Theorem 8.1
In contrast to the proof of Theorem 7.1, we note that for the fading MAC, it is
not sufficient to consider only the case of nt,1 = nt,2 = nr as both transmitting
terminals do not cooperate. For the proof of Theorem 8.1, we consider a general
setup of nt,1, nt,2 and nr.
In order to analyse the achievable pre-log region for the fading MAC, we first
provide an extension to Lemma 7.2 for the fading MAC as follows.
Corollary 8.2. Let E(T )s,k , s = 1, 2 be the estimation-error matrix in estimating
H s,k, i.e.,
E(T )s,k = H s,k − H
(T )s,k (8.31)
with the entries E(T )s,k (r, t), r = 1, . . . , nr, t = 1, . . . , nt. Define
F (SNR , nr +SNR
L− nt,1 − nt,2
L−1∑
ℓ=nt,1+nt,2
[
∥
∥
∥E(T )1,ℓ
∥
∥
∥
2
F+∥
∥
∥E(T )2,ℓ
∥
∥
∥
2
F
]
, (8.32)
n , n +
(
n
L− nt,1 − nt,2+ 1
)
(nt,1 + nt,2)− 1, (8.33)
Tδ ,
xs,k,yk, H(T )s,k , k = 0, . . . , n, s = 1, 2 :
∣
∣
∣
∣
∣
1
n
∑
k∈D
∥
∥
∥yk −
√SNR H
(T )1,kx1,k −
√SNR H
(T )2,kx2,k
∥
∥
∥
2
− F (SNR)
∣
∣
∣
∣
∣
< δ
(8.34)
for some δ > 0. It holds that
limn→∞
Pr
X ns,0,Y
n0 , H
(T ),ns,0 , s = 1, 2
∈ Tδ
= 1, ∀δ > 0. (8.35)
Proof. The proof follows from the proof of Lemma 7.2 by treating the channel
as a MIMO channel with channel matrix (H1,k,H2,k), channel-estimate matrix
(H(T )1,k , H
(T )2,k ) and by considering transmission of codewords that are drawn i.i.d.
from Nnt,1+nt,2 (0, Int,1+nt,2).
Let Pe and Pe(m1, m2) be the ensemble-average error probability and the
ensemble-average error probability corresponding to message m1 and m2 be-
ing transmitted. Since the codebook construction is symmetric, it suffices to
study the conditional probability of error, conditioned on the event that the
149
8.5 Proof of Theorem 8.1
messages (m1, m2) = (1, 1) were transmitted. Let E(m′1, m
′2) denote the event
that D(m′1, m
′2) ≤ D(1, 1). The ensemble-average error probability can be upper-
bounded as
Pe(1, 1)
= Pr
⋃
(m′1,m
′2)6=(1,1)
E(m′1, m
′2)
(8.36)
≤ Pr
⋃
m′1 6=1
E(m′1, 1)
+ Pr
⋃
m′2 6=1
E(1, m′2)
+ Pr
⋃
m′1 6=1
⋃
m′2 6=1
E(m′1, m
′2)
.
(8.37)
With i.i.d. codebooks, we then have three maximum achievable rates: Igmi1,T (SNR),
Igmi2,T (SNR) and Igmi
1+2,T (SNR) corresponding to the error events (m′1 6= 1, m′
2 = 1),
(m′1 = 1, m′
2 6= 1) and (m′1 6= 1, m′
2 6= 1).
Igmi1,T (SNR) – Error Event (m′
1 6= 1, m′2 = 1)
Following the same steps used to derive (7.93), we can upper-bound the ensemble-
average error probability for the error event E(m′1, 1), m
′1 6= 1 using Tδ and its
complement Tcδ as
Pr
⋃
m′1 6=1
E(m′1, 1)
≤ Pr
⋃
m′1 6=1
E(m′1, 1)
∣
∣
∣
∣
∣
∣
X ns,0(1),Y
n0 , H
(T ),ns,0 , s = 1, 2
∈ Tδ
+ Pr
X ns,0(1),Y
n0 , H
(T ),ns,0 , s = 1, 2
∈ Tcδ
(8.38)
≤ enR1 · Pr
1
n·D(m′
1, 1) < F (SNR) + δ
∣
∣
∣
∣
X ns,0(1),Y
n0 , H
(T ),ns,0 , s = 1, 2
∈ Tδ
+ Pr
X ns,0(1),Y
n0 , H
(T ),ns,0 , s = 1, 2
∈ Tcδ
, m′1 6= 1 (8.39)
where in the last inequality we have used the union bound and that
1
n·D(1, 1) < F (SNR) + δ for
X ns,0(1),Y
n0 , H
(T ),ns,0 , s = 1, 2
∈ Tδ. (8.40)
150
8.5 Proof of Theorem 8.1
Note that due to Corollary 8.2, we have that
Pr
X ns,0(1),Y
n0 , H
(T ),ns,0 , s = 1, 2
∈ Tcδ
(8.41)
can be made any arbitrarily small by letting the codeword length n tend to
infinity.
The computation of the GMI corresponding to the event E(m′1, 1), m
′1 6= 1 re-
quires the expression of log moment-generating function of of the metric D(m′1, 1)
associated with an incorrect messagem′1 6= 1, conditioned on the channel outputs,
on the message from user 2, m2 = 1 and on the fading estimates, i.e.,
κ1,n(θ, SNR)
= log E
[
exp
(
θ
n
∑
k∈DDk(m
′1, 1)
)∣
∣
∣
∣
∣
(
yk,x2,k(1), H(T )1,k , H
(T )2,k
)
, k ∈ D
]
(8.42)
where
Dk(m′1, 1) =
∥
∥
∥yk −
√SNR H
(T )1,kx1,k(m
′1)−
√SNR H
(T )2,kx2,k(1)
∥
∥
∥
2
. (8.43)
Following the steps used in Section 7.4.1, it is not difficult to show that
limn→∞
1
n· κ1,n(nθ, SNR)
=1
L− nt,1 − nt,2
L−1∑
ℓ=nt,1+nt,2
(
g1,ℓ − [
log det(
Inr − θ SNR H(T )1,ℓ H
†(T )1,ℓ
)])
,
almost surely (8.44)
where
g1,ℓ ,
[
θ(
Yℓ −√SNR H
(T )2,ℓ X2,ℓ
)†×(
Inr − θ SNR H(T )1,ℓ H
†(T )1,ℓ
)−1
×(
Yℓ −√SNR H
(T )2,ℓ X2,ℓ
)
]
. (8.45)
Furthermore, following the derivation in [39, 54], we can bound the ensemble-
average error probability (E(m′1, 1), m
′1 6= 1) for any δ′ > 0 as
Pr
⋃
m′1 6=1
E(m′1, 1)
≤ exp (nR1) exp(
−n(
Igmi1,T (SNR)− δ′
))
+ ε1(δ′, n) (8.46)
151
8.5 Proof of Theorem 8.1
for some functions ε1(δ′, n) satisfying
limn→∞
ε1(δ′, n) = 0. (8.47)
Here Igmi1,T (SNR) is the GMI corresponding to the event E(m′
1, 1), m′1 6= 1 for a
fixed T
Igmi1,T (SNR) =
L− nt,1 − nt,2
L
(
supθ<0
(
θF (SNR)− κ1(θ, SNR))
)
(8.48)
where κ1(θ, SNR) is given by the RHS of (8.44), i.e.,
κ1(θ, SNR) =1
L
L−1∑
ℓ=nt,1+nt,2
(
g1,ℓ − [
log det(
Inr − θ SNR H(T )1,ℓ H
†(T )1,ℓ
)])
. (8.49)
By noting g1,ℓ ≤ 0 for θ ≤ 0 (which can be shown using the technique devel-
oped in Appendix A.3), combining (8.32) and (8.49) with (8.48) and substituting
the choice8.5
θ =−1
nr + nr (nt,1 + nt,2) SNR ǫ2∗,T
(8.50)
where
ǫ2∗,T = maxs=1,2,
r=1,...,nr,t=1,...,nt,s,
ℓ=nt,1+nt,2,...,L−1
[
∣
∣
∣E
(T )s,ℓ (r, t)
∣
∣
∣
2]
, (8.51)
we obtain a GMI lower bound
Igmi1,T (SNR) ≥
1
L
L−1∑
ℓ=nt,1+nt,2
log det
Inr +SNR H
(T )1,ℓ H
†(T )1,ℓ
nr + nr (nt,1 + nt,2) SNR ǫ2∗,T
− 1
.
(8.52)
We continue by analysing the RHS of (8.52) in the limit as the observation
window T of the channel estimator tends to infinity. To this end, following from
the derivation in Section 7.4.1, we note that, for L ≤ 12λD
, the variance of the
interpolation error [|E(T )s,ℓ (r, t)|2] tends to (7.15) (with SNR in (7.15) replaced
by ntSNR)
limT→∞
[
∣
∣
∣E
(T )s,ℓ (r, t)
∣
∣
∣
2]
= ǫ2 = 1−∫ 1/2
−1/2
SNR(fH(λ))2
SNRfH(λ) + Ldλ (8.53)
8.5As pointed in Section 7.4.1, this choice of θ yields a good lower bound at high SNR.
152
8.5 Proof of Theorem 8.1
irrespective of s, ℓ, r, t. Hence, irrespective of ℓ, the estimate H(T )1,ℓ tends to H in
distribution as T tends to infinity, so
H(T )1,ℓ H
†(T )1,ℓ
nr + nr (nt,1 + nt,2) SNR ǫ2∗,T
d−→ H H†
nr + nr (nt,1 + nt,2)SNR ǫ2(8.54)
where the entries of H are i.i.d., circularly-symmetric, complex-Gaussian random
variables with zero mean and variance 1−ǫ2. Since the function A 7→ det(I+A) is
continuous and bounded from below, we obtain from Portmanteau’s lemma [99]
that
Igmi1 (SNR)
= limT→∞
Igmi1,T (SNR) (8.55)
≥ L− nt,1 − nt,2
L
[
log det
(
Inr +SNR H H †
nr + nr (nt,1 + nt,2) SNR ǫ2
)]
− 1
(8.56)
≥ L− nt,1 − nt,2
Lmin(nr, nt,1)
[
log SNR − log(
nr + nr(nt,1 + nt,2) SNR ǫ2)
]
+L− nt,1 − nt,2
LΨ (8.57)
where
Ψ ,
[
log det H H †]− 1, nr ≤ nt,1
[
log det H †H
]
− 1, nr > nt,1.(8.58)
Here the last inequality follows by lower-bounding log det (I+ A) ≥ log detA.
Following the pre-log evaluation in Section 7.4.1, we obtain a lower bound for
the maximum achievable pre-log as
ΠR∗1, lim
SNR→∞
Igmi1 (SNR)
log SNR≥ min(nr, nt,1)
(
1− nt,1 + nt,2
L
)
, L ≤ 1
2λD. (8.59)
The condition L ≤ 1/(2λD) is necessary since otherwise (7.15) would not hold.
This yields one boundary of the pre-log region presented in Theorem 8.1.
153
8.5 Proof of Theorem 8.1
Igmi2 (SNR) – Error Event (m′
1 = 1, m′2 6= 1)
This follows from the proof for the error event (m′1 6= 1, m′
2 = 1) by swapping
user 1 with user 2. We thus have
ΠR∗2≥ min(nr, nt,2)
(
1− nt,1 + nt,2
L
)
, L ≤ 1
2λD(8.60)
yielding the second boundary of the pre-log region presented in Theorem 8.1.
Igmi1+2(SNR) – Error Event (m′
1 6= 1, m′2 6= 1)
The computation of the GMI corresponding to the event E(m′1, m
′2), (m′
1 6=1, m′
2 6= 1) requires the log moment-generating function of the metric D(m′1, m
′2)
associated with incorrect messages m′1 6= 1 and m′
2 6= 1, conditioned on the
channel outputs and the fading estimates, i.e.,
κ1+2,n(θ, SNR) = log E
[
exp
(
θ
n
∑
k∈DDk(m
′1, m
′2)
)∣
∣
∣
∣
∣
(
yk, H(T )1,k , H
(T )2,k
)
, k ∈ D
]
(8.61)
where
Dk(m′1, m
′2) =
∥
∥
∥yk −
√SNR H
(T )1,kx1,k(m
′1)−
√SNR H
(T )2,kx2,k(m
′2)∥
∥
∥
2
. (8.62)
As a consequence of the ergodicity condition in Part 3) of Lemma 7.1, we can
show for all θ < 0 that
limn→∞
1
n· κ1+2,n(nθ, SNR)
=1
L− nt,1 − nt,2
L−1∑
ℓ=nt,1+nt,2
[
θY †ℓ
(
Inr − θSNR(
H(T )1,ℓ H
†(T )1,ℓ + H
(T )2,ℓ H
†(T )2,ℓ
))−1
Yℓ
]
− 1
L− nt,1 − nt,2
L−1∑
ℓ=nt,1+nt,2
[
log det
(
Inr − θSNR
(
H(T )1,ℓ H
†(T )1,ℓ + H
(T )2,ℓ H
†(T )2,ℓ
)
)]
,
almost surely. (8.63)
As above, the GMI corresponding to the event E(m′1, m
′2), (m
′1 6= 1, m′
2 6= 1)
can be evaluated as [39, 54]
Igmi1+2,T (SNR) =
L− nt,1 − nt,2
L
(
supθ<0
(
θF (SNR)− κ1+2(θ, SNR))
)
(8.64)
154
8.6 Conclusion
where F (SNR) is given in (8.32), and where κ1+2(θ, SNR) is given by the RHS of
(8.63), i.e.,
κ1+2(θ, SNR)
=1
L− nt,1 − nt,2
L−1∑
ℓ=nt,1+nt,2
[
θY †ℓ
(
Inr − θSNR(
H(T )1,ℓ H
†(T )1,ℓ + H
(T )2,ℓ H
†(T )2,ℓ
))−1
Yℓ
]
− 1
L− nt,1 − nt,2
L−1∑
ℓ=nt,1+nt,2
[
log det
(
Inr − θSNR
(
H(T )1,ℓ H
†(T )1,ℓ + H
(T )2,ℓ H
†(T )2,ℓ
)
)]
.
(8.65)
The sum-rate Igmi1+2,T (SNR) can then be viewed as the GMI of an nr × (nt,1 +
nt,2)-dimensional MIMO channel with channel matrix (H1,k,H2,k). Noting that
the channel estimator produces the channel-estimate matrix(
H(T )1,k , H
(T )2,k
)
, it thus
follows from Section 7.4.1 that the pre-log
ΠR∗1+2
, limSNR→∞
R∗1+2(SNR)
log SNR(8.66)
is lower-bounded by
ΠR∗1+2
≥ min (nr, nt,1 + nt,2)
(
1− nt,1 + nt,2
L
)
(8.67)
for L ≤ 1/(2λD). This yields the third boundary of the pre-log region presented
in Theorem 8.1.
Combining (8.59), (8.60) and (8.67), and noting that the boundary is max-
imised for L being the largest integer satisfying L ≤ 12λD
, proves Theorem 8.1.
8.6 Conclusion
We have considered a two-user MIMO fading MAC and proposed a joint–trans-
mission scheme using nearest neighbour decoding and pilot-aided channel es-
timation. We have analysed the achievable rate region and have derived the
corresponding pre-log region. We have shown that the achievable pre-log region
is the best pre-log region for any scheme employing the number of pilots as many
as the sum of transmit antennas from all users.
We have compared the joint-transmission scheme with TDMA and have de-
rived sufficient conditions when the joint-transmission scheme is better than
TDMA and when TDMA is better than the joint-transmission scheme. We have
155
8.6 Conclusion
shown that TDMA is the optimal scheme for SISO fading MAC under a peak-
power constraint in the sense that it achieves the capacity pre-log region of the
SISO fading MAC. The joint-transmission scheme is typically better for channels
with large coherence time when the receiver employs more antennas than the
sum of transmit antennas. Large coherence time occurs when the mobiles move
at small to moderate speeds (up to 200 km/h) and the delay spread is not very
large (typical for indoor and urban environments). On the other aspects, the use
of more receive antennas at the receiver is feasible for uplink transmission, i.e.,
where the number of transmitter for both users are limited due to the size of the
device whereas at the receiver, the base station has more space to employ more
antennas. This suggests the potential of the joint-transmission scheme for uplink
transmission in indoor and urban environments. In other environments, one may
consider TDMA for multiple-access transmission.
156
Chapter 9
Summary and Future Research
In this chapter, we summarise the main contributions of this dissertation and
identify several potential areas for future research.
9.1 Main Contributions
This dissertation has focused on nearest neighbour decoding for fading channels
when the receiver does not have access to perfect CSI.
9.1.1 Part I
Part I studied nearest neighbour decoding in block-fading channels. The block-
fading channel is non-ergodic and its fundamental limit (assuming perfect CSIR)
is given by the outage probability, the probability that the channel is unable to
support the data rate. This fundamental limit can be achieved using nearest
neighbour decoding (which is optimal for perfect CSIR) and a good code design.
In practice, the CSIR is imperfect. Using mismatched-decoding approaches,
we have formulated a technique to study the imperfectness of CSIR and its
effect on nearest neighbour decoding. We have introduced the generalised outage
probability as a novel efficient tool for evaluating the reliability of transmission
over a block-fading channel. Moreover, using error exponents for mismatched
decoders, we have proved the achievability of the generalised outage probability.
Using the GMI converse, we have further shown that the generalised outage
probability is the fundamental limit for i.i.d. codebooks.
Studying the outage and the generalised outage diversities, which characterise
the high-SNR behaviour of the outage and the generalised outage probabilities,
respectively, reveals important system design criteria. We have shown that for
157
9.1 Main Contributions
both Gaussian and discrete inputs, the generalised outage diversity is given by the
perfect-CSIR outage diversity times the minimum of the channel estimation error
diversity—which measures the decay of the channel estimation error variance as
a function of the SNR—and one. Therefore, in order to achieve the highest
possible diversity, the channel estimator should be designed so as to make the
estimation error diversity equal to or larger than one. We have further determined
a threshold on the required block length of random codes in order to achieve
the generalised outage diversity. The results obtained are well applicable for a
general fading model subsuming Rayleigh, Rician, Nakagami-m, Nakagami-q and
Weibull distributions as well as optical wireless channels with lognormal-Rice and
gamma-gamma scintillations.
To improve the generalised outage diversity, we have considered IR-ARQ, as
an adaptive transmission scheme based on binary feedback. Considering both
uniform and optimal power allocation, we have expressed the resulting ARQ
diversity as a function of the quality of feedback, the quality of channel estima-
tion and the maximum number of ARQ rounds. We have derived the condition
when IR-ARQ can provide a significant gain with respect to non-adaptive trans-
mission. We have also determined the condition when power-controlled ARQ is
superior to uniform-power ARQ. In order to utilise the full benefits offered by
power-controlled ARQ, we have shown that the quality of channel estimation has
to improve with the number of rounds. Our results highlight the importance of
accounting for imperfections in the channel for system design and provide guide-
lines on designing a good channel estimator and feedback signalling for IR-ARQ
schemes.
Power can be adapted if CSIT is available. Depending on the number of im-
perfect CSIT blocks prior to transmission, we consider full-, causal- and predictive-
CSIT power allocations. For full CSIT, we have characterised the generalised
outage diversity and have expressed it as a function of the qualities of CSIR
and CSIT and the perfect-CSIR outage diversity with uniform power alloca-
tion. For causal and predictive CSIT, the generalised outage diversity does not
only depend on the qualities of CSIR and CSIT but also on the CSIT delay
or the CSIT prediction parameter. Our results suggest that imperfect CSIR
has more detrimental effects on the generalised outage diversity than imperfect
CSIT. Hence, obtaining a reliable CSIR is more important than obtaining a reli-
able CSIT. The diversity characterisation allows to determine the condition when
the power adaptation is not beneficial with respect to uniform power allocation.
The results shed new light on the design of pilot-assisted channel estimation in
158
9.1 Main Contributions
block-fading channels.
9.1.2 Part II
Part II studied nearest neighbour decoding in stationary ergodic noncoherent
fading channels. Reliable transmission over these channels is possible at rates
below the channel capacity. Due to the absence of CSI, the capacity of the
noncoherent fading channel is smaller than the capacity of the coherent fading
channel.
We have proposed a scheme for point-to-point MIMO fading channels that
estimates the fading with regular training via orthogonal pilots and feeds the
fading estimates to the nearest neighbour decoder. Assuming a bandlimited psd
of the fading process, we studied a set of information rates achievable with this
scheme in the limit as the SNR tends to infinity. Our results reveal that in
order to obtain reliable fading estimates, the portion of time required for pilot
transmission cannot be less than the number of transmit antennas times twice
the bandwidth of the fading psd. Using reliable estimates of the fading, the
nearest neighbour decoder can achieve a positive pre-log, which is given by the
capacity pre-log of the coherent fading channel times the fraction of time used
for the transmission of data. Hence, the loss with respect to the coherent case is
solely due to the transmission of orthogonal pilots used to obtain accurate fading
estimates. Furthermore, if the inverse of twice the bandwidth of the fading psd
is an integer, then for MISO channels, our scheme achieves the capacity pre-
log of the noncoherent fading channel derived by Koch and Lapidoth [91]. For
noncoherent MIMO channels, our scheme achieves the best so far known lower
bound on the capacity pre-log obtained by Etkin and Tse [1].
We have extended our analysis in the point-to-point MIMO channel to the
two-user fading MAC and have proposed a joint-transmission scheme for both
transmitting terminals. We have analysed the rate region that is achievable
with nearest neighbour decoding and pilot-aided channel estimation and have
determined the corresponding pre-log region. We have compared the joint-
transmission scheme with TDMA. We have shown that the joint-transmission
scheme is typically better than TDMA if the number of receive antennas is
larger than the sum of transmit antennas and if the bandwidth of the fading
psd is small. This shows the potential of the joint-transmission scheme in uplink
transmission where the base station can employ more antennas than the sum of
all antennas from mobile devices. If the number of receive antennas is smaller
159
9.2 Areas for Future Research
than the sum of all transmit antennas, then TDMA may be superior to joint
transmission. Indeed, for SISO fading MAC, TDMA is optimal in the sense that
it achieves the capacity pre-log region.
9.2 Areas for Future Research
In Part I, we have shown that the generalised outage probability is the fundamen-
tal limit for i.i.d. codebooks. Lifting the restriction of i.i.d. codebooks may yield
a better error performance, as shown in [13, 36, 38]. Finding the fundamental
limits for general codes involves the discovery of a general tight converse bound,
which is still an open problem and can be considered for further research.
Our achievability results in Part I have been derived using random coding.
Random coding is not of practical interest and only implies that given a reliable
fading estimation, there exists a good code achieving the optimal code diver-
sity. Using our criteria of reliable fading estimation, one can investigate the
performance of existing structured codes under imperfect CSIR and compare the
high-SNR slope of the error probability with the outage diversity.
We have highlighted in Part I that practical system design for block-fading
channels has to account for all possible imperfections in the system and the
channel. For example, power control with imperfect ARQ feedback and imperfect
CSI depends on how noisy the feedback and the imperfect CSI are. In this
dissertation, we have not considered the power control algorithm that achieves
the optimal diversity and this is a potential problem that shall be solved in future
research.
So far, we have only considered independent block-fading channels. This
model can be too simple to capture practical scenarios. A more general correlated
block-fading model can be considered for future research. This is highly relevant
for predictive-CSIT power allocation, where fading realisations in the past can
be used to predict the future values. Based on the fading from the past up to
the number of predicted blocks, the transmitter allocates the power for current
codeword transmission. This research area will involve theories from stochastic
prediction.
In Part II, we have considered orthogonal pilots to estimate the fading and
observed that the main loss in the pre-log is due to the portion of time required
to transmit pilots. We note that orthogonal pilots require the number of pilot
vectors to be equal to the number of transmit antennas. Thus, in order to
reduce the portion of time for pilot transmission, one may consider designing
160
9.2 Areas for Future Research
non-orthogonal pilots. It is prima facie not clear whether this will still yield a
fading estimation error whose variance vanishes with increasing SNR, which is
critical to achieve a positive pre-log.
In Part II, we have only dealt with bandlimited fading processes and the
rate pre-log. If the fading process is not bandlimited, then the rate pre-log
is zero and may not be a useful metric. References [85, 101] have shown that
for non-bandlimited fading processes, the capacity may grow, inter-alia, double
logarithmically with the SNR and the fading number has been introduced as a
performance metric to characterise the dominating term in the high-SNR regime.
Future research may consider developing a scheme for non-bandlimited fading
channels using the fading number as the performance metric.
In both Parts I and II, we have mostly resorted to asymptotic analyses in
the limit of large block length. We have only considered finite block length in
the random coding achievability. A more broad area for future research may
encompass theoretical analysis with finite block length. In our opinion, this is
the most exciting research area to be studied. The inherent difficulty of the
finite block-length analysis is that many convergence results in the literature,
such as the law of large numbers, the ergodic theorem, and the large deviation
technique, cannot be applied. One therefore needs to resort to new techniques
such as information density methods [102]. Recent progress in this area can be
found in [103–105]. Results on noncoherent fading channels are still limited and
many topics including mismatched decoding in the finite block-length regime can
be further explored.
161
Appendix A
A.1 Proof of Lemma 3.1 (Discrete Inputs)
We first bound the pdf of the fading (3.4) as1.1
w0|h|τe−w1(|h|+|w2|)ϕ ≤ w0|h|τe−w1|h−w2|ϕ ≤ w0|h|τe−w1|(|h|−|w2|)|ϕ. (A.1)
Let h = |h|eıφhand α = − log |h|2
log SNR. The lower bound for the joint pdf of Ab,r,t and
ΦHb,r,t is given as
PAb,r,t,ΦHb,r,t
(α, φ) ≥ w0
2log SNR · SNR−(1+ τ
2 )α · e−w1(SNR−α
2 +|w2|)ϕ
. (A.2)
For α < 0, we can see from the exponential term that the above lower bound
decays exponentially with the SNR; for α ≥ 0, the exponential term converges
to a constant as SNR ↑ ∞. As for the joint pdf upper bound, everything remains
unchanged except for the exponential term. We can write the upper bound for
the exponential term as follows
e−w1|(|h|−|w2|)|ϕ = e−w1|SNR−α
2 −|w2||ϕ . (A.3)
If α < 0, then the above term decays exponentially with the SNR. On the other
hand, if α ≥ 0, then the above term converges to a constant as SNR ↑ ∞. We
therefore have that both upper and lower bounds behave similarly for high SNR.
Let OX be the asymptotic perfect-CSIR outage set for the discrete constellation
X, which has been characterised in [32]. We then have the outage probability for
1.1For any ϕ ≥ 1, applying the triangle and reverse-triangle inequalities yields |h − w2|ϕ ≤(|h|+ |w2|)ϕ and |h− w2| ≥
∣
∣|h| − |w2|∣
∣ ≥ 0.
163
A.2 Proof of Theorem 3.1 (Discrete Inputs)
nr × nt MIMO channels with B fading blocks
PX
out(R).= SNR
−dXcsir (A.4)
.=
∫
OX∩A0SNR
−(1+ τ2 )
∑Bb=1
∑nrr=1
∑ntt=1 αb,r,tdAdΦH. (A.5)
Applying Varadhan’s lemma [106] yields
dXcsir = infOX∩A0
(
1 +τ
2
)
B∑
b=1
nr∑
r=1
nt∑
t=1
αb,r,t (A.6)
which is exactly the Rayleigh-fading result [32] multiplied by(
1 + τ2
)
.
A.2 Proof of Theorem 3.1 (Discrete Inputs)
We first state the following lemma, which applies to both i.i.d. Gaussian and
discrete inputs.
Lemma A.1. Consider the MIMO block-fading channel (3.3) with mismatched
CSIR (3.5). Denote the high-SNR generalised outage set by O, which is expressed
in terms of the normalised fading matrix A, fading phase matrix ΦH, normalised
error matrix Θ and error phase matrix ΦE. Then, the generalised outage proba-
bility satisfies
Pgout(R).= SNR
−dicsir (A.7)
.=
∫
O
PA,H(A,Φ
H)P(Θ)PE(ΦE)dAdΘdΦHdΦE (A.8)
.=
∫
O
PA,H(A,Φ
H)P(Θ)dAdΘdΦHΦE. (A.9)
For the fading model (3.4), dicsir is given by the solution of the following infimum
dicsir = infO∩A0,Θde×1
(
1 +τ
2
)
B∑
b=1
nr∑
r=1
nt∑
t=1
αb,r,t +
B∑
b=1
nr∑
r=1
nt∑
t=1
(θb,r,t − de)
.
(A.10)
164
A.2 Proof of Theorem 3.1 (Discrete Inputs)
Proof. The joint probability of A, , H and E over O can be written as (A.9)
because the random matrices and E are independent. Since each entry of E
is uniformly distributed over [0, 2π), the density PE(ΦE) does not affect the dot
equality. This lemma is then obtained by applying Varadhan’s lemma [106] to
(A.9). The condition A 0 is the same as that for perfect CSIR in Appendix
A.1. On the other hand, the condition Θ de×1 is derived as follows. Consider
the entry of at block b, receive antenna r and transmit antenna t, Θb,r,t. The
pdf of Θb,r,t is given by
PΘb,r,t(θ) = log SNR · SNRde−θ · exp
(
−SNRde−θ
)
. (A.11)
We can see that the interval for which the pdf (A.11) does not decay exponentially
with the SNR is given by θ ≥ de. The condition Θ de×1 follows by considering
all entries of .
We start the proof of the discrete-input SNR-exponent with the proof for
SISO channels. The proof for MIMO channels follows as an extension of the
proof for SISO channels.
A.2.1 SISO Case
GMI Lower Bound
For the SISO channel, (3.24) becomes
Igmib (SNR, hb, hb, s)
=M − 1
2M
∑
x∈X
log2∑
x′∈X
(
e−s|√SNR(hbx−hbx
′)+Z|2+s|√SNR(hb−hb)x+Z|2
)
(A.12)
and the GMI is given by
Igmi(h) = sups>0
1
B
B∑
b=1
Igmib (SNR, hb, hb, s). (A.13)
165
A.2 Proof of Theorem 3.1 (Discrete Inputs)
For a given s > 0 and any noise realisation z ∈ , we can bound the summation
of the term inside the expectation in (A.12) as
0 ≤B∑
b=1
log2∑
x′∈X
(
e−s|√SNR(hbx−hbx
′)+z|2+s|√SNR(hb−hb)x+z|2
)
(A.14)
≤B∑
b=1
log2
(
|X|es|√SNR(hb−hb)x+z|2
)
(A.15)
=B∑
b=1
log |X|+ s|z −√SNRebx|2
log 2. (A.16)
We have the expectation over Z
[
B∑
b=1
log |X|+ s|Z −√SNRebx|2
log 2
]
=B log |X|+ s(B + SNR
∑Bb=1 |eb|2|x|2)
log 2.
(A.17)
Note that |X| and |x|2, ∀x ∈ X are assumed to be finite and independent of the
SNR. Thus, to make sure that the RHS of (A.17) is finite
B log |X|+ s(B + SNR∑B
b=1 |eb|2|x|2)log 2
<∞, (A.18)
we can pick s in the set S ⊆ ,
S ,
s ∈ : 0 < s ≤ 1
B + SNR∑B
b=1 |eb|2
. (A.19)
Hence, for any s ∈ S, we can apply the dominated convergence theorem [19], for
which
limSNR→∞
log2∑
x′∈X
(
e−s|√SNR(hbx−hbx
′)+Z|2+s|√SNR(hb−hb)x+Z|2
)
=
limSNR→∞
log2∑
x′∈X
(
e−s|√SNR(hbx−hbx
′)+Z|2+s|√SNR(hb−hb)x+Z|2
)
. (A.20)
Replacing the supremum over s > 0 in (A.13) with the supremum over s ∈ S
166
A.2 Proof of Theorem 3.1 (Discrete Inputs)
results in a lower bound to the GMI. Substituting a specific value of s ∈ S further
lower-bounds the GMI. As we will show later, the following choice of s
s =1
B + SNR(1+ε)∑B
b=1 |eb|2, ε > 0 (A.21)
yields a tight GMI lower bound at high SNR. At high SNR, the range ε > 0
allows for s ↓ 0 (ε ↑ ∞) and for s→ 1
B+SNR∑B
b=1 |eb|2(ε ↓ 0).
Using the transformation of variables αb = − log |hb|2log SNR
and θb = − log |eb|2logSNR
, the
exponential term in (A.12) with s = s becomes
e−s|√SNR(hbx−hbx
′)+z|2+s|√SNR(hb−hb)x+z|2
= e−s
∣
∣
∣
∣
SNR1−αb
2 eıφhb (x−x′)+z−SNR
1−θb2 eıφ
ebx′
∣
∣
∣
∣
2
+s
∣
∣
∣
∣
z−SNR1−θb
2 eıφebx
∣
∣
∣
∣
2
(A.22)
where φhb and φe
b are the angles of hb and eb, respectively. We partition b ∈1, . . . , B into four disjoint subsets as follows
• b ∈ B1 if αb > 1 and αb < θb;
• b ∈ B2 if αb < 1 and αb < θb;
• b ∈ B3 if αb > 1 and αb > θb;
• b ∈ B4 if αb < 1 and αb > θb.
Note that the expression s in (A.21) satisfies
s.= min
(
SNR0, SNRθmin−1−ε
)
. (A.23)
In order to obtain a good lower bound, we shall select ε such that
0 < ε < θmin − αmax (A.24)
where
αmax = max
αb
∣
∣
∣
∣
αb < min(1, θmin), b = 1, . . . , B
. (A.25)
Then, the convergence of (A.22) as the SNR tends to infinity can be explained
as follows.
167
A.2 Proof of Theorem 3.1 (Discrete Inputs)
1. For b ∈ B1, we have that αb > 1 and αb < θb. Under this condition and after
exchanging the limit and the expectation in (A.20), it can be observed that
(A.22) tends to one for any s > 0. It implies that Igmib (SNR, hb, hb, s) →
0, ∀b ∈ B1.
2. For b ∈ B2, we have that αb < 1 and αb < θb. The dominating term
in the exponent of (A.22) is given by −s × SNR1−αb . Thus, for b ∈ B2
and αb < θmin, exchanging the limit and the expectation in (A.20) yields
the convergence of (A.22) to zero as the SNR tends to infinity. We then
have that Igmib (SNR, hb, hb, s) → M . On the other hand, for b ∈ B2 and
αb ≥ θmin, we observe the convergence of (A.22) to one as the SNR tends
to infinity. This implies that Igmib (SNR, hb, hb, s) → 0.
3. For b ∈ B3 and θb < 1, we have the dot equality
− s∣
∣
∣SNR
1−αb2 eıφ
hb (x− x′) + z − SNR
1−θb2 eıφ
ebx′∣
∣
∣
2
+ s∣
∣
∣z − SNR
1−θb2 eıφ
ebx∣
∣
∣
2
.= −s
(
|x′|2 − |x|2)
SNR1−θb (A.26)
for |x| 6= |x′| and
− s∣
∣
∣SNR
1−αb2 eıφ
hb (x− x′) + z − SNR
1−θb2 eıφ
ebx′∣
∣
∣
2
+ s∣
∣
∣z − SNR
1−θb2 eıφ
ebx∣
∣
∣
2
.= s|z|SNR
1−θb2 cos(φz − φe
b)|x| ·(
cosφx′ − cosφx)
+ s|z|SNR1−θb
2 sin(φz − φeb)|x| ·
(
sin φx′ − sin φx)
(A.27)
for |x| = |x′|, x 6= x′, where φz is the angle of z. In this case, we can-
not use the dominated convergence theorem [19] in (A.20) since there is
a dependency on z. Instead, since the logarithm is a concave function
of its argument, we first apply Jensen’s inequality [18, Th. 2.6.2] to the
168
A.2 Proof of Theorem 3.1 (Discrete Inputs)
expectation in (A.12)
log2∑
x′∈X
(
e−s|√SNR(hbx−hbx
′)+Z|2+s|√SNR(hb−hb)x+Z|2
)
≤ log2
∑
x′∈X
(
e−s|√SNR(hbx−hbx
′)+Z|2+s|√SNR(hb−hb)x+Z|2
)
(A.28)
= log2∑
x′∈X
(
e−s|√SNR(hbx−hbx
′)+Z|2+s|√SNR(hb−hb)x+Z|2
)
. (A.29)
For a given z ∈ , we have the bounds
0 ≤(
e−s|√SNR(hbx−hbx
′)+z|2+s|√SNR(hb−hb)x+z|2
)
≤ es|z−√SNRebx|2. (A.30)
Averaging over Z yields
[
es|Z−√SNRebx|2
]
=1
1− se
(
s2
1−s+s
)
SNR|eb|2|x|2 (A.31)
where we have assumed s < 1 so that the the above expectation can be
evaluated. Furthermore, using s = s in (A.21), the RHS of (A.31) can be
guaranteed to be finite. Thus, with s = s, we can apply the dominated
convergence theorem [19] as
limSNR→∞
(
e−s|√SNR(hbx−hbx
′)+Z|2+s|√SNR(hb−hb)x+Z|2
)
=
limSNR→∞
(
e−s|√SNR(hbx−hbx
′)+Z|2+s|√SNR(hb−hb)x+Z|2
)
. (A.32)
For |x| 6= |x′|, using the relationship in (A.32) and ε in (A.24), we observe
that (A.26) tends to zero as the SNR tends to infinity and (A.22) tends to
169
A.2 Proof of Theorem 3.1 (Discrete Inputs)
one. To evaluate (A.27), we first upper-bound
s|z|SNR1−θb
2 cos(φz − φeb)|x| ·
(
cosφx′ − cosφx)
+ s|z|SNR1−θb
2 sin(φz − φeb)|x| ·
(
sin φx′ − sin φx)
≤ 4s|z|SNR1−θb
2 |x|. (A.33)
Let W = |Z|. Then, W has the Rayleigh pdf
PW (w) = 2we−w2
, w ≥ 0. (A.34)
Using the result in (A.29) and the upper bound (A.33) for |x| = |x′|, x 6= x′,
we have that
[
e4s|Z|SNR1−θb
2 |x|]
=
[
e4sWSNR1−θb
2 |x|]
(A.35)
=
∫ ∞
0
e4swSNR1−θb
2 |x| · 2we−w2
dw (A.36)
= 1 + 2√πsSNR
1−θb2 |x|e(4s2SNR1−θb |x|2)
·(
1 + erf(
2sSNR1−θb
2 |x|))
(A.37)
≤ 1 + 4√πsSNR
1−θb2 |x|e(4s2SNR1−θb |x|2) (A.38)
where erf(·) is the error function [53]. Inequality (A.38) is due to the bound
erf(a) ≤ 1. Note that for θb < 1, we have
s · SNR1−θb
2.= SNR
θmin−1−ε · SNR1−θb
2 (A.39).
≤ SNRθmin−1−ε · SNR1−θb (A.40)
.= SNR
θmin−θb−ε. (A.41)
As θmin − ε is always less than θb, the last dot equality implies that as
the SNR tends to infinity, the upper bound in (A.38) tends to one. This
provides an upper bound to the expectation over Z in (A.29) when |x| =|x′|, x 6= x′, αb > 1 and θb < 1. Complementing the result with the one for
|x| 6= |x′|, αb > 1 and θb < 1, we have that Igmib (SNR, hb, hb, s) → 0 when
θb < 1.
For b ∈ B3 and θb > 1, it can be observed that (A.22) tends to one as the
170
A.2 Proof of Theorem 3.1 (Discrete Inputs)
SNR tends to infinity for any s > 0. This implies that Igmib (SNR, hb, hb, s) →
0.
4. For b ∈ B4, we always have θb < 1. For |x| 6= |x′|, we have the dot equality
− s∣
∣
∣SNR
1−αb2 eıφ
hb (x− x′) + z − SNR
1−θb2 eıφ
ebx′∣
∣
∣
2
+ s∣
∣
∣z − SNR
1−θb2 eıφ
ebx∣
∣
∣
2
.= −s
(
|x′|2 − |x|2)
SNR1−θb . (A.42)
On the other hand, for |x| = |x′|, x 6= x′, we have the dot equality
− s∣
∣
∣SNR
1−αb2 eıφ
hb (x− x′) + z − SNR
1−θb2 eıφ
ebx′∣
∣
∣
2
+ s∣
∣
∣z − SNR
1−θb2 eıφ
ebx∣
∣
∣
2
.= −s · SNR1−αb+θb
2 |x|2 ·(
cos(φehb )− cos(φeh
b + φx′x))
. (A.43)
Then, using ε in (A.24) and exchanging the limit and the expectation in
(A.20), it can be observed for both |x| 6= |x′| and |x| = |x′| (x 6= x′) that
(A.22) tends to one as the SNR tends to infinity. Thus, we have that
Igmib (SNR, hb, hb, s) → 0, ∀b ∈ B4.
From the above analysis, the generalised outage probability can be upper-bounded
as
Pgout(R).= SNR
−dXicsir (A.44)
.≤ Pr
1
B
B∑
b=1
M · 1
αb ≤ 1− ǫ ∩ αb ≤ θmin − δ
< R
(A.45)
.=
∫
Oǫ,δX
PA,ΦH(α,φh)PΘ(θ)PΦE(φe)dαdθdφhdφe (A.46)
where we have defined
Oǫ,δX
,
α, θ ∈ B :
B∑
b=1
1
αb ≤ 1− ǫ ∩ αb ≤ θmin − δ
<BR
M
(A.47)
for any ǫ, δ > 0. Applying Lemma A.1 yields
dXicsir ≥ infOǫ,δX
∩α0,θde×1
(
1 +τ
2
)
B∑
b=1
αb +B∑
b=1
(θb − de)
. (A.48)
171
A.2 Proof of Theorem 3.1 (Discrete Inputs)
Following the steps used in [23], we can show that the values of α and θ
achieving the infimum are given by
θ∗b = de, for b = 1, . . . , B (A.49)
α∗b = min(1− ǫ, θ∗b − δ), for b = 1, . . . , B − b∗ (A.50)
α∗b = 0, for b = B − b∗ + 1, . . . , B (A.51)
where b∗ ∈ 0, . . . , B − 1 is the unique integer satisfying b∗
B< R
M≤ b∗+1
B. As
this is valid for any ǫ > 0 and δ > 0, the lower bound for dXicsir is tight if we let
ǫ, δ ↓ 0. This yields
dXicsir ≥ min(1, de)×(
1 +τ
2
)
dSB(R) (A.52)
where dSB(R) is the Singleton bound
dSB(R) = 1 +
⌊
B
(
1− R
M
)⌋
. (A.53)
GMI Upper Bound
For each x, x′ ∈ X, we define
fx,x′(sb, SNR, hb, eb, z) , e−sb|√SNRhb(x−x′)+z−√SNRebx
′|2+sb|z−√SNRebx|2 (A.54)
and for each x ∈ X, we define
fx(sb, SNR, hb, eb, z) , log2∑
x′∈Xfx,x′(sb, SNR, hb, eb, z). (A.55)
Then, by Proposition 2.3, the GMI can be upper-bounded as
Igmi(h) ≤ 1
B
B∑
b=1
supsb>0
Igmib
(
SNR, hb, hb, sb
)
(A.56)
172
A.2 Proof of Theorem 3.1 (Discrete Inputs)
where
Igmib
(
SNR, hb, hb, sb
)
=M − 1
2M
∑
x∈X
[
fx(sb, SNR, hb, eb, Z)
]
(A.57)
=M − [
1
2M
∑
x∈Xfx(sb, SNR, hb, eb, Z)
]
. (A.58)
In order to evaluate the GMI upper bound, we first partition X into some
sets, each has equi-energy signal points. Suppose that the constellation X has
n energy levels. Denote Xn′ , n′ = 1, . . . , n, as the subset of X corresponding
to the n′-th energy level. Then, we can partition X into n disjoint subsets Xn′ ,
n′ = 1, . . . , n such that
X = X1 ∪ . . . ∪ Xn. (A.59)
Note that for each n′, n′ = 1, . . . , n, all signal points in Xn′ have the same energy.
We shall use this partition in the following high-SNR analysis, which is based
on Proposition 2.3 and Fatou’s lemma [41]. To this end, we use the change of
variables from |hb|2 and |eb|2 to αb and θb so that we can write
e−sb|√SNRhb(x−x′)+z−
√SNRebx
′|2+sb|z−√SNRebx|2
= e−sb
∣
∣
∣
∣
SNR1−αb
2 eıφhb (x−x′)+z−SNR
1−θb2 eıφ
ebx′
∣
∣
∣
∣
2
+sb
∣
∣
∣
∣
z−SNR1−θb
2 eıφebx
∣
∣
∣
∣
2
. (A.60)
We expand the exponential term and consider the following cases.
1. Case 1: αb > 1. Regardless of the value of θb, the supremum of
Igmib (SNR, hb, hb, sb) over sb > 0 in (A.56) tends to zero as it is upper-
bounded by the perfect-CSIR mutual information for block b [23].
2. Case 2: αb < 1 and αb < θb. The supremum on the RHS of (A.56) is
equivalent to the following infimum
infsb>0
1
2M
∑
x∈X
[
fx(sb, SNR, hb, eb, Z)
]
, (A.61)
173
A.2 Proof of Theorem 3.1 (Discrete Inputs)
which can be lower-bounded as
infsb>0
1
2M
∑
x∈X
[
fx(sb, SNR, hb, eb, Z)
]
≥ 1
2M
∑
x∈X
[
infsb>0
fx(sb, SNR, hb, eb, Z)
]
(A.62)
by exchanging the infimum over sb and the expectation twice. Let sb be
the value of sb that gives the infimum on the RHS of (A.62). The choice
of sb depends on the behaviour of the following term
e−sb|√SNRhb(x−x′)+z−
√SNRebx
′|2+sb|z−√SNRebx|2. (A.63)
It follows that if
|√SNRhb(x− x′) + z −
√SNRebx
′|2 > |z −√SNRebx|2 (A.64)
the solution of sb is given by sb ↑ ∞. Otherwise, the solution of sb is given
by sb ↓ 0. Note that since we have the dot equality
− sb
∣
∣
∣SNR
1−αb2 eıφ
hb (x− x′) + z − SNR
1−θb2 eıφ
ebx′∣
∣
∣
2
+ sb
∣
∣
∣z − SNR
1−θb2 eıφ
ebx∣
∣
∣
2
.= −sbSNR1−αb (A.65)
for x 6= x′, αb < 1 and αb < θb, it follows that in this case sb ↑ ∞.
Since fx(sb, SNR, hb, eb, z) ≥ 0, we can apply Fatou’s lemma [41] to the RHS
of (A.62) as follows
limSNR→∞
1
2M
∑
x∈X
[
fx(sb, SNR, hb, eb, Z)
]
≥ 1
2M
∑
x∈X
[
limSNR→∞
fx(sb, SNR, hb, eb, Z)]
. (A.66)
This gives a further lower bound to the RHS of (A.62) and yields an upper
bound to Igmi(h).
Using (A.65) and the limit in (A.66), we can show that (A.60) tends to
zero for x 6= x′ as the SNR tends to infinity, and (A.60) is equal to one for
174
A.2 Proof of Theorem 3.1 (Discrete Inputs)
x = x′. Thus, the supremum of Igmib (SNR, hb, hb, sb) over sb > 0 in (A.56)
is upper-bounded by M for αb < 1 and αb < θb.
3. Case 3: αb < 1 and αb > θb. The supremum in (A.56) is equivalent to the
following infimum
infsb>0
[
1
2M
∑
x∈Xfx(sb, SNR, hb, eb, Z)
]
(A.67)
which can be lower-bounded as
infsb>0
[
1
2M
∑
x∈Xfx(sb, SNR, hb, eb, Z)
]
≥ [
infsb>0
1
2M
∑
x∈Xfx(sb, SNR, hb, eb, Z)
]
(A.68)
by exchanging the infimum over sb and the expectation. The terms in the
exponent of (A.60) can be shown to have the following dot equality
− sb
∣
∣
∣SNR
1−αb2 eıφ
hb (x− x′) + z − SNR
1−θb2 eıφ
ebx′∣
∣
∣
2
+ sb
∣
∣
∣z − SNR
1−θb2 eıφ
ebx∣
∣
∣
2
.= −sb
(
|x′|2 − |x|2)
SNR1−θb (A.69)
for |x| 6= |x′|. On the other hand, for |x| = |x′|, x 6= x′, we have that
− sb
∣
∣
∣SNR
1−αb2 eıφ
hb (x− x′) + z − SNR
1−θb2 eıφ
ebx′∣
∣
∣
2
+ sb
∣
∣
∣z − SNR
1−θb2 eıφ
ebx∣
∣
∣
2
.= −sb · SNR1−αb+θb
2 |x|2 ·(
cos(φehb )− cos(φeh
b + φx′x))
(A.70)
with probability one since the constellation X is discrete and both ΦHb and
ΦHb are uniformly distributed over [0, 2π). Hence, for each x ∈ X and
x 6= x′, we have that at high SNR
fx,x′(sb, SNR, hb, eb, z) = e−sb(|x′|2−|x|2)SNR1−θb(A.71)
175
A.2 Proof of Theorem 3.1 (Discrete Inputs)
for |x| 6= |x′|, and
fx,x′(sb, SNR, hb, eb, z) = e−sb·SNR
1−αb+θb
2 |x|2·(
cos(φehb )−cos(φeh
b +φx′x))
(A.72)
for |x| = |x′|, x 6= x′. It follows that
fx(sb, SNR, hb, eb, z) = log2∑
x′∈Xfx,x′(sb, SNR, hb, eb, z). (A.73)
Using the second order derivative of the log-sum-exp function
fx(sb, SNR, hb, eb, z), it can be shown that the function
∑
x∈Xfx(sb, SNR, hb, eb, z) (A.74)
is convex in sb for sb > 0. To check whether the extreme point, which gives
the global minimum to (A.74), exists for sb > 0, we can simply find the
derivative of (A.74) at sb = 0 [72]
∂
∂sb
∑
x∈Xfx(sb, SNR, hb, eb, z)
∣
∣
∣
∣
∣
sb=0
=−SNR
1−αb+θb2
|X| log 2∑
x∈X
∑
x′ 6=x|x′|=|x|x′∈X
(
|x|2 cos(φehb )− |x|2 cos(φeh
b + φx′x))
.
(A.75)
We next apply the partitioning in (A.59). Consider a pair of signal points
(x1, x2) such that x1, x2 ∈ Xn′ , |Xn′| ≥ 2, for n′ ∈ 1, . . . , n. The contri-
176
A.2 Proof of Theorem 3.1 (Discrete Inputs)
bution of the pair (x1, x2) in the summations in (A.75) is given by
|x1|2 cos(φehb )− |x1|2 cos(φeh
b + φx2x1) + |x2|2 cos(φehb )
− |x2|2 cos(φehb + φx1x2)
= |x1|2(
2 cos(φehb )− cos(φeh
b + φx2x1)− cos(φehb − φx2x1)
)
(A.78)
= cos(φehb )|x1|2
(
2− 2 cos(φx2x1))
(A.79)
= cos(φehb )|x1|2
(
1− cos(φx2x1) + 1− cos(φx1x2)
)
(A.80)
= cos(φehb )|x1|2
(
1− cos(φx2x1))
+ cos(φehb )|x2|2
(
1− cos(φx1x2))
(A.81)
where we have used φx1x2 = −φx2x1 by the definition, the equality |x1|2 =|x2|2, and the trigonometry identities
cos(a+ b) + cos(a− b) = 2 cos(a) cos(b) (A.82)
and cos(−a) = cos(a). Define
Q ,
n′ : |Xn′ | ≥ 2, n′ = 1, . . . , n
. (A.83)
Using the result in (A.81), we can re-write the summations in (A.75) as
∑
x∈X
∑
x′ 6=x|x′|=|x|x′∈X
(
|x|2 cos(φehb )− |x|2 cos(φeh
b + φx′x))
= cos(φehb )∑
n′∈Q
∑
x∈Xn′
∑
x′ 6=x,x′∈Xn′
|x|2(1− cos(φx′x)) (A.84)
where we have incorporated all Xn′ , n′ = 1, . . . , n that satisfy |Xn′| ≥ 2.
Let sb be the value of sb that gives the infimum on the RHS of (A.68). Note
that this sb is different to the one in case 2. We have from (A.84) that the
condition 1 − cos(φx′x) ≥ 0 is always true. Then, if cos(φehb ) ≤ 0, then the
derivative in (A.75) is always non-negative, which implies that the solution
of sb that leads to the infimum on the RHS of (A.68) is given by sb ↓ 0. By
177
A.2 Proof of Theorem 3.1 (Discrete Inputs)
using sb ↓ 0 and applying Fatou’s lemma [41] to the RHS of (A.68)
limSNR→∞
[
1
2M
∑
x∈Xfx(sb, SNR, hb, eb, Z)
]
≥ [
limSNR→∞
1
2M
∑
x∈Xfx(sb, SNR, hb, eb, Z)
]
, (A.85)
we have that the upper bound for the supremum of Igmib (SNR, hb, hb, sb)
over sb > 0 in (A.56) tends to zero as the SNR tends to infinity.
On the other hand, if cos(φehb ) > 0, then the derivative in (A.75) is al-
ways non-positive. Thus, there is a possibility that there exists a positive
number sb in the interval sb > 0 that leads to the infimum on the RHS
of (A.68). This also implies that the upper bound for the supremum of
Igmib (SNR, hb, hb, sb) over sb > 0 in (A.56) is in [0,M ].
We can then derive a loose upper bound as follows. By explicitly writing
φehb = φe
b − φhb , we first define
Ξb ,
φhb , φ
eb ∈ [0, 2π) : cos
(
φeb − φh
b
)
> 0
. (A.86)
A loose upper bound is then obtained by considering that when Ξb occurs,
the upper bound for the supremum of Igmib (SNR, hb, hb, sb) over sb > 0 in
(A.56) is given by M .
From the above cases, we can show that the generalised outage probability is
lower-bounded as
Pgout(R).= SNR
−dXicsir (A.87)
.≥ Pr
1
B
B∑
b=1
M · 1
Eǫ,δ(αb, θb,Ξb)
< R
(A.88)
.=
∫
Oǫ,δX
PA,ΦH(α,φh)PΘ(θ)PΦE(φe)dαdθdφhdφe (A.89)
where we have defined
Eǫ,δ(αb, θb,Ξb) ,
αb ≤ 1 + ǫ ∩ αb ≤ θb + δ
∪
αb ≤ 1 + ǫ ∩ αb > θb + δ ∩ Ξb
(A.90)
178
A.2 Proof of Theorem 3.1 (Discrete Inputs)
for ǫ, δ > 0, and
Oǫ,δ
X,
α, θ ∈ B :B∑
b=1
1
Eǫ,δ(αb, θb,Ξb)
<BR
M
. (A.91)
Applying Lemma A.1 yields
dXicsir ≤ infOǫ,δX ∩α0,θde×1
(
1 +τ
2
)
B∑
b=1
αb +
B∑
b=1
(θb − de)
. (A.92)
Similarly to the GMI lower bound, it is not difficult to show that the values
of θb, b = 1, . . . , B achieving the infimum are given by de. To find the values of
αb, b = 1, . . . , B that solve for the infimum, we need to see whether there exists
φhb and φe
b that do not belong to Ξb. Note that the following condition
π
2≤ φe
b − φhb ≤ 3π
2(A.93)
implies that cos(φeb − φh
b ) ≤ 0. Thus, from (A.86) and (A.93), we can always
find (φhb , φ
eb) /∈ Ξb. It then follows from [23] that the values of αb, b = 1, . . . , B
achieving the infimum are given by
α∗b = min(1 + ǫ, θ∗b + δ), for b = 1, . . . , B − b∗ (A.94)
α∗b = 0, for b = B − b∗ + 1, . . . , B (A.95)
where b∗ ∈ 0, . . . , B − 1 is the unique integer satisfying b∗
B< R
M≤ b∗+1
B.
Substituting the values of αb and θb, b = 1, . . . , B achieving the infimum (A.92),
we obtain the upper bound of the SNR-exponent
dXicsir ≤ min(1, de)×(
1 +τ
2
)
dSB(R) (A.96)
where we have let ǫ, δ ↓ 0 to make the upper bound tight.
179
A.2 Proof of Theorem 3.1 (Discrete Inputs)
A.2.2 MIMO Case
Recall (3.24)
Igmib (SNR,Hb, Hb, s)
=Mnt −
log2∑
x′∈Xnt
(
e−s
∥
∥
∥
√
SNR
nt(HbX−Hbx
′)+Z
∥
∥
∥
2+s
∥
∥
∥
√
SNR
nt(Hb−Hb)X+Z
∥
∥
∥
2)
(A.97)
where the expectation is over (X,Z). The GMI is given by
Igmi(H) = sups>0
1
B
B∑
b=1
Igmib (SNR,Hb, Hb, s). (A.98)
Mimicking the analysis done for the SISO case, we have the GMI lower and upper
bounds as follows.
GMI Lower Bound
Using the suboptimal s ∈ S to apply the dominated convergence theorem [19],
we have that
− s
∥
∥
∥
∥
∥
√
SNR
ntHb(x− x′) + z −
√
SNR
ntEbx
′
∥
∥
∥
∥
∥
2
+ s
∥
∥
∥
∥
∥
z −√
SNR
ntEbx
∥
∥
∥
∥
∥
2
.= −s
nr∑
r=1
∣
∣
∣
∣
∣
∣
nt∑
t=1
SNR1−αb,r,t
2 eıφhb,r,t(xt − x′t) + zr −
nt∑
t=1
SNR1−θb,r,t
2 eıφeb,r,tx′t
∣
∣
∣
∣
∣
∣
2
+ s
nr∑
r=1
∣
∣
∣
∣
∣
zr −nt∑
t=1
SNR1−θb,r,t
2 eıφeb,r,txt
∣
∣
∣
∣
∣
2
(A.99)
where now |eb|2 in (A.21) changes to ‖Eb‖2F , and where φhb,r,t and φe
b,r,t are the
angles of hb,r,t and eb,r,t, respectively. Similarly to what it is done in [32], define
180
A.2 Proof of Theorem 3.1 (Discrete Inputs)
the following sets S(ǫ,δ)b,r , S
(ǫ,δ)b , and κb for ǫ, δ > 0 as
S(ǫ,δ)b,r ,
t : αb,r,t ≤ 1− ǫ ∩ αb,r,t ≤ θmin − δ, t = 1, . . . , nt
, (A.100)
S(ǫ,δ)b ,
nr⋃
r=1
S(ǫ,δ)b,r , (A.101)
κb ,∣
∣
∣S(ǫ,δ)b
∣
∣
∣(A.102)
where now θmin , minθ1,1,1, . . . , θb,r,t, . . . , θB,nr,nt. Note that s satisfies
s.= SNR
min(0,θmin−1−ε) (A.103)
where ε is chosen such that
0 < ε < θmin − αmax, (A.104)
and where
αmax
= max
αb,r,t
∣
∣
∣
∣
αb,r,t < min (1, θmin) , b = 1, . . . , B, r = 1, . . . , nr, t = 1, . . . , nt
.
(A.105)
For r = 1, . . . , nr and xt 6= x′t, if there exists αb,r,t satisfying the constraint
set S(ǫ,δ)b , then with s = s, the exponential function inside the expectation on
the RHS of (A.97) tends to zero as the SNR tends to infinity. Otherwise, the
exponential function converges to one. Therefore, we can write the following dot
181
A.2 Proof of Theorem 3.1 (Discrete Inputs)
equality for high SNR
− s
∣
∣
∣
∣
∣
∣
nt∑
t=1
SNR1−αb,r,t
2 eıφhb,r,t(xt − x′t) + zr −
nt∑
t=1
SNR1−θb,r,t
2 eıφeb,r,tx′t
∣
∣
∣
∣
∣
∣
2
+ s
∣
∣
∣
∣
∣
zr −nt∑
t=1
SNR1−θb,r,t
2 eıφeb,r,txt
∣
∣
∣
∣
∣
2
.= −s
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∑
t∈S(ǫ,δ)b,r
xt 6=x′t
SNR1−αb,r,t
2 eıφhb,r,t(xt − x′t) + zr −
nt∑
t=1
SNR1−θb,r,t
2 eıφeb,r,tx′t
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
2
+ s
∣
∣
∣
∣
∣
zr −nt∑
t=1
SNR1−θb,r,t
2 eıφeb,r,txt
∣
∣
∣
∣
∣
2
. (A.106)
Let s∗ be the value of s > 0 that solves the supremum on the RHS of (A.98).
Using the suboptimal s = s given in (A.103), we have the upper bound for the
expectation over Z at high SNR as follows
limSNR→∞
log2∑
x′∈Xnt
(
e−s∗
∥
∥
∥
√
SNR
ntHb(x−x′)+Z−
√
SNR
ntEbx
′∥
∥
∥
2+s∗
∥
∥
∥Z−
√
SNR
ntEbx
∥
∥
∥
2)
≤ [
log2∑
x′∈Xnt
1
xt 6= x′t, ∀t ∈ S(ǫ,δ)b
]
(A.107)
=M(nt − κb) (A.108)
for all x ∈ Xnt. Thus,
limSNR→∞
Igmib (SNR,Hb, Hb, s
∗) ≥Mκb (A.109)
and Pgout(R) is upper-bounded as
Pgout(R).
≤ Pr
1
B
B∑
b=1
Mκb < R
. (A.110)
Define
OX ,
A,Θ ∈ Bnr×nt :B∑
b=1
κb <BR
M
. (A.111)
182
A.2 Proof of Theorem 3.1 (Discrete Inputs)
Then, applying Lemma A.1 yields the lower bound for the SNR-exponent
dXicsir ≥ infOX∩A0,Θde×1
(
1 +τ
2
)
B∑
b=1
nr∑
r=1
nt∑
t=1
αb,r,t +B∑
b=1
nr∑
r=1
nt∑
t=1
(θb,r,t − de)
.
(A.112)
We can observe from (A.100) that the solution of θb,r,t for all b = 1, . . . , B,
r = 1, . . . , nr and t = 1, . . . , nt to the above infimum is given by de. Following
the analysis in [32], it can be proved that the solution of the above infimum is
given by
dXicsir ≥ min(1, de) ·(
1 +τ
2
)
nr
(
1 +
⌊
B
(
nt −R
M
)⌋)
. (A.113)
GMI Upper Bound
Similarly to the SISO analysis, the GMI upper bound is evaluated using Proposi-
tion 2.3 and Fatou’s lemma [41]. The only difference with the GMI lower bound
is in the definition of the sets
S(ǫ,δ)
b,r ,
t :
αb,r,t ≤ 1 + ǫ ∩ αb,r,t ≤ θb,r,t + δ
∪
αb,r,t ≤ 1 + ǫ ∩ αb,r,t > θb,r,t + δ ∩ Ξb,r,t
, t = 1, . . . , nt
,
(A.114)
S(ǫ,δ)
b ,
nr⋃
r=1
S(ǫ,δ)
b,r , (A.115)
κb ,∣
∣
∣S(ǫ,δ)
b
∣
∣
∣(A.116)
where
Ξb,r,t ,
φhb,r,t, φ
eb,r,t ∈ [0, 2π) : cos
(
φeb,r,t − φh
b,r,t
)
> 0
. (A.117)
183
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
Following the same steps used in the SISO analysis, we can lower-bound the
expectation over Z as follows
limSNR→∞
log2∑
x′∈Xnt
(
e−s∗
∥
∥
∥
√
SNR
ntHb(x−x′)+Z−
√
SNR
ntEbx
′∥
∥
∥
2+s∗
∥
∥
∥Z−√
SNR
ntEbx
∥
∥
∥
2)
≥ [
log2∑
x′∈Xnt
1
xt 6= x′t, ∀t ∈ S(ǫ,δ)
b
]
(A.118)
=M(nt − κb). (A.119)
The generalised outage probability can then be lower-bounded as
Pgout(R).
≥ Pr
1
B
B∑
b=1
Mκb < R
. (A.120)
Define
OX ,
A,Θ ∈ Bnr×nt :B∑
b=1
κb <BR
M
. (A.121)
Using OX to apply the result in Lemma A.1 and following the technique used for
the GMI lower bound, the SNR-exponent can be proved to be upper-bounded as
dXicsir ≤ min(1, de) ·(
1 +τ
2
)
nr
(
1 +
⌊
B
(
nt −R
M
)⌋)
. (A.122)
This is because the infimum solution for θb,r,t in (A.114) is the same as that for
θmin in (A.100) (given by de), and because we can always find φhb,r,t and φ
eb,r,t that
do not belong to Ξb,r,t. This completes the proof for discrete inputs.
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
Recall the GMI (3.20) for i.i.d. Gaussian inputs (in nats per channel use)
Igmi(H) = sups>0
1
B
B∑
b=1
Igmib (SNR,Hb, Hb, s) (A.123)
184
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
where
Igmib (SNR,Hb, Hb, s) = log det
(
Inr + sHbH†b
SNR
nt
)
− s
(
nr +SNR
nt
‖Hb − Hb‖2F)
+ [
sY †Σ−1y Y
∣
∣
∣H b = Hb,Eb = Eb
]
(A.124)
and where
Σy , Inr + sHbH†b
SNR
nt. (A.125)
In the following, we derive lower and upper bounds to (A.123) to prove Theorem
3.1.
A.3.1 GMI Lower Bound
We first note from [72, App. D] that [sY †Σ−1y Y |H b = Hb,Eb = Eb] is non-
negative. Then, we have that
Igmib (SNR, Hb,Eb, s) ≥ log det
(
Inr + sHbH†b
SNR
nt
)
− s
(
nr +SNR
nt
‖Eb‖2F)
.
(A.126)
Without loss of generality, assume that nt ≥ nr.1.2 Let λb,i, i = 1, . . . , nr be
the i-th eigenvalue of HbH†b. Then, the RHS of (A.126) can be converted into
eigenvalues expression
Igmib (SNR, λb,Eb, s) ≥ log
nr∏
i=1
(
1 + sλb,iSNR
nt
)
− s
(
nr +SNR
nt‖Eb‖2F
)
.
(A.127)
1.2If nt < nr, then it suffices to replace(
Inr+ sHbH
†bSNRnt
)
with(
Int+ sH
†bHb
SNRnt
)
.
185
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
We can further lower-bound (A.127) using
nr∏
i=1
(
1 + sλb,iSNR
nt
)
=
(
1 + sλb,1SNR
nt
)
. . .
(
1 + sλb,nr
SNR
nt
)
(A.128)
≥ 1 + sλb,1SNR
nt
+ sλb,2SNR
nt
+ · · ·+ sλb,nr
SNR
nt
(A.129)
= 1 + sSNR
nt
nr∑
i=1
λb,i (A.130)
= 1 + sSNR
nt‖Hb‖2F (A.131)
= 1 + sSNR
nt
nr∑
r=1
nt∑
t=1
|hb,r,t|2 (A.132)
where the inequality follows since HbH†b is a positive semidefinite matrix, where
the singular values are always zero or positive. It holds that [86]
‖Hb‖2F =nr∑
i=1
λb,i =nr∑
r=1
nt∑
t=1
|hb,r,t|2. (A.133)
We then have a lower bound to the GMI as
Igmi(H) = sups>0
1
B
B∑
b=1
Igmib (SNR,Hb, Hb, s) (A.134)
≥ sups>0
1
B
B∑
b=1
log
(
1 + s‖Hb‖2FSNR
nt
)
− s
(
nr +SNR
nt
‖Eb‖2F)
.
(A.135)
The optimiser for s is difficult to evaluate in a closed form due to the sum over
b involving the logarithm function. A suboptimal s can be obtained as follows.
For any s > 0, we can lower-bound
B∑
b=1
log
(
1 + s‖Hb‖2FSNR
nt
)
− s
(
nr +SNR
nt
‖Eb‖2F)
≥B∑
b=1
log
(
s‖Hb‖2FSNR
nt
)
− s
(
Bnr +SNR
nt
B∑
b=1
‖Eb‖2F
)
. (A.136)
We then perform the first-order derivative to the RHS of (A.136) with respect to
186
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
s and equate it to zero. From this step, we obtain a suboptimal s with respect
to (A.135), which is given by
s =B
Bnr +SNR
nt
∑Bb=1 ‖Eb‖2F
. (A.137)
Replacing s in (A.135) with s and removing the supremum yield
Igmi(H) ≥ 1
Blog
(
B∏
b=1
e−1
(
1 +B‖Hb‖2F
Bnr +SNR
nt
∑Bb=1 ‖Eb‖2F
SNR
nt
))
. (A.138)
Note that from hb,r,t = hb,r,t + eb,r,t, we have
|hb,r,t|2 = |hb,r,t|2 + |eb,r,t|2 + 2|hb,r,t||eb,r,t| cos(φhb,r,t − φe
b,r,t). (A.139)
Let αb,r,t = − log |hb,r,t|2log SNR
, αb,r,t = − log |hb,r,t|2log SNR
and θb,r,t = − log |eb,r,t|2logSNR
. Then, for any
real positive number ς > 0, we have for αb,r,t 6= θb,r,t that
SNRς |hb,r,t|2 = SNR
ς−αb,r,t.= SNR
ς−min(αb,r,t,θb,r,t). (A.140)
On the other hand, for αb,r,t = θb,r,t, we have the following four cases.
• If hb,r,t = eb,r,t and hb,r,t 6= 0, we have that
SNRς |hb,r,t|2 = SNR
ς−αb,r,t = 4SNRς |hb,r,t|2 (A.141).= SNR
ς−αb,r,t.= SNR
ς−θb,r,t . (A.142)
• If hb,r,t = e∗b,r,t and cos(φhb,r,t) 6= 0, where e∗b,r,t denotes the complex conjugate
of eb,r,t, we have that
SNRς |hb,r,t|2 = SNR
ς−αb,r,t = 4SNRς |hb,r,t|2 cos2(φhb,r,t) (A.143)
.= SNR
ς−αb,r,t.= SNR
ς−θb,r,t . (A.144)
187
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
• If −hb,r,t = e∗b,r,t and sin(φhb,r,t) 6= 0, we have that
SNRς |hb,r,t|2 = SNR
ς−αb,r,t = 4SNRς |hb,r,t|2 sin2(φhb,r,t) (A.145)
.= SNR
ς−αb,r,t.= SNR
ς−θb,r,t . (A.146)
• If hb,r,t = −eb,r,t, we have that
SNRς |hb,r,t|2 = SNR
ς−αb,r,t = 0. (A.147)
Note that the condition for hb,r,t = −eb,r,t also covers the condition for
hb,r,t = eb,r,t = 0, the condition for hb,r,t = e∗b,r,t with cos(φhb,r,t) = 0, and the
condition for −hb,r,t = e∗b,r,t with sin(φhb,r,t) = 0.
From the preceding evaluation, we have that
e−1
(
1 +B‖Hb‖2F
Bnr +SNR
nt
∑Bb=1 ‖Eb‖2F
SNR
nt
)
= e−1
(
1 +B SNR
nt
∑nr
r=1
∑nt
t=1 SNR−αb,r,t
Bnr +SNR
nt
∑Bb=1
∑nr
r=1
∑nt
t=1 SNR−θb,r,t
)
(A.148)
.= max
(
SNR0,
SNR1−αb,min
max(
SNR0, SNR1−θmin
)
)
(A.149)
.= max
(
SNR0,min
(
SNR0, SNRθmin−1
)
× SNR1−αb,min
)
(A.150)
where
θb,min , min θb,1,1, . . . , θb,r,t, . . . , θb,nr,nt , (A.151)
θmin , min θ1,min, . . . , θb,min, . . . , θB,min , (A.152)
αb,min , min αb,1,1, . . . , αb,r,t, . . . , αb,nr,nt . (A.153)
188
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
Define αb,min, (r, t)αb,minand (r, t)θb,min
as
αb,min , min αb,1,1, . . . , αb,r,t, . . . , αb,nr,nt , (A.154)
(r, t)αb,min, arg min
r=1,...,nrt=1,...,nt
αb,r,t, (A.155)
(r, t)θb,min, arg min
r=1,...,nrt=1,...,nt
θb,r,t. (A.156)
We have the following cases.
1. Case 1: (r, t)αb,min6= (r, t)θb,min
. This refers to the case where the indices
(r, t) for which the minimum occurs are different for αb,r,t and θb,r,t. Clearly,
we have that
SNR
nt‖Hb‖2F
.=
SNR
ntSNR
−αb,min.= SNR
1−min(αb,min,θb,min). (A.157)
It follows that
e−1
(
1 +B‖Hb‖2F
Bnr +SNR
nt
∑Bb=1 ‖Eb‖2F
SNR
nt
)
.= max
(
SNR0,min
(
SNR0, SNRθmin−1
)
SNR1−αb,min
)
(A.158)
.= max
(
SNR0,min
(
SNR0, SNRθmin−1
)
× SNR1−min(αb,min,θb,min)
)
(A.159)
.= max
(
SNR0, SNRmin(1,θmin)−min(αb,min,θb,min)
)
. (A.160)
2. Case 2: (r, t)αb,min= (r, t)θb,min
. This refers to the case where the indices
(r, t) for which the minimum occurs are the same for both αb,r,t and θb,r,t.
• Case 2.1: αb,min < θb,min. We have that
SNR
nt
‖Hb‖2F.=
SNR
nt
SNR−αb,min
.= SNR
1−αb,min . (A.161)
189
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
It follows that
e−1
(
1 +B‖Hb‖2F
Bnr +SNR
nt
∑Bb=1 ‖Eb‖2F
SNR
nt
)
.= max
(
SNR0, SNRmin(1,θmin)−αb,min
)
. (A.162)
• Case 2.2: αb,min > θb,min. We have that
SNR
nt‖Hb‖2F
.=
SNR
ntSNR
−αb,min.= SNR
1−θb,min. (A.163)
If we have θmin < 1, the dot equality can be evaluated as follows
e−1
(
1 +B‖Hb‖2F
Bnr +SNR
nt
∑Bb=1 ‖Eb‖2F
SNR
nt
)
.= max
(
SNR0, SNRθmin−θb,min
)
(A.164).= SNR
0 (A.165)
where the last dot equality follows from the condition θmin ≤ θb,min.
For θmin ≥ 1, we have that
e−1
(
1 +B‖Hb‖2F
Bnr +SNR
nt
∑Bb=1 ‖Eb‖2F
SNR
nt
)
.= max
(
SNR0, SNR1−θb,min
)
(A.166).= SNR
0 (A.167)
where the last dot equality is due to θb,min ≥ θmin.
• Case 2.3: αb,min = θb,min. From (A.142), (A.144) and (A.146), if
hb,min = eb,min, hb,min 6= 0 or hb,min = e∗b,min, cos(φhb,min) 6= 0 or−hb,min =
e∗b,min, sin(φhb,min) 6= 0, then we observe the same convergence results
as in case 2.2. Otherwise, we have from (A.147) that SNR−αb,min = 0
and hence,
e−1
(
1 +B‖Hb‖2F
Bnr +SNR
nt
∑Bb=1 ‖Eb‖2F
SNR
nt
)
.= SNR
0. (A.168)
190
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
Note that the results in cases 2.2 and 2.3 are identical.
Summarising from the above cases, we have that
e−1
(
1 +B‖Hb‖2F
Bnr +SNR
nt
∑Bb=1 ‖Eb‖2F
SNR
nt
)
.= SNR
[min(1,θmin)−αb,min]+ . (A.169)
Recall that with multiplexing gain rg (cf. (2.57)), the data rate R(SNR) satisfies
the dot equality eR(SNR) .= SNRrg . Then, from (A.138) and (A.169), we can bound
Pgout(R) as follows
Pgout(R) = PrIgmi(H) < R(SNR) (A.170).= SNR
−dGicsir (A.171)
.≤ Pr
1
B
B∑
b=1
[min(1, θmin)− αb,min]+ < rg
(A.172)
.=
∫
OG
PA,H(A,ΦH)P(Θ)PE(Φ
E)dAdΘdΦHdΦE (A.173)
where we have defined
OG ,
A,Θ ∈ Bnr×nt :B∑
b=1
[min(1, θmin)− αb,min]+ < Brg
. (A.174)
Applying Lemma A.1 yields
dGicsir ≥ infOG∩A0,Θde×1
(
1 +τ
2
)
B∑
b=1
nr∑
r=1
nt∑
t=1
αb,r,t +B∑
b=1
nr∑
r=1
nt∑
t=1
(θb,r,t − de)
.
(A.175)
Since increasing θmin increases both the infimum function and the LHS of the
constraint, the optimiser of θmin is θ∗min = de. Since θb,r,t ≥ θmin, the optimisers of
θb,r,t are θ∗b,r,t = de for all b, r, t. On the other hand, the infimum solution of A is
given by the intersection of the region defined by∑B
b=1 αb,min > B(min(1, de)−rg)and the region defined by αb,min ∈ [0,min(1, de)]. Since αb,r,t ≥ αb,min, for all
r = 1, . . . , nr and t = 1, . . . , nt, the solution to the above infimum is given by
dGicsir ≥(
1 +τ
2
)
Bntnr × (min(1, de)− rg) (A.176)
191
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
for rg ∈ [0,min(1, de)] and zero otherwise. For a fixed coding rate independent
of the SNR (rg ↓ 0), we have that
dGicsir ≥ min(1, de)×(
1 +τ
2
)
Bntnr. (A.177)
A.3.2 GMI Upper Bound
The expectation [sY †Σ−1y Y |H b = Hb,Eb = Eb] can be evaluated as
[
sY †Σ−1y Y
∣
∣
∣H b = Hb,Eb = Eb
]
=
∫
x,y
sy†Σ−1y yPX(x)PY |X,H (y|x,Hb)dxdy (A.178)
=
∫
x,y
sy†Σ−1y y · 1
πnre−∥
∥
∥y−√
SNR
ntHbx
∥
∥
∥
2
· 1
πnte−‖x‖2dxdy (A.179)
=
∫
y
sy†Σ−1y y ·
∫
x
1
πnre−∥
∥
∥y−√
SNR
ntHbx
∥
∥
∥
2
· 1
πnte−‖x‖2dx
dy (A.180)
=
∫
y
sy†Σ−1y y
πnr det(
SNR
ntHbH
†b + Inr
)e−y†Σ−1y ydy (A.181)
where
Σy = Inr +SNR
nt
HbH†b (A.182)
is a positive semi-definite matrix. Let y = Qy, where Q is a unitary matrix
diagonalising Σ−1y . Then, Y is a Gaussian random vector with zero mean and
covariance matrix Q†ΣyQ. We have that1.3
y†Σ−1y y = y†Q†Σ−1
y Qy = y†∆y =
nr∑
i=1
|yi|21 + SNR
ntλb,i
(A.183)
where λb,i is the i-th eigenvalue of HbH†b, and ∆ is a diagonal matrix with diagonal
elements given by (1+ SNR
ntλb,i)
−1, i = 1, . . . , nr. Since Σ−1y is a Hermitian matrix,
we can apply the eigen-decomposition [86] such that
Σ−1y = Q∆Q† ⇔ ∆ = Q†Σ−1
y Q (A.184)
1.3Without loss of generality, herein we assume nt ≥ nr.
192
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
where Q is another unitary matrix and ∆ is another diagonal matrix obtained
by diagonalising Σ−1y . Let λb,i be the i-th eigenvalue of HbH
†b, then the diagonal
entries of ∆ are given by (1 + sSNR
ntλb,i)
−1 for all i = 1, . . . , nr. Applying this to
y†Σ−1y y, we have that
y†Σ−1y y = y†Q†Σ−1
y Qy = y†Q†Q∆Q†Qy = y†V∆V†y = y†Σ−1y y (A.185)
where V = Q†Q is also a unitary matrix and Σ−1y is a Hermitian matrix. Then,
we have that
[
sY †Σ−1y Y
∣
∣
∣H b = Hb,Eb = Eb
]
=s
πnr det(
SNR
ntHbH
†b + Inr
)
∫
y
y†Σ−1y ye−y†Σ−1
y ydy (A.186)
=s
πnr det(
SNR
ntHbH
†b + Inr
) ·∫
y
(
y†V∆V†y)
e−∑nr
i=1|yi|
2
1+ SNRnt
λb,i dy. (A.187)
Let vi,j and σi,j be the entries of V and Σ−1y at row i and column j, respectively.
Then, the integral (A.187) can evaluated as
∫
y
(
y†V∆V
†y)
e−∑nr
i=1|yi|
2
1+ SNRnt
λb,i dy
=
∫
y
(
nr∑
i=1
σi,i|yi|2 + 2nr∑
i=1
nr∑
j>i
ℜ (σi,j y∗i yj)
)
e−
∑nri=1
|yi|2
1+ SNRnt
λb,i dy (A.188)
=nr∑
i=1
∫
y
σi,i|yi|2e−∑nr
i=1|yi|
2
1+ SNRnt
λb,i dy (A.189)
= πnr × det
(
SNR
ntHbH
†b + Inr
)
×nr∑
i=1
σi,i
(
1 +SNR
ntλb,i
)
. (A.190)
Here y∗i denotes the complex conjugate of yi and ℜ· denotes the real part of a
complex number. We have from (A.188) that
σi,i =nr∑
j=1
|vi,j|2
1 + sSNR
ntλb,j
≤nr∑
j=1
|vi,j|2 = 1 (A.191)
where the inequality is because λb,j is non-negative; the last equality is because
193
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
for the unitary matrix V, the sum of |vi,j|2 over j = 1, . . . , nr is equal to one.
Finally, we have
[
sY †Σ−1y Y
∣
∣
∣H b = Hb,Eb = Eb
]
= snr∑
i=1
σi,i
(
1 +SNR
ntλb,i
)
. (A.192)
Let s∗ be the optimising s that gives the supremum on the RHS of (A.123).
Since s, σi,i and λb,i are all non-negative, we upper-bound Igmib (SNR,Hb, Hb, s
∗)
using Proposition 2.3 and (A.191) as follows
Igmib (SNR,Hb, Hb, s
∗)
≤ supsb>0
log det
(
sbSNR
ntHbH
†b + Inr
)
− sb
(
nr +SNR
nt‖Eb‖2F
)
+ sb
nr∑
i=1
(
1 +SNR
ntλb,i
)
(A.193)
= supsb>0
log
nr∏
i=1
(
1 + sbSNR
ntλb,i
)
− sb
(
nr +SNR
nt‖Eb‖2F
)
+ sb
(
nr +SNR
nt‖Hb‖2F
)
(A.194)
≤ supsb>0
nr log
(
1 + sbSNR
nt‖Hb‖2F
)
+ sb
(
SNR
nt‖Hb‖2F − SNR
nt‖Eb‖2F
)
(A.195)
where the last inequality is because∑
i λb,i = ‖Hb‖2F , thus each λb,i is upper-
bounded by ‖Hb‖2F . If SNR‖Hb‖2F is greater than or equal to SNR‖Eb‖2F , then the
supremum on the RHS of (A.195) is achieved with sb ↑ ∞ because the RHS of
(A.195) is a strictly increasing function of sb. However, using the data-processing
inequality (Proposition 2.4), we can always bound Igmib (SNR,Hb, Hb, s
∗) with the
perfect-CSIR bound
Igmib (SNR,Hb, Hb, s
∗) ≤ log det
(
SNR
ntHbH
†b + Inr
)
(A.196)
≤ nr log
(
1 +SNR
nt
‖Hb‖2F)
. (A.197)
On the other hand, if SNR‖Hb‖2F is less than SNR‖Eb‖2F , the supremum is achieved
194
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
with s∗b given by
s∗b =
[
nr
SNR
nt‖Eb‖2F − SNR
nt‖Hb‖2F
− 1SNR
nt‖Hb‖2F
]
+
. (A.198)
The above s∗b ≥ 0 is obtained from the solution of the first order derivative of
nr log
(
1 + sbSNR
nt‖Hb‖2F
)
+ sb
(
SNR
nt‖Hb‖2F − SNR
nt‖Eb‖2F
)
(A.199)
with respect to sb when the derivative is equal to zero.
We continue the analysis by using the change of random variables as used in
the GMI lower bound (Appendix A.3.1). The condition SNR‖Hb‖2F ≥ SNR‖Eb‖2Ffor the perfect-CSIR bound implies that at high SNR, we have the following dot
inequality
SNR1−αb,min
.
≥ SNR1−θb,min . (A.200)
We then have the following asymptotic characterisations.
1. Case 1: αb,min ≤ θb,min. From the perfect-CSIR bound (A.197), we have
that
1 +SNR
nt‖Hb‖2F
.= max
(
SNR0, SNR1−αb,min
)
. (A.201)
2. Case 2: αb,min > θb,min. If 1SNR
nt‖Hb‖2F
is greater than or equal to
nrSNR
nt‖Eb‖2F− SNR
nt‖Hb‖2F
, we have s∗b ↓ 0. From the RHS of (A.195), this yields
exp
(
s∗bSNR
nrnt‖Hb‖2F − s∗bSNR
nrnt‖Eb‖2F
)
×(
1 + s∗bSNR
nt‖Hb‖2F
)
.= SNR
0.
(A.202)
Otherwise, we have
s∗b =nr
SNR
nt‖Eb‖2F − SNR
nt‖Hb‖2F
− 1SNR
nt‖Hb‖2F
(A.203)
195
A.3 Proof of Theorem 3.1 (Gaussian Inputs)
and this also yields
exp
(
s∗bSNR
nrnt‖Hb‖2F − s∗bSNR
nrnt‖Eb‖2F
)
×(
1 + s∗bSNR
nt‖Hb‖2F
)
.= SNR
0.
(A.204)
Recall that the multiplexing gain rg and rate R(SNR) relationship eR(SNR) .=
SNRrg (see Appendix A.3.1). From the above cases, we have the bound for
Pgout(R) as follows
Pgout(R).= SNR
−dGicsir (A.205)
.≥ Pr
1
B
B∑
b=1
nr [1− αb,min]+ · 1 αb,min ≤ θb,min < rg
(A.206)
.=
∫
OG
PA,H(A,ΦH)P(Θ)PE(Φ
E)dAdΘdΦHdΦE (A.207)
where we have defined
OG ,
A,Θ ∈ Bnr×nt :B∑
b=1
[1− αb,min]+ · 1 αb,min ≤ θb,min
<Brgnr
.
(A.208)
Thus, applying Lemma A.1 to find the SNR-exponent and following the same
steps used for the GMI lower bound, it is not difficult to prove that
dGicsir ≤(
1 +τ
2
)
Bntnrmin
(
1− rgnr, de
)
. (A.209)
For fixed rate independent of the SNR (rg ↓ 0), we obtain
dGicsir ≤ min(1, de)(
1 +τ
2
)
Bntnr. (A.210)
This proves Theorem 3.1 for Gaussian inputs.
196
A.4 Proof of Theorem 3.2
A.4 Proof of Theorem 3.2
Recall that from (2.25), the generalised Gallager function for MIMO channels
can be written as (in natural-base log)
EQ0 (s, ρ, Hb)
= − log
(
∫
x′
PX(x′)
(
QY |X,H (Y |x′, H b)
QY |X,H (Y |X, H b)
)s
dx′
)ρ∣
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
.
(A.211)
Evaluating the inner expectation over X ′ for Y = y, X = x, H b = Hb and
Eb = Eb, we have that
∫
x′
PX(x′)
(
QY |X,H (y|x′, Hb)
QY |X,H (y|x, Hb)
)s
dx′ = es∥
∥
∥y−√
SNR
ntHbx
∥
∥
∥
2
· e−sy†Σ−1y y
det(
sHbH†bSNR
nt+ Inr
)
(A.212)
where
Σy = sHbH†b
SNR
nt+ Inr. (A.213)
Then, the expectation over (X,Y ) is given as
(
∫
x′
PX(x′)
(
QY |X,H (Y |x′, H b)
QY |X,H (Y |X, H b)
)s
dx′
)ρ∣
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
=
[
eρs
∥
∥
∥Y −√
SNR
ntHbX
∥
∥
∥
2
× e−ρsY †Σ−1y Y
∣
∣
∣
∣
H b = Hb,Eb = Eb
]
det(
sHbH†bSNR
nt+ Inr
)ρ . (A.214)
We can evaluate the expectation as follows. For a function f(x,y), the ex-
pectation over (X,Y ) is given by
[f(X,Y )] =
∫
x,y
f(x,y) PX(x) PY |X(y|x)dydx. (A.215)
197
A.4 Proof of Theorem 3.2
We first apply the integration over y
∫
y
(
eρs
∥
∥
∥y−
√
SNR
ntHbx
∥
∥
∥
2
· e−ρsy†Σ−1y y
)
· 1
πnre−∥
∥
∥y−
√
SNR
ntHbx
∥
∥
∥
2
dy. (A.216)
Using y = Q†y, we have that
y†Σ−1y y = y†
Q†Σ−1
y Qy = y†∆y =
nr∑
i=1
|yi|2
1 + sλb,iSNR
nt
(A.217)
where Q is a unitary matrix identical to that defined in Appendix A.3, and where
without loss of generality, we have assumed nt ≥ nr. Note that
∥
∥
∥
∥
∥
y −√
SNR
nt
Hbx
∥
∥
∥
∥
∥
2
=
∥
∥
∥
∥
∥
Qy −√
SNR
nt
Hbx
∥
∥
∥
∥
∥
2
=
∥
∥
∥
∥
∥
y −√
SNR
nt
Q†Hbx
∥
∥
∥
∥
∥
2
(A.218)
because multiplication with a unitary matrix does not affect the Euclidean norm
of a vector. Therefore, we have that
∫
y
eρs
∥
∥
∥y−√
SNR
ntHbx
∥
∥
∥
2
· e−ρsy†Σ−1y y · 1
πnre−∥
∥
∥y−√
SNR
ntHbx
∥
∥
∥
2
dy
=
∫
y
eρs
∥
∥
∥y−
√
SNR
ntQ†Hbx
∥
∥
∥
2
· e−ρsy†∆y · 1
πnre−∥
∥
∥y−
√
SNR
ntQ†Hbx
∥
∥
∥
2
dy (A.219)
≤ 1(
1− ρs1−ρs
SNR
nt‖Eb‖2F
)nt·
nr∏
i=1
(
1 + sλb,iSNR
nt
1 + sλb,iSNR
nt(1− ρs)
)
(A.220)
=
(
1− ρs
1− ρs− ρsSNR
nt‖Eb‖2F
)nt
·nr∏
i=1
(
1 + sλb,iSNR
nt
1 + sλb,iSNR
nt(1− ρs)
)
(A.221)
where inequality (A.220) is proved in Appendix A.5. Note that the result in
(A.220) requires ρs < 1 and
s ≤ u1
u2 +SNR
nt‖Eb‖2F
(A.222)
—where u1 and u2 are some positive constants—so that the integral can be
198
A.4 Proof of Theorem 3.2
evaluated. We then have that
∫
x′
PX(x′)
(
QY |X,H (Y |x′, H b)
QY |X,H (Y |X, H b)
)s
dx′
ρ∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
≤ 1
det(
sHbH†bSNR
nt+ Inr
)ρ ·(
1− ρs
1− ρs− ρsSNR
nt‖Eb‖2F
)nt
·nr∏
i=1
(
1 + sλb,iSNR
nt
1 + sλb,iSNR
nt(1− ρs)
)
(A.223)
and from (A.211)
EQ0 (s, ρ, Hb)
≥
nr∑
i=1
(ρ− 1) log
(
1 + sλb,iSNR
nt
)
− nt log(1− ρs)
+ nt log
(
1− ρs− ρsSNR
nt‖Eb‖2F
)
+
nr∑
i=1
log
(
1 + sλb,iSNR
nt(1− ρs)
)
.
(A.224)
Note that the random coding error exponent is given by
EQr (R, H) = sup
s>00≤ρ≤1
1
B
B∑
b=1
EQ0 (s, ρ, Hb)− ρR. (A.225)
A lower bound to EQr (R, H) can be obtained by replacing EQ
0 (s, ρ, Hb) with the
RHS of (A.224), and given by
EQr (R, H) ≥ sup
s>00≤ρ≤1
1
B
B∑
b=1
(
nr∑
i=1
(ρ− 1) log
(
1 + sλb,iSNR
nt
)
)
− nt log(1− ρs)
+ nt log
(
1− ρs− ρsSNR
nt
‖Eb‖2F)
+
nr∑
i=1
log
(
1 + sλb,iSNR
nt(1− ρs)
)
− ρR. (A.227)
Note that from (A.227), we require ρs < 1 and ρs + ρsSNR
nt‖Eb‖2F < 1 for all
199
A.4 Proof of Theorem 3.2
b = 1, . . . , B so that the logarithm functions are defined. Note that the following
choices of ρ = ρ = 1 and
s = s =1
nr
(
1 + SNR
nt
∑Bb=1 ‖Eb‖2F
) (A.228)
satisfy (A.222) and ensure that the logarithm functions in (A.227) are always
defined. Since ρs and ρs+ ρsSNR
nt‖Eb‖2F are always bounded by some real-valued
constants in the interval [0, 1], we have that for SNR ≥ 0
− nt log(1− ρs) + nt log
(
1− ρs− ρsSNR
nt‖Eb‖2F
)
≥ nt log
(
1− 1
nr
)
, u3.
(A.229)
Note that choosing specific values of ρ and s further lower-bounds (A.227).
Since EQr (R, H) can also be lower-bounded by 0 (i.e., ρ = 0), we have that by
substituting s and ρ to ρ and s, the following lower bounds
EQr (R, H) ≥
1
B
B∑
b=1
(
u3 +nr∑
i=1
(ρ− 1) log
(
1 + sλb,iSNR
nt
)
+
nr∑
i=1
log
(
1 + sλb,iSNR
nt(1− ρs)
)
)
− ρR
+
(A.230)
≥[
1
B
B∑
b=1
log
(
eu3 ·(
1 + s‖Hb‖2FSNR
nt(1− s)
))
−BR
]
+
(A.231)
, EQr (R, H). (A.232)
Note that inequality (A.231) is due to the lower-bounding technique in (A.131).
Following the high-SNR analysis in Appendix A.3.1 (cases 1 and 2), we obtain
the dot equality
eu3 ·(
1 + s‖Hb‖2FSNR
nt(1− s)
)
.= SNR
[min(1,θmin)−αb,min]+ . (A.233)
Recall the rate and multiplexing gain relationship eR(SNR) .= SNR
rg (cf. (2.57)).
It follows from EQr (R, H) (the RHS of (A.231)), (A.233) and the dot equality
200
A.4 Proof of Theorem 3.2
eR(SNR) .= SNRrg that for high SNR, if the following event
AG =
A,Θ ∈ Bnr×nt :B∑
b=1
[min(1, θmin)− αb,min]+ ≤ Brg
(A.234)
occurs, then EQr (R, H) = 0 and if the complementary event
AcG =
A,Θ ∈ Bnr×nt :B∑
b=1
[min(1, θmin)− αb,min]+ > Brg
(A.235)
occurs, then EQr (R, H) > 0. Therefore, we can upper-bound the average error
probability of Gaussian random codes as follows
Pe,ave
≤ [
e−BJEQr (R,H )
]
(A.236)
.
≤∫
AG∩A0,Θde×1
(
SNR−(1+ τ
2 )∑B
b=1
∑nrr=1
∑ntt=1 αb,r,t
× SNR∑B
b=1
∑nrr=1
∑ntt=1(de−θb,r,t)dAdΘdΦHdΦE
)
+
∫
AcG∩A0,Θde×1
(
SNR−(1+ τ
2 )∑B
b=1
∑nrr=1
∑ntt=1 αb,r,t · SNR
∑Bb=1
∑nrr=1
∑ntt=1(de−θb,r,t)
× SNR−J(
∑Bb=1[min(1,θmin)−αb,min]+−Brg)dAdΘdΦHdΦE
)
(A.237)
.= K1SNR
−d1(rg) +K2SNR−d2(rg) (A.238)
.= SNR
−d2(rg) (A.239)
where K1, K2.= SNR
0, and where
d1(rg) =(
1 +τ
2
)
Bntnr × (min(1, de)− rg) (A.240)
is a lower bound to the generalised outage SNR-exponent achieved with infinite
block length (notice that OG in Appendix A.3 is similar to AG except for the
201
A.5 Proof of Inequality (A.220)
inequality < which becomes ≤ in AG) and
d2(rg) = infAc
G∩A0,Θde×1
(
1 +τ
2
)
B∑
b=1
nr∑
r=1
nt∑
t=1
αb,r,t +
B∑
b=1
nr∑
r=1
nt∑
t=1
(θb,r,t − de)
+ J
(
B∑
b=1
[min(1, θmin)− αb,min]+ −Brg
)
. (A.241)
Since we need
J
(
B∑
b=1
[min(1, θmin)− αb,min]+ − Brg
)
> 0 (A.242)
in the set AcG for d2(rg), it is straightforward to deduce that d2(rg) ≤ d1(rg),
which follows from [20, Lemma 6]. Therefore, the SNR-exponent lower bound
for a given block length J is given by dℓG(rg) = d2(rg) in Theorem 3.2.
A.5 Proof of Inequality (A.220)
Basically, we want to evaluate the following expectation over X
∫
y
(
eρs
∥
∥
∥y−√
SNR
ntHbX
∥
∥
∥
2
· e−ρsy†Σ−1y y · 1
πnre−∥
∥
∥y−√
SNR
ntHbX
∥
∥
∥
2
dy
)
∣
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
=
∫
y
(
eρs
∥
∥
∥y−√
SNR
ntQ†HbX
∥
∥
∥
2
· e−ρsy†∆y · 1
πnre−∥
∥
∥y−√
SNR
ntQ†HbX
∥
∥
∥
2
dy
)
∣
∣
∣
∣
∣
∣
H b = Hb,Eb = Eb
. (A.243)
To simplify the presentation, we let g =√
SNR
ntQ†Hbx and c =
√
SNR
ntQ†Hbx,
g, c ∈ nr . Then, expanding the argument in the exponential term for X = x,
202
A.5 Proof of Inequality (A.220)
we have that
ρs
∥
∥
∥
∥
∥
y −√
SNR
ntQ†Hbx
∥
∥
∥
∥
∥
2
− ρsy†∆y −∥
∥
∥
∥
∥
y −√
SNR
ntQ†Hbx
∥
∥
∥
∥
∥
2
= ρsnr∑
i=1
(
|yi − gi|2 −|yi|2
1 + sλb,iSNR
nt
)
−nr∑
i=1
|yi − ci|2. (A.244)
By basic integration, we can easily obtain that
∫
yi
1
πe−|yi−ci|2−ρs
|yi|2
1+sλb,iSNRnt
+ρs|yi−gi|2dyi
=1 + sλb,i
SNR
nt
1 + sλb,iSNR
nt(1− ρs)
× e−|ci|2+ρs|gi|2+ |ρsgi−ci|2·
1+sλb,iSNRnt
1+sλb,iSNRnt
(1−ρs) . (A.245)
Evaluating the integral for all yi, i = 1, . . . , nr yields
(
nr∏
i=1
1 + sλb,iSNR
nt
1 + sλb,iSNR
nt(1− ρs)
)
× exp
−nr∑
i=1
|ci|2 + ρsnr∑
i=1
|gi|2
+
nr∑
i=1
(
|ρsgi − ci|2 ·1 + sλb,i
SNR
nt
1 + sλb,iSNR
nt(1− ρs)
)
. (A.246)
Note that
nr∑
i=1
|ci|2 =∥
∥
∥
∥
∥
√
SNR
nt
Q†Hbx
∥
∥
∥
∥
∥
2
=SNR
nt
‖Hbx‖2 (A.247)
nr∑
i=1
|gi|2 =∥
∥
∥
∥
∥
√
SNR
ntQ
†Hbx
∥
∥
∥
∥
∥
2
=SNR
nt
∥
∥
∥Hbx
∥
∥
∥
2
(A.248)
because Q is a unitary matrix that does not change the Euclidean norm of a
vector. This removes the difficulty of obtaining the exact expression for Q. On
the other hand, the last term in the exponential function in (A.246) is difficult to
evaluate as the summation involves the variable λb,i. Herein we have to impose
an additional condition such that the last term in (A.246) can be evaluated.
Suppose that we restrict ρs < 1 with strict inequality for 0 ≤ ρ ≤ 1 and s > 0.
203
A.5 Proof of Inequality (A.220)
Then, we have the bounds for λb,i ≥ 0 and SNR ≥ 0
1 ≤1 + sλb,i
SNR
nt
1 + sλb,iSNR
nt(1− ρs)
≤ 1
1− ρs. (A.249)
Hence, we can upper bound the last term in (A.246) as follows
nr∑
i=1
|ρsgi − ci|2(
1 + sλb,iSNR
nt
1 + sλb,iSNR
nt(1− ρs)
)
≤ 1
1− ρs
nr∑
i=1
|ρsgi − ci|2 (A.250)
=1
1− ρs
∥
∥
∥
∥
∥
√
SNR
ntQ†(ρsHbx− Hbx)
∥
∥
∥
∥
∥
2
(A.251)
=1
1− ρs
SNR
nt
∥
∥
∥ρsHbx− Hbx
∥
∥
∥
2
(A.252)
where the last equality is due to the unitary matrix Q† that does not affect the
Euclidean norm of a vector. This removes the dependency on Q†. By combining
(A.247)–(A.252), we upper-bound the expectation over X of the exponential
function in (A.246) with the following
exp
− SNR
nt
‖HbX‖2 − ρs‖HbX‖2 − 1
1− ρs‖(ρsHb − Hb)X‖2
=
∫
x
exp
− SNR
nt
‖Hbx‖2 − ρs‖Hbx‖2 −1
1− ρs‖(ρsHb − Hb)x‖2
× 1
πnte−‖x‖2dx (A.253)
=1
πnt
∫
x
exp
(
ρs
1− ρs
SNR
nt‖Ebx‖2 − ‖x‖2
)
dx (A.254)
≤ 1
πnt
∫
x
exp
(
ρs
1− ρs
SNR
nt
‖Eb‖2F‖x‖2 − ‖x‖2)
dx (A.255)
=1
(
1− ρs1−ρs
SNR
nt‖Eb‖2F
)nt(A.256)
where the last inequality is due to ‖Ebx‖2 ≤ ‖Eb‖2F‖x‖2 [86, Sec. 5.6]. Note
that the integrand in (A.255) is integrable if ρs1−ρs
SNR
nt‖Eb‖2F < 1. At high SNR,
204
A.6 Proof of Proposition 3.1
there exist positive constants u1 and u2 for which s ≤ u1
u2+SNR
nt‖Eb‖2F
guarantees
that (A.255) is integrable.
A.6 Proof of Proposition 3.1
From (A.230) and (A.231) with ρ = 1, we can rewrite the lower bounds of the
mismatched decoding error exponent as follows
EQr (R, H) ≥
1
B
B∑
b=1
u3 +
nr∑
i=1
log
(
1 + sλb,iSNR
nt(1− s)
)
−R
+
(A.257)
≥
1
B
B∑
b=1
log
(
eu3 ·(
1 + s‖Hb‖2FSNR
nt(1− s)
))
− R
+
(A.258)
where u3 < 0 and
s =1
nr
(
1 + SNR
nt
∑Bb=1 ‖Eb‖2F
) . (A.259)
We have used the last inequality above to derive the block length threshold for
Gaussian inputs in Theorem 3.2. The results are general for the fading model
(3.4). However, the last inequality implies a looser achievability bound and the
resulting block length threshold may not be tight.
A tighter bound can be obtained by using the inequality (A.257). This re-
quires the joint density function of the random vector Λb and the entries of Eb.
Note that conditioned on Eb = Eb, H b has the same distribution and covariance
as H b but with the mean shifted by Eb. From (3.4), the conditional distribution
of each channel estimate entry, Hb,r,t, is given by
PHb,r,t|Eb,r,t(h|e) = w0|h− e|τe−w1|h−e−w2|ϕ. (A.260)
The characterisation of the above pdf is difficult when τ 6= 0. At high SNR, the
near-zero behaviour determines the dominating term in the pdf [20, 47]. Note
that for τ 6= 0, the near-zero behaviour of the pdf is determined by the values of
h, e in |h − e|τ . This behaviour does not only depend on |h|τ but also |e|τ and
the angles of h and e. These interplaying variables make the near-zero behaviour
of the pdf intractable. On the other hand, when τ = 0, the variable e only
affects the exponential term, which for many cases tends to decay exponentially
205
A.6 Proof of Proposition 3.1
or converges to a constant for high SNR (see also [47, 48]).
Consider τ = 0 and assume nt ≥ nr, we perform a change of random variables
from the matrix entries in Hb to its eigenvalues λb,i for all i = 1, . . . , nr. Since the
entries of the channel matrix are assumed to be i.i.d., the pdf of H b for a given
Eb is given by
PH b|Eb
(
Hb
∣
∣
∣Eb
)
=
nr∏
r=1
nt∏
t=1
w0e−w1|hb,r,t−eb,r,t−w2|ϕ. (A.261)
Using the singular value decomposition of Hb = UDS and the eigen-decomposition
in the form of HbH†b = UΣU† [86], random matrices results [107, 108] provide the
joint distribution of the ordered eigenvalues in the following form [47, 48]
PΛb|Eb
(
λb|Eb
)
= Cn,m
nr∏
i=1
λnt−nr
b,i ·∏
i<j
(λb,i − λb,j)2
·∫
Vnr,nr
∫
Vnr,nt
PH b|Eb
(UDS|Eb)dSdU (A.262)
where Cn,m is the normalising constant and Vnr,nr and Vnr,nt are the complex
Stiefel manifolds [107, 108]. Remark that Σ = diag[λb,1, . . . , λb,nr] and D =
diag[λ12b,1, . . . , λ
12b,nr
] with λb,1 ≤ · · · ≤ λb,nr.
Let υb,i = − log λb,i
log SNR. Using this change of variable, (A.261) and (A.262), we
can write the above pdf for τ = 0 as done in [48]
PΥb|Eb
( υb|Eb)
= Cn,m · (log SNR)nr ·nr∏
i=1
SNR−(nt−nr+1)υb,i ·
∏
i<j
(
SNR−υb,i − SNR
−υb,j)2
·(
∫
Vnr,nr
∫
Vnr,nt
wnrnt0 e−w1(‖Hb−Eb−W2‖ϕ)
ϕ
dSdU
)
(A.263)
where W2 is an nr × nt matrix with all elements equal to w2 and ‖ · ‖ϕ is the
ϕ−norm [86]. As we deal with the achievability bound, it suffices to find a tight
upper bound for the pdf. Note that since nr×nt is a finite-dimensional complex
space, all norms on nr×nt are equivalent [86].1.4 Thus, we can find a positive
1.4The equivalence of norms can be explained as follows. Given a finite-dimensional spacem×n and a matrix X ∈ m×n, there exists positive real numbers P and S independent of X
such that P‖X‖p′ ≤ ‖X‖p ≤ S‖X‖p′ [86].
206
A.6 Proof of Proposition 3.1
number u4 > 0 such that the term in the exponent can be lower-bounded as
‖Hb − Eb −W2‖ϕ ≥ u4‖Hb − Eb −W2‖F . (A.264)
Applying the backward triangle inequality for the matrix norm, we have that
‖Hb − Eb −W2‖F ≥∣
∣
∣‖Hb‖F − ‖Eb +W2‖F
∣
∣
∣(A.265)
=
∣
∣
∣
∣
∣
∣
√
√
√
√
nr∑
i=1
λb,i −
√
√
√
√
nr∑
r=1
nt∑
t=1
|eb,r,t + w2|2∣
∣
∣
∣
∣
∣
(A.266)
=
∣
∣
∣
∣
∣
∣
√
√
√
√
nr∑
i=1
SNR−υb,i −
√
√
√
√
nr∑
r=1
nt∑
t=1
∣
∣
∣
∣
SNR− θb,r,t
2 eıφeb,r,t + w2
∣
∣
∣
∣
2
∣
∣
∣
∣
∣
∣
.
(A.267)
Since ϕ ≥ 1 by the definition (3.4), we can lower-bound (A.263) using
(
‖Hb − Eb −W2‖ϕ)ϕ
≥ uϕ4
(
‖Hb − Eb −W2‖F)ϕ
(A.268)
≥ uϕ4
∣
∣
∣‖Hb‖F − ‖Eb +W2‖F
∣
∣
∣
ϕ
(A.269)
which follows from the monotonicity of the function f(u) = uϕ over the interval
u > 0. Remark that the conditional pdf in (A.262) is conditioned on Eb. We can
write the joint density function of Λb,Eb as follows
PΛb,Eb
(
λb,Eb
)
= PΛb|Eb
(
λb
∣
∣
∣Eb
)
PEb(Eb). (A.270)
The density PΛb|Eb
(λb|Eb)PEb(Eb) can be further expanded as
PΛb|Eb
(
λb
∣
∣
∣Eb
)
PEb(Eb) = P
Λb|Eb,r,t
(
λb
∣
∣
∣eb,r,t
)
nr∏
r=1
nt∏
t=1
PEb,r,t(eb,r,t) (A.271)
where eb,r,t denotes the collection of eb,r,t for all r, t. Equality (A.271) holds
since the matrix Eb can be completely expressed in terms of its entries eb,r,t,
r = 1, . . . , nr, t = 1, . . . , nt. Note that the entries of the random matrix Eb are
i.i.d. random variables and for each entry, the phase ΦEb,r,t is independent from
the magnitude |Eb,r,t| and uniformly distributed over [0, 2π). Hence, applying
207
A.6 Proof of Proposition 3.1
the transformation of the variables λb,i and |eb,r,t|2 to υb,i = − log λb,i
log SNRand θb,r,t =
− log |eb,r,t|2logSNR
, we have the joint pdf of Υb, Θb,r,t and ΦEb,r,t, r = 1, . . . , nr, t = 1, . . . , nt
as follows
PΥb,Θb,r,t,ΦE
b,r,t(
υb, θb,r,t, φeb,r,t
)
= PΥb|Θb,r,t,ΦE
b,r,t(
υb| θb,r,t, φeb,r,t
)
·nr∏
r=1
nt∏
t=1
PΘb,r,t(θb,r,t)PΦE
b,r,t(φe
b,r,t).
(A.272)
We continue the analysis from (A.271) and (A.272). Note that the term
PΥb|Θb,r,t,ΦE
b,r,t(υb|θb,r,t, φe
b,r,t) (A.273)
can be further upper-bounded using (A.269). Using this bound, we then group
the exponential terms as follows
exp
− w1uϕ4
∣
∣
∣
∣
∣
∣
√
√
√
√
nr∑
i=1
SNR−υb,i−
√
√
√
√
nr∑
r=1
nt∑
t=1
∣
∣
∣
∣
SNR− θb,r,t
2 eıφeb,r,t + w2
∣
∣
∣
∣
2
∣
∣
∣
∣
∣
∣
ϕ
−nr∑
r=1
nt∑
t=1
SNR(de−θb,r,t)
. (A.274)
As the SNR increases, the behaviour of this exponential term is dominated by the
smallest values of υb,i, i = 1, . . . , nr and θb,r,t, r = 1, . . . , nr, t = . . . , nt. Since the
eigenvalues λb,1, . . . , λb,nt are ordered in a non-decreasing order, the dominating
terms are indicated by υb,nr and θb,min. We have the following observations.
1. For the variables inside the modulus operator, | · |ϕ, if υb,nr ≥ 0 and θb,min ≥0, the terms inside the modulus are converging to some constant w′
2 as the
SNR increases and the convergence of the exponential term is determined
by
−w1uϕ4 |w′
2|ϕ −nr∑
r=1
nt∑
t=1
SNR(de−θb,r,t) .= −SNR
0 − SNRde−θb,min. (A.275)
If θb,min < de, then SNR(de−θb,min) dominates and this makes the overall
pdf upper bound decay exponentially with the SNR. If θb,min ≥ de, then
208
A.6 Proof of Proposition 3.1
the constant w1uϕ4 |w′
2|ϕ dominates and eventually the exponential function
converges to an SNR independent constant which can be neglected in the
pdf upper bound for the asymptotic analysis.
2. If either υb,nr < 0 or θb,min < 0, then the exponential convergence can be
explained in the following cases.
• If υb,nr < θb,min, then the following dominates the exponent
− w1uϕ4
√
√
√
√
nr∑
i=1
SNR−υb,i
ϕ
−nr∑
r=1
nt∑
t=1
SNR(de−θb,r,t)
.= −SNR
−ϕ2υb,nr − SNR
de−θb,min (A.276).= −SNR
max(−ϕ2υb,nr ,de−θb,min). (A.277)
Since υb,nr < 0, it can be seen that the exponential function always
makes the pdf upper bound decay exponentially with the SNR.
• If υb,nr > θb,min, the dominating exponent is given by
− w1uϕ4
√
√
√
√
nr∑
r=1
nt∑
t=1
∣
∣
∣
∣
SNR− θb,r,t
2 eıφeb,r,t + w2
∣
∣
∣
∣
2
ϕ
−nr∑
r=1
nt∑
t=1
SNR(de−θb,r,t)
.= −SNR
−ϕ2θb,min − SNR
de−θb,min (A.278).= −SNR
max(−ϕ2θb,min, de−θb,min). (A.279)
Since θb,min is less than zero, it can be seen that the exponential func-
tion always makes the pdf upper bound decay exponentially with the
SNR.
3. Note that we have υb,1 ≥ · · · ≥ υb,nr and for any r = 1, . . . , nr, t = 1, . . . , nt,
θb,r,t ≥ θb,min.
Hence, from the above observations, we require that υb 0 and Θb de × 1,
b = 1, . . . , B so that the pdf upper bound does not decay exponentially to zero
as the SNR tends to infinity.
We continue the analysis by evaluating the lower bound for EQr (R, H) in
209
A.6 Proof of Proposition 3.1
(A.257)
EQr (R, H) ≥
1
B
B∑
b=1
u3 +
nr∑
i=1
log
(
1 + sλb,iSNR
nt(1− s)
)
−R
+
(A.280)
=
1
Blog
B∏
b=1
nr∏
i=1
eu3nr
(
1 + sλb,iSNR
nt
(1− s)
)
−R
+
(A.281)
, E ′Qr (R, H) (A.282)
where
s =1
nr
(
1 + SNR
nt
∑Bb=1
∑nr
r=1
∑nt
t=1 |eb,r,t|2) . (A.283)
Using the change of variables from λb,i and |eb,r,t|2 to υb,i and θb,r,t, we can show
the following dot equality
eu3nr
(
1 + sλb,iSNR
nt
(1− s)
)
.= SNR
[min(1,θmin)−υb,i]+ (A.284)
where
θmin , min
θ1,1,1, . . . , θb,r,t, . . . , θB,nr,nt
. (A.285)
It follows from E ′Qr (R, H) (the RHS of (A.281)), (A.284) and the rate and mul-
tiplexing gain relationship eR(SNR) .= SNRrg (cf. (2.57)) that at high SNR, if the
following event
AG =
υ ∈ Bnr ,Θ ∈ Bnr×nt :
B∑
b=1
nr∑
i=1
[min(1, θmin)− υb,i]+ ≤ Brg
(A.286)
occurs, then E ′Qr (R, Hb) = 0 and otherwise if
AcG =
υ ∈ Bnr ,Θ ∈ Bnr×nt :
B∑
b=1
nr∑
i=1
[min(1, θmin)− υb,i]+ > Brg
(A.287)
occurs, then E ′Qr (R, Hb) > 0. Therefore, for the fading model (3.4) with τ = 0,
210
A.6 Proof of Proposition 3.1
we can upper-bound the average error probability of Gaussian random codes as
follows
Pe,ave ≤ [
e−BJEQr (R,H )
]
(A.288)
.≤∫
AG∩υ0,Θde×1
(
SNR−∑B
b=1
∑nri=1(2i−1+nt−nr)υb,i
× SNR∑B
b=1
∑nrr=1
∑ntt=1(de−θb,r,t)dυdΘdΦE
)
+
∫
AcG∩υ0,Θde×1
(
SNR−∑B
b=1
∑nri=1(2i−1+nt−nr)υb,i
× SNR∑B
b=1
∑nrr=1
∑ntt=1(de−θb,r,t)
× SNR−J(
∑Bb=1
∑nri=1[min(1,θmin)−υb,i]+−Brg)dυdΘdΦE
)
(A.289)
.= G1SNR
−d1(rg) +G2SNR−d2(rg) (A.290)
.= SNR
−d2(rg) (A.291)
whereG1, G2.= SNR
0, and d1(rg) is the generalised outage SNR-exponent achieved
with infinite block length. Note that to find the solution of d1(rg), we follow the
same approach of finding the optimal DMT in [20]. The lower-bound of the op-
timal DMT curve d1(rg) is given by the piecewise-linear function connecting the
points (rg, d1(rg)), where
rg = 0,min(1, de), 2min(1, de), . . . , nrmin(1, de), (A.292)
d1(rg) = min(1, de) · B(
nt −rg
min(1, de)
)
·(
nr −rg
min(1, de)
)
. (A.293)
Note that we have d1,max = min(1, de)Bntnr and rg,max = min(1, de)nr. On the
other hand, d2(rg) is given as
d2(rg) = infAc
G∩υ0,Θde×1
B∑
b=1
nr∑
i=1
(2i− 1 + nt − nr)υb,i +
B∑
b=1
nr∑
r=1
nt∑
t=1
(θb,r,t − de)
+ J
(
B∑
b=1
nr∑
i=1
[min(1, θmin)− υb,i]+ − Brg
)
. (A.294)
211
A.7 Proof of Theorem 3.3
Since we need
J
(
B∑
b=1
nr∑
i=1
[min(1, θmin)− υb,i]+ − Brg
)
> 0 (A.295)
for d2(rg), it is straightforward to deduce that d2(rg) ≤ d1(rg), which follows
from [20, Lemma 6]. Thus, d2(rg) leads to dℓG(rg) in the proposition. Note that
we just need to replace (Inr + sHbH†bSNR
nt) with (Int + sH†
bHbSNR
nt) in the analysis if
nt < nr.
A.7 Proof of Theorem 3.3
We use the generalised Gallager upper bound to derive the achievability by isolat-
ing the channel block length and the random coding exponent. Recall EQ0 (s, ρ, Hb)
in (2.25) written in different form here
EQ0 (s, ρ, Hb)
= − log2
[(
∑
x′∈Xnt
PX(x′)
(
QY |X,H (Y |x′, H b)
QY |X,H (Y |X, H b)
)s)ρ ∣∣
∣
∣
∣
H b = Hb,Eb = Eb
]
.
(A.296)
For a given Y = y, X = x, H b = Hb and Eb = Eb, inserting the decoding metric
(3.7) and evaluating the expectation over X ′, we have that
∑
x′∈Xnt
PX(x′)
(
QY |X,H (y|x′, Hb)
QY |X,H (y|x, Hb)
)s
= 2−Mnt∑
x′∈Xnt
(
e−∥
∥
∥
√
SNR
ntHb(x−x′)+z−
√
SNR
ntEbx
′∥
∥
∥
2+∥
∥
∥z−√
SNR
ntEbx
∥
∥
∥
2)s
. (A.297)
212
A.7 Proof of Theorem 3.3
Substituting (A.297) to the RHS of (A.296), we obtain
− log2
[(
∑
x′∈Xnt
PX(x′)
(
QY |X,H (Y |x′, H b)
QY |X,H (Y |X, H b)
)s)ρ ∣∣
∣
∣
∣
H b = Hb,Eb = Eb
]
= (1 + ρ)Mnt
− log2∑
x∈Xnt
[
∑
x′∈Xnt
(
e−s
∥
∥
∥
√
SNR
ntHb(x−x′)+Z−
√
SNR
ntEbx
′∥
∥
∥
2+s
∥
∥
∥Z−√
SNR
ntEbx
∥
∥
∥
2)
ρ]
.
(A.298)
Note that
1 ≤
∑
x′∈Xnt
(
e−s
∥
∥
∥
√
SNR
ntHb(x−x′)+z−
√
SNR
ntEbx
′∥
∥
∥
2+s
∥
∥
∥z−√
SNR
ntEbx
∥
∥
∥
2)
ρ
(A.299)
≤ |Xnt |ρeρs∥
∥
∥z−√
SNR
ntEbx
∥
∥
∥
2
. (A.300)
We have the expectation over Z
[
|Xnt|ρeρs∥
∥
∥Z−√
SNR
ntEbx
∥
∥
∥
2]
=|Xnt|ρ
(1− ρs)nre
(
ρ2s2
1−ρs+ρs
)
SNR
nt‖Ebx‖2
(A.301)
≤ |Xnt|ρ(1− ρs)nr
e
(
ρ2s2
1−ρs+ρs
)
SNR
nt‖Eb‖2F ‖x‖2
(A.302)
where we have assumed ρs < 1 so that the expectation can be evaluated, and
where we have used the Frobenius norm property ‖Ebx‖2 ≤ ‖Eb‖2F‖x‖2 [86, Sec.
5.6] in the last inequality. Since the signal energy ‖x‖2, x ∈ Xnt is finite, the
condition|Xnt|ρ
(1− ρs)nre
(
ρ2s2
1−ρs+ρs
)
SNR
nt‖Eb‖2F ‖x‖2
<∞ (A.303)
can be satisfied by choosing the optimal solution of s over
S =
s ∈ : 0 < s ≤ 1
B + SNR∑B
b=1 ‖Eb‖2F
. (A.304)
The choice of s ∈ S leads to a lower bound to the mismatched decoding
error exponent in (2.24). As (A.303) can be satisfied with s ∈ S, the dominated
convergence theorem [19] can be applied here. Let s∗ be the value of s that solves
213
A.7 Proof of Theorem 3.3
the supremum on the RHS of (2.24). Then, using a similar argument to the one
used in the generalised outage evaluation (Appendix A.2.2), we can conclude the
following expectation over Z
limSNR→∞
∑
x′∈Xnt
(
e−s∗
∥
∥
∥
√
SNR
ntHb(x−x′)+Z−
√
SNR
ntEbx
′∥
∥
∥
2+s∗
∥
∥
∥Z−
√
SNR
ntEbx
∥
∥
∥
2)
ρ
≤ [(
∑
x′∈Xnt
1
xt 6= x′t, ∀t ∈ S(ǫ,δ)b
)ρ]
(A.305)
= 2ρM(nt−κb) (A.306)
where S(ǫ,δ)b and κb have the same definition as those in Appendix A.2.2. Conse-
quently, we have at high SNR
EQ0 (s
∗, ρ, Hb) ≥ ρMκb. (A.307)
The random coding error exponent EQr (R, Hb) can then be bounded as follows
EQr (R, Hb) = sup
s>00≤ρ≤1
1
B
B∑
b=1
EQ0 (s, ρ, Hb)− ρR (A.308)
≥ sup0≤ρ≤1
ρM
(
1
B
B∑
b=1
κb −R
M
)
. (A.309)
Define ζ and two mutually exclusive sets as follows
ζ ,B∑
b=1
κb −BR
M, (A.310)
AX ,
A, Θ ∈ Bnr×nt :
B∑
b=1
κb >BR
M
, (A.311)
AcX,
A, Θ ∈ Bnr×nt :B∑
b=1
κb ≤BR
M
. (A.312)
Note that the value of ρ that solves the supremum on the RHS of (A.309) is
given by ρ∗ = 1 if ζ > 0 and ρ∗ = 0 if ζ ≤ 0. Then, we can upper-bound the
214
A.7 Proof of Theorem 3.3
average error probability of discrete-input random codes as follows
Pe,ave ≤ [
2−BJEQr (R,H )
]
(A.313)
.
≤∫
AX∩A0,Θde×1
(
SNR−(1+ τ
2 )∑B
b=1
∑nrr=1
∑ntt=1 αb,r,t
× SNR−
∑Bb=1
∑nrr=1
∑ntt=1(θb,r,t−de) × 2−JMζdAdΘdΦHdΦE
)
+
∫
AcX∩A0,Θde×1
(
SNR−(1+ τ
2 )∑B
b=1
∑nrr=1
∑ntt=1 αb,r,t
× SNR−
∑Bb=1
∑nrr=1
∑ntt=1(θb,r,t−de)dAdΘdΦHdΦE
)
(A.314)
.= C1SNR
−d1 + C2SNR−d2 (A.315)
.= SNR
−min(d1,d2) (A.316)
where C1, C2.= SNR
0. It is straightforward to see that
d2 = min(1, de)×(
1 +τ
2
)
nr
⌈
B
(
nt −R
M
)⌉
(A.317)
is equivalent to dicsir in Theorem 3.1 up to the discontinuity points of the Singleton
bound. This is exactly the same as (A.113) when replacing < in the outage set
with ≤. On the other hand, following the same steps used in Appendix A.2.2,
we arrive to the following result for d1
d1 = infAX∩A0, Θde×1
JMζ log 2
log SNR+(
1 +τ
2
)
B∑
b=1
nr∑
r=1
nt∑
t=1
αb,r,t
+
B∑
b=1
nr∑
r=1
nt∑
t=1
(θb,r,t − de)
. (A.318)
If both M and J are not growing with log SNR, it is clearly seen that d1 = 0
as the SNR tends to infinity. Assume that M is fixed and J(SNR) = ω log SNR,
215
A.7 Proof of Theorem 3.3
ω ≥ 0. Then, we can write d1 as
d1 = infAX∩A0, Θde×1
ωMζ log 2 +(
1 +τ
2
)
B∑
b=1
nr∑
r=1
nt∑
t=1
αb,r,t
+
B∑
b=1
nr∑
r=1
nt∑
t=1
(θb,r,t − de)
. (A.319)
By letting ǫ, δ ↓ 0 to achieve a tight SNR-exponent lower bound, the optimiser
of Θ is given by Θ∗ = de × 1 and for A is given by evaluating the intersection of
A 0 and AX. This yields the following solution of d1
d1 = inf1+⌊BR
M ⌋≤K≤Bnt
d1(K) (A.320)
where
d1(K) = ωM log 2
(
K − BR
M
)
+min(1, de)×(
1 +τ
2
)
nr(Bnt −K). (A.321)
Note that the derivative of d1(K) with respect to K is given by
∂d1(K)
∂K= ωM log 2−min(1, de)
(
1 +τ
2
)
nr. (A.322)
It follows that the value of K that solving the infimum (A.320) is given by
K∗ = Bnt (A.323)
if ωM log 2 < min(1, de)(1 +τ2)nr, and
K∗ = 1 +
⌊
BR
M
⌋
(A.324)
if ωM log 2 ≥ min(1, de)(1 +τ2)nr.
We are interested in the interval of ω for which d1 ≥ d2 as this is the point
where the SNR-exponent of discrete-input random codes is tight with dicsir up to
the discontinuity points of the Singleton bound. Note that from (A.317), (A.321),
(A.323) and (A.324), we deduce that d1 ≥ d2 is only possible with ωM log 2 ≥
216
A.7 Proof of Theorem 3.3
min(1, de)(1 + τ2)nr. This implies that K∗ = 1 + ⌊BR
M⌋ and d1 = d1(K
∗). It
follows that by comparing d1(K∗) and d2, we obtain the following threshold on
ω for which d1(K∗) ≥ d2
ω ≥ 1
M log 2· min(1, de) ·
(
1 + τ2
)
nr
1 +⌊
BRM
⌋
− BRM
. (A.325)
Furthermore, using K∗ in (A.323) and (A.324), a complete characterisation on
the achievable SNR-exponent with discrete-input random codes can be obtained
and it is given in equations (3.40), (3.41) and (3.42).
217
Appendix B
B.1 Proof of Lemma 4.2
We first note that due to the symmetry of the codebook construction, it suffices
to consider that the message m = 1 is transmitted. Recall that for a given Y,
the decoder outputs the message m if X(m) is the unique matrix such that its
normalised metric is greater than the threshold in Tδ. Otherwise, it declares
error. The undetected error event is characterised by the unique decoding of
Ψ(Y) when the decoded message is not the transmitted one. Let X(j) be the
codeword matrix corresponding to the j-th message, j ∈ 1, . . . , |M|. We can
write the undetected error event V,E as
V,E∣
∣
∣
(
X(1),Y, H)
∈ Tcδ
⊆⋃
j 6=1
(
X(j),Y, H)
∈ Tδ
∣
∣
∣
(
X(1),Y, H)
∈ Tcδ
.
(B.1)
We evaluate the following probability for j 6= 1 conditioned on a fixed fading
H = H and its corresponding estimation error E = E (such that H = H+ E)
Pr(
X(j),Y, H)
∈ Tδ
∣
∣
∣
(
X(1),Y, H)
∈ Tcδ
= [
Pr(
X(j),Y, H)
∈ Tδ
∣
∣
∣Y
∣
∣
∣
(
X(1),Y, H)
∈ Tcδ
]
(B.2)
where the equality holds since conditioned on Y = Y for any Y ∈ Bnr×J (and
for a fixed H = H+ E), the following metric for message j
QsY|X ,H (Y|X(j), H)
[
QsY|X ,H (Y|X′, H)
] (B.3)
219
B.2 Proof of Theorem 4.1
is independent of whether or not the event (X(1),Y, H) ∈ Tcδ has occurred (as
the codeword for message j 6= 1 is independently generated from the codeword
for message 1). We next evaluate Pr(X(j),Y, H) ∈ Tδ|Y for a specific channel
output Y = Y ∈ Bnr×J
Pr(
X(j),Y, H)
∈ Tδ
∣
∣
∣Y = Y
= Pr
QsY|X ,H (Y|X(j), H)
[
QsY|X ,H (Y|X′, H)
] ≥ |M|δ
(B.4)
≤
[
QsY|X ,H (Y|X(j), H)
]
|M|δ
[
QsY|X ,H (Y|X′, H)
] (B.5)
=δ
|M| . (B.6)
Inequality (B.5) follows from Markov’s inequality. Equality (B.6) follows since
for a given transmit message m = 1, Y = Y, H = H and E = E, random coding
with i.i.d. codebooks implies that the expectation [QsY|X ,H (Y|X(j), H)] does not
depend on the message index j for j 6= 1. Combining (B.6) with (B.2) and
applying the union bound to the probability of the event (B.1) yields
PrV,E|H = H,E = E < δ. (B.7)
The proof of Lemma 4.2 is completed by letting δ′ = − log2 δBJ
.
B.2 Proof of Theorem 4.1
Using random coding schemes, we characterise Pe(L) in (4.50). The converse and
achievability bounds are given in the following.
220
B.2 Proof of Theorem 4.1
B.2.1 Converse
To characterise the converse, we shall assume i.i.d. codebooks and perfect error
detection such that Pe(L) in (4.50) becomes
Pe(L)
= Pr D1, Ft(1) = 1+ Pr AL−1,DL−1,EL+L−1∑
ℓ′=2
Pr Aℓ′−1,Dℓ′, Ft(ℓ′) = 1 .
(B.8)
As J → ∞, we can lower-bound the first term as
Pr D1, Ft(1) = 1 = PrD1PrFt(1) = 1|Fr(1) = 0 (B.9)
≥ Pr
H1,1,E1,1 ∈ O1,1(R)
pfb (B.10).= P−dicsir(1)−dfb . (B.11)
Here (B.10) follows from the converse of i.i.d. codebooks (Proposition 2.4) and
(B.11) follows from the definition of dicsir(1) in (4.54).
We next consider PrAℓ−1,Dℓ, Ft(ℓ) = 1. We have that for rounds ℓ < L
PrAℓ−1,Dℓ, Ft(ℓ) = 1= Pr Ft(ℓ) = 1|Aℓ−1,DℓPr Aℓ−1,Dℓ (B.12)
= pfb Pr Aℓ−1,Dℓ (B.13)
= pfb Pr Dℓ|Aℓ−1,Dℓ−1Pr Aℓ−1|Aℓ−2,Dℓ−1Pr Aℓ−2,Dℓ−1 (B.14)
= pfb(1− pfb) Pr Dℓ|Aℓ−1,Dℓ−1Pr Aℓ−2,Dℓ−1 (B.15)
= pfb(1− pfb)ℓ−1 Pr D1
ℓ∏
ℓ′=2
Pr Dℓ′|Aℓ′−1,Dℓ′−1 . (B.16)
Note that
Pr D1ℓ∏
ℓ′=2
Pr Dℓ′|Aℓ′−1,Dℓ′−1 (B.17)
with imperfect feedback is identical to
Pr D1ℓ∏
ℓ′=2
Pr Dℓ′|Dℓ′−1 (B.18)
221
B.2 Proof of Theorem 4.1
with perfect feedback as we properly condition the event Dℓ′ with Aℓ′−1 and
Dℓ′−1, i.e., the event Dℓ′ in (B.17) is only considered if detected error occurs at
all previous rounds and negative ACKs are obtained at the transmitter. The
effect of imperfect feedback is captured by pfb(1 − pfb)ℓ−1. For round ℓ = L, we
have that
Pr AL−1,DL−1,EL= Pr EL|AL−1,DL−1Pr AL−1,DL−1 (B.19)
= (1− pfb)L−1 Pr EL|AL−1,DL−1Pr D1
L−1∏
ℓ′=2
Pr Dℓ′|Aℓ′−1,Dℓ′−1 . (B.20)
Therefore, using the GMI converse for i.i.d. codebooks (4.47), we have as
J → ∞ that
pfb(1− pfb)ℓ−1 Pr D1
ℓ∏
ℓ′=2
Pr Dℓ′|Aℓ′−1,Dℓ′−1
≥ pfb(1− pfb)ℓ−1 Pr
H1,ℓ,E1,ℓ ∈
ℓ⋂
ℓ′=1
O1,ℓ′(R)
(B.21)
.= P−dicsir(ℓ)−dfb (B.22)
and
(1− pfb)L−1 Pr EL|AL−1,DL−1Pr D1
L−1∏
ℓ′=2
Pr Dℓ′|Aℓ′−1,Dℓ′−1
≥ (1− pfb)L−1 Pr
H1,L,E1,L ∈
L⋂
ℓ′=1
O1,ℓ′(R)
(B.23)
.= P−duicsir(L). (B.24)
Combining (B.11), (B.22) and (B.24) with (B.8) yields
Pe(L).≥ P−dicsir(L) +
L−1∑
ℓ=1
P−dicsir(ℓ)−dfb . (B.25)
222
B.2 Proof of Theorem 4.1
B.2.2 Achievability
We now prove that the same SNR-exponent as that on the RHS of (B.25) can
be achieved using random coding schemes. Recall Pe(L) in (4.50)
Pe(L) =PrV1,E1+L−1∑
ℓ′=2
PrAℓ′−1,Dℓ′−1,Vℓ′,Eℓ′+ Pr AL−1,DL−1,EL
+ PrD1, Ft(1) = 1+L−1∑
ℓ′=2
PrAℓ′−1,Dℓ′, Ft(ℓ′) = 1. (B.26)
Applying Lemma 4.2, the first two terms corresponding to undetected errors can
be upper-bounded as
PrV1,E1+L−1∑
ℓ′=2
PrAℓ′−1,Dℓ′−1,Vℓ′ ,Eℓ′ ≤ (L− 1)2−BJδ′ (B.27)
which vanishes as J → ∞ for a fixed δ′ > 0. Following the analysis in Appendix
B.2.1, the last three terms can be written as
Pr AL−1,DL−1,EL = (1− pfb)L−1 Pr EL|AL−1,DL−1
· Pr D1L−1∏
ℓ′=2
Pr Dℓ′|Aℓ′−1,Dℓ′−1 , (B.28)
Pr D1, Ft(1) = 1 = PrD1pfb, (B.29)
Pr Aℓ′−1,Dℓ′, Ft(ℓ′) = 1 = pfb(1− pfb)
ℓ−1 Pr D1ℓ∏
ℓ′=2
Pr Dℓ′|Aℓ′−1,Dℓ′−1 .
(B.30)
As argued in Appendix B.2.1, the probability
Pr D1ℓ∏
ℓ′=2
Pr Dℓ′|Aℓ′−1,Dℓ′−1 (B.31)
with imperfect feedback is identical to the probability
Pr Dℓ = Pr D1ℓ∏
ℓ′=2
Pr Dℓ′ |Dℓ′−1 (B.32)
223
B.3 Proof of Proposition 4.1
with perfect feedback. Thus, using random coding schemes with J → ∞, apply-
ing (4.40) yields upper bounds
PrD1pfb ≤ Pr
H1,1,E1,1 ∈ Q1,1(R + δ′)
pfb, (B.33)
pfb(1− pfb)ℓ−1Pr D1
ℓ∏
ℓ′=2
Pr Dℓ′ |Aℓ′−1,Dℓ′−1
≤ pfb(1− pfb)ℓ−1 Pr
H1,ℓ,E1,ℓ ∈
ℓ⋂
ℓ′=1
Q1,ℓ′(R + δ′)
(B.34)
and applying (4.46) yields an upper bound
(1− pfb)L−1 Pr EL|AL−1,DL−1Pr D1
L−1∏
ℓ′=1
Pr Dℓ′|Aℓ′−1,Dℓ′−1
≤ (1− pfb)L−1 Pr
H1,L,E1,L ∈
L⋂
ℓ′=1
Q1,ℓ′(R + δ′)
. (B.35)
By having δ′ ↓ 0 and following from the definition of the sets Q1,ℓ(R) and O1,ℓ(R),
we can see that (B.33), (B.34) and (B.35) tend to be similar with (B.10), (B.21)
and (B.23), respectively. This implies that the converse and the achievability are
tight for a sufficiently small δ′. It follows that the ARQ diversity (4.55) is given
by the slowest decaying exponent of the RHS of (B.25), i.e.,
darq = min
dicsir(1) + dfb, dicsir(2) + dfb, . . . , dicsir(L− 1) + dfb, dicsir(L)
(B.36)
= min(
dicsir(L), dicsir(1) + dfb
)
(B.37)
where the last equality is because dicsir(ℓ) is a non-decreasing function of ℓ. This
completes the proof.
B.3 Proof of Proposition 4.1
We evaluate duicsir(ℓ)—the generalised outage diversity at round ℓ with uniform
power allocation—using the same change of random variables as that in Ap-
pendix A, i.e., αℓ′,b,r,t , − log |hℓ′,b,r,t|2/ logP and θℓ′,b,r,t , − log |eℓ′,b,r,t|2/ logP .We denote A1,ℓ,Θ1,ℓ ∈ ℓBnr×nt as the matrices with entries αℓ′,b,r,t and θℓ′,b,r,t,
224
B.3 Proof of Proposition 4.1
respectively. It follows from Lemma A.1 that
duicsir(ℓ) = infA1,ℓ,Θ1,ℓ∈
⋂ℓℓ′=1
O1,ℓ′
(R), A1,ℓ0,Θ1,ℓde×1
(
1 +τ
2
)
ℓ∑
ℓ′=1
B∑
b=1
nr∑
r=1
nt∑
t=1
αℓ′,b,r,t
+
ℓ∑
ℓ′=1
B∑
b=1
nr∑
r=1
nt∑
t=1
(θℓ′,b,r,t − de)
. (B.38)
For both Gaussian and discrete alphabets, an exact characterisation on O1,ℓ(R)
is difficult to obtain. We shall use bounding techniques developed in Appendix
A to characterise duicsir(ℓ).
We infer from Appendix A that it suffices to consider solving the SNR-
exponent for discrete inputs with alphabet size |X| = 2M . The proof for Gaussian
inputs with constant R, independent of the SNR (such that the multiplexing gain
tends to zero) follows along the same line as the proof for discrete inputs with a
sufficiently large alphabet size such that M ≥ BR. Thus, for the remaining part
of this appendix, we shall focus on O1,ℓ(R) for discrete inputs.
B.3.0.1 GMI Upper Bound
Using the GMI upper bound in Appendix A.2, we have an upper bound duicsir(ℓ) ≥duicsir(ℓ)
duicsir(ℓ) = infA1,ℓ,Θ1,ℓ∈
⋂ℓℓ′=1
O1,ℓ′
(R), A1,ℓ0,Θ1,ℓde×1
(
1 +τ
2
)
ℓ∑
ℓ′=1
B∑
b=1
nr∑
r=1
nt∑
t=1
αℓ′,b,r,t
+
ℓ∑
ℓ′=1
B∑
b=1
nr∑
r=1
nt∑
t=1
(θℓ′,b,r,t − de)
(B.39)
where O1,ℓ′(R) is similarly defined to (4.42) but with the accumulated GMI at
round ℓ′ replaced by its corresponding upper bound obtained using Proposition
2.3. Following the analysis in Appendix A.2, we have for discrete inputs with
alphabet size |X| = 2M that
O1,ℓ(R) =
A,Θ ∈ ℓBnr×nt :
ℓ∑
ℓ′=1
B∑
b=1
κℓ′,b <BR
M
, ℓ = 1, . . . , L (B.40)
225
B.3 Proof of Proposition 4.1
where
κℓ′,b ,∣
∣
∣S(ǫ,ǫ′)
ℓ′,b
∣
∣
∣, (B.41)
S(ǫ,ǫ′)
ℓ′,b ,
nr⋃
r=1
S(ǫ,ǫ′)
ℓ′,b,r, (B.42)
S(ǫ,ǫ′)
ℓ′,b,r ,
t :
αℓ′,b,r,t ≤ 1 + ǫ ∩ αℓ′,b,r,t ≤ θℓ′,b,r,t + ǫ′
∪
αℓ′,b,r,t ≤ 1 + ǫ ∩ αℓ′,b,r,t > θℓ′,b,r,t + ǫ′ ∩ Ξℓ′,b,r,t
, t = 1, . . . , nt
(B.43)
for any ǫ, ǫ′ > 0, and where
Ξℓ′,b,r,t ,
φhℓ′,b,r,t, φ
eℓ′,b,r,t ∈ [0, 2π) : cos
(
φeℓ′,b,r,t − φh
ℓ′,b,r,t
)
> 0
. (B.44)
As argued in Appendix A, the values of θℓ′,b,r,t, ℓ′ = 1, . . . , ℓ, for all b, r, t achieving
the infimum (B.39) are given by de. Substituting θℓ′,b,r,t = de in (B.43) and
following the analysis in [32], it can be shown that the infimum (B.39) is achieved
with
αℓ′,b,r,t = min(1 + ǫ, de + ǫ′), for all b, r, t and ℓ′ = 1, . . . , ℓ− 1, (B.45)
and
αℓ,b,r,t =
min(1 + ǫ, de + ǫ′), bB + t > BRM
0, otherwise(B.46)
for all r. Thus, by letting ǫ, ǫ′ ↓ 0, we have that
duicsir(ℓ) = min(1, de)×(
1 +τ
2
)
nr
(
1 +
⌊
ℓB
(
nt −R
ℓM
)⌋)
(B.47)
= min(1, de)× ducsir(ℓ). (B.48)
226
B.3 Proof of Proposition 4.1
B.3.0.2 GMI Lower Bound
Using the GMI lower bound in Appendix A.2, we have a lower bound duicsir(ℓ) ≤duicsir(ℓ)
duicsir(ℓ) = infA1,ℓ,Θ1,ℓ∈
⋂ℓℓ′=1 O1,ℓ′
(R), A1,ℓ0,Θ1,ℓde×1
(
1 +τ
2
)
ℓ∑
ℓ′=1
B∑
b=1
nr∑
r=1
nt∑
t=1
αℓ′,b,r,t
+ℓ∑
ℓ′=1
B∑
b=1
nr∑
r=1
nt∑
t=1
(θℓ′,b,r,t − de)
(B.49)
where O1,ℓ′(R) is similarly defined to (4.42) but with the accumulated GMI at
round ℓ′ replaced by its corresponding lower bound. We have for discrete inputs
with alphabet size |X| = 2M that
O1,ℓ(R) =
A,Θ ∈ ℓBnr×nt :
ℓ∑
ℓ′=1
B∑
b=1
κℓ′,b <BR
M
, ℓ = 1, . . . , L (B.50)
where
κℓ′,b ,∣
∣
∣S(ǫ,ǫ′)ℓ′,b
∣
∣
∣, (B.51)
S(ǫ,ǫ′)ℓ′,b ,
nr⋃
r=1
S(ǫ,ǫ′)ℓ′,b,r, (B.52)
S(ǫ,ǫ′)ℓ′,b,r ,
t : αℓ′,b,r,t ≤ 1− ǫ ∩ αℓ′,b,r,t ≤ θℓ′,min − δ, t = 1, . . . , nt
, (B.53)
θℓ′,min , min θ1,1,1,1, . . . , θℓ′,B,nr,nt (B.54)
for any ǫ, ǫ′ > 0.
Using the same arguments in Appendix A, the solutions of θℓ′,min and θℓ′,b,r,t
achieving the infimum (B.49) are all given by de. Thus, following the analysis
in [32], the values of αℓ′,b,r,t achieving the infimum (B.49) are given by
αℓ′,b,r,t = min(1− ǫ, de − ǫ′), for all b, r, t and ℓ′ = 1, . . . , ℓ− 1 (B.55)
227
B.4 Proof of Proposition 4.2
and
αℓ,b,r,t =
min(1− ǫ, de − ǫ′), bB + t > BRM
0, otherwise(B.56)
for all r. Thus, we have that by letting ǫ, ǫ′ ↓ 0
duicsir(ℓ) = min(1, de)×(
1 +τ
2
)
nr
(
1 +
⌊
ℓB
(
nt −R
ℓM
)⌋)
(B.57)
= min(1, de)× ducsir(ℓ) (B.58)
which is identical to the upper bound (B.48). It follows that
duicsir(ℓ) = duicsir(ℓ) = duicsir(ℓ) (B.59)
which completes the proof.
B.4 Proof of Proposition 4.2
We evaluate dpicsir(ℓ)—the generalised outage diversity at round ℓ with power
control—using the same change of random variables as that in Appendix B.3.
Recall the power constraint in (4.63)
P1
L+
1
L
L∑
ℓ=2
Pr Ft(ℓ− 1) = 0Pℓ ≤ P. (B.60)
We can see that
PrFt(ℓ− 1) = 0 = PrFt(1) = 0, . . . , Ft(ℓ− 1) = 0 (B.61)
as negative ACK at round ℓ (at the transmitter) is only possible if ACKs at all
previous rounds (at the transmitter) are also negative.
To derive PrFt(ℓ) = 0, we first consider PrFr(ℓ) = 0 and PrFr(ℓ) = 1.Note that an event Fr(ℓ) = 0 occurs if decoding error occurs at round ℓ. Thus,
228
B.4 Proof of Proposition 4.2
we can write
PrFr(ℓ) = 0 = Pr E1, . . . ,Eℓ,Aℓ−1 .= P−dpicsir(ℓ) (B.62)
where the dot equality follows from the argument in (B.22) and by noting that
we have power control. An event Fr(ℓ) = 1 occurs if
• correct message is obtained at round ℓ.
• correct message is obtained at previous rounds ℓ′ = 1, . . . , ℓ − 1, but re-
transmission still occurs as a result of Ft(ℓ′) = 0.
We thus have
PrFr(ℓ) = 1
= Pr E1, . . . ,Eℓ−1,Ecℓ,Aℓ−1+ PrEc
1,Aℓ−1+ℓ−1∑
ℓ′=2
PrE1, . . . ,Eℓ′−1,Ecℓ′,Aℓ−1.
(B.63)
Under imperfect feedback and CSIR, we have no prior knowledge whether
power control always reduces the probability of decoding error from round to
round. Thus, to facilitate the analysis, we denote the set L as
L = ℓ : Pr Ecℓ|E1, . . . ,Eℓ−1,Aℓ−1 = 0, ℓ = 1, . . . , L, (B.64)
namely the set of round indices for which correct decoding cannot be obtained.
The first term in (B.63) can be evaluated as
Pr E1, . . . ,Eℓ−1,Ecℓ,Aℓ−1
= Pr Ecℓ|E1, . . . ,Eℓ−1,Aℓ−1PrE1, . . . ,Eℓ−1,Aℓ−1 (B.65)
= (1− Pr Eℓ|E1, . . . ,Eℓ−1,Aℓ−1) PrE1, . . . ,Eℓ−1,Aℓ−1. (B.66)
229
B.4 Proof of Proposition 4.2
For ℓ ∈ L, this term vanishes. For ℓ ∈ Lc, this term can be evaluated as
(1− Pr Eℓ|E1, . . . ,Eℓ−1,Aℓ−1) PrE1, . . . ,Eℓ−1,Aℓ−1= PrE1, . . . ,Eℓ−1,Aℓ−1 − PrE1, . . . ,Eℓ,Aℓ−1 (B.67)
= (1− pfb) PrE1, . . . ,Eℓ−1,Aℓ−2 − PrE1, . . . ,Eℓ,Aℓ−1 (B.68).= P−dpicsir(ℓ−1) − P−dpicsir(ℓ) (B.69).= P−dpicsir(ℓ−1). (B.70)
Here the first dot equality follows from the same argument as (B.22) and the
second dot equality follows since ℓ ∈ Lc. The second term in (B.63) can be
evaluated as
PrEc1,Aℓ−1 = PrAℓ−1|Ec
1,A1PrA1|Ec1PrEc
1 (B.71)
= pℓ−1fb (1− Pr E1) (B.72)
.= P−(ℓ−1)dfb − P−(ℓ−1)dfb−dpicsir(1) (B.73).= P−(ℓ−1)dfb . (B.74)
Here the first dot equality follows from the same argument as (B.11) and the
second dot equality is because for the coding rate assumed in Section 4.2, the
optimal power control yields dpicsir(1) ≥ duicsir(1) > 0. For the third term, we have
that
PrE1, . . . ,Eℓ′−1,Ecℓ′,Aℓ−1
= PrAℓ−1|E1, . . . ,Eℓ′−1,Ecℓ′,Aℓ′PrAℓ′|E1, . . . ,Eℓ′−1,E
cℓ′,Aℓ′−1
· PrE1, . . . ,Eℓ′−1,Ecℓ′,Aℓ′−1 (B.75)
= pℓ−ℓ′
fb · PrE1, . . . ,Eℓ′−1,Ecℓ′,Aℓ′−1. (B.76)
Similarly to (B.66), if ℓ′ ∈ L, then PrE1, . . . ,Eℓ′−1,Ecℓ′,Aℓ′−1 = 0; if ℓ′ ∈ Lc, we
have that
PrE1, . . . ,Eℓ′−1,Ecℓ′,Aℓ′−1 .
= P−dpicsir(ℓ′−1) (B.77)
which implies that
PrE1, . . . ,Eℓ′−1,Ecℓ′,Aℓ−1 .
= P−(ℓ−ℓ′)dfb−dpicsir(ℓ′−1). (B.78)
230
B.4 Proof of Proposition 4.2
We then evaluate Pr Ft(ℓ) = 0 as follows
Pr Ft(ℓ) = 0 = Pr Ft(ℓ) = 0|Fr(ℓ) = 0PrFr(ℓ) = 0+ Pr Ft(ℓ) = 0|Fr(ℓ) = 1PrFr(ℓ) = 1 (B.79)
= (1− pfb) PrFr(ℓ) = 0+ pfbPrFr(ℓ) = 1 (B.80)
.= P−min(dpicsir(ℓ), d(ℓ)) (B.81)
where
d(ℓ) = min
ℓdfb, minℓ′=1,...,ℓ−1
ℓ′∈Lc
ℓ′dfb + dpicsir(ℓ− ℓ′)
(B.82)
and where we have dpicsir(0) , 0 and d(0) , 0 by definition. As we shall see later
on that L is an empty set since dpicsir(ℓ) is an increasing function of ℓ.
In the following, we characterise dpicsir(ℓ) using bounding techniques developed
in Appendix A. As pointed out in Appendix B.3, it suffices to consider discrete
inputs only.
B.4.1 GMI Upper Bound
To evaluate an upper bound to dpicsir(ℓ), we shall apply the GMI upper bound in
Proposition 2.3. Let
Igmiℓ′,b
(
Pℓ′ ,Hℓ′,b, Hℓ′,b
)
≥ sups>0
Igmiℓ′,b
(
Pℓ′,Hℓ′,b, Hℓ′,b, s)
(B.83)
be the resulting upper bound obtained using the techniques in Appendix A.2 and
Igmi
1,ℓ(H1,ℓ) ,
1
B
ℓ∑
ℓ′=1
B∑
b=1
Igmiℓ′,b
(
Pℓ′ ,Hℓ′,b, Hℓ′,b
)
. (B.84)
We then have an upper bound dpicsir(ℓ) ≥ dpicsir(ℓ) satisfying
P−dpicsir(ℓ).= Pr
H1,ℓ,E1,ℓ ∈
ℓ⋂
ℓ′=1
O1,ℓ′(R)
(B.85)
where O1,ℓ′(R) is similarly defined to O1,ℓ′(R) (4.42) but with Igmi
1,ℓ′(H1,ℓ′) replaced
by Igmi
1,ℓ′(H1,ℓ′). We can see from Appendix A.2 that Igmi
ℓ′,b (Pℓ′,Hℓ′,b, Hℓ′,b) is a non-
231
B.4 Proof of Proposition 4.2
decreasing function of the transmit power Pℓ′ in the high-SNR regime. Note
that the upper bound (B.84) holds for any possible allocation of transmit power
P1, . . . , Pℓ.
Let P ⋆1 , . . . , P
⋆ℓ be the optimal power allocation that minimises P ℓ
gout(R) in
(4.48) and Igmi⋆
1,ℓ(H1,ℓ) be the corresponding accumulated GMI with the optimal
power allocation P ⋆1 , . . . , P
⋆ℓ . The constraint (B.60) implies that P ⋆
ℓ , ℓ = 1, . . . , L,
have to satisfy
P ⋆ℓ ≤ PL
PrFt(ℓ− 1) = 0.= P 1+min(dpicsir(ℓ−1), d(ℓ−1)). (B.86)
Consider the upper bound Igmiℓ′,b (Pℓ′ ,Hℓ′,b, Hℓ′,b). Since Igmi
ℓ′,b (Pℓ′,Hℓ′,b, Hℓ′,b) is a
non-decreasing function of the transmit power Pℓ′, it follows that using
Pℓ′.= P 1+min(dpicsir(ℓ′−1), d′(ℓ′−1)), (B.87).
≥ P 1+min(dpicsir(ℓ′−1), d(ℓ′−1)), ℓ′ = 1, . . . , ℓ (B.88)
for Igmi
1,ℓ(H1,ℓ), we have at high SNR that
Igmi
1,ℓ(H1,ℓ) ≥ Igmi⋆
1,ℓ(H1,ℓ). (B.89)
Here d′(ℓ) is similarly defined to d(ℓ) (B.82) but with dpicsir(ℓ − ℓ′) replaced by
dpicsir(ℓ− ℓ′).
Using P1, . . . , Pℓ satisfying (B.87), the upper bound (B.85) is given by
dpicsir(ℓ) = infA1,ℓ,Θ1,ℓ∈
⋂ℓℓ′=1
O1,ℓ′
(R), A1,ℓ0,Θ1,ℓde×1
(
1 +τ
2
)
ℓ∑
ℓ′=1
B∑
b=1
nr∑
r=1
nt∑
t=1
αℓ′,b,r,t
+
ℓ∑
ℓ′=1
B∑
b=1
nr∑
r=1
nt∑
t=1
(θℓ′,b,r,t − de)
. (B.90)
Let
aℓ , min(
dpicsir(ℓ− 1), d′(ℓ− 1))
. (B.91)
232
B.4 Proof of Proposition 4.2
We then have for discrete inputs with alphabet size |X| = 2M that
O1,ℓ(R) =
A,Θ ∈ ℓBnr×nt :
ℓ∑
ℓ′=1
B∑
b=1
κℓ′,b <BR
M
(B.92)
where
κℓ′,b ,∣
∣
∣S(ǫ,ǫ′)
ℓ′,b
∣
∣
∣, (B.93)
S(ǫ,ǫ′)
ℓ′,b ,
nr⋃
r=1
S(ǫ,ǫ′)
ℓ′,b,r, (B.94)
S(ǫ,ǫ′)
ℓ′,b,r ,
t :
αℓ′,b,r,t ≤ 1 + aℓ′ + ǫ ∩ αℓ,b,r,t ≤ θℓ′,b,r,t + ǫ′
∪
αℓ′,b,r,t ≤ 1 + aℓ′ + ǫ ∩ αℓ′,b,r,t > θℓ′,b,r,t + ǫ′ ∩ Ξℓ′,b,r,t
,
t = 1, . . . , nt
(B.95)
for some ǫ, ǫ′ > 0, and where Ξℓ′,b,r,t is similarly defined to (B.44). Similarly to
the uniform power allocation, the solution for θℓ′,b,r,t, ℓ′ = 1, . . . , ℓ, b = 1, . . . , B,
r = 1, . . . , nr, t = 1, . . . , nt achieving the infimum (B.90) is given by de. The
values of αℓ′,b,r,t achieving the infimum (B.90) are given by
αℓ′,b,r,t = min (1 + aℓ + ǫ, de + ǫ′) , (B.96)
for all ℓ′ = 1, . . . , ℓ− 1 and all b, r, t, and
αℓ,b,r,t =
min (1 + aℓ + ǫ, de + ǫ′) , bB + t > BRM
0, otherwise(B.97)
for all r. Thus, it follows from aℓ (B.91) and by letting ǫ, ǫ′ ↓ 0 that
dpicsir(ℓ) =(
1 +τ
2
)
min(
1 + min(
dpicsir(ℓ− 1), d′(ℓ− 1))
, de
)
dSB(R)
+(
1 +τ
2
)
Bntnr
ℓ−1∑
ℓ′=1
min(
1 + min(
dpicsir(ℓ′ − 1), d′(ℓ′ − 1)
)
, de
)
.
(B.98)
We can see from (B.98) that dpicsir(1) = duicsir(1).
233
B.4 Proof of Proposition 4.2
B.4.2 GMI Lower Bound
Let
Igmi
1,ℓ(H1,ℓ) ≤
1
B
ℓ∑
ℓ′=1
B∑
b=1
Igmiℓ′,b (Pℓ′,Hℓ′,b,Eℓ′,b, s) (B.99)
be the resulting lower bound to Igmi
1,ℓ(H1,ℓ) obtained using the techniques in Ap-
pendix A.2. We shall denote dpicsir as the lower bound to dpicsir(ℓ) satisfying
P−dpicsir.= Pr
H1,ℓ,E1,ℓ ∈
ℓ⋂
ℓ′=1
O1,ℓ′
(B.100)
where O1,ℓ′(R) is similarly defined to O1,ℓ′(R) (4.42) but with Igmi
1,ℓ′(H1,ℓ′) replaced
by Igmi
1,ℓ′(H1,ℓ′).
Note that the power allocation (B.86) violates the constraint (B.60). However,
a suboptimal power allocation satisfying the same dot equality as (B.86) and the
constraint (B.60) can be constructed, i.e.,
Pℓ =P
PrFt(ℓ− 1) = 0.= P 1+min(dpicsir(ℓ−1), d(ℓ−1)). (B.101)
Since we deal with the lower bound dpicsir(ℓ), we shall consider another suboptimal
power allocation that has a similar form to (B.101), i.e.,
Pℓ.= P 1+min(dpicsir(ℓ−1), d′′(ℓ−1)) (B.102)
where d′′(ℓ) is similarly defined to (B.82) but with dpicsir(ℓ− ℓ′), ℓ′ = 1, . . . , ℓ− 1
replaced by dpicsir(ℓ − ℓ′). We have dpicsir(0) , 0, d′′(0) , 0 by definition. We
observe that both dpicsir(ℓ − ℓ′) and d′′(ℓ) are non-decreasing functions of ℓ. We
shall see in the following that this allocation does not necessarily yield the same
SNR-exponent as that obtained with the GMI upper bound.
234
B.4 Proof of Proposition 4.2
Using P1, . . . , Pℓ satisfying (B.102), a lower bound to dpicsir(ℓ) is given by
dpicsir(ℓ) = infA1,ℓ,Θ1,ℓ∈
⋂ℓℓ′=1
O1,ℓ′
(R), A1,ℓ0,Θ1,ℓde×1
(
1 +τ
2
)
ℓ∑
ℓ′=1
B∑
b=1
nr∑
r=1
nt∑
t=1
αℓ′,b,r,t
+
ℓ∑
ℓ′=1
B∑
b=1
nr∑
r=1
nt∑
t=1
(θℓ′,b,r,t − de)
.
(B.103)
Let
aℓ = min(
dpicsir(ℓ− 1), d′′(ℓ− 1))
(B.104)
gℓ(ℓ′) = min
l=1,...,ℓθl,min − al + aℓ′ (B.105)
θl,min = θl,1,1,1, . . . , θl,B,nr,nt. (B.106)
It follows from (B.102) that Pℓ.= P 1+aℓ . Following the derivation of the GMI
lower bound in Appendix A, we have that for discrete inputs with alphabet size
2M
O1,ℓ(R) =
A,Θ ∈ ℓBnr×nt :ℓ∑
ℓ′=1
B∑
b=1
κℓ′,b <BR
M
. (B.107)
Here for a given ℓ = 1, . . . , L, we define the following for each ℓ′ = 1, . . . , ℓ
κℓ′,b ,∣
∣
∣S(ǫ,ǫ′)ℓ′,b
∣
∣
∣, (B.108)
S(ǫ,ǫ′)ℓ′,b ,
nr⋃
r=1
S(ǫ,ǫ′)ℓ′,b,r, (B.109)
S(ǫ,ǫ′)ℓ′,b,r ,
t :
αℓ′,b,r,t ≤ 1 + aℓ′ − ǫ ∩ αℓ′,b,r,t ≤ gℓ(ℓ′)− ǫ′
, t = 1, . . . , nt
(B.110)
for some ǫ, ǫ′ > 0.
Following the same argument in Appendix A, the infimum solutions for θℓ′,b,r,t
for all ℓ′ = 1, . . . , ℓ and all b, r, t are given by de. This makes (B.105) become
gℓ(ℓ′) = min
l=1,...,ℓde − al + aℓ′. (B.111)
235
B.4 Proof of Proposition 4.2
We next evaluate the values of αℓ′,b,r,t achieving the infimum (B.106). Con-
sider the constraint O1,ℓ(R). The values of αℓ′,b,r,t, ℓ′ ≤ ℓ that make the constraint
in O1,ℓ(R) tight are as follows.
• For ℓ′ = ℓ and all r = 1, . . . , nr,
αℓ,b,r,t > min(1 + aℓ − ǫ, gℓ(ℓ)− ǫ′), bB + t >BR
M(B.112)
and
0 ≤ αℓ′,b,r,t ≤ min(1 + aℓ − ǫ, gℓ(ℓ)− ǫ′), (B.113)
otherwise.
• For ℓ′ < ℓ,
αℓ′b,r,t > min(1 + aℓ′ − ǫ, gℓ(ℓ′)− ǫ′) (B.114)
for all b, r, t.
Note that from the monotonicity of aℓ, we have
min(1 + aℓ − ǫ, gℓ(ℓ)− ǫ′) = min(1 + aℓ − ǫ, de − ǫ′). (B.115)
It follows that for ℓ′ = ℓ, the infimum solutions for αℓ′,b,r,t are exactly
determined by (B.112) and (B.113) and given by
αℓ,b,r,t =
min(1 + aℓ − ǫ, de − ǫ′), bB + t > BRM
0, otherwise.(B.116)
For ℓ′ < ℓ, the infimum solutions for αℓ′,b,r,t are not only determined by
(B.114) but also determined by the intersection⋂ℓ
ℓ′ O1,ℓ′(R).
Consider ℓ′ = ℓ− 1. In order to satisfy the constraint O1,ℓ−1(R) ∩ O1,ℓ(R),
we need to have for all r = 1, . . . , nr
αℓ−1,b,r,t > min(1 + aℓ−1 − ǫ, gℓ−1(ℓ− 1)− ǫ′)
= min(1 + aℓ−1 − ǫ, de − ǫ′), bB + t >BR
M(B.117)
αℓ−1,b,r,t > min(1 + aℓ−1 − ǫ, gℓ(ℓ− 1)− ǫ′), bB + t ≤ BR
M(B.118)
αℓ′′,b,r,t > min(1 + aℓ′′ − ǫ, gℓ−1(ℓ′′)− ǫ′), ℓ′′ ≤ ℓ− 2, and all b, t.
(B.119)
236
B.4 Proof of Proposition 4.2
Thus, for ℓ′ = ℓ − 1, the infimum solutions for αℓ′,b,r,t that meet both
O1,ℓ−1(R) and O1,ℓ(R) are given by
αℓ−1,b,r,t =
min(1 + aℓ−1 − ǫ, de − ǫ′), bB + t > BRM
min(1 + aℓ−1 − ǫ, gℓ(ℓ− 1)− ǫ′), otherwise(B.120)
for all r. For ℓ′ < ℓ− 1, we can follow the same procedure by considering
extra constraint in the intersection. It is not difficult to prove that the
values of αℓ′,b,r,t solving the infimum are given by (B.116) for ℓ′ = ℓ and
αℓ′,b,r,t =
min(1 + aℓ′ − ǫ, de − ǫ′), bB + t > BRM
min(1 + aℓ′ − ǫ, gℓ′+1(ℓ′)− ǫ′), otherwise
(B.121)
for ℓ′ < ℓ and all r. Notice that from (B.111), we have
gℓ′+1(ℓ′) = min
l=1,...,ℓ′+1de − al + aℓ′ = de − aℓ′+1 + aℓ′ (B.122)
which follows from the monotonicity of aℓ.
Inserting all values of αℓ′,b,r,t achieving the infimum (B.106), and by letting
ǫ, ǫ′ ↓ 0, we have a lower bound
dpicsir(ℓ) =(
1 +τ
2
)
dSB(R)
ℓ∑
ℓ′=1
min(1 + aℓ′, de)
+(
1 +τ
2
)
(Bntnr − dSB(R))ℓ−1∑
ℓ′=1
min(1 + aℓ′, gℓ′+1(ℓ′)).
(B.123)
Since a1 = a1 = 0, we can see that (B.98) and (B.123) are equal if de is
sufficiently large such that for ℓ′ = 1, . . . , L− 1
1 + aℓ′ ≤ gℓ′+1(ℓ′) (B.124)
which implies
de ≥ 1 + aℓ′+1. (B.125)
237
B.4 Proof of Proposition 4.2
So, the bounds are not equal if there exists ℓ′ < ℓ such that
de < 1 + aℓ′+1. (B.126)
As 1 + aℓ′+1 is exactly the exponent of Pℓ′+1, the bounds are not tight due to a
suboptimal power allocation (B.102).
Since aℓ is a monotonically increasing function of ℓ, we let
ℓmin = min ℓ : de < 1 + aℓ, ℓ = 1, . . . , L. (B.127)
The suboptimality of the power adaptation occurs because we have power ex-
ponent larger than the CSIR-error diversity de. Hence, in order to prevent the
condition (B.126), we shall allocate for ℓ′ ≥ ℓmin, Pℓ′ such that
Pℓ′.= P de , (B.128)
which yields
Pℓ.= Pmin(1+aℓ, de), ℓ = 1, . . . , L. (B.129)
Replacing the power allocation (B.102) with (B.129) changes S(ǫ,ǫ′)ℓ′,b,r in (B.110) to
S(ǫ,ǫ′)ℓ′,b,r ,
t :
αℓ′,b,r,t ≤ 1 + min(aℓ′, de − 1)− ǫ ∩ αℓ′,b,r,t ≤ gℓ(ℓ′)− ǫ′
,
t = 1, . . . , nt
(B.130)
and gℓ(ℓ′) in (B.111) to
gℓ(ℓ′) = min
l=1,...,ℓde −min(al, de − 1) + min(aℓ′, de − 1). (B.131)
Noting these changes and following the same steps to prove the SNR-exponent
with power allocation (B.102), it is not difficult to show that with power alloca-
238
B.4 Proof of Proposition 4.2
tion (B.129)
dpicsir(ℓ)
=(
1 +τ
2
)
min (1 + aℓ, de) dSB(R) +(
1 +τ
2
)
Bntnr
ℓ−1∑
ℓ′=1
min (1 + aℓ′ , de)
(B.132)
=(
1 +τ
2
)
min(
1 + min(
dpicsir(ℓ− 1), d′′(ℓ− 1))
, de
)
dSB(R)
+(
1 +τ
2
)
Bntnr
ℓ−1∑
ℓ′=1
min(
1 + min(
dpicsir(ℓ′ − 1), d′′(ℓ′ − 1)
)
, de
)
.
(B.133)
Here the second equality follows from (B.104). This lower bound can be solved via
recursion from dpicsir(1) and coincides to the upper bound (B.98) since dpicsir(0) =
d′′(0) = 0 (which are the same as dp
icsir(0) = d′(0) = 0 in the upper bound). This
completes the proof the proposition.
239
Appendix C
Magnitude-Squared Notation Phase Notation
Matrix Entry (r, t) Matrix Entry (r, t)
Γb γb,r,t , |hb,r,t|2 ΦHb Φh
b,r,t , ∠hb,r,tΞb ξb,r,t , |eb,r,t|2 ΦE
b Φeb,r,t , ∠eb,r,t
Γb γb,r,t , |hb,r,t|2 ΦHb Φh
b,r,t , ∠hb,r,tΓb γb,r,t , |hb,r,t|2 ΦH
b Φhb,r,t , ∠hb,r,t
Table C.1: Definition of magnitute-squared and phase variables.
Throughout this appendix, to simplify the presentation, we shall change the
usual notation for the pdf in Chapter 1. For any continuous random variable W ,
we rewrite the pdf PW (w) as p(w).
C.1 Preliminaries
Let Hb,r,t, Hb,r,t and Eb,r,t be the entries of H b, H b and Eb at row r and column t
and let
Hb,r,t ,1
σeHb,r,t. (C.1)
It follows from (5.3) that conditioned on Hb,r,t = hb,r,t, Hb,r,t is complex-Gaussian
distributed with mean of 1σehb,r,t and unit variance.
We define magnitude-squared variables and phase variables in Table C.1.
Note that the random variables Γb,r,t and Ξb,r,t have the exponential pdfs
p(γb,r,t) = e−γb,r,t , (C.2)
p(
ξb,r,t
)
= P dee−P de ξb,r,t . (C.3)
241
C.2 Power Allocation and Asymptotic Analysis
Magnitude-Squared Entry (r, t) Normalised Entry (r, t)Matrix Matrix
Γb γb,r,t , |hb,r,t|2 Ab αb,r,t , − log γb,r,tlogP
Ξb ξb,r,t , |eb,r,t|2 Θb θb,r,t , − log ξb,r,tlogP
Γb γb,r,t , |hb,r,t|2 Ab αb,r,t , − log γb,r,tlogP
Γb γb,r,t , |hb,r,t|2 Ab αb,r,t , − log γb,r,tlogP
Table C.2: Definition of normalised magnitute-squared variables.
Conditioned on Hb,r,t = hb,r,t, Γb,r,t has the non-central chi-square pdf
p(γb,r,t|ν) = e−γb,r,t−νI0(
2√
γb,r,tν)
(C.4)
where ν = 1σ2e|hb,r,t|2 = 1
σ2eγb,r,t is the non-centrality parameter and I0(·) is the
zeroth order modified Bessel function of the first kind.
For high-SNR analysis, we define transformed variables in Table C.2. It
follows from (C.2)–(C.4) that we have the following pdfs
p(αb,r,t) = log(P )P−αb,r,te−P−αb,r,t
, (C.5)
p(θb,r,t) = log(P )P de−θb,r,te−Pde−θb,r,t
, (C.6)
p(αb,r,t|αb,r,t) = log(P )P−αb,r,te−P−αb,r,t−P
de−αb,r,tI0
(
2Pde−αb,r,t−αb,r,t
2
)
. (C.7)
C.2 Power Allocation and Asymptotic Analysis
C.2.1 Power Allocation
We consider power allocation with a scaled identity matrix (5.5)
Pb
(
H(n(b)))
=Pb
(
H(n(b)))
nt
Int, b = 1, . . . , B. (C.8)
One can show that power allocation with constraint [Pb(H(n(b)))] ≤ BP for
all b = 1, . . . , B results in an upper bound to the outage SNR-exponent; note
that this violates the constraint (5.18). On the other hand, one can consider
a suboptimal power allocation such that [Pb(H(n(b)))] ≤ P to obtain a lower
242
C.2 Power Allocation and Asymptotic Analysis
bound to the outage SNR-exponent. Let
Γ(n(b)) ,
[
Γ1, . . . , Γn(b)
]
, (C.9)
A(n(b)) ,
[
A1, . . . , An(b)
]
, (C.10)
ΦH(n(b))
,
[
ΦH1 , . . . ,Φ
Hn(b)
]
. (C.11)
Then, the optimal power allocation satisfies
∫
Γ(n(b))∈n(b)·nr·nt+ ,
ΦH(n(b))∈[0,2π)n(b)·nr ·nt
Pb
(
H(n(b)))
p(
Γ(n(b)))
p(
ΦH(n(b)))
dΓ(n(b))dΦH(n(b)) .
≤ P.
(C.12)
Let Pb(H(n(b)))
.= Pb. Using the transformation in Table C.2, the above con-
straint becomes
∫
A(n(b))∈n(b)·nr·nt+ ,
ΦH(n(b))∈[0,2π)n(b)·nr·nt
PbP−∑n(b)
b′=1
∑nrr=1
∑ntt=1 αb′,r,tdA(n(b))dΦH(n(b)) .
≤ P. (C.13)
Herein we have neglected the terms irrelevant to the SNR-exponent such as the
phase as p(ΦH(n(b))) is uniform over [0, 2π)n(b)·nr·nt and the interval of αb′,r,t < 0 as
its probability decays exponentially with the SNR. Applying Varadhan’s lemma
[106] to (C.13) yields
supA(n(b))∈n(b)·nr·nt
+ ,
ΦH(n(b))∈[0,2π)n(b)·nr·nt
b
(
A(n(b)),ΦH(n(b)))
−n(b)∑
b′=1
nr∑
r=1
nt∑
t=1
αb′,r,t
≤ 1. (C.14)
The optimal power exponent b minimises Pgout(R). In the following, we shall
consider b that depends on the magnitude but not the phase, i.e., b(A(n(b))).
We shall observe later in Appendices C.2.3 and C.3 – C.5 that this allocation
does not incur loss in the terms of SNR-exponent.
243
C.2 Power Allocation and Asymptotic Analysis
C.2.2 Asymptotic Analysis
Let OX be the large-SNR outage set from an input alphabet X that can expressed
in terms of Γb, Γb, ΦHb , Θb and ΦE
b , b = 1, . . . , B. Then, it follows that
Pgout(R) = Pr
Igmi(H , H ,P) < R
(C.15)
=
∫
OX
B∏
b=1
p(
Γb,Hb, Eb
)
dΓbdHbdEb (C.16)
=
∫
OX
B∏
b=1
p(
Γb
∣
∣
∣Γb
)
p (Γb) p(
ΦHb
)
p(
Ξb
)
p(
ΦEb
)
· dΓbdΓbdΞbdΦHb dΦ
Eb .
(C.17)
By changing the variables from Γb, Γb and Ξb to Ab, Ab and Θb, we have that
Pgout(R)
.=
∫
OX
B∏
b=1
nr∏
r=1
nt∏
t=1
p(αb,r,t|αb,r,t)p(αb,r,t)p(θb,r,t)dαb,r,tdαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t
(C.18)
where the pdfs have been expressed in terms of the entries of the matrices. Herein
the pdfs of the phases do not appear because ΦHb,r,t and ΦE
b,r,t are uniformly dis-
tributed over [0, 2π) and hence do not affect the dot equality.
Now, assume that we have perfect CSIR (de ↑ ∞). Following the same
derivation in [46], we have that
Pgout(R).=
∫
OX
B∏
b=1
nr∏
r=1
nt∏
t=1
p(αb,r,t|αb,r,t)p(αb,r,t)dαb,r,tdαb,r,t (C.19)
.=
∫
OX
∏
(b,r,t):−de≤αb,r,t=αb,r,t−de<0
(
P−αb,r,tdαb,r,t
)
·∏
(b,r,t):αb,r,t≥0,αb,r,t≥de
(
P−(αb,r,t+αb,r,t)dαb,r,tdαb,r,t
)
. (C.20)
We compare (C.18) and (C.20), and observe that the extra term in (C.18) is
due to p(θb,r,t). Thus, evaluating (C.18) by using the joint pdf
244
C.2 Power Allocation and Asymptotic Analysis∏
b,r,t p(αb,r,t|αb,r,t)p(αb,r,t) (C.20) and the pdf p(θb,r,t) (C.6) yields
Pgout(R).= P dicsi (C.21)
.=
∫
OX
B∏
b=1
nr∏
r=1
nt∏
t=1
p(αb,r,t|αb,r,t)p(αb,r,t)p(θb,r,t)dαb,r,tdαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t
(C.22)
.=
∫
OX
∏
(b,r,t):−de≤αb,r,t=αb,r,t−de<0
(
logP · e−P−(θb,r,t−de) · P−αb,r,t−(θb,r,t−de)
· dαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t
)
×∏
(b,r,t):αb,r,t≥0,αb,r,t≥de
(
logP · e−P−(θb,r,t−de) · P−αb,r,t−αb,r,t−(θb,r,t−de)
· dαb,r,tdαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t
)
(C.23)
.=
∫
OX
∏
(b,r,t):−de≤αb,r,t=αb,r,t−de<0,
θb,r,t≥de
(
P−αb,r,t−(θb,r,t−de)dαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t
)
×∏
(b,r,t): αb,r,t≥0,αb,r,t≥de,
θb,r,t≥de
(
P−αb,r,t−αb,r,t−(θb,r,t−de)dαb,r,tdαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t
)
(C.24)
where the last dot equality follows from the proof of Lemma A.1. Applying
Varadhan’s lemma [106] to (C.24) yields
dicsi = infA,A,Θ∈OX
∑
(b,r,t):−de≤αb,r,t=αb,r,t−de<0, θb,r,t≥de
αb,r,t +(
θb,r,t − de
)
∑
(b,r,t): αb,r,t≥0,αb,r,t≥de, θb,r,t≥de
αb,r,t + αb,r,t +(
θb,r,t − de
)
. (C.25)
C.2.3 GMI Upper and Lower Bounds
We use bounding techniques developed in Appendix A to prove the results. A
GMI upper bound is obtained from Proposition 2.3. Let
Igmib (Pb,Hb, Hb) ≥ sup
s>0Igmib (Pb,Hb, Hb, s) (C.26)
245
C.2 Power Allocation and Asymptotic Analysis
be the upper bound obtained by following the derivation in Appendices A.2 and
A.3. Denote the resulting GMI upper bound as
Igmi(H, H,P) ,
B∑
b=1
Igmib (Pb,Hb, Hb). (C.27)
We can see from Appendices A.2 and A.3 that Igmib (Pb,Hb, Hb) is a non-decreasing
function of the transmit power coefficient Pb(H(n(b))) at high SNR. It follows that
using the maximum power exponent in (C.14), i.e.,
b
(
A(n(b)))
= 1 +
n(b)∑
b′=1
nr∑
r=1
nt∑
t=1
αb′,r,t (C.28)
for the GMI upper bound yields an upper bound to the optimal outage SNR-
exponent for i.i.d. inputs. As for A(n(b)), the power exponent can be expressed
as
b
(
A(n(b)))
= 1 + n(b)nrntde +
n(b)∑
b′=1
nr∑
r=1
nt∑
t=1
αb′,r,t. (C.29)
We further denote an equivalent outage-set obtained with GMI upper bound as
OX ,
H, E,P : Igmi(H, H,P) < R
. (C.30)
A GMI lower bound is obtained by substituting a suboptimal s to Igmib (Pb,Hb, Hb, s),
i.e.,
s =B
Bnr +∑B
b=1
∑nr
r=1
∑nt
t=1
Pb(H(n(b)))nt
|eb,r,t|2. (C.31)
Let
Igmib (Pb,Hb, Hb, s) ≤ Igmi
b (Pb,Hb, Hb, s) (C.32)
be the lower bound derived using the techniques in Appendices A.2 and A.3 and
let
Igmi(H, H,P) ,1
B
B∑
b=1
Igmib (Pb,Hb, Hb, s) (C.33)
246
C.3 Full-CSIT Power Allocation
be the resulting GMI lower bound. Then, we denote an equivalent-outage set
with GMI lower bound
OX ,
H, E,P : Igmi(H, H,P) < R
. (C.34)
Note that the choice s gives the largest outage SNR-exponent in Chapter 3 with
uniform power allocation. However, with power control, it is not yet clear what
effects that may occur by allocating different power to different fading blocks.
Nevertheless, we will first use the allocation (C.29) to evaluate a lower bound to
the outage SNR-exponent. We will show later that this allocation may not give
a tight upper and lower bounds.
C.3 Full-CSIT Power Allocation
For full CSIT, we have n(b) = B, b = 1, . . . , B. We shall show in the following
that upper and lower bounds to the optimal outage SNR-exponent derived using
b(A(n(b))) (i.e, (C.29) with n(b) = B),
b
(
A(n(b)))
= 1 +Bnrntde +B∑
b′=1
nr∑
r=1
nt∑
t=1
αb′,r,t (C.35)
are tight. In the following, the superscript f will be used to indicate full CSIT.
We infer from Appendix A that it suffices to consider solving the outage
SNR-exponent for discrete inputs with alphabet size |X| = 2M . The proof for
the Gaussian inputs with constant R independent of the SNR (such that the
multiplexing gain tends to zero) follows along the same line as the proof for
discrete inputs with a sufficiently large alphabet size such that M ≥ BR. Thus,
for the remaining part of this appendix, we shall focus on OX for discrete inputs.
247
C.3 Full-CSIT Power Allocation
C.3.1 GMI Upper Bound
Substituting OX in (C.25) with OX (C.30) yields an upper bound dficsi ≥ dficsi
dficsi = infA,A,Θ∈OX
∑
(b,r,t):−de≤αb,r,t=αb,r,t−de<0, θb,r,t≥de
αb,r,t +(
θb,r,t − de
)
∑
(b,r,t): αb,r,t≥0,αb,r,t≥de, θb,r,t≥de
αb,r,t + αb,r,t +(
θb,r,t − de
)
. (C.36)
Using b(A(n(b))) in (C.35) and following the derivation in Appendix A.2, we can
express an equivalent outage-set with GMI upper bound (C.30) for the discrete
constellations of size |X| = 2M as follows
OX =
A, A, Θ ∈ Bnr×nt :
B∑
b=1
κb <BR
M
. (C.37)
Here we have defined the following
κb ,∣
∣
∣S(ǫ,δ)
b
∣
∣
∣, (C.38)
S(ǫ,δ)
b ,
nr⋃
r=1
S(ǫ,δ)
b,r , (C.39)
S(ǫ,δ)
b,r ,
t :
αb,r,t ≤ b
(
A(n(b)))
+ ǫ ∩ αb,r,t ≤ θb,r,t + δ
∪
αb,r,t ≤ b
(
A(n(b)))
+ ǫ ∩ αb,r,t > θb,r,t + δ ∩ Qb,r,t
,
t = 1, . . . , nt
, (C.40)
Qb,r,t ,
φhb,r,t, φ
eb,r,t ∈ [0, 2π) : cos
(
φeb,r,t − φh
b,r,t
)
> 0
(C.41)
for any ǫ, δ > 0. Note that increasing θb,r,t increases both the objective function
(C.36) and the threshold for αb,r,t in (C.40). Hence, the infimum solutions for
θb,r,t, b = 1, . . . , B, r = 1, . . . , nr, t = 1, . . . , nt are given by de.
From (C.36), (C.37) and θb,r,t = de, b = 1, . . . , B, r = 1, . . . , nr, t = 1, . . . , nt,
248
C.3 Full-CSIT Power Allocation
assume that without loss of generality, for r = 1, . . . , nr
αb,r,t ≥ min(
b
(
A(n(b)))
+ ǫ, de + δ)
, bB + t >BR
M(C.42)
if Qb,r,t does not occur, and
αb,r,t ≥ b
(
A(n(b)))
+ ǫ, bB + t >BR
M(C.43)
if Qb,r,t occurs. Since the argument on the RHS of (C.36) is increasing with αb,r,t
and since withπ
2≤ φe
b,r,t − φhb,r,t ≤
3π
2, (C.44)
the event Qb,r,t does not occur, it follows that the infimum is achieved with
αb,r,t =
min(
b
(
A(n(b)))
+ ǫ, de + δ)
, if bB + t > BRM
0, otherwise.(C.45)
Depending on the values of de and b(A(n(b))), we have the following cases.
• Case 1: de + δ ≤ b(A(n(b))) + ǫ and de + δ < de
We have αb,r,t for bB + t > BRM
that achieve the infimum (C.36) are given
by de + δ and they belong to (b, r, t) : −de ≤ αb,r,t = αb,r,t − de < 0.On the other hand, for bB + t ≤ BR
M, the infimum (C.36) is achieved with
αb,r,t = 0, which also belongs to (b, r, t) : −de ≤ αb,r,t = αb,r,t − de < 0.Thus, the solutions for αb,r,t that achieve the infimum (C.36) are given by
αb,r,t =
de + δ − de, for bB + t > BRM
−de, otherwise.(C.46)
This yields
dficsi = (de + δ) dSB(R). (C.47)
Here we have that b(A(n(b))) = 1+Bnrntde+
∑
b,r,t αb,r,t = 1+dSB(R)de ≥de+δ−ǫ, which satisfies the condition for case 1 for sufficiently small δ−ǫ.
• Case 2: de + δ ≤ b(A(n(b))) + ǫ and de + δ ≥ de
We have αb,r,t for bB + t > BRM
that achieve the infimum (C.36) are given
249
C.3 Full-CSIT Power Allocation
by de + δ and they belong to (b, r, t) : αb,r,t ≥ 0, αb,r,t ≥ de. On the other
hand, for bB+t ≤ BRM
, the infimum (C.36) is achieved with αb,r,t = 0, which
belongs to (b, r, t) : −de ≤ αb,r,t = αb,r,t − de < 0. Thus, the solutions for
αb,r,t that achieve the infimum (C.36) are given by
αb,r,t =
0, for bB + t > BRM
−de, otherwise.(C.48)
This yields
dficsi = (de + δ) dSB(R). (C.49)
Here we have that b(A(n(b))) = 1+Bnrntde +
∑
b,r,t αb,r,t = 1+ dSB(R)de.
Hence, the condition de + δ ≤ b(A(n(b))) + ǫ is satisfied if de + δ ≤
1 + dedSB(R) + ǫ for some sufficiently small δ, ǫ > 0.
• Case 3: de + δ > b(A(n(b))) + ǫ
We first note that for bB+t ≤ BRM
, the values of αb,r,t achieving the infimum
are given by zero. This implies that the values for αb,r,t, bB + t ≤ BRM
achieving the infimum are given by −de.
For (b′, r′, t′) such that b′B+t′ > BRM
, if (b′, r′, t′) belongs to (b, r, t) : −de ≤αb,r,t = αb,r,t − de < 0, we have that
b
(
A(n(b)))
= 1 + (Bnrnt − 1)de + αb′,r′,t′ +∑
(b,r,t)6=(b′,r′,t′)
αb,r,t ≥ αb′,r′,t′.
(C.50)
This implies that the constraint (C.37) can never be met. As such (b′, r′, t′)
for b′B + t′ > BRM
must belong to (b, r, t) : αb,r,t ≥ 0, αb,r,t ≥ de. In that
case, since b(A(n(b))) increases with αb,r,t, the values of αb,r,t that solve the
infimum in (C.36) are given by
αb,r,t =
0, for bB + t > BRM
−de, otherwise(C.51)
and results in
b
(
A(n(b)))
= 1 + dSB(R)de. (C.52)
250
C.3 Full-CSIT Power Allocation
It follows that
dficsi = dSB(R) (1 + dSB(R)de + ǫ) . (C.53)
From cases 1 to 3, by letting ǫ, δ ↓ 0, we have the upper bound
dficsi =
dedSB(R), if de < 1 + dSB(R)de
dSB(R) (1 + dSB(R)de) , if de ≥ 1 + dSB(R)de.(C.54)
C.3.2 GMI Lower Bound
Replacing OX in (C.25) with OX (C.34) yields a lower bound dficsi ≤ dficsi
dficsi = infA,A,Θ∈O
X
∑
(b,r,t):−de≤αb,r,t=αb,r,t−de<0, θb,r,t≥de
αb,r,t +(
θb,r,t − de
)
∑
(b,r,t): αb,r,t≥0,αb,r,t≥de, θb,r,t≥de
αb,r,t + αb,r,t +(
θb,r,t − de
)
. (C.55)
In the following, we solve dficsi using the same power exponent b(A(n(b))) used to
derive the upper bound (cf. (C.35))
b
(
A(n(b)))
= 1 +Bnrntde +B∑
b′=1
nr∑
r=1
nt∑
t=1
αb′,r,t, b = 1, . . . , B (C.56)
and show that this exponent yields dficsi = dficsi. Following the derivation in
Appendix A.2, we first express an equivalent outage-set with GMI lower bound
(C.34) for the discrete constellation of size |X| = 2M as follows
OX =
A, A, Θ ∈ Bnr×nt :B∑
b=1
κb <BR
M
. (C.57)
251
C.3 Full-CSIT Power Allocation
Here we have defined the following
κb ,∣
∣
∣S(ǫ,δ)b
∣
∣
∣, (C.58)
S(ǫ,δ)b ,
nr⋃
r=1
S(ǫ,δ)b,r , (C.59)
S(ǫ,δ)b,r ,
t : αb,r,t < b
(
A(n(b)))
− ǫ ∩ αb,r,t < θmin − δ, t = 1, . . . , nt
,
(C.60)
θmin , minθ1,1,1, . . . , θB,nr,nt (C.61)
for any ǫ, δ > 0.
We observe from (C.55) and (C.60) that increasing θmin increases both the
objective function (C.55) and the threshold for αb,r,t in (C.60). Thus, the value
of θmin that solves the infimum (C.55) is given by de. Since for any b, r, t we have
θb,r,t ≥ θmin, the values of θb,r,t that solve the infimum are also given by de as they
do not appear in OX.
We next compare S(ǫ,δ)b,r (C.60) with S
(ǫ,δ)
b,r (C.40). There are two main dif-
ferences between S(ǫ,δ)b,r and S
(ǫ,δ)
b,r . Firstly, in the set S(ǫ,δ)b,r , we have θmin as the
threshold for αb,r,t instead of θb,r,t in S(ǫ,δ)
b,r . However, since the value of θmin that
solves the infimum (C.55) is also given by de (same as the value of θb,r,t that
solves the infimum (C.36)), this difference will not contrast the resulting infi-
mum in (C.36) and (C.55). Secondly, we have an extra term in the set S(ǫ,δ)
b,r that
depends on the phases φhb,r,t and φ
eb,r,t
αb,r,t ≤ b + ǫ ∩ αb,r,t > θb,r,t + δ ∩ Qb,r,t
(C.62)
where
Qb,r,t =
φhb,r,t, φ
eb,r,t ∈ [0, 2π) : cos
(
φeb,r,t − φh
b,r,t
)
> 0
. (C.63)
The infimum solution in (C.36) is obtained when the event Qb,r,t does not occur.
It follows that since the infimum solutions for both θmin and θb,r,t are identical
and the set (C.62) is not active in solving the infimum (C.36), the result for the
infimum (C.55) has a similar form to that for (C.36), i.e.,
dpicsi =
(de − δ) dSB(R), if de − ǫ < 1 + dSB(R)de − δ
(1 + dSB(R)de − ǫ) dSB(R), if de − ǫ ≥ 1 + dSB(R)de − δ.(C.64)
252
C.4 Causal-CSIT Power Allocation
By letting ǫ, δ ↓ 0, combining (C.54) with (C.64) completes the proof.
C.4 Causal-CSIT Power Allocation
For causal CSIT, we have that n(b) = b− τd. The exponent of the optimal power
allocation must satisfy (C.14). We shall first use the maximum power exponent
satisfying the constraint, i.e.,
b
(
A(n(b)))
= 1 +
b−τd∑
b′=1
nr∑
r=1
nt∑
t=1
(αb′,r,t + de) (C.65)
for both GMI upper and lower bounds. For GMI upper bound, this gives an
upper bound to the optimal outage SNR-exponent as argued in Appendix C.2.3.
For GMI lower bound, this, however, may not yield a tight result to the optimal
outage SNR-exponent. Nevertheless, this gives a guidance about the structure
of the power exponent that yields a tight lower bound.
Similarly to Appendix C.3, we note that it suffices to consider solving the
outage SNR-exponent for discrete inputs with alphabet size |X| = 2M . In the
following, the superscript c will be used to indicate causal CSIT.
C.4.1 GMI upper bound
An upper bound to the outage SNR-exponent with causal CSIT dcicsi has an
equivalent expression to the one with full CSIT (C.36), i.e.,
dcicsi = infA,A,Θ∈OX
∑
(b,r,t):−de≤αb,r,t=αb,r,t−de<0, θb,r,t≥de
αb,r,t +(
θb,r,t − de
)
∑
(b,r,t): αb,r,t≥0,αb,r,t≥de, θb,r,t≥de
αb,r,t + αb,r,t +(
θb,r,t − de
)
. (C.66)
Here OX has a similar form to that in the full-CSIT case except for b(A(n(b))),
which is now given in (C.65). We can write OX as
OX =
A, A, Θ ∈ Bnr×nt :B∑
b=1
κb <BR
M
(C.67)
253
C.4 Causal-CSIT Power Allocation
where we have defined the following
κb ,∣
∣
∣S(ǫ,δ)
b
∣
∣
∣, (C.68)
S(ǫ,δ)
b ,
nr⋃
r=1
S(ǫ,δ)
b,r , (C.69)
S(ǫ,δ)
b,r ,
t :
αb,r,t ≤ b
(
A(n(b)))
+ ǫ ∩ αb,r,t ≤ θb,r,t + δ
∪
αb,r,t ≤ b
(
A(n(b)))
+ ǫ ∩ αb,r,t > θb,r,t + δ ∩ Qb,r,t
,
t = 1, . . . , nt
, (C.70)
Qb,r,t ,
φhb,r,t, φ
eb,r,t ∈ [0, 2π) : cos
(
φeb,r,t − φh
b,r,t
)
> 0
(C.71)
for any ǫ, δ > 0.
We first define the following
d‡ , Bnt −⌈
BR
M
⌉
+ 1. (C.72)
Following the same argument used in Appendix C.3.1, the infimum solutions for
θb,r,t, for all b, r, t in (C.66) are given by de. For each r = 1, . . . , nr, assume that
without loss of generality, the following conditions that make the constraint in
OX tight
αb,r,t > min(
b
(
A(n(b)))
+ ǫ, de + δ)
,(
φhb,r,t, φ
eb,r,t
)
/∈ Qb,r,t, bnt + t ≤ d‡,
(C.73)
αb,r,t ≤ min(
b
(
A(n(b)))
+ ǫ, de + δ)
, bnt + t > d‡.
(C.74)
Then, the infimum (C.66) is achieved with αb,r,t equal to
ϑb,r,t =
min(
b
(
A(n(b)))
+ ǫ, de + δ)
, for bnt + t ≤ d‡
0, for bnt + t > d‡.(C.75)
Note that for b = 1, . . . , τd, we have b(A(n(b))) = 1.
The exponent b(A(n(b))) sets a threshold for αb,r,t in (C.70) (deep-fading
threshold). Since increasing αb′,r,t, b′ = 1, . . . , b − τd increases both b(A
(n(b)))
254
C.4 Causal-CSIT Power Allocation
and the objective function in (C.66), it follows that the solutions for αb,r,t that
attain the infimum (C.66) are given by
αb,r,t =
ϑb,r,t − de, if ϑb,r,t < de
0, if ϑb,r,t ≥ de(C.76)
which can also be written as
αb,r,t = min(
ϑb,r,t − de, 0)
. (C.77)
Using this αb,r,t, we have that for b = τd + 1, . . . , B
b
(
A(n(b)))
= 1 +
b−τd∑
b′=1
nr∑
r=1
nt∑
t=1
min(
ϑb′,r,t, de)
. (C.78)
Let
b = maxb: bnt≤d‡
b. (C.79)
It follows from (C.75), (C.76) and (C.78) that by letting ǫ, δ ↓ 0, the infimum
(C.66) is given by
dcicsi = ntnr
b∑
b=1
ϑb + nr
(
d‡ − bnt
)
ϑb+1 (C.80)
where for b = 1, . . . ,min(τd, b+ 1)
ϑb = min(de, 1) (C.81)
and for b = (τd, b+ 1) + 1, . . . , b+ 1
ϑb = min
(
de, 1 + nrnt
b−τd∑
b′=1
min(
ϑb′ , de)
)
. (C.82)
255
C.4 Causal-CSIT Power Allocation
C.4.2 GMI Lower Bound
A lower bound to the outage SNR-exponent with causal CSIT dcicsi has an equiv-
alent expression to the one with full CSIT (C.55), i.e.,
dcicsi = infA,A,Θ∈O
X
∑
(b,r,t):−de≤αb,r,t=αb,r,t−de<0, θb,r,t≥de
αb,r,t +(
θb,r,t − de
)
∑
(b,r,t): αb,r,t≥0,αb,r,t≥de, θb,r,t≥de
αb,r,t + αb,r,t +(
θb,r,t − de
)
(C.83)
where now OX is characterised by b(A(n(b))) satisfying the constraint (C.14).
Note that b(A(n(b))) in (C.65) may no longer give a tight lower bound. We will
show later the way to improve b(A(n(b))) so that we obtain a tight lower bound.
Following the derivation in Appendix A.2, we obtain an equivalent-outage set
with GMI lower bound (C.34) for discrete inputs as follows
OX =
A, A, Θ ∈ Bnr×nt :B∑
b=1
κb <BR
M
. (C.84)
Here we have defined the following
κb ,∣
∣
∣S(ǫ,δ)b
∣
∣
∣, (C.85)
S(ǫ,δ)b ,
nr⋃
r=1
S(ǫ,δ)b,r , (C.86)
S(ǫ,δ)b,r ,
t : αb,r,t < b
(
A(n(b)))
− ǫ ∩ αb,r,t < gb − δ, t = 1, . . . , nt
(C.87)
where gb satisfies
P gb .=
Pb
(
H(n(b)))
∑Bb=1
∑nr
r=1
∑nt
t=1
Pb(H(n(b)))nt
∣
∣eb,r,t∣
∣
2(C.88)
and given by
gb = minb=1,...,B
θb,min −b +b
. (C.89)
Here we have θb,min , θb,1,1, . . . , θb,nr,nt. Since increasing θb,min may increase the
256
C.4 Causal-CSIT Power Allocation
threshold gb in OX, the values of θb,min achieving the infimum (C.83) are given
by de. All other θb,r,t are also given by de. Since b(A(n(b))) is monotonically
non-decreasing with b, it follows that for b = 1, . . . , τd
gb = de −B
(
A(n(B)))
+ 1, (C.90)
and for b = τd + 1, . . . , B
gb = de −B
(
A(n(B))
)
+b
(
A(n(b))
)
. (C.91)
For r = 1, . . . , nr, assume that without loss of generality, the following condi-
tions that make the the constraint OX tight
αb,r,t > min(
b
(
A(n(b)))
− ǫ, gb − δ)
, bnt + t ≤ d‡, (C.92)
αb,r,t ≤ min(
b
(
A(n(b)))
− ǫ, gb − δ)
, bnt + t > d‡ (C.93)
where d‡ is defined in (C.72). By letting ǫ, δ ↓ 0, the infimum (C.83) is achieved
with αb,r,t equal to
ϑb,r,t =
min(
b
(
A(n(b)))
, gb)
, for bnt + t ≤ d‡
0, for bnt + t > d‡.(C.94)
Since gb in (C.89) is less than or equal to θb,r,t in (C.70), a lower bound to the
outage SNR-exponent with b(A(n(b))) is generally less than the upper bound.
The loss is mainly due to transmitting with power exponent larger than the
CSIR-error diversity de. Hence, in this case, using the exponent (C.65) which is
optimal in the perfect-CSIR case, may no longer be optimal in the mismatched-
CSIR case.
However, as observed in the GMI upper bound (Appendix C.4.1), if de is less
than b(A(n(b))) in (C.65), using a power exponent beyond de does not provide
additional gains as de always limits the performance. In the following, we show
that limiting the power exponent by the CSIR-error diversity de yields the same
outage SNR-exponent as that in Appendix C.4.1.
Consider a new power exponent ′b(A
(n(b))), which is obtained by imposing a
257
C.4 Causal-CSIT Power Allocation
peak limit de to b(A(n(b))), i.e.,
′b
(
A(n(b)))
= min(
de, b
(
A(n(b))))
. (C.95)
With power exponent ′b(A
(n(b))), gb in (C.89) becomes
g′b = minb=1,...,B
θb,min −′b +′
b
. (C.96)
With θb,min = de that leads to the infimum (C.83), we have that
g′b = de −′B
(
A(n(B)))
+′b
(
A(n(b)))
. (C.97)
Following from (C.94), the values of αb,r,t achieving the infimum (C.83) become
ϑ′b,r,t =
min(
′b
(
A(n(b)))
, g′b)
, for bnt + t ≤ d‡
0, for bnt + t > d‡.(C.98)
For bnt + t ≤ d‡, we can evaluate ϑ′b,r,t as follows
ϑ′b,r,t = min (′b, g
′b) (C.99)
= min (de, b, de −min (de, B) + min (de, b)) (C.100)
where we have omitted the arguments (A(n(b))) and (A(n(B))) for ease of notation.
Since b(A(n(b))) is non-decreasing with b, evaluating (C.100) yields
ϑ′b,r,t =
de, if de ≤ b
(
A(n(b)))
b
(
A(n(b)))
, if b
(
A(n(b)))
< de ≤ B
(
A(n(B)))
b
(
A(n(b)))
, if de > B
(
A(n(B)))
(C.101)
which can simply be written as
ϑ′b,r,t = min(
de, b
(
A(n(b))))
. (C.102)
Note that for b = 1, . . . , τd, we have b(A(n(b))) = 1. Thus, for b = 1, . . . , τd, the
values of ϑ′b,r,t are given by min(de, 1).
We next considerb(A(n(b))), b = τd+1, . . . , B. The power exponentb(A
(n(b)))
258
C.4 Causal-CSIT Power Allocation
sets a threshold for αb,r,t in (C.87). Since increasing αb′,r,t, b′ = 1, . . . , b − τd in-
creases both b(A(n(b))) and the objective function in (C.83), it follows that the
solutions for αb′,r,t attaining the infimum (C.83) are given by
αb′,r,t =
ϑb′,r,t − de, if ϑb′,r,t < de
0, if ϑb′,r,t ≥ de(C.103)
which can also be written as
αb′,r,t = min(
ϑ′b′,r,t − de, 0
)
. (C.104)
Using this αb′,r,t, we have that for b = τd + 1, . . . , B
b
(
A(n(b)))
= 1 +
b−τd∑
b′=1
nr∑
r=1
nt∑
t=1
min(
ϑ′b′,r,t, de
)
. (C.105)
Recall that d‡ and b are defined in (C.72) and (C.79), respectively. It follows
from (C.102), (C.103) and (C.105) that the infimum (C.83) is given by
dcicsi = ntnr
b∑
b=1
ϑ′b + nr
(
d‡ − bnt
)
ϑ′b+1
(C.106)
where for b = 1, . . . ,min(τd, b+ 1)
ϑ′b = min(de, 1) (C.107)
and for b = min(τd, b+ 1) + 1, . . . , b+ 1
ϑ′b = min
(
de, 1 + nrnt
b−τd∑
b′=1
min (ϑ′b′ , de)
)
. (C.108)
We can see that this lower bound coincides with the upper bound (C.80), which
completes the proof.
259
C.5 Predictive-CSIT Power Allocation
C.5 Predictive-CSIT Power Allocation
For predictive CSIT, we have that n(b) = b + τf . The exponent of the optimal
power allocation must satisfy (C.14)
supA(n(b))∈Rn(b)nr×nt
+
b
(
A(n(b)))
−min(B,b+τf )∑
b′=1
nr∑
r=1
nt∑
t=1
(αb′,r,t + de)
≤ 1. (C.109)
Here we have incorporated min(B, b + τf) to limit only fading estimates for the
current codeword as fading matrices beyond the current codeword does not affect
the current transmission.
Similarly to Appendix C.3, we note that it suffices to consider solving the
outage SNR-exponent for discrete inputs with alphabet size |X| = 2M . In the
following, the superscript p is used to indicate predictive CSIT.
C.5.1 GMI Upper Bound
Replacing OX in (C.25) with OX yields an upper bound dpicsi ≥ dpicsi
dpicsi = infA,A,Θ∈OX
∑
(b,r,t):−de≤αb,r,t=αb,r,t−de<0, θb,r,t≥de
αb,r,t +(
θb,r,t − de
)
∑
(b,r,t): αb,r,t≥0,αb,r,t≥de, θb,r,t≥de
αb,r,t + αb,r,t +(
θb,r,t − de
)
.
(C.110)
Here we use the maximum power exponent satisfying the constraint, i.e.,
b
(
A(n(b)))
= 1 +
min(B,b+τf )∑
b′=1
nr∑
r=1
nt∑
t=1
(αb′,r,t + de) (C.111)
as this gives an upper bound to the optimal outage SNR-exponent as argued in
Appendix C.2.3.
Similarly to the full-CSIT case, an equivalent outage-set with GMI upper
260
C.5 Predictive-CSIT Power Allocation
bound (C.30) for discrete inputs is given by
OX =
A, A, Θ ∈ Bnr×nt :
B∑
b=1
κb <BR
M
(C.112)
where κb = |S(ǫ,δ)
b | and S(ǫ,δ)
b =⋃nr
r=1 S(ǫ,δ)
b,r are all defined in (C.38) and (C.39) but
with b(A(n(b))) given in (C.111).
Following the same argument in Appendix C.3, the infimum solution for θb,r,t
are given by de. As b(A(n(b))) is non-decreasing with b, without loss of generality,
for each r = 1, . . . , nr, assume the following conditions
αb,r,t > min(
b
(
A(n(b)))
+ ǫ, de + δ)
,(
φhb,r,t, φ
eb,r,t
)
/∈ Qb,r,t, bnt + t ≤ d‡,
(C.113)
αb,r,t ≤ min(
b
(
A(n(b)))
+ ǫ, de + δ)
, bnt + t > d‡
(C.114)
that satisfy OX with a tight inequality in the constraint. Then, the infimum
(C.110) is achieved with
αb,r,t =
min(
b
(
A(n(b)))
+ ǫ, de + δ)
, for bnt + t ≤ d‡
0, for bnt + t > d‡.(C.115)
We let ǫ, δ ↓ 0. For any de > 0, let
b∗ , maxb:b(A(n(b)))<de
b (C.116)
if b ∈ 1, . . . , B such that b(A(n(b))) < de exists and b∗ , 0, otherwise. In the
following, we solve b(A(n(b))) when the infimum (C.110) is achieved. For the
following two cases, we note that b(A(n(b))) is non-decreasing with b, and d‡ and
b are defined in (C.72) and (C.79), respectively.
1. Case de < de
For b ≥ b+1 and t such that (b−1)nt+ t > d‡, the values of αb,r,t achieving
the infimum are zero. It follows that αb,r,t = −de. In the following, we solve
for αb,r,t and b(A(n(b))) for b ≤ b+ 1 and t such that (b− 1)nt + t ≤ d‡.
• b∗ < b
261
C.5 Predictive-CSIT Power Allocation
For b ∈ [b∗ + 1, b+ 1] and t such that (b− 1)nt + t ≤ d‡, the infimum
(C.110) is achieved with
αb,r,t = de, (C.117)
αb,r,t = de − de (C.118)
for all r. For b ≤ b∗, the infimum (C.110) is achieved with
αb,r,t = b
(
A(n(b))
)
, (C.119)
αb,r,t = 0. (C.120)
With the above values of αb,r,t, we have
b
(
A(n(b)))
= 1 +min(b∗, b+ τf)ntnrde
+[
min(
d‡, (b+ τf)nt
)
− b∗nt
]
+nrde. (C.121)
• b∗ > b
For b ≤ b+ 1 and t such that (b− 1)nt + t ≤ d‡, the infimum (C.110)
is achieved with
αb,r,t = b
(
A(n(b)))
(C.122)
αb,r,t = 0 (C.123)
for all r. Thus, we have that
b
(
A(n(b)))
= 1 +min(
d‡, (b+ τf)nt
)
nrde. (C.124)
• b∗ = b
For b ≤ b∗, the infimum (C.110) is achieved with
αb,r,t = b
(
A(n(b)))
(C.125)
αb,r,t = 0 (C.126)
for all r, t. For b ≥ b∗ + 1 and t such that (b − 1)nt + t ≤ d‡, the
262
C.5 Predictive-CSIT Power Allocation
infimum (C.110) is achieved with
αb,r,t = de (C.127)
αb,r,t = de − de (C.128)
for all r. Thus, we have that
b
(
A(n(b)))
= 1 +min(b∗, b+ τf)ntnrde (C.129)
for b+ τf ≤ b∗ and
b
(
A(n(b)))
= 1 +min(b∗, b+ τf)ntnrde +(
d‡ − b∗nt
)
nrde (C.130)
otherwise.
Since in this case de < de, we have that b(A(n(b))) < de is only possible for
b∗ = 0. This implies that for all b = 1, . . . , B, we always have b(A(n(b))) ≥
de. Thus, it follows from (C.115) that by having ǫ, δ ↓ 0
αb,r,t =
de, for bnt + t ≤ d‡
0, for bnt + t > d‡.(C.131)
Note that in this case, the sum of the preceding values of αb,r,t contributing
to the infimum (C.110) is zero.
2. Case de ≥ de
In this case, when αb,r,t = 0, we have αb,r,t = −de; when αb,r,t = b(A(n(b)))
or αb,r,t = de, de ≥ de, we have αb,r,t = 0 to achieve the infimum (C.110).
Thus, in this case, the infimum solutions for αb,r,t are given by
αb,r,t =
0, for bnt + t ≤ d‡
−de, for bnt + t > d‡.(C.132)
The values of b(A(n(b))) when αb,r,t attaining the infimum (C.110) are then
263
C.5 Predictive-CSIT Power Allocation
given by
b
(
A(n(b)))
= ηb ,
1 + ntnr (b+ τf) de, b+ τf ≤ b
1 + nrd‡de, b+ τf > b.
(C.133)
It follows from (C.115) (by letting ǫ, δ ↓ 0) and (C.133) that
αb,r,t =
min (ηb, de) , for bnt + t ≤ d‡
0, for bnt + t > d‡.(C.134)
From (C.132) and (C.110), we observe that the sum of αb,r,t contributing
to the infimum (C.110) is also zero.
Combining the above cases yields infimum (C.110)
dpicsi = ntnr
b∑
b=1
min (ηb, de) + nr
(
d‡ − bnt
)
min(
ηb+1, de)
. (C.135)
C.5.2 GMI Lower Bound
Following the same explanation as in the causal-CSIT case (Appendix C.4), the
power exponent to prove a tight lower bound to the outage SNR-exponent is
peak-limited by de, i.e.,
′b
(
A(n(b)))
= min
de, 1 +
min(B,b+τf )∑
b′=1
nr∑
r=1
nt∑
t=1
min (αb′,r,t, de)
. (C.136)
For discrete inputs, (C.113) implies that the extra term in S(ǫ,δ)
b,r , i.e.,
αb,r,t ≤ b
(
A(n(b)))
+ ǫ ∩ αb,r,t > θb,r,t + δ ∩ Qb,r,t
(C.137)
for the GMI upper bound does not affect the solution for the infimum. Hence, by
ignoring this extra term and by letting ǫ, δ ↓ 0, it is not difficult to show that OX
derived using b(A(n(b))) in (C.111) and OX derived using ′
b(A(n(b))) in (C.136)
tend to be identical. With that, it can be shown that the resulting lower bound
to the outage SNR-exponent coincides to the upper bound (C.135). The proof
for the GMI lower bound is not reproduced here for the sake of compactness.
264
C.6 LMMSE Channel Estimation
C.6 LMMSE Channel Estimation
Recall that for LMMSE estimation
CSIT Hb,r,t = Hb,r,t + Eb,r,t, (C.138)
CSIR Hb,r,t = Hb,r,t + Eb,r,t. (C.139)
We have the following pdfs
p(hb,r,t) =1
πe−|hb,r,t|2, (C.140)
p(hb,r,t|eb,r,t) =1
π (1− σ2e)exp
(
−|hb,r,t − eb,r,t|2(1− σ2
e)
)
, (C.141)
p(
hb,r,t|hb,r,t)
=1
πσ2e (1− σ2
e)exp
−
∣
∣
∣hb,r,t − (1− σ2
e) hb,r,t
∣
∣
∣
2
σ2e (1− σ2
e)
. (C.142)
Define
Hb,r,t ,Hb,r,t
√
σ2e (1− σ2
e). (C.143)
Conditioned on Hb,r,t = hb,r,t, Hb,r,t is a complex-Gaussian random scalar with
mean hb,r,t√
(1− σ2e)/σe and unit variance. Then, applying the variable-trans-
formation in Tabel C.2, the outage SNR-exponent can be evaluated using
Pgout(R)
.=
∫
OX
∏
b,r,t
p( γb,r,t| γb,r,t)p(
γb,r,t, φhb,r,t
∣
∣ ξb,r,t, φeb,r,t
)
p(
ξb,r,t
)
p(
φeb,r,t
)
dΓdΓdΞdΦHΦE
(C.144)
.=
∫
OX
∏
b,r,t
p(αb,r,t|αb,r,t)p(αb,r,t, φhb,r,t|θb,r,t, φe
b,r,t)p(
θb,r,t
)
p(
φeb,r,t
)
dAdAdΘdΦHΦE.
(C.145)
265
C.6 LMMSE Channel Estimation
Here p(θb,r,t) is given in (C.6) and p(φeb,r,t) = (2π)−1. Following from Appendix
A.1, we have the bounds
p(αb,r,t, φhb,r,t|θb,r,t, φe
b,r,t)
≥ 1
2π(1− σ2e)
logP · P−αb,r,t · exp
−
(
P−αb,r,t2 + P− θb,r,t
2
)2
(1− σ2e)
, (C.146)
p(αb,r,t, φhb,r,t|θb,r,t, φe
b,r,t)
≤ 1
2π(1− σ2e)
logP · P−αb,r,t · exp
−∣
∣
∣
∣
P−αb,r,t2 − P− θb,r,t
2
∣
∣
∣
∣
2
(1− σ2e)
. (C.147)
We have p(γb,r,t|γb,r,t) as a non-central Chi-square pdf with two degrees of freedom
p(γb,r,t|γb,r,t) = C0 · exp(
−γb,r,t − Λhb,r,t
)
· I0(
2√
Λhb,r,tγb,r,t
)
(C.148)
where C0 is the normalising constant and Λhb,r,t is the non-centrality parameter
Λhb,r,t =
(1− σ2e)
σ2e
|hb,r,t|2 .= P de−αb,r,t . (C.149)
Let p(αb,r,t, φhb,r,t|θb,r,t, φe
b,r,t) be the RHS of (C.146). We then have a lower
bound to Pgout(R) as
Pgout(R)
.≥∫
OX
∏
b,r,t
p(αb,r,t|αb,r,t)p(αb,r,t, φhb,r,t|θb,r,t, φe
b,r,t)p(
θb,r,t
)
dAdAdΘdΦHdΦE
(C.150)
.=
∫
OX
∏
b,r,t
e−P−αb,r,t−P
−(αb,r,t−de)
e−(
P−αb,r,t/2+P
−θb,r,t/2)2
e−P−(θb,r,t−de)
× P−αb,r,t−αb,r,t−(θb,r,t−de) · I0(
Pde−αb,r,t−αb,r,t
2
)
dAdAdΘdΦHdΦE. (C.151)
The high-SNR behaviour of the joint pdf depends on I0(Pde−αb,r,t−αb,r,t
2 ). For each
b, r, t, we have the following cases.
266
C.6 LMMSE Channel Estimation
• Case 1: de − αb,r,t − αb,r,t > 0
From [45, 46], we have that
I0
(
Pde−αb,r,t−αb,r,t
2
)
.= P− de−αb,r,t−αb,r,t
4 ePde−αb,r,t−αb,r,t
2 . (C.152)
Grouping the exponential terms in the integrand of (C.151) yields
exp
(
− P−αb,r,t − P−(αb,r,t−de) + Pde−αb,r,t−αb,r,t
2
−(
P−αb,r,t2 + P− θb,r,t
2
)2
− P−(θb,r,t−de)
)
. (C.153)
Note that
max (−αb,r,t,−(αb,r,t − de)) ≥de − αb,r,t − αb,r,t
2(C.154)
with equality occurs if αb,r,t = αb,r,t − de.
⋄ Case 1.1: αb,r,t 6= αb,r,t − de
We have the dot equality
− P−αb,r,t − P−(αb,r,t−de) + Pde−αb,r,t−αb,r,t
2 −(
P−αb,r,t2 + P− θb,r,t
2
)2
− P−(θb,r,t−de) .= −Pmax(−αb,r,t,−(αb,r,t−de),−(θb,r,t−de)). (C.155)
Since
max (−αb,r,t,−(αb,r,t − de)) >de − αb,r,t − αb,r,t
2> 0, (C.156)
we have that the joint pdf decays exponentially with the SNR.
⋄ Case 1.2: αb,r,t = αb,r,t − de
The exponential behaviour depends on the following. If
−(θb,r,t − de) > −(αb,r,t − de), (C.157)
then the joint pdf decays exponentially with the SNR for θb,r,t < de and
267
C.6 LMMSE Channel Estimation
converge to a constant for θb,r,t ≥ de. However, since de−αb,r,t−αb,r,t >
0 and αb,r,t = αb,r,t − de, we know that αb,r,t < de. This implies that
θb − de < 0 and the joint pdf decays exponentially with the SNR.
If
−(θb,r,t − de) ≤ −(αb,r,t − de), (C.158)
we have that
− P−αb,r,t − P−(αb,r,t−de) + Pde−αb,r,t−αb,r,t
2 −(
P−αb,r,t2 + P− θb,r,t
2
)2
− P−(θb,r,t−de) .= −Pmax(−αb,r,t,−(αb,r,t−de)) + Pde−αb,r,t−αb,r,t
2 (C.159)
which is undetermined. But, as argued in [45, 46], we can replace
p(αb,r,t|αb,r,t) with the Kronecker delta function δf (αb,r,t − αb,r,t + de).
Let
A(1)b,r,t =
αb,r,t, αb,r,t, θb,r,t : de − αb,r,t − αb,r,t > 0, αb,r,t = αb,r,t − de
(C.160)
and
cb,r,t =∏
b′,r′,t′ 6=b,r,t
p(αb′,r′,t′|αb′,r′,t′)p(αb′,r′,t′ , φhb′,r′,t′ |θb′,r′,t′, φe
b′,r′,t′)
· p(θb′,r′,t′)dαb′,r′,t′dαb′,r′,t′dθb′,r′,t′dφhb′,r′,t′dφ
eb′,r′,t′ . (C.161)
Then, we have that
∫
OX∩A(1)b,r,t
cb,r,tp(αb,r,t|αb,r,t)p(αb,r,t, φhb,r,t|θb,r,t, φe
b,r,t)
· p(θb,r,t)dαb,r,tdαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t
.=
∫
OX∩αb,r,t=αb,r,t−de,θb,r,t−de≥αb,r,t−decb,r,tp(αb,r,t, φ
hb,r,t|θb,r,t, φe
b,r,t)
· p(θb,r,t)dαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t (C.162)
.=
∫
OX∩αb,r,t=αb,r,t−de,θb,r,t−de≥αb,r,t−decb,r,te
−(
P−αb,r,t/2+P
−θb,r,t/2)2
· e−P−(θb,r,t−de)
P−αb,r,t−(θb,r,t−de)dαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t. (C.163)
We have the following dot equality for the exponential terms in the
268
C.6 LMMSE Channel Estimation
integrand of (C.163)
−(
P−αb,r,t/2 + P−θb,r,t/2)2
− P−(θb,r,t−de) .= −Pmax(−αb,r,t,−(θb,r,t−de)).
(C.164)
If −αb,r,t ≥ −(θb,r,t−de), then we need the condition αb,r,t ≥ 0 to make
the exponential terms converge to a positive constant since otherwise
the RHS of (C.163) decays exponentially with the SNR; this implies
that θb,r,t ≥ de. On the other hand, if −αb,r,t ≤ −(θb,r,t − de), then we
need the condition θb,r,t ≥ de to make the exponential terms converge
to a positive constant since otherwise the RHS of (C.163) decays ex-
ponentially with the SNR; this implies that αb,r,t ≥ 0. From these
conditions, we can express the RHS of (C.163) as follows
∫
OX∩αb,r,t=αb,r,t−de,θb,r,t−de≥αb,r,t−decb,r,te
−(
P−αb,r,t/2+P
−θb,r,t/2)2
· e−P−(θb,r,t−de)
P−αb,r,t−(θb,r,t−de)dαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t
.=
∫
OX∩−de≤αb,r,t=αb,r,t−de<0, θb,r,t≥decb,r,tP
−αb,r,t−(θb,r,t−de)
· dαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t. (C.165)
• Case 2: de − αb,r,t − αb,r,t ≤ 0
From [45, 46], we have that
I0
(
Pde−αb,r,t−αb,r,t
2
)
.= P 0. (C.166)
Grouping the exponential terms in the integrand of (C.151) yields
exp
(
−P−αb,r,t − P−(αb,r,t−de) −(
P−αb,r,t/2 + P−θb,r,t/2)2
− P−(θb,r,t−de)
)
.
(C.167)
We have the dot equality
− P−αb,r,t − P−(αb,r,t−de) −(
P−αb,r,t/2 + P−θb,r,t/2)2
− P−(θb,r,t−de)
.= −Pmax(−αb,r,t,−(αb,r,t−de),−(θb,r,t−de)). (C.168)
269
C.6 LMMSE Channel Estimation
Let
A(2)b,r,t =
αb,r,t, αb,r,t, θb,r,t : de − αb,r,t − αb,r,t ≤ 0
. (C.169)
We then have that
∫
OX∩A(2)b,r,t
cb,r,tp(αb,r,t|αb,r,t)p(αb,r,t, φhb,r,t|θb,r,t, φe
b,r,t)
· p(θb,r,t)dαb,r,tdαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t
.=
∫
OX∩αb,r,t≥0,αb,r,t≥de,θb,r,t≥decb,r,tP
−αb,r,t P−αb,r,t
· P−(θb,r,t−de)dαb,r,tdαb,r,tdθb,r,t. (C.170)
The conditions αb,r,t ≥ 0, αb,r,t ≥ de and θb,r,t ≥ de are necessary since
otherwise the joint pdf decays exponentially with the SNR.
Combining (C.165) and (C.170) with (C.151) gives a lower bound to Pgout(R).
An upper bound can be obtained by applying (C.147), i.e., replacing
p(αb,r,t, φhb,r,t|θb,r,t, φe
b,r,t) with the RHS of (C.147) in cases 1 and 2 above. Fol-
lowing the same derivation as the above cases, it is not difficult to show that
using the upper bound (C.147), we obtain the same dot equality as (C.165) and
(C.170). Thus, we can show that the Pgout(R) exactly satisfies the following dot
equality
Pgout(R).= P−dicsi (C.171)
.=
∫
OX
∏
(b,r,t):−de≤αb,r,t=αb,r,t−de<0,
θb,r,t≥de
(
P−αb,r,t−(θb,r,t−de)dαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t
)
×∏
(b,r,t): αb,r,t≥0,αb,r,t≥de,
θb,r,t≥de
(
P−αb,r,t−αb,r,t−(θb,r,t−de)dαb,r,tdαb,r,tdθb,r,tdφhb,r,tdφ
eb,r,t
)
.
(C.172)
Applying Varadhan’s lemma [106] to the last dot equality yields
dicsi = infA,A,Θ∈OX
∑
(b,r,t):−de≤αb,r,t=αb,r,t−de<0, θb,r,t≥de
αb,r,t +(
θb,r,t − de
)
∑
(b,r,t): αb,r,t≥0,αb,r,t≥de, θb,r,t≥de
αb,r,t + αb,r,t +(
θb,r,t − de
)
(C.173)
270
C.6 LMMSE Channel Estimation
which is identical to (C.25).
To conclude the proof, we note that the power exponent b(A(n(b))) derived
using LMMSE estimation [46] satisfies the same constraint as that derived using
ML estimation (cf. (C.14)). Thus, the difference of LMMSE and ML estimation
is immaterial for the large-SNR outage set. It follows from Appendices C.3 –
C.5 that the outage SNR-exponent characterisations in Theorems 5.1, 5.2 and
5.3 are valid for the LMMSE estimation as well.
271
References
[1] R. H. Etkin and D. N. C. Tse, “Degrees of freedom in some underspread
MIMO fading channels,” IEEE Trans. Inf. Theory, vol. 52, no. 4, pp. 1576–
1608, Apr. 2006.
[2] S. R. Saunders and A. Aragon Zavala, Antennas and Propagation for Wire-
less Communication Systems, 2nd ed. Chichester, UK: Wiley, 2007.
[3] J. G. Proakis and M. Salehi, Digital Communications, 5th ed. New York:
McGraw-Hill, 2008.
[4] A. Lapidoth, “Nearest neighbor decoding for additive non-Gaussian noise
channels,” IEEE Trans. Inf. Theory, vol. 42, no. 5, pp. 1520–1529, Sep.
1996.
[5] A. J. Viterbi, “Error bounds for convolutional codes and an asymptotically
optimum decoding algorithm,” IEEE Trans. Inf. Theory, vol. IT-13, no. 2,
pp. 260–269, Apr. 1967.
[6] T. May, H. Rohling, and V. Engels, “Performance analysis of Viterbi decod-
ing for 64-DAPSK and 64-QAM modulated OFDM signals,” IEEE Trans.
Commun., vol. 46, no. 2, pp. 182–190, Feb. 1998.
[7] E. Akay and E. Ayanoglu, “Achieving full frequency and space diversity in
wireless systems via BICM, OFDM, STBC, and Viterbi decoding,” IEEE
Trans. Commun., vol. 54, no. 12, pp. 2164–2172, Dec. 2006.
[8] J. Jin and C.-Y. Tsui, “Low-power limited-search parallel state Viterbi
decoder implementation based on scarce state transition,” IEEE Trans.
VLSI Syst., vol. 15, no. 10, pp. 1172–1176, Oct. 2007.
[9] G. D. Forney, Jr., “Generalized minimum distance decoding,” IEEE Trans.
Inf. Theory, vol. IT-12, no. 2, pp. 125–131, Apr. 1966.
273
REFERENCES
[10] G. D. Forney, Jr. and A. Vardy, “Generalized minimum-distance decoding
of Euclidean-space codes and lattices,” IEEE Trans. Inf. Theory, vol. 42,
no. 6, pp. 1992–2026, Nov. 1996.
[11] R. Kotter, “Fast generalized minimum-distance decoding of algebraic-
geometry and Reed-Solomon codes,” IEEE Trans. Inf. Theory, vol. 42,
no. 3, pp. 721–737, May 1996.
[12] A. Clark and D. P. Taylor, “Lattice codes and generalized minimum-
distance decoding for OFDM systems,” IEEE Trans. Commun., vol. 55,
no. 3, pp. 417–426, Mar. 2007.
[13] N. Merhav, G. Kaplan, A. Lapidoth, and S. Shamai (Shitz), “On infor-
mation rates for mismatched decoders,” IEEE Trans. Inf. Theory, vol. 40,
no. 6, pp. 1953–1967, Nov. 1994.
[14] E. Biglieri, J. Proakis, and S. Shamai (Shitz), “Fading channels:
Information-theoretic and communications aspects,” IEEE Trans. Inf.
Theory, vol. 44, no. 6, pp. 2619–2692, Oct. 1998.
[15] D. Tse and P. Viswanath, Fundamentals of Wireless Communication.
Cambridge University Press, 2005.
[16] L. H. Ozarow, S. Shamai, and A. D. Wyner, “Information theoretic con-
siderations for cellular mobile radio,” IEEE Trans. Veh. Technol., vol. 43,
no. 2, pp. 359–378, May 1994.
[17] S. Verdu and T. S. Han, “A general formula for channel capacity,” IEEE
Trans. Inf. Theory, vol. 40, no. 4, pp. 1147–1157, Jul. 1994.
[18] T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed.
Hoboken, NJ: Wiley, 2006.
[19] R. Durrett, Probability: Theory and Examples, 4th ed. Cambridge Uni-
versity Press, 2010.
[20] L. Zheng and D. N. C. Tse, “Diversity and multiplexing: A fundamental
tradeoff in multiple-antenna channels,” IEEE Trans. Inf. Theory, vol. 49,
no. 5, pp. 1073–1096, May 2003.
[21] R. Knopp and P. A. Humblet, “On coding for block fading channels,” IEEE
Trans. Inf. Theory, vol. 46, no. 1, pp. 189–205, Jan. 2000.
274
REFERENCES
[22] E. Malkamaki and H. Leib, “Coded diversity on block-fading channels,”
IEEE Trans. Inf. Theory, vol. 45, no. 2, pp. 771–781, Mar. 1999.
[23] A. Guillen i Fabregas and G. Caire, “Coded modulation in the block-fading
channel: Coding theorems and code construction,” IEEE Trans. Inf. The-
ory, vol. 52, no. 1, pp. 91–114, Jan. 2006.
[24] K. D. Nguyen, A. Guillen i Fabregas, and L. K. Rasmussen, “A tight lower
bound to the outage probability of discrete-input block-fading channels,”
IEEE Trans. Inf. Theory, vol. 53, no. 11, pp. 4314–4322, Nov. 2007.
[25] K. D. Nguyen, “Adaptive transmission for block-fading channels,” Ph.D.
dissertation, University of South Australia, 2009.
[26] H. E. Gamal, G. Caire, and M. O. Damen, “The MIMO ARQ channel:
Diversity-multiplexing-delay tradeoff,” IEEE Trans. Inf. Theory, vol. 52,
no. 8, pp. 3601–3621, Aug. 2006.
[27] G. Caire, G. Taricco, and E. Biglieri, “Optimum power control over fading
channels,” IEEE Trans. Inf. Theory, vol. 45, no. 5, pp. 1468–1489, Jul.
1999.
[28] A. Lozano, A. M. Tulino, and S. Verdu, “Optimum power allocation for par-
allel Gaussian channels with arbitrary input distributions,” IEEE Trans.
Inf. Theory, vol. 52, no. 7, pp. 3033–3051, Jul. 2006.
[29] A. Chuang, A. Guillen i Fabregas, L. K. Rasmussen, and I. B. Collings,
“Optimal throughput-diversity-delay tradeoff in MIMO ARQ block-fading
channels,” IEEE Trans. Inf. Theory, vol. 54, no. 9, pp. 3968–3986, Sep.
2008.
[30] K. D. Nguyen, A. Guillen i Fabregas, and L. K. Rasmussen, “Power alloca-
tion for block-fading channels with arbitrary input constellations,” IEEE
Trans. Wireless Commun., vol. 8, no. 5, pp. 2514–2523, May 2009.
[31] ——, “Outage exponents of block-fading channels with power allocation,”
IEEE Trans. Inf. Theory, vol. 56, no. 5, pp. 2373–2381, May 2010.
[32] K. D. Nguyen, L. K. Rasmussen, A. Guillen i Fabregas, and N. Letzepis,
“MIMO ARQ with multibit feedback: Outage analysis,” IEEE Trans. Inf.
Theory, vol. 58, no. 2, pp. 765–779, Feb. 2012.
275
REFERENCES
[33] R. G. Gallager, Information Theory and Reliable Communication. New
York: Wiley, 1968.
[34] S. Arimoto, “On the converse to the coding theorem for discrete memoryless
channels,” IEEE Trans. Inf. Theory, vol. 19, no. 3, pp. 357–359, May 1973.
[35] G. Kaplan and S. Shamai (Shitz), “Information rates and error exponents
of compound channels with application to antipodal signaling in a fading
environment,” AEU Archiv fur Elektronik und Ubertragungstechnik, vol. 47,
no. 4, pp. 228–239, 1993.
[36] A. Ganti, A. Lapidoth, and I. E. Telatar, “Mismatched decoding revisited:
General alphabets, channels with memory, and the wide-band limit,” IEEE
Trans. Inf. Theory, vol. 46, no. 7, pp. 2315–2328, Nov. 2000.
[37] A. Guillen i Fabregas, A. Martinez, and G. Caire, “Bit-interleaved coded
modulation,” Foundations and Trends in Commun. and Inf. Theory, vol. 5,
no. 1-2, pp. 1–153, 2008.
[38] S. Shamai (Shitz) and I. Sason, “Variations on the Gallager bounds, con-
nections, and applications,” IEEE Trans. Inf. Theory, vol. 48, no. 12, pp.
3029–3051, Dec. 2002.
[39] H. Weingarten, Y. Steinberg, and S. Shamai (Shitz), “Gaussian codes and
weighted nearest neighbor decoding in fading multiple-antenna channels,”
IEEE Trans. Inf. Theory, vol. 50, no. 8, pp. 1665–1686, Aug. 2004.
[40] G. H. Hardy, J. E. Littlewood, and G. Polya, Inequalities, 2nd ed. Cam-
bridge University Press, 1952.
[41] H. L. Royden, Real Analysis, 2nd ed. New York: Macmillan, 1968.
[42] R. C. Singleton, “Maximum distance q-nary codes,” IEEE Trans. Inf. The-
ory, vol. 10, no. 2, pp. 116–118, Apr. 1964.
[43] J. J. Boutros, E. C. Strinati, and A. Guillen i Fabregas, “Turbo code design
for block fading channels,” in Proc. 42nd Annual Allerton Conference on
Communication, Control and Computing, Monticello, IL, Sep.–Oct. 2004.
[44] J. J. Boutros, A. Guillen i Fabregas, E. Biglieri, and G. Zemor, “Low-
density parity-check codes for nonergodic block-fading channels,” IEEE
Trans. Inf. Theory, vol. 56, no. 9, pp. 4286–4300, Sep. 2010.
276
REFERENCES
[45] T. T. Kim and G. Caire, “Diversity gains of power control with noisy
CSIT in MIMO channels,” IEEE Trans. Inf. Theory, vol. 55, no. 4, pp.
1618–1626, Apr. 2009.
[46] T. T. Kim, K. D. Nguyen, and A. Guillen i Fabregas, “Coded modulation
with mismatched CSIT over MIMO block-fading channels,” IEEE Trans.
Inf. Theory, vol. 56, no. 11, pp. 5631–5640, Nov. 2010.
[47] L. Zhao, W. Mo, Y. Ma, and Z. Wang, “Diversity and multiplexing tradeoff
in general fading channels,” IEEE Trans. Inf. Theory, vol. 53, no. 4, pp.
1549–1557, Apr. 2007.
[48] ——, “Diversity and multiplexing tradeoff in general fading channels,” in
Proc. Conference on Information Sciences and Systems (CISS), Princeton,
NJ, Mar. 2006.
[49] G. Taricco and E. Biglieri, “Space-time decoding with imperfect channel
estimation,” IEEE Trans. Wireless Commun., vol. 4, no. 4, pp. 1874–1888,
Jul. 2005.
[50] E. Biglieri, Coding for Wireless Channels. New York: Springer
Science+Business Media, Inc., 2005.
[51] M. R. D. Rodrigues, F. Perez-Cruz, and S. Verdu, “Multiple-input multiple-
output Gaussian channels: Optimal covariance for non-Gaussian inputs,”
in Proc. IEEE Inf. Theory Workshop, Porto, Portugal, May 2008.
[52] F. Perez-Cruz, M. R. D. Rodrigues, and S. Verdu, “MIMO Gaussian chan-
nels with arbitrary inputs: Optimal precoding and power allocation,” IEEE
Trans. Inf. Theory, vol. 56, no. 3, pp. 1070–1084, Mar. 2010.
[53] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions
With Formulas, Graphs, and Mathematical Tables. New York: Dover,
1965.
[54] A. Lapidoth and S. Shamai (Shitz), “Fading channels: How perfect need
“perfect side information” be?” IEEE Trans. Inf. Theory, vol. 48, no. 5,
pp. 1118–1134, May 2002.
[55] N. Letzepis and A. Guillen i Fabregas, “Outage probability of the Gaussian
MIMO free space optical channel with PPM,” IEEE Trans. Commun.,
vol. 57, no. 12, pp. 3682–3690, Dec. 2009.
277
REFERENCES
[56] ——, “Outage probability of the free space optical channel with doubly
stochastic scintillation,” IEEE Trans. Commun., vol. 57, no. 10, pp. 2899–
2902, Oct. 2009.
[57] N. Letzepis, K. D. Nguyen, A. Guillen i Fabregas, and W. G. Cowley, “Out-
age analysis of the hybrid free-space optical and radio-frequency channel,”
IEEE J. Sel. Areas Commun., vol. 27, no. 9, pp. 1709–1719, Dec. 2009.
[58] G. Caire and D. Tuninetti, “The throughput of hybrid-ARQ protocols for
the Gaussian collision channel,” IEEE Trans. Inf. Theory, vol. 47, no. 5,
pp. 1971–1988, Jul. 2001.
[59] D. M. Mandelbaum, “An adaptive-feedback coding scheme using incremen-
tal redundancy,” IEEE Trans. Inf. Theory, vol. 20, no. 3, pp. 388–389, May
1974.
[60] J. J. Metzner, “Improvements in block-retransmission schemes,” IEEE
Trans. Commun., vol. COM-27, no. 2, pp. 524–532, Feb. 1979.
[61] D. Chase, “Code combining—A maximum-likelihood decoding approach
for combining an arbitrary number of noisy packets,” IEEE Trans. Com-
mun., vol. COM-33, no. 5, pp. 385–393, May 1985.
[62] D. J. Costello, Jr., J. Hagenauer, H. Imai, and S. B. Wicker, “Applications
of error-control coding,” IEEE Trans. Inf. Theory, vol. 44, no. 6, pp. 2531–
2560, Oct. 1998.
[63] E. Malkamaki and H. Leib, “Performance of truncated type-II hybrid ARQ
schemes with noisy feedback over block fading channels,” IEEE Trans.
Commun., vol. 48, no. 9, pp. 1477–1487, Sep. 2000.
[64] P. Wu and N. Jindal, “Coding versus ARQ in fading channels: How reliable
should the PHY be?” IEEE Trans. Commun., vol. 59, no. 12, pp. 3363–
3374, Dec. 2011.
[65] R. Cam and C. Leung, “Throughput analysis of some ARQ protocols in
the presence of feedback errors,” IEEE Trans. Commun., vol. 45, no. 1, pp.
35–44, Jan. 1997.
[66] L. Cao and P.-Y. Kam, “On the performance of packet ARQ schemes in
Rayleigh fading: The role of receiver channel state information and its
accuracy,” IEEE Trans. Veh. Technol., vol. 60, no. 2, pp. 704–709, Feb.
2011.
278
REFERENCES
[67] H. A. Ngo and L. Hanzo, “Impact of imperfect channel state information
on RS coding aided hybrid-ARQ in Rayleigh fading channels,” in IEEE
Int. Conf. Commun., Cape Town, South Africa, May 2010.
[68] H. Zheng, A. Lozano, and M. Haleem, “Multiple ARQ processes for MIMO
systems,” EURASIP J. Appl. Signal Process., vol. 2004, no. 5, pp. 772–782,
2004.
[69] I. Csiszar and J. Korner, Information Theory: Coding Theorems for Dis-
crete Memoryless Systems, 2nd ed. Cambridge University Press, 2011.
[70] R. Knopp and G. Caire, “Power control and beamforming for systems with
multiple transmit and receive antennas,” IEEE Trans. Wireless Commun.,
vol. 1, no. 4, pp. 638–648, Oct. 2002.
[71] M. Guillaud, D. T. M. Slock, and R. Knopp, “A practical method for wire-
less channel reciprocity exploitation through relative calibration,” in Proc.
Eighth Int. Symp. Signal Process. and Its Applicat., Sydney, Australia, Aug.
2005.
[72] A. T. Asyhari and A. Guillen i Fabregas, “Nearest neighbor decoding in
MIMO block-fading channels with imperfect CSIR,” IEEE Trans. Inf. The-
ory, vol. 58, no. 3, pp. 1483–1517, Mar. 2012.
[73] E. Biglieri, G. Caire, and G. Taricco, “Limiting performance of block-fading
channels with multiple antennas,” IEEE Trans. Inf. Theory, vol. 47, no. 4,
pp. 1273–1289, May 2001.
[74] S. V. Hanly and D. N. C. Tse, “Multiaccess fading channels–Part II: Delay-
limited capacities,” IEEE Trans. Inform. Theory, vol. 44, no. 7, pp. 2816–
2831, Nov. 1998.
[75] K. D. Nguyen, N. Letzepis, A. Guillen i Fabregas, and L. K. Rasmussen,
“Causal/predictive imperfect channel state information in block-fading
channels,” submitted to IEEE Trans. Inf. Theory, Jul. 2010.
[76] B. Hassibi and B. M. Hochwald, “How much training is needed in multiple-
antenna wireless links?” IEEE Trans. Inf. Theory, vol. 49, no. 4, pp. 951–
963, Apr. 2003.
[77] T. L. Marzetta, “BLAST training: Estimating channel characteristics for
high-capacity space-time wireless,” in Proc. 37th Annual Allerton Conf. on
Communication, Control, and Computing, Monticello, IL, Sep. 1999.
279
REFERENCES
[78] H. V. Poor, An Introduction to Signal Detection and Estimation, 2nd ed.
New York: Springer-Verlag (A Dowden & Culver book), 1994.
[79] E. Visotsky and U. Madhow, “Space-time transmit precoding with imper-
fect feedback,” IEEE Trans. Inf. Theory, vol. 47, no. 6, pp. 2632–2639,
Sep. 2001.
[80] V. Aggarwal and A. Sabharwal, “Bits about the channel: Multiround pro-
tocols for two-way fading channels,” IEEE Trans. Inf. Theory, vol. 57,
no. 6, pp. 3352–3370, Jun. 2011.
[81] ——, “Power-controlled feedback and training for two-way MIMO chan-
nels,” IEEE Trans. Inf. Theory, vol. 56, no. 7, pp. 3310–3331, Jul. 2010.
[82] X. J. Zhang, Y. Gong, and K. B. Letaief, “Power control and channel
training for MIMO channels: A DMT perspective,” IEEE Trans. Wireless
Commun., vol. 10, no. 7, pp. 2080–2089, Jul. 2011.
[83] L. Zheng and D. N. C. Tse, “Communication on the Grassmann manifold:
A geometric approach to the noncoherent multiple-antenna channel,” IEEE
Trans. Inf. Theory, vol. 48, no. 2, pp. 359–383, Feb. 2002.
[84] M. Medard, “The effect upon channel capacity in wireless communications
of perfect and imperfect knowledge of the channel,” IEEE Trans. Inf. The-
ory, vol. 46, no. 3, pp. 933–946, May 2000.
[85] A. Lapidoth, “On the asymptotic capacity of stationary Gaussian fading
channels,” IEEE Trans. Inf. Theory, vol. 51, no. 2, pp. 437–446, Feb. 2005.
[86] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge University
Press, 1985.
[87] Y.-H. Kim, “A coding theorem for a class of stationary channels with feed-
back,” IEEE Trans. Inf. Theory, vol. 54, no. 4, pp. 1488–1499, Apr. 2008.
[88] G. J. Foschini, “Layered space-time architecture for wireless communica-
tion in a fading environment when using multi-element antennas,” Bell
Labs Tech. J., vol. 1, no. 2, pp. 41–59, 1996.
[89] E. Telatar, “Capacity of multi-antenna Gaussian channels,” European
Trans. Telecommun., vol. 10, no. 6, pp. 585–595, Nov.–Dec. 1999.
280
REFERENCES
[90] A. Grant, “Rayleigh fading multi-antenna channels,” EURASIP J. Appl.
Signal Process., vol. 2002, no. 3, pp. 316–329, Mar. 2002.
[91] T. Koch and A. Lapidoth, “The fading number and degrees of freedom in
non-coherent MIMO fading channels: A peace pipe,” in Proc. IEEE Int.
Symp. Inf. Theory, Adelaide, Australia, Sep. 2005.
[92] N. Jindal and A. Lozano, “A unified treatment of optimum pilot overhead
in multipath fading channels,” IEEE Trans. Commun., vol. 58, no. 10, pp.
2939–2948, Oct. 2010.
[93] A. Lozano, “Interplay of spectral efficiency, power and Doppler spectrum
for reference-signal-assisted wireless communication,” IEEE Trans. Wire-
less Commun., vol. 7, no. 12, pp. 5020–5029, Dec. 2008.
[94] S. Ohno and G. B. Giannakis, “Average-rate optimal PSAM transmis-
sions over time-selective fading channels,” IEEE Trans. Wireless Commun.,
vol. 1, no. 4, pp. 712–720, Oct. 2002.
[95] K. Petersen, Ergodic Theory, ser. Cambridge studies in advanced mathe-
matics 2. Cambridge University Press, 1983.
[96] V. Sethuraman and B. Hajek, “Capacity per unit energy of fading channels
with a peak constraint,” IEEE Trans. Inf. Theory, vol. 51, no. 9, pp. 3102–
3120, Sep. 2005.
[97] J. R. Brown, Ergodic Theory and Topological Dynamics. New York: Aca-
demic Press, 1976.
[98] A. J. Weir, Lebesgue Integration and Measure. Cambridge University
Press, 1973.
[99] A. W. van der Vaart and J. A. Wellner, Weak Convergence and Empirical
Processes: With Applications to Statistics. New York: Springer-Verlag,
1996.
[100] T. S. Rappaport, Wireless Communications: Principles and Practice,
2nd ed. Upper Saddle River, NJ: Prentice Hall PTR, 2002.
[101] A. Lapidoth and S. M. Moser, “Capacity bounds via duality with appli-
cations to multiple-antenna systems on flat-fading channels,” IEEE Trans.
Inf. Theory, vol. 49, no. 10, pp. 2426–2467, Oct. 2003.
281
REFERENCES
[102] T. S. Han, Information-Spectrum Methods in Information Theory. Berlin,
Germany: Springer-Verlag, 2003.
[103] Y. Polyanskiy, H. V. Poor, and S. Verdu, “Channel coding rate in the
finite blocklength regime,” IEEE Trans. Inf. Theory, vol. 56, no. 5, pp.
2307–2359, May 2010.
[104] Y. Polyanskiy, “Channel coding: Non-asymptotic fundamental limits,”
Ph.D. dissertation, Princeton University, 2010.
[105] Y. Polyanskiy and S. Verdu, “Scalar coherent fading channel: Dispersion
analysis,” in Proc. IEEE Int. Symp. Inf. Theory, Saint Petersburg, Russia,
Jul.–Aug. 2011.
[106] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications,
2nd ed. New York: Springer-Verlag, 1998.
[107] R. J. Muirhead, Aspects of Multivariate Statistical Theory. New York:
Wiley, 1982.
[108] A. Edelman, “Eigenvalues and condition numbers of random matrices,”
Ph.D. dissertation, MIT, 1989.
282