+ All Categories
Home > Documents > Achievable Rates of FDD Massive MIMO Systems with Spatial...

Achievable Rates of FDD Massive MIMO Systems with Spatial...

Date post: 22-Sep-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
15
1 Achievable Rates of FDD Massive MIMO Systems with Spatial Channel Correlation Zhiyuan Jiang, Andreas F. Molisch, Fellow, IEEE, Giuseppe Caire, Fellow, IEEE, and Zhisheng Niu, Fellow, IEEE Abstract—It is well known that the performance of frequency- division-duplex (FDD) massive MIMO systems with i.i.d. channels is disappointing compared with that of time-division-duplex (TDD) systems, due to the prohibitively large overhead for acquiring channel state information at the transmitter (CSIT). In this paper, we investigate the achievable rates of FDD massive MIMO systems with spatially correlated channels, considering the CSIT acquisition dimensionality loss, the imperfection of CSIT and the regularized-zero-forcing linear precoder. The achiev- able rates are optimized by judiciously designing the downlink channel training sequences and user CSIT feedback codebooks, exploiting the multiuser spatial channel correlation. We compare our achievable rates with TDD massive MIMO systems, i.i.d. FDD systems, and the joint spatial division and multiplexing (JSDM) scheme, by deriving the deterministic equivalents of the achievable rates, based on the one-ring model and the Laplacian model. It is shown that, based on the proposed eigenspace channel estimation schemes, the rate-gap between FDD systems and TDD systems is significantly narrowed, even approached under moderate number of base station antennas. Compared to the JSDM scheme, our proposal achieves dimensionality- reduction channel estimation without channel pre-projection, and higher throughput for moderate number of antennas and moderate to large channel coherence block length, though at higher computational complexity. Index Terms—Massive MIMO systems, Frequency-division- duplex, Spatial channel correlation, Training sequences design, Feedback codebook design. I. I NTRODUCTION Scaling-up multiple-input-multiple-output (MIMO) sys- tems, thus exploiting the spatial degree-of-freedom (DoF), plays a pivotal role in boosting the capacity of next gener- ation wireless communication systems. In cellular systems, it is desirable to deploy a large number of antennas at base stations (BSs) [1], resulting in what is referred to as the massive MIMO system. Such designs have several advantages, including significant improvements of spectral efficiency and radiated energy efficiency [2], immunity to small-scale channel fading due to the channel hardening effect, simplification of the media-access-control (MAC) layer design, etc. Striving to reap the dramatic throughput gain of massive MIMO systems, it is found that such capacity improvements Z. Jiang and Z. Niu are with Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing 100084, China. Emails: [email protected]; [email protected]. A. F. Molisch and G. Caire are with the Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089-2565, USA. Emails: [email protected]; [email protected]. This work is sponsored in part by the National Basic Research Program of China (2012CB316001), and the Nature Science Foundation of China (61201191, 61321061, 61461136004), Hitachi Ltd. and Intel Research under the 5G Program. rely heavily on the availability of channel state information at the transmitter (CSIT). Without CSIT, e.g., when the user channels are identically distributed and are i.i.d. (independent identically distributed) in time and frequency, the total DoF reduces to one [3]. 1 In practice, a pilot-assisted CSIT acquisi- tion approach is widely adopted, where the BS first broadcasts downlink channel training sequences, and then listens to the channel feedback from the users. This is the case for the frequency-division-duplex (FDD) system or the uncalibrated time-division-duplex (TDD) system. 2 For the calibrated TDD system, the channel reciprocity is exploited to allow the BS to obtain the CSIT through uplink channel training [5]. Assuming the channel coefficients are i.i.d. for different users and BS antennas, the CSIT acquisition overhead, which leads to a dimensionality loss of the time-frequency resource, scales with the number of BS antennas for FDD systems, and the number of users for TDD systems, respectively. As we scale up the number of BS antennas, the overhead will become prohibitively large for the FDD system. Therefore, it is com- monly considered that the TDD mode is the better, if not the only, choice for massive MIMO systems. Nonetheless, since currently deployed cellular systems are dominantly FDD, and many frequency bands are assigned explicitly for use in FDD, it is of great interest to design schemes that realize the massive MIMO gains with an FDD mode. Given the fact that the dimensionality loss due to CSIT acquisition overhead is devastating with closed-loop channel estimation in FDD and uncalibrated TDD systems, and that the system performance without CSIT is unacceptably poor, it is natural to pose the question whether there exists other information that can be estimated at a much lower cost, while accomplishing the same task as the CSIT. To this end, it is found that the second order channel statistics, specifically the channel correlation matrices (CCMs) of the channel co- efficients, are of tremendous help [6]–[10]. Compared with the instantaneous CSIT realizations, the CCMs, which are determined by user-locations and large-scale fading, vary at a much slower time scale, e.g., seconds to tens of seconds in cellular systems. Therefore, their estimation cost is drastically lower than instantaneous CSIT. In the mean time, recent 1 In such condition it has been shown that even when the CSIT is known within a mean-square error that does not decrease with SNR, the DoF collapses to one [4]. 2 Since in practice TDD reciprocity is quite difficult to obtain, which requires reciprocity calibration of the transmit and receive radio frequency chains. In fact, the only current system that uses MU-MIMO, which is 802.11ac, uses explicit polling of the users through downlink pilots, and explicit quantized closed-loop feedback from the users, even though it is a TDD system.
Transcript
Page 1: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

1

Achievable Rates of FDD Massive MIMO Systemswith Spatial Channel Correlation

Zhiyuan Jiang, Andreas F. Molisch, Fellow, IEEE, Giuseppe Caire, Fellow, IEEE, and Zhisheng Niu, Fellow, IEEE

Abstract—It is well known that the performance of frequency-division-duplex (FDD) massive MIMO systems with i.i.d. channelsis disappointing compared with that of time-division-duplex(TDD) systems, due to the prohibitively large overhead foracquiring channel state information at the transmitter (CSIT).In this paper, we investigate the achievable rates of FDD massiveMIMO systems with spatially correlated channels, considering theCSIT acquisition dimensionality loss, the imperfection of CSITand the regularized-zero-forcing linear precoder. The achiev-able rates are optimized by judiciously designing the downlinkchannel training sequences and user CSIT feedback codebooks,exploiting the multiuser spatial channel correlation. We compareour achievable rates with TDD massive MIMO systems, i.i.d.FDD systems, and the joint spatial division and multiplexing(JSDM) scheme, by deriving the deterministic equivalents of theachievable rates, based on the one-ring model and the Laplacianmodel. It is shown that, based on the proposed eigenspacechannel estimation schemes, the rate-gap between FDD systemsand TDD systems is significantly narrowed, even approachedunder moderate number of base station antennas. Comparedto the JSDM scheme, our proposal achieves dimensionality-reduction channel estimation without channel pre-projection,and higher throughput for moderate number of antennas andmoderate to large channel coherence block length, though athigher computational complexity.

Index Terms—Massive MIMO systems, Frequency-division-duplex, Spatial channel correlation, Training sequences design,Feedback codebook design.

I. INTRODUCTION

Scaling-up multiple-input-multiple-output (MIMO) sys-tems, thus exploiting the spatial degree-of-freedom (DoF),plays a pivotal role in boosting the capacity of next gener-ation wireless communication systems. In cellular systems, itis desirable to deploy a large number of antennas at basestations (BSs) [1], resulting in what is referred to as themassive MIMO system. Such designs have several advantages,including significant improvements of spectral efficiency andradiated energy efficiency [2], immunity to small-scale channelfading due to the channel hardening effect, simplification ofthe media-access-control (MAC) layer design, etc.

Striving to reap the dramatic throughput gain of massiveMIMO systems, it is found that such capacity improvements

Z. Jiang and Z. Niu are with Tsinghua National Laboratory for InformationScience and Technology, Tsinghua University, Beijing 100084, China. Emails:[email protected]; [email protected].

A. F. Molisch and G. Caire are with the Ming Hsieh Department ofElectrical Engineering, University of Southern California, Los Angeles, CA90089-2565, USA. Emails: [email protected]; [email protected].

This work is sponsored in part by the National Basic Research Programof China (2012CB316001), and the Nature Science Foundation of China(61201191, 61321061, 61461136004), Hitachi Ltd. and Intel Research underthe 5G Program.

rely heavily on the availability of channel state informationat the transmitter (CSIT). Without CSIT, e.g., when the userchannels are identically distributed and are i.i.d. (independentidentically distributed) in time and frequency, the total DoFreduces to one [3].1 In practice, a pilot-assisted CSIT acquisi-tion approach is widely adopted, where the BS first broadcastsdownlink channel training sequences, and then listens to thechannel feedback from the users. This is the case for thefrequency-division-duplex (FDD) system or the uncalibratedtime-division-duplex (TDD) system.2 For the calibrated TDDsystem, the channel reciprocity is exploited to allow the BS toobtain the CSIT through uplink channel training [5]. Assumingthe channel coefficients are i.i.d. for different users and BSantennas, the CSIT acquisition overhead, which leads to adimensionality loss of the time-frequency resource, scaleswith the number of BS antennas for FDD systems, and thenumber of users for TDD systems, respectively. As we scaleup the number of BS antennas, the overhead will becomeprohibitively large for the FDD system. Therefore, it is com-monly considered that the TDD mode is the better, if not theonly, choice for massive MIMO systems. Nonetheless, sincecurrently deployed cellular systems are dominantly FDD, andmany frequency bands are assigned explicitly for use in FDD,it is of great interest to design schemes that realize the massiveMIMO gains with an FDD mode.

Given the fact that the dimensionality loss due to CSITacquisition overhead is devastating with closed-loop channelestimation in FDD and uncalibrated TDD systems, and thatthe system performance without CSIT is unacceptably poor,it is natural to pose the question whether there exists otherinformation that can be estimated at a much lower cost, whileaccomplishing the same task as the CSIT. To this end, itis found that the second order channel statistics, specificallythe channel correlation matrices (CCMs) of the channel co-efficients, are of tremendous help [6]–[10]. Compared withthe instantaneous CSIT realizations, the CCMs, which aredetermined by user-locations and large-scale fading, vary ata much slower time scale, e.g., seconds to tens of seconds incellular systems. Therefore, their estimation cost is drasticallylower than instantaneous CSIT. In the mean time, recent

1In such condition it has been shown that even when the CSIT is knownwithin a mean-square error that does not decrease with SNR, the DoFcollapses to one [4].

2Since in practice TDD reciprocity is quite difficult to obtain, whichrequires reciprocity calibration of the transmit and receive radio frequencychains. In fact, the only current system that uses MU-MIMO, which is802.11ac, uses explicit polling of the users through downlink pilots, andexplicit quantized closed-loop feedback from the users, even though it is aTDD system.

Page 2: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

2

work shows the CCMs can be leveraged, in many ways, tofacilitate FDD massive MIMO transmission. While the optimaltransmission scheme with the aid of CCMs is still unclear,significant rate gain can be expected [7].

A large body of work has been done studying TDD massiveMIMO systems. The seminal work in [1] first proposes todeploy BS antennas with a number much larger than thenumber of users, eliminating the impact of small-scale channelfading and uncorrelated noise due to the channel hardeningeffect, while only the inter-cell interference remains due topilot contamination. In [2], the authors show that in additionto the spectral efficiency improvement, the massive MIMOsystem increases the radiated energy efficiency by a factorof M , where M is the number of BS antennas, or

√M in

the presence of imperfect channel estimation. Recent workin [11] further shows the pilot-contamination problem is notinherent. Several other issues are also studied extensively, suchas downlink precoding, detection, hardware impairment, etc.[12]–[14]

For FDD massive MIMO systems, the research can becategorized into three directions: Compressive-sensing-based,temporal-correlation-based and spatial-correlation-based. In[15] and references therein, the authors exploit the sparsity inmassive MIMO channel matrix due to limited number of scat-terers around the BS, using a compressive sensing approach.Moreover, the time correlation of the channels is leveragedto reduce the CSIT overhead, e.g., [16]–[18] and referencestherein, where a trellis-code based quantization codebooks areleveraged to decrease the CSIT estimation overhead in [16][17] and a memory-based channel training sequence design ispresented in [18]. The other direction is exploiting the spatialcorrelation of channel coefficients, pioneered by the work in[7] and extended in [8]–[10], [19], which propose the jointspatial division and multiplexing (JSDM) scheme. Based onthe JSDM scheme, the users are divided into groups based ontheir CCMs, and a two-stage precoding is performed, namelythe pre-beamforming and the beamforming, which utilizes theCCMs to counteract the inter-group-interference (IGI) and theinstantaneous CSIT to manage the interference inside eachgroup, respectively.

The current work endeavors to optimize the achievable ratesof FDD massive MIMO systems. Specifically, we proposeeigenspace channel estimation methods to improve the systemachievable rates, for the case of spatially correlated channels.The main contributions of this paper include:• The low-rank covariance matrices of the channels are

exploited in order to design efficient channel trainingand feedback schemes, which enable dimensionality re-duced channel estimation, e.g., it may suffice to trainthe downlink broadcast channel (BC) with pilots lessthan the number of BS antennas. In fact, the proposedchannel training and feedback schemes can be seen asan alternative to the pre-projection and effective channelapproach in JSDM. We derive deterministic equivalentsof the achievable rates for our schemes with a regularized-zero-forcing (RZF) precoder, considering distinct CCMsof different users, the dimensionality loss due to chan-nel training and feedback processes, and the imperfec-

tion of channel estimations. The proposed approach re-quires minimal modifications of the widely-adopted pilot-assisted scheme, thus making it desirable to implementin practice.

• The optimal channel training sequences with distinctCCMs for different users are studied for the first time.We propose a heuristic iterative algorithm to find theoptimized training sequences, within the heuristics of thealgorithm, based on maximizing the mutual informationbetween the channel coefficients and the received channeltraining signals. The training sequences found by thealgorithm are shown to improve the system achievablerates substantially, compared with the training sequencesoptimized for the i.i.d. case.

• The Karhunen-Loeve (KL) transform followed by entropycoded scalar quantization (SQ) with reverse water fillingbit-loading for the feedback codebook design (KLSQ) isproposed. We compare its performance with two vectorquantization (VQ) methods designed for the spatiallycorrelated channel case. It is shown that the KLSQ is asimple way to approach the optimal VQ performance forcorrelated Gaussian channel vectors. The simplicity is dueto the fact that it is only SQ followed by Huffman entropycoding. Therefore, it is of very low complexity for realtime implementation, which justifies and motivates itsuse.

• Comprehensive numerical results are given to evaluatethe performance. We consider the one-ring channel modeland the Laplacian angular spectrum channel model, andcompare our achievable sum rate with the TDD systemand the i.i.d. FDD system under various system parame-ters. Significant rate gains are obtained by our proposedchannel estimation scheme in spatially correlated chan-nels. Furthermore, in comparison with the JSDM scheme,it is shown that the achievable sum rate with our proposalis better in most scenarios, except when the channelcoherence block length is very small and the users arewell separated in the angular domain.

The remainder of the paper is organized as follows. InSection II, the system model is characterized. In SectionIII, we specify the proposed eigenspace channel training andfeedback schemes, and derive the achievable rates. In SectionIV, we derive the deterministic equivalents of the achievablerates. Section V gives the simulation results, including thecomparison with TDD and i.i.d. FDD systems, and the JSDMscheme, under various system parameters. Finally, in SectionVI, we conclude our work.

Notations : Throughout the paper, we use boldface upper-case letters, boldface lowercase letters and lowercase letters todesignate matrices, column vectors and scalars, respectively.X† denotes the complex conjugate transpose of matrix X .X(:, i) denotes the i-th column of X . xi denotes the i-thelement of vector x. diag[x1, x2, ..., xn] denotes a diagonalmatrix with x1, x2, ..., xn on its diagonal. det(X) and tr(X)denote the determinant and the trace of matrixX , respectively.Denote by E(·) as the expectation operation. Denote by INas the N dimensional identity matrix. CN (µ,Σ) denotes

Page 3: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

3

circularly symmetric complex Gaussian distribution with meanµ and covariance matrix Σ. The logarithm log(x) denotesthe binary logarithm. We use Cov(·) to denote the covariancematrix of a random vector.

II. SYSTEM MODEL

We consider a downlink BC, where an M -antenna BS servesN single-antenna users. The receive signal of the n-th user isexpressed as

yn = h†nWs+ nn, (1)

where s ∈ CN is the data symbols transmitted to theusers, x = Ws denotes the precoded downlink signals,W ∈ CM×N denotes the precoding matrix, hn is the channelvector of user-n, and y ∈ CN are the received signals of users.The downlink total transmit power constraint is

tr{E[Wss†W †]

}≤ P, (2)

and n ∼ CN (0, IN ) is the Gaussian distributed uncorrelatednoise.

A. Spatially Correlated Channel Matrix

Define the compound downlink channel matrix H =[h1,h2, ...,hN ]

†, where hn ∼ CN (0,Rn). The CCM of usern is

Rn = E[hnh

†n

], (3)

where by the Karhunen-Loeve representation,

hn = R12nzn, (4)

where zn ∼ CN (0, IM ). It is assumed that the channel vectorsof users are mutually independent, since users are usually wellseparated. Denote the singular-value-decomposition (SVD) ofthe CCM as Rn = UnΣnU

†n, wherein Un is an M × M

orthogonal matrix and Σn = diag[λ(n)1 , λ

(n)2 , ..., λ

(n)M ].

B. Dominant Eigenspace Representation of CCM

Let us define the order-rn dominant eigenspace representa-tion of Rn (rn-DER) as

R(rn)n = U (rn)

n Σ(rn)n (U (rn)

n )†, (5)

where Σ(rn)n ∈ Crn×rn contains the rn dominant singular

values with 1 ≤ rn ≤ M , and U (rn)n ∈ CM×rn denotes the

corresponding rn eigenvectors of Rn. The order-rn channelvector approximation (rn-CVA) is

h(rn)n = U (rn)

n (Σ(rn)n )

12 z(rn)

n , (6)

where z(rn)n ∼ CN (0, Irn). And let

hn = h(rn)n + e(rn)

n , (7)

where e(rn)n denotes the error introduced by only considering

the dominant rn singular values, which, therefore, can berepresented as

e(rn)n = U (rn)

n (Σ(rn)n )

12 z(rn)

n , (8)

where U (rn)n ∈ CM×(M−rn) denotes the remaining M − rn

eigenvectors of Rn, and Σ(rn)n ∈ C(M−rn)×(M−rn) contains

the remaining M − rn non-dominant singular values. Theapproximation, namely the rn-DER, which only accounts forthe dominant rn singular values of Rn is leveraged to improvethe CSIT feedback efficiency, which is discussed in details inSection III-B.

Assumptions : Throughout the paper, we assume the BShas perfect knowledge of the CCMs of all users, i.e., Rn,∀n,and the users know their respective CCMs. In practice, theBS can obtain the downlink CCMs by direct transformationfrom the uplink CCMs without using any training symbols[20]. In addition, we assume the receive and transmit antennasare uncorrelated [21], and we only consider the transmitcorrelation since we assume single-antenna users [22].

We adopt the block fading channel model, where the chan-nel is constant for T channel uses measured on the time-frequency plane, i.e., in complex dimensions, and evolvesindependently to another block. The channel coherence blocklength T is a dimensionality, which is given by the productof channel coherence time and channel coherence bandwidthin an orthogonal frequency-division multiplexing (OFDM)system, on the time-frequency plane. In Long-Term Evolution(LTE) systems, a resource block is a tile of 14 OFDM symbolsin time multiplied by 12 subcarriers in frequency, for a totalof T = 168 complex symbols [23], over which the channel isconstant (within the time and frequency selectivity for whichthe system is designed).

III. FDD MASSIVE MIMO ACHIEVABLE RATES

In this section, we will specify the rate-achieving trans-mission scheme proposed in this work. The structure of thetransmission strategy is identical with the widely adoptedpilot-assisted FDD MU-MIMO system, which encompassesthree steps:• Downlink channel training.• Uplink CSIT feedback.• Data transmission.The rate improvement stems from optimizing the channel

training sequences and the CSIT feedback codebooks underthe spatially correlated channels, thus requiring minimummodifications to current systems. In what follows, we willinvestigate the aforementioned steps in order, namely thechannel training sequences, feedback codebooks, and derivethe achievable rates on account of the dimensionality lossand imperfection of channel estimations with the RZF linearprecoder.

A. Optimized Downlink Training with Per-User CCM

The signal model of the channel training phase is expressedas

Yτ = HXτ +Nτ ,

tr[XτX

†τ

]≤ τP, (9)

where Xτ is an M × τ training signal matrix, containing thetraining sequences and is known to the BS and the users. τ

Page 4: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

4

is the training length, and Yτ = [yτ,1,yτ,2, ...,yτ,N ]† is the

corresponding channel output observed by the user, disturbedby Gaussian noise Nτ with i.i.d. unit-variance entries. Then-th user observes

y†τ,n = h†nXτ + n†τ,n, (10)

and applies the minimum-mean-square-error (MMSE) estima-tion [24, Section 19.5]

hn = RnXτ (X†τRnXτ + Iτ )−1yτ,n. (11)

Applying the MMSE decomposition, the user channel hn andthe covariance matrix of the channel estimation error due toimperfect channel training are expressed as [25]

hn = hn + en,

Cen = Cov(hn)− Cov(hn)

= Rn −RnXτ (X†τRnXτ + Iτ )−1X†τRn

= (R−1n +XτX

†τ )−1, (12)

respectively. The last equation in (12) follows from the matrixinversion lemma3. The total mean-square error (MSE) is

MSE =

N∑n=1

tr [Cen ] . (13)

Notice that by assumption Rn is the CCM, thus it may berank-deficient and not invertible. Nonetheless, let Rn = Rn+ε∗IM such that ε∗ is small but Rn is invertible. Then (12)holds true if we substitute Rn for Rn. Then we can let ε∗ → 0due to the continuity of the function involved.

In [26], the optimal training sequences where users haveidentical CCMs are given, in the sense of minimizing the MSEor the mutual information between the channel coefficientsand received signals conditioned on the transmitted trainingsignals. However, to the best of our knowledge, the optimaltraining sequences under the per-user CCMs is still unknown,because multiple users share the same downlink trainingsequences, and thus the training sequences can no longermatch one specific CCM, as in the case where user CCMs areidentical. In what follows, we develop an iterative algorithmto find the optimized training sequences, within the heuristicsof the algorithm, in terms of maximizing the conditionalmutual information (CMI) between the channel vector and thereceived signal. The optimization problem, given the traininglength τ and total transmit power P is expressed as,4

maximize:N∑n=1

log det(I +X†τRnXτ

)s.t. tr

[XτX

†τ

]≤ τP, (14)

and we have the following theorem.

3The matrix inversion lemma states (A + UCV )−1 = A−1 −A−1U

(C−1 + V A−1U

)−1V A−1, where A, U , C, V are all matrices

with correct sizes.4Notice that (14) is based on the long-term statistics, i.e., CCMs, instead

of directly on instantaneous CSIT. We stress that it is impossible to base theoptimization of the training sequences on any knowledge of instantaneousCSIT, which varies in time, due to causality.

Theorem 1: The training sequences that maximize the CMIsatisfy the following condition

N∑n=1

[RnXopt

(Iτ +X†optRnXopt

)−1]

= λXopt, (15)

where λ ≥ 0 is a constant chosen to satisfy the powerconstraint.

Proof: The proof is straightforward by deriving theKarush-Kuhn-Tucker (KKT) conditions of the Lagrangian dualproblem [27] of (14).

Remark 1: Unfortunately, in general, the problem in (14) isnot a convex problem. Consider the special case where N = 1and R1 is rank-deficient, then any Xτ satisfying

Xτ = [x0,x0, , ...,x0] , (16)

where x0 is the singular vector of R1 corresponding to thesingular value of 0, is a solution of (15) when λ = 0.Therefore, there are multiple sequences that satisfy the KKTcondition in (15), and clearly, none of which satisfying (16)is the optimal solution, since by plugging (16) into (14), theobjective is zero. To obtain an improved performance, wedevelop a heuristic iterative algorithm which is based on thecondition in (15) to find the optimized training sequences.Based on the simulation results, the algorithm performs fairlywell and converges fast.

Remark 2: Observing the condition in (15), one can imme-diately infer that when N = 1, the optimal training sequencesdeveloped in [26] based on identical CCMs, which contain thesingular vectors of the CCM with optimal power allocationgiven by the water-filling solution, satisfy (15), i.e., the onewith identical CCMs is a special case for our problem.

Remark 3: The reason that we set the objective to bemaximizing the CMI, rather than minimizing the total MSE,is that the algorithm based on minimizing the MSE doesnot converge. This non-convergent behavior is the result ofthe ill-conditioned matrices involved in computing the KKTconditions in the MSE problem. Consider the derivative of theMSE

∂MSE∂Xτ

= −2

N∑n=1

(R−1n +XτX

†τ

)−2Xτ . (17)

The matrix(R−1n +XτX

†τ

)is often ill-conditioned, when

Rn is rank-deficient, whereas in the CMI problem, the ma-trices involved are all well-conditioned. Moreover, based on[28], the MMSE and the mutual information have very strongrelationships, and the numerical results show that the obtainedtraining sequences have very good MSE performance.

The iterative algorithm, which aims to find the optimumtraining sequences based on the first-order KKT condition in(15) is specified as follows

• Step 1) Initialization:

i = 1,

X(1)τ = X(0)

τ ; (18)

Page 5: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

5

• Step 2) Iteration:

i ≥ 2,

X(i)τ =

N∑n=1

RnX(i−1)τ

(Iτ +

(X(i−1)τ

)†RnX

(i−1)τ

)−1

.(19)

Then apply the power normalization, where

X(i)τ ←

τP

tr[X

(i)τ

(X

(i)τ

)†]X(i)τ . (20)

If ‖X(i)τ −X(i−1)

τ ‖2 < ε, where ‖·‖2 denotes the spectralnorm, the algorithm is finished, and the output is X(i)

τ .Else, go to step 2.

Remark 4: Such a KKT-based iteration is a canonicalmethod to get convergent schemes that yield local optimumof the objective function, which aims at solving a non-convexproblem. KKT-based iterations have been proposed in differentcontexts, e.g., [29].

Remark 5: Notice that X(0)τ 6= 0, otherwise the algorithm

would be stuck at zero. In our simulations, letting X(0)τ have

orthogonal rows works well. Also notice that in the algorithm,we normalize the power of the training signals to be equal tothe power constraint, due to the fact that it is clear that theoptimal solution satisfies the power constraint with equality.

B. Uplink CSIT Feedback

After the users estimate their respective channel coefficientsbased on received channel training signals, they feed back theirestimates using predefined codebooks. In this subsection, effi-cient channel feedback codebooks are designed with spatiallycorrelated channels. We first propose the entropy coded scalarquantization after KL transform, which is a simple way touniversally approach the optimal VQ performance. Then wecompare its performance with two VQ approaches, which areshown to be near-optimal with spatially correlated channelsand also serve as two implementation options.

1) Entropy Coded Scalar Quantization: We consider ascalar quantization method (component by component) of theKL-transformed channel vector, denoted by hKL

n . Specifically,denote

hKLn = U †nhn = Σ

12nzn −U †nen (21)

as the KL-transform of the channel vector of user-n, afterchannel training where hn is the MMSE channel estimationafter channel training and Un is the left singular matrix ofRn. Putting aside the channel training error en for ease ofexposition5, this yields M mutually independent Gaussianvariables with non-identical variances. The reverse water-filling approach (RWF) [30] can be implemented to achievethe rate-distortion function (in terms of MSE distortion) in

5Normally the channel training error is small, therefore we ignore it whendesigning feedback codebooks.

this scenario, i.e., we allocate the quantization bits accordingto the following conditions

M∑i=1

min[νn, λ

(n)i

]= D,

Ri = log

(n)i

νn

),

M∑i=1

Ri = Bn, (22)

where D is the total MSE distortion, Ri denotes the numberof bits allocated to the i-th component of hKL

n , Bn is thetotal number of feedback bits for user-n, and 0 ≤ νn ≤max{λ(n)

i ,∀i} denotes the water level. The MSE distortionfor the i-th component is the minimum of {νn, λ(n)

i }, i.e.,

D(n)i = min

[νn, λ

(n)i

]. (23)

After the BS recovers the KL-transformed channel vector fromthe user feedback, it can reconstruct the channel vector by leftmultiplying Un. By this scheme, we obtain the relationshipbetween the channel estimation at the BS side, denoted byˆhn, and the real channel,

hn = ˆhn + en +Unˆen︸ ︷︷ ︸εn

, (24)

Cov(εn) = Cen︸︷︷︸M1

+UnDnU†n︸ ︷︷ ︸

M2

(25)

ˆRn , Cov(ˆhn) = Rn − Cov(εn), (26)

where Cen is defined in (12), Dn ,

diag[D

(n)1 , D

(n)2 , ..., D

(n)M

], and ˆen is the feedback

quantization error. Observing the error covariance matrixin (25), M1 and M2 represent channel estimation errorcovariance due to imperfect channel training and CSIquantization error covariance, respectively.

Remark 6: There are several approaches to mimic suchbehavior using a scalar quantizer, e.g., apply uniform quanti-zation levels and encode the quantization points with Huffmancode for each of the components with λ(n)

i > νn, based on thefact that the component is Gaussian distributed with varianceλ

(n)i . The advantage of this quantizer is that it does not involve

any VQ, thus can be implemented very efficiently in parallel.Notice also that when Un is a slice of a Discrete FourierTransform (DFT) matrix (as in large linear antenna arrays),the KL-transform can be well implemented by a Fast FourierTransform (FFT), therefore the overall quantization can bemade extremely computationally efficient. The quantizationerror performance and comparison with VQ approaches willbe shown in Section V-D.

Remark 7: It should be noted that although the SQ-RWFquantizer quantizes the instantaneous channel vector, the pa-rameters describing the SQ-RWF quantizer, including the KL-transformation, the bit allocation information (the number ofallocated bits per selected channel entries) and etc., are solelydetermined by the second-order statistics, i.e., the channelcorrelations matrices, which are slowly varying. Therefore it

Page 6: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

6

is not a significant overhead to exchange the description of theSQ-RWF quantizer, since it needs to be communicated onlyat the rate at which the statistics change significantly .

2) VQ: Isotropical and Skewed Random Codebooks: Inthe literature, extensive work has been done regarding theVQ feedback codebook design in spatial CCMs. It is wellunderstood that in the asymptotic regime where the number offeedback bits B goes to infinity, the quantization MSE scalesdown with B as MSE ∼ 2

−BM−1 , regardless whether the channel

distribution is i.i.d. or correlated [31] [32]. However, when thenumber of feedback bits B is limited, which is the case forFDD massive MIMO systems due to scarce channel estimationresources, the exact analysis for the quantization MSE per-formance is unavailable. In [33], a “skewed codebook” (i.e.,a codebook based on skewing an isotropical codebook) thatmatches the eigenspace of the CCM is shown to be close tooptimal by simulation results. The authors of [31] try to deriveclosed-form expressions for the SNR loss for general skewedcodebooks, but the expressions are too complicated to findthe optimal skew matrix in closed form. Notwithstanding thedifficulty in deriving the optimal codebook in closed form,the Lloyd algorithm can be implemented to find the optimalcodebook, however with high computational complexity [34].

Observing that the CCMs of the users are usually rank-deficient, in the sense that a number of singular values of theCCMs are extremely small (see numerical results in SectionV for eigenvalue distributions in popular channel models),it is advantageous for the users to compress their feedbackoverhead by only feeding back along the order-rn dominanteigenspace of the channel, i.e., a rn-CVA in (6). It will beshown later that this scheme performs better than feedingback all the channel space, when B is finite. Specifically,we consider two kinds of feedback schemes, both of whichconcentrate the feedback bits in the dominant eigenspace of thechannels, while one of them leverages an isotropical randomvector to quantize the dominant eigenspace, the other exploresthe benefit of a skewed codebook design.

a) Isotropical Quantization in Dominant Eigenspace:First, the n-th user decorrelates the channel vector leveragingthe rn-DER of the CCM,

z(rn)n = (Σ(rn)

n )−12 (U (rn)

n )†hn. (27)

Notice that assuming the rn-CVA is accurate and the channeltraining is perfect, namely hn = h

(rn)n , then z(rn)

n has rnindependently Gaussian distributed unit-norm entries. Basedon this observation, we then use a predefined isotropicalcodebook to quantize z(rn)

n . After the feedback, the BS obtainsa quantized version of the channel estimation, after multiplyingthe channel correlation eigenvectors,

ˆhn = U (rn)n (Σ(rn)

n )12 ˆz

(rn)n , (28)

where ˆz(rn)n denotes the quantized version of z(rn)

n at the BSside, with quantization error ˆen satisfying

z(rn)n = ˆz

(rn)n + ˆen. (29)

The quantization error ˆen can be computed based on [32],where random vector quantization (RVQ) is assumed, by

which the codebook is obtained by generating 2Bn quanti-zation vectors independently and uniformly distributed on theunit sphere in Crn . The quantization error ˆen is i.i.d. andindependent with ˆz

(rn)n . It follows that

Cov(ˆen) =2−Bnrn−1

rnβIrn , (30)

where

β = tr[z(rn)n (z(rn)

n )†]

= tr[Irn − (Σ(rn)

n )−12 (U (rn)

n )†CenU(rn)n (Σ(rn)

n )−12

].(31)

Combining (12), (27), (28), (30) and the rn-CVA in (7), weobtain the relationship between the channel estimation at theBS side and the real channel, i.e.,

hn = ˆhn+

U (rn)n (U (rn)

n )†en +U (rn)n (Σ(rn)

n )12 ˆen + e(rn)

n︸ ︷︷ ︸εn

,

(32)Cov(εn) = U (rn)

n (U (rn)n )†CenU

(rn)n (U (rn)

n )†︸ ︷︷ ︸M1

+2−Bnrn−1

rnβR(rn)

n︸ ︷︷ ︸M2

+ UnΣnU†n︸ ︷︷ ︸

M3

(33)

ˆRn , Cov(ˆhn) = Rn − Cov(εn), (34)

where Cen is defined in (12), and Un, Σn are defined in (8).Observing the error covariance matrix in (33), M1, M2, M3

represent channel estimation error due to imperfect channeltraining, CSI quantization error, and the error from onlyfeeding back the order-rn dominant eigenspace of the channelvectors, respectively.

b) Skewed Codebook Quantization in DominantEigenspace: Although we concentrate our feedback bits inthe dominant eigenspace based on the isotropical dominantcodebook design in the preceding subsection, there is stillimbalance among the singular values of the CCMs, renderingthe isotropical RVQ codebook described above not optimal.To this end, we adopt a skewed codebook

Csk =

A12nfi∥∥∥A 12nfi

∥∥∥ , i = 1, ..., 2B

(35)

where fi ∈ Crn is isotropically distributed on the unit-sphere,and An = U

(rn)n (Σ

(rn)n )

12 . It is clear that by design we

only feed back the dominant rn eigenmodes of the channel,i.e., h(rn)

n , neglecting the remaining eigenmodes. The skewedmatrix is designed to match the dominant eigenspace of thechannel, such that the correlation matrix of the codebook isidentical with the rn-DER. By adopting the codebook design,the total quantization error, which is defined as

MSEq = tr[E(ˆe†nˆen

)], (36)

can be upper bounded based on the following theorem.

Page 7: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

7

Theorem 2: Given a channel vector hn, the quantizationerror based on the skewed codebook defined in (35) is upperbounded as

MSEq ≤

rn∑i=1

(λ(n)i )2

λ(n)1

2−Bnrn−1 + tr

[UnΣnU

†n

]. (37)

Proof: The proof is based upon the distribution resultsdeveloped for the i.i.d. channels in [32]. The detail proof is inAppendix A.

Remark 8: It is clear that the first and second terms in(37) represent the error resulting from quantizing the channeland neglecting the subdominant eigenmodes of the channel,respectively. Also notice that the quantization MSE by theskewed codebook is smaller than that by isotropical codebook,

MSEq,iid = tr[

2−Bn

rn − 1R(rn)n + UnΣnU

†n

](38)

=

rn∑i=1

λ(n)i 2

−Bnrn−1 + tr

[UnΣnU

†n

](39)

rn∑i=1

(λ(n)i )2

λ(n)1

2−Bnrn−1 + tr

[UnΣnU

†n

]= MSEq,sk,

(40)

where (38) stems from (33) with β = rn for fair comparisonsince we assume the channel vector to be quantized in thederivation of Theorem 2 has unit entries. The equality holdsif and only if λ(n)

1 = λ(n)2 = ... = λ

(n)rn .

Remark 9: Notice that the quantization error in Theorem 2does not scale with B to zero. This can be explained thatwhen B is large, it is better to quantize all the channeleigenmodes instead of neglecting the sub-dominant modes,i.e., rn = M . Thus the quantization error with the optimal rn,which minimizes the quantization error, scales with B to zero,when B goes to infinity. Meanwhile, the bound in Theorem 2is tighter than the one with rn fixed to be M , when B is finite.The numerical results in Section V agrees with our analysis.

Remark 10: Notice that the dominant rank rn, i.e., the orderof the CVA we choose to approximate the correlated channels,plays an important role in the feedback scheme. The largerrn is, the more accuracy we obtain by approximating thecorrelated channels, whereas the feedback quantization erroris also larger due to the increased quantization dimension.Therefore, there exists a tradeoff in terms of the dominantrank, rn. The optimal rn can be determined by a simple one-dimensional search over 1:M , performed by the n-th user.

C. Data Transmission

For fair comparisons, also in line with the work in [7]and [35], we consider the RZF linear precoder schemes.The precoder treats the channel estimates as the real channelcoefficients. Corresponding achievable rates on account of theimperfect channel estimations are computed in the followingsection. Define

Krzf =

(ˆH† ˆH +MαIM

)−1

. (41)

The RZF precoding matrix is expressed as

Wrzf = ζKrzfˆH†, (42)

where ˆH =[ˆh1,

ˆh2, ...,ˆhN

]†, ζ is a normalization scalar to

fulfill the power constraint in (2), and α is the regularizationfactor [36]. Based on (2), we obtain

ζ2 =N

tr[

ˆHK2rzf

ˆH†] , (43)

where equal power allocation is assumed, i.e.,[E[ss†]

]i,i

=PN . The signal-to-interference-and-noise-ratio (SINR) of usern is

γn,rzf =

∣∣∣∣ˆh†nKrzfˆhn

∣∣∣∣2NPζ2 +

∣∣∣ε†nKrzfˆhn

∣∣∣2 + h†nKrzfˆH†[n]

ˆH [n]Krzfhn

,

(44)

where ˆH [n] =[ˆh1, ...,

ˆhn−1,ˆhn+1, ...,

ˆhN

]†. The training

dimensionality loss is the length of the training sequence τ .Assuming the feedback bits are transmitted over the uplinkMIMO-multiple-access-channel (MIMO-MAC), and based on[25], the total number of feedback channel uses is computedas

δ =

N∑n=1

Bn

CMIMO-MAC. (45)

For ease of exposition, we assume Bn = B, ∀n, and

CMIMO-MAC = κmin [M,N ] log(MSNRul), (46)

where κ ∈ (0, 1) is a scalar representing the diversity-multiplexing tradeoff in MIMO-MAC as defined in [25], andSNRul is the uplink SNR. The achievable sum rate consideringimperfect channel training and feedback is expressed as thesolution of the following optimization problem

maximize:(

1− τ + δ

T

) N∑n=1

log(1 + γn,rzf)

s.t. 1 ≤ τ + δ ≤ T,τ ≥ 1, δ ≥ 1, (47)

where the optimization is over the training and feedbacklength. The fundamental tradeoff is that larger training andfeedback length provides a more accurate channel estimationwhereas resulting in larger dimensionality loss. Since our focusis on the performance of the downlink BC achievable rateswith spatially correlated channels, we use an exhaustive searchto find the optimal training and feedback length. The analysisof the optimal training and feedback length for i.i.d. channelscan be found in [35] and [37].

IV. PERFORMANCE ANALYSIS

In this section, we provide expressions for the downlinkachievable sum rate under the per-user CCMs, leveragingthe deterministic equivalent techniques provided in [35], withnecessary modifications.

Page 8: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

8

Following the approach in [35], when M goes to infinity,the SINR of user n, i.e., γn,rzf, satisfies

γn,rzf − γon,rzfM→∞−→ 0 with probability 1, (48)

where γon,rzf is a deterministic quantity that can be computedas

γon,rzf =

(ˆeon)2

(1+ˆeon)2

φo

P + ˆEon + Ion, (49)

where the parameters involved are specified in (50)-(62). Thederivation is mostly based upon [35], with generalizations touncorrelated channel estimation error matrices. The details areomitted for brevity.

φo =1

M

N∑n=1

ˆeon′

(1 + ˆeon)2 , (50)

ˆeon =1

Mtr[

ˆRnT], (51)

T =

1

M

N∑j=1

ˆRj

1 + ˆeoj+ αIM

−1

, (52)

ˆeo′

=[ˆeo1′, ˆeo2′, ..., ˆeoN

′]T

= (IN − J)−1v, (53)

[J ]i,j =1

M

1M tr

[ˆRiT

ˆRjT]

(1 + ˆeoj

)2 , (54)

v =1

M

[tr(

ˆR1T2), tr(

ˆR2T2), ..., tr

(ˆRNT

2)]T

(55)

ˆEo

n =don,n

M(1 + ˆeon)2 , (56)

don =[don,1, d

on,2, ..., d

on,N

]T= (IN − J)

−1bn, (57)

bn =1

M

[tr(

ˆR1T(Rn − ˆRn

)T), ...

, tr(

ˆRNT(Rn − ˆRn

)T)]T

, (58)

Ion =un(

1 + ˆeon)2 +

N∑j 6=n

don,j

M(1 + ˆeoj

)2 , (59)

un =1

M

N∑j 6=n

fon,j(1 + ˆeoj

)2 , (60)

fon =[fon,1, ..., f

on,N

]T= (IN − J)

−1cn, (61)

cn =1

M

[tr(

ˆR1TˆRnT

), ..., tr

(ˆRNT

ˆRnT)]T

. (62)

V. NUMERICAL RESULTS

In our simulations, we evaluate the FDD massive MIMOachievable rates with various spatially correlated channelmodels, and compare those with the TDD system, the FDDsystem with i.i.d. channels, and the JSDM scheme. Theazimuth angles of users, i.e., θn, are uniformly distributed in[−60◦, 60◦], unless stated otherwise. We set the tolerance inthe iterative algorithm in Section III-A, i.e., ε, to be 10−6. Theregularization factor in the RZF precoder is α = 0.01.

0 1 2 3 4 5 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Singular value

CD

F

D=0.5λ, ∆=10o, OR

D=λ, ∆=10o, OR

D=0.5λ, ∆=10o, Lap.

D=λ, ∆=10o, Lap.

D=λ, ∆=20o, Lap.

D=λ, ∆=20o, OR

Fig. 1. The CDF of the singular values of user CCMs for various parameters.M = 50. θ = π

2.

8 18 28 38 48 58 68 78 88 980

10

20

30

40

50

60

70

80

M

Rrzf(bit/s/H

z)

FDD, Cor., ARFDD, Cor., ARFDD, iid., ARTDD, Cor., AR

∆ = 20o, D=λ, Lap.

Uplink SNR=20dB

∆=10o, D=0.5λ, OR

Fig. 2. Achievable sum rates (AR) in massive MIMO systems with i.i.d.channels, per-user correlation channels (Cor.), TDD mode and FDD moderespectively. The downlink and uplink SNR are set to 20 dB and 10 dB,respectively, unless labeled otherwise. The channel coherence block length isT = 200. The number of users in the cell is N = 8. The per-user channelcorrelation matrices are calculated according to (63) and (64).

A. CCMs: One-Ring Model and Laplacian Model

First, we evaluate the singular value distribution of CCMs.In Fig. 1, the cumulative probability function (CDF) of thesingular values of the user CCMs is shown. We adopt twomodels to calculate the CCM of a uniform linear antenna array.The first one is the one-ring model (OR) [38], based on which

[R]i,j =1

2∆

∫ ∆+θ

−∆+θ

e−j2πD(i−j) sin(α)dα, (63)

where ∆ denotes the angular spread, θ denotes the mean userazimuth angle seen from the BS and D is the antenna spacing.Alternatively, the Laplacian angular spectrum model (Lap.) isalso considered [24, Section 7.4.2], where

[R]i,j =1√2∆

∫ θ+π

θ−πe−√

2∆ |α−θ|−j2πD(i−j) sin(α)dα. (64)

From Fig. 1, it is observed that the singular values of theCCMs are generally distributed with large deviations under

Page 9: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

9

various parameters, i.e., some singular values are large whilesome others are effectively close to zeros, thus we define theeffective rank (ER) of the CCM. Generally, as the antennaspacing is smaller, or the angular spread is smaller, the ERwill be smaller. Note that usually the ER calculated by theLaplacian model is larger than that by the one-ring model,due to the one-ring model restricting the direction-of-arrival(DoA) to a finite support. Also note that the number of BSantennas is relevant, which is shown in [7] that the ratio of ERand M approaches a constant asymptotically with M goingto infinity. In the following simulations, we evaluate the FDDmassive MIMO achievable rates under different parametersand both models depicted in Fig. 1.

B. Comparisons between the proposed scheme in correlatedFDD systems with TDD and i.i.d. FDD systems

In presence of spatially correlated channels, the achievablerates under the proposed scheme are shown in Fig. 2, incomparison with i.i.d. FDD systems and also correlated TDDsystems. The achievable rates of FDD systems with correlatedchannels are obtained using the training sequences obtainedby the iterative algorithm in Section III-A, and the SQ-RWF feedback codebook design in Section III-B. First, it isnoteworthy that in FDD systems, in general, the achievablesum rate is not monotonously increasing with the number ofBS antennas, as it does so in the TDD system, due to the factthat when the number of BS antennas grows large with FDDmode, the channel estimation dimensionality loss will becomenon-negligible. Therefore, there is a large rate gap between thei.i.d. FDD system and the TDD system, rendering the FDDmode unfavorable for massive MIMO transmission.

Nevertheless, when the channel is spatially correlated, theFDD system achievable sum rate under per-user CCMs issignificantly larger than that in i.i.d. channels, especially whenthe number of BS antennas is large, thanks to the judiciouslydesigned dominant channel estimation schemes. The rate gapbetween the TDD mode and the FDD mode is narrowedsignificantly, especially when M is moderate, which suggeststhat it is promising to exploit the large-system gain even withFDD mode. Note that, while the throughput of TDD systemsgenerally increases with M , the throughput of FDD systemswould eventually go down if we further increase M , even withour proposed schemes.

It is shown that the achievable rates of FDD systems areeven larger than TDD systems (with uplink SNR=10 dB)under some parameters shown in Fig. 2. The phenomenonis explained by the fact that the uplink SNR is set 10 dBlower than the downlink SNR in the corresponding simulationresults, which is typical for a cellular system due to the smallertransmit power of user-terminals, rendering the TDD systemperformance inferior due to the imperfect uplink channeltraining. Observe that when M becomes larger, the TDDsystem sum rate will go up unbounded, eventually surpassingthe FDD system. Moreover, when the uplink SNR is set to bethe same as the downlink SNR, see corresponding curves, theTDD system performs better, which is as expected.

38 58 78 98

15

20

25

30

35

M

Rrzf(bit/s/H

z)

∆=5o, T=100, JSDM

∆=5o, T=100, AR

∆=10o, T=100, JSDM

∆=10o, T=100, AR

∆=10o, T=200, JSDM

∆=10o, T=200, AR

Fig. 3. The achievable sum rates (AR) obtained by the eigenspace channelestimation, compared to the JSDM scheme. The downlink and uplink SNRare both 10 dB. The number of simultaneous users is N = 8.

C. Comparisons with JSDM

In Fig. 3, we compare the achievable sum rates obtainedby the proposed eigenspace channel estimation to the JSDMscheme [7], [8], which was the first to exploit the spatialcorrelation to benefit the FDD massive MIMO system. Notethat the uplink CSIT feedback is not treated in the previousJSDM papers [7], [9]. To make fair comparison, we assumethat the JSDM scheme uses an isotropical VQ feedbackcodebook, since it is unknown whether the JSDM scheme canalso benefit from a better-designed codebook for correlatedchannels after the pre-projection of channel vectors. To getmore insights and understand the simulation results better, itis important to first illustrate the merits and demerits of theJSDM scheme compared to our scheme.

The JSDM scheme has the advantage to better suppressthe channel estimation overhead. Specifically, by grouping theusers based on their respective CCMs and performing thepre-beamforming, the equivalent number of BS antennas ineach virtual sector, i.e., bg in [7], can be optimized to strikea good balance between the power gain, which scales withbg , and the channel estimation overhead. In an extreme case,bg can be made as small as the number of users in eachvirtual sector, thus, the overall channel estimation overheadscales with the number of users in each virtual sector, whichdrastically decreases the dimensionality loss. However, onthe downside, while the JSDM scheme adopts a divide-and-multiplex approach, the division is imperfect, in the sensethat the JSDM scheme suffers from the inherent residual IGI,especially when the CCMs of the users in each group are dif-ferent, rendering that the pre-beamforming cannot counteractthe IGI completely. Notice that in our framework, the proposeddominant channel estimations incorporate all the user CCMsinto the scheme design, which significantly mitigates the IGI.Moreover, it is noteworthy that the computational complexityof the JSDM scheme is smaller compared with our proposedscheme, since our scheme deals with a higher dimensional

Page 10: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

10

channel matrix6.Specifically, we follow the parameters used in the simulation

in [8, Section IV-C]. The fixed angular quantization methodis adopted to divide users into G = 8 user-groups, where eachgroup performs the per-group-processing. The quantizationpoints are

θ ∈ {−57.5◦,−41.5◦,−23◦,−7.5◦, 7.5◦, 23.5◦, 41.5◦, 57.5◦},(65)

and the angular spread for the quantization matrices areidentical with the user-angular-spread, which is specified inFig. 3. To keep the IGI under control, similarly with [8],we further divide the user-groups into 2 patterns, where onlythe users in the same pattern are scheduled simultaneously7.The ER in each virtual sector, i.e. r? in [7], is chosen whileneglecting extremely small singular values of the CCM, and bgis chosen to optimize the sum rate by exhaustive search. Thetraining sequences of each virtual sector are unitary sequencesas in [7, Section VI].

The deterministic equivalents for the JSDM scheme arecomputed based on [7, Appendix A], with generalizations todistinct CCMs within each user-group. The details are againomitted for brevity.

It is observed from Fig. 3 that the JSDM scheme achievesbetter sum rate when the channel coherence block length issmall, e.g., T = 100, and the number of BS antennas M islarge. Qualitatively, this is as expected since the small channelcoherence block length and large M both put more weightin the need to suppress the channel estimation overhead, andbased on [7], a large M also leads to the fact that the singularvectors of the CCMs can be well approximated by the columnsof a DFT matrix, which ensures orthogonality as long as theDoA intervals of different users are disjoint. On the other hand,the achievable sum rate of our proposed eigenspace channelestimation shows evidently better achievable rate when thechannel coherence block length is larger, which elevates theurgency to suppress the channel estimation overhead, or whenthe angular spread of users is larger, which causes largerresidual IGI in the JSDM scheme. Notice that large angularspread also decreases the achievable rates of our scheme,due to the increased channel estimation dimensionality loss,however our scheme turns out to be more resilient in thisregard. Although there are several parameters in the JSDMscheme, that may be properly tuned to achieve better ratethan the eigenspace channel estimation scheme, the eigenspacechannel estimation scheme still has the advantage of lowcomplexity, and the optimization for the JSDM scheme goesout of the scope of this paper.

D. Performance Gain Leveraging Dominant EigenspaceChannel Estimation

Furthermore, we demonstrate how much gain we can getfrom leveraging the training and feedback schemes designed

6Possible operations on the channel matrix include inversion and SVD,depending on the precoding algorithm.

7For fair comparison, we set the number of users in the achievable sumrate of the proposed scheme to be half of the total users in the JSDM, sincethere are 2 patterns.

8 18 28 38 48 5815

20

25

30

35

40

45

50

55

M

Rrzf(bit/s/H

z)

∆=10o, D=0.5λ, OR. Eig

∆=10o, D=0.5λ, OR. iid

∆=20o, D=λ, Lap. Eig

∆=20o, D=λ, Lap. iid

Fig. 4. Achievable sum rates (AR) in massive MIMO systems with eigenspacetraining and feedback schemes (Eig), compared with unitary channel estima-tion schemes commonly used for the i.i.d. channels. The downlink and uplinkSNR are set to 20 dB and 10 dB, respectively. The channel coherence blocklength is T = 200. The number of users in the cell is N = 8.

10 12 14 16 18 200

0.5

1

1.5

2

2.5

τ

Nor

mal

ized

MS

E

Unitary training, downlink SNR=10dBIteritive algorithm, downlink SNR=10dBUnitary training, downlink SNR=20dBIterative algorithm, downlink SNR=20dB

Fig. 5. The MSE caused by only the channel training process normalized bythe number of users, versus the number of training symbols of the optimizedtraining signals given by the iterative algorithm, compared with randomorthogonal training sequences. N = 8, M = 20. The per-user channelcorrelation matrices are calculated according to the one-ring model, withD = 0.5λ and ∆ = 10◦.

for the multi-user CCMs, by comparing with the unitarytraining and feedback schemes used in the i.i.d. channelcase. The rate gain is depicted in Fig. 4, showing leveragingdominant eigenspace channel estimation can indeed improvethe sum rate of FDD massive MIMO system with spatiallycorrelated channels. The detailed performance analyses of theproposed channel training and feedback schemes are shown inFig. 5 and Fig. 6, respectively.

For the training process, the MSE performance of the itera-tive algorithm we developed in Section III-A, which finds theoptimized training sequences with per-user CCMs, is shown inFig. 5. For comparison purposes, the simulation also considersthe unitary training sequences, which are shown to be optimalwith i.i.d. channels [39]. We assume, for the unitary training

Page 11: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

11

10 20 30 40 5010

−2

10−1

100

101

Normalized number of feedback bits

Nor

mal

ized

MS

E

VQ, Isotropical codebookVQ, Skewed codebookSQ−RWF, with SLSQ−RWF, w/o SL

Fig. 6. The quantization MSE, normalized by the number of users, with andwithout the shaping loss (SL). N = 8, M = 20. The per-user CCMs arecalculated according to the one-ring model, with D = 0.5λ and ∆ = 10◦.

10 20 30 40 50 600

2

4

6

8

10

12

14

16

Sorted channel entry index

Num

ber

of fe

edba

ck b

its

Round−off, MSE=10−2

Fractional bits, MSE=10−2

Round−off, MSE=10−1

Fractional bits, MSE=10−1

Fig. 7. Number of feedback bits for each channel entry in the SQ-RWFscheme with SL, versus the sorted channel entry index based on the variations.The mean DoA θ = π/6, angular spread ∆ = 10◦, antenna spacing D =0.5λ and the CCM is calculated based on the one-ring model. The numberof BS antennas M = 64.

sequences8, XτX†τ =

τP

MIM if τ ≥M ,

X†τXτ = PIτ if τ < M .(66)

.When the number of training symbols is small, the MSE

achieved by the iterative algorithm is much lower than theunitary training sequences, due to the fact that in presenceof channel correlation, the training sequences obtained by ouralgorithm can find the eigenspace that needs to be estimatedmore accurately and concentrate the power to that subspace.Note that when the downlink SNR is large and the number of

8Notice that the unitary training sequences for the case τ < M is notwell defined in [39], since it does not suffice to have τ < M pilots in i.i.d.channels. Here we assume Xτ has orthogonal rows when τ < M .

training symbols is large enough9 to train all the subspaces, theunitary training sequences are asymptotically optimal. Suchobservations are further evaluated by setting the downlinkSNR to 10 dB, which shows a certain MSE gap betweenthe optimized training sequences and the unitary trainingsequences, even when τ = M .

Fig. 6 shows the quantization MSE performances of SQ-RWF and VQ with isotropical and skewed codebooks. It isobserved that SQ-RWF achieves better MSE performance,even with the shaping loss (SL)10 when the number of feed-back bits is large. Notice that in general VQ is more efficientthan SQ, especially when the vector is correlated. However,after the KL-transform, the channel vector is decorrelated intoindependent Gaussian variables with non-identical variances,in which case the RWF is the optimal bit allocation in termsof MSE distortion.

Moreover, to compare the SQ-RWF scheme with conven-tional VQ approaches, according to the well-known results onVQ methods in the literature [17], to achieve the perfect-CSITDoF, the total number of feedback bits for an M -dimensionalchannel vector is approximately

B =M − 1

3ρdB, (67)

where ρdB is the downlink SNR (in dB). Given the parametersin Fig. 7 and ρdB = 20 (for fair comparison with MSE =10−2 since it is found that the MSE should scale inverselywith the SNR to achieve the perfect-CSIT DoF), the numberof feedback bits in (67) is 420, whereas the number of totalfeedback bits with SQ-RWF is 164 by simulation. However,it should be noted that the such a comparison is not quite fairsince the feedback codebook design in [17] is not optimizedfor the spatially correlated channels.

Fig. 7 gives a concrete example to specify the bit allo-cation of the SQ-RWF scheme. Note that we perform theproposed quantization method on the user-channel after theKL-transform, i.e., hKL

n in (21). The entries of hKLn are sorted

based on the variations in descending order and the indicesare shown as the x-axis. To achieve a target quantizationMSE, it is shown that the round-off SQ-RWF, which roundsoff the number of feedback bits given by the RWF approachto its nearest integer, uses a slightly larger number of bitsthan the SQ-RWF scheme which allows fractional numberof feedback bits for each channel entry. It is shown that theSQ-RWF scheme “throws away” a number of channel entriesdue to their small variations, i.e., allocating zero bits to them,while concentrates its feedback bits to a few dominant channelentries. In addition, the SQ-RWF scheme uses a larger numberof feedback bits to quantize the dominant entries, as well as

9In this case we have τ ≥M , then there are enough channel observationsto recover the channel coefficients perfectly when the downlink SNR goes toinfinity.

10The shaping loss is defined as the loss due to the fact that scalarquantization operates on a hypercube, while optimal vector quantizationoperates on a hypersphere. It is equal to 1

2log(πe/6) bits per real dimension,

and corresponds to the difference of the differential entropies of a Gaussianand a uniform distribution with the same (unit) variance. A full thoroughtreatment of entropy-coded scalar quantization in an information theoreticsense can be found in [40].

Page 12: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

12

4 5 6 7 8 9 100

0.05

0.1

0.15

0.2

0.25

Dominant Rank,

Nor

mal

ized

MS

E

#FB: 42, SK#FB: 61, SK#FB: 79, SK#FB: 42, iid#FB: 61, iid#FB: 79, iid

rn

Fig. 8. The quantization MSE, normalized by the number of users, versusthe dominant rank we choose to feedback the CSIT, with various number offeedback bits. N = 8, M = 20. The per-user channel correlation matrices arecalculated according to the one-ring model, with D = 0.5λ and ∆ = 10◦.

8 18 28 38 48 58 68 78 88 980

10

20

30

40

50

60

70

M

Rrzf(bit/s/H

z)

T=600

T=400

T=100

T=200

Fig. 9. Performance of FDD massive MIMO systems with CCMs and variouschannel coherence block length. The dotted and solid lines represent thesystem with i.i.d. channels and spatially correlated channels, respectively.The downlink and uplink SNR are set to 20 dB and 10 dB, respectively.The number of users in the cell is N = 8. The per-user channel correlationmatrices are calculated according to the one-ring model, with D = 0.5λ and∆ = 10◦.

quantizing more channel entries, when a better quantizationaccuracy is required.

The impact of the dominant rank, i.e., rn, we choose in theVQ feedback process on the quantization MSE is shown inFig. 8, with different number of feedback bits. The numbersof feedback bits (42, 61, 79) shown in the figure correspondapproximately to 2, 3, 4 bits per channel entry. The tradeoffbetween the quantization accuracy of the effective channel andthe estimation error resulting from the neglected eigenspaceof the CCM is shown. It is observed that there exists anoptimal rn in terms of minimizing the total feedback error.The optimal rn is increasing with the number of feedbackbits, for the reason that when we have more feedback bits,we can afford to estimate a higher-dimensional eigenspace,rendering a better accuracy of the CSIT feedback estimation.

The performance of the skewed feedback codebook is alsoshown in the figure. The gain in terms of MSE is fairly small,when the optimal dominant rank is chosen, because the errormainly stems from neglecting the non-dominant eigenspace.Note that when rn is large, the performance gain of the skewedcodebook is more evident since the MSE in this regime isdominated by the channel quantization. It is worth mentioningthat the absolute values of the quantization error are fairlysmall, compared with the training error. We find that thechannel estimation error mainly comes from the downlinkchannel training process which is analog in nature, rather thanthe digitalized CSIT feedback process, for the reason that theMSE scales down linearly with the number of training symbols(13), but exponentially with the feedback bits (30).

In Fig. 9, the impact of channel coherence block lengthon the sum rate improvements is shown. Significant rateimprovement, which is up to two-fold, is expected for allchannel coherence block length. The results suggest that underthe spatially correlated channels, which is especially commonwith mm-wave channels [9], along with a well-designed trans-mission strategy, namely the training and feedback schemes,the FDD system is capable of realizing significant massiveMIMO gain.

VI. CONCLUSIONS

By computing the achievable rates with a RZF precoder ofFDD massive MIMO systems, on account of the downlinkchannel training and uplink CSIT feedback dimensionalityloss and corresponding channel estimation error, we showedthat spatial channel correlation at the BS side is beneficialto the FDD massive MIMO system. The benefit is especiallyprominent if the channels are strongly correlated, namely theCCMs are effectively rank-deficient. In particular, we proposedan iterative algorithm to find the optimized channel trainingsequences in presence of multiuser spatial correlation, and aKL-transform followed by SQ with RWF bit-loading feedbackcodebook design, which is extremely computationally efficientand thus easy to implement in practice while achieving near-optimal performance. Our proposed approach, which achievesdimensionality-reduction channel estimation, can be seen as analternative to the pre-projection and effective channel approachin the JSDM scheme. Moreover, it is noteworthy that whileachieving a significant performance gain, our approach onlyrequires minimum modifications of the widely-used training-based transmission scheme, and thus it is easy to implement.

Numerical results show significant rate improvements whenleveraging our proposed eigenspace channel estimation ap-proaches under spatially correlated channels, in comparisonwith i.i.d. FDD massive MIMO systems. In fact, when thechannel correlation is strong and the number of BS antennas isnot very large, the achievable sum rate of FDD massive MIMOsystems can even outperforms TDD systems. Comparisonswith the JSDM scheme reveal both schemes have advantagesunder different channel conditions, such as coherence time andangular spread. In particular, our proposed schemes displaybetter performance when channel coherence block length islarge, or the angular spread of the users is large, while

Page 13: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

13

requiring a higher computational complexity due to operatingon a higher-dimensional matrix.

These results suggest that in FDD massive MIMO systems,increasing spatial channel correlation, e.g., by decreasing theantenna spacing, more line-of-sight transmission, etc., can bebeneficial. Although this differs from the favorable propaga-tion conditions in TDD systems, which prefer i.i.d. channels tomaximize the total DoF, the FDD system benefits significantlyfrom correlation, which enables dimensionality loss reductionas far as channel estimation is concerned. The tradeoff betweenthe DoF of the downlink BC and the spatial correlation in FDDmassive MIMO is an interesting problem for future work.

APPENDIX APROOF OF THEOREM 2

Proof: The quantization MSE of the skewed codebook isexpressed as

MSEq = Ezn

{ECsk

[z†nΛnzn −max

i

[f †i Λnznz

†nΛnfi

f †i Λnfi

]]}+ tr

[UnΣnU

†n

], (68)

where hn = QnΛ12nzn, Qn is an orthogonal matrix, zn ∼

CN (0, Irn). Define the first term in (68) as ∆1. We obtain

∆1 = Ezn

∫ z†nΛnzn

0

[Pr

(f †i Λnznz

†nΛnfi

f †i Λnfi

≤ x|f †i fi = 1)]N

dx (69)

≤ Ezn

∫ z†nΛnzn

0

[Pr

(f †i Λnznz

†nΛnfi

λ1

≤ x|f †i fi = 1)]N

dx (70)

= Ezn

∫ z†nΛnzn

0

Pr

f †i Λnzn√

z†nΛ2nzn

2

≤ λ1x

z†nΛ2nzn|f †i fi = 1

)]Ndx (71)

≈ Ezn

∫ z†nΛ2

nznλ1

0

Pr

f †i Λnzn√

z†nΛ2nzn

2

≤ λ1x

z†nΛ2nzn|f †i fi = 1

)]Ndx (72)

= Ezn

z†nΛ2nzn

λ(n)1

∫ 1

0

[1− (1− x)

rn−1]N

dx (73)

rn∑i=1

(λ(n)i )2

λ(n)1

2−Bnrn−1 , (74)

wherein the equality (69) follows from integrating (68) byparts, the approximation in (72) follows from the work in [31,Appendix J], which shows the dominant term of the integral in(71) is (72), then by [32, Corollary 1], (73) and (74) follows.

REFERENCES

[1] T. Marzetta, “Noncooperative cellular wireless with unlimited numbersof base station antennas,” IEEE Trans. Wireless Commun., vol. 9,pp. 3590–3600, Nov 2010.

[2] H. Q. Ngo, E. Larsson, and T. Marzetta, “Energy and spectral efficiencyof very large multiuser MIMO systems,” IEEE Trans Commun., vol. 61,pp. 1436–1449, Apr 2013.

[3] G. Caire and S. Shamai, “On the achievable throughput of a multiantennaGaussian broadcast channel,” IEEE Trans. Inform. Theory, vol. 49,pp. 1691–1706, Jul 2003.

[4] A. G. Davoodi and S. A. Jafar, “Aligned image sets under channeluncertainty: Settling a conjecture by Lapidoth, Shamai and Wigger onthe collapse of degrees of freedom under finite precision CSIT,” arXivpreprint arXiv:1403.1541.

[5] G. Smith, “A direct derivation of a single-antenna reciprocity relationfor the time domain,” IEEE Trans. Antennas Propag., vol. 52, pp. 1568–1577, Jun 2004.

[6] B. Clerckx, G. Kim, and S. Kim, “Correlated fading in broadcast MIMOchannels: Curse or blessing?,” in Proc. IEEE Global CommunicationsConference (GLOBECOM’ 08), pp. 1–5, 2008.

[7] A. Adhikary, J. Nam, J.-Y. Ahn, and G. Caire, “Joint spatial divisionand multiplexing: The large-scale array regime,” IEEE Trans. Inform.Theory, vol. 59, pp. 6441–6463, Oct 2013.

[8] J. Nam, A. Adhikary, J.-Y. Ahn, and G. Caire, “Joint spatial division andmultiplexing: Opportunistic beamforming, user grouping and simplifieddownlink scheduling,” IEEE J. Sel. Top. Signal Process., vol. 8, pp. 876–890, Oct. 2014.

[9] A. Adhikary, E. Al Safadi, M. Samimi, R. Wang, G. Caire, T. Rappaport,and A. Molisch, “Joint spatial division and multiplexing for mm-wavechannels,” IEEE J. Select. Areas Commun., vol. 32, pp. 1239–1255, Jun2014.

[10] J. Nam, “Fundamental limits in correlated fading MIMO broadcastchannels: Benefits of transmit correlation diversity,” arXiv preprintarXiv:1401.7114, 2014.

[11] R. Muller, L. Cottatellucci, and M. Vehkapera, “Blind pilot decontam-ination,” IEEE J. Sel. Top. Signal Process., vol. 8, pp. 773–786, Oct2014.

[12] C. Studer and E. Larsson, “PAR-aware large-scale multi-user MIMO-OFDM downlink,” IEEE J. Sel. Areas Commun., vol. 31, pp. 303–313,Feb 2013.

[13] E. Bjornson, J. Hoydis, M. Kountouris, and M. Debbah, “MassiveMIMO systems with non-ideal hardware: Energy efficiency, estimation,and capacity limits,” IEEE Trans. Inform. Theory, Sep 2014.

[14] C. Shepard, H. Yu, and L. Zhong, “ArgosV2: A flexible many-antennaresearch platform,” in Proc. Annual International Conference on MobileComputing & Networking, pp. 163–166, 2013.

[15] X. Rao and V. Lau, “Distributed compressive CSIT estimation andfeedback for FDD multi-user massive MIMO systems,” IEEE Trans.Signal Processing, vol. 62, pp. 3261–3271, Jun 2014.

[16] J. Choi, D. J. Love, and T. Kim, “Trellis-extended codebooks andsuccessive phase adjustment: A path from LTE-Advanced to FDDmassive MIMO systems,” arXiv preprint arXiv:1402.6794, 2014.

[17] J. Choi, Z. Chance, D. Love, and U. Madhow, “Noncoherent trellis codedquantization: A practical limited feedback technique for massive MIMOsystems,” IEEE Trans. Commun., vol. 61, pp. 5016–5029, Dec 2013.

[18] J. Choi, D. Love, and P. Bidigare, “Downlink training techniques forFDD massive MIMO systems: Open-loop and closed-loop training withmemory,” IEEE J. Sel. Top. Signal Process., vol. 8, pp. 802–814, Oct2014.

[19] Z. Jiang, A. F. Molisch, G. Caire, and Z. Niu, “On the achievable rates ofFDD massive MIMO systems with spatial channel correlation,” in Proc.IEEE International Conference on Communications in China (ICCC’2014), 2014.

[20] K. Hugl, J. Laurila, and E. Bonek, “Transformation based downlinkbeamforming techniques for frequency division duplex systems,” inProc. Interim Symposium on Antennas and Propagation, pp. 1529–1532,2000.

[21] W. Weichselberger, M. Herdin, H. Ozcelik, and E. Bonek, “A stochasticMIMO channel model with joint correlation of both link ends,” IEEETrans. Wireless Commun., vol. 5, pp. 90–100, Jan 2006.

[22] S. Jafar, S. Vishwanath, and A. Goldsmith, “Channel capacity andbeamforming for multiple transmit and receive antennas with covariancefeedback,” in Proc. IEEE International Conference on Communications(ICC’ 2001), vol. 7, pp. 2266–2270, 2001.

[23] J. Li, X. Wu, and R. Laroia, OFDMA Mobile Broadband Communica-tions: A Systems Approach. Cambridge University Press, 2013.

Page 14: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

14

[24] A. F. Molisch, Wireless Communications. 2nd edition, IEEE Press–Wiley, 2011.

[25] G. Caire, N. Jindal, M. Kobayashi, and N. Ravindran, “Multiuser MIMOachievable rates with downlink training and channel state feedback,”IEEE Trans. Inform. Theory, vol. 56, pp. 2845–2866, Jun 2010.

[26] J. H. Kotecha and A. Sayeed, “Transmit signal design for optimal esti-mation of correlated MIMO channels,” IEEE Trans. Signal Processing,vol. 52, pp. 546–557, Feb 2004.

[27] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridgeuniversity press, 2009.

[28] D. Guo, S. Shamai, and S. Verdu, “Mutual information and minimummean-square error in Gaussian channels,” IEEE Trans. Inform. Theory,vol. 51, pp. 1261–1282, Apr 2005.

[29] S. Serbetli and A. Yener, “Transceiver optimization for multiuser MIMOsystems,” IEEE Trans. Signal Processing, vol. 52, pp. 214–226, Jan2004.

[30] T. M. Cover and J. A. Thomas, Elements of Information Theory. JohnWiley & Sons, 2012.

[31] V. Raghavan and V. Veeravalli, “Ensemble properties of RVQ-basedlimited-feedback beamforming codebooks,” IEEE Trans. Inform. Theory,vol. 59, pp. 8224–8249, Dec 2013.

[32] C. K. Au-Yeung and D. Love, “On the performance of random vectorquantization limited feedback beamforming in a MISO system,” IEEETrans. Wireless Commun., vol. 6, pp. 458–462, Feb 2007.

[33] P. Xia and G. Giannakis, “Design and analysis of transmit-beamformingbased on limited-rate feedback,” IEEE Trans. Signal Processing, vol. 54,pp. 1853–1863, May 2006.

[34] S. Lloyd, “Least squares quantization in PCM,” IEEE Trans. Inform.Theory, vol. 28, pp. 129–137, Mar 1982.

[35] S. Wagner, R. Couillet, M. Debbah, and D. T. M. Slock, “Large systemanalysis of linear precoding in correlated MISO broadcast channelsunder limited feedback,” IEEE Trans. Inform. Theory, vol. 58, pp. 4509–4537, Jul 2012.

[36] C. Peel, B. Hochwald, and A. Swindlehurst, “A vector-perturbationtechnique for near-capacity multiantenna multiuser communication-partI: Channel inversion and regularization,” IEEE Trans Commun., vol. 53,pp. 195–202, Jan 2005.

[37] M. Kobayashi, N. Jindal, and G. Caire, “Training and feedback opti-mization for multiuser MIMO downlink,” IEEE Trans Commun., vol. 59,pp. 2228–2240, Aug 2011.

[38] D. shan Shiu, G. Foschini, M. Gans, and J. Kahn, “Fading correlationand its effect on the capacity of multielement antenna systems,” IEEETrans. Commun., vol. 48, pp. 502–513, Mar 2000.

[39] B. Hassibi and B. M. Hochwald, “How much training is needed inmultiple-antenna wireless links?,” IEEE Trans. Inform. Theory, vol. 49,pp. 951–963, Apr 2003.

[40] J. Ziv, “On universal quantization,” IEEE Trans. Inform. Theory, vol. 31,pp. 344–347, May 1985.

Zhiyuan Jiang (S’12) was born at Beipiao, LiaoningProvince, China in 1987. He received his B.E. degreein Electronic Engineering from Tsinghua University,Beijing, China in 2010. Since September 2010, hehas been a graduate student in Niulab of ElectronicEngineering Department of Tsinghua University,where he is currently pursuing his doctoral studies.His main research interests include multiuser MIMOsystems, green wireless networks, queuing theoryand Lyapunov optimization.

Andreas F. Molisch (S’89–M’95–SM’00–F’05) re-ceived the Dipl. Ing., Ph.D., and habilitation degreesfrom the Technical University of Vienna, Vienna,Austria, in 1990, 1994, and 1999, respectively. Hesubsequently was with AT&T (Bell) LaboratoriesResearch (USA); Lund University, Lund, Sweden,and Mitsubishi Electric Research Labs (USA). Heis now a Professor of Electrical Engineering andDirector of the Communication Sciences Institute atthe University of Southern California, Los Angeles.

His current research interests are the measurementand modeling of mobile radio channels, ultra-wideband communicationsand localization, cooperative communications, multiple-input?multiple-outputsystems, wireless systems for healthcare, and novel cellular architectures. Hehas authored, coauthored, or edited four books (among them the textbookWireless Communications, Wiley-IEEE Press), 16 book chapters, some 170journal papers, and numerous conference contributions, as well as more than70 patents and 60 standards contributions. Dr. Molisch has been an Editorof a number of journals and special issues, General Chair, Tecnical ProgramCommittee Chair, or Symposium Chair of multiple international conferences,as well as Chairman of various international standardization groups. He isa Fellow of the IEEE, Fellow of the AAAS, Fellow of the IET, an IEEEDistinguished Lecturer, and a member of the Austrian Academy of Sciences.He has received numerous awards, among them the Donald Fink Prize of theIEEE, and the Eric Sumner Award of the IEEE.

Giuseppe Caire (S’92 – M’94 – SM’03 – F’05)was born in Torino, Italy, in 1965. He receivedthe B.Sc. in Electrical Engineering from Politecnicodi Torino (Italy), in 1990, the M.Sc. in ElectricalEngineering from Princeton University in 1992 andthe Ph.D. from Politecnico di Torino in 1994. He wasa recipient of the AEI G.Someda Scholarship in 1991and has been post-doctoral research fellow with theEuropean Space Agency (ESTEC, Noordwijk, TheNetherlands) in 1994-1995. He has been AssistantProfessor in Telecommunications at the Politecnico

di Torino, Associate Professor at the University of Parma, Italy, Professor withthe Department of Mobile Communications at the Eurecom Institute, Sophia-Antipolis, France, and he is currently a professor of Electrical Engineeringwith the Viterbi School of Engineering, University of Southern California,Los Angeles and an Alexander von Humboldt Professor with the ElectricalEngineering and Computer Science Department of the Technical Universityof Berlin, Germany.

He served as Associate Editor for the IEEE Transactions on Communi-cations in 1998-2001 and as Associate Editor for the IEEE Transactions onInformation Theory in 2001-2003. He received the Jack Neubauer Best SystemPaper Award from the IEEE Vehicular Technology Society in 2003, and theIEEE Communications Society & Information Theory Society Joint PaperAward in 2004 and in 2011. Giuseppe Caire is a Fellow of IEEE since 2005.He has served in the Board of Governors of the IEEE Information TheorySociety from 2004 to 2007, and as officer from 2008 to 2013. He was Presidentof the IEEE Information Theory Society in 2011. His main research interestsare in the field of communications theory, information theory, channel andsource coding with particular focus on wireless communications.

Page 15: Achievable Rates of FDD Massive MIMO Systems with Spatial …network.ee.tsinghua.edu.cn/niulab/wp-content/uploads/2015/02/AR_CR.pdf · prohibitively large for the FDD system. Therefore,

15

Zhisheng Niu (M’98–SM’99–F’12) graduated fromBeijing Jiaotong University, China, in 1985, andgot his M.E. and D.E. degrees from ToyohashiUniversity of Technology, Japan, in 1989 and 1992,respectively. During 1992-94, he worked for FujitsuLaboratories Ltd., Japan, and in 1994 joined withTsinghua University, Beijing, China, where he isnow a professor at the Department of ElectronicEngineering and deputy dean of the School of In-formation Science and Technology. He is also aguest chair professor of Shandong University, China.

His major research interests include queueing theory, traffic engineering,mobile Internet, radio resource management of wireless networks, and greencommunication and networks.

Dr. Niu has been an active volunteer for various academic societies,including Director for Conference Publications (2010-11) and Director forAsia-Pacific Board (2008-09) of IEEE Communication Society, MembershipDevelopment Coordinator (2009-10) of IEEE Region 10, Councilor of IEICE-Japan (2009-11), and council member of Chinese Institute of Electronics(2006-11). He is now a distinguished lecturer (2012-15) and Chair ofEmerging Technology Committee (2014-15) of IEEE Communication Society,a distinguished lecturer (2014-16) of IEEE Vehicular Technologies Society,a member of the Fellow Nomination Committee of IEICE CommunicationSociety (2013-14), standing committee member of Chinese Institute of Com-munications (CIC, 2012-16), and associate editor-in-chief of IEEE/CIC jointpublication China Communications.

Dr. Niu received the Outstanding Young Researcher Award from NaturalScience Foundation of China in 2009 and the Best Paper Award from IEEECommunication Society Asia-Pacific Board in 2013. He also co-received theBest Paper Awards from the 13th, 15th and 19th Asia-Pacific Conference onCommunication (APCC) in 2007, 2009, and 2013, respectively, InternationalConference on Wireless Communications and Signal Processing (WCSP’13),and the Best Student Paper Award from the 25th International TeletrafficCongress (ITC25). He is now the Chief Scientist of the National BasicResearch Program (so called ”973 Project”) of China on ”FundamentalResearch on the Energy and Resource Optimized Hyper-Cellular MobileCommunication System” (2012-2016), which is the first national project ongreen communications in China. He is a fellow of both IEEE and IEICE.


Recommended