+ All Categories
Home > Documents > 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

Date post: 01-Nov-2021
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
17
6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 11, NOVEMBER 2015 Distributed Multilevel Diversity Coding Zhiqing Xiao, Jun Chen, Member, IEEE, Yunzhou Li, Member, IEEE, and Jing Wang, Member, IEEE Abstract— In distributed multilevel diversity coding, K correlated sources (each with K components) are encoded in a distributed manner such that, given the outputs from any α encoders, the decoder can reconstruct the first α components of each of the corresponding α sources. For this problem, the optimality of a multilayer Slepian–Wolf coding scheme based on binning and superposition is established when K 3. The same conclusion is shown to hold for general K under a certain symmetry condition, which generalizes a celebrated result by Yeung and Zhang. Index Terms— Data compression, diversity coding, entropy inequality, Lagrange multiplier, linear programming, rate region, Slepian-Wolf, superposition. I. I NTRODUCTION C ONSIDER the scenario where K correlated sources U 1 , U 2 , ··· , U K are compressed by K sensors in a dis- tributed manner and then forwarded to a fusion center for joint reconstruction. This is exactly the classical distributed data compression problem, for which Slepian-Wolf coding [1], [2] is known to be rate-optimal. However, to attain this best compression efficiency, encoding at each sensor is performed under the assumption that all the other sensors are functioning properly; as a consequence, inactivity of one or more sensors typically leads to a complete decoding failure at the fusion center. Alternatively, each sensor can compress its observation using conventional point-to-point data compression methods without capitalizing on the correlation among different sources so that the maximum system robustness can be achieved. In view of these two extreme cases, a natural question arises whether there exists a tradeoff between compression efficiency and system robustness in distributed data compression. One approach to realize this tradeoff is as follows. Specifically, we decompose each U k into K components U k,1 , U k,2 , ··· , U k, K , ordered according to their importance, Manuscript received April 30, 2015; revised September 2, 2015; accepted September 6, 2015. Date of publication September 9, 2015; date of current version October 16, 2015. Z. Xiao, Y. Li, and J. Wang were supported by the National Basic Research Program of China under Grant 2013CB329002, the National High Technology Research and Development Program of China under Grant 2014AA01A703, the National Science and Technology Major Project under Grant 2014ZX03003002, the Program for NCET in University under Grant NCET-13-0321, the International Science and Technology Coop- eration Program under Grant 2014DFT10320, and the Tsinghua Research Funding under Grant 2015Z02-3. Z. Xiao was supported by Tsinghua Scholarship for Overseas Graduate Studies under Grant 2014057. J. Chen was supported by the Natural Science and Engineering Research Council of Canada under a Discovery Grant. Z. Xiao is with the Department of Electronic Engineering, Tsinghua University, Beijing 100084, China (e-mail: [email protected]). J. Chen is with the Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON L8S 4K1, Canada (e-mail: [email protected]). Y. Liand J. Wang are with the Research Institute of Information Technology, Tsinghua University, Beijing 100084, China (e-mail: liyunzhou@ tsinghua.edu.cn; [email protected]). Communicated by T. Liu, Associate Editor for Shannon Theory. Digital Object Identifier 10.1109/TIT.2015.2477506 and encode them in such a way that, given the outputs from any α sensors, the fusion center can reconstruct the first α components of each of the corresponding α sources. The aforedescribed two extreme cases correspond to (U k,1 , ··· , U k, K 1 , U k, K ) = (0, ··· , 0, U k ) and (U k,1 , U k,2 , ··· , U k, K ) = (U k , 0, ··· , 0), respectively. One can realize a flexible tradeoff between compression efficiency and system robustness by adjusting the amount of information allocated to different components. We shall refer to this problem as distributed multilevel diversity coding (D-MLDC) since it reduces to the well-known (symmetrical) multilevel diversity coding (MLDC) problem when U 1= U 2=···= U K almost surely for all α. The concept of MLDC was introduced by Roche [3] and more formally by Yeung [4] though research on diversity coding can be traced back to Singleton’s work on maximum distance separable codes [5]. The symmetric version of this problem has received particular attention [6], and arguably the culminating achievement of this line of research is the complete characterization of the admissible rate region of symmetrical MLDC by Yeung and Zhang [7]. Some recent developments related to MLDC can be found in [8]–[11]. The goal of the present paper is to characterize the per- formance limits of D-MLDC, which, we hope, may provide some useful insights into the tradeoff between compression efficiency and system robustness in distributed data compres- sion. More fundamentally, we aim to examine the principle of superposition [4] in the context of D-MLDC. Although superposition (or more generally, layering) is a common way to construct sophisticated schemes based on simple building blocks and often yields the best known achievability results, establishing the optimality of such constructions is rarely straightforward, especially when encoding is performed in a distributed manner. In fact, even for the centralized encoding setup studied in [7], the proof of the optimality of superposi- tion is already highly non-trivial. This difficulty can be partly attributed to the fact that it is often a technically formidable task to extract layers from a generic scheme using information inequalities in a converse argument, even in cases where the use of layered constructions may appear rather natural. From this perspective, our work can be viewed as an initial step towards a better understanding of layered schemes for distributed compression of correlated sources. We shall propose a multilayer Slepian-Wolf coding scheme based on binning and superposition, and establish its optimality for D-MLDC when K 3. This scheme is also shown to be optimal for general K under a certain symmetry condition, which generalizes the aforementioned result by Yeung and Zhang on symmetrical MLDC [7]. The main technical diffi- culty encountered in our proof is that it appears to be infeasible to characterize the admissible rate region of D-MLDC by 0018-9448 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Transcript
Page 1: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 11, NOVEMBER 2015

Distributed Multilevel Diversity CodingZhiqing Xiao, Jun Chen, Member, IEEE, Yunzhou Li, Member, IEEE, and Jing Wang, Member, IEEE

Abstract— In distributed multilevel diversity coding,K correlated sources (each with K components) are encodedin a distributed manner such that, given the outputs from anyα encoders, the decoder can reconstruct the first α componentsof each of the corresponding α sources. For this problem, theoptimality of a multilayer Slepian–Wolf coding scheme basedon binning and superposition is established when K ≤ 3. Thesame conclusion is shown to hold for general K under a certainsymmetry condition, which generalizes a celebrated resultby Yeung and Zhang.

Index Terms— Data compression, diversity coding, entropyinequality, Lagrange multiplier, linear programming, rate region,Slepian-Wolf, superposition.

I. INTRODUCTION

CONSIDER the scenario where K correlated sourcesU1,U2, · · · ,UK are compressed by K sensors in a dis-

tributed manner and then forwarded to a fusion center for jointreconstruction. This is exactly the classical distributed datacompression problem, for which Slepian-Wolf coding [1], [2]is known to be rate-optimal. However, to attain this bestcompression efficiency, encoding at each sensor is performedunder the assumption that all the other sensors are functioningproperly; as a consequence, inactivity of one or more sensorstypically leads to a complete decoding failure at the fusioncenter. Alternatively, each sensor can compress its observationusing conventional point-to-point data compression methodswithout capitalizing on the correlation among different sourcesso that the maximum system robustness can be achieved.In view of these two extreme cases, a natural question ariseswhether there exists a tradeoff between compression efficiencyand system robustness in distributed data compression.

One approach to realize this tradeoff is as follows.Specifically, we decompose each Uk into K componentsUk,1,Uk,2, · · · ,Uk,K , ordered according to their importance,

Manuscript received April 30, 2015; revised September 2, 2015; acceptedSeptember 6, 2015. Date of publication September 9, 2015; date of currentversion October 16, 2015. Z. Xiao, Y. Li, and J. Wang were supported bythe National Basic Research Program of China under Grant 2013CB329002,the National High Technology Research and Development Program of Chinaunder Grant 2014AA01A703, the National Science and Technology MajorProject under Grant 2014ZX03003002, the Program for NCET in Universityunder Grant NCET-13-0321, the International Science and Technology Coop-eration Program under Grant 2014DFT10320, and the Tsinghua ResearchFunding under Grant 2015Z02-3. Z. Xiao was supported by TsinghuaScholarship for Overseas Graduate Studies under Grant 2014057. J. Chenwas supported by the Natural Science and Engineering Research Council ofCanada under a Discovery Grant.

Z. Xiao is with the Department of Electronic Engineering, TsinghuaUniversity, Beijing 100084, China (e-mail: [email protected]).

J. Chen is with the Department of Electrical and Computer Engineering,McMaster University, Hamilton, ON L8S 4K1, Canada (e-mail:[email protected]).

Y. Li and J. Wang are with the Research Institute of Information Technology,Tsinghua University, Beijing 100084, China (e-mail: [email protected]; [email protected]).

Communicated by T. Liu, Associate Editor for Shannon Theory.Digital Object Identifier 10.1109/TIT.2015.2477506

and encode them in such a way that, given the outputsfrom any α sensors, the fusion center can reconstructthe first α components of each of the correspondingα sources. The aforedescribed two extreme cases correspondto (Uk,1, · · · ,Uk,K−1,Uk,K ) = (0, · · · , 0,Uk) and(Uk,1,Uk,2, · · · ,Uk,K ) = (Uk, 0, · · · , 0), respectively.One can realize a flexible tradeoff between compressionefficiency and system robustness by adjusting the amountof information allocated to different components. We shallrefer to this problem as distributed multilevel diversitycoding (D-MLDC) since it reduces to the well-known(symmetrical) multilevel diversity coding (MLDC) problemwhen U1,α = U2,α = · · · = UK ,α almost surely for all α.

The concept of MLDC was introduced by Roche [3] andmore formally by Yeung [4] though research on diversitycoding can be traced back to Singleton’s work on maximumdistance separable codes [5]. The symmetric version of thisproblem has received particular attention [6], and arguablythe culminating achievement of this line of research is thecomplete characterization of the admissible rate region ofsymmetrical MLDC by Yeung and Zhang [7]. Some recentdevelopments related to MLDC can be found in [8]–[11].

The goal of the present paper is to characterize the per-formance limits of D-MLDC, which, we hope, may providesome useful insights into the tradeoff between compressionefficiency and system robustness in distributed data compres-sion. More fundamentally, we aim to examine the principleof superposition [4] in the context of D-MLDC. Althoughsuperposition (or more generally, layering) is a common wayto construct sophisticated schemes based on simple buildingblocks and often yields the best known achievability results,establishing the optimality of such constructions is rarelystraightforward, especially when encoding is performed in adistributed manner. In fact, even for the centralized encodingsetup studied in [7], the proof of the optimality of superposi-tion is already highly non-trivial. This difficulty can be partlyattributed to the fact that it is often a technically formidabletask to extract layers from a generic scheme using informationinequalities in a converse argument, even in cases where theuse of layered constructions may appear rather natural.

From this perspective, our work can be viewed as aninitial step towards a better understanding of layered schemesfor distributed compression of correlated sources. We shallpropose a multilayer Slepian-Wolf coding scheme based onbinning and superposition, and establish its optimality forD-MLDC when K ≤ 3. This scheme is also shown to beoptimal for general K under a certain symmetry condition,which generalizes the aforementioned result by Yeung andZhang on symmetrical MLDC [7]. The main technical diffi-culty encountered in our proof is that it appears to be infeasibleto characterize the admissible rate region of D-MLDC by

0018-9448 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

XIAO et al.: DISTRIBUTED MULTILEVEL DIVERSITY CODING 6369

deriving inner and outer bounds separately and then makinga direct comparison based on their explicit expressions.To circumvent this difficulty, we follow the approach in [7],where the analysis of the inner bound and that of the outerbound are conceptually intertwined. Specifically, we analyzecertain linear programs associated with the achievable rateregion of the proposed scheme and leverage the inducedLagrange multipliers to establish the entropy inequalities thatare needed for a matching converse. Since the problem con-sidered here is more general than that in [7], the relevantlinear programs and entropy inequalities are inevitably moresophisticated. It is worth mentioning that, in a broad sense,the strategy of determining an information-theoretic limit byconnecting achievability and converse results to a commonoptimization problem (not necessarily linear) via duality hasfind applications far beyond MLDC (see [12]).

The rest of this paper is organized as follows. We state thebasic definitions and the main results in Section II. Section IIIcontains a high-level description of our general approach.The detailed proofs can be found in Sections IV and V.We conclude the paper in Section VI.

Notation: Random objects (Xv , v ∈ V ) and (Xv,v ′,v ∈ V , v ′ ∈ V ′) are sometimes abbreviated as XV and XV ,V ′ ,respectively. For two integers z1, z2 ∈ Z, we define [z1 : z2] �{z ∈ Z : z1 ≤ z ≤ z2}. The τ -th smallest element of a finiteset V ⊆ Z is denoted by 〈V 〉τ ; moreover, let 〈V 〉T � {〈V 〉τ :τ ∈ T } for any T ⊆ [1 : |V |], where |V | is the cardinalityof V . We often do not distinguish between a singleton andits element.

II. PROBLEM FORMULATION AND MAIN RESULTS

A. System Model

Let Uk,[1:K ], k ∈ [1 : K ], be K vector sources. We assume1

that U[1:K ],α, α ∈ [1 : K ], are mutually independent whereas,for each α, the components of U[1:K ],α (i.e., Uk,α , k ∈ [1 : K ])can be arbitrarily correlated. Let

{U[1:K ],[1:K ](t)

}∞t=1 be i.i.d.

copies of U[1:K ],[1:K ].An (n, (Mk , k ∈ [1 : K ])) D-MLDC system consists of:• K encoders, where encoder Enck (k ∈ [1 : K ]) maps

the source sequence Unk,[1:K ] to a symbol Sk in [1 : Mk ],

i.e.,

Enck :K∏

α=1

Unk,α → [1 : Mk ],

Unk,[1:K ] → Sk,

• 2K − 1 decoders, where decoder DecV (∅ � V ⊆[1 : K ]) produces a reconstruction of Un

V ,[1:|V |], denoted

by UnV ,[1:|V |], based on SV , i.e.,

DecV :∏

k∈V

[1 : Mk ] →∏

k∈V

|V |∏

α=1

Unk,α,

SV → UnV ,[1:|V |].

1This assumption can be relaxed to a certain extent and can be modified invarious ways. In this paper we do not seek to present our results in their mostgeneral forms since the resulting statements and expressions may becomerather unwieldy.

Fig. 1. System diagram for D-MLDC with K = 3.

A D-MLDC system with K = 3 is illustratedin Fig. 1.

B. Admissible Rate Region

A rate tuple (Rk, k ∈ [1 : K ]) is said to admissible if, forany ε > 0, there exists an (n, (Mk , k ∈ [1 : K ])) D-MLDCsystem such that

(1) (Rate Constraints)1

nlog Mk ≤ Rk + ε, k ∈ [1 : K ], (1)

(2) (Reconstruction Constraints)

Pr{

UnV ,[1:|V |] �= Un

V ,[1:|V |]}

≤ ε, ∅ � V ⊆ [1 : K ]. (2)

The admissible rate region R∗K is defined as the set of all

admissible rate tuples.C. Multilayer Slepian-Wolf Coding

We shall propose a D-MLDC scheme, which can be viewedas a natural extension of that in [7] to the distributed encodingsetup. This scheme, termed multilayer Slepian-Wolf coding,includes two steps: intralayer coding and interlayer coding.

• Intralayer Coding: For each α ∈ [1 : K ], encoder k(k ∈ [1 : K ]) compresses Un

k,α using the conventionalbinning scheme2 of rate rk,α ; correct reconstruction ofUn

k,α , α ∈ V , based on the corresponding bin indices isensured (with high probability) for all V ⊆ [1 : K ] with|V | = α if

(rk,α, k ∈ [1 : K ]

) ∈ RK ,α . Here

RK ,α �{(rk,α, k ∈ [1 : K ]

) :∑

k∈V

rk,α≥ H(UV ,α|UV ′,α

),

V ∈ VK ,α, V ′ ∈ V′K ,α [V ]

}

2Here one can in fact use universal Slepian-Wolf coding so that encodingand decoding can be performed without the knowledge of the source distrib-ution [13].

Page 3: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

6370 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 11, NOVEMBER 2015

with

VK ,α � {V ⊆ [1 : K ] : 1 ≤ |V | ≤ α},V

′K ,α [V ] �

{V ′ ⊆ [1 : K ] \V : ∣∣V ′∣∣ + |V | = α

}, V ∈VK ,α.

• Interlayer Coding: In this step, encoder k (k ∈ [1 : K ])generates its output by combining the bin indices asso-ciated with Un

k,α , α ∈ [1 : K ], via superposition. Notethat the resulting rate region

RK �{ K∑

α=1

(rk,α, k ∈ [1 : K ]

) :

(rk,α, k ∈ [1 : K ]

) ∈ RK ,α, α ∈ [1 : K ]

}

is an inner bound of R∗K , i.e.,

RK ⊆ R∗K . (3)

D. Main Results

Our first main result shows that RK coincides with R∗K

when K ≤ 3.Theorem 1: R∗

K = RK for K ≤ 3.To state our second main result, we need the following

definition.Definition 1 (Symmetrical Source): We say that the dis-

tribution of U[1:K ],[1:K ] is symmetrical entropy-wise ifH

(UV ,α

) = H(UV ′,α

)for all α ∈ [1 : K ] and V ,

V ′ ⊆ [1 : K ] with |V | = ∣∣V ′∣∣.

It is worth noting that the symmetrical MLDC problem stud-ied in [7] corresponds to the special case where H

(UV ,α

) =H

(UV ′,α

)for all α ∈ [1 : K ] and ∅ � V , V ′ ⊆ [1 : K ].

Theorem 2: If the distribution of U[1:K ],[1:K ] is symmetricalentropy-wise, then R∗

K = RK .

III. OUTLINE OF A GENERAL APPROACH

In this section we attempt to give an outline of our generalapproach, which is in principle not restricted to the casescovered by Theorems 1 and 2. On a conceptual level, thisapproach was originated in [7] (and made more evidentin [10]). It consists of three major steps:

1) characterize the supporting hyperplanes of RK

(more precisely, the supporting hyperplanes of RK ,α,α ∈ [1 : K ]) via the analysis of the corresponding linearprograms;

2) establish a class of entropy inequalities based on theLagrange multipliers induced by the aforementionedlinear programs;

3) derive a tight outer bound on RK by leveraging theseentropy inequalities.

A. Linear Program

Each supporting hyperplane of RK ,α is associated with alinear program

LPwK ,α : min

K∑

k=1

wkrk,α

over rk,α, k ∈ [1 : K ],

s. t.∑

k∈V

rk,α ≥ H(UV ,α|UV ′,α

),

V ∈ VK ,α, V ′ ∈ V′K ,α [V ],

where w � (wk, k ∈ [1 : K ]) ∈ RK+ . It often suffices to

consider the case where the weights wk , k ∈ [1 : K ], areordered. For this reason, we define

WK �{

w : w1 ≥ w2 ≥ · · · ≥ wK ≥ wK+1 � 0}.

Moreover, to facilitate subsequent analysis, we introduce thefollowing partition3 of WK for each α ∈ [2 : K ]:

W(0)K ,α �

{

w ∈ WK : w1 ≤ 1

α − 1

K∑

k=2

wk

}

,

W(l)K ,α �

{

w ∈ WK : wl >1

α − l

K∑

k=l+1

wk

and wl+1 ≤ 1

α − (l + 1)

K∑

k=l+2

wk

}

,

l ∈ [1 : α − 2],

W(α−1)K ,α �

{

w ∈ WK : wα−1 >

K∑

k=αwk

}

.

We set W(0)K ,1 � WK .

Definition 2 (Optimal Lagrange Multiplier): We say4 that(cV |V ′,α, V ∈ VK ,α, V ′ ∈ V

′K ,α[V ]) is an optimal Lagrange

multiplier of LPwK ,α with w ∈ R

K+ if∑

V ∈VK ,α

V ′∈V′K ,α[V ]

cV |V ′,αH(UV ,α|UV ′,α

) = f wα , (4)

V ∈VK ,α :k∈V

V ′∈V′K ,α [V ]

cV |V ′,α = wk, k ∈ [1 : K ], (5)

cV |V ′,α ≥ 0, V ∈ VK ,α, V ′ ∈ V′K ,α [V ],

where f wα denotes the optimal value of LPw

K ,α.It is in general not easy to find optimal solution(

roptk,α, k ∈ [1 : K ]

)and optimal Lagrange multiplier

(cV |V ′,α, V ∈ VK ,α, V ′ ∈ V′K ,α[V ]) for a given LPw

K ,α(see Section IV for a detailed analysis of LPw

3,2). However,the task becomes relatively straightforward when α = 1 orα = K as shown by the following two lemmas (which canbe proved via direct verification).

Lemma 1: For linear program LPwK ,1 with w ∈ R

K+ ,(ropt

k,1 , k ∈ [1 : K ])

is an optimal solution and(cV |∅,1, V ∈ VK ,1

)is the unique optimal Lagrange multiplier,

where

roptk,1 � H

(Uk,1

), k ∈ [1 : K ],

c{k}|∅,1 � wk, k ∈ [1 : K ].

3For w ∈ WK , we have wl > 1α−l

∑Kk=l+1 wk ⇒ wl′ >

1α−l′

∑Kk=l′+1 wk , l′ ∈ [1 : l], and wl ≤ 1

α−l∑K

k=l+1 wk ⇒ wl′ ≤1

α−l′∑K

k=l′+1 wk , l′ ∈ [l : α − 1]. Therefore, W(0)K ,α, · · · ,W(α−1)

K ,α indeed

form a partition of WK .4It can be shown that (cV |V ′,α, V ∈ VK ,α, V ′ ∈ V

′K ,α [V ]) is an optimal

Lagrange multiplier of LPwK ,α if and only if it is an optimal solution to the

(asymmetric) dual problem of LPwK ,α .

Page 4: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

XIAO et al.: DISTRIBUTED MULTILEVEL DIVERSITY CODING 6371

Lemma 2: For linear program LPwK ,K with w ∈ WK ,(

roptk,K , k ∈ [1 : K ]

)is an optimal solution and(

cV |[1:K ]\V ,K , V ∈ VK ,K)

is an optimal Lagrange multiplier,where

roptk,K � H

(Uk,K |U[k+1:K ],K

), k ∈ [1 : K ],

c[1:k]|[k+1:K ],K � wk −wk+1, k ∈ [1 : K ],

cV |[1:K ]\V ,K � 0, otherwise.

The general case w ∈ RK+ can be reduced to the case w ∈ WK

via suitable relabelling.

B. Entropy Inequality

In this step we aim to establish a class of entropy inequal-ities needed for a matching converse by exploiting the prop-erties of optimal Lagrange multipliers of LPw

K ,α, α ∈ [1 : K ].More precisely, we shall identify suitable conditions underwhich there exist optimal Lagrange multipliers (cV |V ′,α,V ∈ VK ,α, V ′ ∈ V

′K ,α[V ]), α ∈ [1 : K ], such that

V ∈VK ,α′

V ′∈V′K ,α′ [V ]

cV |V ′,α′ H (XV |XV ′)

≥∑

V ∈VK ,α

V ′∈V′K ,α [V ]

cV |V ′,αH (XV |XV ′) (6)

for all X[1:K ] and α ≥ α′.The following lemma indicates that (6) always holds when

α′ = 1.Lemma 3: We have

V ∈VK ,1

cV |∅,1 H (XV )

≥∑

V ∈VK ,α

V ′∈V′K ,α [V ]

cV |V ′,αH (XV |XV ′)

for all X[1:K ], where(cV |∅,1, V ∈ VK ,1

)and(

cV |V ′,α, V ∈ VK ,α, V ′ ∈ V′K ,α[V ]

)are optimal Lagrange

multipliers of LPwK ,1 and LPw

K ,α, respectively.Proof: According to (5),

c{k}|∅,1 = wk

=∑

V ∈VK ,α :k∈V

V ′∈V′K ,α[V ]

cV |V ′,α, k ∈ [1 : K ]. (7)

It can be verified that∑

V ∈VK ,1

cV |∅,1 H (XV )

=K∑

k=1

c{k}|∅,1 H (Xk)

=K∑

k=1

V ∈VK ,α :k∈V

V ′∈V′K ,α [V ]

cV |V ′,αH (Xk) (8)

=∑

V ∈VK ,α

V ′∈V′K ,α [V ]

cV |V ′,α∑

k∈V

H (Xk)

≥∑

V ∈VK ,α

V ′∈V′K ,α [V ]

cV |V ′,αH (XV )

≥∑

V ∈VK ,α

V ′∈V′K ,α [V ]

cV |V ′,αH (XV |XV ′),

where (8) is due to (7). This completes the proofof Lemma 3.

C. Outer Bound

As shown by the following lemma, the existence of entropyinequalities (6) implies that RK is an outer bound of R∗

K .Lemma 4: If given any w ∈ R

K+ , there exist optimalLagrange multipliers (cV |V ′,α, V ∈ VK ,α, V ′ ∈ V

′K ,α[V ]),

α ∈ [1 : K ], such that (6) holds, then

R∗K ⊆ RK . (9)

Proof: Let (Rk, k ∈ [1 : K ]) be an arbitrary admissiblerate tuple. It suffices to show that

K∑

k=1

wk Rk ≥K∑

α=1

f wα , (10)

from which (9) follows immediately. We shall prove viainduction that, for any D-MLDC system satisfying (1) and (2),

K∑

k=1

wk (Rk + ε) ≥β∑

α=1

f wα

+ 1

n

V ∈VK ,β

V ′∈V′K ,β [V ]

cV |V ′,βH(

SV |Un[1:K ],[1:β], SV ′

)

−βδεK∑

k=1

wk, β ∈ [1 : K ], (11)

where δε tends to zero as ε → 0. One can deduce (10)from (11) by setting β = K and sending ε → 0.

It can be verified that

K∑

k=1

wk (Rk + ε) ≥ 1

n

K∑

k=1

wk log Mk

≥ 1

n

K∑

k=1

wk H (Sk)

= 1

n

V ∈VK ,1

cV |∅,1 H (SV ), (12)

where (12) is due to the fact (see Lemma 1) that the optimalLagrange multiplier

(cV |∅,1, V ∈ VK ,1

)is uniquely given by

c{k}|∅,1 � wk , k ∈ [1 : K ]. Note that

H (SV ) ≥ H(Un

V ,1

) + H(SV |Un

V ,1

) − H(Un

V ,1|SV)

≥ H(Un

V ,1

) + H(SV |Un

V ,1

) − nδε (13)

≥ H(UV ,1

) + H(SV |Un

[1:K ],1

) − nδε, V ∈ VK ,1,

(14)

where (13) follows by (2) and Fano’s inequality. Substitut-ing (14) into (12) and invoking (4) proves (11) for β = 1.

Page 5: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

6372 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 11, NOVEMBER 2015

Now assume that (11) holds for β = B − 1. In view of (6),we have

V ∈VK ,B−1

V ′∈V′K ,B−1[V ]

cV |V ′,B−1 H(SV |Un

[1:K ],[1:B−1], SV ′)

≥∑

V ∈VK ,B

V ′∈V′K ,B [V ]

cV |V ′,B H(SV |Un

[1:K ],[1:B−1], SV ′).

(15)

It can be verified that

H(SV |Un

[1:K ],[1:B−1], SV ′)

= H(Un

V ,B, SV |Un[1:K ],[1:B−1], SV ′

)

− H(Un

V ,B |Un[1:K ],[1:B−1], SV , SV ′

)

≥ H(Un

V ,B, SV |Un[1:K ],[1:B−1], SV ′

) − nδε,

V ∈ VK ,B, V ′ ∈ V′K ,B, (16)

where (16) follows by (2) and Fano’s inequality. Moreover,

H(Un

V ,B, SV |Un[1:K ],[1:B−1], SV ′

)

≥ H(Un

V ,B|Un[1:K ],[1:B−1], SV ′

) + H(SV |Un

[1:K ],[1:B], SV ′)

≥ H(Un

V ,B |UnV ′,B

) + H(SV |Un

[1:K ],[1:B], SV ′)

(17)

= H(UV ,B|UV ′,B

) + H(SV |Un

[1:K ],[1:B], SV ′), (18)

where (17) is due to the fact that UnV ,B ↔ Un

V ′,B ↔(Un

[1:K ],[1:B], SV ′) form a Markov chain. Continuing from (15),∑

V∈VK ,B−1

V ′∈V′K ,B−1[V ]

cV |V ′,B−1 H(SV |Un

[1:K ],[1:B−1], SV ′)

≥ n∑

V ∈VK ,B

V ′∈V′K ,B [V ]

cV |V ′,B H(UV ,B |UV ′,B

)

+∑

V∈VK ,B

V ′∈V′K ,B [V ]

cV |V ′,B H(SV |Un

[1:K ],[1:B], SV ′)

− nδε∑

V ∈VK ,B

V ′∈V′K ,B [V ]

cV |V ′,B (19)

≥ n∑

V ∈VK ,B

V ′∈V′K ,B [V ]

cV |V ′,B H(UV ,B |UV ′,B

)

+∑

V∈VK ,B

V ′∈V′K ,B [V ]

cV |V ′,B H(SV |Un

[1:K ],[1:B], SV ′)

− nδε

K∑

k=1

V ∈VK ,B :k∈V

V ′∈V′K ,B [V ]

cV |V ′,B

= n f wB

+∑

V∈VK ,B

V ′∈V′K ,B [V ]

cV |V ′,B H(SV |Un

[1:K ],[1:B], SV ′)

− nδε

K∑

k=1

wk, (20)

where (19) is due to (16) and (18), and (20) is due to(4) and (5). Combining (20) and the induction hypothesisproves (11) for β = B .

IV. PROOF OF THEOREM 1

Theorem 1 is trivially true when K = 1. The case K = 2 isa simple consequence of Lemma 3 and Lemma 4. Therefore,only the case K = 3 remains to be proved.

To this end, we shall give a detailed analysis of LPw3,2. First

consider the following related linear program

LPw3,2 : max

3∑

k=1

wkrk,2

over rk,2, k ∈ [1 : 3],

s. t. rk,2 ≥ ψ{k}, k ∈ [1 : 3],

ri,2 + r j,2 ≥ ψ{i} + ψ{i, j } + ψ{ j },i, j ∈ [1 : 3], i �= j,

where ψV , V ∈ V3,2, are non-negative real numbers.We say that

(cV ,2, V ∈ V3,2

)is an optimal Lagrange multiplier

of LPw3,2 if

3∑

k=1

c{k},2ψ{k} +∑

i, j∈[1:3],i< j

c{i, j },2(ψ{i} + ψ{i, j } + ψ{ j }

) = f wψ ,

V ∈V3,2:k∈V

cV ,2 = wk, k ∈ [1 : 3],

cV ,2 ≥ 0, V ∈ V3,2,

where f wψ denotes the optimal value of LP

w3,2. One can

solve LPw3,2 with w ∈ W3 by considering 5 different cases

(see Table I).Now set

ψ{k} � H(Uk,2|Uνk ,2

), k ∈ [1 : 3],

ψ{i, j } � max{

H(Ui,2,U j,2

) − ψi − ψ j , 0},

i, j ∈ [1 : 3], i �= j,

where νk is a maximizer of maxν∈[1:3]\{k} H(Uk,2|Uν,2

).

Moreover, define

R3,2 �{(

rk,2, k ∈ [1 : 3]) : rk,2 ≥ ψ{k}, k ∈ [1 : 3],

ri,2 + r j,2 ≥ψ{i} + ψ{i, j } + ψ{ j }, i, j ∈ [1 : 3], i �= j}.

One can prove via direct verification that R3,2 coincideswith R3,2.

Lemma 5: R3,2 = R3,2.See Fig. 2 for illustrations of R3,2 (i.e., R3,2), where the

optimal solutions in Table I are highlighted.Lemma 6: Let i, j, k be three distinct integers in [1 : 3].(1) ψ{i} + ψ{i, j } + ψ{ j }

=

⎧⎪⎪⎨

⎪⎪⎩

H(Ui,2,U j,2

), H

(Ui,2,U j,2

) ≥ ψ{i} + ψ{ j },H

(Ui,2|Uk,2

) + H(U j,2|Uk,2

),

H(Ui,2,U j,2

)< ψ{i} + ψ{ j }.

(2) If ψ{i, j } > ψ{i,k} + ψ{ j,k}, then

ψ{i} + ψ{i, j } + ψ{ j } = H(Ui,2,U j,2

).

Page 6: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

XIAO et al.: DISTRIBUTED MULTILEVEL DIVERSITY CODING 6373

TABLE I

LINEAR PROGRAM LPw3,2

Fig. 2. Illustrations of R3,2 (i.e., R3,2) and optimal solutions in Table I.

(3) If ψ{i} +ψ{i, j } +ψ{ j } = H(Ui,2|Uk,2

)+ H(U j,2|Uk,2

),

then

ψ{i} + ψ{i,k} + ψ{k} = H(Ui,2,Uk,2

),

ψ{ j } + ψ{ j,k} + ψ{k} = H(U j,2,Uk,2

).

Proof: See Appendix A.

According to Lemma 6, we have the following four casesfor ψ{i} + ψ{i, j } + ψ{ j } (i, j ∈ [1 : 3], i �= j ):(Case A)

ψ{1} + ψ{1,2} + ψ{2} = H(U1,2,U2,2

),

ψ{1} + ψ{1,3} + ψ{3} = H(U1,2,U3,2

),

ψ{2} + ψ{2,3} + ψ{3} = H(U2,2,U3,2

);

Page 7: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

6374 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 11, NOVEMBER 2015

TABLE II

OPTIMAL LAGRANGE MULTIPLIERS OF LP3,2 FOR ALL POSSIBLE CASES

(Case B)

ψ{1} + ψ{1,2} + ψ{2} = H(U1,2,U2,2

),

ψ{1} + ψ{1,3} + ψ{3} = H(U1,2,U3,2

),

ψ{2} + ψ{2,3} + ψ{3} = H(U2,2|U1,2

) + H(U3,2|U1,2

)

> H(U2,2,U3,2

);(Case C)

ψ{1} + ψ{1,2} + ψ{2} = H(U1,2,U2,2

),

ψ{1} + ψ{1,3} + ψ{3} = H(U1,2|U2,2

) + H(U3,2|U2,2

)

> H(U1,2,U3,2

),

ψ{2} + ψ{2,3} + ψ{3} = H(U2,2,U3,2

);(Case D)

ψ{1} + ψ{1,2} + ψ{2} = H(U1,2|U3,2

) + H(U2,2|U3,2

)

> H(U1,2,U2,2

),

ψ{1} + ψ{1,3} + ψ{3} = H(U1,2,U3,2

),

ψ{2} + ψ{2,3} + ψ{3} = H(U2,2,U3,2

).

Now one can readily solve LPw3,2 with w ∈ W3 by

considering all possible combinations of these four cases andthose in Table I (i.e., Cases 1–5). For example, consider thescenario where Case 2 and Case C are simultaneously satisfied(henceforth called Case 2C). It can be verified that

ψ{1} = H(U1,2|U2,2

),

ψ{3} = H(U3,2|U2,2

),

ψ{1,3} = 0,

ψ{1} + ψ{1,2} + ψ{2} = H(U1,2,U2,2

).

For(

roptk,2 , k ∈ [1 : 3]

)in Table I, we have

ropt1,2 = ψ{1} + ψ{1,3}

= H(U1,2|U2,2

),

ropt2,2 = ψ{2} + ψ{1,2} − ψ{1,3}

= (ψ{1} + ψ{1,2} + ψ{2}

) − (ψ{1} + ψ{1,3}

)

= H(U2,2

),

ropt3,2 = ψ{3}

= H(U3,2|U2,2

).

In view of Lemma 5,(

roptk,2 , k ∈ [1 : K ]

)with ropt

k,2 � roptk,2 ,

k ∈ [1 : K ], is an optimal solution of LPw3,2. Therefore, the

optimal value of LPw3,2 is given by

f w2 = w1 H

(U1,2|U2,2

) +w2 H(U2,2

) +w3 H(U3,2|U2,2

).

Moreover,(

cV |V ′,2, V ∈ V3,2, V ′ ∈ V′3,2 [V ]

)with

c{1}|{2},2 � w1 − w2,

c{3}|{2},2 � w3,

c{1,2}|∅,2 � w2,

cV |V ′,2 � 0, otherwise,

is an optimal Lagrange multiplier of LPw3,2.

One can obtain the following lemma by analyzing the othercombinations in the same manner. It is worth mentioning thatnot all combinations are possible. Specifically, Cases 2D, 3C,and 4B violate Lemma 6(2), so such combinations are void.

Lemma 7: For linear program LPw3,2 with w ∈ W3,(

cV |V ′,2, V ∈ V3,2, V ′ ∈ V′3,2 [V ]

)in Table II5 is an optimal

Lagrange multiplier. The general case w ∈ R3+ can be reduced

to the case w ∈ W3 via suitable relabelling.The next result shows that (6) holds when K = 3, which,

together with (3) and Lemma 4, completes the proof ofTheorem 1.

Lemma 8: We have∑

V ∈V3,1

cV ,1 H (XV ) ≥∑

V ∈V3,2

V ′∈V′3,2[V ]

cV |V ′,2 H (XV |XV ′)

(21)

≥∑

V ∈V3,3

cV ,3 H(XV |X[1:3]\V

)(22)

for all X[1:3], where (cV ,1, V ∈ V3,1), (cV |V ′,2, V ∈ V3,2, V ′ ∈V

′3,2[V ]), and (cV ,3, V ∈ V3,3) are the optimal Lagrange

multipliers in Lemma 1, Lemma 2, and Lemma 7, respectively.

5We set c{k}|{k′ },2 = 0 for k′ �= νk .

Page 8: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

XIAO et al.: DISTRIBUTED MULTILEVEL DIVERSITY CODING 6375

Proof: Note that (21) follows from Lemma 3. The proofof (22) is relegated to Appendix B.

V. PROOF OF THEOREM 2

The proof of Theorem 2 also largely follows the generalapproach outlined in Section III. However, due to the symme-try assumption, some simplifications are possible.

A. Linear Program

When the distribution of U[1:K ],[1:K ] is symmetricalentropy-wise, H

(UV ,α|UV ′,α

)depends on V ∈ VK ,α and

V ′ ∈ V′K ,α [V ] only through |V |; for this reason, we shall

denote it as H|V |,α and rewrite LPwK ,α in the following simpler

form

LPwK ,α : min

K∑

k=1

wkrk,α

over rk,α, k ∈ [1 : K ],

s. t.∑

k∈V

rk,α ≥ H|V |,α, V ∈ VK ,α.

Definition 3: We say that(cV ,α, V ∈ VK ,α

)is an optimal

Lagrange multiplier of LPwK ,α with w ∈ R

K+ if∑

V ∈VK ,α

cV ,αH|V |,α = fwα , (23)

V ∈VK ,α :k∈V

cV ,α = wk, k ∈ [1 : K ], (24)

cV ,α ≥ 0, V ∈ VK ,α, (25)

where fwα denotes the optimal value of LP

wK ,α.

For l ∈ [0 : α − 1], define

r (l)k,α �

⎧⎨

Hk,α − Hk−1,α, k ∈ [1 : l],Hα,α − Hl,α

α − l, k ∈ [l + 1 : K ],

where H0,α � 0.

Lemma 9:(

r (l)k,α, k ∈ [1 : K ])

∈ RK ,α .Proof: See Appendix C.

For α ∈ [1 : K ] and l ∈ [0 : α − 1], define

(l)K ,α � {V ⊆ [1 : K ] : |V | = α, [1 : l] ⊆ V } .

Recall that W(0)K ,α, · · · ,W(α−1)

K ,α form a partition of WK . Forw ∈ WK and α ∈ [1 : K ], let lw

α denote the unique integer in

[0 : α − 1] such that w ∈ W(lwα )

K ,α ; it is easy to verify that

0 = lw1 ≤ lw

2 ≤ · · · ≤ lwK−1 ≤ lw

K .

Moreover, for w ∈ WK and α ∈ [1 : K ], define

λwα � 1

α − lwα

K∑

k=lwα +1

wk

and

CwK ,α �

{(cV ,α : V ∈ VK ,α

) :c[1:k],α = wk −wk+1, k ∈ [

1 : lwα − 1

], (26)

c[1:lwα ],α = wlw

α− λw

α , (27)

cV ,α ≥ 0, V ∈ (lwα )

K ,α , (28)

cV ,α = 0, otherwise, (29)∑

V ∈(lwα )K ,α :k∈V

cV ,α = wk, k ∈ [lwα + 1 : K

]}. (30)

Note that (26) and (27) are void when lwα = 0. The definition

of CwK ,α can be extended to the case w ∈ R

K+ through suitablerelabelling.

Lemma 10: For any w ∈ WK ,(cV ,α−1, V ∈ VK ,α−1

) ∈C

wK ,α−1, and

(cV ,α, V ∈ VK ,α

) ∈ CwK ,α,

V ∈(lwα )K ,α

cV ,α = λwα , (31)

V ∈VK ,α :k∈V

cV ,α = wk, k ∈ [1 : K ], (32)

cV ,α ≥ 0, V ∈ VK ,α, (33)

c[1:lwα−1

],α − c[

1:lwα−1

],α−1 = θw

α ≥ 0, (34)

V ∈VK ,α

cV ,α =

⎧⎪⎪⎨

⎪⎪⎩

1

α

K∑

k=1

wk, lwα = 0,

w1, lwα > 0,

(35)

where

θwα �

⎝λwα −

lwα∑

k=lwα−1+1

(α − 1 − k) c[1:k],α

⎠ 1

α − 1 − lwα−1

.

(36)Proof: See Appendix D.

The main result of Section V-A is as follows.Lemma 11: For linear program LP

wK ,α with w ∈ WK ,(

r(lwα )

k,α , k ∈ [1 : K ])

is an optimal solution, and every(cV ,α, V ∈ VK ,α

) ∈ CwK ,α is an optimal Lagrange multiplier.

Proof: In view of Lemma 9, we have(r(

lwα )

k,α , k ∈ [1 : K ])

∈ RK ,α. Consider an arbitrary(cV ,α, V ∈ VK ,α

) ∈ CwK ,α . It follows from Lemma 10

that(cV ,α, V ∈ VK ,α

)satisfies (24) and (25). Note that

K∑

k=1

wkr(lwα )

k,α

=lwα∑

k=1

wk(Hk,α − Hk−1,α

) +K∑

k=lwα +1

wkHα,α − Hlw

α ,α

α − lwα

=lwα −1∑

k=1

(wk −wk+1) Hk,α + (wlw

α− λw

α

)Hlw

α ,α + λwα Hα,α

=lwα −1∑

k=1

c[1:k],αHk,α + c[1:lwα ],αHlw

α ,α +∑

V ∈(lwα )K ,α

cV ,αHα,α (37)

=∑

V ∈VK ,α

cV ,αH|V |,α,

Page 9: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

6376 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 11, NOVEMBER 2015

where (37) is due to (31). On the other hand, for any(rk,α : k ∈ [1 : K ]

) ∈ RK ,α,

K∑

k=1

wkrk,α =K∑

k=1

V ∈VK ,α :k∈V

cV ,αrk,α

=∑

V ∈VK ,α

cV ,α

k∈V

rk,α

≥∑

V ∈VK ,α

cV ,αH|V |,α.

Therefore,(

r(lwα )

k,α , k ∈ [1 : K ])

is an optimal solution. This

also shows that(cV ,α, V ∈ VK ,α

)satisfies (23), thus is indeed

an optimal Lagrange multiplier.

B. Entropy Inequality

Define the indicator function

I (event) �{

1, event is true,

0, event is false.

Lemma 12: For w ∈ WK with λwα > 0 and(

cV ,α, V ∈ VK ,α) ∈ C

wK ,α , define

(cV ′,α−1, V ′ ∈ VK ,α−1

)as

follows:

c[1:k],α−1 � wk −wk+1, k ∈ [1 : lw

α−1 − 1],

c[1:lwα−1

],α−1 � wlw

α−1− λw

α−1,

cV ′,α−1 � θwα

λwα

V ∈(lwα )K ,α :V ′⊆V

cV ,α + 1

λwα

V ∈(lwα )K ,α

cV ,α

×lwα∑

k=lwα−1+1

c[1:k],α

α∑

τ=k+1

I {V ′ = 〈V 〉[1:α]\{τ }

},

V ′ ∈ (lwα−1

)

K ,α−1,

cV ′,α−1 � 0, otherwise,

where θwα is given by (36). The following statements are true.

(1)(cV ′,α−1, V ′ ∈ VK ,α−1

) ∈ CwK ,α−1.

(2) We have∑

V ′∈VK ,α−1

cV ′,α−1 H (XV ′) ≥∑

V ∈VK ,α

cV ,αH (XV )

for all X[1:K ].Proof: See Appendix E.

Lemma 13: Given any w ∈ RK+ , there exist(

cV ,α, V ∈ VK ,α) ∈ C

wK ,α, α ∈ [1 : K ], such that

V ′∈VK ,α′cV ′,α′ H (XV ′) ≥

V ∈VK ,α

cV ,αH (XV ) (38)

for all X[1:K ] and α ≥ α′.Proof: By symmetry, it suffices to consider w ∈ WK . We

shall first assume wK > 0, which implies λwα > 0, α ∈ [1 : K ].

Define(cV ,K , V ∈ VK ,K

)with

c[1:k],K � wk −wk+1, k ∈ [1 : K ],

cV ,K � 0, otherwise.

It is easy to verify that(cV ,K , V ∈ VK ,K

) ∈ CwK ,K . One

can successively construct the desired(cV ,α, V ∈ VK ,α

)from

α = K − 1 to α = 1 by invoking Lemma 12.Now consider the case w1 ≥ · · · ≥ wK−1 > wK = 0.

The preceding argument implies the existence of(c′

V ,α, V ∈ VK ,α) ∈ C

w′K−1,α, α ∈ [1 : K − 1], such

that∑

V ′∈VK−1,α′c′

V ′,α′ H (XV ′) ≥∑

V ∈VK−1,α

c′V ,αH (XV )

for all X[1:K−1] and α ≥ α′, where w′ � (w1, · · · , wK−1).Define

(cV ,α, V ∈ VK ,α

), α ∈ [1 : K − 1], with

cV ,α � c′V ,α, K /∈ V ,

cV ,α � 0, otherwise,

and(cV ,K , V ∈ VK ,K

)with

cV ,K � c′V ,K−1, K /∈ V ,

cV ,K � 0, otherwise.

It is easy to verify that such(cV ,α, V ∈ VK ,α

), α ∈ [1 : K ],

have the desired properties. The general case wherew1 ≥ · · · ≥ wK ′−1 > wK ′ = · · · = wK = 0 for some K ′ ≤ Kcan be handled via induction.6

C. Outer Bound

The following result, together with (3) and Lemma 13,completes the proof of Theorem 2.

Lemma 14: If given any w ∈ RK+ , there exist(

cV ,α, V ∈ VK ,α) ∈ C

wK ,α , α ∈ [1 : K ], such that (38) holds,

then

R∗K ⊆ RK

when the distribution of U[1:K ],[1:K ] is symmetricalentropy-wise.

Proof: Let (Rk : k ∈ [1 : K ]) be an arbitrary admissiblerate tuple. It suffices to show that

K∑

k=1

wk Rk ≥K∑

α=1

fwα . (39)

Without loss of generality, we assume w ∈ WK . We shallprove via induction that, when the distribution of U[1:K ],[1:K ] issymmetrical entropy-wise, for any D-MLDC system satisfying(1) and (2),

K∑

k=1

wk (Rk + ε)

≥β∑

α=1

fwα + 1

n

lwβ∑

k=1

c[1:k],βH(

S[1:k]|Un[1:K ],[1:β],U

n[k+1:β],K

)

+ 1

n

V ∈(

lwβ

)

K ,β

cV ,βH(

SV |Un[1:K ],[1:β]

)− βδε

K∑

k=1

wk,

β ∈ [1 : K ], (40)

6If w1 = · · · = wK = 0, then cV,α = 0 for all V ∈ VK ,α and α ∈ [1 : K ].

Page 10: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

XIAO et al.: DISTRIBUTED MULTILEVEL DIVERSITY CODING 6377

where δε tends to zero as ε → 0. One can deduce (39)from (40) by setting β = K and sending ε → 0.

The proof of (40) for β = 1 is the same as that of (11).Now assume that (40) holds for β = B − 1. In view of (38),we have

V ′∈(

lwB−1

)

K ,B−1

cV ′,B−1 H(SV ′ |Un

[1:K ],[1:B−1]

)

≥lwB∑

k=1

(c[1:k],B − c[1:k],B−1I

{k ≤ lw

B−1

})

×H(S[1:k]|Un

[1:K ],[1:B−1]

)

+∑

V ∈(lwB )

K ,B

cV ,B H(SV |Un

[1:K ],[1:B−1]

), (41)

where c[1:k],B − c[1:k],B−1I{k ≤ lw

B−1

} ≥ 0, k ∈ [1 : lw

B

],

according to (33), (34), and the fact that c[1:k],B−1 = c[1:k],Bwhen k ∈ [

1 : lwB−1 − 1

]. Moreover,

lwB∑

k=1

(c[1:k],B − c[1:k],B−1I

{k ≤ lw

B−1

})

× H(S[1:k]|Un

[1:K ],[1:B−1]

)

≥lwB∑

k=1

(c[1:k],B − c[1:k],B−1I

{k ≤ lw

B−1

})

× H(S[1:k]|Un

[1:K ],[1:B−1],Un[k+1:B],[B:K ]

)

=lwB∑

k=1

c[1:k],B H(S[1:k]|Un

[1:K ],[1:B−1],Un[k+1:B],[B:K ]

)

−lwB−1∑

k=1

c[1:k],B−1 H(S[1:k]|Un

[1:K ],[1:B−1],Un[k+1:B],[B:K ]

)

≥lwB∑

k=1

c[1:k],B H(S[1:k]|Un

[1:K ],[1:B−1],Un[k+1:B],[B:K ]

)

−lwB−1∑

k=1

c[1:k],B−1

× H(S[1:k]|Un

[1:K ],[1:B−1],Un[k+1:B−1],[B−1:K ]

). (42)

Combining (41) and (42) gives

lwB−1∑

k=1

c[1:k],B−1 H(S[1:k]|Un

[1:K ],[1:B−1],Un[k+1:B−1],[B−1:K ]

)

+∑

V ′∈(

lwB−1

)

K ,B−1

cV ′,B−1 H(SV ′ |Un

[1:K ],[1:B−1]

)

≥lwB∑

k=1

c[1:k],B H(S[1:k]|Un

[1:K ],[1:B−1],Un[k+1:B],[B:K ]

)

+∑

V∈(lwB )

K ,B

cV ,B H(SV |Un

[1:K ],[1:B−1]

). (43)

Note that

H(S[1:k]|Un

[1:K ],[1:B−1],Un[k+1:B],[B:K ]

)

= H(S[1:k]|Un

[1:K ],[1:B−1],Un[k+1:B],[B:K ], S[k+1:B]

)

= H(S[1:B]|Un

[1:K ],[1:B−1],Un[k+1:B],[B:K ], S[k+1:B]

)

= H(Un

[1:B],[1:B], S[1:B]|Un[1:K ],[1:B−1],U

n[k+1:B],[B:K ],

S[k+1:B])

−H(Un

[1:B],[1:B]|Un[1:K ],[1:B−1],U

n[k+1:B],[B:K ], S[1:B]

)

≥ H(Un

[1:B],[1:B], S[1:B]|Un[1:K ],[1:B−1],

Un[k+1:B],[B:K ], S[k+1:B]

) − nδε (44)

= H(Un

[1:B],[1:B]|Un[1:K ],[1:B−1],U

n[k+1:B],[B:K ], S[k+1:B]

)

+ H(S[1:B]|Un

[1:K ],[1:B−1],Un[1:B],[1:B],

Un[k+1:B],[B:K ], S[k+1:B]

) − nδε= H

(Un

[1:B],[1:B]|Un[1:K ],[1:B−1],U

n[k+1:B],[B:K ]

)

+ H(S[1:k]|Un

[1:K ],[1:B−1],Un[1:B],[1:B],U

n[k+1:B],[B:K ]

)

− nδε≥ nHk,B + H

(S[1:k]|Un

[1:K ],[1:B],Un[k+1:B],[B:K ]

) − nδε,

k ∈ [1 : lw

B

], (45)

where (44) follows by (2) and Fano’s inequality. Similarly, wehave

H(SV |Un

[1:K ],[1:B−1]

)

= H(Un

V ,[1:B], SV |Un[1:K ],[1:B−1]

)

−H(Un

V ,[1:α]|Un[1:K ],[1:B−1], SV

)

≥ H(Un

V ,[1:B], SV |Un[1:K ],[1:B−1]

) − nδε= H

(Un

V ,[1:B]|Un[1:K ],[1:B−1]

)

+ H(SV |Un

[1:K ],[1:B−1],UnV ,[1:B]

) − nδε

≥ nH|V |,B + H(SV |Un

[1:K ],[1:B]

) − nδε, V ∈ (lwB )

K ,B .

(46)

Continuing from (43),lwB−1∑

k=1

c[1:k],B−1 H(S[1:k]|Un

[1:K ],[1:B−1],Un[k+1:B−1],[B−1:K ]

)

+∑

V ′∈(

lwB−1

)

K ,B−1

cV ′,B−1 H(SV ′ |Un

[1:K ],[1:B−1]

)

≥ n∑

V ∈VK ,B

cV ,B H|V |,B

+lwB∑

k=1

c[1:k],B · H(S[1:k]|Un

[1:K ],[1:B],Un[k+1:B],[B:K ]

)

+∑

V ∈(lwB )

K ,B

cV ,B H(SV |Un

[1:K ],[1:B]

) − nδε∑

V ∈VK ,B

cV ,B

(47)≥ n f

wB

+lwB∑

k=1

c[1:k],B · H(S[1:k]|Un

[1:K ],[1:B],Un[k+1:B],[B:K ]

)

+∑

V ∈(lwB )

K ,B

cV ,B H(SV |Un

[1:K ],[1:B]

) − nδε

K∑

k=1

wk, (48)

Page 11: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

6378 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 11, NOVEMBER 2015

where (47) is due to (45) and (46), and (48) is dueto (23), (35) as well as Lemma 11. Combining (48) and theinduction hypothesis proves (40) for β = B .

VI. CONCLUSION

We have characterized the admissible rate region ofD-MLDC for the case K ≤ 3 and the case where thesource distribution is symmetrical entropy-wise. In view of theintimate connection between MLDC and its lossy counterpartknown as multiple description coding [14], it is expected thatthe results in the present work may shed new light on therobust distributed source coding problem (which is the lossycounterpart of D-MLDC) studied in [15].

APPENDIX APROOF OF LEMMA 6

Proof of Part (1) of Lemma 6: It follows by the definitionof ψ{i, j } that

ψ{i} + ψ{i, j } + ψ{ j } = H(Ui,2,U j,2

)

when H(Ui,2,U j,2

) ≥ ψ{i} + ψ{ j }.When H

(Ui,2,U j,2

)< ψ{i} + ψ{ j }, we must have

ψ{i} = H(Ui,2|Uk,2

),

ψ{ j } = H(U j,2|Uk,2

),

ψ{i, j } = 0,

and consequently

ψ{i} + ψ{i, j } + ψ{ j } = H(Ui,2|Uk,2

) + H(U j,2|Uk,2

).

�Proof of Part (2) of Lemma 6: Note that ψ{i, j } > ψ{i,k} +

ψ{ j,k} implies ψ{i, j } > 0. It then follows by the definition ofψ{i, j } that

ψ{i} + ψ{i, j } + ψ{ j } = H(Ui,2,U j,2

).

�Proof of Part (3) of Lemma 6: In view of the fact that

ψ{i} ≥ H(Ui,2|Uk,2

), ψ{ j } ≥ H

(U j,2|Uk,2

), and ψ{i, j } ≥ 0,

we must have

ψ{i} = H(Ui,2|Uk,2

), (49)

ψ{ j } = H(U j,2|Uk,2

),

ψ{i, j } = 0

when ψ{i} + ψ{i, j } + ψ{ j } = H(Ui,2|Uk,2

) + H(U j,2|Uk,2

).

Note that

H (Ui,2,Uk,2) = H (Ui,2|Uk,2)+ H (Uk,2)

= ψ{i} + H (Uk,2) (50)

≥ ψ{i} + ψ{k},

where (50) is due to (49). It follows by symmetry thatH

(U j,2,Uk,2

) ≥ ψ{ j } +ψ{k}. Invoking the definition of ψ{i,k}and ψ{ j,k} completes the proof of Lemma 6. �

APPENDIX BPROOF OF (22) IN LEMMA 8

It suffices to consider w ∈ W3.Case 1A (w ∈ W

(0)3,2):

1

2(w1 +w2 −w3) H (X1, X2)

+ 1

2(w1 −w2 +w3) H (X1, X3)

+ 1

2(−w1 +w2 +w3) H (X2, X3)

= (w1 −w2) (H (X1, X2)+ H (X1, X3)

−H (X1, X2, X3))+ (w2 −w3) H (X1, X2)

+ 1

2(−w1 +w2 +w3)

× (H (X1, X2)+ H (X1, X3)+ H (X2, X3))

+ (w1 − w2) H (X1, X2, X3)

≥ (w1 −w2) H (X1)+ (w2 −w3) H (X1, X2)

+ 1

2(−w1 +w2 +w3)

× (H (X1, X2)+ H (X1, X3)+ H (X2, X3))

+ (w1 − w2) H (X1, X2, X3)

≥ (w1 −w2) H (X1)

+ (w2 −w3) H (X1, X2)+w3 H (X1, X2, X3) (51)

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3),

where (51) is due to Han’s inequality [16].Case 1B (w ∈ W

(0)3,2 and ν2 = ν3 = 1):

1

2(−w1 +w2 +w3) (H (X2|X1)+ H (X3|X1))

+1

2(w1 +w2 − w3) H (X1, X2)

+1

2(w1 −w2 + w3) H (X1, X3)

≥ 1

2(−w1 +w2 + w3) (H (X2|X1, X3)+ H (X3|X1, X2))

+1

2(w1 +w2 − w3) H (X1, X2)

+1

2(w1 −w2 + w3) H (X1, X3)

= (w1 −w2) (H (X1, X2)+ H (X1, X3)

−H (X1, X2, X3))+ (w2 −w3) H (X1, X2)

+1

2(−w1 + w2 +w3) (H (X3|X1, X2)+ H (X1, X2))

+1

2(−w1 + w2 +w3) (H (X2|X1, X3)+ H (X1, X3))

+ (w1 −w2) H (X1, X2, X3)

≥ (w1 −w2) H (X1)+ (w2 −w3) H (X1, X2)

+w3 H (X1, X2, X3)

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3).

Page 12: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

XIAO et al.: DISTRIBUTED MULTILEVEL DIVERSITY CODING 6379

Case 1C (ν1 = ν3 = 2):

1

2(w1 − w2 +w3) (H (X1|X2)+ H (X3|X2))

+1

2(w1 +w2 − w3) H (X1, X2)

+1

2(−w1 + w2 +w3) H (X2, X3)

≥ 1

2(w1 −w2 +w3) (H (X1|X2, X3)+ H (X3|X1, X2))

+1

2(w1 +w2 − w3) H (X1, X2)

+1

2(−w1 + w2 +w3) H (X2, X3)

= (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2)

+1

2(−w1 + w2 +w3) (H (X1|X2, X3)+ H (X2, X3))

+1

2(w1 −w2 + w3) (H (X3|X1, X2)+ H (X1, X2))

= (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2)

+w3 H (X1, X2, X3)

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3).

Case 1D (ν1 = ν2 = 3):

1

2(w1 + w2 −w3) (H (X1|X3)+ H (X2|X3))

+1

2(w1 −w2 + w3) H (X1, X3)

+1

2(−w1 + w2 +w3) H (X2, X3)

≥ 1

2(w1 +w2 −w3) (H (X1|X2, X3)+ H (X2|X3))

+1

2(w1 −w2 + w3) H (X1, X3)

+1

2(−w1 + w2 +w3) H (X2, X3)

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2|X3)

+1

2(−w1 + w2 +w3) (H (X1|X2, X3)+ H (X2, X3))

+1

2(w1 −w2 + w3) (H (X2|X1, X3)+ H (X1, X3))

= (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3).

Cases 2A and 2B (w ∈ W(0)3,2):

(−w1 + w2 +w3) H(X3|Xν3

) +w2 H (X1, X2)

+ (w1 −w2) H (X1, X3)

≥ (−w1 +w2 +w3) H (X3|X1, X2)+w2 H (X1, X2)

+ (w1 −w2) H (X1, X3)

= (w1 −w2) (H (X1, X2)+ H (X1, X3)

−H (X1, X2, X3))+ (w2 −w3) H (X1, X2)

+ (−w1 +w2 +w3) (H (X3|X1, X2)+ H (X1, X2))

+ (w1 −w2) H (X1, X2, X3)

≥ (w1 −w2) H (X1)+ (w2 −w3) H (X1, X2)

+w3 H (X1, X2, X3)

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3).

Cases 2C and 5B:

(w1 −w2) H(X1|Xν1

) +w3 H(X3|Xν3

) +w2 H (X1, X2)

≥ (w1 −w2) H (X1|X2, X3)+w3 H (X3|X1, X2)

+w2 H (X1, X2)

= (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2)

+w3 H (X1, X2, X3)

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3).

Cases 3A and 3B (w ∈ W(0)3,2):

(−w1 + w2 +w3) H(X2|Xν2

) + (w1 −w3) H (X1, X2)

+w3 H (X1, X3)

≥ (−w1 +w2 +w3) H (X2|X1, X3)

+ (w1 −w3) H (X1, X2)

+w3 H (X1, X3)

= (w1 −w2) (H (X1, X2)+ H (X1, X3)

−H (X1, X2, X3))+ (w2 −w3) H (X1, X2)

+ (w1 −w2) H (X1, X2, X3)

+ (−w1 + w2 +w3) (H (X2|X1, X3)+ H (X1, X3))

≥ (w1 −w2) H (X1)+ (w2 −w3) H (X1, X2)

+w3 H (X1, X2, X3)

≥ (w1 −w2) H (X1|X2, X3)

+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3).

Cases 3D and 5D (ν1 = ν2 = 3):

(w1 −w3) H (X1|X3)+w2 H (X2|X3)+w3 H (X1, X3)

≥ (w1 −w3) H (X1|X3)+w2 H (X2|X1, X3)

+w3 H (X1, X3)

= (w1 −w2) H (X1|X3)+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3)

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3).

Cases 4A and 4C:

(w1 −w2 +w3) H(X1|Xν1

) + (w2 −w3) H (X1, X2)

+w3 H (X2, X3)

≥ (w1 −w2 +w3) H (X1|X2, X3)+ (w2 − w3) H (X1, X2)

+w3 H (X2, X3)

= (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2)

+w3 H (X1, X2, X3)

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3).

Case 4D (ν1 = ν2 = 3):

w1 H (X1|X3)+ (w2 −w3) H (X2|X3)+w3 H (X2, X3)

≥ (w1 −w2 +w3) H (X1|X2, X3)

+ (w2 −w3) (H (X1|X3)+ H (X2|X3))

+w3 H (X2, X3)

Page 13: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

6380 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 11, NOVEMBER 2015

= (w1 −w2) H (X1|X2, X3)

+ (w2 −w3) (H (X1|X3)+ H (X2|X3))

+w3 H (X1, X2, X3)

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3).

Cases 5A and 5C:

(w1 −w2 − w3) H(X1|Xν1

) +w2 H (X1, X2)

+w3 H (X1, X3)

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2)

+w3 (H (X1, X2)+ H (X1, X3)− H (X1))

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2)

+w3 H (X1, X2, X3)

≥ (w1 −w2) H (X1|X2, X3)+ (w2 −w3) H (X1, X2|X3)

+w3 H (X1, X2, X3).

APPENDIX CPROOF OF LEMMA 9

The following result is needed for the proof of Lemma 9.Lemma 15: Assume that H (XV ) = H (XV ′) for all

V , V ′ ⊆ [1 : K ] with |V | = |V ′|. We have

H(X[1:i1]|X[i1+1: j ]

)

i1≤ H

(X[1:i2]|X[i2+1: j ]

)

i2

for any i1, i2, j ∈ [1 : K ] such that i1 ≤ i2 ≤ j .Proof: It suffices to consider the case i1 < i2. Note

that

i2 H(X[1:i2]|X[i2+1: j ]

)

=i2∑

k=1

H(X[1:i2]|X[i2+1: j ]

)

=i2∑

k=1

H(Xk |X[i2+1: j ]

)

+i2∑

k=1

H(X[1:i2]\{k}|X{k} ⋃

[i2+1: j ])

≥ H(X[1:i2]|X[i2+1: j ]

) + i2 H(X[1:i2−1]|X[i2 : j ]

).

Therefore,

H(X[1:i2−1]|X[i2 : j ]

)

i2 − 1≤ H

(X[1:i2]|X[i2+1: j ]

)

i2.

One can readily complete the proof via induction.Now we are ready to prove Lemma 9.Proof of Lemma 9: Consider an arbitrary V ∈ VK ,α. Let

V1 � V ∩ [1 : l] and V2 � V \V1. It suffices to show that

k∈V1

H(Uk,α |U[k+1:α],α

) + |V2| H(U[l+1:α],α

)

α − l

≥ H(U[1:|V |],α|U[|V |+1:α],α

). (52)

First consider the case Vi �= ∅, i = 1, 2. Note that∑

k∈V1

H(Uk,α|U[k+1:α],α

)

=|V1|∑

τ=1

H(

U〈V1〉τ ,α|U[〈V1〉τ+1:α],α)

≥|V1|∑

τ=1

H(Uτ,α|U[τ+1:α],α

)(53)

= H(U[1:|V1|],α|U[|V1|+1:α],α

), (54)

where (53) is due to the fact that 〈V1〉τ ≥ τ for τ ∈ [1 : |V1|]and that the source distribution is symmetrical entropy-wise.Moreover, we have

|V2| H(U[l+1:α],α

)

α − l

≥ |V2| H(U[1:α−l],α|U[α−l+1:α−|V1|],α

)

α − l≥ H

(U[1:|V2|],α|U[|V2|+1:α−|V1|],α

)(55)

= H(U[|V1|+1:|V |],α|U[|V |+1:α],α

), (56)

where (55) is due to Lemma 15 and the fact that α− l ≥ |V2|.Combining (54) and (56) proves (52) for the case Vi �= ∅,i = 1, 2.

Note that (52) degenerates to (54) when V2 = ∅ anddegenerates to (56) when V1 = ∅. This completes theproof of Lemma 9.

APPENDIX DPROOF OF LEMMA 10

Proof of (31): Note thatK∑

k=lwα +1

wk =K∑

k=lwα +1

V ∈(lwα )K ,α :k∈V

cV ,α

=∑

V ∈(lwα )K ,α

cV ,α

k∈V \[1:lwα ]

1

= (α − lw

α

) ∑

V ∈(lwα )K ,α

cV ,α,

from which the desired result follows immediately. �Proof of (32): Consider the following two cases.(Case 1) k ∈ [

1 : lwα

]:

V ∈VK ,α :k∈V

cV ,α =lwα∑

i=k

c[1:i],α +∑

V ∈(lwα )K ,α

cV ,α

= wk − λwα +

V ∈(lwα )K ,α

cV ,α

= wk, (57)

where (57) is due to (31).(Case 2) k ∈ [

lwα + 1 : K

]:

V∈VK ,α :k∈V

cV ,α =∑

V ∈(lwα )K ,α :k∈V

cV ,α = wk .

Page 14: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

XIAO et al.: DISTRIBUTED MULTILEVEL DIVERSITY CODING 6381

Proof of (33): It suffices to verify that wlwα

− λwα ≥ 0 when

lwα ≥ 1. Indeed, this is a simple consequence of the fact that

w ∈ W(lwα )

K ,α . �Proof of (34): Consider the following three cases.(Case 1) lw

α−1 ≤ lwα − 2: Note that

lwα∑

k=lwα−1+1

(α − 1 − k) c[1:k],α = (α − 1 − lw

α−1

)wlw

α−1+1

−lwα∑

k=lwα−1+1

wk − (α − 1 − lw

α

)λwα (58)

and

(α − lw

α

)λwα +

lwα∑

k=lwα−1+1

wk =K∑

k=lwα +1

wk +lwα∑

k=lwα−1+1

wk

=K∑

k=lwα−1+1

wk

= (α − 1 − lw

α−1

)λwα−1. (59)

We have

θwα =

(λwα −

lwα∑

k=lwα−1+1

(α − 1 − k) c[1:k],α

)1

α − 1 − lwα−1

=(λwα − (

α − 1 − lwα−1

)wlw

α−1+1

+lwα∑

k=lwα−1+1

wk + (α − 1 − lw

α

)λwα

)1

α − 1 − lwα−1

(60)

=(

(α − lw

α

)λwα +

lwα∑

k=lwα−1+1

wk

− (α − 1 − lw

α−1

)wlw

α−1+1

)1

α − 1 − lwα−1

=((α − 1 − lw

α−1

)λwα−1 − (

α − 1 − lwα−1

)wlw

α−1+1

)

× 1

α − 1 − lwα−1

(61)

= λwα−1 −wlw

α−1+1 (62)

= c[1:lwα−1

],α − c[

1:lwα−1

],α−1,

where (60) and (61) are due to (58) and (59), respectively. Itcan be verified that

(α − 1 − lw

α−1

)λwα−1 =

K∑

k=lwα−1+1

wk

=K∑

k=lwα−1+2

wk +wlwα−1+1

≥ (α − 2 − lw

α−1

)wlw

α−1+1 +wlwα−1+1

(63)

= (α − 1 − lw

α−1

)wlw

α−1+1, (64)

where (63) follows from the fact that w ∈ W

(lwα−1

)

K ,α−1. Combining(62) and (64) proves θw

α ≥ 0.(Case 2) lw

α−1 = lwα − 1: Note that

(α − lw

α

)λwα =

K∑

k=lwα +1

wk

= (α − 1 − lw

α−1

)λwα−1 −wlw

α

= (α − lw

α

)λwα−1 − wlw

α. (65)

We have

θwα =

(λwα − (

α − 1 − lwα

)c[1:lw

α ],α) 1

α − 1 − lwα−1

= (λwα − (

α − 1 − lwα

) (wlw

α− λw

α

)) 1

α − lwα

= λwα−1 −wlw

α(66)

= c[1:lwα−1

],α − c[

1:lwα−1

],α−1,

where (66) is due to (65). The fact that θwα ≥ 0 follows

by (64) and (66).(Case 3) lw

α−1 = lwα : Note that

θwα = 1

α − 1 − lwα−1

λwα ≥ 0.

Moreover, we have

1

α − 1 − lwα−1

λwα = 1

(α − 1 − lw

α

) (α − lw

α

)K∑

k=lwα +1

wk

=(

1

α − 1 − lwα

− 1

α − lwα

) K∑

k=lwα +1

wk

= λwα−1 − λw

α

= c[1:lwα−1

],α − c[

1:lwα−1

],α−1.

�Proof of (35): Consider the following two cases.(Case 1) lw

α = 0:∑

V ∈VK ,α

cV ,α =∑

V ∈(0)K ,α

cV ,α

= 1

α

K∑

k=1

wk, (67)

where (67) is due to (31).(Case 2) lw

α > 0:

V ∈VK ,α

cV ,α =lwα −1∑

k=1

c[1:k],α + c[1:lwα ],α +

V ∈(lwα )K ,α

cV ,α

=lwα −1∑

k=1

(wk −wk+1)+(wlw

α− λw

α

) +∑

V ∈(lwα )K ,α

cV ,α

=lwα −1∑

k=1

(wk −wk+1)+(wlw

α− λw

α

) + λwα (68)

= w1,

where (68) is due to (31). �

Page 15: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

6382 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 11, NOVEMBER 2015

APPENDIX EPROOF OF LEMMA 12

Proof of Part (1) of Lemma 12: Note that (26), (27), and (29)obviously hold. Moreover, (28) is implied by (34). Therefore,it suffices to verify (30).

Consider an arbitrary integer k ∈ [lwα−1 + 1 : K

]. We have

V ′∈(

lwα−1

)

K ,α−1 :k∈V ′

V∈(lwα )K ,α :V ′⊆V

cV ,α

α − 1 − lwα−1

=∑

V ∈(lwα )K ,α :k∈V

∣∣∣∣

{V ′ ∈

(lwα−1

)

K ,α−1 : k ∈ V ′ ⊆ V

}∣∣∣∣

α − 1 − lwα−1

cV ,α

=∑

V ∈(lwα )K ,α :k∈V

cV ,α. (69)

Note that

V ′∈(

lwα−1

)

K ,α−1 :k∈V ′

1

λwα

V ∈(lwα )K ,α

cV ,α

lwα∑

i=lwα−1+1

c[1:i],α

×α∑

τ=i+1

I {V ′ = 〈V 〉[1:α]\{τ }

}

= 1

λwα

V∈(lwα )K ,α

cV ,α

lwα∑

i=lwα−1+1

c[1:i],α

×α∑

τ=i+1

V ′∈(

lwα−1

)

K ,α−1 :k∈V ′

I {V ′ = 〈V 〉[1:α]\{τ }

}. (70)

Moreover,α∑

τ=i+1

V ′∈(

lwα−1

)

K ,α−1 :k∈V ′

I {V ′ = 〈V 〉[1:α]\{τ }

}

=α∑

τ=i+1

I {k ∈ 〈V 〉[1:α]\{τ }

}

= (α − 1 − i) I {k ∈ V } + I {k ∈ 〈V 〉[1:i]

},

V ∈ (lwα )

K ,α , i ∈ [lwα−1 + 1 : lw

α

]. (71)

Therefore,

V ′∈(

lwα−1

)

K ,α−1 :k∈V ′

1

λwα

V ∈(lwα )K ,α

cV ,α

lwα∑

i=lwα−1+1

c[1:i],α

×α∑

τ=i+1

I {V ′ = 〈V 〉[1:α]\{τ }

}

= 1

λwα

V ∈(lwα )K ,α

cV ,α

lwα∑

i=lwα−1+1

c[1:i],α

× ((α − 1 − i)I {k ∈ V } + I {

k ∈ 〈V 〉[1:i]})

(72)

= 1

λwα

lwα∑

i=lwα−1+1

(α − 1 − i) c[1:i],α∑

V ∈(lwα )K ,α :k∈V

cV ,α

+ 1

λwα

V ∈(lwα )K ,α :k∈V

cV ,α

lwα∑

i=lwα−1+1

c[1:i],αI{k ∈ 〈V 〉[1:i]

},

(73)

where (72) is obtained by substituting (71) into (70). Combin-ing (69) and (73) gives

V ′∈(

lwα−1

)

K ,α−1 :k∈V ′

cV ′,α−1 =∑

V ∈(lwα )K ,α :k∈V

cV ,α

+ 1

λwα

V ∈(lwα )K ,α :k∈V

cV ,α

lwα∑

i=lwα−1+1

c[1:i],αI{k ∈ 〈V 〉[1:i]

}.

(74)

Now consider the following two cases.(Case 1) k ∈ [lw

α−1 + 1 : lwα ]: We have

V ∈(lwα )K ,α :k∈V

cV ,α =∑

V ∈(lwα )K ,α

cV ,α

= λwα (75)

and

I {k ∈ 〈V 〉[1:i]

} = I {k ∈ [1 : i ]},V ∈

(lwα )

K ,α , i ∈ [lwα−1 + 1 : lw

α

], (76)

where (75) is due to (31). Continuing from (74),∑

V ′∈(

lwα−1

)

K ,α−1 :k∈V ′

cV ′,α−1

=∑

V ∈(lwα )K ,α :k∈V

cV ,α

+ 1

λwα

V ∈(lwα )K ,α :k∈V

cV ,α

lwα∑

i=lwα−1+1

c[1:i],αI {k ∈ [1 : i ]}

(77)

=∑

V ∈(lwα )K ,α :k∈V

cV ,α + 1

λwα

lwα∑

i=k

c[1:i],α∑

V ∈(lwα )K ,α :k∈V

cV ,α

= λwα +

lwα∑

i=k

c[1:i],α (78)

= wk,

where (77) and (78) are due to (76) and (75), respectively.(Case 2) k ∈ [lw

α + 1 : K ]: We have∑

V ′∈(

lwα−1

)

K ,α−1 :k∈V ′

cV ′,α−1 =∑

V ∈(lwα )K ,α :k∈V

cV ,α (79)

= wk,

Page 16: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

XIAO et al.: DISTRIBUTED MULTILEVEL DIVERSITY CODING 6383

where (79) follows by (74) and the fact that

I {k ∈ 〈V 〉[1:i]

} = 0, V ∈ (lwα )

K ,α , i ∈ [lwα−1 + 1 : lw

α

].

This completes the verification of (30). �Proof of Part (2) of Lemma 12: Note that

|V |∑

τ=i+1

H(

X〈V 〉[1:|V |]\{τ }

)

=|V |∑

τ=i+1

H(

X〈V 〉[1:|V |]\{τ } |X〈V 〉[1:i]

)+

|V |∑

τ=i+1

H(X〈V 〉[1:i]

)

=|V |∑

τ=i+1

H(

X〈V 〉[1:|V |]\{τ } |X〈V 〉[1:i]

)+ (|V | − i) H

(X〈V 〉[1:i]

)

≥ (|V | − 1 − i) H(XV |X〈V 〉[1:i]

)

+ (|V | − i) H(X〈V 〉[1:i]

)(80)

= (|V | − 1 − i) H (XV )+ H(X〈V 〉[1:i]

),

i ∈ [0 : |V | − 1], (81)

where (80) is due to Han’s inequality [16]. We have∑

V ′∈(

lwα−1

)

K ,α−1

θwα

λwα

V ∈(lwα )K ,α :V ′⊆V

cV ,αH (XV ′)

= θwα

λwα

V ∈(lwα )K ,α

cV ,α

V ′∈(

lwα−1

)

K ,α−1 :V ′⊆V

H (XV ′)

= θwα

λwα

V ∈(lwα )K ,α

cV ,α

α∑

τ=lwα−1+1

H(

X〈V 〉[1:α]\{τ }

)

≥ θwα

λwα

V ∈(lwα )K ,α

cV ,α

×((α − 1 − lw

α−1

)H (XV )+ H

(X[

1:lwα−1

]))

(82)

=∑

V ∈(lwα )K ,α

cV ,αH (XV )

− 1

λwα

lwα∑

i=lwα−1+1

(α − i − 1) c[1:i],α∑

V ∈(lwα )K ,α

cV ,αH (XV )

+(

c[1:lwα−1

],α − c[

1:lwα−1

],α−1

)H

(X[

1:lwα−1

]), (83)

where (82) follows by (81), and (83) is due to (31) and (34).Moreover,

V ′∈(

lwα−1

)

K ,α−1

1

λwα

V ∈(lwα )K ,α

cV ,α

×lwα∑

i=lwα−1+1

c[1:i],αα∑

τ=i+1

I {V ′ = 〈V 〉[1:α]\{τ }

}H (XV ′)

= 1

λwα

V ∈(lwα )K ,α

cV ,α

lwα∑

i=lwα−1+1

c[1:i],αα∑

τ=i+1

H(

X〈V 〉[1:α]\{τ }

)

≥ 1

λwα

V ∈(lwα )K ,α

cV ,α

lwα∑

i=lwα−1+1

c[1:i],α

× ((α − 1 − i) H (XV )+ H

(X[1:i]

))(84)

= 1

λwα

lwα∑

i=lwα−1+1

(α − 1 − i) c[1:i],α∑

V ∈(lwα )K ,α

cV ,αH (XV )

+ 1

λwα

lwα∑

i=lwα−1+1

c[1:i],αH(X[1:i]

) ∑

V ∈(lwα )K ,α

cV ,α

= 1

λwα

lwα∑

i=lwα−1+1

(α − 1 − i) c[1:i],α∑

V ∈(lwα )K ,α

cV ,αH (XV )

+lwα∑

i=lwα−1+1

c[1:i],αH(X[1:i]

), (85)

where (84) and (85) are due to (81) and (31), respectively.Now one can readily complete the proof by combining (82)and (85) and invoking the fact that c[1:i],α−1 = c[1:i],α wheni ∈ [

1 : lwα−1 − 1

]. �

REFERENCES

[1] D. Slepian and J. K. Wolf, “Noiseless coding of correlated informa-tion sources,” IEEE Trans. Inf. Theory, vol. 19, no. 4, pp. 471–480,Jul. 1973.

[2] T. M. Cover, “A proof of the data compression theorem ofSlepian and Wolf for ergodic sources (Corresp.),” IEEE Trans. Inf.Theory, vol. 21, no. 2, pp. 226–228, Mar. 1975.

[3] J. R. Roche, “Distributed information storage,” Ph.D. dissertation,Dept. Statist., Stanford Univ., Stanford, CA, USA, 1992.

[4] R. W. Yeung, “Multilevel diversity coding with distortion,” IEEE Trans.Inf. Theory, vol. 41, no. 2, pp. 412–422, Mar. 1995.

[5] R. C. Singleton, “Maximum distance q-nary codes,” IEEE Trans. Inf.Theory, vol. 10, no. 2, pp. 116–118, Apr. 1964.

[6] J. R. Roche, R. W. Yeung, and K. P. Hau, “Symmetrical multi-level diversity coding,” IEEE Trans. Inf. Theory, vol. 43, no. 3,pp. 1059–1064, May 1997.

[7] R. W. Yeung and Z. Zhang, “On symmetrical multilevel diversitycoding,” IEEE Trans. Inf. Theory, vol. 45, no. 2, pp. 609–621,Mar. 1999.

[8] S. Mohajer, C. Tian, and S. N. Diggavi, “Asymmetric multilevel diversitycoding and asymmetric Gaussian multiple descriptions,” IEEE Trans. Inf.Theory, vol. 56, no. 9, pp. 4367–4387, Sep. 2010.

[9] A. Balasubramanian, H. D. Ly, S. Li, T. Liu, and S. L. Miller, “Securesymmetrical multilevel diversity coding,” IEEE Trans. Inf. Theory,vol. 59, no. 6, pp. 3572–3581, Jun. 2013.

[10] J. Jiang, N. Marukala, and T. Liu, “Symmetrical multilevel diversitycoding and subset entropy inequalities,” IEEE Trans. Inf. Theory, vol. 60,no. 1, pp. 84–103, Jan. 2014.

[11] C. Tian and T. Liu. (2015). “Multilevel diversity coding with regenera-tion.” [Online]. Available: http://arxiv.org/abs/1503.00013

[12] L. Song, J. Chen, and C. Tian, “Broadcasting correlated vectorGaussians,” IEEE Trans. Inf. Theory, vol. 61, no. 5, pp. 2465–2477,May 2015.

[13] I. Csiszar, “Linear codes for sources and source networks: Errorexponents, universal coding,” IEEE Trans. Inf. Theory, vol. 28, no. 4,pp. 585–592, Jul. 1982.

[14] C. Tian, S. Mohajer, and S. N. Diggavi, “Approximating the Gaussianmultiple description rate region under symmetric distortion constraints,”IEEE Trans. Inf. Theory, vol. 55, no. 8, pp. 3869–3891, Aug. 2009.

[15] J. Chen and T. Berger, “Robust distributed source coding,” IEEE Trans.Inf. Theory, vol. 54, no. 8, pp. 3385–3398, Aug. 2008.

[16] T. S. Han, “Nonnegative entropy measures of multivariate symmetriccorrelations,” Inf. Control, vol. 36, no. 2, pp. 133–156, Feb. 1978.

Page 17: 6368 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, …

6384 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 11, NOVEMBER 2015

Zhiqing Xiao received the B.S. degree from the Beijing University of Postsand Telecommunications, China, in 2011. He is currently a PhD candidate inthe Department of Electronic Engineering, Tsinghua University. His researchinterest is network information theory.

Jun Chen (S’03–M’06) received the B.E. degree with honors in communi-cation engineering from Shanghai Jiao Tong University, Shanghai, China, in2001 and the M.S. and Ph.D. degrees in electrical and computer engineeringfrom Cornell University, Ithaca, NY, in 2004 and 2006, respectively.

He was a Postdoctoral Research Associate in the Coordinated ScienceLaboratory at the University of Illinois at Urbana-Champaign, Urbana, IL,from September 2005 to July 2006, and a Postdoctoral Fellow at the IBMThomas J. Watson Research Center, Yorktown Heights, NY, from July 2006to August 2007. Since September 2007 he has been with the Department ofElectrical and Computer Engineering at McMaster University, Hamilton, ON,Canada, where he is currently an Associate Professor. His research interestsinclude information theory, wireless communications, and signal processing.

He received several awards for his research, including the Josef RavivMemorial Postdoctoral Fellowship in 2006, the Early Researcher Award fromthe Province of Ontario in 2010, and the IBM Faculty Award in 2010.He is currently serving as an Associate Editor for Shannon Theory for theIEEE TRANSACTIONS ON INFORMATION THEORY.

Yunzhou Li (M’06) received the Ph.D. degree from Tsinghua Universityin 2004. He is currently a professorship researcher at Tsinghua University.His research interests include wireless and mobile communications, such asWLAN and cellular systems, as well as coding theory.

Jing Wang (M’99) received the B.S. and M.S. degrees in electronic engineer-ing from Tsinghua University, Beijing, China, in 1983 and 1986, respectively.He has been on the Faculty at Tsinghua University since 1986. He is currentlya professor at the School of Information Science and Technology, TsinghuaUniversity. He serves as the Vice Director of the Tsinghua National Lab forInformation Science and Technology. His research interests are in the area ofwireless communications, including transmission and networking technologiesof 5G. He has published more than 150 conference and journal papers.


Recommended