Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability...

Basic Concepts of Information Theory

Entropy for Two-dimensional Discrete Finite Probability Schemes.Conditional Entropy.

Communication Network. Noise Characteristics of a Communication Channel.

1

Entropy. Basic Properties

• Continuity: if the probabilities of the occurrence of events are slightly changed, the entropy is slightly changed accordingly.

• Symmetry:• Extremal Property : when all the events are

equally likely, the average uncertainty has the largest value:

2

1 1,..., , ,..., ,..., , ,...,n niji jH p p H pp pp p p

1

1 1 1max ( ,..., ) , ,...,nH p p H

n n n


• Additivity. Let is the entropy associated with a complete set of events E1, E2, …, En. Let the event En is divided into k disjoint subsets:

• Thus and where

3

1 2( , ,..., )nH p p p

11

; ; .m m

n k n k k kkk

E F p q P F q

1 2 ... 1m

n n n

qq q

p p p

n n mH H p H

1

1 1 1 2

1 2

,..., ;

,..., , , ,..., ;

, ,...,

n

n n m

mm

n n n

H H p p

H H p p q q q

qq qH H

p p p


• In general,• is continuous in pi for all

• •

4

1 2( , ,..., )nH p p p 0 1ip

,1 1 , , 1,2,...,i i i iH p p H p p i n

1 1 1 2

1 21 1

1

,..., , , ,...,

,..., , , ,..., ;

n m

mn

m

nn nn

kkn n

H p p q q q

qq qH p p p p H

p p pp q

ENTROPY FOR TWO-DIMENSIONAL DISCRETE FINITE PROBABILITY SCHEMES

5

Entropy for Two-dimensional Discrete Finite Probability Schemes• The two-dimensional probability scheme

provides the simplest mathematical model for a communication system with a transmitter and a receiver.

• Consider two finite discrete sample spaces Ω1 (transmitter space) Ω2 (receiver space) and their product space Ω.

6

Entropy for Two-dimensional Discrete Finite Probability Schemes• In Ω1 and Ω2 we select complete sets of events

• Each event may occur in conjunction with any event . Thus for the product space Ω= Ω1 Ω2 we obtain the following complete set of events:

7

1 2 1 2, ,..., ; , ,...,n mE E E E F F F F

1kE 2jF

1 1 1 2 1

2 1 2 2 2

1 2

...

...

... ... ... ...

...

m

m

n n n m

E F E F E F

E F E F E F

E F

F

E F E

E

F

Entropy for Two-dimensional Discrete Finite Probability Schemes• We may consider the following three

complete sets of probability schemes

• Each one of them is, by assumption, a finite complete probability scheme like

8

; ;k j k jP E P E P F P F P EF P E F

1 21

1 21

, ,..., ;

, ,..., ; 1

n

n ii

n

n ii

E E E E E U

P p p p p

Entropy for Two-dimensional Discrete Finite Probability Schemes• The joint probability matrix for the random

variables X and Y associated with spaces Ω1 and Ω2 :

• Respectively, 9

11 12 1

21 22 2

1 2

\...

...

... ... ... ..

.

,.

. .

m

m

n n nm

X Yp p p

p p p

p p p

P X Y

1

,m

k k jj

P x p x y

1

,n

j k jk

P y p x y

Entropy for Two-dimensional Discrete Finite Probability SchemesComplete Probability Scheme Entropy

10

kP E P E

jP F P F

k jP EF P E F 1 1

, logn m

kj kjk j

H X Y p p

1 1 1

logn m m

kj kjk j j

H X p p

1 1 1

logm n n

kj kjj k k

H Y p p

Entropy for Two-dimensional Discrete Finite Probability Schemes• If all marginal probabilities and are

known then the marginal entropies can be expressed according to the entropy definition:

11

{ }kp x { }jp y

1

logn

k kk

H X p x p x

1

logm

j jj

H Y p y p y

Conditional Entropies

• Let now an event Fi may occur not independently, but in conjunction with

12

1 2, ,..., or :nE E E

1

|

|

n

j k jk

k j

k j

j

kjk j

j

F E F

P X x Y yP X x Y y

P Y y

pp x y

p y


• Consider the following complete probability scheme:

• Hence

13

1 2

1 2

1

| | , | ,..., |

| , ,..., ; 1

j j j n j

nj j nj kj

jkj j j j

E F E F E F E F

p p p pP E F

p y p y p y p y

1 1

log | log ||n n

kj kjk j k j

k kj j

i

p pp x y p x y

p y pX y

yH

Conditional Entropies• Taking this conditional entropy for all admissible yj,

we obtain a measure of average conditional entropy of the system:

• Respectively,

14

1 1,

1 1 1

1 1

1 1

| | | log |

| log |

, lo

,l |

g |

|

og

k j

j

m nk j

j k jj k j

p x y

m m n

j j j j k j k jj j k

m n

j k j k jj

p y

k

m n

k j k jj k

H X y p y H X y p y

p x yp y p

p x y p x y

p y p x y

H X Y

p x y

p x

x yp y

y p x y

1 1

1 1

| log |

, log |

|n m

k j k j kk j

n m

j k j kk j

H p x p y x p y x

p y x p y x

Y X


• Since • Then finally conditional entropies can be

written as

15

1 1

1 1

, log |

, log ||

|m n

k j k jj k

n m

k j j kk j

H

p x y p x y

p x y p y x

X Y

Y X

H

| ,|, j k j k j j kkk j pp x y p x p y x xp x y p yy

Five Entropies Pertaining to Joint Distribution

• Thus we have considered:• Two conditional entropies H(X|Y), H(Y|X)• Two marginal entropies H(X), H(Y)• The joint entropy H(X,Y)

16

COMMUNICATION NETWORK. NOISE CHARACTERISTICS OF A CHANNEL

17

Communication Network

• Consider a source of communication with a given alphabet. The source is linked to the receiver via a channel.

• The system may be described by a joint probability matrix: by giving the probability of the joint occurrence of two symbols, one at the input and another at the output.

18

Communication Network

• xi – a symbol, which was sent; yj - a symbol, which was received

• The joint probability matrix:

19

1 1 1 2 1

2 1 2 2 2

1 2

, , ... ,

, , ... ,,

... ... ... ...

, , ... ,

m

m

n n n m

P x y P x y P x y

P x y P x y P x yP X Y

P x y P x y P x y

Communication Network: Probability Schemes

• There are following five probability schemes of interest in a product space of the random variables X and Y:

• [P{X,Y}] – joint probability matrix• [P{X}] – marginal probability matrix of X• [P{Y}] – marginal probability matrix of Y• [P{X|Y}] – conditional probability matrix of X|Y

• [P{Y|X}] – conditional probability matrix of Y|X

20

Communication Network: Entropies

• There is the following interpretation of the five entropies corresponding to the mentioned five probability schemes:

• H(X,Y) – average information per pairs of transmitted and received characters (the entropy of the system as a whole);

• H(X) – average information per character of the source (the entropy of the source)

• H(Y) – average information per character at the destination (the entropy at the receiver)

• H(Y|X) – a specific character xk being transmitted and one of the permissible yj may be received (a measure of information about the receiver, where it is known what was transmitted)

• H(X|Y) – a specific character yj being received ; this may be a result of transmission of one of the xk with a given probability (a measure of information about the source, where it is known what was received)

21

Communication Network: Entropies’ Meaning

• H(X) and H(Y) give indications of the probabilistic nature of the transmitter and receiver, respectively.

• H(X,Y) gives the probabilistic nature of the communication channel as a whole

• H(Y|X) gives an indication of the noise (errors) in the channel

• H(X|Y) gives a measure of equivocation (how well one can recover the input content from the output)

22

Communication Network:Derivation of the Noise Characteristics• In general, the joint probability matrix is not

given for the communication system.• It is customary to specify the noise

characteristics of a channel and the source alphabet probabilities.

• From these data the joint and the output probability matrices can be derived.

23

Communication Network:Derivation of the Noise Characteristics• Let us suppose that we have derived the joint

probability matrix:

24

1 1 1 1 2 1 1 1

2 1 2 2 2 2 2 2

1 2

| | ... |

| | ... |,

... ... ... ...

| | ... |

m

m

n n n n n m n

p x p y x p x p y x p x p y x

p x p y x p x p y x p x p y xP X Y

p x p y x p x p y x p x p y x

Communication Network:Derivation of the Noise Characteristics• In other words :

• where:

25

, |P X Y P X P Y X

1

2

1

0 0 ... 0

0 0 ... 0

... ... ... ... ... ;

0 0 ... 0

0 0 .... 0n

n

p x

p x

P X

p x

p x

Communication Network:Derivation of the Noise Characteristics• If [P{X}] is not diagonal, but a row matrix

(n-dimensional vector) then

• where [P{Y}] is also a row matrix (m-dimensional vector) designating the probabilities of the output alphabet.

26

|P Y P X P Y X

Communication Network:Derivation of the Noise Characteristics• Two discrete channels of our particular

interest:• Discrete noise-free channel (an ideal channel)• Discrete channel with independent input-

output (errors in the channel occur, thus noise is presented)

27

Date post:	21-Jan-2016
Category:	Documents
Upload:	vernon-rice
View:	220 times
Download:	0 times

Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability...

Documents