of 18
8/9/2019 Lect Slides 1
1/18
EE 708: Information Theory and Coding
Sibi Raj B Pillai
Department of Electrical Engineering
1/17
EE
708InformationTheory,
IIT
Bombay,
02January2012
8/9/2019 Lect Slides 1
2/18
Mindmap
InformationTheory
Network Info.Theory
MultipleAcessBroadcast
Relay/Interference
MultipleDescriptionsJoint Source
Channel Code
ChannelCoding
CommunicationCapacity
Error CorrectionCodes
ErrorExponents
Coding withSide Info
DataCompression
InformationMeasures
LosslessCompression
UniversalCompression
Rate DistortionTheory
DistributedCompression
VistasBeyond
Telecom
Evolution
SignalProcessingMathematical
Finance
Ergodic Theory
Random MatrixTheory
Non Shannonmeasures
8/9/2019 Lect Slides 1
3/18
Sending Packets
Transmitter Channel Receiver
ARQ
If each packet takesTseconds, how many packets can we send innTseconds (n>>).
8/9/2019 Lect Slides 1
4/18
Sending Packets
Transmitter Channel Receiver
ARQ
Each Packetlost withprobabilityp
If each packet takesTseconds, how many packets can we send innTseconds (n>>).
8/9/2019 Lect Slides 1
5/18
Average Packet Time
Let Tbe the time spent for correctly sending a packet.
Tis a random variable due to errors in the channel. The average time required for correctly sending a packet is
ET = (1 p)T+p(1 p)(2T) +p2(1 p)(3T) +
= (1 p)T
i1ipi1
= (1 p)T 1
(1 p)2
=
T
1 p (1)
From this, the number of correct packets Nnn(1 p) 1
1important to notice that this is an approximation, actual computation in next page
8/9/2019 Lect Slides 1
6/18
Successful Packets
The probability thatkpackets were lost is
nk
pk(1 p)nk.
Thus the average number of lost packetsNf is
Nf =n
k=0
knkpk(1 p)nk
= (1 p)nn
k=0
k
n
k
k, where =
p
1 p
= (1 p)n
nk=0
nk d
d k
= (1 p)n
d
d (1 +)n
=n(1 p)= np
8/9/2019 Lect Slides 1
7/18
8/9/2019 Lect Slides 1
8/18
Scratching the CDs
Have you noticed the scratches on CD.
Scratches most often cause loss of data.
How does the music still gets played out of a scratched CD.
ans: Reed Solomon Codes/Discrete Fourier Transform
8/9/2019 Lect Slides 1
9/18
Defective Memory
Suppose you develop a technique to make adirt-cheaphighdensity memory disk, on which you wish to mass-distributesomecontent.
Unfortunately, on every disk there are some number offaulty(stuck) memories whose locations are known to you ontesting.
How efficiently can you pack data so that no information is lost atthe intended receivers.
ans: Coding with Transmitter Side information
8/9/2019 Lect Slides 1
10/18
Distributed Source Coding
Two sensors at different locations observe versions of the same
phenomenon. Say temperature in a plant. They both wish tocommunicate their data to a central station.
Note that though the observed information is related, eachsensor has no idea (other than statistical properties) of the
observations at the other one. Do they have to individually send all the information they collect?
Even if you let those sensors sit together, collect and processthe data, the sum of the informations sent individually is same
as the case when they are not talking to each other.Slepian-Wolf Coding
8/9/2019 Lect Slides 1
11/18
CDMA vs OFDMA
From 2G to 3G marked the shift from TDMA to CDMA, largelydue to the efforts of Qualcomm and others.
From 3G to 4G, CDMA could not sustain the steam.
In particular, technologies like OFDMA (as word says:- a way of
FDM) took over.
How do we know a new physical layer technology has future, sayin wireless communication.
ans: Capacity region of fading MAC/BC :-not fully covered in EE708
8/9/2019 Lect Slides 1
12/18
Taking it too Far
A huge hadron collider is looking for subatomic particles. Thesensor produces a binary output at integral multiples ofseconds. In each interval, the probability of observing theparticle is, independent of all other observations. The data is tobe stored inside the collider till the experiment lasts, say 1010
seconds. Of the onboard memory available, you wish to minimizetheaveragememory required per observation, take = 0.0001.
Design a scheme which will get you there.
Many codes can do this, in particular Arithmetic coding is very
popular.
8/9/2019 Lect Slides 1
13/18
Probability Essentials
Introductory Treatments and Reference William Feller, An Introduction to Probability, Wiley Student
Edition (Indian), 1970. P. Billingsley, Probability and Measure, Wiley 1995. Firstedt and Gray, A Modern Approach to Probability Theory,
Birkhauser 1998.
Details Relevant to Shannon Theory Robert Gray, Probability, Random Processes and Ergodic
Properties, available online athttp://ee.stanford.edu/gray/arp.html
Note: We will use the minimal required set
8/9/2019 Lect Slides 1
14/18
Bayes Rule
In its simplest form,
P(A, B) =P(A)P(B/A).
But, what isP, Aand B?. Try this example.
Question)Consider a blood testing equipment which detects thepresence of a disease 99% of the cases, if presented with infectedsamples. Thus 1% of the infected escape undetected. On the otherhand, the test gives false results for 2% of healthy patients too.Suppose, on average, 1 out of 1000 people are infected. If themachine gives a positive test, what is the chance of the blood samplebeing actually infected.
Let us now make these precise.
8/9/2019 Lect Slides 1
15/18
Probability Definition
Probability Space(,F, P)
We assume a set of possible experimental outcomes, say . To obtain a level playing ground, we generate a field F
containing subsets (events) of. In order to be stable (meaningful) with respect to operations like
limits, we insist that our fieldF isclosedwith respect to the
countable unions, and we call it a field. Our final party is the measure (loosely, non-negative function) P:
Unit Norm
P() =1 (2)
Countable Additivity: For disjoint sets Ai F, i=1, 2,
P(
i=1
Ai) =
i=1
P(Ai). (3)
8/9/2019 Lect Slides 1
16/18
Random Variables
A random variableXis a measurable mapping from a probabilityspace
(,F, P) (,B),
where is a set of observables, and Ba field of observed events.
The RHS imbibes a probability measure from the LHS, and noneed to explicitly mention it.
IfXis discrete and finite (the case for a large part of ourlectures), then then the associatedfields can be taken as thepowerset (set of all subsets).
More generally, there are otherfields as well, but we stick tothe sensible ones with respect to whichX is measurable. The
ones we use are mostly clear from the context. From now, we assume that Xtakes values in R.
We have to be more careful whenX is also continuous valued.
8/9/2019 Lect Slides 1
17/18
Expectation
For a discrete random variableX, the quantity
E[X] =x
xP(X=x),
is known as the expectation ofX. Notice thatE()is a linear operation.
Similarly, whenX is continuous, admitting a density fX(x),
E[X] =
xfX(x)dx
8/9/2019 Lect Slides 1
18/18
Weak Law of Large Numbers (WLLN)
Weak law is about weak convergence of random variables.
Let {Xi}, i N be a sequence of iid R-valued random variables.Define
Sn=
1
n
n
i=
1
Xi.
Then Snconverges in probability to E[X].
In other words, >0,
P(|Sn E[X]|> ) 0.
The Strong LLN says that if E[|Xi|]< , then >0
P( limn
|Sn E[X]|> ) =0