+ All Categories
Home > Documents > Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Date post: 19-Dec-2015
Category:
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
63
Today: Entropy <break> Information Theory
Transcript
Page 1: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Today:

Entropy

<break>

Information Theory

Page 2: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Information Theory

Page 3: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Claude Shannon Ph.D.1916-2001

Page 4: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

I x;y( ) = H x( ) − H x | y( )

Page 5: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

I x;y( ) = H x( ) − H x | y( )

H x( ) = − p x i( )log2 p x i( )[ ]∑

Page 6: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Entropy

H x( ) = p x i( )log2 p x i( )[ ]∑

Page 7: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Entropy

A measure of the disorder in a system

Page 8: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

EntropyThe (average) number of yes/no

questions needed to completely specify the state of a

system

Page 9: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

The (average) number of yes/no questions needed

to completely specify the state of a system

Page 10: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

What if there were two coins?

Page 11: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

What if there were two coins?

Page 12: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

What if there were two coins?

Page 13: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

What if there were two coins?

Page 14: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

2 states. 1 question.

4 states. 2 questions.

8 states. 3 questions.

16 states. 4 questions.

number of states = 2

number of yes-no questions

Page 15: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

number of states = 2

number of yes-no questions

log2(number of states) =

number of yes-no questions

Page 16: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

H = log2 n[ ]

H is entropy, the number of yes-no questions required to specify the state of the system

n is the number of states of the system, assumed (for now) to be equally likely

Page 17: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

H = log2 n[ ]

Page 18: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Consider Dice

Page 19: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

The Six Sided Die

H = log2(6) = 2.585 bits

Page 20: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

The Four Sided Die

H = log2(4) = 2.000 bits

Page 21: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

The Twenty Sided Die

H = log2(20) = 4.322 bits

Page 22: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

What about all three dice?

H = log2(4620)

Page 23: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

What about all three dice?

H = log2(4)+log2(6)+log2(20)

Page 24: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

What about all three dice?

H = 8.907 bits

Page 25: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

What about all three dice?

Entropy, from independent elements of a system, adds

Page 26: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

H = log2 n[ ]

Let’s the rewrite this a bit...Trivial Fact 1: log2(x) = - log2(1/x)

Page 27: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

H = −log2

1

n

⎡ ⎣ ⎢

⎤ ⎦ ⎥

Trivial Fact 1: log2(x) = - log2(1/x)

Trivial Fact 2:if there are n equally likely

possibilites p = (1/n)

Page 28: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

H = −log2 p[ ]

Trivial Fact 2:if there are n equally likely

possibilites p = (1/n)

Page 29: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

H = −log2 p[ ]

Page 30: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

H = −log2 p[ ]

What if the n statesare not equally

probable?Maybe we should use the

expected value of the entropies,a weighted average by probability

Page 31: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

H = − pi log2 pi[ ]i=1

n

Let’s do a simple example:n = 2 , how does H change as we

vary p1 and p2 ?

Page 32: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

H = − pi log2 pi[ ]i=1

n

n = 2

p1 + p2 = 1

Page 33: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

how about n = 3€

H = − pi log2 pi[ ]i=1

n

n = 3

p1 + p2 + p3 = 1

Page 34: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

The bottom line intuitions for Entropy:

• Entropy is a statistic for describing a probability distribution.

• Probabilities distributions which are flat, broad, sparse, etc. have HIGH entropy.

• Probability distributions which are peaked, sharp, narrow, compact etc. have LOW entropy.

• Entropy adds for independent elements of a system, thus entropy grows with the dimensionality of the probability distribution.

• Entropy is zero IFF the system is in a definite state, i.e. p = 1 somewhere and 0 everywhere else.

Page 35: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Pop Quiz:

1. 2.

3. 4.

Page 36: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

EntropyThe (average) number of yes/no

questions needed to completely specify the state of a

system

Page 37: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

11:16 am (Pacific) on June 29th of the year 2001,

there were approximately 816,119 words in the English Language

H(english) = 19.6 bits

Twenty Questions:220 = 1,048,576

What’s a winning 20 Questions Strategy?

Page 38: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

<break>

Page 39: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

I x;y( ) = H x( ) − H x | y( )

So, what is information?

It’s a change in what you don’t know.

It’s a change in the entropy.

Page 40: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

x yInformation as a measure of correlation

Page 41: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

x yInformation as a measure of correlation

Page 42: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

heads tails

pro

bab

ilit

y

0

1

1/2

P(Y)

I (X;Y) = H(Y) - H(Y|X) = 0 bits

heads tails

pro

bab

ilit

y

0

1

1/2

P(Y|x=heads )H(Y|x=heads) = 1H(Y) = 1

Page 43: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

x yInformation as a measure of correlation

Page 44: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

x yInformation as a measure of correlation

Page 45: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

heads tails

pro

bab

ilit

y

0

1

1/2

P(Y)

I (X;Y) = H(Y) - H(Y|X) ~ 1 bit

heads tails

pro

bab

ilit

y

0

1

1/2

P(Y|x=heads )H(Y|x=heads) ~ 0H(Y) = 1

Page 46: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

x yInformation Theory in Neuroscience

Page 47: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

The Critical Observation:

Information is Mutual

I(X;Y) = I(Y;X)

H(Y)-H(Y|X) = H(X)-H(X|Y)

Page 48: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

The Critical Observation:

What a spike tells the Brain about the stimulus,

is the same as what our stimulus choice tells us about the likelihood

of a spike.

I(Stimulus;Spike) = I(Spike;Stimulus)

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 49: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

The Critical Observation:

What our stimulus choice tells us about the likelihood of a spike.

stimulus response

This, we can measure....

Page 50: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Show your system stimuli.Measure neural

responses.P( neural response | stimulus presented )Estimate: P( neural repsones )From that,

Estimate:Compute: H(neural response) and H(neural response | stimulus presented)

Calculate: I(response ; stimulus)

How to use Information Theory:

Page 51: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Choose stimuli which are not representative.Measure the “wrong” aspect of the response.Don’t take enough data to estimate P( ) well. Use a crappy method of computing H( ).Calculate I( ) and report it without comparing it to anything...

How to screw it up:

Page 52: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Here’s an example of Information Theory applied appropriately

Temporal Coding of Visual Information in the Thalamus Pamela Reinagel and R. Clay Reid

J. Neurosci. 20(14):5392-5400. (2000)

Page 53: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

LGN responses are very reliable.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Is there information in thetemporal pattern of spikes?

Page 54: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

x yPatterns of Spikes in the LGN

…….0……..…….1……..

spikes

Page 55: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

x yPatterns of Spikes in the LGN

…….00……..…….10…….. …….01……..…….11……..

spikes

Page 56: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

x yPatterns of Spikes in the LGN

…….000……..…….101…….. …….011……..…….100……..

spikes

Page 57: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

x yPatterns of Spikes in the LGN

…….000100……..…….101101…….. …….011110……..…….010001……..

spikes

Page 58: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

P( spike pattern)

Page 59: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

P( spike pattern | stimulus )

Page 60: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

There is some extra Information in Temporal Patterns of spikes.

Page 61: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Claude Shannon Ph.D.1916-2001

Page 62: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

Prof. Tom CoverEE376A & B

Page 63: Today: Entropy Information Theory. Claude Shannon Ph.D. 1916-2001.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.


Recommended