1 1. Basics Probability theory deals with the study of random phenomena, which under repeated...

1

1. Basics

Probability theory deals with the study of random

phenomena, which under repeated experiments yield

different outcomes that have certain underlying patterns

about them. The notion of an experiment assumes a set of

repeatable conditions that allow any number of identical

repetitions. When an experiment is performed under these

conditions, certain elementary events occur in different

but completely uncertain ways. We can assign nonnegative

number as the probability of the event in various

ways:

),( iP

i

i

PROBABILITY THEORY

PILLAI

The totality of all known a priori, constitutes a set , the set of all experimental outcomes.

has subsets Recall that if A is a subset of , then implies From A and B, we can generate other related subsets etc.

,i

,,,, 21 k (1-7)

A ..,,, CBA

, , , , BABABA

(1-8)

and

BABA

BABA

and |

or |

AA | PILLAI

A B

BA

A B A

BA A

• If the empty set, then A and B are

said to be mutually exclusive (M.E).• A partition of is a collection of mutually exclusive

subsets of such that their union is .

,BA

. and ,1

i

iji AAA

BA

BA

1A2A

nA

iA

A

(1-9)

jA

Fig. 1.2

Fig.1.1

PILLAI

De-Morgan’s Laws:

BABABABA ;

A B

BA

A B

BA

A B

BA

A B

• Often it is meaningful to talk about at least some of the subsets of as events, for which we must have mechanism to compute their probabilities.

Example 1.1: Consider the experiment where two coins are simultaneously tossed. The various elementary events are

Fig.1.3

PILLAI

(1-10)

. ,,, 4321

),( ),,( ),,( ),,( 4321 TTHTTHHH

and

The subset is the same as “Head has occurred at least once” and qualifies as an event.

Suppose two subsets A and B are both events, then consider

“Does an outcome belong to A or B ”

“Does an outcome belong to A and B ”

“Does an outcome fall outside A”?

,, 321 A

BA

BA

PILLAI

Thus the sets etc., also qualify as events.

, , , , BABABA

PILLAI

Axioms of Probability

For any event A, we assign a number P(A), called the probability of the event A. This number satisfies the following three conditions that act the axioms of probability.

(Note that (iii) states that if A and B are mutually exclusive (M.E.) events, the probability of their union is the sum of their probabilities.)

).()()( then, If (iii)

unity) isset whole theofty (Probabili 1)( (ii)

number) enonnegativa isty (Probabili 0)( (i)

BPAPBAPBA

P

AP

PILLAI

(1-13)

The following conclusions follow from these axioms:

a. Since we have using (ii)

But and using (iii),

b. Similarly, for any A,

Hence it follows that

But and thus

c. Suppose A and B are not mutually exclusive (M.E.)?

How does one compute

, AA

.1)() P( PAA

, AA

).(1)or P( 1)P()() P( APAAAPAA (1-14)

. A

. )()( PAPAP

, AA .0 P (1-15)

?)( BAP

PILLAI

To compute the above probability, we should re-express in terms of M.E. sets so that we can make use ofthe probability axioms. From Fig.1.4 we have

where A and are clearly M.E. events.

Thus using axiom (1-13-iii)

To compute we can express B as

Thus

since and are M.E. events.

BA

, BAABA (1-16)

).()()()( BAPAPBAAPBAP ),( BAP

ABBAABAB

AABBB

)()(

)(

),()()( ABPBAPBP

ABBA BAAB

BA

(1-17)

(1-18)

(1-19)

A BA

BA

Fig.1.4

PILLAI

From (1-19),

and using (1-20) in (1-17)

)()()( ABPBPBAP

).()()()( ABPBPAPBAP

(1-20)

(1-21)

PILLAI

Conditional Probability and Independence

In N independent trials, suppose NA, NB, NAB denote the number of times events A, B and AB occur respectively. According to the frequency interpretation of probability, for large N

Among the NA occurrences of A, only NAB of them are also found among the NB occurrences of B. Thus the ratio

.)( ,)( ,)(N

NABP

N

NBP

N

NAP ABBA (1-33)

)(

)(

/

/

BP

ABP

NN

NN

N

N

B

AB

B

AB (1-34)

PILLAI

is a measure of “the event A given that B has already occurred”. We denote this conditional probability by

P(A|B) = Probability of “the event A given

that B has occurred”.

We define

provided As we show below, the above definition

satisfies all probability axioms discussed earlier.

,)(

)()|(

BP

ABPBAP

.0)( BP

(1-35)

PILLAI

We have

(i)

(ii) since B = B.

(iii) Suppose Then

But hence

satisfying all probability axioms in (1-13). Thus (1-35) defines a legitimate probability measure.

,00)(

0)()|(

BP

ABPBAP

.)(

)(

)(

))(()|(

BP

CBABP

BP

BCAPBCAP

,1)(

)(

)(

)()|(

BP

BP

BP

BPBP

),|()|()(

)(

)(

)()|( BCPBAP

BP

CBP

BP

ABPBCAP

.0CA

, ACAB ).()()( CBPABPCBABP

(1-39)

(1-37)

(1-36)

(1-38)

PILLAI

Properties of Conditional Probability:

a. If and

since if then occurrence of B implies automatic occurrence of the event A. As an example, but

in a dice tossing experiment. Then and

b. If and

, , BABAB

1)(

)(

)(

)()|(

BP

BP

BP

ABPBAP (1-40)

,AB

).()(

)(

)(

)()|( AP

BP

AP

BP

ABPBAP

, , AABBA

(1-41)

,AB .1)|( BAP

PILLAI

{outcome is even}, ={outcome is 2},A B

Independence: A and B are said to be independent events,

if

Notice that the above definition is a probabilistic statement,

not a set theoretic notion such as mutually exclusiveness.

).()()( BPAPABP

(1-45)

PILLAI

Suppose A and B are independent, then

Thus if A and B are independent, the event that B has occurred does not shed any more light into the event A. It makes no difference to A whether B has occurred or not. An example will clarify the situation:

Example 1.2: A box contains 6 white and 4 black balls. Remove two balls at random without replacement. What is the probability that the first one is white and the second one is black?

Let W1 = “first ball removed is white”

B2 = “second ball removed is black”

).()(

)()(

)(

)()|( AP

BP

BPAP

BP

ABPBAP (1-46)

PILLAI

We need We have Using the conditional probability rule,

But

and

and hence

?)( 21 BWP

).()|()()( 1121221 WPWBPWBPBWP

,5

3

10

6

46

6)( 1

WP

,9

4

45

4)|( 12

WBP

.25.081

20

9

4

9

5)( 21 BWP

.122121 WBBWBW

(1-47)

PILLAI

Are the events W1 and B2 independent? Our common sense says No. To verify this we need to compute P(B2). Of course the fate of the second ball very much depends on that of the first ball. The first ball has two options: W1 = “first ball is white” or B1= “first ball is black”. Note that and Hence W1 together with B1 form a partition. Thus (see (1-42)-(1-44))

and

As expected, the events W1 and B2 are dependent.

,11 BW

.11 BW

,5

2

15

24

5

2

3

1

5

3

9

4

10

4

36

3

5

3

45

4

)()|()()|()( 1121122

BPRBPWPWBPBP

.81

20)(

5

3

5

2)()( 1212 WBPWPBP

PILLAI

From (1-35),

Similarly, from (1-35)

or

From (1-48)-(1-49), we get

or

Equation (1-50) is known as Bayes’ theorem.

).()|()( BPBAPABP

,)(

)(

)(

)()|(

AP

ABP

AP

BAPABP

).()|()( APABPABP

(1-48)

(1-49)

).()|()()|( APABPBPBAP

(1-50))()(

)|()|( AP

BP

ABPBAP

PILLAI

Although simple enough, Bayes’ theorem has an interesting interpretation: P(A) represents the a-priori probability of the event A. Suppose B has occurred, and assume that A and B are not independent. How can this new information be used to update our knowledge about A? Bayes’ rule in (1-50) take into account the new information (“B has occurred”) and gives out the a-posteriori probability of A given B.

We can also view the event B as new knowledge obtained from a fresh experiment. We know something about A as P(A). The new information is available in terms of B. The new information should be used to improve our knowledge/understanding of A. Bayes’ theorem gives the exact mechanism for incorporating such new information.

PILLAI

A more general version of Bayes’ theorem involves

partition of . From (1-50)

where we have made use of (1-44). In (1-51),

represent a set of mutually exclusive events with

associated a-priori probabilities With the

new information “B has occurred”, the information about

Ai can be updated by the n conditional probabilities

,)()|(

)()|(

)(

)()|()|(

1

n

iii

iiiii

APABP

APABP

BP

APABPBAP (1-51)

,1 , niAi

47).-(1 using ,1 ),|( niABP i

.1 ),( niAP i

PILLAI

Example 1.3: Two boxes B1 and B2 contain 100 and 200

light bulbs respectively. The first box (B1) has 15 defective

bulbs and the second 5. Suppose a box is selected at random and one bulb is picked out.

(a) What is the probability that it is defective?

Solution: Note that box B1 has 85 good and 15 defective

bulbs. Similarly box B2 has 195 good and 5 defective

bulbs. Let D = “Defective bulb is picked out”.

Then

.025.0200

5)|( ,15.0

100

15)|( 21 BDPBDP

PILLAI

Since a box is selected at random, they are equally likely.

Thus B1 and B2 form a partition as in (1-43), and using

(1-44) we obtain

Thus, there is about 9% probability that a bulb picked at random is defective.

.2

1)()( 21 BPBP

.0875.02

1025.0

2

115.0

)()|()()|()( 2211

BPBDPBPBDPDP

PILLAI

(b) Suppose we test the bulb and it is found to be defective.

What is the probability that it came from box 1?

Notice that initially then we picked out a box

at random and tested a bulb that turned out to be defective.

Can this information shed some light about the fact that we

might have picked up box 1?

From (1-52), and indeed it is more

likely at this point that we must have chosen box 1 in favor

of box 2. (Recall box1 has six times more defective bulbs

compared to box2).

.8571.00875.0

2/115.0

)(

)()|()|( 11

1

DP

BPBDPDBP

?)|( 1 DBP

(1-52)

;5.0)( 1 BP

,5.0857.0)|( 1 DBP

PILLAI

Random Variables

Let X be a function that maps every to a unique

point the set of real numbers. Since the outcome

is not certain, so is the value Thus if B is some

subset of R, we may want to determine the probability of

“ ”. To determine this probability, we can look at

the set that contains all that maps

into B under the function X.

,,Rx

.)( xX

BX )(

)(1 BXA

R)(X

x

A

B

Fig. 3.1 PILLAI

Obviously, if the set also belongs to the associated field F, then it is an event and the probability of A is well defined; in that case we can say

However, may not always belong to F for all B, thus creating difficulties. The notion of random variable (r.v) makes sure that the inverse mapping always results in an event so that we are able to determine the probability for any

Random Variable (r.v): A finite single valued function that maps the set of all experimental outcomes into the set of real numbers R is said to be a r.v, if the set is an event for every x in R.

)(1 BXA

)).((" )("event theofy Probabilit 1 BXPBX (3-1)

)(1 BX

.RB

) (X )(| xX

)( F

PILLAI

Denote

This is the Probability Distribution Function (PDF) associated with the r.v X.

.0)( )(| xFxXP X

PILLAI

Distribution Function: Note that a distribution function g(x) is nondecreasing, right-continuous and satisfies

i.e., if g(x) is a distribution function, then

(i)

(ii) if then

and

(iii) for all x.

We need to show that defined in (3-4) satisfies all properties in (3-6). In fact, for any r.v X,

,0)( ,1)( gg

,0)( ,1)( gg

,21 xx ),()( 21 xgxg

),()( xgxg

(3-6)

)(xFX

(3-5)

PILLAI

1)( )(| )( PXPFX

.0)( )(| )( PXPFX

(i)

and

(ii) If then the subset Consequently the event since implies As a result

implying that the probability distribution function is nonnegative and monotone nondecreasing.

,21 xx ).,(),( 21 xx

, )(| )(| 21 xXxX

1)( xX .)( 2xX

),()()()( 2211 xFxXPxXPxF XX (3-9)

(3-7)

(3-8)

PILLAI

Additional Properties of a PDF

(iv) If for some then

This follows, since implies is the null set, and for any will be a subset of the null set.

(v)

We have and since the two events are mutually exclusive, (16) follows.

(vi)

The events and are mutually exclusive and their union represents the event

0)( 0 xFX ,0x . ,0)( 0xxxFX (3-15)

0)()( 00 xXPxFX 0)( xX

)( ,0 xXxx

).(1 )( xFxXP X (3-16)

, )( )( xXxX

. ),()( )( 121221 xxxFxFxXxP XX (3-17)

})({ 21 xXx )( 1xX . )( 2xX

PILLAI

(vii)

Let and From (3-17)

or

According to (3-14), the limit of as from the right always exists and equals However the left limit value need not equal Thus need not be continuous from the left. At a discontinuity point of the distribution, the left and right limits are different, and from (3-20)

).()()( xFxFxXP XX (3-18)

,0 ,1 xx .2 xx

),(lim)( )( lim00

xFxFxXxP XX (3-19)

).()( )( xFxFxXP XX (3-20)

),( 0xFX )(xFX 0xx

).( 0xFX

)( 0xFX ).( 0xFX )(xFX

.0)()( )( 000 xFxFxXP XX (3-21)PILLAI

Thus the only discontinuities of a distribution function are of the jump type, and occur at points where (3-21) is satisfied. These points can always be enumerated as a sequence, and moreover they are at most countable in number. Example 3.1: X is a r.v such that Find Solution: For so that and for so that (Fig.3.2)

Example 3.2: Toss a coin. Suppose the r.v X is such that Find

)(xFX

0x

. ,)( cX ).(xFX

, )( , xXcx ,0)( xFX

.,TH.1)( ,0)( HXTX

)(xFX

xc

1

Fig. 3.2

).(xFX

.1)( xFX ,)( , xXcx

PILLAI

Solution: For so that

•X is said to be a continuous-type r.v if its distribution function is continuous. In that case for all x, and from (3-21) we get

•If is constant except for a finite number of jump discontinuities(piece-wise constant; step-type), then X is said to be a discrete-type r.v. If is such a discontinuity point, then from (3-21)

, )( ,0 xXx .0)( xFX

3.3) (Fig. .1)( that so , , )( ,1

,1 )( that so , )( ,10

xFTHxXx

pTPxFTxXx

X

X

).()( iXiXii xFxFxXPp (3-22)

)(xFX)()( xFxF XX

.0xXP

)(xFX

)(xFX

x

Fig.3.3

1q

1

ix

PILLAI

From Fig.3.2, at a point of discontinuity we get

and from Fig.3.3,

Example:3.3 A fair coin is tossed twice, and let the r.v X represent the number of heads. Find Solution: In this case and

.101)()( cFcFcXP XX

.0)0()0( 0 qqFFXP XX

, ,,, TTTHHTHH).(xFX

.0)(,1)(,1)(,2)( TTXTHXHTXHHX

3.4) (Fig. .1)()( ,2

,4

3,, )(,, )( ,21

,4

1)()( )( )( ,10

,0)()( ,0

xFxXx

THHTTTPxFTHHTTTxXx

TPTPTTPxFTTxXx

xFxXx

X

X

X

X

PILLAI

From Fig.3.4,

Probability density function (p.d.f)

The derivative of the distribution function is called the probability density function of the r.v X. Thus

Since

from the monotone-nondecreasing nature of

.2/14/14/3)1()1(1 XX FFXP

)(xFX

)(xf X

.

)()(

dx

xdFxf X

X

,0)()(

lim

)(0

x

xFxxF

dx

xdF XX

x

X

(3-23)

(3-24)

),(xFX

)(xFX

x

Fig. 3.4

1

4/1

1

4/3

2

PILLAI

it follows that for all x. will be a continuous function, if X is a continuous type r.v. However, if X is a discrete type r.v as in (3-22), then its p.d.f has the general form (Fig. 3.5)

where represent the jump-discontinuity points in As Fig. 3.5 shows represents a collection of positive discrete masses, and it is known as the probability mass function (p.m.f ) in the discrete case. From (3-23), we also obtain by integration

Since (3-26) yields

0)( xf X )(xf X

,)()( i

iiX xxpxf

).(xFX

(3-25)

.)()( duufxFx

xX (3-26)

,1)( XF

,1)(

dxxf x

(3-27)

ix

)(xf X

xix

ip

Fig. 3.5

)(xf X

PILLAI

which justifies its name as the density function. Further, from (3-26), we also get (Fig. 3.6b)

Thus the area under in the interval represents the probability in (3-28).

Often, r.vs are referred by their specific density functions - both in the continuous and discrete cases - and in what follows we shall list a number of them in each category.

.)()()( )( 2

11221 dxxfxFxFxXxP

x

x XXX (3-28)

Fig. 3.6

)(xf X ),( 21 xx

)(xf X

(b)

x1x 2x

)(xFX

x

1

(a)1x 2x

PILLAI

Continuous-type random variables

1. Normal (Gaussian): X is said to be normal or Gaussian r.v, if

This is a bell shaped curve, symmetric around the parameter and its distribution function is given by

where is often tabulated. Since depends on two parameters and the notation will be used to represent (3-29).

.2

1)(

22 2/)(

2

x

X exf (3-29)

,

,2

1)(

22 2/)(

2

x y

X

xGdyexF

(3-30)

dyexG yx 2/2

2

1)(

),( 2NX

)(xf X

xFig. 3.7

)(xf X

,2

PILLAI

2. Uniform: if (Fig. 3.8), ),,( babaUX

otherwise. 0,

, ,1

)( bxaabxf X

(3.31)

)(xf X

xa b

ab 1

Fig. 3.8

3. Exponential: if (Fig. 3.9))( X

otherwise. 0,

,0 ,1

)(/ xexf

x

X

(3-32)

)(xf X

x

Fig. 3.9

PILLAI

9. Cauchy: if (Fig. 3.14)

10. Laplace: (Fig. 3.15)

,),( CX

. ,)(

/)(

22

x

xxf X

. ,2

1)( /|| xexf x

X

)(xf X

x

Fig. 3.14

(3-41)

(3-40)

(3-39)

x

)(xf X

Fig. 3.15PILLAI

Discrete-type random variables

1. Bernoulli: X takes the values (0,1), and

2. Binomial: if (Fig. 3.17)

3. Poisson: if (Fig. 3.18)

.)1( ,)0( pXPqXP (3-43)

),,( pnBX

.,,2,1,0 ,)( nkqpk

nkXP knk

(3-44)

, )( PX

.,,2,1,0 ,!

)( kk

ekXPk

(3-45)

k

)( kXP

Fig. 3.17

12 n

)( kXP

Fig. 3.18 PILLAI

4. Hypergeometric:

5. Geometric: if

6. Negative Binomial: ~ if

7. Discrete-Uniform:

PILLAI

(3-49)

(3-48)

(3-47)

.,,2,1 ,1

)( NkN

kXP

),,( prNBX1

( ) , , 1, .1

r k rkP X k p q k r r

r

.1 ,,,2,1,0 ,)( pqkpqkXP k

)( pgX

, max(0, ) min( , )( )

m N m

k n kN

n

m n N k m nP X k

(3-46)

Conditional Probability Density Function

For any two events A and B, we have defined the conditional probability of A given B as

Noting that the probability distribution function is given by

we may define the conditional distribution of the r.v X given the event B as

.0)( ,)(

)()|(

BP

BP

BAPBAP (4-9)

)(xFX

, )( )( xXPxFX (4-10)

PILLAI

.

)(

)( |)( )|(

BP

BxXPBxXPBxFX

(4-11)

Thus the definition of the conditional distribution depends on conditional probability, and since it obeys all probability axioms, it follows that the conditional distribution has the same properties as any distribution function. In particular

Further

.0

)(

)(

)(

)( )|(

,1)(

)(

)(

)( )|(

BP

P

BP

BXPBF

BP

BP

BP

BXPBF

X

X

(4-12)

),|()|(

)(

)( )|)((

12

2121

BxFBxF

BP

BxXxPBxXxP

XX

(4-13)

PILLAI

Since for

The conditional density function is the derivative of the conditional distribution function. Thus

and proceeding as in (3-26) we obtain

Using (4-16), we can also rewrite (4-13) as

. )()()( 2112 xXxxXxX (4-14)

,

)|()|(

dx

BxdFBxf X

X (4-15)

,12 xx

x

XX duBufBxF

.)|()|( (4-16)

2

1

21 .)|(|)(x

x X dxBxfBxXxP (4-17)

PILLAI

Fig. 4.3

)(xFX

x

(a)

q1

1

( | )XF x B

x

(b)

1

1

Example 4.4: Refer to example 3.2. Toss a coin and X(T)=0, X(H)=1. Suppose Determine

Solution: From Example 3.2, has the following form. We need for all x.

For so that and

).|( BxFX

)(xFX

)|( BxFX

,)( ,0 xXx ,)( BxX

.0)|( BxFX

}.{HB

PILLAI

For so that

For and

(see Fig. 4.3(b)).

Example 4.5: Given suppose Find Solution: We will first determine From (4-11) and B as given above, we have

, )( ,10 TxXx

HTBxX )( .0)|( and BxFX

,)( ,1 xXx

}{ )( BBBxX 1)(

)()|( and

BP

BPBxFX

),(xFX .)( aXB ).|( Bxf X

).|( BxFX

.

)|(

aXP

aXxXPBxFX

(4-18)

PILLAI

xXaXxXax ,

.

)(

)()|(

aF

xF

aXP

xXPBxF

X

XX

)( , aXaXxXax .1)|( BxFX

(4-19)

, ,1

, ,)(

)()|(

ax

axaF

xFBxF

X

X

X (4-20)

otherwise. ,0

,,)(

)()|()|(

axaF

xfBxF

dx

dBxf

X

X

XX (4-21)

For so that

For so that Thus

and hence

PILLAI

)|( BxFX

)(xFX

xa

1

(a)Fig. 4.4

Example 4.6: Let B represent the event with For a given determine and Solution:

For we have and hence

bXa )( .ab

),(xFX )|( BxFX ).|( Bxf X

.

)()(

)( )(

)(

)()( |)( )|(

aFbF

bXaxXP

bXaP

bXaxXPBxXPBxF

XX

X

(4-22)

,ax ,)( )( bXaxX

.0)|( BxFX (4-23)

)|( Bxf X

)(xf X

xa(b)

PILLAI

For we have and hence

For we have so that Using (4-23)-(4-25), we get (see Fig. 4.5)

})({ )( )( xXabXaxX

.

)()(

)()(

)()(

)()|(

aFbF

aFxF

aFbF

xXaPBxF

XX

XX

XXX

,bxa

,bx bXabXaxX )( )( )( .1)|( BxFX

(4-24)

(4-25)

otherwise.,0

,,)()(

)()|(

bxaaFbF

xfBxf

XX

X

X (4-26)

)|( Bxf X

)(xf X

x

Fig. 4.5

a b

PILLAI

We can use the conditional p.d.f together with the Bayes’ theorem to update our a-priori knowledge about the probability of events in presence of new observations. Ideally, any new information should be used to update our knowledge. As we see in the next example, conditional p.d.f together with Bayes’ theorem allow systematic updating. For any two events A and B, Bayes’ theorem gives

Let so that (4-27) becomes (see (4-13) and (4-17))

.)(

)()|()|(

BP

APABPBAP (4-27)

21 )( xXxB

).()(

)|()(

)()(

)|()|(

)(

)(|))(())((|

2

1

2

1

12

12

21

2121

APdxxf

dxAxfAP

xFxF

AxFAxF

xXxP

APAxXxPxXxAP

x

x X

x

x X

XX

XX

(4-28)

PILLAI

Further, let so that in the limit as

or

From (4-30), we also get

or

and using this in (4-30), we get the desired result

,0 , , 21 xxxx ,0

).()(

)|()(|))((|lim

0AP

xf

AxfxXAPxXxAP

X

X

(4-29)

.)(

)()|()|(| AP

xfxXAPAxf X

AX

(4-30)

(4-31),)()|()|()(

1

dxxfxXAPdxAxfAP XX

dxxfxXAPAP X )()|()(

(4-32)

.)()|(

)()|()|(|

dxxfxXAP

xfxXAPAxf

X

XAX

(4-33)

PILLAI

To illustrate the usefulness of this formulation, let us reexamine the coin tossing problem.

Example 4.7: Let represent the probability of obtaining a head in a toss. For a given coin, a-priori p can possess any value in the interval (0,1). In the absence of any additional information, we may assume the a-priori p.d.f to be a uniform distribution in that interval. Now suppose we actually perform an experiment of tossing the coin n times, and k heads are observed. This is new information. How can we update Solution: Let A= “k heads in n specific tosses”. Since these tosses result in a specific sequence,

)(HPp

)( pfP

?)( pfP

)( pfP

p0 1

Fig.4.6

,)|( knkqppPAP (4-34)

PILLAI

and using (4-32) we get

The a-posteriori p.d.f represents the updated information given the event A, and from (4-30)

Notice that the a-posteriori p.d.f of p in (4-36) is not a uniform distribution, but a beta distribution. We can use this a-posteriori p.d.f to make further predictions, For example, in the light of the above experiment, what can we say about the probability of a head occurring in the next (n+1)th toss?

.)!1(

!)!()1()()|()(

1

0

1

0

n

kkndpppdppfpPAPAP knk

P (4-35)

| ( | )P Af p A

).,( 10 ,!)!(

)!1(

)(

)()|()|(|

knpqpkkn

n

AP

pfpPAPApf

knk

PAP

(4-36)

)|(| Apf AP

p

Fig. 4.7

10

PILLAI

Let B= “head occurring in the (n+1)th toss, given that k heads have occurred in n previous tosses”. Clearly and from (4-32)

Notice that unlike (4-32), we have used the a-posteriori p.d.f in (4-37) to reflect our knowledge about the experiment already performed. Using (4-36) in (4-37), we get

Thus, if n =10, and k = 6, then

which is more realistic compare to p = 0.5.

,)|( ppPBP

1

0 .)|()|()( dpApfpPBPBP P

(4-37)

1

0 .

2

1

!)!(

)!1()(

n

kdpqp

kkn

npBP knk (4-38)

,58.012

7)( BP

PILLAI

To summarize, if the probability of an event X is unknown, one should make noncommittal judgement about its a-priori probability density function Usually the uniform distribution is a reasonable assumption in the absence of any other information. Then experimental results (A) are obtained, and out knowledge about X must be updated reflecting this new information. Bayes’ rule helps to obtain the a-posteriori p.d.f of X given A. From that point on, this a-posteriori p.d.f should be used to make further predictions and calculations.

).(xf X

)|(| Axf AX

PILLAI

Functions of a Random Variable

Let X be a r.v defined and suppose g(x) is a function of the variable x. Define

Is Y necessarily a r.v? If so what is its probability distribution function and its probability density function

In general, for a set B,

).(XgY (5-1)

),( yFY

?)( yfY

)).(()( 1 BgXPBYP (5-2)

PILLAI

In particular

Thus the distribution function as well of the density function of Y can be determined in terms of that of X. To obtain the distribution function of Y, we must determine the Borel set on the x-axis such that for every given y, and the probability of that set. At this point, we shall consider some of the following functions to illustrate the technical details.

. ],()())(())(()( 1 ygXPyXgPyYPyFY (5-3)

)()( 1 ygX

)(XgY

baX 2X

|| X

X)(|| xUXXe

Xlog

X

1

PILLAI

)sin( X

Example 5.1: Solution: Suppose

and

On the other hand if then

and hence

baXY (5-4)

.0a

. )()()()(

a

byF

a

byXPybaXPyYPyF XY (5-5)

. 1

)(

a

byf

ayf XY

(5-6)

,0a

, 1

)()()()(

a

byF

a

byXPybaXPyYPyF

X

Y

(5-7)

. 1

)(

a

byf

ayf XY (5-8)

PILLAI

From (5-6) and (5-8), we obtain (for all a)

Example 5.2:

If then the event and hence

For from Fig. 5.1, the event is equivalent to

.||

1)(

a

byf

ayf XY (5-9)

.2XY

. )()()( 2 yXPyYPyFY

(5-10)

(5-11)

,0y ,)( 2 yX

.0 ,0)( yyFY(5-12)

,0y })({})({ 2 yXyY }.)({ 21 xXx

2XY

X

y

2x1xFig. 5.1 PILLAI

Hence

By direct differentiation, we get

If represents an even function, then (5-14) reduces to

In particular if so that

.otherwise,0

,0, )()(2

1)(

yyfyf

yyf XXY (5-14)

)(xf X

).( 1

)( yUyfy

yf XY (5-15)

),1,0( NX

,2

1)( 2/2x

X exf (5-16)

.0 ),()(

)()()()( 1221

yyFyF

xFxFxXxPyF

XX

XXY

(5-13)

PILLAI

and substituting this into (5-14) or (5-15), we obtain the p.d.f of to be

On comparing this with (3-36), we notice that (5-17) represents a Chi-square r.v with n = 1, since Thus, if X is a Gaussian r.v with then represents a Chi-square r.v with one degree of freedom (n = 1).

Example 5.3: Let

2XY ).(

2

1)( 2/ yUe

yyf y

Y

(5-17)

.)2/1( ,0 2XY

.,

,,0

,,

)(

cXcX

cXc

cXcX

XgY

PILLAI

In this case

For we have and so that

Similarly if and so that

Thus

).()())(()0( cFcFcXcPYP XX (5-18)

,0y ,cx .0 ),()(

))(()()(

ycyFcyXP

ycXPyYPyF

X

Y

(5-19)

,0y ,cx

.0 ),()(

))(()()(

ycyFcyXP

ycXPyYPyF

X

Y

(5-20)

( ), 0,

( ) [ ( ) ( )] ( ),

( ), 0.

X

Y X X

X

f y c y

f y F c F c y

f y c y

(5-21)

)(Xg

Xc

c

(a) (b)x

)(xFX

(c)

( )YF y

y

Fig. 5.2

cXY )()(

cXY )()(

PILLAI

Note: As a general approach, given first sketch the graph and determine the range space of y. Suppose is the range space of Then clearly for and for so that can be nonzero only in Next, determine whether there are discontinuities in the range space of y. If so evaluate at these discontinuities. In the continuous region of y, use the basic approach

and determine appropriate events in terms of the r.v X for every y. Finally, we must have for and obtain

,0)( , yFay Y ,1)( , yFby Y

)( yFY .bya

iyYP )(

yXgPyFY ))(()(

)( yFY ,y

( )( ) in .

Y

Y

dF yf y a y b

dy

),(XgY ),(xgy bya ).(xgy

PILLAI

However, if is a continuous function, it is easy to establish a direct procedure to obtain

The summation index i in (5-30) depends on y, and for every y the equation must be solved to obtain the total number of solutions at every y, and the actual solutions

all in terms of y.

)(XgY ).( yfY

PILLAI

iiX

iiX

i x

Y xfxg

xfdxdy

yfi

).()(

1)(

/

1)( (5-30)

)( ixgy

,, 21 xx

For example, if then for all and represent the two solutions for each y. Notice that the solutions are all in terms of y so that the right side of (5-30) is only a function of y. Referring back to the example (Example 5.2) here for each there are two solutions given by and ( for ). Moreover

and using (5-30) we get

which agrees with (5-14).

,2XY yxy 1 ,0 yx 2

ix2XY

,0y

yx 1 .2 yx 0)( yfY0y

ydx

dyx

dx

dy

ixx

2 that so 2

, otherwise,0

,0, )()(2

1)(

yyfyfyyf XX

Y (5-31)

2XY

X

y

2x1xFig. 5.5

PILLAI

Example 5.5: Find

Solution: Here for every y, is the only solution, and

and substituting this into (5-30), we obtain

In particular, suppose X is a Cauchy r.v as in (3-39) with parameter so that

In that case from (5-33), has the p.d.f

.1

XY ).( yfY

yx /11

,/1

1 that so

1 222

1

yydx

dy

xdx

dy

xx

.11

)(2

yf

yyf XY

(5-33)

(5-32)

. ,/

)(22

xx

xf X (5-34)

XY /1

. ,)/1(

/)/1(

)/1(

/1)(

22222

y

yyyyfY

(5-35)

PILLAI

But (5-35) represents the p.d.f of a Cauchy r.v with parameter Thus if then

Example 5.6: Suppose and Determine Solution: Since X has zero probability of falling outside the interval has zero probability of falling outside the interval Clearly outside this interval. For any from Fig.5.6(b), the equation has an infinite number of solutions where is the principal solution. Moreover, using the symmetry we also get etc. Further,

so that

./1 ),( CX )./1( /1 CX

,0 ,/2)( 2 xxxf X .sin XY ).( yfY

xy sin ),,0( ).1,0( 0)( yfY

,10 y xy sin,, , ,, 321 xxx

12 xx 22 1sin1cos yxx

dx

dy

.1 2ydx

dy

ixx

yx 11 sin

PILLAI

Using this in (5-30), we obtain for

But from Fig. 5.6(a), in this case (Except for and the rest are all zeros).

,10 y

2

0

1( ) ( ).

1Y X i

ii

f y f xy

(5-36)

0)()()( 431 xfxfxf XXX

)( 1xf X )( 2xf X

)(xf X

x

x

xy sin

1x 1x 2x 3x

y

(a)

(b)

Fig. 5.6

3x

PILLAI

Thus (Fig. 5.7)

otherwise.,0

,10,1

2

1

)(2

22

1

1)()(

1

1)(

222

11

22

21

2212

yy

y

xx

xx

yxfxf

yyf XXY

(5-37)

)( yfY

y

Fig. 5.7

2

1

PILLAI

Mean or the Expected Value of a r.v X is defined as

If X is a discrete-type r.v, then using (3-25) we get

Mean represents the average (mean) value of the r.v in a very large number of trials. For example if then using (3-31) ,

is the midpoint of the interval (a,b).

.)( )( dxxfxXEX XX (6-2)

. )(

)()()(

1

iii

iii

iiii

iiiX

xXPxpx

dxxxpxdxxxpxXEX

(6-3)

),,( baUX

(6-4)

b

a

b

a

ba

ab

abx

abdx

ab

xXE

2)(22

1)(

222

PILLAI

On the other hand if X is exponential with parameter as in (3-32), then

implying that the parameter in (3-32) represents the mean value of the exponential r.v.

Similarly if X is Poisson with parameter as in (3-45), using (6-3), we get

Thus the parameter in (3-45) also represents the mean of the Poisson r.v.

0

/

0 , )(

dyyedxe

xXE yx (6-5)

.!)!1(

!!)()(

01

100

eei

ek

e

kke

kkekXkPXE

i

i

k

k

k

k

k

k

k

(6-6)

PILLAI

In a similar manner, if X is binomial as in (3-44), then its mean is given by

Thus np represents the mean of the binomial r.v in (3-44).

For the normal r.v in (3-29),

.)(!)!1(

)!1(

)!1()!(

!

!)!(

!)()(

111

01

100

npqpnpqpiin

nnpqp

kkn

n

qpkkn

nkqp

k

nkkXkPXE

ninin

i

knkn

k

knkn

k

knkn

k

n

k

(6-7)

.2

1

2

1

)(2

1

2

1)(

1

2/

2

0

2/

2

2/

2

2/)(

2

2222

2222

dyedyye

dyeydxxeXE

yy

yx

(6-8)

PILLAI

Thus the first parameter in is infact the mean of the Gaussian r.v X. Given suppose defines a new r.v with p.d.f Then from the previous discussion, the new r.v Y has a mean given by (see (6-2))

From (6-9), it appears that to determine we need to determine However this is not the case if only is the quantity of interest. It turns out that,

In the discrete case, (6-13) reduces to

),( 2NX

),( xfX X )(XgY ).( yfY

Y

.)( )( dyyfyYE YY (6-9)

),(YE

).( yfY )(YE

PILLAI

.)()()( )()( dxxfxgdyyfyXgEYE XY (6-13)

).()()( ii

i xXPxgYE (6-14)

We can use (6-14) to determine the mean of where X is a Poisson r.v. Using (3-45)

,2XY

PILLAI

.

!)!1(

!!!

!)1(

)!1(

!!)(

2

0

1

1

10 0

0

1

1

1

2

0

2

0

22

eee

em

eei

e

ei

ieii

ie

iie

kke

kke

kekkXPkXE

m

m

i

i

i

i

i i

ii

i

i

k

k

k

k

k

k

k

(6-15)

Mean alone will not be able to truly represent the p.d.f of any r.v. To illustrate this, consider the following scenario: Consider two Gaussian r.vs and Both of them have the same mean However, as Fig. 6.1 shows, their p.d.fs are quite different. One is more concentrated around the mean, whereas the other one has a wider spread. Clearly, we need atleast an additional parameter to measure this spread around the mean!

(0,1) 1 NX (0,10). 2 NX

.0

Fig.6.1

)( 11xf X

1x

12 (a)

)( 22xf X

2x

102 (b)

)( 2X

PILLAI

For a r.v X with mean represents the deviation of the r.v from its mean. Since this deviation can be either positive or negative, consider the quantity and its average value represents the average mean square deviation of X around its mean. Define

With and using (6-13) we get

is known as the variance of the r.v X, and its square root is known as the standard deviation of X. Note that the standard deviation represents the root mean square spread of the r.v X around its mean

X ,

,2 X

][ 2 XE

.0][ 2 2 XEX

(6-16)

2)()( XXg

.0)()(

22

dxxfx XX

(6-17)

2

X

2)( XEX

.

PILLAI

Expanding (6-17) and using the linearity of the integrals, we get

Alternatively, we can use (6-18) to compute

Thus , for example, returning back to the Poisson r.v in (3-45), using (6-6) and (6-15), we get

Thus for a Poisson r.v, mean and variance are both equal to its parameter

.)(

)( 2)(

)(2)(

22___

2 222

2

2

222

XXXEXEXE

dxxfxdxxfx

dxxfxxXVar

XX

XX

(6-18)

.2

X

.22___

22 2

XXX

(6-19)

.PILLAI

To determine the variance of the normal r.v we can use (6-16). Thus from (3-29)

To simplify (6-20), we can make use of the identity

for a normal p.d.f. This gives

Differentiating both sides of (6-21) with respect to we get

or

.2

1])[()(

2/)(

2

22 22

dxexXEXVar x

(6-20)

),,( 2N

2/)(

2

1

2

1)(

22

dxedxxf xX

2/)( .222

dxe x(6-21)

2/)(3

2

2)( 22

dxe

x x

,2

1 2

2/)(

2

2 22

dxex x (6-22)

,

PILLAI

which represents the in (6-20). Thus for a normal r.v as in (3-29)

and the second parameter in infact represents the variance of the Gaussian r.v. As Fig. 6.1 shows the larger the the larger the spread of the p.d.f around its mean. Thus as the variance of a r.v tends to zero, it will begin to concentrate more and more around the mean ultimately behaving like a constant.

Moments: As remarked earlier, in general

are known as the moments of the r.v X, and

),( 2N

)(XVar

2)( XVar (6-23)

,

1 ),( ___

nXEXm nnn

(6-24)

PILLAI

Two Random Variables

In many experiments, the observations are expressible not

as a single quantity, but as a family of quantities. For

example to record the height and weight of each person in

a community or the number of people and the total income

in a family, we need two numbers.

Let X and Y denote two random variables (r.v) based on a

probability model (, F, P). Then

and

2

1

,)()()()( 1221

x

x XXX dxxfxFxFxXxP

.)()()()(2

11221

y

y YYY dyyfyFyFyYyP PILLAI

What about the probability that the pair of r.vs (X,Y) belongs to an arbitrary region D? In other words, how does one estimate, for example, Towards this, we define the joint probability distribution function of X and Y to be

where x and y are arbitrary real numbers.

Properties

(i)

since we get

?))(())(( 2121 yYyxXxP

,0),(

))(())((),(

yYxXP

yYxXPyxFXY (7-1)

.1),( ,0),(),( XYXYXY FxFyF

, )()(,)( XyYX

(7-2)

PILLAI

Similarly

we get

(ii)

To prove (7-3), we note that for

and the mutually exclusive property of the events on the right side gives

which proves (7-3). Similarly (7-4) follows.

.0)(),( XPyFXY ,)(,)( YX

.1)(),( PFXY

).,(),()(,)( 1221 yxFyxFyYxXxP XYXY

).,(),()(,)( 1221 yxFyxFyYyxXP XYXY

(7-3)

(7-4)

,12 xx

yYxXxyYxXyYxX )(,)()(,)()(,)( 2112

yYxXxPyYxXPyYxXP )(,)()(,)()(,)( 2112

PILLAI

(iii)

This is the probability that (X,Y) belongs to the rectangle in Fig. 7.1. To prove (7-5), we can make use of the following identity involving mutually exclusive events on the right side.

).,(),(

),(),()(,)(

1121

12222121

yxFyxF

yxFyxFyYyxXxP

XYXY

XYXY

(7-5)

.)(,)()(,)()(,)( 2121121221 yYyxXxyYxXxyYxXx

0R

1y

2y

1x 2x

X

Y

Fig. 7.1

0R

PILLAI

2121121221 )(,)()(,)()(,)( yYyxXxPyYxXxPyYxXxP

2yy 1y

.

),(),(

2

yx

yxFyxf XY

XY

(7-6)

. ),(),(

dudvvufyxF

x y

XYXY (7-7)

.1 ),(

dxdyyxf XY

(7-8)

This gives

and the desired result in (7-5) follows by making use of (7-3) with and respectively.

Joint probability density function (Joint p.d.f)

By definition, the joint p.d.f of X and Y is given by

and hence we obtain the useful formula

Using (7-2), we also get

PILLAI

To find the probability that (X,Y) belongs to an arbitrary region D, we can make use of (7-5) and (7-7). From (7-5) and (7-7)

Thus the probability that (X,Y) belongs to a differential rectangle x y equals and repeating this procedure over the union of no overlapping differential rectangles in D, we get the useful result

.),(),(

),(),( ),(

),()(,)(

yxyxfdudvvuf

yxFyxxFyyxF

yyxxFyyYyxxXxP

XY

xx

x

yy

y XY

XYXYXY

XY

(7-9)

x

X

Y

Fig. 7.2

yD

,),( yxyxf XY

PILLAI

Dyx XY dxdyyxfDYXP),(

.),(),( (7-10)

(iv) Marginal Statistics In the context of several r.vs, the statistics of each individual ones are called marginal statistics. Thus is the marginal probability distribution function of X, and is the marginal p.d.f of X. It is interesting to note that all marginals can be obtained from the joint p.d.f. In fact

Also

To prove (7-11), we can make use of the identity

.),()( ),,()( yFyFxFxF XYYXYX (7-11)

.),()( ,),()(

dxyxfyfdyyxfxf XYYXYX (7-12)

)()()( YxXxX

)(xFX

)(xf X

PILLAI

so that To prove (7-12), we can make use of (7-7) and (7-11), which gives

and taking derivative with respect to x in (7-13), we get

At this point, it is useful to know the formula for differentiation under integrals. Let

Then its derivative with respect to x is given by

Obvious use of (7-16) in (7-13) gives (7-14).

).,(,)( xFYxXPxXPxF XYX

dudyyufxFxFx

XYXYX ),(),()(

(7-13)

.),()(

dyyxfxf XYX (7-14)

.),()()(

)( xb

xadyyxhxH (7-15)

( )

( )

( ) ( ) ( ) ( , )( , ) ( , ) .

b x

a x

dH x db x da x h x yh x b h x a dy

dx dx dx x

(7-16)

PILLAI

If X and Y are discrete r.vs, then represents their joint p.d.f, and their respective marginal p.d.fs are given by

and

Assuming that is written out in the form of a rectangular array, to obtain from (7-17), one need to add up all entries in the i-th row.

),( jiij yYxXPp

j j

ijjii pyYxXPxXP ),()(

i i

ijjij pyYxXPyYP ),()(

(7-17)

(7-18)

),( ji yYxXP

),( ixXP

mnmjmm

inijii

nj

nj

pppp

pppp

pppp

pppp

21

21

222221

111211

j

ijp

i

ijp

Fig. 7.3

It used to be a practice for insurance companies routinely to scribble out these sum values in the left and top margins, thus suggesting the name marginal densities! (Fig 7.3).

PILLAI

From (7-11) and (7-12), the joint P.D.F and/or the joint p.d.f represent complete information about the r.vs, and their marginal p.d.fs can be evaluated from the joint p.d.f. However, given marginals, (most often) it will not be possible to compute the joint p.d.f. Consider the following example:

Example 7.1: Given

Obtain the marginal p.d.fs and Solution: It is given that the joint p.d.f is a constant in the shaded region in Fig. 7.4. We can use (7-8) to determine that constant c. From (7-8)

. otherwise0,

,10constant,),(

yxyxf XY (7-19)

),( yxf XY

)(xf X ).( yfY

.122

),(1

0

21

0

1

0

0

ccycydydydxcdxdyyxf

yy

y

xXY (7-20)

0 1

1

X

Y

Fig. 7.4

y

PILLAI

Thus c = 2. Moreover from (7-14)

and similarly

Clearly, in this case given and as in (7-21)-(7-22), it will not be possible to obtain the original joint p.d.f in (7-19).

Example 7.2: X and Y are said to be jointly normal (Gaussian) distributed, if their joint p.d.f has the following form:

,10 ),1(22),()(1

xyXYX xxdydyyxfxf (7-21)

.10 ,22),()(

0

y

xXYY yydxdxyxfyf (7-22)

)(xf X)( yfY

.1|| , ,

,12

1),(

2

2

2

2

2

)(

))((2

)(

)1(2

1

2

yx

eyxf Y

Y

YX

YX

X

X yyxx

YX

XY

(7-23)

PILLAI

By direct integration, using (7-14) and completing the square in (7-23), it can be shown that

~

and similarly

~

Following the above notation, we will denote (7-23) as Once again, knowing the marginals in (7-24) and (7-25) alone doesn’t tell us everything about the joint p.d.f in (7-23).

As we show below, the only situation where the marginal p.d.fs can be used to recover the joint p.d.f is when the random variables are statistically independent.

),,( 2

1),()( 22/)(

2

22

XXx

X

XYX Nedyyxfxf XX

(7-24)

(7-25)),,( 2

1),()( 22/)(

2

22

YYy

Y

XYY Nedxyxfyf YY

).,,,,( 22 YXYXN

PILLAI

Independence of r.vs

Definition: The random variables X and Y are said to be statistically independent if the events and are independent events for any two Borel sets A and B in x and y axes respectively. Applying the above definition to the events and we conclude that, if the r.vs X and Y are independent, then

i.e.,

or equivalently, if X and Y are independent, then we must have

AX )( })({ BY

xX )( , )( yY

))(())(())(())(( yYPxXPyYxXP (7-26)

)()(),( yFxFyxF YXXY

).()(),( yfxfyxf YXXY (7-28)

(7-27)

PILLAI

If X and Y are discrete-type r.vs then their independence implies

Equations (7-26)-(7-29) give us the procedure to test for independence. Given obtain the marginal p.d.fs and and examine whether (7-28) or (7-29) is valid. If so, the r.vs are independent, otherwise they are dependent. Returning back to Example 7.1, from (7-19)-(7-22), we observe by direct verification that Hence X and Y are dependent r.vs in that case. It is easy to see that such is the case in the case of Example 7.2 also, unless In other words, two jointly Gaussian r.vs as in (7-23) are independent if and only if the fifth parameter

., allfor )()(),( jiyYPxXPyYxXP jiji (7-29)

)(xf X

)( yfY

),,( yxf XY

).()(),( yfxfyxf YXXY

.0

.0

PILLAI

Example 7.3: Given

Determine whether X and Y are independent. Solution:

Similarly

In this case

and hence X and Y are independent random variables.

otherwise.,0

,10 ,0,),(

2

xyexy

yxfy

XY (7-30)

.10 ,2 22

),()(

0 0

0

2

0

xxdyyeyex

dyeyxdyyxfxf

yy

yXYX

(7-31)

.0 ,2

),()(21

0 ye

ydxyxfyf y

XYY (7-32)

),()(),( yfxfyxf YXXY

PILLAI

8. One Function of Two Random Variables

Given two random variables X and Y and a function g(x,y),

we form a new random variable Z as

Given the joint p.d.f how does one obtain

the p.d.f of Z ? Problems of this type are of interest from a

practical standpoint. For example, a receiver output signal

usually consists of the desired signal buried in noise, and

the above formulation in that case reduces to Z = X + Y.

).,( YXgZ

),,( yxf XY ),(zfZ

(8-1)

PILLAI

It is important to know the statistics of the incoming signal for proper receiver design. In this context, we shall analyze problems of the following type:

Referring back to (8-1), to start with

),( YXgZ

YX

)/(tan 1 YX

YX

XY

YX /

),max( YX

),min( YX

22 YX

zDyx XY

zZ

dxdyyxf

DYXPzYXgPzZPzF

, ,),(

),(),()()(

(8-2)

(8-3)PILLAI

where in the XY plane represents the region such that is satisfied. Note that need not be simply connected (Fig. 8.1). From (8-3), to determine it is enough to find the region for every z, and then evaluate the integral there.

We shall illustrate this method through various examples.

zD

zyxg ),(

)(zFZ

zD

zD

X

Y

zD

zD

Fig. 8.1PILLAI

Example 8.1: Z = X + Y. Find Solution:

since the region of the xy plane where is the shaded area in Fig. 8.2 to the left of the line Integrating over the horizontal strip along the x-axis first (inner integral) followed by sliding that strip along the y-axis from to (outer integral) we cover the entire shaded area.

,),()(

y

yz

x XYZ dxdyyxfzYXPzF (8-4)

zD zyx .zyx

yzx

x

y

Fig. 8.2

).(zfZ

PILLAI

We can find by differentiating directly. In this context, it is useful to recall the differentiation rule in (7-15) - (7-16) due to Leibnitz. Suppose

Then

Using (8-6) in (8-4) we get

Alternatively, the integration in (8-4) can be carried out first along the y-axis followed by the x-axis as in Fig. 8.3.

)(zFZ)(zfZ

)(

)( .),()(

zb

zadxzxhzH (8-5)

)(

)( .

),(),(

)(),(

)()( zb

zadx

z

zxhzzah

dz

zdazzbh

dz

zdb

dz

zdH(8-6)

( , )( ) ( , ) ( , ) 0

( , ) .

z y z yXY

Z XY XY

XY

f x yf z f x y dx dy f z y y dy

z z

f z y y dy

(8-7)

PILLAI

In that case

and differentiation of (8-8) gives

,),()(

x

xz

y XYZ dxdyyxfzF (8-8)

.),(

),( )(

)(

x XY

x

xz

y XYZ

Z

dxxzxf

dxdyyxfzdz

zdFzf

(8-9)

If X and Y are independent, then

and inserting (8-10) into (8-8) and (8-9), we get

)()(),( yfxfyxf YXXY

.)()()()()(

x YXy YXZ dxxzfxfdyyfyzfzf

(8-10)

(8-11)

xzy

x

y

Fig. 8.3

PILLAI

The above integral is the standard convolution of the functions and expressed two different ways. We thus reach the following conclusion: If two r.vs are independent, then the density of their sum equals the convolution of their density functions.

As a special case, suppose that for and for then we can make use of Fig. 8.4 to determine the new limits for

)(zf X )(zfY

0)( xf X 0x 0)( yfY

,0y

.zD

Fig. 8.4

yzx

x

y

)0,(z

),0( z

PILLAI

In that case

or

On the other hand, by considering vertical strips first in Fig. 8.4, we get

or

if X and Y are independent random variables.

z

y

yz

x XYZ dxdyyxfzF

0

0 ),()(

.0,0

,0,),( ),()(

0

0

0 z

zdyyyzfdydxyxfz

zfz

XYz

y

yz

x XYZ (8-12)

,0,0

,0,)()(),()(

0

0 z

zdxxzfxfdxxzxfzf

z

y YXz

x XYZ

z

x

xz

y XYZ dydxyxfzF

0

0 ),()(

(8-13)

PILLAI

Example 8.2: Suppose X and Y are independent exponential r.vs with common parameter , and let Z = X + Y. Determine Solution: We have and we can make use of (13) to obtain the p.d.f of Z = X + Y.

As the next example shows, care should be taken in using the convolution formula for r.vs with finite range.

Example 8.3: X and Y are independent uniform r.vs in the common interval (0,1). Determine where Z = X + Y. Solution: Clearly, here, and as Fig. 8.5 shows there are two cases of z for which the shaded areas are quite different in shape and they should be considered separately.

),()( ),()( yUeyfxUexf yY

xX

(8-14)

20 zYXZ

),(zfZ

).( )( 2

0

2

0

)(2 zUezdxedxeezf zzzz xzxZ

(8-15)

).(zfZ

PILLAI

Using (8-16) - (8-17), we obtain

By direct convolution of and we obtain the same result as above. In fact, for (Fig. 8.6(a))

and for (Fig. 8.6(b))

Fig 8.6 (c) shows which agrees with the convolution of two rectangular waveforms as well.

.21,2

,10)()(

zz

zz

dz

zdFzf Z

Z (8-18)

)(xf X ),( yfY

10 z

21 z

. 1 )()()(

0 zdxdxxfxzfzf

z

YXZ

.2 1 )(1

1 zdxzf

zZ

(8-19)

(8-20)

)(zfZ

PILLAI

)(xfY

x1

)( xzf X

xz

)()( xfxzf YX

xz1z

10 )( za

)(xfY

x1

)( xzf X

x

)()( xfxzf YX

x11z z

1z

21 )( zb

Fig. 8.6 (c)

)(zfZ

z20 1

PILLAI

Example 8.3: Let Determine its p.d.f

Solution: From (8-3) and Fig. 8.7

and hence

If X and Y are independent, then the above formula reduces to

which represents the convolution of with

.YXZ

),( )(

y

yz

x XYZ dxdyyxfzYXPzF

( )( ) ( , ) ( , ) .

z yZ

Z XY XYy x

dF zf z f x y dx dy f y z y dy

dz z

(8-21)

( ) ( ) ( ) ( ) ( ),Z X Y X Yf z f z y f y dy f z f y

(8-22)

)( zf X ).(zfY

Fig. 8.7

y

x

zyx zyx

y

).(zfZ

PILLAI

As a special case, suppose

In this case, Z can be negative as well as positive, and that gives rise to two situations that should be analyzed separately, since the region of integration for and are quite different. For from Fig. 8.8 (a)

and for from Fig 8.8 (b)

After differentiation, this gives

0

0 ),( )(

y

yz

x XYZ dxdyyxfzF

0 ),( )(

zy

yz

x XYZ dxdyyxfzF

.0 ,0)( and ,0 ,0)( yyfxxf YX

0z 0z,0z

,0z

.0,),(

,0,),()(

0

zdyyyzf

zdyyyzfzf

z XY

XY

Z (8-23) Fig. 8.8 (b)

y

x

yzx

z

y

x

yzx

zz

(a)

PILLAI

9. Two Functions of Two Random Variables

In the spirit of the previous lecture, let us look at an immediate generalization: Suppose X and Y are two random variables with joint p.d.f Given two functions and define the new random variables

How does one determine their joint p.d.f Obviously with in hand, the marginal p.d.fs and can be easily determined.

(9-1)

).,( yxf XY

).,(

),(

YXhW

YXgZ

),( yxg

),,( yxh

(9-2)?),( wzfZW

),( wzfZW )(zfZ )(wfW

PILLAI

The procedure is the same as that in (8-3). In fact for given z and w,

where is the region in the xy plane such that the inequalities and are simultaneously satisfied.

We illustrate this technique in the next example.

wzDyx XYwz

ZW

dxdyyxfDYXP

wYXhzYXgPwWzZPwzF

,),( , ,),(),(

),(,),()(,)(),(

(9-3)

wzD ,

zyxg ),( wyxh ),(

x

y

wzD ,

Fig. 9.1

wzD ,

PILLAI

Example 9.1: Suppose X and Y are independent uniformly distributed random variables in the interval Define Determine Solution: Obviously both w and z vary in the interval Thus

We must consider two cases: and since they give rise to different regions for (see Figs. 9.2 (a)-(b)).

).,max( ),,min( YXWYXZ

.0or 0 if ,0),( wzwzFZW

).,( wzfZW

).,0(

).,0( (9-4)

. ),max( ,),min(,),( wYXzYXPwWzZPwzFZW (9-5)

zw ,zw

wzD ,

X

Y

wy

),( ww

),( zz

zwa )(

X

Y

),( ww

),( zz

zwb )(

Fig. 9.2 PILLAI

For from Fig. 9.2 (a), the region is represented by the doubly shaded area. Thus

and for from Fig. 9.2 (b), we obtain

With

we obtain

Thus

, , ),(),(),(),( zwzzFzwFwzFwzF XYXYXYZW

wzD ,

(9-6)

,zw

,zw . , ),(),( zwwwFwzF XYZW (9-7)

,)( )(),(2

xyyxyFxFyxF YXXY (9-8)

2

2 2

(2 ) / , 0 ,( , )

/ , 0 .ZW

w z z z wF z w

w w z

.otherwise,0

,0,/2),(

2 wzwzfZW

(9-9)

(9-10)

PILLAI

From (9-10), we also obtain

and

If and are continuous and differentiable functions, then as in the case of one random variable (see (5-30)) it is possible to develop a formula to obtain the joint p.d.f directly. Towards this, consider the equations

For a given point (z,w), equation (9-13) can have many solutions. Let us say

,0 ,12

),()(

z

zdwwzfzf

z ZWZ (9-11)

.),( ,),( wyxhzyxg (9-13)

.0 ,2

),()(

0 2

w

wdzwzfwf

w

ZWW(9-12)

),( yxg ),( yxh

),( wzfZW

),,( , ),,( ),,( 2211 nn yxyxyx PILLAI

represent these multiple solutions such that (see Fig. 9.3)

(9-14).),( ,),( wyxhzyxg iiii

Fig. 9.3

Then we can write

z

(a)

w

),( wz

z

www

zz

(b)

x

y1

2

i

n

),( 11 yx

),( 22 yx

),( ii yx

),( nn yx

PILLAI

where

where represents the Jacobian of the original transformation in (9-13) given by

,),(|),(|

1),(|),(|),(

iiiXY

iiiiiXYZW yxf

yxJyxfwzJwzf (9-15)

|),(|

1 |),(|

ii yxJwzJ

(9-16)

( , )i iJ x y

.det),(

, ii yyxx

ii

y

h

x

h

y

g

x

g

yxJ

(9-17)

PILLAI

1 1

1 1

( , ) det .

g g

z wJ z w

h h

z w

Next we shall illustrate the usefulness of the formula in (9-15) through various examples:

Example 9.2: Suppose X and Y are zero mean independent Gaussian r.vs with common variance Define where Obtain Solution: Here

Since

if is a solution pair so is From (9-19)

),/(tan , 122 XYWYXZ

.2

).,( wzfZW

.2

1),(

222 2/)(2

yx

XY eyxf (9-18)

,2/||),/(tan),(;),( 122 wxyyxhwyxyxgz (9-19)

),( 11 yx ).,( 11 yx

.tanor ,tan wxywx

y (9-20)

.2/|| w

PILLAI

Substituting this into z, we get

and

Thus there are two solution sets

We can use (9-21) - (9-23) to obtain From (9-17)

so that

.cos or ,sec tan1 222 wzxwxwxyxz (9-21)

.sintan wzwxy (9-22)

.sin ,cos ,sin ,cos 2211 wzywzxwzywzx (9-23)

).,( wzJ

,cossin

sincos),( z

wzw

wzw

w

y

z

y

w

x

z

x

wzJ

(9-24)

.|),(| zwzJ (9-25)PILLAI

We can also compute using (9-16). From (9-16),

Notice that agreeing with (9-30). Substituting (9-23) and (9-25) or (9-26) into (9-15), we get

Thus

which represents a Rayleigh r.v with parameter and

),( yxJ

.11

),(22

2222

2222

zyx

yx

x

yx

y

yx

y

yx

x

yxJ

(9-26)

|,),(|/1|),(| ii yxJwzJ

.2

|| ,0 ,

),(),(),(

22 2/2

2211

wzez

yxfyxfzwzf

z

XYXYZW

(9-27)

,0 ,),()(22 2/

2

2/

2/

zez

dwwzfzf zZWZ

(9-28)

,2

,2

|| ,1

),()(

0

wdzwzfwf ZWW(9-29)PILLAI

which represents a uniform r.v in the interval Moreover by direct computation

implying that Z and W are independent. We summarize these results in the following statement: If X and Y are zero mean independent Gaussian random variables with common variance, then has a Rayleigh distribution and has a uniform distribution. Moreover these two derived r.vs are statistically independent. Alternatively, with X and Y as independent zero mean r.vs as in (9-32), X + jY represents a complex Gaussian r.v. But

where Z and W are as in (9-33), except that for (9-45) to hold good on the entire complex plane we must have and hence it follows that the magnitude and phase of

).2/,2/(

22 YX )/(tan 1 XY

,jWZejYX (9-31)

)()(),( wfzfwzf WZZW (9-30)

, W

PILLAI

a complex Gaussian r.v are independent with Rayleigh and uniform distributions ~ respectively. The statistical independence of these derived r.vs is an interesting observation.

Example 9.3: Let X and Y be independent exponential random variables with common parameter . Define U = X + Y, V = X - Y. Find the joint and marginal p.d.f of U and V. Solution: It is given that

Now since u = x + y, v = x - y, always and there is only one solution given by

Moreover the Jacobian of the transformation is given by

.0 ,0 ,1

),( /)(2

yxeyxf yxXY

(9-32)

.2

,2

vuy

vux

(9-33)

,|| uv

),( U

PILLAI

and hence

represents the joint p.d.f of U and V. This gives

and

Notice that in this case the r.vs U and V are not independent.

2 11

1 1 ),(

yxJ

, || 0 ,2

1),( /

2 uvevuf u

UV

(9-34)

,0 ,2

1),()( /

2

/2

ueu

dvedvvufuf uu

u

uu

u UVU

(9-35)

/ | |/2 | | | |

1 1( ) ( , ) , .

2 2u v

V UVv vf v f u v du e du e v

(9-36)

PILLAI

Joint Moments

Following section 6, in this section we shall introduce various parameters to compactly represent the information contained in the joint p.d.f of two r.vs. Given two r.vs X and Y and a function define the r.v

Using (6-2), we can define the mean of Z to be

(10-1)),( YXgZ

),,( yxg

.)( )(

dzzfzZE ZZ (10-2)

PILLAI

However, the situation here is similar to that in (6-13), and it is possible to express the mean of in terms of without computing To see this, recall from (5-26) and (7-10) that

where is the region in xy plane satisfying the above inequality. From (10-3), we get

As covers the entire z axis, the corresponding regions are nonoverlapping, and they cover the entire xy plane.

zDyxXY

Z

yxyxf

zzYXgzPzzfzzZzP

),(

),(

),()(

(10-3)

zD

),( YXgZ ),( yxf XY ).(zfZ

( , )

( ) ( , ) ( , ) .z

Z XYx y D

z f z z g x y f x y x y

(10-4)

zDz

PILLAI

By integrating (10-4), we obtain the useful formula

or

If X and Y are discrete-type r.vs, then

Since expectation is a linear operator, we also get

( ) ( ) ( , ) ( , ) .Z XYE Z z f z dz g x y f x y dxdy

(10-5)

(10-6)

[ ( , )] ( , ) ( , ) .XYE g X Y g x y f x y dxdy

[ ( , )] ( , ) ( , ).i j i ji j

E g X Y g x y P X x Y y (10-7)

( , ) [ ( , )].k k k kk k

E a g X Y a E g X Y (10-8)

PILLAI

If X and Y are independent r.vs, it is easy to see that and are always independent of each other. In that case using (10-7), we get the interesting result

However (10-9) is in general not true (if X and Y are not independent).

In the case of one random variable (see (10- 6)), we defined the parameters mean and variance to represent its average behavior. How does one parametrically represent similar cross-behavior between two random variables? Towards this, we can generalize the variance definition given in (6-16) as shown below:

)(XgZ

)].([)]([)()()()(

)()()()()]()([

YhEXgEdyyfyhdxxfxg

dxdyyfxfyhxgYhXgE

YX

YX

(10-9)

)(YhW

PILLAI

Covariance: Given any two r.vs X and Y, define

By expanding and simplifying the right side of (10-10), we also get

It is easy to see that

To see (10-12), let so that

.

)()()()(),(________

YXXY

YEXEXYEXYEYXCov YX

(10-10)

(10-12)

. )()(),( YX YXEYXCov

(10-11)

,YaXU

. )()(),( YVarXVarYXCov

. 0)(),( 2)(

)()()(2

2

YVarYXCovaXVara

YXaEUVar YX (10-13)

PILLAI

The right side of (10-13) represents a quadratic in the variable a that has no distinct real roots (Fig. 10.1). Thus the roots are imaginary (or double) and hence the discriminant

must be non-positive, and that gives (10-12). Using (10-12), we may define the normalized parameter

or

and it represents the correlation coefficient between X and Y.

(10-14)

)( )(),( 2 YVarXVarYXCov

,11 ,),(

)()(

),( XY

YXXY

YXCov

YVarXVar

YXCov

a

)(UVar

Fig. 10.1

YXXYYXCov ),( (10-15)

PILLAI

Uncorrelated r.vs: If then X and Y are said to be uncorrelated r.vs. From (11), if X and Y are uncorrelated, then

Orthogonality: X and Y are said to be orthogonal if

From (10-16) - (10-17), if either X or Y has zero mean, then orthogonality implies uncorrelatedness also and vice-versa. Suppose X and Y are independent r.vs. Then from (10-9) with we get

and together with (10-16), we conclude that the random variables are uncorrelated, thus justifying the original definition in (10-10). Thus independence implies uncorrelatedness.

(10-16)

,0XY

,)( ,)( YYhXXg

).()()( YEXEXYE

.0)( XYE(10-17)

),()()( YEXEXYE

PILLAI

Naturally, if two random variables are statistically independent, then there cannot be any correlation between them However, the converse is in general not true. As the next example shows, random variables can be uncorrelated without being independent.

Example 10.1: Let Suppose X and Y are independent. Define Z = X + Y, W = X - Y . Show that Z and W are dependent, but uncorrelated r.vs.

Solution: gives the only solution set to be

Moreover and

),1,0( UX ).1,0( UY

|| ,2 ,2 ,11 ,20 wzwzwzwz .2/1|),(| wzJ

.2

,2

wzy

wzx

yxwyxz ,

).0( XY

PILLAI

,otherwise,0

,|| ,2 ,2 ,11 ,20,2/1),(

zwwzwzwzwzfZW

(10-18)

Thus (see the shaded region in Fig. 10.2)

and hence

or by direct computation ( Z = X + Y )

Fig. 10.21

1

w

z2

,21,2 2

1

,10 , 2

1

),()(2

2

zzdw

zzdw

dwwzfzf-z

z-

z

z

ZWZ

PILLAI

and

Clearly Thus Z and W are not independent. However

and

and hence

implying that Z and W are uncorrelated random variables.

,otherwise,0

,21,2

,10,

)()()( zz

zz

zfzfzf YXZ(10-19)

(10-20)

).()(),( wfzfwzf WZZW

(10-21) ,0)()())(()( 22 YEXEYXYXEZWE

0)()()(),( WEZEZWEWZCov (10-22)

.otherwise,0

,11|,|1

2

1),()(

||2

wwdzdzwzfwf

w

|w|ZWW

,0)()( YXEWE

PILLAI

Example 10.2: Let Determine the variance of Z in terms of and Solution:

and using (10-15)

In particular if X and Y are independent, then and (10-23) reduces to

Thus the variance of the sum of independent r.vs is the sum of their variances

(10-23)

,0XY

.bYaXZ YX , .XY

( ) ( )Z X YE Z E aX bY a b

22 2

2 2 2 2

2 2 2 2

( ) ( ) ( ) ( )

( ) 2 ( )( ) ( )

2 .

Z Z X Y

X X Y Y

X XY X Y Y

Var Z E Z E a X b Y

a E X abE X Y b E Y

a ab b

.22222YXZ ba (10-24)

).1( baPILLAI

Moments:

represents the joint moment of order (k,m) for X and Y.

Following the one random variable case, we can define the joint characteristic function between two random variables which will turn out to be useful for moment calculations.

, ),(][

dydxyxfyxYXE XY

mkmk (10-25)

PILLAI

Date post:	29-Dec-2015
Category:	Documents
Upload:	dominick-pearson
View:	214 times
Download:	0 times

1 1. Basics Probability theory deals with the study of random phenomena, which under repeated...

Documents