Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | dominick-pearson |
View: | 214 times |
Download: | 0 times |
1
1. Basics
Probability theory deals with the study of random
phenomena, which under repeated experiments yield
different outcomes that have certain underlying patterns
about them. The notion of an experiment assumes a set of
repeatable conditions that allow any number of identical
repetitions. When an experiment is performed under these
conditions, certain elementary events occur in different
but completely uncertain ways. We can assign nonnegative
number as the probability of the event in various
ways:
),( iP
i
i
PROBABILITY THEORY
PILLAI
The totality of all known a priori, constitutes a set , the set of all experimental outcomes.
has subsets Recall that if A is a subset of , then implies From A and B, we can generate other related subsets etc.
,i
,,,, 21 k (1-7)
A ..,,, CBA
, , , , BABABA
(1-8)
and
BABA
BABA
and |
or |
AA | PILLAI
A B
BA
A B A
BA A
• If the empty set, then A and B are
said to be mutually exclusive (M.E).• A partition of is a collection of mutually exclusive
subsets of such that their union is .
,BA
. and ,1
i
iji AAA
BA
BA
1A2A
nA
iA
A
(1-9)
jA
Fig. 1.2
Fig.1.1
PILLAI
De-Morgan’s Laws:
BABABABA ;
A B
BA
A B
BA
A B
BA
A B
• Often it is meaningful to talk about at least some of the subsets of as events, for which we must have mechanism to compute their probabilities.
Example 1.1: Consider the experiment where two coins are simultaneously tossed. The various elementary events are
Fig.1.3
PILLAI
(1-10)
. ,,, 4321
),( ),,( ),,( ),,( 4321 TTHTTHHH
and
The subset is the same as “Head has occurred at least once” and qualifies as an event.
Suppose two subsets A and B are both events, then consider
“Does an outcome belong to A or B ”
“Does an outcome belong to A and B ”
“Does an outcome fall outside A”?
,, 321 A
BA
BA
PILLAI
Thus the sets etc., also qualify as events.
, , , , BABABA
PILLAI
Axioms of Probability
For any event A, we assign a number P(A), called the probability of the event A. This number satisfies the following three conditions that act the axioms of probability.
(Note that (iii) states that if A and B are mutually exclusive (M.E.) events, the probability of their union is the sum of their probabilities.)
).()()( then, If (iii)
unity) isset whole theofty (Probabili 1)( (ii)
number) enonnegativa isty (Probabili 0)( (i)
BPAPBAPBA
P
AP
PILLAI
(1-13)
The following conclusions follow from these axioms:
a. Since we have using (ii)
But and using (iii),
b. Similarly, for any A,
Hence it follows that
But and thus
c. Suppose A and B are not mutually exclusive (M.E.)?
How does one compute
, AA
.1)() P( PAA
, AA
).(1)or P( 1)P()() P( APAAAPAA (1-14)
. A
. )()( PAPAP
, AA .0 P (1-15)
?)( BAP
PILLAI
To compute the above probability, we should re-express in terms of M.E. sets so that we can make use ofthe probability axioms. From Fig.1.4 we have
where A and are clearly M.E. events.
Thus using axiom (1-13-iii)
To compute we can express B as
Thus
since and are M.E. events.
BA
, BAABA (1-16)
).()()()( BAPAPBAAPBAP ),( BAP
ABBAABAB
AABBB
)()(
)(
),()()( ABPBAPBP
ABBA BAAB
BA
(1-17)
(1-18)
(1-19)
A BA
BA
Fig.1.4
PILLAI
From (1-19),
and using (1-20) in (1-17)
)()()( ABPBPBAP
).()()()( ABPBPAPBAP
(1-20)
(1-21)
PILLAI
Conditional Probability and Independence
In N independent trials, suppose NA, NB, NAB denote the number of times events A, B and AB occur respectively. According to the frequency interpretation of probability, for large N
Among the NA occurrences of A, only NAB of them are also found among the NB occurrences of B. Thus the ratio
.)( ,)( ,)(N
NABP
N
NBP
N
NAP ABBA (1-33)
)(
)(
/
/
BP
ABP
NN
NN
N
N
B
AB
B
AB (1-34)
PILLAI
is a measure of “the event A given that B has already occurred”. We denote this conditional probability by
P(A|B) = Probability of “the event A given
that B has occurred”.
We define
provided As we show below, the above definition
satisfies all probability axioms discussed earlier.
,)(
)()|(
BP
ABPBAP
.0)( BP
(1-35)
PILLAI
We have
(i)
(ii) since B = B.
(iii) Suppose Then
But hence
satisfying all probability axioms in (1-13). Thus (1-35) defines a legitimate probability measure.
,00)(
0)()|(
BP
ABPBAP
.)(
)(
)(
))(()|(
BP
CBABP
BP
BCAPBCAP
,1)(
)(
)(
)()|(
BP
BP
BP
BPBP
),|()|()(
)(
)(
)()|( BCPBAP
BP
CBP
BP
ABPBCAP
.0CA
, ACAB ).()()( CBPABPCBABP
(1-39)
(1-37)
(1-36)
(1-38)
PILLAI
Properties of Conditional Probability:
a. If and
since if then occurrence of B implies automatic occurrence of the event A. As an example, but
in a dice tossing experiment. Then and
b. If and
, , BABAB
1)(
)(
)(
)()|(
BP
BP
BP
ABPBAP (1-40)
,AB
).()(
)(
)(
)()|( AP
BP
AP
BP
ABPBAP
, , AABBA
(1-41)
,AB .1)|( BAP
PILLAI
{outcome is even}, ={outcome is 2},A B
Independence: A and B are said to be independent events,
if
Notice that the above definition is a probabilistic statement,
not a set theoretic notion such as mutually exclusiveness.
).()()( BPAPABP
(1-45)
PILLAI
Suppose A and B are independent, then
Thus if A and B are independent, the event that B has occurred does not shed any more light into the event A. It makes no difference to A whether B has occurred or not. An example will clarify the situation:
Example 1.2: A box contains 6 white and 4 black balls. Remove two balls at random without replacement. What is the probability that the first one is white and the second one is black?
Let W1 = “first ball removed is white”
B2 = “second ball removed is black”
).()(
)()(
)(
)()|( AP
BP
BPAP
BP
ABPBAP (1-46)
PILLAI
We need We have Using the conditional probability rule,
But
and
and hence
?)( 21 BWP
).()|()()( 1121221 WPWBPWBPBWP
,5
3
10
6
46
6)( 1
WP
,9
4
45
4)|( 12
WBP
.25.081
20
9
4
9
5)( 21 BWP
.122121 WBBWBW
(1-47)
PILLAI
Are the events W1 and B2 independent? Our common sense says No. To verify this we need to compute P(B2). Of course the fate of the second ball very much depends on that of the first ball. The first ball has two options: W1 = “first ball is white” or B1= “first ball is black”. Note that and Hence W1 together with B1 form a partition. Thus (see (1-42)-(1-44))
and
As expected, the events W1 and B2 are dependent.
,11 BW
.11 BW
,5
2
15
24
5
2
3
1
5
3
9
4
10
4
36
3
5
3
45
4
)()|()()|()( 1121122
BPRBPWPWBPBP
.81
20)(
5
3
5
2)()( 1212 WBPWPBP
PILLAI
From (1-35),
Similarly, from (1-35)
or
From (1-48)-(1-49), we get
or
Equation (1-50) is known as Bayes’ theorem.
).()|()( BPBAPABP
,)(
)(
)(
)()|(
AP
ABP
AP
BAPABP
).()|()( APABPABP
(1-48)
(1-49)
).()|()()|( APABPBPBAP
(1-50))()(
)|()|( AP
BP
ABPBAP
PILLAI
Although simple enough, Bayes’ theorem has an interesting interpretation: P(A) represents the a-priori probability of the event A. Suppose B has occurred, and assume that A and B are not independent. How can this new information be used to update our knowledge about A? Bayes’ rule in (1-50) take into account the new information (“B has occurred”) and gives out the a-posteriori probability of A given B.
We can also view the event B as new knowledge obtained from a fresh experiment. We know something about A as P(A). The new information is available in terms of B. The new information should be used to improve our knowledge/understanding of A. Bayes’ theorem gives the exact mechanism for incorporating such new information.
PILLAI
A more general version of Bayes’ theorem involves
partition of . From (1-50)
where we have made use of (1-44). In (1-51),
represent a set of mutually exclusive events with
associated a-priori probabilities With the
new information “B has occurred”, the information about
Ai can be updated by the n conditional probabilities
,)()|(
)()|(
)(
)()|()|(
1
n
iii
iiiii
APABP
APABP
BP
APABPBAP (1-51)
,1 , niAi
47).-(1 using ,1 ),|( niABP i
.1 ),( niAP i
PILLAI
Example 1.3: Two boxes B1 and B2 contain 100 and 200
light bulbs respectively. The first box (B1) has 15 defective
bulbs and the second 5. Suppose a box is selected at random and one bulb is picked out.
(a) What is the probability that it is defective?
Solution: Note that box B1 has 85 good and 15 defective
bulbs. Similarly box B2 has 195 good and 5 defective
bulbs. Let D = “Defective bulb is picked out”.
Then
.025.0200
5)|( ,15.0
100
15)|( 21 BDPBDP
PILLAI
Since a box is selected at random, they are equally likely.
Thus B1 and B2 form a partition as in (1-43), and using
(1-44) we obtain
Thus, there is about 9% probability that a bulb picked at random is defective.
.2
1)()( 21 BPBP
.0875.02
1025.0
2
115.0
)()|()()|()( 2211
BPBDPBPBDPDP
PILLAI
(b) Suppose we test the bulb and it is found to be defective.
What is the probability that it came from box 1?
Notice that initially then we picked out a box
at random and tested a bulb that turned out to be defective.
Can this information shed some light about the fact that we
might have picked up box 1?
From (1-52), and indeed it is more
likely at this point that we must have chosen box 1 in favor
of box 2. (Recall box1 has six times more defective bulbs
compared to box2).
.8571.00875.0
2/115.0
)(
)()|()|( 11
1
DP
BPBDPDBP
?)|( 1 DBP
(1-52)
;5.0)( 1 BP
,5.0857.0)|( 1 DBP
PILLAI
Random Variables
Let X be a function that maps every to a unique
point the set of real numbers. Since the outcome
is not certain, so is the value Thus if B is some
subset of R, we may want to determine the probability of
“ ”. To determine this probability, we can look at
the set that contains all that maps
into B under the function X.
,,Rx
.)( xX
BX )(
)(1 BXA
R)(X
x
A
B
Fig. 3.1 PILLAI
Obviously, if the set also belongs to the associated field F, then it is an event and the probability of A is well defined; in that case we can say
However, may not always belong to F for all B, thus creating difficulties. The notion of random variable (r.v) makes sure that the inverse mapping always results in an event so that we are able to determine the probability for any
Random Variable (r.v): A finite single valued function that maps the set of all experimental outcomes into the set of real numbers R is said to be a r.v, if the set is an event for every x in R.
)(1 BXA
)).((" )("event theofy Probabilit 1 BXPBX (3-1)
)(1 BX
.RB
) (X )(| xX
)( F
PILLAI
Denote
This is the Probability Distribution Function (PDF) associated with the r.v X.
.0)( )(| xFxXP X
PILLAI
Distribution Function: Note that a distribution function g(x) is nondecreasing, right-continuous and satisfies
i.e., if g(x) is a distribution function, then
(i)
(ii) if then
and
(iii) for all x.
We need to show that defined in (3-4) satisfies all properties in (3-6). In fact, for any r.v X,
,0)( ,1)( gg
,0)( ,1)( gg
,21 xx ),()( 21 xgxg
),()( xgxg
(3-6)
)(xFX
(3-5)
PILLAI
1)( )(| )( PXPFX
.0)( )(| )( PXPFX
(i)
and
(ii) If then the subset Consequently the event since implies As a result
implying that the probability distribution function is nonnegative and monotone nondecreasing.
,21 xx ).,(),( 21 xx
, )(| )(| 21 xXxX
1)( xX .)( 2xX
),()()()( 2211 xFxXPxXPxF XX (3-9)
(3-7)
(3-8)
PILLAI
Additional Properties of a PDF
(iv) If for some then
This follows, since implies is the null set, and for any will be a subset of the null set.
(v)
We have and since the two events are mutually exclusive, (16) follows.
(vi)
The events and are mutually exclusive and their union represents the event
0)( 0 xFX ,0x . ,0)( 0xxxFX (3-15)
0)()( 00 xXPxFX 0)( xX
)( ,0 xXxx
).(1 )( xFxXP X (3-16)
, )( )( xXxX
. ),()( )( 121221 xxxFxFxXxP XX (3-17)
})({ 21 xXx )( 1xX . )( 2xX
PILLAI
(vii)
Let and From (3-17)
or
According to (3-14), the limit of as from the right always exists and equals However the left limit value need not equal Thus need not be continuous from the left. At a discontinuity point of the distribution, the left and right limits are different, and from (3-20)
).()()( xFxFxXP XX (3-18)
,0 ,1 xx .2 xx
),(lim)( )( lim00
xFxFxXxP XX (3-19)
).()( )( xFxFxXP XX (3-20)
),( 0xFX )(xFX 0xx
).( 0xFX
)( 0xFX ).( 0xFX )(xFX
.0)()( )( 000 xFxFxXP XX (3-21)PILLAI
Thus the only discontinuities of a distribution function are of the jump type, and occur at points where (3-21) is satisfied. These points can always be enumerated as a sequence, and moreover they are at most countable in number. Example 3.1: X is a r.v such that Find Solution: For so that and for so that (Fig.3.2)
Example 3.2: Toss a coin. Suppose the r.v X is such that Find
)(xFX
0x
. ,)( cX ).(xFX
, )( , xXcx ,0)( xFX
.,TH.1)( ,0)( HXTX
)(xFX
xc
1
Fig. 3.2
).(xFX
.1)( xFX ,)( , xXcx
PILLAI
Solution: For so that
•X is said to be a continuous-type r.v if its distribution function is continuous. In that case for all x, and from (3-21) we get
•If is constant except for a finite number of jump discontinuities(piece-wise constant; step-type), then X is said to be a discrete-type r.v. If is such a discontinuity point, then from (3-21)
, )( ,0 xXx .0)( xFX
3.3) (Fig. .1)( that so , , )( ,1
,1 )( that so , )( ,10
xFTHxXx
pTPxFTxXx
X
X
).()( iXiXii xFxFxXPp (3-22)
)(xFX)()( xFxF XX
.0xXP
)(xFX
)(xFX
x
Fig.3.3
1q
1
ix
PILLAI
From Fig.3.2, at a point of discontinuity we get
and from Fig.3.3,
Example:3.3 A fair coin is tossed twice, and let the r.v X represent the number of heads. Find Solution: In this case and
.101)()( cFcFcXP XX
.0)0()0( 0 qqFFXP XX
, ,,, TTTHHTHH).(xFX
.0)(,1)(,1)(,2)( TTXTHXHTXHHX
3.4) (Fig. .1)()( ,2
,4
3,, )(,, )( ,21
,4
1)()( )( )( ,10
,0)()( ,0
xFxXx
THHTTTPxFTHHTTTxXx
TPTPTTPxFTTxXx
xFxXx
X
X
X
X
PILLAI
From Fig.3.4,
Probability density function (p.d.f)
The derivative of the distribution function is called the probability density function of the r.v X. Thus
Since
from the monotone-nondecreasing nature of
.2/14/14/3)1()1(1 XX FFXP
)(xFX
)(xf X
.
)()(
dx
xdFxf X
X
,0)()(
lim
)(0
x
xFxxF
dx
xdF XX
x
X
(3-23)
(3-24)
),(xFX
)(xFX
x
Fig. 3.4
1
4/1
1
4/3
2
PILLAI
it follows that for all x. will be a continuous function, if X is a continuous type r.v. However, if X is a discrete type r.v as in (3-22), then its p.d.f has the general form (Fig. 3.5)
where represent the jump-discontinuity points in As Fig. 3.5 shows represents a collection of positive discrete masses, and it is known as the probability mass function (p.m.f ) in the discrete case. From (3-23), we also obtain by integration
Since (3-26) yields
0)( xf X )(xf X
,)()( i
iiX xxpxf
).(xFX
(3-25)
.)()( duufxFx
xX (3-26)
,1)( XF
,1)(
dxxf x
(3-27)
ix
)(xf X
xix
ip
Fig. 3.5
)(xf X
PILLAI
which justifies its name as the density function. Further, from (3-26), we also get (Fig. 3.6b)
Thus the area under in the interval represents the probability in (3-28).
Often, r.vs are referred by their specific density functions - both in the continuous and discrete cases - and in what follows we shall list a number of them in each category.
.)()()( )( 2
11221 dxxfxFxFxXxP
x
x XXX (3-28)
Fig. 3.6
)(xf X ),( 21 xx
)(xf X
(b)
x1x 2x
)(xFX
x
1
(a)1x 2x
PILLAI
Continuous-type random variables
1. Normal (Gaussian): X is said to be normal or Gaussian r.v, if
This is a bell shaped curve, symmetric around the parameter and its distribution function is given by
where is often tabulated. Since depends on two parameters and the notation will be used to represent (3-29).
.2
1)(
22 2/)(
2
x
X exf (3-29)
,
,2
1)(
22 2/)(
2
x y
X
xGdyexF
(3-30)
dyexG yx 2/2
2
1)(
),( 2NX
)(xf X
xFig. 3.7
)(xf X
,2
PILLAI
2. Uniform: if (Fig. 3.8), ),,( babaUX
otherwise. 0,
, ,1
)( bxaabxf X
(3.31)
)(xf X
xa b
ab 1
Fig. 3.8
3. Exponential: if (Fig. 3.9))( X
otherwise. 0,
,0 ,1
)(/ xexf
x
X
(3-32)
)(xf X
x
Fig. 3.9
PILLAI
9. Cauchy: if (Fig. 3.14)
10. Laplace: (Fig. 3.15)
,),( CX
. ,)(
/)(
22
x
xxf X
. ,2
1)( /|| xexf x
X
)(xf X
x
Fig. 3.14
(3-41)
(3-40)
(3-39)
x
)(xf X
Fig. 3.15PILLAI
Discrete-type random variables
1. Bernoulli: X takes the values (0,1), and
2. Binomial: if (Fig. 3.17)
3. Poisson: if (Fig. 3.18)
.)1( ,)0( pXPqXP (3-43)
),,( pnBX
.,,2,1,0 ,)( nkqpk
nkXP knk
(3-44)
, )( PX
.,,2,1,0 ,!
)( kk
ekXPk
(3-45)
k
)( kXP
Fig. 3.17
12 n
)( kXP
Fig. 3.18 PILLAI
4. Hypergeometric:
5. Geometric: if
6. Negative Binomial: ~ if
7. Discrete-Uniform:
PILLAI
(3-49)
(3-48)
(3-47)
.,,2,1 ,1
)( NkN
kXP
),,( prNBX1
( ) , , 1, .1
r k rkP X k p q k r r
r
.1 ,,,2,1,0 ,)( pqkpqkXP k
)( pgX
, max(0, ) min( , )( )
m N m
k n kN
n
m n N k m nP X k
(3-46)
Conditional Probability Density Function
For any two events A and B, we have defined the conditional probability of A given B as
Noting that the probability distribution function is given by
we may define the conditional distribution of the r.v X given the event B as
.0)( ,)(
)()|(
BP
BP
BAPBAP (4-9)
)(xFX
, )( )( xXPxFX (4-10)
PILLAI
.
)(
)( |)( )|(
BP
BxXPBxXPBxFX
(4-11)
Thus the definition of the conditional distribution depends on conditional probability, and since it obeys all probability axioms, it follows that the conditional distribution has the same properties as any distribution function. In particular
Further
.0
)(
)(
)(
)( )|(
,1)(
)(
)(
)( )|(
BP
P
BP
BXPBF
BP
BP
BP
BXPBF
X
X
(4-12)
),|()|(
)(
)( )|)((
12
2121
BxFBxF
BP
BxXxPBxXxP
XX
(4-13)
PILLAI
Since for
The conditional density function is the derivative of the conditional distribution function. Thus
and proceeding as in (3-26) we obtain
Using (4-16), we can also rewrite (4-13) as
. )()()( 2112 xXxxXxX (4-14)
,
)|()|(
dx
BxdFBxf X
X (4-15)
,12 xx
x
XX duBufBxF
.)|()|( (4-16)
2
1
21 .)|(|)(x
x X dxBxfBxXxP (4-17)
PILLAI
Fig. 4.3
)(xFX
x
(a)
q1
1
( | )XF x B
x
(b)
1
1
Example 4.4: Refer to example 3.2. Toss a coin and X(T)=0, X(H)=1. Suppose Determine
Solution: From Example 3.2, has the following form. We need for all x.
For so that and
).|( BxFX
)(xFX
)|( BxFX
,)( ,0 xXx ,)( BxX
.0)|( BxFX
}.{HB
PILLAI
For so that
For and
(see Fig. 4.3(b)).
Example 4.5: Given suppose Find Solution: We will first determine From (4-11) and B as given above, we have
, )( ,10 TxXx
HTBxX )( .0)|( and BxFX
,)( ,1 xXx
}{ )( BBBxX 1)(
)()|( and
BP
BPBxFX
),(xFX .)( aXB ).|( Bxf X
).|( BxFX
.
)|(
aXP
aXxXPBxFX
(4-18)
PILLAI
xXaXxXax ,
.
)(
)()|(
aF
xF
aXP
xXPBxF
X
XX
)( , aXaXxXax .1)|( BxFX
(4-19)
, ,1
, ,)(
)()|(
ax
axaF
xFBxF
X
X
X (4-20)
otherwise. ,0
,,)(
)()|()|(
axaF
xfBxF
dx
dBxf
X
X
XX (4-21)
For so that
For so that Thus
and hence
PILLAI
)|( BxFX
)(xFX
xa
1
(a)Fig. 4.4
Example 4.6: Let B represent the event with For a given determine and Solution:
For we have and hence
bXa )( .ab
),(xFX )|( BxFX ).|( Bxf X
.
)()(
)( )(
)(
)()( |)( )|(
aFbF
bXaxXP
bXaP
bXaxXPBxXPBxF
XX
X
(4-22)
,ax ,)( )( bXaxX
.0)|( BxFX (4-23)
)|( Bxf X
)(xf X
xa(b)
PILLAI
For we have and hence
For we have so that Using (4-23)-(4-25), we get (see Fig. 4.5)
})({ )( )( xXabXaxX
.
)()(
)()(
)()(
)()|(
aFbF
aFxF
aFbF
xXaPBxF
XX
XX
XXX
,bxa
,bx bXabXaxX )( )( )( .1)|( BxFX
(4-24)
(4-25)
otherwise.,0
,,)()(
)()|(
bxaaFbF
xfBxf
XX
X
X (4-26)
)|( Bxf X
)(xf X
x
Fig. 4.5
a b
PILLAI
We can use the conditional p.d.f together with the Bayes’ theorem to update our a-priori knowledge about the probability of events in presence of new observations. Ideally, any new information should be used to update our knowledge. As we see in the next example, conditional p.d.f together with Bayes’ theorem allow systematic updating. For any two events A and B, Bayes’ theorem gives
Let so that (4-27) becomes (see (4-13) and (4-17))
.)(
)()|()|(
BP
APABPBAP (4-27)
21 )( xXxB
).()(
)|()(
)()(
)|()|(
)(
)(|))(())((|
2
1
2
1
12
12
21
2121
APdxxf
dxAxfAP
xFxF
AxFAxF
xXxP
APAxXxPxXxAP
x
x X
x
x X
XX
XX
(4-28)
PILLAI
Further, let so that in the limit as
or
From (4-30), we also get
or
and using this in (4-30), we get the desired result
,0 , , 21 xxxx ,0
).()(
)|()(|))((|lim
0AP
xf
AxfxXAPxXxAP
X
X
(4-29)
.)(
)()|()|(| AP
xfxXAPAxf X
AX
(4-30)
(4-31),)()|()|()(
1
dxxfxXAPdxAxfAP XX
dxxfxXAPAP X )()|()(
(4-32)
.)()|(
)()|()|(|
dxxfxXAP
xfxXAPAxf
X
XAX
(4-33)
PILLAI
To illustrate the usefulness of this formulation, let us reexamine the coin tossing problem.
Example 4.7: Let represent the probability of obtaining a head in a toss. For a given coin, a-priori p can possess any value in the interval (0,1). In the absence of any additional information, we may assume the a-priori p.d.f to be a uniform distribution in that interval. Now suppose we actually perform an experiment of tossing the coin n times, and k heads are observed. This is new information. How can we update Solution: Let A= “k heads in n specific tosses”. Since these tosses result in a specific sequence,
)(HPp
)( pfP
?)( pfP
)( pfP
p0 1
Fig.4.6
,)|( knkqppPAP (4-34)
PILLAI
and using (4-32) we get
The a-posteriori p.d.f represents the updated information given the event A, and from (4-30)
Notice that the a-posteriori p.d.f of p in (4-36) is not a uniform distribution, but a beta distribution. We can use this a-posteriori p.d.f to make further predictions, For example, in the light of the above experiment, what can we say about the probability of a head occurring in the next (n+1)th toss?
.)!1(
!)!()1()()|()(
1
0
1
0
n
kkndpppdppfpPAPAP knk
P (4-35)
| ( | )P Af p A
).,( 10 ,!)!(
)!1(
)(
)()|()|(|
knpqpkkn
n
AP
pfpPAPApf
knk
PAP
(4-36)
)|(| Apf AP
p
Fig. 4.7
10
PILLAI
Let B= “head occurring in the (n+1)th toss, given that k heads have occurred in n previous tosses”. Clearly and from (4-32)
Notice that unlike (4-32), we have used the a-posteriori p.d.f in (4-37) to reflect our knowledge about the experiment already performed. Using (4-36) in (4-37), we get
Thus, if n =10, and k = 6, then
which is more realistic compare to p = 0.5.
,)|( ppPBP
1
0 .)|()|()( dpApfpPBPBP P
(4-37)
1
0 .
2
1
!)!(
)!1()(
n
kdpqp
kkn
npBP knk (4-38)
,58.012
7)( BP
PILLAI
To summarize, if the probability of an event X is unknown, one should make noncommittal judgement about its a-priori probability density function Usually the uniform distribution is a reasonable assumption in the absence of any other information. Then experimental results (A) are obtained, and out knowledge about X must be updated reflecting this new information. Bayes’ rule helps to obtain the a-posteriori p.d.f of X given A. From that point on, this a-posteriori p.d.f should be used to make further predictions and calculations.
).(xf X
)|(| Axf AX
PILLAI
Functions of a Random Variable
Let X be a r.v defined and suppose g(x) is a function of the variable x. Define
Is Y necessarily a r.v? If so what is its probability distribution function and its probability density function
In general, for a set B,
).(XgY (5-1)
),( yFY
?)( yfY
)).(()( 1 BgXPBYP (5-2)
PILLAI
In particular
Thus the distribution function as well of the density function of Y can be determined in terms of that of X. To obtain the distribution function of Y, we must determine the Borel set on the x-axis such that for every given y, and the probability of that set. At this point, we shall consider some of the following functions to illustrate the technical details.
. ],()())(())(()( 1 ygXPyXgPyYPyFY (5-3)
)()( 1 ygX
)(XgY
baX 2X
|| X
X)(|| xUXXe
Xlog
X
1
PILLAI
)sin( X
Example 5.1: Solution: Suppose
and
On the other hand if then
and hence
baXY (5-4)
.0a
. )()()()(
a
byF
a
byXPybaXPyYPyF XY (5-5)
. 1
)(
a
byf
ayf XY
(5-6)
,0a
, 1
)()()()(
a
byF
a
byXPybaXPyYPyF
X
Y
(5-7)
. 1
)(
a
byf
ayf XY (5-8)
PILLAI
From (5-6) and (5-8), we obtain (for all a)
Example 5.2:
If then the event and hence
For from Fig. 5.1, the event is equivalent to
.||
1)(
a
byf
ayf XY (5-9)
.2XY
. )()()( 2 yXPyYPyFY
(5-10)
(5-11)
,0y ,)( 2 yX
.0 ,0)( yyFY(5-12)
,0y })({})({ 2 yXyY }.)({ 21 xXx
2XY
X
y
2x1xFig. 5.1 PILLAI
Hence
By direct differentiation, we get
If represents an even function, then (5-14) reduces to
In particular if so that
.otherwise,0
,0, )()(2
1)(
yyfyf
yyf XXY (5-14)
)(xf X
).( 1
)( yUyfy
yf XY (5-15)
),1,0( NX
,2
1)( 2/2x
X exf (5-16)
.0 ),()(
)()()()( 1221
yyFyF
xFxFxXxPyF
XX
XXY
(5-13)
PILLAI
and substituting this into (5-14) or (5-15), we obtain the p.d.f of to be
On comparing this with (3-36), we notice that (5-17) represents a Chi-square r.v with n = 1, since Thus, if X is a Gaussian r.v with then represents a Chi-square r.v with one degree of freedom (n = 1).
Example 5.3: Let
2XY ).(
2
1)( 2/ yUe
yyf y
Y
(5-17)
.)2/1( ,0 2XY
.,
,,0
,,
)(
cXcX
cXc
cXcX
XgY
PILLAI
In this case
For we have and so that
Similarly if and so that
Thus
).()())(()0( cFcFcXcPYP XX (5-18)
,0y ,cx .0 ),()(
))(()()(
ycyFcyXP
ycXPyYPyF
X
Y
(5-19)
,0y ,cx
.0 ),()(
))(()()(
ycyFcyXP
ycXPyYPyF
X
Y
(5-20)
( ), 0,
( ) [ ( ) ( )] ( ),
( ), 0.
X
Y X X
X
f y c y
f y F c F c y
f y c y
(5-21)
)(Xg
Xc
c
(a) (b)x
)(xFX
(c)
( )YF y
y
Fig. 5.2
cXY )()(
cXY )()(
PILLAI
Note: As a general approach, given first sketch the graph and determine the range space of y. Suppose is the range space of Then clearly for and for so that can be nonzero only in Next, determine whether there are discontinuities in the range space of y. If so evaluate at these discontinuities. In the continuous region of y, use the basic approach
and determine appropriate events in terms of the r.v X for every y. Finally, we must have for and obtain
,0)( , yFay Y ,1)( , yFby Y
)( yFY .bya
iyYP )(
yXgPyFY ))(()(
)( yFY ,y
( )( ) in .
Y
Y
dF yf y a y b
dy
),(XgY ),(xgy bya ).(xgy
PILLAI
However, if is a continuous function, it is easy to establish a direct procedure to obtain
The summation index i in (5-30) depends on y, and for every y the equation must be solved to obtain the total number of solutions at every y, and the actual solutions
all in terms of y.
)(XgY ).( yfY
PILLAI
iiX
iiX
i x
Y xfxg
xfdxdy
yfi
).()(
1)(
/
1)( (5-30)
)( ixgy
,, 21 xx
For example, if then for all and represent the two solutions for each y. Notice that the solutions are all in terms of y so that the right side of (5-30) is only a function of y. Referring back to the example (Example 5.2) here for each there are two solutions given by and ( for ). Moreover
and using (5-30) we get
which agrees with (5-14).
,2XY yxy 1 ,0 yx 2
ix2XY
,0y
yx 1 .2 yx 0)( yfY0y
ydx
dyx
dx
dy
ixx
2 that so 2
, otherwise,0
,0, )()(2
1)(
yyfyfyyf XX
Y (5-31)
2XY
X
y
2x1xFig. 5.5
PILLAI
Example 5.5: Find
Solution: Here for every y, is the only solution, and
and substituting this into (5-30), we obtain
In particular, suppose X is a Cauchy r.v as in (3-39) with parameter so that
In that case from (5-33), has the p.d.f
.1
XY ).( yfY
yx /11
,/1
1 that so
1 222
1
yydx
dy
xdx
dy
xx
.11
)(2
yf
yyf XY
(5-33)
(5-32)
. ,/
)(22
xx
xf X (5-34)
XY /1
. ,)/1(
/)/1(
)/1(
/1)(
22222
y
yyyyfY
(5-35)
PILLAI
But (5-35) represents the p.d.f of a Cauchy r.v with parameter Thus if then
Example 5.6: Suppose and Determine Solution: Since X has zero probability of falling outside the interval has zero probability of falling outside the interval Clearly outside this interval. For any from Fig.5.6(b), the equation has an infinite number of solutions where is the principal solution. Moreover, using the symmetry we also get etc. Further,
so that
./1 ),( CX )./1( /1 CX
,0 ,/2)( 2 xxxf X .sin XY ).( yfY
xy sin ),,0( ).1,0( 0)( yfY
,10 y xy sin,, , ,, 321 xxx
12 xx 22 1sin1cos yxx
dx
dy
.1 2ydx
dy
ixx
yx 11 sin
PILLAI
Using this in (5-30), we obtain for
But from Fig. 5.6(a), in this case (Except for and the rest are all zeros).
,10 y
2
0
1( ) ( ).
1Y X i
ii
f y f xy
(5-36)
0)()()( 431 xfxfxf XXX
)( 1xf X )( 2xf X
)(xf X
x
x
xy sin
1x 1x 2x 3x
y
(a)
(b)
Fig. 5.6
3x
PILLAI
Thus (Fig. 5.7)
otherwise.,0
,10,1
2
1
)(2
22
1
1)()(
1
1)(
222
11
22
21
2212
yy
y
xx
xx
yxfxf
yyf XXY
(5-37)
)( yfY
y
Fig. 5.7
2
1
PILLAI
Mean or the Expected Value of a r.v X is defined as
If X is a discrete-type r.v, then using (3-25) we get
Mean represents the average (mean) value of the r.v in a very large number of trials. For example if then using (3-31) ,
is the midpoint of the interval (a,b).
.)( )( dxxfxXEX XX (6-2)
. )(
)()()(
1
iii
iii
iiii
iiiX
xXPxpx
dxxxpxdxxxpxXEX
(6-3)
),,( baUX
(6-4)
b
a
b
a
ba
ab
abx
abdx
ab
xXE
2)(22
1)(
222
PILLAI
On the other hand if X is exponential with parameter as in (3-32), then
implying that the parameter in (3-32) represents the mean value of the exponential r.v.
Similarly if X is Poisson with parameter as in (3-45), using (6-3), we get
Thus the parameter in (3-45) also represents the mean of the Poisson r.v.
0
/
0 , )(
dyyedxe
xXE yx (6-5)
.!)!1(
!!)()(
01
100
eei
ek
e
kke
kkekXkPXE
i
i
k
k
k
k
k
k
k
(6-6)
PILLAI
In a similar manner, if X is binomial as in (3-44), then its mean is given by
Thus np represents the mean of the binomial r.v in (3-44).
For the normal r.v in (3-29),
.)(!)!1(
)!1(
)!1()!(
!
!)!(
!)()(
111
01
100
npqpnpqpiin
nnpqp
kkn
n
qpkkn
nkqp
k
nkkXkPXE
ninin
i
knkn
k
knkn
k
knkn
k
n
k
(6-7)
.2
1
2
1
)(2
1
2
1)(
1
2/
2
0
2/
2
2/
2
2/)(
2
2222
2222
dyedyye
dyeydxxeXE
yy
yx
(6-8)
PILLAI
Thus the first parameter in is infact the mean of the Gaussian r.v X. Given suppose defines a new r.v with p.d.f Then from the previous discussion, the new r.v Y has a mean given by (see (6-2))
From (6-9), it appears that to determine we need to determine However this is not the case if only is the quantity of interest. It turns out that,
In the discrete case, (6-13) reduces to
),( 2NX
),( xfX X )(XgY ).( yfY
Y
.)( )( dyyfyYE YY (6-9)
),(YE
).( yfY )(YE
PILLAI
.)()()( )()( dxxfxgdyyfyXgEYE XY (6-13)
).()()( ii
i xXPxgYE (6-14)
We can use (6-14) to determine the mean of where X is a Poisson r.v. Using (3-45)
,2XY
PILLAI
.
!)!1(
!!!
!)1(
)!1(
!!)(
2
0
1
1
10 0
0
1
1
1
2
0
2
0
22
eee
em
eei
e
ei
ieii
ie
iie
kke
kke
kekkXPkXE
m
m
i
i
i
i
i i
ii
i
i
k
k
k
k
k
k
k
(6-15)
Mean alone will not be able to truly represent the p.d.f of any r.v. To illustrate this, consider the following scenario: Consider two Gaussian r.vs and Both of them have the same mean However, as Fig. 6.1 shows, their p.d.fs are quite different. One is more concentrated around the mean, whereas the other one has a wider spread. Clearly, we need atleast an additional parameter to measure this spread around the mean!
(0,1) 1 NX (0,10). 2 NX
.0
Fig.6.1
)( 11xf X
1x
12 (a)
)( 22xf X
2x
102 (b)
)( 2X
PILLAI
For a r.v X with mean represents the deviation of the r.v from its mean. Since this deviation can be either positive or negative, consider the quantity and its average value represents the average mean square deviation of X around its mean. Define
With and using (6-13) we get
is known as the variance of the r.v X, and its square root is known as the standard deviation of X. Note that the standard deviation represents the root mean square spread of the r.v X around its mean
X ,
,2 X
][ 2 XE
.0][ 2 2 XEX
(6-16)
2)()( XXg
.0)()(
22
dxxfx XX
(6-17)
2
X
2)( XEX
.
PILLAI
Expanding (6-17) and using the linearity of the integrals, we get
Alternatively, we can use (6-18) to compute
Thus , for example, returning back to the Poisson r.v in (3-45), using (6-6) and (6-15), we get
Thus for a Poisson r.v, mean and variance are both equal to its parameter
.)(
)( 2)(
)(2)(
22___
2 222
2
2
222
XXXEXEXE
dxxfxdxxfx
dxxfxxXVar
XX
XX
(6-18)
.2
X
.22___
22 2
XXX
(6-19)
.PILLAI
To determine the variance of the normal r.v we can use (6-16). Thus from (3-29)
To simplify (6-20), we can make use of the identity
for a normal p.d.f. This gives
Differentiating both sides of (6-21) with respect to we get
or
.2
1])[()(
2/)(
2
22 22
dxexXEXVar x
(6-20)
),,( 2N
2/)(
2
1
2
1)(
22
dxedxxf xX
2/)( .222
dxe x(6-21)
2/)(3
2
2)( 22
dxe
x x
,2
1 2
2/)(
2
2 22
dxex x (6-22)
,
PILLAI
which represents the in (6-20). Thus for a normal r.v as in (3-29)
and the second parameter in infact represents the variance of the Gaussian r.v. As Fig. 6.1 shows the larger the the larger the spread of the p.d.f around its mean. Thus as the variance of a r.v tends to zero, it will begin to concentrate more and more around the mean ultimately behaving like a constant.
Moments: As remarked earlier, in general
are known as the moments of the r.v X, and
),( 2N
)(XVar
2)( XVar (6-23)
,
1 ),( ___
nXEXm nnn
(6-24)
PILLAI
Two Random Variables
In many experiments, the observations are expressible not
as a single quantity, but as a family of quantities. For
example to record the height and weight of each person in
a community or the number of people and the total income
in a family, we need two numbers.
Let X and Y denote two random variables (r.v) based on a
probability model (, F, P). Then
and
2
1
,)()()()( 1221
x
x XXX dxxfxFxFxXxP
.)()()()(2
11221
y
y YYY dyyfyFyFyYyP PILLAI
What about the probability that the pair of r.vs (X,Y) belongs to an arbitrary region D? In other words, how does one estimate, for example, Towards this, we define the joint probability distribution function of X and Y to be
where x and y are arbitrary real numbers.
Properties
(i)
since we get
?))(())(( 2121 yYyxXxP
,0),(
))(())((),(
yYxXP
yYxXPyxFXY (7-1)
.1),( ,0),(),( XYXYXY FxFyF
, )()(,)( XyYX
(7-2)
PILLAI
Similarly
we get
(ii)
To prove (7-3), we note that for
and the mutually exclusive property of the events on the right side gives
which proves (7-3). Similarly (7-4) follows.
.0)(),( XPyFXY ,)(,)( YX
.1)(),( PFXY
).,(),()(,)( 1221 yxFyxFyYxXxP XYXY
).,(),()(,)( 1221 yxFyxFyYyxXP XYXY
(7-3)
(7-4)
,12 xx
yYxXxyYxXyYxX )(,)()(,)()(,)( 2112
yYxXxPyYxXPyYxXP )(,)()(,)()(,)( 2112
PILLAI
(iii)
This is the probability that (X,Y) belongs to the rectangle in Fig. 7.1. To prove (7-5), we can make use of the following identity involving mutually exclusive events on the right side.
).,(),(
),(),()(,)(
1121
12222121
yxFyxF
yxFyxFyYyxXxP
XYXY
XYXY
(7-5)
.)(,)()(,)()(,)( 2121121221 yYyxXxyYxXxyYxXx
0R
1y
2y
1x 2x
X
Y
Fig. 7.1
0R
PILLAI
2121121221 )(,)()(,)()(,)( yYyxXxPyYxXxPyYxXxP
2yy 1y
.
),(),(
2
yx
yxFyxf XY
XY
(7-6)
. ),(),(
dudvvufyxF
x y
XYXY (7-7)
.1 ),(
dxdyyxf XY
(7-8)
This gives
and the desired result in (7-5) follows by making use of (7-3) with and respectively.
Joint probability density function (Joint p.d.f)
By definition, the joint p.d.f of X and Y is given by
and hence we obtain the useful formula
Using (7-2), we also get
PILLAI
To find the probability that (X,Y) belongs to an arbitrary region D, we can make use of (7-5) and (7-7). From (7-5) and (7-7)
Thus the probability that (X,Y) belongs to a differential rectangle x y equals and repeating this procedure over the union of no overlapping differential rectangles in D, we get the useful result
.),(),(
),(),( ),(
),()(,)(
yxyxfdudvvuf
yxFyxxFyyxF
yyxxFyyYyxxXxP
XY
xx
x
yy
y XY
XYXYXY
XY
(7-9)
x
X
Y
Fig. 7.2
yD
,),( yxyxf XY
PILLAI
Dyx XY dxdyyxfDYXP),(
.),(),( (7-10)
(iv) Marginal Statistics In the context of several r.vs, the statistics of each individual ones are called marginal statistics. Thus is the marginal probability distribution function of X, and is the marginal p.d.f of X. It is interesting to note that all marginals can be obtained from the joint p.d.f. In fact
Also
To prove (7-11), we can make use of the identity
.),()( ),,()( yFyFxFxF XYYXYX (7-11)
.),()( ,),()(
dxyxfyfdyyxfxf XYYXYX (7-12)
)()()( YxXxX
)(xFX
)(xf X
PILLAI
so that To prove (7-12), we can make use of (7-7) and (7-11), which gives
and taking derivative with respect to x in (7-13), we get
At this point, it is useful to know the formula for differentiation under integrals. Let
Then its derivative with respect to x is given by
Obvious use of (7-16) in (7-13) gives (7-14).
).,(,)( xFYxXPxXPxF XYX
dudyyufxFxFx
XYXYX ),(),()(
(7-13)
.),()(
dyyxfxf XYX (7-14)
.),()()(
)( xb
xadyyxhxH (7-15)
( )
( )
( ) ( ) ( ) ( , )( , ) ( , ) .
b x
a x
dH x db x da x h x yh x b h x a dy
dx dx dx x
(7-16)
PILLAI
If X and Y are discrete r.vs, then represents their joint p.d.f, and their respective marginal p.d.fs are given by
and
Assuming that is written out in the form of a rectangular array, to obtain from (7-17), one need to add up all entries in the i-th row.
),( jiij yYxXPp
j j
ijjii pyYxXPxXP ),()(
i i
ijjij pyYxXPyYP ),()(
(7-17)
(7-18)
),( ji yYxXP
),( ixXP
mnmjmm
inijii
nj
nj
pppp
pppp
pppp
pppp
21
21
222221
111211
j
ijp
i
ijp
Fig. 7.3
It used to be a practice for insurance companies routinely to scribble out these sum values in the left and top margins, thus suggesting the name marginal densities! (Fig 7.3).
PILLAI
From (7-11) and (7-12), the joint P.D.F and/or the joint p.d.f represent complete information about the r.vs, and their marginal p.d.fs can be evaluated from the joint p.d.f. However, given marginals, (most often) it will not be possible to compute the joint p.d.f. Consider the following example:
Example 7.1: Given
Obtain the marginal p.d.fs and Solution: It is given that the joint p.d.f is a constant in the shaded region in Fig. 7.4. We can use (7-8) to determine that constant c. From (7-8)
. otherwise0,
,10constant,),(
yxyxf XY (7-19)
),( yxf XY
)(xf X ).( yfY
.122
),(1
0
21
0
1
0
0
ccycydydydxcdxdyyxf
yy
y
xXY (7-20)
0 1
1
X
Y
Fig. 7.4
y
PILLAI
Thus c = 2. Moreover from (7-14)
and similarly
Clearly, in this case given and as in (7-21)-(7-22), it will not be possible to obtain the original joint p.d.f in (7-19).
Example 7.2: X and Y are said to be jointly normal (Gaussian) distributed, if their joint p.d.f has the following form:
,10 ),1(22),()(1
xyXYX xxdydyyxfxf (7-21)
.10 ,22),()(
0
y
xXYY yydxdxyxfyf (7-22)
)(xf X)( yfY
.1|| , ,
,12
1),(
2
2
2
2
2
)(
))((2
)(
)1(2
1
2
yx
eyxf Y
Y
YX
YX
X
X yyxx
YX
XY
(7-23)
PILLAI
By direct integration, using (7-14) and completing the square in (7-23), it can be shown that
~
and similarly
~
Following the above notation, we will denote (7-23) as Once again, knowing the marginals in (7-24) and (7-25) alone doesn’t tell us everything about the joint p.d.f in (7-23).
As we show below, the only situation where the marginal p.d.fs can be used to recover the joint p.d.f is when the random variables are statistically independent.
),,( 2
1),()( 22/)(
2
22
XXx
X
XYX Nedyyxfxf XX
(7-24)
(7-25)),,( 2
1),()( 22/)(
2
22
YYy
Y
XYY Nedxyxfyf YY
).,,,,( 22 YXYXN
PILLAI
Independence of r.vs
Definition: The random variables X and Y are said to be statistically independent if the events and are independent events for any two Borel sets A and B in x and y axes respectively. Applying the above definition to the events and we conclude that, if the r.vs X and Y are independent, then
i.e.,
or equivalently, if X and Y are independent, then we must have
AX )( })({ BY
xX )( , )( yY
))(())(())(())(( yYPxXPyYxXP (7-26)
)()(),( yFxFyxF YXXY
).()(),( yfxfyxf YXXY (7-28)
(7-27)
PILLAI
If X and Y are discrete-type r.vs then their independence implies
Equations (7-26)-(7-29) give us the procedure to test for independence. Given obtain the marginal p.d.fs and and examine whether (7-28) or (7-29) is valid. If so, the r.vs are independent, otherwise they are dependent. Returning back to Example 7.1, from (7-19)-(7-22), we observe by direct verification that Hence X and Y are dependent r.vs in that case. It is easy to see that such is the case in the case of Example 7.2 also, unless In other words, two jointly Gaussian r.vs as in (7-23) are independent if and only if the fifth parameter
., allfor )()(),( jiyYPxXPyYxXP jiji (7-29)
)(xf X
)( yfY
),,( yxf XY
).()(),( yfxfyxf YXXY
.0
.0
PILLAI
Example 7.3: Given
Determine whether X and Y are independent. Solution:
Similarly
In this case
and hence X and Y are independent random variables.
otherwise.,0
,10 ,0,),(
2
xyexy
yxfy
XY (7-30)
.10 ,2 22
),()(
0 0
0
2
0
xxdyyeyex
dyeyxdyyxfxf
yy
yXYX
(7-31)
.0 ,2
),()(21
0 ye
ydxyxfyf y
XYY (7-32)
),()(),( yfxfyxf YXXY
PILLAI
8. One Function of Two Random Variables
Given two random variables X and Y and a function g(x,y),
we form a new random variable Z as
Given the joint p.d.f how does one obtain
the p.d.f of Z ? Problems of this type are of interest from a
practical standpoint. For example, a receiver output signal
usually consists of the desired signal buried in noise, and
the above formulation in that case reduces to Z = X + Y.
).,( YXgZ
),,( yxf XY ),(zfZ
(8-1)
PILLAI
It is important to know the statistics of the incoming signal for proper receiver design. In this context, we shall analyze problems of the following type:
Referring back to (8-1), to start with
),( YXgZ
YX
)/(tan 1 YX
YX
XY
YX /
),max( YX
),min( YX
22 YX
zDyx XY
zZ
dxdyyxf
DYXPzYXgPzZPzF
, ,),(
),(),()()(
(8-2)
(8-3)PILLAI
where in the XY plane represents the region such that is satisfied. Note that need not be simply connected (Fig. 8.1). From (8-3), to determine it is enough to find the region for every z, and then evaluate the integral there.
We shall illustrate this method through various examples.
zD
zyxg ),(
)(zFZ
zD
zD
X
Y
zD
zD
Fig. 8.1PILLAI
Example 8.1: Z = X + Y. Find Solution:
since the region of the xy plane where is the shaded area in Fig. 8.2 to the left of the line Integrating over the horizontal strip along the x-axis first (inner integral) followed by sliding that strip along the y-axis from to (outer integral) we cover the entire shaded area.
,),()(
y
yz
x XYZ dxdyyxfzYXPzF (8-4)
zD zyx .zyx
yzx
x
y
Fig. 8.2
).(zfZ
PILLAI
We can find by differentiating directly. In this context, it is useful to recall the differentiation rule in (7-15) - (7-16) due to Leibnitz. Suppose
Then
Using (8-6) in (8-4) we get
Alternatively, the integration in (8-4) can be carried out first along the y-axis followed by the x-axis as in Fig. 8.3.
)(zFZ)(zfZ
)(
)( .),()(
zb
zadxzxhzH (8-5)
)(
)( .
),(),(
)(),(
)()( zb
zadx
z
zxhzzah
dz
zdazzbh
dz
zdb
dz
zdH(8-6)
( , )( ) ( , ) ( , ) 0
( , ) .
z y z yXY
Z XY XY
XY
f x yf z f x y dx dy f z y y dy
z z
f z y y dy
(8-7)
PILLAI
In that case
and differentiation of (8-8) gives
,),()(
x
xz
y XYZ dxdyyxfzF (8-8)
.),(
),( )(
)(
x XY
x
xz
y XYZ
Z
dxxzxf
dxdyyxfzdz
zdFzf
(8-9)
If X and Y are independent, then
and inserting (8-10) into (8-8) and (8-9), we get
)()(),( yfxfyxf YXXY
.)()()()()(
x YXy YXZ dxxzfxfdyyfyzfzf
(8-10)
(8-11)
xzy
x
y
Fig. 8.3
PILLAI
The above integral is the standard convolution of the functions and expressed two different ways. We thus reach the following conclusion: If two r.vs are independent, then the density of their sum equals the convolution of their density functions.
As a special case, suppose that for and for then we can make use of Fig. 8.4 to determine the new limits for
)(zf X )(zfY
0)( xf X 0x 0)( yfY
,0y
.zD
Fig. 8.4
yzx
x
y
)0,(z
),0( z
PILLAI
In that case
or
On the other hand, by considering vertical strips first in Fig. 8.4, we get
or
if X and Y are independent random variables.
z
y
yz
x XYZ dxdyyxfzF
0
0 ),()(
.0,0
,0,),( ),()(
0
0
0 z
zdyyyzfdydxyxfz
zfz
XYz
y
yz
x XYZ (8-12)
,0,0
,0,)()(),()(
0
0 z
zdxxzfxfdxxzxfzf
z
y YXz
x XYZ
z
x
xz
y XYZ dydxyxfzF
0
0 ),()(
(8-13)
PILLAI
Example 8.2: Suppose X and Y are independent exponential r.vs with common parameter , and let Z = X + Y. Determine Solution: We have and we can make use of (13) to obtain the p.d.f of Z = X + Y.
As the next example shows, care should be taken in using the convolution formula for r.vs with finite range.
Example 8.3: X and Y are independent uniform r.vs in the common interval (0,1). Determine where Z = X + Y. Solution: Clearly, here, and as Fig. 8.5 shows there are two cases of z for which the shaded areas are quite different in shape and they should be considered separately.
),()( ),()( yUeyfxUexf yY
xX
(8-14)
20 zYXZ
),(zfZ
).( )( 2
0
2
0
)(2 zUezdxedxeezf zzzz xzxZ
(8-15)
).(zfZ
PILLAI
Using (8-16) - (8-17), we obtain
By direct convolution of and we obtain the same result as above. In fact, for (Fig. 8.6(a))
and for (Fig. 8.6(b))
Fig 8.6 (c) shows which agrees with the convolution of two rectangular waveforms as well.
.21,2
,10)()(
zz
zz
dz
zdFzf Z
Z (8-18)
)(xf X ),( yfY
10 z
21 z
. 1 )()()(
0 zdxdxxfxzfzf
z
YXZ
.2 1 )(1
1 zdxzf
zZ
(8-19)
(8-20)
)(zfZ
PILLAI
)(xfY
x1
)( xzf X
xz
)()( xfxzf YX
xz1z
10 )( za
)(xfY
x1
)( xzf X
x
)()( xfxzf YX
x11z z
1z
21 )( zb
Fig. 8.6 (c)
)(zfZ
z20 1
PILLAI
Example 8.3: Let Determine its p.d.f
Solution: From (8-3) and Fig. 8.7
and hence
If X and Y are independent, then the above formula reduces to
which represents the convolution of with
.YXZ
),( )(
y
yz
x XYZ dxdyyxfzYXPzF
( )( ) ( , ) ( , ) .
z yZ
Z XY XYy x
dF zf z f x y dx dy f y z y dy
dz z
(8-21)
( ) ( ) ( ) ( ) ( ),Z X Y X Yf z f z y f y dy f z f y
(8-22)
)( zf X ).(zfY
Fig. 8.7
y
x
zyx zyx
y
).(zfZ
PILLAI
As a special case, suppose
In this case, Z can be negative as well as positive, and that gives rise to two situations that should be analyzed separately, since the region of integration for and are quite different. For from Fig. 8.8 (a)
and for from Fig 8.8 (b)
After differentiation, this gives
0
0 ),( )(
y
yz
x XYZ dxdyyxfzF
0 ),( )(
zy
yz
x XYZ dxdyyxfzF
.0 ,0)( and ,0 ,0)( yyfxxf YX
0z 0z,0z
,0z
.0,),(
,0,),()(
0
zdyyyzf
zdyyyzfzf
z XY
XY
Z (8-23) Fig. 8.8 (b)
y
x
yzx
z
y
x
yzx
zz
(a)
PILLAI
9. Two Functions of Two Random Variables
In the spirit of the previous lecture, let us look at an immediate generalization: Suppose X and Y are two random variables with joint p.d.f Given two functions and define the new random variables
How does one determine their joint p.d.f Obviously with in hand, the marginal p.d.fs and can be easily determined.
(9-1)
).,( yxf XY
).,(
),(
YXhW
YXgZ
),( yxg
),,( yxh
(9-2)?),( wzfZW
),( wzfZW )(zfZ )(wfW
PILLAI
The procedure is the same as that in (8-3). In fact for given z and w,
where is the region in the xy plane such that the inequalities and are simultaneously satisfied.
We illustrate this technique in the next example.
wzDyx XYwz
ZW
dxdyyxfDYXP
wYXhzYXgPwWzZPwzF
,),( , ,),(),(
),(,),()(,)(),(
(9-3)
wzD ,
zyxg ),( wyxh ),(
x
y
wzD ,
Fig. 9.1
wzD ,
PILLAI
Example 9.1: Suppose X and Y are independent uniformly distributed random variables in the interval Define Determine Solution: Obviously both w and z vary in the interval Thus
We must consider two cases: and since they give rise to different regions for (see Figs. 9.2 (a)-(b)).
).,max( ),,min( YXWYXZ
.0or 0 if ,0),( wzwzFZW
).,( wzfZW
).,0(
).,0( (9-4)
. ),max( ,),min(,),( wYXzYXPwWzZPwzFZW (9-5)
zw ,zw
wzD ,
X
Y
wy
),( ww
),( zz
zwa )(
X
Y
),( ww
),( zz
zwb )(
Fig. 9.2 PILLAI
For from Fig. 9.2 (a), the region is represented by the doubly shaded area. Thus
and for from Fig. 9.2 (b), we obtain
With
we obtain
Thus
, , ),(),(),(),( zwzzFzwFwzFwzF XYXYXYZW
wzD ,
(9-6)
,zw
,zw . , ),(),( zwwwFwzF XYZW (9-7)
,)( )(),(2
xyyxyFxFyxF YXXY (9-8)
2
2 2
(2 ) / , 0 ,( , )
/ , 0 .ZW
w z z z wF z w
w w z
.otherwise,0
,0,/2),(
2 wzwzfZW
(9-9)
(9-10)
PILLAI
From (9-10), we also obtain
and
If and are continuous and differentiable functions, then as in the case of one random variable (see (5-30)) it is possible to develop a formula to obtain the joint p.d.f directly. Towards this, consider the equations
For a given point (z,w), equation (9-13) can have many solutions. Let us say
,0 ,12
),()(
z
zdwwzfzf
z ZWZ (9-11)
.),( ,),( wyxhzyxg (9-13)
.0 ,2
),()(
0 2
w
wdzwzfwf
w
ZWW(9-12)
),( yxg ),( yxh
),( wzfZW
),,( , ),,( ),,( 2211 nn yxyxyx PILLAI
represent these multiple solutions such that (see Fig. 9.3)
(9-14).),( ,),( wyxhzyxg iiii
Fig. 9.3
Then we can write
z
(a)
w
),( wz
z
www
zz
(b)
x
y1
2
i
n
),( 11 yx
),( 22 yx
),( ii yx
),( nn yx
PILLAI
where
where represents the Jacobian of the original transformation in (9-13) given by
,),(|),(|
1),(|),(|),(
iiiXY
iiiiiXYZW yxf
yxJyxfwzJwzf (9-15)
|),(|
1 |),(|
ii yxJwzJ
(9-16)
( , )i iJ x y
.det),(
, ii yyxx
ii
y
h
x
h
y
g
x
g
yxJ
(9-17)
PILLAI
1 1
1 1
( , ) det .
g g
z wJ z w
h h
z w
Next we shall illustrate the usefulness of the formula in (9-15) through various examples:
Example 9.2: Suppose X and Y are zero mean independent Gaussian r.vs with common variance Define where Obtain Solution: Here
Since
if is a solution pair so is From (9-19)
),/(tan , 122 XYWYXZ
.2
).,( wzfZW
.2
1),(
222 2/)(2
yx
XY eyxf (9-18)
,2/||),/(tan),(;),( 122 wxyyxhwyxyxgz (9-19)
),( 11 yx ).,( 11 yx
.tanor ,tan wxywx
y (9-20)
.2/|| w
PILLAI
Substituting this into z, we get
and
Thus there are two solution sets
We can use (9-21) - (9-23) to obtain From (9-17)
so that
.cos or ,sec tan1 222 wzxwxwxyxz (9-21)
.sintan wzwxy (9-22)
.sin ,cos ,sin ,cos 2211 wzywzxwzywzx (9-23)
).,( wzJ
,cossin
sincos),( z
wzw
wzw
w
y
z
y
w
x
z
x
wzJ
(9-24)
.|),(| zwzJ (9-25)PILLAI
We can also compute using (9-16). From (9-16),
Notice that agreeing with (9-30). Substituting (9-23) and (9-25) or (9-26) into (9-15), we get
Thus
which represents a Rayleigh r.v with parameter and
),( yxJ
.11
),(22
2222
2222
zyx
yx
x
yx
y
yx
y
yx
x
yxJ
(9-26)
|,),(|/1|),(| ii yxJwzJ
.2
|| ,0 ,
),(),(),(
22 2/2
2211
wzez
yxfyxfzwzf
z
XYXYZW
(9-27)
,0 ,),()(22 2/
2
2/
2/
zez
dwwzfzf zZWZ
(9-28)
,2
,2
|| ,1
),()(
0
wdzwzfwf ZWW(9-29)PILLAI
which represents a uniform r.v in the interval Moreover by direct computation
implying that Z and W are independent. We summarize these results in the following statement: If X and Y are zero mean independent Gaussian random variables with common variance, then has a Rayleigh distribution and has a uniform distribution. Moreover these two derived r.vs are statistically independent. Alternatively, with X and Y as independent zero mean r.vs as in (9-32), X + jY represents a complex Gaussian r.v. But
where Z and W are as in (9-33), except that for (9-45) to hold good on the entire complex plane we must have and hence it follows that the magnitude and phase of
).2/,2/(
22 YX )/(tan 1 XY
,jWZejYX (9-31)
)()(),( wfzfwzf WZZW (9-30)
, W
PILLAI
a complex Gaussian r.v are independent with Rayleigh and uniform distributions ~ respectively. The statistical independence of these derived r.vs is an interesting observation.
Example 9.3: Let X and Y be independent exponential random variables with common parameter . Define U = X + Y, V = X - Y. Find the joint and marginal p.d.f of U and V. Solution: It is given that
Now since u = x + y, v = x - y, always and there is only one solution given by
Moreover the Jacobian of the transformation is given by
.0 ,0 ,1
),( /)(2
yxeyxf yxXY
(9-32)
.2
,2
vuy
vux
(9-33)
,|| uv
),( U
PILLAI
and hence
represents the joint p.d.f of U and V. This gives
and
Notice that in this case the r.vs U and V are not independent.
2 11
1 1 ),(
yxJ
, || 0 ,2
1),( /
2 uvevuf u
UV
(9-34)
,0 ,2
1),()( /
2
/2
ueu
dvedvvufuf uu
u
uu
u UVU
(9-35)
/ | |/2 | | | |
1 1( ) ( , ) , .
2 2u v
V UVv vf v f u v du e du e v
(9-36)
PILLAI
Joint Moments
Following section 6, in this section we shall introduce various parameters to compactly represent the information contained in the joint p.d.f of two r.vs. Given two r.vs X and Y and a function define the r.v
Using (6-2), we can define the mean of Z to be
(10-1)),( YXgZ
),,( yxg
.)( )(
dzzfzZE ZZ (10-2)
PILLAI
However, the situation here is similar to that in (6-13), and it is possible to express the mean of in terms of without computing To see this, recall from (5-26) and (7-10) that
where is the region in xy plane satisfying the above inequality. From (10-3), we get
As covers the entire z axis, the corresponding regions are nonoverlapping, and they cover the entire xy plane.
zDyxXY
Z
yxyxf
zzYXgzPzzfzzZzP
),(
),(
),()(
(10-3)
zD
),( YXgZ ),( yxf XY ).(zfZ
( , )
( ) ( , ) ( , ) .z
Z XYx y D
z f z z g x y f x y x y
(10-4)
zDz
PILLAI
By integrating (10-4), we obtain the useful formula
or
If X and Y are discrete-type r.vs, then
Since expectation is a linear operator, we also get
( ) ( ) ( , ) ( , ) .Z XYE Z z f z dz g x y f x y dxdy
(10-5)
(10-6)
[ ( , )] ( , ) ( , ) .XYE g X Y g x y f x y dxdy
[ ( , )] ( , ) ( , ).i j i ji j
E g X Y g x y P X x Y y (10-7)
( , ) [ ( , )].k k k kk k
E a g X Y a E g X Y (10-8)
PILLAI
If X and Y are independent r.vs, it is easy to see that and are always independent of each other. In that case using (10-7), we get the interesting result
However (10-9) is in general not true (if X and Y are not independent).
In the case of one random variable (see (10- 6)), we defined the parameters mean and variance to represent its average behavior. How does one parametrically represent similar cross-behavior between two random variables? Towards this, we can generalize the variance definition given in (6-16) as shown below:
)(XgZ
)].([)]([)()()()(
)()()()()]()([
YhEXgEdyyfyhdxxfxg
dxdyyfxfyhxgYhXgE
YX
YX
(10-9)
)(YhW
PILLAI
Covariance: Given any two r.vs X and Y, define
By expanding and simplifying the right side of (10-10), we also get
It is easy to see that
To see (10-12), let so that
.
)()()()(),(________
YXXY
YEXEXYEXYEYXCov YX
(10-10)
(10-12)
. )()(),( YX YXEYXCov
(10-11)
,YaXU
. )()(),( YVarXVarYXCov
. 0)(),( 2)(
)()()(2
2
YVarYXCovaXVara
YXaEUVar YX (10-13)
PILLAI
The right side of (10-13) represents a quadratic in the variable a that has no distinct real roots (Fig. 10.1). Thus the roots are imaginary (or double) and hence the discriminant
must be non-positive, and that gives (10-12). Using (10-12), we may define the normalized parameter
or
and it represents the correlation coefficient between X and Y.
(10-14)
)( )(),( 2 YVarXVarYXCov
,11 ,),(
)()(
),( XY
YXXY
YXCov
YVarXVar
YXCov
a
)(UVar
Fig. 10.1
YXXYYXCov ),( (10-15)
PILLAI
Uncorrelated r.vs: If then X and Y are said to be uncorrelated r.vs. From (11), if X and Y are uncorrelated, then
Orthogonality: X and Y are said to be orthogonal if
From (10-16) - (10-17), if either X or Y has zero mean, then orthogonality implies uncorrelatedness also and vice-versa. Suppose X and Y are independent r.vs. Then from (10-9) with we get
and together with (10-16), we conclude that the random variables are uncorrelated, thus justifying the original definition in (10-10). Thus independence implies uncorrelatedness.
(10-16)
,0XY
,)( ,)( YYhXXg
).()()( YEXEXYE
.0)( XYE(10-17)
),()()( YEXEXYE
PILLAI
Naturally, if two random variables are statistically independent, then there cannot be any correlation between them However, the converse is in general not true. As the next example shows, random variables can be uncorrelated without being independent.
Example 10.1: Let Suppose X and Y are independent. Define Z = X + Y, W = X - Y . Show that Z and W are dependent, but uncorrelated r.vs.
Solution: gives the only solution set to be
Moreover and
),1,0( UX ).1,0( UY
|| ,2 ,2 ,11 ,20 wzwzwzwz .2/1|),(| wzJ
.2
,2
wzy
wzx
yxwyxz ,
).0( XY
PILLAI
,otherwise,0
,|| ,2 ,2 ,11 ,20,2/1),(
zwwzwzwzwzfZW
(10-18)
Thus (see the shaded region in Fig. 10.2)
and hence
or by direct computation ( Z = X + Y )
Fig. 10.21
1
w
z2
,21,2 2
1
,10 , 2
1
),()(2
2
zzdw
zzdw
dwwzfzf-z
z-
z
z
ZWZ
PILLAI
and
Clearly Thus Z and W are not independent. However
and
and hence
implying that Z and W are uncorrelated random variables.
,otherwise,0
,21,2
,10,
)()()( zz
zz
zfzfzf YXZ(10-19)
(10-20)
).()(),( wfzfwzf WZZW
(10-21) ,0)()())(()( 22 YEXEYXYXEZWE
0)()()(),( WEZEZWEWZCov (10-22)
.otherwise,0
,11|,|1
2
1),()(
||2
wwdzdzwzfwf
w
|w|ZWW
,0)()( YXEWE
PILLAI
Example 10.2: Let Determine the variance of Z in terms of and Solution:
and using (10-15)
In particular if X and Y are independent, then and (10-23) reduces to
Thus the variance of the sum of independent r.vs is the sum of their variances
(10-23)
,0XY
.bYaXZ YX , .XY
( ) ( )Z X YE Z E aX bY a b
22 2
2 2 2 2
2 2 2 2
( ) ( ) ( ) ( )
( ) 2 ( )( ) ( )
2 .
Z Z X Y
X X Y Y
X XY X Y Y
Var Z E Z E a X b Y
a E X abE X Y b E Y
a ab b
.22222YXZ ba (10-24)
).1( baPILLAI
Moments:
represents the joint moment of order (k,m) for X and Y.
Following the one random variable case, we can define the joint characteristic function between two random variables which will turn out to be useful for moment calculations.
, ),(][
dydxyxfyxYXE XY
mkmk (10-25)
PILLAI