Spring 2007 Geog 210C: Phaedon C. Kyriakidis
Bivariate Distributions
Definition: class of multivariate probability distributions describing
joint variation of outcomes of two random variables (discrete or
continuous), e.g., temperature and precipitation, household size and
number of cars owned
Example: survey results of 100 households reporting # of people per
household (household size) and # of cars owned per household:
# of cars owned (y)
size (x) 0 1 2 3 tot.
2 10 8 3 2 23
3 7 10 6 3 26
4 4 5 12 6 27
5 1 2 6 15 24
tot. 22 25 27 26 100
Bivariate PMF: probability mass function fX,Y (x, y) that two RVs
X and Y attain simultaneously two respective outcomes x and y:
fX,Y (x, y) = Prob{X = x, Y = y} = number of samples occurring
jointly in classes x and y / total number of samples
# of cars owned (y)
size (x) 0 1 2 3 fX (x)
2 0.10 0.08 0.03 0.02 0.23
3 0.07 0.10 0.06 0.03 0.26
4 0.04 0.05 0.12 0.06 0.27
5 0.01 0.02 0.06 0.15 0.24
fY (y) 0.22 0.25 0.27 0.26 1.00
note that 0 ≤ fX,Y (x, y) ≤ 1, and∑
x
∑y fX,Y (x, y) = 1
Slide 1
Discrete Bivariate Distribution Example
# of cars owned (y)
size (x) 0 1 2 3 fX (x)
2 0.10 0.08 0.03 0.02 0.23
3 0.07 0.10 0.06 0.03 0.26
4 0.04 0.05 0.12 0.06 0.27
5 0.01 0.02 0.06 0.15 0.24
fY (y) 0.22 0.25 0.27 0.26 1.00
Stem plot representation:
22.5
33.5
44.5
5
0
1
2
30
0.05
0.1
0.15
0.2
household size
Bivariate pmf example
number of cars
Slide 2
Lecture Notes Bivariate & Multivariate Distributions total # of slides = 22
Spring 2007 Geog 210C: Phaedon C. Kyriakidis
Discrete Marginal Distributions
# of cars owned (y)
size (x) 0 1 2 3 fX (x)
2 0.10 0.08 0.03 0.02 0.23
3 0.07 0.10 0.06 0.03 0.26
4 0.04 0.05 0.12 0.06 0.27
5 0.01 0.02 0.06 0.15 0.24
fY (y) 0.22 0.25 0.27 0.26 1.00
Marginal (univariate) distributions:
• univariate PMFs of the two RVs X and Y
• obtained by integrating (summing up) the columns and rows of
the PMF table:
fX(x) =∑
y
fX,Y (x, y) and fY (y) =∑
x
fX,Y (x, y)
fX(x) = Prob{X = x, Y ∈ [−∞, +∞]}marginal PMF = bivariate PMF with one RV unconstrained
Examples:
• probability of a randomly selected household with 3 members:
fX(3) = 0.07 + 0.10 + 0.06 + 0.03 = 0.26
• probability of a randomly selected household with no cars:
fY (0) = 0.10 + 0.07 + 0.04 + 0.01 = 0.22
Slide 3
Discrete Conditional Distributions
# of cars owned (y)
size (x) 0 1 2 3 fX (x)
2 0.10 0.08 0.03 0.02 0.23
3 0.07 0.10 0.06 0.03 0.26
4 0.04 0.05 0.12 0.06 0.27
5 0.01 0.02 0.06 0.15 0.24
fY (y) 0.22 0.25 0.27 0.26 1.00
Conditional distribution:
• PMF of one RV, say Y , given an outcome, X = x of another
RV, say X:
fY |X(y|x) = Prob{Y = y|X = x}
• obtained by dividing (standardizing) the PMF fY,X(y, x) by the
marginal PMF fX(x) of conditioning event:
fY |X(y|x) =fY,X(y, x)
fX(x)=
fX,Y (x, y)
fX(x)
division by fX(x) standardizes (to unit sum) PMF entries corresponding
to particular row (column):∑
y
fY |X(y|x) = 1
Examples:
• conditional PMF of car ownership, given 3-member households:
fY |X (0|3) =0.07
0.26fY |X (1|3) =
0.10
0.26fY |X (3|3) =
0.03
0.26
• conditional PMF of household size, given 3-car ownership:
fX|Y (2|3) =0.02
0.26fX|Y (3|3) =
0.03
0.26fX|Y (4|3) =
0.06
0.26
Slide 4
Lecture Notes Bivariate & Multivariate Distributions total # of slides = 22
Spring 2007 Geog 210C: Phaedon C. Kyriakidis
Covariance and Correlation
# of cars owned (y)
size (x) 0 1 2 3 fX (x)
2 0.10 0.08 0.03 0.02 0.23
3 0.07 0.10 0.06 0.03 0.26
4 0.04 0.05 0.12 0.06 0.27
5 0.01 0.02 0.06 0.15 0.24
fY (y) 0.22 0.25 0.27 0.26 1.00
σXY = Cov{X, Y } = E{[X − E{X}][Y − E{Y }]} = E{XY } − E{X}E{Y }
Covariance calculation:
σXY =5∑
x=2
3∑y=0
xyfX,Y (x, y)−5∑
x=2
xfX(x)3∑
y=0
yfY (y)
= 6.16− 3.52× 1.57 = 0.6336
Correlation coefficient: standardized (unit-free) version of covariance:
ρXY =σXY√
σX√
σY∈ [−1, +1]
σX = variance of RV X; σY = variance of RV Y
Correlation coefficient calculation:
ρXY =σXY√∑5
x=2 x2fX(x)− E{X}2√∑3
y=0 y2fY (y)− E{Y }2
=0.6336
1.0907× 1.0978= 0.5292
Slide 5
Correlation versus Dependence
# of cars owned (y)
size (x) 0 1 2 3 fX (x)
2 0.10 0.08 0.03 0.02 0.23
3 0.07 0.10 0.06 0.03 0.26
4 0.04 0.05 0.12 0.06 0.27
5 0.01 0.02 0.06 0.15 0.24
fY (y) 0.22 0.25 0.27 0.26 1.00
Independence:
fX,Y (x, y) = fX(x)fY (y) ∀x, y
bivariate PMF = product of two marginal PMFs,
for all possible combinations of x and y
Example:
fX,Y (3, 1) = 0.10 �= 0.26× 0.25 = fX(3)fY (1)
Important note:
• lack of of correlation, i.e., σXY = ρXY = 0, does not entail
independence, because possibly fX,Y (x, y) �= fX(x)fY (y) for
some combinations of x and y
• independence, however, entails lack of correlation:
fX,Y (x, y) = fX(x)fY (y), ∀x, y ⇒ σXY = ρXY = 0
Note: Covariance σXY and correlation ρXY are just a moments, i.e.,
summary statistics, of the bivariate PMF fX,Y (x, y)
Slide 6
Lecture Notes Bivariate & Multivariate Distributions total # of slides = 22
Spring 2007 Geog 210C: Phaedon C. Kyriakidis
Joint Variability of Two Continuous Attributes
Consider two random variables Z and Y , and their sample scatter
plot, i.e., N realized pairs {(zn, yn), n = 1, . . . , N}
90 95 100 105 110190
192
194
196
198
200
202
204
206
208
210Sample scatterplot
Variable Y
Var
iabl
e Z
Joint and marginal PDFs:
(Z, Y )� fZY (z, y) = Prob {Z ∈ z ± dz , Y ∈ y ± dy}︸ ︷︷ ︸bivariate PDF
Z � fZ(z) = Prob {Z ∈ z ± dz}Y � fY (y) = Prob {Y ∈ y ± dy}
Bivariate density histogram = estimate of bivariate PDF
Slide 7
Sample Bivariate Density Histogram
Discretization step: mesh of rectangular bins overlaid on sample
scatter plot
90 95 100 105 110190
192
194
196
198
200
202
204
206
208
210Discretizing sample variability
Variable Y
Var
iabl
e Z
Normalized sample frequencies: = # of pairs falling in each bin,
divided by: (i) total # of pairs, and (ii) bin area
90 95 100 105 110190
192
194
196
198
200
202
204
206
208
210
Variable Y
Var
iabl
e Z
Sample bivariate density histogram
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
Slide 8
Lecture Notes Bivariate & Multivariate Distributions total # of slides = 22
Spring 2007 Geog 210C: Phaedon C. Kyriakidis
Bivariate Distributions
Bivariate CDF: FZY (z, y) = Prob {Z ≤ z, Y ≤ y} ∈ [0, 1]
Bivariate PDF: fZY (z, y) =∂2FZY (z, y)
∂z ∂y
with+∞∫−∞
+∞∫−∞
fZY (z, y)dzdy = 1
Marginal distributions:
Z : FZ(z) = Prob {Z ≤ z}= Prob {Z ≤ z, Y ∈ [−∞, +∞]}= FZY (z, +∞)
fZ(z) =+∞∫−∞
fZY (z, y) dy
Y : FY (y) = Prob {Y ≤ y} = FZY (+∞, y)
Marginal moments:
µZ = E{Z} =∫ +∞−∞ z fZ(z) dz
σZ = E{[Z − µZ ]2} =∫ +∞−∞ (z − µZ)2 fZ(z) dz
Slide 9
Statistics of Indicators (1)
n = 10 joint realizations of two continuous RVs Z and Y :
z-values 2 1 5 1 6 3 9 4 6 8
y-values 3 0 6 0 5 4 8 5 6 7
Indicator random variables:
IZ(z0) =
{0 if Z > z0
1 if Z ≤ z0IY (y0) =
{0 if Y > y0
1 if Y ≤ y0
Example (z0 = 5, y0 = 5):
z-indicators 1 1 1 1 0 1 0 1 0 0
y-indicators 1 1 0 1 1 1 0 1 0 0
Univariate indicator moment (mean):
E{IZ(z0)} =1
N
N∑n=1
in(z0) = FZ(z0) ⇒ FZ(z0 = 5) =6
10
average of indicators in(z0) =
proportion of values no greater than z0 = CDF FZ(z0)
Slide 10
Lecture Notes Bivariate & Multivariate Distributions total # of slides = 22
Spring 2007 Geog 210C: Phaedon C. Kyriakidis
Statistics of Indicators (2)
n = 10 joint realizations of two RVs Z and Y :
z-values 2 1 5 1 6 3 9 4 6 8
y-values 3 0 6 0 5 4 8 5 6 7
Indicator values (example for z0 = 5, y0 = 5):
z-indicators 1 1 1 1 0 1 0 1 0 0
y-indicators 1 1 0 1 1 1 0 1 0 0
Bivariate indicator moment (mean):
E{IZ(z0) · IY (y0)} =1
N
N∑n=1
in(z0) · in(y0) = FZY (z0, y0)
Example: FZY (z0 = 5, y0 = 5) = 510
average of product of indicators in(z0) · in(y0) =
proportion of values jointly no greater than z0 and y0
= bivariate CDF FZY (z0, y0)
Slide 11
Bivariate Moments
Covariance σZY , a joint moment of order 2:
Cov{Z, Y } = E{[Z − µZ ][Y − µY ]} = E{ZY } − µZµY
=
+∞∫−∞
+∞∫−∞
(z − µZ)(y − µY )fZY (z, y) dzdy
Correlation: ρZY =Cov{Z, Y }√
V ar{Z}V ar{Y } ∈ [−1, +1]
• correlation = measure of linear dependence;
proportional to slope of regression line: Z = a + bY
• |ρZY | = 1⇔ Z linear combination of Y
• independence entails lack of correlation
FZY (z, y) = FZ(z) · FY (y), ∀z, y ⇒ ρZY = 0
• lack of correlation does not entail independence:
ρZY = 0 does not entail FZY (z, y) = FZ(z) · FY (y)
Moment of inertia:
γZY =1
2E{[Z − Y ]2} =
1
2
∫ ∫(z − y)2 fZY (z, y) dz dy
Points to remember: (i) moments summarize distributions and
condense scatter plots, (ii) two RVs with same CDFs/PDFs = two
RVs with same moments (reverse not always true)
Slide 12
Lecture Notes Bivariate & Multivariate Distributions total # of slides = 22
Spring 2007 Geog 210C: Phaedon C. Kyriakidis
Conditional Probability (1)
Conditional PDF:
fZ|Y (z|y) =Prob{Z = z, Y = y}
Prob{Y = y} =fZY (z, y)
fY (y)
Conditional CDF:
FZ|Y (z|y) = Prob{Z ≤ z|Y = y} =Prob{Z ≤ z, Y = y}
Prob{Y = y}
More generally (for two events A and B):
Prob{A|B} =Prob{A and B}
Prob{B}
conditional probability =joint probability
prior probability of conditioning event
Bayes’ relation:
Prob{A|B} =Prob{A, B}Prob{B} =
Prob{B|A}Prob{B} · Prob{A}
Slide 13
Conditional Probability (2)
11 realizations of rain occurrence in(zε) = 1 if zn > zε, 0 if not
i-values 1 1 1 1 1 0 0 1 1 0 0
n-th day 1 2 3 4 5 6 7 8 9 10 11
Requisite: Probability of rain today, given rain yesterday = ?
Conditional probability:
Prob{In(zε) = 1|In−1(zε) = 1}︸ ︷︷ ︸p1|1
=P rob{In(zε)=1,In−1(zε)=1}
P rob{In−1(zε)=1}
=E{In(zε)·In−1(zε)}
E{In−1(zε)} =p1,1p1
Application: on “eroded” sequence {in(zε), n = 2, . . . , 11},for which both today’s and yesterday’s events available:
• p1,1 = E{In(zε) · In−1(zε)} = 510
• p1 = E{In−1(zε)} = E{In(zε)} = 610
• p1|1 = 56
= 0.83
Note that p1|1 = 0.83 > 0.6 = proportion of rainy days,
and p1,1 = 0.5 > 0.36 = p1 · p1 (under independence)
Slide 14
Lecture Notes Bivariate & Multivariate Distributions total # of slides = 22
Spring 2007 Geog 210C: Phaedon C. Kyriakidis
Conditional Moments
Conditional mean:
µZ|y = E{Z|Y = y} =
+∞∫−∞
z · fZ|Y (z|y) dz
non-linear function of y-data, in general
Conditional variance:
σZ|y = V ar{Z|Y = y} =
+∞∫−∞
[z − E{Z|Y = y}]2 · fZ|Y (z|y) dz
non-linear function of y-data, in general
For two independent RVs:
fZ|Y (z|y) = fZ(z)⇔ fZ,Y (z, y) = fZ(z) · fY (y)
E{Z|Y = y} = mZ , and V ar{Z|Y = y} = σZ for any y
under independence, if Y = y, the best estimate of Z is mZ
(best in the least squared error sense)
Slide 15
Standard Normal Bivariate Distribution
(Z, Y )�N
⎧⎨⎩ 0, 1
0, 1ρZY ← one single parameter
Bivariate PDF:
fZY (z, y) =1
2π√
1− ρ2ZY
e[− 1
2(1−ρ2ZY
)(z2−2 ρZY zy+y2)]
Marginal distributions:
Z �N
⎧⎨⎩ 0 = µZ
1 = sZ
Y �N
⎧⎨⎩ 0 = µY
1 = sY
Zero correlation entails independence:
ρZY = 0⇒ fZY (z, y) = fZ(z)fY (y)
only for bivariate standard Normal RVs
Slide 16
Lecture Notes Bivariate & Multivariate Distributions total # of slides = 22
Spring 2007 Geog 210C: Phaedon C. Kyriakidis
Bivariate Normal Distribution
(Z, Y )�N
⎧⎪⎪⎨⎪⎪⎩
µZ , σZ
ρZY
µY , σY
⇒(
Z − µZ√σZ
,Y − µY√
σY
)�N
⎧⎪⎪⎨⎪⎪⎩
0, 1
ρZY
0, 1
Marginal distributions:
Z �N
⎧⎨⎩ µZ
σZ
Y �N
⎧⎨⎩ µY
σY
Conditional distributions (Gaussian with two parameters):
• conditional mean: E{Z|Y = y} = µZ + ρZY
√σZ√σY
(y − µY )
• conditional variance: V ar{Z|Y = y} = σZ(1− ρ2ZY )
conditional mean = linear function of y-data,
conditional variance = independent of y-data (homoscedastic)
Zero correlation entails independence:
ρZY = 0⇒ fZY (z, y) = fZ(z)fY (y), only for bivariate Normal RVs
under independence: cond. mean = mZ , and cond. variance = σZ
under perfect correlation: cond. mean = µZ +√
σZ√σY
(y − µY ), and
cond. variance = 0
Slide 17
Multivariate Distributions (1)
K random variables: {Z1 . . . , ZK} or a random vector z
Joint CDF:
FZ1,...,ZK(z1, . . . , zK) = Prob {Z1 ≤ z1, . . . , ZK ≤ zK}
Joint PDF:
fZ1,...,ZK(z1, . . . , zK) =
∂KFZ1,...,ZK(z1, . . . , zK)
∂z1, . . . , ∂zK
K-th order partial derivative of joint CDF
Marginal univariate CDF:
FZk(zk) = Prob {Zk ≤ zk; Zi ≤ +∞, ∀ i �= k}
Marginal bivariate CDF:
FZ1Z2 (z1, z2) = Prob{Zk ≤ zk, Zk′ ≤ zk′ ; Zi ≤ +∞, ∀ i �= k, k′}
Slide 18
Lecture Notes Bivariate & Multivariate Distributions total # of slides = 22
Spring 2007 Geog 210C: Phaedon C. Kyriakidis
Multivariate Distributions (2)
K random variables: {Z1 . . . , ZK}
Conditional univariate CDF:
FZk|Zi,∀ i�=k(zk|Zi = zi,∀ i�=k) = Prob{Zk ≤ zk|Zi = zi,∀i�=k}
Conditional mean:
E{Zk|Zi = zi,∀ i�=k} = ϕ(zi,∀ i�=k)
non-linear function ϕ(·), in general, of K − 1 conditioning values {zi,∀ i�=k}
Independence:
• Prob{Zk ≤ zk|Zi = zi,∀i�=k} = Prob{Zk ≤ zk}, ∀k
• fZ1,...,ZK(z1, ..., zK) =
K∏k=1
fZk(zk)
• fZkZk′ (zk, zk′ ) = fZk(zk) · fZk′ (zk′ ), ∀k, k′ ⇒ ρkk′ = 0
pairwise independence entails lack of correlation
higher-order (e.g., 3 by 3) independence entails pairwise independence
(reverse not always true)
Slide 19
Multivariate Gaussian Distribution
(K × 1) random vector z = [Z1, . . . , ZK ]T
Univariate and bivariate moments:
• (K × 1) mean vector: µ = [µ1, . . . , µK ]T
• (K ×K) covariance matrix: Σ = E{[z−µ][Z−µ]T }
Multivariate PDF:
fZ(z) =1
(2π)K/2|Σ|1/2exp
[−1
2[z−µ]T Σ−1[z−µ]
]fZ(z) = fZ1,...,ZK
(z1, . . . , zK) = c exp[− 12 Q(z1, . . . , zK)],
|Σ| = determinant of Σ, c = constant, Q = 2nd-order polynomial
Properties:
• all (K − k′)-variate marginal distributions are Gaussian
∀k′ = 1, . . . , K − 1
• all (K − k′)-variate conditional distributions are Gaussian
• any univariate cond. distribution has (cond.) expectation which
is a linear function of z-values, and (cond.) variance independent
of z-values
• any linear transformation of multivariate Gaussian RVs yields
multivariate Gaussian RVs
Slide 20
Lecture Notes Bivariate & Multivariate Distributions total # of slides = 22
Spring 2007 Geog 210C: Phaedon C. Kyriakidis
Bivariate Gaussian Distribution
Standardized versus non-standardized variables:
(Z, Y )�N
⎧⎪⎪⎨⎪⎪⎩
µZ , σZ
ρZY
µY , σY
⇒ (Z − µZ√
σZ,Y − µY√
σY)�N
⎧⎪⎪⎨⎪⎪⎩
0, 1
ρZY
0, 1
Bivariate Gaussian PDF (non-standardized):
fZY (z, y) =1
2π√
σZσY (1− ρ2ZY )
× exp
[− 1
2(1−ρ2ZY
)
[(
z−µZ√σZ
)2−2ρZY (z−µZ√
σZ)(
y−µY√σY
)+(y−µY√
σY)2
]]
a scaled and shifted version
of the standardized bivariate Gaussian PDF
Slide 21
Bivariate Gaussian Distribution Examples
Standardized (zero mean, unit variance):
−4−3
−2−1
01
23
4
−4
−2
0
2
4
0
0.05
0.1
0.15
0.2
X−variable
Standardized bivariate normal pdf with r=0.25
Y−variable
−4−3
−2−1
01
23
4
−4
−2
0
2
4
0
0.05
0.1
0.15
0.2
0.25
X−variable
Standardized bivariate normal pdf with r=0.75
Y−variable
Non-Standardized:
−4−3
−2−1
01
23
4
−4
−2
0
2
4
0
0.02
0.04
0.06
0.08
0.1
X−variable
Bivariate normal pdf with mx=0, my=1, sx=1, sy=2, r=0.25
Y−variablepd
f
−4−3
−2−1
01
23
4
−4
−2
0
2
4
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
X−variable
Bivariate normal pdf with mx=0, my=1, sx=1, sy=2, r=0.75
Y−variable
Slide 22
Lecture Notes Bivariate & Multivariate Distributions total # of slides = 22