Date post: | 20-Dec-2015 |
Category: |
Documents |
View: | 228 times |
Download: | 0 times |
A gentle introduction to Gaussian distribution
Review
• Random variable
• Coin flip experiment
X = 0 X = 1
X: Random variable
Review
• Probability mass function (discrete)
x0 1
P(x)
P(x) >= 0
Example: Coin flip experiment
Any other constraints?
Hint: What is the sum?
Review
• Probability density function (continuous)
f(x)
x
f(x) >= 0
Examples?
Unlike discrete, Density function does not representprobability but its rate of change called the “likelihood”
Review
• Probability density function (continuous)
f(x)
x
f(x) >= 0
x0 X0+dx
P( x0 < x < x0+dx ) = f(x0).dx
But, P( x = x0 ) = 0
& Integrates to 1.0
The Gaussian Distribution
Courtesy: http://research.microsoft.com/~cmbishop/PRML/index.htm
A 2D Gaussian
Central Limit Theorem
•The distribution of the sum of N i.i.d. random variables becomes increasingly Gaussian as N grows.
•Example: N uniform [0,1] random variables.
Central Limit Theorem (Coin flip)
• Flip coin N times
• Each outcome has an associated random variable Xi (=1, if heads, otherwise 0)
• Number of heads
• NH is a random variable
– Sum of N i.i.d. random variables
NH = x1 + x2 + …. + xN
Central Limit Theorem (Coin flip)
• Probability mass function of NH
– P(Head) = 0.5 (fair coin)
N = 5 N = 10 N = 40
Geometry of the Multivariate Gaussian
Moments of the Multivariate Gaussian (1)
thanks to anti-symmetry of z
Moments of the Multivariate Gaussian (2)
Maximum likelihood
• Fit a probability density model p(x | θ) to the data– Estimate θ
• Given independent identically distributed (i.i.d.) data X = (x1, x2, …, xN)– Likelihood
– Log likelihood
• Maximum likelihood: Maximize ln p(X | θ) w.r.t. θ
)|()|()|()|( 21 NxpxpxpXp
N
iixpXp
1
)|(ln)|(ln
Maximum Likelihood for the Gaussian (1)
• Given i.i.d. data , the log likelihood function is given by
• Sufficient statistics
Maximum Likelihood for the Gaussian (2)
• Set the derivative of the log likelihood function to zero,
• and solve to obtain
• Similarly
Mixtures of Gaussians (1)
• Old Faithful data set
Single Gaussian Mixture of two Gaussians
Mixtures of Gaussians (2)
• Combine simple models
into a complex model:
Component
Mixing coefficientK=3
Mixtures of Gaussians (3)
Mixtures of Gaussians (4)
• Determining parameters ¹, §, and ¼ using maximum log likelihood
• Solution: use standard, iterative, numeric optimization methods or the expectation maximization algorithm (Chapter 9).
Log of a sum; no closed form maximum.
Thank you!