Date post: | 24-Feb-2018 |
Category: |
Documents |
Upload: | jolo-candare |
View: | 214 times |
Download: | 0 times |
of 29
7/25/2019 Kirk20130318 Book Club StochProcesses
1/29
Stochastic Processes for PhysicistsUnderstanding Noisy Systems
Chapter 1: A review of probability theory
Paul KirkDivision of Molecular Biosciences, Imperial College London
19/03/2013
7/25/2019 Kirk20130318 Book Club StochProcesses
2/29
1.1 Random variables and mutually exclusive events
Random variables
Suppose we do not know the precise value of a variable, but mayhave an idea of the relative likelihood that it will have one of anumber of possible values.
Let us call the unknown quantity X.
This quantity is referred to as arandom variable.
Probability
6-sided die. Let Xbe the value we get when we roll the die.
Describe the likelihood that Xwill have each of the values 1, . . . , 6by a number between 0 and 1: the probability.
IfProb(X= 3) = 1, then we will always get a 3.
IfProb(X= 3) = 2/3, then we expect to get a 3 about two-thirdsof the time.
Paul Kirk 1 of 22
7/25/2019 Kirk20130318 Book Club StochProcesses
3/29
1.1 Random variables and mutually exclusive events
Mutually exclusive events
The various values ofXare an example ofmutually exclusiveevents.
Xcan take precisely one of the values between 1 and 6.
Mutually exclusive probabilities sum
Prob(X= 3 or X= 4) = Prob(X= 3) +Prob(X = 4).
Paul Kirk 2 of 22
7/25/2019 Kirk20130318 Book Club StochProcesses
4/29
1.1 Random variables and mutually exclusive events
Note: in mathematics texts it is customary to denote the unknownquantity using a capital letter, say X, and a variable that specifies oneof the possible values that Xmay have as the equivalent lower-caseletter, x. We will use this convention in this chapter, but in thefollowing chapters we will use a lower-case letter for both the unknown
quantity and the values it can take, since it causes no confusion.
So, rather than writing Prob(X= 3) or Prob(X =x), we will (in laterchapters) tend to write Prob(3) or Prob(x).
Paul Kirk 3 of 22
7/25/2019 Kirk20130318 Book Club StochProcesses
5/29
1.1 Random variables and mutually exclusive events
Note: in mathematics texts it is customary to denote the unknownquantity using a capital letter, say X, and a variable that specifies oneof the possible values that Xmay have as the equivalent lower-caseletter, x. We will use this convention in this chapter, but in thefollowing chapters we will use a lower-case letter for both the unknown
quantity and the values it can take, since it causes no confusion.
So, rather than writing Prob(X= 3) or Prob(X =x), we will (in laterchapters) tend to write Prob(3) or Prob(x).
Warning: may cause confusion.
Paul Kirk 3 of 22
7/25/2019 Kirk20130318 Book Club StochProcesses
6/29
1.1 Random variables and mutually exclusive events
Continuous random variables
For continuous random variables, the probability for X to be withina range is found by integrating the probability density function.
Prob(a
7/25/2019 Kirk20130318 Book Club StochProcesses
7/29
1.1 Random variables and mutually exclusive events
Expectation
Theexpectationof an arbitrary function, f(X), with respect to theprobability density function P(X) is
f(X)P(X)=
P(x)f(x)dx.
Themeanorexpected valueofX is X.
Paul Kirk 5 of 22
7/25/2019 Kirk20130318 Book Club StochProcesses
8/29
1.1 Random variables and mutually exclusive events
Variance
ThevarianceofXis the expectation of the squared difference fromthe mean
V[X] =
P(x)(x X)2dx
=
P(x)(x2 +X2 2xX)dx
=
P(x)x2dx+
P(x)X2dx
P(x)2xXdx
=X2+
X2
P(x)dx
2X
P(x)xdx
=X2+X2 2X2
=X2 X2
Paul Kirk 6 of 22
7/25/2019 Kirk20130318 Book Club StochProcesses
9/29
1.2 Independence
Independence
Independent probabilities multiply
For independent variables, PX,Y(x, y) =PX(x)PY(y)
For independent variables, XY=XY
Paul Kirk 7 of 22
7/25/2019 Kirk20130318 Book Club StochProcesses
10/29
1.3 Dependent random variables
Dependence
IfX and Y are dependent, then PX,Y(x, y) does not factor as theproduct ofPX(x) and PY(y)
If we know PX,Y(x, y) and want to know PX(x), then it is obtainedbyintegrating out(ormarginalising) the other variable:
PX(x) =
PX,Y(x, y)dy
Paul Kirk 8 of 22
7/25/2019 Kirk20130318 Book Club StochProcesses
11/29
1.3 Dependent random variables
Conditional probability densities
The probability density for Xgiven that we know that Y =y iswritten P(X =x|Y =y) orP(x|y) and is referred to as theconditional probability density for X given Y.
PX|Y(X =x|Y =y) = PX,Y(X =x,Y =y)PY(Y =y) .
Explanation: To see how to calculate this conditional probability, we
note first that P(x, y) with y=a gives the relative probability fordifferent values ofxgiven that Y =a. To obtain the conditionalprobability density for X given that Y =a, all we have to do is divideP(x, a) by its integral over all values ofx.
Paul Kirk 9 of 22
C
7/25/2019 Kirk20130318 Book Club StochProcesses
12/29
1.4 Correlations and correlation coefficients
ThecovarianceofX and Y is:
cov(X,Y) =(X X)(Y Y)=XY XY.
Idea:
1. How can we define what it means for a value xto be bigger thanusual? Well, we can see ifx>X i.e. ifx X> 0.
2. Similarly, we can say that a value x is smaller than usual ifx0
Thecorrelationis just a normalised version of the covariance,
which takes values in the range 1 to 1:
CXY = XY XY
V[X]V[Y]
Paul Kirk 10 of 22
1 Addi d i bl h
7/25/2019 Kirk20130318 Book Club StochProcesses
13/29
1.5 Adding random variables together
When we have two continuous random variables, X and Y, withprobability densities PX and PY, it is often useful to be able tocalculate the probability density of the random variable whose value isthe sum of them: Z =X+Y. It turns out that the probabilitydensity for Z is given by
PZ(z) = PX(sz)PY(s)ds=PXPY
Paul Kirk 11 of 22
1 5 Addi d i bl h
7/25/2019 Kirk20130318 Book Club StochProcesses
14/29
1.5 Adding random variables together
When we have two continuous random variables, X and Y, withprobability densities PX and PY, it is often useful to be able tocalculate the probability density of the random variable whose value isthe sum of them: Z =X+Y. It turns out that the probabilitydensity for Z is given by
PZ(z) = PX(sz)PY(s)ds=PXPY
Paul Kirk 11 of 22
1 5 Addi d i bl t th
7/25/2019 Kirk20130318 Book Club StochProcesses
15/29
1.5 Adding random variables together
When we have two continuous random variables, X and Y, withprobability densities PX and PY, it is often useful to be able tocalculate the probability density of the random variable whose value isthe sum of them: Z =X+Y. It turns out that the probabilitydensity for Z is given by
PZ(z) = PX(sz)PY(s)ds=PXPY
PZ(z) =
PX,Y(zs, s)ds=
PX,Y(s, zs)ds
Paul Kirk 11 of 22
1 5 Addi d i bl t th
7/25/2019 Kirk20130318 Book Club StochProcesses
16/29
1.5 Adding random variables together
When we have two continuous random variables, X and Y, withprobability densities PX and PY, it is often useful to be able tocalculate the probability density of the random variable whose value isthe sum of them: Z =X+Y. It turns out that the probabilitydensity for Z is given by
PZ(z) = PX(sz)PY(s)ds=PXPY
PZ(z) =
PX,Y(zs, s)ds=
PX,Y(s, zs)ds
IfX
andY
areindependent, this becomes:
PZ(z) =
PX(zs)PY(s)ds=PXPY
Paul Kirk 11 of 22
1 5 Addi d i bl s t th
7/25/2019 Kirk20130318 Book Club StochProcesses
17/29
1.5 Adding random variables together
IfX1 and X2 are random variables and X =X1+X2, then
X= X1+X2,
andifX1 and X2 are independent, then
V[X] =V[X1] +V[X2].
Mysterious (?) assertion
Averaging the results of a number of independent measurementsproduces a more accurate result. This is because the variances of thedifferent measurements add together. does this make sense?
Paul Kirk 12 of 22
1 5 Adding random variables together
7/25/2019 Kirk20130318 Book Club StochProcesses
18/29
1.5 Adding random variables together
Explanation
Assume all measurements have expectation, , and variance, 2
.By the independence assumption, the variance of the average is:
V
N
n=1Xn
N
=
N
n=1 VXn
N .Moreover,
VXn
N =EX2nN2 E
Xn
N 2
= E
X2n
(E[Xn])
2
N2 =
V[Xn]
N2 .
So,
V
Nn=1
Xn
N
=
Nn=1
V[Xn]
N2 =
1
N2
Nn=1
2 =2
N.
Paul Kirk 13 of 22
1 6 Transformations of a random variable
7/25/2019 Kirk20130318 Book Club StochProcesses
19/29
1.6 Transformations of a random variable
Key assertion: IfY =g(X), then:
f(Y)= x=bx=a
PX(x)f(g(x))dx= y=g(b)y=g(a)
PY(y)f(y)dy.
Paul Kirk 14 of 22
1 6 Transformations of a random variable
7/25/2019 Kirk20130318 Book Club StochProcesses
20/29
1.6 Transformations of a random variable
Key assertion: IfY =g(X), then:
f(Y)= x=bx=a
PX(x)f(g(x))dx= y=g(b)y=g(a)
PY(y)f(y)dy.
Given this assumption, everything else falls out automatically:
f(Y)= x=bx=a
PX(x)f(g(x))dx= y=g(b)y=g(a)
PX(g1(y))f(y)dx
dydy
=
y=g(b)y=g(a)
PX(g1(y))
g(g1(y))f(y)dy.
Paul Kirk 14 of 22
1 6 Transformations of a random variable
7/25/2019 Kirk20130318 Book Club StochProcesses
21/29
1.6 Transformations of a random variable
Key assertion: IfY =g(X), then:
f(Y)= x=bx=a
PX(x)f(g(x))dx= y=g(b)y=g(a)
PY(y)f(y)dy.
Given this assumption, everything else falls out automatically:
f(Y)= x=bx=a
PX(x)f(g(x))dx= y=g(b)y=g(a)
PX(g1(y))f(y)dx
dydy
=
y=g(b)y=g(a)
PX(g1(y))
g(g1(y))f(y)dy.
General result (for invertible g):
PY(y) = PX(g
1(y))
|g(g1(y))|.
Paul Kirk 14 of 22
1 7 The distribution function
7/25/2019 Kirk20130318 Book Club StochProcesses
22/29
1.7 The distribution function
The probability distribution function, which we will call D(x), of arandom variable Xis defined as the probability that X is less than orequal to x. Thus
D(x) =Prob(X x) = x P(z)dz
In addition, the fundamental theorem of calculus tells us that
P(x) = d
dx
D(x).
Paul Kirk 15 of 22
1 8 The characteristic function
7/25/2019 Kirk20130318 Book Club StochProcesses
23/29
1.8 The characteristic function
The characteristic functionis defined as theFourier transformof theprobability density.
(s) =
P(x)exp(isX)dx.
The inverse transform gives:
P(x) = 12
(s)exp(isx)ds.
The Fourier transform of the convolution of two functions, P(x) andQ(x), is the product of their Fourier transforms, P(s) and Q(s).
For discrete random variables, the characteristic function is a sum.In general (for both discrete and continuous r.v.s), we have:
P(s) =exp(isx)P(X).
Paul Kirk 16 of 22
1 9 Moments and cumulants
7/25/2019 Kirk20130318 Book Club StochProcesses
24/29
1.9 Moments and cumulants
Moment generating function(departure from the book)
The moment generating function is defined as:
M(t) =exp(tX),
where X is a random variable, and the expectation is with respect tosome density P(X), so that
M(t) =
exp(tx)P(x)dx
=
1 +tx+ 1
2!
t2x2 + ...P(x)dx= 1 +tm1+
1
2!t2m2+ . . . +
1
r!trmr+ . . .
where mr =Xr is the r-th (raw) moment.
Paul Kirk 17 of 22
1 9 Moments and cumulants
7/25/2019 Kirk20130318 Book Club StochProcesses
25/29
1.9 Moments and cumulants
Moment generating function(continued)
M(t) = 1 +tm1+ 1
2!t2m2+ . . . +
1
r!trmr+ . . .
It follows from the above expansion that:
M(0) = 1
M(0) =m1
M(0) =m2
...
M(r)(0) =mr
Paul Kirk 18 of 22
1.9 Moments and cumulants
7/25/2019 Kirk20130318 Book Club StochProcesses
26/29
1.9 Moments and cumulants
Cumulant generating function(continued departure from book)
The log of the moment generating function is called the cumulantgenerating function,R(t) = ln(M(t)).
By the chain rule of differentiation, we can write down the derivativesofR(t) in terms of the derivatives ofM(t), e.g.
R(t) = M(t)
M(t)
R(t) = M(t)M(t)(M(t))2
(M(t))2
Note that R(0) = 1, R(0) =M(0) =m1 =,R(0) =M(0)(M(0))2 =m2 m
21 =
2, . . .These are thecumulants.
Paul Kirk 19 of 22
1.9 Moments and cumulants
7/25/2019 Kirk20130318 Book Club StochProcesses
27/29
9
The moments can be calculated from the derivatives of the
characteristic function, evaluated at s= 0. We can see this byexpanding the characteristic function as a Taylor series:
(s) =
n=0
(n)(0)sn
n!
where (n)(s) is the n-th derivative of(s). But we also have:
(s) =eisX=
n=0
(isX)n
n!=
n=0
inXnsn
n!
Equating the 2 expressions, we get: Xn= (n)(0)in
Paul Kirk 20 of 22
1.9 Moments and cumulants
7/25/2019 Kirk20130318 Book Club StochProcesses
28/29
Cumulants
The n-th order cumulant ofX, is the n-th derivative of the logof thecharacteristic function.
Forindependentrandom variables, X and Y, ifZ =X+Y then then-th cumulant ofZ is the sum of the n-th cumulants ofX and Y.
The Gaussian distribution is also the only absolutely continuousdistribution all of whose cumulants beyond the first two (i.e. other
than the mean and variance) are zero.
Paul Kirk 21 of 22
1.10 The multivariate Gaussian
7/25/2019 Kirk20130318 Book Club StochProcesses
29/29
Let x= [x!, . . . , xN], then the general form of the Gaussian pdf is:
P(x) = 1(2)Ndet()
exp
1
2(x )1(x )
,
where is the mean vector and is the covariance matrix.
All higher moments of a Gaussian can be written in terms of themeans and covariances. Defining X X X, for a 1-dimensionalGaussian we have:
X2n
=
(2n1)!(V[X])n
2n1(n1)!
X2n1= 0
Paul Kirk 22 of 22