+ All Categories
Home > Documents > Kirk20130318 Book Club StochProcesses

Kirk20130318 Book Club StochProcesses

Date post: 24-Feb-2018
Category:
Upload: jolo-candare
View: 214 times
Download: 0 times
Share this document with a friend

of 29

Transcript
  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    1/29

    Stochastic Processes for PhysicistsUnderstanding Noisy Systems

    Chapter 1: A review of probability theory

    Paul KirkDivision of Molecular Biosciences, Imperial College London

    19/03/2013

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    2/29

    1.1 Random variables and mutually exclusive events

    Random variables

    Suppose we do not know the precise value of a variable, but mayhave an idea of the relative likelihood that it will have one of anumber of possible values.

    Let us call the unknown quantity X.

    This quantity is referred to as arandom variable.

    Probability

    6-sided die. Let Xbe the value we get when we roll the die.

    Describe the likelihood that Xwill have each of the values 1, . . . , 6by a number between 0 and 1: the probability.

    IfProb(X= 3) = 1, then we will always get a 3.

    IfProb(X= 3) = 2/3, then we expect to get a 3 about two-thirdsof the time.

    Paul Kirk 1 of 22

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    3/29

    1.1 Random variables and mutually exclusive events

    Mutually exclusive events

    The various values ofXare an example ofmutually exclusiveevents.

    Xcan take precisely one of the values between 1 and 6.

    Mutually exclusive probabilities sum

    Prob(X= 3 or X= 4) = Prob(X= 3) +Prob(X = 4).

    Paul Kirk 2 of 22

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    4/29

    1.1 Random variables and mutually exclusive events

    Note: in mathematics texts it is customary to denote the unknownquantity using a capital letter, say X, and a variable that specifies oneof the possible values that Xmay have as the equivalent lower-caseletter, x. We will use this convention in this chapter, but in thefollowing chapters we will use a lower-case letter for both the unknown

    quantity and the values it can take, since it causes no confusion.

    So, rather than writing Prob(X= 3) or Prob(X =x), we will (in laterchapters) tend to write Prob(3) or Prob(x).

    Paul Kirk 3 of 22

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    5/29

    1.1 Random variables and mutually exclusive events

    Note: in mathematics texts it is customary to denote the unknownquantity using a capital letter, say X, and a variable that specifies oneof the possible values that Xmay have as the equivalent lower-caseletter, x. We will use this convention in this chapter, but in thefollowing chapters we will use a lower-case letter for both the unknown

    quantity and the values it can take, since it causes no confusion.

    So, rather than writing Prob(X= 3) or Prob(X =x), we will (in laterchapters) tend to write Prob(3) or Prob(x).

    Warning: may cause confusion.

    Paul Kirk 3 of 22

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    6/29

    1.1 Random variables and mutually exclusive events

    Continuous random variables

    For continuous random variables, the probability for X to be withina range is found by integrating the probability density function.

    Prob(a

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    7/29

    1.1 Random variables and mutually exclusive events

    Expectation

    Theexpectationof an arbitrary function, f(X), with respect to theprobability density function P(X) is

    f(X)P(X)=

    P(x)f(x)dx.

    Themeanorexpected valueofX is X.

    Paul Kirk 5 of 22

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    8/29

    1.1 Random variables and mutually exclusive events

    Variance

    ThevarianceofXis the expectation of the squared difference fromthe mean

    V[X] =

    P(x)(x X)2dx

    =

    P(x)(x2 +X2 2xX)dx

    =

    P(x)x2dx+

    P(x)X2dx

    P(x)2xXdx

    =X2+

    X2

    P(x)dx

    2X

    P(x)xdx

    =X2+X2 2X2

    =X2 X2

    Paul Kirk 6 of 22

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    9/29

    1.2 Independence

    Independence

    Independent probabilities multiply

    For independent variables, PX,Y(x, y) =PX(x)PY(y)

    For independent variables, XY=XY

    Paul Kirk 7 of 22

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    10/29

    1.3 Dependent random variables

    Dependence

    IfX and Y are dependent, then PX,Y(x, y) does not factor as theproduct ofPX(x) and PY(y)

    If we know PX,Y(x, y) and want to know PX(x), then it is obtainedbyintegrating out(ormarginalising) the other variable:

    PX(x) =

    PX,Y(x, y)dy

    Paul Kirk 8 of 22

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    11/29

    1.3 Dependent random variables

    Conditional probability densities

    The probability density for Xgiven that we know that Y =y iswritten P(X =x|Y =y) orP(x|y) and is referred to as theconditional probability density for X given Y.

    PX|Y(X =x|Y =y) = PX,Y(X =x,Y =y)PY(Y =y) .

    Explanation: To see how to calculate this conditional probability, we

    note first that P(x, y) with y=a gives the relative probability fordifferent values ofxgiven that Y =a. To obtain the conditionalprobability density for X given that Y =a, all we have to do is divideP(x, a) by its integral over all values ofx.

    Paul Kirk 9 of 22

    C

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    12/29

    1.4 Correlations and correlation coefficients

    ThecovarianceofX and Y is:

    cov(X,Y) =(X X)(Y Y)=XY XY.

    Idea:

    1. How can we define what it means for a value xto be bigger thanusual? Well, we can see ifx>X i.e. ifx X> 0.

    2. Similarly, we can say that a value x is smaller than usual ifx0

    Thecorrelationis just a normalised version of the covariance,

    which takes values in the range 1 to 1:

    CXY = XY XY

    V[X]V[Y]

    Paul Kirk 10 of 22

    1 Addi d i bl h

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    13/29

    1.5 Adding random variables together

    When we have two continuous random variables, X and Y, withprobability densities PX and PY, it is often useful to be able tocalculate the probability density of the random variable whose value isthe sum of them: Z =X+Y. It turns out that the probabilitydensity for Z is given by

    PZ(z) = PX(sz)PY(s)ds=PXPY

    Paul Kirk 11 of 22

    1 5 Addi d i bl h

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    14/29

    1.5 Adding random variables together

    When we have two continuous random variables, X and Y, withprobability densities PX and PY, it is often useful to be able tocalculate the probability density of the random variable whose value isthe sum of them: Z =X+Y. It turns out that the probabilitydensity for Z is given by

    PZ(z) = PX(sz)PY(s)ds=PXPY

    Paul Kirk 11 of 22

    1 5 Addi d i bl t th

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    15/29

    1.5 Adding random variables together

    When we have two continuous random variables, X and Y, withprobability densities PX and PY, it is often useful to be able tocalculate the probability density of the random variable whose value isthe sum of them: Z =X+Y. It turns out that the probabilitydensity for Z is given by

    PZ(z) = PX(sz)PY(s)ds=PXPY

    PZ(z) =

    PX,Y(zs, s)ds=

    PX,Y(s, zs)ds

    Paul Kirk 11 of 22

    1 5 Addi d i bl t th

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    16/29

    1.5 Adding random variables together

    When we have two continuous random variables, X and Y, withprobability densities PX and PY, it is often useful to be able tocalculate the probability density of the random variable whose value isthe sum of them: Z =X+Y. It turns out that the probabilitydensity for Z is given by

    PZ(z) = PX(sz)PY(s)ds=PXPY

    PZ(z) =

    PX,Y(zs, s)ds=

    PX,Y(s, zs)ds

    IfX

    andY

    areindependent, this becomes:

    PZ(z) =

    PX(zs)PY(s)ds=PXPY

    Paul Kirk 11 of 22

    1 5 Addi d i bl s t th

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    17/29

    1.5 Adding random variables together

    IfX1 and X2 are random variables and X =X1+X2, then

    X= X1+X2,

    andifX1 and X2 are independent, then

    V[X] =V[X1] +V[X2].

    Mysterious (?) assertion

    Averaging the results of a number of independent measurementsproduces a more accurate result. This is because the variances of thedifferent measurements add together. does this make sense?

    Paul Kirk 12 of 22

    1 5 Adding random variables together

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    18/29

    1.5 Adding random variables together

    Explanation

    Assume all measurements have expectation, , and variance, 2

    .By the independence assumption, the variance of the average is:

    V

    N

    n=1Xn

    N

    =

    N

    n=1 VXn

    N .Moreover,

    VXn

    N =EX2nN2 E

    Xn

    N 2

    = E

    X2n

    (E[Xn])

    2

    N2 =

    V[Xn]

    N2 .

    So,

    V

    Nn=1

    Xn

    N

    =

    Nn=1

    V[Xn]

    N2 =

    1

    N2

    Nn=1

    2 =2

    N.

    Paul Kirk 13 of 22

    1 6 Transformations of a random variable

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    19/29

    1.6 Transformations of a random variable

    Key assertion: IfY =g(X), then:

    f(Y)= x=bx=a

    PX(x)f(g(x))dx= y=g(b)y=g(a)

    PY(y)f(y)dy.

    Paul Kirk 14 of 22

    1 6 Transformations of a random variable

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    20/29

    1.6 Transformations of a random variable

    Key assertion: IfY =g(X), then:

    f(Y)= x=bx=a

    PX(x)f(g(x))dx= y=g(b)y=g(a)

    PY(y)f(y)dy.

    Given this assumption, everything else falls out automatically:

    f(Y)= x=bx=a

    PX(x)f(g(x))dx= y=g(b)y=g(a)

    PX(g1(y))f(y)dx

    dydy

    =

    y=g(b)y=g(a)

    PX(g1(y))

    g(g1(y))f(y)dy.

    Paul Kirk 14 of 22

    1 6 Transformations of a random variable

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    21/29

    1.6 Transformations of a random variable

    Key assertion: IfY =g(X), then:

    f(Y)= x=bx=a

    PX(x)f(g(x))dx= y=g(b)y=g(a)

    PY(y)f(y)dy.

    Given this assumption, everything else falls out automatically:

    f(Y)= x=bx=a

    PX(x)f(g(x))dx= y=g(b)y=g(a)

    PX(g1(y))f(y)dx

    dydy

    =

    y=g(b)y=g(a)

    PX(g1(y))

    g(g1(y))f(y)dy.

    General result (for invertible g):

    PY(y) = PX(g

    1(y))

    |g(g1(y))|.

    Paul Kirk 14 of 22

    1 7 The distribution function

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    22/29

    1.7 The distribution function

    The probability distribution function, which we will call D(x), of arandom variable Xis defined as the probability that X is less than orequal to x. Thus

    D(x) =Prob(X x) = x P(z)dz

    In addition, the fundamental theorem of calculus tells us that

    P(x) = d

    dx

    D(x).

    Paul Kirk 15 of 22

    1 8 The characteristic function

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    23/29

    1.8 The characteristic function

    The characteristic functionis defined as theFourier transformof theprobability density.

    (s) =

    P(x)exp(isX)dx.

    The inverse transform gives:

    P(x) = 12

    (s)exp(isx)ds.

    The Fourier transform of the convolution of two functions, P(x) andQ(x), is the product of their Fourier transforms, P(s) and Q(s).

    For discrete random variables, the characteristic function is a sum.In general (for both discrete and continuous r.v.s), we have:

    P(s) =exp(isx)P(X).

    Paul Kirk 16 of 22

    1 9 Moments and cumulants

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    24/29

    1.9 Moments and cumulants

    Moment generating function(departure from the book)

    The moment generating function is defined as:

    M(t) =exp(tX),

    where X is a random variable, and the expectation is with respect tosome density P(X), so that

    M(t) =

    exp(tx)P(x)dx

    =

    1 +tx+ 1

    2!

    t2x2 + ...P(x)dx= 1 +tm1+

    1

    2!t2m2+ . . . +

    1

    r!trmr+ . . .

    where mr =Xr is the r-th (raw) moment.

    Paul Kirk 17 of 22

    1 9 Moments and cumulants

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    25/29

    1.9 Moments and cumulants

    Moment generating function(continued)

    M(t) = 1 +tm1+ 1

    2!t2m2+ . . . +

    1

    r!trmr+ . . .

    It follows from the above expansion that:

    M(0) = 1

    M(0) =m1

    M(0) =m2

    ...

    M(r)(0) =mr

    Paul Kirk 18 of 22

    1.9 Moments and cumulants

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    26/29

    1.9 Moments and cumulants

    Cumulant generating function(continued departure from book)

    The log of the moment generating function is called the cumulantgenerating function,R(t) = ln(M(t)).

    By the chain rule of differentiation, we can write down the derivativesofR(t) in terms of the derivatives ofM(t), e.g.

    R(t) = M(t)

    M(t)

    R(t) = M(t)M(t)(M(t))2

    (M(t))2

    Note that R(0) = 1, R(0) =M(0) =m1 =,R(0) =M(0)(M(0))2 =m2 m

    21 =

    2, . . .These are thecumulants.

    Paul Kirk 19 of 22

    1.9 Moments and cumulants

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    27/29

    9

    The moments can be calculated from the derivatives of the

    characteristic function, evaluated at s= 0. We can see this byexpanding the characteristic function as a Taylor series:

    (s) =

    n=0

    (n)(0)sn

    n!

    where (n)(s) is the n-th derivative of(s). But we also have:

    (s) =eisX=

    n=0

    (isX)n

    n!=

    n=0

    inXnsn

    n!

    Equating the 2 expressions, we get: Xn= (n)(0)in

    Paul Kirk 20 of 22

    1.9 Moments and cumulants

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    28/29

    Cumulants

    The n-th order cumulant ofX, is the n-th derivative of the logof thecharacteristic function.

    Forindependentrandom variables, X and Y, ifZ =X+Y then then-th cumulant ofZ is the sum of the n-th cumulants ofX and Y.

    The Gaussian distribution is also the only absolutely continuousdistribution all of whose cumulants beyond the first two (i.e. other

    than the mean and variance) are zero.

    Paul Kirk 21 of 22

    1.10 The multivariate Gaussian

  • 7/25/2019 Kirk20130318 Book Club StochProcesses

    29/29

    Let x= [x!, . . . , xN], then the general form of the Gaussian pdf is:

    P(x) = 1(2)Ndet()

    exp

    1

    2(x )1(x )

    ,

    where is the mean vector and is the covariance matrix.

    All higher moments of a Gaussian can be written in terms of themeans and covariances. Defining X X X, for a 1-dimensionalGaussian we have:

    X2n

    =

    (2n1)!(V[X])n

    2n1(n1)!

    X2n1= 0

    Paul Kirk 22 of 22


Recommended