+ All Categories
Home > Documents > Uai Mcmc Tutorial

Uai Mcmc Tutorial

Date post: 08-Apr-2018
Category:
Upload: studentscorners
View: 233 times
Download: 0 times
Share this document with a friend

of 61

Transcript
  • 8/7/2019 Uai Mcmc Tutorial

    1/61

    Inference on Relational Models Using

    Markov Chain Monte Carlo

    Brian Milch

    Massachusetts Institute of Technology

    UAI Tutorial

    July 19, 2007

  • 8/7/2019 Uai Mcmc Tutorial

    2/61

    2

    S. Russel and P. Norvig (1995). Artificial Intelligence: A Modern Approach. Upper

    Saddle River, NJ: Prentice Hall.

    Example 1: Bibliographies

    Russell, Stuart and Norvig, Peter. Articial Intelligence. Prentice-Hall, 1995.

    Stuart Russell Peter Norvig

    Artificial Intelligence: A Modern Approach

  • 8/7/2019 Uai Mcmc Tutorial

    3/61

    3

    (1.9, 6.1, 2.2)

    (0.6, 5.9, 3.2)

    Example 2: Aircraft Tracking

    t=1 t=2 t=3

    (1.9, 9.0, 2.1)

    (0.7, 5.1, 3.2)

    (1.8, 7.4, 2.3)

    (0.9, 5.8, 3.1)

  • 8/7/2019 Uai Mcmc Tutorial

    4/61

    4

    Inference on Relational Structures

    Russell Roberts

    AI: A Mod...

    Rus... AI... AI: A... Rus... AI... AI: A...

    Rus... AI... AI: A...

    Rob... Adv... Rob...

    Shak... Haml... Wm...Seu... The... Seu...

    Russell Norvig

    AI: A Mod...Advance...

    Seuss

    The... If you...

    Shak...

    Hamlet

    Tempest

    1.2 x 10-12 2.3 x 10-12 4.5 x 10-14

    6.7 x 10-16 8.9 x 10-16 5.0 x 10-20

  • 8/7/2019 Uai Mcmc Tutorial

    5/61

    5

    Markov Chain Monte Carlo (MCMC)

    Markov chain s1, s2, ... over

    worlds where evidence E is

    true

    Approximate P(Q|E) as

    fraction of s1, s2, ... that satisfy

    query Q

    E

    Q

  • 8/7/2019 Uai Mcmc Tutorial

    6/61

    6

    Outline

    Probabilistic models for relational structures Modeling the number of objects

    Three mistakes that are easy to make

    Markov chain Monte Carlo (MCMC) Gibbs sampling

    Metropolis-Hastings

    MCMC over events

    Case studies Citation matching

    Multi-target tracking

  • 8/7/2019 Uai Mcmc Tutorial

    7/61

    7

    Simple Example: Clustering

    Wingspan (cm)

    Q = 22 Q = 49 Q = 80

    10 20 30 40 50 60 70 80 90 100

  • 8/7/2019 Uai Mcmc Tutorial

    8/61

    8

    Simple Bayesian Mixture Model

    Number of latent objects is known to be k

    For each latent object i, have parameter:

    For each data point j, have object selector

    and observable value

    ]100,0[Uniform~iQ

    },...,1Uniform({~ kCj

    25,N r aljcj

    X Q

  • 8/7/2019 Uai Mcmc Tutorial

    9/61

    9

    BN for Mixture Model

    X1 X2 X3 Xn

    C1 C2 C3 Cn

    Q1 Q2 Qk

  • 8/7/2019 Uai Mcmc Tutorial

    10/61

    10

    Context-Specific Dependencies

    X1 X2 X3 Xn

    C1 C2 C3 Cn

    Q1 Q2 Qk

    = 2 = 1 = 2

  • 8/7/2019 Uai Mcmc Tutorial

    11/61

    11

    Extensions to Mixture Model

    Random number of latent objects k, with distributionp(k) such as:

    Uniform({1, , 100})

    Geometric(0.1)

    Poisson(10)

    Random distribution T for selecting objects

    p(T | k) ~ Dirichlet(E1,..., Ek)(Dirichlet: distribution over probability vectors)

    Still symmetric: each Ei= E/k

    unbounded!

  • 8/7/2019 Uai Mcmc Tutorial

    12/61

    12

    Existence versus Observation

    A latent object can existeven ifno observations correspond

    to it

    Bird species may not be observed yet

    Aircraft may fly over without yielding any blips

    Two questions:

    How many objects correspond to observations?

    How many objects are there in total?

    Observed3 species, each 100 times: probably no more

    Observed 200 species, each 1 or 2 times: probably more exist

  • 8/7/2019 Uai Mcmc Tutorial

    13/61

    13

    Expecting Additional Objects

    P(ever observe new species | seen r so far) bounded byP(ku r)

    So as # species observedpg, probability of ever seeingmorep 0

    What if we dont want this?

    r observed species

    observe more later?

  • 8/7/2019 Uai Mcmc Tutorial

    14/61

    14

    Dirichlet Process Mixtures

    Set k = g, letT be infinite-dimensionalprobabilityvector with stick-breaking prior

    Another view: Define prior directly on partitions ofdata points, allowing unbounded number of blocks

    Drawback: Cant ask about number ofunobservedlatent objects (always infinite)

    T1 T2 T3 T4 T5

    [Ferguson 1983; Sethuraman 1994]

    [tutorials: Jordan 2005; Sudderth 2006]

  • 8/7/2019 Uai Mcmc Tutorial

    15/61

    15

    Outline

    Probabilistic models for relational structures Modeling the number of objects

    Three mistakes that are easy to make

    Markov chain Monte Carlo (MCMC) Gibbs sampling

    Metropolis-Hastings

    MCMC over events

    Case studies Citation matching

    Multi-target tracking

  • 8/7/2019 Uai Mcmc Tutorial

    16/61

    16

    Mistake 1: Ignoring Interchangeability

    Which birds are in species S1?

    Latent objectindices are

    interchangeable

    Posterior on selector variable CB1 is uniform

    Posterior on QS1 has a peak for each cluster of birds

    Really care aboutpartition of observations

    Partition with r blocks corresponds to k! / (k-r)! instantiations

    of the Cjvariables

    B1 B3B2 B5 B4

    {{1, 3}, {2}, {4, 5}}

    (1, 2, 1, 3, 3), (1, 2, 1, 4, 4), (1, 4, 1, 3, 3), (2, 1, 2, 3, 3),

  • 8/7/2019 Uai Mcmc Tutorial

    17/61

    17

    Ignoring Interchangeability, Contd

    Say k = 4. Whats prior probability that B1, B3 arein one species, B2 in another?

    Multiply probabilities for CB1

    , CB2

    , CB3:

    (1/4) x (1/4) x (1/4)

    Not enough! Partition {{B1, B3}, {B2}} correspondsto 12 instantiations of Cs

    Partition with r blocks corresponds to kPrinstantiations

    (S1, S2, S1), (S1, S3, S1), (S1, S4, S1), (S2, S1, S2), (S2, S3, S2), (S2, S4, S2)

    (S3, S1, S3), (S3, S2, S3), (S3, S4, S3), (S4, S1, S4), (S4, S2, S4), (S4, S3, S4)

  • 8/7/2019 Uai Mcmc Tutorial

    18/61

    18

    Mistake 2: Underestimating the Bayesian

    Ockhams Razor Effect Say k = 4. Are B1 and B2 in same species?

    Maximum-likelihood estimation would yield one species

    with Q = 50 and another with Q = 52

    But Bayesian modeltrades offlikelihood againstprior

    probabilityof getting those Q values

    Wingspan (cm)

    10 20 30 40 50 60 70 80 90 100

    XB1=50 X B2=52

  • 8/7/2019 Uai Mcmc Tutorial

    19/61

    19

    Bayesian Ockhams Razor

    10 20 30 40 50 60 70 80 90 100

    XB1=50 X B2=52

    H1: Partition is {{B1, B2}}

    11211100

    01

    2

    141 )|()|()(41)d t,( QQQQ dxpxppPHp !

    } 1.3 x 10-4

    H2: Partition is {{B1}, {B2}}

    222100

    02111

    100

    01

    2

    242 )|()()|()(41)d t,( QQQQQQ dxppdxppPHp !

    } 7.5 x 10-5

    = 0.01

    Dont use more latent objects than necessary to explain your data

    [MacKay 1992]

  • 8/7/2019 Uai Mcmc Tutorial

    20/61

    20

    Mistake 3: Comparing Densities Across

    Dimensions

    Wingspan (cm)

    10 20 30 40 50 60 70 80 90 100

    XB1=50 X B2=52

    H1: Partition is {{B1, B2}}, Q = 51

    H2: Partition is {{B1}, {B2}}, QB1 = 50, QB2= 52

    )5,51;52()5,51;50(01.04

    1)d t,( 222

    141 NNPHp !

    )5,52;52(.)5,50;50(01.041)ata,( 22

    2

    142 NNPHp

    } 1.5 x 10-5

    } 4.8 x 10-7

    H1 wins by greater margin

  • 8/7/2019 Uai Mcmc Tutorial

    21/61

    21

    What If We Change the Units?

    Wingspan (m)

    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

    XB1=0.50 X B2=0.52

    H1: Partition is {{B1, B2}}, Q = 0.51

    H2: Partition is {{B1}, {B2}}, QB1 = 0.50, QB2= 0.52

    )05.0,51.0;52.0()05.0,51.0;50.0(14

    1)d t,( 222

    141 NNPHp !

    )05.0,52.0;52.0(1)05.0,50.0;50.0(141)ata,( 22

    2

    142 NNPHp

    } 15

    } 48

    density of Uniform(0, 1) is 1!

    Now H2wins by a landslide

  • 8/7/2019 Uai Mcmc Tutorial

    22/61

    22

    Lesson: Comparing Densities Across

    Dimensions

    Densities dont behave like probabilities (e.g., theycan be greater than 1)

    Heights of density peaks in spaces ofdifferentdimension are not comparable

    Work-arounds:

    Find most likely partition first, then most likely

    parameters given that partition Findregion in parameter space where most of the

    posterior probability mass lies

  • 8/7/2019 Uai Mcmc Tutorial

    23/61

    23

    Outline

    Probabilistic models for relational structures Modeling the number of objects

    Three mistakes that are easy to make

    Markov chain Monte Carlo (MCMC) Gibbs sampling

    Metropolis-Hastings

    MCMC over events

    Case studies Citation matching

    Multi-target tracking

  • 8/7/2019 Uai Mcmc Tutorial

    24/61

    24

    Why Not Exact Inference?

    Number of possible partitions is superexponential in

    n

    Variable elimination? Summing outQi

    couples all the Cjs

    Summing out Cjcouples all theQis

    X1 X2 X3Xn

    C1 C2 C3 Cn

    Q1 Q2 Qk

  • 8/7/2019 Uai Mcmc Tutorial

    25/61

    25

    Markov Chain Monte Carlo (MCMC)

    Start in arbitrary state (possibleworld) s1 satisfying evidence E

    Sample s2, s3, ... according to

    transition kernelT(si, si+1),yielding Markov chain

    Approximate p(Q | E) byfraction of s1, s2, , sL that are

    in Q

    E

    Q

  • 8/7/2019 Uai Mcmc Tutorial

    26/61

    26

    Why a Markov Chain?

    Why use Markov chain rather than sampling

    independently?

    Stochastic local search for high-probability s

    Once we find such s, explore around it

  • 8/7/2019 Uai Mcmc Tutorial

    27/61

    27

    Convergence

    Stationary distribution T is such that

    If chain is ergodic(can get to anywhere fromanywhere*), then:

    It has unique stationary distribution T

    Fraction of s1, s2, ..., sL in Q converges to T(Q) as Lpg

    Well design T so T(s) = p(s | E)

    !s

    sssTs )'()',()( TT

    * and its aperiodic

  • 8/7/2019 Uai Mcmc Tutorial

    28/61

    28

    Gibbs Sampling

    Order non-evidence variables V1,V2,...,Vm

    Given state s, sample from T as follows:

    Let sd = s For i = 1 to m

    Sample vid from p(Vi | sd-i)

    Let sd = (sd-i, Vi= vid)

    Return sd

    Theorem: stationary distribution is p(s | E)

    [Geman & Geman 1984]

    Conditional for Vigiven

    other vars in sd

  • 8/7/2019 Uai Mcmc Tutorial

    29/61

    29

    Conditional for V depends only on factors that

    contain v

    So condition on Vs Markov

    blanketmb(V): parents,

    children, and co-parents

    Gibbs on Bayesian Network

    w)ch(

    )])([P,|][()])[P (|()|(VY

    VVYsvYspVsvpsvp

    V

  • 8/7/2019 Uai Mcmc Tutorial

    30/61

    30

    Gibbs on Bayesian Mixture Model

    Given current state s:

    Resample each Qigiven prior and

    {Xj: Cj= i in s}

    Resample each Cjgiven XjandQ1:k

    X1 X2 X3 Xn

    C1 C2 C3 Cn

    Q1 Q2 Qk

    context-specificMarkov blanket

    [Neal 2000]

  • 8/7/2019 Uai Mcmc Tutorial

    31/61

    31

    Sampling Given Markov Blanket

    If V is discrete, just iterate over values, normalize,

    sample from discrete distrib. If V is continuous:

    Simple if child distributions are conjugate to Vs prior:posterior has same form as priorwith different

    parameters

    In general, even sampling from p(v | s-V) can be hard

    w

    )c (

    )])([ a,|][()])[ a(|()|(VY

    VVYsvYspVsvpsvp

    [See BUGS software: http://www.mrc-bsu.cam.ac.uk/bugs]

  • 8/7/2019 Uai Mcmc Tutorial

    32/61

    32

    Convergence Can Be Slow

    Cjs wont change untilQ2 is in right area Q2does unguidedrandom walkas long as no observations

    are associated with it Especially bad in high dimensions

    should be two clusters

    Q1 = 20 Q2= 90

    species 2 is far away

    Wingspan (cm)

    10 20 30 40 50 60 70 80 90 100

  • 8/7/2019 Uai Mcmc Tutorial

    33/61

    33

    Outline

    Probabilistic models for relational structures Modeling the number of objects

    Three mistakes that are easy to make

    Markov chain Monte Carlo (MCMC) Gibbs sampling

    Metropolis-Hastings

    MCMC over events

    Case studies Citation matching

    Multi-target tracking

  • 8/7/2019 Uai Mcmc Tutorial

    34/61

    34

    Metropolis-Hastings

    Define T(si, si+1) as follows:

    Sample sd from proposal distribution q(sd | s)

    Compute acceptance probability

    With probabilityE, let si+1 = sd;else let si+1 = si

    d

    dd!

    ii

    i

    ssqEsp

    ssqEsp

    ||

    ||,1miE

    relative posterior

    probabilities

    backward / forward

    proposal probabilities

    Can show that p(s | E) is stationary distribution for T

    [Metropolis et al. 1953; Hastings 1970]

  • 8/7/2019 Uai Mcmc Tutorial

    35/61

    35

    Metropolis-Hastings

    Benefits

    Proposal distribution can propose big steps involvingseveral variables

    Only need to compute ratio p(sd | E) / p(s | E), ignoringnormalization factors

    Dont need to sample from conditional distribs

    Limitations

    Proposals must be reversible, else q(s | sd) = 0

    Need to be able to compute q(s | sd) / q(sd | s)

  • 8/7/2019 Uai Mcmc Tutorial

    36/61

    36

    Split-Merge Proposals

    Choose two observations i, j

    If Ci= Cj= c, then splitcluster c

    Get unused latent object cd

    For each observation m such that Cm = c, change Cm tocd with probability 0.5

    Propose new values forQc, Qcd

    Else merge clusters ciand cj For each m such that Cm = cj, set Cm = ci Propose new value forQc

    [Jain & Neal 2004]

  • 8/7/2019 Uai Mcmc Tutorial

    37/61

    37

    Split-Merge Example

    Q1 = 20 Q2= 90

    Wingspan (cm)

    10 20 30 40 50 60 70 80 90 100

    Q2= 27

    Split two birds from species 1

    Resample Q2 to match these two birds

    Move is likely to be accepted

  • 8/7/2019 Uai Mcmc Tutorial

    38/61

    38

    Mixtures of Kernels

    If T1,,Tm all have stationary distribution T, then sodoes mixture

    Example: Mixture of split-merge and Gibbs moves

    Point: Faster convergence

    !wm

    i

    ii ssTwssT1

    )',()',(

  • 8/7/2019 Uai Mcmc Tutorial

    39/61

    39

    Outline

    Probabilistic models for relational structures Modeling the number of objects

    Three mistakes that are easy to make

    Markov chain Monte Carlo (MCMC) Gibbs sampling

    Metropolis-Hastings

    MCMC over events

    Case studies Citation matching

    Multi-target tracking

  • 8/7/2019 Uai Mcmc Tutorial

    40/61

    40

    MCMC States in Split-Merge

    Not complete instantiations!

    No parameters for unobserved species

    States are partialinstantiations of random variables

    Each state corresponds to an event: set of outcomes

    satisfying description

    k = 12, CB1 = S2, CB2= S8, QS2= 31, QS8= 84

  • 8/7/2019 Uai Mcmc Tutorial

    41/61

    41

    MCMC over Events

    Markov chain over

    events W, with stationary distrib.

    proportional to p(W)

    Theorem: Fraction of visitedevents in Q converges to p(Q|E)

    if:

    Each W is either subset of Q or

    disjoint from Q

    Events form partition of E

    E

    Q

    [Milch & Russell 2006]

  • 8/7/2019 Uai Mcmc Tutorial

    42/61

    42

    Computing Probabilities of Events

    Engine needs to compute p(Wd) / p(Wn) efficiently

    (without summations)

    Use instantiations thatinclude all active parents

    of the variables they

    instantiate

    Then probability is product of CPDs:

    !)(vars

    ))(Pa(|)()(W

    WWWW

    X

    XXXpp

  • 8/7/2019 Uai Mcmc Tutorial

    43/61

    43

    States That Are Even More Abstract

    Typical partial instantiation:

    Specifies particular species numbers, even though species areinterchangeable

    Let states be abstractpartial instantiations:

    See [Milch & Russell 2006] for conditions under which wecan compute probabilities of such events

    x y{ x[k = 12, CB1 = x, CB2= y, Qx= 31, Qy= 84]

    k = 12, CB1 = S2, CB2= S8, QS2= 31, QS8= 84

  • 8/7/2019 Uai Mcmc Tutorial

    44/61

    44

    Outline

    Probabilistic models for relational structures Modeling the number of objects

    Three mistakes that are easy to make

    Markov chain Monte Carlo (MCMC) Gibbs sampling

    Metropolis-Hastings

    MCMC over events

    Case studies Citation matching

    Multi-target tracking

  • 8/7/2019 Uai Mcmc Tutorial

    45/61

    45

    Representative Applications

    Tracking cars with cameras [Pasula et al. 1999]

    Segmentation in computer vision [Tu & Zhu 2002]

    Citation matching [Pasula et al. 2003] Multi-target tracking with radar[Oh et al. 2004]

  • 8/7/2019 Uai Mcmc Tutorial

    46/61

    46

    Citation Matching Model

    #Researcher ~ NumResearchersPrior();

    Name(r) ~ NamePrior();

    #Paper ~ NumPapersPrior();

    FirstAuthor(p) ~ Uniform({Researcher r});Title(p) ~ TitlePrior();

    PubCited(c) ~ Uniform({Paper p});

    Text(c) ~ NoisyCitationGrammar

    (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

    [Pasula et al. 2003; Milch & Russell 2006]

  • 8/7/2019 Uai Mcmc Tutorial

    47/61

    47

    Citation Matching

    Elaboration of generative model shown earlier

    Parameter estimation

    Priors for names, titles, citation formats learned offline from

    labeled data

    String corruption parameters learned with Monte Carlo EM

    Inference

    MCMC with split-merge proposals

    Guided by canopies of similar citations Accuracy stabilizes after ~20 minutes

    [Pasula et al., NIPS 2002]

  • 8/7/2019 Uai Mcmc Tutorial

    48/61

    48

    Citation Matching Results

    Four data sets of ~300-500 citations, referring to ~150-

    300 papers

    0

    0.05

    0.

    0. 5

    0.

    0. 5

    Rei f e Face Reason Const aint

    Error

    (Fraction

    ofClustersNotRecoveredCorrectly)

    Phrase Matching[Lawrence et al. 1999]

    Generative Model + MCMC

    [Pasula et al. 2002]

    Conditional Random Field

    [Wellneret al. 2004]

  • 8/7/2019 Uai Mcmc Tutorial

    49/61

    49

    Cross-Citation Disambiguation

    Wauchope, K. Eucalyptus: Integrating Natural Language

    Input with a Graphical User Interface. NRL Report

    NRL/FR/5510-94-9711 (1994).

    Is "Eucalyptus" part of the title, or is the authornamed K. Eucalyptus Wauchope?

    Kenneth Wauchope (1994). Eucalyptus: Integrating

    natural language input with a graphical user

    interface. NRL Report NRL/FR/5510-94-9711, Naval

    Research Laboratory, Washington, DC, 39pp.

    Second citation makes it clear how to parse the first one

  • 8/7/2019 Uai Mcmc Tutorial

    50/61

    50

    Preliminary Experiments:

    Information Extraction

    P(citation text | title, author names) modeled with

    simple HMM

    For each paper: recover title, author surnames andgiven names

    Fraction whose attributes are recovered perfectly in

    last MCMC state:

    among papers with one citation: 36.1%

    among papers with multiple citations: 62.6%

    Can use inferred knowledge for disambiguation

  • 8/7/2019 Uai Mcmc Tutorial

    51/61

    51

    Multi-Object Tracking

    False

    Detection

    Unobserved

    Object

  • 8/7/2019 Uai Mcmc Tutorial

    52/61

    52

    State Estimation for Aircraft

    #Aircraft ~ NumAircraftPrior();

    State(a, t)

    if t = 0 then ~ InitState()else ~ StateTransition(State(a, Pred(t)));

    #Blip(Source = a, Time = t)~ NumDetectionsCPD(State(a, t));

    #Blip(Time = t)~ NumFalseAlarmsPrior();

    ApparentPos(r)if (Source(r) = null) then ~ FalseAlarmDistrib()else ~ ObsCPD(State(Source(r), Time(r)));

  • 8/7/2019 Uai Mcmc Tutorial

    53/61

    53

    Aircraft Entering and Exiting

    #Aircraft(EntryTime = t) ~ NumAircraftPrior();

    Exits(a, t)if InFlight(a, t) then ~ Bernoulli(0.1);

    InFlight(a, t)

    if t < EntryTime(a) then = falseelseif t = EntryTime(a) then = trueelse = (InFlight(a, Pred(t)) & !Exits(a, Pred(t)));

    State(a, t)if t = EntryTime(a) then ~ InitState()elseif InFlight(a, t) then

    ~ StateTransition(State(a, Pred(t)));#Blip(Source = a, Time = t)

    if InFlight(a, t) then~ NumDetectionsCPD(State(a, t));

    plus last two statements from previous slide

  • 8/7/2019 Uai Mcmc Tutorial

    54/61

    54

    MCMC for Aircraft Tracking

    Uses generative model from previous slide (although not

    with BLOG syntax)

    Examples of Metropolis-Hastings proposals:

    [Oh et al., CDC 2004]

    [Figures by Songhwai Oh]

  • 8/7/2019 Uai Mcmc Tutorial

    55/61

    55

    Aircraft Tracking Results

    [Oh et al., CDC 2004][Figures by Songhwai Oh]

    MCMC has smallest error,

    hardly degrades at all as

    tracks get dense

    MCMC is nearly as fast as

    greedy algorithm;

    much faster than MHT

    Estim tionError Running Time

  • 8/7/2019 Uai Mcmc Tutorial

    56/61

    56

    Toward General-Purpose Inference

    Currently, each new application requires new code

    for:

    Proposing moves

    Representing MCMC states

    Computing acceptance probabilities

    Goal:

    User specifies model and proposal distribution

    General-purpose code does the rest

  • 8/7/2019 Uai Mcmc Tutorial

    57/61

    57

    General MCMC Engine

    Propose MCMC state

    sd given sn

    Compute ratio

    q(sn | sd) / q(sd | sn)

    Compute acceptance

    probability based on model

    Set sn+1

    Define p(s)Custom proposal distribution

    (Java class)

    General-purpose engine

    (Java code)

    Model

    (in declarative language)MCMC states: partial worlds

    [Milch & Russell 2006]

    Handle arbitrary proposals efficiently

    using context-specific structure

  • 8/7/2019 Uai Mcmc Tutorial

    58/61

    58

    Summary

    Models for relational structures go beyond standard

    probabilistic inference settings

    MCMC provides a feasible path for inference

    Open problems

    More general inference

    Adaptive MCMC

    Integrating discriminative methods

  • 8/7/2019 Uai Mcmc Tutorial

    59/61

    59

    References

    Blei, D. M. and Jordan, M. I. (2005) Variational inference for Dirichlet process mixtures. J. Bayesian

    Analysis 1(1):121-144.

    Casella, G. and Robert, C. P. (1996) Rao-Blackwellisation of sampling schemes . Biometrika 83(1):81-

    94.

    Ferguson T. S. (1983) Bayesian density estimation by mixtures of normal distributions. In Rizvi, M. H.

    et al., eds. Recent Advances in Statistics: Papers in Honor of Herman Chernoff on His Sixtieth Birthday.

    Academic Press, New York, pages 287-302.

    Geman, S. and Geman, D. (1984) Stochastic relaxation, Gibbs distributions and the Bayesian

    restoration of images. IEEE Trans. on Pattern Analysis and Machine Intelligence 6:721-741.

    Gilks, W. R., Thomas, A. and Spiegelhalter, D. J. (1994) A language and program for complex Bayesian

    modelling. The Statistician 43(1):169-177.

    Gilks, W. R., Richardson, S., and Spiegelhalter, D. J., eds. (1996) Markov Chain Monte Carlo in Practice.

    Chapman and Hall.

    Green, P. J. (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model

    determination. Biometrika 82(4):711-732.

  • 8/7/2019 Uai Mcmc Tutorial

    60/61

    60

    References

    Hastings, W. K. (1970) Monte Carlo sampling methods using Markov chains and their applications.Biometrika 57:97-109.

    Jain, S. and Neal, R. M. (2004) A split-merge Markov chain Monte Carlo procedure for the Dirichletprocess mixture model. J. Computational and Graphical Statistics 13(1):158-182.

    Jordan M. I. (2005) Dirichlet processes, Chinese restaurant processes, and all that. Tutorial at theNIPS Conference, available at http://www.cs.berkeley.edu/~jordan/nips-tutorial05.ps

    MacKay D. J. C. (1992) Bayesian Interpolation Neural Computation 4(3):414-447.

    MacEachern, S. N. (1994) Estimating normal means with a conjugate style Dirichlet process priorCommunications in Statistics: Simulation and Computation 23:727-741.

    Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953) Equations ofstate calculations by fast computing machines. J. Chemical Physics 21:1087-1092.

    Milch, B., Marthi, B., Russell, S., Sontag, D., Ong, D. L., and Kolobov, A. (2005) BLOG: ProbabilisticModels with Unknown Objects. In Proc. 19th Intl Joint Conf. on AI, pages 1352-1359.

    Milch, B. and Russell, S. (2006) General-purpose MCMC inference over relational structures. In Proc.

    22nd

    Conf. on Uncertainty in AI, pages 349-358.

  • 8/7/2019 Uai Mcmc Tutorial

    61/61

    61

    References

    Neal, R. M. (2000) Markov chain sampling methods for Dirichlet process mixture models . J.

    Computational and Graphical Statistics 9:249-265.

    Oh, S., Russell, S. and Sastry, S. (2004) Markov chain Monte Carlo data association for general multi-target tracking problems. In Proc. 43rdIEEE Conf. on Decision and Control, pages 734-742.

    Pasula, H., Russell, S. J., Ostland, M., and Ritov, Y. (1999) Tracking many objects with many sensors.In Proc. 16th Intl Joint Conf. on AI, pages 1160-1171.

    Pasula, H., Marthi, B., Milch, B., Russell, S., and Shpitser, I. (2003) Identity uncertainty and citationmatching. In Advances in Neural Information Processing Systems 15, MIT Press, pages 1401-1408.

    Richardson,, S. and Green, P. J. (1997) On Bayesian analysis of mixtures with an unknown number ofcomponents. J. Royal Statistical Society B 59:731-792.

    Sethuraman, J. (1994) A constructive definition of Dirichlet priors . Statistica Sinica 4:639-650.

    Sudderth, E. (2006) Graphical models for visual object recognition and tracking. Ph.D. thesis, Dept. of

    EECS, Massachusetts Institute of Technology, Cambridge, MA.

    Tu, Z. and Zhu, S.-C. (2002) Image segmentation by data-driven Markov chain Monte Carlo. IEEE

    Trans. Pattern Analysis and Machine Intelligence 24(5):657-673.


Recommended