+ All Categories
Home > Documents > Bayesian Inference in Machine Learning

Bayesian Inference in Machine Learning

Date post: 02-Apr-2018
Category:
Upload: harik68
View: 216 times
Download: 0 times
Share this document with a friend

of 81

Transcript
  • 7/27/2019 Bayesian Inference in Machine Learning

    1/81

  • 7/27/2019 Bayesian Inference in Machine Learning

    2/81

    Tutorial ContentsI. Introduction to Machine Learning

    II. Bayesian Inference

    1. Maximum Likelihood Estimation

    2. Bayesian Approach

    3. Estimation of Posterior DistributionIII. Machine Learning using Bayesian Inference

    1. Clustering

    2. Latent Variables

    3. Expectation Maximization Algorithm

    4. Variational Methods

    Tutorial Contents 2ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    3/81

    Introduction to Machine Learning

    Introduction to Machine Learning 3ACM Compute 2013

    - Finding non-obvious patterns in data

    - Learning complex relationships among

    variables

    - Predicting the outcomes when somevariables undergo changes

    Machine Learning is a subject inArtificialIntelligence at the intersection ofComputerScience and Statistics dealing with

  • 7/27/2019 Bayesian Inference in Machine Learning

    4/81

    Common Tasks in Machine Learning

    Introduction to Machine Learning 4ACM Compute 2013

    Identifying natural groups (clusters) in the dataand assigning each data point to a particular

    clusterClustering

    Assigning new observations to a set of predefinedcategories (classes) by learning from a training

    data setClassification

    Estimating the likely hood of purchase, probabilityof failure, odds ratio of win etc.

    Prediction

    Extracting non-obvious patterns and rules havingsignificant support and confidence from data

    Pattern Mining

    Reduce number of variables by finding a subset

    (features) which contains most of the relevant

    information (useful for visualization)

    Dimensionality

    Reduction

    Predicting future values in a Time Series

    Forecasting

  • 7/27/2019 Bayesian Inference in Machine Learning

    5/81

    Machine Learning Algorithms

    Introduction to Machine Learning 5ACM Compute 2013

    K-Means Algorithm Decision TreesSupport Vector

    Machines

    Neural Networks Logistic Regression Bayesian Networks

    Apriori Algorithm

    PrincipalComponent

    Analysis

    Gibbs Sampling

  • 7/27/2019 Bayesian Inference in Machine Learning

    6/81

    Machine Learning Algorithms

    Introduction to Machine Learning 6ACM Compute 2013

    x = x1

    XX

    X X

    X

    X

    X

    X

    X

    X

    X

    X

    X

    XX

    X

    X

    X

    X

    X

    X

    X

    X

    X

    XX

    X

    X

    X

    X X

    X

    X X

    X

    X

    X

    X

    y = y1

    y = y2

    x = x2

    x = x3

    Decision Trees for Classification : Divide and Conquer Approach

  • 7/27/2019 Bayesian Inference in Machine Learning

    7/81

    Machine Learning Algorithms

    Introduction to Machine Learning 7ACM Compute 2013

    Decision Trees for Classification : Divide and Conquer Approach

    X < X1 X > X1

    Y < Y1 Y > Y1 Y < Y2 Y > Y2

    X < X3 X > X3X < X2 X > X2

    Red Red Blue Red Blue

  • 7/27/2019 Bayesian Inference in Machine Learning

    8/81

    Classification of Machine Learning

    Algorithms

    Introduction to Machine Learning 8ACM Compute 2013

    Based on Data Type1. Supervised

    When there are labeled data sets for training

    Classification, Regression

    2. Unsupervised

    No labeled data sets for training Clustering

    Based on Model Type

    1. Parametric

    Finite number of parameters describing models

    Regression Coefficients2. Non-Parametric

    Infinite number of parameters describing models

    Make no assumption on the distribution of data

    Rank Ordering

  • 7/27/2019 Bayesian Inference in Machine Learning

    9/81

    Tools for Machine Learning

    Introduction to Machine Learning 9ACM Compute 2013

    Open Source

    1. R

    2. Mahout (for Big Data)

    3. Weka

    4. SciKit (Python)

    Licensed

    1. SAS

    2. SPSS

    3. KXEN

  • 7/27/2019 Bayesian Inference in Machine Learning

    10/81

    Maximum Likelihood Estimation

    Workhorse of Machine Learning

    Objective is to estimate model parameters that maximizes the

    probability of observed data given a model

    MLE Steps: Data i.i.d. observations , , , Choose model parameters which maximizes Likelihood Function =

    argmax log arg max1 log

    =

    Introduction to Machine Learning 10ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    11/81

    MLE Example using R

    Synthetic Data Set:

    Linear Model 1 0 . 0 , 2 . 0 ,

    Noise ~ , 0 , 2 5 . 0

    Introduction to Machine Learning 11ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    12/81

    MLE Example using R

    MLE Solution: Case when s is known

    9.8342 , 1.9653

    MLE Solution: Case when s is unknown

    6.7631 , 1.9993, 43.61

    Introduction to Machine Learning 12ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    13/81

    Maximum Likelihood Estimation

    Limitations:

    Point estimates for model parameters Need regularization to avoid model over fit

    Difficult to incorporate domain knowledge Online/Sequential learning no provision for incremental

    update

    Introduction to Machine Learning 13ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    14/81

    Bayesian Methods Bayesian Methods provide a framework to reason coherently about the world in

    the presence of uncertainty

    Based on a Theorem by Rev. Thomas Bayes (1701-1761)

    Independently discovered and popularized by Laplace (1749 1827)

    The core approach in Bayesian Methods is to

    Start with a Belief (Hypothesis) about a problem and aPrior Degree of Belief

    Update the Degree of Belief (Hypothesis) as mode

    Evidence (Data) gathers Different from Frequentist Approach, where Probability is a measure of proportion

    of outcomes

    Analogous to how human beings learn about the world

    Bayesian Inference 14ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    15/81

    Bayesian Approach

    In Bayesian approach, probability is considered as a Degree of

    Belief

    Starting point - A Hypothesis representing existing knowledge

    or belief

    Update the Hypothesis in the event of observing new data

    using Bayes Theorem

    Bayesian Inference 15ACM Compute 2013

    )(

    )()|()|(

    01

    DataP

    HypothesisPHypothesisDataPDataHypothesisP

    Posterior Likelihood Prior

    Bayes Theorem

  • 7/27/2019 Bayesian Inference in Machine Learning

    16/81

    Bayesian Approach

    Prediction for a new data point is made by using the updated

    posterior distribution

    ,

    Bayesian Inference 16ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    17/81

    Bayesian Approach

    Advantageous:

    Distribution of parameters capturing uncertainty in parameter

    estimation

    Marginalization over distribution of parameters to avoid over

    fitting, no need of regularization

    Framework to incorporate Domain knowledge through Prior

    distribution

    Framework to address Model uncertainties

    Natural framework for online/sequential learning

    Modeling can start with very little data

    Bayesian Inference 17ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    18/81

    Challenges in implementation

    Estimation of Posterior Distribution is not trivial since P(Data)

    often can not be computed

    Choosing the right prior

    Computationally more intensive

    Bayesian Inference 18ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    19/81

    Choosing the Right Prior

    General Guidelines in choosing a Prior:

    1. Justify assumptions and evaluate their plausibility in view of

    what is known

    2. Explore the sensitivity of the results of the analysis to the

    assumptions on the Prior.

    Different Type of Priors:

    1. Conjugate Priors

    2. Non-informative Priors3. Jeffrys Rule

    4. Subjective Priors

    Bayesian Inference 19ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    20/81

  • 7/27/2019 Bayesian Inference in Machine Learning

    21/81

    Conjugate Priors

    Data i.i.d. observations , , , Model , ~ 0, Prior , Posterior

    : exp

    = 1

    2exp 1

    2

    Bayesian Inference 21ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    22/81

    Conjugate Priors

    After rearranging

    12 2

    =

    = 12

    2

    12 2 1

    2

    /

    2

    Bayesian Inference 22ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    23/81

    Conjugate Priors

    : ~ ,

    / /

    /

    1

    Bayesian Inference 23ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    24/81

    Conjugate PriorsPosterior mean

    / /

    / Weighted average of the sample mean and prior men Component having less uncertainty has more weight As n increases the weight of prior mean decreases

    Bayesian Inference 24ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    25/81

    Conjugate PriorsPosterior precision

    1 1

    Sum of the precision of the sample mean and prior

    Always greater than prior precision, even with poor quality data For large n, dominated by the precision of the sample mean

    Bayesian Inference 25ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    26/81

    Non-informative Priors

    A prior which contains No Information aboutq

    Used when no or minimal prior information is available

    Often such priors are Improper Priors (infinite mass)

    Priors being Improper is not an issue per say, as long as

    the posteriors are well defined densities.

    Often Uniform distribution is used as a non-informative prior

    Bayesian Inference 26ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    27/81

    Jeffreys PriorJeffreys (1961) suggested the following prior which is

    commonly used

    det

    is the Fisher Information Matrix

    l o g

    Bayesian Inference 27ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    28/81

    Computation of Posterior Distribution

    Maximum A Posteriori Estimation

    Monte Carlo Simulations

    Variational Methods

    Bayesian Inference 28ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    29/81

    Maximum A Posteriori Estimation

    (MAP) Find the parameter values which maximizes the Posterior

    distribution

    argmax

    argmax

    Maximizing the Numerator of Bayes formula

    Bayesian Inference 29ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    30/81

    Maximum A Posteriori Estimation

    (MAP)Example Linear Regression

    1, , , ~ 0, , , , , ,

    Variables

    is treated as deterministic (exogenous)

    Model can be rewritten as

    |, , ~ ; ,

    Bayesian Inference 30ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    31/81

    Maximum A Posteriori Estimation

    (MAP)Example Linear Regression

    Likelihood function:

    , , 12 1

    det exp 12

    1det exp 12 2 Prior distribution: for , ; ,

    exp 1

    2

    exp 12 2

    Bayesian Inference 31ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    32/81

    Maximum A Posteriori Estimation

    (MAP)Example Linear Regression

    Posterior distribution:

    , , exp 12 2

    exp 12

    2

    Rearranging terms,

    exp 12 2

    Bayesian Inference 32ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    33/81

    Maximum A Posteriori Estimation

    (MAP)Example Linear Regression

    MAP Estimation of : log , , 0

    2

    2

    0

    Bayesian Inference 33ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    34/81

    Maximum A Posteriori Estimation

    (MAP)Example Linear Regression

    MAP Estimation of :Bayesian Shrinkage

    Bayesian Inference 34ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    35/81

    Maximum A Posteriori Estimation

    (MAP)Example Prediction of Global Warming by Greenhouse Gases

    Data

    Atmospheric CO2 Data collected by Mauna Loa Observatory

    (1959 2012)

    Temperature Anomaly Data collected by NASA and NOAA (1880

    2012)

    Task is to Predict likely Temperature Raise when CO2 levels reach

    500 ppm from current 400ppm

    Bayesian Inference 35ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    36/81

    Maximum A Posteriori Estimation

    (MAP)

    Example Prediction of Global Warming by Greenhouse Gases

    Bayesian Inference 36ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    37/81

    Maximum A Posteriori Estimation

    (MAP)

    Example Prediction of Global Warming by Greenhouse Gases

    Hypothesis:

    Temperature Anomaly CO2 ConcentrationDomain Knowledge:

    Doubling CO2 concentration would result in increase in temperature

    of 1 4 0C

    During the period 1960 2000, CO2 concentration increased by 52

    ppm and temperature increased by 0.44 0C

    Bayesian Inference 37ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    38/81

    Maximum A Posteriori Estimation

    (MAP)

    Example Prediction of Global Warming by Greenhouse Gases

    From data for the period 1960 1980

    .. 2 . 2.3793 0.9924, 0.00730.0030

    Use this as Prior information

    Use data for period 1981 2007 as Training data

    Predict Temp. Anom. for the period 2008 - 2012

    Bayesian Inference 38ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    39/81

    Maximum A Posteriori Estimation

    (MAP)

    Example Prediction of Global Warming by Greenhouse Gases

    Bayesian Inference 39ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    40/81

    Maximum A Posteriori Estimation

    (MAP)Prior distribution

    ; , exp 12

    3.27, 0.01 , 9 10

    00 81 10

    Bayesian Inference 40ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    41/81

    Maximum A Posteriori Estimation

    (MAP)Maximum A Posteriori Estimation (MAP)

    Prior distribution

    Bayesian Inference 41ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    42/81

    Maximum A Posteriori Estimation

    (MAP)

    Bayesian Inference 42ACM Compute 2013

    Prediction using Prior Prediction using MLE

  • 7/27/2019 Bayesian Inference in Machine Learning

    43/81

    Maximum A Posteriori Estimation

    (MAP)

    Bayesian Inference 43ACM Compute 2013

    Prediction using MAP

  • 7/27/2019 Bayesian Inference in Machine Learning

    44/81

    Monte-Carlo SimulationsOften Posterior distribution is analytically not tractable

    No expressions for Mean, Variance etc.

    No closed form for Marginal Distributions

    Solution

    Draw i.i.d. samples from the Posterior

    Using the samples compute the Mean, Variance, Confidence Interval

    etc.

    Markov Chain Monte-Carlo to generate i.i.d. samples

    Bayesian Inference 44ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    45/81

    Monte-Carlo Simulations

    Markov Chain Monte-Carlo Simulations (MCMC)Let , , be the parameters for which posteriordistribution needs to be computed

    Consider the case where parameters are discrete

    with K states for each , , Set up a Markov Process with states and Transition Probability (+) () such that Steady State corresponds to Posterior

    Bayesian Inference 45ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    46/81

    Monte-Carlo SimulationsMarkov Chain Monte-Carlo Simulations

    1. Metropolis Hasting Algorithm

    i. Let () be the state of the system at time tii. Generate a candidate state for time t+1 by drawing from a

    Proposal Distribution iii. Accept the proposal move with probability

    min 1,

    (+)

    iv. If the proposal is rejected (+) ()v. Continue till the distribution converges to a steady state

    Bayesian Inference 46ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    47/81

    Monte-Carlo SimulationsMarkov Chain Monte-Carlo Simulations

    1. Metropolis Hasting Algorithm

    i. Hill Climbing type of algorithm

    ii. Very generic, can be used for most Posterior distributions

    iii. Need to choose proposal distribution

    carefully to avoid

    large number of rejections

    slow convergence to steady state

    Bayesian Inference 47ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    48/81

    Monte-Carlo SimulationsMarkov Chain Monte-Carlo Simulations

    1. Gibbs Sampling

    i. Start with an initial state , , ii. At each step update components one by one by drawing from

    a distribution conditional on the most recent value of rest of the

    components (+)~ , , (+)~ + , , (+)~ + , + , + ,

    (+)

    ~ +

    , , +

    iii. After M steps, all the components of the parameter will be updated

    iv. Continue till the distribution converges to a steady state

    Bayesian Inference 48ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    49/81

    Monte-Carlo SimulationsMarkov Chain Monte-Carlo Simulations

    1. Gibbs Sampling

    i. Very efficient algorithm since there are no rejections

    ii. Commonly used for practical applications

    iii. Conditional distributions should be known

    Bayesian Inference 49ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    50/81

    Monte-Carlo SimulationsMarkov Chain Monte-Carlo Simulations

    1. Gibbs Sampling

    Example: Posterior ~ Bivariate Normal

    ; , 1

    2 exp 1

    2

    , , , , Conditional density

    ; ,

    Bayesian Inference 50ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    51/81

    Monte-Carlo SimulationsMarkov Chain Monte-Carlo Simulations

    1. Gibbs Sampling

    Example: Posterior ~ Bivariate Normal

    Conditional density

    ; ,

    Bayesian Inference 51ACM Compute 2013

    M hi L i i B iMachine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    52/81

    Machine Learning using Bayesian

    InferenceExample - Clustering

    Consider a set of N points , , in D-dimensionsGoal is to Partition data set into K clusters such that

    Distance between points within cluster are smaller compared to

    distance between points in different clusters.

    Let , 1 , , be D-dimensional vectors representing thecenters of each cluster

    Let be the indicator function

    1 if point in cluser 0 if point in cluster

    52ACM Compute 2013

    Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    53/81

    Clustering Define an Objective Function (Distortion Function) to represent

    the sum of square of each data point to the center of the clusterit belongs.

    =

    =

    Task is to find andwhich minimizes

    53ACM Compute 2013

    Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    54/81

    Clustering

    54ACM Compute 2013

    Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    55/81

    Clustering Iterative procedure for finding

    and

    Start with initial values for Minimize w.r.t keeping fixed Minimize w.r.t keeping fixed

    K Means algorithm

    55ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    56/81

    Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    57/81

    Gaussian Mixture ModelWhere

    =

    ,

    =

    ,

    =

    57ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    58/81

  • 7/27/2019 Bayesian Inference in Machine Learning

    59/81

    Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    60/81

    Gaussian Mixture ModelMaximum Likelihood Estimate Computational Issues

    Presence of singularities

    Let And one of the data points This term contributes

    , 12 1 During minimization this term become singular as 0This issue do not arise in single Gaussian distribution ?

    60ACM Compute 2013

    Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    61/81

    Gaussian Mixture ModelMaximum Likelihood Estimate Computational Issues

    Exercise:

    Singularity issue do not arise in single Gaussian distribution ?

    61ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    62/81

    Expectation Maximization (EM)Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    63/81

    Expectation Maximization (EM)

    AlgorithmDempster et.al. 1977

    Elegant method for finding MLE for models with latent variables.

    Taking derivative ofln , , w.r.t ,,and equating tozero will give

    1

    =

    =

    63ACM Compute 2013

    Expectation Maximization (EM)Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    64/81

    Expectation Maximization (EM)

    Algorithm Taking derivative ofln , , w.r.t ,, and equating tozero will give

    1

    =

    Does not yield a closed form solution since depends on

    ,

    and

    64ACM Compute 2013

    Expectation Maximization (EM)Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    65/81

    Expectation Maximization (EM)

    Algorithm

    Iterative Solution

    1. Initialize parameters , and and evaluate initial value oflog likelihood.

    2. E Step: Evaluate responsibilities using current parameter values

    , , =

    65ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    66/81

  • 7/27/2019 Bayesian Inference in Machine Learning

    67/81

    Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    68/81

    EM Algorithm in a Bayesian SettingSuppose we know values of in addition to the observedConsider the problem of maximizing the likelihood of the completedata set ,

    ,

    =

    =

    , ,

    =

    =

    68ACM Compute 2013

    Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    69/81

    EM Algorithm in a Bayesian Setting

    Log likelihood of the complete data set

    ,

    ln , ,, == ln ln , Logarithm directly acts on the Normal distribution

    Simpler equation for MLE

    For maximization w.r.t and , the expression is a sum of Kindependent terms For maximization w.r.t , there is a coupling since sum = 1 Using Lagrange multiplier

    =

    69ACM Compute 2013

    Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    70/81

    EM Algorithm in a Bayesian Setting Log likelihood of the complete data set

    , is easy to maximize

    However usually is unknown. Take expectation of the complete-data likelihood using Posterior

    distribution of

    , ,, , ,, ,, , ,, ,

    =

    =

    70ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    71/81

    Machine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    72/81

    EM Algorithm in a Bayesian Setting1. Initialize parameters , and and evaluate initial value of log

    likelihood2. E Step: use these values to compute responsibilities 3. M Step: Keeping responsibilities fixed, maximize

    ln , ,, w.r.t , and

    1

    =

    1

    =

    72ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    73/81

  • 7/27/2019 Bayesian Inference in Machine Learning

    74/81

    General form of EM Algorithm inMachine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    75/81

    General form of EM Algorithm in

    Bayesian Setting

    For any choice of following decomposition holdsln , || , ln ,

    || ln ,

    ||is the Kullback-Leibler Divergence between

    and

    75ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    76/81

  • 7/27/2019 Bayesian Inference in Machine Learning

    77/81

  • 7/27/2019 Bayesian Inference in Machine Learning

    78/81

    General form of EM Algorithm inMachine Learning Using Bayesian Inference

  • 7/27/2019 Bayesian Inference in Machine Learning

    79/81

    General form of EM Algorithm in

    Bayesian SettingGraphical interpretation of EM Algorithm

    79ACM Compute 2013

  • 7/27/2019 Bayesian Inference in Machine Learning

    80/81

    f

  • 7/27/2019 Bayesian Inference in Machine Learning

    81/81

    References1. Pattern Recognition and Machine Learning

    Christopher M. Bishop, Springer

    2. Dynamic Linear Models with R Giovanni Petris,

    Springer

    Email: [email protected]

    Twitter: @hari_koduvely

    mailto:[email protected]:[email protected]

Recommended