+ All Categories
Home > Documents > Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion...

Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion...

Date post: 20-Dec-2015
Category:
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
18
Classification for High Classification for High Dimensional Problems Usi Dimensional Problems Usi ng Bayesian Neural Netwo ng Bayesian Neural Netwo rks and Dirichlet Diffus rks and Dirichlet Diffus ion Trees ion Trees Radford M. Neal and Jianguo Zhang Radford M. Neal and Jianguo Zhang the winners of NIPS2003 feature selectio the winners of NIPS2003 feature selectio n challenge n challenge University of Toronto University of Toronto
Transcript
Page 1: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

Classification for High DimensClassification for High Dimensional Problems Using Bayesiaional Problems Using Bayesian Neural Networks and Dirichln Neural Networks and Dirichlet Diffusion Treeset Diffusion Trees

Radford M. Neal and Jianguo Zhang Radford M. Neal and Jianguo Zhang the winners of NIPS2003 feature selection challengethe winners of NIPS2003 feature selection challenge

University of TorontoUniversity of Toronto

Page 2: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

The resultsThe results

•Combination of Bayesian neural networks and classification based on Bayesian clustering with a Dirichlet diffusion tree model. •A Dirichlet diffusion tree method is used for Arcene. •Bayesian neural networks (as in BayesNN-large) are used for Gisette, Dexter, and Dorothea. •For Madelon, the class probabilities from a Bayesian neural network and from a Dirichlet diffusion tree method are averaged, then thresholded to produce predictions.

Page 3: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

Their General ApproachTheir General Approach

Use simple techniques to reduce Use simple techniques to reduce the computational difficulty of the the computational difficulty of the problem, then apply more problem, then apply more sophisticated Bayesian methods.sophisticated Bayesian methods.– The simple techniques: PCA and The simple techniques: PCA and

feature selection by significance tests.feature selection by significance tests.– Bayesian neural networks.Bayesian neural networks.– Automatic Relevance Determination.Automatic Relevance Determination.

Page 4: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

(I) First level feature (I) First level feature reductionreduction

Page 5: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

Feature selection using significance tests (first level) An initial feature subset was found by sim

ple univariate significance tests. (correlation coefficient, symmetrical uncertainty )

Assumption: Relevant variables will be at least somewhat relevant on their own.

For all tests, a p-value was found by comparing to the distribution found when permuting the class labels.

Page 6: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

Dimensionality reduction with PCA (an alternative for FS) There are probably better

dimensionality reduction methods than PCA, but that’s what we used. One reason is that it’s feasible even when p is huge, provided n is not too large - time required is of order min(pn2, np2).

PCA was done using all the data (training, validation, and test).

Page 7: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

(II) Building learning model (II) Building learning model & Second level feature & Second level feature SelectionSelection

Page 8: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

Bayesian Neural Networks

Page 9: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

Conventional neural network learning

Page 10: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

Bayesian Neural Network Learning Based on the statistic Based on the statistic

interpretation of the conventional interpretation of the conventional neural network learningneural network learning

Page 11: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

Bayesian Neural Network Learning Bayesian predictions are found by integration rather

than maximization. For a test case x, y is predicted:

Conventional neural network only consider Conventional neural network only consider parameters with maximum posteriorparameters with maximum posterior

Bayesian Neural Network consider all possible Bayesian Neural Network consider all possible parameters in the parameter space.parameters in the parameter space.

Can be implemented by Gaussian Can be implemented by Gaussian approximation and MCMCapproximation and MCMC

Page 12: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

ARD Prior

Still remember the decay?

How? (by optimize the decay parameter)– Associate weights from each input with a decay

parameter– There are theories for optimizing the decays.

Result.If an input feature x is irrelevant, its relevance hyper-parameter β=1/a will tend to be small, forcing the relevant weight from that input to be near zero.

Page 13: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

Some Strong Points of Some Strong Points of This AlgorithmThis Algorithm Bayesian learning integrates over the post

erior distribution for the network parameters, rather than picking a single “optimal” set of parameters. This farther helps to avoid overfitting.

ARD can be used to adjust the relevance of input features

We can using prior to incorporate external knowledge

Page 14: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

Dirichlet Diffusion Trees An Bayesian hierarchical clustering

method

Page 15: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

The methodsThe methods

BayesNN-smallfeatures selected using significance tests.

BayesNN-largeprinciple components

BayesNN-DFT-combothe class probabilities from a Bayesian neural network and from a Dirichlet diffusion tree method are averaged, then thresholded to produce predictions.

Page 16: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

About the datasetsAbout the datasets

Page 17: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

The resultsThe results

•http://www.nipsfsc.ecs.soton.ac.uk/

Page 18: Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.

Thanks.Thanks.

Any Question?Any Question?


Recommended