+ All Categories
Home > Documents > Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data...

Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data...

Date post: 02-Jan-2016
Category:
Upload: ross-bradley
View: 213 times
Download: 0 times
Share this document with a friend
68
Learning with Hypergraphs: Learning with Hypergraphs: Discovery of Higher-Order Interaction Pattern Discovery of Higher-Order Interaction Pattern s from s from High-Dimensional Data High-Dimensional Data Moscow State University, Moscow State University, Faculty of Computational Mathemati Faculty of Computational Mathemati cs and Cybernetics, Feb. 22, 2007, Moscow, Russia cs and Cybernetics, Feb. 22, 2007, Moscow, Russia Byoung-Tak Zhang Biointelligence Laboratory School of Computer Science and Engineering Brain Science, Cognitive Science, Bioinformatics P rograms Seoul National University Seoul 151-742, Korea [email protected] http:// bi.snu.ac.kr /
Transcript
Page 1: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

Learning with Hypergraphs: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from Discovery of Higher-Order Interaction Patterns from

High-Dimensional DataHigh-Dimensional Data

Moscow State University,Moscow State University, Faculty of Computational Mathematics and Cybernetics, Faculty of Computational Mathematics and Cybernetics, Feb. 22, 2007, Moscow, RussiaFeb. 22, 2007, Moscow, Russia

Byoung-Tak Zhang

Biointelligence LaboratorySchool of Computer Science and Engineering

Brain Science, Cognitive Science, Bioinformatics ProgramsSeoul National University

Seoul 151-742, Korea

[email protected]://bi.snu.ac.kr/

Page 2: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

2

Probabilistic Graphical Models (PGMs)Probabilistic Graphical Models (PGMs)

Represent the joint probability distribution on some random variables in graphical form. Undirected PGMs Directed PGMs

Generative: The probability distribution for some variables given values of other variables can be obtained. Probabilistic inference

( , , , , )

( ) ( | ) ( | , ) ( | , , )

( | , , , )

( ) ( ) ( | , ) ( | ) ( | )

P A B C D E

P A P B A P C A B P D A B C

P E A B C D

P A P B P C A B P D B P E C

C

A B

E

D

• C and D are independent given B.

• C asserts dependency between A and B.

• B and E are independent given C.

Page 3: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

3

Kinds of Graphical ModelsKinds of Graphical Models

Graphical Models

- Boltzmann Machines - Markov Random Fields

- Bayesian Networks- Latent Variable Models- Hidden Markov Models- Generative Topographic Mapping- Non-negative Matrix Factorization

Undirected Directed

Page 4: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

4

Bayesian NetworksBayesian Networks BN = (S, P) consists of a network structure S and a set of local

probability distributions P 1

( ) ( | )

n

i ii

p p xx pa

• Structure can be found by relying on the prior knowledge of causal relationships

<BN for detecting credit card fraud>

Page 5: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

5

From Bayes Nets to High-Order PGMsFrom Bayes Nets to High-Order PGMs

G

F

J

A

S

G

F

J

A

S

{ , , , }

( | , , , )

( , , , | ) ( )

( , , , )

( , , , | )

( | ) ( | ) ( | ) ( | )

( | )x J G S A

P F J G S A

P J G S A F P F

P J G S A

P J G S A F

P J F P G F P S F P A F

P x F

G

F

J

A

S

{ , , , , }

( , , , , )

( | ) ( | ) ( | )( | )

( | ( ))x F J G S A

P F J G S A

P G F P J F P J A J S

P x pa x

( , ) {( , )| , { , , , } and }

( , , , , )

( , | ) ( , | ) ( , | )

( , | ) ( , | )

( , | )

( ( , ) | )he x y x y x y J G S A

x y

P F J G S A

P J G F P J S F P J A F

P G S F P G A F

P S A F

P he x y F

(1) Naïve Bayes

(2) Bayesian Net

(3) High-Order PGM

Page 6: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

The Hypernetworks The Hypernetworks

Page 7: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

7

HypergraphsHypergraphs A hypergraph is a (undirected) graph G whose edges connect

a non-null number of vertices, i.e. G = (V, E), where V = {v1, v2, …, vn}, E = {E1, E2, …, En}, and Ei = {vi1, vi2, …, vim} An m-hypergraph consists of a set V of vertices and a subset

E of V[m], i.e. G = (V, V[m]) where V[m] is a set of subsets of V whose elements have precisely m members.

A hypergraph G is said to be k-uniform if every edge Ei in E has cardinality k.

A hypergraph G is k-regular if every vertex has degree k. Rem.: An ordinary graph is a 2-uniform hypergraph.

Page 8: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

8

An Example HypergraphAn Example Hypergraph

v5v5

v1v1

v3v3

v7v7

v2v2

v6v6

v4v4

G = (V, E)V = {v1, v2, v3, …, v7}E = {E1, E2, E3, E4, E5}

E1 = {v1, v3, v4}E2 = {v1, v4}E3 = {v2, v3, v6}E4 = {v3, v4, v6, v7}E5 = {v4, v5, v7}

E1

E4

E5

E2

E3

Page 9: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

9

HypernetworksHypernetworks A hypernetwork is a hypergraph of weighted edges. It is defined as a tripl

e H = (V, E, W), where

V = {v1, v2, …, vn},

E = {E1, E2, …, En},

and W = {w1, w2, …, wn}. An m-hypernetwork consists of a set V of vertices and a subset E of V[m], i.

e. H = (V, V[m], W) where V[m] is a set of subsets of V whose elements have precisely m members and W is the set of weights associated with the hyperedges.

A hypernetwork H is said to be k-uniform if every edge Ei in E has cardinality k.

A hypernetwork H is k-regular if every vertex has degree k. Rem.: An ordinary graph is a 2-uniform hypergraph with wi=1.

[Zhang, DNA-2006]

Page 10: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

10

x1x2

x3

x4

x5

x6

x7

x8 x9

x10

x11

x12

x13

x14

x15

A Hypernetwork A Hypernetwork

Page 11: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

Learning with Hypernetworks Learning with Hypernetworks

Page 12: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

12

The Hypernetwork Model of LearningThe Hypernetwork Model of Learning

Nn

K

iii

i

D

WWWW

SkXSSS

xxxX

WSXH

I

1)(

)()3()2(

,...,,

}{

:set Training

),...,,(

|| , ,

)(

),,(

as defined isrk hypernetwo The

21

x

)( 2121...21

21

21...21

321

321321

21

2121

321

321321

21

2121

2 ,...,,

)()()()(

2 ,...,,

)()()()(

,,

)()()()3(

,

)()()2(

)()(

,,

)()()()3(

,

)()()2()(

...)(

1exp)Z(

isfunction partition thewhere

,...)(

1exp

)Z(

1

...6

1

2

1exp

)Z(

1

)];(exp[)Z(

1 )|(

ondistributiy probabilit The

...6

1

2

1 );(

rkhypernetwo theofenergy The

m kkiiikiii

k

kiiikiii

iiiiiiiiii

iiiiiiiiii

K

k iii

mmmk

K

k iii

nnnk

iii

nnn

ii

nn

nn

iii

nnn

ii

nnn

xxxwkc

W

xxxwkcW

xxxwxxwW

WEW

WP

xxxwxxwWE

x

xx

x

[Zhang, 2006]

Page 13: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

13

Deriving the Learning RuleDeriving the Learning Rule

N

n

K

k iii

nnnk

N

n

Kn

Nn

WZxxxwkc

WWWP

WP

k

kiiikiii1 2 ,...,,

)()()()(

1

)()3()2()(

1)(

)(ln...)(

1exp

),...,,|(ln

)|}({ln

21

21...21

x

x

)|}({ln 1)(

)(

...21

WPw

Nns

siii

x

N

n

Nn

WP

WP

1

(n)

1)(

)|(

)|}({

x

x

Page 14: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

14

Derivation of the Learning Derivation of the Learning RuleRule

xx

x

x

x

x

)|(......

...1

...

where

......

......

)(ln...)(

1exp

)(ln...)(

1exp

)|}({ln

2121

2121

2121

2121

...2121

21...21

...21

21

21...21

...21

...21

)|(

1

)()()(

)|(

1)|(

)()()(

1)(

2 ,...,,

)()()()()(

1 2 ,...,,

)()()()()(

1)(

)(

WPxxxxxx

xxxN

xxx

xxxxxxN

xxxxxx

WZw

xxxwkcw

WZxxxwkcw

WPw

siiis

siiis

ss

ssiii

siiik

kiiikiii

siii

k

kiiikiii

siii

siii

WPiii

N

n

nnn

Dataiii

WPiiiDataiii

N

nWPiii

nnn

N

ns

K

k iii

nnnks

N

n

K

k iii

nnnks

Nns

Page 15: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

15x8 x9

x12

x1x2

x3

x4

x5

x6

x7x10

x11

x13

x14

x15

x1 =1

x2 =0

x3 =0

x4 =1

x5 =0

x6 =0

x7 =0

x8 =0

x9 =0

x10 =1

x11 =0

x12 =1

x13 =0

x14 =0

x15 =0

y

= 1

x1 =0

x2 =1

x3 =1

x4 =0

x5 =0

x6 =0

x7 =0

x8 =0

x9 =1

x10 =0

x11 =0

x12 =0

x13 =0

x14 =1

x15 =0

y

= 0

x1 =0

x2 =0

x3 =1

x4 =0

x5 =0

x6 =1

x7 =0

x8 =1

x9 =0

x10 =0

x11 =0

x12 =0

x13 =1

x14 =0

x15 =0

y

=1

4 examples

x4 x10 y=1x1

x4 x12 y=1x1

x10 x12 y=1x4

x3 x9 y=0x2

x3 x14 y=0x2

x9 x14 y=0x3

x6 x8 y=1x3

x6 x13 y=1x3

x8 x13 y=1x6

1

2

3

1

2

3

x1 =0

x2 =0

x3 =0

x4 =0

x5 =0

x6 =0

x7 =0

x8 =1

x9 =0

x10 =0

x11 =1

x12 =0

x13 =0

x14 =0

x15 =1

y

=14

x11 x15 y=0x84

Round 1Round 2Round 3

Page 16: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

16

Molecular Self-Assembly of HypernetworksMolecular Self-Assembly of Hypernetworks

xi xj y

X7

X6

X5

X8

X1

X2

X3

X4

Hypernetwork Representation

x1 x3 Class

x1 x2 x4 Classx2 x3 Class

x1 x4 Class

x1 x3 Class

x1 x3 Class

x1 x2 x4 Class

x1 x2 x4 Class

x2 x3 x4 Class

x2 x3 x4 Class

x2 x3 x4 Class

x2 x3 Class

x2 x3 Class

x1 x4 Class

x1 x4 Class

x1 Class

x2 Class

x1 x2 Class

x1 x3 Class

x1 xn Class…

x1 Class

x1 Class

x2 Class

x1 x2 Class

x1 x2 Class

x1 x3 Class

x1 x3 Class

x1 x3 Class

x1 xn Class…

x2 Class

x2 Class

x1 x3 Class

x1 x3 Class

Molecular Encoding

Page 17: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

17

Encoding a Hypernetwork with DNAEncoding a Hypernetwork with DNA

z1 :

z2 :

z3 :

z4 :

b)

x1

x2

x3

x4

x5

y

0

1

where

z1 : (x1=0, x2=1, x3=0, y=1)z2 : (x1=0, x2=0, x3=1, x4=0, x5=0, y=0)z3 : (x2=1, x4=1, y=1)z4 : (x2=1, x3=0, x4=1, y=0)

a)

AAAACCAATTGGAAGGCCATGCGG

AAAACCAATTCCAAGGGGCCTTCCCCAACCATGCCC

AATTGGCCTTGGATGCGG

AATTGGAAGGCCCCTTGGATGCCC

GG

AAAA

AATT

AAGG

CCTT

CCAA

ATGC

CC

Collection of (labeled) hyperedges

Library of DNA molecules corresponding to (a)

Page 18: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

18

DNA Molecular ComputingDNA Molecular Computing

Self-assembly

Heat

Cool

Polymer

Repeat

Self-replication

Molecular recognitionNanostructure

Page 19: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

19

Learning the Hypernetwork (by Molecular EvoluLearning the Hypernetwork (by Molecular Evolution)tion)

Library of combinatorialmolecules

+

Library Example

Select the library elements matching the example

Amplify the matched library elements by PCR

Next generation

ii

Hybridize

[Zhang, DNA11]

Page 20: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

20

Molecular Information ProcessingMolecular Information Processing

MP4.avi

Page 21: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

21

i i

The Theory of Bayesian EvolutionThe Theory of Bayesian Evolution

P0(Ai) Pg(Ai |D)...

generation 0 generation gP(A |D)P(A |D)

Pg(Ai)

[Zhang, CEC-99]

Evolution as a Bayesian inference process Evolutionary computation (EC) is viewed as an iterative process of

generating the individuals of ever higher posterior probabilities from the priors and the observed data.

Page 22: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

22

1. Let the hypernetwork H represent the current distribution P(X,Y).

2. Get a training example (x,y).3. Classify x using H as follows

3.1 Extract all molecules matching x into M.3.2 From M separate the molecules into classes:

Extract the molecules with label Y=0 into M0

Extract the molecules with label Y=1 into M1

3.3 Compute y*=argmaxY{0,1}| MY |/|M|

4. Update HIf y*=y, then Hn ← Hn-1+{c(u, v)} for u=x and v=y for (u, v) Hn-1,If y*≠y, then Hn ← Hn-1{c(u, v)} for u=x and v ≠ y for (u, v) Hn-1

5.Goto step 2 if not terminated.

Evolutionary Learning Algorithm for HyEvolutionary Learning Algorithm for Hypernetwork Classifierspernetwork Classifiers

Page 23: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

Learning with Hypergraphs: ApplicatioLearning with Hypergraphs: Application Resultsn Results

Page 24: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

Biological ApplicationsBiological Applications

DNA-Based Molecular Diagnosis MicroRNA-Based Diagnosis Aptamer-Based Diagnosis

Page 25: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

25

DNA-Based DiagnosisDNA-Based Diagnosis

120 samples from60 leukemia patients

Diagnosis

[Cheok et al., Nature Genetics, 2003]

Gene expression data

Training Hypernets with 6-fold validation

Class: ALL/AML

&

Page 26: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

26

Learning CurveLearning Curve

Fitness evolution of the population of hyperedges

Page 27: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

27

Order Effects on LearningOrder Effects on Learning

Fitness curves for runs with fixed-cardinality hyperedges (card = 1, 4, 7, 10)

Page 28: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

Aptamer-Based Cardiovascular Disease Aptamer-Based Cardiovascular Disease DiagnosisDiagnosis

Page 29: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

29

Training DataTraining Data▷ Disease : Cardiovascular Disease (CVD)

▷ Classes : 4 Classes [Normal / 1st / 2nd / 3rd Stages] ▷ The number of Samples : 135 Samples [N : 40 / 1st : 38 / 2nd : 19 / 3rd : 18] ▷ Preprocessing

3K Aptamer Array

Convert to Real-value

3K Real-value Data

Feature SelectionUsing Gain Ratio

150 Real-value Data

BinarizationUsing MDL

150 Boolean Data

▷ Simulation Parameter Value 1) Order : 2 ~ 70 2) Sampling Rate : 50 3) In each case, 10 times repeated and averaged

▷ Classification : Majority voting with The Sum of Library Element Weight

▷ Training / Test Size : Traing 108 (80%) / Test 27 (20%)

Page 30: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

30

Learning & Classification by HypernetworksLearning & Classification by HypernetworksX1=1 C=1X2=0X3=0X4=1X5=1X6=1X7=0 … X149=1X0=1

X1=0

C=0

X2=0X3=1X4=1X5=1X6=0X7=0 X149=1X0=0

X1=0 C=1X2=1X3=1X4=0X5=1X6=0X7=1 X149=1X0=0

X1=1X2=0X3=0X0=1 C=1

X4=1X6=1X7=0X0=1 C=1

X18=1X35=0X68=1X82=0C=1

X6=0X7=0X8=0X9=1 C=0

X14=0X4=1X5=1X7=0 C=0

X22=0X4=1X6=0X149=1

C=0

X1=0X33=1X4=0X9=1 C=1

X3=1X6=0X52=1X8=0 C=1

X2=1X4=0X5=1X0=0 C=1

X1=1 C=1X2=1X3=1X4=0X5=0X6=0X7=1 X149=1X0=0

X1=0 C=0X2=1X3=1X4=0X5=0X6=0X7=1 X149=1X0=1

W=1000

W=1000

W=1000

W=1000

W=1000

W=1000

W=1000

W=1000

W=1000

TrainingData

TestData

BinarizationBinarization

Library

Data Set

Library

Sampling

Sampling

WeightUpdate

WeightUpdate

TrainingData

TestTest

TestData

65

70

75

80

85

90

95

0

40 80

120

160

200

240

280

320

360

400

440

480

520

560

600

640

680

720

760

800

840

880

920

960

1000

70

72

74

76

78

80

82

84

86

0

40 80

120

160

200

240

280

320

360

400

440

480

520

560

600

640

680

720

760

800

840

880

920

960

1000

Learining Loop [Evolution Stage]

Source Data

Adjust Learning Rate

C=0

X1=1X2=0X3=0X0=1 C=1

X4=1X6=1X7=0X0=1 C=1

X18=1X35=0X68=1X82=0C=1

X6=0X7=0X8=0X9=1 C=0

X14=0X4=1X5=1X7=0 C=0

X22=0X4=1X6=0X149=1

X1=0X33=1X4=0X9=1 C=1

X3=1X6=0X52=1X8=0 C=1

X2=1X4=0X5=1X0=0 C=1

W’=1

W’=45

W’=4000

W’=12

W’=8530

W’=500

W’=1300

W’=4

W’=14

Weight Update Rule (Learning) : Error CorrectionIn case that all index-value matched,If Class is correct, w = w*1.0001Else w = w*0.95.

Page 31: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

31

Simulation Result (1/3)Simulation Result (1/3)

▷ Training & test errors as learning goes on (order k=12)

0 50 100 150 200 250 300 350 400 450 50075

80

85

90

95

100

Epoch

Accura

cy

Training

Test

Page 32: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

32

Simulation Result (2/3)Simulation Result (2/3)

0 50 100 150 200

64

66

68

70

72

74

76

78

80

82

84

Epoch

Accura

cy

24

8

12

16

2030

40

50

6070

Order

▷ Accuracy on test data as learning goes on (order k=12)

Page 33: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

33

Simulation Result (3/3)Simulation Result (3/3) ▷ The effect of learning

0 10 20 30 40 50 60 70 8064

66

68

70

72

74

76

78

80

82

84

Order

Accura

cy

Learning

Sampling only

Page 34: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

Mining Mining CCancerancer-R-Related elated MicroMicroRNA RNA MModules from miRNA odules from miRNA EExpression xpression PProfilesrofiles

Page 35: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

35

Gene Regulation by microRNAsGene Regulation by microRNAs MicroRNAs

MicroRNAs (miRNAs) are endogenous about 22 nt RNAs that can play important regulatory roles in animals, plants and viruses. Post-transcriptional gene regul

ation Binding target genes for degra

dation or translational repression

Recently, miRNAs are reported that related to the cancer development and progression.

Page 36: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

36

DatasetDataset

The miRNA expression microarray data The expression profiles of miRN

A in human among 11 tumors, which were bladder, breast, colon, kidney, lung, pancreas, prostate, uterus, melanoma, mesothelioma, ovary tissue (Lu et al., 2005).

This dataset consists of an expression matrix of 151 miRNAs (rows) and 89 samples (columns).

Tissue type

Cancer

Normal

Bladder 1 6

Breast 3 6

Colon 4 7

Kidney 3 4

Lung 2 5

Pancreas 1 8

Prostate 6 6

Uterus 1 10

Melanoma 0 3

Mesothelioma 0 8

Ovary 0 5

All tissues

21 68

Page 37: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

37

X=11

X=20

X=31

X=41

X=50

X=61

X=1510

Class cancer

X=10

X=20

X=30

X=41

X=50

X=60

……. X=1511

Class normal

X=11

X=20

X=30

X=41

X=50

X=61

……. X=1511

Class cancer

1

2

89

Data item : 151 miRNAs 89 samples

X=1 X=2 cancer

X=1 X=45 cancer

X=1 X=80 normal

X=1 X=2 cancer

1

X=10 X=20 normal

X=10 X=31 cancer

X=31 X=20 normal

2

X=1 X=2 cancer

X=1 X=45 cancer

X=1 X=45 cancer

X=1 X=2 cancer

89

Library (normal or cancer classification rules)

A hypernetwork H = (X, E, W) of DNA Molecules

Representing a Hypernetwork Representing a Hypernetwork from miRNA Expression Datafrom miRNA Expression Data

Page 38: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

38

PerformancePerformance

Leave-one-out cross-validation

Algorithms Correct classification rate

Bayesian Network 79.77 %

Naïve Bayes 83.15 %

ID3 88.76 %

Hypernetworks 90.00%

Sequential Minimal Optimization (SMO)

91.01 %

Multi-layer perceptron (MLP) 92.13 %

Page 39: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

39

Accuracy vs. Order for Test Data Accuracy vs. Order for Test Data (sampling only)(sampling only)

20 40 60 80 100 120 1400.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Order

Cla

ssifi

catio

n ra

tio

Page 40: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

40

Learning Curves for Training Learning Curves for Training DataData

0 10 20 30 40 50 60

0.8

0.85

0.9

0.95

1

Epoch

Cla

ssifi

catio

n ra

tio

23

4

5

67

Order

Page 41: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

41

miRNA Data MiningmiRNA Data Mining

WeightmiRNA modules

a b

7919.249184 hsa-miR-215 1 hsa-miR-7 1

6787.927872 hsa-miR-194 1 hsa-miR-30d 0

6787.927872 hsa-miR-214 1 hsa-miR-30e 0

6084.600896 hsa-miR-21 1 hsa-miR-321 1

5656.60656 hsa-miR-142-3p 1 hsa-miR-34b 0

5656.60656 hsa-miR-142-3p 1 hsa-miR-96 0

5656.60656 hsa-miR-126 1 hsa-miR-30c 0

5324.025784 hsa-miR-26b 1 hsa-miR-29b 1

5324.025784 hsa-let-7f 1 hsa-miR-9* 1

5324.025784 hsa-miR-224 1 hsa-miR-301 0

miRNA modules related to cancer miRNAs related to cancer

miRNAs weight

hsa-miR-155 295972.7

hsa-miR-105 283034.8

hsa-miR-223 280371.4

hsa-miR-21 277609.9

hsa-let-7c 270764.7

hsa-miR-142-3p 266700.1

hsa-miR-29b 263159

hsa-miR-224 260877.3

hsa-miR-183 260877.3

hsa-miR-184 260116.7

hsa-let-7a 256313.8

Page 42: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

Non-Biological ApplicationsNon-Biological Applications

Digit Recognition Face Classification Text Classification Movie Title Prediction

Page 43: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

43

Digit Recognition: DatasetDigit Recognition: Dataset

Original Data Handwritten digits (0 ~ 9) Training data: 2,630 (263

examples for each class) Test data: 1,130 (113

examples for each class)

Preprocessing Each example is 8x8

binary matrix. Each pixel is 0 or 1.

Page 44: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

44

Output Layer

x1

x1

xn

x3

x2

Class

x1x3

x1…xn

x1x2

x2

w2

w i

w jw

m

w1

Input Layer

Hidden Layer

•••

•••

•••

n

k k

nm

1

}#,1|{1

copiesofwk

niwW

n

ki

x1 Class

x2 Class

x1 x2 Class

x1 x3 Class

x1 xn Class…

x1 Class

x1 Class

x2 Class

x1 x2 Class

x1 x2 Class

x1 x3 Class

x1 x3 Class

x1 x3 Class

x1 xn Class…

x1 x3 Class

x1 x2 x4 Classx2 x3 Class

x1 x4 Class

x1 x3 Class

x1 x3 Class

x1 x2 x4 Class

x1 x2 x4 Class

x2 x3 x4 Class

x2 x3 x4 Class

x2 x3 x4 Class

x2 x3 Class

x2 x3 Class

x1 x4 Class

x1 x4 Class

x1 Class

x2 Class

x1 x2 Class

x1 x3 Class

x1 xn Class…

x1 Class

x1 Class

x2 Class

x1 x2 Class

x1 x2 Class

x1 x3 Class

x1 x3 Class

x1 x3 Class

x1 xn Class…

x2 Class

x2 Class

x1 x3 Class

x1 x3 Class

Probabilistic Library(DNA Representation)

“Layered” Hypernetwork

Pattern ClassificationPattern Classification

Page 45: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

45

Simulation Results – without Simulation Results – without Error CorrectionError Correction |Train set| = 3760, |Test set| = 1797.

Page 46: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

46

Performance ComparisonPerformance Comparison

Methods Accuracy

MLP with 37 hidden nodes 0.941

MLP with no hidden nodes 0.901

SVM with polynomial kernel 0.926

SVM with RBF kernel 0.934

Decision Tree 0.859

Naïve Bayes 0.885

kNN (k=1) 0.936

kNN (k=3) 0.951

Hypernet with learning (k = 10) 0.923

Hypernet with sampling (k = 33) 0.949

Page 47: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

47

Error Correction AlgorithmError Correction Algorithm

1. Initialize the library as before.2. maxChangeCnt := librarySize.3. For i := 0 to iteration_limit

1. trainCorrectCnt := 0.2. Run classification for all training patterns. For each correctly classifed p

atterns, increase trainCorrectCnt.3. For each library elements

1. Initialize fitness value to 0.2. For each misclassified training patterns if a library element is matched to that

example1. if classified correctly, then fitness of the library element gains 2 points.2. Else it loses 1 points.

4. changeCnt := max{ librarySize * (1.5 * (trainSetSize - trainCorrectCnt) / trainSetSize + 0.01), maxChangeCnt * 0.9 }.

5. maxChangeCnt := changeCnt.6. Delete changeCnt library elements of lowest fitness and resample library

elements whose classes are that of deleted ones.

Page 48: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

48

Simulation Results – with Simulation Results – with Error CorrectionError Correction iterationLimit = 37, librarySize = 382,300,

0 5 10 15 20 25 30 350.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

Iteration

Cla

ssifi

catio

n ra

tio

Train

610

14

18

2227

Order

0 5 10 15 20 25 30 35

0.87

0.88

0.89

0.9

0.91

0.92

0.93

Iteration

Cla

ssifi

catio

n ra

tio

Test

610

14

18

2226

Order

Page 49: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

49

Performance ComparisonPerformance Comparison

Algorithms Correct classification rate

Random Forest (f=10, t=50) 94.10 %

KNN (k=4)

Hypernetwork (Order=26)

93.49 %

92.99 %

AdaBoost (Weak Learner: J48) 91.93 %

SVM (Gaussian Kernel, SMO) 91.37 %

MLP 90.53 %

Naïve Bayes

J48

87.26 %84.86 %

Page 50: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

Face Classification ExperimentsFace Classification Experiments

Page 51: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

51

Face Data SetFace Data Set

Yale dataset 15 people 11 images

per personTotal 165

images

Page 52: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

52

Training Images of a PersonTraining Images of a Person

10 for training

The remaining 1 for test

Page 53: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

53

Bitmaps for Training Data Bitmaps for Training Data (Dimensionality = 480)(Dimensionality = 480)

Page 54: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

54

Classification Rate by Leave-One-Classification Rate by Leave-One-OutOut

Page 55: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

55

Classification Rate Classification Rate (Dimensionality = 64 by PCA)(Dimensionality = 64 by PCA)

Page 56: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

Text Classification ExperimentsText Classification Experiments

Page 57: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

57

Text ClassificationText Classification

. . .

1 0 0 0 2 0 0 1

0 3 0 1 0 0 0 1

1. Documents1. Documents

3. Term vectors3. Term vectors

1 0 1 0 1 0 1 0

0 1 1 1 1 0 0 1

1 0 0 0 0 1 0 1

0 0 1 0 0 0 1 0

0 1 1 0 0 0 0 1

0 0 1 0 0 1 0 0

1 0 1 1 0 0 1 1

0 1 1 0 1 0 0 0

0 0 0 0 1 1 0 0

baseballspecs

graphicshockey

unixspace

d1

d2

d3

dn

4. Binary term-document matrix

4. Binary term-document matrix

1 0 1 0 0 0 0 2

2. Bag-of-words representation2. Bag-of-words representation

x1=0 x2=1 y=1x3=1

x1=0 x2=0 y=0x3=1 x2=1 x3=0 y=0

x2=1 y=0

x2=1 x3=1 y=1

x1=0 y=0x1=0 y=0

x1=0 y=0

x1=0 y=1x1=0 y=1

x1=0 y=1

x2=0 y=0x2=0 y=0

x2=0 y=0x2=0 y=1

x2=0 y=1x2=0 y=1

x1=0 x2=0 y=0x1=0 x2=0 y=0

x1=0 x2=0 y=0

x1=0 x2=0 y=1x1=0 x2=0 y=1

x1=0 x2=0 y=1

x1=0 x2=1 y=0x1=0 x2=1 y=0

x1=0 x2=1 y=0

x1=0 x2=1 y=1x1=0 x2=1 y=1

x1=0 x2=1 y=1

x3=0x1=0 x2=0 y=0x3=0x1=0 x2=0 y=0

x3=0x1=0 x2=0 y=0

x3=0x1=0 x2=0 y=1x3=0x1=0 x2=0 y=1

x3=0x1=0 x2=0 y=1

x3=1x1=0 x2=0 y=0x3=1x1=0 x2=0 y=0

x3=1x1=0 x2=0 y=0

x3=1x1=0 x2=0 y=1x3=1x1=0 x2=0 y=1

x3=1x1=0 x2=0 y=1

x3=0x1=0 x2=1 y=0x3=0x1=0 x2=1 y=0

x3=0x1=0 x2=1 y=0

x3=0x1=0 x2=1 y=1x3=0x1=0 x2=1 y=1

x3=0x1=0 x2=1 y=1

5. DNA encoded kernel functions

5. DNA encoded kernel functions

Page 58: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

58

Text ClassificationText Classification

Data from Reuters-21578 (‘ACQ’ and ‘EARN’)

Learning curves: average for 10 runs

Page 59: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

59

Performance ComparisonPerformance Comparison

‘ACQ’ data (4,724 documents)

‘EARN’ data (7,888 documents)

Higher-dimensional kernel functions can improve the performance further.

Page 60: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

Learning from Movie Captions Learning from Movie Captions ExperimentsExperiments

Page 61: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

61

Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions

Order Sequential Range: 2~3

Corpus Friends Prison Break 24

Page 62: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

62

Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions

Page 63: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

63

Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions

Page 64: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

64

Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions

Page 65: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

65

Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions

Classification Query generation

- I intend to marry her : I ? to marry her I intend ? marry her I intend to ? her I intend to marry ? Matching

- I ? to marry her order 2: I intend, I am, intend to, …. order 3: I intend to, intend to marry, …

Count the number of max-perfect-matching h

yperedges

Page 66: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

66

Completion & Classification Examples

Query Completion Classification

who are you Corpus: Friends, 24, Prison Break

? are you

who ? you

who are ?

what are you

who are you

who are you

Friends

Friends

Friends

you need to wear it Corpus: 24, Prison Break, House

? need to wear it

you ? to wear it

you need ? wear it

you need to ? it

you need to wear ?

i need to wear it

you want to wear it

you need to wear it

you need to do it

you need to wear a

24

24

24

House

24

Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions

Page 67: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

67

ConclusionConclusion Hypernetworks are a graphical model employing higher-order nodes exp

licitly and allowing for a more natural representation for learning higher-order graphical models.

We introduce an evolutionary learning algorithm that makes use of the high information density and massive parallelism of molecular computing to solve the combinatorial explosion problems.

Applied to pattern recognition (and completion) problems in IT and BT. Obtained a performance competitive to conventional ML classifiers. Why does this work?

Exploits the huge population size available in DNA computing to build an ensemble machine, i.e. a hypernetwork, of simple random hyperedges.

A new kind of evolutionary algorithm where a very simple “molecular” operators are applied to a “huge” population of individuals in a “massively parallel” way.

Another potential of hypernetworks is for application to solving biological problems where data are given as “wet” DNA or RNA molecules.

Page 68: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

68

Simulation Experiments Joo-Kyoung Kim, Sun Kim, Soo-Jin Kim, Jung-Woo Ha, Chan-Hoon Park, Ha-Young JangCollaborating Labs - Biointelligence Laboratory, Seoul National University - RNomics Lab, Seoul National University - DigitalGenomics, Inc. - GenoProt, Inc. Supported by - National Research Lab Program of Min. of Sci. & Tech. (2002-2007)

- Next Generation Tech. Program of Min. of Ind. & Comm. (2000-2010)

More Information at - http://bi.snu.ac.kr/MEC/ - http://cbit.snu.ac.kr/

AcknowledgementsAcknowledgements


Recommended