Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | ross-bradley |
View: | 213 times |
Download: | 0 times |
Learning with Hypergraphs: Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from Discovery of Higher-Order Interaction Patterns from
High-Dimensional DataHigh-Dimensional Data
Moscow State University,Moscow State University, Faculty of Computational Mathematics and Cybernetics, Faculty of Computational Mathematics and Cybernetics, Feb. 22, 2007, Moscow, RussiaFeb. 22, 2007, Moscow, Russia
Byoung-Tak Zhang
Biointelligence LaboratorySchool of Computer Science and Engineering
Brain Science, Cognitive Science, Bioinformatics ProgramsSeoul National University
Seoul 151-742, Korea
[email protected]://bi.snu.ac.kr/
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
2
Probabilistic Graphical Models (PGMs)Probabilistic Graphical Models (PGMs)
Represent the joint probability distribution on some random variables in graphical form. Undirected PGMs Directed PGMs
Generative: The probability distribution for some variables given values of other variables can be obtained. Probabilistic inference
( , , , , )
( ) ( | ) ( | , ) ( | , , )
( | , , , )
( ) ( ) ( | , ) ( | ) ( | )
P A B C D E
P A P B A P C A B P D A B C
P E A B C D
P A P B P C A B P D B P E C
C
A B
E
D
• C and D are independent given B.
• C asserts dependency between A and B.
• B and E are independent given C.
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
3
Kinds of Graphical ModelsKinds of Graphical Models
Graphical Models
- Boltzmann Machines - Markov Random Fields
- Bayesian Networks- Latent Variable Models- Hidden Markov Models- Generative Topographic Mapping- Non-negative Matrix Factorization
Undirected Directed
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
4
Bayesian NetworksBayesian Networks BN = (S, P) consists of a network structure S and a set of local
probability distributions P 1
( ) ( | )
n
i ii
p p xx pa
• Structure can be found by relying on the prior knowledge of causal relationships
<BN for detecting credit card fraud>
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
5
From Bayes Nets to High-Order PGMsFrom Bayes Nets to High-Order PGMs
G
F
J
A
S
G
F
J
A
S
{ , , , }
( | , , , )
( , , , | ) ( )
( , , , )
( , , , | )
( | ) ( | ) ( | ) ( | )
( | )x J G S A
P F J G S A
P J G S A F P F
P J G S A
P J G S A F
P J F P G F P S F P A F
P x F
G
F
J
A
S
{ , , , , }
( , , , , )
( | ) ( | ) ( | )( | )
( | ( ))x F J G S A
P F J G S A
P G F P J F P J A J S
P x pa x
( , ) {( , )| , { , , , } and }
( , , , , )
( , | ) ( , | ) ( , | )
( , | ) ( , | )
( , | )
( ( , ) | )he x y x y x y J G S A
x y
P F J G S A
P J G F P J S F P J A F
P G S F P G A F
P S A F
P he x y F
(1) Naïve Bayes
(2) Bayesian Net
(3) High-Order PGM
The Hypernetworks The Hypernetworks
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
7
HypergraphsHypergraphs A hypergraph is a (undirected) graph G whose edges connect
a non-null number of vertices, i.e. G = (V, E), where V = {v1, v2, …, vn}, E = {E1, E2, …, En}, and Ei = {vi1, vi2, …, vim} An m-hypergraph consists of a set V of vertices and a subset
E of V[m], i.e. G = (V, V[m]) where V[m] is a set of subsets of V whose elements have precisely m members.
A hypergraph G is said to be k-uniform if every edge Ei in E has cardinality k.
A hypergraph G is k-regular if every vertex has degree k. Rem.: An ordinary graph is a 2-uniform hypergraph.
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
8
An Example HypergraphAn Example Hypergraph
v5v5
v1v1
v3v3
v7v7
v2v2
v6v6
v4v4
G = (V, E)V = {v1, v2, v3, …, v7}E = {E1, E2, E3, E4, E5}
E1 = {v1, v3, v4}E2 = {v1, v4}E3 = {v2, v3, v6}E4 = {v3, v4, v6, v7}E5 = {v4, v5, v7}
E1
E4
E5
E2
E3
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
9
HypernetworksHypernetworks A hypernetwork is a hypergraph of weighted edges. It is defined as a tripl
e H = (V, E, W), where
V = {v1, v2, …, vn},
E = {E1, E2, …, En},
and W = {w1, w2, …, wn}. An m-hypernetwork consists of a set V of vertices and a subset E of V[m], i.
e. H = (V, V[m], W) where V[m] is a set of subsets of V whose elements have precisely m members and W is the set of weights associated with the hyperedges.
A hypernetwork H is said to be k-uniform if every edge Ei in E has cardinality k.
A hypernetwork H is k-regular if every vertex has degree k. Rem.: An ordinary graph is a 2-uniform hypergraph with wi=1.
[Zhang, DNA-2006]
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
10
x1x2
x3
x4
x5
x6
x7
x8 x9
x10
x11
x12
x13
x14
x15
A Hypernetwork A Hypernetwork
Learning with Hypernetworks Learning with Hypernetworks
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
12
The Hypernetwork Model of LearningThe Hypernetwork Model of Learning
Nn
K
iii
i
D
WWWW
SkXSSS
xxxX
WSXH
I
1)(
)()3()2(
,...,,
}{
:set Training
),...,,(
|| , ,
)(
),,(
as defined isrk hypernetwo The
21
x
)( 2121...21
21
21...21
321
321321
21
2121
321
321321
21
2121
2 ,...,,
)()()()(
2 ,...,,
)()()()(
,,
)()()()3(
,
)()()2(
)()(
,,
)()()()3(
,
)()()2()(
...)(
1exp)Z(
isfunction partition thewhere
,...)(
1exp
)Z(
1
...6
1
2
1exp
)Z(
1
)];(exp[)Z(
1 )|(
ondistributiy probabilit The
...6
1
2
1 );(
rkhypernetwo theofenergy The
m kkiiikiii
k
kiiikiii
iiiiiiiiii
iiiiiiiiii
K
k iii
mmmk
K
k iii
nnnk
iii
nnn
ii
nn
nn
iii
nnn
ii
nnn
xxxwkc
W
xxxwkcW
xxxwxxwW
WEW
WP
xxxwxxwWE
x
xx
x
[Zhang, 2006]
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
13
Deriving the Learning RuleDeriving the Learning Rule
N
n
K
k iii
nnnk
N
n
Kn
Nn
WZxxxwkc
WWWP
WP
k
kiiikiii1 2 ,...,,
)()()()(
1
)()3()2()(
1)(
)(ln...)(
1exp
),...,,|(ln
)|}({ln
21
21...21
x
x
)|}({ln 1)(
)(
...21
WPw
Nns
siii
x
N
n
Nn
WP
WP
1
(n)
1)(
)|(
)|}({
x
x
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
14
Derivation of the Learning Derivation of the Learning RuleRule
xx
x
x
x
x
)|(......
...1
...
where
......
......
)(ln...)(
1exp
)(ln...)(
1exp
)|}({ln
2121
2121
2121
2121
...2121
21...21
...21
21
21...21
...21
...21
)|(
1
)()()(
)|(
1)|(
)()()(
1)(
2 ,...,,
)()()()()(
1 2 ,...,,
)()()()()(
1)(
)(
WPxxxxxx
xxxN
xxx
xxxxxxN
xxxxxx
WZw
xxxwkcw
WZxxxwkcw
WPw
siiis
siiis
ss
ssiii
siiik
kiiikiii
siii
k
kiiikiii
siii
siii
WPiii
N
n
nnn
Dataiii
WPiiiDataiii
N
nWPiii
nnn
N
ns
K
k iii
nnnks
N
n
K
k iii
nnnks
Nns
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
15x8 x9
x12
x1x2
x3
x4
x5
x6
x7x10
x11
x13
x14
x15
x1 =1
x2 =0
x3 =0
x4 =1
x5 =0
x6 =0
x7 =0
x8 =0
x9 =0
x10 =1
x11 =0
x12 =1
x13 =0
x14 =0
x15 =0
y
= 1
x1 =0
x2 =1
x3 =1
x4 =0
x5 =0
x6 =0
x7 =0
x8 =0
x9 =1
x10 =0
x11 =0
x12 =0
x13 =0
x14 =1
x15 =0
y
= 0
x1 =0
x2 =0
x3 =1
x4 =0
x5 =0
x6 =1
x7 =0
x8 =1
x9 =0
x10 =0
x11 =0
x12 =0
x13 =1
x14 =0
x15 =0
y
=1
4 examples
x4 x10 y=1x1
x4 x12 y=1x1
x10 x12 y=1x4
x3 x9 y=0x2
x3 x14 y=0x2
x9 x14 y=0x3
x6 x8 y=1x3
x6 x13 y=1x3
x8 x13 y=1x6
1
2
3
1
2
3
x1 =0
x2 =0
x3 =0
x4 =0
x5 =0
x6 =0
x7 =0
x8 =1
x9 =0
x10 =0
x11 =1
x12 =0
x13 =0
x14 =0
x15 =1
y
=14
x11 x15 y=0x84
Round 1Round 2Round 3
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
16
Molecular Self-Assembly of HypernetworksMolecular Self-Assembly of Hypernetworks
xi xj y
X7
X6
X5
X8
X1
X2
X3
X4
Hypernetwork Representation
x1 x3 Class
x1 x2 x4 Classx2 x3 Class
x1 x4 Class
x1 x3 Class
x1 x3 Class
x1 x2 x4 Class
x1 x2 x4 Class
x2 x3 x4 Class
x2 x3 x4 Class
x2 x3 x4 Class
x2 x3 Class
x2 x3 Class
x1 x4 Class
x1 x4 Class
x1 Class
x2 Class
x1 x2 Class
x1 x3 Class
x1 xn Class…
x1 Class
x1 Class
x2 Class
x1 x2 Class
x1 x2 Class
x1 x3 Class
x1 x3 Class
x1 x3 Class
x1 xn Class…
x2 Class
x2 Class
x1 x3 Class
x1 x3 Class
Molecular Encoding
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
17
Encoding a Hypernetwork with DNAEncoding a Hypernetwork with DNA
z1 :
z2 :
z3 :
z4 :
b)
x1
x2
x3
x4
x5
y
0
1
where
z1 : (x1=0, x2=1, x3=0, y=1)z2 : (x1=0, x2=0, x3=1, x4=0, x5=0, y=0)z3 : (x2=1, x4=1, y=1)z4 : (x2=1, x3=0, x4=1, y=0)
a)
AAAACCAATTGGAAGGCCATGCGG
AAAACCAATTCCAAGGGGCCTTCCCCAACCATGCCC
AATTGGCCTTGGATGCGG
AATTGGAAGGCCCCTTGGATGCCC
GG
AAAA
AATT
AAGG
CCTT
CCAA
ATGC
CC
Collection of (labeled) hyperedges
Library of DNA molecules corresponding to (a)
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
18
DNA Molecular ComputingDNA Molecular Computing
Self-assembly
Heat
Cool
Polymer
Repeat
Self-replication
Molecular recognitionNanostructure
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
19
Learning the Hypernetwork (by Molecular EvoluLearning the Hypernetwork (by Molecular Evolution)tion)
Library of combinatorialmolecules
+
Library Example
Select the library elements matching the example
Amplify the matched library elements by PCR
Next generation
ii
Hybridize
[Zhang, DNA11]
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
20
Molecular Information ProcessingMolecular Information Processing
MP4.avi
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
21
i i
The Theory of Bayesian EvolutionThe Theory of Bayesian Evolution
P0(Ai) Pg(Ai |D)...
generation 0 generation gP(A |D)P(A |D)
Pg(Ai)
[Zhang, CEC-99]
Evolution as a Bayesian inference process Evolutionary computation (EC) is viewed as an iterative process of
generating the individuals of ever higher posterior probabilities from the priors and the observed data.
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
22
1. Let the hypernetwork H represent the current distribution P(X,Y).
2. Get a training example (x,y).3. Classify x using H as follows
3.1 Extract all molecules matching x into M.3.2 From M separate the molecules into classes:
Extract the molecules with label Y=0 into M0
Extract the molecules with label Y=1 into M1
3.3 Compute y*=argmaxY{0,1}| MY |/|M|
4. Update HIf y*=y, then Hn ← Hn-1+{c(u, v)} for u=x and v=y for (u, v) Hn-1,If y*≠y, then Hn ← Hn-1{c(u, v)} for u=x and v ≠ y for (u, v) Hn-1
5.Goto step 2 if not terminated.
Evolutionary Learning Algorithm for HyEvolutionary Learning Algorithm for Hypernetwork Classifierspernetwork Classifiers
Learning with Hypergraphs: ApplicatioLearning with Hypergraphs: Application Resultsn Results
Biological ApplicationsBiological Applications
DNA-Based Molecular Diagnosis MicroRNA-Based Diagnosis Aptamer-Based Diagnosis
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
25
DNA-Based DiagnosisDNA-Based Diagnosis
120 samples from60 leukemia patients
Diagnosis
[Cheok et al., Nature Genetics, 2003]
Gene expression data
Training Hypernets with 6-fold validation
Class: ALL/AML
&
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
26
Learning CurveLearning Curve
Fitness evolution of the population of hyperedges
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
27
Order Effects on LearningOrder Effects on Learning
Fitness curves for runs with fixed-cardinality hyperedges (card = 1, 4, 7, 10)
Aptamer-Based Cardiovascular Disease Aptamer-Based Cardiovascular Disease DiagnosisDiagnosis
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
29
Training DataTraining Data▷ Disease : Cardiovascular Disease (CVD)
▷ Classes : 4 Classes [Normal / 1st / 2nd / 3rd Stages] ▷ The number of Samples : 135 Samples [N : 40 / 1st : 38 / 2nd : 19 / 3rd : 18] ▷ Preprocessing
3K Aptamer Array
Convert to Real-value
3K Real-value Data
Feature SelectionUsing Gain Ratio
150 Real-value Data
BinarizationUsing MDL
150 Boolean Data
▷ Simulation Parameter Value 1) Order : 2 ~ 70 2) Sampling Rate : 50 3) In each case, 10 times repeated and averaged
▷ Classification : Majority voting with The Sum of Library Element Weight
▷ Training / Test Size : Traing 108 (80%) / Test 27 (20%)
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
30
Learning & Classification by HypernetworksLearning & Classification by HypernetworksX1=1 C=1X2=0X3=0X4=1X5=1X6=1X7=0 … X149=1X0=1
X1=0
C=0
X2=0X3=1X4=1X5=1X6=0X7=0 X149=1X0=0
X1=0 C=1X2=1X3=1X4=0X5=1X6=0X7=1 X149=1X0=0
X1=1X2=0X3=0X0=1 C=1
X4=1X6=1X7=0X0=1 C=1
X18=1X35=0X68=1X82=0C=1
X6=0X7=0X8=0X9=1 C=0
X14=0X4=1X5=1X7=0 C=0
X22=0X4=1X6=0X149=1
C=0
X1=0X33=1X4=0X9=1 C=1
X3=1X6=0X52=1X8=0 C=1
X2=1X4=0X5=1X0=0 C=1
X1=1 C=1X2=1X3=1X4=0X5=0X6=0X7=1 X149=1X0=0
X1=0 C=0X2=1X3=1X4=0X5=0X6=0X7=1 X149=1X0=1
W=1000
W=1000
W=1000
W=1000
W=1000
W=1000
W=1000
W=1000
W=1000
TrainingData
TestData
BinarizationBinarization
Library
Data Set
Library
Sampling
Sampling
WeightUpdate
WeightUpdate
TrainingData
TestTest
TestData
65
70
75
80
85
90
95
0
40 80
120
160
200
240
280
320
360
400
440
480
520
560
600
640
680
720
760
800
840
880
920
960
1000
70
72
74
76
78
80
82
84
86
0
40 80
120
160
200
240
280
320
360
400
440
480
520
560
600
640
680
720
760
800
840
880
920
960
1000
Learining Loop [Evolution Stage]
Source Data
Adjust Learning Rate
…
…
…
…
C=0
X1=1X2=0X3=0X0=1 C=1
X4=1X6=1X7=0X0=1 C=1
X18=1X35=0X68=1X82=0C=1
X6=0X7=0X8=0X9=1 C=0
X14=0X4=1X5=1X7=0 C=0
X22=0X4=1X6=0X149=1
X1=0X33=1X4=0X9=1 C=1
X3=1X6=0X52=1X8=0 C=1
X2=1X4=0X5=1X0=0 C=1
W’=1
W’=45
W’=4000
W’=12
W’=8530
W’=500
W’=1300
W’=4
W’=14
Weight Update Rule (Learning) : Error CorrectionIn case that all index-value matched,If Class is correct, w = w*1.0001Else w = w*0.95.
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
31
Simulation Result (1/3)Simulation Result (1/3)
▷ Training & test errors as learning goes on (order k=12)
0 50 100 150 200 250 300 350 400 450 50075
80
85
90
95
100
Epoch
Accura
cy
Training
Test
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
32
Simulation Result (2/3)Simulation Result (2/3)
0 50 100 150 200
64
66
68
70
72
74
76
78
80
82
84
Epoch
Accura
cy
24
8
12
16
2030
40
50
6070
Order
▷ Accuracy on test data as learning goes on (order k=12)
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
33
Simulation Result (3/3)Simulation Result (3/3) ▷ The effect of learning
0 10 20 30 40 50 60 70 8064
66
68
70
72
74
76
78
80
82
84
Order
Accura
cy
Learning
Sampling only
Mining Mining CCancerancer-R-Related elated MicroMicroRNA RNA MModules from miRNA odules from miRNA EExpression xpression PProfilesrofiles
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
35
Gene Regulation by microRNAsGene Regulation by microRNAs MicroRNAs
MicroRNAs (miRNAs) are endogenous about 22 nt RNAs that can play important regulatory roles in animals, plants and viruses. Post-transcriptional gene regul
ation Binding target genes for degra
dation or translational repression
Recently, miRNAs are reported that related to the cancer development and progression.
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
36
DatasetDataset
The miRNA expression microarray data The expression profiles of miRN
A in human among 11 tumors, which were bladder, breast, colon, kidney, lung, pancreas, prostate, uterus, melanoma, mesothelioma, ovary tissue (Lu et al., 2005).
This dataset consists of an expression matrix of 151 miRNAs (rows) and 89 samples (columns).
Tissue type
Cancer
Normal
Bladder 1 6
Breast 3 6
Colon 4 7
Kidney 3 4
Lung 2 5
Pancreas 1 8
Prostate 6 6
Uterus 1 10
Melanoma 0 3
Mesothelioma 0 8
Ovary 0 5
All tissues
21 68
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
37
X=11
X=20
X=31
X=41
X=50
X=61
X=1510
Class cancer
X=10
X=20
X=30
X=41
X=50
X=60
……. X=1511
Class normal
X=11
X=20
X=30
X=41
X=50
X=61
……. X=1511
Class cancer
…
1
2
89
Data item : 151 miRNAs 89 samples
X=1 X=2 cancer
X=1 X=45 cancer
X=1 X=80 normal
X=1 X=2 cancer
1
X=10 X=20 normal
X=10 X=31 cancer
X=31 X=20 normal
2
X=1 X=2 cancer
X=1 X=45 cancer
X=1 X=45 cancer
X=1 X=2 cancer
89
Library (normal or cancer classification rules)
…
A hypernetwork H = (X, E, W) of DNA Molecules
Representing a Hypernetwork Representing a Hypernetwork from miRNA Expression Datafrom miRNA Expression Data
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
38
PerformancePerformance
Leave-one-out cross-validation
Algorithms Correct classification rate
Bayesian Network 79.77 %
Naïve Bayes 83.15 %
ID3 88.76 %
Hypernetworks 90.00%
Sequential Minimal Optimization (SMO)
91.01 %
Multi-layer perceptron (MLP) 92.13 %
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
39
Accuracy vs. Order for Test Data Accuracy vs. Order for Test Data (sampling only)(sampling only)
20 40 60 80 100 120 1400.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Order
Cla
ssifi
catio
n ra
tio
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
40
Learning Curves for Training Learning Curves for Training DataData
0 10 20 30 40 50 60
0.8
0.85
0.9
0.95
1
Epoch
Cla
ssifi
catio
n ra
tio
23
4
5
67
Order
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
41
miRNA Data MiningmiRNA Data Mining
WeightmiRNA modules
a b
7919.249184 hsa-miR-215 1 hsa-miR-7 1
6787.927872 hsa-miR-194 1 hsa-miR-30d 0
6787.927872 hsa-miR-214 1 hsa-miR-30e 0
6084.600896 hsa-miR-21 1 hsa-miR-321 1
5656.60656 hsa-miR-142-3p 1 hsa-miR-34b 0
5656.60656 hsa-miR-142-3p 1 hsa-miR-96 0
5656.60656 hsa-miR-126 1 hsa-miR-30c 0
5324.025784 hsa-miR-26b 1 hsa-miR-29b 1
5324.025784 hsa-let-7f 1 hsa-miR-9* 1
5324.025784 hsa-miR-224 1 hsa-miR-301 0
miRNA modules related to cancer miRNAs related to cancer
miRNAs weight
hsa-miR-155 295972.7
hsa-miR-105 283034.8
hsa-miR-223 280371.4
hsa-miR-21 277609.9
hsa-let-7c 270764.7
hsa-miR-142-3p 266700.1
hsa-miR-29b 263159
hsa-miR-224 260877.3
hsa-miR-183 260877.3
hsa-miR-184 260116.7
hsa-let-7a 256313.8
Non-Biological ApplicationsNon-Biological Applications
Digit Recognition Face Classification Text Classification Movie Title Prediction
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
43
Digit Recognition: DatasetDigit Recognition: Dataset
Original Data Handwritten digits (0 ~ 9) Training data: 2,630 (263
examples for each class) Test data: 1,130 (113
examples for each class)
Preprocessing Each example is 8x8
binary matrix. Each pixel is 0 or 1.
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
44
Output Layer
x1
x1
xn
x3
x2
Class
x1x3
x1…xn
x1x2
x2
w2
w i
w jw
m
w1
Input Layer
Hidden Layer
•••
•••
•••
n
k k
nm
1
}#,1|{1
copiesofwk
niwW
n
ki
x1 Class
x2 Class
x1 x2 Class
x1 x3 Class
x1 xn Class…
x1 Class
x1 Class
x2 Class
x1 x2 Class
x1 x2 Class
x1 x3 Class
x1 x3 Class
x1 x3 Class
x1 xn Class…
x1 x3 Class
x1 x2 x4 Classx2 x3 Class
x1 x4 Class
x1 x3 Class
x1 x3 Class
x1 x2 x4 Class
x1 x2 x4 Class
x2 x3 x4 Class
x2 x3 x4 Class
x2 x3 x4 Class
x2 x3 Class
x2 x3 Class
x1 x4 Class
x1 x4 Class
x1 Class
x2 Class
x1 x2 Class
x1 x3 Class
x1 xn Class…
x1 Class
x1 Class
x2 Class
x1 x2 Class
x1 x2 Class
x1 x3 Class
x1 x3 Class
x1 x3 Class
x1 xn Class…
x2 Class
x2 Class
x1 x3 Class
x1 x3 Class
Probabilistic Library(DNA Representation)
“Layered” Hypernetwork
Pattern ClassificationPattern Classification
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
45
Simulation Results – without Simulation Results – without Error CorrectionError Correction |Train set| = 3760, |Test set| = 1797.
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
46
Performance ComparisonPerformance Comparison
Methods Accuracy
MLP with 37 hidden nodes 0.941
MLP with no hidden nodes 0.901
SVM with polynomial kernel 0.926
SVM with RBF kernel 0.934
Decision Tree 0.859
Naïve Bayes 0.885
kNN (k=1) 0.936
kNN (k=3) 0.951
Hypernet with learning (k = 10) 0.923
Hypernet with sampling (k = 33) 0.949
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
47
Error Correction AlgorithmError Correction Algorithm
1. Initialize the library as before.2. maxChangeCnt := librarySize.3. For i := 0 to iteration_limit
1. trainCorrectCnt := 0.2. Run classification for all training patterns. For each correctly classifed p
atterns, increase trainCorrectCnt.3. For each library elements
1. Initialize fitness value to 0.2. For each misclassified training patterns if a library element is matched to that
example1. if classified correctly, then fitness of the library element gains 2 points.2. Else it loses 1 points.
4. changeCnt := max{ librarySize * (1.5 * (trainSetSize - trainCorrectCnt) / trainSetSize + 0.01), maxChangeCnt * 0.9 }.
5. maxChangeCnt := changeCnt.6. Delete changeCnt library elements of lowest fitness and resample library
elements whose classes are that of deleted ones.
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
48
Simulation Results – with Simulation Results – with Error CorrectionError Correction iterationLimit = 37, librarySize = 382,300,
0 5 10 15 20 25 30 350.9
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1
Iteration
Cla
ssifi
catio
n ra
tio
Train
610
14
18
2227
Order
0 5 10 15 20 25 30 35
0.87
0.88
0.89
0.9
0.91
0.92
0.93
Iteration
Cla
ssifi
catio
n ra
tio
Test
610
14
18
2226
Order
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
49
Performance ComparisonPerformance Comparison
Algorithms Correct classification rate
Random Forest (f=10, t=50) 94.10 %
KNN (k=4)
Hypernetwork (Order=26)
93.49 %
92.99 %
AdaBoost (Weak Learner: J48) 91.93 %
SVM (Gaussian Kernel, SMO) 91.37 %
MLP 90.53 %
Naïve Bayes
J48
87.26 %84.86 %
Face Classification ExperimentsFace Classification Experiments
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
51
Face Data SetFace Data Set
Yale dataset 15 people 11 images
per personTotal 165
images
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
52
Training Images of a PersonTraining Images of a Person
10 for training
The remaining 1 for test
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
53
Bitmaps for Training Data Bitmaps for Training Data (Dimensionality = 480)(Dimensionality = 480)
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
54
Classification Rate by Leave-One-Classification Rate by Leave-One-OutOut
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
55
Classification Rate Classification Rate (Dimensionality = 64 by PCA)(Dimensionality = 64 by PCA)
Text Classification ExperimentsText Classification Experiments
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
57
Text ClassificationText Classification
. . .
1 0 0 0 2 0 0 1
0 3 0 1 0 0 0 1
1. Documents1. Documents
3. Term vectors3. Term vectors
1 0 1 0 1 0 1 0
0 1 1 1 1 0 0 1
1 0 0 0 0 1 0 1
0 0 1 0 0 0 1 0
0 1 1 0 0 0 0 1
0 0 1 0 0 1 0 0
1 0 1 1 0 0 1 1
0 1 1 0 1 0 0 0
0 0 0 0 1 1 0 0
baseballspecs
graphicshockey
unixspace
d1
d2
d3
dn
4. Binary term-document matrix
4. Binary term-document matrix
1 0 1 0 0 0 0 2
2. Bag-of-words representation2. Bag-of-words representation
x1=0 x2=1 y=1x3=1
x1=0 x2=0 y=0x3=1 x2=1 x3=0 y=0
x2=1 y=0
x2=1 x3=1 y=1
x1=0 y=0x1=0 y=0
x1=0 y=0
x1=0 y=1x1=0 y=1
x1=0 y=1
x2=0 y=0x2=0 y=0
x2=0 y=0x2=0 y=1
x2=0 y=1x2=0 y=1
x1=0 x2=0 y=0x1=0 x2=0 y=0
x1=0 x2=0 y=0
x1=0 x2=0 y=1x1=0 x2=0 y=1
x1=0 x2=0 y=1
x1=0 x2=1 y=0x1=0 x2=1 y=0
x1=0 x2=1 y=0
x1=0 x2=1 y=1x1=0 x2=1 y=1
x1=0 x2=1 y=1
x3=0x1=0 x2=0 y=0x3=0x1=0 x2=0 y=0
x3=0x1=0 x2=0 y=0
x3=0x1=0 x2=0 y=1x3=0x1=0 x2=0 y=1
x3=0x1=0 x2=0 y=1
x3=1x1=0 x2=0 y=0x3=1x1=0 x2=0 y=0
x3=1x1=0 x2=0 y=0
x3=1x1=0 x2=0 y=1x3=1x1=0 x2=0 y=1
x3=1x1=0 x2=0 y=1
x3=0x1=0 x2=1 y=0x3=0x1=0 x2=1 y=0
x3=0x1=0 x2=1 y=0
x3=0x1=0 x2=1 y=1x3=0x1=0 x2=1 y=1
x3=0x1=0 x2=1 y=1
5. DNA encoded kernel functions
5. DNA encoded kernel functions
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
58
Text ClassificationText Classification
Data from Reuters-21578 (‘ACQ’ and ‘EARN’)
Learning curves: average for 10 runs
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
59
Performance ComparisonPerformance Comparison
‘ACQ’ data (4,724 documents)
‘EARN’ data (7,888 documents)
Higher-dimensional kernel functions can improve the performance further.
Learning from Movie Captions Learning from Movie Captions ExperimentsExperiments
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
61
Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions
Order Sequential Range: 2~3
Corpus Friends Prison Break 24
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
62
Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
63
Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
64
Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
65
Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions
Classification Query generation
- I intend to marry her : I ? to marry her I intend ? marry her I intend to ? her I intend to marry ? Matching
- I ? to marry her order 2: I intend, I am, intend to, …. order 3: I intend to, intend to marry, …
Count the number of max-perfect-matching h
yperedges
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
66
Completion & Classification Examples
Query Completion Classification
who are you Corpus: Friends, 24, Prison Break
? are you
who ? you
who are ?
what are you
who are you
who are you
Friends
Friends
Friends
you need to wear it Corpus: 24, Prison Break, House
? need to wear it
you ? to wear it
you need ? wear it
you need to ? it
you need to wear ?
i need to wear it
you want to wear it
you need to wear it
you need to do it
you need to wear a
24
24
24
House
24
Learning Hypernets from Movie CaptionsLearning Hypernets from Movie Captions
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
67
ConclusionConclusion Hypernetworks are a graphical model employing higher-order nodes exp
licitly and allowing for a more natural representation for learning higher-order graphical models.
We introduce an evolutionary learning algorithm that makes use of the high information density and massive parallelism of molecular computing to solve the combinatorial explosion problems.
Applied to pattern recognition (and completion) problems in IT and BT. Obtained a performance competitive to conventional ML classifiers. Why does this work?
Exploits the huge population size available in DNA computing to build an ensemble machine, i.e. a hypernetwork, of simple random hyperedges.
A new kind of evolutionary algorithm where a very simple “molecular” operators are applied to a “huge” population of individuals in a “massively parallel” way.
Another potential of hypernetworks is for application to solving biological problems where data are given as “wet” DNA or RNA molecules.
© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/
68
Simulation Experiments Joo-Kyoung Kim, Sun Kim, Soo-Jin Kim, Jung-Woo Ha, Chan-Hoon Park, Ha-Young JangCollaborating Labs - Biointelligence Laboratory, Seoul National University - RNomics Lab, Seoul National University - DigitalGenomics, Inc. - GenoProt, Inc. Supported by - National Research Lab Program of Min. of Sci. & Tech. (2002-2007)
- Next Generation Tech. Program of Min. of Ind. & Comm. (2000-2010)
More Information at - http://bi.snu.ac.kr/MEC/ - http://cbit.snu.ac.kr/
AcknowledgementsAcknowledgements