Gene-Disease Associations Based on network 报告人:李金金.

Post on 17-Jan-2016

263 views 4 download

transcript

Gene-Disease Associations Based on network

报告人:李金金

Contents

Background1

Heterogeneous network2

Methods3

Background

Correctly identifying association of genes with diseases has long been a goal in biology.

Identifying association of genes with diseases has contributed to improving medical care and understanding of gene functions and interactions.

Clinical diseases are characterized by distinct phenotypes. To identity disease genes, the relationship between genes and phenotypes is involved.

Background

Pheno-type

GeneDisea-se

Pheno-type

Gene

Association

Problems

Construction heterogeous network

Gene network based on HPRD

g1

g4

g3

g2

g5

g7

g6GA

Construction heterogeous network

Phenotype network using MinMiner

p1

p4

p2

p5

PA

Construction heterogeous network

Gene-Phenotype network based on OMIM

p1

p4

p2

p5

B

g1

g4

g3

g2

g5

g7

g6

Construction heterogeous network

)*( mmPA)*( nnGA )*( mnB

PTG

AB

BAA

Methods

Katz

RWRH

Prince

GeneWalker

CIPHER

CATAPULT

Methods

Methods

Katzis successfully applied for link prediction in social networks.

Methods

CATAPULTis a supervised learning method.Features are derived from hybrid walks through the heterogeneous network.

Katz

g3

g1

g4

g2

g5g6

001110

000101

100110

111011

101100

010100

A

Katz

g3

g1

g4

g5g6

g5g6

g1

g3

g2g2

g3

312221

121111

213221

212521

212231

111112

2A

3A 4A 5A……

Katz

How to get the similarity matrix?

Katz measure:

ijl

k

ll AS )(

1ij

0ll,

ll

21

1

katz 1,)(

AIAIAS l

k

l

l

Small values of k (k=3 or k=4) are known to yield competitive performance in the task of recommending similar nodes.

Katz on the heterogeneous network

Adjacency matrix of heterogeneous network:

PTG

AB

BAA AG gene-gene network

Bthe bipartite network genes and phenotypes

APHSthe similarity matrix of human diseases

APSthe similarity matrix of phenotypes of other species

SHS BBB

PS

PHSP A

AA

0

0

Katz on the heterogeneous network

Katz similarity measure specialized to A:

K=3,the similarities between gene nodes and human disease nodes could be denoted by

ijl

k

l

lij

K AAS )()(1

atz

)(s AS KatzH

)()( 2s PHsHsHsGHsKatzH ABBABAS

)( 22s

3PHsHsPHsHsGHsGH

T ABABABABBB

CATAPULT

How to train a biased SVM?

T the number of bootstraps

the sets of positive

the set of unlabeled gene-phenotype pairs

n+the number of examples in A

A

Step 1: Draw a bootstrap sample U of size n+ .

Step 2: Train a linear classifier θ using the positive training examples A and U as negative examples.

CATAPULT

How to train a biased SVM?

Step 2: Training classifier

CATAPULT

How to train a biased SVM?

Step 3: For anytUUx \ update: