+ All Categories
Home > Science > Enzyme Annotation using Conditional Ranking Algorithms

Enzyme Annotation using Conditional Ranking Algorithms

Date post: 27-Jan-2015
Category:
Upload: michiel-stock
View: 107 times
Download: 0 times
Share this document with a friend
Description:
Presentation for the Benelearn conference about the application of conditional ranking algorithms for predicting enzyme function from their structure.
Popular Tags:
14
Enzyme Annotation using Conditional Ranking Algorithms Michiel Stock Faculty of Bioscience Engineering Ghent University 6th of June 2014 Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 1 / 14
Transcript
Page 1: Enzyme Annotation using Conditional Ranking Algorithms

Enzyme Annotation using Conditional RankingAlgorithms

Michiel Stock

Faculty of Bioscience EngineeringGhent University

6th of June 2014

KERMIT

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 1 / 14

Page 2: Enzyme Annotation using Conditional Ranking Algorithms

Outline

1 From Structure to Function

2 Ranking Enzymes

3 Learning to Rank

4 Results

5 Conclusion

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 2 / 14

Page 3: Enzyme Annotation using Conditional Ranking Algorithms

From Structure to Function

What bioinformatics is (often) about

Bioinformatics for proteins

Using biological knowledge and statistical models to map informationfrom a low level (e.g. protein structure) to a higher level (e.g. molecularfunction).

Sequence Structure Function

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 3 / 14

Page 4: Enzyme Annotation using Conditional Ranking Algorithms

From Structure to Function

The data set

Data:

two data sets of ca. 1600enzymes with 21different functions

five different similaritymeasures of the activesite

active site of anenzyme:

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 4 / 14

Page 5: Enzyme Annotation using Conditional Ranking Algorithms

From Structure to Function

The enzyme commission number

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 5 / 14

Page 6: Enzyme Annotation using Conditional Ranking Algorithms

Ranking Enzymes

Quantifying enzyme function similarity

EC 2.7.7.12

EC 4.2.3.90

EC ?.?.?.?EC 2.7.7.34

EC 4.6.1.11

EC 2.7.1.12

1

0

0

3

0

2

02

0

zondag, 13 mei 2012

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 6 / 14

Page 7: Enzyme Annotation using Conditional Ranking Algorithms

Ranking Enzymes

Conditional ranking of enzymes

Ranking enzymes

For an unannotated enzyme, rank the annotated enzymes so that thetop has a similar function w.r.t. the query.

Minimize ranking error:number of switches neededfor a perfect ranking

Example: suppose one has anenzyme with unknownfunction: EC ?.?.?.?

1 EC 2.7.7.12

2 EC 2.7.7.12

3 EC 2.7.7.34

4 EC 2.7.1.12

5 EC 2.7.7.34

6 EC 4.2.3.90

7 EC 1.14.11

8 EC 4.6.1.11

⇒ EC 2.7.7.12

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 7 / 14

Page 8: Enzyme Annotation using Conditional Ranking Algorithms

Learning to Rank

Learning the catalytic similarity

pair of enzymes:e = (v , v ′)

label ye ∈ {0, 1, 2, 3, 4}:the catalytic similarity

five different structuralsimilarities: Kφ(v , v ′)

A B C D E F GA 4 4 0 0 0B 4 4 0 0 0C 0 0 4 2 1D 0 0 2 4 3E 0 0 1 3 4FG

Enzymes

Enzymes

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 8 / 14

Page 9: Enzyme Annotation using Conditional Ranking Algorithms

Learning to Rank

Pairwise features with the Kronecker product

( , )

( , )( , )

( , )

( , )

( , )

Object kernel Pairwise kernel Learning!algorithm

SVM!RLS!…

The Kronecker kernel is defined as:

KΦ((v , v ′), (v , v ′)) = KΦ(e, e) = Kφ(v , v)Kφ(v ′, v ′)

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 9 / 14

Page 10: Enzyme Annotation using Conditional Ranking Algorithms

Learning to Rank

Basic pairwise models

Use training data T = {(e, ye)} to fit a model:

h(e) =∑e∈T

aeKΦ(e, e).

The function h ∈ H can be fitted using the following optimisation problem:

A(T ) = arg minh∈H

L(h,T ) + λ||h||2H.

For conditional ranking we choose an approximation of the rank loss.

This problem has time complexity O(n3), with n the number of enzymes.

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 10 / 14

Page 11: Enzyme Annotation using Conditional Ranking Algorithms

Results

Qualitative improvement in the enzyme similarities

Example for CavBase structural similarity:

Ground truthSupervisedUnsupervised

Lighter color = higher similarity

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 11 / 14

Page 12: Enzyme Annotation using Conditional Ranking Algorithms

Results

Improvement of the ROC curves

ROC curves for the five different structural similarity measures:unsupervised and supervised

False positive rate

Ave

rage

true

pos

itive

rate

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

CB sup.FP sup.LPCS sup.MCS sup.SW sup.CB unsup.FP unsup.LPCS unsup.MCS unsup.SW unsup.

ROC curve for the different enzyme similarity measurements of data set I

Improve

ment

Increase of AUC from ca. 0.7 to more than 0.8!Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 12 / 14

Page 13: Enzyme Annotation using Conditional Ranking Algorithms

Conclusion

General conclusions

1 enzyme function prediction can nicely be cast in a conditional rankingframework

2 supervised ranking is a clear improvement upon the baseline

3 efficient enough for many bioinformatics applications

4 can be generalised to many other settings

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 13 / 14

Page 14: Enzyme Annotation using Conditional Ranking Algorithms

Conclusion

Acknowledgements

Ghent University

Bernard De BaetsWillem Waegeman

University of Turku

Tapio PahikkalaAntti Airola

University of Marburg

Thomas FoberEyke Hullermeier

Want to know more?[1] T. Pahikkala, A. Airola, M. Stock, B. De Baets, and W. Waegeman. Efficient regularized least-squares algorithms for

conditional ranking on relational data. Machine Learning, 93(2-3):321–356, 2013.

[2] M. Stock, T. Fober, E. Hullermeier, S. Glinca, G. Klebe, T. Pahikkala, A. Airola, B. De Baets, and W. Waegeman.Identification of functionally related enzymes by learning-to-rank methods. IEEE Transactions on Computational Biologyand Bioinformatics, page Accepted for publication, 2014.

Michiel Stock (KERMIT) Conditional Ranking of Enzymes 6th of June 2014 14 / 14


Recommended