Graph kernels for chemoinformatics.A critical discussion
Matthias Rupp
Berlin Institute of Technology, Germany
6th German Conference on Chemoinformatics,Goslar, Germany, November 7–9, 2010
Outline
Introduction Kernel-based learning
Graph kernels Idea, taxonomy
Applications Virtual screening, pKa estimation
Discussion Assessment
Matthias Rupp: Graph kernels in chemoinformatics 2
Machine learning: introduction
I Algorithmic search for patterns in data
I Inference from known samples to new ones
Application examples:
I Ligand-based virtual screening
I Quantitative structure-property relationships
I Toxicity mode of action
Method examples:
I Linear regression
I Principle component analysis
I Artificial neural networks
-10 -5 5 10x
-4
-2
2
4
6
8
fHxL
Matthias Rupp: Graph kernels in chemoinformatics 3
Machine learning: kernel-based learning
Idea:
I Transform samples into higher-dimensional space
I Implicitly do inference there
-2 Π -Π 0 Π 2 Π
x 7→-2Π -Π Π 2Π
x
-1
1sin x
Input space Feature space
<x, x′> =d∑
i=1
xix′i inner product
k(x, x′) = <φ(x), φ(x′)> kernel function
Example: φ(x) = (x , sin x)
Matthias Rupp: Graph kernels in chemoinformatics 4
Graph kernels: idea
Define kernels directly on graphs!
k(G ,G ′) = <φ(G ), φ(G ′)> kernel function
I Combine graph theory and machine learning
I Complete graph kernels are computationally hard
small moleculemolecular graph
668 HUAN ET AL.
FIG. 9. Large subgraph motif found in more than 90% of the Protein Kinase family members that includes a catalyticresidue. Left: graph representations. All edges are proximity edges. Right: mapping of this motif onto the backboneof Cell Division Kinase 5 (1h4l). The motif includes the invariant catalytic residue Lys128, darkened in the graphrepresentation and in the protein structure, and neighboring hydrophobic residues that contact the ligand.
FIG. 10. Least-squares superposition of the largest fingerprint that contains the whole active site in 30 proteins fromour dataset of 35 eukaryotic and 8 prokaryotic serine proteases. Maximum RMSD is 0.5 Å RMSD in the first fourresidues (Asp-His-Ala-Ser). Only 7 serine proteases (ESP: 1lo6A,1eq4A,1fiwA,1eaxA; PSP: 1qq4A, 1sgpE, 1hpgA)are shown superposed, for clarity. The surrounding conserved C! trace is also shown.
protein kinase motifreduced graph
protein-proteininteraction network
Gartner et al., COLT 2003, 129.
Matthias Rupp: Graph kernels in chemoinformatics 5
Graph kernels: taxonomy
random walksCCCOCCCOCCCCCCCOCOCCCCCCCCSOSCSOCCCCCCCCCCC
CCCOCCOCCCCOCOCNCCCCCCCCCCCCCCCCNCCOCNCCCOC
CCCOCCCOCCCCCCCOCOCCCCCCCCSOSCSOCCCCCCCCCCC
CCCOCCOCCCCOCOCNCCCCCCCCCCCCCCCCNCCOCNCCCOC
time O(n3)
patterns
sampling
assignments
Gartner et al., COLT/Kernel 2003, 129; Kashima et al., 155, in Scholkopf et al. (eds.),Kernel methods in computational biology, MIT Press, 2004.
Matthias Rupp: Graph kernels in chemoinformatics 6
Graph kernels: taxonomy
random walks
CCCOCCCOCCCCCCCOCOCCCCCCCCSOSCSOCCCCCCCCCCC
CCCOCCOCCCCOCOCNCCCCCCCCCCCCCCCCNCCOCNCCCOC
time O(n2c2c) for trees, O(n3) for cyclic patterns
patterns
sampling
assignments
Mahe & Vert, Mach. Learn. 75(1): 3, 2009; Horvath et al., KDD 2004, 158.
Matthias Rupp: Graph kernels in chemoinformatics 7
Graph kernels: taxonomy
random walks
time O(nck−1), k ∈ {3, 4, 5}
patterns
sampling
assignments
Shervashidze et al., AISTATS 2009, 488; Kondor et al., ICML 2009, 529.
Matthias Rupp: Graph kernels in chemoinformatics 8
Graph kernels: taxonomy
random walks
time O(n3)
patterns
sampling
assignments
Frohlich et al, QSAR Comb. Sci 25(4): 317, 2006;Rupp et al, J. Chem. Inf. Model. 47(6): 2280, 2007
Matthias Rupp: Graph kernels in chemoinformatics 9
Applications: virtual screening
Target:
I Peroxisome proliferator-activated receptor γ (PPARγ)
I Related to type 2 diabetes and dyslipidemia
Methods:
I Gaussian process regression
I Graph kernel + descriptors
I Cellular reporter gene assay
Results:
I 8 out of 15 compounds active
I One selective PPARγ agonist with novel scaffold(derivative of natural product truxillic acid),EC50 = 10.03 ± 0.2µM
Rupp et al., ChemMedChem 5(2): 191, 2010.
Matthias Rupp: Graph kernels in chemoinformatics 10
Applications: quantitative structure-property relationships
Objective:
I Estimation of acid dissociation constants pKa in water
I HA A− + H+; pKa ≈ pH + log10c(HA)c(A−)
Methods:
I Published data (n = 698)
I Kernel ridge regression
I Only graph kernel
Results:
I Best RMSE = 0.23median RMSE = 0.85
I Same performance as semi-empirical reference modelbased on frontier electron theory
Tehan et al., Quant. Struct. Act. Rel. 21(5): 457, 473; Rupp et al, Mol. Inf., 2010 29: 731.
Matthias Rupp: Graph kernels in chemoinformatics 11
Discussion: choice of kernel
Problem:
I It’s not clear when to use which graph kernel
Questions to ask:
I Does it consider the position of patterns?
I Does it support domain knowledge, e.g., labels?
I Does it exploit molecular graph properties,e.g., bounded vertex degrees?
I Is it positive definite?
Matthias Rupp: Graph kernels in chemoinformatics 12
Discussion: assessmentKernel methods:
+ Principled way of non-linear pattern recognition
– Solution in terms of training samples instead of input dimensionsAffects computing time, solution size, interpretation
Graph kernels:
+ Principled use of graph theory in kernel learning
+ Defined directly on the graphs
+ Potential in chemoinformatics
– High computational requirements
I Some aid interpretability, some do not
I Recent development, active area of research
Outlook:
I Theoretical and comparative studies needed
I Graph kernels designed for chemoinformaticsMatthias Rupp: Graph kernels in chemoinformatics 13
Acknowledgments
Prof. Dr. Klaus-Robert MullerInstitute of Technology BerlinGermany
Prof. Dr. Gisbert SchneiderETH ZurichSwitzerland
Prof. Dr. Manfred Schubert-Zsilavecz, Dr. Heiko Zettl, Ramona SteriDr. Petra Schneider, Dr. Ewgenij Proschak, Markus HartenfellerDr. Timon Schroeter, Katja HansenDr. Igor Tetko, Robert Korner
Matthias Rupp: Graph kernels in chemoinformatics 14
Literature
I Mathematical review:
Vishwanathan et al., Graph kernels,J. Mach. Learn. Res. 11: 1201, 2010.
I Chemoinformatics review:
Rupp & Schneider, Graph kernels for molecular similarity,Mol. Inf. 29(4): 266, 2010.
I Slides:
http://www.mrupp.info
Matthias Rupp: Graph kernels in chemoinformatics 15