Raheleh SalariSFU
Potential Drug Target Discovery on PPI Networks
• Pathogens becoming more drug resilient; infectious diseases on the rise.
• Emerging diseases (e.g. avian flu) may result in a global pandemic!
• Rational drug design - search for magic bullets is failing.
• Combinatorial therapies needed – multiple drug targets.
Computational identification of drug targets• Protein protein interaction (PPI) networks:
edges-interactions, nodes-proteins.
• Goal: Identify protein targets on PPI networks whose “removal” disrupts several “essential” pathways/complexes and their possible “backup” paths on the PPI network.
• Targets should have no human orthologs.
Associated PPI subnetwork
ExampleH.Pylori Chemotaxis pathway
PPI networks + pathways• Strategy: aim to disrupt all the possible
communication paths between “endpoint” pairs of essential pathogenic pathways (multicut).
• Weighted node sparsest cut: – Input: Node weights (large for human orthologs -
small for essential proteins, surface proteins, easy targets), Essentiality of source/sink pairs (quantify how important a pathway is to survival)
– Output: minimize W(C) / ecc(C)• W(C) = total weight of nodes on C• ecc(C) = total essentiality of the pathways disrupted
Approximation algorithms• DSC: # endpoint pairs = O(log n)
O(log n) approximation by trivial generalization of multi-cut algorithms (Check every subset of source sink pairs)O(n3 log2 n) [Goldberg & Tarjan 88]
• LP: # source/sink pairs unboundedO(n1/2 ) approximation
polynomial rounding algorithm [Hajiaghayi & Raecke 07] • Identical results on H.pylori PPI network,
slight differences on E.coli PPI network
Input: E.coli Signaling Pathways
DSCmtlDMotADPPABacterial Chemotaxis
DSC, LPcheW*MotATarBacterial Chemotaxis
CysACysPABC Transporter
holDdnaEDNA polymerase
holAdnaEDNA polymerase
DSC, LPdnaK*FrdANarQNarL Family
NarINarGNarL Family
TorATorSOmpR Family
PhoAPhoROmpR Family
Method(s)Target(s)SinkSourcePathway
Input: E.coli essential complexes
kdsAcafARibosome associated
uvrChlpARibosome associated
priAsbcBDNA polymerase
DSC, LPlpdA*, IysU, aceF*, aceE, iscS*, rpsE*
fdhDhscAIscS
aidBFfhACP
DSC, LPrpoA*+, rpoB*+, rplC*, rpoC*, rpsB*, rpsE*
greBhepARNA Polymerase
rpoNinfBRNA Polymerase
Method(s)Target(s)SinkSourceComplex
Input: H.pylori signaling pathways
DSC, LPHP0241dnaNdnaEDNA Polymerase
DSC, LPmsrABOppFOppAABC Transporters
MotBCheWBacterial Chemotaxis
DSC, LPFabEFlhAFliDFlagellar Assembly
DSC, LPHP0823FliNFliGFlagellar Assembly
DSC, LPHP0149AtoBAtoSTwo component Sys.
DSC, LPHP0452TrpETrpBTwo component Sys.
DSC, LPHP0933trbIcag12Type IV Secretion Sys.
FlhAFliFType III Secretion Sys.
YidCSecDProtein Export
DSC, LPHP1223rpsFrplIRibosomal Proteins
rplPrplDRibosomal Proteins
Method(s)Target(s)SinkSourcePathway
PPI networks only• Strategy: aim to disrupt as many “potential”
pathways as possible (balanced cut).• Minimum weighted node separator problem:
C is a -balanced separator if C partitions V to V’ and V’’ s.t. min{|V’|,|V’’|} > .|V|– Input: Node weights (small node weights indicate
essentiality, targetability etc., human orthologs have large weight)
– Output: find C with minimum total weight
Approximation algorithms, heuristics• O(log n) approximation [Leighton & Rao 99]
performs poorly in practice .• O(log1/2 n) approximation [Arora & Kale 07] is only
slightly better.• Greedy heuristics targeting nodes with maximum
degree (GDeg), betweenness (GBet) perform relatively poorly.
• Heuristics motivated by several combinatorial observations devised (HMWS).
Comparison of HMWS, GDeg and GBet methods
E.Coli pathways disrupted (cut size 28, β=0.15)
ABC transporters (Iron complex) *16
Bacterial Chemotaxis *15
Two Component (NarL family) *14
Aminoacyl-tRNA biosynthesis13
Lysine biosynthesis12
RNA polymerase11
Purine metabolism10
Pyrimidine metabolism9
Valine, leucine and isoleucine degradation8
Glycine, serine and threonine metabolism7
Alanine and asparate metabolism6
Glycolysis/Gluconeogenesis5
Citrate cycle (TCA cycle)4
Butanoate metabolism3
Pyruvate metabolism2
Ribosome1
E.coli known drug targets (re)discovered(cut size 28 β=0.15)
Clomocycline, Demeclocycline, Doxycycline, Lymecycline, Minocycline, Oxytertracycline, Tetracycline, Tigecycline
rpsD
NitrofurantoinrpsJ
Rifampin, RifaximinrpoB
RifabutinrpoA
DrugGene Name
H.Pylori disrupted pathways (cut size 17, β=0.15)
Tyep IV secretion system *17
Flagellar assembly *16
ABC transporters(Iron complex) *15
Two-component system – NtrC family *14
Protein export (Sec dependent pathway) *13
Oxidative phosphorylation (f-type ATPase) *12
Bacterial chemotaxis *11
DNA polymerases *10
Epithelial cell signaling in H. pylori infection *9
Oxidative phosphorylation (F-type ATPase) *8
Ribosomal proteins *7
Urease complex6
Flagellar assembly5
Caprolactam degradation4
RNA polymerase3
Pyrimidine metabolism2
Purine metabolism1
Acknowledgements• Cenk Sahinalp (SFU, CompBio)• Fereydoun Hormozdiari (SFU, CompBio)• Vineet Bafna (UCSD)• Phuong Dao (SFU, CopmBio)• SFU CTEF: Bioinformatics for combating
infectious diseases program• NSERC, CRC program, MSFHR
HMWS1. RWB: compute Random Walk
Betweenness for all nodes – in O(n3) time on a sparse graph
2. Split: returns an initial cut s.t. every connected component < (1n nodes
3. Merge: partitions the components into two each with > n nodes
4. Cut: do it all over again