The STRING database

Post on 22-May-2015

452 views 0 download

Tags:

description

14th International Conference on Intelligent Systems for Molecular Biology, Software demo, Fortaleza Conference Center, Fortaleza, Brazil, August 6-10, 2006

transcript

The STRING database

Lars Juhl Jensen

EMBL Heidelberg

data integration

functional interactions

179 proteomes

Ensembl

SWISS-PROT

genomic context methods

phylogenetic profiles

Cell

Cellulosomes

Cellulose

gene fusion

gene neighborhood

questionable reliability

raw quality scores

gene neighborhood

sum of intergenic distances

many types of evidence

raw quality scores

not directly comparable

benchmarking

calibrate against KEGG

curated knowledge

KEGGKyoto Encyclopedia of Genes and Genomes

Reactome

MIPSMunich Information center

for Protein Sequences

STKESignal Transduction Knowledge Environment

primary experimental data

many sources

parsers

co-expression

GEOGene Expression Omnibus

SMDStanford Microarray Database

physical protein interactions

BINDBiomolecular Interaction Network Database

MINTMolecular Interactions Database

GRIDGeneral Repository for Interaction Datasets

DIPDatabase of Interacting Proteins

HPRDHuman Protein Reference Database

literature mining

different gene identifiers

synonyms lists

MEDLINE

SGDSaccharomyces Genome Database

The Interactive Fly

OMIMOnline Mendelian Inheritance in Man

co-mentioning

NLPNatural Language Processing

Gene and protein namesCue words for entity recognitionVerbs for relation extraction

[nxgene The GAL4 gene]

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

combine all evidence

spread over many species

transfer by orthology

orthologous groups

fuzzy orthology

?

Source species

Target species

Bayesian scoring scheme

Acknowledgments

The STRING team (EMBL)– Christian von Mering

– Berend Snel

– Martijn Huynen

– Sean Hooper

– Samuel Chaffron

– Julien Lagarde

– Mathilde Foglierini

– Peer Bork

Literature mining project(EML Research)– Jasmin Saric

– Rossitza Ouzounova

– Isabel Rojas