Post on 29-Jan-2016
description
transcript
BIOINFORMATICS - GENE DATABASES
21 Enero 2010Dr. Victor Treviño
vtrevino@itesm.m
x
GENE DATABASES(DNA, RNA, PROTEIN)
HUGO (www.genenames.org) NCBI (http://www.ncbi.nlm.nih.gov) EBI – EMBL (http://www.ebi.ac.uk/ ) EBIMed (http://www.ebi.ac.uk/Rebholz-srv/ebimed/ ) SwissProt / UniProt
http://www.ebi.ac.uk/uniprot/ http://www.psc.edu/general/software/packages/swis
s/swiss.php
PubGene (http://www.pubgene.org/ ) GeneCards (http://www.genecards.org/ ) iHOP (http://www.ihop-net.org/UniPub/iHOP/ ) Panther (http://www.pantherdb.org/ ) "Others"
vtrevino@itesm.m
x
HUGO – HGNCWWW.GENENAMES.ORG
Human Genome Organization
"OFFICIAL" Gene Names
NCBI LINKS
vtrevino@itesm.m
x
NCBI
The "richest" information about genes
vtrevino@itesm.m
x
NCBI
vtrevino@itesm.m
x
NCBI – GENE DATABASE
Summary Species (or specific) Function Sequence CDS Chr Location Domains Interactions GeneRIFs: Gene
References Into Function
Lots of LINKS to all parts of NCBI and Externals
vtrevino@itesm.m
x
NCBI – NUCLEOTIDE / PROTEIN
GenBank/GenPept Format
vtrevino@itesm.m
x
GENE SEQUENCE FORMATS
http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml
vtrevino@itesm.m
x
GENE SEQUENCE FORMATS
vtrevino@itesm.m
x
BLAST - SEARCHING A GENE FROM SEQUENCE
cgagatgcagatagcagctagagat (at random)
small sequences may identify a gene (dbEST, dbSTS, ePCR)
vtrevino@itesm.m
x
NCBI - UNIGENE
Unified Information about A Gene across reported sequences
"set of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location."
VERY IMPORTANT : UniGene ID (Hs. xxxxxx)
vtrevino@itesm.m
x
NCBI – OMIM - OMIA
On-Line Mendelian Inheritance in Man / Animals
Curated "Function" of Genes Good References Strong History / Evidence
vtrevino@itesm.m
x
NCBI – OTHERS…
HomoloGene conserved functions
dbEST snapshot of genes expressed in a given tissue
UniSTS sequence tagged sites, PCR primer pairs, genomic position, genes
SNP Polymorphisms
vtrevino@itesm.m
x
EBIHTTP://WWW.EBI.AC.UK/
Ensembl - automatic annotation of large eukaryotic genomes (Genes ID)
UniProt - (Universal Protein Resource) is the world's most comprehensive catalogue of information on proteins
CiteXplore (good for literature) EBIMed (Tools, semantic mining)
vtrevino@itesm.m
x
SWISSPROTHTTP://WWW.EBI.AC.UK/UNIPROT/
Uniprot: http://www.uniprot.org/ Union of Swiss-Prot, TrEMBL, and PIR
Curated: Swiss-Prot is manually annotated and reviewed.
Good Summaries References Sequence Features (repeats, disulfid, … , domains) Examples…
Names and origin · Protein attributes · General annotation (Comments) · Ontologies · Alternative products · Sequence annotation (Features) · Sequences · References · Cross-references · Entry information · Relevant documents
vtrevino@itesm.m
x
PUBGENEHTTP://WWW.PUBGENE.ORG/
Good for gene interactions
References Association to Gene
Onthologies (GO) TEXT-MINING REALLY NICE
vtrevino@itesm.m
x
GENECARDSHTTP://WWW.GENECARDS.ORG/
Good Summary Function Lots of links
vtrevino@itesm.m
x
IHOPHTTP://WWW.IHOP-NET.ORG/UNIPUB/IHOP/
"Summary" of information for a protein
Linked TEXT-MINING REALLY NICE
vtrevino@itesm.m
x
PANTHERHTTP://WWW.PANTHERDB.ORG/
Rich Information Curated Pathways Functions Families Homologous
vtrevino@itesm.m
x
BIOINFORMATICS LINKS DIRECTORY
http://bioinformatics.ca/links_directory/
vtrevino@itesm.m
x
GENE DATABASES - SUMMARY
No SINGLE site contains ALL information we have to use several sources BioGPS
CURATED data is valuable Be cautious with predicted data Relation with other genes is more
difficult to explore
vtrevino@itesm.m
x
BIOGPS
http://biogps.gnf.org/ It is a portal of
portals You can add as
many portal sites as you want
Easy to configure Versatile VERY IMPORTANT!