Date post: | 14-Aug-2015 |
Category: |
Science |
Upload: | raymond91105 |
View: | 122 times |
Download: | 1 times |
WormBase beyond www.wormbase.org
WormBase ParaSite
• New home for parasitic worm genomes in WormBase
UCSC WormBase assembly hub• View current WormBase data on UCSC genome browser
WormBase ParaSite
Motivation
• Many (100s) of parasitic nematode genome sequences available/iminent
• Helminth genomes scattered across a number of resources
• Much of data is “draft” quality
Introducing WormBase ParaSite (parasite.wormbase.org)
• Consistent, integrated access to hundreds of parasitic nematode draft genomes
• Encompass all parasitic worms (i.e. nematodes and flatworms)
WormBase ParaSite genomes (v2)
Nematodes• 63 species (70 genomes)
•Clade I – 7 species (9)
•Clade III – 22 species (24)
•Clade IV – 16 species (16)
•Clade V - 18 species (21)
• Largest and smallest•Teladorsagia circumcincta (700 Mb)
•Parastrongyloides trichosuri (42 Mb)
Platyhelminthes• 25 species (26 genomes)
•Cestodes – 12 species
•Trematodes – 11 species
•Other– 2 species
• Largest and smallest•Spirometra erinaceieuropaei (1250 Mb)
•Hydatigera taeniaeformis (100 Mb)
Orthologs and paralogs• Ensembl “Compara” protein-tree pipeline
• 118 genomes
•9 additional nematode genomes (free living)
•13 comparator genomes
•Including human, mouse, zebrafish
• ~150,000 protein multiple alignments
• ~1000 CPU days
http://parasite.wormbase.org
http://parasite.wormbase.org
ParaSite Downloads
ftp://ftp.wormbase.org/pub/wormbase/parasite• Consistent file naming and data organisation
• Genome project (NCBI BioProject) disambiguation
• Files for each genome
• Genome fasta(s)
• Protein fasta
• Transcript fasta
• Annotation GFF3
http://parasite.wormbase.org
ParaSite Mart
• Table-based data-mining tool
• Like WormMine, but different interface
• Complementary to WormMine
•Less depth for C. elegans, but…
•Comprehensive species set (all nematode genomes)
•Some additional functionality
ParaSite Mart - orthologs
ParaSite Mart – sequence extraction
The UCSC WormBase genome Hub
Background● Many researchers like the UCSC genome browser
○ Familiar interface
○ Comparative genomics (alignments / conservation)
● Worm data at UCSC is 5 years out of date
UCSC hubs● A new mechanism for remote hosting of collections of genome browser tracks
● Emerging standard for cross-browser compatibility
● The WormBase hub
○ View up-to-date WormBase data on UCSC!
○ View some data not viewable anywhere else: genomic alignments
Nematode genomic alignments
Progressive Cactus (Nguyen et al, 2014)
• New tool (UCSC) for genome multiple alignments (100s
genomes)
• Creates “virtual” ancestor genomes
• Output = HAL file (HDF5 database)
WormBase cactus alignments
• 29 nematode genomes (more in future)
• Viewable on UCSC browser (“SNAKE” tracks)
UCSC
Production Hub
http://ftp.ebi.ac.uk/pub/databases/wormbase/releases/current-production-release/COMPARATIVE_ANALYSIS/hub/hub.txt
UCSC
New Dropdowns:
● Nematodes● core
genomes● WormBase
assembly identifiers
Search:● seq. names● WBGeneIDs● gene symbols
UCSC
Release Tracks:
● transcripts● current + reference● pseudogenes● ncRNAs● mRNA alignments● WormBase links
Assembly Tracks:● repeats● conservation● comparative hub
UCSC
UCSC
EnsEMBL (and friends)
Development Hub
http://ftp.ebi.ac.uk/pub/databases/wormbase/releases/current-development-release/COMPARATIVE_ANALYSIS/hub/parasite_hub.txt
Production Hub
http://ftp.ebi.ac.uk/pub/databases/wormbase/releases/current-production-release/COMPARATIVE_ANALYSIS/hub/parasite_hub.txt
current:metazoa.ensembl.org
coming soon:parasite.wormbase.orgensembl.org
configure tracks
… and more
GBrowse
BioDalliance
JBrowse
Summary
WormBase ParaSite
• parasite.wormbase.org
• Poster 952C (Saturday)
UCSC WormBase assembly hub• ftp.ebi.ac.uk/pub/databases/wormbase/releases/current-production-release/
COMPARATIVE_ANALYSIS/hub/hub.txt
• blog.wormbase.org
More information• [email protected]
• Come and see us for a tutorial!