+ All Categories
Home > Education > Ensembl Plants: Visualising, mining and analysing crop genomics data

Ensembl Plants: Visualising, mining and analysing crop genomics data

Date post: 14-Jun-2015
Category:
Upload: dan-bolser
View: 373 times
Download: 1 times
Share this document with a friend
Description:
Ensembl Plants is a genome centric platform for visualisation and analysis of plant genomics data. It hosts assembly, sequence, expression, variation and comparative datasets for a growing number of plant species (currently 26) covering a range of economically important crops, including brassica, tomato, grape, barley, potato, maize and wheat, and taxonomically diverse model organisms. The web-based genome browser visually integrates sequence and assembly information with genes, markers, probes, repeats and other public or user-supplied datasets. It includes a web-based data mining tool, allowing specific sets of data to be queried and downloaded for offline analysis. In addition to the browser, all data can be accessed computationally via extensive Perl and REST APIs and is available for FTP download or direct database access.
Popular Tags:
36
Ensembl Plants: Visualising, mining and analysing crop genomics data Dan Bolser Ensembl Plants project leader EMBL-EBI http://plants.ensembl.org #EnsemblGenomes
Transcript
Page 1: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Ensembl Plants:Visualising, mining and analysing crop genomics data

Dan BolserEnsembl Plants project leaderEMBL-EBI

http://plants.ensembl.org#EnsemblGenomes

Page 2: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Visualising, mining and analysing data:

● The Ensembl genome browser

● BioMart

● Tools for processing your own data

Overview

Background:

● Ensembl Plants● History● Data

● Recent updates● Wheat ● Barley

Page 3: Ensembl Plants: Visualising, mining and analysing crop  genomics data

EBI Ensembl is developed jointly by the EBI and

the Wellcome Trust Sanger Institute

Page 4: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Ensembl Plants uses Ensembl technology

Ensembl:

● A platform for genome browsing, annotation and analysisdeveloped jointly by the EBI and Wellcome Trust Sanger Institute.

● Has modules for handling:● Genomic data, Variations, Comparative genomics, Gene prediction, ...

● Multiple points of access to data:● Browser-based application, Perl and REST APIs, direct access

(MySQL), BioMart data mining tool, DAS (client and server), FTP.

● Upload your own data and compare it to the reference seq. and annotation.

Ensembl was originally developed for vertebrate genomes, subsequently extended to non-vertebrate species:● Ensembl Genomes → Ensembl Plants

Page 5: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Currently 33 genomes in Ensembl Plants

http://plants.ensembl.org

Page 6: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Dicots in Ensembl Plants (10)

Brassicales

Fabales

Malpighiales

Rosales

Solanales

Vitales

Page 7: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Monocots in Ensembl Plants (12+5)

Poales

Zingiberales

Page 8: Ensembl Plants: Visualising, mining and analysing crop  genomics data

'Others' (5)

Page 9: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Types of data in Ensembl (Ensembl Plants)

● Genomic sequence● Gene, transcript, and protein annotations● External references and ontology terms● Mapped sequences: cDNAs, proteins,

probes, BACs, repeats, markers, ...

● Variation data:● sequence variants● structural variants

● Comparative data:● gene trees, orthologues, paralogues● whole genome alignments and synteny

Page 10: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Recent data updates

Page 11: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 12: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Wheat data in Ensembl Plants

● The chromosome survey sequencefrom the International Wheat Genome Sequencing Consortium.● Version 2.1 of the IWGSC gene models called

on the chromosome survey sequence.

● Repeats● Repbase● The Triticeae Repeat Sequence Database

(TREP)

● Alignments● RNA-seq from various studies in ENA● ESTs and UniGene clusters● 5x 454 Brenchley et al.● Triticum turgidum cDNA assemblies

Page 13: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Wheat data in Ensembl Plants

● Whole genome alignments● Between wheat(s) and:

● Rice● Brachypodium

● Within wheat● A vs. B● A vs. D● B vs. D

● Gene trees● Aegilops tauschii● Triticum urartu● and other more

distant relatives

Page 14: Ensembl Plants: Visualising, mining and analysing crop  genomics data

WGA between wheat, rice and brachy

Page 15: Ensembl Plants: Visualising, mining and analysing crop  genomics data

WGA within wheat A, B and D sub-genomes

Page 16: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Gene trees

Page 17: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Gene trees

Page 18: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Walk through ‘demo’ for Ensembl Plants

Page 19: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 20: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Search

Page 21: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 22: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 23: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 24: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 25: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Variant Effect Predictor (VEP)

● Predicts functional consequences of known and unknown variants

● For substitutions, insertions, deletions and structural variants

● Web interface (for up to 750 variants), standalone Perl script, Perl API and REST API

Page 26: Ensembl Plants: Visualising, mining and analysing crop  genomics data

Visualise your own data

Upload data:● Data saved on server● 5 MB limit

● Large file formats?

Attach remote files:● URL-based

● HTTP or FTP● No size limit

Upload formats:● BED genes / features● Gbrowse genes / features● GFF/GTF genes / features● PSL sequence alignments● WIG continuous-valued data● BedGraph continuous-valued data● TrackHub collections of tracks

Attach formats:● BigBed genes / features● BAM sequence alignments● BigWig continuous-valued data● VCF variants

User added tracks:● Can be saved or shared● Only trivial security, do not use for sensitive data!

Page 27: Ensembl Plants: Visualising, mining and analysing crop  genomics data

The barley Gene-ome

Page 28: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 29: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 30: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 31: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 32: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 33: Ensembl Plants: Visualising, mining and analysing crop  genomics data

● Step 1 – Dataset● Choose your dataset

and species

● Step 2 – Filters● Limit your dataset

● Step 3 – Attributes● Specify what

information you want to output

● Step 4 – Results● Preview and output

your results

Blast and BioMart...

Page 34: Ensembl Plants: Visualising, mining and analysing crop  genomics data
Page 35: Ensembl Plants: Visualising, mining and analysing crop  genomics data

[email protected] 10/01/2014

Funding (Ensembl Plants)

• Ensembl Genomes Funded by

• EMBL

• EU (INFRAVEC, Microme, transPLANT, AllBio)

• BBSRC (PhytoPath, wheat, barley and midge sequencing, UK-US collaboration, RNAcentral)

• Wellcome Trust (PomBase)

• NIH/NIAID (VectorBase)

• NSF (Gramene collaboration)

• Bill and Melinda Gates Foundation (wheat rust)

Page 36: Ensembl Plants: Visualising, mining and analysing crop  genomics data

[email protected] 10/01/2014

People (Ensembl Plants)

• James Allen, Irina Armean, Dan Bolser, Mikkel Christensen, Paul Davies, Christoph Grabmueller, Kevin Howe, Malcolm Hinsley, Jay Humphrey, Arnaud Kerhornou, Paul Kersey, Julia Khobdova, Eugene Kulesha, Nick Langridge, Dan Lawson, Mark McDowall, Uma Maheswari, Gareth Maslen, Michael Nuhn, Chuang Kee Ong, Michael Paulini, Helder Pedro, Anton Petrov, Dan Staines, Mary Ann Tuli, Brandon Walts, Gary Williams

• If you have a question that is not answered here, please Contact our HelpDesk:• [email protected]


Recommended