Date post: | 30-Dec-2015 |
Category: |
Documents |
Upload: | veda-schultz |
View: | 18 times |
Download: | 0 times |
Metagenomics
Prof:Rui [email protected] Ciencies Mediques Basiques,1st Floor, Room 1.08Website of the Course:http://web.udl.es/usuaris/pg193845/Courses/Bioinformatics_2007/ Course: http://10.100.14.36/Student_Server/
Studying an organism
…ACTG…
>Dna
MAACTG…
>DNA Pol
MTC…
Stress
Measure Response
Find signatures for physiological dynamics in
genomic data
Diversity of Life on Earth
Described species: ~1.5 millions Predicted to exist: >30 millions Cultivate in the lab: ~thousands How do we know the genome of the species
we can not cultivate? How can we know if the genes that are
expressed in nature follow the same patterns as those in the lab?
Metagenomics
Metagenomics (also Environmental Genomics, Ecogenomics or Community Genomics) is the study of genetic material recovered directly from environmental samples.
Sampling in Metagenomics
Take a sample off of the environment
Isolate and amplify DNA/mRNA
Sequence it
Computer assembly
ACT…GTC CTA …ATC … …GGGG
How do we know which genes belong to which genome????
How do we assemble them???
What normally happens
Coverage is not enough and assembly is fragmentary
Worst Case Scenario: Some fragments can not be assigned
Down Side of Metagenomics
Often fragmentary Often highly divergent Rarely any known activity No chromosomal
placement No organism of origin Ab initio ORF predictions Huge data
Marine Metagenomics
Microbes account for more than 90% of ocean biomass, mediate all biochemical cycles in the oceans and are responsible for 98% of primary production in the sea.
Metagenomics is a breakthrough sequencing approach to examine the open-space microbial species without the need for isolation and lab cultivation of individual species.
Marine Genome Sequencing ProjectMeasuring the Genetic Diversity of Ocean Microbes
Sorcerer II Data from this area has already reach to 10% of GenBank.
The Entire Data Will Double Number of Proteins in Embank!
Sample Metadata from GOS Site Metadata
Location (lat/long, water depth)
Site characterization (finite list of types plus “other”)
Site description (free text)
Country
Sampling Metadata Sample collection date/time
Sampling depth
Conditions at time of sampling (e.g., stormy, surface temperature)
Sample physical/chemical measurements (T (oC), S (ppt), chl a (mg m-3), etc)
“author”
Experimental Parameters Filter size
Insert size
Flat FileServerFarm
W E
B
PO
RT
AL
Traditional
User
Response
Request
DedicatedCompute Farm(1000 CPUs)
TeraGrid: Cyberinfrastructure Backplane(scheduled activities, e.g. all by all comparison)
(10000s of CPUs)
Data-BaseFarm
10 GigE Fabric
Calit2’s Direct Access Core Architecture Will Create Next Generation Metagenomics Server
Source: Phil Papadopoulos, SDSC, Calit2+
Web
Se
rvic
es
Sargasso Sea Data
Sorcerer II Expedition (GOS)
JGI Community Sequencing Project
Moore Marine Microbial Project
NASA Goddard Satellite Data
Community Microbial Metagenomics Data
Web(other service)
Local Cluster
LocalEnvironment
DirectAccess LambdaCnxns
Marine Metagenomics
Who is there?
Drug discovery
Environmental surveyMicrobial genetic survey
Microbial genomic survey
Symbiosis
Organism discovery
Marine conservation
Evolution study
Bioenergy discovery
Endosymbiosis
Biogeochemistry mapping
Metabolic pathway discovery