+ All Categories
Home > Documents > Biological Oceanography Scientific Domain Ed DeLong MIT Department of Biological Engineering...

Biological Oceanography Scientific Domain Ed DeLong MIT Department of Biological Engineering...

Date post: 20-Dec-2015
Category:
View: 230 times
Download: 1 times
Share this document with a friend
Popular Tags:
19
Biological Oceanography Scientific Domain Ed DeLong MIT Department of Biological Engineering Department of Civil and Environmental Engineering DataSpace 1
Transcript

Biological OceanographyScientific Domain

Ed DeLong MITDepartment of Biological Engineering

Department of Civil and Environmental Engineering

DataSpace

1

• Coupling of physical & biological oceanographic processes

• Comparative ecosystem analysis

• Biodiversity, biomass and productivity

• C-N-P cycling and energy flow

• Production, consumption of greenhouse gases: climate

• Measurement, modeling and experiments with microbial communities in the sea

• Education, training and knowledge exchange

BIOLOGICAL OCEANOGRAPHY

Microbial and sampling scales, based on Dickey (1991) and Allen (2000): Ricardo Letelier

Oceanagraphic sampling approaches in the context of scales

Scope & Scale : Challenges in Biological Oceanography(Genomes to Biomes…)

ADVANCEDINSTRUMENTATION

Continuous, autonomous collection of 4D physical , chemical and

bio-optical datasets

• 2 Eddies• 1 frontal system• Sub-mesocale features?

• Higher Chla bellow cyclone• DCM constant• Patchy distribution of small particles• Advection/local production of small particles in the Ze

Further specialization: Marine Metagenomics

• Traditional microbiology and microbial genome sequencing studies rely on cultivated cultures

• Marine metagenomics: DNA sequences of microbial assemblages from the environment

• Metagenomic data is used by scientists across multiple disciplines, e.g., • Biological engineering & biotechnology

• Genomics and computation biology

• Ecology and environmental science

• Climate: relationship between marine microbes & the ocean’s carbon cycle, productivity, greenhouse gases

6

H179_454DNA_vs_Pelagibacter

***

25 m

75 m

125 m

2ND Gen Sequencing Platforms

Cost per run ~$50 <$12K <$5K

Bases read/run

72 Kbp 100 Mbp

500 Mbp

>2 Gbp

> 200 Gbp !!!

Bases per read

750 250

450

>36 (> 100 +

Paired end reads)

Reads per run

96 reads/run 400K reads/run 20M reads/run

$ per Mbp $ 694 $ 120 $ 7

AB3730 work equivalent

- 100x AB3730/dy 300x AB3730/dy

Errors Diverse

(cloning bias)

Homopolymeric runs

Diverse

(base subn.)

Run time 1 hour 6.5 hours 2-14 days*

AB3730 454 FLX/titan. ILLUMINA

Biological Oceanography Data Challenges

• Wide variety and heterogeneity of data types•Oceanographic cruise data•Oceanographic time series data•Laboratory & field experiments•Remote sensing datasets•Data from gliders, AUVs & moorings•Genomics, metagenomics, gene expression data•Numerical simulations & synthesis products

• Distributed data (multi-institution & researchers)

• Need to balance PI, project & public data accessibility

• Data visualization & analysis needs

• Long term archiving requirements

Why do biological oceanographers need DataSpace?

UHMITOSUUCSCWHOIMBARI

DataSpace partners: MIT-OSU

Oceanographic Science PartnersEd DeLong (MIT) & Ricardo Letelier (OSU)

Library IT PartnersMacKenzie Smith (MIT) & Terry Reese (OSU)

DeLong and Letelier Co-PIs on three major projects:

Center for Microbial Oceanography: Research and Education (C-MORE)

Microbial Oceanography of Oxygen Minimum Zones (MOOMZ)

Microbial diversity and activity in seasonal hypoxic waters (MI-LOCO)

Existing Data Portal

Currently a distributed approach. Consists of weblinks to individually managed heterogeneous datasets.

http://cmore.soest.hawaii.edu/data.htm

Biological and Chemical Oceanography Data Management Office database

http://osprey.bcodmo.org/index.cfm

BCO-DMOWhere is the data now ? (Oceanographic data)

Public Databases: NCBI and CAMERANational Center for Biotechnology Information

Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis

http://camera.calit2.net/

http://www.ncbi.nlm.nih.gov/

Where is the data now? (Genomic/metagenomic)

In-house Databases …

Why do biological oceanographers need DataSpace ?

•Data access, storage, search not centralized

•Large heterogeneous datasets

•Complex data management/sharing requirements

•Shared multiple Institutions & Investigator

•Long term requirements (2017)

•Need cross-investigator,institution,project search

•Currently lots of data is “lost”, e.g. not utilizable

Why do biological oceanographers need DataSpace ?

How many autonomous surveys, cruises, mooring datasets, hydrocasts, deckboard experiments had chlorophyll concentrations than X ?

Of those data, how many had light levels and oxygen concentrations corresponding Y and Z ?

Of those data, how many have corresponding microbial community taxonomic composition and gene content data ? (retrieve)

What is the relationship between light, chlorophyll, oxygen and microbial community taxonomic composition and gene content, across all datasets ? How do taxa and gene content relate to oxygen levels and the balance of production and consumption ? Greenhouse (GHG) gas levels ?

Are there specific gene proxies that predict oxygen or GHG levels ?Note: centralized data access, search and storage will also drive the way we (sceintists) ask our questions, collect, and annotate our data. = A collaboration between scientists, IT, curators and database managers.

The DataSpace Project & Biological Oceanography

• Provide infrastructure for digital archiving & preservation at appropriate scales matching scope/complexity of data

• Enable more integrated intra- & inter-project collaborations, analyses, data encoding, documentation, sharing, visualizing, and preservation

• Establish standards & best practices to capture, express, encode and publish the policies related to archived data

• Enable new discoveries by facilitating access, search storage of large, complex heterogeneous datasets

GENOMES BIOMES

Community genomic and transcriptomic data

Community metabolism

Ecosystem functions

Community compositionand interactions

The DataSpace Project & Biological Oceanography


Recommended