Plant and Animal Genome Conference XX

Date post:	03-Jan-2016
Category:	Documents
Upload:	knox-rowe
View:	23 times
Download:	3 times

Download Report this document

Share this document with a friend

Description:

Building an Efficient Database. Content Management System Easy to Use and Update Open Source Popular, Powerful, Robust and Secure Modular and Extensible Users/Roles/Access Control. Drupal modules as web front-end for Chado. Chado. Generic Database schema - PowerPoint PPT Presentation

Embed Size (px):

Popular Tags:

breeding database

genome sequence data

cotton marker database

cotton genomics

CottonGen: Developing an Integrated Genomics, Genetics and Breeding Database for Cotton Research Jing Yu 1 , Sook Jung 1 , Stephen Ficklin 1 , Chun-Huai Cheng 1 , Taein Lee 1 , Ping Zheng 1 , Don Jones 2 , Richard Percy 3 , Dorrie Main 1 1. Washington State University 2. Cotton Incorporated 3. USDA-ARS A new cotton community database (CottonGen, htpp://www.cottongen.org) is being developed to further enable cotton genomics, genetics and breeding research. CottonGen is being built using the open-source Tripal database infrastructure developed for the genome databases hosted at Washington State University (www.bioinfo.wsu.edu). CottonGen will initially consolidate the data from CottonDB and the Cotton Marker Database (CMD), which includes sequences, genetic and physical maps, genotypic and phenotypic markers and polymorphisms, QTLs, pathogens, germplasm collections and trait evaluations, pedigrees, and relevant bibliographic citations. CottonGen will be expanded to include annotated transcriptome, genome sequence, marker-trait-locus and breeding data, as well as enhanced tools for easy querying and visualizing research data. Examples of the functionality being developed for CottonGen are presented through examples of our existing Tripal databases. Plant and Animal Genome Conference XX Acknowledged with thanks Goal: To provide a centralized, user-friendly, web portal for access to integrated cotton genomic, genetic and breeding data and tools, enabling basic, translational and applied research for the cotton community. Objectives: • Build an efficient, modular database through implementation of the open-source Tripal database infrastructure. • Integrate annotated transcriptome and genome sequence data. • Consolidate CottonDB and CMD in CottonGen • Build a toolbox where breeders can access tools to analyze integrated genotype, phenotype, and pedigree data • Implement enhanced tools for easy querying and data visualization Building an Efficient Database Generic Database schema Relational Database Schema Sophisticated Schema for Molecular Biology Do Complex Representations Chado Content Management System Easy to Use and Update Open Source Popular, Powerful, Robust and Secure Modular and Extensible Users/Roles/Access Control Drupal modules as web front-end for Chado More Tools (Rosaceae and Citrus) Map Comparisions in CMAP Genome Sequences in GBrowse (GDR) Public site Private site Marker s Maps, Markers and FPC data of CottonDB and CMD will be merged in CottonGen CottonDB taxonomy data will be added to CottonGen Annotated cotton genome sequences will be hosted in CottonGen, will include genes, transcripts, SSRs, SNPs, and primers, etc. Housing both Public and Private Data The Breeders Toolbox will be enhanced to include more searching and analysis features (GDR example on right) Example from the Cacao Genome Database The Cotton Research Community Abstrac t

Transcript

Page 1: Plant and Animal Genome Conference XX

CottonGen: Developing an Integrated Genomics, Genetics and Breeding Database for Cotton Research

Jing Yu1, Sook Jung1, Stephen Ficklin1, Chun-Huai Cheng1, Taein Lee1, Ping Zheng1, Don Jones2, Richard Percy3, Dorrie Main1

1. Washington State University 2. Cotton Incorporated 3. USDA-ARS

A new cotton community database (CottonGen, htpp://www.cottongen.org) is being developed to further enable cotton genomics, genetics and breeding research. CottonGen is being built using the open-source Tripal database infrastructure developed for the genome databases hosted at Washington State University (www.bioinfo.wsu.edu). CottonGen will initially consolidate the data from CottonDB and the Cotton Marker Database (CMD), which includes sequences, genetic and physical maps, genotypic and phenotypic markers and polymorphisms, QTLs, pathogens, germplasm collections and trait evaluations, pedigrees, and relevant bibliographic citations. CottonGen will be expanded to include annotated transcriptome, genome sequence, marker-trait-locus and breeding data, as well as enhanced tools for easy querying and visualizing research data. Examples of the functionality being developed for CottonGen are presented through examples of our existing Tripal databases.

Plant and Animal Genome Conference XX

Acknowledgedwith thanks

Goal: To provide a centralized, user-friendly, web portal for access to integrated cotton genomic, genetic and breeding data and tools, enabling basic, translational and applied research for the cotton community.

Objectives:• Build an efficient, modular database through implementation of the

open-source Tripal database infrastructure.• Integrate annotated transcriptome and genome sequence

data.• Consolidate CottonDB and CMD in CottonGen• Build a toolbox where breeders can access tools to analyze

integrated genotype, phenotype, and pedigree data• Implement enhanced tools for easy querying and data visualization

Building an Efficient Database

Generic Database schema Relational Database Schema

Sophisticated Schema for Molecular Biology Do Complex Representations

Chado

Content Management System Easy to Use and Update

Open Source Popular, Powerful, Robust and Secure

Modular and Extensible Users/Roles/Access Control

Drupal modules as web front-end for Chado

More Tools (Rosaceae and Citrus)

Map Comparisions in CMAP

Genome Sequences in GBrowse (GDR)

Public site

Private site

Markers

Maps, Markers and FPC data of CottonDB and CMD will be merged in CottonGen

CottonDB taxonomy datawill be added to CottonGen

Annotated cotton genome sequences will be hosted in CottonGen, will include genes, transcripts, SSRs, SNPs, and primers, etc.

Housing both Public and Private Data

The Breeders Toolbox will be enhanced to include more searching and analysisfeatures (GDR example on right)

Example from the Cacao Genome Database

The Cotton ResearchCommunity

Abstract

Plant and Animal Genome Conference XX

Documents