+ All Categories
Home > Documents > Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Date post: 22-Dec-2015
Category:
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
12
Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?
Transcript
Page 1: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Reminder: Class on Friday, Discussion of Li et al.

Proposal/Projects

CAMERA feedback?

Page 2: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Eukaryotes

Large

Have organelles

Diploid (mostly)

linear chromosomes

lower % coding

Genes have introns

Page 3: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Genomes—How Big? Genome Size # of

GenesH. influenzae 1.8 Mb 1700E. coli 4.7 Mb 4400Yeast 12 Mb 6300Diatom (Thaps) 34 Mb11,000Fruit Fly 180 Mb 13,600Fugu 400 Mb30,000Human 3000 Mb 30,000

Page 4: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

http://www.genomesize.com/ Gregory, 2004 Paleobiology 30:179-202

1pg ~= 1 billion base pairs (1000 Mbp).

Page 5: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Eukaryotic genomes are bigWhat does this mean for sequencing?

Strategies are similar Low coverage of large insert library (BACs,

fosmids) Higher coverage of small insert library

Finishing is harder Often additional mapping tools, RE maps, optical maps

employed to map scaffolds to chromosomes Genomes released in “versions” (Thaps 3.0) Publications often based on draft versions

Page 6: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Where are draft Versions in GenBank?

Model organisms have their own web sites

YeastDBWormDBFlyBase

Page 7: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Eukaryotic genomes are diploidWhat does this mean for sequencing?

Finishing is harder Will never get a 100% consensus Instead identify “high quality discrepancies” What is the sequence in the released genome? How to find where the SNPs are?

T. pseudonana 0.75% of nuclear genome polymorphic

Page 8: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Eukaryotic genomes are arranged in linear chromosomes

Finishing is harder Need to use additional maps to decide if

contigs shoulf be joined or belong on their own chromosoms

Additional mechanisms of gene duplication available/common

Page 9: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Eukaryotic genomes have low % coding

Finishing is harder Much of non coding DNA made up of “selfish DNA” Repeatsmake assembly problematic Thaps: 2% of genome is retrotransposons

Mammalian cells—less than 1% of genomic DNA is coding

Page 10: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Eukaryotic gene structure

Page 11: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Gene finding in eukaryotic genomesRelies on both signal sequences and coding statistics Signals: promoters, start and stop codons, splice sites, poly A

sites These are all relatively weak signals Need to combine with codon statistics

Organisms Specific Training Set is crucial Generated from cDNA library sequenced in

conjucntion with genome project

Page 12: Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?

Implications for Environmental genomicsNeed even more sequencing to get adequate coverage

For any given piece of DNA, likely to have fewer genes than if were prokaryotic in origin

Current state of gene finding and available genomes for comparison mean gene finders likely have very poor perfomrance on DNA of unknown origin


Recommended