+ All Categories
Home > Documents > Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web...

Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web...

Date post: 06-Feb-2018
Category:
Upload: lamkhue
View: 214 times
Download: 0 times
Share this document with a friend
22
Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072 012.docx A Microsoft-Word 2007 file with 16 figures comparing the results of Blast2GO for GeneChip (Sanger-EST) and transcriptome assemblies of pepper as well as the IGA transcriptome assembly procedure flow chart. De novo assembly of the pepper transcriptome (Capsicum annuum): a benchmark for in silico discovery of SNPs, SSRs and candidate genes AUTHORS: Hamid Ashrafi, Theresa Hill, Kevin Stoffel, Alexander Kozik, Jiqiang Yao, Sebastian Reyes Chin-Wo and Allen Van Deynze 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Transcript
Page 1: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docxA Microsoft-Word 2007 file with 16 figures comparing the results of Blast2GO for GeneChip (Sanger-EST) and transcriptome assemblies of pepper as well as the IGA transcriptome assembly procedure flow chart.

De novo assembly of the pepper transcriptome (Capsicum annuum): a benchmark for in silico discovery of SNPs, SSRs and candidate genes

AUTHORS:

Hamid Ashrafi, Theresa Hill, Kevin Stoffel, Alexander Kozik, Jiqiang Yao, Sebastian Reyes Chin-Wo and Allen Van Deynze

12

345

6

7

8

9

10

11

1213

14

15

16

17

18

19

20

21

22

23

24

25

26

Page 2: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 1. Distribution of E-Values of BLASTX of a) the Sanger-EST unigenes b) IGA transcriptome contigs

1

2

3

4

Page 3: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 2. Percent Similarity of assembly sequences with sequences in the GenBank a) Sanger-EST unigenes b) IGA transcriptome contigs. Similarity is computed of each query-hot pair as the sum of similarity values for all matching HSPs

1

234

5

Page 4: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 3. Length vs number of sequences in a) Sanger-EST unigenes b) IGA transcriptome contigs.

1

23

4

5

Page 5: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 4. High-scoring segment pairs (HSP) per sequence coverage a) Sanger-EST unigenes b) IGA transcriptome contigs.

1

234

Page 6: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 5. Evidence code distribution1 sequences depicts the inference about the annotation. For instance IEA is inferred from electronic assay, or IDA inferred from direct assay. a) Sanger-EST unigenes b) IGA transcriptome contigs.

1 Once mapping has been completed, the user can check the distribution of evidence codes in the recovered GO terms and the original database sources of annotations. These charts give an indication of suitable values for B2G annotation parameters. For example, when a good overall level of sequence similarity is obtained for the dataset, the default annotation cutoff value could be raised to improve annotation accuracy. Similarly, if evidence code charts indicate a low representation of experimentally derived GOs, the user might choose to increase the weight given to annotations. After the final annotation step, new charts show the distribution of annotated sequences, electronic the number of GOs per sequence, the number of sequences per GO, and the distribution of annotations per GO level, which jointly provide a general overview of the performance of the annotation procedure.

123

123456789

Page 7: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 6. Evidence code distribution for BLAST hits depicts the inference about the annotation. For instance IEA is inferred from electronic assay, or IDA inferred from direct assay. a) Sanger-EST unigenes b) IGA transcriptome contigs.

12345

Page 8: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 7. Number of high similarity pairs per BLAST hit a) Sanger-EST unigenes b) IGA transcriptome contigs.

123

4

5

6

Page 9: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 8. Database resources that were used for mapping step of BLAST2GO a) Sanger-EST unigenes b) IGA transcriptome contigs.

1

23

4

5

6

Page 10: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

0

500

1000

1500

2000

2500

3000

Num

ber o

f Con

tigs

Number of GO terms

0

1000

2000

3000

4000

5000

6000

7000

8000

Num

ber o

f Con

tigs

Number of GO terms

a

b

Supplement Figure 9. Number of GO terms per contigs. a) On average (weighted average) 5 GO terms was mapped to 19,966 (64%) contigs of Sanger-EST assembly. b) on average (weighted average) between 5 GO terms was mapped to 37,000 (30%) contigs of IGA transcriptome assembly.

1

2345

6

Page 11: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 10. Number of annotations at each GO level. P for Biological Processes, F for Molecular Function and C stands for Cellular components. a) Sanger-EST unigenes b) IGA transcriptome contigs.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

222324

25

26

Page 12: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

PigmentationNitrogen utilization

Biological adhesionLocomotion

Cell killingViral reproduction

Cell proliferationRhythmic processCarbon utilization

Immune system processDeath

GrowthCell wall organization or biogenesis

Multi-organism processReproduction

Cellular component biogenesisSignaling

Cellular component organizationMulticellular organismal process

Developmental processLocalization

Biological regulationResponse to stimulus

Metabolic processCellular process

Number of Sequences

Biol

ogic

al P

roce

sse

Direct Go Counts of Biological Processes a

Amine bindingEnzyme activator activity

Nucleoside-triphosphatase regulator activityCarboxylic acid binding

Peroxidase activityMetal cluster binding

Enzyme inhibitor activityLipid binding

Carbohydrate bindingVitamin binding

Isomerase activityTetrapyrrole binding

Signal transducer activityStructural constituent of ribosome

Lyase activitySequence-specific DNA binding TF activity

Ligase activityCofactor binding

Substrate-specific transporter activityTransmembrane transporter activity

Oxidoreductase activityNucleic acid binding

Protein bindingHydrolase activity

Ion bindingNucleotide binding

Transferase activity

Number of Sequences

Bio

logi

cal F

unct

ion

Direct GO Counts of Molecular Functions

b

.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

Page 13: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

c

Receptor complexPeriplasmic spaceApical part of cell

Extracellular spaceExternal encapsulating structure part

Extracellular matrixCell surface

Beta-galactosidase complexProtein histidine kinase complex

Cell fractionIntrinsic to organelle membrane

Vesicle membraneGolgi membrane

Serine/threonine phosphatase complexNetwork of nuclear outer & ER membranes

Organelle subcompartmentExternal encapsulating structure

Endomembrane systemOrganelle envelope

EnvelopeMembrane-bounded vesicle

Organelle lumenOrganelle membrane

Membrane partIntracellular organelle part

MembraneIintracellular part

Intracellular

Number of Sequnces

Cellu

lar C

ompo

nent

Direct Go Counts of Cellular Components

a

Sulfur utilizationNitrogen utilization

PigmentationBiological adhesion

Cell killingViral reproduction

LocomotionCell proliferation

Carbon utilizationRhythmic process

Immune system processCell wall organization or biogenesis

GrowthDeath

Multi-organism processCellular component biogenesis

ReproductionSignaling

Cellular component organizationDevelopmental process

Multicellular organismal processLocalization

Biological regulationResponse to stimulus

Metabolic processCellular process

Number of Sequence

Bio

logi

cal P

roce

ss

Direct GO Counts of Biological Processes

Supplement Figure 11. Direct GO count graphs depicting, a) Biological processes b) Cellular components and c) Molecular functions in the Sanger-EST assembly.

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

18

19

20

21

22

23

24

25

26

Page 14: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

c

Proton-transporting ATP synthase complexNADH dehydrogenase complex

Respiratory chain complex ICoated membrane

Cytoplasmic vesicle partOrganelle outer membrane

Membrane coatOuter membrane

Proton-transporting two-sector ATPase complexProteasome complex

Nuclear envelopeExtrinsic to membrane

PhotosystemRespiratory chain

Endoplasmic reticulum membraneMitochondrial membrane part

Ubiquitin ligase complexOrganelle inner membranePhotosynthetic membrane

Thylakoid partThylakoid

cell wallPlasma membrane part

Ribonucleoprotein complexIntrinsic to membrane

Plasma membraneCytoplasmic part

CytoplasmIntracellular organelle

Number of Sequences

Cellu

lar C

ompo

nent

Direct GO Counts of Cellular Components

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

Page 15: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

Supplement Figure 12 The Direct GO count graphs depicting, a) Biological processes b) Cellular components c) Molecular functions in the IGA transcriptome assembly.

1

2

3

4

56

Page 16: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 13. The relationship between number of Go terms and length of sequences. a) Sanger-EST unigenes b) IGA transcriptome contigs.

1

23

4

5

6

Page 17: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 14. Distribution of annotation score vs. number of sequences a) Sanger-EST unigenes b) IGA transcriptome contigs.

1

23

4

5

6

Page 18: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

a

b

Supplement Figure 15. The relationship between length of sequence and annotation a) Sanger-EST unigenes b) IGA transcriptome contigs.

1

23

4

5

Page 19: Additional_file_3_Ashrafi_et_al_2012_Pepper_Annotation ... Web viewAdditional_file_3_Ashrafi_et_al_2012_Pepper_Annotation_Supp_05072012.docx. A Microsoft-Word 2007 file with ... abSupplement

Supplement Figure 16: A flow chart of steps taken to assemble pepper IGA reads. Super assembly comprises of the combined assembly of Velvet K-mers or CLC workbench iterations (within each square box two super assemblies). The assembly of each super assembly is depicted by different colors to show Mega assemblies (immediately below each box). The Mega assemblies were combined to make Meta assembly (navy blue box marked as reference sequence).

123456

7


Recommended