CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 62
5.1 INTRODUCTION
The development of DNA-based genetic markers has been the driving force behind
current revolution in animal and plant genetics (Dodgson et al., 1997). The abundance
and hyper-variability associated with SSRs make them ideal candidates for
development of markers for genetic mapping, fingerprinting, gene tagging, marker-
assisted selection and evolutionary studies (Kantety et al., 2002; Powell et al., 1996;
Rafalski et al., 1993; Tautz, 1989). The standard molecular biology method for
developing SSR markers is the construction of small insert libraries followed by
nucleic acid hybridization-based identification of candidate clones and sequencing
(Liu et al., 1996, Akkaya et. al., 1992; Morgante et al., 1993). While improved SSR
enrichment methods reduce marker development costs, they still require some time-
consuming steps for the development (Kumpatla et al., 2004).Computational
approaches provide an attractive alternative to conventional laboratory methods for
rapid and economical development of SSR markers by utilizing freely available
sequences in public databases (Varshney et al., 2002).
Expressed Sequence Tag (EST) databases received much attention as potentially
valuable resources for the development of molecular markers for population genetics
studies and gene discovery due to increasing amounts of ESTs being deposited in
databases for various plants. Publicly-available EST collections are a largely
unexplored source of expression data (Ewing et al., 2000). Currently there are more
than 2 million ESTs available for major monocotyledonous species and more than 1.5
million ESTs for dicots.The usefulness of EST-SSR markers arises from their close
linkage to potentially important genes, helping to identify candidate genes for
quantitative trait loci (QTL).
Expressed Sequence Tags (ESTs) are short sequence reads, typically within the range
of 100–700 bp, obtained from randomly selected cDNA clones. ESTs are often
generated by single pass sequencing of cDNA clones from one or both ends, usually
covering only a part of the transcript sequence, and are relatively prone to error
(~3%). Despite this limitation, EST sequencing represents a main stream
methodology for gene surveying. Even in these days, when whole genome sequences
are available for many organisms, ESTs continue to play an important role in gene
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 63
identification, gene expression studies, transcript mapping, description of the
transcriptional activity of a tissue/cell type, evidence for gene prediction and an
abundant resource of molecular markers for physical mapping (Gruber et al.,2006). In
particular, ESTs provide valuable resources to develop gene-associated SSR markers.
Since EST-SSR markers are derived from expressed genes, they are more conserved
and have a better potential for applications such as identifying conserved genomic
regions among species and genera, comparative genomics, and evolutionary studies
(Thiel et al., 2003; Eujayl et al., 2004; Ukoskit et al., 2012). Also, due to their
existence in transcribed regions of genomic DNA, they can lead to the development of
gene-based maps which may help to identify candidate function genes and increase
the efficiency of marker-assisted selection (Liang et al., 2009). Moreover, the
development of SSR (microsatellite) markers from genomic libraries is expensive and
inefficient, while development of SSR markers through data mining has become a
fast, efficient, and low-cost option for many plants (Eujayl et al., 2004).
Bioinformatics approaches are increasingly being used for molecular marker
development since the sequences from many genomes are made freely available in the
public databases (Gu et al., 1998; Kantety et al., 2002; Varshney et al., 2002).
Additionally, bioinformatics tools also supplement existing approaches by automating
the task of SSR identification from available DNA sequences. Moreover, recent
studies have observed that the frequency of microsatellites was significantly higher in
ESTs than in genomic DNA in several plant species investigated (Morgante et al.,
2002; Toth et al., 2000). Because of the above advantages, SSR markers have been
developed from ESTs in various crops including A. thaliana (Delseny et al., 1999),
Sugarcane (Ukoskit et. al., 2012), Medicago truncatula (Eujayl et al., 2004), Barley
(Thiel et al., 2003), Rubber (Li et al., 2012), Cotton (Han et al., 2006), Capsicum
(Ince et al., 2010), Oil palm (Billotte et al., 2001), Grasses (Kantety et al., 2002) and
J. curcas (Kumar et al., 2011; Wen et al., 2010). Around 1% to 5% of the ESTs
contain SSRs, hence these SSRs have become the marker class of choice for
molecular mapping and plant breeding studies (Eujayl et al., 2004).
Jatropha curcas L. is promoted as non edible biodiesel crop worldwide. Though SSRs
are markers of choice in many plant species, only a very limited number of SSR
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 64
markers are publicly available for Jatropha curcas. The two major limiting factors in
the use of molecular markers for quantitative trait locus (QTL) analysis and marker-
assisted selection programs in Jatropha are: 1) the limited number of suitable markers
available in the public sector, and 2) the lack of knowledge of how these markers are
associated with economically important QTLs (traits). The use of EST database for
marker development will provide a promising tool to enhance molecular and genomic
research in Jatropha.
In this study, we have characterized informative SSR markers from a large collection
of EST (42,483 ESTs) using EST database of Jatropha curcas, which should provide
a clear picture of repeat types, number of repeats, frequency and distribution of the
EST- SSRs in Jatropha curcas. These EST-SSR markers will enrich the current
resource of molecular markers for Jatropha community and would be useful for
qualitative and quantitative trait mapping, marker-assisted selection, and genetic
diversity study in Jatropha as well as related plant species.
5.2 METHODS AND MATERIALS
5.2 Methodology
5.2.1 Dataset
J. curcas sequences used in this study were obtained from NCBI’s EST Database
(http://www.ncbi.nlm.nih.gov/nucestterm=jatropha%20curcas[organism]). These EST
sequences were collected in FASTA format for the identification of SSRs.
5.2.2 SSR analysis:
Perfect mono-, di-, tri-, tetra-, penta-, and hexa-nucleotide motifs with arepeat of ≥6
times were identified using the software WebSat, The Web Static Analyzer Tool
(http://wsmartins.net/websat/), an SSR repeat finder, along with Primer3, PCR primer
design program, into one pipeline tool (Martins et al., 2009). The EST sequences from
Jatropha curcas database were downloaded and entered in the WebSat software. As
the program can process 150,000 characters, multiple FASTA formatted sequences
were processed for SSR analysis at a time. The output generated by the program
highlights the SSR sequences in yellow color (Fig 5.1).
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 65
Fig 5.1: Identification of SSR markers using Websat. SSR markers are highlighted with
yellow color. Arrow shows type of SSR
The analysis of occurrence and frequency of SSRs among EST sequences was carried
out by exporting the WebSat results to Microsoft Excel spreadsheets.
5.2.3 Marker Development
The 3286 SSR containing sequences were subsequently analyzed for primer designing
with the WEBSAT (http://wsmartins.net/websat) which uses primer3 software.
Flanking DNA sequences was analyzed for the presence of suitable specific forward
and reverse primers to assay the SSR loci (Robinson, 2004).
The parameters set for primer design were; Primer Size Min: 18 bp, Optimum Primer
Size: 22 bp , Primer Size Max: 27 bp , Primer Tm Min: 57.0°C, Optimum Primer Tm:
60.0°C, Primer Tm Max: 68.0°C, Primer GC% Min: 40.0, Primer GC% Max: 80.0,
Max Tm Difference: 1.00 and Product Size: 100 – 400 bp.
If primer design is successful and a pair of primers is designed, they are colored green
along with the SSR in blue. If not, a message reporting the failure of primer design
appears (Fig 5.2).
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 66
Fig 5.2: Output of WebSat software. SSR in highlighted in blue and primer specific
for that SSR shown in green
The file can be easily visualized in a spread sheet program, by using the option to
import external data in CVS (MS excel) file, with following fields for each SSR: the
sequence identification, SSR, product size, forward and reverse primer sequence,
melting temperature, and coordinates of the primers within the sequence (Fig 5.3).
Fig 5.3: WebSat export file in MS excel format
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 67
5.2.4 Compositional analysis of SSR mining results
The analysis of occurrence and frequency of SSRs was carried out by exporting the
WebSat results to Microsoft Excel spread sheets. Results on repeat types, number of
repeats as well as frequency were first collected using a combination of sorting and
counting functions. The final results were tabulated.
5.2.5 Insilico validation of designed EST SSR primers
For in silico validation of primers, the web software NetPrimer
(http://www.premierbiosoft.com/netprimer/index.html) was explored. All primers
were analyzed for primer melting temperature and also for all primer secondary
structures including hairpins, loops, self-dimers, and cross-dimers in primer pairs, to
ensure the availability of the primer for the reaction as well as minimizing the
formation of primer dimer. A comprehensive analysis report was generated for
individual primers or primer pairs.
5.3 RESULT AND DISCUSSION
5.3.1 SSR analysis:
Simple sequence repeats have proven to be highly abundant and uniformly distributed
in human and other mammalian genomes (Weber et al., 1989). Several studies have
demonstrated the occurrence, distribution, informativeness and Mendelian inheritance
of SSRs in plant genomes (Wang et al., 1994; Anon, 2004). It has also been reported
that SSRs occur as frequently as once in about 6 kb in case of plant genomes (Cardle
et al., 2000). Studies on several plant genomes have also demonstrated that the
frequencies of SSRs were significantly higher in ESTs than in genomic DNA
(Morgante et al., 2002). The knowledge of the occurrence and frequency of different
types of SSRs in different genomes is valuable for an understanding of their
distribution and also in developing SSR markers for genetic analysis and diagnostics.
In this study, we have characterized informative SSR markers from a large collection
of EST (42,483 ESTs), which provides a clear picture of occurrence, distribution, and
informativeness of the EST- SSRs in Jatropha curcas. In order to assess the
frequency of SSRs in EST sequences of J. curcas, percentages of SSR-containing
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 68
ESTs were calculated. A total of 3682 ESTs contained SSRs (8.66% of the total
42,483 ESTs). This is a relatively higher abundance of SSRs for Jatropha ESTs,
compared to the previous reports for maize (1.4%), barley (3.4%), wheat (3.2%),
sorghum (3.6%), rice (4.7%) (Kantety et al.,2002), Medicago truncatula (3.0%)
(Eujayl et al., 2004) and Peanut (6.8%) species (Liang, 2009). Studies on the
abundance of SSRs in monocots revealed that SSRs were present in about 7% to 10%
(Varshney et al., 2002) of the total ESTs. Kumpatla et al., 2005 analysed 1.5 million
ESTs derived from 55 dicotyledonous species and found that 2.6 to 16.8% of ESTs
contained at least one SSR. The observed frequencies of ESTs containing SSR in
several of the dicotyledonous species are much higher (as many species contain more
than 10%) as compared to monocots (Anon, 2004). Two most likely reasons for these
observations are: (i) the frequency estimates in some species may not represent the
actual values due to the availability of smaller number of ESTs in public database and
(ii) several of the ESTs in species with high frequency of SSR-ESTs may be
redundant.
The relative abundance of mono-, di-, tri- and tetra-nucleotide repeats were
determined by calculating their frequencies in ESTs containing single SSR stretches.
According to the repeat motif classification criteria we divided the SSRs into three
groups: perfect, imperfect and compound types (Weber, 1990). Table 5.1 shows these
different classes of SSRs and their frequencies in EST database of J. curcas. Most
repeats (SSRs ≥20bp: 673, 20.72%; SSRs 12-20bp: 1777, 54.72%) were perfect
repeats. Of these, mono and di-nucleotide repeats were the most abundant motif type.
In the imperfect and compound SSR categories, only mono-, di- and tri-nucleotide
SSR units were present. Most of repeat motifs in mono-nucleotide SSR units were of
the A/T type. AG/CT, GA/TC and AT/TA repeat motif types were present in di-
nucleotide SSR units, while AAG/AGA/GAA/CTT/TTC/TCT repeat motifs were
found in tri-nucleotide SSR units. Of the six types of SSR units, mono-mono, di-di-,
tri-tri-, mono-di-, mono-tri- and di-tri-nucleotide types were found in both perfect and
imperfect compound SSR categories. The distribution of different types of EST- SSR
is shown in Figure 5.4; the numbers inside them indicate the actual numberand % of
sequences of that particular category.
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 69
The SSR loci were categorized into two groups based on the length of their SSR
tracts: class ISSRs, 12 to 20 nucleotides in length and class II, containing perfect
SSRs>20 nucleotides in length (Fig. 5.5). Dinucleotide was the most abundant repeat
motif in both Class I (722,41%) and class II (269,40%) category. The class I repeats
were largely composed of 49% mononucleotide, 41% dinucleotide and 10%
trinucleotide repeats, whereas Class II repeats have 40% dinucleotide, 35%
trinucleotide, 23% mononucletide, 2% pentanucleotide and 1% tetra-hexanucleotide
repeats.
Table 5.1: The total number of EST-SSRs identified in J. curcas
SSR Markers
<10
Nucleo
- tides
10-11
Nucleo -
tides
12-20
Nucleo -
tides
>20
Nucleo
- tides
Total
1) Interrupted
a) Compound - - 2 269 271
b) Simple - - 2 104 106
2) Non-interrupted
a) Imperfect - - 1 57 58
b) Perfect
· Mono - nucleotide - 781 869 157 1087
· Di - nucleotide - - 722 269 991
· Tri - nucleotide - - 186 232 418
· Tetra - nucleotide - - - 6 6
· Penta - nucleotide - - - 2 2
· Hexa - nucleotide - - - 7 7
3) Overlapped 2 - 12 2 16
TOTAL 2 781 1794 1105 3682
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 70
Fig 5.4: Distribution of SSR Markers mined from EST Database of Jatropha curcas L
Fig 5.5 Comparative distribution of different repeat motifs (a) Class I
Fig 5.5 (a): Comparative distribution of different repeat motifs (a) Class II
271, 7% 106, 3%58, 2%
3231, 88%
Interrupted, Compound
Interrupted, Simple
Non-interrupted, Imperfect
Non-interrupted, Perfect
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 71
5.3.2 Compositional analysis of SSRs
Computational mining and analysis of SSRs in ESTs of some cereal species revealed
that trimeric repeats are the most abundant class followed by dinucleotide repeats
(Varshney et al., 2002). As a general trend in dicotyledenous species, dinucleotidesis
the most abundant repeats (Anon, 2004). In our present study on J. curcas
mononucleotide repeats formed the largest group which may be because several of the
ESTs in GenBank still contain polyA / polyT stretches at their ends due to lack of
processing prior to deposition in GenBank. Thus, during SSR mining the As and Ts at
the ends of ESTs would be identified by the WebSat as mononucleotide repeats.
Mononucleotide repeats were followed by dinucleotides which were further followed
by trinucleotides repeats.
Theoretically, the probability of finding mononucleotide repeats in a genome is higher
followed by dinucleotide repeats and then by trinucleotide repeats followed by
tetranucleotide repeats. The results observed for ESTs in Jatropha curcas (Table 5.1)
show this trend, in which the mononucleotide repeats formed the largest group
(55.7%) (Fig 5.5). Dinucleotides was the second largest group (30.5%) (Fig 5.6). This
was followed by trinucleotides (13.2%), tetra and hexanucleotides (0.21%) and
pentanucleotides (2 SSRs) (Fig 5.7). Similar results are reported by Anon, 2004, in
Allium cepa, Hevea brasiliensis, Linum usitatissimum, Phlomis armeniaca, Capsicum
annuum, Gossypium arboreum, Gossypium hirsutum and, Medicago truncatula. On
the other hand, in some plant species like Coffea Arabica and Lactuca sativa
trinucleotide repeats are the most abundant class. Whereas in Mentha piperita, di- and
tri-nucleotide repeats are observed in equal proportions while the mononucleotide
repeats are predominant class (Anon, 2004).
In this study, a total of 3682 SSRs were identified, i.e, SSRs exist in 8.6 % of EST
sequences, in which the mononucleotide repeats formed the largest group (55.7%)
consisting of 95.1% A/T and 4.9% G/C motifs (Fig 5.6). Dinucleotides was the
second largest group (30.5%) consisting of 42.5% AG/CT, 17.3% AT/TA, 4.3%
AC/TG, 34.8% TC/GA and 1.1% GT/CA motifs (Fig 5.7). This was followed by
trinucleotides (13.2%) (Fig 5.4), tetra- hexanucleotides (0.21%) and penta nucleotides
(2 SSRs).
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 72
The available SSR motif combinations could be grouped into unique classes based on
the property of DNA base complementarities. For mononucleotides, although A, T, C
and G are possible, A and T could be grouped into one since an A repeat on one
strand is same as a T repeat on the opposite strand and a poly C on one strand is the
same as a poly G on the opposite strand, resulting in two unique classes of
mononucleotides, A/T and C/G (Katti et al., 2001). Similarly, all dinucleotides can be
grouped into four unique classes: (i) AT/TA; (ii) AG/GA/CT/TC; (iii) AC/CA/TG/GT
and (iv) GC/CG. Thus, the number of unique classes possible for mono-, di-, tri- and
tetra-nucleotide repeats is 2, 4, 10 and 33, respectively (Katti et al., 2001; Jurka et al.,
1995). Figure 5.6 shows the frequencies of A/T and C/G repeats. It is clear that A/T
repeats are the predominant mononucleotides as A/T SSRs represent more than 95%
of the total mononucleotide SSRs in J. curcas.
Fig 5.6: Frequencies of mononucleotide SSRs in ESTs of Jatropha curcas L
Relative frequencies of four unique classes of dinucleotide repeats are shown in
Figure 5.7. Out of the dinucleotide repeats, AG/GA/CT/TC group is the predominant
class of dinucleotide repeats followed by AT/TA as the second most abundant
dinucleotide repeat typein J. curcas. The AG/GA/CT/TC is the predominant class of
repeats is in concurrence with the results observed by Varshney et al., 2002, in some
cereal species and Anon, 2004, in dicot species. In contrast, SaiSug et al., 2013,
recently reported that for the dinucleotide motif sequences, the TC motif was the most
common followed by CT and AT motifs, whereas the AC motif was the least common
95%
5%
Mononucleotide Frequency
A/T
G/C
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 73
in J.curcas EST. However, the second most abundant repeat observed by Varshney et
al., 2002, was AC repeat (same as AC/CA/TG/GT group in the present study),
whereas repeat observed by Anon, 2004, and also in present investigation AT/TA is
the second most frequent repeat observed following AG/GA/CT/TC.
Fig 5.7: Frequencies of dinucleotide SSRs in ESTs of Jatropha curcas L
An analysis of the frequencies of trinucleotide repeats out of total SSRs observed
indicate the predominance of AAG/AGA/GAA/CTT/TTC/TCT repeat class,while
AAT/ATT/ATA/TTA/TAA/TAT is the second frequent repeat class observed. The
number of tetra- penta- hexanucleotide repeats observed in our study is low, that are 6
SSRs, 2 SSRs and 7 SSRs, respectively.
Varshney et al., 2002, observed that the CCG trinucleotide repeat (belongs to the
GGC/GCG/CGC/GCC/CCG/CGC class) is the most predominant SSR in cereal
species. However, this repeat is not the predominant class in J. curcas investigated
here for which large numbers of ESTs are available. Thus in terms of the abundance
of motif types, our study agrees to that of Ueno et al., 2008, 2009, and other studies
performed in dicotyledonous species (reviewed by Kumpatla et al., 2005), in which
AG and AAG were the most abundant di- and trimeric SSRs, respectively. The
extremely low number of SSR motifs containing C and G (0 CGs out of 991 dimeric
SSRs and 3 CCGs out of 428 trimeric SSRs) could be attributed to the composition of
dicot genes being less rich in G+C compared to monocots due to codon usage bias
18%
77%
5%
0%
Dinucleotide Frequency
AT/TA
AG/GA/CT/TC
AC/CA/TG/GT
GC/CG
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 74
(Morgante et al., 2002) and to the intrinsic negative correlation between GC content
and slippage rate (Schlotterer et al., 1992).
Fig 5.8: Frequencies of trinucleotide SSRs in ESTs of Jatropha curcas L
One of the important features of SSRs markers that make them ideal candidates for
genetic analysis is their highly polymorphic nature, i.e., a large number of allelic
variants are possible across different genotypes (Akkaya et al., 1992; Powell et al.,
1996). Knowledge of the distribution of SSRs into different repeat length classes is
useful in assessing the abundance of potentially informative markers. It is a general
experience in molecular genetics community that the utility or informativeness of
SSRs increases with increased number of repeats in a given SSR stretch. For example,
di- and tri-nucleotide repeats with 5 or more repeats are very likely to be informative
compared to 2-4 repeats. This is the reason behind choosing 6 repeats as the minimum
criteria for di- and tri- nucleotide repeats mining using WebSat program. With respect
to the distribution of mono-di-trinucleotide SSR distribution into repeat length falls in
the classes10-24 for mononucleotide, 6-25 for dinucleotides and 6-13 for trinucleotide
repeats.
19%
6%
2%
8%
9%34%
7%
14%
0% 1%
Trinucleotide Frequency
AAT/ATT/ATA/TTA/TAA/TATAAC/ACA/CAA/GTT/TTG/TGTAGT/GTA/TAG/ACT/TCA/CATAGC/GCA/CAG/GCT/CTG/TGCACC/CCA/CAC/GGT/GTG/TGGAAG/AGA/GAA/CTT/TTC/TCTATG/TGA/GAT/CAT/ATC/TCAAGG/GGA/GAG/CTC/TCC/CCT
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 75
In silico validation of designed EST SSR primers
Based on the 3682 SSR-containing ESTs identified, a total of 2236 primers were
successfully designed and used for the validation of the amplification in in silico
condition. All the 2236 primers were validated in silico with NetPrimer. From this
validation studies, 93 primers were such that which does not contain hairpin loops,
self-primers and cross primers. Out of which 4 were interrupted EST-SSR primers
and the rest 89 were non-interrupted EST-SSR primers (Table 5.2).
Thus, the distribution analysis of SSRs in ESTs of J.curcas species clearly indicates
the abundance of mononucleotide SSRs containing 10-24 repeats and di- and tri-
nucleotide SSRs containing, 6-25 repeats and 6-13 repeats, respectively. This
information coupled with the frequencies of different types of mono-, di- and tri-
nucleotide motifs detailed before demonstrates that ESTs are a rich source of SSRs
towards marker development for genetic analysis in Jatropha curcas. Results of this
study give a sight into the type, distribution, frequency of EST-SSRs and development
of EST-SSR markers in J.curcas. These EST-SSR markers would enrich the current
resource of molecular markers for Jatropha community and will be useful for
qualitative and quantitative trait mapping, marker-assisted selection, and study of
genetic diversity in Jatropha as well as related plant species.
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 76
5.4 REFERENCES
Akkaya, M.S., Bhagwat, A.A. and Cregan, P.B. (1992) Length polymorphisms of
simple sequence repeat DNA in soybean. Genetics 132: 1131-1139
Anon, 2004. Computational Mining And Survey Of Simple Sequence Repeats (
SSRs) In:Kumpatla, S.P (Eds.) Expressed Sequence Tags ( ESTS ) of dicotyledonous
plants.www.https://scholarworks.iupui.edu/handle/1805/333
Billotte, N., Risterucci, A.M., Barcelos, E., Noyer, J.L. Amblard, P and Baurens, F.C.
(2001) Development, characterisation, and across-taxa utility of oil palm (Elaeis
guineensis Jacq.) microsatellite markers. Genome 44: 413-425.
Cardle, L., Ramsay, L., Milbourne, D., Macaulay, M., Marshall, D. and Waugh, R.
(2000) Computational and experimental characterization of physically clustered
simple sequence repeats in plants. Genetics 156: 847–854
Delseny, M. (1999) Genomics: methods and early results.Ol.Corps Gras Lipides.
6:136-143.
Eujayl, I., Sledge, M., Wang, L., May, G., Chekhovskiy, K., Zwonitzer, J. and Mian,
M. (2004) Medicago truncatula EST-SSRs reveal cross species genetic markers for
Medicago spp. Theorital and Applied Genetics 108: 414–422
Ewing, R.M. and Claverie, J.M. (2000) EST databases as multi-conditional gene
expression datasets. Pacific Symposium on Biocomputing 5: 427-439
Gu, Z., Hillier, L. and Kwok, P.Y. (1998) Single nucleotide polymorphism hunting in
cyberspace. Human Mutation 12: 221-225
Han, Z., Wang, C., Song, X., Guo, W., Gou, J., Li, C., Chen, X. and Zhang, T. (2006)
Characteristics, development and mapping of Gossypium hirsutum derived EST-SSRs
in allotetraploid cotton. Theorital and Applied Genetics 112: 430–439
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 77
Ince, A., Karaca, M. and Onus, A. (2010) Polymorphic microsatellite markers
transferable across Capsicum species. Plant Molecular Biology Reporter 28: 285–291
Jurka, J. and Pethiygoda, C. (1995) Simple repetitive DNA sequences from primates:
compilation and analysis. Journal of Molecular Evolution 40: 120-126
Kantety, R.V., La Rota, M., Matthews, D.E. and Sorrells, M.E. (2002) Data mining
for simple sequence repeats in expressed sequence tags from barley, maize, rice,
sorghum and wheat. Plant Molecular Biology48: 501-510.
Katti, M.V., Ranjekar, P.K. and Gupta, V.S. (2001) Differential distribution of simple
sequence repeats in eukaryotic genome sequences. Molecular Biology Evolution 18:
1161-1167
Kumar, Y.H., Ranjan, A., Asif, M., Mantri, S., Sawant, S. and Tuli, R. (2011) EST-
derived SSR markers in Jatropha curcas L.: development, characterization,
polymorphism, and transferability across the species/genera. Tree Genetics and
Genomes 7: 207–219
Kumpatla, S.P. and Mukhopadhyay, S. (2005) Mining and survey of simple sequence
repeats in expressed sequence tags of dicotyledonous species. Genome 48(6): 985-998
Kumpatla, S.P., Manley, M.K., Horne, E.C., Gupta, M. and Thompson, S.A. (2004)
An improved enrichment procedure to develop multiple repeat classes of cotton
microsatellite markers. Plant Molecular Biology Reporter 22: 85a-85i
Li, D., Deng, Z., Qin, B., Liu, X. and Men, Z. (2012) De novo assembly and
characterization of bark transcriptome using illumine sequencing and development of
EST-SSR markers in rubber tree (Hevea brasiliensis Muell. Arg.). BMC Genomics
13: 192
Liang, X., Chen, X., Hong, Y., Liu, H., Zhou, G., Li, S. and Guo, B. (2009) Utility of
EST-derived SSR in cultivated peanut (Arachis hypogaea L.) and Arachis wild
species. BioMedCentral 9: 35
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 78
Liu, Z.W., Biyashev, R.M. and Maroof, M.A. (1996) Development of simple
sequence repeat DNA markers and their integration into a barley linkage map.
Theorital and Applied Genetics. 93: 869-876
Morgante, M. and Olivieri, A.M. (1993) PCR-amplified microsatellites as markers in
plant genetics. Plant Journal 3: 175-182
Morgante, M., Hanafey, M. and Powell, W. (2002) Microsatellites are preferentially
associated with nonrepetitive DNA in plant genomes. Nature Genetics 30: 194-200
Powell, W., Machray, G.C. and Provan, J. (1996) Polymorphism revealed by simple
sequence repeats. Trends in Plant Sciences1: 215-222
Rafalski, J.A. and Tingey, S.V. (1993) Genetic diagnostics in plant breeding: RAPDs,
microsatellites and machines. Trends in Genetics9: 275-280
Robinson, A.J., Love, C.J., Batley, J., Barker, G. and Edwards, D. (2004) Simple
sequence repeat marker loci discovery using SSR primer. Bioinformatics 20: 1475–
1476
Schlotterer, C and Tautz, D (1992) Slippage synthesis of simple sequence DNA.
Nucleic Acids Research 20: 211-215
Sobreira, T.J., Durham, A.M. and Gruber, A. (2006) TRAP: automated classification,
quantification and annotation of tandemly repeated sequences. Bioinformatics 22:
361-362
Tautz, D. (1989) Hypervariability of simple sequences as a general source for
polymorphic DNA markers. Nucleic Acids Research 17: 6443-6471
Thiel, T., Michalek, W., Varshney, R. and Graner, A. (2003) Exploiting EST database
for the development and characterization of genederived SSR-markers in barley
(Hordeum vulgare L.).Theoritical and Applied Genetics 106: 411–422
CHAPTER 5: Development of SSR primers from EST database of Jatropha curcas
Marker assisted selection for high oil yielding varieties in Jatropha curcas 79
Ueno, S. and Tsumura, Y. (2008) Development of ten microsatellite markers for
Quercus mongolica var. crispula by database mining. Conservation Genetics 9: 1083-
1085
Ueno, S., Aoki, K. and Tsumura, Y. (2009) Generation of expressed sequence tags
and development of microsatellite markers for Castanopsis sieboldii var. sieboldii
(Fagaceae). Annals of Forest Science 66: 509-509
Ukoskit, K., Thipmongkolcharoen, P. and Chatwachirawong, P. (2012) Novel
expressed sequence tag- simple sequence repeats (EST SSR) markers characterized by
new bioinformatic criteria reveal high genetic similarity in sugarcane (Saccharum
spp.) breeding lines. African Journal Biotechnology 11: 1337–1363
Varshney, R.K., Thiel, T., Stein, N., Langridge, P. and Graner, A. (2002) In silico
analysis on frequency and distribution of microsatellites in ESTs of some cereal
species. Cell and Molecular Biology Letters 7: 537-546
Wang, Z., Weber, J.L., Zhong, G. and Tanksley, S.D. (1994) Survey of plant short
tandem DNA repeats. Theoritical and Applied Genetics 88: 1-6
Weber, J.L (1990) Informativeness of human (dC-dA)n (dG-dT)n polymorphisms.
Genomics 7: 524–530.
Weber, J.L. and May, P.E. (1989) Abundant class of human DNA polymorphisms
which can be typed using the polymerase chain reaction. American Journal of Human
Genetics 44: 388-396.
Wen, M., Wang, H., Xia, Z., Zou, M., Lu, C. and Wang, W. (2010) Developmenrt of
EST-SSR and genomic-SSR markers to assess genetic diversity in Jatropha curcas L.
BMC Research Notes 3: 42
79
Tab
le 5
.2:
Lis
t of
SS
R P
rim
ers
des
ign
ed f
rom
ES
T D
ata
base
of
J. cu
rcas
an
d
in s
ilic
o v
ali
date
d w
ith
Net
Pri
mer.
S.N
o.
Ge
nB
an
k
Acc
ess
ion
No
.
Ty
pe
of
Re
pe
at
Fo
rwa
rd P
rim
er
(5’-
3’)
R
ev
ers
e P
rim
er
(5’-
3’)
P
rod
uct
Siz
e
Non
In
terr
up
ted
ES
T-S
SR
Pri
mer
s
1
GI|
31
12
13
41
3
(T)1
7
TG
CT
GT
TC
TA
CT
GG
TC
TG
CT
GT
T
GC
CC
GT
TT
TC
CT
TC
TA
TC
TT
A
26
1
2
GI|
31
11
96
82
2
(CT
T)6
C
CC
TG
TT
TC
CC
AT
TA
GT
CT
CT
G
CT
CT
GT
TT
GC
TT
CC
TT
CA
TT
CC
2
73
3
GI|
31
11
96
54
9
(TC
)6
CT
GT
TT
CA
CT
GC
TC
TG
TG
GA
CT
G
CC
AC
CC
TT
TT
CT
TA
TC
CA
TT
A
37
3
4
GI|
31
11
96
42
9
(CT
)14
C
TC
TC
TC
TC
CT
TC
AC
CA
TC
AC
C
GG
CA
TC
AC
AC
AT
TC
TA
AA
AC
GA
3
06
5
GI|
31
11
91
46
4
(AG
)19
C
AC
CC
AA
AA
CC
TC
TG
CT
AA
AA
C
TC
TC
TG
CC
AA
AA
TC
AT
CA
TC
AC
2
87
6
GI|
31
11
96
89
7
(TC
)8
TC
CT
TC
CA
AC
TC
CT
GC
TT
TT
AC
C
GG
AC
TT
TC
TC
TC
TG
TT
CT
CG
T
27
2
7
GI|
31
11
96
82
2
(CT
T)6
C
CC
TG
TT
TC
CC
AT
TA
GT
CT
CT
G
CT
CT
GT
TT
GC
TT
CC
TT
CA
TT
CC
2
73
8
GI|
31
11
96
42
9
(CT
)14
C
TC
TC
TC
TC
CT
TC
AC
CA
TC
AC
C
GG
CA
TC
AC
AC
AT
TC
TA
AA
AC
GA
3
06
9
GI|
31
11
91
46
4
(AG
)19
C
AC
CC
AA
AA
CC
TC
TG
CT
AA
AA
C
TC
TC
TG
CC
AA
AA
TC
AT
CA
TC
AC
2
87
10
G
I|3
11
18
00
17
(C
T)8
T
TA
CT
CC
TC
CA
TC
TC
CC
TC
CT
T
CC
TC
CG
TT
TC
TT
TC
CT
TC
TT
TT
1
96
11
G
I|3
11
17
95
75
(A
)15
G
AG
TC
TG
GG
TG
CT
TT
GA
TT
CT
T
AG
TC
TC
CT
TT
TC
TT
CT
CG
GG
TT
3
71
12
G
I|3
11
17
52
43
(T
C)6
C
AA
AG
CA
GA
GA
AT
GA
GC
AA
GT
G
AT
GA
AG
CA
AG
AA
CC
CT
GA
GA
AG
2
12
13
G
I|3
11
17
24
99
(T
AG
)7
GA
GT
CA
AA
AG
GT
GG
GA
AG
AA
GA
T
AG
TC
AG
GA
AA
TA
GC
AG
TC
GC
A
29
1
14
G
I|3
11
16
38
99
(C
T)6
A
AC
CC
AT
TA
TC
CC
AC
TA
AA
CC
C
CC
AT
TT
CT
TG
CT
CT
TA
CC
CA
TC
2
96
15
G
I|3
11
16
38
94
(A
G)6
A
CG
GG
GA
AG
AG
AA
GG
AA
TA
AA
G
CG
CA
CA
TC
AA
GA
AA
CA
AG
AA
GT
2
45
16
G
I|3
11
16
38
01
(A
)10
A
CC
AC
TC
CT
AC
TA
CA
AC
GA
CT
AT
CT
G
TA
AC
AC
AC
AC
CA
AT
CT
CC
CA
AC
3
88
17
G
I|3
11
16
31
54
(A
)14
A
AC
AG
CA
AG
CG
AA
CA
AG
GA
G
AG
TA
GA
TG
GA
TG
AA
AG
GA
CC
GA
1
05
18
G
I|3
11
16
31
48
(A
G)1
0
AG
AG
AG
AG
AA
AA
GC
GG
AA
GG
AT
A
GA
AG
AA
GA
CG
AA
CT
GG
AG
GT
G
25
8
19
G
I|3
11
16
28
61
(A
)11
A
TG
GA
TG
AA
TG
TG
CT
TT
GT
GT
C
TT
GG
AA
CT
CT
GA
AG
GA
TG
GA
AT
1
64
20
G
I|3
11
16
28
57
(A
G)6
G
CA
AG
GA
GG
GA
GA
TT
TT
GT
TT
T
TG
AT
TT
AG
CG
AG
AG
AG
AG
AG
GG
2
02
21
G
I|3
11
16
27
61
(T
)13
T
CT
TT
CT
CT
TC
TC
TC
CA
CT
GC
C
CT
CT
TC
CC
TT
TC
AT
TT
GG
TC
TG
1
80
22
G
I|3
11
16
12
67
(T
CT
)7
GC
AT
CT
TT
CT
CT
TT
CA
TC
TT
CG
A
TC
TA
AC
AC
TT
GC
CC
AC
CA
CT
A
14
8
23
G
I|3
11
16
12
08
(T
C)6
C
CG
TT
TC
GC
TC
TT
GT
CA
TC
TA
C
GT
TG
CC
AT
TG
TC
GT
TA
TT
TC
CT
3
95
24
G
I|3
11
15
88
75
(C
AA
)10
A
CA
AA
CA
GA
AA
GA
GC
GA
TG
GA
T
GT
GT
AA
GA
GG
TG
GA
AG
TG
GG
AG
3
19
25
G
I|3
11
15
87
25
(T
)13
G
GC
TT
TG
TG
CT
CT
GG
AG
AT
TA
G
TC
CT
TT
TA
CA
CT
GG
TG
GG
TT
TC
3
93
79
26
G
I|3
11
15
85
36
(A
)10
T
GT
TT
TC
TC
GT
AT
TG
GG
GT
TC
T
AG
TA
TC
AT
TC
CT
GC
TG
GT
TG
GT
2
31
27
G
I|3
11
15
83
12
(C
T)1
4
CC
TC
TC
GT
TT
CA
TT
TC
TC
TC
GT
C
TC
TT
AC
CC
AC
TC
GT
CC
TT
CA
C
16
8
28
G
I|3
11
15
83
00
(A
AG
)7
CG
AA
AA
CA
AA
AG
CA
GA
TT
AG
GG
C
AA
AC
AA
AA
CG
GA
AA
CA
GA
GT
G
33
1
29
G
I|3
11
15
70
97
(T
C)8
G
GG
TC
CT
TT
CA
CC
TC
TC
TC
TC
T
GC
TC
CT
CT
CC
CA
AT
CT
GT
TC
TA
3
06
30
G
I|3
11
15
70
93
(T
)20
T
CA
TC
TC
CA
TA
CC
CA
CA
TA
CA
CA
C
GA
AT
CA
AG
AA
GA
GT
CA
AA
GC
A
19
1
31
G
I|3
11
15
69
13
(C
AA
)10
A
CA
AA
CA
GA
AA
GA
GC
GA
TG
GA
T
GT
GT
AA
GA
GG
TG
GA
AG
TG
GG
AG
3
19
32
G
I|3
02
37
02
91
(C
AA
)6
AC
CA
AG
AC
CA
GT
TT
TA
CC
TC
CA
T
CC
CT
CA
AC
CG
TC
AT
AG
AT
TT
T
38
0
33
G
I|3
02
37
02
06
(C
T)1
0
TG
TG
TT
CG
TG
TG
TT
CG
TG
TC
T
AT
AG
TT
GT
TG
CC
TT
CA
TT
GC
CT
1
59
34
G
I|3
02
37
00
37
(T
CT
)6
CA
GT
TC
TC
AC
AG
TT
CT
TA
GC
CG
C
GC
AT
AC
TA
CC
CT
CT
CG
TT
CT
T
29
8
35
G
I|3
02
36
93
52
(G
CA
)7
AG
TG
GG
TG
GA
TG
TT
GA
GA
GA
GT
G
GA
TT
TA
GA
GA
AT
GC
TT
GT
GC
C
25
4
36
G
I|3
02
36
89
85
(C
T)1
3
CT
CT
CT
CT
CC
TT
CA
CC
AT
CA
CC
G
GC
AT
CA
CA
CA
TT
CT
AA
AA
CG
A
30
4
37
G
I|3
02
36
79
21
(T
C)6
C
AA
TA
CG
AA
CG
AG
AG
AG
AG
CA
G
AT
TT
CC
AT
CA
AC
TT
TC
AC
CC
AC
2
33
38
G
I|3
02
36
73
09
(C
TC
)6
AT
GG
GT
AG
GA
AA
GG
AA
AA
TG
GT
C
AG
CA
GC
AA
CT
GG
AA
CA
GA
AT
A
28
3
39
G
I|3
02
36
69
32
(A
G)1
4
AG
AG
AG
AG
AA
AA
GC
GG
AA
GG
AT
A
GA
AG
AA
GA
CG
AA
CT
GG
AG
GT
G
26
0
40
G
I|3
02
36
55
97
(G
CA
)7
AG
TG
GG
TG
GA
TG
TT
GA
GA
GA
GT
G
GA
TT
TA
GA
GA
AT
GC
TT
GT
GC
C
25
4
41
G
I|3
02
36
21
80
(T
AG
)7
GA
GT
CA
AA
AG
GT
GG
GA
AG
AA
GA
T
AG
TC
AG
GA
AA
TA
GC
AG
TC
GC
A
29
1
42
G
I|3
02
36
17
28
(T
AG
)7
AA
AG
GA
GT
CA
AA
AG
GA
GG
GA
AG
T
AG
TC
AG
GA
AA
TA
GC
AG
TC
GC
A
29
5
43
G
I|3
02
36
15
53
(G
CA
)7
AG
TG
GG
TG
GA
TG
TT
GA
GA
GA
GT
G
GA
TT
TA
GA
GA
AT
GC
TT
GT
GC
C
25
4
44
G
I|3
02
35
93
00
(A
AG
A)6
A
GG
GA
CA
GC
AG
CA
GA
AG
AA
G
GA
CG
AA
GG
AG
AG
AA
CA
AG
CA
TC
2
60
45
G
I|3
02
35
81
06
(A
G)6
A
CC
AG
CC
CT
CT
CT
TT
CT
TT
TC
T
AA
TC
TC
AT
CC
TC
TT
CC
AA
TC
CA
2
13
46
G
I|2
68
52
56
31
(A
T)7
A
CC
AT
AA
CA
AA
CC
AA
AA
CC
CA
G
CA
AC
CA
GA
AG
AG
GC
AG
AA
GT
G
35
4
47
G
I|2
68
52
43
39
(C
T)1
3
CT
CT
CT
CT
CC
TT
CA
CC
AT
CA
CC
G
GC
AT
CA
CA
CA
TT
CT
AA
AA
CG
A
30
4
48
G
I|2
68
52
43
38
(A
)10
C
AA
CA
AG
GT
CA
GA
GT
CA
AT
CA
A
GT
TC
AC
CG
CT
CA
GT
AT
CA
AA
GT
1
52
49
G
I|2
68
52
37
16
(T
C)1
6
AA
CA
TT
GG
TT
AT
GG
TT
GG
AA
GG
T
CA
GA
GA
AG
AG
GT
GA
GA
AG
AA
AG
AA
2
68
50
G
I|2
68
52
27
64
(C
TT
)6
CC
CT
GT
TT
CC
CA
TT
AG
TC
TC
TG
C
TC
TG
TT
TG
CT
TC
CT
TC
AT
TC
C
27
3
51
G
I|2
68
52
17
34
(T
CA
)6
CC
TC
AT
CC
TC
AT
CT
TC
AT
CC
TC
C
TG
TG
TC
TT
GG
TT
TA
CG
GG
TG
1
35
52
G
I|2
68
51
99
74
(T
)16
T
TG
TT
GC
TA
TG
CC
AG
AA
GA
AT
G
TT
GG
AA
AC
GG
AA
AG
GA
GT
AA
GA
2
37
53
G
I|2
68
51
94
09
(T
)13
G
AA
CT
CC
AC
TC
CT
CT
TT
CC
CT
T
AA
TC
CT
CT
TT
CC
TT
TG
GT
CT
CC
3
83
54
G
I|2
68
51
74
42
(A
)13
A
CA
TT
TT
GT
TC
TG
AC
TG
GG
TT
G
GT
CC
TT
CA
TC
TG
TT
TT
GC
CT
TT
2
20
55
G
I|2
37
68
03
90
(A
)13
T
TA
AG
CA
GT
GG
TA
TC
AA
CG
CA
G
AG
CA
TC
CA
GT
CG
TA
TC
TT
CT
CC
1
13
56
G
I|2
37
68
01
09
(T
G)6
C
TG
TT
TG
CT
TC
TG
AC
CA
TT
TT
G
AA
CC
CC
TT
GT
TT
TC
AC
TC
CA
C
27
2
79
57
G
I|2
37
67
99
04
(G
)11
C
GA
GG
GA
TT
GT
TT
CT
TG
TT
TT
C
CC
CA
CA
GC
CA
CA
CC
AC
TA
1
45
58
G
I|3
11
20
81
56
(C
GG
)6
TC
GT
CC
CT
TC
TG
CT
TC
TT
AC
TC
C
TA
TT
CC
TA
TT
CG
CG
GC
AT
AT
C
15
1
59
G
I|3
11
20
79
33
(T
A)6
T
GA
CA
AG
AA
GG
TT
AC
TC
AG
GC
A
GC
AA
GG
AC
AA
AA
TG
AT
AC
GA
CA
3
69
60
G
I|3
11
20
72
33
(C
AG
)6
AC
AG
CA
GG
AG
CA
GA
AC
CA
AC
A
TG
TA
AA
TC
AC
CG
AT
CC
AA
AC
C
28
6
61
G
I|3
15
70
85
20
(A
)14
G
GC
TC
TC
TC
TG
TC
TC
AT
TT
CG
T
GC
CA
TA
TC
TT
CG
TC
GT
CT
TC
TT
3
50
62
G
I|3
15
70
84
92
(T
)14
A
AT
GA
GG
GA
AT
CT
TG
GA
TG
AA
C
TG
AA
AT
CT
AC
AG
TT
TG
CT
GG
TC
TC
2
61
63
G
I|3
15
70
83
10
(T
)15
T
CT
CT
CT
CT
TT
CT
CT
CT
CT
CC
CC
T
CC
AA
AA
CT
AC
CT
CT
CT
CC
TT
CA
3
12
64
G
I|3
15
70
65
59
(T
)11
A
CG
GA
GT
CA
AT
GG
AA
GG
AA
GT
A
CA
CG
CA
AC
AC
GA
CA
AA
CC
2
67
65
G
I|3
15
70
58
78
(T
AG
)7
GA
GT
CA
AA
AG
GT
GG
GA
AG
AA
GA
T
AG
TC
AG
GA
AA
TA
GC
AG
TC
GC
A
29
1
66
G
I|3
15
70
35
18
(A
T)1
8
GG
TT
CA
GA
TT
CA
TC
GT
CA
GT
CA
C
TT
CT
TT
TC
AG
TT
CC
CA
GC
AG
T
34
7
67
G
I|3
15
70
34
74
(T
C)6
C
CG
TT
TC
GC
TC
TT
GT
CA
TC
TA
C
GT
TG
CC
AT
TG
TC
GT
TA
TT
TC
CT
3
95
68
G
I|3
15
70
21
37
(A
G)1
3
AG
AG
AG
AG
AA
AA
GC
GG
AA
GG
AT
A
GA
AG
AA
GA
CG
AA
CT
GG
AG
GT
G
25
8
69
G
I|3
15
70
20
55
(T
)12
C
AA
TC
AA
CC
TT
CC
AG
TG
CC
C
CC
TT
TC
TT
TT
GC
CT
TC
TC
AT
A
13
7
70
G
I|3
15
70
18
42
(T
C)6
C
AA
TA
CG
AA
CG
AG
AG
AG
AG
CA
G
AT
TT
CC
AT
CA
AC
TT
TC
AC
CC
AC
2
33
71
G
I|3
15
70
09
65
(T
AG
)7
GA
GT
CA
AA
AG
GT
GG
GA
AG
AA
GA
T
AG
TC
AG
GA
AA
TA
GC
AG
TC
GC
A
29
1
72
G
I|3
15
69
95
60
(T
)13
T
CT
CT
CT
CT
TT
CT
CT
CT
CT
CC
CC
T
CC
AA
AA
CT
AC
CT
CT
CT
CC
TT
CA
3
10
73
G
I|3
15
69
94
84
(T
)13
G
AC
AG
GA
CG
GG
AC
AA
GA
TA
AA
G
AA
CC
AG
AT
CG
TA
CC
CA
AG
AA
AA
3
08
74
G
I|3
15
69
67
95
(A
T)8
C
TG
AC
CA
GA
CA
AA
AG
CA
GA
AA
C
GG
CA
AG
AA
AG
AG
AC
CA
AG
TG
AT
1
17
75
G
I|3
15
69
66
99
(C
CG
)8
CG
CT
CT
TT
GC
CT
TA
TT
AT
GC
TT
T
GA
CA
GA
TA
GA
AC
AC
TC
GT
GG
G
30
9
76
G
I|3
15
69
66
90
(G
AA
)10
G
TA
GA
AG
GA
GA
AG
GG
GA
AG
AG
G
TA
TG
CT
TG
GA
CA
GG
GC
TT
TA
TT
3
43
77
G
I|3
15
69
26
62
(C
T)1
3
CT
CT
CT
CT
CC
TT
CA
CC
AT
CA
CC
G
GC
AT
CA
CA
CA
TT
CT
AA
AA
CG
A
30
4
78
G
I|3
15
69
18
73
(G
)11
A
GA
AG
AG
AA
AC
AG
CA
CC
AC
CA
C
TG
AA
AC
CA
TT
AC
AC
AC
AG
CA
CA
3
79
79
G
I|3
15
68
53
59
(T
A)2
1
CT
AC
GG
CT
TT
CC
TA
CC
TT
TT
CA
T
TC
TG
CT
TA
CA
AT
CC
CA
AC
CT
T
21
2
80
G
I|3
15
68
53
00
(T
TA
)6
GC
TT
GC
TT
CT
TT
GT
TC
TT
TC
CT
G
TC
TC
TT
GT
CT
GT
TC
GT
CA
TC
G
38
9
81
G
I|3
15
68
47
75
(A
AG
)6
GT
AA
GC
AA
AG
AG
AA
CC
CG
AA
GA
A
AA
TC
AT
CA
AC
GA
CA
CA
AG
CA
G
18
8
82
G
I|3
15
68
47
41
(C
CG
)8
GC
TC
TT
TG
CC
TT
AT
TA
TG
CT
TG
G
TG
AC
GG
TA
GA
GG
TT
GA
AT
GA
2
51
83
G
I|3
15
68
45
64
(T
)15
T
CT
CT
CT
CT
TT
CT
CT
CT
CT
CC
CC
T
CC
AA
AA
CT
AC
CT
CT
CT
CC
TT
CA
3
12
84
G
I|3
15
68
45
41
(C
T)6
G
AC
TG
TG
AA
AA
CT
GA
AA
AC
CC
C
GA
GG
AG
AG
AA
AG
CA
AA
GA
TG
GA
1
57
85
G
I|3
15
68
15
08
(C
)12
T
GA
TT
TG
CC
TT
GT
GA
GT
AT
TG
C
GG
AG
GA
GA
TG
AG
AG
AG
TG
GA
GA
1
95
86
G
I|3
15
68
12
28
(T
)11
A
AC
AT
AG
CG
GG
AT
GG
AA
AT
AG
A
CA
CG
CA
AC
AC
GA
CA
AA
CC
1
04
87
G
I|3
15
67
61
33
(T
)11
A
CG
GA
GT
CA
AT
GG
AA
GG
AA
GT
A
CA
CG
CA
AC
AC
GA
CA
AA
CC
2
67
79
88
G
I|3
15
67
60
19
(A
)26
T
CC
CT
CT
CT
AT
CC
AA
AA
TC
CA
A
TA
CT
TT
AT
CC
CT
AA
TC
CA
GC
GG
1
48
89
G
I|3
15
67
56
32
(A
)13
C
TA
CT
AC
CC
AT
CA
AA
TC
CC
AC
C
CC
AT
TA
GC
CA
CA
AC
AC
CA
CT
TA
2
22
Inte
rru
pte
d E
ST
-SS
R P
rim
ers
90
(TT
C)9
(T)1
1
CG
TT
AG
AT
TT
CC
AC
TC
AC
CT
CC
C
TA
TT
TA
CC
G C
TT
CC
GA
TT
CC
T
29
6
91
(T)1
5(G
A)1
7
CT
GT
CC
AT
CT
CC
CT
CT
CA
GT
AT
C
GT
GT
GT
GT
GT
GT
GT
TT
AT
TC
GC
3
37
92
(TC
)10
(TC
)11
A
G C
AA
CT
CT
TT
T C
CT
TC
CT
CC
T
TC
AC
TT
AT
CA
TC
AC
CA
GC
CA
TC
1
90
93
(GA
)11
(A)1
1
AT
AA
AG
A C
AA
AT
GG
AC
A A
GG
GG
G
CA
AA
GT
GA
AT
CT
AC
AG
CA
GG
A
35
4