Date post: | 21-Nov-2023 |
Category: |
Documents |
Upload: | independent |
View: | 0 times |
Download: | 0 times |
SNP haplotypes of the BADH1 gene and their associationwith aroma in rice (Oryza sativa L.)
Anuradha Singh • Pradeep K. Singh • Rakesh Singh • Awadhesh Pandit •
Ajay K. Mahato • Deepak K. Gupta • Kuldeep Tyagi • Ashok K. Singh •
Nagendra K. Singh • Tilak R. Sharma
Received: 5 July 2009 / Accepted: 2 March 2010 / Published online: 21 March 2010
� Springer Science+Business Media B.V. 2010
Abstract Betaine aldehyde dehydrogenase (BADH)
is a key enzyme involved in the synthesis of glycin-
ebetaine—a powerful osmoprotectant against salt and
drought stress in a large number of species. Rice is not
known to accumulate glycinebetaine but it has two
functional genes coding for the BADH enzyme. A non-
functional allele of the BADH2 gene located on
chromosome 8 is a major factor associated with rice
aroma. However, similar information is not available
regarding the BADH1 gene located on chromosome 4
despite the similar biochemical function of the two
genes. Here we report on the discovery and validation
of SNPs in the BADH1 gene by re-sequencing of
diverse rice varieties differing in aroma and salt
tolerance. There were 17 SNPs in introns with an
average density of one per 171 bp, but only three SNPs
in exons at a density of one per 505 bp. Each of the
three exonic SNPs led to changes in amino acids with
functional significance. Multiplex SNP assays were
used for genotyping of 127 diverse rice varieties and
landraces. In total 15 SNP haplotypes were identified
but only four of these, corresponding to two protein
haplotypes, were common, representing more than
85% of the cultivars. Determination of population
structure using 54 random SNPs classified the varieties
into two groups broadly corresponding to indica and
japonica cultivar groups, aromatic varieties clustering
with the japonica group. There was no association
between salt tolerance and the common BADH1
haplotypes, but aromatic varieties showed specific
association with a BADH1 protein haplotype (PH2)
having lysine144 to asparagine144 and lysine345 to
glutamine345 substitutions. Protein modeling and
ligand docking studies show that these two substitu-
tions lead to reduction in the substrate binding capacity
of the BADH1 enzyme towards gamma-aminobutyr-
aldehyde (GABald), which is a precursor of the major
aroma compound 2-acetyl-1-pyrroline (2-AP). This
association requires further validation in segregating
populations for potential utilization in the rice breeding
programs.
Keywords BADH1 � Oryza sativa �SNP haplotypes � Aroma
Electronic supplementary material The online version ofthis article (doi:10.1007/s11032-010-9425-1) containssupplementary material, which is available to authorized users.
A. Singh � P. K. Singh � R. Singh � A. Pandit �A. K. Mahato � D. K. Gupta � K. Tyagi �N. K. Singh (&) � T. R. Sharma
Rice Genome Laboratory, National Research Centre
on Plant Biotechnology, Indian Agricultural Research
Institute, New Delhi 110012, India
e-mail: [email protected]
Present Address:R. Singh
National Bureau of Plant Genetic Resources,
New Delhi 110012, India
A. K. Singh
Division of Genetics, Indian Agricultural Research
Institute, New Delhi 110012, India
123
Mol Breeding (2010) 26:325–338
DOI 10.1007/s11032-010-9425-1
Introduction
Glycinebetaine (GB) is a compatible organic solute
synthesized in response to salt, drought and temper-
ature stress in a large number of species including
microbes, animals and plants. The enzyme betaine
aldehyde dehydrogenase (BADH, E.C. No. 1.2.1.8) is
involved in the synthesis of GB from its precursor
betaine aldehyde. Many flowering plants, e.g. man-
grove, spinach, amaranth, barley and sorghum, are
proven betaine accumulators and tolerate salt and
drought stress partly through this mechanism, but
other species like tobacco, tomato and rice are
considered non-accumulators of GB (Ishitani et al.
1993; Rathinasabapathi et al. 1993; Shirasawa et al.
2006). Transformation of the Badh gene from bac-
terial and plant sources into betaine-deficient plant
species has resulted in accumulation of GB in their
system and consequent acquisition of tolerance to salt
and drought stress (Liang et al. 1997; Mohanty et al.
2002). BADH synthesis is up-regulated several-fold
in response to salt and drought stress in spinach,
barley and sorghum leaves (Weretilnyk and Hanson
1990; Ishitani et al. 1995; Wood et al.1996).
Rice (Oryza sativa L.) is thought to be a non-
accumulator of GB but it does express BADH at low
levels (Fitzgerald et al. 2008). This evokes an interest
in the phylogenic evolution of this enzyme in rice and
a search for variation in the BADH gene sequence in
the rice germplasm and its association with important
traits. Rice has two functional genes coding for the
BADH enzyme: BADH1 gene located on chromo-
some 4 and BADH2 gene on chromosome 8. Both the
genes have 15 exons that show high sequence
homology to their orthologs in other species. Rice
BADH1 is induced by salt and water stresses whereas
the BADH2 gene is expressed constitutively at low
levels. Expression of both genes also appears to be
regulated by post-translational processing directed by
paired short direct repeats in response to stress (Niu
et al. 2007). Addition of GB or choline to the culture
media showed increased accumulation of GB in rice
seedlings, which was strongly associated with water
use efficiency and maximum quantum yield of PSII
(Yang et al. 2005).
BADH has also been implicated in the develop-
ment of aroma in basmati and jasmine rice. The
concentration of the most important rice aroma
compound 2-acetyl-1-pyrroline (2-AP) is controlled
by a recessive gene for fragrance (fgr) mapped on
rice chromosome 8. The accumulation of 2-AP in
aromatic rice is explained by the loss of function
mutations in the BADH2 gene (Bradbury et al. 2008;
Chen et al. 2008). At least ten non-functional alleles
of the BADH2 gene have now been identified (Shi
et al. 2008; Sakthivel et al. 2009; Kovach et al. 2009).
It is anticipated that the BADH1 gene could also be
involved in the development of rice aroma in a way
similar to the BADH2 gene because, except for the
stability and regulation of the enzyme, the biochem-
ical function and substrate specificity of the BADH
enzymes coded by the two loci is quite similar
(Bradbury et al. 2008). A minor quantitative trait
locus (QTL) for aroma has been mapped on rice
chromosome 4 that is co-localized with the BADH1
gene, although no studies are available on the allelic
variation of the BADH1 gene in the rice germplasm
and its possible relationship with the aroma (Lorieux
et al. 1996; Amarawathi et al. 2008),
Analysis of single nucleotide polymorphism (SNP)
and small insertion/deletion (InDel), which are the
basis of most differences between alleles, has been
simplified by recent developments in sequencing
technology. Due to the ability of SNPs to generate
numerous markers within a target region and the
availability of high-throughput SNP assays, SNP
genotyping is becoming a valuable tool for gene
mapping, map-based cloning, and marker-assisted
selection in crops (Rafalski 2002; Belo et al. 2008).
In comparison to individual SNPs, haplotype analysis
of a group of linked SNPs is more informative in
determining association with the phenotypes.
The objectives of the present study were to: (1)
identify SNP variation in the BADH1 gene from 16
diverse rice varieties with different levels of salt
tolerance and aroma and (2) develop high-throughput
SNP genotyping assays and their application to a
large set of rice varieties to study possible association
of the BADH1 haplotypes with salt tolerance and
aroma.
Materials and methods
Plant material
Sixteen rice varieties used for the study of polymor-
phism in the BADH1 gene were collected from the
326 Mol Breeding (2010) 26:325–338
123
Indian Agricultural Research Institute and the
National Bureau of Plant Genetic Resources
(NBPGR), New Delhi, the Central Soil Salinity
Research Institute, Karnal and GB Pant University
of Agriculture and Technology, Pantnagar (Table 1).
The varieties included four salt-tolerant lines (Pok-
kali, CSR 10, CSR 27, CSR 36), two short-grain
aromatic salt-tolerant varieties (Kalanamak 3119,
Kalanamak 3131), two new plant type lines (Pusa
1266, Pusa 1342), a japonica variety Taipei 309, an
indica variety with pigmented endosperm (Red
Triveni), four modern high-yielding varieties (Jaya,
Ratna, Jyoti, Pusa 44), a drought-tolerant line MI48,
and a basmati variety Pusa 1121 (Table 1). A larger
set of 127 diverse rice varieties and landraces was
used for the validation and genotyping of SNPs for
association analysis (Table S1). One accession each
of O. nivara (No. 283160) and O. rufipogon (No.
381932) were obtained from the NBPGR, New Delhi.
PCR amplification of the BADH1 gene fragments
Genomic DNA was extracted from young seedlings
using the standard CTAB method (Murray and
Thompson 1980). The reference sequence of the
BADH1 gene from Oryza sativa japonica cv. Nippon-
bare was downloaded from the NCBI database (http://
www.ncbi.nlm.nih.gov). It was excised from the
sequence of chromosome 4 BAC clone OSJNBa00614
(IRGSP 2005). Since the BADH1 coding sequence is in
the negative strand of this BAC, the excised fragment
between nucleotide positions 41764 and 36437 was
reverse-complemented for correct orientation of the
gene sequence. The excised sequence represented a
region between 700 bp upstream of the ATG transla-
tion start codon and 200 bp downstream of the TAG
translation stop codon. Forward and reverse primers
were designed from different overlapping segments of
the BADH1 gene using Primer 3 software and re-
checked by BLAST search to ensure that they matched
uniquely with the expected positions in the rice gen-
ome (Table 2). Gradient PCR was done for all the
primer pairs to find the optimum annealing temperature
for amplification of single DNA fragments.
Sequencing of the PCR products and SNP
discovery
PCR products were sequenced by MegaBACE 4000
automated capillary sequencers (GE Healthcare). The
sequence trace files from each variety were assem-
bled into contigs using combined Phred/Pharp/Con-
sed software (Ewing and Green 1988). The sequence
reads generated from PCR products of the BADH1
gene from 16 rice varieties were pooled into a single
assembly to find SNP differences among the varie-
ties. Polymorphism tags were generated automati-
cally by Polyphred software integrated with the
Table 1 List of 16 rice
varieties of diverse origin,
aroma and salt tolerance
used for discovery and
validation of SNPs in the
BADH1 gene on rice
chromosome 4
S. no. Variety Origin Salt tolerance Aroma
1. CSR 10 M40-431-24-114/Jaya Tolerant Absent
2. CSR 27 Nona Bokra/IR5657-33-2 Tolerant Absent
3. CSR 36 indica Tolerant Absent
4. Jaya T(N)1/T 141 Sensitive Absent
5. Jyoti PTB10/IR8 Sensitive Absent
6. Kalanamak 3119 Aromatic landrace Tolerant Strong
7. Kalanamak 3131 Aromatic landrace Tolerant Strong
8. MI 48 Pelita1-1//H4/H501 Sensitive Absent
9. Pokkali indica, landrace Tolerant Absent
10. Pusa 1121 P 614-1-2/P 614-2-4-3 Sensitive Strong
11. Pusa 1266 indica, NPT line Sensitive Absent
12. Pusa 1342 indica, NPT line Sensitive Absent
13. Pusa 44 IARI 5901-2/IR8 Sensitive Absent
14. Ratna TKM 6/IR8 Sensitive Absent
15. Red Triveni indica Sensitive Absent
16. Taipei 309 japonica Sensitive Absent
Mol Breeding (2010) 26:325–338 327
123
Consed. High quality SNPs were then identified
manually and screen shots of the SNP trace files for
the two alleles were taken. The SNPs were analyzed
from the transcribed region of the gene only because
primers designed for the promoter region did not
work well. The position of exons and introns were
identified using gene prediction software FGENESH
(www.softberry.com) and verified manually by
checking against the full length cDNA sequence of
the BADH1 gene. The sequence reads of each variety
were also assembled separately to obtain a high-
quality consensus sequence of the BADH1 gene of the
individual varieties for comparison with the Nip-
ponbare reference sequence using pair-wise nucleo-
tide BLAST.
BADH1 SNP assay design and genotyping
The Sequenom MassARRAY� system uses matrix-
assisted laser desorption ionization-time of flight
(MALDI-TOF) mass spectrometer for accurate detec-
tion of SNPs in a high-throughput manner (www.
sequenom.com). MassARRAY Assay Design 3.1
software was used for the design of multiplex iPLEX
assays for 20 SNPs of the BADH1 gene in two wells.
One of these wells also included an SNP from exon 7 of
the BADH2 gene for genotyping a loss of function
allele (Bradbury et al. 2005, Amarawathi et al. 2008).
The 30-mer pre-amplification primers and variable
length genotyping primers generated by the Assay
Design 3.1 software were procured and used for the
validation of SNPs according to the Sequenom user
manual. The MassARRAY Typer 3.4 Software was
used for the visualization of SNPs and allele calling.
BADH1 sequence divergence and association
with rice aroma
The phylogenetic tree of the BADH1 gene sequence
obtained by resequencing of 16 rice varieties and
Nipponbare reference gene sequence (NCBI
gi:2244603) was constructed using MEGA version 4
software with O. rufipogon (NCBI gi:165874486) as an
out group for rooting the tree (Tamura et al. 2007). The
reliability of neighbor-joining phylogeny output was
estimated using bootstrap analysis with 1000 permu-
tations and one input order per replicate. In addition,
analysis of the BADH1 sequence variation among 127
rice varieties was done based on the scores of 15
validated SNPs identified by resequencing of the
BADH1 gene from 16 varieties and Nipponbare using
the Sequenom MassARRAY assays. The frequencies
Table 2 PCR primers used for amplification of different segments of the BADH1 gene from 16 rice varieties for SNP discovery
Primer ID Forward primer Reverse primer Start
positionaEnd
positionaProduct
size
Bad1-Pa TGTTAAAATGACCAGATTACCCCTA AGCCCGTGATACCTTTTTGA -684 -158 527
Bad1-1 GCATTTGGTTTGCTCCATC TTATGTGCACCCCCTTCCTA 159 666 508
Bad1-2 AGGACGCTCTTAGTGCCGTA AATCTGCAGCCAGTTCACAA 515 1114 600
Bad1-3 GTGCCCACCCTGTCATTAGT TCAAACATGAACCAACAAAAGC 1035 1635 601
Bad1-4 ACGTCCAATTTCCCTCGTCT ACAGGGAAGCAAGCTCAGAT 1533 2142 610
Bad1-5 GCTGATGGCTACTTGGAAGG AGGTTTCTTGCTCCGACAGA 2060 2616 557
Bad1-6 TGATCATGCCCTGAAGAGAA CAGCCTGCAACCTTCTTCTA 2549 3152 605
Bad1-7 AATTGCAAAGCGATTCTTGG AAGAACCCCCTTTTGAGGTG 3069 3671 603
Bad1-8 TGCCCGACCACAAGTATGTA TTGGCCACAGTTTGTGACAT 3555 4278 724
Bad1-9 GGGAGCTAGGACAGTGGTGA GTGACTTGCTTCACGCTCAA 4169 4373 205
Bad1-IN-5 GGTACTCCGTCCCTTTGCTT TGAGAAACCCATTGTTCAAAGA 15 869 755
Bad1-3.1 GCTGATGGCTACTTGGAAGG AGCACTGCAGACTTGACCAG 2060 2972 913
Bad1-3.2 CCCACGTCAACTATGCTCCT AGAATTGTGGCACCTTCACA 2775 3545 770
Bad1-3a CCCAAGGCTGAAATTTTTGT TGAAATTTCCAATTGGTCTTCTG 967 1661 694
Bad1-3b TAAATGGAAAGCCCCAAGG TTGGATGATCACGTACAAAAGG 955 1670 815
Bad1-END GTCTAGCTGGCGCTGTGATT CCGTATGGTTCATCTGAGCA 3885 4400 516
a With reference to ATG translation start codon
328 Mol Breeding (2010) 26:325–338
123
of the two common BADH1 protein haplotypes
(corresponding to four BADH1 SNP haplotypes) were
analyzed in all 127 rice varieties and also separately in
the aromatic and salt-tolerant subgroups of varieties,
and significance of deviation in allele frequencies from
the population means was tested using the chi-squared
test.
Population structure and cluster analysis using
genome-wide SNPs
Sequenom MassARRAY multiplex assays were
designed for 72 SNPs (two wells of iPLEX gold
chemistry), representing conserved single-copy rice
genes (Singh et al. 2007), taking six genes from each
of the twelve rice chromosomes. Two 36-plex assays
were designed and validated by Sequenom Corpora-
tion (San Diego), but only 54 SNPs giving more than
95%success rates were used for the population
structure analysis (Table S2). STRUCTURE 2.3.1
software was used to infer population structure of the
127 rice varieties using a burn-in of 100,000, run
length of 100,000 and a Bayesian model allowing for
admixture and with correlated allele frequencies
applying (Pritchard et al. 2000). The software was
applied for ten independent runs with an assumption
of ‘independent allele frequencies’ using a value of K
ranging from 1 to 10. The final K value (number of
subpopulations) was selected such that the value of awas less than 0.2 and did not change with subsequent
runs using higher K values. Graphical outputs from
STRUCTURE were produced to visualize the opti-
mum number of clusters. The 54 genome-wide SNP
scores of 127 rice accessions and one accession each
of two wild rice species (O. rufipogon and O. nivara)
were also analyzed using Free Tree software to
construct a phylogenetic tree (Pavlicek et al. 1999).
Dice coefficient was used for generating the distance
matrix and clusters were developed using the neigh-
bour-joining method. Bootstrap analysis was done to
test the reliability of clustering. The phylogenetic tree
of the varieties was constructed using the tree option
with O. nivara at the root of the tree.
Protein modeling and ligand docking
Two haplotypes of the BADH1 protein, PH1 and
PH2, were modeled using Accelrys Discovery Studio
Software. Protein sequences of PH1 and PH2 were
searched against the PDB database using the
BLASTP program of NCBI. The crystal structure of
BADH from Staphylococcus aureus (PDB ID: 3ED6)
showed highest similarity (44%) with the PH1 and
PH2 protein haplotypes, hence it was selected as
template to model the BADH1 proteins. The protein
modeling of PH1 and PH2 was done by loop refinement
and energy minimization using CHARMm 27 force
field. A three-dimensional structure model of gamma-
aminobutyraldehyde (GABald) substrate for the BADH
enzyme was created using CHEMSKETCH software.
Docking of GABald was carried out with the modeled
PH1 and PH2 using the Ligandfit Module of the
Discovery Studio Software. BADH1-PH1 and BADH1-
PH2 proteins were treated as receptor molecules and
GABald as the ligand. The program was run on a
machine with CPU speed of 2.8 GHz, Intel Xeon
processors, 12 GB RAM with a modeling time of 1 h
30 min, energy minimization of 30 min and docking
time of 10 min for each binding sites.
Results and discussion
Discovery of SNPs in the BADH1 gene of rice
Considering the implication of BADH enzyme in rice
aroma development and its potential role in drought
and salt stress tolerance, we attempted to find SNP
variation in the BADH1 gene by resequencing of this
gene from 16 diverse rice genotypes differing in
aroma and salt tolerance (Table 1). Initially, a single
pair of PCR primers flanking the BADH1 gene was
designed for the amplification of the full-length
BADH1 gene, but this effort failed despite using
different conditions and high-fidelity long-read DNA
polymerases. Therefore, multiple pairs of primers
were designed to amplify smaller overlapping seg-
ments of the BADH1 gene (Table 2), so that full
sequence could be generated by in silico assembly of
the sequence reads. These primers gave good ampli-
fication after optimization of the annealing temper-
ature using gradient PCR in the range 45–65�C. On
the basis of gradient PCR results, the optimum
annealing temperature (Ta) for all the primer pairs
was taken as 65�C, which gave a clean single PCR
product with minimum background. The primers
designed for amplification of promoter region of the
Mol Breeding (2010) 26:325–338 329
123
BADH1 gene (BAD1-Pa, Table 2) did not amplify
well even in the gradient PCR experiment except for
two varieties, namely Taipei 309 which is a japonica
type and Pusa 1266 which has a japonica back-
ground. This may be due to high GC content in this
region and, furthermore, there could be significant
sequence difference between the indica and japonica
rice varieties in this region, as the PCR primers were
designed based on the sequence of reference japonica
variety Nipponbare. Among the coding region prim-
ers, BAD1-3 showed amplification only in seven
varieties, and therefore new set of primers, BAD1-3a,
BAD1-3b, BAD1-3.1 and BAD1-3.2, were designed
(Table 2), which solved the problem and led to
complete assembly of the sequence for this region of
the gene. Similarly, another primer pair, BAD1-END,
was redesigned to amplify the stop codon region of
the BADH1 gene that amplified nicely in all the
varieties. All the primer pairs showed amplification
of excepted size products indicated in Table 2.
PCR products were sequenced from both the ends
using the same forward and reverse primers that were
employed for their amplification, generating high
Phred quality sequence data, and several hundred
high-quality sequence reads were obtained from the
32 sequencing primers. The promoter region primer
BAD1-Pa gave amplification only with Taipei 309
and Pusa 1266, and therefore was not considered for
the SNP analysis in this study. For sequence assembly
each sequence read was named in full giving project
name, variety name, primer name and forward or
reverse primer names, which helped in the interpre-
tation of data in the sequence assemblies. The first
kind of sequence assemblies involved assembly of all
the reads coming from a single variety without the
sequence of reference variety Nipponbare. There
were sixteen separate contig assembly projects, one
for each test variety. Twelve of the 16 varieties
produced single assembled contigs for the coding
region while the remaining four varieties (CSR36,
Jyothi, Kalanamak 3131 and Ratna) were assembled
into two to three contigs. Some of these contigs were
just touching each other when compared with the
reference sequence of Nipponbare, but did not merge
into a single contig in the Phrap assembly. The
consensus sequences of the contigs for each variety
already submitted to the NCBI were used for
construction of the phylogentic tree of the BADH1
gene (Fig. S1).
The second kind of assembly included sequence
reads from all the 16 test varieties and the reference
variety Nipponbare into a single project, and the
entire assembly was integrated with the Polyphred
that helped automatically tag the SNPs with their
quality score. The consensus sequence was based on
the reference sequence and SNPs were identified and
tagged in the CONSED alignment window. Twenty
SNPs were identified in this manner and the high
quality of the SNPs was evident from the fact that all
the flanking sequences were clearly identical except
for the single nucleotide differences. The nucleotide
positions of all the 20 SNPs in the BADH1 gene of
reference variety Nipponbare are shown in Fig. 1.
The BADH1 gene has 15 exons and 14 introns, and
out of the 20 SNPs identified here 17 were present in
the introns and only three were in the exons,
suggesting high sequence conservation in the coding
region of this gene (Fig. 1). The abundance of SNPs
in the introns (one per 171 bp) was almost three times
higher than exons (one per 505 bp). In fact, 50% of
the 20 SNPs were found in just two introns (numbers
2 and 4). The three exonic SNPs were (1) S61371 in
exon 4 with a T/A polymorphism resulting in
asparagine to lysine substitution at amino acid
position 144; (2) S183493 in exon 11 with a C/A
polymorphism resulting in glutamine to lysine sub-
stitution at amino acid position 345, and (3) S193500
also in exon 11 with T/C polymorphism resulting in
isolucine to threonine substitution at amino acid
position 347. The S61371 and S183493 SNPs are
significant because they result in incorporation of
functionally active lysine residues. Furthermore, they
are base transversions which are less frequent than
base transition mutations. The 3D crystal structure of
the BADH enzyme has been solved in rice and its
active sites and functional domains have been
identified; these amino acid mutation positions are
near the enzyme active site and substrate recognition
site (Chen et al. 2008). Certain SNPs were specific to
only one of the 16 varieties, e.g. Red Triveni has
unique allele A for the SNP S2474 at nucleotide
position 474. Similarly, SNP numbers S142772 and
S193500 were unique to short-grain aromatic salt-
tolerant varieties Kalanamak 3119 and Kalanamak
3131. The SNPs S193500 and S183493 are both near
the substrate recognition site of the BADH enzyme.
This suggests evolution of these mutations under
positive selection pressure.
330 Mol Breeding (2010) 26:325–338
123
Fig. 1 Sequence of the
BADH1 gene from Oryzasativa japonica cv
‘Nipponbare’ showing
positions of 20 SNPs
(highlighted yellow) in
exons (capital letters) and
introns (lower case). PCR
primers used for
amplification of
overlapping gene fragments
are shown in red font,reverse primers are
underlined
Mol Breeding (2010) 26:325–338 331
123
Phylogenetic analysis of the BADH1 gene of rice
Earlier studies in plants have shown the divergence
of BADH1 and BADH2 genes from a common
ancestor well before the divergence of cereals, with
each gene having similar enzyme activities but
different regulatory control (Bradbury et al. 2005).
However, no study is available on the BADH1
sequence variation in different rice varieties to
provide an insight into the origin of functional
mutations in the BADH1 gene of rice. We con-
structed a phylogenetic tree on the basis of the high-
quality sequence contigs assembled for the coding
region of the BADH1 gene of the 16 rice varieties
along with the reference sequence of japonica
cultivar Nipponbare for comparison and a partial
genomic sequence of the progenitor species O.
rufipogon for rooting of phylogenetic tree (Fig S1).
The tree rooted with the progenitor species O.
rufipogon clearly grouped the varieties according to
their sequence differences with very high bootstrap
values. Two indica type varieties Pusa 44 and Jyothi
were more closely related to the origin than other
varieties. The next most closely related variety was
Pokkali, a salt-tolerant landrace from Kerala, India.
The tree clearly shows the evolution of the Indian
aromatic and japonica group of cultivars from the
progenitor O.rufipogon in a step-wise manner (Fig
S1). It was interesting to note that the phylogram
based on a single gene was able to group these 16
varieties into clusters, which corresponded closely to
their geo-evolutionary origin. For example, the
japonica type varieties Nipponbare and Taipei 309
and the japonica 9 indica crossbred varieties
Pusa1342 and Pusa1266 were grouped together,
along with Red Triveni. Salt-tolerant varieties
CSR36 and CSR10 formed a separate group while
short-grain aromatic rice varieties Kalanamak 3119
and Kalanamak 3131 were grouped in a separate
cluster. Basmati quality variety Pusa 1121 formed a
separate group with CSR27 which has Nona Bokra
in its pedigree (Table S1), and three high yielding
semi-dwarf varieties, Jaya, Ratna and MI48, were
more closely related to each other. These groupings
are consistent with the known pedigree and origin of
these varieties (Table S1). Interestingly, the six salt-
tolerant rice varieties did not form a single group,
thus highlighting the possible role of different
mechanisms of salt tolerance in these varieties and
possibly indicating the lack of association between
salt tolerance and the BADH1 polymorphism in rice.
Multipex SNP genotyping and association
of BADH1 haplotypes with rice aroma
Sequenom MassARRAY multiplex assays were
designed for high-throughput analysis of the 20 SNPs
identified during this study, and an additional SNP
linked with 8 bp deletion in exon 7 of the BADH2 gene
(Table 3). However, after optimization and validation
of the assays it was found that only 15 BADH1 SNPs
and one BADH2 SNP could be scored reliably with a
high degree of reproducibility. The remaining five
BADH1 SNPs (S7, S10, S12, S13 and S20) did not
work consistently, most likely due to the overlapping
target sites for the primers in the same PCR reaction.
However, it may not be necessary to assay all the SNPs
because they tend to inherit as haplotype blocks and
therefore 15 reproducible SNPs in the BADH1 gene
can provide full information on its common alleles.
While the individual SNPs were scored successfully in
more than 95% of the samples, all the 15 SNPs were
scored accurately in 92 of the 127 varieties analyzed.
Based on the data of 15 SNPs, 92 varieties could be
grouped into 15 different haplotypes (Fig. 2, Table
S3). Ten of these SNP haplotypes were rare, repre-
sented by one variety each, viz. Taipai309, Jyothi,
Pusa44, SKR126, CSR10, IR64, Pusa1266, Kasturi,
Pusa1121 and Pant Dhan 4. Another unique SNP
haplotype was present in the two different accessions
of Kalanamak (3119 and 3131). Four haplotypes were
common, representing 86.9% of the varieties with full
15 SNP complements. Analysis based on the three
exonic SNPs showed only five haplotypes, two of
which (PH1 and PH2) were common and the remain-
ing three were rare, including haplotype PH3 in the
aromatic salt-tolerant landrace Kalanamak 3119 and
3131. Varieties CSR10 and Pant Dhan 4 also have
unique protein haplotypes PH4 and PH5, respectively
(Fig. 2, Table S4).
Meaningful trait association analysis in the present
study is possible only with the common alleles
present in eighty varieties, e.g. SH1-SH4 (PH1-PH2).
The association of rare SNPs with the traits con-
cerned can be better analyzed using bi-parental
mapping populations (Rafalski 2002, Belo et al.
2008). The number of varieties with the four common
SNP haplotypes was 38 for SH1, 19 for SH2, 17 for
332 Mol Breeding (2010) 26:325–338
123
Ta
ble
3S
equ
ence
of
the
pre
-am
pli
fica
tio
nP
CR
pri
mer
s(P
CR
P)
and
sin
gle
nu
cleo
tid
eex
ten
sio
np
rim
ers
(UE
P)
des
ign
edfo
rm
ult
iple
xg
eno
typ
ing
of
20
SN
Ps
inth
eB
AD
H1
gen
ean
do
ne
SN
Pin
the
exo
n7
of
the
BA
DH
2g
ene,
des
ign
edfo
rth
eiP
LE
Xas
say
so
fS
equ
eno
mM
assA
RR
AY
syst
em
SN
P_
IDW
ell
2n
d-P
CR
P1
st-P
CR
PU
EP
BA
DH
1_
S2
1A
CG
TT
GG
AT
GT
TT
AC
GG
CA
CT
AA
GA
GC
GT
CA
CG
TT
GG
AT
GT
CT
CC
TA
TG
CT
GC
TA
AC
CT
GT
TC
AG
TC
AC
AC
AG
CA
AG
BA
DH
1_
S5
1A
CG
TT
GG
AT
GT
TG
AT
TC
TG
GG
AA
GC
CT
CT
GA
CG
TT
GG
AT
GC
GG
AA
AA
TC
TT
GT
GT
CA
TG
CG
CT
GG
GG
AC
AT
GG
TA
TG
BA
DH
1_
S6
1A
CG
TT
GG
AT
GT
TA
GA
TG
GG
AA
AC
AA
CG
GG
CA
CG
TT
GG
AT
GA
CC
CC
AA
TG
GG
TT
CT
TT
GA
GT
CT
CT
CT
AC
CC
AT
GG
AA
AA
BA
DH
1_
S1
71
AC
GT
TG
GA
TG
CA
CT
AG
AA
GA
AG
GT
TG
CA
GG
AC
GT
TG
GA
TG
TT
GC
AA
CT
GA
AC
TT
CA
GG
AG
GG
GC
AG
GT
AA
TG
TA
AA
TA
G
BA
DH
1_
S4
1A
CG
TT
GG
AT
GA
AC
AG
CA
AG
GG
CA
AG
TC
AA
CA
CG
TT
GG
AT
GT
GG
CT
AG
TT
TA
GT
CT
AG
CG
GT
AG
TG
GA
GA
AT
CT
CA
TC
TG
C
BA
DH
1_
S7
1A
CG
TT
GG
AT
GG
GA
CG
TG
TC
AA
AG
TA
TT
TC
GA
CG
TT
GG
AT
GG
CC
TT
GA
AA
AA
TT
GA
TA
CT
GA
TT
TC
GA
TA
CA
CT
AC
AA
CA
TC
BA
DH
2_
S1
1A
CG
TT
GG
AT
GG
TT
AG
GT
TG
CA
TT
TA
CT
GG
GA
CG
TT
GG
AT
GC
CT
TA
AC
CA
TA
GG
AG
AG
CT
GC
TG
GG
AG
TT
AT
GA
AA
CT
GG
TA
T
BA
DH
1_
S9
1A
CG
TT
GG
AT
GC
CA
AT
TG
GT
CT
TC
TG
TT
AT
CA
CG
TT
GG
AT
GG
TT
TT
GT
TC
AC
AC
CG
GA
AG
CA
TC
AA
AC
AT
GA
AC
CA
AC
AA
AA
G
BA
DH
1_
S1
91
AC
GT
TG
GA
TG
TG
TG
GC
AC
CT
TC
AC
AT
CT
TG
AC
GT
TG
GA
TG
AC
TC
GG
AC
GA
CT
TA
AG
AA
CC
CT
TG
CT
GT
TG
AG
AT
GA
AC
TT
CA
TT
BA
DH
1_
S1
41
AC
GT
TG
GA
TG
CG
CA
TA
AT
TG
AA
TT
AG
GA
GC
AC
GT
TG
GA
TG
GG
TA
TT
AG
GT
AT
CT
TC
CA
CT
CT
TA
GG
AG
CA
TA
GT
TG
AC
GT
GG
GA
A
BA
DH
1_
S8
2A
CG
TT
GG
AT
GC
GA
AA
TA
CT
TT
GA
CA
CG
TC
CA
CG
TT
GG
AT
GG
AA
CA
AT
TG
CT
TC
CG
GT
GT
GC
AC
GT
CC
AA
TT
TC
CC
TC
BA
DH
1_
S2
02
AC
GT
TG
GA
TG
CC
AG
CT
AG
AC
CA
TA
GC
TG
AG
AC
GT
TG
GA
TG
TG
CT
TT
AT
GT
GG
AA
CT
GT
GG
AG
AC
AA
GA
AC
AG
GC
TT
A
BA
DH
1_
S1
62
AC
GT
TG
GA
TG
AG
TC
TG
CA
GT
GC
TA
CT
TC
TC
AC
GT
TG
GA
TG
CG
AG
GA
CA
TG
AC
TT
AA
GC
TG
TC
GT
CT
AC
TT
TT
GC
AT
GT
A
BA
DH
1_
S1
52
AC
GT
TG
GA
TG
TT
CC
CA
CG
TC
AA
CT
AT
GC
TC
AC
GT
TG
GA
TG
CT
AC
AG
AA
TT
GC
AG
GT
AA
TG
AA
TT
AT
GC
GA
TG
TT
GT
CA
AA
BA
DH
1_
S3
2A
CG
TT
GG
AT
GT
TT
AC
GG
CA
CT
AA
GA
GC
GT
CA
CG
TT
GG
AT
GT
CT
CC
TA
TG
CT
GC
TA
AC
CT
GC
CT
TT
AA
TC
AA
TC
CC
CG
AA
AG
BA
DH
1_
S1
12
AC
GT
TG
GA
TG
AG
CC
TA
AA
GC
GC
AT
TT
AC
AC
AC
GT
TG
GA
TG
CA
TG
AG
TA
CA
GA
CA
TG
CA
CC
AC
AC
AA
AG
CA
AT
GA
AA
CA
CG
GA
BA
DH
1_
S1
2A
CG
TT
GG
AT
GA
GA
GC
TC
TG
CT
CA
AC
AA
CT
CA
CG
TT
GG
AT
GA
AG
GT
GC
TC
AG
CT
TC
TA
TG
CC
TG
CT
CA
AC
AA
CT
CA
TC
TA
GC
TC
BA
DH
1_
S1
22
AC
GT
TG
GA
TG
GT
AA
AA
GG
CT
AA
TC
GT
TT
GG
AC
GT
TG
GA
TG
TT
CC
AA
GT
AG
CC
AT
CA
GC
AG
CT
AA
TC
GT
TT
GG
AT
TA
TT
TT
CT
TA
BA
DH
1_
S1
32
AC
GT
TG
GA
TG
GC
TC
TA
AA
CA
TG
TC
CT
GG
TT
AC
GT
TG
GA
TG
TC
CC
TG
TA
AG
GA
TC
TA
AG
GA
agC
AT
AT
TT
TT
AG
TC
AT
GG
GG
TA
CA
BA
DH
1_
S1
82
AC
GT
TG
GA
TG
AC
TC
GG
AC
GA
CT
TA
AG
AA
CC
AC
GT
TG
GA
TG
TG
TG
GC
AC
CT
TC
AC
AT
CT
TG
tcA
GA
AC
CT
TT
TT
CT
CA
TT
TC
AG
TA
T
BA
DH
1_
S1
02
AC
GT
TG
GA
TG
TG
GG
AA
GA
TC
AA
CT
TC
AA
AG
AC
GT
TG
GA
TG
CC
TG
TC
AG
AG
GA
AA
GA
CT
AT
ccA
CT
TC
AA
AG
TT
AT
TG
GA
TG
AT
CA
C
Mol Breeding (2010) 26:325–338 333
123
SH3 and 6 for SH4 (Fig. 2). The four common SNP
haplotypes represented two BADH1 protein haplo-
types resulting from two exonic SNPs, S61371 and
S183493. The S193500 was a rare SNP unique to
Kalanamak 3119 and Kalanamak 3131, but S61371
and S183493 were polymorphic in the four common
SNP haplotypes. The SNP haplotypes SH1 and SH2
have lysine residues at amino acid positions 144 and
345, whereas in the SNP haplotypes SH3 and SH4 the
two lysine residues are substituted by asparagine and
glutamine residues, respectively. The difference
between SNP haplotypes SH1 and SH2 was due to
three intronic SNPs (S1, S9 and S16), whereas the
difference between haplotypes SH3 and SH4 was due
to two intronic SNPs (S2 and S5). Thus, at the protein
level there are only two common alleles: (i) protein
haplotype 1 (PH1), represented by SNP haplotypes
SH1 and SH2, and (ii) protein haplotype 2 (PH2),
represented by SNP haplotypes SH3 and SH4. The
protein haplotype PH1 was present in most high-
yielding indica rice varieties, whereas the protein
haplotype PH2 was common to the aromatic and
japonica group of varieties. The BADH1 protein
modeling and GABald ligand docking study revealed
20 ligand binding sites in the PH1 and 18 ligand
binding sites in the PH2 protein haplotype. Monte
Carlo simulation was used to dock the ligand with the
receptor in 1000 iterations for each binding sites. In
the PH1 haplotype, 15 out of the 20 binding sites
were predicted as probable active sites, as the
GABald ligand bound tightly on those sites. In the
PH2 haplotypes only 8 out of 18 binding sites were
predicted as probable active sites for the binding of
GABald, suggesting a drastic reduction in the affinity
of the BADH1-PH2 haplotype for GABald, which is
a precursor of the aroma compound 2-AP (Fig. 3).
Thus PH2 could be a loss of function allele of the
BADH1 gene with implications in rice aroma similar
to the loss of function alleles of the BADH2 gene
(Kovach et al. 2009).
Variety S-1 S-2 S-3 S-4 S-5 S-6 S-8 S-9 S-11 S-14 S-15 S-16 S-17 S-18 S-19 SNP Haplotype
Protein Haplotype Frequency
Jaya G C G T T A A C C T T T T A T SH1 PH1 38ADT43 A C G T T A A G C T T C T A T SH2 PH1 19Basmati 370 G C A A C T G G T T C C C C T SH3 PH2 17Taraori Basmati G A A A T T G G T T C C C C T SH4 PH2 6Kalanamak 3119 G C A T T A A G C A T C C A C SH5 PH3 2Taipai 309 G C A A T T G G T T C C C C T SH6 PH2 1Jyothi G C G T C A G C C T T T T A T SH7 PH1 1Pusa 44 G C G T T A A G C T T T T A T SH8 PH1 1SKR 126 G C G T T A A G C T T C T A T SH9 PH1 1CSR 10 G C G T T T G G T T T C T A T SH10 PH4 1IR 64 G C G T T A A C C T T T C A T SH11 PH1 1Pusa 1266 G C A A C T G C T T C C C C T SH12 PH2 1Kasturi G C A T C T G G T T C C C C T SH13 PH2 1Pusa 1121 A C G T T A A C C T T C T A T SH14 PH1 1Pant Dhan 4 G C G T T A A C C T T T T C T SH15 PH5 1
Fig. 2 Haplotypes of the
BADH1 gene in 92 diverse
rice varieties based on 15
SNPs with no missing data
genotyped using Sequenom
MassARRAY system.
Protein haplotypes are
based on three exonic SNPs
(S6, S18 and S19)
Fig. 3 3D modeling of the
two common BADH1
protein haplotypes (PH1
and PH2) with ligand
docking (green color)
showing reduced number of
binding sites in the PH2 for
GABald, a precursor of the
rice aroma compound 2-AP
334 Mol Breeding (2010) 26:325–338
123
An important observation in the present investiga-
tion was the association of BADH1-PH2 protein
haplotype with the aromatic rice varieties. To com-
plement such analysis it was important to analyze the
127 rice varieties and landraces for their population
structure using the STRUCTURE 2 software (Prit-
chard et al. 2000). We used 54 validated genome-wide
SNP markers with four to six loci per rice chromosome
for genotyping using Sequenom Mass ARRAY in two
multiplex assays (Table S2). Genome-wide synony-
mous SNPs were identified by resequencing of the
intron-spanning regions of conserved single copy rice
genes (our unpublished data). After fixing the K value
at two, and 1000 bootstrap permutations, the 127 rice
varieties were classified in two population groups
(Fig. 4). The list of varieties in each group with their
BADH1 haplotype score, aroma, salt tolerance and
BADH2-exon 7 SNP score is shown in Table S5. Most
of the traditional aromatic rice varieties were present
in population group 2 along with the japonica type
varieties, broadly agreeing with the phylogenetic
grouping based on the BADH1 gene sequence (Fig.
S1). But the population structure grouping presented
here is based on 54 genome-wide SNPs and therefore
reflected the pedigree and selection strategy used in
the development of these varieties (Fig. 4a). Rooting
of the phylogenetic tree with O. nivara showed that the
indica cultivar group was closer to this wild progen-
itor, but O. rufipogon was closer to the japonica/
aromatic population group (Fig. 4b). In each of the
two population groups there were varieties having
significant proportion of genes from the other group
due to their crossbreeding pedigrees; those with more
than 10% genes from other groups are marked in Table
S5.
The association between the BADH1 haplotypes
and salt tolerance or aroma trait of the rice varieties
was assessed manually using the chi-squared test of
significance. The two common protein haplotypes of
the BADH1 gene were present in both the population
groups. Similarly, salt-tolerant and aromatic varieties
were also present across the population groups, but
there was predominance of aromatic varieties in the
population group 2. Frequencies of varieties with
different salt tolerance and aroma scores against the
two common BADH1 protein haplotypes present in
80 rice varieties are shown in Table 4. The frequen-
cies of two protein haplotypes were analyzed in each
of the four categories of aromatic, non-aromatic, salt-
tolerant and salt-susceptible varieties against the
observed overall distribution of 71.2% for PH1 and
28.8% for PH2 in the whole population. There was a
significant association between aroma score and the
BADH1 protein haplotype PH2, where lysine144 and
lysine345 residues of the common haplotype PH1 are
substituted by asparagine144 and glutamine345,
respectively (v2 = 6.985, P = 0.008 at df 1). It is
known from several independent genetic studies and
validation through genetic transformation that the
BADH2 gene is a major locus responsible for rice
aroma, where loss of function mutations in the gene
lead to accumulation of gamma-aminobutyraldehyde
(GABald), a precursor of the aroma compound 2-AP
in the rice grains (Bradbury et al. 2008). At least two
earlier studies have shown co-location of a QTL for
aroma and the BADH1 gene on rice chromosome 4
(Lorieux et al. 1996, Amarawathi et al. 2008). The
BADH1 haplotype aroma association indentified in
this study may explain the functional basis of the
aroma QTL on chromosome 4.
Due to their similar biochemical function it is
anticipated that loss of function mutations in the
BADH1 gene could also control rice aroma similar to
the BADH2 gene, particularly in salt and water stress
conditions (Bradbury et al. 2008). In this study we
provide evidence that the BADH1 protein haplotype
PH2 (SNP haplotypes SH1 and SH2) is associated
with the aromatic rice varieties. It is important to note
here that the loss of function mutation in the BADH2
gene is a primary requirement for aroma development
due to constitutive expression of the BADH2 gene
(Bradbury et al. 2005). However, just the loss of
function of the BADH2 gene is not enough; it may be
complemented with the BADH1 protein haplotype
PH2 (SNP haplotypes SH1 and SH2) for full aroma
expression. For example, the popular crossbred
basmati variety Pusa Basmati 1 has the badh2-exon
7 deletion mutation but has BADH1 haplotype PH1/
SH1, which could be the reason for its mild aroma,
whereas another popular crossbred basmati variety
Pusa 1121 has a rare allele of the BADH1 gene
(haplotype PH1/SH14) that might lead to a better
aroma development than Pusa Basmati 1. Thus, a
combination of loss of function mutation in the
BADH2 gene and a reduction in the substrate binding
capacity of the BADH1 enzyme to aroma precursor
compound GABald could be important for full aroma
development in rice.
Mol Breeding (2010) 26:325–338 335
123
Re-sequencing of the target gene from different
genotypes of a species is one of the most reliable
techniques for SNP discovery that was applied here
for identification for the first time of 20 new SNPs in
the BADH1 gene. However, for routine application of
SNPs for allele mining and marker-assisted breeding,
high-throughput methods of SNP genotyping are
required. Sequenom MassARRAY system allows
handling of a large number of samples using small
to medium numbers of well characterized SNPs in
0.1
BindliTetep99
JayaPantDha1299
42
Kanak10
PR106Prabhat36
5
PNR381Pokkali65
MI4825
Pothana22
ADT3714
ADT384
1
PR108Shiva29
Swarnamukhi8
CSR27MTU107556
IR64Jyothi36
23
SumatiVarsha35
4
DhoiabankoiAnnadaRudramma
Vikas36
Rajavadlu19
ADT438
NeelaPantDhan422
1
KaushalMTU529398
NLR3444917
SwarnaCSR36
MalviyaDhan3636
OrgalluPelalaVadlu43
IR3614
BhadrakaliDivya60
8
IR501
IntanPusa83411
BPT1768Narendra359100
PusaSugandh2PusaSugandh341
PusaSugandh530
Lunishree52
Pusa16914
SaleemTellahamsa71
Satya72
Rajendra55
Rasi57
KrishanaHamsaRatna84
PNR16250
TKM634
RedTriveni3
HeiBaoJD617
HeeraIR20100
Narendra11827
Pusa20514
HKR126Keshava33
PR111Pusa4415
PR113Phalguna24
1
Kalinga3Vanprabha67
ChandanErramallelu82
Varalu34
JGL11470JGL3855100
SagarSamba31
5
JGL13595WGL3210050
1
ChaitanyaHKR12096
Krisahnaveni68
SwarnaDhan30
MandyaVijayaVijetha59
MTU106445
2
KavyaWGL1497
JGL384445
SonaMahsuri21
SambaMahsuri4
JGL117279
CSR10MTU108125
4
SurekhaIndurSamba
NLR3049129
BPT22312
8
Sriranga
23
CSR30TaraoriBasmati89
Basmati37088
HasanSarai69
Kalanamak313120
Kalanamak31197
Pusa1121Sonasal42
Pusa117610
33
SeondBasmati
41
ShahPasand20
NipponbareTaipai30999
Tripura83
Sathi56
46
Orufipogon381932
43
JhumKhasaPechiBadam37
Nagina2281
Golmalati35
51
KasturiPusaBasmati129
49
Pusa1266Pusa134252
64
Onivara283160
A B
Fig. 4 a Population structure (K = 2) and b NJ phylogenetic
tree (using Nei’s similarity matrix) of 127 rice varieties based
on 54 synonymous SNPs present in conserved single-copy rice
genes evenly distributed on the 12 rice chromosomes. The
phylogenetic tree was rooted using Oryza nivara as an out
group
336 Mol Breeding (2010) 26:325–338
123
multiplex reactions. We designed two multiplex
genotyping assays for the 20 newly discovered SNPs.
Fifteen of these SNPs were validated successfully and
used for the genotyping of a large set of 127 rice
genotypes of diverse origin and agronomic trait
variation. This helped discovery of BADH1 haplo-
types showing significant association with the aro-
matic rice varieties that may compliment the role of
known loss of function alleles of the BADH2 locus
for rice aroma. However, this association needs
further validation in a segregating population.
Sequence submissions
The BADH1 sequences from sixteen rice varieties have
been submitted to the NCBI GenBank with accession
numbers: CSR10 (EU566870), CSR27 (bankit111
4275), CSR36 (bankit1114294), Jaya (EU566862),
Jyoti (bankit1114323), Kalanamak 3119 (EU566863),
Kalanamak 3131 (bankit1114319), MI48 (bankit111
4313), Pokkali (EU566865), Pusa 44 (EU566866),
Pusa 1121 (EU566867), Pusa 1266 (EU566864), Pusa
1342 (EU566868), Ratna (bankit1114305), Red Tri-
veni (bankit1114308), Taipei 309 (EU566869).
Acknowledgments This work was supported by the NPTC
project of the Indian Council of Agricultural Research and is
part of the M.Sc. thesis of the senior author.
References
Amarawathi Y, Singh R, Singh AK, Singh VP, Mohapatra T,
Sharma TR et al (2008) Mapping of quantitative trait loci
for basmati quality traits in rice (Oryza sativa L.). Mol
Breed 21:49–65
Belo A, Zheng P, Luck S, Shen B, Meyer DJ, Li B, Tingey S,
Rafalski A (2008) Whole genome scan detects an allelic
variant of fad2 associated with increased oleic acid levels
in maize. Mol Genet Genomics 279:1–10
Bradbury LMT, Fitzgerald TL, Henry RJ, Jin QS, Waters DLE
(2005) The gene for fragrance in rice. Plant Biotechnol J
3:363–370
Bradbury LMY, Gillies SA, Brushett DJ, Daniel LEW, Henry
RJ (2008) Inactivation of an aminoaldehyde dehydroge-
nase is responsible for fragrance in rice. Plant Mol Biol
68: 439–449
Chen S, Yang Y, Shi W, Ji Q, He F, Zhang Z, Cheng Z, Liu X,
Xu M (2008) BADH2, encoding betaine aldehyde dehy-
drogenase, inhibits the biosynthesis of 2-acetyl-1-pyrro-
line, a major component in rice fragrance. Plant Cell 20:
1850–1861
Ewing B, Green P (1988) Base calling of automated sequence
tracer using Phred I. Accuracy assessment. Genome Res
8:175–185
Fitzgerald TL, Waters DLE, Henry RJ (2008) The affect of salt
on betaine aldehyde dehydrogenase transcript levels and 2-
acetyl-1-pyrroline concentration in fragrant and non-fra-
grant rice (Oryza sativa). Plant Sci. doi:10.1016/j.plantsci.
2008.06.005
IRGSP (2005) The map based sequence of the rice genome.
Nature 436:793–800
Ishitani M, Arakawa K, Mizuno K, Kishitani S, Takabe T
(1993) Betaine aldehyde dehydrogenase in the Grami-
neae. Levels in leaves of both betaine accumulating and
non-accumulating cereal plants. Plant Cell Physiol 34:
493–495
Ishitani M, Nakamura T, Han SY, Takabe T (1995) Expression of
the betaine aldehyde dehydrogenase gene in barley in
response to osmotic stress and abscisic acid. Plant Mol Biol
27:307–315
Kovach MJ, Mariafe N, Calingacionb N, Melissa AF,
McCouch SR (2009) The origin and evolution of fra-
grance in rice (Oryza sativa L.). Proc Natl Acad Sci USA
106:14444–14449
Liang Z, Ma D, Tang L, Hong Y, Luo A, Zhou J, Dai X (1997)
Expression of the spinach betaine aldehyde dehydroge-
nase (BADH) gene in transgenic tobacco plants. Chin J
Biotechnol 13:153–159
Lorieux M, Petrov M, Huang N, Guiderdoni E, Ghesquiere A
(1996) Aroma in rice: genetic analysis of quantitative
trait. Theor Appl Genet 93:1145–1151
Mohanty A, Kathuria H, Ferjani A, Sakamoto A, Mohanty P,
Murata N et al (2002) Transgenics of an elite indica rice
variety Pusa Basmati 1 harbouring the codA gene are
highly tolerant to salt stress. Theor Appl Genet 106:51–57
Table 4 Frequency distribution of two major protein haplotypes of the BADH1 gene (based on the exonic SNPs) in a diverse set of
80 rice varieties and landraces representing different pedigrees and eco-geographical origins
BADH1 protein haplotype Aromatic Non-aromatic Salt-tolerant Salt susceptible Total (%)
PH1 5 52 5 52 57 (71.2)
PH2 8 15 1 22 23 (28.8)
Total 13 67 6 74 80 (100)
v2 6.985 1.225 0.448 0.050
P (df = 1) 0.008 0.268 0.503 0.823
Mol Breeding (2010) 26:325–338 337
123
Murray MG, Thompson WF (1980) Rapid isolation of high
molecular weight plant DNA. Nucleic Acids Res 8:4321–
4325
Niu XL, Zheng WJ, Lu BR, Ren GJ, Huang WZ, Wang SH
et al (2007) An unusual post-transcriptional processing in
two betaine aldehyde dehydrogenase loci of cereal crops
directed by short, direct repeats in response to stress
conditions. Plant Physiol 143:1929–1942
Pavlicek A, Hrda S, Flegr J (1999) FreeTree—Freeware pro-
gram for construction of phylogenetic trees on the basis of
distance data and bootstrap/jackknife analysis of the tree
robustness. Application in the RAPD analysis of the genus
Frenkelia. Folia Biol (Praha) 45:97–99
Pritchard JK, Stephens M, Donnelly P (2000) Inference of
population structure using multilocus genotype data.
Genetics 155:945–959
Rafalski A (2002) Applications of single nucleotide polymor-
phisms in crop genetics. Curr Opin Plant Biol 5:94–100
Rathinasabapathi B, Gage DA, Mackill DJ, Hanson AD (1993)
Cultivated and wild rices do not accumulate glycine
betaine due to deficiencies in two biosynthetic steps. Crop
Sci 33:534–538
Sakthivel K, Sundaram RM, Shobha Rani N, Balachandran
SM, Neeraja CN (2009) Genetic and molecular basis of
fragrance in rice. Biotechnol Adv 27:468–473
Shi W, Yang Y, Chen S, Xu M (2008) Discovery of a new
fragrance allele and the development of functional markers
for the breeding of fragrant rice varieties. Mol Breed 22:
185–192
Shirasawa K, Takabe T, Takabe T, Kishitani S (2006) Accu-
mulation of glycinebetaine in rice plants that overexpress
choline monooxygenase from spinach and evaluation of
their tolerance to abiotic stress. Ann Bot 98:565–571
Singh NK, Dalal V, Batra K, Singh BK, Chitra G et al (2007)
Single-copy genes define a conserved order between rice
and wheat for understanding differences caused by dupli-
cation, deletion, and transposition of genes. Funct Integr
Genomics 7:17–35
Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4:
molecular evolutionary genetics analysis (MEGA) soft-
ware version 4.0. Mol Biol Evol 24:1596–1599
Weretilnyk EA, Hanson AD (1990) Molecular cloning of a
plant betaine-aldehyde dehydrogenase, an enzyme impli-
cated in adaptation to salinity and drought. Proc Natl Acad
Sci USA 87:2745–2749
Wood AJ, Saneoka H, Rhodes D, Joly RJ, Goldsbrough PB
(1996) Betaine aldehyde dehydrogenase in sorghum. Plant
Physiol 110:1301–1308
Yang X, Liang Z, Lu C (2005) Genetic engineering of the bio-
synthesis of glycinebetaine enhances photosynthesis
against high temperature stress in transgenic tobacco plants.
Plant Physiol 138:2299–2309
338 Mol Breeding (2010) 26:325–338
123