Draft - University of Toronto T-Space · Draft 3 contribute maximum countrywide V. radiata...

Draft

In silico genome-wide identification and characterization of

glutathione S-transferase gene family in Vigna radiata (L.) Wilczek

Journal: Genome

Manuscript ID gen-2017-0192.R1

Manuscript Type: Article

Date Submitted by the Author: 25-Nov-2017

Complete List of Authors: Vaish, Swati; Institute of Bioscience and Technology, Shri Ramswaroop

Memorial University, Lucknow-Deva Road, Barabanki, Uttar Pradesh 225003, India Awasthi , Praveen; National Agri-Food Biotechnology Institute Tiwari, Siddharth; National Agri-Food Biotechnology Institute Tiwari, Shailesh Kumar; Indian Institute of Vegetable Research Gupta, Divya; Institute of Bioscience and Technology, Shri Ramswaroop Memorial University, Lucknow-Deva Road, Barabanki, Uttar Pradesh 225003, India Basantani, Mahesh; Shri Ramswaroop Memorial University, Institute of Bio-Science and Technology

Is the invited manuscript for consideration in a Special

Issue? :

N/A

Keyword: Bioinformatics, Phi and tau GSTs, legumeinfo, Plant stress metabolism, Whole-genome sequencing

https://mc06.manuscriptcentral.com/genome-pubs

Genome

Draft

1

In silico genome-wide identification and characterization of glutathione

S-transferase gene family in Vigna radiata (L.) Wilczek

Swati Vaisha, Praveen Awasthi

b, Siddharth Tiwari

b, Shailesh Kumar

Tiwaric, Divya Gupta

a,Mahesh Kumar Basantani

a*

aInstitute of Bioscience and Technology, Shri Ramswaroop Memorial University,

Lucknow-Deva Road, Barabanki, 225003, Uttar Pradesh, India

bNational Agri-Food Biotechnology Institute (NABI), (Department of Biotechnology,

Government of India), Knowledge City, Sector 81, S.A.S. Nagar, Mohali 140306,

Punjab, India.

cDivision of Crop Improvement, ICAR-Indian Institute of Vegetable Research, Post bag

01, Post Office Jakhini (Shahanshahpur), Varanasi, 221305, Uttar Pradesh, India

Running title: Glutathione S-transferase genes in Vigna radiata

*Corresponding author:

Dr. Mahesh Kumar Basantani,

Institute of Bioscience and Technology,

Shri Ramswaroop Memorial University,

Lucknow-Deva Road,

Barabanki, 225003,

Uttar Pradesh,

India

Tel: +91 9839534061

[email protected]

Page 1 of 35


Genome

Draft

2

Abstract

Plant glutathione S-transferases are integral to normal plant metabolism, and biotic and

abiotic stress tolerance. GST gene family has been characterized in diverse plant species

using molecular biology and bioinformatics approaches. In the current study, in silico

analysis identified 44 GSTs in Vigna radiata. Of the total 44 GSTs identified,

chromosomal locations of 31 GSTs were confirmed. The pI value of GST proteins ranged

from 5.10 to 9.40. The predicted molecular weights ranged from 13.12 to 50 kDa.

Subcellular localization analysis revealed that all GSTs were predominantly localized in

the cytoplasm. The active site amino acids were confirmed to be serine in tau, phi, theta,

zeta and TCHQD; cysteine in lambda, DHAR and omega; and tyrosine in EF1G. The

gene architecture conformed to the 2 exon-1 intron and 3 exon-2 intron organization in

case of tau and phi classes, respectively. MEME analysis identified 10 significantly

conserved motifs with the width of 8 to 50 amino acids. The motifs identified were either

specific to a specific GST class, or were shared by multiple GST classes. The results of

the current study will be of potential importance in the characterization of GST gene

family in V. radiata, an economically important leguminous crop.

Keywords: Bioinformatics, phi and tau GSTs, legumeinfo, plant stress metabolism,

whole-genome sequencing

1. Introduction

Vigna radiata (L.) R. Wilczek, commonly known as mung bean, is a widely cultivated

warm-season legume species, grown extensively in tropical and subtropical regions of the

world. It belongs to the papilionoid subfamily of the Fabaceae family and has a diploid

chromosome number of 2n =22 (Keatinge et al. 2011). It is grown in about 6 million

hectares of area in the world, primarily in South and Southeast Asia, Africa, South

America and Australia (Schafleitner et al. 2015; Nair et al. 2012). In southern Asia, it is

grown in India, Pakistan, Bangladesh, Sri Lanka, China, etc. It is grown in the equatorial

and semi-tropical climates of India (Baloda et al. 2017). India is the world’s largest

producer of V. radiata contributing over 50% of the total world production (Kang et al.

2014). The states of Rajasthan, Maharashtra, Andhra Pradesh, Gujarat and Bihar

Page 2 of 35


Genome

Draft

3

contribute maximum countrywide V. radiata production, with Rajasthan and Maharashtra

producing 26% and 20%, respectively (Figure 1) (GoI: Department of Agriculture and

Cooperation, 2014-15).

V. radiata is a rich source of protein, resistant starch and dietary fibres (Chitra et al.

1995; Sandhu et al. 2008), and contains higher levels of folate and iron than most other

legumes (Graham and Vance 2003). Proteins and carbohydrates of V. radiata are easily

digestible and create less flatulence than proteins derived from other legumes. V. radiata

is highly sensitive to salty, and desiccated soils, and variations of temperature (very low

or very high), during the flowering and seed/pod development stages, result in heavy

losses to productivity (Baloda et al. 2017). V. radiata fixes atmospheric nitrogen via root

rhizobial symbiosis and improves soil fertility (Yaqub et al. 2010).

Plant glutathione S-transferases (GSTs; EC 2.5.1.1.8) are found in higher plants,

bryophytes, algae, fungi, bacteria, etc. They are specifically located in the cytosol

(Sheehan et al. 2001). Interestingly, two GSTs, Nt ParA in tobacco, and GTSU12 in

Arabidopsis thaliana are found to be present in the nucleus (Dixon et al. 2009; Takahashi

et al. 1995). Besides, in plants, GSTs are also reported to be present in chloroplasts,

mitochondria, etc. (Lallement et al. 2014). Cytosolic GSTs are the most numerous and

extensively characterized both structurally and functionally. However, there is a lack of

information about organelle-specific plant GSTs. In plants, GSTs have been recognized

for their roles in normal physiology and metabolism, biotic and abiotic stress

management like oxidative stress tolerance, herbicides, weedicides, pesticides, antibiotic

resistance etc. (Dixon et al. 2002; Neuefeind et al. 1997).

Plant GSTs are classified into ten different classes: tau (GSTU), phi (GSTF), lambda

(GSTL), GSTT (theta), GSTZ (zeta), DHAR (dehydroascorbate reductase), TCHQD

(tetrachloro hydroquinone), EF1G (elongation factor-1gamma), hemerythrin and iota,

based on sequence similarity, immunological reactivity, kinetic properties, and structural

conformation (Liu et al. 2013). 14 different GST classes have been identified in

eukaryotic photosynthetic organisms on the basis of phylogenetic analysis, out of which

phi and tau are plant-specific and the most abundant plant GSTs (Lallement et al. 2014).

All plant GSTs have relative molecular masses of around 50 kDa and are homodimer or

heterodimer composed of two similarly sized (~25kDa) subunits with an isoelectric point

Page 3 of 35


Genome

Draft

4

in the pH range of 4 to 5. All GSTs consist of two domains, a conserved N-terminal

domain containing G-site for binding of GSH; and relatively less conserved C-terminal

domain with H-site, with which hydrophobic toxic molecules interact. Plant GSTs have a

tendency to attain α-helical structure followed by random coil and then by β-sheet

(Labrou et al. 2015).

GST gene family has been characterized in several plant species using molecular and

bioinformatics approaches. 55 GSTs in Arabidopsis thaliana (Dixon et al. 2009), 82 in

Oryza sativa (Jain et al. 2010), 81 in Populus (Lan et al. 2009), 42 in maize (McGonigle

et al. 2000), 65 in broccoli (Vijayakumar et al. 2016), 84 in Hordeum vulgare (Rezaei et

al. 2013), 20 in Draceana cambodiana (Zhu et al. 2016), and 37 GSTs in Physcomitrella

patens (a bryophyte) have been identified (Liu et al. 2013).

The draft genome sequence of cultivated mung bean (V. radiata var. radiata VC1973A)

has been published by Kang et al. (2014). The availability of whole-genome sequence

information will tremendously enhance genomics research in V. radiata, and provide an

impetus to V. radiata breeding programmes, thereby laying down the foundation for V.

radiata resequencing efforts.

The V. radiata whole-genome sequence information offers a wide range of opportunities

for identification and characterisation of agriculturally-relevant gene families, understand

gene expression regulation in normal plant metabolism and during abiotic and biotic

stresses, identification of molecular markers to undertake targeted marker-assisted

breeding programmes, etc.

Despite the availability of whole-genome sequence of V. radiata, large-scale in-silico

genome-wide identification and characterization of any gene family have not been carried

out in this economically important legume crop. Till date, there is no report of GST gene

family identification and characterization in V. radiata. The current study identified 44

GSTs in V. radiata, distributed into 7 classes; they were found to be primarily localized

in the cytoplasm. The canonical N and C-terminal domains of GSTs were found to be

present in V. radiata GSTs with active site residues located in the N-terminus G-site.

These GSTs were further characterized on the basis of molecular weight, pI, protein

length, gene architecture, protein motif identification, active site residue localization etc.

Page 4 of 35


Genome

Draft

5

2. Material and methods

2.1 Searching for GST genes in V. radiata sequence database available at Legume

Information System (LIS)

To conduct the in-silico identification of GSTs present in V. radiata genome, A. thaliana,

Glycine max and O. sativa GST protein sequences were used to query the V. radiata

whole genome database Vr1.0 available at Legume Information System (LIS;

http://legumeinfo.org/). A. thaliana GST protein sequences were retrieved from The

Arabidopsis Information Resource (TAIR); soybean and rice GST protein sequences

were retrieved from NCBI by the locus ID or the accession number published by Liu et

al. (2015) and Jain et al. (2010), respectively (Supplementary tables 1, 2 and 3). A total of

235 GST protein sequences from these three species were used to query the V. radiata

database. In the preliminary analysis, these sequences were FASTA formatted and used

to search the V. radiata sequence database at LIS with default parameters. Furthermore,

GST_N and GST_C domains of these 235 GST protein sequences from the three species

were identified. These two domain sequences were separately used to query the V.

radiata sequence database at LIS. Hidden Markov Model (HMM) searches were also

carried out separately with GST_N and GST_C domain sequences of the three species as

queries. pBLAST searches were performed individually for GST_N and GST_C domain

sequences of each GST class separately using GST protein sequences of A. thaliana, G.

max and O. sativa as queries. The amino acid, genomic, and coding sequence (CDS) of

the identified V. radiata were downloaded from the LIS. These GSTs were tentatively

named as VrGSTs. The protein and gene sequences of the identified V. radiata GSTs

were subjected to pI and molecular weight predictions, subcellular localization, protein

domain characterization, identification of exon-intron organisation, protein motif

identification, phylogenetic analysis, etc using different online or offline software

programs and applications (Table 1).

2.2 Conserved domain analysis and confirmation of GST proteins

The identity and protein domain organization of V. radiata GST proteins were primarily

confirmed by NCBI batch-CD (conserved domain) search

Page 5 of 35


Genome

Draft

6

(https://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi) (Marchler-Bauer et al.

2017).

2.3 Subcellular localization

The probable subcellular localization of the identified GSTs was determined by CELLO

online tool v.2.5 (http://cello.life.nctu.edu.tw/) (Yu et al. 2006), Target P (Emanuelsson et

al. 2000) and WoLF PSORT (Horton et al. 2007).

2.4 Prediction of molecular weights and pIs on ExPASy server

The protein molecular weights and pIs were predicted using ProtParam tool of ExPASy

(http://web.expasy.org/protparam/) (Gasteiger et al. 2005).

2.5 Conserved motif identification in V. radiata GST proteins

MEME analysis (http://meme-suite.org/) (Bailey et al. 2009) was performed with default

parameters to identify conserved motifs in the identified V. radiata GST protein

sequences.

2.6 Schematic diagram of protein functional domains with active site residues

The protein domain organisation and active site residues were depicted diagrammatically

using Illustrator IBS v. 1.0.2 (http://ibs.biocuckoo.org/index.php) (Liu et al. 2015).

2.7 Protein sequence alignments

The GST protein sequences of V. radiata, A. thaliana, O. sativa and G. max were aligned

using Clustal Omega (Sievers et al. 2011). The protein alignments were rendered in

ESPript 3.0 (Robert and Gouet 2014).

2.8 Gene structure visualization

The exon-intron number and gene architecture of the identified GST genes were obtained

using Gene Structure Display Server 2.0 (GSDS; http://gsds.cbi.pku.edu.cn/) by aligning

the FASTA formatted CDS and genomic DNA sequences (Hu et al. 2015).

2.9 Phylogenetic analysis

Page 6 of 35


Genome

Draft

7

The evolutionary relationship of V. radiata (an angiosperm) GST proteins with

Physcomitrella patens (a bryophyte), Larix kaempferi (a gymnosperm) and Arabidopsis

thaliana (an angiosperm) GST proteins was studied using MEGA 7 (Kumar et al. 2016).

Phylogenetic analysis was performed by Neighbour-Joining method in MEGA 7.

Bootstrap analysis was performed with 1000 replicates.

3. Results

3.1 Chromosome locations, nomenclature, and gene lengths of GST genes identified in V.

radiata

A pBLAST search of the V. radiata genome using GST protein sequences of A. thaliana,

G. max and O. sativa led to the identification of 44 GST genes sequences (Supplementary

table 4). They were distributed in tau (19), phi (7), lambda (3), EF1G (2), zeta (2), theta

(1), TCHQD (2), mPGES2 (2), GST_N_2GST_N (2), omega (2), and DHAR (2). As

reported in other plant species, the tau and phi class GSTs were highest in number in V.

radiata as well. Out of these putative 44 GST genes, the chromosomal locations of 31

were known, the rest being assigned to the scaffolds. Only the 31 genes with confirmed

chromosomal locations were subjected to further bioinformatics analyses and

characterizations (Table 2). In case of tau class GSTs, the chromosomal locations of 12,

out of 19 identified, were known: one tau each was present on chromosomes 1, 3, 6 and

11; and four each were present on chromosomes 7 and 8 (Figures 2A and 2B). The

chromosomal locations of 4 phi GSTs, out of 7 identified, were known: two were found

to be present on chromosome 6, and one each was present on chromosomes 8 and 10. Of

the 3 lambda GSTs, one was found to be present on chromosome 2, and the other two

were assigned to the scaffolds. The 2 EF1Gs were localized to chromosomes 7 and 10.

Out of the 2 zeta GSTs identified, one was present on chromosome 7; the location of the

other zeta GST was not known. DHAR GSTs were localized to chromosomes 3 and 8.

The 2 TCHQDs were localized to chromosomes 5 and 6. The 2 omega GSTs were

localized to chromosomes 7 and 8. The 2 mPGES2 GSTs were localized to chromosomes

1 and 5. The 2 GST_N_2GST_N were localized to chromosomes 5 and 8. The only theta

GST identified was localized to chromosome 7.

Page 7 of 35


Genome

Draft

8

The GSTs were designated as VrGSTs according to the proposed nomenclature for GST

genes (Dixon et al. 2002; Dixon and Edwards 2010). The genes from the tau, phi, theta,

zeta, lambda, omega, EF1G, TCHQD, mPGES2, GST_N_2GST_N and DHAR classes

were named as GSTU, GSTF, GSTT, GSTZ, GSTL, GSTO, EF1G, TCHQD, mPGES2,

GST_N_2GST_N and DHAR, respectively, followed by a gene number (Table 2). The

numbering for each class GST gene was based on their position on each chromosome

from the beginning towards the end of the chromosome (5’->3’), and on different

chromosomes from chromosome 1 to chromosome 11 (Dong et al., 2016). The gene

lengths of the identified GSTs ranged from 723 nucleotides (VrGSTU7) to 11101

nucleotides (VrGSTZ1).

3.2 GSTs identified in V. radiata demonstrated domain organization typical of GST

protein family

The protein sequences of the 31 VrGSTs were downloaded from the LIS database. These

sequences ranged in length from 117 (VrGSTF2) to 442 (VrGSTT1) amino acids (Table

2). All the VrGSTUs were from ~200 to ~250 amino acids; VrGSTU10 being the only

exception with the protein length of 320 amino acids. The theta GST (VrGSTT1) was the

longest with 442 amino acids as opposed to VrGSTF1 and VrGSTF2, which were the

smallest with amino acid lengths of 178 and 117, respectively. VrEF1G1 and VrEF1G2 also

had long chain lengths with 391 and 419 amino acids, respectively. VrGST_N_2GST_N1 and

VrGST_N_2GST_N2 were 333 and 358 amino acids long, respectively. VrGSTO1 and

VrGSTO2 had chain lengths of 368 and 352 amino acids, respectively.

The protein domain organization of the VrGSTs was further identified and confirmed by

NCBI batch-CD search (Supplementary table 5). The analysis revealed that all the 31 GSTs

had typical GST class-specific domain organization having a small thioredoxin-like N-

terminal domain that binds to glutathione (GST-N), and a variable GST-C domain that binds

to hydrophobic or electrophilic substrates (Supplementary figure 1). The analysis

demonstrated the presence of G-site, H-site, N-terminal and C-terminal domain interfaces, etc

integral to GST protein function. It was found that all the GSTs showed the presence of

typical N- and C-terminal domains. However, Vradi06g02400_Phi (VrGSTF1) and

Vradi06g16320_Phi (VrGSTF2) showed only C-terminal or N-terminal domains,

respectively. In case of Vradi03g07940_DHAR (VrDHAR1), GST_C domain was

Page 8 of 35


Genome

Draft

9

identified. In Vradi05g06210_TCHQD (VrTCHQD1) and Vradi06g11490_TCHQD

(VrTCHQD2), GstA family and GST_C domains were identified. Distinct GST_N and

GST_C_mPGES2 domains were identified in Vradi01g05130_mPGES2 (VrmPGES2A)

and Vradi05g13980_mPGES2 (VrmPGES2B). Distinct GST_N and GST_C terminal

domains were identified in VrGST_N_2GST_N1 and VrGST_N_2GST_N2.

Vradi07g04760_Omega-like (VrGSTO1), and Vradi08g07540_Omega-like (VrGSTO2)

revealed the presence of ECM4 domain, indicating that these two GSTs might belong to

the glutathionyl-hydroquinone reductases (GS-HQRs) group of GSTs. (Supplementary

figure 1 and Figures 3A and 3B).

3.3 pI value and molecular weight predictions of GSTs identified in V. radiata

The predicted molecular weights of the VrGST proteins ranged from 13.12 (VrGSTF2) to

50 kDa (VrGSTT1) (Table 2). The theoretical pI values ranged from 5.10 (VrGSTL1) to

9.40 (VrTCHQD1). The pI value of the VrGSTUs ranged from 5.23 (VrGSTU1) to 8.38

(VrGSTU5). In case of VrGSTFs, VrGSTF2 had a pI value of 9.18 as against pI values

between ~5 and ~6 in the other members. The pI values of all the omega, TCHQD,

mPGES2 and GST_N_2GST_N classes were in the higher ranges between 8 and 9.4.

3.4 Subcellular localization analysis of V. radiata GST proteins revealed that most of the

VrGSTs were cytoplasmic

The subcellular localization of the 31 GSTs was predicted using 3 different prediction

tools. In most of the VrGST proteins, the results were the same with all the 3 tools.

However, for some VrGSTs different cellular locations were predicted by different tools

(Table 3). According to CELLO, all the VrGSTs were predicted to be cytoplasmic,

except VrGSTT1 and VrGSTZ1, which were localized to mitochondria. According to

WoLF PSORT, VrGSTU4, VrGSTU6, VrGSTU9, VrGSTT1 and VrEF1G2 were

predicted to be localized to chloroplast; VrGSTU5 was nuclear; VrGSTF1 was

mitochondrial; and VrGSTF2 was vacuolar. According to TargetP, VrGSTU7,

VrGSTU9, VrGSTU11, VrGSTU12 and VrEF1G2 were predicted to be secretory GSTs.

According to CELLO, VrGSTO1 was predicted to be mitochondrial, and VrGSTO2 as

extracellular. However, according to WoLF PSORT and TargetP, VrGSTO2 was

Page 9 of 35


Genome

Draft

10

predicted to be chloroplastic. Both the mPGES2 GSTs (Vradi01g05130_mPGES2 and

Vradi05g13980_mPGES2) were predicted to be either mitochondrial or chloroplastic.

Both the GST_N_2GST_N (Vradi05g22310_GST_N_2GST_N and

Vradi08g05820_GST_N_2GST_N) were predicted to be chloroplastic.

3.5 V. radiata GST genes followed two-exon/one-intron and three-exon/two-intron

architecture in tau GSTs and phi GSTs, respectively

The exon-intron organization in the VrGST genes was determined using the coding

sequences and the corresponding genomic DNA sequences. The gene structure showed

group-specific exon/intron patterns. The number of exons ranged from 2 in VrGSTUs to

17 in VrGSTZ1 (Vradi07g26200) (Figure 4). All tau GSTs of V. radiata possessed two

exons and one intron except VrGSTU6 (Vradi07g05380), VrGSTU9 (Vradi08g08500)

and VrGSTU10 (Vradi08g15620), which contained four-exons and three-introns each.

Amongst the four identified phi genes, one phi GST VrGSTF1 (Vradi06g02400)

possessed three-exons and two-intron, and the rest three VrGSTF2 (Vradi06g16320),

VrGSTF3 (Vradi08g10080) and VrGSTF4 (Vradi10g04530) contained two, four and two

exons, respectively. VrGSTT1 (Vradi07g30490) and VrGSTL1 (Vradi03g03170) were

found to contain 13 and 11 exons, respectively. Both the VrEF1G contained 7 exons

each. Both the VrDHARs (Vradi03g07940 and Vradi08g22680) contained 6 exons each.

The VrGST_N_2GST_N1 (Vradi05g22310) and VrGST_N_2GST_N2 (Vradi08g05820)

contained 10 and 12 exons, respectively. Both the VrmPGES2 (Vradi01g05130 and

Vradi05g13980) contained 6 exons each. VrGSTO1 (Vradi07g04760) and VrGSTO2

(Vradi08g07540) contained 6 and 3 exons, respectively.

3.6 VrGST protein active sites are comprised of conserved serine or cysteine residues

The multiple sequence alignment of full-length VrGST protein sequences with GST

protein sequences of A. thaliana, rice and soybean revealed highly conserved N-terminus

with active site serine (Ser; S) or cysteine (Cys; C) residue for the activation of GSH

binding and GST catalytic activity. The tau, phi, theta and zeta VrGSTs had active site

Ser residues (Figure 5A, 5B, 5C and 5D), while DHAR and lambda VrGSTs revealed the

presence of active site Cys residues (Figure 5E and 5F). The positions of active site Ser

Page 10 of 35


Genome

Draft

11

and Cys were different among different GST classes (Table 4). Tau VrGSTs possessed

Ser at position 10-20; phi VrGSTs had conserved Ser residue at position 60-70; theta and

zeta VrGSTs contained active site Ser at positions 14 and 20, respectively. DHAR and

lambda VrGSTs possessed active site Cys residue at locations 20 and 36, respectively. In

case of EF1G VrGSTs, there were two probable conserved active site tyrosine (Tyr; Y)

residues (Figure 5G and 5H). In case of TCHQD, the potential active site Ser residue was

at around position 30 (Figure 5I). In omega GSTs, the active site Cys was found to be

present at around positions 30-40 within the ACPWA amino acid sequence in case of

Vradi07g04760 (VrGSTO1); however, the active site Cys in Vradi08g07540 (VrGSTO2)

could not be identified around 30-40. Nonetheless, a Cys was found to be present at

around position 120 within A-C-P-W-A amino acid sequence (Figure 5J).

3.7 MEME analysis of VrGST proteins revealed the presence of class-specific motifs

MEME analysis was performed to identify conserved motifs in VrGST protein

sequences. Of all the motifs identified, few were class specific and others were found in

almost all classes (Table 5). Motif 1, 4 and 6 were present only in VrGSTUs. Motif 4 had

extremely conserved A-R-F-W sequence. Motifs 8 and 10 were present only in EF1G

VrGSTs. Motif 7 was present in EF1G and mPGES2 VrGSTs. Motif 9 was present in

EF1G and GST_N_2GST_N VrGSTs. Motifs 2 and 5 were present in various GST

classes. Motif 3 was presnt in all the VrGSTs irrespective of the class.

3.8 VrGST proteins clustered with the GST proteins of other plant species in a class-

specific manner

An extensive phylogenetic analysis of VrGST proteins was carried out by comparing

them with GST protein sequences from plant species as diverse as Physcomitrella patens

(a bryophyte), Larix kaempferi (a gymnosperm) and A. thaliana (an angiosperm). All the

tau VrGSTUs were found to be closely associated with A. thaliana. Similarly, phi

VrGSTFs were found to be closely associated with A. thaliana. mPGES2, omega and

GST_N_2GST_N VrGSTs branched out separately from the rest of the VrGSTs. The

analysis clearly revealed that VrGSTs of a particular class clustered with the GSTs of

their classes. Exceptionally, Vradi06g16320_phi clustered with zeta GSTs.

Page 11 of 35


Genome

Draft

12

4. Discussion

The current study performed the identification and detailed characterization of V. radiata

GSTs using in silico approaches. Similar studies have been carried out in a wide variety

of plant species and large number of GST genes has been identified in these species. 101

GSTs were identified in the genome of G. max through TBLASTN (Liu et al. 2015), 61

GST transcripts were reported in Citrus cinensis (Licciardello et al. 2014). In 2013, a

genome-wide analysis of P. patens revealed the presence of 37 GSTs, where GST family

was reported for the first time from a nonvascular representative of early land plants (Liu

et al. 2013). The 37 P. patens GSTs were divided into 10 classes, including two new

classes (hemerythrin and iota). In V. radiata a total of 44 GST genes were identified out

of which tau and phi were highest in number, 19 and 7 respectively. Interestingly, HMM

analysis resulted in the identification of TCHQD, omega, mPGES2 and

GST_N_2GST_N VrGSTs. Only one putative hemerythrin GST was found in V. radiata

when P. patens hemerythrin GSTs were used as query sequences (data not shown).

VrGSTs, like most of the plant GSTs, were primarily localized in the cytoplasm;

however, VrGSTU5, VrGSTF1, VrGSTO1, VrGSTT7 and VrGSTZ7 were predicted to

be localized to mitochondria, which suggests that these VrGSTs might be involved in

functions different from cytoplasmic GSTs. These VrGSTs might be involved in

maintaining GSH:GSSG ratios in mitochondria since high glutathione concentrations

have been observed in mitochondria (Zechmann et al. 2008). GFP-GST fusion studies,

similar to those performed in A. thaliana, may be performed for these VrGSTs to confirm

their subcellular localization (Dixon et al. 2009). mPGES2 and GST_N_2GST_N

VrGSTs were predicted to be chloroplastic. GSTs present in chloroplast have been

identified other plant species. An auxin-inducible chloroplast-localized GST has been

identified from phreatophyte Prosopis juliflora (George et al. 2010); similarly, a

chloroplast-localized GST from Puccinellia tenuiflora seedling leaves in resistance to

Na2CO3 stress (Sun et al. 2012). It would be interesting to identify targeting sequences in

these organelle-localized VrGSTs so as to confirm their subcellular localization.

All plant GSTs have relative molecular masses of around 50 kDa and are homodimer or

heterodimer encoded by different genes of the same class and composed of two similarly

Page 12 of 35


Genome

Draft

13

sized (~25kDa) subunits with an isoelectric point in the pH range of 4 to 5 (Frova et al.

2006). The average molecular masses of VrGST tau and phi were 26.8 kDa, within the

range of earlier reported data. EF1G and theta VrGSTs had highest molecular masses of

46.1 and 50 kDa, respectively. TCHQD, omega, mPGES2, and GST_N_2GST_N

VrGSTs had consistently high pI values ranging from 8 to 9.

As reported, one-inron/two-exon structure is normally found in plant specific tau GSTs

and two-intron/three-exon structure characterizes phi class GSTs in higher plants (Labrou

et al. 2015), VrGSTs also followed the same gene architecture. With the exception of

VrGSTU6 (Vradi07g05380_Tau), VrGSTU9 (Vradi08g08500_Tau) and VrGSTU10

(Vradi08g15620_Tau), all the tau VrGSTs had one-intron/two-exon structure. Similar

exceptions have been reported in other plant species as well. Liu et al., 2015, reported

three exons in GSTU54, a deviation from the typical one-inron/two-exon structure for tau

GSTs. Zeta-class GST genes possess 10 exons (Basantani et al. 2007). However, in the

current study, 17 exons were identified in VrGSTZ1 (Vradi07g26200).

Plant GST protein structure studies have clearly demonstrated that tau, phi, theta, zeta

GSTs contain Ser active site residue involved in GSH binding and activation, whereas

DHAR and lambda are Cys containing GSTs that facilitate deglutathionylation reaction.

EF1G are predicted to contain Ser or Tyr residue. For active site Ser/Cys/Tyr residue

position identification in VrGSTs, amino acid sequence alignments of VrGST proteins

were performed in Clustal omega and visualized using EsPript software. Tau and phi

VrGST active site serines were positioned between 10 to 20, and 60 to 70 respectively.

The zeta-class GSTs from a range of species contain a characteristic motif

[SSCX(W/H)RVIAL, in the N terminal region (Board et al. 1997). No such motif was

found in the current analysis. MEME analysis suggested that motifs found in multiple

classes of GSTs may be performing similar functions.

The phylogenetic analysis revealed that tau class VrGSTs were more closely related to A.

thaliana GSTs as compared to L. kaempferi. L. kaempferi tau GSTs branched separately

from VrGSTs and A. thaliana GSTs. It was the same for VrGSTs of other classes. Phi

VrGSTs were also more closely related to A. thaliana.

Using a combined computational strategy, the current study identified 44 VrGSTs in the

V. radiata genome and characterized them based on their sub-cellular localization,

Page 13 of 35


Genome

Draft

14

protein domains and active sites, gene structure, motif analysis and phylogeny. The

results of the current study can potentially be used for the cloning and characterization of

VrGSTs and detailed investigation of V. radiata GST gene family. The role of GSTs in V.

radiata growth and metabolism, biotic and abiotic stress tolerance, etc can be elucidated

in further studies and the candidate genes identified can be used for making stress-

tolerant plants with enhanced productivity.

Acknowledgement

This research did not receive any specific grant from funding agencies in the public,

commercial, or not-for-profit sectors.

Page 14 of 35


Genome

Draft

15

References

Baloda, A., Madanpotra, S., and Aiwal, P.K.J. 2017. Transformation of mung bean plants

for abiotic stress tolerance by introducing codA gene, for an osmoprotectant

glycine betaine. Journal of Plant Stress Physiology, 3: 5-11. doi:

http://dx.doi.org/10.19071/jpsp.2017.v3.3148

Bailey, T.L., Bodén, M., Buske, F.A., Frith, M., Grant, C.E., and Clementi, L., et al.

2009. MEME SUITE: tools for motif discovery and searching. Nucleic Acids

Research, 37(Web Server Issue): 202-208. doi:

https://doi.org/10.1093/nar/gkp335

Basantani, M., and Srivastava, A. 2007. Plant glutathione transferases: a decade falls

short. Canadian Journal of Botany, 85(5): 443-456. doi:

https://doi.org/10.1139/B07-033

Board, P.G., Baker, R.T., Chelvanayagam, G., and Jermiin, L.S. 1997. Zeta, a novel class

of glutathione transferases in a range of species from plants to humans.

Biochemical Journal, 328(Pt 3): 929-935. doi: https://doi.org/10.1042/bj3280929

Chitra, U., Vimala, V., Singh, U., and Geervani P. 1995. Variability in phytic acid

content and protein digestibility of grain legumes. Plant Foods for Human

Nutrition, 47(2): 163-172. doi: https://doi.org/10.1007/BF01089266

Page 15 of 35


Genome

Draft

16

Dixon, D.P., Lapthorn, A., and Edwards, R. 2002. Plant glutathione transferases. Genome

Biology, 3: 3004.1. doi: https://doi.org/10.1186/gb-2002-3-3-reviews3004

Dixon, D.P., Hawkins, T., Hussey, P.J., and Edwards, R. 2009. Enzyme activities and

subcellular localization of members of the Arabidopsis glutathione transferase

superfamily. Journal of Experimental Botany, 60(4): 1207-1218. doi:

https://doi.org/10.1093/jxb/ern365

Dixon, D.P., and Edwards, R. 2010. Glutathione transferases. The Arabidopsis Book.

American Society of Plant Biologists.

Dong, Y., Li, C., Zhang, Y., He, Q., Daud, M.K., and Chen, J., et al. 2016. Glutathione S-

transferase gene family in Gossypium raimondii and G. arboreum: Comparative

genomic study and their expression under salt stress. Frontiers in Plant Science, 7:

139. doi: https://doi.org/10.3389/fpls.2016.00139

Emanuelsson, O., Nielsen, H., Brunak, S., and Heijne G.V. 2000. Predicting subcellular

localization of proteins based on their N-terminal amino acid sequence.

Journal of Molecular Biology, 300(4): 1005-1016. doi:

https://doi.org/10.1006/jmbi.2000.3903

Frova, C. 2006. Glutathione transferases in the genomics era: New insights and

perspectives. Biomolecular Engineering, 23(4): 149-169. doi:

https://doi.org/10.1016/j.bioeng.2006.05.020

Page 16 of 35


Genome

Draft

17

Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M.R., and Appel, R.D., et

al. 2005. Protein Identification and Analysis Tools on the ExPASy Server. The

Proteomics Protocols Handbook 571-607.

George, S., Venkataraman, G., and Parida, A. 2010. A chloroplast-localized and auxin-

induced glutathione S-transferase from phreatophyte Prosopis juliflora confer

drought tolerance on tobacco. Journal of Plant Physiology, 167(4): 311-318. doi:

https://doi.org/10.1016/j.jplph.2009.09.004

GoI: Department of Agriculture and Cooperation (2014-15)

Graham, P.H. and Vance, C.P. 2003. Legumes: importance and constraints to greater

use. Plant Physiology, 131(3): 872-877. doi: http://dx.doi.org/10.1104/pp.017004

Horton, P., Park, K., Obayashi, T., Fujita, N., Harada, H., and Adams-Collier, C.J., et al.

2007. WoLF PSORT: protein localization predictor. Nucleic Acids Research, 35

(Web Server Issue): W585-W587. doi: https://doi.org/10.1093/nar/gkm259

Hu, B., Jin, J., Guo, A-Y., Zhang, H., Luo, J., and Gao, G. 2015. GSDS 2.0: an upgraded

gene feature visualization server. Bioinformatics, 31(8): 1296-1297. doi:

https://doi.org/10.1093/bioinformatics/btu817

Jain, M., Ghanashyam, C., and Bhattacharjee, A. 2010. Comprehensive expression

analysis suggests overlapping and specific roles of glutathione S-transferases

during development and stress responses in rice. BMC Genomics, 11: 73.

doi: https://doi.org/10.1186/1471-2164-11-73

Page 17 of 35


Genome

Draft

18

Kang Y.J., Kim S., Kim M.Y., Lestari P., Kim K.H., and Ha B-K., et al. 2014. Genome

sequence of mungbean and insights into evolution within Vigna species. Nature

Communications, 5: 5443. doi: https://doi.org/10.1038/ncomms6443

Keatinge, J.D.H., Easdown, W.J., Yang, R.Y., Chadha, M.L., and Shanmugasundaram, S.

2011. Overcoming chronic malnutrition in a future warming world: the key

importance of mungbean and vegetable soybean. Euphytica, 180(1): 129-141. doi:

https://doi.org/10.1007/s10681-011-0401-6

Kumar, S., Stecher, G., and Tamura, K. 2016. MEGA7: Molecular Evolutionary Genetics

Analysis Version 7.0 for Bigger Datasets. Molecular Biology and Evolution,

33(7): 1870-1874. doi: https://doi.org/10.1093/molbev/msw054

Labrou, N.E., Papageorgiou, A.C., Pavli, O., and Flemetakis, E. 2015. Plant GSTome:

structure and functional role in xenome network and plant stress response. Current

Opinion in Biotechnology, 32: 186-194. doi:

http://dx.doi.org/10.1016/j.copbio.2014.12.024

Lallement, P.A., Brouwer, B., Keech, O., Hecker, A., and Rouhier, N. 2014. The still

mysterious roles of cysteine-containing glutathione transferases in plants.

Frontiers in Pharmacology, 5: 192. doi: https://doi.org/10.3389/fphar.2014.00192

Lan, T., Yang, Z.L., Yang, X., Liu, Y.J., Wang, X.R., and Zeng, Q.Y. 2009. Extensive

functional diversification of the Populus glutathione S-transferase supergene

family. Plant Cell, 21(12): 3749-3766. doi: http://dx.doi.org/10.1105/tpc.109.

070219

Page 18 of 35


Genome

Draft

19

Licciardello, C., D’Agostino, N., Traini, A., Recupero, G.R., Frusciante, L., and

Chiusano M.L. 2014. Characterization of the glutathione S-transferase gene

family through ESTs and expression analyses within common and pigmented

cultivars of Citrus sinensis (L.) Osbeck. BMC Plant Biology, 14: 39.

doi: https://doi.org/10.1186/1471-2229-14-39

Liu, H-J., Tang, Z-X., Han, X-M., Yang, Z-L., Zhang, F-M., and Yang, H-L., et al. 2015.

Divergence in enzymatic activities in the soybean GST supergene family provides

new insight into the evolutionary dynamics of whole-genome duplicates.

Molecular Biology and Evolution, 32(11): 2844–2859. doi:

https://doi.org/10.1093/molbev/msv156

Liu, W., Xie, Y., Ma, J., Luo, X., Nie, P., and Zuo, Z., et al. 2015. IBS: an illustrator for

the presentation and visualization of biological sequences. Bioinformatics,

31(20): 3359-3361. doi: https://doi.org/10.1093/bioinformatics/btv362

Liu, Y-J., Han, X-M., Ren, L-L., Yang, H-L., and Zeng, Q-N., 2013. Functional

divergence of the glutathione S-transferase supergene family in Physcomitrella

patens reveals complex patterns of large gene family evolution in land plants.

Plant Physiology, 161(2): 773-786. doi: http://dx.doi.org/10.1104/pp.112.205815

Marchler-Bauer A., Bo, Y., Han, L., He, J., Lanczycki, C.J., and Lu, S. et al. 2017.

CDD/SPARCLE: functional classification of proteins via subfamily domain

architectures. Nucleic Acids Research, 45(Database Issue): D200-D203. doi:

https://doi.org/10.1093/nar/gkw1129

Page 19 of 35


Genome

Draft

20

McGonigle, B., Keeler, S.J., Lau, S-M.C., Koeppe, M.K., and O’Keefe, D.P. 2000. A

genomics approach to the comprehensive analysis of the glutathione S-transferase

gene family in soybean and maize. Plant Physiology, 124(3): 1105-1120. doi:

http://dx.doi.org/10.1104/pp.124.3.1105

Nair, R.M., Schafleitner, R., Kenyon, L., Srinivasan, R., Easdown, W., and Ebert, A.W.,

et al. 2012. Genetic improvement of mungbean. SABRAO Journal of Breeding

and Genetics, 44(2): 177-190.

Neuefeind, T., Reinemer, P., and Bieseler, B. 1997. Plant glutathione S-transferases and

herbicide detoxification. Biological Chemistry, 378(3-4): 199-205.

Rezaei, M.K., Shobbar, Z.-S., Shahbazi, M., Abedini, R., and Zare, S. 2013. Glutathione

S-transferase (GST) family in barley: Identification of members, enzyme activity,

and gene expression pattern. Journal of Plant Physiology, 170(14): 1277-1284.

doi: http://dx.doi.org/10.1016/j.jplph.2013.04.005

Robert, X., and Gouet, P. 2014. Deciphering key features in protein structures with the

new ENDscript server. Nucleic Acids Research, 42(Web Server Issue): W320-

W324. doi: https://doi.org/10.1093/nar/gku316

Sandhu, K.S., and Lim, S-T. 2008. Digestibility of legume starches as influenced by their

physical and structural properties. Carbohydrate Polymers, 71(2): 245-252. doi:

https://doi.org/10.1016/j.carbpol.2007.05.036

Page 20 of 35


Genome

Draft

21

Schafleitner, R., Nair, M.R., Rathore, A., Wang, Y-W., Lin C-Y., and Chu, S-H., et al.

2015. The AVRDC – The World Vegetable Center mungbean (Vigna radiata)

core and mini core collections. BMC Genomics, 16: 344.

doi: http://dx.doi.org/10.1186%2Fs12864-015-1556-7

Sheehan, D., Meade, G., Foley, V.M., and Dowd, C.A. 2001. Structure, function and

evolution of glutathione transferases: implications for classification of non-

mammalian members of an ancient enzyme superfamily. Biochemical Journal,

360(Pt 1): 1-16. doi: https://doi.org/10.1042/bj3600001

Sievers, F., Wilm, A., Dineen, D.G., Gibson, T.J., Karplus, K., and Li, W., et al. 2011.

Fast, scalable generation of high-quality protein multiple sequence alignments

using Clustal Omega. Molecular Systems Biology, 7: 539.

doi: https://doi.org/10.1038/msb.2011.75

Soranzo, N., Gorla, M.S., Mizzi, L., Toma, G.D., and Frova, C. 2004. Organisation and

structural evolution of the rice glutathione S-transferase gene family. Molecular

Genetics and Genomics, 271(5): 511-521. doi: https://doi.org/10.1007/s00438-

004-1006-8

Sun, G.R., Wu, X.L., Chen, G., Wang, J.B., Cao, W.Z., and Du, Q., et al. 2012. The

function of chloroplast GST of Puccinellia tenuiflora seedling leaves in resistance

to Na2CO3 stress. Advanced Materials Research, 343-344: 712-720.

Page 21 of 35


Genome

Draft

22

Takahashi, Y., Hasezawa, S., Kusaba, M., and Nagata, T. 1995. Expression of the auxin-

regulated parA gene in transgenic tobacco and nuclear localization of its gene

product. Planta, 196(1): 111-117. doi: https://doi.org/10.1007/BF00193224

Vijayakumar, H., Thamilarasan, S.K., Shanmugam, A., Natarajan, S.K., Jung, H-J., and

Park, J-I., et al. 2016. Glutathione transferases superfamily: Cold-inducible

expression of distinct GST genes in Brassica oleracea. International Journal of

Molecular Sciences, 17(8): 1211. doi: http://dx.doi.org/10.3390/ijms17081211

Yaqub, M., Mahmood, T., Akhtar, M., Iqbal M.M., and Ali S. 2010. Induction of

mungbean [Vigna radiata (L.) Wilczek] as a grain legume in the annual rice-

wheat double cropping system. Pakistan Journal of Botany, 42(5): 3125-3135.

Yu, C-S., Chen, Y-C., Lu, C-H., and Hwang, J-K. 2006. Prediction of protein subcellular

localization. Proteins, 64(3): 643-651. doi: http://dx.doi.org/10.1002/prot.21018

Zechmann B., Mauch F., Sticher L. and Müller M. 2008. Subcellular

immunocytochemical analysis detects the highest concentrations of glutathione in

mitochondria and not in plastids. Journal of Experimental Botany, 59(14): 4017-

4027. doi: https://doi.org/10.1093/jxb/ern243

Zhu, J-H., Li, H-L., Guo, D., Wang, Y., Dai, H-F., and Mei, W-L., et al. 2016.

Transcriptome-wide identification and expression analysis of glutathione S-

transferase genes involved in flavonoids accumulation in Dracaena cambodiana.

Plant Physiology and Biochemistry, 104: 304-311. doi:

https://doi.org/10.1016/j.plaphy.2016.05.012

Page 22 of 35


Genome

Draft

23

Legends

Figure 1. State-wise distribution of mung bean in India. Rajasthan (26%), Maharashtra

(20%) and Andhra Pradesh (10%) contribute maximum production, followed by other

states (GoI: Department of Agriculture and Cooperation 2014-15)

Figure 2. Genomic distribution of VrGSTs on chromosomes. (A) Chromosomal locations

are indicated based on V. radiata genome database on LIS. (B) Summary of the number

of GSTs present on each chromosome. The VrGST genes have been highlighted. Each

color represents one GST class.

Figure 3. Protein domain organization of VrGSTs. (A) Domain organization showing

active site amino acid residues on the representative GST protein from each class. (B)

Diagrammatic representation of protein domain of VrGSTs created using Illustrator.

Figure 4. Gene architecture of VrGSTs. The exon-intron structures of VrGST genes were

determined by comparing the coding sequences and the corresponding genomic DNA

sequences using the Gene Structure Display Server (GSDS). The blue rounded rectangles

indicate exons and the black lines indicate introns.

Figure 5. Protein active sites in VrGSTs. VrGST protein active site residues were

predicted by multiple sequence alignments of V. radiata, A. thaliana, rice and soybean

GST protein sequences. The asterisks indicate the active site serine in tau, phi, theta and

zeta VrGSTs (A to D); the active site cysteine in DHAR and lambda (E and F); the two

probable active site tyrosine residues in EF1G (G and H); and the potential active site

cysteine residue in omega VrGSTs (J).

Figure 6. Phylogenetic analysis of VrGST proteins. Different GST classes and their

branches have been given different colours.

Page 23 of 35


Genome

Draft

24

Legends to supplementary material

gen-2017-0192.R1Suppla

Supplementary table 1. Details of GST gene family of O. sativa. [Soranzo et al. (2004);

Jain et al. (2010)]

Supplementary table 2. Details of GST gene family of A. thaliana (Source: TAIR)

Supplementary table 3. Details of GST gene family of G. max (Liu et al. 2015)

Supplementary table 4. Details of GSTs identified in V. radiata. 44 GSTs distributed in

tau, phi, theta, zeta, lambda, DHAR, EF1G, TCHQD, omega, mPGES2, and

GST_N_2GST_N classes were identified in V. radiate

Supplementary table 5. NCBI batch-CD search results for 31 GST proteins to identify

GST family protein domain organization

gen-2017-0192.R1Supplb

Supplementary Figure 1. NCBI batch-CD search results of VrGST proteins. The image

shows the concise results of all the 31 VrGST proteins demonstrating the N- and C-

terminal domains.

Page 24 of 35


Genome

Draft

Table 1. List of sequence databases, bioinformatics software and applications

used in the present study

S. No. Bioinformatics tools Application

1. Legume information system (LIS) pBLAST search, retrieval of GST amino acid sequence of

Glycine max through the locus id/accession no.

2. The Arabidopsis information

Resource (TAIR)

Retrieval of all GST class amino acid sequences of

Arabidopsis

3. Rice Annotation Project (RAP) Retrieval of all GST class amino acid sequences of rice

through locus IDs

4. NCBI Database To retrieve some amino acid sequences of soybean

5. NCBI Batch CD Search Identification of conserved GST protein domains

6. ExPASY protparam Computation of various chemical and physical parameters

of GST proteins

7. CELLO, Target P, WoLF PSORT For subcellular localization

8. Clustal Omega Protein sequence alignments

9. MEME suit Identification of conserved motifs

10. Illustrator tool For diagrammatic representation of GST protein domains

11. Gene Structure Display Server

(GSDS)

Gene structure with well-defined introns and exons

12. MEGA v. 7 Evolutionary analysis and alignment

13. ESPript 3 Formatting of multiple sequence alignments

Page 25 of 35


Genome

Draft

Table 2. Details of GSTs identified in V. radiata. 31 GSTs distributed in tau,

phi, theta, zeta, lambda, DHAR, EF1G, TCHQD, omega, mPGES2, and

GST_N_2GST_N classes were further characterized. These GST genes were

designated as VrGSTs

S.

No. Gene name Accession no

Chromosome

no

Gene length

(nucleotides)

Protein

length (aa) pI

Mol wt

(kDa)

1 VrGSTU1 Vradi01g12660_Tau Vr01 1332 219 5.23 25.58












13 VrGSTF1 Vradi06g02400_Phi Vr06 1130 178 6.11 20.55




17 VrGSTT1 Vradi07g30490_Theta Vr07 6007 442 9.29 50.03

18 VrGSTZ1 Vradi07g26200_Zeta Vr07 11101 395 5.85 44.67

19 VrGSTL1 Vradi03g03170_Lambda Vr03 4654 330 5.1 37.74

20 VrDHAR1 Vradi03g07940_DHAR Vr03 5393 235 9.27 26.25

21 VrDHAR2 Vradi08g22680_DHAR Vr08 3247 213 5.98 23.42

22 VrEF1G1 Vradi07g27390_EF1G Vr07 3138 391 6.21 44.28

23 VrEF1G2 Vradi10g13450_EF1G Vr10 2377 419 5.57 47.93

24 VrTCHQD1 Vradi05g06210_TCHQD Vr05 2659 267 9.4 31.6

25 VrTCHQD2 Vradi06g11490_TCHQD Vr06 2670 267 9.23 31.71

26 VrGSTO1 Vradi07g04760_Omega-like Vr07 5427 368 6.51 42.1

27 VrGSTO2 Vradi08g07540_Omega-like Vr08 2189 352 9.05 38.58

28 VrmPGES2A Vradi01g05130_mPGES2 Vr01 3956 311 8.93 35.09

29 VrmPGES2B Vradi05g13980_mPGES2 Vr05 3080 293 9.00 33.26

30 VrGST_N_2GST_N1 Vradi05g22310_GST_N_2GST_N Vr05 3685 333 8.4 37.28

31 VrGST_N_2GST_N2 Vradi08g05820_GST_N_2GST_N Vr08 5930 358 9.01 39.82

Page 26 of 35


Genome

Draft

Table 3. Subcellular localization of GSTs identified in V. radiata. 31 GSTs were

analyzed for their cellular location using CELLO, WoLF PSORT, and TargetP

subcellular localization prediction tools

�

�

S. No. Gene name Accession No CELLO WoLF

PSORT TargetP

1 VrGSTU1 Vradi01g12660_Tau Cytoplasm Cytoplasm



4 VrGSTU4 Vradi07g04910_Tau Cytoplasm Chloroplast

5 VrGSTU5 Vradi07g05370_Tau Cytoplasm Nucleus Mitochondria

6 VrGSTU6 Vradi07g05380_Tau Cytoplasm Chloroplast

7 VrGSTU7 Vradi07g23980_Tau Cytoplasm Cytoplasm Secretory


9 VrGSTU9 Vradi08g08500_Tau Cytoplasm Chloroplast Secretory




13 VrGSTF1 Vradi06g02400_Phi Cytoplasm Mitochondria

14 VrGSTF2 Vradi06g16320_Phi Cytoplasm Vacuole

15 VrGSTF3 Vradi08g10080_Phi Cytoplasm Cytoplasm

16 VrGSTF4 Vradi10g04530_Phi Cytoplasm Cytoplasm

17 VrGSTT1 Vradi07g30490_Theta Mitochondria Chloroplast

18 VrGSTZ1 Vradi07g26200_Zeta Mitochondria Cytoplasm

19 VrGSTL1 Vradi03g03170_Lambda Cytoplasm Cytoplasm

20 VrDHAR1 Vradi03g07940_DHAR Chloroplast Chloroplast Chloroplast

21 VrDHAR2 Vradi08g22680_DHAR Cytoplasm Cytoplasm

22 VrEF1G1 Vradi07g27390_EF1G Cytoplasm Cytoplasm

23 VrEF1G2 Vradi10g13450_EF1G Cytoplasm Chloroplast Secretory

24 VrTCHQD1 Vradi05g06210_TCHQD Mitochondria Cytoplasm

25 VrTCHQD2 Vradi06g11490_TCHQD Cytoplasm Cytoplasm

26 VrGSTO1 Vradi07g04760_Omega-like Mitochondria Cytoplasm

27 VrGSTO2 Vradi08g07540_Omega-like Extracellular Chloroplast Chloroplast

28 VrmPGES2A Vradi01g05130_mPGES2 Chloroplast Mitochondria Mitochondria

29 VrmPGES2B Vradi05g13980_mPGES2 Mitochondria Chloroplast Mitochondria

30 VrGST_N_2GST_N1

Vradi05g22310_GST_N_2GST_N Plasma

membrane Chloroplast Chloroplast

31 VrGST_N_2GST_N2 Vradi08g05820_GST_N_2GST_N Chloroplast Chloroplast Chloroplast

�

�

�

Page 27 of 35


Genome

Draft

Table 4. Predicted amino acid position of active site serine or cysteine residues in

VrGSTs

GST

Class

Active site amino acid

residue

Predicted position

with MSA

Tau

Phi

Theta

Zeta

Lambda

DHAR

EF1G

TCHQD

Omega

Ser

Ser

Ser

Ser

Cys

Cys

Tyr

Ser

Cys

10-20

60-70

14

20

36

20

-

-

30-40

Page 28 of 35


Genome

Draft

��

��

�

!�� "�� # ��

��

��

��

�

��

�

��

�

��

��

�

��

�

��

��

� �!��"�#��

"$��%��

�

��

��

�

��

�

��&��''�'�

�

(�

��

��)��

�

��

�

��*��

�

+�

��

��)��(��

�

��

�

��

�

$�� ,� ��

-�"./��

� �!��

�/�*��

*"�#��

�

��

�

� ��+��

��

�

+��

�

��

�

)��(� (�

+��

�

��"$��%��

�/��

�

��

�

��+��

�

��

�

��

�

�

��

(�

�

�0*��*��

12�/��#�+��3

�%�343��%�34��

�

��)�

+��

�

��

��

�

�

�

�

Page 29 of 35


Genome

Draft

Figure 1. State-wise distribution of mung bean in India. Rajasthan (26%), Maharashtra (20%) and Andhra Pradesh (10%) contribute maximum production, followed by other states (GoI: Department of Agriculture

and Cooperation 2014-15)

78x100mm (300 x 300 DPI)

Page 30 of 35


Genome

Draft

Figure 2. Genomic distribution of VrGSTs on chromosomes. (A) Chromosomal locations are indicated based on V. radiata genome database on LIS. (B) Summary of the number of GSTs present on each chromosome.

The VrGST genes have been highlighted. Each color represents one GST class.

97x125mm (300 x 300 DPI)

Page 31 of 35


Genome

Draft

Figure 3. Protein domain organization of VrGSTs. (A) Domain organization showing active site amino acid residues on the representative GST protein from each class. (B) Diagrammatic representation of protein

domain of VrGSTs created using Illustrator.

201x260mm (300 x 300 DPI)

Page 32 of 35


Genome

Draft

Figure 4. Gene architecture of VrGSTs. The exon-intron structures of VrGST genes were determined by comparing the coding sequences and the corresponding genomic DNA sequences using the Gene Structure Display Server (GSDS). The blue rounded rectangles indicate exons and the black lines indicate introns.

120x155mm (300 x 300 DPI)

Page 33 of 35


Genome

Draft

Figure 5. Protein active sites in VrGSTs. VrGST protein active site residues were predicted by multiple sequence alignments of V. radiata, A. thaliana, rice and soybean GST protein sequences. The asterisks

indicate the active site serine in tau, phi, theta and zeta VrGSTs (A to D); the active site cysteine in DHAR

and lambda (E and F); the two probable active site tyrosine residues in EF1G (G and H); and the potential active site cysteine residue in omega VrGSTs (J).

339x442mm (300 x 300 DPI)

Page 34 of 35


Genome

Draft

Figure 6. Phylogenetic analysis of VrGST proteins. Different GST classes and their branches have been given different colours.

106x137mm (300 x 300 DPI)

Page 35 of 35


Genome

Date post:	23-Aug-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Draft - University of Toronto T-Space · Draft 3 contribute maximum countrywide V. radiata...

Documents