Transcriptome data modeling for targeted plant metabolicengineeringKeiko Yonekura-Sakakibara1, Atsushi Fukushima1 and Kazuki Saito1,2
Available online at www.sciencedirect.com
The massive data generated by omics technologies require the
power of bioinformatics, especially network analysis, for data
mining and doing data-driven biology. Gene coexpression
analysis, a network approach based on comprehensive gene
expression data using microarrays, is becoming a standard tool
for predicting gene function and elucidating the relationship
between metabolic pathways. Differential and comparative
gene coexpression analyses suggest a change in coexpression
relationships and regulators controlling common and/or
specific biological processes. In conjunction with the newly
emerging genome editing technology, network analysis
integrated with other omics data should pave the way for robust
and practical plant metabolic engineering.
Addresses1 RIKEN Plant Science Center, 1-7-22, Suehiro-cho, Tsurumi-ku,
Yokohama 230-0045 Japan2 Graduate School of Pharmaceutical Sciences, Chiba University, 1-8-1,
Inohana, Chuo-ku, Chiba 260-8675, Japan
Corresponding author: Saito, Kazuki ([email protected],
Current Opinion in Biotechnology 2013, 24:285–290
This review comes from a themed issue on Plant biotechnology
Edited by Natalia Dudareva and Dean DellaPenna
For a complete overview see the Issue and the Editorial
Available online 4th December 2012
0958-1669/$ – see front matter, # 2012 Elsevier Ltd. All rights
reserved.
http://dx.doi.org/10.1016/j.copbio.2012.10.018
IntroductionIn most systems approaches used to understand cellular
biological processes, one key assertion is that the biological
systems can be modeled as a network [1–3]. In general, a
network can be described as a graph in which biological
entities such as genes, transcripts, proteins and metabolites
correspond to nodes, and the interactions between nodes
such as coexpression and protein-protein interaction, cor-
respond to edges. Network analysis, in turn, refers to the
use of graph-theoretic models and statistics to provide
topological information about a network.
In a broad sense, network analysis can be classified into
three major types, omics data modeling, stoichiometric
modeling and kinetic modeling, although they overlap
to a certain degree (Figure 1). Omics data modeling refers
to the use of statistical methods to identify and infer
www.sciencedirect.com
complex functional interactions among the components
in biological systems [4–6]. Stoichiometric modeling
entails network analysis based on knowledge of the stoichi-
ometry of a system, which predicts flux distributions of
biological pathways [7]. Kinetic modeling, which requires
reliable information about kinetic parameters, is utilized
for the evaluation of the dynamics of biological systems
such as time-course simulation, steady-state analysis and
metabolic control analysis [8]. Of these three types, omics
data modeling is becoming a standard analytical tool for
understanding whole biological systems and prediction of
gene function. Various networks including gene coexpres-
sion, differential coexpression, metabolite-to-metabolite,
gene-to-metabolite and protein-protein interaction have
been elucidated by means of network analysis using multi-
omics data [6,9–20]. Here, we focus on omics data model-
ing, especially on transcriptomic data, in network analysis
as a means to improve metabolic engineering strategies.
Transcriptome data modeling for identifyingtarget genesFunctional identification of target genes is one of the
major objectives of network analyses using omics data.
Gene coexpression analysis based on the so-called ‘guilt-
by-association’ principle [11] is frequently used for this
purpose [21–22]. This approach enables efficient exam-
ination of candidate genes that belong to a multigene
family (e.g. family 1 glycosyltransferases (UGTs), meth-
yltransferases, MYB), and accurate prediction of gene
function that is then corroborated by other experimental
evidences [23–26,27��,28�]. Maeda et al. identified a gene
encoding prephenate aminotransferase (PPA-AT) from
Arabidopsis thaliana and Petunia hybrida by first searching
for an aminotransferase gene coexpressed with those in
the shikimate and phenylpropanoid pathways of Arabi-
dopsis and then looking for homologs in petunia
expressed sequence tag (EST) databases [27��]. Enzy-
matic characterization of the recombinant proteins and invivo experiments using RNA interference (RNAi) tech-
nique verified the PPA-AT function. Before this discov-
ery, PPA-AT enzymatic characterization had been well
studied but the corresponding genes remained uniden-
tified in any organism. Analyses of PPA-AT RNAi lines
additionally revealed unknown post-transcriptional
regulation in the phenylalanine pathway. Moreover,
the use of co-expression analysis allowed Pfalz et al. to
identify genes that were later used to improve pro-
duction of the indole glucosinolate, indol-3-yl-methyl
(I3M), in Nicotiana benthamiana [28�]. The two-step
modification of I3M to 4-methoxy-indol-3-yl-methyl
Current Opinion in Biotechnology 2013, 24:285–290
286 Plant biotechnology
Figure 1
- Topologi calproperties
Omics datamodeling
Stoic hiom etricmodeling
- Geno me-sc alereconst ruction
- Quan titativepredictions
Time (h)
Leve
ls
Kineticmodeling
- Dyna micdescription
- Regula tor ymechani sm
Network analyses
Net work Infe rencesPearson ’s correla tionPartial correla tionMutual in for mation MICGene Coordina tion etc .
Data mining too lsPCA and ICAGraph clusteringNetwor k re const ructionPathway da taba ses
etc .
- Omics multine twork
- Functional map
Theoret ical app roac hesODEsOptimizationPrior kno wledgeEnzyme da taba ses etc .
“top-do wn” “bottom-up ”“middle-out ”
Network siz eLarge Small
.
.
.
.
.
.
.
.
.
.
.
. .
A BC
DV3
V3
V2V1
V1
V2
Current Opinion in Biotechnology
Schematic diagram of computational approaches of cellular networks.
A network analysis is a statistical method of identification and inference about functional interactions between elements in biological systems.
Mathematical modeling with detailed kinetic parameters (so-called ‘kinetic models’) evaluates the function and the dynamics in targeted biological
pathways. Stoichiometric analysis with genome-scale maps provides predictive flux distributions in cell metabolism. Omics data carry multilevel
network inferences. MIC, maximal information coefficient [64]; PCA, principal component analysis; ICA, independent component analysis; ODEs,
ordinary differential equations.
or 1-methoxy-indol-3-yl-methyl was achieved by
additional overexpression of both cytochrome P450
monooxygenases, CYP81Fs, and two O-methyltransfer-
ases that had been identified by gene coexpression
analysis using CYP81Fs [28�]. Similarly, based on coex-
pression with sucrose synthases (SUS5 and SUS6)
involved in the synthesis of the callose lining, Barratt
et al. identified a callose synthase, glucan synthase-like7,
in the sieve plate pores of stems and roots [29].
Thus, various genes encoding enzymes, members of
protein complexes and transcription factors have been
identified by gene coexpression network analysis using
public database (reviewed in [11,21], http://atted.jp/top_
publication.shtml).
As another approach based on transcriptome data, inde-
pendent component analysis (ICA), a form of unsupervised
algorithm, has been applied to microarray data analysis for
extraction and characterization of informative features,
Current Opinion in Biotechnology 2013, 24:285–290
clustering and classification of gene expression profiles
[30–33]. ICA of a total of 1877 genes including flavonoid
biosynthetic genes and genes annotated in AraCyc [34] was
performed on 1388 microarray data with ATTED-II [35�].A hierarchical cluster analysis of genes based on ICA
showed that the genes involved in the biosynthesis of
anthocyanins and flavonols form distinct clusters and the
cluster of anthocyanin biosynthetic genes can be divided
into two subclusters for skeleton biosynthesis and modi-
fication [35�]. Among putative anthocyanin UGTs,
UGT79B1 found in the anthocyanin modification subclus-
ter was identified as anthocyanin 3-O-glucoside: 200-O-
xylosyltransferase. In the anthocyanin skeleton biosyn-
thesis subcluster, UGT84A2 was discovered to be sinapic
acid glucosyltransferase that supplies 1-O-sinapoylglucose
as sinapoyl donor for anthocyanin sinapoyltransferase [35�].Interestingly, the cluster designation changes for some
bifunctional flavonoid glycosyltranferases depending on
the number of independent components (ICs) used for
analysis. Flavonoid 3-O-glucosyltransferase, for example,
www.sciencedirect.com
Transcriptome data modeling toward plant metabolic engineering Yonekura-Sakakibara, Fukushima and Saito 287
which can recognize flavonols and anthocyanins, falls into
the flavonoid cluster based on 8 ICs, but into the antho-
cyanin modification cluster based on 10 ICs [35�].
To further infer inter-pathway interactions, gene coex-
pression analysis can be expanded by conducting graph
clustering and differential coexpression analysis using
distinct data sets (e.g. organ specific data) [36�]. Gene
coexpression analysis was examined based on more than
300 tomato microarray data, and coexpression modules
were extracted by graph clustering. Graph clustering of
coexpression networks can be helpful for extracting den-
sely-clustered gene modules [37]. Significantly enriched
gene ontology terms in 88% of extracted coexpression
modules suggested the biological relevance of genes in
the modules. Coexpression analysis showed that gene
coexpression varied with the organ datasets used. In
the case of flavonoid biosynthetic genes, a strong positive
correlation between flavanone 3-hydroxylase and 4-cou-
marate:CoA ligase was observed in fruits (r = 0.89), but
not in roots (r = �0.23). In the case of chalcone synthase
(CHS) and chalcone isomerase, a strong negative corre-
lation in roots (r = �0.72) and a weak positive in fruits
(r = 0.50) were observed. Carotenoid biosynthetic genes
also showed similar results. Flavonoids are highly accu-
mulated in leaves and fruits, but not in roots of tomato,
suggesting that differential coexpression indicates a
change in gene coexpression relationships that may
reflect ‘reprogramming’ of transcriptome networks among
two biological tissues. This approach has been applied to
not only plants but also animals [36�,38–39].
A bioinformatics approach termed ‘gene coordination’
was developed in order to understand the coordinated
response of gene networks to environmental stimuli
[40]. An analysis using >1000 genes encoding enzymes
and transcription factors revealed possible stress-associ-
ated intra-pathway and inter-pathway interactions be-
tween genes from six energy-associated pathways
including the TCA cycle, glycolysis and photosynthesis
[41]. It may be useful for prediction of gene function
and deeper understanding of interaction and crosstalk
between pathways to elucidate whole biological sys-
tems in organisms.
Comparative network approachestransferring insights from model plants intocropsThe accumulation of gene expression data from various
plants enables us to conduct gene coexpression network
analysis in a range of plant species. Furthermore, integ-
ration of sequence similarity and gene (co)expression
profiles allows identification of conserved coexpression
clusters among multiple plant species (so-called ‘com-
parative coexpression’) [42]. As an example, compara-
tive coexpression analysis was examined using the CHS
gene as query by PlaNet [43�]. CHS belongs to the
www.sciencedirect.com
polyketide synthase family and is the first committed
enzyme in the biosynthesis of flavonoids [44]. Con-
served coexpression clusters containing CHSs and other
flavonoid biosynthetic genes from each plant (Medicago,
barley, soybean, wheat, rice, and poplar) were found as
most similar ones across seven species. In addition,
conserved coexpression clusters containing Arabidopsis
CHS-like genes were found. Arabidopsis has three
additional CHS-like polyketide synthase genes. At-PKS-B, one of polyketide synthases, is involved in fatty
acid and phenolics biosynthesis for pollen exine
developments. In accordance with this, the conserved
coexpression cluster for At-PKS-B contained several
genes which are required for biosynthesis of polyamine
which consist of sporopollenin surrounding the pollen
grains.
Furthermore, statistical methods based on the number of
orthologs between coexpression modules have been pro-
posed for proper cross-species comparison, although Pla-
Net applies a permutation test as statistical model. For
example, a method referred to as conserved modules
across organisms (COMODO) was developed to deter-
mine the most statistically optimal conserved coexpres-
sion module pairs between organisms [45]. By
COMODO, module ‘seed’ genes are selected from the
gene-to-gene threshold matrices, extended until optim-
ality assessed by a Pearson’s chi-square test is reached,
and finally the conserved module pair consisting of genes
in core and variable parts are shown [45]. In the case of
Escherichia coli and Bacillus subtilis, c.a. 80 conserved
module pairs linked through a statistically significant
set of homologous genes were identified. In those con-
served modules, genes in the variable parts accounting for
on average 40% are specific to one organism, suggesting
that these organisms have acquired new members and/or
have rewired the network during evolution. In addition,
the splitting of coexpression modules in one organism
into two modules in the other, and expression divergence
of modules contacting duplicated genes, were observed.
This result may suggest that we should also take into
account an evolutionary perspective including both evol-
utionary systems biology (i.e. how biological networks
evolved, see review [5]) and gene expression profiles,
because plants have multigene families encoding
enzymes and transcription factors involved in metab-
olism.
Genome editing tools for new plantbiotechnologyIn parallel with progress of network analysis, new plant
biotechnologies including zinc-finger nuclease technol-
ogy [46–47], oligonucleotide-directed mutagenesis [48]
and RNA-dependent DNA methylation [49] have been
developed and applied to model plants and crops [50].
These approaches are especially useful for plants with
long generation and fruition times.
Current Opinion in Biotechnology 2013, 24:285–290
288 Plant biotechnology
Figure 2
Network analysesomics data modelingstoichio metric modelingkine tic modeling
Geno me ed itingtechno log iesTALE N, ZFN , ODM ,cisgene sis andintra genetics, RdDM, grafting, re ver sebreeding, agro-infil tra tion
Custombreed ing
Multi-omicsdata
transc rip tome,epigenome,proteo me,intera ctome,metabolo me, etc
Systems biolog y
Biochem
ical dataAdvan ced analytic altechnolog iesRNA-Seq,ChIP-Seq,etc.
Current Opinion in Biotechnology
Schematic representation for custom breeding based on network
analysis.
TALEN, transcription activator-like effector nuclease; ZFN, zinc finger
nuclease; ODM, oligonucleotide directed mutagenesis; RdDM, RNA-
dependent DNA methylation.
Transcription activator-like effector nuclease (TALEN)
is a promising genome editing tool applicable to nearly
any organism [51–52]. Fusion of transcription activator-
like (TAL) effector proteins to the FokI nuclease creates
site-specific DNA nuclease for targeted DNA cleavage
[51]. Disease-resistant rice was developed by introducing
deletions or insertions into TAL effector-binding
element (EBE) in a promoter region of a disease-
susceptibility gene, Os11N3, a member of the SWEET
sucrose-efflux transporter family, using TALEN [53��].Xanthomonas Oryzae pv. oryzae TAL effectors bind EBE
and modify the Os11N3 expression to divert the sugar to
their advantage. The mutations were designed to inter-
fere the binding of X. Oryzae pv. oryzae TAL effectors,
AvrZa7 and PthXo3, but not to affect the function of
Os11N3.
Furthermore, gene activation and repression using engin-
eered TAL effector proteins were reported [54–55]. TAL
effector protein Hax3 fused with the EAR-repression
domain, SRDX, efficiently repressed the target RD29Agenes [54]. Designer TAL effectors with modified repeat-
variable diresidues of TALE repeat units, activated the
expression of target EGL3 and KNAT1 genes in Arabi-
dopsis [55]. A publicly available high-throughput system
using TALEN in human cells [56] may be applicable for
plants.
ConclusionsNetwork analyses, especially transcriptome data mod-
eling, have facilitated the functional identification of
unknown genes and the elucidation of metabolic net-
works. It also suggests the fine-tuning of regulatory
mechanisms under different conditions. Differential
and comparative network approaches [42] may give
us useful information about key regulators controlling
common biological processes among plant species and
diversified systems in specific plant(s). So far, the omics
data that serve as a basis for network analyses are still
not precise or distinct, and these factors contribute to
the limitations seen in network analyses. It will be
important to carefully filter out inaccurate data before
being utilized for network analysis and/or to develop
robust bioinformatics analysis that can tolerate a certain
degree of inaccuracy present in the large data set [57].
Further advances in analytical technologies like RNA-
Seq [58–59], while still containing a high degree of
inaccuracy, should generate more complete and
precise omics data and exploit the power of network
analysis.
Network-based integration of multiple omics data is
another promising strategy [60] because key regulatory
changes leading to phenotypes of interest (e.g. altered
metabolite accumulation) do not necessarily occur at
transcriptional levels. This limitation can be addressed
by integrating multiple omics data.
Current Opinion in Biotechnology 2013, 24:285–290
Discovery of gene functions and inter-pathway inter-
actions based on omics modeling is an important first
step. As more comprehensive and genome-wide data are
available at deep omics levels [59,61], integrated omics
modeling will shed light on plant metabolism at the
system-level and lead to robust and directed metabolic
engineering in plants (Figure 2). Furthermore, stoichio-
metric modeling and kinetic modeling will become
critical future steps toward identification of targets for
rational plant metabolic engineering [62–63].
AcknowledgementsThis work was partly supported by a Grant-in-Aid for Scientific Research onInnovative Areas (to K.S.), Scientific Research (C) (to K. Y.-S.) and YoungScientists (B) (to A.F.) from the Ministry of Education, Culture, Sports,Science and Technology of Japan.
References and recommended readingPapers of particular interest, published within the period of review,have been highlighted as:
� of special interest
�� of outstanding interest
1. Bebek G, Koyuturk M, Price ND, Chance MR: Network biologymethods integrating biological data for translational science.Brief Bioinform 2012, 13:446-459.
2. Ruffel S, Krouk G, Coruzzi GM: A systems view of responses tonutritional cues in Arabidopsis: toward a paradigm shift forpredictive network modeling. Plant Physiol 2010, 152:445-452.
www.sciencedirect.com
Transcriptome data modeling toward plant metabolic engineering Yonekura-Sakakibara, Fukushima and Saito 289
3. Yuan JS, Galbraith DW, Dai SY, Griffin P, Stewart CN Jr: Plantsystems biology comes of age. Trends Plant Sci 2008,13:165-171.
4. Fukushima A, Kusano M, Redestig H, Arita M, Saito K: Integratedomics approaches in plant systems biology. Curr Opin ChemBiol 2009, 13:532-538.
5. Chae L, Lee I, Shin J, Rhee SY: Towards understanding howmolecular networks evolve in plants. Curr Opin Plant Biol 2012,15:177-184.
6. Stitt M, Sulpice R, Keurentjes J: Metabolic networks: how toidentify key components in the regulation of metabolism andgrowth. Plant Physiol 2010, 152:428-444.
7. Schellenberger J, Que R, Fleming RM, Thiele I, Orth JD, Feist AM,Zielinski DC, Bordbar A, Lewis NE, Rahmanian S et al.:Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat Protoc 2011,6:1290-1307.
8. Rohwer JM: Kinetic modelling of plant metabolic pathways. JExp Bot 2012, 63:2275-2292.
9. Vidal M, Cusick ME, Barabasi AL: Interactome networks andhuman disease. Cell 2011, 144:986-998.
10. Fukushima A, Kusano M, Redestig H, Arita M, Saito K:Metabolomic correlation-network modules in Arabidopsisbased on a graph-clustering approach. BMC Syst Biol 2011,5:1.
11. Saito K, Hirai MY, Yonekura-Sakakibara K: Decoding genes withcoexpression networks and metabolomics – ‘majority reportby precogs’. Trends Plant Sci 2008, 13:36-43.
12. Kusano M, Fukushima A, Arita M, Jonsson P, Moritz T,Kobayashi M, Hayashi N, Tohge T, Saito K: Unbiasedcharacterization of genotype-dependent metabolicregulations by metabolomic approach in Arabidopsis thaliana.BMC Syst Biol 2007, 1:53.
13. Kusano M, Tohge T, Fukushima A, Kobayashi M, Hayashi N,Otsuki H, Kondou Y, Goto H, Kawashima M, Matsuda F et al.:Metabolomics reveals comprehensive reprogramminginvolving two independent metabolic responses ofArabidopsis to UV-B light. Plant J 2011, 67:354-369.
14. Matsuda F, Hirai MY, Sasaki E, Akiyama K, Yonekura-Sakakibara K, Provart NJ, Sakurai T, Shimada Y, Saito K:AtMetExpress development: a phytochemical atlas ofArabidopsis development. Plant Physiol 2010, 152:566-578.
15. Gutierrez RA, Lejay LV, Dean A, Chiaromonte F, Shasha DE,Coruzzi GM: Qualitative network models and genome-wideexpression data define carbon/nitrogen-responsivemolecular machines in Arabidopsis. Genome Biol 2007, 8:R7.
16. Gifford ML, Dean A, Gutierrez RA, Coruzzi GM, Birnbaum KD: Cell-specific nitrogen responses mediate developmental plasticity.Proc Natl Acad Sci USA 2008, 105:803-808.
17. Katari MS, Nowicki SD, Aceituno FF, Nero D, Kelfer J,Thompson LP, Cabello JM, Davidson RS, Goldberg AP,Shasha DE et al.: VirtualPlant: a software platform to supportsystems biology research. Plant Physiol 2010, 152:500-515.
18. Vidal EA, Tamayo KP, Gutierrez RA: Gene networks for nitrogensensing, signaling, and response in Arabidopsis thaliana. WileyInterdiscip Rev Syst Biol Med 2010, 2:683-693.
19. Lee I, Ambaru B, Thakkar P, Marcotte EM, Rhee SY: Rationalassociation of genes with traits using a genome-scale genenetwork for Arabidopsis thaliana. Nat Biotechnol 2010,28:149-156.
20. Lee I, Seo YS, Coltrane D, Hwang S, Oh T, Marcotte EM,Ronald PC: Genetic dissection of the biotic stress responseusing a genome-scale gene network for rice. Proc Natl Acad SciUSA 2011, 108:18548-18553.
21. Usadel B, Obayashi T, Mutwil M, Giorgi FM, Bassel GW,Tanimoto M, Chow A, Steinhauser D, Persson S, Provart NJ: Co-expression tools for plant biology: opportunities forhypothesis generation and caveats. Plant Cell Environ 2009,32:1633-1651.
www.sciencedirect.com
22. Tohge T, Fernie AR: Combining genetic diversity, informaticsand metabolomics to facilitate annotation of plant genefunction. Nat Protoc 2010, 5:1210-1227.
23. Yonekura-Sakakibara K, Tohge T, Niida R, Saito K: Identificationof a flavonol 7-O-rhamnosyltransferase gene determiningflavonoid pattern in Arabidopsis by transcriptomecoexpression analysis and reverse genetics. J Biol Chem 2007,282:14932-14941.
24. Hirai MY, Sugiyama K, Sawada Y, Tohge T, Obayashi T, Suzuki A,Araki R, Sakurai N, Suzuki H, Aoki K et al.: Omics-basedidentification of Arabidopsis Myb transcription factorsregulating aliphatic glucosinolate biosynthesis. Proc Natl AcadSci USA 2007, 104:6478-6483.
25. Yonekura-Sakakibara K, Tohge T, Matsuda F, Nakabayashi R,Takayama H, Niida R, Watanabe-Takahashi A, Inoue E, Saito K:Comprehensive flavonol profiling and transcriptomecoexpression analysis leading to decoding gene-metabolitecorrelations in Arabidopsis. Plant Cell 2008, 20:2160-2176.
26. Okazaki Y, Shimojima M, Sawada Y, Toyooka K, Narisawa T,Mochida K, Tanaka H, Matsuda F, Hirai A, Hirai MY et al.: Achloroplastic UDP-glucose pyrophosphorylase fromArabidopsis is the committed enzyme for the first step ofsulfolipid biosynthesis. Plant Cell 2009, 21:892-909.
27.��
Maeda H, Yoo H, Dudareva N: Prephenate aminotransferasedirects plant phenylalanine biosynthesis via arogenate. NatChem Biol 2011, 7:19-21.
Based on gene coexpression analysis, prephenate aminotransferase(PPA-AT) was identified in Arabidopsis. The petunia homologue wasisoated from EST database by similarity with Arabidopsis PPA-AT. Thisis a first report about PPA-AT genes and a good example as application ofArabidopsis research to horticultural plant.
28.�
Pfalz M, Mikkelsen MD, Bednarek P, Olsen CE, Halkier BA,Kroymann J: Metabolic engineering in Nicotiana benthamianareveals key enzyme functions in Arabidopsis indoleglucosinolate modification. Plant Cell 2011, 23:716-729.
Production of indole glucosinolate in N. benthamiana was succeeded byintroducing seven indole glucosinolate biosynthetic genes. Gene coex-pression analysis delimited two O-methyltransferase as candidatesinvolved in indole glucosinolate modification. The functions of two O-methyltransferase were tested by coexpression with CYP81Fs in indoleglucosinolate producing N. benthamiana.
29. Barratt DH, Kolling K, Graf A, Pike M, Calder G, Findlay K,Zeeman SC, Smith AM: Callose synthase GSL7 is necessary fornormal phloem transport and inflorescence growth inArabidopsis. Plant Physiol 2011, 155:328-341.
30. Kong W, Vanderburg CR, Gunshin H, Rogers JT, Huang X: Areview of independent component analysis application tomicroarray gene expression data. Biotechniques 2008,45:501-520.
31. Hori G, Inoue M, Nishimura SI, Nakahara H: Blind geneclassification—an application of a signal separation method.Genome Inform 2001, 12:255-256.
32. Frigyesi A, Veerla S, Lindgren D, Hoglund M: Independentcomponent analysis reveals new and biologically significantstructures in micro array data. BMC Bioinform 2006, 7:290.
33. Kong W, Mou X, Liu Q, Chen Z, Vanderburg CR, Rogers JT,Huang X: Independent component analysis of Alzheimer’s DNAmicroarray gene expression data. Mol Neurodegener 2009, 4:5.
34. Mueller LA, Zhang P, Rhee SY: AraCyc: a biochemical pathwaydatabase for Arabidopsis. Plant Physiol 2003, 132:453-460.
35.�
Yonekura-Sakakibara K, Fukushima A, Nakabayashi R, Hanada K,Matsuda F, Sugawara S, Inoue E, Kuromori T, Ito T, Shinozaki Ket al.: Two glycosyltransferases involved in anthocyaninmodification delineated by transcriptome independentcomponent analysis in Arabidopsis thaliana. Plant J 2012,69:154-167.
Independent component analysis (ICA) with gene coexpression analysiswas used to identify the candidate genes involved in anthocyanin biosynth-esis. As a result, anthocyanin xylosyltransferase and sinapic acid glucosyl-transferase were identified as glycosyltransferase involved in anthocyaninmodification directly and indirectly. It is interesting that ICA provides adifferent perspective on microarray data analysis by IC numbers.
Current Opinion in Biotechnology 2013, 24:285–290
290 Plant biotechnology
36.�
Fukushima A, Nishizawa T, Hayakumo M, Hikosaka S, Saito K,Goto E, Kusano M: Exploring tomato gene functions based oncoexpression modules using graph clustering and differentialcoexpression approaches. Plant Physiol 2012, 158:1487-1502.
The authors found differential gene coexpressions based on tomatomicroarray data sets and pointed out the possibility that differentialcoexpression reflect rewiring of the trancriptome networks in distinctorgans and suggest key regulatory steps in the pathways.
37. Wang J, Li M, Deng Y, Pan Y: Recent advances in clusteringmethods for protein interaction networks. BMC Genomics2010, 11(Suppl. 3):S10.
38. de la Fuente A: From ‘differential expression’ to ‘differentialnetworking’ – identification of dysfunctional regulatorynetworks in diseases. Trends Genet 2010, 26:326-333.
39. Ideker T, Krogan NJ: Differential network biology. Mol Syst Biol2012, 8:565.
40. Less H, Angelovici R, Tzin V, Galili G: Coordinated genenetworks regulating Arabidopsis plant metabolism inresponse to various stresses and nutritional cues. Plant Cell2011, 23:1264-1271.
41. Avin-Wittenberg T, Tzin V, Angelovici R, Less H, Galili G:Deciphering energy-associated gene networks operating inthe response of Arabidopsis plants to stress and nutritionalcues. Plant J 2012, 70:954-966.
42. Movahedi S, Van Bel M, Heyndrickx KS, Vandepoele K:Comparative co-expression analysis in plant biology. Plant CellEnviron 2012, 35:1787-1789.
43.�
Mutwil M, Klie S, Tohge T, Giorgi FM, Wilkins O, Campbell MM,Fernie AR, Usadel B, Nikoloski Z, Persson S: PlaNet: combinedsequence and expression comparisons across plant networksderived from seven species. Plant Cell 2011, 23:895-910.
PlaNet is a plant coexpression network browser for seven plant species.Integration of sequence similarity and gene expression profiles allows usto identify conserved coexpression clusters across multiple plantspecies.
44. Abe I, Morita H: Structure and function of the chalconesynthase superfamily of plant type III polyketide synthases.Nat Prod Rep 2010, 27:809-838.
45. Zarrineh P, Fierro AC, Sanchez-Rodriguez A, De Moor B,Engelen K, Marchal K: COMODO: an adaptive coclusteringstrategy to identify conserved coexpression modules betweenorganisms. Nucleic Acids Res 2011, 39:e41.
46. Shukla VK, Doyon Y, Miller JC, DeKelver RC, Moehle EA,Worden SE, Mitchell JC, Arnold NL, Gopalan S, Meng X et al.:Precise genome modification in the crop species Zea maysusing zinc-finger nucleases. Nature 2009, 459:437-441.
47. Townsend JA, Wright DA, Winfrey RJ, Fu F, Maeder ML, Joung JK,Voytas DF: High-frequency modification of plant genes usingengineered zinc-finger nucleases. Nature 2009, 459:442-445.
48. Beetham PR, Kipp PB, Sawycky XL, Arntzen CJ, May GD: A toolfor functional plant genomics: chimeric RNA/DNAoligonucleotides cause in vivo gene-specific mutations. ProcNatl Acad Sci USA 1999, 96:8774-8778.
49. Aufsatz W, Mette MF, van der Winden J, Matzke AJ, Matzke M:RNA-directed DNA methylation in Arabidopsis. Proc Natl AcadSci USA 2002, 99(Suppl. 4):16499-16506.
Current Opinion in Biotechnology 2013, 24:285–290
50. Lusser M, Parisi C, Plan D, Rodriguez-Cerezo E: Deployment ofnew biotechnologies in plant breeding. Nat Biotechnol 2012,30:231-239.
51. Bogdanove AJ, Voytas DF: TAL effectors: customizableproteins for DNA targeting. Science 2011, 333:1843-1846.
52. Scholze H, Boch J: TAL effectors are remote controls for geneactivation. Curr Opin Microbiol 2011, 14:47-53.
53.��
Li T, Liu B, Spalding MH, Weeks DP, Yang B: High-efficiencyTALEN-based gene editing produces disease-resistant rice.Nat Biotechnol 2012, 30:390-392.
TALEN is a promising genome editing technology which is applicable togene activation and repression of desired genes. This report is a firstexample conferring disease-resistance to plants using a TALENtechnology.
54. Mahfouz MM, Li L, Piatek M, Fang X, Mansour H,Bangarusamy DK, Zhu JK: Targeted transcriptional repressionusing a chimeric TALE-SRDX repressor protein. Plant Mol Biol2012, 78:311-321.
55. Morbitzer R, Romer P, Boch J, Lahaye T: Regulation of selectedgenome loci using de novo-engineered transcriptionactivator-like effector (TALE)-type transcription factors. ProcNatl Acad Sci USA 2010, 107:21617-21622.
56. Reyon D, Tsai SQ, Khayter C, Foden JA, Sander JD, Joung JK:FLASH assembly of TALENs for high-throughput genomeediting. Nat Biotechnol 2012, 30:460-465.
57. Ideker T, Dutkowski J, Hood L: Boosting signal-to-noise incomplex biology: prior knowledge is power. Cell 2011,144:860-863.
58. Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y: RNA-seq:an assessment of technical reproducibility and comparisonwith gene expression arrays. Genome Res 2008, 18:1509-1517.
59. Mochida K, Shinozaki K: Genomics and bioinformaticsresources for crop improvement. Plant Cell Physiol 2010,51:497-523.
60. Brady SM, Provart NJ: Web-queryable large-scale data sets forhypothesis generation in plant biology. Plant Cell 2009,21:1034-1051.
61. Yonekura-Sakakibara K, Saito K: Functional genomics for plantnatural product biosynthesis. Nat Prod Rep 2009,26:1466-1487.
62. Heinzle E, Matsuda F, Miyagawa H, Wakasa K, Nishioka T:Estimation of metabolic fluxes, expression levels andmetabolite dynamics of a secondary metabolic pathway inpotato using label pulse-feeding experiments combined withkinetic network modelling and simulation. Plant J 2007,50:176-187.
63. Colon AM, Sengupta N, Rhodes D, Dudareva N, Morgan J: Akinetic model describes metabolic response to perturbationsand distribution of flux control in the benzenoid network ofPetunia hybrida. Plant J 2010, 62:64-76.
64. Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G,Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC: Detectingnovel associations in large data sets. Science 2011,334:1518-1524.
www.sciencedirect.com