Transcriptome data modeling for targeted plant metabolic engineering

Transcriptome data modeling for targeted plant metabolicengineeringKeiko Yonekura-Sakakibara1, Atsushi Fukushima1 and Kazuki Saito1,2

Available online at www.sciencedirect.com

The massive data generated by omics technologies require the

power of bioinformatics, especially network analysis, for data

mining and doing data-driven biology. Gene coexpression

analysis, a network approach based on comprehensive gene

expression data using microarrays, is becoming a standard tool

for predicting gene function and elucidating the relationship

between metabolic pathways. Differential and comparative

gene coexpression analyses suggest a change in coexpression

relationships and regulators controlling common and/or

specific biological processes. In conjunction with the newly

emerging genome editing technology, network analysis

integrated with other omics data should pave the way for robust

and practical plant metabolic engineering.

Addresses1 RIKEN Plant Science Center, 1-7-22, Suehiro-cho, Tsurumi-ku,

Yokohama 230-0045 Japan2 Graduate School of Pharmaceutical Sciences, Chiba University, 1-8-1,

Inohana, Chuo-ku, Chiba 260-8675, Japan

Corresponding author: Saito, Kazuki ([email protected],

[email protected])

Current Opinion in Biotechnology 2013, 24:285–290

This review comes from a themed issue on Plant biotechnology

Edited by Natalia Dudareva and Dean DellaPenna

For a complete overview see the Issue and the Editorial

Available online 4th December 2012

0958-1669/$ – see front matter, # 2012 Elsevier Ltd. All rights

reserved.

http://dx.doi.org/10.1016/j.copbio.2012.10.018

IntroductionIn most systems approaches used to understand cellular

biological processes, one key assertion is that the biological

systems can be modeled as a network [1–3]. In general, a

network can be described as a graph in which biological

entities such as genes, transcripts, proteins and metabolites

correspond to nodes, and the interactions between nodes

such as coexpression and protein-protein interaction, cor-

respond to edges. Network analysis, in turn, refers to the

use of graph-theoretic models and statistics to provide

topological information about a network.

In a broad sense, network analysis can be classified into

three major types, omics data modeling, stoichiometric

modeling and kinetic modeling, although they overlap

to a certain degree (Figure 1). Omics data modeling refers

to the use of statistical methods to identify and infer

www.sciencedirect.com

complex functional interactions among the components

in biological systems [4–6]. Stoichiometric modeling

entails network analysis based on knowledge of the stoichi-

ometry of a system, which predicts flux distributions of

biological pathways [7]. Kinetic modeling, which requires

reliable information about kinetic parameters, is utilized

for the evaluation of the dynamics of biological systems

such as time-course simulation, steady-state analysis and

metabolic control analysis [8]. Of these three types, omics

data modeling is becoming a standard analytical tool for

understanding whole biological systems and prediction of

gene function. Various networks including gene coexpres-

sion, differential coexpression, metabolite-to-metabolite,

gene-to-metabolite and protein-protein interaction have

been elucidated by means of network analysis using multi-

omics data [6,9–20]. Here, we focus on omics data model-

ing, especially on transcriptomic data, in network analysis

as a means to improve metabolic engineering strategies.

Transcriptome data modeling for identifyingtarget genesFunctional identification of target genes is one of the

major objectives of network analyses using omics data.

Gene coexpression analysis based on the so-called ‘guilt-

by-association’ principle [11] is frequently used for this

purpose [21–22]. This approach enables efficient exam-

ination of candidate genes that belong to a multigene

family (e.g. family 1 glycosyltransferases (UGTs), meth-

yltransferases, MYB), and accurate prediction of gene

function that is then corroborated by other experimental

evidences [23–26,27��,28�]. Maeda et al. identified a gene

encoding prephenate aminotransferase (PPA-AT) from

Arabidopsis thaliana and Petunia hybrida by first searching

for an aminotransferase gene coexpressed with those in

the shikimate and phenylpropanoid pathways of Arabi-

dopsis and then looking for homologs in petunia

expressed sequence tag (EST) databases [27��]. Enzy-

matic characterization of the recombinant proteins and invivo experiments using RNA interference (RNAi) tech-

nique verified the PPA-AT function. Before this discov-

ery, PPA-AT enzymatic characterization had been well

studied but the corresponding genes remained uniden-

tified in any organism. Analyses of PPA-AT RNAi lines

additionally revealed unknown post-transcriptional

regulation in the phenylalanine pathway. Moreover,

the use of co-expression analysis allowed Pfalz et al. to

identify genes that were later used to improve pro-

duction of the indole glucosinolate, indol-3-yl-methyl

(I3M), in Nicotiana benthamiana [28�]. The two-step

modification of I3M to 4-methoxy-indol-3-yl-methyl


mailto:[email protected]

mailto:[email protected]

http://www.sciencedirect.com/science/journal/09581669/24/2



http://www.sciencedirect.com/science/journal/09581669

286 Plant biotechnology

Figure 1

- Topologi calproperties

Omics datamodeling

Stoic hiom etricmodeling

- Geno me-sc alereconst ruction

- Quan titativepredictions

Time (h)

Leve

ls

Kineticmodeling

- Dyna micdescription

- Regula tor ymechani sm

Network analyses

Net work Infe rencesPearson ’s correla tionPartial correla tionMutual in for mation MICGene Coordina tion etc .

Data mining too lsPCA and ICAGraph clusteringNetwor k re const ructionPathway da taba ses

etc .

- Omics multine twork

- Functional map

Theoret ical app roac hesODEsOptimizationPrior kno wledgeEnzyme da taba ses etc .

“top-do wn” “bottom-up ”“middle-out ”

Network siz eLarge Small

.

.

.

.

.

.

.

.

.

.

.

. .

A BC

DV3

V3

V2V1

V1

V2

Current Opinion in Biotechnology

Schematic diagram of computational approaches of cellular networks.

A network analysis is a statistical method of identification and inference about functional interactions between elements in biological systems.

Mathematical modeling with detailed kinetic parameters (so-called ‘kinetic models’) evaluates the function and the dynamics in targeted biological

pathways. Stoichiometric analysis with genome-scale maps provides predictive flux distributions in cell metabolism. Omics data carry multilevel

network inferences. MIC, maximal information coefficient [64]; PCA, principal component analysis; ICA, independent component analysis; ODEs,

ordinary differential equations.

or 1-methoxy-indol-3-yl-methyl was achieved by

additional overexpression of both cytochrome P450

monooxygenases, CYP81Fs, and two O-methyltransfer-

ases that had been identified by gene coexpression

analysis using CYP81Fs [28�]. Similarly, based on coex-

pression with sucrose synthases (SUS5 and SUS6)

involved in the synthesis of the callose lining, Barratt

et al. identified a callose synthase, glucan synthase-like7,

in the sieve plate pores of stems and roots [29].

Thus, various genes encoding enzymes, members of

protein complexes and transcription factors have been

identified by gene coexpression network analysis using

public database (reviewed in [11,21], http://atted.jp/top_

publication.shtml).

As another approach based on transcriptome data, inde-

pendent component analysis (ICA), a form of unsupervised

algorithm, has been applied to microarray data analysis for

extraction and characterization of informative features,


clustering and classification of gene expression profiles

[30–33]. ICA of a total of 1877 genes including flavonoid

biosynthetic genes and genes annotated in AraCyc [34] was

performed on 1388 microarray data with ATTED-II [35�].A hierarchical cluster analysis of genes based on ICA

showed that the genes involved in the biosynthesis of

anthocyanins and flavonols form distinct clusters and the

cluster of anthocyanin biosynthetic genes can be divided

into two subclusters for skeleton biosynthesis and modi-

fication [35�]. Among putative anthocyanin UGTs,

UGT79B1 found in the anthocyanin modification subclus-

ter was identified as anthocyanin 3-O-glucoside: 200-O-

xylosyltransferase. In the anthocyanin skeleton biosyn-

thesis subcluster, UGT84A2 was discovered to be sinapic

acid glucosyltransferase that supplies 1-O-sinapoylglucose

as sinapoyl donor for anthocyanin sinapoyltransferase [35�].Interestingly, the cluster designation changes for some

bifunctional flavonoid glycosyltranferases depending on

the number of independent components (ICs) used for

analysis. Flavonoid 3-O-glucosyltransferase, for example,


http://atted.jp/top_publication.shtml

http://atted.jp/top_publication.shtml

Transcriptome data modeling toward plant metabolic engineering Yonekura-Sakakibara, Fukushima and Saito 287

which can recognize flavonols and anthocyanins, falls into

the flavonoid cluster based on 8 ICs, but into the antho-

cyanin modification cluster based on 10 ICs [35�].

To further infer inter-pathway interactions, gene coex-

pression analysis can be expanded by conducting graph

clustering and differential coexpression analysis using

distinct data sets (e.g. organ specific data) [36�]. Gene

coexpression analysis was examined based on more than

300 tomato microarray data, and coexpression modules

were extracted by graph clustering. Graph clustering of

coexpression networks can be helpful for extracting den-

sely-clustered gene modules [37]. Significantly enriched

gene ontology terms in 88% of extracted coexpression

modules suggested the biological relevance of genes in

the modules. Coexpression analysis showed that gene

coexpression varied with the organ datasets used. In

the case of flavonoid biosynthetic genes, a strong positive

correlation between flavanone 3-hydroxylase and 4-cou-

marate:CoA ligase was observed in fruits (r = 0.89), but

not in roots (r = �0.23). In the case of chalcone synthase

(CHS) and chalcone isomerase, a strong negative corre-

lation in roots (r = �0.72) and a weak positive in fruits

(r = 0.50) were observed. Carotenoid biosynthetic genes

also showed similar results. Flavonoids are highly accu-

mulated in leaves and fruits, but not in roots of tomato,

suggesting that differential coexpression indicates a

change in gene coexpression relationships that may

reflect ‘reprogramming’ of transcriptome networks among

two biological tissues. This approach has been applied to

not only plants but also animals [36�,38–39].

A bioinformatics approach termed ‘gene coordination’

was developed in order to understand the coordinated

response of gene networks to environmental stimuli

[40]. An analysis using >1000 genes encoding enzymes

and transcription factors revealed possible stress-associ-

ated intra-pathway and inter-pathway interactions be-

tween genes from six energy-associated pathways

including the TCA cycle, glycolysis and photosynthesis

[41]. It may be useful for prediction of gene function

and deeper understanding of interaction and crosstalk

between pathways to elucidate whole biological sys-

tems in organisms.

Comparative network approachestransferring insights from model plants intocropsThe accumulation of gene expression data from various

plants enables us to conduct gene coexpression network

analysis in a range of plant species. Furthermore, integ-

ration of sequence similarity and gene (co)expression

profiles allows identification of conserved coexpression

clusters among multiple plant species (so-called ‘com-

parative coexpression’) [42]. As an example, compara-

tive coexpression analysis was examined using the CHS

gene as query by PlaNet [43�]. CHS belongs to the


polyketide synthase family and is the first committed

enzyme in the biosynthesis of flavonoids [44]. Con-

served coexpression clusters containing CHSs and other

flavonoid biosynthetic genes from each plant (Medicago,

barley, soybean, wheat, rice, and poplar) were found as

most similar ones across seven species. In addition,

conserved coexpression clusters containing Arabidopsis

CHS-like genes were found. Arabidopsis has three

additional CHS-like polyketide synthase genes. At-PKS-B, one of polyketide synthases, is involved in fatty

acid and phenolics biosynthesis for pollen exine

developments. In accordance with this, the conserved

coexpression cluster for At-PKS-B contained several

genes which are required for biosynthesis of polyamine

which consist of sporopollenin surrounding the pollen

grains.

Furthermore, statistical methods based on the number of

orthologs between coexpression modules have been pro-

posed for proper cross-species comparison, although Pla-

Net applies a permutation test as statistical model. For

example, a method referred to as conserved modules

across organisms (COMODO) was developed to deter-

mine the most statistically optimal conserved coexpres-

sion module pairs between organisms [45]. By

COMODO, module ‘seed’ genes are selected from the

gene-to-gene threshold matrices, extended until optim-

ality assessed by a Pearson’s chi-square test is reached,

and finally the conserved module pair consisting of genes

in core and variable parts are shown [45]. In the case of

Escherichia coli and Bacillus subtilis, c.a. 80 conserved

module pairs linked through a statistically significant

set of homologous genes were identified. In those con-

served modules, genes in the variable parts accounting for

on average 40% are specific to one organism, suggesting

that these organisms have acquired new members and/or

have rewired the network during evolution. In addition,

the splitting of coexpression modules in one organism

into two modules in the other, and expression divergence

of modules contacting duplicated genes, were observed.

This result may suggest that we should also take into

account an evolutionary perspective including both evol-

utionary systems biology (i.e. how biological networks

evolved, see review [5]) and gene expression profiles,

because plants have multigene families encoding

enzymes and transcription factors involved in metab-

olism.

Genome editing tools for new plantbiotechnologyIn parallel with progress of network analysis, new plant

biotechnologies including zinc-finger nuclease technol-

ogy [46–47], oligonucleotide-directed mutagenesis [48]

and RNA-dependent DNA methylation [49] have been

developed and applied to model plants and crops [50].

These approaches are especially useful for plants with

long generation and fruition times.



Figure 2

Network analysesomics data modelingstoichio metric modelingkine tic modeling

Geno me ed itingtechno log iesTALE N, ZFN , ODM ,cisgene sis andintra genetics, RdDM, grafting, re ver sebreeding, agro-infil tra tion

Custombreed ing

Multi-omicsdata

transc rip tome,epigenome,proteo me,intera ctome,metabolo me, etc

Systems biolog y

Biochem

ical dataAdvan ced analytic altechnolog iesRNA-Seq,ChIP-Seq,etc.

Current Opinion in Biotechnology

Schematic representation for custom breeding based on network

analysis.

TALEN, transcription activator-like effector nuclease; ZFN, zinc finger

nuclease; ODM, oligonucleotide directed mutagenesis; RdDM, RNA-

dependent DNA methylation.

Transcription activator-like effector nuclease (TALEN)

is a promising genome editing tool applicable to nearly

any organism [51–52]. Fusion of transcription activator-

like (TAL) effector proteins to the FokI nuclease creates

site-specific DNA nuclease for targeted DNA cleavage

[51]. Disease-resistant rice was developed by introducing

deletions or insertions into TAL effector-binding

element (EBE) in a promoter region of a disease-

susceptibility gene, Os11N3, a member of the SWEET

sucrose-efflux transporter family, using TALEN [53��].Xanthomonas Oryzae pv. oryzae TAL effectors bind EBE

and modify the Os11N3 expression to divert the sugar to

their advantage. The mutations were designed to inter-

fere the binding of X. Oryzae pv. oryzae TAL effectors,

AvrZa7 and PthXo3, but not to affect the function of

Os11N3.

Furthermore, gene activation and repression using engin-

eered TAL effector proteins were reported [54–55]. TAL

effector protein Hax3 fused with the EAR-repression

domain, SRDX, efficiently repressed the target RD29Agenes [54]. Designer TAL effectors with modified repeat-

variable diresidues of TALE repeat units, activated the

expression of target EGL3 and KNAT1 genes in Arabi-

dopsis [55]. A publicly available high-throughput system

using TALEN in human cells [56] may be applicable for

plants.

ConclusionsNetwork analyses, especially transcriptome data mod-

eling, have facilitated the functional identification of

unknown genes and the elucidation of metabolic net-

works. It also suggests the fine-tuning of regulatory

mechanisms under different conditions. Differential

and comparative network approaches [42] may give

us useful information about key regulators controlling

common biological processes among plant species and

diversified systems in specific plant(s). So far, the omics

data that serve as a basis for network analyses are still

not precise or distinct, and these factors contribute to

the limitations seen in network analyses. It will be

important to carefully filter out inaccurate data before

being utilized for network analysis and/or to develop

robust bioinformatics analysis that can tolerate a certain

degree of inaccuracy present in the large data set [57].

Further advances in analytical technologies like RNA-

Seq [58–59], while still containing a high degree of

inaccuracy, should generate more complete and

precise omics data and exploit the power of network

analysis.

Network-based integration of multiple omics data is

another promising strategy [60] because key regulatory

changes leading to phenotypes of interest (e.g. altered

metabolite accumulation) do not necessarily occur at

transcriptional levels. This limitation can be addressed

by integrating multiple omics data.


Discovery of gene functions and inter-pathway inter-

actions based on omics modeling is an important first

step. As more comprehensive and genome-wide data are

available at deep omics levels [59,61], integrated omics

modeling will shed light on plant metabolism at the

system-level and lead to robust and directed metabolic

engineering in plants (Figure 2). Furthermore, stoichio-

metric modeling and kinetic modeling will become

critical future steps toward identification of targets for

rational plant metabolic engineering [62–63].

AcknowledgementsThis work was partly supported by a Grant-in-Aid for Scientific Research onInnovative Areas (to K.S.), Scientific Research (C) (to K. Y.-S.) and YoungScientists (B) (to A.F.) from the Ministry of Education, Culture, Sports,Science and Technology of Japan.

References and recommended readingPapers of particular interest, published within the period of review,have been highlighted as:

� of special interest

�� of outstanding interest

1. Bebek G, Koyuturk M, Price ND, Chance MR: Network biologymethods integrating biological data for translational science.Brief Bioinform 2012, 13:446-459.

2. Ruffel S, Krouk G, Coruzzi GM: A systems view of responses tonutritional cues in Arabidopsis: toward a paradigm shift forpredictive network modeling. Plant Physiol 2010, 152:445-452.


Transcriptome data modeling toward plant metabolic engineering Yonekura-Sakakibara, Fukushima and Saito 289

3. Yuan JS, Galbraith DW, Dai SY, Griffin P, Stewart CN Jr: Plantsystems biology comes of age. Trends Plant Sci 2008,13:165-171.

4. Fukushima A, Kusano M, Redestig H, Arita M, Saito K: Integratedomics approaches in plant systems biology. Curr Opin ChemBiol 2009, 13:532-538.

5. Chae L, Lee I, Shin J, Rhee SY: Towards understanding howmolecular networks evolve in plants. Curr Opin Plant Biol 2012,15:177-184.

6. Stitt M, Sulpice R, Keurentjes J: Metabolic networks: how toidentify key components in the regulation of metabolism andgrowth. Plant Physiol 2010, 152:428-444.

7. Schellenberger J, Que R, Fleming RM, Thiele I, Orth JD, Feist AM,Zielinski DC, Bordbar A, Lewis NE, Rahmanian S et al.:Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat Protoc 2011,6:1290-1307.

8. Rohwer JM: Kinetic modelling of plant metabolic pathways. JExp Bot 2012, 63:2275-2292.

9. Vidal M, Cusick ME, Barabasi AL: Interactome networks andhuman disease. Cell 2011, 144:986-998.

10. Fukushima A, Kusano M, Redestig H, Arita M, Saito K:Metabolomic correlation-network modules in Arabidopsisbased on a graph-clustering approach. BMC Syst Biol 2011,5:1.

11. Saito K, Hirai MY, Yonekura-Sakakibara K: Decoding genes withcoexpression networks and metabolomics – ‘majority reportby precogs’. Trends Plant Sci 2008, 13:36-43.

12. Kusano M, Fukushima A, Arita M, Jonsson P, Moritz T,Kobayashi M, Hayashi N, Tohge T, Saito K: Unbiasedcharacterization of genotype-dependent metabolicregulations by metabolomic approach in Arabidopsis thaliana.BMC Syst Biol 2007, 1:53.

13. Kusano M, Tohge T, Fukushima A, Kobayashi M, Hayashi N,Otsuki H, Kondou Y, Goto H, Kawashima M, Matsuda F et al.:Metabolomics reveals comprehensive reprogramminginvolving two independent metabolic responses ofArabidopsis to UV-B light. Plant J 2011, 67:354-369.

14. Matsuda F, Hirai MY, Sasaki E, Akiyama K, Yonekura-Sakakibara K, Provart NJ, Sakurai T, Shimada Y, Saito K:AtMetExpress development: a phytochemical atlas ofArabidopsis development. Plant Physiol 2010, 152:566-578.

15. Gutierrez RA, Lejay LV, Dean A, Chiaromonte F, Shasha DE,Coruzzi GM: Qualitative network models and genome-wideexpression data define carbon/nitrogen-responsivemolecular machines in Arabidopsis. Genome Biol 2007, 8:R7.

16. Gifford ML, Dean A, Gutierrez RA, Coruzzi GM, Birnbaum KD: Cell-specific nitrogen responses mediate developmental plasticity.Proc Natl Acad Sci USA 2008, 105:803-808.

17. Katari MS, Nowicki SD, Aceituno FF, Nero D, Kelfer J,Thompson LP, Cabello JM, Davidson RS, Goldberg AP,Shasha DE et al.: VirtualPlant: a software platform to supportsystems biology research. Plant Physiol 2010, 152:500-515.

18. Vidal EA, Tamayo KP, Gutierrez RA: Gene networks for nitrogensensing, signaling, and response in Arabidopsis thaliana. WileyInterdiscip Rev Syst Biol Med 2010, 2:683-693.

19. Lee I, Ambaru B, Thakkar P, Marcotte EM, Rhee SY: Rationalassociation of genes with traits using a genome-scale genenetwork for Arabidopsis thaliana. Nat Biotechnol 2010,28:149-156.

20. Lee I, Seo YS, Coltrane D, Hwang S, Oh T, Marcotte EM,Ronald PC: Genetic dissection of the biotic stress responseusing a genome-scale gene network for rice. Proc Natl Acad SciUSA 2011, 108:18548-18553.

21. Usadel B, Obayashi T, Mutwil M, Giorgi FM, Bassel GW,Tanimoto M, Chow A, Steinhauser D, Persson S, Provart NJ: Co-expression tools for plant biology: opportunities forhypothesis generation and caveats. Plant Cell Environ 2009,32:1633-1651.


22. Tohge T, Fernie AR: Combining genetic diversity, informaticsand metabolomics to facilitate annotation of plant genefunction. Nat Protoc 2010, 5:1210-1227.

23. Yonekura-Sakakibara K, Tohge T, Niida R, Saito K: Identificationof a flavonol 7-O-rhamnosyltransferase gene determiningflavonoid pattern in Arabidopsis by transcriptomecoexpression analysis and reverse genetics. J Biol Chem 2007,282:14932-14941.

24. Hirai MY, Sugiyama K, Sawada Y, Tohge T, Obayashi T, Suzuki A,Araki R, Sakurai N, Suzuki H, Aoki K et al.: Omics-basedidentification of Arabidopsis Myb transcription factorsregulating aliphatic glucosinolate biosynthesis. Proc Natl AcadSci USA 2007, 104:6478-6483.

25. Yonekura-Sakakibara K, Tohge T, Matsuda F, Nakabayashi R,Takayama H, Niida R, Watanabe-Takahashi A, Inoue E, Saito K:Comprehensive flavonol profiling and transcriptomecoexpression analysis leading to decoding gene-metabolitecorrelations in Arabidopsis. Plant Cell 2008, 20:2160-2176.

26. Okazaki Y, Shimojima M, Sawada Y, Toyooka K, Narisawa T,Mochida K, Tanaka H, Matsuda F, Hirai A, Hirai MY et al.: Achloroplastic UDP-glucose pyrophosphorylase fromArabidopsis is the committed enzyme for the first step ofsulfolipid biosynthesis. Plant Cell 2009, 21:892-909.

27.��

Maeda H, Yoo H, Dudareva N: Prephenate aminotransferasedirects plant phenylalanine biosynthesis via arogenate. NatChem Biol 2011, 7:19-21.

Based on gene coexpression analysis, prephenate aminotransferase(PPA-AT) was identified in Arabidopsis. The petunia homologue wasisoated from EST database by similarity with Arabidopsis PPA-AT. Thisis a first report about PPA-AT genes and a good example as application ofArabidopsis research to horticultural plant.

28.�

Pfalz M, Mikkelsen MD, Bednarek P, Olsen CE, Halkier BA,Kroymann J: Metabolic engineering in Nicotiana benthamianareveals key enzyme functions in Arabidopsis indoleglucosinolate modification. Plant Cell 2011, 23:716-729.

Production of indole glucosinolate in N. benthamiana was succeeded byintroducing seven indole glucosinolate biosynthetic genes. Gene coex-pression analysis delimited two O-methyltransferase as candidatesinvolved in indole glucosinolate modification. The functions of two O-methyltransferase were tested by coexpression with CYP81Fs in indoleglucosinolate producing N. benthamiana.

29. Barratt DH, Kolling K, Graf A, Pike M, Calder G, Findlay K,Zeeman SC, Smith AM: Callose synthase GSL7 is necessary fornormal phloem transport and inflorescence growth inArabidopsis. Plant Physiol 2011, 155:328-341.

30. Kong W, Vanderburg CR, Gunshin H, Rogers JT, Huang X: Areview of independent component analysis application tomicroarray gene expression data. Biotechniques 2008,45:501-520.

31. Hori G, Inoue M, Nishimura SI, Nakahara H: Blind geneclassification—an application of a signal separation method.Genome Inform 2001, 12:255-256.

32. Frigyesi A, Veerla S, Lindgren D, Hoglund M: Independentcomponent analysis reveals new and biologically significantstructures in micro array data. BMC Bioinform 2006, 7:290.

33. Kong W, Mou X, Liu Q, Chen Z, Vanderburg CR, Rogers JT,Huang X: Independent component analysis of Alzheimer’s DNAmicroarray gene expression data. Mol Neurodegener 2009, 4:5.

34. Mueller LA, Zhang P, Rhee SY: AraCyc: a biochemical pathwaydatabase for Arabidopsis. Plant Physiol 2003, 132:453-460.

35.�

Yonekura-Sakakibara K, Fukushima A, Nakabayashi R, Hanada K,Matsuda F, Sugawara S, Inoue E, Kuromori T, Ito T, Shinozaki Ket al.: Two glycosyltransferases involved in anthocyaninmodification delineated by transcriptome independentcomponent analysis in Arabidopsis thaliana. Plant J 2012,69:154-167.

Independent component analysis (ICA) with gene coexpression analysiswas used to identify the candidate genes involved in anthocyanin biosynth-esis. As a result, anthocyanin xylosyltransferase and sinapic acid glucosyl-transferase were identified as glycosyltransferase involved in anthocyaninmodification directly and indirectly. It is interesting that ICA provides adifferent perspective on microarray data analysis by IC numbers.



36.�

Fukushima A, Nishizawa T, Hayakumo M, Hikosaka S, Saito K,Goto E, Kusano M: Exploring tomato gene functions based oncoexpression modules using graph clustering and differentialcoexpression approaches. Plant Physiol 2012, 158:1487-1502.

The authors found differential gene coexpressions based on tomatomicroarray data sets and pointed out the possibility that differentialcoexpression reflect rewiring of the trancriptome networks in distinctorgans and suggest key regulatory steps in the pathways.

37. Wang J, Li M, Deng Y, Pan Y: Recent advances in clusteringmethods for protein interaction networks. BMC Genomics2010, 11(Suppl. 3):S10.

38. de la Fuente A: From ‘differential expression’ to ‘differentialnetworking’ – identification of dysfunctional regulatorynetworks in diseases. Trends Genet 2010, 26:326-333.

39. Ideker T, Krogan NJ: Differential network biology. Mol Syst Biol2012, 8:565.

40. Less H, Angelovici R, Tzin V, Galili G: Coordinated genenetworks regulating Arabidopsis plant metabolism inresponse to various stresses and nutritional cues. Plant Cell2011, 23:1264-1271.

41. Avin-Wittenberg T, Tzin V, Angelovici R, Less H, Galili G:Deciphering energy-associated gene networks operating inthe response of Arabidopsis plants to stress and nutritionalcues. Plant J 2012, 70:954-966.

42. Movahedi S, Van Bel M, Heyndrickx KS, Vandepoele K:Comparative co-expression analysis in plant biology. Plant CellEnviron 2012, 35:1787-1789.

43.�

Mutwil M, Klie S, Tohge T, Giorgi FM, Wilkins O, Campbell MM,Fernie AR, Usadel B, Nikoloski Z, Persson S: PlaNet: combinedsequence and expression comparisons across plant networksderived from seven species. Plant Cell 2011, 23:895-910.

PlaNet is a plant coexpression network browser for seven plant species.Integration of sequence similarity and gene expression profiles allows usto identify conserved coexpression clusters across multiple plantspecies.

44. Abe I, Morita H: Structure and function of the chalconesynthase superfamily of plant type III polyketide synthases.Nat Prod Rep 2010, 27:809-838.

45. Zarrineh P, Fierro AC, Sanchez-Rodriguez A, De Moor B,Engelen K, Marchal K: COMODO: an adaptive coclusteringstrategy to identify conserved coexpression modules betweenorganisms. Nucleic Acids Res 2011, 39:e41.

46. Shukla VK, Doyon Y, Miller JC, DeKelver RC, Moehle EA,Worden SE, Mitchell JC, Arnold NL, Gopalan S, Meng X et al.:Precise genome modification in the crop species Zea maysusing zinc-finger nucleases. Nature 2009, 459:437-441.

47. Townsend JA, Wright DA, Winfrey RJ, Fu F, Maeder ML, Joung JK,Voytas DF: High-frequency modification of plant genes usingengineered zinc-finger nucleases. Nature 2009, 459:442-445.

48. Beetham PR, Kipp PB, Sawycky XL, Arntzen CJ, May GD: A toolfor functional plant genomics: chimeric RNA/DNAoligonucleotides cause in vivo gene-specific mutations. ProcNatl Acad Sci USA 1999, 96:8774-8778.

49. Aufsatz W, Mette MF, van der Winden J, Matzke AJ, Matzke M:RNA-directed DNA methylation in Arabidopsis. Proc Natl AcadSci USA 2002, 99(Suppl. 4):16499-16506.


50. Lusser M, Parisi C, Plan D, Rodriguez-Cerezo E: Deployment ofnew biotechnologies in plant breeding. Nat Biotechnol 2012,30:231-239.

51. Bogdanove AJ, Voytas DF: TAL effectors: customizableproteins for DNA targeting. Science 2011, 333:1843-1846.

52. Scholze H, Boch J: TAL effectors are remote controls for geneactivation. Curr Opin Microbiol 2011, 14:47-53.

53.��

Li T, Liu B, Spalding MH, Weeks DP, Yang B: High-efficiencyTALEN-based gene editing produces disease-resistant rice.Nat Biotechnol 2012, 30:390-392.

TALEN is a promising genome editing technology which is applicable togene activation and repression of desired genes. This report is a firstexample conferring disease-resistance to plants using a TALENtechnology.

54. Mahfouz MM, Li L, Piatek M, Fang X, Mansour H,Bangarusamy DK, Zhu JK: Targeted transcriptional repressionusing a chimeric TALE-SRDX repressor protein. Plant Mol Biol2012, 78:311-321.

55. Morbitzer R, Romer P, Boch J, Lahaye T: Regulation of selectedgenome loci using de novo-engineered transcriptionactivator-like effector (TALE)-type transcription factors. ProcNatl Acad Sci USA 2010, 107:21617-21622.

56. Reyon D, Tsai SQ, Khayter C, Foden JA, Sander JD, Joung JK:FLASH assembly of TALENs for high-throughput genomeediting. Nat Biotechnol 2012, 30:460-465.

57. Ideker T, Dutkowski J, Hood L: Boosting signal-to-noise incomplex biology: prior knowledge is power. Cell 2011,144:860-863.

58. Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y: RNA-seq:an assessment of technical reproducibility and comparisonwith gene expression arrays. Genome Res 2008, 18:1509-1517.

59. Mochida K, Shinozaki K: Genomics and bioinformaticsresources for crop improvement. Plant Cell Physiol 2010,51:497-523.

60. Brady SM, Provart NJ: Web-queryable large-scale data sets forhypothesis generation in plant biology. Plant Cell 2009,21:1034-1051.

61. Yonekura-Sakakibara K, Saito K: Functional genomics for plantnatural product biosynthesis. Nat Prod Rep 2009,26:1466-1487.

62. Heinzle E, Matsuda F, Miyagawa H, Wakasa K, Nishioka T:Estimation of metabolic fluxes, expression levels andmetabolite dynamics of a secondary metabolic pathway inpotato using label pulse-feeding experiments combined withkinetic network modelling and simulation. Plant J 2007,50:176-187.

63. Colon AM, Sengupta N, Rhodes D, Dudareva N, Morgan J: Akinetic model describes metabolic response to perturbationsand distribution of flux control in the benzenoid network ofPetunia hybrida. Plant J 2010, 62:64-76.

64. Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G,Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC: Detectingnovel associations in large data sets. Science 2011,334:1518-1524.


Date post:	12-Dec-2016
Category:	Documents
Upload:	kazuki
View:	214 times
Download:	0 times

Transcriptome data modeling for targeted plant metabolic engineering

Documents