Examination of the Transcriptional Regulation and Downstream Targets of the Transcription Factor AtMYB61
by
Michael Prouse
A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy
Department of Cell & Systems Biology University of Toronto
© Copyright by Michael Prouse 2013
ii
Examination of the transcriptional regulation and downstream
targets of the transcription factor AtMYB61
Michael Prouse
Doctor of Philosophy
Department of Cell & Systems Biology, University of Toronto
2013
Thesis Abstract
The mechanisms behind how a transcription factor elicits a given phenotype can be
complex. The aim of the research presented herein was to provide experimental
evidence to characterise the upstream and downstream regulation of the Arabidopsis
thaliana R2R3-MYB transcription factor, AtMYB61. To address these aims, three
separate experiments were undertaken.
First, three direct downstream target genes of AtMYB61 were predicted based on a two-
stage complete transcriptome analysis, using publicly available microarray datasets in
combination with a custom microarray dataset comparing the transcriptomes of WT,
atmyb61 and 35S::MYB61 plants. These candidate target genes encode the following
proteins: a KNOTTED1-like transcription factor, a caffeoyl-CoA 3-O-methyltransferase
and a pectin-methylesterase. AtMYB61 bound the 5‘ non-coding regulatory regions of
these target genes, as determined by electrophoretic mobility shift assay.
Second, the preferred DNA-binding sites of recombinant AtMYB61 protein were
assessed with a cyclic amplification and selection of targets (CASTing) assay. Key
interactions between amino acids in the AtMYB61 DNA-binding site and nucleotides in
the preferred DNA targets were predicted by molecular modeling. While recombinant
iii
AtMYB61 was sufficient to drive gene expression from CASTing-identified target DNA
sequences in yeast, it did so in a manner that was not entirely consistent with predicted
DNA-binding affinities determined by a nitrocellulose filter binding assay.
Finally, the molecular components that function upstream to modulate AtMYB61
expression were determined. AtMYB61 was determined to be de-repressed by sucrose
in a mechanism involving its second intron. An over-represented motif was conserved
within the second intron of Brassicaceae AtMYB61 homologues and this motif
functioned as a binding target for a putative sugar-mediated repressor, as determined
by EMSA. Putative AtMYB61 repressor proteins that bound this motif in the absence of
sucrose were affinity purified and characterised using LC-MS/MS, and the proteins
identified based on their MS fingerprints.
iv
Acknowledgements
I thank my supervisor, Dr. Malcolm Campbell, for his ongoing mentorship and guidance
over the five years that I have had the pleasure to be in his laboratory. His tremendous
support and optimism has shaped me into the scientist I am today. Our father-son
relationship was something that I will always treasure, and for that I thank him. I would
also like to thank my committee members Dr. Darrell Desveaux and Dr. Keiko Yoshioka
and examiners, Dr. Dinesh Christendat, Dr. Daphne Goring and Dr. Shelley Hepworth,
for keeping my goals in sight and obtainable and for the constructive criticisms that I
needed to receive to reach the next level.
I would also like to thank the members of the Cell Systems Biology program, with whom
I spent countless hours discussing science, projects, and ideas. I would also like to
thank my lab mates with whom I have treated as my family and shared some of my
fondest of memories – Katharina Braeutigam, Thomas Cannam, Erin Hamanishi,
Katrina Hiiback, Hungwei Hou, Julia Nowak, Joan Ouellette, Sherosha Raj, Julia
Romano, Joseph Skaf, Michael Stokes, Heather Wheeler, and Olivia Wilkins. To my
longtime office mates Michael Stokes and Rohan Patel, I thank you for all the laughs,
pranks and great times that we shared over the years.
I am also grateful to my parents, Doris and Robert Prouse, who have constantly been
there for me throughout my life and have provided me with the guidance, unconditional
love, and support that I needed to succeed. Finally, I would like to thank my wife, Diana
– without you always loving and supporting me, I would never have made it this far.
You are everything to me and I can‘t wait to start our new family together.
―I‘m a great believer in luck, and I find the harder I work, the more I have of it.‖
—Stephen Leacock
v
Table of Contents
Thesis Abstract ........................................................................................................................... ii
Acknowledgements .................................................................................................................... iv
Table of Contents ....................................................................................................................... v
List of Abbreviations ................................................................................................................... x
List of Tables ............................................................................................................................. xii
List of Figures........................................................................................................................... xiii
Chapter 1 ................................................................................................................................... 1
1. Introduction ........................................................................................................................ 2
1.1 Transcription Factors ..................................................................................................... 2
1.2 The Nature of MYB Proteins .......................................................................................... 3
1.2.1 The MYB Transcription Factor Superfamily .............................................................. 3
1.2.2 Animal MYB Proteins ................................................................................................ 4
1.2.3 Plant MYB Proteins .................................................................................................. 6
1.2.4 Single MYB Repeat Proteins .................................................................................... 7
1.2.5 Expansion and Diversification of the MYB Family ..................................................... 7
1.3 DNA targets of MYB family members ............................................................................. 8
1.3.1 Animal MYB DNA-Binding Sites ............................................................................... 8
1.3.2 Plant MYB DNA-Binding Sites .................................................................................. 9
1.3.3 The DNA Targets of Single MYB Repeat Proteins .................................................. 14
1.4 The Nature of DNA-Binding by MYB Proteins .............................................................. 15
1.4.1 Relationship Between the MYB DNA-Binding Domain and DNA-Binding Specificity .......................................................................................................... 15
1.4.2 Involvement of MYB Repeats in DNA Binding ........................................................ 17
1.4.3 The Nature of DNA Binding By Animal MYB Proteins ............................................. 17
1.4.4 The Nature of DNA Binding By Plant MYB Proteins ............................................... 21
1.5 Future of Plant MYB-DNA Interaction Studies .............................................................. 24
vi
1.5.1 Determining the Breadth of MYB DNA Targets in vitro ........................................... 24
1.5.2 Emerging Approaches for Plant MYB Target Discovery and Analysis in vivo .......... 25
1.6 Transcriptional Regulation of MYB proteins ................................................................. 29
1.6.1 Regulators Effecting MYB Gene Expression in Networks ....................................... 29
1.6.2 The Role of Introns on MYB Transcriptional Regulation ......................................... 30
1.7 Research Hypotheses and Aims ............................................................................. 31
1.8 Acknowledgements ................................................................................................. 33
Chapter 2 ................................................................................................................................. 34
2 AtMYB61, an R2R3-MYB Transcription Factor, is a Pleiotropic Regulator of Plant Carbon Acquisition and Resource Allocation ....................................................................... 35
2.1 Abstract ....................................................................................................................... 35
2.2 Introduction .................................................................................................................. 35
2.3 Materials and Methods ................................................................................................. 37
2.3.1 Plant Material, Seed Sterilization and Growth Conditions ................................... 37
2.3.2 RNA Isolation and Quantitative PCR .................................................................. 38
2.3.3 Secondary Thickened Hypocotyls Stained with Phloroglucinol ........................... 38
2.3.4 Transmission Electron Microscopy ..................................................................... 39
2.3.5 Microarray Analysis ............................................................................................ 39
2.3.6 Bioinformatic Analyses to Identify AtMYB61 Targets .......................................... 40
2.3.7 Electrophoretic Mobility Shift Assay (EMSA) ...................................................... 41
2.3.8 Transcriptional Activation Assay ........................................................................ 41
2.3.9 Fibre Quality Analysis ........................................................................................ 42
2.4 Results and Discussion ................................................................................................ 42
2.4.1 AtMYB61 Modulates the Expression of a Specific Set of Target Genes ............. 42
2.4.2 AtMYB61 Regulates Genes with Specific Target Motifs in Their Promoters ....... 47
2.4.3 AtMYB61 Regulates Genes Which Themselves Contribute to AtMYB61-Related Phenotypes .......................................................................................... 52
2.5 Conclusion ................................................................................................................... 54
vii
2.6 Acknowledgements ...................................................................................................... 54
Chapter 3 ................................................................................................................................. 55
3 Interactions between the R2R3-MYB Transcription Factor, AtMYB61, and Target DNA Binding Sites ....................................................................................................................... 56
3.1 Abstract ....................................................................................................................... 56
3.2 Introduction .................................................................................................................. 56
3.3 Materials and Methods ................................................................................................. 59
3.3.1 Ethics Statement ................................................................................................ 59
3.3.2 Expression of Recombinant Protein in Bacteria ................................................. 59
3.3.3 Antibody Production and Western Blot Analysis ................................................. 59
3.3.4 Cyclic Amplification and Selection of Targets (CASTing) ................................... 60
3.3.5 Nitrocellulose Filter-Binding Assay ..................................................................... 60
3.3.6 Electrophoretic Mobility Shift Assay (EMSA) ...................................................... 61
3.3.7 Molecular Modelling ........................................................................................... 61
3.3.8 Transcriptional Activation Assay ........................................................................ 61
3.4 Results and Discussion ................................................................................................ 62
3.4.1 AtMYB61 Bound a Discrete Subset of DNA Target Sequences ......................... 62
3.4.2 AtMYB61 Bound to DNA Target Sequences with Varying Degrees of Affinity .... 66
3.4.3 The Affinity of AtMYB61 to Specific Target DNA Sequences Was Predicted by Molecular Interactions Determined in silico ....................................................... 69
3.4.4 The Affinity of AtMYB61 to Specific Target DNA Sequences Did Not Correlate with AtMYB61-Driven Transcriptional Activation with Each of the Target Sequences ........................................................................................................ 71
3.4.5 CASTing Target Sequences Were Found in the Promoter Regions of Three Putative Direct Downstream Targets of AtMYB61 ............................................. 76
3.5 Conclusion ................................................................................................................... 78
3.6 Acknowledgements ...................................................................................................... 78
3.7 Supplemental Figures and Tables ............................................................................... 79
Chapter 4 ................................................................................................................................. 83
viii
4 Novel Regulation of an R2R3-MYB Transcription Factor, AtMYB61, by a Non-Hexokinase Sugar-Signalling Pathway ................................................................................ 84
4.1 Abstract ....................................................................................................................... 84
4.2 Introduction .................................................................................................................. 84
4.3 Materials and Methods ................................................................................................. 86
4.3.1 Plant Material and Culture .................................................................................. 86
4.3.2 Phylogenetic Analysis of AtMYB61 Brassicaceae Homologues ......................... 87
4.3.3 Analysis of Transgenic Plants Containing Promoter::Reporter Fusions .............. 87
4.3.4 Semi-Quantitative PCR ...................................................................................... 88
4.3.5 Quantitative, Real-Time, Reverse Transcriptase Polymerase Chain Reaction (qRT-PCR) ........................................................................................................ 88
4.3.6 Electrophoretic Mobility Shift Assay (EMSA) ...................................................... 90
4.3.7 Streptavidin Biotin Pull-Down Assay .................................................................. 90
4.3.8 Mass Spectrometry ............................................................................................ 91
4.4 Results and Discussion ................................................................................................ 91
4.4.1 AtMYB61 Expression is Regulated by Sugars .................................................... 91
4.4.2 AtMYB61 Acts in a Pathway Independent of the Hexokinase Sugar Signalling Pathway ............................................................................................................ 94
4.4.3 AtMYB61 Expression is Sugar Derepressed, Involving an Intragenic Sequence within the 5‘ Coding Region Containing Two Introns ......................... 97
4.4.4 Affinity Purification Coupled with Mass Spectrometry Uncovers a Suite of Putative AtMYB61 Repressor Proteins that Bind the Conserved Second Intron Motif in a Sucrose-Dependent Manner .................................................. 103
4.4.5 A Subset of Putative AtMYB61 Repressor Genes Are Sugar Sensitive ............ 106
4.4.6 rmx Loss-of-Function Mutant Phenocopies Constitutive AtMYB61 Overexpression ............................................................................................... 108
4.5 Conclusion ................................................................................................................. 110
4.6 Acknowledgements .................................................................................................... 111
4.7 Supplemental Figures and Tables ............................................................................. 112
Chapter 5 ............................................................................................................................... 134
ix
5 General Conclusions and Future Directions .................................................................... 135
5.1 General Conclusions ................................................................................................. 135
5.2 Future Directions ....................................................................................................... 137
Molecular Characterisations of Plant Transcription Factors ........................................ 137
ChIP-Seq .................................................................................................................... 137
Characterisations of Putative AtMYB61 Repressors ................................................... 138
Appendices ............................................................................................................................ 140
A The Wound-, Pathogen-, and Ultraviolet B-Responsive MYB134 Gene Encodes an R2R3 MYB Transcription Factor that Regulates a Suite of Genes Involved in Proanthocyanidin Synthesis in Poplar ................................................................................ 141
A.1 Abstract ..................................................................................................................... 141
A.2 Introduction ............................................................................................................... 141
A.3 Materials and Methods .............................................................................................. 144
A.3.1 EMSA .............................................................................................................. 144
A.4 Results and Discussion ............................................................................................. 145
A.4.1 MYB134 Binds to Promoter Regions of PA Biosynthetic Genes ...................... 145
A.5 Conclusion ................................................................................................................ 148
A.6 Acknowledgements ................................................................................................... 149
B Study Labels ................................................................................................................... 150
References ............................................................................................................................. 151
Copyright Acknowledgements ................................................................................................ 181
x
List of Abbreviations
35S Cauliflower Mosaic Virus 35S promoter 61P AtMYB61 promoter 61PN AtMYB61 promoter and 5‘ intragenic sequences 2-DG 2-deoxyglucose 3-OMG 3-O-methylglucose aba abscisic acid loss-of-function mutant abi abscisic acid insensitive loss-of-function mutant ABRC Arabidopsis Biological Resource Center AC-1 AtMYB61 preferred target sequence-ACCTAC AC elements adenosine and cytosine enriched sequences ACT ACTIN AMV avian myeloblastosis virus AGRIS Arabidopsis Gene Regulatory Information Server ANR2 ANTHOCYANIDIN REDUCTASE2 AtHXK Arabidopsis thaliana HEXOKINASE atmyb61 Arabidopsis thaliana MYB61 loss-of-function mutant BERF1 Barley Ethylene Response Factor1 BEIL1 Barley Ethylene Insensitive Like1 BGRF1 Barley Growth Regulating Factor1 bHTH basic helix-turn-helix bHLH basic helix-loop-helix C1 COLORED1 CAST cyclic amplification and selection of targets CCoAOMT7 caffeoyl-CoA 3-O-methyltransferase ChIP-chip chromatin immunoprecipitation on chip ChIP-seq chromatin immunoprecipitation followed by high throughput sequencing Col-0 wild-type Arabidopsis thaliana Columbia CPC CAPRICE DEPC diethylpyrocarbonate DFR1 DIHYDROFLAVONOL REDUCTASE1 DOF DNA binding with one Finger EMSA electrophoretic mobility shift assay FLP FOUR LIPS gin glucose insensitive loss-of-function mutant GL1 GLABRA1 GL3 GLABRA3 GR glucocorticoid receptor GS1b GLUTAMATE SYNTHETASE-1B GSNO S-nitrosoglutathione GTFs general transcription factors
GUS -glucuronidase hxk hexokinase loss-of-function mutant IBP indicator binding protein group IFN-g human interferon-g irx11 irregular xylem11/knat-7 loss-of-function mutant Kd dissociation constant KNAT7 KNOTTED1-like transcription factor LACC Local Animal Care Committee LC-MS/MS liquid chromatography tandem mass spectrometry LCR locus control region
xi
MBS MYB binding site MIAME minimum information about a microarray experiment MEME Multiple Em for Motif Elicitation MHL mannoheptulose MS Murashige Skoog MSA M phase-specific activator element MUG methylumbelliferone-glucuronide NASC Nottingham Arabidopsis Stock Centre NBS non-binding site of AtMYB61 PA proanthocyanidins PAL1 PHENYLALANINE AMMONIA-LYASE1
PBF Pyrimidine-box Binding Factor PCR polymerase chain reaction
PDB Protein Data Bank PG phenolic glycosides
PME pectin-methylesterase
PHYRE Protein Homology/analogY Recognition Engine PLACE PLAnt Cis-Element datatbase qRT-PCR Quantitative, real-time, reverse transcriptase polymerase chain reaction R MYB repeat RAmy1a RICE ALPHA-AMYLASE rmx repressor of myb expression loss-of-function mutant RMX REPRESSOR OF MYB EXPRESSION SBEI STARCH-BRANCHING ENZYME I SELEX systematic evolution of ligands by exponential enrichment SMH single MYB histone group SNP sodium nitroprusside Sus3 sucrose synthase 3 TAIR The Arabidopsis Information Resource TRANSFAC Transcription Factor Database TRFL TRF1/2-LIKE genes UACC University of Toronto Animal Care Committee UTR untranslated regions WBS WER-binding site WER WEREWOLF WT wild-type
xii
List of Tables
1 Introduction
1.1 DNA binding specificities of members of the MYB superfamily ..................................... 12
2 AtMYB61, an R2R3-MYB transcription factor, is a pleiotropic regulator of plant carbon acquisition and resource allocation
2.1 Genes that share transcript abundance profiles with AtMYB61 determined by Pearson correlation coefficient, across the AtGenExpress developmental baseline dataset. ......................................................................................................................... 44
2.2 Genes that share transcript abundance profiles with AtMYB61 determined by Pearson correlation coefficient, across the AtMYB61 microarray dataset ..................... 46
2.3 AC elements within the promoters of putative downstream targets. .............................. 50
3 Interactions between the R2R3-MYB transcription factor, AtMYB61, and target DNA binding sites
3.1 Alignment of AtMYB61 binding sites ............................................................................. 64
3.2 AtMYB61 consensus sequence was derived from a comparison of 89 sequences recovered from 5 cycles of CASTing ............................................................................. 65
3.3 Dissociation constants (Kd) in mol/L and associated errors of CASTing targets ........... 67
3.4 Dissociation constants (Kd) in mol/L and associated errors of mutated ACCTAC (AC1 element) sequences ............................................................................................ 68
S3.1 Relative binding of CASTing targets and mutated AC1 sequences to AtMYB61 .......... 80
4 Novel regulation of an R2R3-MYB transcription factor, AtMYB61, by a non-hexokinase sugar-signalling pathway
4.1 List of putative repressors of AtMYB61 expression (RMX) that bound AtMYB61 second intron repeat ................................................................................................... 105
S4.1 AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana genes 118
S4.2 AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana intergenic regions ....................................................................................................... 127
S4.3 AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana introns and corresponding transcript response to sugar ......................................................... 129
xiii
List of Figures
1 Introduction
1.1 Schematic representation of an R2R3-MYB transcription factor...................................... 5
1.2 Phylogenetic relationships and subgroup designations for 87 MYB superfamily members ...................................................................................................................... 10
2 AtMYB61, an R2R3-MYB transcription factor, is a pleiotropic regulator of plant carbon acquisition and resource allocation
2.1 Transcript abundance of a subset of genes in the Arabidopsis thaliana transcriptome is influenced by the presence or absence of AtMYB61 activity ..................................... 43
2.2 AtMYB61 binds to the promoters of putative downstream targets, to motifs that are over-represented in these promoters and is sufficient to activate transcription from these motifs .................................................................................................................. 48
2.3 AtMYB61 binding to the 5‘ non-coding sequences of the three putative target genes as determined by EMSA ............................................................................................... 51
2.4 AtMYB61 downstream target genes have an impact on secondary wall formation and xylem formation in secondary thickened hypocotyls ...................................................... 53
3 Interactions between the R2R3-MYB transcription factor, AtMYB61, and target DNA binding sites
3.1 Cylic amplification and selection of targets (CASTing) recovered a suite of hexamer target sequences that bound to AtMYB61 ..................................................................... 63
3.2 Relative binding affinities of AtMYB61 to CASTing targets and to mutated ACCTAC motif determined by nitrocellulose filter-binding assays are confirmed by electrophoretic mobility shift assays (EMSAs) ............................................................... 70
3.3 Molecular modelling of AtMYB61 with target sequences confirm binding preferences determined by nitrocellulose filter-binding assays and EMSAs ..................................... 72
3.4 AtMYB61-mediated activation of promoter activity in Saccharomyces cerevisiae in an AC dependent fashion .................................................................................................. 74
3.5 Sequences recovered from the CASTing assay were found in all three promoter regions of predicted direct downstream targets of AtMYB61, namely KNOTTED1-like transcription factor (KNAT7, At1g62990); caffeoyl-CoA 3-O-methyltransferase (CCoAOMT7, At4g26220), and pectin-methylesterase (PME, At2g45220) ................... 77
S3.1 AtMYB61 antibody generation and validation .............................................................. 79
4 Novel regulation of an R2R3-MYB transcription factor, AtMYB61, by a non-hexokinase sugar-signalling pathway
4.1 Sugar regulation of AtMYB61 expression in dark-grown wild-type seedlings, 7 days post-germination ........................................................................................................... 92
xiv
4.2 Promoter-reporter and qRT-PCR analysis of AtMYB61 expression in response to sugars ........................................................................................................................... 93
4.3 qRT-PCR analysis of AtMYB61 and HXK-2 expression in wild-type (WT) and glucose insensitive (gin) loss-of-function mutants ...................................................................... 96
4.4 Analysis of AtMYB61 promoter-reporter fusion constructs that contain or do not contain AtMYB61 5’ intragenic sequences in response to sucrose ............................... 98
4.5 Phylogenetic footprinting identifies a conserved repeat motif in the second intron of AtMYB61 Brassicaceae homologues ............................................................................ 99
4.6 EMSA shows AtMYB61 second intron motif bound differentially by proteins in nuclear extracts from seedlings grown in the absence or presence of sucrose in the dark, consistent with the derepression model ...................................................................... 101
4.7 Affinity purification coupled with LC-MS/MS determines putative AtMYB61 repressor proteins that bound the second intron repeat .............................................................. 104
4.8 qRT-PCR of putative repressors of AtMYB61 expression loss-of-function mutants (rmx) that had AtMYB61 misexpression in seedlings grown in the absence of sucrose in the dark, validating the repressor hypothesis .......................................................... 107
4.9 Phenotypes of Arabidopsis thaliana wild-type (WT) plants, AtMYB61 loss-of-function mutants (atmyb61), AtMYB61 over-expressor mutants (35S::MYB61) and At2g43970 loss-of-function mutants (rmx3)................................................................................... 109
S4.1 Sequence alignment of the second intron of Brassicaceae AtMYB61 homologues .... 112
S4.2 Sequence alignment of AtMYB61 and AtMYB50 reveals no second intron repeat within AtMYB50 second intron .................................................................................... 113
S4.3 EMSA shows AtMYB61 second intron motif bound differentially by proteins in nuclear extracts from seedlings grown in the absence or presence of sucrose in the dark, consistent with the derepression model ............................................................. 114
S4.4 Validation of biotinylation of AtMYB61 second intron and second intron repeat ......... 115
S4.5 Semi-quantitative PCR of AtMYB61 expression in repressors of AtMYB61 expression loss-of-function mutant (rmx) seedlings grown in the absence or presence of sucrose in the dark .................................................................................. 116
S4.6 At2g43970 and At1g09540 share inverse transcript abundance profiles across development ............................................................................................................... 117
A Appendix. The wound-, pathogen-, and ultraviolet B-responsive MYB134 gene encodes an R2R3 MYB transcription factor that regulates a suite of genes involved in proanthocyanidin synthesis in Poplar
A.1 MYB134 binds to the promoters of putative downstream target genes ........................ 146
1
Chapter 1
Introduction
This chapter contains the following publication in its entirety:
Prouse M.B., and Campbell M.M. (2012) The interaction between MYB proteins and
their target DNA binding sites. Biochimica Et Biophysica Acta-Gene Regulatory
Mechanisms. 1819: 67-77.
Contributions: MBP, MMC designed research; MBP, MMC analyzed data; MBP, MMC
wrote and edited manuscript.
MBP contributed specifically to each figure and table in this chapter.
Copyright: Sections 1.1 to 1.6 inclusive are copyrighted by Elsevier B.V.
2
1. Introduction
1.1 Transcription Factors
In eukaryotic organisms, gene expression is subject to complex patterns of spatial and
temporal regulation. The first step of transcriptional regulation of any gene is
orchestrated by the activity of sequence-specific transcription factors, proteins that
function to reconfigure gene expression in response to external and internal cues.
Sequence-specific transcription factors frequently have a modular structure –
comprising a DNA-binding domain together with a transcriptional regulatory domain
(Colladovides et al., 1991). The DNA-binding domains of transcription factors are highly
conserved, while their transcriptional regulatory domains are variable (Schwechheimer
and Bevan, 1998). Sequence-specific transcription factors can act as transcriptional
activators, repressors, or both (Maniatis et al., 1987).
In eukaryotes, transcription factors that promote transcription are termed activator
proteins. Transcriptional activators can promote transcription of protein coding genes in
numerous ways. Activator proteins can bind a cognate target DNA site to directly or
indirectly recruit RNA polymerase II and general transcription factors (GTFs) that in turn
carry out transcription of a gene (Schwechheimer and Bevan, 1998; Lee and Young,
2000). Activator proteins can also effect the rate of transcription of a gene through
interactions with RNA polymerase II and GTFs (Lee and Young, 2000). Finally,
activator proteins can promote the acetylation of histone proteins making the DNA more
accessible for transcription (Cosma et al., 1999). Transcriptional activators accomplish
these tasks by directly or indirectly recruiting other proteins with this catalytic activity to
the DNA target.
Sequence-specific transcription factors that reduce transcription are transcriptional
repressors. These proteins act in three ways: (i) by binding to a cognate DNA site to
block the binding of general transcription factors or activators; (ii) by blocking
transcription by means of inhibitory interaction with general transcription factors or
activators; or (iii) by altering the higher-order DNA structure in a way to inhibit
3
transcription (HannaRose and Hansen, 1996). Repressors can reduce the rate of
transcription, or suppress it altogether.
Large families, or superfamilies of activator and repressor proteins have evolved in
eukaryotes. These are categorised based on the similarities of the DNA-binding
domain, with several such groups composed of one hundred or more members (Pabo
and Sauer, 1992; Yanhui et al., 2006). The MYB superfamily is one of the largest and
most diverse families of sequence-specific transcription factors (Rosinski and Atchley,
1998; Riechmann et al., 2000).
Much is known about the specifics of the interaction between animal MYB proteins and
their cognate DNA binding sites. By contrast, the knowledge of the details of MYB-DNA
interactions in plants is rather incomplete. This introduction will consider the current
state of knowledge with respect to MYB-DNA interactions in animals, and contrast this
with what is known in plants, suggesting means by which the gap in knowledge in plants
can be addressed. Moreover, this introduction will address how MYB proteins are
regulated to elicit their downstream responses.
1.2 The Nature of MYB Proteins
1.2.1 The MYB Transcription Factor Superfamily
The MYB superfamily is found in all major eukaryotic lineages, and is thought to be
more than 1 billion years old (Lipsick, 1996; Rosinski and Atchley, 1998; Kranz et al.,
2000; Wilkins et al., 2009). MYB proteins acquired their name from v-MYB, the
oncogenic component of avian myeloblastosis virus (AMV), where the sequence-
specific MYB domain was initially discovered (Peters et al., 1987). The cellular
counterpart of v-MYB is c-MYB, a MYB protein that plays a critical role in controlling the
proliferation and differentiation of hematopoietic cells (Mucenski et al., 1991). c-MYB
mutations that alter target gene expression drastically reduce the proliferation of
hematopoietic cells (Gewirtz and Calabretta, 1988). In keeping with this, homozygous
c-MYB knock-out lines of mice die before reaching day 15 of the fetal lifecycle due to
the inability to sustain hepatic erythropoiesis (Mucenski et al., 1991).
4
MYB superfamily members are characterised by a highly conserved DNA-binding
domain, referred to as the MYB domain, which consists of up to four imperfect amino
acid repeats (R1, R2, R3 and R4) of 50-53 amino acids (Fig. 1.1)(Rosinski and Atchley,
1998). Each of the MYB repeats, within the MYB domain, gives rise to a helix-helix-
turn-helix secondary structure (Fig. 1.1). The MYB domain is predominantly found
within the N-terminus of MYB-proteins (Fig.1.1)(Stracke et al., 2001); however, MYB
domains recently have also been discovered within the C-termini of MYB-proteins
(Linger and Price, 2009). Each MYB repeat consists of several highly conserved
tryptophan residues that are regularly spaced forming a hydrophobic core (Fig.
1.1)(Ogata et al., 1994). In contrast to the MYB domain, the C-terminal region of MYB
proteins is characteristically highly variable from one MYB protein to another, and
usually functions as either an activation or repression domain (Jin and Martin, 1999;
Kranz et al., 2000; Stracke et al., 2001; Jia et al., 2004). This gives rise to a wide range
of variability both structurally and functionally within the MYB superfamily.
In animals, the MYB superfamily is relatively small, generally comprising four or five
proteins (Lipsick, 1996; Konig et al., 1998; Rosinski and Atchley, 1998; Wong et al.,
1998). Animal MYB superfamily members regulate gene expression related to cell
division or a discrete subset of cellular differentiation events (Biedenkapp et al., 1988;
Golay et al., 1991; Howe and Watson, 1991). By contrast, the MYB superfamily in
plants has expanded dramatically, with 100-200 MYB family members commonly found
in individual plant species (Dubos et al., 2010). In plants, MYB proteins regulate a vast
array of biochemical, cellular and developmental processes (Martin and PazAres, 1997;
Jin and Martin, 1999; Dubos et al., 2010).
1.2.2 Animal MYB Proteins
As is the case with c-MYB, animal MYB superfamily members contain three MYB
repeats (Howe et al., 1990; Luscher and Eisenman, 1990; Ogata et al., 1994); although,
there are some notable exceptions that deviate from this, including human SNAPc 190
and TRF1 (Konig et al., 1998; Wong et al., 1998). In all annotated vertebrate genomes,
5
Figure 1.1. Schematic representation of an R2R3-MYB transcription factor. The primary structure, secondary structure and protein-DNA model are indicated for an R2R3-MYB transcription factor. MYB proteins are classified depending on the number of adjacent MYB repeats (R). Each MYB repeat gives rise to a helix-helix-turn-helix secondary structure that is involved in sequence specific binding. Model of an R2R3-MYB transcription factor binding to the major groove of its target sequence was generated by Pymol. H, helix; T, turn; W, tryptophan; X, amino-acid; red, helix secondary structure; green, turn secondary structure; yellow, DNA target.
6
there are only three MYB proteins with three MYB repeats: A-MYB, B-MYB, and c-MYB
(Lipsick, 1996; Rosinski and Atchley, 1998). A-MYB and B-MYB proteins are R1R2R3-
MYB nuclear transcription factors expressed in hematopoietic cells, epithelial cells, and
fibroblasts (Nomura et al., 1988). A-MYB negatively regulates cellular proliferation
(Golay et al., 1991), while B-MYB positively regulates cell growth control, differentiation,
and cancer (Sala and Watson, 1999).
1.2.3 Plant MYB Proteins
In comparison to animals, the MYB superfamily is greatly expanded in plants (Stracke et
al., 2001; Jia et al., 2004; Wilkins et al., 2009). For example, of the over 1600
sequence-specific transcription factors identified in the genome of the model
dicotyledonous plant, Arabidopsis thaliana, almost 10% are members of the MYB
transcription factor family (Riechmann et al., 2000; Dubos et al., 2010). In contrast to
animals, Arabidopsis thaliana has 5 three-repeat MYB proteins, and 126 two-repeat
(R2R3) MYB proteins, (Martin and PazAres, 1997; Arabidopsis Genome, 2000;
Riechmann et al., 2000; Stracke et al., 2001; Yanhui et al., 2006; Dubos et al., 2010),
while the monocotyledon plant rice (Oryza sativa) has 109 predicted R2R3-MYB
proteins (Yanhui et al., 2006). In addition, single-repeat MYBs have also been identified
in plants and animals in increasing numbers (Baranowskij et al., 1994; Carre and Kay,
1995; Feldbrugge et al., 1997; Konig and Rhodes, 1997; Schaffer et al., 1998; Koering
et al., 2000; Alabadi et al., 2001; Chen et al., 2001; Hwang et al., 2001; Nishikawa et al.,
2001; Lu et al., 2002; Mohrmann et al., 2002; Li and de Lange, 2003; Marian et al.,
2003; Maxwell et al., 2003; Court et al., 2005; Xue, 2005; Fukuzawa et al., 2006; Lira et
al., 2007; Ko et al., 2008; Liao et al., 2008; Pitt et al., 2008; Ehrenkaufer et al., 2009; Ko
et al., 2009; Rawat et al., 2009; Lang and Juan, 2010; Yi et al., 2010; Yu et al., 2010).
Although, single-repeat MYB proteins have been identified in both animals and plants,
the majority of single repeat MYB proteins have not been characterised in plants.
As their name implies, R2R3-MYB proteins have two MYB repeats (Stracke et al.,
2001). R2R3-MYB proteins comprise the largest group of MYB transcription factors in
the MYB superfamily and appear to be specific to plants (Dubos et al., 2010). Plant
R2R3-MYB proteins regulate a myriad of processes, including primary and secondary
7
metabolism; regulation of cell fate and identity; regulation of plant development; and
responses to biotic and abiotic stresses (Pazares et al., 1987; Martin and PazAres,
1997; Glover et al., 1998; Jin and Martin, 1999; Martin et al., 2002; Patzlaff et al.,
2003a; Patzlaff et al., 2003b; Gomez-Maldonado et al., 2004; Jia et al., 2004; Liang et
al., 2005; Dubos et al., 2010). While analogous processes, such as regulation of cell
fate and identity, can be found in animals, the precise functions associated with R2R3-
MYB proteins appear to be plant specific (Martin and PazAres, 1997; Jin and Martin,
1999; Dubos et al., 2010).
1.2.4 Single MYB Repeat Proteins
Single MYB repeat proteins can be classified into the following two groups: 1) proteins
with MYB domain at C-terminus (Indicator Binding Protein (IBP) group), and 2) proteins
with MYB domain at the N-terminus (Single MYB Histone (SMH) group). The IBP group
of proteins includes RTBP1 from rice, AtTRP1 and AtTBP1 from Arabidopsis thaliana
(Konig et al., 1998; Chen et al., 2001; Hwang et al., 2001), as well as the highly
characterized telomeric DNA-binding proteins TRF1, TRF2, RAP1 and Taz1. SMH
proteins are a novel group of single MYB proteins that have only been identified in
plants. SMH group of proteins include PcMYB1 from Petroselinum crispum, AtTRB1,
AtTRB2, AtTRB3 from Arabidopsis thaliana, and Smh1 from Maize. AtTRB1, AtTRB2,
AtTRB3 have been studied in detail, all sharing a single MYB repeat more similar to R2
than R1 and R3 (Marian et al., 2003). In Arabidopsis thaliana, single-repeat MYB
proteins CAPRICE (CPC), TRYPTICHON (TRY), ETC1 (ENHANCER OF TRY and
CPC) and ETC2 have been identified (Schellmann et al., 2002; Kirik et al., 2004).
1.2.5 Expansion and Diversification of the MYB Family
Two theories of how the MYB superfamily evolved have been constructed based on
parsimony (Lipsick, 1996). The first is formulated on the premise that three-repeat MYB
proteins are closely related to vertebrate c-MYB and other similar three-repeat MYB
proteins in other eukaryotic groups, such as ciliates and slime molds (Braun and
Grotewold, 1999; Yang et al., 2003b). These primitive proteins are predicted to have
existed before the divergence between animals and plants (Yang et al., 2003b). This
8
theory proposes that R2R3-MYB proteins originated recently from three-repeat MYB
proteins due to loss of R1-MYB repeat (Braun and Grotewold, 1999; Dias et al., 2003).
The second theory postulates that within an ancient R2R3 predecessor that there was a
domain duplication and subsequent gain of R1, suggesting that R2R3 is a precursor of
MYB3R (Jiang et al., 2004a). Common to both theories, there was a vast expansion of
R2R3-MYB proteins in plants via duplications of entire genes (Lipsick, 1996); however,
the expansion was restricted for the three-repeat MYB proteins in both animals and
plants. Comparisons of DNA-binding specificities and functional roles between MYB
proteins with different repeats could help elucidate the nature of the evolutionary
pathway for MYB proteins.
1.3 DNA targets of MYB family members
1.3.1 Animal MYB DNA-Binding Sites
The DNA target of animal three-repeat MYB transcription factors was first determined
by isolation of chicken genomic DNA fragments bound by v-MYB on filters (Biedenkapp
et al., 1988) and by comparison of putative MYB binding sites within the SV40 enhancer
region (Nakagoshi et al., 1990). Binding-site selection methods with c-MYB protein
resulted in added minor extensions to the c-MYB consensus sequence. The c-MYB
consensus sequence was found to be ((T/C)AAC(G/T)G(A/C/T)(A/C/T)) and was termed
MYB binding site I (MBSI) (Howe et al., 1990; Weston, 1992). Mutational assays
validated by NMR structural data revealed that the MBSI sequence was bipartite. The
first half-site ((T/C)AAC)) has the majority of specific contacts with R3, and the second
half-site ((G/T)G(A/C/T)(A/C/T)) had specific contacts with R2 (Tanikawa et al., 1993;
Ogata et al., 1994; Ording et al., 1994). Following identification of the c-MYB DNA-
binding site, mammalian A-MYB and B-MYB, were subsequently shown to bind MBSI
(Mizuguchi et al., 1990; Watson et al., 1993; Ma and Calabretta, 1994; Jin and Martin,
1999).
9
1.3.2 Plant MYB DNA-Binding Sites
Although R1R2R3-MYB proteins in plants share the same functionality as animal
R1R2R3-MYB family members, their DNA-binding specificities are different (Howe and
Watson, 1991; Weston, 1992; Ito, 2005). All three characterised animal three-repeat
MYB proteins bind to the same sequence MBSI ((T/C)AAC(G/T)G(A/C/T)(A/C/T)) and
have similar functions in cell-cycle control (Biedenkapp et al., 1988; Golay et al., 1991;
Howe and Watson, 1991). In comparison, plant three-repeat MYB proteins, such as
tobacco MYBA1, MYBA2, and MYBB have an important role at the G2/M phase of the
cell-cycle, by regulating transcription of cyclin B and other cell-cycle genes that are
expressed at a similar time in the cell-cycle (Ito et al., 1998). Through a yeast one-
hybrid screen, NtMYBA1, NtMYBA2, and NtMYBB were found to bind to AACGG. This
consensus sequence is known as the M phase-specific activator (MSA) element, and
was identified previously in tobacco.
Relatively few of the possible plant R2R3-MYB DNA targets have been characterised;
but some common elements of plant MYB-DNA interactions have emerged (Fig. 1.2,
Table 1.1). Recognition of plant MYB DNA targets was first determined with studies
conducted on the Maize P protein, an R2R3-MYB protein involved in flavonoid
biosynthesis (Grotewold et al., 1994). Through binding-site selection assays and
EMSAs, P was shown to bind to ACC(A/T)ACC(A/C/T). This contrasted with the animal
MYB DNA consensus sequence of ((T/C)AAC(G/T)G(A/C/T)(A/C/T)), but was a
harbinger for the majority of plant MYB proteins, which recognise MBSI
((T/C)AAC(G/T)G(A/C/T)(A/C/T)), MBSII (AGTTAGTTA), and MBSIIG
((C/T)ACC(A/T)A(A/C)C). Nevertheless, it is important to note that not all plant MYB
proteins, especially within the R2R3-MYB family, recognise these motifs (Romero et al.,
1998). Many R2R3-MYB transcription factors recognise AC elements, DNA motifs that
are enriched in adenosine and cytosine residues (Grotewold et al., 1994; Sablowski et
al., 1994; Sablowski et al., 1995; Moyano et al., 1996; Sainz et al., 1997; Uimari and
Strommer, 1997; Tamagnone et al., 1998; Jin et al., 2000; Sugimoto et al., 2000; Yang
et al., 2001; Patzlaff et al., 2003a; Patzlaff et al., 2003b; Fukuzawa et al., 2006). Some
R2R3-MYB proteins function as transcriptional activators at these sites (Patzlaff et al.,
2003a; Patzlaff et al., 2003b), while others function as transcriptional repressors
10
Figure 1.2. Phylogenetic relationships and subgroup designations for 87 MYB superfamily members. The unrooted phylogenetic tree was generated using the amino acid sequences of the MYB proteins in Table 1.1. Whole MYB protein sequences were downloaded from The Arabidopsis Information Resource (TAIR; http://Arabidopsis.org) and from the National Center for Biotechnology Information protein database (NCBI Entrez; http://www.ncvi.nlm.nih.gov/sites/entrez). The phylogenetic analysis included 9 three-repeat MYB proteins (R1R2R3-MYB proteins), 50 two-repeat MYB proteins (R2R3-MYB proteins) and 28 one-repeat MYB proteins (R1-MYB proteins). The full-length amino acid sequences were aligned using Multiple Alignment using Fast Fourier Transform (MAFFT) using the G-INS-I algorithm (Katoh et al., 2005). A neighbour-joining tree was constructed using Molecular Evolutionary Genetics Analysis 4 (MEGA 4) (Tamura et al., 2007) with the parameters for the Jones-Taylor-Thornton substitution model and a Gamma parameter of 1.0 to account for the
11
Figure 1.2 caption continued. uneven rates of substitution across the length of the MYB proteins. Pairwise gap deletion was used, along with a bootstrap value of 1000. DNA-binding sites for MYB proteins were obtained from the literature. MYB proteins are annotated by colour based on DNA sequence recognition. Red, blue, green, orange, purple and grey represent MYB proteins that bind CNGTT(A/G), ACC(A/T)A(A/C), TTAGGG, AAAATATCT, GATA and TATCCA respectively. Black represents MYB proteins that do not bind to an assigned group. N indicates adenosine, guanine, cytosine or thymine. * indicates that the MYB protein DNA-binding specificity differs slightly from the consensus sequence of its group. Refer to Table 1.1 for specific details on DNA sequences bound by the MYB proteins.
12
Table 1.1. DNA binding specificities of members of the MYB superfamily. The information in the table represents the current state of knowledge pertaining to the DNA targets of MYB proteins, as determined from the literature. N indicates adenosine, guanine, cytosine or thymine. * indicates that the MYB protein DNA-binding specificity differs slightly from the consensus sequence of its group.
Group MYB Protein Binding Site Species MYB REPEAT References
1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002
GCAGTTT
At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
AAACCA Hoeren et al., 1998
*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011
At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005
AGTAGTTA
At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998
*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008
*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997
DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011
Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010
gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002
GTTT(G/T)(G/T) Yang et al., 2003
CTGTTG Huang et al., 2008
CTGTAG
CAGTAG
GTGTAG
GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GGTAGGTGAGA
GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008
ATCCTTTTTTCCGG
Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995
Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996
G(G/T)T(A/T)GGT(A/G)
Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001
NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006
ACCAACCCC
GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006
ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009
*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998
MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997
AGTTAGTTA
PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997
AAAAGTTAGGTTA
PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010
v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992
Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992
c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992
A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994
B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993
Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997
At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000
At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008
ACCTAAC
Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998
(C/T)AAC(A/T)AAC
Group MYB Protein Binding Site Species MYB REPEAT References
1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002
GCAGTTT
At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
AAACCA Hoeren et al., 1998
*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011
At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005
AGTAGTTA
At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998
*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008
*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997
DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011
Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010
gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002
GTTT(G/T)(G/T) Yang et al., 2003
CTGTTG Huang et al., 2008
CTGTAG
CAGTAG
GTGTAG
GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GGTAGGTGAGA
GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008
ATCCTTTTTTCCGG
Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995
Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996
G(G/T)T(A/T)GGT(A/G)
Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001
NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006
ACCAACCCC
GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006
ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009
*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998
MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997
AGTTAGTTA
PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997
AAAAGTTAGGTTA
PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010
v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992
Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992
c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992
A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994
B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993
Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997
At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000
At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008
ACCTAAC
Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998
(C/T)AAC(A/T)AAC
Group MYB Protein Binding Site Species MYB REPEAT References
1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002
GCAGTTT
At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
AAACCA Hoeren et al., 1998
*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011
At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005
AGTAGTTA
At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998
*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008
*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997
DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011
Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010
gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002
GTTT(G/T)(G/T) Yang et al., 2003
CTGTTG Huang et al., 2008
CTGTAG
CAGTAG
GTGTAG
GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GGTAGGTGAGA
GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008
ATCCTTTTTTCCGG
Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995
Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996
G(G/T)T(A/T)GGT(A/G)
Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001
NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006
ACCAACCCC
GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006
ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009
*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998
MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997
AGTTAGTTA
PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997
AAAAGTTAGGTTA
PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010
v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992
Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992
c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992
A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994
B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993
Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997
At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000
At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008
ACCTAAC
Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998
(C/T)AAC(A/T)AAC
Group MYB Protein Binding Site Species MYB REPEAT References
1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002
GCAGTTT
At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
AAACCA Hoeren et al., 1998
*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011
At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005
AGTAGTTA
At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998
*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008
*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997
DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011
Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010
gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002
GTTT(G/T)(G/T) Yang et al., 2003
CTGTTG Huang et al., 2008
CTGTAG
CAGTAG
GTGTAG
GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GGTAGGTGAGA
GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008
ATCCTTTTTTCCGG
Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995
Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996
G(G/T)T(A/T)GGT(A/G)
Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001
NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006
ACCAACCCC
GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006
ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009
*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998
MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997
AGTTAGTTA
PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997
AAAAGTTAGGTTA
PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010
v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992
Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992
c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992
A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994
B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993
Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997
At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000
At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008
ACCTAAC
Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998
(C/T)AAC(A/T)AAC
Group MYB Protein Binding Site Species MYB REPEAT References
1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002
GCAGTTT
At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
AAACCA Hoeren et al., 1998
*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011
At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005
AGTAGTTA
At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998
*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008
*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997
DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011
Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010
gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002
GTTT(G/T)(G/T) Yang et al., 2003
CTGTTG Huang et al., 2008
CTGTAG
CAGTAG
GTGTAG
GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GGTAGGTGAGA
GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008
ATCCTTTTTTCCGG
Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995
Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996
G(G/T)T(A/T)GGT(A/G)
Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001
NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006
ACCAACCCC
GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006
ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009
*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998
MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997
AGTTAGTTA
PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997
AAAAGTTAGGTTA
PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010
v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992
Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992
c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992
A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994
B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993
Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997
At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000
At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008
ACCTAAC
Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998
(C/T)AAC(A/T)AAC
Group MYB Protein Binding Site Species MYB REPEAT References
1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002
GCAGTTT
At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
AAACCA Hoeren et al., 1998
*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011
At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005
AGTAGTTA
At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998
*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008
*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997
DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011
Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010
gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002
GTTT(G/T)(G/T) Yang et al., 2003
CTGTTG Huang et al., 2008
CTGTAG
CAGTAG
GTGTAG
GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GGTAGGTGAGA
GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008
ATCCTTTTTTCCGG
Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995
Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996
G(G/T)T(A/T)GGT(A/G)
Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001
NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006
ACCAACCCC
GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006
ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009
*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998
MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997
AGTTAGTTA
PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997
AAAAGTTAGGTTA
PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010
v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992
Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992
c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992
A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994
B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993
Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997
At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000
At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008
ACCTAAC
Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998
(C/T)AAC(A/T)AAC
Group MYB Protein Binding Site Species MYB REPEAT References
1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002
GCAGTTT
At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
AAACCA Hoeren et al., 1998
*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011
At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005
AGTAGTTA
At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998
*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008
*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997
DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011
Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010
gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002
GTTT(G/T)(G/T) Yang et al., 2003
CTGTTG Huang et al., 2008
CTGTAG
CAGTAG
GTGTAG
GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GGTAGGTGAGA
GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008
ATCCTTTTTTCCGG
Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995
Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996
G(G/T)T(A/T)GGT(A/G)
Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001
NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006
ACCAACCCC
GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006
ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009
*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998
MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997
AGTTAGTTA
PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997
AAAAGTTAGGTTA
PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010
v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992
Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992
c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992
A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994
B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993
Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997
At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000
At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008
ACCTAAC
Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998
(C/T)AAC(A/T)AAC
13
Table 1.1 continued.
Group MYB Protein Binding Site Species MYB REPEAT References
1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002
GCAGTTT
At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993
AAACCA Hoeren et al., 1998
*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011
At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005
AGTAGTTA
At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998
*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008
*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997
DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011
Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010
gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002
GTTT(G/T)(G/T) Yang et al., 2003
CTGTTG Huang et al., 2008
CTGTAG
CAGTAG
GTGTAG
GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008
ATCCTTTTTTCCGG
GGTAGGTGAGA
GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008
ATCCTTTTTTCCGG
Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995
Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996
G(G/T)T(A/T)GGT(A/G)
Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001
NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006
ACCAACCCC
GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006
ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009
*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998
MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997
AGTTAGTTA
PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997
AAAAGTTAGGTTA
PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010
v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992
Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992
c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992
A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994
B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993
Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001
2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997
At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995
AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000
At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009
ACCAACC
ACCTAAC
At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998
AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008
ACCTAAC
Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998
(C/T)AAC(A/T)AAC
Am MYB308 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Tamagnone et al., 1998
Am MYB340 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Moyano et al.,1996
DcMYB1 ACC(A/T)(A/T)CC Daucus carota R2R3 Maeda et al., 2005
Eg MYB1 (C/T)ACC(A/T)A(A/C)C Eucalyptus gunnii R2R3 Legay et al., 2007
Eg MYB2 CACCTACC Eucalyptus gunnii R2R3 Goicoechea et al., 2005
TACCTAAC
NlMYB305 TCACCTAAC Nicotiana langsdorffii R2R3 Liu et al., 2009
GCACCTAAT
NtMYB2 ATCTCACCTACCA Nicotiana tabacum R2R3 Sugimoto et al., 2000
PtMYB1 ACCTACC Pinus taeda R2R3 Patzlaff et al., 2003b
ACCAACC
ACCTAAC
PtMYB4 ACCTACC Pinus taeda R2R3 Patzlaff et al., 2003a
ACCAACC
ACCTAAC
Pt MYB134 ACCTAC Populus tremuloides R2R3 Mellway et al., 2009
*Le MYBI TCTAATCTCATCC Solanum lycopersicum R2R3 Rose et al., 1999
ZmMYB31 ACC(T/A)ACC Zea mays R2R3 Fornale et al., 2010
Zm MYBC1 A(A/C)C(A/T)A(A/C)C Zea mays R2R3 Sainz et al., 1997
GTT(A/T)GTT(A/G)
ZmMYB-IF35 ACC(A/T)ACC(A/C/T) Zea mays R2R3 Heine et al., 2007
Zm P ACC(A/T)ACC(A/C/T) Zea mays R2R3 Grotewold et al., 1994
3.TTAGGG At TBP1 TTTAGGG Arabidopsis thaliana R1 Hwang et al., 2001
At TRP1 TTTAGGG Arabidopsis thaliana R1 Chen et al., 2001
hTRF1 TTTAGGG Homo sapien R1 Nishikawa et al., 2001; Court et al., 2005
Rap1 TTAGGG Homo sapien R1 Li et al., 2003
LaTBP1 TTTAGGG Leishmania amazonensis R1 Lira et al., 2007
TGTGTGGG
Ng TRF1 TTTAGGG Nicotiana glutinosa R1 Ko et al., 2008
RTBP1 TTTAGGG Oryza sativa R1 Ko et al., 2009
*Tbf1p TAGGGTTGG Saccharomyces cerevisiae R1 Koering et al., 2000
Smhl TTTAGGG Zea mays R1 Marian et al., 2003
Rap1 TTAGGG Saccharomyces cerevisiae R2R3 Konig and Rhodes, 1997
ACA(C/T)CCCAT(C/T) Lascaris et al., 1999
ACACCC(A/G)(C/T)ACA(C/T)(A/C) Lieb et al., 2001
4.AAAATATCT *CCA1 AA(A/C)AATCT Arabidopsis thaliana R1 Carre et al., 1995
LHY AAAATATCT Arabidopsis thaliana R1 Schaffer et al., 1998
RVE1 AAAATATCT Arabidopsis thaliana R1 Maxwell et al., 2003; Rawat et al., 2009
RVE2 AAAATATCT Arabidopsis thaliana R1 Maxwell et al., 2003; Rawat et al., 2009
TOC1 AAAATATCT Arabidopsis thaliana R1 Alabadi et al., 2001
5.GATA MYBSt1 GGATA Solanum tuberosum R1 Baranowskij et al., 1994
TaMYB80 AGATAC Triticum aestivum R1 Xue et al., 2005
GGAATATNC
Tv MYB1 ANAACGATA Trichomonas vaginalis R2R3 Ong et al., 2006
TAACGA
TATCGT
Tv MYB2 CGATA Trichomonas vaginalis R2R3 Ong et al., 2007
TATCGTC
6.TATCCA Os MYBS1 TATCCA Oryza sativa R1 Lu et al., 2002
Os MYBS2 TATCCA Oryza sativa R1 Lu et al., 2002
Os MYBS3 TATCCA Oryza sativa R1 Lu et al., 2002
7.Miscellaneous Ca Rap1 GGTGT Candida albicans R1 Yu et al., 2010
GGATG
dd MYBE CACCCCAC Dictyostelium discoideum R1 Fukuzawa et al., 2006
Adf-1 (G(C/T)(C/T))x4 Drosophila funebris R1 Lang et al., 2010
Zeste (T/C/G)GAGTG(A/G/C) Drosophila melanogaster R1 Mohrmann et al., 2002
Eh Mybdr CCCCCC Entamoeba histolytica R1 Ehrenkaufer et al., 2009
Gm MYB176 TAGT(A/T)(A/T) Glycine max R1 Yi et al., 2010
Tbf1 ACAGGGTT Schizosaccharomyces pombe R1 Pitt et al., 2008
At MYBCDC5 CTCAGCG Arabidopsis thaliana R2R3 Hirayama and Shinozaki, 1996
14
(Jin et al., 2000). Compendia of plant MYB DNA-binding sites can be found in
databases such as The Arabidopsis Gene Regulatory Information Server (AGRIS)
(http://arabidopsis.med.ohio-state.edu/) and the Transcription Factor Database
(TRANSFAC) (http://www.gene-regulation.com/pub/databases.html). These databases
contain many of the MYB DNA-binding sites reported in the literature, most of which
have been experimentally validated, and all of which are reported here (Fig. 1.2, Table
1.1). Plant MYB DNA-binding sites were determined on a protein-by-protein basis
(Luscher and Eisenman, 1990; Ramsay et al., 1991; Urao et al., 1993; Baranowskij et
al., 1994; Grotewold et al., 1994; Sablowski et al., 1994; Gubler et al., 1995; Li and
Parish, 1995; Moyano et al., 1996; Yang and Klessig, 1996; Sainz et al., 1997; Solano
et al., 1997; Uimari and Strommer, 1997; Romero et al., 1998; Suzuki et al., 1998;
Tamagnone et al., 1998; Wang and Tobin, 1998; Rose et al., 1999; Chen et al., 2001;
Ito et al., 2001; Yang et al., 2001; Patzlaff et al., 2003a; Patzlaff et al., 2003b; Koshino-
Kimura et al., 2005; Heine et al., 2007; Legay et al., 2007; Punwani et al., 2007; Liao et
al., 2008; Aya et al., 2009; Ko et al., 2009; Liu et al., 2009; Mellway et al., 2009), and
generally reside approximately 500bp upstream of the transcriptional start site (Fig. 1.2,
Table 1.1).
1.3.3 The DNA Targets of Single MYB Repeat Proteins
In contrast to two and three-repeat MYB proteins, single-repeat MYB proteins bind
predominantly to the telomeric sequence TTAGGG and display similar sequence
identity; however, not all single-repeat MYB proteins bind this sequence and moreover,
they do not share the same functional roles (Table 1.1). Single-repeat MYB proteins
are involved with telomere binding and circadian clock regulation (Martin and PazAres,
1997). These functionalities have been conserved during the evolution of yeast,
animals and plants (Bilaud et al., 1996; Lipsick, 1996).
Both C-terminal and N-terminal single-MYB repeat proteins bind to double-stranded
DNA of telomeric repeats TTTAGGG. AtTRB1, AtTRB2, AtTRB3 all bind the telomeric
DNA sequence containing a minimum of two repeats (TTTAGGG)2. In this regard,
these single-MYB repeat proteins are divergent from R2R3-MYB and R1R2R3-MYB
proteins both in terms of the primary sequence of the MYB domains, and also,
15
consistent with their divergent DNA-binding domain, in terms of their cognate DNA
target binding sites. By contrast, some single-repeat MYB proteins seem to bind to
DNA targets that are coincident with R2R3-MYB proteins (Feldbrugge et al., 1997; Lu et
al., 2002; Liao et al., 2008). In this regard, they function as competitors for the same
DNA targets.
The rice single-MYB repeat proteins OsMYBS1, OsMYBS2, and OsMYBS3 can form
dimers to bind to the sequence TATCCA with different binding affinities, as determined
by EMSAs with competition (Lu et al., 2002). Mutational assays showed that
nucleotides CCA are more important for OsMYBS1 and OsMYBS3 binding than the
TAT nucleotides. In contrast, sequence TAT seems to be more important for OsMYBS2
binding. Moreover, all three of these MYB proteins alter alpha-amylase gene
expression. OsMYBS1 had a higher transactivation ability than OsMYBS2 and
OsMYBS3. OsMYBS3 acted as a transcriptional repressor in both yeast and barley
aleurone cells. These results demonstrate differential binding affinities and
transactivation ability of three MYB proteins for the same target DNA sequence.
Competing with each other, single-repeat MYB proteins and R2R3-MYB proteins
provide a means by which to fine-tune gene expression of genes that contain the gene
regulatory regions that are the sites of competition (Konig and Rhodes, 1997; Lu et al.,
2002; Liao et al., 2008).
1.4 The Nature of DNA-Binding by MYB Proteins
1.4.1 Relationship Between the MYB DNA-Binding Domain and DNA-
Binding Specificity
The MYB superfamily has been categorised based both on the number of MYB repeats
and on the amino acid sequence of the MYB domain (Stracke et al., 2001; Jia et al.,
2004). In other families of transcription factors, overall sequence conservation is low
and variability in DNA-binding specificity is high (Treisman et al., 1992; Klug and
Schwabe, 1995). Contrary to this, members of the plant R2R3-MYB family share higher
amino acid sequence similarity, especially in their recognition helices, and display
16
similar DNA-recognition patterns (Romero et al., 1998). These similarities in recognition
specificity are heightened between members of the same phylogenetic group.
R2R3-MYB family members from different species have been previously classified into
different phylogenetic clades (group A, B, and C) based on sequence similarities
(Romero et al., 1998). These clades were then analysed for DNA-binding specificities.
It was shown that members from group A, bind MBS type I sequence
(C(A/C/G/T)GTT(A/G)), members of group B bind equally to both MBS type I and type II
(G(G/T)T(A/T)GTT(A/G)), and most members of group C bind MBS type IIG
((C/T)ACC(A/T)A(A/C)C). For example, AtMYB6 and AtMYB7 are both members of
group C and share 90% amino acid sequence identity (Romero et al., 1998). AtMYB6
and AtMYB7 both bind to the MBS type IIG sequence (Li and Parish, 1995)(Fig. 1.2,
Table 1.1).
Well-characterised DNA-binding sites can be extracted from the literature for 87
proteins from the MYB superfamily (Fig. 1.2, Table 1.1). Characterisation of DNA
targets was derived from both in vivo and in vitro protein-DNA-binding assays (see
captions for Fig. 1.2 and Table 1.1). DNA-binding sites for these proteins can be
categorised into seven groups based on DNA-binding specificities. Examination of the
protein-sequence-similarity-derived phylogenetic relationships between 87 MYB
proteins reveals that, in general, MYB proteins that share protein sequences bind to
similar DNA sequences (Fig. 1.2, Table 1.1). That is, in general similar protein structure
implies similar DNA-binding sequences recognised by MYB proteins; however, there
are instances in the phylogenetic tree and in other studies where this is not the case (Li
and Parish, 1995)(Fig. 1.2, Table 1.1).
Members of the MYB superfamily do not always share similar DNA-binding sites based
on similar structure. Although Romero et al. had shown a correlation between MYB
protein structure and DNA-binding specificity, there were MYB family members that
could not be predicted (Romero et al., 1998). For example, Group C MYB family
members prefer in general type IIG sequence; however, the two Group C MYB proteins,
AtMYB2 and AtMYBGL1, bound DNA with different patterns. AtMYB2 bound to type I
sequences (Urao et al., 1993), and AtMYBGL1 bound only to type II sequence (Romero
17
et al., 1998). These examples show that MYB proteins DNA binding sites can generally
be predicted; however, there are examples of MYB proteins similar in structure and
function that can bind to different DNA cognate target sites. These exceptions highlight
the importance of conducting DNA-binding site experiments for individual MYB proteins.
1.4.2 Involvement of MYB Repeats in DNA Binding
Both R2 and R3-MYB repeats are necessary for DNA binding, either by R1R2R3-MYB
or R2R3-MYB proteins (Ogata et al., 1994). Neither R2 nor R3 can alone bind DNA
specifically (Ogata et al., 1994). This implies that both the R2 and R3 repeats bind
cooperatively to its cognate DNA target sequence. The resolved structure of the c-
MYB-DNA complex has displayed that both the C-termini recognition helices of R2 and
R3 contact directly with each other prior to sequence-specific binding (Ogata et al.,
1994; Ogata et al., 1995; Tahirov et al., 2001; Tahirov et al., 2002). Furthermore, the
phosphate backbone interacts simultaneously with the amino acids in both the R2 and
R3 repeats to aid in DNA-binding.
As R1 is not necessary for the specific recognition of DNA target sequences, both
R1R2R3 and R2R3-MYB proteins bind DNA in a similar manner. By contrast, single-
repeat MYB proteins, which only contain one MYB DNA-binding repeat, bind DNA in a
different manner than R1R2R3 and R2R3-MYB proteins (Hwang et al., 2001). This first
became clear when S. cerevisiae Rap1 was found to contain two MYB repeats in its
MYB-DNA-binding domain and its orthologous MYB counterpart, Homo sapiens RAP1,
only contained one MYB repeat (Konig and Rhodes, 1997). It was subsequently found
that S. cerevisiae Rap1 binds DNA as a monomer because it contains two MYB
repeats. In contrast, Homo sapiens RAP1 contains only one MYB repeat and does not
bind DNA directly; however, it tethers itself to TRF2 to bind to its DNA targets.
1.4.3 The Nature of DNA Binding By Animal MYB Proteins
The nature of DNA binding by any MYB protein has been most extensively examined
using c-MYB and its cognate target, MBSI. Mutational studies on c-MYB have shown
that R1 can be deleted without significant loss of DNA-binding ability, and that both R2
and R3 are essential for MYB-DNA recognition and binding (Anton and Frampton, 1988;
18
Sakura et al., 1989; Frampton et al., 1991). Although R1 is not involved in the direct
recognition of DNA sequence motifs, it does enhance the stability of DNA binding by the
R2R3 repeats without significantly altering the DNA-R2R3 conformation (Tanikawa et
al., 1993; Ogata et al., 1994).
c-MYB-DNA interactions were validated structurally with the resolution of the NMR
solution structures and X-ray crystal structures of c-MYB DNA-binding domain in the
free and DNA-bound states (Ogata et al., 1994; Ogata et al., 1995; Tahirov et al., 2001;
Tahirov et al., 2002). Each third helix (C-terminal helix) of R2 and R3 were
subsequently found to act as the recognition helix (Fig. 1.1)(Ogata et al., 1994). In
keeping with this, the recognition helix of R3 interacts with the core of the DNA
consensus sequence ((T/C)AAC)); while the recognition helix of R2 interacts less
specifically with nucleotides surrounding the core recognition motif
((G/T)G(A/C/T)(A/C/T)) (Ogata et al., 1995). The binding of R2 and R3 to its consensus
sequence ((T/C)AAC(G/T)G(A/C/T)(A/C/T)) widens the major groove and causes a
bend of local helical axis (Ogata et al., 1994). Several interhelical interactions occur
between the helices of R2 and R3, stabilizing the MYB-DNA interaction. Moreover, R2
and R3 bound the major groove continuously, similar to transcription factor IIIA (TFIIIA)-
type Zn fingers (Pavletich and Pabo, 1991; Fairall et al., 1993; Pavletich and Pabo,
1993). Contrary to TFIIIA-type Zn fingers, the recognition helices of c-MYB R2 and R3
are more closely packed together in the major groove. This type of direct interaction
between the recognition helices from different DNA-binding units is unique among other
DNA-binding domain complexes (Ogata et al., 1994).
Not all amino acid residues within the DNA-binding site of transcription factors partake
in DNA recognition and binding. Within the MYB protein family, certain key residues are
critical for these tasks (Ogata et al., 1994; Solano et al., 1997). For example, for c-
MYB, the three key base contacts are governed by residues Lys128 (R2), Lys182 (R3),
and Asn183 (R3), which are found to be fully conserved in all known animal and plant
MYB proteins (Ogata et al., 1994; Ogata et al., 1995).
Each MYB DNA-binding domain contains several conserved regularly spaced
tryptophan residues that participate in a hydrophobic cluster (Anton and Frampton,
19
1988; Saikumar et al., 1990). Through mutational and structural studies on c-MYB, this
hydrophobic cluster was determined to be essential for both the stability of MYB-protein
interaction and for sequence-specific binding to its consensus sequence
((T/C)AAC(G/T)G(A/C/T)(A/C/T)). Mutational and structural studies on animal c-MYB
have aided in providing critical knowledge on the molecular mechanisms behind MYB-
DNA interactions. Moreover, these studies allow one to generate testable hypotheses
on MYB-DNA interactions in other organisms where orthologous MYB proteins reside.
A cysteine residue located in the DNA recognition helix of R2 has remained completely
conserved in animals, fungi, and plants during the evolution of MYB domains (Heine et
al., 2004). R1R2R3-MYB domains have a single cysteine residue (Cys130) that is
included in the hydrophobic core. Cys130 of c-MYB needs to be reduced to allow for
sequence-specific DNA-binding. When reduced, Cys130 accomplishes this by
structurally stabilising the three helices of the R2-MYB repeat during sequence-specific
DNA-binding (Graesser et al., 1992; Guehmann et al., 1992; Melcher, 2000). In
contrast, most R2R3-MYB domains contain two cysteine residues (Cys49 and Cys53)
with the equivalent position as Cys130 in R1R2R3 MYB (Heine et al., 2004).
c-MYB has been extensively studied with regards to dynamics of DNA binding
(Tanikawa et al., 1993; Ebneth et al., 1994; Ogata et al., 1994; Ogata et al., 1995). The
c-MYB R2R3-domain was shown to bind tightly to the MYB binding site
((T/C)AAC(G/T)G(A/C/T)(A/C/T)) with a binding constant of 1.5E-09M±28% (Tanikawa et
al., 1993; Ebneth et al., 1994). Mutational analyses have shown that specific residues
within the R2R3-MYB repeats of c-MYB bound to specific nucleotides with different
affinities (Tanikawa et al., 1993). High affinity interactions within the R2R3-MYB
repeats of c-MYB are disproportionately localised across the MYB-binding site -
AACTGAC. The first adenosine, the third cytosine, and the fifth guanine are involved in
high affinity binding with c-MYB, in which any base substitutions reduce the binding
affinity by more than 500-fold in comparison to binding to an unmutated MYB-binding
site sequence. In contrast to this, the interaction with the second adenosine is involved
in lower affinity binding, with an affinity reduction in the range of 6 to 15-fold when
subjected to base change. The seventh cytosine shows an interesting interaction in that
only guanine substitution abolishes the binding affinity. All together, these affinity data
20
show that the second and third MYB-repeats cover the AACTGAC region from the
major groove of DNA in an orientation that allows the third MYB-repeat to cover the core
AAC sequence. Moreover, the results show that the third MYB-repeat recognises the
core AAC sequence with high binding affinity; however, the second repeat recognises
the GAC sequence with lower binding affinity.
MYB-DNA kinetic studies also found that mutating the R1 repeat does not affect the
DNA recognition of c-MYB but does effect the stability of the MYB-DNA complex.
Furthermore, the N-terminal acidic activation region upstream of the first MYB repeat
was found to reduce the binding affinity by interfering with R1 binding to DNA. NMR, X-
ray crystallography, and surface plasmon resonance studies have validated these c-
MYB-DNA binding kinetic results (Ogata et al., 1994; Ogata et al., 1995; Oda et al.,
1999). Further studies on c-MYB DNA affinity indicated that when c-MYB binds DNA,
the orientation of R2 and R3 are immobilised by sequence-specific binding and their
conformations are slightly changed. No significant conformational changes occur in R1
during MYB DNA-binding, further emphasising that R1 is not involved in DNA-binding
site recognition (Ogata et al., 1995).
In a comparison between the binding kinetics of the three vertebrate MYB proteins (A-,
B- and c-MYB), both A- and c-MYB bound the MYB recognition site with similar binding
constants and specificity; however, B-MYB formed DNA-protein complexes of lower
stability, rapidly dissociating under competitive conditions and showed less tolerance to
DNA-binding site variations (Bergholtz et al., 2001). These studies on animal MYB
proteins have granted insight into the molecular mechanisms behind MYB-DNA
interactions in general because R2R3-MYB proteins bind DNA in a similar fashion
(Ogata et al., 1994; Solano et al., 1997).
Kinetics on single-repeat animal MYB proteins binding to their DNA cognate sequences
have also been examined. For example, the human single-repeat MYB protein TRF1
shows that TRF1 can bind to the telomeric sequence TTAGGG with high affinity (Kd =
3.2 ± 0.5 × 10–9 M) and specificity as a monomer (Konig et al., 1998). The recorded
DNA binding affinity lies in the range of various homeodomains that also bind
specifically to DNA as monomers (Affolter et al., 1990; Florence et al., 1991; Ades and
21
Sauer, 1994; Carra and Privalov, 1997). Although the interaction of TRF1 and the
telomeric sequence is specific, the specificity and affinity is significantly increased as a
homodimer (Bianchi et al., 1997).
1.4.4 The Nature of DNA Binding By Plant MYB Proteins
To date, some of the specifics of plant MYB interaction with target DNA have relied on
model-building based on the c-MYB binding to DNA, as no crystal structure has been
generated yet for any plant multi-repeat MYB protein. For example, PAP1/AtMYB75,
the R2R3-domain was modelled according to the known structural data of c-MYB
(Zimmermann et al., 2004). A conserved amino acid signature found within several
MYB proteins was hypothesised to predict new MYB/BHLH interactions for Arabidopsis
thaliana proteins. Consistent with this hypothesis, analysis of the predicted 3D model of
PAP1/AtMYB75 showed that the amino acids of the conserved motif are surface-
exposed on helices 1 and 2 of the R3 repeat, forming hydrophobic and charged residue
patterns (Zimmermann et al., 2004). These surface-exposed amino acids are thought
to stabilise the protein-protein interactions (Zimmermann et al., 2004). This model was
validated by mutational assays (Zimmermann et al., 2004).
The Petunia MYB.Ph3 structure was also modelled after c-MYB. MYB.Ph3, a plant
R2R3-MYB transcription factor involved in the regulation of flavonoid biosynthetic
pathway in petunia flowers (Avila et al., 1993; Sablowski et al., 1994; Solano et al.,
1995), shows divergence in binding specificity compared to c-MYB (Solano et al., 1997).
MYB.Ph3 can bind two types of sites: MBSI ((T/C)AAC(G/T)G(A/C/T)(A/C/T)) and
MBSII (AGTTAGTTA) (Solano et al., 1997; Romero et al., 1998). Modeling predicted
that a single residue substitution in the R2 repeat of MYB.Ph3 (Leu71►Glu) would
change its DNA recognition to that of c-MYB, and the reciprocal substitution in c-MYB,
Glu132►Leu would change c-MYB specificity to that of MYB.Ph3 (Solano et al., 1997).
This model was experimentally validated via mutational assays. Even though it was
previously found that these residues do not directly bind DNA (Ogata et al., 1994), the
MYB.Ph3 Leu71 and c-MYB Glu132 residues interact with residues that do interact with
DNA, enabling them to impact DNA-specificity indirectly (Solano et al., 1997). By
contrast, other studies had found that P and v-MYB DNA-binding domains, which are
22
conserved among animal and plant MYB domains, are necessary for the high affinity
DNA-binding activity of these proteins to their respective DNA target sites but are not
sufficient for their unique DNA-binding site recognition of P and v-MYB (Williams and
Grotewold, 1997). Furthermore, Williams and Grotewold (1997) found that chimeric
MYB domains have novel DNA-binding specificities. Resolution of these differences will
require crystal or solution structures for plant MYB proteins.
As is the case with c-MYB, both Cys49 and Cys53 are thought to be essential for the
DNA-binding or transcriptional activity of plant MYB proteins, forming an intramolecular
disulfide bond with each other under non-reducing conditions. This disulfide bond has
been hypothesised to impair DNA binding under non-reducing conditions, causing
R2R3-MYB proteins to be functionally active only under reducing conditions. Toward
this end, the same two cysteines are conserved in the R2-MYB repeat of the R2R3-
MYB protein WEREWOLF (WER) (Koshino-Kimura et al., 2005). WER cannot bind to
its DNA-binding sites within its downstream target promoter regions without the addition
of dithiothreitol (a reducing agent). The dithiothreitol is thought to abolish the disulfide
bond, leading to the sequence specific binding of WER to its downstream targets. Nitric
oxide (NO) was shown to modifiy the DNA-binding activity of AtMYB2 by a
posttranslational modification of its conserved Cys53 (Serpa et al., 2007). AtMYB2
bound to the core binding site AAACCA in an EMSA assay; however, the addition of NO
donors, such as SNP (sodium nitroprusside) and GSNO (S-nitrosoglutathione), inhibited
sequence specific binding of AtMYB2. The NO-mediated inhibitory effect was reversed
by DTT, demonstrating that sequence specific DNA-binding of AtMYB2 is inhibited by S-
nitrosylation of Cys53 as a result of NO action. The role of cysteine residues in MYB
proteins displays the divergence of DNA binding mechanisms between both animals
and plant MYB proteins. Despite some similarities in DNA binding, given the
divergence of target DNA-binding sites of R2R3-MYB proteins relative to R1R2R3-MYB
protein, it follows that the residues critical for DNA recognition and binding within the
binding site of many plant MYB proteins differ from those of animal MYB proteins.
These examples display why there is inherent flexibility of DNA recognition by the MYB
superfamily of transcription factors because merely one change in residue can alter the
DNA recognition by a particular MYB protein.
23
Despite the vast knowledge of plant MYB transcription factor function at the gross
morphological level, little is known about the dynamics of MYB protein-DNA
interactions. Nevertheless, some general themes regarding plant MYB-DNA binding
kinetics are emerging (Solano et al., 1997; Lu et al., 2002; Liao et al., 2008). Most plant
MYB proteins display considerable inherent flexibility in their ability to recognise target
sites (Fig. 1.2, Table 1.1). For example, Petunia protein MYB.Ph3 bound to both MBSI
and MBSII sites with the same affinity, inducing similar DNA-bending/distortions in both
cases (Solano et al., 1997). Affinities for these two plant MYB binding sites vary among
other plant MYB proteins; however, certain MYB proteins have been shown to only bind
one of these sequences (Urao et al., 1993; Grotewold et al., 1994; Sablowski et al.,
1994; Gubler et al., 1995; Li and Parish, 1995; Moyano et al., 1996). The maize R2R3-
MYB C1 protein bound to its target sequences in the a1 (dihydroflavonol reductase)
promoter (Sainz et al., 1997). Determined by EMSA assays, the affinity of binding was
reduced by mutations in the C1 DNA-binding domain or in the a1 sequences recognised
and bound by C1. Maize transient assays determined that C1 directly activated the a1
gene. Towards this end, the two C1 binding sites were also bound by the maize P
protein. One of the sites (ACC(A/T)ACC) were bound with higher affinity by P (Kd = 52
± 4 × 10–9M) relative to C1 (Kd = 330 ± 50 × 10–9 M). In contrast, the other site
(AACTACCGG) is bound with similar low affinities by P (Kd = 860 ± 150 × 10–9 M) and
C1 (Kd = 780 ± 70 × 10–9 M). These results allow a greater understanding of the
mechanism behind the anthocyanin biosynthetic pathway in maize. In another example,
all three Soy-MYB proteins, GmMYB76, GmMYB92, and GmMYB177, bound to the
MBSI sequence (Liao et al., 2008). GmMYB92 could also bind sequences MRE4
(TCTCACCTACC) and mMRE1 (CCGGAAAAAAGGAT). Unlike GmMYB92,
GmMYB76 and GmMYB177 bound to the mMRE1 sequence with weak affinity.
It is important to note that, while the aforementioned studies have provided profoundly
useful insights into plant R2R3-MYB interactions with DNA sequences, they also
provide a rather incomplete picture of the specific interactions that are possible. Given
the sheer number of plant MYB proteins, the correspondingly large number of
downstream DNA targets for these proteins, and the breadth of processes controlled by
the MYB family members in plants, the complexity of plant MYB-DNA interactions
24
characterised to date is the tip of the proverbial iceberg. Clearly, there is a need for
more extensive analysis of these important interactions. One might expect considerable
inroads to be made in the future, with the emergence of new technologies to probe
DNA-protein interactions.
1.5 Future of Plant MYB-DNA Interaction Studies
1.5.1 Determining the Breadth of MYB DNA Targets in vitro
The identification of in vitro MYB DNA-binding sequences in a rapid and high-
throughput manner is required in the future to identify all variants of their DNA targets.
Transcription factors, including MYB proteins, are promiscuous in terms that they can
interact and initiate transcription from multiple target sequences (Solano et al., 1997;
Patzlaff et al., 2003a; Meijsing et al., 2009). Well-established protocols based on
recombinant MYB transcription factor DNA-binding domains have been used to enrich
for target sequences from libraries of random DNA sequences (Howe and Watson,
1991; Weston, 1992; Grotewold et al., 1994; Jackson et al., 2001). These experiments
include cyclic amplification and selection of targets (CASTing) (Wright et al., 1991) and
systematic evolution of ligands by exponential enrichment (SELEX) (Roche et al.,
1992). Both of these procedures have determined numerous MYB in vitro DNA binding
motifs for several MYB transcription factors, and their underlying principles can now be
scaled to accommodate high-throughout approaches. Microarray based technologies,
such as protein-binding microarrays, have been developed to identify transcription
factor sequence specificities (Seong and Choi, 2003; Mukherjee et al., 2004; Berger et
al., 2006; Kim et al., 2009). Binding sites identified by this technology have correlated
with in vivo transcription factor-bound DNA sequences identified by ChIP experiments
(Mukherjee et al., 2004; Badis et al., 2009; Grove et al., 2009).
Two types of protein-binding microarrays have emerged: double-stranded DNA
microarrays and transcription factor microarrays. Double-stranded DNA microarrays
contain all possible double-stranded 11bp sequences (approximately 4.2 million
sequences) in roughly 240,000 oligonucleotides (Godoy et al., 2011). Recombinant
protein from a transcription factor of interest is flowed over the double-stranded DNA
25
microarray and washed with increasing concentrations of salts. This technology allows
the accurate quantification of binding affinities to all possible DNA-binding sites
recognised by the transcription factor of interest in just one hybrization step.
Transcription factor DNA-binding enrichment, based on a protein array, allows for the
capture of multiple transcription factors and then discovery of their binding sites (Linnell
et al., 2004). A library of random oligonucleotides is flowed over captured proteins to
identify the transcription factors‘ DNA-binding sites. The array is then washed with
increasing salt concentrations to allow for the identification of relative binding affinities.
This specific protein-binding microarray has a slight advantage over the double-
stranded DNA microarray because multiple transcription factors can hybridise onto a
chip, allowing for the identification of binding preferences for transcription factor families
(Gong et al., 2008). Both these techniques are powerful means to identify in vitro DNA-
binding sites of proteins of interest in a time efficient manner (Linnell et al., 2004),
(Gong et al., 2008); however, there are limitations to these experiments. One limitation
is that protein-DNA complexes that have weak affinity for each other will be washed
away with low concentrations of salts, biasing the results. Another limitation is that the
whole structure of the protein is not available to bind to its preferred DNA targets
because a portion of the protein is hybridised to the array. Not knowing the DNA
binding domain of the protein of interest could lead to misleading results.
1.5.2 Emerging Approaches for Plant MYB Target Discovery and Analysis
in vivo
Crucially, in vitro MYB DNA-binding sites might differ from those preferred in vivo
(Barbulescu et al., 2001; Verrijdt et al., 2003). These differences are a result of in vivo
protein-protein interactions and post-translational modifications altering DNA binding
specificity, as well as conformational differences between in vitro recombinant DNA-
binding domains and in vivo native conformations of these domains. The in vivo
availability of transcription factor DNA-binding sites is also controlled by the packaging
of genomic DNA in chromatin. Therefore, alternative in vivo approaches are necessary
to map MYB-DNA binding sites in the genome accurately.
26
Transient expression assays and yeast one-hybrid assays are now a staple in
identifying that a particular MYB binds to a specific DNA target in vivo (Patzlaff et al.,
2003a; Patzlaff et al., 2003b; Xie et al., 2010). These procedures involve the
expression of a transcription factor of interest within organisms, such as plants or
yeasts, to see if it is sufficient to enable the transactivation of an artificial gene
comprising a tandem repeat of its putative DNA-binding site fused to a minimal
promoter, upstream of a reporter gene. These experiments, with the right controls,
ensure that a specific transcription factor of interest interacts and activates transcription
from its putative DNA binding site in vivo. For example, the R2R3-MYB transcription
factors AtMYB11, AtMYB12 and AtMYB111 were shown, through transient expression
assays in Arabidopsis thaliana protoplasts, that they were functionally similar to its
structurally similar maize P protein (Mehrtens et al., 2005; Stracke et al., 2007).
AtMYB11, AtMYB12, AtMYB111, and P protein had similar target gene specificity,
regulating a myriad of flavonoid biosynthetic genes. Furthermore, all activated target
gene promoters in vivo in the presence of a MYB recognition element. Transient
expression assays and yeast one-hybrid assays are well established experiments to
validate if a particular protein activates transcription from a particular motif; however,
chromatin immunoprecipitation (ChIP) followed by either whole-genome tiled microarray
analysis (ChIP-chip) or high-throughput signature sequencing (ChIP-seq) can identify
novel in vivo DNA targets of proteins of interest, resulting in more biologically significant
results. ChIP-chip or ChIP-seq has proven to be powerful tools by which to identify in
vivo binding sites of sequence-specific transcription factors in the context of chromatin
(Massie and Mills, 2008), which avoids many caveats of the aforementioned techniques
(Solomon et al., 1988).
Recently, ChIP identified in vivo DNA-binding target sites for a select group of MYB
proteins (Wang et al., 2000; Berge et al., 2007; Georlette et al., 2007; Hara et al., 2009;
Morohashi and Grotewold, 2009; Fornale et al., 2010; Xie et al., 2010). In plants, ChIP-
chip identified in vivo binding sites for two Arabidopsis thaliana two-MYB-repeat
proteins, FOUR LIPS (FLP; AtMYB124) and AtMYB88 (Xie et al., 2010). FLP and
MYB88 were shown to directly bind promoters of cell cycle genes, including CDKA:1
(At3g48750), CELL DIVISION CYCLE6a and 6b (CDC6a At2g29680 and 6b
27
At1g07270), Cyclind4:1 (At5g65420), a cyclin-like gene, CYCLINT:1 (CYCT:1,
At1g35440), CDKD1:3 (At1g8040) and CYCB1:3 (At3g11520). These results were
consistent with FLP/MYB88 in suppressing DNA replication and cell cycle progression
within the stomata. Systematic evolution of ligands by exponential enrichment (SELEX)
and EMSA, along with ChIP-chip, helped identified that this group bound to the core
consensus sequence of (A/T/G)(A/T/G)C(C/G)(C/G). Similarily, Zea mays MYB31 was
shown by SELEX and ChIP to bind to the sequence ACC(T/A)ACC within the two lignin
promoters XmCOMT and ZmF5H, resulting in the repression of lignin biosynthetic gene
expression (Fornale et al., 2010). Furthermore, ChIP-chip was performed on the
trichome developmental selectors GLABRA3 (GL3) and GLABRA1 (GL1), encoding
basic helix-loop-helix (bHLH) and MYB transcription factors respectively. ChIP-chip
identified 20 novel in vivo GL3/GL1 direct targets such as SCL8 and MYC1 (involved in
the control of gene expression), SIM (a cyclin-dependent kinase inhibitor), and RBR1 (a
negative regulator of the cell cycle transcription factor E2F) (Morohashi and Grotewold,
2009). Recently, ChIP coupled with high-throughput sequencing was employed to
determine that the R2R3-MYB P1 protein has a broader suite of direct target genes
outside of the already known flavonoid biosynthetic genes (Morohashi et al., 2012).
ChIP-chip and ChIP-seq are difficult procedures to conduct because they require an
antibody that specifically recognises a transcription factor of interest. Although these
procedures are the most effective way in determining true in vivo DNA-binding targets
and sites, they have not been used to study the majority of the MYB superfamily.
Epitope tagging is the process of making the product of a gene of interest
immunoreactive to an already synthesised antibody (Massie and Mills, 2008). This can
be done by inserting a polynucleotide encoding an epitope into a gene of interest and
expressing the gene in an appropriate host. This protein from the gene of interest can
now be located via an antibody that has already been generated. This method could be
used as an alternative to generating novel antibodies before conducting a ChIP-chip.
When an antibody cannot be generated to a protein of interest, this method is best used
to determine in vivo DNA-protein binding data on the protein of interest.
Other methods can also identify in vivo MYB DNA-binding sites and downstream
targets. A glucocorticoid receptor (GR)-mediated inducible system has successfully
28
been used to define direct target genes of several putative transcription factors
(Sablowski and Meyerowitz, 1998; Wagner et al., 1999; Samach et al., 2000). In a GR-
mediated inducible system a fusion protein between a protein of interest and the rat
glucocorticoid receptor hormone binding domain is engineered. This fusion protein is
retained in the cytoplasm in absence of the synthetically made steroid hormone
dexamethasone. Upon addition of dexamethasone the protein of interest-GR fusion
protein enters the nucleus and binds to the protein of interest‘s downstream target
genes. The addition of translational inhibitors such as cycloheximide will inhibit further
downstream effects of your protein of interest. By assaying genome-wide expression
changes on microarrays, one can determine direct target genes of a protein of interest.
The GR inducible system was used to show the single-repeat MYB protein CAPRICE
(CPC) transcription is regulated directly by WER (an R2R3-MYB transcription factor)
(Ryu et al., 2005). Using EMSAs, two WER-binding sites (WBSs; WBSI and WBSII)
were verified in the CPC promoter. WER-WBSI binding was further validated in vivo
using yeast one-hybrid assays. In another example, AtMYB80 involvement in tapetal
and pollen development was examined (Phan et al., 2011). Using the GR system, it
was determined that 79 genes were changed when the R2R3-MYB transcription factor
AtMYB80 function was restored in the myb80 mutant following dexamethasone
induction (Phan et al., 2011). Thirty-two of these genes were analyzed using ChIP, and
three were identified as direct targets of AtMYB80. These genes were shown to encode
a glyoxal oxidase (GLOX1), a pectin methylesterase (VANGUARD1), and an A1
aspartic protease (UNDEAD) and corresponded with in vitro binding data. This
procedure is a powerful way of identifying direct targets of transcription factors. When
the GR-inducible system is coupled with in silico processes, such as promoter analyses,
and DNA-binding experiments, such as EMSAs, yeast-one hybrid assays and
protoplasts assays, it has proven quite useful in identifying in vivo DNA binding sites.
The use of both expression data and DNA-binding site experiments are required to
reduce the amount of MYB false positive targets. The forkheadboxA homolog, PHA-4
regulates organogenesis of the pharynx in Caenorhabditis elegans (Gaudet and Mango,
2002). Expression of PHA-4 targets correlated with its binding sites in promoter
regions, and that the timing of target expression correlated with binding affinity between
29
PHA-4 and its target sequence (Gaudet et al., 2004). The data suggested that PHA-4
regulates pharyngeal organ development by combining PHA-4 binding affinity and
cooperating factors to regulate gene expression temporally. ChIP-seq data for PHA-4
validated this assessment; 87% of the associated genes were expressed when PHA-4
binding was present, and this expression was reduced to 60% when PHA-4 binding was
not present (Zhong et al., 2010). Towards this end, using both expression data and
DNA-binding site experiments is a powerful means of validating that the binding of a
factor activates the expression of its putative target genes. The use of both expression
data and MYB-DNA binding site assays in tandem will aid in generating a more
biologically significant MYB network.
Another problem with the identification of MYB binding sites is that some MYB binding
sites are not proximal to the predicted target genes. In an early ChIP-seq study, there
were a large number of bound sites observed for the human interferon-g (IFN-g)
responsive transcription factor STAT1 (Robertson et al., 2007). Before stimulating the
cells with IFN-g, 10,000 binding sites were identified. Binding sites for STAT1 increased
fourfold after stimulating the cells with IFN-g. In both conditions, approximately 50% of
the total sites were intragenic, 25% of the total sites were intergenic. Most binding sites
were not located near STAT1-regulated genes, which suggested that bound sites were
not directly regulating nearby genes. This can be explained by chromatin looping - a
mechanism for transcriptional control that involves bringing regulatory elements into
proximity of target genes (Vakoc et al., 2005). The use of chromosome conformation
capture studies in the future will determine if the distant locus control region (LCR) with
MYB target genes are required for high-level transcription.
1.6 Transcriptional Regulation of MYB proteins
1.6.1 Regulators Effecting MYB Gene Expression in Networks
MYB expression has been shown to be transcriptionally regulated by a suite of
regulators in animals and plants (Dubos et al., 2010). These regulators of MYB
expression are governed by biotic and abiotic stimuli (Martin and PazAres, 1997). In
Arabidopsis thaliana, MYB transcription factors have been direct targets of 87 other
30
regulators (http://arabidopsis.med.ohio-state.edu/). For instance, AGL15, a MADS
family protein that regulates embryo development, directly binds to 29 different MYB
genes, although DNA binding does not always imply transcriptional regulation (Zheng et
al., 2009). Regulators have mainly been shown to modulate expression from MYB
promoter regions 500bp upstream of the transcriptional start site; however, other
regions of MYB genes, including introns can be involved in modulating MYB expression
(Table 1.1, Fig. 1.2).
1.6.2 The Role of Introns on MYB Transcriptional Regulation
Introns impact multiple steps in the expression of genes in plants and animals (Le Hir et
al., 2003; Rose, 2008). In Arabidopsis thaliana, approximately 80% of genes contain
introns (Rose, 2002; Carmel et al., 2007). Introns can contain regulatory sequences
that allow binding of activator and repressor proteins to these sites to modulate
transcription (Rippe et al., 1989; Bruhat et al., 1990; Deyholos and Sieburth, 2000).
Examples of intragenic regulation of gene expression are growing. Prominent among
these examples is the intron-mediated regulation of animal Myb expression.
A direct link between animal Myb intragenic sequences and gene expression has been
reported (Dooley et al., 1996). Human c-Myb was transcriptionally regulated by nuclear
protein complexes that bind to a conserved motif of c-Myb intron 1. A 70 kDa protein
bound to this intragenic motif, and was associated with transcriptionally active leukemia
cells. Furthermore, a 20 kDa repressor protein (with a c-Jun domain) in transcriptionally
silent cells bound to another motif within the intron 1, demonstrating complex regulation
of transcription during leukemic cell growth and differentiation.
Intragenic modulation of MYB expression is not limited to animals. In plants, the
Arabidopsis thaliana R2R3-MYB gene GLABRA1 (GL1) regulates trichome
development (Oppenheimer et al., 1991). The first intron of GL1 plays a role in
patterning trichomes (Wang et al., 2004). This intron operates as an enhancer in
trichome cells and a repressor in nontrichome cells, generating a trichome-specific
pattern of MYB gene expression. A motif was identified (CA/CGTTA) in the first intron
of GL1 and the position of the motif was conserved between closely related MYB
proteins AtWER and GaMYB2. This conserved intragenic motif was critical in the
31
regulation of trichome patterning and it was suggested that this motif might be a binding
site for activators and repressors that regulate transcription of this gene. These studies
suggest that MYB introns can play regulatory roles; however, more research is needed
to elucidate the molecular components involved in such regulation.
Novel regulatory proteins that bind MYB intragenic regions can be determined through
pull-down assays followed by mass spectrometry. For a streptavidin biotin pull-down
assay, the intragenic DNA sequence of interest is biotinylated and bound to streptavidin
beads. Nuclear protein extracts are then passed over the complex and washed to
remove proteins that bind non-specifically. The bound nuclear proteins are eluted from
the complex and subsequently identified using a mass-spectrometry approach (Hewel
et al., 2010). This technique can identify novel proteins that bind to MYB intragenic
regions. Further analyses, including genetic over-expression and loss-of-function
approaches, can be used in a complementary manner to validate the role of such
proteins in the regulation of MYB expression.
1.7 Research Hypotheses and Aims
The past decades have seen remarkable inroads made into our understanding of the
molecular interactions between sequence-specific transcription factors and their DNA
targets in general, and MYB proteins and their binding sites more specifically. This is
particularly true for animal R1R2R3-MYB transcription factors. Insights into the
specificities of R1R2R3-MYB interactions with DNA target sequences is, in turn,
providing greater understanding of the molecular mechanisms that control cellular
processes (Howe and Watson, 1991; Weston, 1992; Grotewold et al., 1994; Solano et
al., 1997; Patzlaff et al., 2003b)(Fig. 1.2, Table 1.1), and is aiding in the development of
diagnostics and therapeutics for when those mechanisms go awry (Vicente et al., 2009;
Stenman et al., 2010). By contrast, comparable understanding of the molecular
mechanisms that proceed from the interaction of plant R2R3-MYB proteins with their
cognate DNA targets is much less complete, with only a handful of MYB-DNA
interactions characterised at the molecular level for any given plant species.
Consequently, the precise means by which plant R2R3-MYB proteins coordinate gene
expression and are regulated to give rise to plant phenotype is still at a relatively
32
nascent stage. This said, with the emergence of new techniques that enable the
dissection of protein-DNA interactions more rapidly (Gertz et al., 2005; Vavouri and
Elgar, 2005), and/or with higher resolution (Mardis, 2007; Massie and Mills, 2008),
and/or for a larger number of proteins (Huang, 2003; Seong and Choi, 2003; Godoy et
al., 2011), the characterisation of new MYB-DNA interactions and MYB regulatory
proteins, particularly those that occur in plant species, should proceed apace. Given
this, the coming decade promises to provide great insights into the means by which
members of this remarkable family of proteins convert molecular information into whole
organism responses.
Recently, the R2R3-MYB transcription factor AtMYB61 was shown to be involved in
carbon acquisition and resource allocation (Penfield et al., 2001; Newman et al., 2004;
Liang et al., 2005; Romano et al., 2012). The aim of this research presented in this
thesis is to better characterise the molecular function of an R2R3-MYB family member,
focusing on the interplay between AtMYB61 and its DNA target sequences. In addition
to shedding light on a particular transcription factor, the project should establish a
pipeline for the characterisation of the molecular functions of any plant transcription
factor. The aims are:
(1) To test the hypothesis that AtMYB61 binds to the 5‘ non-coding regulatory
regions of a distinct set of predicted target genes to modify transcription.
(2) To test the hypothesis that AtMYB61 binds to a preferred DNA sequence to
activate transcription.
(3) To test the hypothesis that AtMYB61 is regulated by an intragenic motif
within the 5‘ coding region of its second intron in a sucrose dependent manner.
To address these aims, three separate experiments were undertaken. First,
electrophoretic mobility shift assays (EMSAs) were used to examine the binding of
AtMYB61 to predicted downstream targets upstream regulatory regions. Second, a
cyclic amplification and selection of target sequences (CASTing) assay was conducted
on AtMYB61 recombinant protein and compared to yeast activation assays to determine
AtMYB61 preferred DNA-binding sites. Finally, an EMSA and streptavidin-biotin pull-
33
down assay followed by mass-spectrometry were used to examine if putative repressors
bind to the conserved motif within AtMYB61 second intron in a sugar dependent
manner.
1.8 Acknowledgements
We are very grateful to Dr. Katharina Bräutigam, Joseph Skaf, and Heather Wheeler for
fruitful discussions and extensive assistance with earlier drafts of this manuscript. This
work was generously supported by a Natural Science and Engineering Research
Council of Canada (NSERC) Canadian Graduate Scholarship (CGSD) awarded to MP,
and by funding from the University of Toronto and NSERC to MMC.
34
Chapter 2
AtMYB61, an R2R3-MYB transcription factor, is a pleiotropic regulator of plant carbon acquisition and resource allocation
This chapter is an extract of material originally contained in the following publication:
Romano J, Dubos, C., Prouse, M.B., Wilkins, O., Hong, H., Poole, M., Kang, K., Li, E., ,
Douglas, C.J., Western, T.L., Mansfield, S.D., and Campbell, M.M. (2012) AtMYB61, an
R2R3-MYB transcription factor, is a pleiotropic regulator of plant carbon acquisition and
resource allocation. New Phytologist. 195: 774-786.
Contributions: MBP, JMR, CD, MMC designed research; MBP, JMR, CD, and HH
performed research; JMR, CD, MBP, MP, and OW analysed data; MBP, JMR, CD, and
MMC wrote manuscript with editorial assistance from MBP, HH, and OW.
MBP contributed specifically to Fig. 2.2, Fig. 2.3, Table 2.3.
Copyright: The material in this chapter is copyrighted by Wiley and Wiley.
35
2 AtMYB61, an R2R3-MYB Transcription Factor, is a Pleiotropic Regulator of Plant Carbon Acquisition and Resource Allocation
2.1 Abstract
Throughout their lifetimes, plants must coordinate the regulation of various facets of
growth and development. Previous evidence has determined that the Arabidopsis
thaliana R2R3-MYB, AtMYB61, functions as a coordinate regulator of multiple aspects
of plant resource allocation. Using a combination of cell biology and transcriptome
analysis, in conjunction with over-expression and loss-of-function genetics, the role of
AtMYB61 in conditioning resource allocation was explored. Putative downstream
targets of AtMYB61 were predicted and include genes that encode the following
proteins: a KNOTTED1-like transcription factor (KNAT7, At1g62990); a caffeoyl-CoA 3-
O-methyltransferase (CCoAOMT7, At4g26220); and a pectin-methylesterase (PME,
At2g45220). Statistically over-represented motifs were identified in the 5‘ non-coding
regions of the putative target genes, and these correspond to previously characterised
AC element motifs that function as R2R3-MYB targets. The consensus motif functions
as a bona fide target for AtMYB61 binding as determined by an electrophoretic mobility
shift assay. Binding between the gene regulatory sequences of the putative target
genes, which contain multiples of these motifs, was confirmed via electrophoretic
mobility shift assays. Altogether these experiments provide assessment of the ability of
AtMYB61 to bind to gene regulatory sequences present in the 5‘ non-coding sequences
of the three putative downstream targets: KNAT7, CCoAOMT7 and a PME,
substantiating its role as a potential regulator of the transcription of these genes.
Together with the analysis of the regulation of AtMYB61 expression, these studies
provide insights into the transcriptional regulatory circuit downstream of AtMYB61.
2.2 Introduction
Plants have evolved mechanisms that enable them to contend with fluctuations in their
capacity to fix carbon (Halford and Paul, 2003; Koch, 2004; Gibson, 2005; Rogers et al.,
2005; Coupe et al., 2006; Rolland et al., 2006; Solfanelli et al., 2006; Shimazaki et al.,
36
2007; Hanson and Smeekens, 2009). Some of these mechanisms control the aperture
of stomata and regulate the uptake of CO2 for photosynthesis (Hetherington and
Woodward, 2003; Coupe et al., 2006; Shimazaki et al., 2007). Plants have also evolved
mechanisms to appropriately modulate the allocation of carbon to various facets of plant
growth, development and metabolism (Osuna et al., 2007; Smith and Stitt, 2007; Stitt et
al., 2007; Usadel et al., 2008; Gibon et al., 2009; Sulpice et al., 2009; Graf et al., 2010).
Although a body of evidence suggests a link between the pathways that modulate
carbon acquisition through stomata with those involved in resource allocation (Tallman,
2004; Coupe et al., 2006; Shimazaki et al., 2007; Liang et al., 2010; Romano et al.,
2012), little is known about the specific factors involved.
AtMYB61 (At1g09540), which encodes a member of the Arabidopsis thaliana R2R3-
MYB family of transcription factors, is a gene that controls resource acquisition and
allocation (Penfield et al., 2001; Newman et al., 2004; Liang et al., 2005; Romano et al.,
2012). AtMYB61 expression was both sufficient and necessary to bring about
reductions in stomatal aperture with consequent effects on gas exchange (Liang et al.,
2005). Analysis of loss-of-function atmyb61 mutants showed that AtMYB61 was also
necessary for the deposition of seed coat mucilage (Penfield et al., 2001). Other
experiments have revealed that AtMYB61 plays a role in the control of lignification and
photomorphogenesis (Newman et al., 2004; Dubos et al., 2005). In keeping with its role
as a regulator of resource allocation, AtMYB61 was shown to be expressed in sink
tissues, notably xylem, roots and developing seeds (Romano et al., 2012). Loss of
AtMYB61 function decreases xylem formation, induces qualitative changes in xylem cell
structure and decreases lateral root formation; in contrast, over-expression of AtMYB61
has the opposite effect on these traits.
The link between AtMYB61 and its role in the regulation of carbon allocation is not
obvious. We show here that AtMYB61 orchestrates changes in transcriptome activity
that modify plant resource allocation. Together with our previous results (Newman et
al., 2004; Liang et al., 2005; Romano et al., 2012), these new data support the
hypothesis that AtMYB61 binds to the upstream regulatory regions of predicted
downstream targets, modifying transcription to control both resource acquisition through
37
stomata, as well as resource allocation, largely into non-recoverable carbon sinks,
throughout plant growth and development.
2.3 Materials and Methods
2.3.1 Plant Material, Seed Sterilization and Growth Conditions
All wild-type (WT) and mutant Arabidopsis thaliana seeds were in the Columbia-0
background. Plants over-expressing AtMYB61 under the control of the Cauliflower
Mosaic Virus 35S promoter (35S::MYB61) were as described previously (Newman et
al., 2004; Liang et al., 2005). Similarly, AtMYB61 loss-of-function mutants (atmyb61)
have been described previously (Penfield et al., 2001; Liang et al., 2005). One
independently transformed AtMYB61 overexpressing line (35S::MYB61) and minimally
two loss-of-function alleles (atmyb61-1 and atmyb61-2) were used in all experiments,
and results are representative. T-DNA insertional mutant lines corresponding to either
AtMYB61 or putative downstream targets of AtMYB61 were obtained from the
Arabidopsis Biological Resource Center (ABRC) (Alonso et al., 2003). Homozygous T-
DNA lines were obtained by PCR screening using the left border A T-DNA primer and a
right border gene-specific primer. Insertion sites were sequenced for all mutants to
verify insertional mutagenesis (data not shown), and quantitative PCR was conducted to
show that the mutants were loss-of-function (data not shown).
For primary bolt and hypocotyl analyses, seeds were germinated and plants were grown
on soil. Seeds were sown on dampened soil and then cold stratified for 3 d before
placement in a growth chamber at 21°C with a regime of 12 h of light (120 μmol m−2 s−2)
and 12 h of dark. This growth regime is referred to as short-day conditions herein.
To induce secondary growth in the hypocotyl, the primary inflorescence and secondary
inflorescences were continually removed from plants grown under short days for 10 wk
(Chaffey et al., 2002). Hypocotyl sections were fixed, coated or stained for transmission
electron microscopy, scanning electron microscopy and bright- and dark-field
microscopy, respectively.
38
2.3.2 RNA Isolation and Quantitative PCR
In order to verify that the insertional mutagenised mutants identified as above were loss-
of-function mutants, transcript accumulation corresponding to the mutagenized gene
was determined in the mutants. Primary and secondary inflorescences were excised
with a scalpel and immediately frozen in liquid nitrogen. Approximately 1 g (fresh
weight) of ground tissue was used per RNA extraction. TRIzol reagent (Invitrogen) was
used following the manufacturer‘s recommendations. The RNA pellet was dissolved in
30 μl diethylpyrocarbonate (DEPC)-treated water. RNA quantity and purity were
analysed using a NanoDrop spectrophotometer (Thermo Scientific, Wilmington, DE,
USA), and RNA integrity was assessed by loading 1 μg of RNA onto a 1% agarose 0.5X
TBE (Tris-borate-EDTA) gel. First-strand cDNA was generated using 5 μg total RNA
with oligo dT primer with SuperScript II (Invitrogen). Standard curves, quantitative PCR
and melt curves were conducted with a Bio-Rad Chromo4 Real-Time PCR detector
using Sybr-Green florescent dye (Bio-Rad). In order to avoid the generation of a
reverse transcription-polymerase chain reaction (RT-PCR) amplicon from genomic
DNA, primers were designed so that the 3‘-end of at least one of the primers spanned
an intron splice site. The relative mRNA levels were determined by normalizing the
PCR threshold cycle number of each gene with that of TUBULIN-4 reference gene
(At1g04820). Primer sequences and quantitative PCR amplification conditions are
available on request.
2.3.3 Secondary Thickened Hypocotyls Stained with Phloroglucinol
Secondary thickened hypocotyl sections (c. 1 mm) were stained with phloroglucinol as
described previously (Newman et al., 2004). Sections were viewed with an Olympus
SZX16 microscope under both bright and dark field. Images were captured with a
QImaging MicroPublisher 3.3RTV digital camera utilizing QCapture version 2.7
software. Measurements employed to calculate the area of xylem and area of phloem
were calculated using ImageJ 1.38x (Collins, 2007).
39
2.3.4 Transmission Electron Microscopy
Segments (1 cm) of primary inflorescence stems from stage 6.30 plants, or secondary
thickened hypocotyls, were fixed in 2% glutaraldehyde in 0.1 M Sorensen‘s phosphate
buffer (pH 7.4) for 72 h at room temperature, and postfixed in 1% osmium tetroxide in
0.1 M phosphate buffer for 1 h in the dark. Samples were then dehydrated through an
ascending graded series of ethanol (30%, 50%, 70%, 80%, 90%, 100%), infiltrated with
Spurr‘s epoxy resin and polymerized overnight at 65°C. Semi-thin sections (0.5–1 μm)
were cut using glass knives on a Leica EM UC6 ultramicrotome (Leica, Allendale, NJ,
USA), stained with 0.1% toluidine blue and 0.025% methylene blue, and examined by
light microscopy to determine the quality of fixation and orientation of the samples.
Ultrathin sections (60–90 nm) were cut using a diamond knife, stained with 3% uranyl
acetate in 50% methanol, poststained with Reynold‘s lead citrate and examined using a
Hitachi H7000 transmission electron microscope (Hitachi, Mississauga, ON, Canada)
operated at 75 kV. Pictures were taken using Kodak 4489 electron microscope film,
and negatives were scanned using an Epson Perfection 1680 scanner (Epson,
Markham, ON, Canada) at 1200 dpi.
2.3.5 Microarray Analysis
Total RNA was extracted using described methods (Newman et al., 2004) from 7-d-old
Arabidopsis thaliana seedlings grown in the dark in liquid MS medium as described
above. Each pool of RNA was derived from hundreds of seedlings. Three biological
replicates were collected for each condition (genotype × sucrose presence/absence) for
RNA extraction. As there were three genotypes (WT, atmyb61, 35S::MYB61) and two
conditions (presence and absence of sucrose), there were 18 RNA samples in total.
The quality of the total RNA was assessed using an Agilent Bioanalyser (Agilent,
Mississauga, ON, Canada) at the Genomic Arabidopsis Resource Network (GARNet)
microarray facility at the Nottingham Arabidopsis Stock Centre. Hybridization to the 18
Affymetrix GeneChip Arabidopsis ATH1 Arrays (Affymetrix, Santa Clara, CA, USA),
scanning of the hybridized arrays and raw data collection were performed at the
GARNet facility at the Nottingham Arabidopsis Stock Centre according to standard
Affymetrix protocols (http://affymetrix.com). The data for the RNA quality control, the
40
raw data for the triplicated microarray experiments and the detailed description of the
MIAME (minimum information about a microarray experiment)-compliant experimental
conditions are publicly available at:
http://ssbdjc2.nottingham.ac.uk/narrays/experimentpage.pl?experimentid=14.
2.3.6 Bioinformatic Analyses to Identify AtMYB61 Targets
To identify AtMYB61 targets, a two-stage complete transcriptome analysis was
undertaken. In the first stage, publicly available, complete Arabidopsis thaliana
transcriptome microarray data were used to identify those genes sharing the same
transcript abundance profile as AtMYB61 across multiple stages of development.
Genes were identified whose transcript abundance profiles had a Pearson correlation
coefficient > 0.8 when compared with the transcript abundance of AtMYB61 across the
66 microarrays comprising the AtGenExpress ‗Developmental Baseline‘ dataset
(http://web.uni-frankfurt.de/fb15/botanik/mcb/AFGN/atgenex.htm,
http://bar.utoronto.ca/ntools/cgi-bin/ntools_expression_angler.cgi). The 58 genes
identified in this manner (Supporting Information Table S1) should fit into one of two
categories: (1) genes regulated in parallel with AtMYB61, and (2) genes regulated by
AtMYB61. To select genes in the latter category, a second stage of analysis was
undertaken.
The second stage of transcriptome analysis identified genes whose transcript
abundance was influenced by the presence or absence of AtMYB61. A complete
transcriptome microarray dataset was generated using WT, atmyb61 and 35S::MYB61
grown at a time point and under conditions that allow the comparison of the impact of
AtMYB61 on transcriptome activity (seedlings grown in the dark in the absence or
presence of sucrose). In this dataset, genes that are either direct or indirect targets of
AtMYB61 should have reduced transcript abundance in atmyb61 mutants and elevated
expression in 35S::MYB61 over-expressing plants in comparison with WT. Using these
criteria to generate a ‗bait‘ transcript abundance profile for use in the Expression Angler
co-expression tool, 31 genes were identified that had a Pearson correlation coefficient
> 0.8 across the 18 microarrays in the ‗AtMYB61 dataset‘ (Fig. 2.1, Table 2.2). Groups
of genes with transcript abundance profiles that had high Pearson correlation
41
coefficients relative to AtMYB61 were identified using Expression Angler
(http://bar.utoronto.ca/ntools/cgi-bin/ntools_expression_angler.cgi) (Toufighi et al.,
2005). The calculations of the Pearson correlation coefficients were based on raw
expression values across all 18 GeneChips generated in our study. Both gene lists,
generated from the microarray and Expression Angler analyses, were then compared
using Venn Selector (http://bar.utoronto.ca/ntools/cgi-bin/ntools_venn_selector.cgi).
Three genes were identified in the intersection set.
The 5‘ noncoding sequences (1000 bp) for the three genes with Pearson correlation
coefficients > 0.8 in both datasets were obtained by bulk download from The
Arabidopsis Information Resource (TAIR,
http://www.arabidopsis.org/tools/bulk/sequences/index.jsp). Over-represented
sequence motifs in the 5‘ noncoding sequences were identified using option 2 of
Promomer (http://bar.utoronto.ca/ntools/cgi-bin/BAR_Promomer.cgi) (Toufighi et al.,
2005) with the following parameters: base pairs in the element = 6, minimum
percentage of genes in which the identified element should occur = 75. Bootstrap
analysis (n = 1000) with a randomized dataset allows the validation of the significance
of an over-represented motif within the sequences being queried.
2.3.7 Electrophoretic Mobility Shift Assay (EMSA)
Recombinant AtMYB61 protein was produced in Escherichia coli using the coding
sequence cloned in frame into the NdeI and BamHI sites of the pET15b vector
(Novagen, EMD Millipore, Mississauga, ON, Canada). Recombinant AtMYB61 protein
was produced, extracted and affinity purified as described previously for pine MYB
proteins (Patzlaff et al., 2003b). EMSA conditions were exactly as described previously
(Patzlaff et al., 2003b; Gomez-Maldonado et al., 2004), but using recombinant AtMYB61
protein instead of pine MYB protein.
2.3.8 Transcriptional Activation Assay
Transcriptional activation assays using yeast were performed as described previously
(Patzlaff et al., 2003b), but substituting the AtMYB61 coding sequence instead of the
pine MYB sequences.
42
2.3.9 Fibre Quality Analysis
Fibre quality analysis Secondary thickened hypocotyls were subjected to fibre quality
analysis according to published methods (Chaffey et al., 2002). Fibre quality analysis
enables the determination of cell types liberated from secondary xylem following
maceration, by documenting cell lengths, widths and frequencies as suspended cells
pass through a flow chamber.
2.4 Results and Discussion
2.4.1 AtMYB61 Modulates the Expression of a Specific Set of Target
Genes
As a transcription factor, AtMYB61 should exert its control over facets of the plant
transpiration stream by modulating the expression of specific target genes (Ptashne and
Gann, 1997). To identify such targets, a two-stage complete transcriptome analysis
was undertaken, using publicly available microarray datasets in combination with a
custom microarray dataset comparing the transcriptomes of wild-type (WT), atmyb61
and 35S::MYB61 plants (see the Materials and Methods section; Fig. 2.1; Tables 2.1,
2.2). Three genes emerged from the sequential filtering of publicly available microarray
data and the AtMYB61-specific microarray data, which were shared in both tiers of data
mining. These genes are strong candidates for direct targets of AtMYB61: At1g62990,
At2g45220 and At4g26220.
The nature of the gene products encoded by the three putative AtMYB61 targets is
consistent with their role in xylem development. At1g62990 encodes the homeobox
protein AtKNAT7. AtKNAT7 is expressed in xylem fibres, and xylem vessels of
AtKNAT7 loss-of-function mutants (irregular xylem11, irx11) have thin, weak cell walls
resulting in collapsed vessels (Brown et al., 2005; Zhong et al., 2008). At2g45220
encodes a pectin methylesterase (AtPME), a class of enzymes with demonstrable roles
43
Figure 2.1. Transcript abundance of a subset of genes in the Arabidopsis thaliana transcriptome is influenced by the presence or absence of AtMYB61 activity. Clustergram of transcript abundance of genes that share the same transcript abundance profile as AtMYB61 in 7 d old, dark-grown wild-type (WT), atmyb61, and 35S::MYB61 seedlings. Each row shows transcript abundance data for a given gene in 7 d old, dark-grown seedlings as determined by Affymetrix ATH1 GeneChip microarrays. Three biological replicates were analysed for each genotype. Green indicates low transcript abundance; whereas, red indicates high transcript abundance. Genes that share the same transcript abundance profile as AtMYB61 as determined by Expression Angler, using a Pearson correlation coefficient >0.8 as the cut-off, are characterised by having low transcript abundance in the atmyb61 mutant and high transcript abundance in the 35S::MYB61 overexpressor.
44
Table 2.1. Genes that share transcript abundance profiles with AtMYB61 Determined by Pearson correlation coefficient, across the AtGenExpress developmental baseline dataset.
Pearson correlation co-
efficient relative to AtMYB61
Arabidopsis Gene
Identifier (AGI) Gene Product Description
0.911 At1g63300 unknown protein
0.906 At5g58930 unknown protein 0.906 At5g40960 unknown protein 0.898 At1g62990 KNOTTED-LIKE HOMEOBOX 7
0.894 At3g11690 unknown protein
0.894 At1g14380 IQ67 DOMAIN PROTEIN 28
0.892 At2g43060 transcription factor
0.890 At5g60820 C3HC4-type RING finger
0.887 At4g33330 PGSIP3__transferase, transferring glycosyl groups
0.883 At1g07750 cupin family protein
0.880 At4g26220 caffeoyl-CoA 3-O-methyltransferase, putative
0.879 At3g51000 epoxide hydrolase, putative
0.878 At5g54530 unknown protein
0.864 At4g28370 protein binding / zinc ion binding
0.857 At4g14930 acid phosphatase survival protein SurE, putative
0.855 At5g14500 aldose 1-epimerase family protein
0.853 At4g32350 unknown protein
0.844 At3g53520 UDP-GLUCURONIC ACID DECARBOXYLASE 1
0.844 At3g17940 aldose 1-epimerase family protein
0.841 At1g63690 protease-associated domain-containing protein
0.840 At4g37530 peroxidase, putative
0.837 At4g30500 unknown protein
0.836 At1g01900 ATSBT1.1__SBTI1.1; serine-type endopeptidase
0.835 At3g14720 ATMPK19; MAP kinase
0.832 At5g39190 GERMIN-LIKE PROTEIN 2
0.831 At5g66460 (1-4)-beta-mannan endohydrolase, putative
0.830 At5g65710 HAESA-Like 2
0.829 At1g76550 pyrophosphate-dependent 6-phosphofructose-1-kinase
0.828 At1g12550 oxidoreductase family protein
0.828 At1g19190 hydrolase
0.828 At5g59310 LIPID TRANSFER PROTEIN 4
0.825 At5g66660 unknown protein 0.825 At5g27360 SFP2; carbohydrate transmembrane transporter
0.823 At5g23720 PROPYZAMIDE-HYPERSENSITIVE 1
0.821 At3g54200 unknown protein
0.820 At1g49450 transducin family protein / WD-40 repeat family protein
0.819 At3g45130 lanosterol synthase 1
45
Table 2.1 continued
0.819 At1g75390 basic leucine-zipper 44
0.816 At2g24170 endomembrane protein 70, putative
0.815 At1g29050 unknown protein
0.815 At2g43050 ATPMEPCRD; enzyme inhibitor/ pectinesterase
0.815 At2g38710 AMMECR1 family
0.815 At5g15490 UDP-glucose 6-dehydrogenase, putative
0.813 At3g18440 aluminum-activated malate transporter 9
0.813 At1g11070 proline-rich family protein
0.812 At1g72220 zinc finger (C3HC4-type RING finger) family protein
0.810 At3g53400 CONSERVED PEPTIDE UPSTREAM ORF 47
0.810 At2g16990 tetracycline transporter
0.807 At5g66170 unknown protein
0.807 At5g62150 peptidoglycan-binding LysM domain-containing protein
0.806 At1g03920 protein kinase, putative
0.805 At3g13640 ATRLI1; transporter
0.804 At3g51300 RHO-RELATED PROTEIN FROM PLANTS 1
0.801 At1g16490 MYB DOMAIN PROTEIN 58
0.800 At2g38360 PRENYLATED RAB ACCEPTOR 1.B4
0.800 At1g54160 NUCLEAR FACTOR Y, SUBUNIT A5
0.800 At2g45220 pectinesterase family protein
46
Table 2.2. Genes that share transcript abundance profiles with AtMYB61
Determined by Pearson correlation coefficient, across the AtMYB61 microarray dataset.
Pearson correlation co-
efficient relative to AtMYB61
Arabidopsis Gene
Identifier (AGI) Gene Product Description
0.937 At1g26270 phosphatidylinositol 3- and 4-kinase family protein
0.916 At3g59480 pfkB-type carbohydrate kinase family protein
0.912 At1g11210 unknown protein
0.908 At4g26220 caffeoyl-CoA 3-O-methyltransferase, putative
0.873 At2g16720 MYB DOMAIN PROTEIN 7
0.864 At3g61440 CYSTEINE SYNTHASE C1
0.862 At1g21590 protein kinase family protein
0.859 At2g44840 ETHYLENE-RESPONSIVE BINDING FACTOR 13 0.854 At2g12290 unknown protein 0.853 At1g62990 KNOTTED-LIKE HOMEOBOX 7
0.849 At3g52870 calmodulin-binding family protein
0.843 At5g47230 ETHYLENE RESPONSIVE ELEMENT BINDING FACTOR 5
0.837 At4g17980 NAC domain containing protein 71
0.835 At1g77590 LONG CHAIN ACYL-COA SYNTHETASE 9
0.833 At3g06390 integral membrane family protein
0.833 At3g62730 unknown protein
0.833 At3g48690 ATCXE12__CXE12; carboxylesterase
0.827 At1g29240 unknown protein
0.826 At5g02200 FAR-RED-ELONGATED HYPOCOTYL1-LIKE
0.825 At4g36780 transcription regulator
0.822 At1g77220 unknown protein
0.812 At3g28290 unknown protein
0.812 At3g04870 ZETA-CAROTENE DESATURASE
0.809 At1g72410 COP1-interacting protein-related
0.805 At5g26340 STP13__MSS1; hexose:hydrogen symporter
0.802 At4g05090 inositol monophosphatase family protein
0.801 At4g36930 SPATULA
0.800 At1g63870 disease resistance protein (TIR-NBS-LRR class), putative
0.800 At3g18250 unknown protein
0.800 At1g53570 MAP3KA
0.800 At2g45220 pectinesterase family protein
47
in reconfiguring plant cell wall chemistry (Pelloux et al., 2007). At4g26220 encodes a
caffeoyl-CoA O-methyltransferase (AtCCoAOMT7), which, based on the extent of
sequence similarity, is probably involved in the genesis of the monolignol precursors
used to build the lignin polymer, as do related homologues (Do et al., 2007).
2.4.2 AtMYB61 Regulates Genes with Specific Target Motifs in Their
Promoters
Consistent with the three genes functioning as downstream targets of AtMYB61,
recombinant AtMYB61 protein bound to 300-bp DNA regions residing upstream of the
TATA box for each of the putative target genes (Fig. 2.2). Candidate AtMYB61 binding
sites in the gene regulatory regions of these three genes were identified by algorithm-
based screening for over-represented motifs in the three DNA sequences. The most
over-represented DNA motif in the gene regulatory sequences showed high similarity to
canonical R2R3-MYB binding sites known as AC elements (Fig. 2.2). Three such AC
elements were found in each of the upstream regions of AtPME and AtKNAT7, whereas
four elements were found in the AtCCoAOMT7 upstream noncoding sequences
(Fig. 2.2; Table 2.3). Recombinant AtMYB61 bound to this element, but could not bind
to a mutated version of the element, confirming that this is the likely target of AtMYB61
binding in these genes (Fig. 2.2). Non-AC-element-containing DNA could not be bound
by AtMYB61 (Fig. 2.2), nor could it compete with AC elements for AtMYB61 binding
(Fig. 2.3). The AC element was also an effective competitor for recombinant AtMYB61
bound to the 300-bp upstream regulatory sequences (Fig. 2.2). Moreover, expression
of AtMYB61 in yeast transactivated an artificial target gene comprising a tandem repeat
of the AC element fused to a yeast minimal promoter, upstream of the reporter β-
galactosidase (Fig. 2.2). These binding data are in accordance with the literature
surrounding MYB–DNA interactions (Prouse and Campbell, 2012). Thus, AtMYB61
activity appeared to promote the expression of target genes containing the AC element.
Strikingly, evidence suggests that AtKNAT7 is the target of other transcription factors
and, on that basis, is thought to be a component of a transcriptional network that
regulates xylem differentiation (Zhong et al., 2007; Zhong et al., 2008). AtKNAT7
appears to function as a common target for several transcriptional networks that are
48
Figure 2.2. AtMYB61 binds to the promoters of putative downstream targets, to motifs that are over-represented in these promoters and is sufficient to activate transcription from these motifs. (a) Schematic representation of the 5‘ noncoding sequences of the three putative AtMYB61 downstream target genes identified as the intersection set of genes found to be co-regulated with AtMYB61 in both the AtGenExpress developmental dataset and a
49
Figure 2.2 caption continued. AtMYB61-specific microarray experiment, as determined by Expression Angler. +/− indicate the orientation of canonical R2R3-MYB binding site motifs relative to the sense coding strand, and numbers indicate the position of these motifs relative to the putative transcriptional start (indicated by an arrow). Blue horizontal lines under the sequences correspond to the location of the DNA sequence used as the target in the electrophoretic mobility shift assay (EMSA) conducted in (b). (b) AtMYB61 binding of the 5‘ noncoding sequences of the three putative target genes as determined by EMSA. Recombinant AtMYB61 binds to all three 5‘-noncoding sequences, as determined by a gel shift of the probe, and can be outcompeted with increasing quantities of unlabelled DNA corresponding to a canonical R2R3-MYB binding site, known as an AC element. (c) Left: over-represented motif in the 5‘ noncoding sequences of the three genes outlined above, as determined by the Promomer algorithm (Toufighi et al., 2005) (average = 2.9; Z-score = 13; significance = 0.001). Right: AtMYB61 binding to the AC-rich motif as determined by EMSA. Recombinant AtMYB61 binds to the AC-rich motif (AC: 5′ attgttcttcctggggtgaccgtccACCTAAcgctaaaagccgtcgcgggataagcctgtctg 3′), but not to a mutated version of the putative binding motif (NBS: 5′ attgttcttcctggggtgaccgtgcATGGATcgctaaaagccgtcgcgggataagcctgtctg 3′). (d) AtMYB61-mediated activation of promoter activity in Saccharomyces cerevisiae. AC (5′ gaagacgaggtaccagccACCTAAcccACCTAAcccACCTAAcgctgttctcgagcctcatct 3′) and NBS (5′ gaagacgaggtaccagTCCATGGATcgccATGGATcgccATGGATcctgttctcgagccctcatct 3′) sequences are triplicated within the segment. Left: schematic representation of the effector (top) and reporter (bottom) constructs used in this study (CYC1: minimal yeast promoter). Right: quantitative analysis of β-galactosidase activity in yeast (noninducible medium: glucose, open bars; inducible medium: galactose, closed bars). Error bars represent standard deviation. *Significantly different from control, P < 0.05, t-test.
50
Table 2.3. AC elements within the promoters of putative downstream targets. Table of the orientation and location of AC elements within the upstream non-coding regions of the putative targets.
AGI Gene AC Element Orientation Location
At1g62990 AtKNAT7 ACCTAA Antisense 558
ACCTAA Antisense 665
ACCTAA Antisense 704
At2g45220 AtPME ACCAAC Antisense 139
ACCAAT Antisense 143
ACCAAT Sense 151
At4g26220 AtCCoAOMT7 ACCAAA Antisense 82
ACCAAC Sense 128
ACCAAA Antisense 165
ACCAAA Antisense 235
51
Figure 2.3. AtMYB61 binding to the 5’ non-coding sequences of the three putative target genes as determined by EMSA. Recombinant AtMYB61 bound to all three 5‘-non coding sequences of AtKNAT7, AtCCoAOMT7 and AtPME, as determined by a gel shift of the probe (arrows), and could not be outcompeted with increasing quantities of unlabelled DNA corresponding a random binding site (NBS: 5‘ attgttcttcctggggtgaccgtgcATGGATcgctaaaagccgtcgcgggataagcctgtctg 3‘).
52
involved in xylem differentiation, including one that involves AtMYB61. As such,
AtKNAT7 could be viewed as a regulatory module that is co-opted by several gene
regulatory networks.
2.4.3 AtMYB61 Regulates Genes Which Themselves Contribute to
AtMYB61-Related Phenotypes
To determine whether the putative AtMYB61 targets contribute to any of the xylem-
related traits in which AtMYB61 is involved (Romano et al., 2012), the phenotypes of the
loss-of-function mutants for the target genes (atknat7/irx11, atpme and atccoaomt7)
were compared with atmyb61 and WT. Loss-of-function mutations in each of the three
target genes generated xylem-related phenotypes that at least partially phenocopied
atmyb61 phenotypes. For example, secondary thickening of xylem vessel cell walls
was reduced in atknat7/irx11 and atpme mutants relative to WT, like atmyb61 (Fig. 2.4).
As with atmyb61 mutants, the xylem : phloem ratio was reduced relative to WT in
secondary thickened hypocotyls of atknat7/irx11, atpme and atccoaomt7 mutants
(Fig. 2.4). Strikingly, the atknat7/irx11, atpme and atccoaomt7 mutants had far fewer
fibre cells and disproportionately more vessel cells relative to WT (Fig. 2.4). Unlike
atmyb61 mutants, the atknat7/irx11, atpme and atccoaomt7 mutants were able to make
vessels, and fusiform cambial cells were not the predominant cell type. These findings
are in keeping with the hypothesis that AtMYB61 functions upstream of AtKNAT7,
AtPME and AtCCoAOMT7, as AtMYB61 activity promotes the differentiation of both
vessels and fibres, whereas the differentiation of vessels more prominently occurs in the
atknat7/irx11, atpme and atccoaomt7 mutants. This suggests that AtKNAT7, AtPME
and AtCCoAOMT7 are involved in pathways governing fibre differentiation in secondary
hypocotyl development, whereas AtMYB61 sits upstream of both fibre and vessel
differentiation pathways in the development of this anatomical region.
53
Figure 2.4. AtMYB61 downstream target genes have an impact on secondary wall formation and xylem formation in secondary thickened hypocotyls. Transmission electron micrographs (×2000) of cross-sections obtained from primary inflorescence stems of growth stage 6.03 Arabidopsis thaliana plants grown under 12 h light : 12 h dark conditions, for (a) wild-type (WT), (b) atmyb61, (c) atknat7, (d) atpme and (e) atccoaomt7 genotypes. All plants were grown until the inflorescence stems were an equivalent length (26 cm), and cross-sections were made at 0.5 cm from the base of the stem (adjacent to the rosette). Bars, 10 μm. (f–j) Secondary thickened hypocotyls from mature plants after 10 wk of growth with continuous removal of primary and secondary inflorescences under 12 h light : 12 h dark conditions. Sections were stained with phloroglucinol to reveal alterations of lignified xylem cells to phloem cells. Sections are (f) WT, (g) atmyb61, (h) atknat7, (i) atpme and (j) atccoaomt7 genotypes. (k) Quantitative assessment of the ratio of xylem area : phloem area obtained from multiple measurements (biological replicates, n > 10) of secondary thickened hypocotyl cross-sections obtained as already described. (l) Fibre quality analysis of secondary thickened hypocotyls from plants after 10 wk of growth with continuous removal of the primary and secondary inflorescence under 12 h light : 12 h dark conditions. Results are shown as the ratio of length to diameter to reflect particular cell types. Length : diameter (L : D) ratios of 10 indicate vessels, of 17.5 indicate fibres and of 20 indicate cambial cells. Bars represent ± SE. *Significantly different from WT (P < 0.05). Data from experiments performed in triplicate with 5–20 seedlings per genotype per experiment, depending on the nature of the experiment. (f-j) Bars, 50 μm.
54
2.5 Conclusion
These findings suggest that AtMYB61 functions as a pleiotropic regulator of carbon
acquisition and allocation of the plant via a small gene network. Three direct
downstream targets of AtMYB61 were predicted based on comparative transcriptome
analyses between microarrays that examined changes in gene expression that were
modulated by differences in AtMYB61 activity and sugar, and those that examined the
co-expression of AtMYB61 across plant development and in different organs. These
predicted direct downstream targets of AtMYB61 are: a KNOTTED1-like transcription
factor (KNAT7, At1g62990); a caffeoyl-CoA 3-O-methyltransferase (CCoAOMT7,
At4g26220), and a pectin-methylesterase (PME, At2g45220). AtMYB61 bound the
putative downstream targets‘ promoter regions in an AC-motif-dependent fashion.
Expression of AtMYB61 protein in yeast was sufficient to drive the transactivation of a
reporter gene comprising a tandem repeat of an AC element fused to a yeast minimal
promoter, upstream of the reporter lac-Z. Together, these results suggest that
AtMYB61 binds to promoter regions of downstream targets to modulate transcription to
regulate the allocation of carbon to non-recoverable sinks when conditions are
favourable to do so.
2.6 Acknowledgements
We are most grateful to Astrid Patzlaff, Christine Surman and Joan Ouellette for
excellent technical assistance. This work was generously supported by funding from
the Natural Science and Engineering Research Council of Canada (NSERC) and the
Canada Foundation for Innovation (CFI) to S.D.M., by a Canadian Graduate
Scholarship (CGSD) from NSERC awarded to M.B.P., and O.W., by an NSERC
Discovery Grant and the NSERC Green Crops Network to C.J.D., and by funding from
the University of Toronto, CFI and NSERC to M.M.C. Research infrastructure was
provided by the Centre for Analysis of Genome Evolution and Function at the University
of Toronto.
55
Chapter 3
Interactions between the R2R3-MYB transcription factor, AtMYB61, and target DNA binding sites
This chapter is the equivalent of the following submitted manuscript in its entirety:
Prouse M.B., and Campbell M.M. (2013) Interactions between the R2R3-MYB
transcription factor, AtMYB61, and target DNA binding sites. PLOS ONE. 8(5): e65132.
Contributions: MBP, MMC designed research; MBP, MMC analyzed data; MBP, MMC
wrote and edited manuscript.
MBP contributed specifically to each figure and table in this chapter.
Copyright: The material in this chapter is copyrighted by PLOS.
56
3 Interactions between the R2R3-MYB Transcription Factor, AtMYB61, and Target DNA Binding Sites
3.1 Abstract
Despite the prominent roles played by R2R3-MYB transcription factors in the regulation
of plant gene expression, little is known about the details of how these proteins interact
with their DNA targets. For example, while Arabidopsis thaliana R2R3-MYB protein
AtMYB61 is known to alter transcript abundance of a specific set of target genes, little is
known about the specific DNA sequences to which AtMYB61 binds. To address this
gap in knowledge, DNA sequences bound by AtMYB61 were identified using cyclic
amplification and selection of targets (CASTing). The DNA targets identified using this
approach corresponded to AC elements, sequences enriched in adenosine and
cytosine nucleotides. The preferred target sequence that bound with the greatest
affinity to AtMYB61 recombinant protein was ACCTAC, the AC-I element. Mutational
analyses based on the AC-I element showed that ACC nucleotides in the AC-I element
served as the core recognition motif, critical for AtMYB61 binding. Molecular modelling
predicted interactions between AtMYB61 amino acid residues and corresponding
nucleotides in the DNA targets. The affinity between AtMYB61 and specific target DNA
sequences did not correlate with AtMYB61-driven transcriptional activation with each of
the target sequences. CASTing-selected motifs were found in the regulatory regions of
genes previously shown to be regulated by AtMYB61. Taken together, these findings
are consistent with the hypothesis that AtMYB61 regulates transcription from specific
cis-acting AC elements in vivo. The results shed light on the specifics of DNA binding
by an important family of plant-specific transcriptional regulators.
3.2 Introduction
Much of plant growth and development is shaped by sequence-specific transcription
factors, proteins that act in response to external and internal cues to modulate gene
expression. The MYB family is the largest family of plant sequence-specific
transcription factors, with greater than 100 family members in individual plant species
57
(Martin and PazAres, 1997; Arabidopsis Genome, 2000; Riechmann et al., 2000;
Stracke et al., 2001; Dubos et al., 2010). MYB transcription factors are recognised by
the presence of the MYB domain, which comprises characteristic helix-helix-turn-helix
repeats of approximately 50 amino acids. The MYB domain binds DNA in a sequence-
specific manner and is highly conserved in yeast, vertebrates, and plants (Rosinski and
Atchley, 1998). The MYB domain is normally found near the amino terminus of the
protein, and generally contains either 1, 2, or 3 of the 50 amino-acid MYB repeat.
R2R3-MYB proteins have two such repeats, and comprise the largest sub-family of the
plant and animal MYB family. Moreover, R2R3-MYB proteins are plant specific,
regulating facets of plant growth, development and metabolism (Lipsick, 1996; Martin
and PazAres, 1997; Glover et al., 1998; Jin and Martin, 1999; Stracke et al., 2001;
Martin et al., 2002; Patzlaff et al., 2003a; Gomez-Maldonado et al., 2004; Newman et
al., 2004; Liang et al., 2005).
While members of the R2R3-MYB family are being characterised in increasing
numbers, these investigations largely focus on the involvement of a particular MYB in
the manifestation of a specific plant phenotype. That is, most of these analyses do not
extend to a more detailed examination of MYB function at the molecular level.
Nevertheless, some general themes with respect to R2R3-MYB function at the
molecular level are emerging (Prouse and Campbell, 2012). For example, many R2R3-
MYB transcription factors bind to DNA motifs that are enriched in adenosine (A) and
cytosine (C) residues (Patzlaff et al., 2003b; Gomez-Maldonado et al., 2004), where
guanine (G) residues are either absent or depleted (Hatton et al., 1995; Prouse and
Campbell, 2012). These motifs have been variously referred to as AC elements, H
boxes, or PAL boxes (Lois et al., 1989; Joos and Hahlbrock, 1992; Leyva et al., 1992;
Hauffe et al., 1993; Hatton et al., 1995; Logemann et al., 1995; BellLelong et al., 1997;
Seguin et al., 1997; Lacombe et al., 2000; Lauvergeat et al., 2002). Some R2R3-MYB
proteins function as transcriptional activators at these sites (Patzlaff et al., 2003a;
Patzlaff et al., 2003b), while others function as transcriptional repressors (Jin et al.,
2000). AC elements are relatively short, comprising 5 or 6 nucleotides, where 3
residues form a relatively invariant core (Ogata et al., 1993; Ogata et al., 1994; Ogata et
al., 1995). R2R3-MYB proteins bind to AC elements in a manner that relies on specific
58
amino acid residues in the R2R3-MYB domain (Ogata et al., 1993; Ogata et al., 1994;
Ogata et al., 1995; Tahirov et al., 2001; Tahirov et al., 2002). To date, the details of
such interactions have been relatively scant, aside from their putative involvement in the
regulation of plant-specific gene expression.
AtMYB61, a member of the Arabidopsis thaliana R2R3-MYB family of transcription
factors, illustrates the involvement of R2R3-MYB family members in the regulation of
plant-specific processes. AtMYB61 is a pleiotropic regulator of three major facets of the
plant transpiration system: xylem cell differentiation; lateral root outgrowth; and,
stomatal aperture (Liang et al., 2005; Romano et al., 2012). AtMYB61 modifies gene
expression in response to diurnal cues so as to appropriately modify the aperture of
stomata (Liang et al., 2005), the pore-like structures on leaf surfaces that enable gas
exchange. Thus, AtMYB61 plays a role in modifying the capacity to take up carbon
dioxide for photosynthesis, while limiting the loss of water from the plant body.
AtMYB61 also alters gene expression in response to sugars, resulting in modification of
plant architecture and cell wall structure (Penfield et al., 2001; Newman et al., 2004;
Dubos et al., 2005). As is the case for most R2R3-MYB transcription factors, the
precise mechanisms that enable AtMYB61 to bring about important changes in plant
function are unknown. Furthermore, although AtMYB61 has been shown to bind to
certain consensus motifs (Romano et al., 2012), the preferred binding of AtMYB61 has
not yet been determined quantitatively.
Given that R2R3-MYB proteins are involved in a rich variety of plant-specific processes
(Dubos et al., 2010), it would be desirable to have a more detailed understanding of
R2R3-MYB and DNA motif interactions. The work described herein focuses on the
interplay between AtMYB61 and its DNA target sequences. Cyclic amplification and
selection of targets (CASTing), which enables identification of a transcription factor‘s
DNA-binding sites from a pool of random oligonucleotides, was used to identify target
DNA-binding sites for AtMYB61 (Wright et al., 1991). The sequences identified served
as a useful foundation to examine mechanisms responsible for AtMYB61 sequence-
specific binding, and to hypotheses about the roles these may play in shaping AtMYB61
function in vivo.
59
3.3 Materials and Methods
3.3.1 Ethics Statement
Antibody generation was carried out in strict accordance with the Province of Ontario‘s
Animals for Research Act, and the requirements of the federal Canadian Council on
Animal Care. The protocol was approved at the University of Toronto, which involved
full committee review by the Local Animal Care Committee (LACC), followed by
approval by the University of Toronto Office of Research Ethics, the University
Veterinarian, and finally the University of Toronto Animal Care Committee (UACC)
(Permit Number: 20007080, approved 14/01/08). All efforts were made to minimise
suffering.
3.3.2 Expression of Recombinant Protein in Bacteria
Recombinant AtMYB61 protein was produced in E. coli using the coding sequence
cloned in frame into the NdeI and BamHI sites of the pET15b vector (Novagen).
Recombinant AtMYB61 protein was produced, extracted and affinity purified as
described previously for pine MYB proteins (Patzlaff et al., 2003b).
3.3.3 Antibody Production and Western Blot Analysis
Anti-AtMYB61 polyclonal antibodies were produced against the recombinant fusion
protein in rabbits as described previously (Harlow, 1988). Affinity-purified recombinant
antigen was gel-purified on a 10% SDS-PAGE gel and shipped in phosphate buffered
saline to University of Toronto BioScience Support Laboratories for antibody production.
In brief, 2 rabbits were each injected a total of 4 times with 300 g of antigen per
injection over a 6 week period. Production bleeds were performed after nitrocellulose
dot blot assays indicated acceptable titre.
For western blot analysis, total soluble protein extracts were separated by SDS-PAGE
and transferred to Bio-Rad Laboratories Nitrocellulose Trans-Blot Transfer Medium
(0.45µm) by electrophoretic transfer (BioRad, Mississauga, ON, Canada).
Chemiluminescent western blot analysis was performed on the filters with Invitrogen‘s
60
Western Breeze Chemiluminescent kit as described by the manufacturer (Invitrogen,
Burlington, ON, Canada). Primary antibody dilutions were done at a final dilution of
1/20000.
3.3.4 Cyclic Amplification and Selection of Targets (CASTing)
The CASTing assay was completed according to Wright et al. (Wright et al., 1991).
CASTing was completed by incubating 15 μg of double stranded random
olionucleotides (27 mers) flanked in between two constant priming sequences with the
AtMYB61 full length recombinant protein. This complex was added to a Protein G
Dynabead (Invitrogen, Burlington, ON, Canada) plus post-injection AtMYB61 antibody
complex, causing the complex to immunoprecipitate. The immunoprecipitated complex
was then washed 3 times, resuspended in 100 μL PCR buffer, boiled and then PCR
amplified for 30 cycles with 15 pmol of forward and reverse primers. 10 μl of the
amplified selected targets were kept for analysis and 90 μL were used to continue with
the next cycle. This cycle was repeated four more times to select for AtMYB61
consensus DNA target sequences. The selected targets were then cloned into
Invitrogen‘s pCR4 TOPO vector and sequenced (Invitrogen, Burlington, ON, Canada).
3.3.5 Nitrocellulose Filter-Binding Assay
The nitrocellulose filter-binding assay was conducted as described by Hall and Kranz
(Hall and Kranz, 2008). The CASTing targets that were over-represented were ordered
from Invitrogen and PCR amplified (Invitrogen, Burlington, ON, Canada). These PCR
products were Qiagen nucleotide purified according to the Qiagen manufacturer
(Qiagen, Toronto, ON, Canada). The cleaned up PCR products were then radioactively
labelled with 32P via primer extension and further Qiagen nucleotide purified according
to the Qiagen manufacturer (Qiagen, Toronto, ON, Canada). The CPM levels were
measured via a liquid scintillation counter to measure the incorporation of 32P into the
probe. The radioactively labelled probes were combined in a binding reaction with
recombinant AtMYB61 protein and passed through BioRad nitrocellulose filters (0.2µm)
(BioRad, Mississauga, ON, Canada). The relative binding of recombinant AtMYB61
protein to the CASTing motifs and mutated AC-I sequences were recorded. The
61
dissociation constants (Kd) of the CASTing targets to AtMYB61 were determined by
GRAFIT program which linearised the nonlinear regression via scatchard plots to
calculate the point at which half of the binding sites of AtMYB61 was bound by ligand.
3.3.6 Electrophoretic Mobility Shift Assay (EMSA)
Recombinant AtMYB61 protein was produced, extracted and affinity purified as
described previously for pine MYB proteins (Patzlaff et al., 2003b). EMSA conditions
were exactly as described previously (Patzlaff et al., 2003b; Gomez-Maldonado et al.,
2004) but using recombinant AtMYB61 protein in place of pine MYB protein.
3.3.7 Molecular Modelling
The tertiary structure of AtMYB61 was predicted using the tool Protein
Homology/analogY Recognition Engine (PHYRE)(McDonnell et al., 2006);
www.sbg.bio.ic.ac.uk/phyre/html/index.html). PHYRE proposed that the resolved
structure that shared the most homology to AtMYB61 was the animal c-MYB DNA-
binding domain, which was resolved previously with its DNA consensus motif (AACNG)
by heteronuclear multidimensional NMR (Ogata et al., 1994). This solution structure
was used to predict a 3D protein model of AtMYB61 with an E-value of 3.8e-13 and an
estimated precision of 100%. The two protein sequences were 44% alike using amino
acid sequence alignment. The PDB (Protein Data Bank) file recovered from the PHYRE
analysis (PDB ID = c1msfC) was used to superimpose the predicted AtMYB61 structure
with the c-MYB structure using DaliLite (Holm and Park, 2000). The c-MYB protein was
resolved along with its DNA binding sequence allowing one to predict the binding
domain of AtMYB61 using homology. The PDB files for the AC-I and NBS nucleotide
motifs were created from the http://structure.usc.edu/make-na/server.html server. Using
Pymol (Seeliger and de Groot, 2010) the two structures were modelled and
superimposed (DeLano, 2002). Polar interactions were determined using Pymol.
3.3.8 Transcriptional Activation Assay
Transcriptional activation assays using yeast were as described previously (Patzlaff et
al., 2003b), but substituting the AtMYB61 coding sequence in place of pine MYB
62
sequences. Transcriptional activation assays were conducted with three biologically
independent replicates per condition.
3.4 Results and Discussion
3.4.1 AtMYB61 Bound a Discrete Subset of DNA Target Sequences
To generate an antibody of adequate specificity for the cyclic amplification and selection
of targets (CASTing) assay, antibodies were raised against a non-conserved region in
the AtMYB61 C-terminus (Fig. S3.1). CASTing was initiated with a pool of 63-base-pair
double-stranded oligonucleotides, where each oligonucleotide consisted of a segment
of 27 random nucleotides flanked by designed sequences for PCR priming. A 15 μg
(2.21x1014 DNA molecules) pool of ―randomers‖ was incubated with AtMYB61 full-length
recombinant protein (Fig. 3.1a). Assuming the average protein-binding site is a
hexamer, the 27-bp degenerate core of each double-stranded oligomer contained 21
possible positions. Therefore, in the initial round of CASTing, 21 X 1014 unique sites
were available for binding.
Five CASTing cycles were undertaken to enrich the pool of oligonucleotides in DNA
binding-sites bound by AtMYB61. The enriched oligonucleotides were cloned into
pCR4 TOPO (Invitrogen, Burlington, ON, Canada) and sequenced. Following
enrichment, 89 CASTing-derived oligonucleotides were sequenced. Sequences were
subjected to analysis to discover over-represented motifs using MEME (Multiple Em for
Motif Elicitation) (Bailey et al., 2006) (Table 3.1, Table 3.2, Fig. 3.1b). MEME filtering
criteria identified sequences with a min/max motif width of 6, any number of repetitions
of a single motif distributed among the sequences, and no restrictions on the number of
motifs identified. Following MEME analysis, all CASTing-enriched sequences contained
over-represented motifs characterised by an abundance of adenosine and cytosine
residues. These over-represented motifs had a conserved set of ACC nucleotides
present at the beginning of the motifs, suggesting that these nucleotides may be
essential for recognition and binding (Table 3.1, Table 3.2, Fig. 3.1b). These motifs
correspond to canonical AC elements, also known as H-boxes or PAL-boxes (Table 3.1,
Table 3.2, Fig. 3.1b).
63
Figure 3.1. Cylic amplification and selection of targets (CASTing) recovered a suite of hexamer target sequences that bound to AtMYB61. (a) 27bp random sequences flanked by two primer sites (63bp in total) were used in the CASTing assay. (b) Sequence logo of CASTing targets discovered by MEME. The ACC motif was conserved among all target sequences. Two nucleotides upstream and downstream of the over-represented hexamer target sequences were included to analyse if the over-represented motifs could be extended beyond a hexameric sequence.
64
Table 3.1. Alignment of AtMYB61 binding sites obtained from CASTing Assay
Seven hexomer targets were determined to be overrepresentative by MEME (Multiple EM for Motif Elicitation).
Group AtMYB61 Site
ACCACC
1 ACCCCAGAGTCCC ACCACC CGACCCCC
2 ACCCAAACACCACGCCCTAG ACCACC C
3 GCTAAACGTTCATTCCCCT ACCACC CC
4 A ACCACC TCAACAAACCCCGGCCGCCC
5 ACCAC ACCACC ACCCACCCCCCCCCCC
6 G ACCACC CTCCAACCTATACCGGCCCC
7 CCAAACTCGACCGTTCCCGC ACCACC C
8 GCACCCC ACCACC ACCATACCTACCCC
9 ACCCGATCAGGCCCTCC ACCACC CCCC
10 CCACACCCCACCCCGAACG ACCACC GC
11 ACCAACGGACTAGCTCCCAC ACCACC C
12 C ACCACC CCACCATACAATCCCTAGGC
13 ACCAC ACCACC ACCCCACCCTAGGACC
14 ACCACC ACTACCCGGACCCGGCCCCCC
15 ACACGAGATAACGACCCG ACCACC CCC
ACCTAC
16 GACACAAGACAC ACCTAC ACCCCCCCC
17 GCAGCCC ACCTAC ACTCCCGCTCCCCC
18 GCACCCCACCACCACCAT ACCTAC CCC
19 ACCCCCCCTAATTG ACCTAC GGCAGGC
20 CAG ACCTAC CCCCGCCCCCAACCCGCC
21 CACCCACCGTCCAACG ACCTAC ACCCC
22 GCGCACCCCACCCCCC ACCTAC GGCCC
ACCACA
23 ACCACA ATGCAGCCGTACTTCGACCCC
24 ACCACA CCACCACCCACCCCCCCCCCC
25 A ACCACA TCAACAAACCCCGGCCGCCC
26 CAACCCCTCCA ACCACA CCTCCCCGCC
27 CC ACCACA CTCTGCATTCTTGACCGCC
ACCATA
28 GGGTAATGTC ACCATA GCCCCCCCCCC
29 GCACCCCACCACC ACCATA CCTACCCC
30 CA ACCATA CACAACGCCCCGACCCCCC
31 CACCACCCC ACCATA CAATCCCTAGGC
32 CAGGCACCCCCAACCCCCC ACCATA CC
ACCAAT
33 AAAGGGTATACACAGGT ACCAAT GGCC
34 AACCTTAGGG ACCAAT CAATAAGGGAC
35 ACCAAT GAAGAGACCCCTAACCATTAC
36 ATGTGTAG ACCAAT GGCATAATCTGCA
37 GTCGAGTCG ACCAAT GCAGCACGCAGC
ACCAAC
38 CAG ACCAAC CTCATACCCCCCCCTGCC
39 CC ACCAAC CCTCCCTCCCAATGCCCGC
40 ACCAAC GGACTAGCTCCCACACCACCC
41 AACATGCTGTGCAACCAA ACCAAC ACC
ACCAAA
42 ACCAAA AGATCAACCCCCCCCCGTACC
43 AACATGCTGTGCA ACCAAA CCAACGCC
44 ACACATAAACAGCA ACCAAA CCAGCCC
45 AACATGCTGTGCA ACCAAA CCAACACC
65
Table 3.2. AtMYB61 consensus sequence was derived from a comparison of 89 sequences recovered from 5 cycles of CASTing The composition of each base at each position of the hexameric sequence is provided. -/+ indicate the bases 5' or 3' of hexameric consensus sequence. The bases 5' or 3' of hexameric consensus sequence does not add up to 45 in certain circumstances because primer sites were negated from the analysis. W corresponds to A/T, H corresponds to A/T/C, – corresponds to a zero value.
-2 -1 A C C W H H +1 +2
G 3 11 – – – – – – 9 7
A 10 8 45 – – 38 20 14 10 4
T 2 3 – – – 7 5 5 2 4
C 20 17 – 45 45 – 20 26 24 27
Total 45 45 45 45 45 45
66
AC elements, also known as PAL boxes or H-boxes, play key roles in regulating
transcription for a variety of genes, particularly those encoding enzymes implicated in
phenylpropanoid metabolism (Lois et al., 1989; Joos and Hahlbrock, 1992; Leyva et al.,
1992; Hauffe et al., 1993; Hatton et al., 1995; Logemann et al., 1995; BellLelong et al.,
1997; Seguin et al., 1997; Lacombe et al., 2000; Lauvergeat et al., 2002). R2R3-MYB
proteins are known to bind AC elements and activate transcription from these motifs in
yeast and in planta (Prouse and Campbell, 2012). For example, pine (Pinus taeda)
MYB1 (Patzlaff et al., 2003a) and MYB4 (Patzlaff et al., 2003b) and eucalyptus
(Eucalyptus grandis) MYB2 (Goicoechea et al., 2005), were all able to bind to AC
elements present in the promoters of lignin biosynthetic genes. Similarly, pine (Pinus
taeda) MYB1 and MYB4 bound AC elements present in the gene regulatory sequences
of a pine gene encoding GLUTAMATE SYNTHETASE1b (GS1b) (Gomez-Maldonado et
al., 2004). R2R3-MYB binding to AC elements is predicted to play a role in dictating
xylem-localised expression of the aforementioned genes (Patzlaff et al., 2003a; Patzlaff
et al., 2003b; Gomez-Maldonado et al., 2004; Goicoechea et al., 2005). Given the
xylem-localised expression of AtMYB61 (Romano et al., 2012), it is likely that it
functions in an equivalent manner to drive AC-element-mediated expression in
Arabidopsis thaliana.
3.4.2 AtMYB61 Bound to DNA Target Sequences with Varying Degrees of
Affinity
The relative binding affinities of recombinant AtMYB61 protein to the CASTing-derived
sequences were determined (Table S3.1). Dissociation constants for each CASTing
target were calculated by GRAFIT software program by using Scatchard plots (Table
3.3). The CASTing target that bound with the highest affinity (9.12E-09 M) was ACCTAC
(AC-I) (Table 3.3). Since the AC-I motif was the preferred target of AtMYB61, a
mutational assay was conducted on this motif to examine which nucleotides were
essential for binding (Table 3.4). A guanine nucleotide was substituted one nucleotide
at a time and shifted along the motif. A nitrocellulose filter-binding assay was used to
calculate the Kds of the mutated AC-I motifs (Table 3.4). Binding diminished when a
mutation was present in the first three nucleotides of the AC-I motif (Kd>5.00E-06 M);
67
Table 3.3. Dissociation constants (Kd) in mol/L and associated errors of CASTing targets. Relative binding affinities of the CASTing targets to AtMYB61 were determined by a nitrocellulose filter-binding assay. The relative binding affinities were used to determine the dissociation constants of the CASTing targets by GRAFIT program which linearized the nonlinear regression via scatchard plots to calculate the ligand concentration at which half of the binding sites of AtMYB61 are occupied. ACCTAC bound with the greatest affinity to AtMYB61. NBS or non-binding site did not bind to recombinant AtMYB61.
Kd Error
ACCTAC 9.12E-09 3.11E-09
ACCAAT 1.21E-08 3.42E-09
ACCAAA 1.68E-08 4.07E-09
ACCATA 1.83E-08 5.06E-09
ACCAAC 7.37E-08 1.53E-08
ACCACA 8.08E-08 6.93E-09
ACCACC 6.90E-07 2.27E-08
NBS >5.00E-06
68
Table 3.4. Dissociation constants (Kd) in mol/L and associated errors of mutated ACCTAC (AC1 element) sequences A guanine nucleotide was inserted one nucleotide at a time and shifted along the AC1 motif. Relative binding affinities of the mutated AC1 elements to AtMYB61 were determined by a nitrocellulose filter-binding assay. The relative binding affinities were used to determine the dissociation constants of the CASTing targets by GRAFIT program which linearized the nonlinear regression via scatchard plots to calculate the ligand concentration at which half of the binding sites of AtMYB61 are occupied. Underlined bases corresponds to a substituted guanine.
Kd Error
ACCTAC 9.12E-09 3.11E-09
GCCTAC >5.00E-06
AGCTAC >5.00E-06
ACGTAC >5.00E-06
ACCGAC 7.19E-07 2.12E-07
ACCTGC 7.97E-08 1.83E-08
ACCTAG 5.60E-08 5.09E-09
69
however, when a mutation is present in the last three nucleotides of the AC-I motif, the
binding is reduced but not completely abolished (Table 3.4). The relative binding
affinities of recombinant AtMYB61 protein to CASTing targets and mutated motifs were
validated by EMSAs (Fig. 3.2). EMSAs were conducted at a protein concentration of
5x10-08 M because this was the protein concentration at which not all the targets
reached their binding max as determined by nitrocellulose filter-binding assay (Fig. 3.2,
Table S3.1). This enabled detection of differential binding via EMSAs.
AtMYB61 bound its preferred target AC-I (ACCTAC) with a binding constant of 9.12E-09
M (Table 3.3), which is similar to the tight binding of the vertebrate c-MYB R2R3 domain
to the MYB binding site ((T/C)AAC(G/T)G(A/C/T)(A/C/T)) (binding constant = 1.5E-09
M±28% ) (Tanikawa et al., 1993; Ebneth et al., 1994). Tanikawa et al. found that AACG
nucleotides in the c-MYB binding site were critical for binding (Tanikawa et al., 1993).
The second adenosine, fourth cytosine, and sixth guanine were particularly important in
determining binding specificity. If any of these core nucleotides were mutated, binding
affinity decreased by greater than 500 fold. The third adenosine was not as crucial - if it
was mutated, the binding affinity would be decreased up to 15 fold. Consistent with
this, AtMYB61 had a set of core recognition nucleotides – ACC – that could not be
mutated without abolishing binding (Fig. 3.2b, Table 3.4). Moreover, mutation of the
latter half of the binding site, occurring at residues TAC, reduced binding but did not
abolish it completely.
3.4.3 The Affinity of AtMYB61 to Specific Target DNA Sequences Was
Predicted by Molecular Interactions Determined in silico
Computational analysis of the 3-dimensional structure of the N-terminal DNA-binding
region of AtMYB61 was conducted in order to validate the role of this domain in
sequence-specific binding. Previously, the structure of the N-terminal DNA-binding
domain of animal c-MYB bound to its DNA consensus motif (AACNG) was solved by
heteronuclear multidimensional NMR (Ogata et al., 1994). Animal c-MYB DNA-binding
region contains a conserved R2R3-MYB domain that exhibits high similarity to plant
R2R3-MYB DNA binding domains. This NMR structure was used as a template to
model the structure of AtMYB61. The AC-I (ACCTAC) and NBS (GAGACC) nucleotide
70
Figure 3.2. Relative binding affinities of AtMYB61 to CASTing targets and to mutated ACCTAC motif determined by nitrocellulose filter-binding assays are confirmed by electrophoretic mobility shift assays (EMSAs). (a) EMSA of recombinant AtMYB61 protein binding to 6 labelled CASTing target sequences. The protein concentration used was 5x10-08M. Protein concentrations were conducted at 5x10-08M because this was the protein concentration at which targets had not all reached their binding max as determined by nitrocellulose filter-binding assay, allowing one to observe differential binding. (b) EMSA validating relative binding affinities of AtMYB61 to mutated ACCTAC motif. The protein concentration used was 5x10-08M. Mutations were conducted by substituting a single guanine nucleotide along the AC1 element. Black arrow indicates gel shift by the probe. Non-binding site (NBS) is a sequence that does not bind AtMYB61, acting as a negative control. Probes were engineered for the EMSA reaction by inserting the hexamer CASTing sequence or mutated AC1 element sequence into the underlined area.
71
models were then docked into the predicted binding sites of the AtMYB61 model (Fig.
3.3).
Based on the model of AtMYB61, the molecular interactions shared between the
binding sites of AtMYB61 to its targets supported in vitro binding data (Fig. 3.3). For
example, there were more hydrogen bonds shared between AtMYB61 DNA-binding
domain and AC-I compared to NBS (Fig. 3.3bcd). Based on the model of AtMYB61
bound to AC-I, several specific intermolecular interactions are predicted to create
binding specificity. These include hydrogen bonds between the following residues:
asparagine-59 (R3 helix) of AtMYB61 with adenosine-1 nitrogen of AC-I; asparagine-
106 (R3 helix) oxygen of AtMYB61 with adenosine-1 hydrogen of AC-I; asparagine-59
(R3 helix) oxygen of AtMYB61 with cytosine-2 hydrogen of AC-I; asparagine-102 (R3
helix) oxygen of AtMYB61 with cytosine-3 hydrogen of AC-I; arginine-56 (R2 helix)
oxygen of AtMYB61 with cytosine-3 hydrogen of AC-I; arginine-54 (R2 helix) hydrogen
of AtMYB61 with thymidine-4 oxygen of AC-I; and, lysine-51 (R2 helix) of AtMYB61 with
adenosine-5 nitrogen of AC-I (Fig. 3.3bc). The leucine-55 (R2 helix) methyl group of
AtMYB61 is predicted to form a non-polar bond with thymidine-4 methyl group of AC-I.
Cytosine-6 remained unbound in the model (Fig. 3.3c). In comparison, the NBS model
had only one hydrogen bond present, involving asparagine-59 (R3 helix) oxygen of
AtMYB61 with adenosine-2 hydrogen of AC-I (Fig. 3.3d).
3.4.4 The Affinity of AtMYB61 to Specific Target DNA Sequences Did Not
Correlate with AtMYB61-Driven Transcriptional Activation with
Each of the Target Sequences
Previous studies have shown that AtMYB61 protein is sufficient to drive transcription in
yeast from promoter sequences that contain AC elements (Romano et al., 2012).
Consequently, yeast transcriptional activation assays were used to determine the
relationship between AtMYB61 affinity to specific DNA sequences and its capacity to
drive transcription (Fig. 3.4). Reporter constructs comprised the coding sequence for -
galactosidase under the control of the yeast minimal CYC1 promoter fused to triple
repeats of a given CASTing target or a mutated AC-I motif (Fig. 3.4). The minimal
72
Figure 3.3. Molecular modelling of AtMYB61 with target sequences confirm binding preferences determined by nitrocellulose filter-binding assays and EMSAs. (a) Pymol models of ACCTAC motif docked into the binding site of AtMYB61. Molecular modelling was completed by using the online program PHYRE (Protein Homology/analogY Recognition Engine) to predict a crystal structure of AtMYB61 using homology to the c-MYB DNA binding domain. The PDB (Protein Data Bank) file recovered from the PHYRE analysis was used to superimpose the predicted AtMYB61 crystal structure with the c-MYB crystal structure using DaliLite. Using Pymol the 3D sequence model -- ACCTAC -- was docked into the predicted binding sites of AtMYB61. The AC1 element model is displayed in yellow, the loop secondary structure of AtMYB61 inferred model is displayed in green, and the helix secondary structure of AtMYB61 inferred model is displayed in red. (b) Model of AtMYB61 binding site with the first three ACC nucleotides in the ACCTAC sequence determines that these nucleotides are essential for binding. The AC1 (ACCTAC) nucleotide model was docked into the predicted binding site of AtMYB61. The specific hydrogen bonding between the amino acids of AtMYB61 binding site to the ACC nucleotides of AC1 were predicted by Pymol and
73
Figure 3.3 caption continued. listed as follows: asparagine-59 (R3 helix) hydrogen to adenosine-1 nitrogen; asparagine-106 (R3 helix) oxygen to adenosine-1 hydrogen; asparagine-59 (R3 helix) oxygen to cytosine-2 hydrogen; asparagine-102 (R3 helix) oxygen to cytosine-3 hydrogen; and arginine-56 (R2 helix) oxygen to cytosine-3 hydrogen. This confirms binding data determined by the nitrocellulose filter-binding assay and EMSAs, iterating that the ACC motif is the core recognition motif of AtMYB61. (c) Model of AtMYB61 binding site with the TAC nucleotides in the ACCTAC sequence determine that these nucleotides are less essential for binding. The AC1 (ACCTAC) nucleotide models were docked into the predicted binding sites of AtMYB61. The molecular interactions between the amino acids of AtMYB61 binding site and the TAC nucleotides of AC1 were analyzed by Pymol and are listed as follows: leucine-55 (R2 helix) methyl group was predicted to form a non-polar bond with thymidine-4 methyl group; Arginine-54 (R2 helix) hydrogen was predicted to form a hydrogen bound with thymidine-4 oxygen; lysine-51 (R2 helix) hydrogen was predicted to form a hydrogen bound with adenosine-5 nitrogen; and cytosine-6 remained unbound in the model. (d) Model of AtMYB61 binding site with non-binding site (GAGACC) predicts that this motif is not recognised by AtMYB61. The non binding site model was docked into AtMYB61 binding site via Pymol and hydrogen bonding was analyzed. Only one hydrogen bond was predicted between AtMYB61 asparagine-59 (R3 helix) oxygen and the non-binding site adenosine-2 hydrogen. Yellow dashed lines indicate hydrogen bonding established by Pymol program, and blue dashed lines indicate non-polar interactions.
74
Figure 3.4. AtMYB61-mediated activation of promoter activity in Saccharomyces cerevisiae in an AC dependent fashion. (a) The sequence of the oligonucleotides cloned into the reporter vector using EcoRI and SalI sites. Each AC element or mutated ACI element is triplicated within the segment. (b) Schematic representations of the Effector
75
Figure 3.4 caption continued. (pYES2TRP::AtMYB61) and Reporter (pLacZi::AC) constructs used in this assay (CYC1: minimal yeast promoter). (c) Quantitative analysis of β-galactosidase activity in yeast after induction. The measurements in liquid assay were made from three biological independent replicates. Activation of artificial genes comprising a minimal CYC1 promoter fused to a tandem AC element or mutated ACI element upstream of the lacZ gene by AtMYB61 protein, upon growth of the yeast in galactose (light grey bars), gave rise to β-galactosidase activity that was significantly greater than the controls, as determined by analysis of variance (P < 0.005); including each vector alone, or both together after growth on non-inducing glucose (dark grey bars). Error bars represent standard deviations. * indicates statistically significant, P < 0.005, determined by t-test. Underlined bases corresponds to a substituted guanine.
76
CYC1 promoter is unable to support transcription, so reporter expression would be
contingent on the capacity of AtMYB61 to bind to the fused motifs, which would function
as gene regulatory sequences. The expression of AtMYB61 was under the control of
the galactose-inducible GAL1 promoter. As determined by the quantification of -
galactosidase activity, when AtMYB61 protein was induced by galactose, the protein
was able to activate transcription from the CASTing target sequences but not from the
mutated AC-I elements (Fig. 3.4). The extent of transcriptional activation varied for
each CASTing target (Fig. 3.4c). Notably, CASTing target sequences ACCATA,
ACCAAT, and ACCAAA supported greater amounts of -galactosidase induction
relative to the AC-I element, which bound with the greatest affinity to AtMYB61 (Fig.
3.4c).
Previously, R2R3-MYB proteins have been shown to bind to AC elements and activate
transcription in yeast and in planta; however, these studies did not correlate binding
affinity with ability to activate transcription (Jin et al., 2000; Patzlaff et al., 2003a;
Patzlaff et al., 2003b; Gomez-Maldonado et al., 2004). Yeast activation assays
determined that the affinity of AtMYB61 to specific target DNA sequences did not
correlate with AtMYB61-driven transcriptional activation with each of the target
sequences. This is consistent with results obtained using the glucocorticoid receptor
(GR), where no correlation between in vitro binding affinities and in vivo transcriptional
activities was observed (Meijsing et al., 2009). GR target sequences, differing by as
little as a single nucleotide, differentially affected GR DNA binding and transcriptional
activity, with no correlation between these parameters. Similarly, binding affinity of
AtMYB61 to specific target DNA sequences did not correlate with AtMYB61-driven
transcriptional activation with each of the target sequences. It may be that conformation
of AtMYB61 changes when binding to a specific DNA sequence, altering its ability to
activate transcription.
3.4.5 CASTing Target Sequences Were Found in the Promoter Regions
of Three Putative Direct Downstream Targets of AtMYB61
Previous experiments identified three putative direct downstream target genes of
AtMYB61 (Fig. 3.5)(Romano et al., 2012). These gene targets encode the following
77
Figure 3.5. Sequences recovered from the CASTing assay were found in all three promoter regions of predicted direct downstream targets of AtMYB61, namely KNOTTED1-like transcription factor (KNAT7, At1g62990); caffeoyl-CoA 3-O-methyltransferase (CCoAOMT7, At4g26220), and pectin-methylesterase (PME, At2g45220). The three putative AtMYB61 direct downstream target genes were identified by Romano et al. by using the intersection set of genes found to be co-regulated with AtMYB61 in both the AtGenExpress developmental dataset and AtMYB61-specific microarray experiment. 1000bp upstream regulatory regions were examined of the three genes. +/- indicate the orientation of CASTing target sequences relative to the sense coding strand; whereas, numbers indicate the position of these motifs relative to the putative transcriptional start (indicated by an arrow). Triangle represents ACCAAA, square represents ACCAAT, and circle represents ACCATA.
78
gene products: a KNOTTED1-like transcription factor (KNAT7, At1g62990); a caffeoyl-
CoA 3-O-methyltransferase (CCoAOMT7, At4g26220); and a pectin-methylesterase
(PME, At2g45220). The CASTing targets were identified in the 1000bp 5‘ non-coding
regions of the three putative direct target genes (Fig. 3.5). AtMYB61 bound to the 5‘
gene regulatory sequences of all three putative direct target genes in an AC dependent
manner (Romano et al., 2012). These data support the hypothesis that AtMYB61 binds
to AC elements in a distinct set of target genes to modify gene expression.
3.5 Conclusion
Despite the size and importance of the plant R2R3-MYB family of transcriptional
regulators, little is known about the molecular functioning of given family members. The
work described herein casts greater light on the interaction between an R2R3-MYB
family member and its cognate DNA targets. The findings support the hypothesis that
AtMYB61 is recruited to target genes via its interactions with a set of unique sequences,
and thereby modifies gene expression. Surprisingly, the affinity of AtMYB61 to specific
target DNA sequences did not correlate with AtMYB61-driven transcriptional activation
with each of the target sequences, suggesting that the conformation of AtMYB61 may
be altered allosterically when bound to specific target sequences. These findings point
to additional complexities in the regulation of plant gene expression, and argue for the
need for greater exploration of the molecular intricacies involved in the interactions
between plant transcription factors and their DNA targets.
3.6 Acknowledgements
We are very grateful to Ms. Joan Ouellette for technical assistance, Mr. Ke Wu for
assistance on the CASTing assay and to Ms. Stephanie Tung, Ms. Kate Lee and Ms.
Trisha Min for assistance with the yeast experiments. This work was generously
supported by a Natural Science and Engineering Research Council of Canada
(NSERC) Canadian Graduate Scholarship (CGSD) awarded to MP, and funding from
NSERC to M.M.C.
79
3.7 Supplemental Figures and Tables
Figure S3.1. AtMYB61 antibody generation and validation. (a) Amino acid sequence similarity of AtMYB61 along with its closest family member AtMYB50. The two proteins have conserved N-terminal amino acid sequences but unique C-terminal domains, which was the domain selected to generate AtMYB61 antibodies against (highlighted region). (b) A chemiluminescence western-blot of full length AtMYB61 recombinant protein (Lane 1), of antibody alone (Lane 2), and AtMYB61 recombinant protein immunoprecipitated with prebleed serum (Lane 3) and with AtMYB61 specific antiserum (Lane 4) validate AtMYB61 antibody specificity. Western-blot was done with 1:20 000 dilution of post-injected serum. Western-blot shows greater quantities of AtMYB61 protein eluted off the Magnetic Dynabeads Protein G post-injected antibody complex compared to the Magnetic Dynabeads Protein G pre-injected antibody complex, showing that the immunoprecipitation was successful.
80
Supplemental Table S3.1. Relative binding of CASTing targets and mutated AC1
sequences to AtMYB61
ACCAAC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 16203 20451 26456 111235 223153 310225 325212 335456 460122
Trial 2 15145 18513 23513 92214 242153 315212 321021 324658 458213
Trial 3 13142 19088 20578 114285 231026 288232 304666 307279 446521
Average 14830 19350 23515 105911 232110 304556 316966 322464 454952
Binding 0.0352182 0.045954 0.055845 0.251517 0.551214 0.723257 0.752728 0.765785
ACCACC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 32891 42654 47895 54112 112356 220167 328989 363354 481234
Trial 2 33564 41258 45654 55333 111589 226896 322644 312578 495242
Trial 3 38289 41124 46524 49992 117458 236446 328227 341592 475863
Average 34914 41678 46691 53145 113801 227836 326619 339174 484113
Binding 0.0774535 0.092459 0.103578 0.117897 0.252453 0.505425 0.724564 0.752415
ACCAAA 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 26442 31996 176351 255645 319665 321348 345631 347562 496372
Trial 2 25854 31254 169856 251335 314556 318964 342654 342556 489653
Trial 3 22542 26978 141629 241274 310182 310808 331499 342380 481234
Average 24946 30076 162612 249418 314801 317040 339928 344166 489086.33
Binding 0.0551982 0.066549 0.359815 0.55189 0.696565 0.701519 0.752165 0.761542
ACCAAT 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 20155 46254 211558 266242 312585 357456 354231 362645 487651
Trial 2 18982 45335 208334 263423 311225 349978 344580 350024 480225
Trial 3 19003 41241 196044 266540 305857 297827 292797 300952 479852
Average 19380 44276 205312 265401 309888 335086 330536 337873 482576
Binding 0.0431234 0.098522 0.456848 0.590556 0.689546 0.745615 0.735489 0.751816
ACCACA 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 17245 24288 37524 126997 224568 308521 334568 337851 462254
Trial 2 16670 22853 36293 124789 219895 305452 325586 329987 461235
Trial 3 11815 24571 35957 117110 140819 296508 325909 318383 459978
Average 15243 23904 36591 122965 195093 303493 328687 328740 461155
Binding 0.0354852 0.055647 0.085182 0.286255 0.454165 0.706512 0.765162 0.765285
81
Table S3.1 continued
ACCATA 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 21258 28269 171654 242588 324689 320115 345456 345571 475821
Trial 2 20199 27855 169558 234560 311471 319524 339887 340129 474458
Trial 3 18837 22493 154411 185371 266696 315799 309027 325918 468521
Average 20097 26205 165207 220839 300951 318479 331456 337206 472933
Binding 0.0456419 0.059512 0.375182 0.50152 0.683453 0.723257 0.752728 0.765785
ACCTAC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 42125 51226 251665 288968 332541 331224 361547 358702 489213
Trial 2 40242 52874 248552 287110 325574 312123 358990 348873 486237
Trial 3 37496 45365 232628 250710 288998 323556 288460 312273 480411
Average 39954 49821 244281 275596 315704 322300 336332 339949 485287
Binding 0.0882924 0.110098 0.539824 0.609024 0.697657 0.712234 0.743241 0.751234
GCCTAC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 13586 23558 25334 34135 55101 68951 62548 66352 478921
Trial 2 11440 21040 22114 37526 57241 69524 60177 63874 476621
Trial 3 5666 15369 22859 37855 50639 65307 54646 56274 465312
Average 10230 19989 23435 36505 54326 67927 59123 62166 473618
Binding 0.0232423 0.0454123 0.0532423 0.0829349 0.123423 0.154321 0.134321 0.141234
AGCTAC 1.00E-09 5.00E-09 2.00E+00 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 14448 18542 23868 38512 51224 85304 71452 74289 468621
Trial 2 16273 17520 21264 37246 49254 82330 68871 72555 461255
Trial 3 13322 10614 21663 29321 42945 81802 53817 68382 458913
Average 14680 15558 22264 35026 47807 83145 64713 71741 462929
Binding 0.0345132 0.0365768 0.0523423 0.0823432 0.11239 0.195465 0.152134 0.168657
ACGTAC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 27881 31456 44586 49853 78921 69889 81227 83556 498533
Trial 2 26648 35213 41001 44571 69246 71526 82254 84470 491524
Trial 3 22840 22499 32838 35768 78246 62399 71100 90315 489255
Average 25789 29722 39475 43397 75471 67937 78526 86113 493104
Binding 0.0565421 0.0651652 0.0865465 0.0951456 0.165465 0.148949 0.172165 0.188798
82
Table S3.1 Continued
ACCGAC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 8014 18520 43558 51201 63599 133563 262147 335864 475561
Trial 2 4861 14332 41002 49664 54248 119211 246610 312247 472608
Trial 3 3577 9157 43689 42203 49600 113920 210525 333435 467823
Average 5484 14003 42749 47689 55815 122231 239760 327182 471997
Binding 0.012591 0.03215 0.098156 0.109498 0.12816 0.280651 0.55051 0.75123
ACCTGC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 17265 34552 36521 92234 233458 290142 330121 388914 466258
Trial 2 16998 32621 37229 88841 225895 289521 322449 311258 462135
Trial 3 9756 34186 31294 94421 207276 256933 308889 269165 459532
Average 14673 33786 35014 91832 222209 278865 320486 323112 462641
Binding 0.034285 0.07895 0.081816 0.214575 0.51922 0.651598 0.74885 0.75499
ACCTAG 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 18554 34858 44578 155441 225898 321580 335412 344521 465852
Trial 2 15587 33654 41876 159872 241014 333148 301512 335215 462344
Trial 3 9779 35282 30872 139987 206279 220897 328389 300488 461247
Average 14639 34597 39108 151766.76 224397 291874.9 321771 326741 463147
Binding 0.034285 0.08102 0.091588 0.355419 0.52551 0.683535 0.75355 0.76519
NBS 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe
Trial 1 19985 37885 34458 43528 51985 68555 55512 71445 465664
Trial 2 17753 37521 32141 41152 47880 64118 52998 66529 463887
Trial 3 17636 34577 34782 42211 52763 67915 53950 74516 460014
Average 18457 36661 33793 42296 50875 66862 54153 70830 463188
Binding 0.043123 0.08565 0.078952 0.098818 0.11886 0.156212 0.12652 0.16548
This table includes nitrocellulose filter binding data determining the relative binding of AtMYB61 to the CASTing targets and to the mutated ACCTAC motifs in triplicate. The 60 bp DNA probes were present in excess amounts. The probe concentration for each sequence was 1065nM and the total amount of DNA added to each reaction was 124.41ng. The protein concentrations are labelled in red and vary from 0M to 5.00E -09 M. The cpm of each sample was measured by a liquid scintillation counter. If AtMYB61 bound to a sequence then it would reach a binding max of ~0.75 binding. If AtMYB61 did not bind to a sequence, then the binding would not increase with the increase in protein concentration.
83
Chapter 4
Novel regulation of an R2R3-MYB transcription factor, AtMYB61, by a non-hexokinase sugar-signalling pathway
This chapter comprises the following manuscript-in-preparation in its entirety:
Michael B. Prouse, Christian Dubos, Cécile Vriet, & Malcolm M. Campbell (2013) Novel
Regulation of an R2R3-MYB transcription factor AtMYB61 by a non-hexokinase sugar-
signalling pathway.
Contributions: MBP, CD, MMC designed research; MBP, CD, CV performed
research; MBP, CD, CV, MMC analysed data; MBP, MMC wrote manuscript with
editorial assistance from CD, CV.
MBP contributed specifically to Fig. 4.5, Fig. 4.6, Fig. 4.7, Fig. 4.8, Fig. 4.9, Fig. 4.10,
Fig. S4.1, Fig. S4.2, Fig. S4.3, Fig. S4.4, Table 4.1, and Table S4.1.
84
4 Novel Regulation of an R2R3-MYB Transcription Factor, AtMYB61, by a Non-Hexokinase Sugar-Signalling Pathway
4.1 Abstract
AtMYB61, a member of the R2R3-MYB family of transcription factors in Arabidopsis
thaliana, alters gene expression, resulting in pleiotropic modifications of carbon
allocation throughout the plant body. Here, we demonstrate that AtMYB61 expression
is modulated by photosynthate through a novel sugar-signalling pathway that does not
appear to directly involve hexokinase. Analysis of promoter-reporter fusion constructs
that contained or did not contain AtMYB61 5‘ intragenic sequences determined that
AtMYB61 expression is de-repressed by soluble sugars in a mechanism involving
intragenic sequences. Phylogenetic footprinting identified a repeat motif, termed
second intron repeat, that was conserved across second intron of Brassicaceae
AtMYB61 homologues. Nuclear proteins from seedlings grown in the presence or
absence of sugars bound differentially to the second intron repeat consistent with the
derepression model. Second intron repeat binding proteins were identified and
characterised using a combination of loss-of-function genetics and transcriptome
analysis. Taken together, a novel protein activity that binds a conserved repeat motif
within AtMYB61 second intron is uncovered, and suggested to regulate sugar mediated
gene expression in other genes that contain this repeat. The elucidation of the
upstream regulation of AtMYB61 thereby uncovers a novel sugar signalling pathway
that makes use of intragenic sequences as regulatory elements, which appears to act in
a pathway independent of hexokinase.
4.2 Introduction
As sessile photoautotrophs, plants must balance their requirements for carbon against
their ability to fix carbon through photosynthesis. Consequently, plants have evolved
mechanisms that enable them to contend with fluctuations in their capacity to fix carbon.
Some of these mechanisms control the aperture of stomata, dynamic pores found on
85
the surfaces of plant leaves that control water loss from the plant and regulate the
uptake of CO2 for photosynthesis. Such mechanisms modulate carbon acquisition
relative to prevailing environmental conditions, thereby creating variations in the levels
of photosynthate, the sugars derived from photosynthesis. Accordingly, plants have
also evolved mechanisms to perceive and respond to photosynthate (Koch, 1996;
Smeekens, 2000; Rolland et al., 2002; Halford and Paul, 2003). These mechanisms
appropriately modulate the allocation of carbon to various facets of plant growth,
development and metabolism. While a significant body of evidence suggests a link
between the signalling pathways that modulate carbon acquisition and those involved in
resource allocation, rather little is known about the specific factors involved.
We found that a gene encoding a member of the Arabidopsis thaliana R2R3-MYB family
of transcription factors, AtMYB61 (At1g09540), was expressed in guard cells in a
manner consistent with involvement in the control of stomatal aperture (Liang et al.,
2005). Over-expression and loss-of-function mutant analyses revealed that AtMYB61
expression was both sufficient and necessary to bring about reductions in stomatal
aperture with consequent effects on gas exchange (Liang et al., 2005). Taken together,
the data provided evidence that AtMYB61 encodes a transcription factor implicated in
the closure of stomata. Aside from its involvement in the control of stomatal aperture,
we recently found that AtMYB61 is sufficient and necessary to allocate carbon to the
two other major components of the plant transpiration system – the water conducting
xylem cells and the root system (Romano et al., 2012). Thus, it appears that AtMYB61
regulates processes related to the acquisition and allocation of carbon, perhaps
functioning to balance carbon supply with demand. Ideally, such a chemostat would be
informed by the level of carbon itself.
We show here that AtMYB61 expression is modulated by photosynthate. The results
suggest that AtMYB61 integrates signals derived from the perception of sugars, but that
this does not directly involve the hexokinase sugar-signalling pathway. Conversion of
the sugar signals into a transcriptional response is dependent on intragenic sequences
located within one of the two introns of AtMYB61. A novel protein activity that binds to
a repeat motif found within this intron is uncovered, and proposed to regulate sugar-
mediated gene expression in a suite of genes that contain repeats of the same motif,
86
predominantly in intragenic regions. AtMYB61 thereby uncovers a novel sugar-
signalling pathway that makes use of intragenic non-coding sequences as cis-acting
elements, and a novel repressor protein in the regulatory pathway.
4.3 Materials and Methods
4.3.1 Plant Material and Culture
Wild-type (WT) Arabidopsis thaliana seeds (Col-0) were obtained from the Nottingham
Arabidopsis Stock Centre (NASC). The AtMYB61 promoter/5’ coding sequence
(containing introns)::uidA (61PN::GUS) fusion, AtMYB61 promoter (not containing
introns)::GFP (61P::GFP) fusion and AtMYB61 promoter/5’ coding sequence
(containing introns)::GFP (61PN::GFP) fusion were constructed and stably transformed
into Arabidopsis thaliana plants as described previously (Newman et al., 2004; Liang et
al., 2005). Arabidopsis thaliana seeds were sterilised and grown according to standard
protocols (Newman et al., 2004), except where indicated below. Growth stages were
assigned based on published standards (Boyes et al., 2001).
Plants over-expressing AtMYB61 under the control of the Cauliflower Mosaic Virus 35S
promoter (35S::MYB61) were as described previously (Newman et al., 2004; Liang et
al., 2005). Similarly, AtMYB61 loss-of-function mutants (atmyb61) and glucose
insensitive mutants (gin) have been described previously (Zhou et al., 1998; Arenas-
Huertero et al., 2000; Penfield et al., 2001; Moore et al., 2003; Liang et al., 2005).
35S::MYB61 line, loss-of-function allele (atmyb61-1), loss-of-function allele (gin1-3),
loss-of-function allele (gin6-1), and loss-of-function allele (gin2-1) were used in all
experiments, and results are representative. T-DNA insertional mutant lines
corresponding to either AtMYB61 or putative downstream targets of AtMYB61 were
obtained from the Arabidopsis Biological Resource Center (ABRC) (Alonso et al., 2003).
Homozygous T-DNA lines were obtained by PCR screening using the left border T-DNA
primer and a right border gene-specific primer
(http://signal.salk.edu/tdnaprimers.2.html). Insertion sites were sequenced for all
mutants to verify insertional mutagenesis (data not shown), and quantitative PCR was
conducted to show that the mutants were loss-of-function.
87
For primary bolt and hypocotyl analyses, seeds were germinated and plants were grown
on soil. Seeds were sown on dampened soil and then cold stratified for 3 d before
placement in a growth chamber at 21°C with a regime of 12 h of light (120 μmol m−2 s−2)
and 12 h of dark. This growth regime is referred to as short-day conditions herein.
4.3.2 Phylogenetic Analysis of AtMYB61 Brassicaceae Homologues
Sequences of Brassicaceae AtMYB61 homologues were obtained from the online
Phytozome tool (http://phytozome.net). Sequence alignments were conducted by
aligning intragenic and flanking exonic AtMYB61 Brassicaceae homologues sequences
(Arabidopsis thaliana gene At1g09540; Arabidopsis lyrata gene 919710; Capsella
rubella gene Carubv10009497m.g; Brassica rapa gene Bra020016; and Thellungiella
halophila gene Thhalv10008000m.g) by using the online ClustalW2 tool
(http://ebi.ac.uk?Tools/msa/clustalw2/). AtMYB61 Brassicaceae homologues sequence
alignments were used in Berkeley‘s online WebLog tool
(http://weblogo.berkeley.edu/logo.cgi) to obtain sequence logos.
4.3.3 Analysis of Transgenic Plants Containing Promoter::Reporter
Fusions
For the analysis of GUS expression, seedlings were germinated in the dark in liquid
Murashige Skoog (MS) medium as described previously (Newman et al., 2004). Liquid
MS medium contained no carbon source, and sugars were supplemented to a final
concentration of 30mM with either sucrose, glucose, fructose, maltose, turanose,
palatinose, or raffinose. 2-deoxyglucose and 3-O-methylglucose were added to the MS
medium, in the dark, to a final concentration of 30mM, after the seeds had germinated.
In the experiment with mannoheptulose (MHL), the MS medium was supplemented with
30mM sucrose and 100 mM MHL. Seedlings were grown for 7 days in the dark prior to
analysis by confocal scanning laser microscopy. These seedlings were mounted in an
aqueous solution of 10g/ml propidium iodide on a microscope slide and examined
using a Zeiss LSM 510 confocal laser scanning microscope according to published
protocols (Matsumoto, 2002). Histochemical localisation and quantitative fluorometric
(methylumbelliferone-glucuronide, MUG) assay of GUS activity was conducted as
88
described previously (Gallagher, 1992). Quantitative GUS analyses used protein
extracts obtained from 37 seedlings that had been frozen in liquid nitrogen and then
ground to a fine powder. Three biological replicates were obtained per condition
examined, and each protein extract was measured in duplicate.
4.3.4 Semi-Quantitative PCR
Semi-quantitative PCR on sugar sensitive putative repressors of AtMYB61 expression
loss-of-function mutants (rmx), WT Col, atmyb61-1, and 35S::MYB61 were conducted
by extracting RNA from 400mg of 7 day-old dark-grown seedlings, grown in the
presence or absence of 30mM sucrose. RNA was extracted from frozen tissue using
RNeasy Plant Mini Kit (Qiagen, Toronto, ON, Canada) according to manufacturer‘s
instructions. cDNA Synthesis was accomplished from oligo(dT) with SuperScript II
Reverse Transcriptase (Invitrogen, Burlington, ON, Canada) following the
manufacturer's instructions. PCR primers amplified the AtMYB61 gene using primers
F61Bam (5′-GGATCCATGGGGAGACATTCTTTGCTGTTAC-3′) and R61Eco (5′-
GAATTCTAAAGGGACTGACCAAAAGAGAC-3′). Semi-quantitative PCR was
performed on first strand cDNA using the MJ Research PTC Thermal Cycler (Bio-Rad,
Mississauga, ON, Canada). Semi-quantitative PCR conditions were: 90°C for 2 min,
and then 35 cycles of the following: 90°C for 40 sec, 65°C for 1 min, 72°C for 2 min, and
then 72°C for 5 min. The data were normalized to an actin control gene (ACT-11) that
was amplified using primers ACT1 (5′-GCC-AAAGCAGTGATCTCTTTGCTC-3′) and
ACT2 (5′-GTGTTGGAC-TCTGGAGATGGTGTG-3′), using the above reaction
conditions with either 25 or 35 amplification cycles. Results were analysed for
AtMYB61 misexpression to validate these nuclear proteins, which bound to the second
intron and second intron repeat, as putative AtMYB61 repressors.
4.3.5 Quantitative, Real-Time, Reverse Transcriptase Polymerase Chain
Reaction (qRT-PCR)
qRT-PCR for sugar regulation of AtMYB61 was conducted by extracting RNA from
400mg of seedlings that had been grown in the dark for 7 days in liquid MS medium
supplemented with sugars as described above. RNA was extracted from frozen tissue
89
using the RNeasy Plant Mini Kit (Qiagen, Toronto, ON, Canada) according to
manufacturer‘s instructions. The extracted RNA was subjected to DNase digestion
(DNA-free Kit, Ambion, Burlington, ON, Canada), precipitation (GlycoBlue, Ambion,
Burlington, ON, Canada), and purification (RNeasy Plant Mini Kit, Qiagen, Toronto, ON,
Canada). cDNA Synthesis was accomplished using the RETROscript Kit (Ambion,
Burlington, ON, Canada) according to manufacturer‘s instructions. Three sets of
polymerase chain reaction (PCR) primers were designed using Primer Express 2.0
(Applied Biosystems). These primers spanned an intron/exon boundary in order to
circumvent the amplification of genomic DNA. The first set of primers generated an
amplicon corresponding to AtMYB61 (At1g09540U: 5‘-TGG AAA CAG ATG GTC ACA
GAT TG-3‘; At1g09540U: 5‘-ATG CTT GAG TTC CAT AGA TTC TTG ATC-3‘), the
second to HEXOKINASE-2 (AtHXK2) (AtHXK2U: 5‘-ACA AAT GCA GCC TAT GTC
GAA CGT G-3‘; AtHXK2L: 5‘-TGT TCG GGG TCC TTA TGA TGA ATG G-3‘) and the
third set amplified the internal qPCR control, AtTUBULIN4 (At5g44340U: 5‘-AAC GCT
GAC GAG TGT ATG GTT TT-3‘; At5g44340L: 5‘-CCA AAG GTA GGA TTA GCG AGC
TT-3‘). qRT-PCR was performed on first strand cDNA using the QuantiTect SYBR
Green PCR Kit (Qiagen, Toronto, ON, Canada). qRT-PCR conditions were: 50ºC for 2
min, 95ºC for 10 min, and then 40 cycles of 95ºC for 15 s and 60ºC for 1 min.
Quantification was performed using the ABI PRISM 7700 Sequence Detection System
(Applied Biosystems).
qRT-PCR for sugar sensitive putative repressors of AtMYB61 expression loss-of-
function mutants (rmx) was conducted by extracting RNA from 7 day-old dark-grown
(rmx) seedlings, grown in the presence or absence of 30mM sucrose. RNA was
extracted from frozen tissue using RNeasy Plant Mini Kit (Qiagen, Toronto, ON,
Canada) according to manufacturer‘s instructions (Qiagen, Toronto, ON, Canada).
cDNA was synthesised from the RNA using SuperScript II Reverse Transcriptase
(Invitrogen, Burlington, ON, Canada) initiated using an oligo(dT) primer, following the
manufacturer's instructions. qRT-PCR was performed on first strand cDNA using the
iCycler iQ real-time PCR detection system (Bio-Rad, Mississauga, ON, Canada). The
AtMYB61 amplicon was generated using primers F61Bam (5′-
GGATCCATGGGGAGACATTCTTTGCTGTTAC-3′) and R61Eco (5′-
90
GAATTCTAAAGGGACTGACCAAAAGAGAC-3′) qRT-PCR conditions were: 90°C for 2
min, and then 35 cycles of the following: 90°C for 40 sec, 65°C for 1 min, 72°C for 2
min, and then 72°C for 5 min. The data were normalized to an actin control gene (ACT-
11) that was amplified using primers ACT1 (5′-GCC-AAAGCAGTGATCTCTTTGCTC-3′)
and ACT2 (5′-GTGTTGGAC-TCTGGAGATGGTGTG-3′), using the above reaction
conditions with either 25 or 35 amplification cycles.
4.3.6 Electrophoretic Mobility Shift Assay (EMSA)
Nuclear extracts were purified from 7 day-old dark-grown wild-type Columbia
Arabidopsis thaliana seedlings grown in the absence or presence of sucrose according
to Saleh et al. (Saleh et al., 2008). The 90bp intron-two repeat was amplified and
purified using a Qiagen nucleotide removal column, according to the Qiagen
manufacturer (Qiagen, Toronto, ON, Canada). The purified PCR products were then
radioactively labelled with 32P via incorporation of a radiolabelled nucleotide following
primer extension (Sablowski et al., 1994; Hatton et al., 1995). The labelled
oligonucleotide was then subjected to a final purification using the Qiagen nucleotide
purification kit, according to the manufacturer‘s instructions (Qiagen, Toronto, ON,
Canada). Radioactivity levels were measured via a liquid scintillation counter to
measure the incorporation of 32P into the probe. Affinity of binding was assessed using
competition assays with the unlabelled AtMYB61 intron-two repeat sequence. EMSA
conditions were exactly as described previously (Patzlaff et al., 2003b), but using
nuclear extracts purified from 7 day-old dark-grown wild-type Columbia Arabidopsis
thaliana seedlings grown with the absence or presence of sucrose protein in place of
pine MYB protein.
4.3.7 Streptavidin Biotin Pull-Down Assay
For the streptavidin-bioitin pull-down assay, the second intron and the second intron
repeat was biotinylated and immobilised on M280 Streptavidin Dynabeads (Invitrogen,
California, USA). Nuclear extracts from 7 day-old, dark-grown wild-type Columbia
Arabidopsis thaliana seedlings grown in either the absence or presence of sucrose,
according to Saleh et al. (Saleh et al., 2008), were exposed to biotinylated complexes
91
that were confirmed by the Chemiluminescent Biofisher Biotin Detection Kit (Nepean,
Ontario, Canada). 0.1mg/ml of Poly-R478 was used in each reaction to reduce non-
specific binding. The proteins that bound the biotinylated complexes were subjected to
mass spectrometry.
4.3.8 Mass Spectrometry
Liquid chromatography tandem mass spectrometry (LC-MS/MS) with the Orbital Mass
Spectrometer was conducted on peptides purified from the streptavidin biotin pull-down
assay as previously described (Hewel et al., 2010). Confidence of each protein
identified was calculated by StatQuest program (Kislinger et al., 2003). The database
for the identification of proteins was UNIPROT database of the Arabidopsis thaliana
subset. All spectra were also searched separately against human/mouse database
without obtaining significant identifications verifying identifications in Arabidopsis
thaliana samples.
4.4 Results and Discussion
4.4.1 AtMYB61 Expression is Regulated by Sugars
Arabidopsis thaliana seedlings that have been germinated and grown in the dark are
etiolated, and rely completely on seed reserves or exogenous sugar as a source of
carbon (Roldan et al., 1999). Such seedlings serve as a useful model to examine the
effects of sugars on plant cells. AtMYB61 transcript abundance in dark-grown
Arabidopsis thaliana seedlings increased when the seedlings were grown in the
presence of the metabolisable sugars sucrose, glucose or fructose as revealed by
quantitative, real-time, reverse-transcriptase, polymerase chain reaction (qRT-PCR)
(Fig. 4.1). This effect was not osmotic as the presence of sorbitol (a non-metabolisable
sugar alcohol) did not induce an increase in transcript abundance.
To investigate how AtMYB61 expression is shaped by sugars, qualitative and
quantitative changes in the activity of the -glucuronidase (GUS) reporter gene driven
by a translational fusion with the AtMYB61 promoter and 5‘ intragenic sequences
(61PN::GUS) were examined (Fig. 4.2ab). Metabolisable sugars (sucrose, glucose,
92
Figure 4.1. Sugar regulation of AtMYB61 expression in dark-grown wild-type seedlings, 7 days post-germination. qRT-PCR analysis of AtMYB61 expression in response to sugars was conducted on wild-type seedlings grown for 7 days of dark. Sucrose, glucose and fructose all induced AtMYB61 expression. Sorbitol acted as an osmotic control and did not induce AtMYB61 expression. * indicates significantly different from the no sugar control, p<0.05, t-test.
93
Figure 4.2. Promoter-reporter and qRT-PCR analysis of AtMYB61 expression in response to sugars. (a) 61PN::GUS expression of 7 day-old dark-grown seedlings within the hypocotyl xylem in response to sugars. In response to metabolisable sugars (sucrose, glucose, fructose and maltose), AtMYB61 gene regulatory sequences were sufficient to drive GUS expression in the hypocotyls of 7 day-old dark-grown seedlings. A mannitol control confirmed that this effect was not due to osmotic regulation. Turanose and palatinose controls validated that the effect was not due to sucrose sensing alone. A raffinose control showed that this effect was not due a sucrose translocation effect. 3-O-methylglucose (3-OMG) and 2-deoxyglucose (2-DG) controls displayed that this effect was not due to the detection of hexose sugars involving the hexokinase (HXK) pathway (b) Quantitative analysis of 61PN::GUS expression in response to the same sugars and controls presented in (a). Bars in (a) represent 100µm. * in (b) represent signicantly different from no sugar control (P < 0.05).
94
fructose or maltose) significantly increased expression; whereas, an equivalent change
in osmotic conditions using sorbitol did not (Fig. 4.2ab). The disaccharides, turanose
and palatinose, which can interact with extracellular sucrose sensors (Loreti et al., 2000;
Sinha et al., 2002), failed to increase expression. Similarly, raffinose, which is
translocated with sucrose in the phloem, but not hydrolysed (Haritatos et al., 2000), did
not increase expression (Fig. 4.2ab).
Sugars have been shown to modify gene expression within a few members of R2R3-
MYB family members, AtMYB61 being one within this subset. Previously, it was shown
that AtMYB61 was diurnally regulated, to account for light-to-dark transitions in stomatal
aperture (Liang et al., 2005). Furthermore, AtMYB61 expression was shown to be
modulated by two amino acids implicated in nitrogen partitioning and signalling,
glutamate and glycine (Dubos et al., 2005). It is striking that AtMYB61 activity is up-
regulated by the most significant product of photosynthesis, sucrose, and that it is
down-regulated by two amino acids that are significant by-products of photorespiration,
glutamate and glycine. It may be that AtMYB61 is poised to respond to the abundance
of different carbon skeletons in plants, and thereby modulate carbon acquisition via
stomata and carbon allocation in sink tissues.
4.4.2 AtMYB61 Acts in a Pathway Independent of the Hexokinase Sugar
Signalling Pathway
Hexokinase (HXK) is important as a sugar sensor in plants (Jang et al., 1997;
Smeekens, 2000; Xiao et al., 2000; Rolland et al., 2002; Halford and Paul, 2003; Moore
et al., 2003; Gibson, 2005). Experiments using 3-O-methylglucose (3-OMG), which is
transported into plant cells but not metabolised by HXK, and 2-deoxyglucose (2-DG)
and mannose, which are phosphorylated by HXK but not metabolised further, can be
used to examine the involvement of HXK in sugar signalling (Jang et al., 1997; Pego et
al., 2000). GUS expression driven by the AtMYB61 promoter was not increased in
dark-grown plants provided with 3-O-methylglucose (3-OMG) or 2-deoxyglucose (2-DG)
(Fig. 4.2ab), showing that the sugar-sensing pathway did not simply entail detection of
hexose sugars, nor involve direct signalling via HXK (Jang et al., 1997; Gibson, 2000;
Smeekens, 2000). AtMYB61 promoter-mediated expression was increased by sucrose
95
even in the presence of the specific HXK inhibitor mannoheptulose (MHL) (Jang et al.,
1997; Chiou and Bush, 1998; Smeekens, 2000) (Fig. 4.2ab). The ability of sucrose to
increase AtMYB61 expression in the presence of MHL supports the hypothesis that
signalling directly by HXK is unlikely to be involved in AtMYB61 expression. AtMYB61
expression was not simply a response to the presence of carbon-based metabolites, as
acetate, pyruvate, succinate and trehalose, which are implicated in carbon metabolite
signalling (Graham et al., 1994), did not induce expression (data not shown).
The relationship between AtMYB61 expression and HXK sugar signalling was also
examined using the Arabidopsis thaliana loss-of-function mutants involved in the
hexokinase sugar signalling pathway: glucose insensitive2 (gin2) (Moore et al., 2003),
glucose insensitive1 (gin1) (Zhou et al., 1998), and glucose insensitive6 (gin6) (Arenas-
Huertero et al., 2000) (Fig. 4.3ab). As determined by qRT-PCR, AtMYB61 transcript
abundance increased in response to sucrose and glucose in wild-type plants (Fig. 4.3a).
This was also observed in gin2, gin1, and gin6 mutants. Moreover, the largest increase
in AtMYB61 transcript abundance was observed when sucrose was added to the
medium. In contrast, transcript abundance of HXK2, which is regulated through the
HXK signalling pathway, increased dramatically in response to glucose, and this
increase was significantly less in the gin2 mutant (Fig 4.3b). Together, these results
suggest that AtMYB61 transcript abundance is not modulated via the HXK1 signalling
pathway.
Transcript abundance data support the hypothesis that, under most circumstances,
AtMYB61 is likely to function independently of HXK. That is, AtMYB61 and
AtHXK1/GIN2 (At4g29130) have transcript abundance profiles that are slightly
negatively correlated in the AtGenExpress developmental dataset (RAGE=-0.236). This
indicates that the two genes are likely to have inverse transcript abundance relative to
each other, in those instances when their expression is coincident at all. Thus, the HXK
pathway and a distinct ―AtMYB61 pathway‖ are likely to operate non-redundantly, and
the pathway that is deployed is likely to be contingent on the developmental context. A
novel sugar-signalling pathway that does not involve hexokinase has been predicted
(Chiou and Bush, 1998; Tiessen et al., 2003; Dekkers et al., 2004), but the components
96
Figure 4.3. qRT-PCR analysis of AtMYB61 and HXK-2 expression in wild-type (WT) and glucose insensitive (gin) loss-of-function mutants. (a) qRT-PCR of AtMYB61 expression in response to sugars (glucose and sucrose) in WT and glucose insensitive mutants (gin1, gin6 and gin2) reveal that AtMYB61 acts in a sugar signalling pathway independent of HXK. (b) qRT-PCR of HXK-2 expression in response to sugars (glucose and sucrose) in wild-type and gin1, gin6 and gin2 mutants confirm that HXK-2 acts in the HXK sugar signalling pathway.
97
of this signalling pathway have yet to be elucidated. It may be that AtMYB61 is a
component of this pathway. One might be able to capitalise on this information to
uncover additional components of the uncharacterised AtMYB61-related sugar-
signalling pathway.
4.4.3 AtMYB61 Expression is Sugar Derepressed, Involving an Intragenic
Sequence within the 5‘ Coding Region Containing Two Introns
AtMYB61 gene regulatory sequences comprising the 5‘ intragenic region (61PN::GFP)
were sufficient to drive the expression of GFP in the xylem of seedlings grown in the
presence of sucrose but not in the absence of sucrose (Fig. 4.4ab). In contrast,
AtMYB61 gene regulatory sequences without the 5‘ intragenic region (61P::GFP)
constitutively expressed GFP in the seedlings grown in the presence and absence of
sucrose. The most parsimonious hypothesis for this finding is that AtMYB61 expression
is de-repressed by soluble sugars in a mechanism involving intragenic sequences.
Sequence comparison of Brassicaceae AtMYB61 homologues (Arabidopsis thaliana,
Arabidopsis lyrata, Capsella rubella, Brassica rapa, and Thellungiella halophila)
uncovered a highly conserved motif (CTCTGTTTT) in intron-two, repeated 4 times (Fig.
4.5; Fig. S4.1). The repeats within the second introns of AtMYB61 homologues occur 4
times - 3 times in the sense direction and once in the antisense direction. Scanning the
Arabidopsis thaliana genome for this repeat, with an occurrence cutoff of 3 times within
500bp, identified 83 genes and 15 intergenic regions (Table S4.1, S4.2). Of the 98
instances, 45 of these occurrences were in introns (Table S4.3). That is, when this
motif is repeated 3 or more times within a 500bp region of the Arabidopsis thaliana
genome, 45.9% of these occurrences are within introns (Table S4.3). Notably, introns
comprise only 15.6% of the Arabidopsis thaliana genome (Kaul et al., 2000). Of the 45
occurrences of this repeat within Arabidopsis thaliana introns, 21 are within sugar-
responsive genes (Table S4.3). Arabidopsis thaliana splicing prediction tools
(http://cbs.dtu.dk/services/NetPGene)(Hebsgaard et al., 1996)) nor miRNA and siRNA
prediction tools (http://www.athamap.de/)(Steffens et al., 2004, 2005; Galuschka et al.,
2007; Bulow et al., 2009; Bulow et al., 2010)) suggest that the motif is neither likely to
be a splice site nor a miRNA or siRNA binding site.
98
Figure 4.4. Analysis of AtMYB61 promoter-reporter fusion constructs that contain or do not contain AtMYB61 5’ intragenic sequences in response to sucrose. (a) Schematic representation of the constructs used to drive the expression of GUS (uidA) and GFP (GFP). 61P correspond to the promoter of AtMYB61, and 61PN to the promoter of AtMYB61 plus the portion of the coding sequence that encodes the N-terminus of the protein, which includes the two introns (E:exon; I: intron; NosT: nopaline synthase terminator sequence). (b) Expression of 61P and 61PN constructs within developing xylem of 7 day-old dark-grown seedlings in response to 30mM sucrose support the sugar derepression model.
99
Figure 4.5. Phylogenetic footprinting identifies a conserved repeat motif in the second intron of AtMYB61 Brassicaceae homologues. Sequence logo of AtMYB61 second intron (green highlight) flanked by exon 2 and exon 3 (red highlight). An over-represented conserved motif is present within AtMYB61 second intron that repeats itself three times in a sense direction and once in an antisense direction. Brassicaceae AtMYB61 homologues include: Arabidopsis thaliana gene At1g09540; Arabidopsis lyrata gene 919710; Capsella rubella gene Carubv10009497m.g; Brassica rapa gene Bra020016; and Thellungiella halophila gene Thhalv10008000m.g.
100
To further assess the putative functional role of the conserved over-represented motif
within AtMYB61 second intron, AtMYB61 was aligned with AtMYB50 (At1g57560), its
most closely related R2R3-MYB family member (Fig. S4.2)(Stracke et al., 2001). Direct
nucleotide sequence comparison between the two genes shows that, while AtMYB50
contain 2 introns, and while the introns of both genes share significant sequence
similarity; neither of the AtMYB50 intron contains the AtMYB61 second intron repeat.
Notably, AtMYB50 is not sugar induced (data not shown). Taken together with the data
above, the findings support the hypothesis that the repeat sequences found in the
second AtMYB61 intron might function as a gene regulatory sequence to mediate
sugar-responsive gene regulation. What‘s more, if they do function in this manner, they
might serve as binding sites for a repressor that binds to the sequences in the absence
of sugar, which are then released when sugar is present.
To determine if the repeats in the second intron of AtMYB61 could function as targets
for binding by a hypothetical sugar-mediated repressor, EMSAs were undertaken.
EMSAs used radioactively labeled second-intron repeats, and nuclear extracts from
plants that were grown in either the presence or absence of sucrose in the dark. The
second intron repeat motif was bound by to a greater extent by proteins in nuclear
extracts obtained from seedlings grown in the absence of sucrose in the dark, relative to
those from seedlings grown in the presence of sucrose in the dark (Fig. 4.6). This
interaction was specific as determined by a competition assay using either unlabelled
second intron repeat or poly(dI-dC) as a competitor (Fig. S4.3). Taken together, these
findings are consistent with a nuclear-localised repressor protein binding to the second
intron repeat in seedlings grown in the absence of sugar.
Recently, intragenic regulatory elements have been identified that can function as either
repressors, enhancers or promoters of gene transcription (Dooley et al., 1996; Busch et
al., 1999; Deyholos and Sieburth, 2000; Kapranov et al., 2001; Fiume et al., 2004;
Wang et al., 2004; Fu et al., 2005; Osnato et al., 2010). In barley, a tandem duplication
of 305bp in intron IV is responsible for the dominant Hooded phenotype, which leads to
an ectopic over expression of Knox3 at the distal end of the lemma and the
development of an extra flower in place of an awn present in wild-type spikelets (Muller
et al., 1995). In transgenic Nicotiana tabacum lines, the 305bp element can drive
101
Figure 4.6. EMSA shows AtMYB61 second intron motif bound differentially by proteins in nuclear extracts from seedlings grown in the absence or presence of sucrose in the dark, consistent with the derepression model. EMSA of nuclear extracts from wild-type Columbia seedlings grown for 7 days of dark with or without 30mM of sucrose on the second intron repeat shows differential binding. Competition with AtMYB61 second intron repeat cold probe shows that this interaction is specific. Arrows indicate gel shifts by the probe.
102
reporter gene expression within the flower base, in contrast to the Knox3 promoter,
whose activity is restricted to the SAM (Santi et al., 2003). The 305bp intron element
acts as a floral-specific regulatory element. A one-hybrid screen identified three
proteins that bound the 305bp intron element (Osnato et al., 2010). The proteins were
Barley Ethylene Response Factor1 (BERF1), Barley Ethylene Insensitive Like1 (BEIL1),
and Barley Growth Regulating Factor1 (BGRF1). Both BERF1 and BEIL1 are ethylene
signalling proteins that act at the interface between ethylene sensing and gene
regulation. In rice protoplasts, BEIL1 activated a reporter gene driven by the 305-bp
intron element. In contrast, BERF1 counteracted this activation, acting as a repressor
at this site. All in all, BEIL1 and BERF1 mediate fine-tuning of Knox expression by
ethylene through binding to the 305-bp intron element, providing cross-talk between the
KNOX and ethylene pathways.
The gene encoding the Zea mays starch-branching enzyme I (SBEI) also has an intron-
derived transcriptional regulatory sequence. Importantly, this functions in sugar-
mediated gene regulation. In transient gene expression analysis, inclusion of the SBEI
first intron increased transcript abundance 14-fold relative to gene constructs that did
not contain this intron (Kim and Guiltinan, 1999). Two cis elements were found within a
60bp region that bound nuclear proteins prepared from maize kernels in a sucrose-
dependent manner. In another study, the first intron of cotton sucrose synthase 3
(Sus3), a regulator of cotton fiber development, was analysed (Ruan et al., 2009). The
first intron of Sus3 is a negative regulator of gene expression and represses expression
of its transcripts in pollen. A Pyrimidine-box (CCTTTTG) was identified in the first intron
of Sus3. This motif was also present in the promoter of the RICE ALPHA-AMYLASE
(RAmy1a) gene (Morita et al., 1998) and the barley alpha amylase (Amy2/32b) gene
(Mena et al., 2002). In barley, DOF (DNA binding with one Finger) transcription factor
PBF (Pyrimidine-box Binding Factor) protein is induced by gibberellin to recognise this
Pyrimidine-box motif. This suggests that the motifs within the Sus3 first intron is
recognised by hormone inducible transcription factors to then regulate gene expression.
Gene regulation of animal c-MYB has also been shown to be regulated by intragenic
sequences (Dooley et al., 1996). The first intron of human and mouse c-Myb share
74% sequence conservation. These sites contain conserved GC-rich motifs, serving as
103
binding sites to a 70kDa activator protein and a 20kDa repressor protein (with a c-Jun
domain). Both c-MYB intron binding proteins regulate cell cycle progression (Dooley et
al., 1996). In concert with this, c-MYB has previously been shown to regulate the
proliferation and differentiation of hematopoietic cells (Mucenski et al., 1991), strongly
suggesting that the intragenic regulation of this gene is crucial for proper regulation.
Studies on the intragenic regulation of c-MYB paved the way for plant MYB studies.
In plants, the Arabidopsis thaliana R2R3-MYB gene GLABRA1 (GL1) is a central
regulator of trichome development (Oppenheimer et al., 1991). Trichome formation by
GL1 and GL1-like MYB genes (AtWER and GaMYB2) was regulated by their 5‘
intragenic regions (Wang et al., 2004). Both AtWER and GaMYB2 contain a conserved
MYB binding site (CA/CGTTA) within their first intron that is suggested to be a binding
site for R2R3-MYB repressor and activators that act upstream of GL1 (Wang et al.,
2004). The findings presented here suggest that AtMYB61 may also be regulated
through its intron sequences. Determination of the proteins that may bring about such
regulation would enable a more stringent test of this hypothesis.
4.4.4 Affinity Purification Coupled with Mass Spectrometry Uncovers a
Suite of Putative AtMYB61 Repressor Proteins that Bind the
Conserved Second Intron Motif in a Sucrose-Dependent Manner
Affinity purification was used to identify putative repressor proteins that bound to the
second AtMYB61 intron. The second AtMYB61 intron and the second intron repeat
were both end-labelled with biotin. Biotinylation was confirmed using the
Chemiluminescent Biofisher Biotin Detection Kit (Nepean, Ontario, Canada; Fig. S4.4).
Nuclear proteins were affinity purified from 7 day-old dark grown wild-type Columbia
seedlings that had been grown either in the absence or presence of sucrose.
Streptavidin beads were used to immobilise the biotinylated second intron and second
intron repeat. The streptavidin-biotin complexes were then used to affinity purify
proteins that bound to the intron sequences generally, and the second intron repeat
specifically (Fig. 4.7). Affinity purified proteins were then characterised using liquid
chromatography coupled with tandem mass spectrometry (LC-MS/MS), and the proteins
identified based on their MS fingerprints (Table 4.1)(Kislinger et al., 2003; Hewel et al.,
104
Figure 4.7. Affinity purification coupled with LC-MS/MS determines putative AtMYB61 repressor proteins that bound AtMYB61 second intron repeat. Nuclear proteins were purified from 7 day-old dark grown wild-type Columbia seedlings grown in the absence or presence of sucrose. The nuclear proteins were exposed to AtMYB61 second intron or second intron repeat sequences. The silver stained gel of proteins eluted from the streptavidin-biotin pull-down assay displays certain proteins binding with greater affinity in the no sucrose condition compared to the 30mM sucrose condition consistent with the derepression model.
105
Table 4.1. List of putative repressors of AtMYB61 expression (RMX) that bound AtMYB61 second intron repeat
List of putative RMX proteins that bound AtMYB61 second intron repeat with corresponding Arabidopsis thaliana gene idenfications (AGIs), mutant labels, SALK lines and protein annotations. AtMYB61 transcript abundance was misexpressed in a subset of rmx loss-of-function mutants in response to sucrose as determined by qRT-PCR and semi-quantitative RT-PCR. Confidence of each protein identified was calculated by StatQuest program (Kislinger et al., 2003) and each protein identified had a confidence level of greater than 50 percent (Hewel et al., 2010).
AGI Mutant Label
AtMYB61 Misexpression to Sucrose SALK Line Protein Annotation
At4g04940 No SALK_112391C Putative WD-repeat membrane protein At1g06840 No SALK_134409C Leucine-rich transmembrane kinase
At4g16830 No SALK_143514C Putative nuclear antigen homolog protein
At3g45810 rmx1 Yes SALK_050658 Respiratory burst oxidase-like protein
At5g35700 rmx2 Yes SALK_082219C Fimbrin FIMBRIN-LIKE PROTEIN 2 At2g43970 rmx3 Yes SALK_046986 La and winged repressor domain protein
At1g10170 No SALK_129409C Homologue of human repressor NF-X1
At3g52100 No SALK_047892C PHD finger family protein At3g22980 No SALK_150941C Elongation factor EF-2
At5g11700 No SALK_147133 Glycine rich protein on chromosome 5
At2g24650 rmx4 Yes SALK_109533C DNA binding / transcription factor
At1g50680 rmx5 Yes SALK_047550C RAV-like DNA-binding protein
At5g22760 No SALK_125978 PHD finger family protein At1g07650 No SALK_009225C Leucine-rich
transmembrane protein kinase
At4g02430 rmx6 Yes SALK_032344C Putative SR1 Protein At4g24710 No SALK_031449C Putative nucleotide binding
protein
At5g55670 No SALK_036546C RNA recognition motif-containing protein
At1g34460 No SALK_100844C B1 cyclin cyclin-dependent protein kinase
106
2010). Consistent with a role in binding the second intron repeat of AtMYB61, the
proteins identified were all nuclear proteins and were mainly nucleic acid binding
proteins or proteins involved in DNA-binding complexes. All of the identified AtMYB61
second intron binding proteins have not been biochemically characterised to date (Table
4.1).
4.4.5 A Subset of Putative AtMYB61 Repressor Genes Are Sugar
Sensitive
In order to determine whether the putative repressor proteins played a role in the
regulation of AtMYB61 expression, a genetic loss-of-function approach was taken.
Loss-of-function Arabidopsis thaliana mutants with T-DNA insertions in exons
corresponding to affinity purified proteins were ordered from the Arabidopsis Biological
Resource Centre and verified. These were then tested as putative repressors of
AtMYB61 expression (rmx) mutants. Sucrose-dependent AtMYB61 expression was
examined in putative rmx mutants using semi-quantitative PCR and quantitative real-
time PCR (Fig. S4.4; Fig. 4.8). In rmx mutants, it is hypothesised that AtMYB61
transcript abundance should be elevated, specifically in seedlings that had been grown
in the dark in the absence of sucrose. Of the 18 proteins for which putative rmx mutants
could be obtained, six proteins had rmx mutants that showed the predicted transcript
abundance profile, with elevated AtMYB61 transcripts in seedlings that had been grown
in the absence of sucrose (Fig. S4.5; Fig. 4.8; Table 4.1). While these six proteins have
yet to be characterized biochemically, four have been annotated as putative DNA-
binding proteins (Table 4.1). Notably, in certain rmx backgrounds (rmx1, rmx3, rmx4,
and rmx5), AtMYB61 transcript abundance is higher in the absence of sucrose
compared to the presence of sucrose (Fig. 4.8). Moreover, in the presence of sucrose,
AtMYB61 transcript abundance is, in general, lower in rmx background compared to
wild-type (Fig. 4.8).
107
Figure 4.8. qRT-PCR of putative repressors of AtMYB61 expression loss-of-function mutants (rmx) that had AtMYB61 misexpression in seedlings grown in the absence of sucrose in the dark, consistent with the repressor hypothesis. The RNA and cDNA were purified from rmx mutants, and analysis of AtMYB61 gene expression, via AtMYB61 specific primers, validated a subset of rmx mutants that had higher AtMYB61 expression when grown in the absence of sucrose. These 6 positive rmx mutants were filtered out from a screen of 18 putative rmx mutants recovered from the streptavidin-biotin pull-down assay. Wild-type Columbia, loss of function atmyb61 mutants, and 35S::MYB61 overexpressor mutants acted as controls for AtMYB61 expression for the quantitative PCR assay. ACTIN-11 control was used as a reference gene for the qRT-PCR.
108
4.4.6 rmx Loss-of-Function Mutant Phenocopies Constitutive AtMYB61
Overexpression
To determine the phenotypic effect of rmx mutants, plants were grown for 8 weeks.
Mutants were grown simultaneously with AtMYB61 overexpressors (35S::MYB61), loss-
of-function mutants (atmyb61) and wild-type plants (Fig. 4.9). If the rmx mutants were
impaired in making protein that repressed AtMYB61 expression, then they should have
features of plants that constitutively overexpress AtMYB61. Consistitutive AtMYB61
overexpressors developed quickly, bolted and flowered earlier, and senesced earlier
than wild-type plants (Romano et al., 2012). In contrast, atmyb61 plants developed
more slowly, bolted and flowered later, and senesced later relative to wild-type plants.
Of the rmx mutants, one (rmx3) phenocopied AtMYB61 overexpressors (Fig. 4.9). This
mutant bolted and flowered early, and senesced earlier than wild-type plants. The gene
corresponding to this mutant (At2g43970) is annotated as a La domain-containing
protein that functions in nucleic acid binding. This protein has a winged helix repressor
DNA-binding domain; however, no studies have biochemically characterised this protein
activity to date. Both molecular and phenotypic characterisations reported here suggest
that this protein represses AtMYB61 transcription.
Transcript abundance data across development support the hypothesis that At2g43970
is a negative regulator of AtMYB61 (Fig. S4.6) (http://bar.utoronto.ca/efp/cgi-
bin/efpWeb.cgi)(Schmid et al., 2005). Consistent with being a repressor, At2g43970
transcript abundance is inversely correlated with that of AtMYB61. That is, when the
transcript of At2g43970 is abundant within a tissue at a developmental time point,
AtMYB61 transcript abundance is lower, and vice versa. Combined, the data presented
here support the hypothesis that At2g43970 might be a direct repressor of AtMYB61,
functioning to regulate the expression of this gene in a sugar-dependent manner.
109
Figure 4.9. Phenotypes of Arabidopsis thaliana wild-type (WT) plants, AtMYB61 loss-of-function mutants (atmyb61), AtMYB61 over-expressor mutants (35S::MYB61) and At2g43970 loss-of-function mutants (rmx3). (a) Plants grown on soil for 21 d at WT growth stage 1.12. (b) Plants grown on soil for 28 d at WT growth stage 5.90. (c) Graph displaying leaf senescence of plants grown for 8 weeks at WT growth stage 8.00 (1, fully yellow leaves: 5, fully green leaves). Leaf senescence assay was conducted on the basis of published standards (Romano et al., 2012). *Significantly different from WT (P < 0.05). Data from experiments were conducted on >10 plants per genotype per experiment. Plants were grown in individual pots, and were randomized in flats to discourage position dependent effects. All rosette leaves were harvested. Measurement line represents 1 cm. Growth stages were assigned on the basis of published standards (Boyes et al., 2001).
110
4.5 Conclusion
The data presented herein provides evidence that AtMYB61, an R2R3-MYB
transcription factor, functions at the interface of sugar perception and sugar response.
Although in plants, gene specific transcriptional regulation is generally effected by the
binding of regulatory proteins to 5‘ non-coding regions (Schwechheimer and Bevan,
1998; Lee and Young, 2000), the findings presented in this study support the hypothesis
that AtMYB61 makes use of intragenic, non-coding sequences as cis-acting binding
sites for a sugar mediated repressor protein to regulate its gene expression in a sugar
dependent manner. AtMYB61 was regulated by metabolisable sugars, particularly
sucrose, in a sugar-signalling pathway that does not appear to directly involve
hexokinase (Jang et al., 1997; Gibson, 2000; Smeekens, 2000). AtMYB61 expression
was de-repressed by sucrose in a mechanism involving intragenic sequences
determined by promoter-reporter fusion constructs, hinting at a sugar mediated
repression mechanism (Rolland et al., 2006). An over-represented motif was conserved
within the second intron of Brassicaceae AtMYB61 homologues and this motif
functioned as a binding target for a putative sugar-mediated repressor, as determined
by EMSA. Putative repressor proteins (RMX) that bound AtMYB61 second intron motif
in seedlings grown in the absence of sucrose were affinity purified and characterised
using LC-MS/MS, and the proteins identified based on their MS fingerprints. These
proteins were all nuclear proteins and were mainly DNA-binding proteins or proteins
involved in DNA-binding complexes and have not been chararcterised to date. In rmx
mutants, it was hypothesised that AtMYB61 transcript abundance should be elevated in
seedlings that have been grown in the dark in the absence of sucrose. Six rmx mutants
showed this predicted transcript profile. Only one rmx mutant, whose gene corresponds
to At2g43970, could phenocopy transgenic plants overexpressing AtMYB61 (Romano et
al., 2012), this result supports the hypothesis that this gene encodes a repressor protein
that modulates AtMYB61 gene expression in vivo. At2g43970 gene encodes a La
domain-containing protein that contains a winged helix repressor DNA-binding domain
and has not been characterised in Arabidopsis thaliana to date (Schwartz et al., 1999).
Moreover, AtMYB61 and At2g43970 had inverse transcript abundance data across
development, supporting the hypothesis that At2g43970 encodes a protein that
111
represses AtMYB61. Taken together, a novel protein activity that binds a conserved
repeat motif within AtMYB61 second intron is uncovered, and suggested to regulate
sugar mediated gene expression in AtMYB61 and other genes that contain this repeat,
acting independently of the HXK sugar signalling pathway.
4.6 Acknowledgements
The authors are grateful to Christine Surman (University of Oxford) for technical
assistance; John Baker (University of Oxford) for assistance with photography; Ho-
Young Koo for assistance with nuclear protein extractions; Hilda Doan for the
assistance with plant phenotype analyses; Nottingham Arabidopsis Stock Centre
(NASC) and Arabidopsis Biological Resource Center for provision of seeds. This
research was generously supported by the Natural Science and Engineering Research
Council of Canada (NSERC) Canadian Graduate Scholarship (CGSD) awarded to
M.B.P., and by competitive grant funding from the UK Biotechnology and Biological
Sciences Research Council (BBSRC), the Canada Foundation for Innovation, and the
Natural Science and Engineering Research Council of Canada (NSERC) to M.M.C..
112
4.7 Supplemental Figures and Tables
Figure S4.1. Sequence alignment of the second intron of Brassicaceae AtMYB61 homologues. Sequence comparison of Brassicaceae AtMYB61 homologues (Arabidopsis thaliana gene At1g09540; Arabidopsis lyrata gene 919710; Capsella rubella gene Carubv10009497m.g; Brassica rapa gene Bra020016; and Thellungiella halophila gene Thhalv10008000m.g) uncovers a highly conserved motif (16-21 million years ago) in intron-2. Yellow boxes indicate second intron repeat in sense direction. Purple boxes indicate second intron repeat in antisense direction. * indicates positions which have a single, fully conserved residue. : indicates conservation between groups of strongly similar properties. . indicates conservation between groups of weakly similar properties.
113
Figure S4.2. Sequence alignment of AtMYB61 and AtMYB50 reveals no second intron repeat within AtMYB50 second intron. Sequence alignment was conducted on the AtMYB61 and AtMYB50 intron 2. AtMYB50 is AtMYB61 closest related R2R3-MYB member. AtMYB50 is not sugar responsive and did not contain the second intron repeat. Yellow boxes indicate second intron motif in the sense direction (5‘ – CTCTGTTTT - 3‘). Purple boxes indicate second intron motif in the antisense direction (5‘ - AAAACAGAG - 3‘).
114
Figure S4.3. EMSA shows AtMYB61 second intron motif bound differentially by proteins in nuclear extracts from seedlings grown in the absence or presence of sucrose in the dark, consistent with the derepression model. EMSA of nuclear proteins from wild-type Columbia seedlings grown for 7 days of dark with or without 30mM of sucrose on the second intron repeat shows differential binding. Competition with the nonspecific competitor poly(dIdC) could not outcompete this specific interaction. Arrows indicate gel shifts by the probe.
115
Figure S4.4. Validation of biotinylation of AtMYB61 second intron and second intron repeat. The biotinylation of the second intron and the second intron repeat of AtMYB61 was confirmed using the Chemiluminescent Biofisher Detection Biotin Kit. Detection was only reported on biotinylated AtMYB61 intron 2 and second intron repeat.
116
Figure S4.5. Semi-quantitative PCR of AtMYB61 expression in repressors of AtMYB61 expression loss-of-function mutant (rmx) seedlings grown in the absence or presence of sucrose in the dark. Semi-quantitative PCR validated a subset of rmx mutants that had higher AtMYB61 expression when grown in the absence of sucrose in the dark. These 6 positive rmx mutants were filtered out from a screen of 18 putative rmx mutants. Wild-type (WT), loss-of-
function atmyb61 mutants, and AtMYB61 overexpressor mutants (35S::MYB61) provided
controls for AtMYB61 expression in response to sucrose. ACTIN-11 (ACT-11) control was used as a reference gene and a loading control for the assay. 25 PCR cycles were used in this experiment.
117
Figure S4.6. At2g43970 and At1g09540 share inverse transcript abundance profiles across development. eFP browser shows that both (a) At2g43970 and (b) At1g09540 (AtMYB61) have inverse transcript abundance profiles across development in different organs. This suggests, along with other data presented within this study, that At2g43970 is a repressor of AtMYB61.
118
Table S4.1. AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana genes Note a cutoff of at least 3 motifs occurring at least 500 bp apart was set. Thus 83 genes contain this repeat. Highlighted regions indicate unique gene.
AGI # of hits Position Orientation
AT5G46240.1 8 246 238 AAAACAGAG
AT5G46240.1 8 370 378 CTCTGTTTT
AT5G46240.1 8 384 392 CTCTGTTTT
AT5G46240.1 8 436 444 CTCTGTTTT
AT5G46240.1 8 472 480 CTCTGTTTT
AT5G46240.1 8 1130 1122 AAAACAGAG
AT5G46240.1 8 1845 1837 AAAACAGAG
AT5G46240.1 8 2594 2586 AAAACAGAG
AT1G67070.1 7 564 572 CTCTGTTTT
AT1G67070.1 7 576 584 CTCTGTTTT
AT1G67070.1 7 766 774 CTCTGTTTT
AT1G67070.1 7 779 787 CTCTGTTTT
AT1G67070.1 7 805 813 CTCTGTTTT
AT1G67070.1 7 951 959 CTCTGTTTT
AT1G67070.1 7 964 972 CTCTGTTTT
AT3G60130.1 6 1707 1715 CTCTGTTTT
AT3G60130.1 6 1724 1732 CTCTGTTTT
AT3G60130.1 6 1751 1759 CTCTGTTTT
AT3G60130.1 6 1778 1786 CTCTGTTTT
AT3G60130.1 6 1805 1813 CTCTGTTTT
AT3G60130.1 6 2597 2605 CTCTGTTTT
AT5G38970.1 6 75 67 AAAACAGAG
AT5G38970.1 6 95 87 AAAACAGAG
AT5G38970.1 6 107 99 AAAACAGAG
AT5G38970.1 6 721 729 CTCTGTTTT
AT5G38970.1 6 745 753 CTCTGTTTT
AT5G38970.1 6 849 857 CTCTGTTTT
AT5G45340.1 6 279 271 AAAACAGAG
AT5G45340.1 6 296 288 AAAACAGAG
AT5G45340.1 6 740 732 AAAACAGAG
AT5G45340.1 6 776 768 AAAACAGAG
AT5G45340.1 6 781 789 CTCTGTTTT
AT5G45340.1 6 804 812 CTCTGTTTT
119
Table S4.1 continued.
AT5G53660.1 6 680 672 AAAACAGAG
AT5G53660.1 6 815 823 CTCTGTTTT
AT5G53660.1 6 842 850 CTCTGTTTT
AT5G53660.1 6 855 863 CTCTGTTTT
AT5G53660.1 6 900 892 AAAACAGAG
AT5G53660.1 6 1003 995 AAAACAGAG
AT1G30320.1 5 203 211 CTCTGTTTT
AT1G30320.1 5 726 734 CTCTGTTTT
AT1G30320.1 5 737 745 CTCTGTTTT
AT1G30320.1 5 834 842 CTCTGTTTT
AT1G30320.1 5 877 885 CTCTGTTTT
AT2G21560.1 5 56 64 CTCTGTTTT
AT2G21560.1 5 591 583 AAAACAGAG
AT2G21560.1 5 635 627 AAAACAGAG
AT2G21560.1 5 645 637 AAAACAGAG
AT2G21560.1 5 708 700 AAAACAGAG
AT4G02780.1 5 35 27 AAAACAGAG
AT4G02780.1 5 346 354 CTCTGTTTT
AT4G02780.1 5 476 484 CTCTGTTTT
AT4G02780.1 5 566 574 CTCTGTTTT
AT4G02780.1 5 718 726 CTCTGTTTT
AT5G37600.1 5 368 376 CTCTGTTTT
AT5G37600.1 5 556 564 CTCTGTTTT
AT5G37600.1 5 580 588 CTCTGTTTT
AT5G37600.1 5 592 600 CTCTGTTTT
AT5G37600.1 5 1688 1680 AAAACAGAG
AT1G09540.1 4 575 567 AAAACAGAG
AT1G09540.1 4 604 612 CTCTGTTTT
AT1G09540.1 4 621 629 CTCTGTTTT
AT1G09540.1 4 661 669 CTCTGTTTT
AT1G17830.1 4 2338 2346 CTCTGTTTT
AT1G17830.1 4 2351 2359 CTCTGTTTT
AT1G17830.1 4 2381 2389 CTCTGTTTT
AT1G17830.1 4 2394 2402 CTCTGTTTT
AT1G32700.1 4 198 206 CTCTGTTTT
AT1G32700.1 4 225 233 CTCTGTTTT
AT1G32700.1 4 373 381 CTCTGTTTT
120
Table S4.1 continued.
AT1G32700.1 4 390 398 CTCTGTTTT
AT1G51950.1 4 451 459 CTCTGTTTT
AT1G51950.1 4 469 461 AAAACAGAG
AT1G51950.1 4 662 670 CTCTGTTTT
AT1G51950.1 4 833 841 CTCTGTTTT
AT1G61800.1 4 668 676 CTCTGTTTT
AT1G61800.1 4 729 737 CTCTGTTTT
AT1G61800.1 4 757 765 CTCTGTTTT
AT1G61800.1 4 787 795 CTCTGTTTT
AT1G69530.1 4 764 772 CTCTGTTTT
AT1G69530.1 4 787 779 AAAACAGAG
AT1G69530.1 4 826 834 CTCTGTTTT
AT1G69530.1 4 849 841 AAAACAGAG
AT1G70550.1 4 639 647 CTCTGTTTT
AT1G70550.1 4 650 658 CTCTGTTTT
AT1G70550.1 4 661 669 CTCTGTTTT
AT1G70550.1 4 710 718 CTCTGTTTT
AT1G72150.1 4 2 10 CTCTGTTTT
AT1G72150.1 4 476 468 AAAACAGAG
AT1G72150.1 4 491 483 AAAACAGAG
AT1G72150.1 4 521 513 AAAACAGAG
AT2G01540.1 4 499 507 CTCTGTTTT
AT2G01540.1 4 540 548 CTCTGTTTT
AT2G01540.1 4 580 588 CTCTGTTTT
AT2G01540.1 4 619 627 CTCTGTTTT
AT2G37440.1 4 244 252 CTCTGTTTT
AT2G37440.1 4 502 510 CTCTGTTTT
AT2G37440.1 4 660 668 CTCTGTTTT
AT2G37440.1 4 703 711 CTCTGTTTT
AT2G38120.1 4 354 362 CTCTGTTTT
AT2G38120.1 4 411 419 CTCTGTTTT
AT2G38120.1 4 425 433 CTCTGTTTT
AT2G38120.1 4 466 474 CTCTGTTTT
AT2G40320.1 4 515 507 AAAACAGAG
AT2G40320.1 4 707 715 CTCTGTTTT
AT2G40320.1 4 733 741 CTCTGTTTT
AT2G40320.1 4 759 767 CTCTGTTTT
121
Table S4.1 continued.
AT3G03650.1 4 479 471 AAAACAGAG
AT3G03650.1 4 518 510 AAAACAGAG
AT3G03650.1 4 535 527 AAAACAGAG
AT3G03650.1 4 557 549 AAAACAGAG
AT3G16520.1 4 298 306 CTCTGTTTT
AT3G16520.1 4 691 683 AAAACAGAG
AT3G16520.1 4 715 723 CTCTGTTTT
AT3G16520.1 4 844 852 CTCTGTTTT
AT3G28180.1 4 932 940 CTCTGTTTT
AT3G28180.1 4 943 951 CTCTGTTTT
AT3G28180.1 4 975 983 CTCTGTTTT
AT3G28180.1 4 986 994 CTCTGTTTT
AT4G00430.1 4 463 471 CTCTGTTTT
AT4G00430.1 4 480 488 CTCTGTTTT
AT4G00430.1 4 505 513 CTCTGTTTT
AT4G00430.1 4 539 547 CTCTGTTTT
AT4G13710.1 4 488 496 CTCTGTTTT
AT4G13710.1 4 533 525 AAAACAGAG
AT4G13710.1 4 733 741 CTCTGTTTT
AT4G13710.1 4 778 770 AAAACAGAG
AT4G19230.1 4 746 754 CTCTGTTTT
AT4G19230.1 4 813 805 AAAACAGAG
AT4G19230.1 4 829 837 CTCTGTTTT
AT4G19230.1 4 1300 1292 AAAACAGAG
AT4G34990.1 4 361 369 CTCTGTTTT
AT4G34990.1 4 398 406 CTCTGTTTT
AT4G34990.1 4 412 420 CTCTGTTTT
AT4G34990.1 4 608 600 AAAACAGAG
AT5G02170.1 4 208 216 CTCTGTTTT
AT5G02170.1 4 362 370 CTCTGTTTT
AT5G02170.1 4 600 608 CTCTGTTTT
AT5G02170.1 4 1578 1586 CTCTGTTTT
AT5G40030.1 4 734 742 CTCTGTTTT
AT5G40030.1 4 846 854 CTCTGTTTT
AT5G40030.1 4 934 942 CTCTGTTTT
AT5G40030.1 4 1170 1178 CTCTGTTTT
AT5G61570.1 4 687 695 CTCTGTTTT
122
Table S4.1 continued.
AT5G61570.1 4 699 707 CTCTGTTTT
AT5G61570.1 4 932 940 CTCTGTTTT
AT5G61570.1 4 1284 1276 AAAACAGAG
AT5G63850.1 4 839 847 CTCTGTTTT
AT5G63850.1 4 869 877 CTCTGTTTT
AT5G63850.1 4 890 898 CTCTGTTTT
AT5G63850.1 4 910 918 CTCTGTTTT
AT1G01590.1 3 1752 1760 CTCTGTTTT
AT1G01590.1 3 2134 2142 CTCTGTTTT
AT1G01590.1 3 2164 2172 CTCTGTTTT
AT1G04610.1 3 927 919 AAAACAGAG
AT1G04610.1 3 946 954 CTCTGTTTT
AT1G04610.1 3 965 957 AAAACAGAG
AT1G07340.1 3 632 640 CTCTGTTTT
AT1G07340.1 3 653 661 CTCTGTTTT
AT1G07340.1 3 675 683 CTCTGTTTT
AT1G10220.1 3 354 346 AAAACAGAG
AT1G10220.1 3 628 620 AAAACAGAG
AT1G10220.1 3 742 750 CTCTGTTTT
AT1G10750.1 3 676 684 CTCTGTTTT
AT1G10750.1 3 716 724 CTCTGTTTT
AT1G10750.1 3 745 753 CTCTGTTTT
AT1G16380.1 3 2527 2535 CTCTGTTTT
AT1G16380.1 3 2591 2599 CTCTGTTTT
AT1G16380.1 3 2757 2749 AAAACAGAG
AT1G19050.1 3 277 285 CTCTGTTTT
AT1G19050.1 3 303 311 CTCTGTTTT
AT1G19050.1 3 334 342 CTCTGTTTT
AT1G26770.1 3 756 748 AAAACAGAG
AT1G26770.1 3 778 786 CTCTGTTTT
AT1G26770.1 3 801 793 AAAACAGAG
AT1G64355.1 3 458 466 CTCTGTTTT
AT1G64355.1 3 497 505 CTCTGTTTT
AT1G64355.1 3 529 537 CTCTGTTTT
AT1G65150.1 3 1653 1661 CTCTGTTTT
AT1G65150.1 3 1756 1764 CTCTGTTTT
AT1G65150.1 3 1817 1825 CTCTGTTTT
123
Table S4.1 continued.
AT1G65920.1 3 178 186 CTCTGTTTT
AT1G65920.1 3 205 213 CTCTGTTTT
AT1G65920.1 3 215 223 CTCTGTTTT
AT1G76360.1 3 180 172 AAAACAGAG
AT1G76360.1 3 191 183 AAAACAGAG
AT1G76360.1 3 515 523 CTCTGTTTT
AT1G77330.1 3 239 247 CTCTGTTTT
AT1G77330.1 3 349 341 AAAACAGAG
AT1G77330.1 3 593 585 AAAACAGAG
AT1G78440.1 3 223 231 CTCTGTTTT
AT1G78440.1 3 444 452 CTCTGTTTT
AT1G78440.1 3 463 471 CTCTGTTTT
AT2G03730.1 3 145 153 CTCTGTTTT
AT2G03730.1 3 397 405 CTCTGTTTT
AT2G03730.1 3 506 514 CTCTGTTTT
AT2G13840.1 3 552 560 CTCTGTTTT
AT2G13840.1 3 619 627 CTCTGTTTT
AT2G13840.1 3 687 695 CTCTGTTTT
AT2G23320.1 3 806 814 CTCTGTTTT
AT2G23320.1 3 834 826 AAAACAGAG
AT2G23320.1 3 1049 1041 AAAACAGAG
AT2G25460.1 3 41 33 AAAACAGAG
AT2G25460.1 3 67 59 AAAACAGAG
AT2G25460.1 3 91 83 AAAACAGAG
AT2G33230.1 3 747 739 AAAACAGAG
AT2G33230.1 3 785 777 AAAACAGAG
AT2G33230.1 3 801 809 CTCTGTTTT
AT2G38090.1 3 216 224 CTCTGTTTT
AT2G38090.1 3 520 528 CTCTGTTTT
AT2G38090.1 3 643 635 AAAACAGAG
AT2G39210.1 3 670 662 AAAACAGAG
AT2G39210.1 3 681 673 AAAACAGAG
AT2G39210.1 3 691 683 AAAACAGAG
AT3G03780.1 3 115 123 CTCTGTTTT
AT3G03780.1 3 146 154 CTCTGTTTT
AT3G03780.1 3 195 203 CTCTGTTTT
AT3G24600.1 3 2056 2064 CTCTGTTTT
124
Table S4.1 continued.
AT3G24600.1 3 2068 2076 CTCTGTTTT
AT3G24600.1 3 2311 2319 CTCTGTTTT
AT3G46110.1 3 310 318 CTCTGTTTT
AT3G46110.1 3 325 333 CTCTGTTTT
AT3G46110.1 3 519 527 CTCTGTTTT
AT3G48360.1 3 554 562 CTCTGTTTT
AT3G48360.1 3 627 635 CTCTGTTTT
AT3G48360.1 3 976 984 CTCTGTTTT
AT3G61230.1 3 429 437 CTCTGTTTT
AT3G61230.1 3 443 451 CTCTGTTTT
AT3G61230.1 3 567 575 CTCTGTTTT
AT3G61750.1 3 106 114 CTCTGTTTT
AT3G61750.1 3 272 280 CTCTGTTTT
AT3G61750.1 3 300 292 AAAACAGAG
AT4G03210.1 3 281 289 CTCTGTTTT
AT4G03210.1 3 299 307 CTCTGTTTT
AT4G03210.1 3 313 321 CTCTGTTTT
AT4G09460.1 3 362 354 AAAACAGAG
AT4G09460.1 3 381 373 AAAACAGAG
AT4G09460.1 3 558 566 CTCTGTTTT
AT4G12080.1 3 741 749 CTCTGTTTT
AT4G12080.1 3 753 761 CTCTGTTTT
AT4G12080.1 3 912 920 CTCTGTTTT
AT4G22880.1 3 65 73 CTCTGTTTT
AT4G22880.1 3 105 113 CTCTGTTTT
AT4G22880.1 3 117 125 CTCTGTTTT
AT4G25420.1 3 423 415 AAAACAGAG
AT4G25420.1 3 618 610 AAAACAGAG
AT4G25420.1 3 681 689 CTCTGTTTT
AT4G28025.1 3 106 98 AAAACAGAG
AT4G28025.1 3 270 278 CTCTGTTTT
AT4G28025.1 3 419 427 CTCTGTTTT
AT4G35300.1 3 235 243 CTCTGTTTT
AT4G35300.1 3 332 340 CTCTGTTTT
AT4G35300.1 3 427 435 CTCTGTTTT
AT5G09220.1 3 1108 1116 CTCTGTTTT
AT5G09220.1 3 1122 1130 CTCTGTTTT
125
Table S4.1 continued.
AT5G09220.1 3 1163 1171 CTCTGTTTT
AT5G09460.1 3 137 145 CTCTGTTTT
AT5G09460.1 3 198 206 CTCTGTTTT
AT5G09460.1 3 475 483 CTCTGTTTT
AT5G09461.1 3 137 145 CTCTGTTTT
AT5G09461.1 3 198 206 CTCTGTTTT
AT5G09461.1 3 475 483 CTCTGTTTT
AT5G09462.1 3 137 145 CTCTGTTTT
AT5G09462.1 3 198 206 CTCTGTTTT
AT5G09462.1 3 475 483 CTCTGTTTT
AT5G09463.1 3 137 145 CTCTGTTTT
AT5G09463.1 3 198 206 CTCTGTTTT
AT5G09463.1 3 475 483 CTCTGTTTT
AT5G12050.1 3 240 232 AAAACAGAG
AT5G12050.1 3 430 422 AAAACAGAG
AT5G12050.1 3 458 450 AAAACAGAG
AT5G14370.1 3 109 101 AAAACAGAG
AT5G14370.1 3 150 142 AAAACAGAG
AT5G14370.1 3 331 323 AAAACAGAG
AT5G26230.1 3 453 461 CTCTGTTTT
AT5G26230.1 3 632 624 AAAACAGAG
AT5G26230.1 3 651 643 AAAACAGAG
AT5G39785.1 3 387 379 AAAACAGAG
AT5G39785.1 3 399 407 CTCTGTTTT
AT5G39785.1 3 435 427 AAAACAGAG
AT5G39850.1 3 134 142 CTCTGTTTT
AT5G39850.1 3 269 277 CTCTGTTTT
AT5G39850.1 3 309 317 CTCTGTTTT
AT5G40460.1 3 139 147 CTCTGTTTT
AT5G40460.1 3 153 161 CTCTGTTTT
AT5G40460.1 3 175 183 CTCTGTTTT
AT5G41380.1 3 485 477 AAAACAGAG
AT5G41380.1 3 731 739 CTCTGTTTT
AT5G41380.1 3 748 756 CTCTGTTTT
AT5G49340.1 3 512 520 CTCTGTTTT
AT5G49340.1 3 682 674 AAAACAGAG
AT5G49340.1 3 707 699 AAAACAGAG
126
Table S4.1 continued.
AT5G51670.1 3 525 517 AAAACAGAG
AT5G51670.1 3 536 528 AAAACAGAG
AT5G51670.1 3 549 541 AAAACAGAG
AT5G57350.1 3 2683 2691 CTCTGTTTT
AT5G57350.1 3 2695 2703 CTCTGTTTT
AT5G57350.1 3 2720 2728 CTCTGTTTT
AT5G58000.1 3 257 249 AAAACAGAG
AT5G58000.1 3 492 484 AAAACAGAG
AT5G58000.1 3 705 713 CTCTGTTTT
AT5G62140.1 3 289 281 AAAACAGAG
AT5G62140.1 3 472 480 CTCTGTTTT
AT5G62140.1 3 514 522 CTCTGTTTT
127
Table S4.2. AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana intergenic regions Note a cutoff of at least 3 motifs occurring at least 500 bp apart was set. Thus 15 intergenic regions contain this repeat. Highlighted regions indicate unique intergenic regions.
AGI of intergenic region # of hits Position Orientation
AT5G57480-AT5G57490 6 37 29 AAAACAGAG
AT5G57480-AT5G57490 6 51 43 AAAACAGAG
AT5G57480-AT5G57490 6 70 62 AAAACAGAG
AT5G57480-AT5G57490 6 81 73 AAAACAGAG
AT5G57480-AT5G57490 6 101 93 AAAACAGAG
AT5G57480-AT5G57490 6 121 113 AAAACAGAG
AT4G16880-AT4G16890 4 1840 1848 CTCTGTTTT
AT4G16880-AT4G16890 4 2114 2106 AAAACAGAG
AT4G16880-AT4G16890 4 2284 2276 AAAACAGAG
AT4G16880-AT4G16890 4 3170 3162 AAAACAGAG
AT5G16970-AT5G16980 4 350 342 AAAACAGAG
AT5G16970-AT5G16980 4 369 361 AAAACAGAG
AT5G16970-AT5G16980 4 743 735 AAAACAGAG
AT5G16970-AT5G16980 4 755 747 AAAACAGAG
AT1G71680-AT1G71690 3 60 52 AAAACAGAG
AT1G71680-AT1G71690 3 79 71 AAAACAGAG
AT1G71680-AT1G71690 3 92 84 AAAACAGAG
AT1G71950-AT1G71960 3 674 682 CTCTGTTTT
AT1G71950-AT1G71960 3 705 713 CTCTGTTTT
AT1G71950-AT1G71960 3 717 725 CTCTGTTTT
AT2G05360-AT2G05370 3 63 55 AAAACAGAG
AT2G05360-AT2G05370 3 291 283 AAAACAGAG
AT2G05360-AT2G05370 3 304 296 AAAACAGAG
AT3G16120-AT3G16130 3 583 591 CTCTGTTTT
AT3G16120-AT3G16130 3 666 658 AAAACAGAG
AT3G16120-AT3G16130 3 682 674 AAAACAGAG
AT3G27610-AT3G27620 3 813 805 AAAACAGAG
AT3G27610-AT3G27620 3 835 827 AAAACAGAG
AT3G27610-AT3G27620 3 845 837 AAAACAGAG
AT4G23200-AT4G23210 3 604 596 AAAACAGAG
AT4G23200-AT4G23210 3 620 612 AAAACAGAG
AT4G23200-AT4G23210 3 648 640 AAAACAGAG
128
Table S4.2 continued.
AT4G34400-AT4G34410 3 447 439 AAAACAGAG
AT4G34400-AT4G34410 3 454 462 CTCTGTTTT
AT4G34400-AT4G34410 3 713 705 AAAACAGAG
AT4G37030-AT4G37040 3 202 210 CTCTGTTTT
AT4G37030-AT4G37040 3 243 251 CTCTGTTTT
AT4G37030-AT4G37040 3 259 267 CTCTGTTTT
AT5G12950-AT5G12960 3 543 551 CTCTGTTTT
AT5G12950-AT5G12960 3 564 572 CTCTGTTTT
AT5G12950-AT5G12960 3 586 594 CTCTGTTTT
AT5G29015-AT5G29020 3 1008 1016 CTCTGTTTT
AT5G29015-AT5G29020 3 1300 1308 CTCTGTTTT
AT5G29015-AT5G29020 3 1328 1336 CTCTGTTTT
AT5G57520-AT5G57530 3 3396 3404 CTCTGTTTT
AT5G57520-AT5G57530 3 3471 3479 CTCTGTTTT
AT5G57520-AT5G57530 3 3496 3504 CTCTGTTTT
AT5G57535-AT5G57540 3 169 177 CTCTGTTTT
AT5G57535-AT5G57540 3 181 189 CTCTGTTTT
AT5G57535-AT5G57540 3 202 210 CTCTGTTTT
129
Table S4.3. AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana introns and corresponding transcript response to sugar Note a cutoff of at least 3 motifs occurring at least 500 bp apart was set. Thus 45 introns contain this repeat. 45.9% of the AtMYB61 second intron repeat motif occurrences are within introns. Within the 45 introns with these repeats, 21 of these occurrences are within sugar responsive genes (46.7%). Highlighted regions indicate unique gene. -n within AGI represents which intron the motif is present within. Sugar responsive genes were identified from microarray data conducted by Romano et al. (Romano et al., 2012).
AGI # of hits Position Orientation Sugar Responsive
AT1G67070.1-1 7 15 23 CTCTGTTTT NO
AT1G67070.1-1 7 27 35 CTCTGTTTT
AT1G67070.1-1 7 217 225 CTCTGTTTT
AT1G67070.1-1 7 230 238 CTCTGTTTT
AT1G67070.1-1 7 256 264 CTCTGTTTT
AT1G67070.1-1 7 402 410 CTCTGTTTT
AT1G67070.1-1 7 415 423 CTCTGTTTT
AT3G60130.1-7 5 168 176 CTCTGTTTT YES
AT3G60130.1-7 5 185 193 CTCTGTTTT
AT3G60130.1-7 5 212 220 CTCTGTTTT
AT3G60130.1-7 5 239 247 CTCTGTTTT
AT3G60130.1-7 5 266 274 CTCTGTTTT
AT1G09540.1-2 4 58 50 AAAACAGAG YES
AT1G09540.1-2 4 87 95 CTCTGTTTT
AT1G09540.1-2 4 104 112 CTCTGTTTT
AT1G09540.1-2 4 144 152 CTCTGTTTT
AT1G30320.1-2 4 76 84 CTCTGTTTT NO
AT1G30320.1-2 4 87 95 CTCTGTTTT
AT1G30320.1-2 4 184 192 CTCTGTTTT
AT1G30320.1-2 4 227 235 CTCTGTTTT
AT1G32700.1-1 4 40 48 CTCTGTTTT YES
AT1G32700.1-1 4 67 75 CTCTGTTTT
AT1G32700.1-1 4 215 223 CTCTGTTTT
AT1G32700.1-1 4 232 240 CTCTGTTTT
AT1G61800.1-1 4 89 97 CTCTGTTTT YES
AT1G61800.1-1 4 150 158 CTCTGTTTT
AT1G61800.1-1 4 178 186 CTCTGTTTT
AT1G61800.1-1 4 208 216 CTCTGTTTT
AT1G69530.1-2 4 25 33 CTCTGTTTT YES
AT1G69530.1-2 4 48 40 AAAACAGAG
130
Table S4.3 continued.
AT1G69530.1-2 4 87 95 CTCTGTTTT
AT1G69530.1-2 4 110 102 AAAACAGAG
AT1G70550.1-1 4 85 93 CTCTGTTTT NO
AT1G70550.1-1 4 96 104 CTCTGTTTT
AT1G70550.1-1 4 107 115 CTCTGTTTT
AT1G70550.1-1 4 156 164 CTCTGTTTT
AT2G01540.1-1 4 125 133 CTCTGTTTT YES
AT2G01540.1-1 4 166 174 CTCTGTTTT
AT2G01540.1-1 4 206 214 CTCTGTTTT
AT2G01540.1-1 4 245 253 CTCTGTTTT
AT2G37440.1-1 4 137 145 CTCTGTTTT NO
AT2G37440.1-1 4 395 403 CTCTGTTTT
AT2G37440.1-1 4 553 561 CTCTGTTTT
AT2G37440.1-1 4 596 604 CTCTGTTTT
AT2G38120.1-1 4 82 90 CTCTGTTTT YES
AT2G38120.1-1 4 139 147 CTCTGTTTT
AT2G38120.1-1 4 153 161 CTCTGTTTT
AT2G38120.1-1 4 194 202 CTCTGTTTT
AT3G28180.1-1 4 15 23 CTCTGTTTT YES
AT3G28180.1-1 4 26 34 CTCTGTTTT
AT3G28180.1-1 4 58 66 CTCTGTTTT
AT3G28180.1-1 4 69 77 CTCTGTTTT
AT4G00430.1-1 4 36 44 CTCTGTTTT YES
AT4G00430.1-1 4 53 61 CTCTGTTTT
AT4G00430.1-1 4 78 86 CTCTGTTTT
AT4G00430.1-1 4 112 120 CTCTGTTTT
AT4G02780.1-1 4 202 210 CTCTGTTTT NO
AT4G02780.1-1 4 332 340 CTCTGTTTT
AT4G02780.1-1 4 422 430 CTCTGTTTT
AT4G02780.1-1 4 574 582 CTCTGTTTT
AT5G37600.1-1 4 109 117 CTCTGTTTT YES
AT5G37600.1-1 4 297 305 CTCTGTTTT
AT5G37600.1-1 4 321 329 CTCTGTTTT
AT5G37600.1-1 4 333 341 CTCTGTTTT
AT5G40030.1-1 4 84 92 CTCTGTTTT NO
AT5G40030.1-1 4 196 204 CTCTGTTTT
AT5G40030.1-1 4 284 292 CTCTGTTTT
131
Table S4.3 continued.
AT5G40030.1-1 4 520 528 CTCTGTTTT
AT5G45340.1-2 4 33 25 AAAACAGAG YES
AT5G45340.1-2 4 69 61 AAAACAGAG
AT5G45340.1-2 4 74 82 CTCTGTTTT
AT5G45340.1-2 4 97 105 CTCTGTTTT
AT5G46240.1-1 4 22 30 CTCTGTTTT NO
AT5G46240.1-1 4 36 44 CTCTGTTTT
AT5G46240.1-1 4 88 96 CTCTGTTTT
AT5G46240.1-1 4 124 132 CTCTGTTTT
AT5G61570.1-1 4 7 15 CTCTGTTTT NO
AT5G61570.1-1 4 19 27 CTCTGTTTT
AT5G61570.1-1 4 252 260 CTCTGTTTT
AT5G61570.1-1 4 604 596 AAAACAGAG
AT5G63850.1-3 4 17 25 CTCTGTTTT NO
AT5G63850.1-3 4 47 55 CTCTGTTTT
AT5G63850.1-3 4 68 76 CTCTGTTTT
AT5G63850.1-3 4 88 96 CTCTGTTTT
AT1G04610.1-1 3 77 69 AAAACAGAG NO
AT1G04610.1-1 3 96 104 CTCTGTTTT
AT1G04610.1-1 3 115 107 AAAACAGAG
AT1G07340.1-2 3 46 54 CTCTGTTTT NO
AT1G07340.1-2 3 67 75 CTCTGTTTT
AT1G07340.1-2 3 89 97 CTCTGTTTT
AT1G10750.1-1 3 149 157 CTCTGTTTT NO
AT1G10750.1-1 3 189 197 CTCTGTTTT
AT1G10750.1-1 3 218 226 CTCTGTTTT
AT1G19050.1-1 3 29 37 CTCTGTTTT YES
AT1G19050.1-1 3 55 63 CTCTGTTTT
AT1G19050.1-1 3 86 94 CTCTGTTTT
AT1G26770.1-3 3 37 29 AAAACAGAG YES
AT1G26770.1-3 3 59 67 CTCTGTTTT
AT1G26770.1-3 3 82 74 AAAACAGAG
AT1G64355.1-1 3 8 16 CTCTGTTTT NO
AT1G64355.1-1 3 47 55 CTCTGTTTT
AT1G64355.1-1 3 79 87 CTCTGTTTT
AT1G65920.1-1 3 34 42 CTCTGTTTT NO
AT1G65920.1-1 3 61 69 CTCTGTTTT
132
Table S4.3 continued.
AT1G65920.1-1 3 71 79 CTCTGTTTT
AT2G13840.1-1 3 198 206 CTCTGTTTT NO
AT2G13840.1-1 3 265 273 CTCTGTTTT
AT2G13840.1-1 3 333 341 CTCTGTTTT
AT2G33230.1-1 3 48 40 AAAACAGAG NO
AT2G33230.1-1 3 86 78 AAAACAGAG
AT2G33230.1-1 3 102 110 CTCTGTTTT
AT2G39210.1-1 3 67 59 AAAACAGAG NO
AT2G39210.1-1 3 78 70 AAAACAGAG
AT2G39210.1-1 3 88 80 AAAACAGAG
AT2G40320.1-2 3 19 27 CTCTGTTTT NO
AT2G40320.1-2 3 45 53 CTCTGTTTT
AT2G40320.1-2 3 71 79 CTCTGTTTT
AT3G03780.1-1 3 60 68 CTCTGTTTT NO
AT3G03780.1-1 3 91 99 CTCTGTTTT
AT3G03780.1-1 3 140 148 CTCTGTTTT
AT4G03210.1-1 3 39 47 CTCTGTTTT YES
AT4G03210.1-1 3 57 65 CTCTGTTTT
AT4G03210.1-1 3 71 79 CTCTGTTTT
AT4G09460.1-1 3 46 38 AAAACAGAG YES
AT4G09460.1-1 3 65 57 AAAACAGAG
AT4G09460.1-1 3 242 250 CTCTGTTTT
AT4G12080.1-1 3 19 27 CTCTGTTTT YES
AT4G12080.1-1 3 31 39 CTCTGTTTT
AT4G12080.1-1 3 190 198 CTCTGTTTT
AT4G19230.1-2 3 9 17 CTCTGTTTT YES
AT4G19230.1-2 3 76 68 AAAACAGAG
AT4G19230.1-2 3 92 100 CTCTGTTTT
AT4G22880.1-1 3 39 47 CTCTGTTTT NO
AT4G22880.1-1 3 79 87 CTCTGTTTT
AT4G22880.1-1 3 91 99 CTCTGTTTT
AT4G34990.1-1 3 20 28 CTCTGTTTT NO
AT4G34990.1-1 3 57 65 CTCTGTTTT
AT4G34990.1-1 3 71 79 CTCTGTTTT
AT4G35300.4-1 3 295 303 CTCTGTTTT NO
AT4G35300.4-1 3 392 400 CTCTGTTTT
AT4G35300.4-1 3 487 495 CTCTGTTTT
133
Table S4.3 continued.
AT5G09220.1-3 3 11 19 CTCTGTTTT YES
AT5G09220.1-3 3 25 33 CTCTGTTTT
AT5G09220.1-3 3 66 74 CTCTGTTTT
AT5G38970.1-2 3 28 36 CTCTGTTTT NO
AT5G38970.1-2 3 52 60 CTCTGTTTT
AT5G38970.1-2 3 156 164 CTCTGTTTT
AT5G39850.1-1 3 75 83 CTCTGTTTT YES
AT5G39850.1-1 3 210 218 CTCTGTTTT
AT5G39850.1-1 3 250 258 CTCTGTTTT
AT5G51670.1-1 3 41 33 AAAACAGAG NO
AT5G51670.1-1 3 52 44 AAAACAGAG
AT5G51670.1-1 3 65 57 AAAACAGAG
AT5G53660.1-2 3 32 40 CTCTGTTTT YES
AT5G53660.1-2 3 59 67 CTCTGTTTT
AT5G53660.1-2 3 72 80 CTCTGTTTT
AT5G57350.1-5 3 14 22 CTCTGTTTT YES
AT5G57350.1-5 3 26 34 CTCTGTTTT
AT5G57350.1-5 3 51 59 CTCTGTTTT
135
5 General Conclusions and Future Directions
5.1 General Conclusions
This thesis investigated the upstream and downstream regulation of the Arabidopsis
thaliana R2R3-MYB transcription factor, AtMYB61. It addressed three major aims. The
first aim related to the identification of direct downstream targets of AtMYB61. The
second aim related to the determination of DNA targets preferentially bound by
AtMYB61. The third aim dealt with the examination of upstream regulatory mechanisms
that impact the transcription of AtMYB61. The scientific objectives and the major
findings that arose by addressing these aims are as follows:
(1) To determine the direct downstream targets of AtMYB61
Three putative downstream target genes of AtMYB61 were identified. Putative
AtMYB61 targets were predicted on the basis of comparative transcriptome analysis.
This transcriptome analysis entailed identification and comparison of genes whose
transcript abundance was modulated by differences in AtMYB61 activity, relative to
those genes whose transcript abundance profiles paralleled AtMYB61 across
development and in different organs.
The three putative AtMYB61 targets identified through this comparison are predicted to
encode the following proteins: a KNOTTED1-like transcription factor (KNAT7,
At1g62990); a caffeoyl-CoA 3-O-methyltransferase (CCoAOMT7, At4g26220), and a
pectin-methylesterase (PME, At2g45220). Statistically over-represented motifs were
identified in the 5‘ non-coding regions of the three putative target genes. These motifs
corresponded to previously-characterised AC-element motifs that function as R2R3-
MYB targets in other systems (Grotewold et al., 1994; Sablowski et al., 1994; Sablowski
et al., 1995; Moyano et al., 1996; Sainz et al., 1997; Uimari and Strommer, 1997;
Tamagnone et al., 1998; Jin et al., 2000; Sugimoto et al., 2000; Yang et al., 2001;
Patzlaff et al., 2003a; Patzlaff et al., 2003b; Fukuzawa et al., 2006).
The consensus motif identified in the gene regulatory regions of the three putative
AtMYB61 target genes functions as a bona fide target for AtMYB61 binding, as
136
determined by EMSA using purified recombinant AtMYB61 protein. Moreover, the 5‘
non-coding regulatory regions of each of the putative target genes could also be bound
by AtMYB61, as determined by EMSA. AtMYB61 expression in yeast was sufficient to
drive transcription of a synthetic reporter gene comprising a tandem AC-element fused
to a yeast minimal promoter, upstream of the reporter gene lac-Z. Together, these
findings support the hypothesis that AtMYB61 binds to, and regulates, the expression of
a small subset of genes, which in turn shape multiple facets of plant growth and
metabolism.
(2) To identify and characterise the DNA binding motifs to which AtMYB61
preferentially binds
The DNA binding sites to which a gene regulatory protein binds can be affinity purified
using the CASTing system. This system was used to identify DNA recognition sites to
which recombinant AtMYB61 protein preferentially binds in vitro. The binding kinetics of
AtMYB61 to the CASTing-selected DNA target sequences were determined using a
nitrocellulose filter-binding assay. These experiments confirmed that a core ACC
nucleotide motif was essential for binding by AtMYB61. The nature of the interactions
between amino acids in the AtMYB61 DNA-binding site and nucleotides in the
preferential DNA targets were explored using molecular modeling in silico. These
predict key interactions that likely shape the affinity of protein binding to the cognate
DNA sequence. Notably, while recombinant AtMYB61 was sufficient to drive gene
expression from CASTing-identified target DNA sequences in yeast, it did so in a
manner that was not entirely consistent with predicted affinities. Together, these
findings illustrate the binding specificity of an R2R3-MYB protein, and underscore the
fact that such specificity may play out in a complex manner in a biological system.
(3) To determine the molecular components that function upstream to modulate
AtMYB61 expression.
AtMYB61 was regulated by photosynthate in a sugar-signalling pathway that appears to
act independent of the hexokinase sugar signalling pathway. Analysis of AtMYB61
promoter-reporter fusion constructs with or without AtMYB61 5‘ intragenic sequences
suggested that AtMYB61 expression is de-repressed by sucrose in a mechanism
137
involving intragenic sequences. An over-represented conserved motif was identified
within the second intron of Brassicaceae AtMYB61 homologues. The second intron
repeats of AtMYB61 could function as binding targets for a putative sugar-mediated
repressor, as determined by EMSA. Putative repressor proteins that bound this motif in
the absence of sucrose were identified by affinity purification coupled with mass
spectrometry, and characterised using a combination of loss-of-function genetics and
transcriptome analysis. Together, these findings support the hypothesis of a novel
protein activity that binds a conserved repeat motif within AtMYB61 second intron to
regulate sugar mediated gene expression in AtMYB61.
5.2 Future Directions
Molecular Characterisations of Plant Transcription Factors
Despite the vast knowledge of plant transcription factor function at the gross
morphological level, little is known about the mechanistic basis for transcription factor
activity. In addition to shedding light on a particular transcription factor AtMYB61, this
thesis has established a pipeline for the characterisation of the functions of any
transcription factor on the molecular level. This pipeline is essential because it gives
insight into the mechanisms that drive phenotypes. The identification of more DNA-
binding sites of regulatory proteins should lead to more accurate in silico motif
prediction programs for novel DNA-binding proteins. These insights are not only
important from a basic science perspective, but can also be fruitful in terms of
developing schemes for the modification of important transcription factors, like
AtMYB61, for specific end purposes, such as the directed modification of plant
architecture or metabolic engineering.
ChIP-Seq
In addition to the in vitro and in silico characterisation of AtMYB61 and its target
sequences demonstrated in this thesis, it is critical that an in vivo characterisation be
conducted as well to further determine how AtMYB61 influence phenotype-affecting
mechanisms. Recently, chromatin immunoprecipitation (ChIP) followed by high-
138
throughput signature sequencing (ChIP-seq) has proven to be an incredibly powerful
means by which to identify in vivo DNA-binding sites of sequence-specific transcription
factors (Massie and Mills, 2008). ChIP-seq could be used to identify AtMYB61 in vivo
DNA targets in the Arabidopsis thaliana genome. Towards this note, a viable antibody
has been generated against the variable region of AtMYB61 (refer to Chapter 3 of this
thesis). DNA sequences can be pulled down, sequenced, and analysed to determine
their location in the Arabidopsis thaliana genome. Targets can be validated by
analysing their transcript abundance in atmyb61 loss-of-function and AtMYB61
overexpressor mutants. The in vivo direct downstream targets of AtMYB61 can be
compared to the in vitro and in silico targets determined in this thesis to confirm
accuracy of methods.
Characterisations of Putative AtMYB61 Repressors
The identification of putative repressors that bound the second intron of AtMYB61
determined in this thesis demonstrated the molecular components that function
upstream to modulate AtMYB61 expression; however, the biochemical characterisations
of these repressor proteins still remain. To determine if the putative AtMYB61 proteins
can repress gene activity, these proteins should be expressed in Arabidopsis thaliana
protoplasts to observe if they can repress a synthetic reporter gene comprising tandem
AtMYB61 second intron repeats fused to a Cauliflower Mosaic Virus 35S promoter,
upstream of the GUS reporter gene uidA. In addition to the biochemical
characterisations of these putative repressor proteins, the in vivo direct downstream
targets of these proteins should be identified. ChIP-seq should be conducted on these
repressor proteins to identify in vivo targets. To determine the expression of AtMYB61
repressors in tissues throughout development, promoter-reporter fusion constructs
should be transformed into plants and analysed.
The validation of binding of putative AtMYB61 repressors to AtMYB61 second intron
repeat is also to be determined. The cDNA of putative AtMYB61 repressors should be
cloned into the pET-15b protein expression vectors and expressed. To determine if the
proteins bind to AtMYB61 second intron repeat in vitro, an electrophoretic mobility shift
assay (EMSA) is to be conducted with recombinant putative AtMYB61 repressor
139
proteins and labelled AtMYB61 second intron repeat. In addition to this, to determine if
the putative AtMYB61 repressor proteins bind to AtMYB61 second intron repeat in vivo,
an EMSA is to be conducted with labelled AtMYB61 second intron repeat and nuclear
proteins purified from putative AtMYB61 repressor loss-of-function (rmx) mutants. It is
hypothesised that in the rmx loss-of-function background, the binding would be reduced
in the EMSA compared to the same assay conducted with wild-type nuclear proteins.
Despite the size and importance of the plant R2R3-MYB family of transcription factors,
little is known about the molecular functioning of individual family members. AtMYB61,
a member of the R2R3-MYB family in Arabidopsis thaliana, regulates pleiotropic
modifications of carbon acquisition and allocation throughout the plant body. As is the
case for most R2R3-MYB transcription factors, the precise mechanisms that enable
AtMYB61 to bring about important changes in plant function were unknown before the
onset of this thesis. The work described in this thesis casts light on the downstream
and upstream mechanisms of AtMYB61. The findings presented in this thesis point to
additional complexities in the regulation of plant gene expression, and argue for the
need for greater exploration of the molecular intricacies involved in how a given plant
transcription factor elicits a phenotype.
140
Appendices
The wound-, pathogen-, and ultraviolet B-responsive MYB134 gene encodes an R2R3 MYB transcription factor that
regulates a suite of genes involved in proanthocyanidin synthesis in Poplar
This chapter is an extract of material originally contained in the following publication:
Mellway, R.D., Tran L.T., Prouse, M.B., Campbell, M.M., and Constabel, C.P. (2009)
The Wound-, Pathogen-, and Ultraviolet B-Responsive MYB134 Gene Encodes an
R2R3 MYB Transcription Factor That Regulates Proanthocyanidin Synthesis in Poplar.
Plant Physiology. 150: 924-941.
Contributions: MBP, RDM, MMC, CPC designed research; MBP, RDM, LTT
performed research; RDM, LTT, MBP, MMC, CPC analysed data; MBP, RDM, MMC,
CPC wrote manuscript with editorial assistance from MBP, RDM, LTT, MMC, CPC
MBP contributed specifically to each figure and table in this chapter.
Copyright: The material in this chapter is copyrighted by The American Society of
Plant Biologists and is cited as:
141
A The Wound-, Pathogen-, and Ultraviolet B-Responsive MYB134 Gene Encodes an R2R3 MYB Transcription Factor that Regulates a Suite of Genes Involved in Proanthocyanidin Synthesis in Poplar
A.1 Abstract
In poplar (Populus spp.), the major defense phenolics produced in leaves are flavonoid-
derived proanthocyanidins (PAs). Transcriptional activation of PA biosynthetic genes
leading to PA accumulation in leaves occurs following herbivore damage and
mechanical wounding. A poplar R2R3-MYB transcription factor gene, MYB134, exhibits
close sequence similarity to the Arabidopsis thaliana PA regulator TRANSPARENT
TESTA2 and is coinduced with PA biosynthetic genes following mechanical wounding
and exposure to elevated ultraviolet B light. Overexpression of MYB134 in poplar
results in transcriptional activation of the full PA biosynthetic pathway and a significant
plant-wide increase in PA levels. Here, we demonstrate through electrophoretic mobility
shift assays (EMSA) that recombinant MYB134 protein is able to bind to promoter
regions of early and late PA pathway genes: PHENYLALANINE AMMONIA-LYASE1
(PAL1), DIHYDROFLAVONOL REDUCTASE1 (DFR1) and ANTHOCYANIDIN
REDUCTASE2 (ANR2). Sequences enriched with adenosine and cytosine nucleotides,
termed AC elements, were over-represented in the 5‘ non-coding regions of putative
target genes. The consensus motif functions as a bona fide target for MYB134 as
determined by EMSA. Our data provide insight into the regulatory mechanisms
controlling PA metabolism in poplar, and the identification of a regulator of stress-
responsive PA biosynthesis constitutes a valuable tool for manipulating PA metabolism
in poplar and investigating the biological functions of PAs in resistance to biotic and
abiotic stresses.
A.2 Introduction
Plant secondary metabolites play important ecological roles and in many plants
constitute a critical component of defenses against biotic and abiotic stress. Many
142
secondary metabolic pathways are responsive to environmental conditions and can be
rapidly activated by stresses such as pathogen infection, elevated light, and herbivory.
The phenylpropanoid pathway in particular leads to the synthesis of a large and diverse
class of plant secondary metabolites, many of which are stress induced (Dixon and
Paiva, 1995). Synthesis of phenylpropanoids and other secondary metabolites
following stress is typically mediated by the transcriptional activation of suites of
biosynthetic genes coordinately regulated by transcription factor proteins (Weisshaar
and Jenkins, 1998; Davies and Schwinn, 2003). The possibility of identifying
transcription factors that control entire pathways is motivating many studies in plant
stress biology, since such regulators would be valuable for the metabolic engineering of
plants for both plant and human health (Dixon, 2005; Sharma and Dixon, 2005; Yu and
McGonigle, 2005).
Populus species (cottonwoods, poplars, and aspens, hereafter referred to collectively as
poplar) are often ecological foundation species and include the most widely distributed
trees in the Northern Hemisphere. The phenolic metabolites produced by poplar are
thought to be important determinants of community structure and ecosystem dynamics
(Lindroth and Hwang, 1996; Schweitzer et al., 2004; Bailey et al., 2005; LeRoy et al.,
2006; Whitham et al., 2006). Poplar leaves typically accumulate several classes of
phenolic metabolites, including the salicylate-derived phenolic glycosides (PGs),
flavonoids such as flavonol glycosides, anthocyanins, and proanthocyanidins (PAs; or
condensed tannins), and numerous small phenolic acids and their esters (Pearl and
Darling, 1971; Klimczak et al., 1972; Palo, 1984; Lindroth and Hwang, 1996). PGs and
PAs are generally the most abundant foliar phenolic metabolites in poplar and together
can constitute more than 30% of leaf dry weight (Pearl and Darling, 1971; Klimczak et
al., 1972; Palo, 1984; Lindroth and Hwang, 1996). PAs are also constitutively produced
in poplar leaves, but their biosynthesis is often up-regulated by stresses such as insect
herbivory, mechanical wounding, and pathogen infection (Peters and Constabel, 2002;
Stevens and Lindroth, 2005; Miranda et al., 2007). PA accumulation following
wounding and herbivory occurs both locally at the site of damage and systemically in
distal leaves (Peters and Constabel, 2002). The strong systemic activation of the PA
biosynthetic pathway in poplar following insect herbivory suggests that these
143
compounds function in herbivore defense. However, experimental evidence indicates
that poplar leaf PAs may not be strong, broad-spectrum antiherbivore compounds
(Hemming and Lindroth, 1995; Ayres et al., 1997). In addition to biotic stresses, nutrient
limitation and high light levels have also been found to result in greater PA
concentrations in poplar (Hemming and Lindroth, 1999; Osier and Lindroth, 2001),
hinting at broader biological roles.
Transcriptional regulation of flavonoid and PA biosynthetic genes involves combinatorial
interactions between several classes of transcription factor proteins (Mol et al., 1998;
Nesi et al., 2001; Winkel-Shirley, 2001). These include members of the R2R3-MYB
domain, basic helix-loop-helix (bHLH) domain, and WD-repeat (WDR) families (Lepiniec
et al., 2006). In Arabidopsis thaliana seed testa, PA biosynthesis is regulated by a
MYB-bHLH-WDR ternary complex composed of the TT2, TT8, and TTG1 proteins (Nesi
et al., 2000; Nesi et al., 2001. The MYB factor (TT2) confers target gene specificity to
the complex, activating the late PA biosynthetic genes, including DFR, BAN, TT12, and
AHA10 {Baudry, 2004 #208; Debeaujon et al., 2003; Baudry et al., 2004; Sharma and
Dixon, 2005). The DNA sequences bound by TT2 have not been elucidated, although
the closely related maize (Zea mays) COLORLESS1 (C1) protein, a regulator of
anthocyanin metabolism, has been shown to bind to both AC-rich motifs known as AC
elements and the animal c-MYB consensus sequence (CNGTTR) present in the
regulatory regions of numerous phenylpropanoid genes (Howe and Watson, 1991;
Weston, 1992; Sainz et al., 1997; Hernandez et al., 2004).
The R2R3-MYBs constitute large gene families in plants, with 126 members in
Arabidopsis thaliana (Stracke et al., 2001) and 192 in poplar (Wilkins et al., 2009).
Although many remain functionally uncharacterised, numerous R2R3-MYB proteins are
implicated in the regulation of plant-specific developmental and physiological processes,
including the regulation of phenylpropanoid metabolism (Stracke et al., 2001). R2R3-
MYB proteins are characterised by two imperfectly repeated N-terminal MYB domains
each forming DNA-binding helix-helix-turn-helix structures. Outside of the R2R3 MYB
domain, the proteins are highly divergent except for short conserved amino acid
sequence motifs. These motifs, together with sequence homology within the MYB
144
domains, form the basis for their classification into different subgroups (Stracke et al.,
2001; Jiang et al., 2004b).
We previously showed that the stress induction of PAs in poplar leaves follows the
transcriptional activation of PA biosynthetic genes (Peters and Constabel, 2002;
Miranda et al., 2007) and therefore hypothesised that a TT2-like R2R3 MYB protein
regulates this process. MYB134 was also previously identified as a candidate PA
regulator that is consistently coregulated with PA biosynthetic genes (Mellway et al.,
2009). Constitutive expression of MYB134 in transgenic poplar resulted in a specific
activation of PA pathway genes, leading to a dramatic increase in PA concentrations,
suggesting that this gene is indeed a poplar PA regulator. Here, we show that
recombinant MYB134 protein binds to promoter regions of both early and late PA
pathway genes containing predicted MYB binding sites. These findings provide insight
into the regulatory mechanisms mediating stress-induced PA biosynthesis, and the PA-
modified poplar trees produced here represent a valuable tool for investigating the
functions of carbon-based allelochemicals in poplar.
A.3 Materials and Methods
A.3.1 EMSA
Recombinant MYB134 protein was produced in Escherichia coli using the coding
sequence cloned in-frame into the NdeI and BamHI sites of the pET15b vector
(Novagen). Recombinant MYB134 protein was produced, extracted, and affinity purified
as described previously for pine (Pinus spp.) MYB proteins (Patzlaff et al., 2003b).
EMSA conditions were exactly as described previously (Patzlaff et al., 2003b; Gomez-
Maldonado et al., 2004) except that recombinant MYB134 protein was used in place of
pine MYB protein.
145
A.4 Results and Discussion
A.4.1 MYB134 Binds to Promoter Regions of PA Biosynthetic Genes
PHENYLALANINE AMMONIA-LYASE1 (PAL1), DIHYDROFLAVONOL REDUCTASE1
(DFR1) and ANTHOCYANIDIN REDUCTASE2 (ANR2) were all upregulated by
constitutive expression of MYB134, suggesting that they were all direct targets (Mellway
et al., 2009). These proteins act in the PA biosynthetic pathway, as PAL1 catalyses the
conversion of phenylalanine to cinnamic acid; DFR1 catalyses dihydroflavonols to
leucoanthocyanidins; and ANR2 catalyses anthocyanidins into epicatechins (Xie and
Dixon, 2005). These target genes represent general phenylpropanoid/early PA
metabolism (PAL1), late flavonoid metabolism (DFR1), and the PA-specific branch of
flavonoid metabolism (ANR2). Candidate MYB134-binding sites in the regulatory
regions of these genes were identified by visual examination of the upstream genomic
sequence and comparison with characterised phenylpropanoid promoters as well as
with a search of the PLACE (plant cis-element database)
(http://www.dna.affrc.go.jp/PLACE/signalscan.html) using SIGNAL SCAN (Prestridge,
1991; Higo et al., 1998). The promoter regions of the target genes were found to
contain motifs similar to the adenosine- and cytosine-rich AC elements found in the
regulatory regions of biosynthetic genes of different branches of phenylpropanoid
metabolism, including both flavonoid and lignin biosynthesis (Fig. A.1a)(Hatton et al.,
1995; Rogers and Campbell, 2004; Hartmann et al., 2005). AC elements are bound by
the maize C1 protein, the most closely related MYB protein to MYB134 for which DNA-
binding sites have been defined, as well as several MYB proteins involved in the
regulation of lignin metabolism (Hatton et al., 1995; Patzlaff et al., 2003b; Rogers and
Campbell, 2004). The 180-bp ANR2 promoter region analysed also contains a motif
matching the CNGTTR consensus sequence bound by the vertebrate c-MYB (Fig.
A.1a)(Howe and Watson, 1991; Weston, 1992). Inspection of these representative
promoter sequences also revealed the presence of bHLH protein consensus-binding
sites (CANNTG) in close proximity to the putative MYB-binding sites (Fig. A.1a). The
upstream region of poplar PAL1 contains two overlapping AC element sequences
identical to the high-affinity P-binding site (ACCTACCAACC) identified in the maize A1
146
Figure A.1. MYB134 binds to the promoters of putative downstream target genes. (a) Schematic representation of 1,000 bp of 5′ noncoding sequences for three putative MYB134 downstream target genes. + and − indicate the orientations of AC element-like motifs relative to the sense coding strand; numbers indicate the positions of these motifs relative to the putative transcriptional start. Arrows above each line indicate bHLH consensus sites (CANNTG), while arrows below each line indicate c-MYB consensus sites (CNGTTR). Light gray horizontal lines under the sequences correspond to the location of the DNA sequence used as the binding target in the EMSA conducted in B. (b) MYB134 binding to 5′ noncoding sequences of the three putative target genes as determined by EMSA. Recombinant MYB134 bound to all three 5′ noncoding sequences, as determined by a gel shift of the probe (arrows), which could be outcompeted with increasing quantities of unlabeled DNA corresponding to a canonical R2R3 MYB-binding site, known as an AC element motif (AC; 5′-ATTGTTCTTCCTGGGGTGACCGTCCACCTACGCTAAAAGCCGTCGCGGGATAAGCCTGTCTG-3′). C, MYB134 binding to the AC-rich canonical R2R3 MYB-binding site motif as determined by EMSA. Binding of recombinant MYB134 to radiolabeled AC can be outcompeted by cold competitor AC (left) but not by the nonspecific competitor poly(dIdC).
147
(encoding dihydroflavonol reductase) promoter sequence that is bound by maize C1
and the maize P protein, an R2R3-MYB protein that regulates the biosynthesis 3-deoxy
flavonoids and phlobaphenes (Fig. A.1a)(Sainz et al., 1997). Within the 180-bp regions
analyzed, poplar DFR1 and ANR2 both contain motifs that are quite similar to the AC
elements defined by (Hatton et al., 1995) in the tobacco PAL2 promoter (GCCTACC
and ACCTACA, respectively)(Fig. A.1a). EMSA experiments showed that the
recombinant MYB134 protein specifically bound the 180-bp upstream regulatory
sequences (Fig. A.1b). Two shifted bands were observed for the PAL1 and ANR2 180-
bp probes, while only one was seen with the DFR1 probe (Fig. A.1b). It is possible that
the MYB134 protein binds both of the overlapping AC elements in the PAL1 promoter
and both the AC element-like sequence and the c-MYB-binding site in the ANR2
promoter. A sequence containing a canonical AC element was an effective competitor
and eliminated MYB134 binding (Fig. A.1b), and recombinant MYB134 also bound to
this element in a specific manner (Fig. A.1c). Thus, MYB134 appears to bind to the
gene regulatory regions of putative target genes in an AC motif-dependent fashion. Our
work indicates that high sequence similarity to TT2 can be used to link MYB gene
function to PA pathway regulation.
In silico analysis has shown that the promoter regions of the poplar flavonoid and PA
biosynthetic genes contain cis elements matching the consensus sequences recognised
by phenylpropanoid regulatory R2R3-MYB proteins (Tsai et al., 2006). MYB134 was
shown to bind to promoter fragments containing motifs similar to the AC elements found
in a wide variety of phenylpropanoid biosynthetic gene promoters (Fig. A.1b). MYB134
was also shown to bind to a DNA sequence containing a canonical AC element
(ACCTAC; Fig. A.1c). These results suggest that such motifs are bound by MYB134 in
vivo, although these results do not rule out the involvement of other putative MYB
binding sites, such as the animal c-MYB recognition site found in the ANR2 promoter.
AC element-like motifs are present within the 2-kb 5′ noncoding sequence of most
poplar flavonoid genes (Tsai et al., 2006). Given that AC elements are widely
distributed in the regulatory regions not just of PA biosynthetic genes but of genes
involved in other branches of flavonoid and phenylpropanoid metabolism, interactions
with cofactors such as bHLH domain proteins that require the presence of additional
148
binding sites likely contribute to the specific activation of different branch pathways
(Hartmann et al., 2005). Consistent with specific bHLH cofactor binding sites
contributing to MYB134 target gene specificity, putative bHLH-binding sites are present
in all poplar PA pathway genes (R.D. Mellway and C.P. Constabel, unpublished data).
In activating the full suite of early and late flavonoid as well as PA biosynthetic genes,
MYB134 differs from Arabidopsis thaliana TT2, which regulates a more limited set of
late PA structural genes (Nesi et al., 2001; Sharma and Dixon, 2005). A wider target
gene set for MYB134, in conjunction with the natural constitutive PA production in a
wider range of poplar tissues, may account for the different effects of TT2
overexpression in Arabidopsis thaliana compared with MYB134 overexpression in
poplar. Unlike poplar, Arabidopsis thaliana produces PAs only in the seed testa, and
ectopic expression of TT2 does not result in plant-wide PA accumulation (Nesi et al.,
2001). A more detailed elucidation of how the pathway is regulated will require
functional characterization of the members of both MYB gene families as well as
identification and analysis of the additional interacting proteins such as the bHLH and
WDR proteins.
A.5 Conclusion
The extensive genomics resources combined with the complexity and biological
importance of phenylpropanoid metabolism in poplar make it a useful system for
investigating this pathway. In this report, we describe work identifying a gene encoding
an R2R3-MYB transcription factor, PtMYB134, which appears to play an important role
in controlling PA biosynthesis. PtMYB134 was shown to bind to the 5‘ non-coding
regulatory regions of both early and late PA biosynthetic genes: PAL1, DFR1 and
ANR2. AC elements were identified within the targets promoter regions and this
consensus motif functions as a bona fide target for MYB134 binding as determined by
an electrophoretic mobility shift assay. Identifying transcriptional regulators of
biosynthetic pathway genes is an important goal for metabolic engineering of secondary
metabolism in plants, and the identification of a putative regulator of PA metabolism in
poplar may permit new experimental approaches for evaluating the biological functions
of PAs.
149
A.6 Acknowledgements
This work was generously supported by a Natural Science and Engineering Research
Council of Canada (NSERC) Canadian Graduate Scholarship (CGSD) awarded to MP,
and by funding from the University of Toronto and NSERC to MMC.
150
B Study Labels
Study Label SALK Line Mutant Label
A SALK_112391C
C SALK_134409C
E SALK_143514C
G SALK_050658 rmx1
I SALK_082219C rmx2
J SALK_046986 rmx3
K SALK_129409C
L SALK_047892C
M SALK_150941C
N SALK_147133
O SALK_109533C rmx4
P SALK_047550C rmx5
Q SALK_125978
R SALK_009225C
S SALK_032344C rmx6
T SALK_031449C
U SALK_036546C
V SALK_100844C
151
References
Ades, S.E., and Sauer, R.T. (1994). Differential DNA-binding specificity of the engrailed homeodomain: the role of residue 50. Biochemistry 33, 9187-9194.
Affolter, M., Percivalsmith, A., Muller, M., Leupin, W., and Gehring, W.J. (1990). DNA binding properties of the purified Antennapedia homeodomain. Proc. Natl. Acad. Sci. U. S. A. 87, 4093-4097.
Alabadi, D., Oyama, T., Yanovsky, M.J., Harmon, F.G., Mas, P., and Kay, S.A. (2001). Reciprocal regulation between TOC1 and LHY/CCA1 within the Arabidopsis circadian clock. Science 293, 880-883.
Alonso, J.M., Stepanova, A.N., Leisse, T.J., Kim, C.J., Chen, H.M., Shinn, P., Stevenson, D.K., Zimmerman, J., Barajas, P., Cheuk, R., Gadrinab, C., Heller, C., Jeske, A., Koesema, E., Meyers, C.C., Parker, H., Prednis, L., Ansari, Y., Choy, N., Deen, H., Geralt, M., Hazari, N., Hom, E., Karnes, M., Mulholland, C., Ndubaku, R., Schmidt, I., Guzman, P., Aguilar-Henonin, L., Schmid, M., Weigel, D., Carter, D.E., Marchand, T., Risseeuw, E., Brogden, D., Zeko, A., Crosby, W.L., Berry, C.C., and Ecker, J.R. (2003). Genome-wide Insertional mutagenesis of Arabidopsis thaliana. Science 301, 653-657.
Anton, I.A., and Frampton, J. (1988). Tryptophans in myb proteins. Nature 336, 719-719.
Arabidopsis Genome, I. (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796-815.
Arenas-Huertero, F., Arroyo, A., Zhou, L., Sheen, J., and Leon, P. (2000). Analysis of Arabidopsis glucose insensitive mutants, gin5 and gin6, reveals a central role of the plant hormone ABA in the regulation of plant vegetative development by sugar. Genes Dev. 14, 2085-2096.
Avila, J., Nieto, C., Canas, L., Benito, M.J., and Pazares, J. (1993). Petunia hybrida genes related to the maize regulatory C1 gene and to animal myb proto-oncogenes. Plant J. 3, 553-562.
Aya, K., Ueguchi-Tanaka, M., Kondo, M., Hamada, K., Yano, K., Nishimura, M., and Matsuoka, M. (2009). Gibberellin Modulates Anther Development in Rice via the Transcriptional Regulation of GAMYB. Plant Cell 21, 1453-1472.
Ayres, M.P., Clausen, T.P., MacLean, S.F., Redman, A.M., and Reichardt, P.B. (1997). Diversity of structure and antiherbivore activity in condensed tannins. Ecology 78, 1696-1712.
Badis, G., Berger, M.F., Philippakis, A.A., Talukder, S., Gehrke, A.R., Jaeger, S.A., Chan, E.T., Metzler, G., Vedenko, A., Chen, X.Y., Kuznetsov, H., Wang, C.F., Coburn, D., Newburger, D.E., Morris, Q., Hughes, T.R., and Bulyk, M.L.
152
(2009). Diversity and Complexity in DNA Recognition by Transcription Factors. Science 324, 1720-1723.
Bailey, J.K., Deckert, R., Schweitzer, J.A., Rehill, B.J., Lindroth, R.L., Gehring, C., and Whitham, T.G. (2005). Host plant genetics affect hidden ecological players: links among Populus, condensed tannins, and fungal endophyte infection. Canadian Journal of Botany-Revue Canadienne De Botanique 83, 356-361.
Bailey, T.L., Williams, N., Misleh, C., and Li, W.W. (2006). MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res. 34, W369-W373.
Baranowskij, N., Frohberg, C., Prat, S., and Willmitzer, L. (1994). A novel DNA binding protein with homology to Myb oncoproteins containing only one repeat can function as a transcriptional activator. Embo J. 13, 5383-5392.
Barbulescu, K., Geserick, C., Schuttke, I., Schleuning, W.D., and Haendler, B. (2001). New androgen response elements in the murine Pem promoter mediate selective transactivation. Mol. Endocrinol. 15, 1803-1816.
Baudry, A., Heim, M.A., Dubreucq, B., Caboche, M., Weisshaar, B., and Lepiniec, L. (2004). TT2, TT8, and TTG1 synergistically specify the expression of BANYULS and proanthocyanidin biosynthesis in Arabidopsis thaliana. Plant J. 39, 366-380.
Beall, E.L., Manak, J.R., Zhou, S., Bell, M., Lipsick, J.S., and Botchan, M.R. (2002). Role for a Drosophila Myb-containing protein complex in site-specific DNA replication. Nature 420, 833-837.
BellLelong, D.A., Cusumano, J.C., Meyer, K., and Chapple, C. (1997). Cinnamate-4-hydroxylase expression in Arabidopsis - Regulation in response to development and the environment. Plant Physiol. 113, 729-738.
Berge, T., Matre, V., Brendeford, E.M., Saether, T., Luscher, B., and Gabrielsen, O.S. (2007). Revisiting a selection of target genes for the hematopoietic transcription factor c-Myb using chromatin immunoprecipitation and c-Myb knockdown. Blood Cells Mol. Dis. 39, 278-286.
Berger, M.F., Philippakis, A.A., Qureshi, A.M., He, F.X.S., Estep, P.W., and Bulyk, M.L. (2006). Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities. Nat. Biotechnol. 24, 1429-1435.
Bergholtz, S., Andersen, T.O., Andersson, K.B., Borrebaek, J., Luscher, B., and Gabrielsen, O.S. (2001). The highly conserved DNA-binding domains of A-, B- and c-Myb differ with respect to DNA-binding, phosphorylation and redox properties. Nucleic Acids Res. 29, 3546-3556.
Bianchi, A., Smith, S., Chong, L., Elias, P., and deLange, T. (1997). TRF1 is a dimer and bends telomeric DNA. Embo J. 16, 1785-1794.
153
Biedenkapp, H., Borgmeyer, U., Sippel, A.E., and Klempnauer, K.H. (1988). Viral myb oncogene encodes a sequence-specific DNA-binding activity. Nature 335, 835-837.
Bilaud, T., Koering, C.E., BinetBrasselet, E., Ancelin, K., Pollice, A., Gasser, S.M., and Gilson, E. (1996). The telobox, a Myb-related telomeric DNA binding motif found in proteins from yeast, plants and human. Nucleic Acids Res. 24, 1294-1303.
Borg, M., Brownfield, L., Khatab, H., Sidorova, A., Lingaya, M., and Twell, D. (2011). The R2R3 MYB Transcription Factor DUO1 Activates a Male Germline-Specific Regulon Essential for Sperm Cell Differentiation in Arabidopsis. Plant Cell 23, 534-549.
Boyes, D.C., Zayed, A.M., Ascenzi, R., McCaskill, A.J., Hoffman, N.E., Davis, K.R., and Gorlach, J. (2001). Growth stage-based phenotypic analysis of arabidopsis: A model for high throughput functional genomics in plants. Plant Cell 13, 1499-1510.
Braun, E.L., and Grotewold, E. (1999). Newly discovered plant c-myb-like genes rewrite the evolution of the plant myb gene family. Plant Physiol. 121, 21-24.
Brown, D.M., Zeef, L.A.H., Ellis, J., Goodacre, R., and Turner, S.R. (2005). Identification of novel genes in Arabidopsis involved in secondary cell wall formation using expression profiling and reverse genetics. Plant Cell 17, 2281-2295.
Bruhat, A., Tourmente, S., Chapel, S., Sobrier, M.L., Couderc, J.L., and Dastugue, B. (1990). Regulatory elements in the 1st-intron contribute to transcriptional regulation of the beta-3 tubulin gene by 20-hydroxyecdysone in Drosophila kc-cells. Nucleic Acids Res. 18, 2861-2867.
Bulow, L., Brill, Y., and Hehl, R. (2010). AthaMap-assisted transcription factor target gene identification in Arabidopsis thaliana. Database-the Journal of Biological Databases and Curation.
Bulow, L., Engelmann, S., Schindler, M., and Hehl, R. (2009). AthaMap, integrating transcriptional and post-transcriptional data. Nucleic Acids Res. 37, D983-D986.
Busch, M.A., Bomblies, K., and Weigel, D. (1999). Activation of a floral homeotic gene in Arabidopsis. Science 285, 585-587.
Carmel, L., Wolf, Y.I., Rogozin, I.B., and Koonin, E.V. (2007). Three distinct modes of intron dynamics in the evolution of eukaryotes. Genome Res. 17, 1034-1044.
Carra, J.H., and Privalov, P.L. (1997). Energetics of folding and DNA binding of the MAT alpha 2 homeodomain. Biochemistry 36, 526-535.
154
Carre, I.A., and Kay, S.A. (1995). Multiple DNA-protein complexes at a circadian-regulated promoter element. Plant Cell 7, 2039-2051.
Chaffey, N., Cholewa, E., Regan, S., and Sundberg, B. (2002). Secondary xylem development in Arabidopsis: a model for wood formation. Physiologia Plantarum 114, 594-600.
Chen, C.M., Wang, C.T., and Ho, C.H. (2001). A plant gene encoding a Myb-like protein that binds telomeric GGTTTAG repeats in vitro. J. Biol. Chem. 276, 16511-16519.
Chen, P.W., Chiang, C.M., Tseng, T.H., and Yu, S.M. (2006). Interaction between rice MYBGA and the gibberellin response element controls tissue-specific sugar sensitivity of alpha-amylase genes. Plant Cell 18, 2326-2340.
Chiou, T.J., and Bush, D.R. (1998). Sucrose is a signal molecule in assimilate partitioning. Proc. Natl. Acad. Sci. U. S. A. 95, 4784-4788.
Colladovides, J., Magasanik, B., and Gralla, J.D. (1991). Control site location and transcriptional regulation in Escherichia coli. Microbiol. Rev. 55, 371-394.
Collins, T.J. (2007). ImageJ for microscopy. Biotechniques 43, 25-+.
Cosma, M.P., Tanaka, T.U., and Nasmyth, K. (1999). Ordered recruitment of transcription and chromatin remodeling factors to a cell cycle- and developmentally regulated promoter. Cell 97, 299-311.
Coupe, S.A., Palmer, B.G., Lake, J.A., Overy, S.A., Oxborough, K., Woodward, F.I., Gray, J.E., and Quick, W.P. (2006). Systemic signalling of environmental cues in Arabidopsis leaves. J. Exp. Bot. 57, 329-341.
Court, R., Chapman, L., Fairall, L., and Rhodes, D. (2005). How the human telomeric proteins TRF1 and TRF2 recognize telomeric DNA: a view from high-resolution crystal structures. EMBO Rep. 6, 39-45.
Davies, K.M., and Schwinn, K.E. (2003). Transcriptional regulation of secondary metabolism. Functional Plant Biology 30, 913-925.
Debeaujon, I., Nesi, N., Perez, P., Devic, M., Grandjean, O., Caboche, M., and Lepiniec, L. (2003). Proanthocyanidin-accumulating cells in Arabidopsis testa: Regulation of differentiation and role in seed development. Plant Cell 15, 2514-2531.
Dekkers, B.J.W., Schuurmans, J., and Smeekens, S.C.M. (2004). Glucose delays seed germination in Arabidopsis thaliana. Planta 218, 579-588.
DeLano, W.L. (2002). The PyMOL Molecular Graphics System DeLano Scientific. http://www.pymol.org.
155
Deyholos, M.K., and Sieburth, L.E. (2000). Separable whorl-specific expression and negative regulation by enhancer elements within the AGAMOUS second intron. Plant Cell 12, 1799-1810.
Dias, A.P., Braun, E.L., McMullen, M.D., and Grotewold, E. (2003). Recently duplicated maize R2R3 Myb genes provide evidence for distinct mechanisms of evolutionary divergence after duplication. Plant Physiol. 131, 610-620.
Dill, A., and Sun, T.P. (2001). Synergistic derepression of gibberellin signaling by removing RGA and GAI function in Arabidopsis thaliana. Genetics 159, 777-785.
Dixon, R.A. (2005). Engineering of plant natural product pathways. Curr. Opin. Plant Biol. 8, 329-336.
Dixon, R.A., and Paiva, N.L. (1995). Stress-induced phenylpropanoid metabolism. Plant Cell 7, 1085-1097.
Do, C.-T., Pollet, B., Thevenin, J., Sibout, R., Denoue, D., Barriere, Y., Lapierre, C., and Jouanin, L. (2007). Both caffeoyl Coenzyme A 3-O-methyltransferase 1 and caffeic acid O-methyltransferase 1 are involved in redundant functions for lignin, flavonoids and sinapoyl malate biosynthesis in Arabidopsis. Planta 226, 1117-1129.
Dooley, S., Seib, T., Welter, C., and Blin, N. (1996). c-myb Intron I protein binding and association with transcriptional activity in leukemic cells. Leuk. Res. 20, 429-439.
Dubos, C., Willment, J., Huggins, D., Grant, G.H., and Campbell, M.M. (2005). Kanamycin reveals the role played by glutamate receptors in shaping plant resource allocation. Plant J. 43, 348-355.
Dubos, C., Stracke, R., Grotewold, E., Weisshaar, B., Martin, C., and Lepiniec, L. (2010). MYB transcription factors in Arabidopsis. Trends Plant Sci. 15, 573-581.
Ebneth, A., Schweers, O., Thole, H., Fagin, U., Urbanke, C., Maass, G., and Wolfes, H. (1994). Biophysical characterization of the c-Myb DNA-binding domain. Biochemistry 33, 14586-14593.
Ehrenkaufer, G.M., Hackney, J.A., and Singh, U. (2009). A developmentally regulated Myb domain protein regulates expression of a subset of stage-specific genes in Entamoeba histolytica. Cell Microbiol. 11, 898-910.
Fairall, L., Schwabe, J.W.R., Chapman, L., Finch, J.T., and Rhodes, D. (1993). The crystal structure of a two zinc-finger peptide reveals an extension to the rules for zinc-finger/DNA recognition. Nature 366, 483-487.
Feldbrugge, M., Sprenger, M., Hahlbrock, K., and Weisshaar, B. (1997). PcMYB1, a novel plant protein containing a DNA-binding domain with one MYB repeat, interacts in vivo with a light-regulatory promoter unit. Plant J. 11, 1079-1093.
156
Fiume, E., Christou, P., Giani, S., and Breviario, D. (2004). Introns are key regulatory elements of rice tubulin expression. Planta 218, 693-703.
Florence, B., Handrow, R., and Laughon, A. (1991). DNA-binding specificity of the fushi tarazu homeodomain. Mol. Cell. Biol. 11, 3613-3623.
Fornale, S., Shi, X.H., Chai, C.L., Encina, A., Irar, S., Capellades, M., Fuguet, E., Torres, J.L., Rovira, P., Puigdomenech, P., Rigau, J., Grotewold, E., Gray, J., and Caparros-Ruiz, D. (2010). ZmMYB31 directly represses maize lignin genes and redirects the phenylpropanoid metabolic flux. Plant J. 64, 633-644.
Frampton, J., Gibson, T.J., Ness, S.A., Doderlein, G., and Graf, T. (1991). Proposed structure for the DNA-binding domain of the Myb oncoprotein based on model building and mutational analysis. Protein Eng. 4, 891-901.
Fu, D.L., Szucs, P., Yan, L.L., Helguera, M., Skinner, J.S., von Zitzewitz, J., Hayes, P.M., and Dubcovsky, J. (2005). Large deletions within the first intron in VRN-1 are associated with spring growth habit in barley and wheat. Mol. Genet. Genomics 273, 54-65.
Fukuzawa, M., Zhukovskaya, N.V., Yamada, Y., Araki, T., and Williams, J.G. (2006). Regulation of Dictyostelium prestalk-specific gene expression by a SHAQKY family MYB transcription factor. Development 133, 1715-1724.
Galis, I., Simek, P., Narisawa, T., Sasaki, M., Horiguchi, T., Fukuda, H., and Matsuoka, K. (2006). A novel R2R3 MYB transcription factor NtMYBJS1 is a methyl jasmonate-dependent regulator of phenylpropanoid-conjugate biosynthesis in tobacco. Plant J. 46, 573-592.
Gallagher, S.R. (1992). GUS protocols : using the GUS gene as a reporter of gene expression. (San Diego ; London: Academic Press).
Galuschka, C., Schindler, M., Bulow, L., and Hehl, R. (2007). AthaMap web tools for the analysis and identification of co-regulated genes. Nucleic Acids Res. 35, D857-D862.
Gaudet, J., and Mango, S.E. (2002). Regulation of organogenesis by the Caenorhabditis elegans, FoxA protein PHA-41. Science 295, 821-825.
Gaudet, J., Muttumu, S., Horner, M., and Mango, S.E. (2004). Whole-genome analysis of temporal gene expression during foregut development. PLoS. Biol. 2, 1828-1842.
Georlette, D., Ahn, S., MacAlpine, D.M., Cheung, E., Lewis, P.W., Beall, E.L., Bell, S.P., Speed, T., Manak, J.R., and Botchan, M.R. (2007). Genomic profiling and expression studies reveal both positive and negative activities for the Drosophila Myb-MuvB/dREAM complex in proliferating cells. Genes Dev. 21, 2880-2896.
157
Gertz, J., Riles, L., Turnbaugh, P., Ho, S.W., and Cohen, B.A. (2005). Discovery, validation, and genetic dissection of transcription factor binding sites by comparative and functional genomics. Genome Res. 15, 1145-1152.
Gewirtz, A.M., and Calabretta, B. (1988). A c-myb antisense oligodeoxynucleotide inhibits normal human hematopoiesis in vitro. Science 242, 1303-1306.
Gibon, Y., Pyl, E.-T., Sulpice, R., Lunn, J.E., Hoehne, M., Guenther, M., and Stitt, M. (2009). Adjustment of growth, starch turnover, protein content and central metabolism to a decrease of the carbon supply when Arabidopsis is grown in very short photoperiods. Plant Cell and Environment 32, 859-874.
Gibson, S.I. (2000). Plant sugar-response pathways. Part of a complex regulatory web. Plant Physiol. 124, 1532-1539.
Gibson, S.I. (2005). Control of plant development and gene expression by sugar signaling. Curr. Opin. Plant Biol. 8, 93-102.
Glover, B.J., Perez-Rodriguez, M., and Martin, C. (1998). Development of several epidermal cell types can be specified by the same MYB-related plant transcription factor. Development 125, 3497-3508.
Godoy, M., Franco-Zorrilla, J.M., Pérez-Pérez, J., Oliveros, J.C., Lorenzo, Ó., and Solano, R. (2011). Improved protein-binding microarrays for the identification of DNA-binding specificities of transcription factors. The Plant Journal 66, 700-711.
Goicoechea, M., Lacombe, E., Legay, S., Mihaljevic, S., Rech, P., Jauneau, A., Lapierre, C., Pollet, B., Verhaegen, D., Chaubet-Gigot, N., and Grima-Pettenati, J. (2005). EgMYB2, a new transcriptional activator from Eucalyptus xylem, regulates secondary cell wall formation and lignin biosynthesis. Plant J. 43, 553-567.
Golay, J., Capucci, A., Arsura, M., Castellano, M., Rizzo, V., and Introna, M. (1991). Expression of c-myb and B-myb, but not A-myb, correlates with proliferation in human hematopoietic cells. Blood 77, 149-158.
Gomez-Maldonado, J., Avila, C., de la Torre, F., Canas, R., Canovas, F.M., and Campbell, M.M. (2004). Functional interactions between a glutamine synthetase promoter and MYB proteins. Plant J. 39, 513-526.
Gong, W., He, K., Covington, M., Dinesh-Kumar, S.P., Snyder, M., Harmer, S.L., Zhu, Y.X., and Deng, X.W. (2008). The development of protein microarrays and their applications in DNA-protein and protein-protein interaction analyses of Arabidopsis transcription factors. Mol. Plant. 1, 27-41.
Graesser, F.A., Lamontagne, K., Whittaker, L., Stohr, S., and Lipsick, J.S. (1992). A highly conserved cysteine in the v-Myb DNA-binding domain is essential for transformation and transcriptional trans-activation. Oncogene 7, 1005-1009.
158
Graf, A., Schlereth, A., Stitt, M., and Smith, A.M. (2010). Circadian control of carbohydrate availability for growth in Arabidopsis plants at night. Proc. Natl. Acad. Sci. U. S. A. 107, 9458-9463.
Graham, I.A., Denby, K.J., and Leaver, C.J. (1994). CARBON CATABOLITE REPRESSION REGULATES GLYOXYLATE CYCLE GENE-EXPRESSION IN CUCUMBER. Plant Cell 6, 761-772.
Grotewold, E., Drummond, B.J., Bowen, B., and Peterson, T. (1994). The myb-homologous P gene controls phlobaphene pigmentation in maize floral organs by directly activating a flavonoid biosynthetic gene subset. Cell 76, 543-553.
Grove, C.A., De Masi, F., Barrasa, M.I., Newburger, D.E., Alkema, M.J., Bulyk, M.L., and Walhout, A.J.M. (2009). A Multiparameter Network Reveals Extensive Divergence between C. elegans bHLH Transcription Factors. Cell 138, 314-327.
Gubler, F., Kalla, R., Roberts, J.K., and Jacobsen, J.V. (1995). Gibberellin-regulated expression of a myb gene in barley aleurone cells: evidence for Myb transactivation of a high-pI alpha-amylase gene promoter. Plant Cell 7, 1879-1891.
Guehmann, S., Vorbrueggen, G., Kalkbrenner, F., and Moelling, K. (1992). Reduction of a conserved Cys is essential for Myb DNA-binding. Nucleic Acids Res. 20, 2279-2286.
Halford, N.G., and Paul, M.J. (2003). Carbon metabolite sensing and signalling. Plant Biotechnology Journal 1, 381-398.
Hall, K.B., and Kranz, J.K. . (2008). Nitrocellulose Filter Binding for Determination of Dissociation Constants. In RNA Protein Interaction Protocols Humana Press, 105-114.
HannaRose, W., and Hansen, U. (1996). Active repression mechanisms of eukaryotic transcription repressors. Trends Genet. 12, 229-234.
Hanson, J., and Smeekens, S. (2009). Sugar perception and signaling - an update. Curr. Opin. Plant Biol. 12, 562-567.
Hara, Y., Onishi, Y., Oishi, K., Miyazaki, K., Fukamizu, A., and Ishida, N. (2009). Molecular characterization of Mybbp1a as a co-repressor on the Period2 promoter. Nucleic Acids Res. 37, 1115-1126.
Haritatos, E., Medville, R., and Turgeon, R. (2000). Minor vein structure and sugar transport in Arabidopsis thaliana. Planta 211, 105-111.
Harlow, E., and Lane, D. . (1988). Antibodies: A Laboratory Manual. Cold Spring Harbor NY. Cold Spring Harbor Laboratory Press.
159
Hartmann, U., Sagasser, M., Mehrtens, F., Stracke, R., and Weisshaar, B. (2005). Differential combinatorial interactions of cis-acting elements recognized by R2R3-MYB, BZIP, and BHLH factors control light-responsive and tissue-specific activation of phenylpropanoid biosynthesis genes. Plant Mol.Biol. 57, 155-171.
Hatton, D., Sablowski, R., Yung, M.H., Smith, C., Schuch, W., and Bevan, M. (1995). 2 Classes of cis sequences contribute to tissue-specific expression of a PAL2 promtoer in transgenic tobacco. Plant J. 7, 859-876.
Hauffe, K.D., Lee, S.P., Subramaniam, R., and Douglas, C.J. (1993). Combinatorial interactions between positive and negative cis-acting elements control spatial patterns of 4CL-1 expression in transgenic tobacco Plant J. 4, 235-253.
Hebsgaard, S.M., Korning, P.G., Tolstrup, N., Engelbrecht, J., Rouze, P., and Brunak, S. (1996). Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res. 24, 3439-3452.
Heine, G.F., Hernandez, J.M., and Grotewold, E. (2004). Two cysteines in plant R2R3 MYB domains participate in REDOX-dependent DNA binding. J. Biol. Chem. 279, 37878-37885.
Heine, G.F., Malik, V., Dias, A.P., and Grotewold, E. (2007). Expression and molecular characterization of ZmMYB-IF35 and related R2R3-MYB transcription factors. Mol. Biotechnol. 37, 155-164.
Hemming, J.D.C., and Lindroth, R.L. (1995). Intraspecific variation in aspen phytochemistry - effects on performance of gypsy moths and forest tent caterpillars. Oecologia 103, 79-88.
Hemming, J.D.C., and Lindroth, R.L. (1999). Effects of light and nutrient availability on aspen: Growth, phytochemistry, and insect performance. Journal of Chemical Ecology 25, 1687-1714.
Hernandez, J.M., Heine, G.F., Irani, N.G., Feller, A., Kim, M.G., Matulnik, T., Chandler, V.L., and Grotewold, E. (2004). Different mechanisms participate in the R-dependent activity of the R2R3 MYB transcription factor C1. J. Biol. Chem. 279, 48205-48213.
Hetherington, A.M., and Woodward, F.I. (2003). The role of stomata in sensing and driving environmental change. Nature 424, 901-908.
Hewel, J.A., Liu, J.A., Onishi, K., Fong, V., Chandran, S., Olsen, J.B., Pogoutse, O., Schutkowski, M., Wenschuh, H., Winkler, D.F.H., Eckler, L., Zandstra, P.W., and Emili, A. (2010). Synthetic Peptide Arrays for Pathway-Level Protein Monitoring by Liquid ChromatographyTandem Mass Spectrometry. Mol. Cell. Proteomics 9, 2460-2473.
160
Higo, K., Ugawa, Y., Iwamoto, M., and Higo, H. (1998). PLACE: a database of plant cis-acting regulatory DNA elements. Nucleic Acids Res. 26, 358-359.
Hirayama, T., and Shinozaki, K. (1996). A cdc5(+) homolog of a higher plant, Arabidopsis thaliana. Proc. Natl. Acad. Sci. U. S. A. 93, 13371-13376.
Hoeren, F.U., Dolferus, R., Wu, Y.R., Peacock, W.J., and Dennis, E.S. (1998). Evidence for a role for AtMYB2 in the induction of the Arabidopsis alcohol dehydrogenase gene (ADH1) by low oxygen. Genetics 149, 479-490.
Holm, L., and Park, J. (2000). DaliLite workbench for protein structure comparison. Bioinformatics 16, 566-567.
Howe, K.M., and Watson, R.J. (1991). Nucleotide preferences in sequence-specific recognition of DNA by c-myb protein. Nucleic Acids Res. 19, 3913-3919.
Howe, K.M., Reakes, C.F.L., and Watson, R.J. (1990). Characterization of the sequence-specific interaction of mouse c-myb protein with DNA. Embo J. 9, 161-169.
Huang, R.P. (2003). Protein arrays, an excellent tool in biomedical research. Front. Biosci. 8, D559-D576.
Huang, Y.C., Su, L.H., Lee, G.A., Chiu, P.W., Cho, C.C., Wu, J.Y., and Sun, C.H. (2008). Regulation of Cyst Wall Protein Promoters by Myb2 in Giardia lamblia. J. Biol. Chem. 283, 31021-31029.
Hwang, M.G., Chung, I.K., Kang, B.G., and Cho, M.H. (2001). Sequence-specific binding property of Arabidopsis thaliana telomeric DNA binding protein 1 (AtTBP1). FEBS Lett. 503, 35-40.
Ito, M. (2005). Conservation and diversification of three-repeat Myb transcription factors in plants. J. Plant Res. 118, 61-69.
Ito, M., Iwase, M., Kodama, H., Lavisse, P., Komamine, A., Nishihama, R., Machida, Y., and Watanabe, A. (1998). A novel cis-acting element in promoters of plant B-type cyclin genes activates M phase-specific transcription. Plant Cell 10, 331-341.
Ito, M., Araki, S., Matsunaga, S., Itoh, T., Nishihama, R., Machida, Y., Doonan, J.H., and Watanabe, A. (2001). G2/M-phase-specific transcription during the plant cell cycle is mediated by c-Myb-like transcription factors. Plant Cell 13, 1891-1905.
Jackson, J., Ramsay, G., Sharkov, N.V., Lium, E., and Katzen, A.L. (2001). The role of transcriptional activation in the function of the Drosophila myb gene. Blood Cells Mol. Dis. 27, 446-455.
Jang, J.C., Leon, P., Zhou, L., and Sheen, J. (1997). Hexokinase as a sugar sensor in higher plants. Plant Cell 9, 5-19.
161
Jia, L., Clegg, M.T., and Jiang, T. (2004). Evolutionary dynamics of the DNA-binding domains in putative R2R3-MYB genes identified from rice subspecies indica and japonica genomes. Plant Physiol. 134, 575-585.
Jiang, C.H., Gu, J.Y., Chopra, S., Gu, X., and Peterson, T. (2004a). Ordered origin of the typical two- and three-repeat Myb genes. Gene 326, 13-22.
Jiang, C.Z., Gu, X., and Peterson, T. (2004b). Identification of conserved gene structures and carboxy-terminal motifs in the Myb gene family of Arabidopsis and Oryza sativa L. ssp indica. Genome Biol. 5, 11.
Jin, H.L., and Martin, C. (1999). Multifunctionality and diversity within the plant MYB-gene family. Plant Mol.Biol. 41, 577-585.
Jin, H.L., Cominelli, E., Bailey, P., Parr, A., Mehrtens, F., Jones, J., Tonelli, C., Weisshaar, B., and Martin, C. (2000). Transcriptional repression by AtMYB4 controls production of UV-protecting sunscreens in Arabidopsis. Embo J. 19, 6150-6161.
Joos, H.J., and Hahlbrock, K. (1992). Phenylalanine ammonia-lyase in potato (Solanum-tuberosum L) - genomic complexity, structural comparison of 2 selected genes and modes of expression Eur. J. Biochem. 204, 621-629.
Kapranov, P., Routt, S.M., Bankaitis, V.A., de Bruijn, F.J., and Szczyglowski, K. (2001). Nodule-specific regulation of phosphatidylinositol transfer protein expression in Lotus japonicus. Plant Cell 13, 1369-1382.
Katoh, K., Kuma, K., Toh, H., and Miyata, T. (2005). MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511-518.
Kaul, S., Koo, H.L., Jenkins, J., Rizzo, M., Rooney, T., Tallon, L.J., Feldblyum, T., Nierman, W., Benito, M.I., Lin, X.Y., Town, C.D., Venter, J.C., Fraser, C.M., Tabata, S., Nakamura, Y., Kaneko, T., Sato, S., Asamizu, E., Kato, T., Kotani, H., Sasamoto, S., Ecker, J.R., Theologis, A., Federspiel, N.A., Palm, C.J., Osborne, B.I., Shinn, P., Conway, A.B., Vysotskaia, V.S., Dewar, K., Conn, L., Lenz, C.A., Kim, C.J., Hansen, N.F., Liu, S.X., Buehler, E., Altafi, H., Sakano, H., Dunn, P., Lam, B., Pham, P.K., Chao, Q., Nguyen, M., Yu, G.X., Chen, H.M., Southwick, A., Lee, J.M., Miranda, M., Toriumi, M.J., Davis, R.W., Wambutt, R., Murphy, G., Dusterhoft, A., Stiekema, W., Pohl, T., Entian, K.D., Terryn, N., Volckaert, G., Salanoubat, M., Choisne, N., Rieger, M., Ansorge, W., Unseld, M., Fartmann, B., Valle, G., Artiguenave, F., Weissenbach, J., Quetier, F., Wilson, R.K., de la Bastide, M., Sekhon, M., Huang, E., Spiegel, L., Gnoj, L., Pepin, K., Murray, J., Johnson, D., Habermann, K., Dedhia, N., Parnell, L., Preston, R., Hillier, L., Chen, E., Marra, M., Martienssen, R., McCombie, W.R., Mayer, K., White, O., Bevan, M., Lemcke, K., Creasy, T.H., Bielke, C., Haas, B., Haase, D., Maiti, R., Rudd, S., Peterson, J., Schoof, H., Frishman, D., Morgenstern, B., Zaccaria, P., Ermolaeva, M., Pertea, M., Quackenbush, J., Volfovsky, N., Wu, D.Y., Lowe, T.M., Salzberg, S.L., Mewes, H.W., Rounsley, S., Bush, D., Subramaniam, S.,
162
Levin, I., Norris, S., Schmidt, R., Acarkan, A., Bancroft, I., Brennicke, A., Eisen, J.A., Bureau, T., Legault, B.A., Le, Q.H., Agrawal, N., Yu, Z., Copenhaver, G.P., Luo, S., Pikaard, C.S., Preuss, D., Paulsen, I.T., Sussman, M., Britt, A.B., Selinger, D.A., Pandey, R., Mount, D.W., Chandler, V.L., Jorgensen, R.A., Pikaard, C., Juergens, G., Meyerowitz, E.M., Dangl, J., Jones, J.D.G., Chen, M., Chory, J., and Somerville, M.C. (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796-815.
Kim, K.N., and Guiltinan, M.J. (1999). Identification of cis-acting elements important for expression of the starch-branching enzyme I gene in maize endosperm. Plant Physiol. 121, 225-236.
Kim, M.J., Lee, T.H., Pahk, Y.M., Kim, Y.H., Park, H.M., Choi, Y.D., Nahm, B.H., and Kim, Y.K. (2009). Quadruple 9-mer-based protein binding microarray with DsRed fusion protein. BMC Mol. Biol. 10, 11.
Kirik, V., Simon, M., Huelskamp, M., and Schiefelbein, J. (2004). The ENHANCER OF TRY AND CPCl gene acts redundantly with TRIPTYCHON and CAPRICE in trichome and root hair cell patterning in Arabidopsis. Dev. Biol. 268, 506-513.
Kislinger, T., Rahman, K., Radulovic, D., Cox, B., Rossant, J., and Emili, A. (2003). PRISM, a generic large scale proteomic investigation strategy for mammals. Mol. Cell. Proteomics 2, 96-106.
Klimczak, M., Kahl, W., and Grodzins.Z. (1972). Studies on phenolic acids, derivatives of cinnamic acid, in plants .1. phenolic acids in poplar (populus). Dissertationes Pharmaceuticae Et Pharmacologicae 24, 181-&.
Klug, A., and Schwabe, J.W.R. (1995). Protein motifs 5. Zinc fingers. Faseb J. 9, 597-604.
Ko, S., Yu, E.Y., Shin, J., Yoo, H.H., Tanaka, T., Kim, W.T., Cho, H.S., Lee, W., and Chung, I.K. (2009). Solution Structure of the DNA Binding Domain of Rice Telomere Binding Protein RTBP1. Biochemistry 48, 827-838.
Ko, S., Jun, S.H., Bae, H., Byun, J.S., Han, W., Park, H., Yang, S.W., Park, S.Y., Jeon, Y.H., Cheong, C., Kim, W.T., Lee, W., and Cho, H.S. (2008). Structure of the DNA-binding domain of NgTRF1 reveals unique features of plant telomere-binding proteins. Nucleic Acids Res. 36, 2739-2755.
Koch, K. (2004). Sucrose metabolism: regulatory mechanisms and pivotal roles in sugar sensing and plant development. Curr. Opin. Plant Biol. 7, 235-246.
Koch, K.E. (1996). Carbohydrate-modulated gene expression in plants. Annual Review of Plant Physiology and Plant Molecular Biology 47, 509-540.
163
Koering, C.E., Fourel, G., Binet-Brasselet, E., Laroche, T., Klein, F., and Gilson, E. (2000). Identification of high affinity Tbf1p-binding sites within the budding yeast genome. Nucleic Acids Res. 28, 2519-2526.
Konig, P., and Rhodes, D. (1997). Recognition of telomeric DNA. Trends Biochem.Sci. 22, 43-47.
Konig, P., Fairall, L., and Rhodes, D. (1998). Sequence-specific DNA recognition by the Myb-like domain of the human telomere binding protein TRF1: a model for the protein-DNA complex. Nucleic Acids Res. 26, 1731-1740.
Koshino-Kimura, Y., Wada, T., Tachibana, T., Tsugeki, R., Ishiguro, S., and Okada, K. (2005). Regulation of CAPRICE transcription by MYB proteins for root epidermis differentiation in Arabidopsis. Plant Cell Physiol. 46, 817-826.
Kranz, H., Scholz, K., and Weisshaar, B. (2000). c-MYB oncogene-like genes encoding three MYB repeats occur in all major plant lineages. Plant J. 21, 231-235.
Lacombe, E., Van Doorsselaere, J., Boerjan, W., Boudet, A.M., and Grima-Pettenati, J. (2000). Characterization of cis-elements required for vascular expression of the Cinnamoyl CoA Reductase gene and for protein-DNA complex formation. Plant J. 23, 663-676.
Lang, M., and Juan, E. (2010). Binding site number variation and high-affinity binding consensus of Myb-SANT-like transcription factor Adf-1 in Drosophilidae. Nucleic Acids Res. 38, 6404-6417.
Lascaris, R.F., Mager, W.H., and Planta, R.J. (1999). DNA-binding requirements of the yeast protein Rap1p as selected in silico from ribosomal protein gene promoter sequences. Bioinformatics 15, 267-277.
Lauvergeat, V., Rech, P., Jauneau, A., Guez, C., Coutos-Thevenot, P., and Grima-Pettenati, J. (2002). The vascular expression pattern directed by the Eucalyptus gunnii cinnamyl alcohol dehydrogenase EgCAD2 promoter is conserved among woody and herbaceous plant species. Plant Mol.Biol. 50, 497-509.
Le Hir, H., Nott, A., and Moore, M.J. (2003). How introns influence and enhance eukaryotic gene expression. Trends Biochem.Sci. 28, 215-220.
Lee, T.I., and Young, R.A. (2000). Transcription of eukaryotic protein-coding genes. Annual Review of Genetics 34, 77-137.
Legay, S., Lacombe, E., Goicoechea, M., Briere, C., Seguin, A., Mackay, J., and Grima-Pettenati, J. (2007). Molecular characterization of EgMYB1, a putative transcriptional repressor of the lignin biosynthetic pathway. Plant Sci. 173, 542-549.
164
Lepiniec, L., Debeaujon, I., Routaboul, J.-M., Baudry, A., Pourcel, L., Nesi, N., and Caboche, M. (2006). Genetics and biochemistry of seed flavonoids. In Annual Review of Plant Biology, pp. 405-430.
LeRoy, C.J., Whitham, T.G., Keim, P., and Marks, J.C. (2006). Plant genes link forests and streams. Ecology 87, 255-261.
Leyva, A., Liang, X.W., Pintortoro, J.A., Dixon, R.A., and Lamb, C.J. (1992). Cis-element combinations determine phenylalanine ammonia-lyase gene tissue-specific expression patterns. Plant Cell 4, 263-271.
Li, B.B., and de Lange, T. (2003). Rap1 affects the length and heterogeneity of human telomeres. Mol. Biol. Cell 14, 5060-5068.
Li, S.F., and Parish, R.W. (1995). Isolation of two novel myb-like genes from Arabidopsis and studies on the DNA-binding properties of their products. Plant J. 8, 963-972.
Liang, Y.-K., Xie, X., Lindsay, S.E., Wang, Y.B., Masle, J., Williamson, L., Leyser, O., and Hetherington, A.M. (2010). Cell wall composition contributes to the control of transpiration efficiency in Arabidopsis thaliana. Plant J. 64, 679-686.
Liang, Y.K., Dubos, C., Dodd, I.C., Holroyd, G.H., Hetherington, A.M., and Campbell, M.M. (2005). AtMYB61, an R2R3-MYB transcription factor controlling stomatal aperture in Arabidopsis thaliana. Curr. Biol. 15, 1201-1206.
Liao, Y., Zou, H.F., Wang, H.W., Zhang, W.K., Ma, B., Zhang, J.S., and Chen, S.Y. (2008). Soybean GmMYB76, GmMYB92, and GmMYB177 genes confer stress tolerance in transgenic Arabidopsis plants. Cell Res. 18, 1047-1060.
Lieb, J.D., Liu, X.L., Botstein, D., and Brown, P.O. (2001). Promoter-specific binding of Rap1 revealed by genome-wide maps of protein-DNA association. Nature Genet. 28, 327-334.
Lindroth, R.L., and Hwang, S.Y. (1996). Diversity, redundancy, and multiplicity in chemical defense systems of aspen. In Phytochemical Diversity and Redundancy in Ecological Interactions, J.T. Romeo, J.A. Saunders, and P. Barbosa, eds, pp. 25-56.
Linger, B.R., and Price, C.M. (2009). Conservation of telomere protein complexes: shuffling through evolution. Crit. Rev. Biochem. Mol. Biol. 44, 434-446.
Linnell, J., Mott, R., Field, S., Kwiatkowski, D.P., Ragoussis, J., and Udalova, I.A. (2004). Quantitative high-throughput analysis of transcription factor binding specificities. Nucleic Acids Res. 32, 7.
Lipsick, J.S. (1996). One billion years of Myb. Oncogene 13, 223-235.
165
Lira, C.B.B., Neto, J.L.D., Khater, L., Cagliari, T.C., Peroni, L.A., dos Reis, J.R.R., Ramos, C.H.I., and Cano, M.I.N. (2007). LaTBP1: A Leishmania amazonensis DNA-binding protein that associates in vivo with telomeres and GT-rich DNA using a Myb-like domain. Arch. Biochem. Biophys. 465, 399-409.
Liu, G.Y., Ren, G., Guirgis, A., and Thornburg, R.W. (2009). The MYB305 Transcription Factor Regulates Expression of Nectarin Genes in the Ornamental Tobacco Floral Nectary. Plant Cell 21, 2672-2687.
Logemann, E., Parniske, M., and Hahlbrock, K. (1995). Modes of expression and common structural features of the complete phenylalanine ammonia-lyase gene family in parsley. Proc. Natl. Acad. Sci. U. S. A. 92, 5905-5909.
Lois, R., Dietrich, A., Hahlbrock, K., and Schulz, W. (1989). A phenylalanine ammonia-lyase gene from parsley - structure, regulation and identification of elicitor and light responsive cis-acting elements. Embo J. 8, 1641-1648.
Loreti, E., Alpi, A., and Perata, P. (2000). Glucose and disaccharide-sensing mechanisms modulate the expression of alpha-amylase in barley embryos. Plant Physiol. 123, 939-948.
Lu, C.A., Ho, T.H.D., Ho, S.L., and Yu, S.M. (2002). Three novel MYB proteins with one DNA binding repeat mediate sugar and hormone regulation of alpha-amylase gene expression. Plant Cell 14, 1963-1980.
Luscher, B., and Eisenman, R.N. (1990). New light on Myc and Myb. Part I. Myc. Genes Dev. 4, 2025-2035.
Ma, X.P., and Calabretta, B. (1994). DNA binding and transactivation activity of A-myb, a c-myb-related gene. Cancer Res. 54, 6512-6516.
Maeda, K., Kimura, S., Demura, T., Takeda, J., and Ozeki, Y. (2005). DcMYB1 acts as a transcriptional activator of the carrot phenylalanine ammonia-lyase gene (DcPAL1) in response to elicitor treatment, UV-B irradiation and the dilution effect. Plant Mol.Biol. 59, 739-752.
Maniatis, T., Goodbourn, S., and Fischer, J.A. (1987). Regulation of inducible and tissue-specific gene expression. Science 236, 1237-1245.
Mardis, E.R. (2007). ChIP-seq: welcome to the new frontier. Nat. Methods 4, 613-614.
Marian, C.O., Bordoli, S.J., Goltz, M., Santarella, R.A., Jackson, L.P., Danilevskaya, O., Beckstette, M., Meeley, R., and Bass, H.W. (2003). The maize Single myb histone 1 gene, Smh1, belongs to a novel gene family and encodes a protein that binds telomere DNA repeats in vitro. Plant Physiol. 133, 1336-1350.
Martin, C., and PazAres, J. (1997). MYB transcription factors in plants. Trends Genet. 13, 67-73.
166
Martin, C., Bhatt, K., Baumann, K., Jin, H., Zachgo, S., Roberts, K., Schwarz-Sommer, Z., Glover, B., and Perez-Rodrigues, M. (2002). The mechanics of cell fate determination in petals. Philos. Trans. R. Soc. Lond. Ser. B-Biol. Sci. 357, 809-813.
Massie, C.E., and Mills, I.G. (2008). ChIPping away at gene regulation. EMBO Rep. 9, 337-343.
Matsumoto, B. (2002). Cell biological applications of confocal microscopy. (San Diego ; London: Academic Press).
Maxwell, B.B., Andersson, C.R., Poole, D.S., Kay, S.A., and Chory, J. (2003). HY5, Circadian Clock-Associated 1, and a cis-element, DET1 dark response element, mediate DET1 regulation of chlorophyll a/b-binding protein 2 expression. Plant Physiol. 133, 1565-1577.
McDonnell, A.V., Jiang, T., Keating, A.E., and Berger, B. (2006). Paircoil2: improved prediction of coiled coils from sequence. Bioinformatics 22, 356-358.
Mehrtens, F., Kranz, H., Bednarek, P., and Weisshaar, B. (2005). The Arabidopsis transcription factor MYB12 is a flavonol-specific regulator of phenylpropanoid biosynthesis. Plant Physiol. 138, 1083-1096.
Meijsing, S.H., Pufall, M.A., So, A.Y., Bates, D.L., Chen, L., and Yamamoto, K.R. (2009). DNA Binding Site Sequence Directs Glucocorticoid Receptor Structure and Activity. Science 324, 407-410.
Melcher, K. (2000). A modular set of prokaryotic and eukaryotic expression vectors. Anal. Biochem. 277, 109-120.
Mellway, R.D., Tran, L.T., Prouse, M.B., Campbell, M.M., and Constabel, C.P. (2009). The Wound-, Pathogen-, and Ultraviolet B-Responsive MYB134 Gene Encodes an R2R3 MYB Transcription Factor That Regulates Proanthocyanidin Synthesis in Poplar. Plant Physiol. 150, 924-941.
Mena, M., Cejudo, F.J., Isabel-Lamoneda, I., and Carbonero, P. (2002). A role for the DOF transcription factor BPBF in the regulation of gibberellin-responsive genes in barley aleurone. Plant Physiol. 130, 111-119.
Meneses, E., Cardenas, H., Zarate, S., Brieba, L.G., Orozco, E., Lopez-Camarillo, C., and Azuara-Liceaga, E. (2010). The R2R3 Myb protein family in Entamoeba histolytica. Gene 455, 32-42.
Miranda, M., Ralph, S.G., Mellway, R., White, R., Heath, M.C., Bohlmann, J., and Constabel, C.P. (2007). The transcriptional response of hybrid poplar (Populus trichocarpa x P-deltoides) to infection by Melampsora medusae leaf rust involves induction of flavonoid pathway genes leading to the accumulation of proanthocyanidins. Molecular Plant-Microbe Interactions 20, 816-831.
167
Mizuguchi, G., Nakagoshi, H., Nagase, T., Nomura, N., Date, T., Ueno, Y., and Ishii, S. (1990). DNA binding activity and transcriptional activator function of the human B-myb protein compared with c-MYB. J. Biol. Chem. 265, 9280-9284.
Mohrmann, L., Kal, A.J., and Verrijzer, C.P. (2002). Characterization of the extended Myb-like DNA-binding domain of trithorax group protein zeste. J. Biol. Chem. 277, 47385-47392.
Mol, J., Grotewold, E., and Koes, R. (1998). How genes paint flowers and seeds. Trends Plant Sci. 3, 212-217.
Moore, B., Zhou, L., Rolland, F., Hall, Q., Cheng, W.H., Liu, Y.X., Hwang, I., Jones, T., and Sheen, J. (2003). Role of the Arabidopsis glucose sensor HXK1 in nutrient, light, and hormonal signaling. Science 300, 332-336.
Morita, A., Umemura, T., Kuroyanagi, M., Futsuhara, Y., Perata, P., and Yamaguchi, J. (1998). Functional dissection of a sugar-repressed alpha-amylase gene (RAmylA) promoter in rice embryos. FEBS Lett. 423, 81-85.
Morohashi, K., and Grotewold, E. (2009). A Systems Approach Reveals Regulatory Circuitry for Arabidopsis Trichome Initiation by the GL3 and GL1 Selectors. PLoS Genet. 5, 17.
Morohashi, K., Casas, M.I., Falcone Ferreyra, L., Mejia-Guerra, M.K., Pourcel, L., Yilmaz, A., Feller, A., Carvalho, B., Emiliani, J., Rodriguez, E., Pellegrinet, S., McMullen, M., Casati, P., and Grotewold, E. (2012). A genome-wide regulatory framework identifies maize pericarp color1 controlled genes. Plant Cell 24, 2745-2764.
Moyano, E., MartinezGarcia, J.F., and Martin, C. (1996). Apparent redundancy in Myb gene function provides gearing for the control of flavonoid biosynthesis in Antirrhinum flowers. Plant Cell 8, 1519-1532.
Mucenski, M.L., McLain, K., Kier, A.B., Swerdlow, S.H., Schreiner, C.M., Miller, T.A., Pietryga, D.W., Scott, W.J., and Potter, S.S. (1991). A functional c-myb gene is required for normal murine fetal hepatic hematopoiesis. Cell 65, 677-689.
Mukherjee, S., Berger, M.F., Jona, G., Wang, X.S., Muzzey, D., Snyder, M., Young, R.A., and Bulyk, M.L. (2004). Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays. Nature Genet. 36, 1331-1339.
Muller, K.J., Romano, N., Gerstner, O., Garciamaroto, F., Pozzi, C., Salamini, F., and Rohde, W. (1995). The barley Hooded mutation caused by a duplication in a homeobox gene intron. Nature 374, 727-730.
Nakagoshi, H., Nagase, T., Kaneiishii, C., Ueno, Y., and Ishii, S. (1990). Binding of the c-myb proto-oncogene product to the simian virus 40 enhancer stimulates transcription. J. Biol. Chem. 265, 3479-3483.
168
Nesi, N., Jond, C., Debeaujon, I., Caboche, M., and Lepiniec, L. (2001). The Arabidopsis TT2 gene encodes an R2R3 MYB domain protein that acts as a key determinant for proanthocyanidin accumulation in developing seed. Plant Cell 13, 2099-2114.
Nesi, N., Debeaujon, I., Jond, C., Pelletier, G., Caboche, M., and Lepiniec, L. (2000). The TT8 gene encodes a basic helix-loop-helix domain protein required for expression of DFR and BAN genes in Arabidopsis siliques. Plant Cell 12, 1863-1878.
Newman, L.J., Perazza, D.E., Juda, L., and Campbell, M.M. (2004). Involvement of the R2R3-MYB, AtMYB61, in the ectopic lignification and dark-photomorphogenic components of the det3 mutant phenotype. Plant J. 37, 239-250.
Nishikawa, T., Okamura, H., Nagadoi, A., Konig, P., Rhodes, D., and Nishimura, Y. (2001). Solution structure of a telomeric DNA complex of human TRF1. Structure 9, 1237-1251.
Nomura, N., Takahashi, M., Matsui, M., Ishii, S., Date, T., Sasamoto, S., and Ishizaki, R. (1988). Isolation of human cDNA clones of myb-related genes, A-myb and B-myb. Nucleic Acids Res. 16, 11075-11089.
Oda, M., Furukawa, K., Sarai, A., and Nakamura, H. (1999). Kinetic analysis of DNA binding by the c-Myb DNA-binding domain using surface plasmon resonance. FEBS Lett. 454, 288-292.
Ogata, K., Kanai, H., Inoue, T., Sekikawa, A., Sasaki, M., Nagadoi, A., Sarai, A., Ishii, S., and Nishimura, Y. (1993). Solution structures of Myb DNA-binding domain and its complex with DNA. Nucleic acids symposium series, 201-202.
Ogata, K., Morikawa, S., Nakamura, H., Sekikawa, A., Inoue, T., Kanai, H., Sarai, A., Ishii, S., and Nishimura, Y. (1994). Solution structure of a specific DNA complex of the Myb DNA-binding domain with cooperative recognition helices. Cell 79, 639-648.
Ogata, K., Morikawa, S., Nakamura, H., Hojo, H., Yoshimura, S., Zhang, R.H., Aimoto, S., Ametani, Y., Hirata, Z., Sarai, A., Ishii, S., and Nishimura, Y. (1995). Comparison of the free and DNA-complexed forms of the DNA-binding domain from c-Myb. Nat. Struct. Biol. 2, 309-320.
Ong, S.J., Hsu, H.M., Liu, H.W., Chu, C.H., and Tai, J.H. (2006). Multifarious transcriptional regulation of adhesion protein gene ap65-1 by a novel Myb1 protein in the protozoan parasite Trichomonas vaginalis. Eukaryot. Cell 5, 391-399.
Ong, S.J., Hsu, H.M., Liu, H.W., Chu, C.H., and Tai, J.H. (2007). Activation of multifarious transcription of an adhesion protein ap65-1 gene by a novel Myb2 protein in the protozoan parasite Trichomonas vaginalis. J. Biol. Chem. 282, 6716-6725.
169
Oppenheimer, D.G., Herman, P.L., Sivakumaran, S., Esch, J., and Marks, M.D. (1991). A myb gene required for leaf trichome differentiation in Arabidopsis is expressed in stipules. Cell 67, 483-493.
Ording, E., Kvavik, W., Bostad, A., and Gabrielsen, O.S. (1994). Two functionally distinct half sites in the DNA-recognition sequence of the Myb oncoprotein. Eur. J. Biochem. 222, 113-120.
Osier, T.L., and Lindroth, R.L. (2001). Effects of genotype, nutrient availability, and defoliation on aspen phytochemistry and insect performance. Journal of Chemical Ecology 27, 1289-1313.
Osnato, M., Stile, M.R., Wang, Y.M., Meynard, D., Curiale, S., Guiderdoni, E., Liu, Y.X., Horner, D.S., Ouwerkerk, P.B.F., Pozzi, C., Muller, K.J., Salamini, F., and Rossini, L. (2010). Cross Talk between the KNOX and Ethylene Pathways Is Mediated by Intron-Binding Transcription Factors in Barley. Plant Physiol. 154, 1616-1632.
Osuna, D., Usadel, B., Morcuende, R., Gibon, Y., Blaesing, O.E., Hoehne, M., Guenter, M., Kamlage, B., Trethewey, R., Scheible, W.-R., and Stitt, M. (2007). Temporal responses of transcripts, enzyme activities and metabolites after adding sucrose to carbon-deprived Arabidopsis seedlings. Plant J. 49, 463-491.
Pabo, C.O., and Sauer, R.T. (1992). Transcription factors: structural families and principles of DNA recognition. Annu. Rev. Biochem. 61, 1053-1095.
Palo, R.T. (1984). Distribution of birch (Betula spp), willow (Salix spp), and poplar (Populus spp) secondary metabolites and their potential role as chemical defense against herbivores. Journal of Chemical Ecology 10, 499-520.
Patzlaff, A., Newman, L.J., Dubos, C., Whetten, R., Smith, C., McInnis, S., Bevan, M.W., Sederoff, R.R., and Campbell, M.M. (2003a). Characterisation of PtMYB1, an R2R3-MYB from pine xylem. Plant Mol.Biol. 53, 597-608.
Patzlaff, A., McInnis, S., Courtenay, A., Surman, C., Newman, L.J., Smith, C., Bevan, M.W., Mansfield, S., Whetten, R.W., Sederoff, R.R., and Campbell, M.M. (2003b). Characterisation of a pine MYB that regulates lignification. Plant J. 36, 743-754.
Pavletich, N.P., and Pabo, C.O. (1991). Zinc finger-DNA recognition: crystal structure of a Zif268-DNA complex at 2.1 A. Science 252, 809-817.
Pavletich, N.P., and Pabo, C.O. (1993). Crystal structure of a five-finger GLI-DNA complex: new perspectives on zinc fingers. Science 261, 1701-1707.
Pazares, J., Ghosal, D., Wienand, U., Peterson, P.A., and Saedler, H. (1987). The regulatory c1 locus of Zea mays encodes a protein with homology to myb proto-
170
oncogene products and with structural similarities to transcriptional activators. Embo J. 6, 3553-3558.
Pearl, I.A., and Darling, S.F. (1971). Studies on leaves of family Salicacear .16. Phenolic extractives of leaves of Populus balsamifera and of P. trichocarpa. Phytochemistry 10, 2844-&.
Pego, J.V., Kortstee, A.J., Huijser, G., and Smeekens, S.G.M. (2000). Photosynthesis, sugars and the regulation of gene expression. J. Exp. Bot. 51, 407-416.
Pelloux, J., Rusterucci, C., and Mellerowicz, E.J. (2007). New insights into pectin methylesterase structure and function. Trends Plant Sci. 12, 267-277.
Penfield, S., Meissner, R.C., Shoue, D.A., Carpita, N.C., and Bevan, M.W. (2001). MYB61 is required for mucilage deposition and extrusion in the Arabidopsis seed coat. Plant Cell 13, 2777-2791.
Peters, C.W.B., Sippel, A.E., Vingron, M., and Klempnauer, K.H. (1987). Drosophila and vertebrate myb proteins share two conserved regions, one of which functions as a DNA-binding domain. Embo J. 6, 3085-3090.
Peters, D.J., and Constabel, C.P. (2002). Molecular analysis of herbivore-induced condensed tannin synthesis: cloning and expression of dihydroflavonol reductase from trembling aspen (Populus tremuloides). Plant J. 32, 701-712.
Phan, H.A., Iacuone, S., Li, S.F., and Parish, R.W. (2011). The MYB80 Transcription Factor Is Required for Pollen Development and the Regulation of Tapetal Programmed Cell Death in Arabidopsis thaliana. Plant Cell 23, 2209-2224.
Pitt, C.W., Valente, L.P., Rhodes, D., and Simonsson, T. (2008). Identification and characterization of an essential telomeric repeat binding factor in fission yeast. J. Biol. Chem. 283, 2693-2701.
Prestridge, D.S. (1991). Signal scan - a computer-program that scans DNA-sequences for eukaryotic transcriptional elements. Computer Applications in the Biosciences 7, 203-206.
Prouse, M.B., and Campbell, M.M. (2012). The interaction between MYB proteins and their target DNA binding sites. Biochimica Et Biophysica Acta-Gene Regulatory Mechanisms 1819, 67-77.
Ptashne, M., and Gann, A. (1997). Transcriptional activation by recruitment. Nature 386, 569-577.
Punwani, J.A., Rabiger, D.S., and Drews, G.N. (2007). MYB98 positively regulates a battery of synergid-expressed genes encoding filiform apparatus-localized proteins. Plant Cell 19, 2557-2568.
171
Punwani, J.A., Rabiger, D.S., Lloyd, A., and Drews, G.N. (2008). The MYB98 subcircuit of the synergid gene regulatory network includes genes directly and indirectly regulated by MYB98. Plant J. 55, 406-414.
Ramirez, V., Agorio, A., Coego, A., Garcia-Andrade, J., Hernandez, M.J., Balaguer, B., Ouwerkerk, P.B.F., Zarra, I., and Vera, P. (2011). MYB46 Modulates Disease Susceptibility to Botrytis cinerea in Arabidopsis. Plant Physiol. 155, 1920-1935.
Ramsay, R.G., Ishii, S., and Gonda, T.J. (1991). Increase in specific DNA binding by carboxyl truncation suggests a mechanism for activation of Myb. Oncogene 6, 1875-1879.
Rawat, R., Schwartz, J., Jones, M.A., Sairanen, I., Cheng, Y.F., Andersson, C.R., Zhao, Y.D., Ljung, K., and Harmer, S.L. (2009). REVEILLE1, a Myb-like transcription factor, integrates the circadian clock and auxin pathways. Proc. Natl. Acad. Sci. U. S. A. 106, 16883-16888.
Riechmann, J.L., Heard, J., Martin, G., Reuber, L., Jiang, C.Z., Keddie, J., Adam, L., Pineda, O., Ratcliffe, O.J., Samaha, R.R., Creelman, R., Pilgrim, M., Broun, P., Zhang, J.Z., Ghandehari, D., Sherman, B.K., and Yu, C.L. (2000). Arabidopsis transcription factors: Genome-wide comparative analysis among eukaryotes. Science 290, 2105-2110.
Rippe, R.A., Lorenzen, S.I., Brenner, D.A., and Breindl, M. (1989). Regulatory elements in the 5'-flanking region and the 1st intron contribute to transcriptional control of the mouse alpha-1 type-i collagen gene. Mol. Cell. Biol. 9, 2224-2227.
Robertson, G., Hirst, M., Bainbridge, M., Bilenky, M., Zhao, Y.J., Zeng, T., Euskirchen, G., Bernier, B., Varhol, R., Delaney, A., Thiessen, N., Griffith, O.L., He, A., Marra, M., Snyder, M., and Jones, S. (2007). Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods 4, 651-657.
Roche, P.J., Hoare, S.A., and Parker, M.G. (1992). A consensus DNA-binding site for the androgen receptor. Mol. Endocrinol. 6, 2229-2235.
Rogers, L.A., and Campbell, M.M. (2004). The genetic control of lignin deposition during plant growth and development. New Phytol. 164, 17-30.
Rogers, L.A., Dubos, C., Cullis, I.F., Surman, C., Poole, M., Willment, J., Mansfield, S.D., and Campbell, M.M. (2005). Light, the circadian clock, and sugar perception in the control of lignin biosynthesis. J. Exp. Bot. 56, 1651-1663.
Rogg, L.E., and Bartel, B. (2001). Auxin signaling: Derepression through regulated proteolysis. Developmental Cell 1, 595-604.
172
Roldan, M., Gomez-Mena, C., Ruiz-Garcia, L., Salinas, J., and Martinez-Zapater, J.M. (1999). Sucrose availability on the aerial part of the plant promotes morphogenesis and flowering of Arabidopsis in the dark. Plant J. 20, 581-590.
Rolland, F., Moore, B., and Sheen, J. (2002). Sugar sensing and signaling in plants. Plant Cell 14, S185-S205.
Rolland, F., Baena-Gonzalez, E., and Sheen, J. (2006). Sugar sensing and signaling in plants: Conserved and novel mechanisms. In Annual Review of Plant Biology, pp. 675-709.
Romano, J.M., Dubos, C., Prouse, M.B., Wilkins, O., Hong, H., Poole, M., Kang, K.Y., Li, E.Y., Douglas, C.J., Western, T.L., Mansfield, S.D., and Campbell, M.M. (2012). AtMYB61, an R2R3-MYB transcription factor, functions as a pleiotropic regulator via a small gene network. New Phytol. 195, 774-786.
Romero, I., Fuertes, A., Benito, M.J., Malpica, J.M., Leyva, A., and Paz-Ares, J. (1998). More than 80R2R3-MYB regulatory genes in the genome of Arabidopsis thaliana. Plant J. 14, 273-284.
Rose, A., Meier, I., and Wienand, U. (1999). The tomato I-box binding factor LeMYBI is a member of a novel class of Myb-like proteins. Plant J. 20, 641-652.
Rose, A.B. (2002). Requirements for intron-mediated enhancement of gene expression in Arabidopsis. RNA-Publ. RNA Soc. 8, 1444-1453.
Rose, A.B. (2008). Intron-Mediated Regulation of Gene Expression. Curr.Top.Microbiol.Immunol. 326, 277-290.
Rosinski, J.A., and Atchley, W.R. (1998). Molecular evolution of the Myb family of transcription factors: Evidence for polyphyletic origin. J. Mol. Evol. 46, 74-83.
Ruan, M.B., Liao, W.B., Zhang, X.C., Yu, X.L., and Peng, M. (2009). Analysis of the cotton sucrose synthase 3 (Sus3) promoter and first intron in transgenic Arabidopsis. Plant Sci. 176, 342-351.
Rushton, D.L., Tripathi, P., Rabara, R.C., Lin, J., Ringler, P., Boken, A.K., Langum, T.J., Smidt, L., Boomsma, D.D., Emme, N.J., Chen, X., Finer, J.J., Shen, Q.J., and Rushton, P.J. (2012). WRKY transcription factors: key components in abscisic acid signalling. Plant Biotechnology Journal 10, 2-11.
Ryu, K.H., Kang, Y.H., Park, Y.H., Hwang, D., Schiefelbein, J., and Lee, M.M. (2005). The WEREWOLF MYB protein directly regulates CAPRICE transcription during cell fate specification in the Arabidopsis root epidermis. Development 132, 4765-4775.
Sablowski, R.W.M., and Meyerowitz, E.M. (1998). A homolog of NO APICAL MERISTEM is an immediate target of the floral homeotic genes APETALA3/PISTILLATA. Cell 92, 93-103.
173
Sablowski, R.W.M., Baulcombe, D.C., and Bevan, M. (1995). Expression of a flower-specific Myb protein in leaf cells using a viral vector causes ectopic activation of a target promoter. Proc. Natl. Acad. Sci. U. S. A. 92, 6901-6905.
Sablowski, R.W.M., Moyano, E., Culianezmacia, F.A., Schuch, W., Martin, C., and Bevan, M. (1994). A flower-specific Myb protein activates transcription of phenylpropanoid biosynthetic genes. Embo J. 13, 128-137.
Saikumar, P., Murali, R., and Reddy, E.P. (1990). Role of tryptophan repeats and flanking amino acids in Myb-DNA interactions. Proc. Natl. Acad. Sci. U. S. A. 87, 8452-8456.
Sainz, M.B., Grotewold, E., and Chandler, V.L. (1997). Evidence for direct activation of an anthocyanin promoter by the maize C1 protein and comparison of DNA binding by related Myb domain proteins. Plant Cell 9, 611-625.
Sakura, H., Chie, K.I., Nagase, T., Nakagoshi, H., Gonda, T.J., and Ishii, S. (1989). Delineation of three functional domains of the transcriptional activator encoded by the c-myb protooncogene. Proc. Natl. Acad. Sci. U. S. A. 86, 5758-5762.
Sala, A., and Watson, R. (1999). B-Myb protein in cellular proliferation, transcription control, and cancer: Latest developments. J. Cell. Physiol. 179, 245-250.
Saleh, A., Alvarez-Venegas, R., and Avramova, Z. (2008). An efficient chromatin immunoprecipitation (ChIP) protocol for studying histone modifications in Arabidopsis plants. Nat. Protoc. 3, 1018-1025.
Samach, A., Onouchi, H., Gold, S.E., Ditta, G.S., Schwarz-Sommer, Z., Yanofsky, M.F., and Coupland, G. (2000). Distinct roles of CONSTANS target genes in reproductive development of Arabidopsis. Science 288, 1613-1616.
Santi, L., Wang, Y.M., Stile, M.R., Berendzen, K., Wanke, D., Roig, C., Pozzi, C., Muller, K., Muller, J., Rohde, W., and Salamini, F. (2003). The GA octodinucleotide repeat binding factor BBR participates in the transcriptional regulation of the homeobox gene Bkn3. Plant J. 34, 813-826.
Schaffer, R., Ramsay, N., Samach, A., Corden, S., Putterill, J., Carre, I.A., and Coupland, G. (1998). The late elongated hypocotyl mutation of Arabidopsis disrupts circadian rhythms and the photoperiodic control of flowering. Cell 93, 1219-1229.
Schellmann, S., Schnittger, A., Kirik, V., Wada, T., Okada, K., Beermann, A., Thumfahrt, J., Jurgens, G., and Hulskamp, M. (2002). TRIPTYCHON and CAPRICE mediate lateral inhibition during trichome and root hair patterning in Arabidopsis. Embo J. 21, 5036-5046.
Schmid, M., Davison, T.S., Henz, S.R., Pape, U.J., Demar, M., Vingron, M., Scholkopf, B., Weigel, D., and Lohmann, J.U. (2005). A gene expression map of Arabidopsis thaliana development. Nature Genet. 37, 501-506.
174
Schwartz, T., Rould, M.A., Lowenhaupt, K., Herbert, A., and Rich, A. (1999). Crystal structure of the Z alpha domain of the human editing enzyme ADAR1 bound to left-handed Z-DNA. Science 284, 1841-1845.
Schwechheimer, C., and Bevan, M. (1998). The regulation of transcription factor activity in plants. Trends Plant Sci. 3, 378-383.
Schweitzer, J.A., Bailey, J.K., Rehill, B.J., Martinsen, G.D., Hart, S.C., Lindroth, R.L., Keim, P., and Whitham, T.G. (2004). Genetically based trait in a dominant tree affects ecosystem processes. Ecology Letters 7, 127-134.
Seeliger, D., and de Groot, B.L. (2010). Ligand docking and binding site analysis with PyMOL and Autodock/Vina. J. Comput.-Aided Mol. Des. 24, 417-422.
Seguin, A., Laible, G., Leyva, A., Dixon, R.A., and Lamb, C.J. (1997). Characterization of a gene encoding a DNA-binding protein that interacts in vitro with vascular specific cis elements of the phenylalanine ammonia-lyase promoter. Plant Mol.Biol. 35, 281-291.
Seong, S.Y., and Choi, C.Y. (2003). Current status of protein chip development in terms of fabrication and application. Proteomics 3, 2176-2189.
Serpa, V., Vernal, J., Lamattina, L., Grotewold, E., Cassia, R., and Terenzi, H. (2007). Inhibition of AtMYB2 DNA-binding by nitric oxide involves cysteine S-nitrosylation. Biochem. Biophys. Res. Commun. 361, 1048-1053.
Sharma, S.B., and Dixon, R.A. (2005). Metabolic engineering of proanthocyanidins by ectopic expression of transcription factors in Arabidopsis thaliana. Plant J. 44, 62-75.
Shimazaki, K.-i., Doi, M., Assmann, S.M., and Kinoshita, T. (2007). Light regulation of stomatal movement. In Annual Review of Plant Biology, pp. 219-247.
Sinha, A.K., Hofmann, M.G., Romer, U., Kockenberger, W., Elling, L., and Roitsch, T. (2002). Metabolizable and non-metabolizable sugars activate different signal transduction pathways in tomato. Plant Physiol. 128, 1480-1489.
Smeekens, S. (2000). Sugar-induced signal transduction in plants. Annual Review of Plant Physiology and Plant Molecular Biology 51, 49-81.
Smith, A.M., and Stitt, M. (2007). Coordination of carbon supply and plant growth. Plant Cell and Environment 30, 1126-1149.
Solano, R., Nieto, C., and Pazares, J. (1995). MYB.Ph3 transcription factor from Petunia hybrida induces similar DNA-bending/distortions on its two types of binding site. Plant J. 8, 673-682.
Solano, R., Fuertes, A., SanchezPulido, L., Valencia, A., and PazAres, J. (1997). A single residue substitution causes a switch from the dual DNA binding specificity
175
of plant transcription factor MYB.Ph3 to the animal c-MYB specificity. J. Biol. Chem. 272, 2889-2895.
Solfanelli, C., Poggi, A., Loreti, E., Alpi, A., and Perata, P. (2006). Sucrose-specific induction of the anthocyanin biosynthetic pathway in Arabidopsis. Plant Physiol. 140, 637-646.
Solomon, M.J., Larsen, P.L., and Varshavsky, A. (1988). Mapping protein-DNA interactions in vivo with formaldehyde: evidence that histone H4 is retained on a highly transcribed gene. Cell 53, 937-947.
Steffens, N.O., Galuschka, C., Schindler, M., Bulow, L., and Hehl, R. (2004). AthaMap: an online resource for in silico transcription factor binding sites in the Arabidopsis thaliana genome. Nucleic Acids Res. 32, D368-D372.
Steffens, N.O., Galuschka, C., Schindler, M., Bulow, L., and Hehl, R. (2005). AthaMap web tools for database-assisted identification of combinatorial cis-regulatory elements and the display of highly conserved transcription factor binding sites in Arabidopsis thaliana. Nucleic Acids Res. 33, W397-W402.
Stenman, G., Andersson, M.K., and Andren, Y. (2010). New tricks from an old oncogene Gene fusion and copy number alterations of MYB in human cancer. Cell Cycle 9, 2986-2995.
Stevens, M.T., and Lindroth, R.L. (2005). Induced resistance in the indeterminate growth of aspen (Populus tremuloides). Oecologia 145, 298-306.
Stitt, M., Gibon, Y., Lunn, J.E., and Piques, M. (2007). Multilevel genomics analysis of carbon signalling during low carbon availability: coordinating the supply and utilisation of carbon in a fluctuating environment. Functional Plant Biology 34, 526-549.
Stobergrasser, U., Brydolf, B., Bin, X., Grasser, F., Firtel, R.A., and Lipsick, J.S. (1992). The Myb DNA-binding domain is highly conserved in Dictyostelium discoideum. Oncogene 7, 589-596.
Stracke, R., Werber, M., and Weisshaar, B. (2001). The R2R3-MYB gene family in Arabidopsis thaliana. Curr. Opin. Plant Biol. 4, 447-456.
Stracke, R., Ishihara, H., Barsch, G.H.A., Mehrtens, F., Niehaus, K., and Weisshaar, B. (2007). Differential regulation of closely related R2R3-MYB transcription factors controls flavonol accumulation in different parts of the Arabidopsis thaliana seedling. Plant J. 50, 660-677.
Sugimoto, K., Takeda, S., and Hirochika, H. (2000). MYB-related transcription factor NtMYB2 induced by wounding and elicitors is a regulator of the tobacco retrotransposon Tto1 and defense-related genes. Plant Cell 12, 2511-2527.
176
Sulpice, R., Pyl, E.-T., Ishihara, H., Trenkamp, S., Steinfath, M., Witucka-Wall, H., Gibon, Y., Usadel, B., Poree, F., Piques, M.C., Von Korff, M., Steinhauser, M.C., Keurentjes, J.J.B., Guenther, M., Hoehne, M., Selbig, J., Fernie, A.R., Altmann, T., and Stitt, M. (2009). Starch as a major integrator in the regulation of plant growth. Proc. Natl. Acad. Sci. U. S. A. 106, 10348-10353.
Sun, C.H., Palm, D., McArthur, A.G., Svard, S.G., and Gillin, F.D. (2002). A novel Myb-related protein involved in transcriptional activation of encystation genes in Giardia lamblia. Mol. Microbiol. 46, 971-984.
Suzuki, A., Wu, C.Y., Washida, H., and Takaiwa, F. (1998). Rice MYB protein OSMYB5 specifically binds to the AACA motif conserved among promoters of genes for storage protein glutelin. Plant Cell Physiol. 39, 555-559.
Tahirov, T.H., Sasaki, M., Inoue-Bungo, T., Fujikawa, A., Sato, K., Kumasaka, T., Yamamoto, M., and Ogata, K. (2001). Crystals of ternary protein-DNA complexes composed of DNA-binding domains of c-Myb or v-Myb, C/EBP alpha or C/EBP beta and tom-1A promoter fragment. Acta Crystallogr. Sect. D-Biol. Crystallogr. 57, 1655-1658.
Tahirov, T.H., Sato, K., Ichikawa-Iwata, E., Sasaki, M., Inoue-Bungo, T., Shiina, M., Kimura, K., Takata, S., Fujikawa, A., Morii, H., Kumasaka, T., Yamamoto, M., Ishii, S., and Ogata, K. (2002). Mechanism of c-Myb-C/EBP beta cooperation from separated sites on a promoter. Cell 108, 57-70.
Tallman, G. (2004). Are diurnal patterns of stomatal movement the result of alternating metabolism of endogenous guard cell ABA and accumulation of ABA delivered to the apoplast around guard cells by transpiration? J. Exp. Bot. 55, 1963-1976.
Tamagnone, L., Merida, A., Parr, A., Mackay, S., Culianez-Macia, F.A., Roberts, K., and Martin, C. (1998). The AmMYB308 and AmMYB330 transcription factors from antirrhinum regulate phenylpropanoid and lignin biosynthesis in transgenic tobacco. Plant Cell 10, 135-154.
Tamura, K., Dudley, J., Nei, M., and Kumar, S. (2007). MEGA4: Molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24, 1596-1599.
Tanikawa, J., Yasukawa, T., Enari, M., Ogata, K., Nishimura, Y., Ishii, S., and Sarai, A. (1993). Recognition of specific DNA sequences by the c-myb protooncogene product: role of three repeat units in the DNA-binding domain. Proc. Natl. Acad. Sci. U. S. A. 90, 9320-9324.
Telfer, A., Bollman, K.M., and Poethig, R.S. (1997). Phase change and the regulation of trichome distribution in Arabidopsis thaliana. Development 124, 645-654.
Tiessen, A., Prescha, K., Branscheid, A., Palacios, N., McKibbin, R., Halford, N.G., and Geigenberger, P. (2003). Evidence that SNF1-related kinase and hexokinase are involved in separate sugar-signalling pathways modulating post-
177
translational redox activation of ADP-glucose pyrophosphorylase in potato tubers. Plant J. 35, 490-500.
Toufighi, K., Brady, S.M., Austin, R., Ly, E., and Provart, N.J. (2005). The Botany Array Resource: e-Northerns, Expression Angling, and Promoter analyses. Plant J. 43, 153-163.
Treisman, R., Marais, R., and Wynne, J. (1992). Spatial flexibility in ternary complexes between SRF and its accessory proteins. Embo J. 11, 4631-4640.
Tsai, C.-J., Harding, S.A., Tschaplinski, T.J., Lindroth, R.L., and Yuan, Y. (2006). Genome-wide analysis of the structural genes regulating defense phenylpropanoid metabolism in Populus. New Phytol. 172, 47-62.
Uimari, A., and Strommer, J. (1997). Myb26: a MYB-like protein of pea flowers with affinity for promoters of phenylpropanoid genes. Plant J. 12, 1273-1284.
Urao, T., Yamaguchishinozaki, K., Urao, S., and Shinozaki, K. (1993). An Arabidopsis myb homolog is induced by dehydration stress and its gene product binds to the conserved MYB recognition sequence. Plant Cell 5, 1529-1539.
Usadel, B., Blaesing, O.E., Gibon, Y., Retzlaff, K., Hoehne, M., Guenther, M., and Stitt, M. (2008). Global transcript levels respond to small changes of the carbon status during progressive exhaustion of carbohydrates in Arabidopsis rosettes. Plant Physiol. 146, 1834-1861.
Vakoc, C.R., Letting, D.L., Gheldof, N., Sawado, T., Bender, M.A., Groudine, M., Weiss, M.J., Dekker, J., and Blobel, G.A. (2005). Proximity amona distant reaulatory elements at the beta-globin locus requires GATA-1 and FOG-1. Mol. Cell 17, 453-462.
Vavouri, T., and Elgar, G. (2005). Prediction of cis-regulatory elements using binding site matrices - the successes, the failures and the reasons for both. Curr. Opin. Genet. Dev. 15, 395-402.
Verrijdt, G., Haelens, A., and Claessens, F. (2003). Selective DNA recognition by the androgen receptor as a mechanism for hormone-specific regulation of gene expression. Mol. Genet. Metab. 78, 175-185.
Vicente, C., Conchillo, A., Pauwels, D., Vazquez, I., Garcia-Orti, L., Calasanz, M.J., Lahortiga, I., Cools, J., and Odero, M.D. (2009). MYB Overexpression Is Directly Involved in Acute Myeloid Leukemia Pathogenesis and Could Constitute a New Therapeutic Target for Patients with Aberrant Expression of This Gene. Blood 114, 948-948.
Wagner, D., Sablowski, R.W.M., and Meyerowitz, E.M. (1999). Transcriptional activation of APETALA1 by LEAFY. Science 285, 582-584.
178
Wang, Q.F., Lauring, J., and Schlissel, M.S. (2000). c-Myb binds to a sequence in the proximal region of the RAG-2 promoter and is essential for promoter activity in T-lineage cells. Mol. Cell. Biol. 20, 9203-9211.
Wang, S., Wang, J.W., Yu, N., Li, C.H., Luo, B., Gou, J.Y., Wang, L.J., and Chen, X.Y. (2004). Control of plant trichome development by a cotton fiber MYB gene. Plant Cell 16, 2323-2334.
Wang, Z.Y., and Tobin, E.M. (1998). Constitutive expression of the CIRCADIAN CLOCK ASSOCIATED 1 (CCA1) gene disrupts circadian rhythms and suppresses its own expression. Cell 93, 1207-1217.
Watson, R.J., Robinson, C., and Lam, E.W.F. (1993). Transcription regulation by murine B-myb is distinct from that by c-myb. Nucleic Acids Res. 21, 267-272.
Weisshaar, B., and Jenkins, G.I. (1998). Phenylpropanoid biosynthesis and its regulation. Curr. Opin. Plant Biol. 1, 251-257.
Weston, K. (1992). Extension of the DNA binding consensus of the chicken c-Myb and v-Myb proteins. Nucleic Acids Res. 20, 3042-3049.
Whitham, T.G., Bailey, J.K., Schweitzer, J.A., Shuster, S.M., Bangert, R.K., LeRoy, C.J., Lonsdorf, E.V., Allan, G.J., DiFazio, S.P., Potts, B.M., Fischer, D.G., Gehring, C.A., Lindroth, R.L., Marks, J.C., Hart, S.C., Wimp, G.M., and Wooley, S.C. (2006). A framework for community and ecosystem genetics: from genes to ecosystems. Nature Reviews Genetics 7, 510-523.
Wilkins, O., Nahal, H., Foong, J., Provart, N.J., and Campbell, M.M. (2009). Expansion and Diversification of the Populus R2R3-MYB Family of Transcription Factors. Plant Physiol. 149, 981-993.
Williams, C.E., and Grotewold, E. (1997). Differences between plant and animal myb domains are fundamental for DNA binding activity, and chimeric Myb domains have novel DNA binding specificities. J. Biol. Chem. 272, 563-571.
Winkel-Shirley, B. (2001). Flavonoid biosynthesis. A colorful model for genetics, biochemistry, cell biology, and biotechnology. Plant Physiol. 126, 485-493.
Wong, M.W., Henry, R.W., Ma, B.C., Kobayashi, R., Klages, N., Matthias, P., Strubin, M., and Hernandez, N. (1998). The large subunit of basal transcription factor SNAP(C) is a Myb domain protein that interacts with Oct-1. Mol. Cell. Biol. 18, 368-377.
Wright, W.E., Binder, M., and Funk, W. (1991). Cyclic amplification and selection of targets (CASTing) for the myogenin consensus binding site. Mol. Cell. Biol. 11, 4104-4110.
179
Xiang, Q.J., and Judelson, H.S. (2010). Myb transcription factors in the oomycete Phytophthora with novel diversified DNA-binding domains and developmental stage-specific expression. Gene 453, 1-8.
Xiao, W.Y., Sheen, J., and Jang, J.C. (2000). The role of hexokinase in plant sugar signal transduction and growth and development. Plant Mol.Biol. 44, 451-461.
Xie, D.Y., and Dixon, R.A. (2005). Proanthocyanidin biosynthesis - still more questions than answers? Phytochemistry 66, 2127-2144.
Xie, Z.D., Lee, E., Lucas, J.R., Morohashi, K., Li, D.M., Murray, J.A.H., Sack, F.D., and Grotewold, E. (2010). Regulation of Cell Proliferation in the Stomatal Lineage by the Arabidopsis MYB FOUR LIPS via Direct Targeting of Core Cell Cycle Genes. Plant Cell 22, 2306-2321.
Xue, G.P. (2005). A CELD-fusion method for rapid determination of the DNA-binding sequence specificity of novel plant DNA-binding proteins. Plant J. 41, 638-649.
Yang, H., Chung, H.J., Yong, T., Lee, B.H., and Park, S. (2003a). Identification of an encystation-specific transcription factor, Myb protein in Giardia lamblia. Mol. Biochem. Parasitol. 128, 167-174.
Yang, S.C., Sweetman, J.P., Amirsadeghi, S., Barghchi, M., Huttly, A.K., Chung, W.I., and Twell, D. (2001). Novel anther-specific myb genes from tobacco as putative regulators of phenylalanine ammonia-lyase expression. Plant Physiol. 126, 1738-1753.
Yang, T., Perasso, R., and Baroin-Tourancheau, A. (2003b). Myb genes in ciliates: A common origin with the myb protooncogene? Protist 154, 229-238.
Yang, Y.O., and Klessig, D.F. (1996). Isolation and characterization of a tobacco mosaic virus-inducible myb oncogene homolog from tobacco. Proc. Natl. Acad. Sci. U. S. A. 93, 14972-14977.
Yanhui, C., Xiaoyuan, Y., Kun, H., Meihua, L., Jigang, L., Zhaofeng, G., Zhiqiang, L., Yunfei, Z., Xiaoxiao, W., Xiaoming, Q., Yunping, S., Li, Z., Xiaohui, D., Jingchu, L., Xing-Wang, D., Zhangliang, C., Hongya, G., and Li-Jia, Q. (2006). The MYB transcription factor superfamily of Arabidopsis: expression analysis and phylogenetic comparison with the rice MYB family. Plant Mol.Biol. 60, 107-124.
Yi, J.X., Derynck, M.R., Li, X.Y., Telmer, P., Marsolais, F., and Dhaubhadel, S. (2010). A single-repeat MYB transcription factor, GmMYB176, regulates CHS8 gene expression and affects isoflavonoid biosynthesis in soybean. Plant J. 62, 1019-1034.
Yu, E.Y., Yen, W.F., Steinberg-Neifach, O., and Lue, N.F. (2010). Rap1 in Candida albicans: an Unusual Structural Organization and a Critical Function in Suppressing Telomere Recombination. Mol. Cell. Biol. 30, 1254-1268.
180
Yu, O., and McGonigle, B. (2005). Metabolic engineering of isoflavone biosynthesis. In Advances in Agronomy, Volume 86, D.L. Sparks, ed, pp. 147-190.
Zheng, Y.M., Ren, N., Wang, H., Stromberg, A.J., and Perry, S.E. (2009). Global Identification of Targets of the Arabidopsis MADS Domain Protein AGAMOUS-Like15. Plant Cell 21, 2563-2577.
Zhong, M., Niu, W., Lu, Z.J., Sarov, M., Murray, J.I., Janette, J., Raha, D., Sheaffer, K.L., Lam, H.Y.K., Preston, E., Slightham, C., Hillier, L.W., Brock, T., Agarwal, A., Auerbach, R., Hyman, A.A., Gerstein, M., Mango, S.E., Kim, S.K., Waterston, R.H., Reinke, V., and Snyder, M. (2010). Genome-Wide Identification of Binding Sites Defines Distinct Functions for Caenorhabditis elegans PHA-4/FOXA in Development and Environmental Response. PLoS Genet. 6, 13.
Zhong, R., Richardson, E.A., and Ye, Z.-H. (2007). The MYB46 transcription factor is a direct target of SND1 and regulates secondary wall biosynthesis in Arabidopsis. Plant Cell 19, 2776-2792.
Zhong, R., Lee, C., Zhou, J., McCarthy, R.L., and Ye, Z.-H. (2008). A Battery of Transcription Factors Involved in the Regulation of Secondary Cell Wall Biosynthesis in Arabidopsis. Plant Cell 20, 2763-2782.
Zhou, J.L., Lee, C.H., Zhong, R.Q., and Ye, Z.H. (2009). MYB58 and MYB63 Are Transcriptional Activators of the Lignin Biosynthetic Pathway during Secondary Cell Wall Formation in Arabidopsis. Plant Cell 21, 248-266.
Zhou, L., Jang, J.C., Jones, T.L., and Sheen, J. (1998). Glucose and ethylene signal transduction crosstalk revealed by an Arabidopsis glucose-insensitive mutant. Proc. Natl. Acad. Sci. U. S. A. 95, 10294-10299.
Zhu, Z., An, F., Feng, Y., Li, P., Xue, L., Mu, A., Jiang, Z., Kim, J.-M., To, T.K., Li, W., Zhang, X., Yu, Q., Dong, Z., Chen, W.-Q., Seki, M., Zhou, J.-M., and Guo, H. (2011). Derepression of ethylene-stabilized transcription factors (EIN3/EIL1) mediates jasmonate and ethylene signaling synergy in Arabidopsis. Proc. Natl. Acad. Sci. U. S. A. 108, 12539-12544.
Zimmermann, I.M., Heim, M.A., Weisshaar, B., and Uhrig, J.F. (2004). Comprehensive identification of Arabidopsis thaliana MYB transcription factors interacting with R/B-like BHLH proteins. Plant J. 40, 22-34.
181
Copyright Acknowledgements
Statement of Publications
The research presented in this thesis has appeared or has been submitted as a series
of original publications in refereed journals.
Chapter 1
Prouse M.B., and Campbell M.M. (2012) The interaction between MYB proteins and
their target DNA binding sites. Biochimica Et Biophysica Acta-Gene Regulatory
Mechanisms. 1819: 67-77.
Chapter 2
Romano J, Dubos, C., Prouse, M.B., Wilkins, O., Hong, H., Poole, M., Kang, K., Li, E., ,
Douglas, C.J., Western, T.L., Mansfield, S.D., and Campbell, M.M. (2012) AtMYB61, an
R2R3-MYB transcription factor, is a pleiotropic regulator of plant carbon acquisition and
resource allocation. New Phytologist. 195: 774-786.
Chapter 3
Prouse M.B., and Campbell M.M. (2013) Interactions between the R2R3-MYB
transcription factor, AtMYB61, and target DNA binding sites. PLOS ONE. 8(5): e65132.
Appendix
Mellway, R.D., Tran L.T., Prouse, M.B., Campbell, M.M., and Constabel, C.P. (2009)
The Wound-, Pathogen-, and Ultraviolet B-Responsive MYB134 Gene Encodes an
R2R3 MYB Transcription Factor That Regulates Proanthocyanidin Synthesis in Poplar.
Plant Physiology. 150: 924-941.