Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Chapter 31
Transcription and Regulation of Gene Expression
to accompany
Biochemistry, 2/e
by
Reginald Garrett and Charles Grisham
All rights reserved. Requests for permission to make copies of any part of the work should be mailed to: Permissions Department, Harcourt Brace & Company, 6277 Sea Harbor Drive, Orlando, Florida 32887-6777
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Outline• 31.1 Transcription in Prokaryotes
• 31.2 Transcription in Eukaryotes
• 31.3 Regulation of Transcription in Prokaryotes
• 31.4 Transcription Regulation in Eukaryotes
• 31.5 Structural Motifs in DNA-Binding Proteins
• 31.6 Post-Transcriptional Processing of mRNA
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
The Postulate of Jacob and Monod
• Before it had been characterized in a molecular sense, messenger RNA was postulated to exist by F. Jacob and J. Monod.
• Their four properties: – base composition that reflects DNA
– heterogeneous with respect to mass
– able to associate with ribosomes
– high rate of turnover
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Other Forms of RNA rRNA and tRNA only appreciated later
• All three forms participate in protein synthesis • All made by DNA-dependent RNA polymerases • This process is called transcription • Not all genes encode proteins! Some encode
rRNAs or tRNAs • Transcription is tightly regulated. Only 0.01% of
genes in a typical eukaryotic cell are undergoing transcription at any given moment
• How many proteins is that???
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Transcription in Prokaryotes Only a single RNA polymerase
• In E.coli, RNA polymerase is 465 kD complex, with 2 , 1 , 1 ', 1
' binds DNA binds NTPs and interacts with recognizes promoter sequences on DNA subunits appear to be essential for
assembly and for activation of enzyme by regulatory proteins
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Stages of Transcription
See Figure 31.2
• binding of RNA polymerase holoenzyme at promoter sites
• initiation of polymerization
• chain elongation
• chain termination
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Binding of polymerase to Template DNA
• Polymerase binds nonspecifically to DNA with low affinity and migrates, looking for promoter
• Sigma subunit recognizes promoter sequence
• RNA polymerase holoenzyme and promoter form "closed promoter complex" (DNA not unwound) - Kd = 10-6 to 10-9 M
• Polymerase unwinds about 12 pairs to form "open promoter complex" - Kd = 10-14 M
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Properties of Promoters See Figure 31.3
• Promoters typically consist of 40 bp region on the 5'-side of the transcription start site
• Two consensus sequence elements:
• The "-35 region", with consensus TTGACA - sigma subunit appears to bind here
• The Pribnow box near -10, with consensus TATAAT - this region is ideal for unwinding - why?
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Initiation of Polymerization • RNA polymerase has two binding sites for NTPs • Initiation site prefers to binds ATP and GTP (most
RNAs begin with a purine at 5'-end) • Elongation site binds the second incoming NTP • 3'-OH of first attacks alpha-P of second to form a
new phosphoester bond (eliminating PPi)
• When 6-10 unit oligonucleotide has been made, sigma subunit dissociates, completing "initiation"
• Note rifamycin and rifampicin and their different modes of action (Fig. 31.4 and related text)
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Chain Elongation Core polymerase - no sigma
• Polymerase is accurate - only about 1 error in 10,000 bases
• Even this error rate is OK, since many transcripts are made from each gene
• Elongation rate is 20-50 bases per second - slower in G/C-rich regions (why??) and faster elsewhere
• Topoisomerases precede and follow polymerase to relieve supercoiling
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Chain Termination Two mechanisms
• Rho - the termination factor protein – rho is an ATP-dependent helicase
– it moves along RNA transcript, finds the "bubble", unwinds it and releases RNA chain
• Specific sequences - termination sites in DNA – inverted repeat, rich in G:C, which forms a
stem-loop in RNA transcript
– 6-8 As in DNA coding for Us in transcript
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Transcription in Eukaryotes• RNA polymerases I, II and III transcribe rRNA,
mRNA and tRNA genes, respectively
• Pol III transcribes a few other RNAs as well
• All 3 are big, multimeric proteins (500-700 kD)
• All have 2 large subunits with sequences similar to and ' in E.coli RNA polymerase, so catalytic site may be conserved
• Pol II is most sensitive to -amanitin, an octapeptide from Amanita phalloides ("destroying angel mushroom")
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Transcription Factors More on this later, but a short note now
• The three polymerases (I, II and III) interact with their promoters via so-called transcription factors
• Transcription factors recognize and initiate transcription at specific promoter sequences
• Some transcription factors (TFIIIA and TFIIIC for RNA polymerase III) bind to specific recognition sequences within the coding region
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
RNA Polymerase II Most interesting because it regulates
synthesis of mRNA
• Yeast Pol II consists of 10 different peptides (RPB1 - RPB10)
• RPB1 and RPB2 are homologous to E. coli RNA polymerase and '
• RPB1 has DNA-binding site; RPB2 binds NTP • RPB1 has C-terminal domain (CTD) or PTSPSYS • 5 of these 7 have -OH, so this is a hydrophilic and
phosphorylatable site
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
More RNA Polymerase II
• CTD is essential and this domain may project away from the globular portion of the enzyme (up to 50 nm!)
• Only RNA Pol II whose CTD is NOT phosphorylated can initiate transcription
• TATA box (TATAAA) is a consensus promoter
• 7 general transcription factors are required • See TFIID bound to TATA (Fig. 31.11)
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Transcription Regulation in Prokaryotes
• Genes for enzymes for pathways are grouped in clusters on the chromosome - called operons
• This allows coordinated expression
• A regulatory sequence adjacent to such a unit determines whether it is transcribed - this is the ‘operator’
• Regulatory proteins work with operators to control transcription of the genes
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Induction and Repression
• Increased synthesis of genes in response to a metabolite is ‘induction’
• Decreased synthesis in response to a metabolite is ‘repression’
• Some substrates induce enzyme synthesis even though the enzymes can’t metabolize the substrate - these are ‘gratuitous inducers’ - such as IPTG
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
The lac Operon
• lacI mutants express the genes needed for lactose metabolism
• The structural genes of the lac operon are controlled by negative regulation
• lacI gene product is the lac repressor
• The lac operator is a palindromic DNA
• lac repressor - DNA binding on N-term; C-term. binds inducer, forms tetramer.
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Catabolite Activator ProteinPositive Control of the lac Operon
• Some promoters require an accessory protein to speed transcription
• Catabolite Activator Protein or CAP is one such protein
• CAP is a dimer of 22.5 kD peptides• N-term binds cAMP; C-term binds DNA
• Binding of CAP-(cAMP)2 to DNA assists formation of closed promoter complex
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
The trp Operon
• Encodes a leader sequence and 5 proteins that synthesize tryptophan
• Trp repressor controls the operon
• Trp repressor binding excludes RNA polymerase from the promoter
• Trp repressor also regulates trpR and aroH operons and is itself encoded by the trpR operon. This is autogenous regulation (autoregulation).
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Transcription Regulationin Eukaryotes
• More complicated than prokaryotes• Chromatin limits access of regulatory proteins
to promoters• Factors must reorganize the chromatin • In addition to promoters, eukaryotic genes
have ‘enhancers’, also known as upstream activation sequences
• DNA looping permits multiple proteins to bind to multiple DNA sequences
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Structural Motifs in DNA-Binding Regulatory Proteins
• Crucial feature must be atomic contacts between protein residues and bases and sugar-phosphate backbone of DNA
• Most contacts are in the major groove of DNA • 80% of regulatory proteins can be assigned to one
of three classes: helix-turn-helix (HTH), zinc finger (Zn-finger) and leucine zipper (bZIP)
• In addition to DNA-binding domains, these proteins usually possess other domains that interact with other proteins
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Alpha Helices and DNA A perfect fit!
• A recurring feature of DNA-binding proteins is the presence of -helical segments that fit directly into the major groove of B-form DNA
• Diameter of helix is 1.2 nm
• Major groove of DNA is about 1.2 nm wide and 0.6 to 0.8 nM deep
• Proteins can recognize specific sites in DNA
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
The Helix-Turn-Helix Motif First identified in 3 prokaryotic proteins
• two repressor proteins (Cro and cI) and the E. coli catabolite activator protein (CAP)
• All these bind as dimers to dyad-symmetric sites on DNA (see Figure 31.33)
• All contain two alpha helices separated by a loop with a beta turn
• The C-terminal helix fits in major groove of DNA; N-terminal helix stabilizes by hydrophobic interactions with C-terminal helix
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Helix-Turn-Helix II See Figures 31.34 and 31.35
• Residues 1-7 of the motif are the first helix (but called "helix 2")
• Residue 9 is the turn maker - a Gly, of course
• Residues 12-20 are the second helix (called "helix 3")
• Recognition of DNA sequence involves the sides of base pairs that face the major groove (see discussion on pages 1050-1052)
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
The Zn-Finger Motif First discovered in TFIIIA from Xenopus laevis, the
African clawed toad
• Now known to exist in nearly all organisms
• Two main classes: C2H2 and Cx
• C2H2 domains consist of Cys-x2-Cys and His-x3-His domains separated by at least 7-8 aas
• Cx domains consist of 4, 5 or 6 Cys residues separated by various numbers of other residues
• See Figure 31.37 and Table 31.7
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
More Zn-Fingers Their secondary and tertiary structures
• C2H2 -type Zn fingers form a folded beta strand and an alpha helix that fits into the DNA major groove
• Cx-type Zn fingers consist of two mini-domains of four Cys ligands to Zn followed by an alpha helix: the first helix is DNA
• recognition helix, second helix packs against the first
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
The Leucine Zipper Motif First found in C/EBP, a DNA-binding protein in
rat liver nuclei
• Now found in nearly all organisms
• Characteristic features: a 28-residue sequence with Leu every 7th position and a "basic region"
• (What do you know by now about 7-residue repeats?)
• This suggests amphipathic alpha helix and a coiled-coil dimer
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
The Structure of the Zipper and its DNA complex
• Leucine zipper proteins (aka bZIP proteins) dimerize, either as homo- or hetero-dimers
• The basic region is the DNA-recognition site
• Basic region is often modelled as a pair of helices that can wrap around the major groove
• Homodimers recognize dyad-symmetric DNA
• Heterodimers recognize non-symmetric DNA
• Fos and Jun are classic bZIPs
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Post-transcriptional Processing of mRNA in Eukaryotes
• Translation closely follows transcription in prokaryotes
• In eukaryotes, these processes are separated - transcription in nucleus, translation in cytoplasm
• On the way from nucleus to cytoplasm, the mRNA is converted from "primary transcript" to "mature mRNA"
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Eukaryotic Genes are Split • Introns intervene between exons
• Examples: actin gene has 309-bp intron separates first three amino acids and the other 350 or so
• But chicken pro-alpha-2 collagen gene is 40-kbp long, with 51 exons of only 5 kbp total.
• The exons range in size from 45 to 249 bases
• Mechanism by which introns are excised and exons are spliced together is complex and must be precise
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Capping and Methylation • Primary transcripts (aka pre-mRNAs or
heterogeneous nuclear RNA) are usually first "capped" by a guanylyl group
• The reaction is catalyzed by guanylyl transferase
• Capping G residue is methylated at 7-position
• Additional methylations occur at 2'-O positions of next two residues and at 6-amino of the first adenine
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
3'-Polyadenylylation • Termination of transcription occurs only after
RNA polymerase has transcribed past a consensus AAUAAA sequence - the poly(A)+ addition site
• 10-30 nucleotides past this site, a string of 100 to 200 adenine residues are added to the mRNA transcript - the poly(A)+ tail
• poly(A) polymerase adds these A residues
• Function not known for sure, but poly(A) tail may govern stability of the mRNA
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Splicing of Pre-mRNA Capped, polyadenylated RNA, in the form of a RNP
complex, is the substrate for splicing • In "splicing", the introns are excised and the
exons are sewn together to form mature mRNA • Splicing occurs only in the nucleus • The 5'-end of an intron in higher eukaryotes is
always GU and the 3'-end is always AG • All introns have a "branch site" 18 to 40
nucleotides upstream from 3'-splice site
• Branch site is essential to splicing
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
The Branch site and Lariat • Branch site is usually YNYRAY, where Y =
pyrimidine, R = purine and N is anything
• The "lariat" a covalently closed loop of RNA is formed by attachment of the 5'-P of the intron's invariant 5'-G to the 2'-OH at the branch A site
• The exons then join, excising the lariat.
• The lariat is unstable; the 2'-5' phosphodiester is quickly cleaved and intron is degraded in the nucleus.
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
The Importance of snRNP • Small nuclear ribonucleoprotein particles -
snRNPs, pronounced "snurps" - are involved in splicing
• A snRNP consists of a small RNA (100-200 bases long) and about 10 different proteins
• Some of the 10 proteins are general, some are specific. Properties described on page 1063
• snRNPs and pre-mRNA form the spliceosome
• Spliceosome is the size of ribosomes, and its assembly requires ATP
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Assembly of the Spliceosome See Figure 31.53
• snRNPs U1 and U5 bind at the 5'- and 3'- splice sites, and U2 snRNP binds at the branch site
• Interaction between the snRNPs brings 5'- and 3'- splice sites together so lariat can form and exon ligation can occur
• The transesterification reactions that join the exons may in fact be catalyzed by "ribozymes"
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company