Initiation and termination of transcription, Post transcription modification of the RNA
Mitesh Shrestha
Transcription: overview
• In prokaryotes transcription and translation are coupled. Proteins are synthesized directly from the primary transcript as it is made.
• In eukaryotes transcription and translation are separated. Transcription occurs in the nucleus, and translation occurs in the cytoplasm on ribosomes.
Transcription: RNA Polymerase
• DNA-dependent – DNA template, ribonucleoside 5´ triphosphates, and
Mg2+
• Synthesizes RNA in 5´ to 3´ direction • E. coli RNA polymerase consists of 5 subunits • Eukaryotes have five RNA polymerases
– RNA polymerase II is responsible for transcription of protein-coding genes and some snRNA molecules
– RNA polymerase II has 12 subunits – Requires accessory proteins (transcription factors) – Does not require a primer
The Process of Gene Expression For non-viral proteins…
Information stored in the nucleotide sequences of genes is translated into the amino acid sequences of proteins through unstable intermediaries called messenger (m)RNAs.
Synthesis of viral proteins… in infected bacteria involved an unstable RNA molecule synthesized from the viral DNA.
1-Single strand of DNA (template strand) 2-ribonuleoside triphosphate (NTP) 3-no pre-existing primers (de novo)
RNA synthesis
template strand=transcribed
Protein
Nucleophilic attacks --3’ hydroxyl group of the RNA strand --nucleotidyl phosphorus on the nucleoside triphosphate “nucleotide, nucleoside monophosphate”
“RNA polymerase” “Transcriptional factors”
The Transcription Bubble
Prokaryotes: --RNA polymerase binds specific nucleotide sequences (promoter regions) plus transcriptional factors --Single RNA polymerase --DNA unwinding (AT regions)
The Transcription Bubble
Eukaryotes: --several RNA polymerases --no direct recognition binding --transcriptional factors
General Features of RNA Synthesis
• Similar to DNA Synthesis except – The precursors are ribonucleoside triphosphates. – Only one strand of DNA is used as a template.
– RNA chains can be initiated de novo (no primer required).
• The RNA molecule will be complementary to the DNA template (antisense) strand and identical (except that uridine replaces thymidine) to the DNA non-template (sense) strand.
• RNA synthesis is catalyzed by RNA polymerases and proceeds in the 5’3’ direction.
• In eukaryotes, genes are present in the nucleus, whereas polypeptides are synthesized in the cytoplasm.
• Messenger RNA molecules function as non-stable intermediaries that carry
genetic information from DNA to the ribosomes, where proteins are synthesized.
• RNA synthesis, catalyzed by RNA polymerases, is similar to DNA synthesis in many respects.
• RNA synthesis occurs within a localized region of strand separation
(Transcription Bubble), and only one strand of DNA functions as a template for RNA synthesis.
• RNA synthesis, catalyzed by RNA polymerases, is similar to DNA synthesis in many respects.
Eukaryotic: ARS (Autonomously Replicating Sequences) AT-rich region 11 bp
Prokaryotic: OriC (245 bp) AT-rich region (replication bubble)
Stages of Transcription
• Promoter Recognition
• Chain Initiation
• Chain Elongation
• Chain Termination
Transcription: promoter recognition
• Transcription factors bind to promoter sequences and recruit RNA polymerase.
• DNA is bound first in a closed complex. Then, RNA polymerase denatures a 12–15 bp segment of the DNA (open complex).
• The site where the first base is incorporated into the transcription is numbered “+1” and is called the transcription start site.
• Transcription factors that are required at every promoter site for RNA polymerase interaction are called basal transcription factors.
Promoter recognition: promoter sequences
• Promoter sequences vary considerably.
• RNA polymerase binds to different promoters with different strengths; binding strength relates to the level of gene expression
• There are some common consensus sequences for promoters: – Example: E. coli –35 sequence (found 35 bases 5´ to the start
of transcription)
– Example: E. coli TATA box (found 10 bases 5´ to the start of transcription)
Properties of Promoters
• Promoters typically consist of 40 bp region on the 5'-side of the transcription start site
• Two consensus sequence elements:
• The "-35 region", with consensus TTGACA - sigma subunit appears to bind here
• The Pribnow box near -10, with consensus TATAAT - this region is ideal for unwinding.
Promoter recognition: enhancers
• Eukaryotic genes may also have enhancers.
• Enhancers can be located at great distances from the gene they regulate, either 5´ or 3´ of the transcription start, in introns or even on the noncoding strand.
• One of the most common ways to identify promoters and enhancers is to use a reporter gene.
Promoter recognition: other players
• Many proteins can regulate gene expression by modulating the strength of interaction between the promoter and RNA polymerase.
• Some proteins can activate transcription (upregulate gene expression).
• Some proteins can inhibit transcription by blocking polymerase activity.
• Some proteins can act both as repressors and activators of transcription.
Transcription: chain initiation
• Chain initiation:
• RNA polymerase locally denatures the DNA.
• The first base of the new RNA strand is placed complementary to the +1 site.
• RNA polymerase does not require a primer.
• The first 8 or 9 bases of the transcript are linked. Transcription factors are released, and the polymerase leaves the promoter region.
Transcription in Prokaryotes
Transcription ---The first step in gene expression ---Transfers the genetic information stored in DNA (genes) into messenger (m)RNA molecules that ---Carry the information to the sites of protein synthesis in the cytoplasm.
Stages of Transcription
--DNA dependent RNA polymerase
--5’ to 3’ direction
--Walk (literally) on the DNA
--Upstream and downstream
regions
E. Coli RNA Polymerase • Tetrameric core: 2 ’
• Holoenzyme: 2 ’
• (480,000 Daltons; bp~650 Daltons)
• Functions of the subunits: : assembly of the tetrameric core
: ribonucleoside triphosphate binding site
’: DNA template binding region
(sigma factor): initiation of transcription (*)
(*) in vivo In vitro: RNA polymerase works…just fine on both DNA strands
Initiation of RNA Chains
Binding of RNA polymerase holoenzyme to a promoter region in DNA ( promoter region)
Localized unwinding of the two strands of DNA by RNA polymerase to provide a single-stranded template
Formation of phosphodiester bonds between the first few ribonucleotides in the nascent RNA chain
A Typical E. coli Promoter
..,-2,-1,+1,+2,..
Numbering of a Transcription Unit • The transcript initiation site is +1 (A/T).
• Bases preceding the initiation site are given minus (–) prefixes and are referred to as upstream sequences.
• Bases following the initiation site are given plus (+)
prefixes and are referred to as downstream sequences.
• Consensus sequences: highly conserved
• Recognition sequences: Sigma factor (
Transcription: chain elongation
• Chain elongation:
• RNA polymerase moves along the transcribed or template DNA strand.
• The new RNA molecule (primary transcript) forms a short RNA-DNA hybrid molecule with the DNA template.
Elongation Sigma factor needs to be released ---Re- and Un-winding activities -- Walk (literally) on the DNA 5’ to 3’ --growing RNA chain RNA polymerase binds both DNA template and growing RNA chain
Elongation phase of transcription
• Requires the release of RNA polymerase from the initiation complex
• Highly processive
• Dissociation of factors needed specifically at initiation. – Bacterial dissociates from the holoenzyme
– Eukaryotic TFIID and TFIIA appear to stay behind at the promoter after polymerase and other factors leave the initiation complex
Proteins implicated in elongation • P-TEFb
– Positive transcription elongation factor b
– Cyclin-dependent kinase
– Phosphorylates CTD of large subunit, Pol II
• E. coli GreA and GreB, eukaryotic TFIIS
– may overcome pausing by the polymerase
– induce cleavage of the new transcript, followed by release of the 3’ terminal RNA fragment.
• E. coli NusG, yeast Spt5, human DSIF
– Regulated elongation (negative and positive), direct contact with polymerase and nascent transcript
• ELL: increase elongation rate of RNA Pol II
• CSB: Cockayne syndrome B protein, incr. elongation rate
Model for RNA Polymerase II Phosphorylation
Eukaryotic RNA polymerase II
CTD of large subunit of Pol II
Pol IIa
CTD of large subunit of Pol IIP
PP
P
PP
Pol IIokinase + ATP
phosphatase
Model: Phosphorylation of Pol IIa to make Pol IIo is needed to release the polymerase from the initiation complex and allow it to start elongation.
CTD has repeat of (YSPTSPT)26-50
The shift from initiation to elongation can be a regulated event.
• Release from pausing can be the mechanism for induction of expression.
– In Drosophila, the RNA polymerase can pause after synthesizing ~ 25 nucleotides of RNA in many genes.
– under elevated temperature conditions, the heat shock factor stimulates elongation by release from pausing.
– Other possible examples: mammalian c-myc, HIV LTR
• This is in addition to regulation at initiation.
Phosphorylated form of RNA PolII is at sites of elongation after heat shock
Immunofluorescence Detection of Pol II on Drosophila Polytene Chromosomes.
Green: dephosphorylatedRed: hyperphosphorylatedYellow: mixed
Transcription: chain termination
• Most known about bacterial chain termination
• Termination is signaled by a sequence that can form a hairpin loop.
• The polymerase and the new RNA molecule are released upon formation of the loop.
Termination Signals in E. coli
• Rho-dependent terminators—require a
protein factor ()
• Rho-independent terminators—do not
require
Termination of transcription in E. coli: Rho-independent site
G UUA
GA
GUA
G
UA
GGCCU
UG
AC
AA
GCCCUAA
CG
A
5' ...
CCG
G
AU
A
AC
G
UUUCGGGAUU U U U U ...3'
G+C rich region in stem
Run of U's 3' to stem-loop
Rho-independent terminators—do not require
intrinsic termination)
RNA transcription stops
--when the newly synthesized RNA
molecule forms a G-C-rich hairpin loop
followed by a run of As
--Create a mechanical stress
--Pulls the poly-U transcript out of the
active site of the RNA polymerase
--A-U has very weak interaction
Termination of transcription in E. coli: Rho-dependent site
5' ...AUCGCUACCUCAUAUCCGCACCUCCUCAAACGCUACCUCGACCAGAAAGGCGUCUCUU
Termination occurs at one of these 3 nucleotides.
• Little sequence specificity: rich in C, poor in G. • Requires action of rho ( ) in vitro and in vivo. • Many (most?) genes in E. coli have rho-dependent terminators.
Termination Signals in E. coli
• Rho-dependent terminators (non-intrinsic) —
require a protein factor () and rut site
• Rut proteins bind specific RNA sequences (>>Cs
and <<<Gs)
• Not hairpins or other secondary Structures
© John Wiley & Sons, Inc.
Rho utilization (rut)
Rho factor, or
• Rho is a hexamer, subunit size is 46 kDa
• Is an RNA-dependent ATPase
• Is an essential gene in E. coli
• Rho binds to protein-free RNA and moves along it (tracks)
• Upon reaching a paused RNA polymerase, it causes the polymerase to dissociate and unwinds the RNA-DNA duplex, using ATP hydrolysis. This terminates transcription.
Model for
action of rho factor
'
-dependent site
Structure in RNA that causes pausing
hexamer binds to protein-free RNA and moves along it.
RNA polymerase pauses at the
-dependent terminator site,
and catches up
unwinds the RNA-DNA hybrid and transcription terminates
RNA polymerase transcribes along the
template, and moves along the RNA.
mRNA Structure in Bacteria : Coupling Transcription Termination and Translation
lacZ lacY lacA
AUG UAA AUG UAA AUG UAA
Genes in operon
Polycistronic mRNA
-galactosidase lactose permease
-galactoside transacetylase
transcription
translation
Translation can occur simultaneously with transcription in bacteria
lacZ lacY lacA
AUG
UAA Transcription of genes
Translation of mRNA
-galactosidase
ribosome Nascent polypeptide
Coupled Transcription and Translation in E. coli
© John Wiley & Sons, Inc.
Polarity
• Polar mutations occur in a gene early in an operon, but affect expression of both that gene and genes that follow in the operon.
• Usually affect translation at the beginning of an operon, and exert a negative effect on the transcription of genes later in the operon.
– Usually are nonsense (translation termination) mutations in a 5’ gene that cause termination of transcription of subsequent genes in the operon.
• Rho mutants can suppress polarity.
Diagram of polar effects
lac Z lac Y lac A
wt:
txn
tln-galactosidase permease Ac'ase
missense mutation: x
txn
tln
no -galactosidase activity
permease Ac'ase
x
nonsense mutation:Stop tln
x
txn
tln
no -galactosidase protein
(truncated protein gets
degraded)
no permease no Ac'ase
x
-dependent
terminator of
txn
Model for involve-ment of
rho in polar effects of nonsense mutations
-dependent site
within a transcripton
unit
Structure in
RNA that causes
pausing
Structure in
RNA that causes
pausing
nonsens
e
mutation
ribosome
Ribosomes dissociate
at nonsense codon
Ribosomes prev ent
from catching up with
RNA polymerase
Transcription and
translation continues past
the dependent
termination site.
Wild-type Nonsense mutation
nonsense
Eukaryotic mRNA structure
Transcription and RNA Processing in Eukaryotes
Five different enzymes catalyze transcription in eukaryotes,
and the resulting RNA transcripts undergo three important
modifications, including the excision of noncoding sequences
called introns.
The nucleotide sequenced of some RNA transcripts are
modified post-transcriptionally by RNA editing.
Modifications to Eukaryotic pre-mRNAs
• A 7-Methyl guanosine cap is added to the 5’ end of the primary transcript by a 5’-5’ phosphate linkage.
( stability and protection)
• A poly(A) tail (a 20-200 nucleotide polyadenosine tract, As) is added to the 3’ end of the transcript. The 3’ end is generated by cleavage rather than by termination. (stability and protection)
• When present, intron sequences are spliced out of the transcript. (stability)
Eukaryotes Have Five RNA Polymerases
RNA polymerase II Nucleus miRNA
Pre-mRNA~Heterogeneous nuclear RNA (hnRNA)
A Typical RNA Polymerase II Promoter (mRNA)
Promoter: short sequence of conserved elements (seq. of DNA) located upstream from the transcript starting point. --~200 bp ( DNA linear) --~10 Kdp ( DNA bending)
Initiation by RNA Polymerase II
Transcriptional factors (proteins) --Help/modulate/assist Basal transcriptional factors --bind close to the transcript starting point Other factors (enhancers and silencers) -TFIID --TATA-biding proteins (TBP) -TFIIA -TFIIB
-TFIIF (together with the RNA Pol-II) --enzymatic activity (DNA-unwinding) -TFIIE --binds downstream regions -TFIIH (helicase activity) -TFIIJ --binds downstream regions
Helicase activity:
it separates two annealed nucleic acid strands
RNA Pol-II DNA-unwinding activity: RNA Polymerase bends and wraps around DNA. TFIIF alters (nonspecific) DNA binding by RNA polymerase II, resulting in substantial DNA unwinding but not DNA strand separation.
When RNA polymerase II binds to the complex, it initiates transcription.
Phosphorylation of the CTD is required for elongation to begin.
CTD: carboxy-terminal domain
-TFIIH (helicase activity and kinase activity)
• All eukaryotic RNA polymerases have ∼12 subunits and are aggregates of >500 kD. (nucleotide pair~0.660 kD)
• Some subunits are common to all three RNA polymerases.
• The largest subunit in RNA polymerase II has a CTD (carboxy-terminal domain) consisting of multiple repeats of a heptamer.
Figure 24.2
-Typical RNA polymerase isolated from yeast (S.
cerevisiase) ( and subunits)
- subunits: CTD – carboxy-terminal domain, which
consists in multiple repeats of 7 amino acids, unique
and important of regulation (tyrosine (Try, Y), serine
(Ser, S) and threonine (Thr, T) residues)
-Some subunits are common to all three polymerases.
RNA Polymerase I Has a Bipartite Promoter
• The RNA polymerase I promoter consists of:
• --a core promoter
• --an upstream control element (UPE)
RNA Pol I transcribes rRNA genes.
Core promoter: -45 to +20 seq.,
G-C-rich and A-T-rich (Inr-initiator) regions,
Binding factors - protein complexes formed by
TFIs and TBP-(TATA binding protein)
RNA Polymerase III Uses Both Downstream and Upstream Promoters
• RNA polymerase III has (3) types of promoters.
Figure 24.7
-RNA Pol III transcribes tRNA
-Core promoters (boxes)
-Transcriptional Factors(TF) III:
general and specifics
*proximal sequence element
*
RNA Chain elongation
© John Wiley & Sons, Inc.
--Model
The 7-Methyl Guanosine (7-MG) Cap
© John Wiley & Sons, Inc.
Energy
Histones:? FACT (facilitates chromatin transcriptional)
The 3’ Poly(A) Tail
RNA Chain termination
Termination signal: specific DNA seq. -1000 to 2000 nucleotides --AAUAAA seq. --GU-rich seq. --poly(A) polymerase
Endonuclease
Pol-II vs Pol I and III
-Terminator proteins (Rho-indep. Terminator)
Termination of transcription in eukaryotes : Pol I
• Termination by RNA polymerase I requires a binding site for a protein, Reb1p, that causes pausing.
Model for Pol I termination.
Reb1pRNA polymerase I
U-rich
If the Reb1p binding site in the DNA is replaced with the binding site for E. coli Lac repressor, Lac repressor protein will induce termination in an in vitro transcription reaction.
• RNA polymerase III terminates in a run of 4-5 T’s on the nontemplate strand, surrounded by G+C-rich DNA.
• No clear evidence for a discrete terminator of transcription by RNA polymerase II.
• The 3’ end of the mRNA is made by cleavage and polyadenylation.
Termination of transcription in eukaryotes : Pol II and Pol III
Transcription: mRNA synthesis/processing
• Prokaryotes: mRNA transcribed directly from DNA template and used immediately in protein synthesis
• Eukaryotes: primary transcript must be processed to produce the mRNA – Noncoding sequences (introns) are removed
– Coding sequences (exons) spliced together
– 5´-methylguanosine cap added
– 3´-polyadenosine tail added
Transcription: mRNA synthesis/processing
• Removal of introns and splicing of exons can occur several ways – For introns within a nuclear transcript, a spliceosome
is required. • Splicesomes protein and small nuclear RNA (snRNA) • Specificity of splicing comes from the snRNA, some of which
contain sequences complementary to the splice junctions between introns and exons
– Alternative splicing can produce different forms of a protein from the same gene
– Mutations at the splice sites can cause disease • Thalassemia • Breast cancer (BRCA 1)
Transcription: mRNA synthesis/processing
• RNA splicing inside the nucleus on particles called spliceosomes.
• Splicesomes are composed of proteins and small RNA molecules (100–200 bp; snRNA).
• Both proteins and RNA are required, but some suggesting that RNA can catalyze the splicing reaction.
• Self-splicing in Tetrahymena: the RNA catalyzes its own splicing
• Catalytic RNA: ribozymes
RNA Editing
• Usually the genetic information is not altered in the mRNA intermediary.
• Sometimes RNA editing changes the information content of genes by – Inserting or deleting uridine monophosphate residues.
– Changing the structures of individual bases
Editing of Apoplipoprotein-B mRNA
(Amino groups)
• Three to five different RNA polymerases are present in eukaryotes, and each polymerase transcribes a distinct set of genes.
• Eukaryotic gene transcripts usually undergo three major modifications:
• the addition of 7-methyl guanosine caps to 5’ termini,
• The addition of poly(A) tails to 3’ ends,
• The information content of some eukaryotic transcripts is altered by RNA editing, which changes the nucleotide sequences of transcripts prior to their translation.
Interrupted Genes in Eukaryotes: Exons and Introns
Most eukaryotic genes contain noncoding sequences called introns that interrupt the
coding sequences, or exons.
The introns are excised from the RNA transcripts prior to their transport to the
cytoplasm.
Hybridization: annealing
R-Loop Evidence of an Intron in the Mouse -Globin Gene
mRNA
Pre-mRNA
Missing in actions
Introns • Introns (or intervening sequences) are noncoding sequences located
between coding sequences.
• Introns are removed from the pre-mRNA and are not present in the mRNA.
• Introns are variable in size and may be very large ( 50 bp to 3000 bp).
• Exons (both coding and noncoding sequences) are composed of the sequences that remain in the mature mRNA after splicing.
• The biological significance of introns is still open to debate.
Removal of Intron Sequences by RNA Splicing
The noncoding introns are
excised from gene transcripts by
several different mechanisms.
Eukaryotes No prokaryotes (excepts a few a prokaryotes virus and others)
Excision of Intron Sequences Conserved seq. for mRNA Exon-GT…AG-exon
intron
99%
Ribonucleoproteins: Spliceosomes (1981)
Splicing
• Removal of introns must be very precise. • Conserved sequences for removal of the introns of nuclear
mRNA genes are minimal. – Dinucleotide sequences at the 5’ and 3’ ends of introns. Exon-GU…AG-exon
– TACTAAC box (branch site with A) about 30 nucleotides upstream
from the 3’ splice site.
Spliceosomes: snRNA plus ~40 proteins
1%: CG…AG
AT…AC
Nuclear splicing involves trans-esterification
GU…UACUAAC….AG
“Branch site”
tRNA • A splicing endonuclease makes two cuts at the end of the
intron.
• A splicing ligase joins the two ends of the tRNA to produce the mature tRNA.
• Specificity resides in the three-dimensional (secundary) structure of the tRNA precursor, not in the nucleotide sequence.
rRNA (Autocatalytic Splicing)
G-3’-OH: absolute requirement “Co-factor”
• Noncoding intron sequences are excised from RNA transcripts in the nucleus prior to the transport to the cytoplasm.
• Introns in tRNA precursors are removed by the concerted action of a splicing endonuclease and ligase, whereas introns in some rRNA precursors are spliced out autocatalytically—with no catalytic protein involved.
• The introns in nuclear pre-mRNAs are excised on
complex ribonucleoprotein structures called
spliceosomes.
• The intron excision process must be precise, with
accuracy to the nucleotide level, to ensure that
codons in exons distal to introns are read
correctly during translation.