Chapter 11
Transcription
The biochemistry and molecular biology department of CMU
The synthesis of RNA molecules using DNA strands as the templates so that the genetic information can be transferred from DNA to RNA.
Transcription
• Both processes use DNA as the template.
• Phosphodiester bonds are formed in both cases.
• Both synthesis directions are from 5´ to 3´.
Similarity between replication and transcription
replication transcription
template double strands single strand
substrate dNTP NTP
primer yes no
Enzyme DNA polymerase RNA polymerase
product dsDNA ssRNA
base pair A-T, G-C A-U, T-A, G-C
Differences between replication and transcription
Section 1
Template and Enzymes
• The whole genome of DNA needs to
be replicated, but only small portion of genome is transcribed in response to the development requirement, physiological need and environmental changes.
• DNA regions that can be transcribed into RNA are called structural genes.
§1.1 Template
The template strand is the strand from which the RNA is actually transcribed. It is also termed as antisense strand.
The coding strand is the strand whose base sequence specifies the amino acid sequence of the encoded protein. Therefore, it is also called as sense strand.
G C A G T A C A T G T C5' 3'
3' C G T C A T G T A C A G 5' template strand
coding strand
transcription
RNAG C A G U A C A U G U C5' 3'
• Only the template strand is used for the
transcription, but the coding strand is not.
• Both strands can be used as the templates.
• The transcription direction on different strands is opposite.
• This feature is referred to as the asymmetric transcription.
Asymmetric transcription
5'
3'
3'
5'
Organization of coding information in the adenovirus genome
§1.2 RNA Polymerase
• The enzyme responsible for the RNA synthesis is DNA-dependent RNA polymerase.
– The prokaryotic RNA polymerase is a multiple-subunit protein of ~480kD.
– Eukaryotic systems have three kinds of RNA polymerases, each of which is a multiple-subunit protein and responsible for transcription of different RNAs.
core enzymeholoenzyme
Holoenzyme
The holoenzyme of RNA-pol in E.coli con
sists of 5 different subunits: 2 .
subunit MW function
36512Determine the DNA to be transcribed
150618 Catalyze polymerization
155613 Bind & open DNA template
70263Recognize the promoter
for synthesis initiation
RNA-pol of E. Coli
• Rifampicin, a therapeutic drug for tuberculosis treatment, can bind specifically to the subunit of RNA-pol, and inhibit the RNA synthesis.
• RNA-pol of other prokaryotic systems is similar to that of E. coli in structure and functions.
RNA-pol I II III
products 45S rRNA hnRNA
5S rRNA
tRNA
snRNA
Sensitivity to Amanitin
No high moderate
RNA-pol of eukaryotes
Amanitin is a specific inhibitor of RNA-pol.
• Each transcriptable region is called ope
ron.
• One operon includes several structural genes and upstream regulatory sequences (or regulatory regions).
• The promoter is the DNA sequence that RNA-pol can bind. It is the key point for the transcription control.
§1.3 Recognition of Origins
5'
3'
3'
5'
regulatory sequences structural gene
promotorRNA-pol
Promoter
5'
3'
3'
5'-50 -40 -30 -20 -10 1 10
start -10 region
T A T A A T A T A T T A
(Pribnow box)
-35 region
T T G A C A A A C T G T
Prokaryotic promoter
Consensus sequence
Consensus Sequence
Frequency in 45 samples 38 36 29 40 25 30 37 37 28 41 29 44
• The -35 region of TTGACA sequence
is the recognition site and the binding site of RNA-pol.
• The -10 region of TATAAT is the region at which a stable complex of DNA and RNA-pol is formed.
Section 2
Transcription Process
General concepts
• Three phases: initiation, elongation, and termination.
• The prokaryotic RNA-pol can bind to the DNA template directly in the transcription process.
• The eukaryotic RNA-pol requires co-factors to bind to the DNA template together in the transcription process.
§2.1 Transcription of Prokaryotes
• Initiation phase: RNA-pol recognizes the promoter and starts the transcription.
• Elongation phase: the RNA strand is continuously growing.
• Termination phase: the RNA-pol stops synthesis and the nascent RNA is separated from the DNA template.
a. Initiation
• RNA-pol recognizes the TTGACA region, and slides to the TATAAT region, then opens the DNA duplex.
• The unwound region is about 171 bp.
• The first nucleotide on RNA transcript is always purine triphosphate. GTP is more often than ATP.
• The pppGpN-OH structure remains on the RNA transcript until the RNA synthesis is completed.
• The three molecules form a transcription initiation complex.
RNA-pol (2) - DNA - pppGpN- OH 3
• No primer is needed for RNA synthesis.
• The subunit falls off from the RNA-pol once the first 3,5 phosphodiester bond is formed.
• The core enzyme moves along the DNA template to enter the elongation phase.
b. Elongation
• The release of the subunit causes the conformational change of the core enzyme. The core enzyme slides on the DNA template toward the 3 end.
• Free NTPs are added sequentially to the 3 -OH of the nascent RNA strand.
• RNA-pol, DNA segment of ~40nt and the nascent RNA form a complex called the transcription bubble.
• The 3 segment of the nascent RNA hybridizes with the DNA template, and its 5 end extends out the transcription bubble as the synthesis is processing.
Transcription bubble
RNA-pol of E. Coli
RNA-pol of E. Coli
Simultaneous transcriptions and
translation
c. Termination
• The RNA-pol stops moving on the DNA template. The RNA transcript falls off from the transcription complex.
• The termination occurs in either -dependent or -independent manner.
The termination function of factor
The factor, a hexamer, is a ATPase and a helicase.
-independent termination
• The termination signal is a stretch of 30-40 nucleotides on the RNA transcript, consisting of many GC followed by a series of U.
• The sequence specificity of this nascent RNA transcript will form particular stem-loop structures to terminate the transcription.
RNA
5TTGCAGCCTGACAAATCAGGCTGATGGCTGGTGACTTTTTAGGCACCAGCCTTTTT... 3 DNA
UUUU...…
rplL protein
UUUU...…
5TTGCAGCCTGACAAATCAGGCTGATGGCTGGTGACTTTTTAGTCACCAGCCTTTTT... 3
• The stem-loop structure alters the co
nformation of RNA-pol, leading to the pause of the RNA-pol moving.
• Then the competition of the RNA-RNA hybrid and the DNA-DNA hybrid reduces the DNA-RNA hybrid stability, and causes the transcription complex dissociated.
• Among all the base pairings, the most unstable one is rU:dA.
Stem-loop disruption
§2.2 Transcription of Eukaryotes
• Transcription initiation needs promoter and upstream regulatory regions.
• The cis-acting elements are the specific sequences on the DNA template that regulate the transcription of one or more genes.
a. Initiation
structural geneGCGC CAAT TATA
intronexon exon
start
CAAT box
GC box
enhancer
cis-acting element
TATA box (Hogness box)
Cis-acting element
TATA box
• RNA-pol does not bind the promoter directly.
• RNA-pol II associates with six transcription factors, TFII A - TFII H.
• The trans-acting factors are the proteins that recognize and bind directly or indirectly cis-acting elements and regulate its activity.
Transcription factors
TF for eukaryotic transcription
• TBP of TFII D binds TATA
• TFII A and TFII B bind TFII D
• TFII F-RNA-pol complex binds TFII B
• TFII F and TFII E open the dsDNA (helicase and ATPase)
• TFII H: completion of PIC
Pre-initiation complex (PIC)
Pre-initiation complex (PIC)
RNA pol II
TF II F
TBP TAFTATA
DNATF II A
TF II B
TF II E
TF II H
• TF II H is of protein kinase activity to
phosphorylate CTD of RNA-pol. (CTD is the C-terminal domain of RNA-pol)
• Only the p-RNA-pol can move toward the downstream, starting the elongation phase.
• Most of the TFs fall off from PIC during the elongation phase.
Phosphorylation of RNA-pol
• The elongation is similar to that of
prokaryotes.
• The transcription and translation do not take place simultaneously since they are separated by nuclear membrane.
b. Elongation
RNA-Pol
RNA-Pol
RNA-Pol
nucleosome
moving direction
• The termination sequence is AATAAA followed by GT repeats.
• The termination is closely related to the post-transcriptional modification.
c. Termination
Section 3
Post-Transcriptional
Modification
• The nascent RNA, also known as primary transcript, needs to be modified to become functional tRNAs, rRNAs, and mRNAs.
• The modification is critical to eukaryotic systems.
• Primary transcripts of mRNA are called as heteronuclear RNA (hnRNA).
• hnRNA are larger than matured mRNA by many folds.
• Modification includes – Capping at the 5- end – Tailing at the 3- end– mRNA splicing– RNA edition
§3.1 Modification of hnRNA
CH3
O
O OH
CH2
PO
O
O
N
NHN
N
O
NH2
AAAAA-OH
O
Pi
5'
3'
O
OHOH
H2CN
HNN
N
O
H2N O P
O
O
O P
O
O
O P
O
O
5'
a. Capping at the 5- end
m7GpppGp----
ppp5'NpNp
pp5'NpNp
GTP
PPi
G5'ppp5'NpNp
methylating at G7
methylating at C2' of the first and second nucleotides after G
forming 5'-5' triphosphate group
removing phosphate group
m7GpppNpNp
m7Gpppm
2'Npm2'Np
Pi
• The 5- cap structure is found on hnRNA too. The capping process occurs in nuclei.
• The cap structure of mRNA will be recognized by the cap-binding protein required for translation.
• The capping occurs prior to the splicing.
b. Poly-A tailing at 3 - end
• There is no poly(dT) sequence on the DNA template. The tailing process dose not depend on the template.
• The tailing process occurs prior to the splicing.
• The tailing process takes place in the nuclei.
The matured mRNAs are much shorter than the DNA templates.
DNA
mRNA
c. mRNA splicing
A~G no-coding region 1~7 coding region
L 1 2 3 4 5 6 77 700 bp
The structural genes are composed of coding and non-coding regions that are alternatively separated.
Split gene
EA B C D F G
Exon and intron
Exons are the coding sequences that appear on split genes and primary transcripts, and will be expressed to matured mRNA.
Introns are the non-coding sequences that are transcripted into primary mRNAs, and will be cleaved out in the later splicing process.
mRNA splicing
Splicing mechanism
lariat
U pA G pU5' 3'5'exon 3'exon
intron
pG-OH
pGpA
G pU 3'U5' OH
first transesterification
Twice transesterification
second transesterification
U5' pU 3'
pGpA
GOH
5'
3'
• Taking place at the transcription level
• One gene responsible for more than one proteins
• Significance: gene sequences, after post-transcriptional modification, can be multiple purpose differentiation.
d. mRNA editing
Different pathway of apo B
Human apo B gene
hnRNA (14 500 base)
liverapo B100( 500 kD) intestine
apo B48( 240 kD)
CAA to UAAAt 6666
§3.2 Modification of tRNA
tRNA precursor
RNA-pol III
TGGCNNAGTGC GGTTCGANNCC
DNA
Precursor transcription
RNAase Pendonuclease
Cleavage
ligase
tRNA nucleotidyl transferase
ATP ADP
Addition of CCA-OH
Base modification
( 1)( 1)
( 3)
( 2)
( 4)
1. Methylation A→mA, G→mG
2. Reduction U→DHU
3. Transversion U→ψ
4. DeaminationA→I
§3.3 Modification of rRNA
• 45S transcript in nucleus is the precursor of 3 kinds of rRNAs.
• The matured rRNA will be assembled with ribosomal proteins to form ribosomes that are exported to cytosolic space.
rRNA
transcription
splicing
45S-rRNA
18S-rRNA5.8S and 28S-rRNA
28S5.8S18S
• The rRNA precursor of tetrahymena has the activity of self-splicing (1982).
• The catalytic RNA is called ribozyme.
• Self-splicing happened often for intron I and intron II.
§3.4 Ribozyme
• Both the catalytic domain and the substrate locate on the same molecule, and form a hammer-head structure.
• At least 13 nucleotides are conserved.
Hammer-head
• Be a supplement to the central dogm
a
• Redefine the enzymology
• Provide a new insights for the origin of life
• Be useful in designing the artificial ribozymes as the therapeutical agents
Significance of ribozyme
Artificial ribozyme
• Thick lines: artificial ribozyme
• Thin lines: natural ribozyme
• X: consensus sequence
• Arrow: cleavage point