Transcription
Central Dogma
Genes• Sequence of DNA that is transcribed.• Encode proteins, tRNAs, rRNAs, etc..• “Housekeeping” genes encode
proteins or RNAs that are essential for normal cellular activity.
• Simplest bacterial genomes contain 500 to 600 genes.
• Mulitcellular Eukaryotes contain between 15,000 and 50,000 genes.
Types of RNAs• tRNA, rRNA, and mRNA• rRNA and tRNA very abundant
relative to mRNA.• But mRNA is transcribed at
higher rates than rRNA and tRNA• Abundance is a reflection of the
relative stability of the different forms of RNA
RNA Content of E. coli Cells
typeSteady State Levels
Synthetic Capacity
Stability
rRNA 83% 58% High
tRNA 14% 10% High
mRNA 3% 32% Very Low
Phases of Transcription• Initiation: Binding of RNA
polymerase to promoter, unwinding of DNA, formation of primer.
• Elongation: RNA polymerase catalyzes the processive elongation of RNA chain, while unwinding and rewinding DNA strand
• Termination: termination of transcription and disassemble of transcription complex.
E. Coli RNA Polymerase• RNA polymerase core
enzyme is a multimeric protein ’
• The ’ subunit is involved in DNA binding
• The subunit contains the polymerase active site
• The subunit acts as scaffold on which the other subunits assemble.
• Also requires -factor for initiation –forms holo enzyme complex
Site of DNA binding and
RNA polymerization
-factor• The -factor is required for binding of the RNA
polymerase to the promoter• Association of the RNA polynerase core complex w/
the -factor forms the holo-RNA polymerase complex
• W/o the -factor the core complex binds to DNA non-specifically.
• W/ the -factor, the holo-enzyme binds specifically with high affinity to the promoter region
• Also decreases the affinity of the RNA polymerase to non-promoter regions
• Different -factors for specific classes of genes
General Gene Structure• Promoter – sequences
recognized by RNA polymerase as start site for transcription.
• Transcribed region – template from which mRNA is synthesized
• Terminator – sequences signaling the release of the RNA polymerase from the gene.
5’ 3’Transcribed region terminatorPromoter
Gene Promoters• Site where RNA polymerase binds and
initiates transcription.• Gene that are regulated similarly contain
common DNA sequences (concensus sequences) within their promoters
Important Concensus Sequences
• Pribnow Box – position –10 from transcriptional start
• -35 region – position –35 from transcriptional start.
• Site where -factor binds.
Other -Factors• Standard genes – 70
• Nitrogen regulated genes – 54
• Heat shock regulated genes – 32
How does RNA polymerase finds the
promoter?• RNA polymerase does not disassociate from DNA strand and reassemble at the promoter (2nd order reaction – to slow)
• RNA polymerase holo-enzyme binds to DNA and scans for promoter sequences (scanning occurs in only one dimension, 100 times faster than diffusion limit)
• During scanning enzyme is bound non-specifically to DNA.
• Can quickly scan 2000 base pairs
Transcriptional Initiation
• Rate limiting step of trxn.• Requires unwinding of DNA and synthesis
of primer.• Conformational change occurs after DNA
binding of RNA polymerase holo-enzyme.• First RNA Polymerase binds to DNA
(closed-complex), then conformational change in the polymerase (open complex) causes formation of transcription bubble (strand separation).
Initiation of Polymerization • RNA polymerase has two binding sites for
NTPs • Initiation site prefers to binds ATP and GTP
(most RNAs begin with a purine at 5'-end) • Elongation site binds the second incoming NTP • 3'-OH of first attacks alpha-P of second to form
a new phosphoester bond (eliminating PPi) • When 6-10 unit oligonucleotide has been
made, sigma subunit dissociates, completing "initiation“
• NusA protein binds to core complex after disassociation of -factor to convert RNA polymerase to elongation form.
Transcriptional Initiation
Closed complex
Open complex
Primer formation
Disassociation of -factor
Chain Elongation Core polymerase - no sigma
• Polymerase is accurate - only about 1 error in 10,000 bases
• Even this error rate is OK, since many transcripts are made from each gene
• Elongation rate is 20-50 bases per second - slower in G/C-rich regions (why??) and faster elsewhere
• Topoisomerases precede and follow polymerase to relieve supercoiling
Transcriptional Termination
• Process by which RNA polymerase complex disassembles from 3’ end of gene.
• Two Mechanisms – Pausing and “rho-mediated” termination
Pausing induces termination
• RNA polymerase can stall at “pause sites”
• Pause sites are GC rich (difficult to unwind)
• Can decrease trxn rates by a factor of 10 to 100.
• Hairpin formation in RNA can exaggerate pausing
• Hairpin structures in transcribed RNA can destabilize DNA:RNA hybrid in active site
• Nus A protein increases pausing when hairpins form.
3’end tends to be AU rich easily to disrupt during pausing. Leads to disassembly of RNA polymerase complex
Rho Dependent Termination
• rho is an ATP-dependent helicase
• it moves along RNA transcript, finds the "bubble", unwinds it and releases RNA chain
Eukaryotic Transcription
• Similar to what occurs in prokaryotes, but requires more accessory proteins in RNA polymerase complex.
• Multiple RNA polymerases
Eukaryotic RNA Polymerases
type Location ProductsRNA polymerase
I Nucleolus rRNA
RNA polymerase II
Nucleoplasm mRNA
RNA polymerase III
Nucleoplasm
rRNA, tRNA, others
Mitochondrial RNA polymerase
Mitochondria
Mitochondrial gene
transcripts
Chloroplast RNA polymerase
Chloroplast
Chloroplast gene
transcripts
Eukaryotic RNA Polymerases
• RNA polymerase I, II, and III
• All 3 are big, multimeric proteins (500-700 kD)
• All have 2 large subunits with sequences similar to and ' in E.coli RNA polymerase, so catalytic site may be conserved
Eukaryotic Gene Promoters• Contain AT rich concensus sequence
located –19 to –27 bp from transcription start (TATA box)
• Site where RNA polymerase II binds
RNA Polymerase II • Most interesting because it
regulates synthesis of mRNA • Yeast Pol II consists of 10 different
peptides (RPB1 - RPB10) • RPB1 and RPB2 are homologous to E.
coli RNA polymerase and ' • RPB1 has DNA-binding site; RPB2 binds
NTP • RPB1 has C-terminal domain (CTD) or
PTSPSYS • 5 of these 7 have -OH, so this is a
hydrophilic and phosphorylatable site
More RNA Polymerase II
• CTD is essential and this domain may project away from the globular portion of the enzyme (up to 50 nm!)
• Only RNA Pol II whose CTD is NOT phosphorylated can initiate transcription
• TATA box (TATAAA) is a consensus promoter
• 7 general transcription factors are required
Transcription Factors • Polymerase I, II, and III do not bind
specifically to promoters• They must interact with their
promoters via so-called transcription factors
• Transcription factors recognize and initiate transcription at specific promoter sequences
Transcription Factors• TFAIIA, TFAIIB –
components of RNA polymerase II holo-enzyme complex
• TFIID – Initiation factor, contains TATA binding protein (TBP) subunit. TATA box recognition.
• TFIIF – (RAP30/74) decrease affinity to non-promoter DNA
Eukaryotic Transcription
• Once initiation complex assembles process similar to bacteria (closed complex to open complex transition, primer formation)
• Once elongation phase begins most transcription factor disassociate from DNA and RNA polymerase II (but TFIIF may remain bound).
• TFIIS – Elongation factor binds at elongation phase. May also play analogous role to NusA protein in termination.
Transcriptional Regulation and
RNA Processing
Gene Expression• Constitutive – Genes expressed
in all cells (Housekeeping genes)
• Induced – Genes whose expression is regulated by environmental, developmental, or metabolic signals.
Regulation of Gene Expression
AAAAAA5’CAPmRNA
RNA Processing
RNA Degradation
Protein DegradationPost-translational modification
Activeenzyme
Transcriptional Regulation
• Regulation occurring at the initiation of transcription.
• Involves regulatory sequences present within the promoter region of a gene (cis-elements)
• Involves soluble protein factors (trans-acting factors) that promote (activators) or inhibit (repressors) binding of the RNA polymerase to the promoter
Cis-elements• Typically found in 5’
untranscribed region of the gene (promoter region).
• Can be specific sites for binding of activators or repressors.
• Position and orientation of cis element relative to transcriptional start site is usually fixed.
Enhancers• Enhancers are a class of cis-elements
that can be located either upstream or downstream of the promoter region (often a long distance away).
• Enhancers can also be present within the transcribed region of the gene.
• Enhancers can be inverted and still function
5’-ATGCATGC-3’ = 5’-CGTACGTA-3’
Two Classes of Trans-Acting Factors
• Activators and repressors- Bind to cis-elements.
• Co-activators and co-repressors – bind to proteins associated with cis-elements. Promote or inhibit assembly of transcriptional initiation complex
Structural Motifs in DNA-Binding Regulatory Proteins
• Crucial feature must be atomic contacts between protein residues and bases and sugar-phosphate backbone of DNA
• Most contacts are in the major groove of DNA • 80% of regulatory proteins can be assigned
to one of three classes: helix-turn-helix (HTH), zinc finger (Zn-finger) and leucine zipper (bZIP)
• In addition to DNA-binding domains, these proteins usually possess other domains that interact with other proteins
The Helix-Turn-Helix Motif
• contain two alpha helices separated by a loop with a beta turn
• The C-terminal helix fits in major groove of DNA; N-terminal helix stabilizes by hydrophobic interactions with C-terminal helix
The Zn-Finger Motif
Zn fingers form a folded beta strand and an alpha helix that fits into the DNA major groove.
The Leucine Zipper Motif
• Forms amphipathic alpha helix and a coiled-coil dimer
• Leucine zipper proteins dimerize, either as homo- or hetero-dimers
• The basic region is the DNA-recognition site
• Basic region is often modeled as a pair of helices that can wrap around the major groove
Binding of some trans-factors is regulated by allosteric
modification
Transcription Regulation in Prokaryotes• Genes for enzymes for pathways are
grouped in clusters on the chromosome - called operons
• This allows coordinated expression• A regulatory sequence adjacent to such a
unit determines whether it is transcribed - this is the ‘operator’
• Regulatory proteins work with operators to control transcription of the genes
Induction and Repression
• Increased synthesis of genes in response to a metabolite is ‘induction’
• Decreased synthesis in response to a metabolite is ‘repression’
lac operon
• Lac operon – encodes 3 proteins involved in galactosides uptake and catabolism.
• Permease – imports galactosides (lactose)
-galactosidase – Cleaves lactose to glucose and galactose.
-galactoside transacetylase – acetylates -galactosides
• Expression of lac operon is negatively regulated by the lacI protein
The lac I protein• The structural genes of the lac operon
are controlled by negative regulation• lacI gene product is the lac repressor• When the lacI protein binds to the lac
operator it prevents transcription• lac repressor – 2 domains - DNA binding
on N-term; C-term. binds inducer, forms tetramer.
Inhibition of repression of lac operon by inducer
binding to lacI
• Binding of inducer to lacI cause allosteric change that prevents binding to the operator
• Inducer is allolactose which is formed when excess lactose is present.
Catabolite Repression of lac Operon (Positive regulation)
• When excess glucose is present, the lac operon is repressed even in the presence of lactose.
• In the absence of glucose, the lac operon is induced.
• Absence of glucose results in the increase synthesis of cAMP
• cAMP binds to cAMP regulatory protein (CRP) (AKA CAP).
• When activated by cAMP, CRP binds to lac promoter and stimulates transcription.
Post-transcriptional Modification of RNA
• tRNA Processing• rRNA Processing• Eukaryotic mRNA Processing
tRNA Processing•tRNA is first transcribed by RNA •Polymerase III, is then processed•tRNAs are further processed in the chemical modification of bases
rRNA Processing•Multiple rRNAs are originally transcribed as single transcript.•In eukaryotes involves RNA polymerase I•5 endonuclases involved in the processing
Processing of Eukaryotic mRNA
5’ Capping• Primary transcripts (aka pre-mRNAs or
heterogeneous nuclear RNA) are usually first "capped" by a guanylyl group
• The reaction is catalyzed by guanylyl transferase
• Capping G residue is methylated at 7-position
• Additional methylations occur at 2'-O positions of next two residues and at 6-amino of the first adenine
• Modification required to increase mRNA stability
3'-Polyadenylylation • Termination of transcription occurs
only after RNA polymerase has transcribed past a consensus AAUAAA sequence - the poly(A)+ addition site
• 10-30 nucleotides past this site, a string of 100 to 200 adenine residues are added to the mRNA transcript - the poly(A)+ tail
• poly(A) polymerase adds these A residues
• poly(A) tail may govern stability of the mRNA
Splicing of Pre-mRNA • Pre-mRNA must be capped and polyadenylated
before splicing • In "splicing", the introns are excised and the
exons are sewn together to form mature mRNA • Splicing occurs only in the nucleus • The 5'-end of an intron in higher eukaryotes is
always GU and the 3'-end is always AG • All introns have a "branch site" 18 to 40
nucleotides upstream from 3'-splice site
Splicing of Pre-mRNA
• Lariat structure forms by interaction with 5’splice site G and 2’OH of A in the branch site.
• Exons are then joined and lariot is excised.
• Splicing complex includes snRNAs that are involved in identification of splice junctions.