Date post: | 19-Dec-2015 |
Category: |
Documents |
View: | 214 times |
Download: | 1 times |
Introduction to Molecular Biology
Cells, genome, gene and DNA
• Almost all cells of a living organism contain an identical set of codes describing the genes and their regulation
• This code is encoded as one or more strands of DNA• Cells from the different parts of an organism have the
same DNA– Distinction: The portion of the DNA that is transcribed and
translated into protein • Genome: entire complement of DNA molecules of each
organism• Overall function of genome: Control the generation of
molecules (mostly proteins) that will– Regulate the metabolism of a cell and its response to the
environment, and– Provide structural integrity.
Chromosomes
• Humans have 46 chromosomes– 22 pairs of autosomal chromosomes
• Numbered largest (1) to smallest (22)
– Two sex chromosomes• XX for women, XY for men
– One chromosome of each pair comes from each parent
• The collection of chromosomes is called a genome
Chromosome
Chrom. Genes Bases
1 2968 245,203,898
2 2288 243,315,028
3 2032 199,411,731
4 1297 191,610,523
5 1643 180,967,295
6 1963 170,740,541
7 1443 158,431,299
8 1127 145,908,738
9 1299 134,505,819
10 1440 135,480,874
11 2093 134,978,784
12 1652 133,464,434
13 748 114,151,656
14 1098 105,311,216
15 1122 100,114,055
16 1098 89,995,999
17 1576 81,691,216
A chromosome is a very long, continuous piece of DNA, which contains many genes, regulatory elements and other intervening nucleotide sequences.
Chrom. Genes Bases
18 766 77,753,510
19 1454 63,790,860
20 927 63,644,868
21 303 46,976,537
22 288 49,476,972
X 1184 152,634,166
Y 231 50,961,097
http://www.tqnyc.org/NYC040844/Mitosis.htm
Genomes
• the term genome refers to the complete complement of DNA for a given species
• the human genome consists of 46 chromosomes.
• every cell (except sex cells and mature red blood cells) contains the complete genome of an organism
DNA• ~3.2 billion base pairs
in every cell build the human genome
• genes form only 1,5% of the human genome
• a gene is a segment of the DNA, that encodes the construction plan for a protein
• in humans there are ca. 30,000 genes only
DNA - Deoxyribonucleic acid • Deoxyribonucleic acid (DNA) forms a
double stranded helix.
• A sugar-phosphate backbone forms the outer shell on the helix
• The two strands of DNA run in opposite directions.
• Bases face towards each other and form hydrogen bonds
• carries the generic instructions (genes)
free BasesCytosine - CGuanine - GAdenine - AThymine - T
complementary base pairs
DNA
• can be thought of as the “blueprint” for an organism• composed of small molecules called nucleotides
– four different nucleotides distinguished by the four bases: adenine (A), cytosine (C), guanine (G) and thymine (T)
• is a polymer: large molecule consisting of similar units (nucleotides in this case)
• DNA is digital information• a single strand of DNA can be thought of as a string
composed of the four letters: A, C, G, Tctgctggaccgggtgctaggaccctgactgcccggggccgggggtgcggggcccgctgag…
Watson-Crick Base Pairs
• A bonds to T
• C bonds to G
The Double Helix
DNA molecules usually consist of two strands arranged in the famous double helix
The Central Dogma
Structure of DNA
• Made up of 4 different building blocks (so called nucleotide bases), each an almost planar nitrogenic organic compound– Adenine (A)– Thymine (T)– Guanine (G)– Cytosine (C)– Base pairs (A -- T, C -- G)
DNA - Deoxyribonucleic acidA nucleotide is an organic molecule build of three components:1. one out of five bases (A, G, C, T and U in RNA)2. a pentose sugar (deoxyribose in DNA or ribose in RNA) 3. and a phosphate group.
Nucleoside = Nucleobase + Pentose Nucleotide = Nucleobase + Pentose + Phosphate Group
free base nucleoside nucleotide
Adenine (A) Adenosine Adenosine monophosphate (AMP)
Guanine (G) Guanosin Guanosine monophosphate (GMP
Cytosine (C) Cytidin Cytidine monophosphate (CMP)
Thymine (T) Thymidin Thymidin monophosphate (TMP)
DNA - Deoxyribonucleic acid
TMP
HO
OH
O
P O
HO
O
O
N NH
CH3
O
HO
OH
O
P O
HO
ONH2N
N
N N
AMP
Phosphate Sugar Base
GMP
HO
OH
O
P O
HO
ONH2N
N
N NH
NH2
1´
2´
5´4´
3´
CMP
HO
OH
O
P O
HO
O
NH2
N N
O
1´
2´
5´4´
3´
Structure of DNA -- 2• Base pairs (A -- T,C -- G) are attached to a sugar phosphate
backbone to form one of 2 strands of a DNA molecule.– Phosphate ((PO4) -3)– Deoxyribose
• Two strands are bonded together by the base pairs (A – T, C – G).• Results in mirror image or complementary strands, each is twisted
(or helical), and when bonded they form a double helix.• Direction of each strand (5’ meaning beginning or 3’ meaning end of
the strand)– 5’ and 3’ refer to position of bases in relation to the sugar molecule in
the DNA backbone.– complementary strands are oriented in opposite direction to each other.
Structure of DNA -- 3
DNA
• Two strands form bonds at complementary bases– A with T– C with G
• Strands are asymmetric– 5’ -> 3’– Reversed with respect to
one another.
• This stable conformation is the “double helix”
5’
3’
3’
5’
DNA
RNA – Ribonucleic acidIn RNA the base Thymine (T) is replaced by Uracil (U). The other difference to DNA is that the sugar (Pentose) will be Ribose instead of Deoxiribose. Ribose has an additional hydroxyl group.
Bases:Cytosine - CGuanine - GAdenine - AUracil - U
Uracil
RNA transmits genetic information from DNA (via transcription) into proteins (by translation). RNA is almost exclusively found in the single-stranded form.
2’-deoxyribose sugars
Phosphodiester linkages
Directional chain (5’ to 3’)
4 Bases
purines: adenine & guanine
pyrimidines: cytosine & thymine
DNA is a polymer of2’-deoxyribonucleotides
GCTAp
5’ end
3’ end
C
G
T
A
HO-CH2
O
H2N-C
C
C
HN
N
N
CH
C
O
N
O
O
O P O CH2
O
O
C
N
N
CHC
CH
NH2
NH2
C
CN
N
N
CH
C
NHC
O
O
O P O CH2
O
O-PO32
O
O
O P O CH2
O
N
CC
O
HN
CHCO
CH3
1’
2’3’
4’
5’
3’
RNA is a polymer of ribonucleotides
ribose sugars
Phosphodiester linkages
Directional chain (5’ to 3’)
4 Bases
purines: adenine & guanine
pyrimidines: cytosine & uracil
GCUApC
G
U
A
5’ end
3’ end
1’
2’3’
4’
5’
3’
OH
HO-CH2
O
H2N-C
C
C
HN
N
N
CH
C
O
N
O
O
O P O CH2
O
O
C
N
N
CHC
CH
NH2
OH
O
O
O P O CH2
O
N
CHC
O
HN
CHCO
OH
NH2
C
CN
N
N
CH
C
NHC
O
O
O P O CH2
O
O-PO32
OH
The phosphate groups of DNA and RNA are negatively charged
A phosphodiester group has a pKa of about 1, and so will always be ionized and negatively charged under physiological conditions (pH ~7).
Nucleic acids require counterions such as Mg2+, polyamines, histones or other proteins to balance this charge.
5’
3’
HO-CH2
ON
O
O P O CH2
OO
N
O
O P O CH2
OO
N
O
O P O CH2
OO
O-PO32
N
+M
+M
+M
+M
Duplication of DNA
• Occurs through the coordinated action of many molecules, including– DNA polymerases (synthesizing new DNA),– DNA gyrases (unwinding the molecule), and– DNA ligases (concatenating segments
together)
Transcription of DNA to RNA
• Why transcription: – (For genome) to direct or effect changes in the cytoplasm of the
cell – Need to generate new proteins to populate the cytosol
(heteregenous intracellular soup of the cytoplasm) • Note: DNA is in the nucleus, while proteins are needed
in the cytoplasm, where many of the cell’s functions are performed.
• Coding region of the DNA is copied to a more transient molecule called RNA– Gene is a single segment of the coding region that is transcribed
into RNA– Generation of RNA from DNA (in the nucleus) is done trough a
process called transcription
Genes to Proteins
• Ribonucleic acid (RNA)– Single stranded– Ribose sugar, rather than deoxy-ribose– Uracil (U) instead of Thymine (T)– RNA can move out of the nucleus to the
cytoplasm
• In Eukaryotes, the primary transcript (RNA) is “edited” before it moves to the cytoplasm.
Genes to Proteins
• The RNA that moves from the nucleus to the cytoplasm is called messenger RNA (mRNA)
• Translation– Carried out by the Ribosome– Makes a chain of amino-acids from mRNA– 3 bases (codon) -> 1 amino-acid– Starts with Methionine (AUG)– Ends with Stop codon (UGA,UAA,UAG)
• Protein is often modified after translation!– Initial methionine loss
Transcription• RNA (Ribonucleic acid)
– Similar to DNA (except for a chemical modification of the sugar backbone)– Instead of T contains U (Uracil) which binds with A.– Is not double stranded but single stranded– RNA molecules tend to fold back on themselves to make helical twisted and rigid
segments.• RNA is synthesized
– By unwinding the DNA double helix separating the 2 strands.– Using one of the strands as a template along which to build the RNA molecule– Accomplished by Enzyme RNA polymerase (binds to promoter and copies or
transcribes the gene in its full length)– Resulting molecule is called Pre-mRNA– Single stranded pre-mRNA is then processed.– Splicing (mediated by spliceosome consisting of RNA and proteins) removes the
introns.– Ends modified (Capping modifies 5’ end and Polyadenylation adds adenines at
the 3’ end) to enhance stability
mRNA, ORFs, etc.
• Each cell has 20 to 30 pg of RNA (1% of the cell mass) • The RNA that codes for proteins is called messenger
RNA (mRNA)• The part of DNA that provides that code is called Open
Reading Frame (ORF)• When read in the standard 5’ to 3’ direction, the portion
of DNA before the ORF is considered upstream and the portion following the ORF is considered downstream.
• Promoter regions: DNA sequence upstream of an ORF– Specifically determine which gene to transcribe– Transcription factors: proteins that contain part that bind to
specific promoter regions, thus activating or deactivating transcription of the downstream ORF
Coding and non-coding RNA
• Not all RNA code for proteins– 4% of total RNA is made of coding RNA– Of the non-coding RNA
• Ribosonal RNA (rRNA) and transfer RNA(tRNA) are used in the various protein translational apparatus
• Small nuclear RNA (snRNA) – found in eucaryotes, is part of the splicing apparatus
• Small nucleolar RNA (snoRNA) involved in methylation of rRNA
• Small cytoplasmic RNA (scRNA) plays a role in the expression of specific genes
Prokaryotic and Eukaryotic cells
• Eukaryotes: Organisms whose cells contain compartments or organelles within the cell, such as mitochondria and nucleus– Animals, plants
• Prokaryotes: Whose cells do not have these organelles (e.g. bacteria)– Most prokaryotes have a smaller genome, typically
contained in a single circular DNA molecule.– Additional genetic information may be contained in
smaller satellite pieces of DNA called plasmids
More on transcription
• Most eukaryotic genes have exons (portions that will be put in the mRNA) and introns (that are normally spliced out)– Some introns may have a promoter-like control of the
transcription process– If an intron is not spliced out then an alternative splicing product
is created.– Various tissue types can flexibly alter their gene products
through alternative splicing• Post-splicing (in Eukaryotes)
– The generated mRNA is exported (through nuclear pore complexes) to the cytoplasm
– In the cytoplasm, the ribosonal complex (containing hundreds of proteins and special function RNA molecules) acts to generate the protein on the basis of the mRNA code.
Translation
• Process of generating a protein or polypeptide from an mRNA molecule is known as translation.
• Protein: a polymer or chain of aminoacids, whose sequence is determined by the mRNA template– 3 nulceotides code for 20 naturally occurring amino acids– 43 = 64; thus several trinucleotide sequences (codons)
correspond to a single amino acid. – There is no nucleotide between codons, and a few codons
represent start and stop. – Notable exceptions: code of naturally occurring selenocysteine
is identical to that for a stop codon, except for a particular nucleotide sequence further downstream.
Translocation of proteins
• A newly formed protein need to be translocated to the right place to perform its function (such as structural protein in the cytoskeleton, as a cell membrane receptor, as a hormone that is to be secreted by the cell, etc.)
• Signal peptide (header): part of the polypeptide that is one of the determinant of its location and handling
Transcriptional programs
• Initiation of the transcription process can be caused by external events or by a programmed event within the cell.
• External events– Piezoelectric forces generated in bones through walking can
gradually stimulate osteoblastic and osteoclastic transcriptional activity to cause bone remodelling; Heat shock
– Appearance or disappearance of new micro or macronutrients around the cell; binding of distantly secreted hormones
• Internally programmed sequences of transcriptional expression (eg. clock and per genes)
• Pathological internal derangements of the cell– Self-repair or damage detection programs can trigger apoptosis
(self-destruction) under conditions such as irreparable DNA damage
Biological function of proteins• Enzyme catalysis: DNA polymerases, lactate dehydrogenase,
trypsin• Transport: hemoglobin, membrane transporters, serum albumin• Storage: ovalbumin, egg-white protein, ferritin• Motion: myosin, actin, tubulin, flagellar proteins• Structural and mechanical support: collagen, elastin, keratin, viral
coat proteins• Defense: antibodies, complement factors, blood clotting factors,
protease inhibitors• Signal transduction: receptors, ion channels, rhodopsin, G
proteins, signalling cascade proteins• Control of growth, differentiation and metabolism: repressor
proteins, growth factors, cytokines, bone morphogenic proteins, peptide hormones, cell adhesion proteins
• Toxins: snake venoms, cholera toxin
Gene expression studies• Allow you to understand how a gene is regulated in a tissue or a cell type.• Most useful way of studying gene expression is by measuring the levels of
mRNA produced from a particular gene in a particular tissue.• Application: to understand certain biological process it is useful to study the
differences in gene expression which occur during such processes. E.g.– It is of interest to know which genes are induced or repressed, say in the liver,
after a particular drug is taken.– Or which genes are expressed in a tumor but not in the surrounding normal
tissue.• Some techniques for analyzing mRNA level of a single gene or to quantify
gene expression– Northern blots– Quantitative reverse transcriptase PCR (QT-RT-PCR)– DNA microarrays– Proteomics (analysis of the protein synthesis that results from gene expression)
DNA microarrays• Consist of thousands of DNA probes corresponding to different genes
arranged as an array.• Each probe (sometimes consisting of a short sequences of synthetic DNA)
is complementary to a different mRNA (or cDNA)• mRNA isolated from a tissue or cell type is converted to fluoroscently
labeled mRNA or cDNA and is used to hybridize the array.• All expressed genes in the sample will bind to one probe of the array and
generate a fluoroscent signal.• A DNA microarray can interrogate the level of transcription of several
thousand of different genes from one sample in one experiment. (One DNA microarray experiment reveals the mRNA levels of 1000s of genes from one tissue or cell type at one time point)
• Particularly useful when studying the effect of environmental factors on gene expression.
• A fingernail size chip can interrogate 10,000 different transcripts. Chip has 30-40 different probes; half of them are designed to perfectly match 20 nucleotide stretches of the gene and the other half contains a mismatch as a control to test for specificity of the hybridization signal.
SNPs (single nucleotide polymorphisms)
• Genetic basis for organismal diversity is due in large part to differences in sequences, also known as polymorphisms of each gene.
• Most of these polymorphisms differ from one another by one nucleotide and are known as SNPs.
• Due to the small portion of the genome coding for proteins and the redundancy in the mRNA code, only some SNPs will result in differently constructed proteins.
• It is believed that genomic markers such as SNPs spaced every 1000 bases will be sufficient to unambiguously resolve the span of genome associated with a phenotypic difference to a single gene.
Gene clustering dogma
• Genes that appeared to be expressed in similar patterns are mechanistically related.
• I.e., if we can find genes whose expression patterns approximate one another we can possibly conclude that they have functions that are related.