Human Genetics
The Human Genome
Genome
The genome of an organism is the complete set of genes specifying how its phenotype will develop (under a certain set of environmental conditions).
Diploid organisms (like us) contain two genomes, one inherited from our mother, the other from our father.
The total DNA of an organism.Nuclear genome refers to the total DNA in the
nucleus, which is distinguished from organellar genomes of the mitochondria and chloroplast.
Genome size variation
Comparison of genome organization
Organism Genome Size(n)
# of Genes Chromosomes(n)
Human 3,000,000,000 35,000 23 linear
D.melanogaster
140,000,000 13,600 4 linear
C. elegans 97,000,000 19,000 6 linear
A. thaliana 125,000,000 25,500 5 linear
S. cerevisiae 13,000,000 5,800 16 linear
E. coli 4,700,000 4,000 1 circular
Human mtDNA 17,000 37 1 circular
Eukaryotic Genomes are Variable in Size
Marbled lungfish 139,000,000,000
Salamander 50,000,000,000
Homo sapiens 3,000,000,000
Pufferfish 400,000,000
Fruit Fly 165,000,000
Arabidopsis 100,000,000
Baker’s yeast 12,067,280
Why the big differences?
Do Marbled Lungfish differ from Pufferfish?
Are Lilies all that much different than Arabidopsis?
These differences exist because: Genomes have duplicated (chromosome
doubling) Individual genes have duplicated. DNA exists that has no coding function.
Gene structure
I. Gene definitionII. Genome organization (eukaryotic)
1. Genes and their noncoding regulatory sequences2. “Nonfunctional” DNA3. Duplicated genes4. Repetitive DNA
III. Mobile DNAIV. Gene Regulation
Some Terms
A duplicate of a gene may acquire mutations and emerge as a new gene.
Noncoding DNA: a sequence of DNA contained in eukaryotic genomes that does not encode any genetic information and often consists of repetitive sequences.
Expression: DNA transcribed into RNA and RNA turned into protein are expressed. The regulation of this process is called gene expression.
Nomenclature on DNA quantity
bp = one base pair within a double-stranded DNA
kb = 1,000 base pairs of double-stranded DNAmb = 1 million base pairs of double-stranded
DNAn = number of chromosomes in a haploid
genome2n = number of chromosomes in a diploid
genome
Definition(s) of a Gene
1. A hereditary unit that is composed of a sequence of DNA and occupies a specific position or locus.
2. Broadly, any genetic determinant of a specific functional gene product.
3. Molecular definition:
Entire nucleic acid sequence necessary for the synthesis of a functional polypeptide (protein chain) or functional RNA
Genes and Their ProductsThe majority of genes are expressed as
the proteins they encode. The process occurs in two steps:
Transcription = DNA -> RNA Translation = RNA -> protein
This is the “Central Dogma" of Biology: DNA makes RNA makes protein.
The Central Dogma of Molecular Biology
WHY? The DNA can retain integrity The RNA step allows amplification Multiple steps allow multiple points of control
DNA
RNA
Protein
TranscriptionTranslation
Most Genes Encode Proteins
Original Concept of the Gene: One gene = one enzyme
This concept does not hold for those proteins that consist of two or more different subunits.
Revised Concept: One gene = one messenger RNA = one
peptide.
RNA Genes
Some RNAs (tRNA, rRNA, snRNA, mtRNA) don’t code for proteins that are translated.
However, these are still referred to as genes-they are specific functional gene products.
Other DNA sequences regulate the transcription of other genes and can act like genes in some ways.
Genes are interspersed along DNA molecules, being separated by DNA sequence of unknown function (intergenic regions)
Coding region
Nucleotides (open reading frame) encoding the amino acid sequence of a protein
The molecular definition of gene includes more than just the coding region.
Noncoding regions
Regulatory regions RNA polymerase binding site Transcription factor binding sites
IntronsPolyadenylation [poly(A)] sites
“Nonfunctional” DNA
Higher eukaryotes have a lot of noncoding DNA
Some has no known structural or regulatory function (no genes)
80 kb
Duplicated genesEncode closely related (homologous)
proteinsClustered together in genomeFormed by duplication of an ancestral
gene followed by mutation
Five functional genes and two pseudogenes
Mobile DNA
Moves within genomesMost of moderately repeated DNA
sequences found throughout higher eukaryotic genomes L1 LINE is ~5% of human DNA (~50,000 copies) Alu is ~5% of human DNA (>500,000 copies)
Some encode enzymes that enable movement
Transposition
Movement of mobile DNAInvolves copying of mobile DNA element
and insertion into new site in genome
Why?
Molecular parasite: “selfish DNA”Probably have significant effect on
evolution by facilitating gene duplication, which provides the fuel for evolution, and exon shuffling
RNA or DNA intermediate
Transposon moves using DNA intermediate
Retrotransposon moves using RNA intermediate
LTR (long terminal repeat)Flank viral retrotransposons and retrovirusesContain regulatory sequences
Transcription start site and poly(A) site
LINES and SINESNonviral retrotransposons
RNA intermediate Lack LTR
LINES (long interspersed elements) ~6000 to 7000 base pairs L1 LINE (~5% of human DNA) Encode enzymes that catalyze movement
SINES (short interspersed elements) ~300 base pairs Alu (~5% of human DNA)
Human Disease and Mobile DNA
Movement (transposition) of LINES and SINES can cause mutations and genetic disease by insertion into essential genes Hemophilia (blood clotting factor VIII gene) Muscular dystrophy (DMD) Colon cancer (APC)
RNA Transcription
The process of releasing information contained in a DNA sequence, because DNA itself is used only for storage and transmission.
The sequence of bases in the DNA template is copied into an RNA sequence, which is either used directly or translated into a polypeptide.
Noncoding DNA can be Part of Transcribed Genes
Regulatory regions (Promoters)IntronsPoly A+ Addition sites5’ untranslated regions3’ untranslated regions.
Basic Gene Structure
-35 -10
CAAT TATAGC
Prokaryotes like E. coli
Humans and other Eukaryotes
Bacterial Gene
Human Genes
Most have intronsProduce monocistronic mRNA: only one
encoded proteinLarge ( 1000->1,000,000 base pairs)
Gene Transcription and Regulation
A Puzzle about Cells
Each Cell has a complete copy of all the DNA. And yet, cells are different.
This is the theoretical basis of organism cloning.
So cells are only using some of the DNA to make RNA to make proteins at any time.
How does the cell know which DNA to chose to transcribe?
External environment sends signals that are recognized, and transcription is turned on or off in response to the signals.
Transcription
Transcription is the synthesis of RNA from a DNA template.
Main Types of RNA each have different roles in the cell: mRNA= Messenger RNA tRNA = Transfer RNA rRNA = Ribosomal RNA mtRNA = Mitochondrial RNA snRNA = Small nuclear RNA
rRNA and tRNA are Cogs in the Machinery
rRNA is a structural part of the ribosome
tRNA helps the protein machinery to read the mRNA
Neither of these types of RNAs actually carries any information
Messenger RNA
Messenger RNA carries the information in the DNA to the protein translation machinery (ribosomes)
Serves as the template for protein synthesis
Which mRNAs are transcribed in a cell decide the fate of that cell since they dictate which information in the DNA is read by the protein translation machinery
RNA moleculesSynthesized by RNA polymerases using DNA
as a template.Polymer of ribonucleotides, where each
consists of a phosphate group (PO4), ribose sugar, and a base (adenine, guanine, cytosine, or uracil).
Following synthesis of an RNA strand, it remains single-stranded.
Gene Regulation can occur at any of these steps
Initiation- highly regulated stepElongation- the rate at which the mRNA is
made can control how quickly its madeTermination- premature termination can
mean that the whole mRNA never gets made and neither does what it codes for: Like receiving only part of the instructions on
how to put together your “easy to assemble” bookcase/desk/whatever
Steps of RNA Transcription
Initiation Elongation Termination
All RNA transcription is performed by enzymes called RNA polymerases.
RNA transcription starts at a Promoter sequence (analogous to ORI for DNA replication).
Transcription of mRNA in Humans
Steps involved are the same as in prokaryotes: Initiation Elongation Termination
Mediated by RNA polymerase II: Very complex enzyme with many subunits
Human Transcription
Has to be more control of how more complex genetic material is read to create more variety (multicellular)
RNA has to be transcribed in the nucleus and then transported to the protein translation machinery in the cytoplasm before it can be read.
Protein
DNA
Nucleus
Human genes
Most have intronsProduce monocistronic mRNA: only one
encoded proteinLarge genes
Initiation
Initiation occurs at promoters as in prokaryotes- eukaryotic promoters are not well-characterized but have some well conserved elements- including the TATA box and CAAT box (both have A=T pairs)
In addition to the promoters there are region in the DNA called enhancers to which transcription factors bind and regulate which DNA is read and encoded in mRNA
mRNAPromoterEnhancer Gene
PromoterEnhancer Gene
Pol
TF
TF
TF
TFTF
TF
Transcription Factor Function
TF
TF= Transcription Factor
Transcription Factors
Although transcription is performed by RNA Polymerase, it needs other proteins to produce the transcript.
These proteins are either associated directly with RNA Polymerase or help it bind to the DNA sequences upstream of the initiation of translation..
These associated proteins are called transcription factors.
RNA transcription begins by the assembly of the RNA polymerase on a promoter region.
Orientation of promoter elements specifies the direction of transcription
-35 -10
CAAT TATAGC
prokaryote
eukaryote
Transfer of Information
Gene
exon | intron | exon | intron | exon
mRNA
Exon - portion of the gene that contains DNA sequences that will be translated into protein.
Intron - portion of the gene that will be cut out before translation
Transfer of Information
Reading the Genes in the Genome
TranslatingProtein
mRNAAAAProcessing
Transcribing
Signal recognizing