Chapter 21
Genomes and Their Evolution
Overview: Reading the Leaves from the Tree of Life
• Complete genome sequences exist for a human, chimpanzee, E. coli,
brewer’s yeast, nematode, fruit fly, house mouse, rhesus macaque,
among others
– Comparisons of genomes among organisms provide information
about the evolutionary history of genes and taxonomic groups
• Genomics is the study of whole sets of genes and their interactions
– The enormous amount of data generated in our studies of
genomics has spawned a
new field called bioinformatics
• Bioinformatics is the
application of
computational methods to
the storage and analysis
of biological data
Fig. 21-1
Concept 21.1: New approaches have accelerated the pace of genome sequencing
• The most ambitious mapping project to date has
been the sequencing of the human genome
– Officially begun as the Human Genome
Project in 1990, the sequencing was largely
completed by 2003
• The project had three stages:
– Genetic (or linkage) mapping
– Physical mapping
– DNA sequencing
Three-Stage Approach to Genome Sequencing
• The initial stage in sequencing the human genome was to construct a type
of genetic map called a linkage map
– A linkage map maps the location of several thousand genetic markers
on each chromosome
• The order of the markers and the relative distances between them
on this map are based on recombination frequencies
• The genetic marker may be a gene or
any other identifiable DNA sequence
(RFLPs, STRs)
– By 1992, researchers had
compiled a human linkage map
with ~5,000 markers
• This map allowed scientists to locate
other markers, including genes, by
testing for genetic linkage to the
known markers
Fig. 21-2-4
Cytogenetic map
Genes locatedby FISH
Chromosomebands
Linkage mapping1
2
3
Geneticmarkers
Physical mapping
Overlappingfragments
DNA sequencing
• The next stage was the physical mapping of the human genome
– A physical map expresses the distance between genetic markers, usually as the number of base pairs along the DNA
• It is constructed by cutting a DNA molecule (using restriction enzymes) into many short fragments and arranging them in order by identifying overlaps
• These fragments are prepared by DNA cloning using one of the following cloning vectors:
– A yeast artificial chromosome (YAC) that can carry inserted fragments made up of 1 million base pairs
– A bacterial artificial chromosome (BAC) that typically carries 100,000-300,000 bp
• After these long, cloned fragments are put in order, each fragment is cut into smaller pieces
– These smaller pieces are then cloned in plasmids or phages, ordered, and finally sequenced
Fig. 21-2-4
Cytogenetic map
Genes locatedby FISH
Chromosomebands
Linkage mapping1
2
3
Geneticmarkers
Physical mapping
Overlappingfragments
DNA sequencing
• The final stage of genome sequencing is to determine the
complete nucleotide sequence
– Thus was accomplished by sequencing machines, using
the dideoxy chain termination method
– The sequencing of all 3.2 billion base pairs of the haploid
human genome still presented a
formidable challenge, even with
automation
• In the 1980s, a productive lab
could usually sequence ~1,000
bp/day
• By 2000, each research center
was sequencing
1,000 bp/second
Fig. 21-2-4
Cytogenetic map
Genes locatedby FISH
Chromosomebands
Linkage mapping1
2
3
Geneticmarkers
Physical mapping
Overlappingfragments
DNA sequencing
Whole-Genome Shotgun Approach to Genome Sequencing
• An alternative approach to whole-genome sequencing was devised
by J. Craig Venter in 1992
– This whole-genome shotgun approach essentially skips the
linkage mapping and physical mapping stages
• It instead begins directly with the sequencing of random DNA
fragments
• Powerful computer programs
then assemble the resulting
fragments of overlapping
short sequences into a
single continuous sequence
Fig. 21-3-3
Cut the DNAinto overlappingfragments short enoughfor sequencing
1
2
3
4
Clone the fragmentsin plasmid or phagevectors.
Sequence eachfragment.
Order thesequences intoone overallsequencewith computer software.
• Both the three-stage process and the whole-genome
shotgun approach were used for the Human Genome
Project and for genome sequencing of other organisms
– At first many scientists were skeptical about the whole-
genome shotgun approach, but it is now widely used
as the sequencing method of choice
• This approach can miss some duplicated
sequences, however, underestimating the size of
the genome and missing genes in those regions
– A hybrid of the two approaches may thus be the most
useful in the long run
Concept 21.2 Scientists use bioinformatics to analyze genomes and their functions
• The Human Genome Project established databases and
refined analytical software programs to make data available on
the Internet
– This has accelerated progress in DNA sequence analysis
by making bioinformatics resources available to
researchers worldwide
• Bioinformatics resources are provided by a number of sources:
– National Library of Medicine and the National Institutes of
Health (NIH) created the National Center for Biotechnology
Information (NCBI)
– European Molecular Biology Laboratory
– DNA Data Bank of Japan
• The NCBI database of sequences in called Genbank
– Genbank includes the sequences of 76 million fragments of genomic DNA (~80 million base pairs)
– This database is constantly updated, doubling the amount of data it contains approximately every 18 months
– Any sequence in the database can be retrieved and analyzed using software from the NCBI website
• Available software allows online visitors to search Genbank for matches to:
– A specific DNA sequence
– A predicted protein sequence
– Common stretches of amino acids in a protein
• It also provides 3-D views of all protein structures that have been determined
Fig. 21-4
Identifying Protein-Coding Genes Within DNA Sequences
• One challenge in bioinformatics is recognizing unknown protein-coding genes and determining their function
– The usual approach uses software to scan stored sequences for transcriptional and translational start and stop signals, RNA-splicing sites, and other tell-tale signs of a protein-coding gene
– The software also looks for sequences called expressed sequence tags (ESTs) that correspond to sequences present in known mRNAs
• Comparison of sequences of “new” genes with those of known genes in other species may help identify new genes
– If a newly-identified sequence partially matches the sequence of a gene or protein whose function is well-known, this may suggest a similar function in both
– If the sequence is unlike anything ever seen before, protein function can be deduced through a combination of biochemical and functional studies
• The biochemical approach aims to determine the 3-D structure of the protein, including potential binding sites for other molecules
• Functional studies usually involve blocking or disabling the gene the determine its effects on phenotype
Understanding Genes and Their Products at the Systems Level
• The success in sequencing genomes and studying
entire sets of genes has encouraged scientists to
attempt similar systematic studies of all the
proteins encoded by the genome
– Proteomics is the systematic study of all
proteins encoded by a genome
• These studies are necessary to understand
how cells and organisms function since
proteins, not genes, carry out most of the
activities of the cell
How Systems Are Studied: An Example
• One basic application of the systems biology approach is to define gene
circuits and protein interaction networks
– This process begins with 1000s of predicted RNA transcripts
– Molecular techniques are then used to test interactions between the
whole or partial protein products of these transcripts
– Finally, statistical tests are used to select
interactions for which the data is strongest
– The vast number of protein-protein
interactions generated during these
experiments can be integrated into graphic
models using powerful computers,
mathematical tools, and newly developed
software
• Thus, the systems biology approach is possible
because of advances in bioinformatics
Fig. 21-5
Proteins
Application of Systems Biology to Medicine • A systems biology approach has several medical applications:
– The Cancer Genome Atlas project aims to understand how changes in biological systems lead to cancer
• 3 types of cancer are being analyzed by comparing gene sequences and patterns of gene expression in cancer cells with those of normal cells
• A set of ~2,000 genes in cancer cells are being sequenced at several different times during the progression of the disease to monitor changes due to mutations and rearrangements
– Silicon and glass “chips” have also been developed to hold a microarray of most of the known human genes
• Such chips are being used to analyze gene expression patterns in patients suffering from various cancers and other diseases
• The eventual aim of these chips is to tailor treatments to the unique genetic makeup of the patients, as well as their specific disease
– In the future, all people may carry a catalog of their DNA sequences along with their medical records with regions highlighted that predispose them to specific diseases
Fig. 21-6
Concept 21.3 Genomes vary in size, number of genes, and gene density
• By summer 2007, genomes had been sequenced for 500 bacteria, 45
archaea, and 65 eukaryotes including vertebrates, invertebrates, and plants
– Genomes of most bacteria and archaea range from 1 to 6 million base
pairs (Mb); genomes of eukaryotes are usually larger
• Most plants and animals have genomes greater than 100 Mb;
humans have 3,200 Mb
• Within each domain, however, there is no
systematic relationship between genome
size and phenotype
– Ex) The genome of F. assyriaca (a
flowering plant of the lily family)
contains 120,000 Mb, ~40X the size
of the human genome
Table 21-1
Number of Genes
• A similar difference also holds true for the number of genes among the 3 domains
– Bacteria and Archaea have fewer genes, in general, than eukaryotes
• Free-living bacteria and archaea have 1,500 to 7,500 genes
• Unicellular fungi have from about 5,000 genes and multicellular eukaryotes from 40,000 genes
– Within eukaryotes, the number of genes in a species is often lower than expected from simply considering the size of its genome
• Ex) The genome of C.elegans is 100 Mb and carries 20,000 genes, while the Drosophila genome is almost 2X as large (180 Mb) but only has ~2/3 the number (13,700) of genes
• Polypeptide diversity in eukaryotes mainly due to extensive alternative RNA splicing and post-translational modifications
Table 21-1
Gene Density and Noncoding DNA
• Gene density (number of genes per given DNA length) can
also be compared in different species
– Eukaryotes generally have larger genomes but fewer
genes in a given number of base pairs
• Humans and other mammals have the lowest gene
density
– Multicellular eukaryotes have many introns within
genes and noncoding DNA between genes
• In bacterial genomes, however, most of the DNA
codes for proteins, tRNA, or rRNA, with only a small
amount of other DNA containing regulatory
sequences, such as promoters
Concept 21.4: Multicellular eukaryotes have much noncoding DNA and many multigene families
• The bulk of most eukaryotic genomes consists of noncoding DNA sequences, often described in the past as “junk DNA”
– Much evidence indicates that noncoding DNA plays important roles in the cell
• Ex) Genomes of humans, rats, and mice show high sequence conservation for ~500 noncoding regions, a higher level of conservation than seen for protein-coding regions in these species
• This strongly suggests that these noncoding regions have important functions
• Sequencing of the human genome reveals that 98.5% does not code
for proteins, rRNAs, or tRNAs
– About 24% of the human genome codes for introns and gene-
related regulatory sequences
• The rest of the genome is located between functional genes
(intergenic DNA) and includes:
– Pseudogenes: former genes that have accumulated mutations
and are nonfunctional
– Repetitive DNA: consists of
sequences that are present in
multiple copies in the genome
• About three-fourths of repetitive
DNA is made up of transposable
elements and sequences related
to them
Fig. 21-7
Exons (regions of genes coding for protein
or giving rise to rRNA or tRNA) (1.5%)
RepetitiveDNA thatincludestransposableelementsand relatedsequences(44%)
Introns andregulatorysequences(24%)
UniquenoncodingDNA (15%)
RepetitiveDNAunrelated totransposableelements(15%)
L1sequences(17%)
Alu elements(10%)
Simple sequenceDNA (3%)
Large-segmentduplications (5–6%)
Transposable Elements and Related Sequences
• Both prokaryotes and eukaryotes have stretches of DNA that can move from one location to another within the genome
– These stretches are known as transposable (genetic) elements
– During the process of transposition, a transposable element moves from one site in a cell’s DNA to a different target site by a special type of recombination
• Transposable elements are sometimes called “jumping genes”
– This phrase is misleading because these elements never completely detach from the cell’s DNA
• Instead, the original and new DNA sites are brought together by DNA bending
Transposable Elements and Related Sequences
• The first evidence for wandering DNA segments came
from geneticist Barbara McClintock’s breeding
experiments with Indian corn
– McClintock identified changes in the color of corn
kernels that could only be explained by the
movement of some genetic elements from other
genome locations into
the genes for kernel
color
– This movement would
disrupt these genes
and thus change the
kernel’s color
Fig. 21-8
Movement of Transposons and Retrotransposons
• Eukaryotic transposable elements are of two types:
– Transposons: move within a genome by means of a DNA intermediate
• Transposons can move by a “cut-and-paste” mechanism that removes the element from the original site
• They can also move by a “copy-and-paste” method that leaves a copy behind in its original location
Fig. 21-9
Transposon
New copy oftransposon
Insertion
Transposonis copied
Mobile transposon
DNA ofgenome
(a) Transposon movement (“copy-and-paste” mechanism)
Retrotransposon
New copy ofretrotransposon
Insertion
Reversetranscriptase
RNA
(b) Retrotransposon movement
Movement of Transposons and Retrotransposons
• Retrotransposons: move by means of an RNA intermediate
– Retrotransposons always leave a copy at the original site during transposition because they are initially transcribed into an RNA intermediate
• To insert at another site, the RNA intermediate is first converted back to DNA by reverse transcriptase
– This enzyme is encoded in the retrotransposon itself
• Another cellular enzyme then catalyzes the insertion of this reverse-transcribed DNA at a new site
Fig. 21-9
Transposon
New copy oftransposon
Insertion
Transposonis copied
Mobile transposon
DNA ofgenome
(a) Transposon movement (“copy-and-paste” mechanism)
Retrotransposon
New copy ofretrotransposon
Insertion
Reversetranscriptase
RNA
(b) Retrotransposon movement
Sequences Related to Transposable Elements
• Multiple copies of transposable elements and related sequences are scattered throughout the eukaryotic genome
– A single unit is usually 100s-1000s of base pairs long
– These dispersed “copies” are similar but are usually not identical to one another
• In primates, a large portion (10%) of transposable element–related DNA consists of a family of similar sequences called Alu elements
– Many Alu elements are transcribed into RNA molecules; however, their function is unknown
• An even larger percentage (17%) of the human genome is made up of a type of retrotransposon called LINE-1 (L1)
– L1 sequences have a low rate of transposition and may help regulate gene expression
Other Repetitive DNA, Including Simple Sequence DNA
• About 15% of the human genome consists of duplication of long sequences
of DNA from one location to another
– This repetitive DNA probably arises due to mistakes during DNA
replication or recombination
• In contrast, simple sequence DNA contains many copies of tandemly
repeated short sequences
– Ex) GTTACGTTACGTTACGTTAC
• Simple sequence DNA is common in centromeres and telomeres,
where it probably plays structural roles in the chromosome
• These repeated units may contain anywhere between 2-500
nucleotides
– A series of repeating units of 2 to 5 nucleotides is called a short
tandem repeat (STR)
• The repeat number for STRs can vary among sites (within a
genome) or individuals
Genes and Multigene Families • Many eukaryotic genes are present in one copy per haploid set of chromosomes
– In the human genome and the genomes of many other plants and animals, these solitary genes make up less than ½ of the total transcribed DNA
• The rest of the genome occurs in multigene families, collections of identical or very similar genes
– Some multigene families consist of identical DNA sequences, usually clustered tandemly, such as those that code for RNA products
• Ex) A family of identical DNA sequences contain the genes for the 3 largest rRNA molecules
– These rRNA molecules are are transcribed from a single transcription unit that is repeated tandemly 100s- 1000s of times in one or several clusters in the genomes of multicellular eukaryotes
– The many copies of this rRNA transcription unit help cells to quickly make the millions of ribosomes needed for active protein synthesis
Fig. 21-10a
(a) Part of the ribosomal RNA gene family
18S
28S
28S18S 5.8S
5.8S
rRNA
DNA
DNA
RNA transcripts
Nontranscribedspacer Transcription unit
• The classic examples of multigene families of nonidentical genes are two
related families of genes that encode globins
– α-globins and β-globins are polypeptides of hemoglobin and are coded
by genes on different human chromosomes (16 & 11)
– The different forms of each globin subunit are expressed at different
times in development
• This allows hemoglobin to function effectively in the changing
environment of a developing animal
– Ex) Embryonic and fetal forms of
human hemoglobin have a higher
affinity for oxygen than adult
forms
– Several pseudogenes (green) are also
found in the globin gene family
• Pseudogenes are nonfunctional
versions of the functional genes
Fig. 21-10b
(b) The human -globin and -globin gene families
Heme
Hemoglobin
-Globin
-Globin
-Globin gene family-Globin gene family
Chromosome 16 Chromosome 11
21
2
1
G A
Embryo Embryo Fetus
Fetus
and adult Adult
Concept 21.5: Duplication, rearrangement, and mutation of DNA contribute to genome evolution
• The basis of change at the genomic level is
mutation, which underlies much of genome
evolution
– The earliest forms of life likely had a minimal
number of genes, including only those
necessary for survival and reproduction
• The size of genomes has increased over
evolutionary time, with the extra genetic
material providing raw material for gene
diversification
Duplication of Entire Chromosome Sets
• Accidents in meiosis can lead to one or more extra sets of
chromosomes, a condition known as polyploidy
– Although polyploidy is most often lethal, it can facilitate
the evolution of genes in rare cases
• One set of genes can provide their normal functions
in the organism
• The genes of the extra sets can then eventually
diverge by accumulating mutations, creating genes
with novel functions
– The outcome of these accumulations of mutations may
be the branching off of a new species
Alterations of Chromosome Structure
• Humans have 23 pairs of chromosomes, while chimpanzees have 24 pairs
– Following the divergence of humans and chimpanzees from a common ancestor, two ancestral chromosomes fused in the human line
• By comparing the chromosomal organization of many different species, we can make inferences about the evolutionary processes that shape chromosomes and may drive speciation
– In one study, researchers compared the DNA sequence of each human chromosome with the whole genome sequence of the mouse
• Large blocks of genes are found in both human and mouse chromosomes, indicating that the genes in each block stayed together during evolution
– These same comparative analyses were performed between chromosomes of humans and 6 other mammalian species
• This allowed researchers to reconstruct the evolutionary history of chromosomal rearrangements among these species
– They found many duplications and inversions of large portions of chromosomes that likely arose from mistakes during meiotic recombination
Fig. 21-11
Human chromosome 16
Blocks of DNAsequence
Blocks of similar sequences in four mouse chromosomes:
7 8
1617
• The rate of duplications and inversions seems to have accelerated about 100 million years ago
– This coincides with the time when large dinosaurs went extinct and mammals diversified
• This coincidence is likely not accidental, since chromosomal rearrangements are thought to contribute to the generation of new species
– This is because chromosomal rearrangements can lead to 2 populations that cannot successfully mate, one of the steps towards speciation
• Analysis of chromosomal breakage points associated with rearrangements showed that these spots are not randomly distributed
– Specific sites known as “hot spots” have been used over and over again
• Some of these “hot spots” correspond to locations that are associated with congenital diseases
Evolution of Genes with Related Functions: The Human Globin Genes
• Duplication events can lead to the evolution of genes with related functions
– The genes encoding the various globin proteins ( and ) evolved from
one common ancestral globin gene, which duplicated and diverged
about 450–500 million years ago
• After the duplication events, differences between the genes in the
globin family arose from the accumulation of mutations Fig. 21-13
Ancestral globin gene
Duplication ofancestral gene
Mutation inboth copies
Transposition todifferent chromosomes
Further duplicationsand mutations
-Globin gene familyon chromosome 16
-Globin gene familyon chromosome 11
Evo
luti
on
ary
tim
e
2 1
2
1
G A
Evolution of Genes with Related Functions: The Human Globin Genes
• Subsequent duplications of these genes and random mutations
gave rise to the present globin genes, which code for oxygen-
binding proteins
– The similarity in the amino acid sequences of the various
globin proteins supports this model of gene duplication and
mutation
Table 21-2
Evolution of Genes with Novel Functions
• The copies of some duplicated genes have diverged so much in evolution
that the functions of their encoded proteins are now very different
– Ex) the lysozyme gene was duplicated and evolved into the α-
lactalbumin gene in mammals
• Lysozyme is an enzyme that helps protect animals against bacterial
infection
• α-lactalbumin is a nonenzymatic protein that plays a role in milk
production in mammals
– Although their functions are quite different, these 2 proteins are very
similar in their amino acid sequences and 3-D structure
• Both genes are found in mammals, while only the gene for
lysozyme is found in birds
– This finding suggests that the lysozyme gene underwent
duplication after the divergence of avian and mammalian
lineages
Rearrangements of Parts of Genes: Exon Duplication and Exon Shuffling
• Rearrangement of existing DNA sequences within genes has also contributed to genome evolution
– Ex) The duplication or repositioning of exons has contributed to genome evolution
• Errors during meiosis can result in an exon being duplicated on one chromosome and deleted from the homologous chromosome
• Alternatively, occasional mixing and matching of different exons within a gene or even between 2 nonallelic genes due to errors in recombination can also occur
– This process is called exon shuffling and can lead to new proteins with novel functions
• Ex) The TPA protein that helps control blood clotting is thought to have arisen by several instances of exon shuffling and duplication
Fig. 21-14
Epidermal growthfactor gene with multipleEGF exons (green)
Fibronectin gene with multiple“finger” exons (orange)
Exonshuffling
Exonshuffling
Exonduplication
Plasminogen gene with a“kringle” exon (blue)
Portions of ancestral genes TPA gene as it exists today
How Transposable Elements Contribute to Genome Evolution
• Transposable elements also play an important role in shaping a genome over evolutionary time
– These elements can contribute to genome evolution in several ways:
• They can promote recombination, disrupt cellular genes or control elements, and carry entire genes or individuals exons to new locations
– Multiple copies of similar transposable elements may facilitate recombination between different chromosomes by providing homologous regions for crossing over
• Though most such alterations are lethal, an occasional recombination event may be advantageous to an organism
– Movement of transposable elements can also have direct consequences
• Insertion of transposable elements within a protein-coding sequence may block protein production
• Insertion of transposable elements within a regulatory sequence may increase or decrease protein production
• Transposable elements may carry a gene or groups of genes to a new location
• Transposable elements may also create new sites for alternative splicing in an RNA transcript
– In all cases, changes are usually detrimental but may on occasion prove advantageous to an organism
Concept 21.6: Comparing genome sequences provides clues to evolution and development
• Genome sequencing has advanced rapidly in
the last 20 years
• Comparative studies of genomes:
– Advance our understanding of the evolutionary
history of life
– Help explain how the evolution of development
leads to morphological diversity
Comparing Genomes
• Genome comparisons of closely related species help us understand
recent evolutionary events
– Alternatively, genome comparisons of distantly related species
help us understand ancient evolutionary events
• In either case, learning about characteristics that are shared
or divergent between groups increases our understanding of
evolutions
• The relationships
among species can
be represented by a
tree-shaped diagram
Fig. 21-15
Most recentcommonancestorof all livingthings
Billions of years ago
4 3 2 1 0
Bacteria
Eukarya
Archaea
Chimpanzee
Human
Mouse
010203040506070
Millions of years ago
Comparing Distantly Related Species
• Highly conserved genes are genes that have changed very little
over time
– These inform us about relationships among species that
diverged from each other a long time ago
• Ex) Comparisons of complete genome sequences of
bacteria, archaea, and eukaryotes indicate that these
groups diverged from each other between 2 and 4
billion years ago
– Highly conserved genes can also be studied in one model
organism, and the results applied to other organisms
• Ex) Several genes in yeast are so similar to certain
human disease genes that researchers have deduced
their functions by studying these genes in yeast
Comparing Closely Related Species
• The genomes of closely-related species are likely to show similar
organization due to their relatively recent divergence
– Genetic differences between closely related species can therefore be
more easily correlated with phenotypic differences
• For example, genetic comparison of several mammals with
nonmammals helps identify what it takes to make a mammal
– Human and chimpanzee genomes differ by 1.2%, at single base-pairs,
and by 2.7% because of insertions and deletions
• Several genes are evolving faster in humans than chimpanzees
– These include genes involved in defense against malaria and
tuberculosis, regulation of brain size, and genes that code for
transcription factors
• One particular gene called FOXP2 that encodes a
transcription factor shows evidence of rapid change in the
human lineage
– Evidence suggests that this gene functions in
vocalization in vertebrates
• Mutations in this gene can produce severe speech
and language impairment in humans, and well as
disruption of normal vocalizations in other animals
– There are only 2 amino acid differences between
human and chimpanzee FOXP2 proteins
• These differences may explain why humans but not
chimpanzees communicate by speech
• The strongest evidence for the function of the FOXP2 gene came from a “knock-out” experiment in which researchers disrupted this gene in mice and analyzed the resulting phenotype
– Mice produce ultrasonic squeaks, referred to as whistles, to communicate stress
• Researchers applied genetic engineering techniques to produce mice in which one or both copies of the FOXP2 gene were disrupted
• These mutant newborn pups were then separated from their mothers to induce vocalizations
– Homozygous mutant mice had malformed brains and failed to emit normal ultrasonic vocalizations
– Heterozygous mice also showed significant problems with vocalization
Fig. 21-16
Wild type: two normalcopies of FOXP2
EXPERIMENT
RESULTS
Heterozygote: one copyof FOXP2 disrupted
Homozygote: both copiesof FOXP2 disrupted
Experiment 1: Researchers cut thin sections of brain and stainedthem with reagents, allowing visualization of brain anatomy in aUV fluorescence microscope.
Experiment 2: Researchers sepa-rated each newborn pup from itsmother and recorded the numberof ultrasonic whistles produced bythe pup.
Experiment 1 Experiment 2
Wild type Heterozygote Homozygote
Nu
mb
er
of
wh
istl
es
Wildtype
Hetero-zygote
Homo-zygote
(Nowhistles)
0
100
200
300
400
Comparing Genomes Within a Species
• As a species, humans have only been around about 200,000 years and have low within-species genetic variation
– Variation within humans is due to single nucleotide polymorphisms, inversions, deletions, and duplications
– These variations are useful for studying human evolution and human health
• Ex) They can serve as markers for identifying genes that cause diseases or affect our health
Comparing Developmental Processes
• Evolutionary developmental biology, or evo-devo,
is the study of the evolution of developmental
processes in multicellular organisms
– Its aim is to understand how these processes
have evolved and how changes in them can
modify existing organismal features
• Genomic information has shown that minor
differences in gene sequence or regulation
can result in major differences in form
Widespread Conservation of Developmental Genes Among Animals
• Recall: Homeotic genes specify the identity of body segments in fruit flies and other animals
– Molecular analysis of the homeotic genes in Drosophila has shown that they all include a sequence called a homeobox
• This homeobox specifies an amino acid sequence called a homeodomain in the encoded protein
• An identical or very similar nucleotide sequence has been discovered in the homeotic genes of both vertebrates and invertebrates
– These resemblances even extend to the organization of these genes, as evidenced by the fact that vertebrate genes homologous to the homeotic genes of fruit flies have kept the same chromosomal arrangement
Fig. 21-17
Adultfruit fly
Fruit fly embryo(10 hours)
Flychromosome
Mousechromosomes
Mouse embryo(12 days)
Adult mouse
• From these similarities, we can deduce that the
homeobox DNA sequence evolved very early
in the history of life
– It must also have been sufficiently valuable
to organisms to have been conserved in
animals and plants virtually unchanged for
100s of millions of years
• Homeotic genes in animals were named Hox genes (short
for homeobox-containing genes) because homeotic genes
were the 1st genes found to have a homeobox
– Other homeobox-containing genes were later found
that do not act as homeotic genes
• These genes do not directly control the identity of
body parts
• Most of these genes, however, are associated with
development
– Ex) Homeoboxes are present in the egg-polarity
gene bicoid, as well as in several segmentation
genes and in a master regulatory gene for eye
development
• Researchers have discovered that the homeobox-encoded homeodomain is
the part of a protein that binds to DNA when the protein functions as a
transcriptional regulator
– The shape of the homeodomain allows it to bind to any DNA segment,
so by itself, it cannot select a specific sequence
– Instead, other domains in the protein determine which genes the
protein regulates
• Interactions of these other domains with transcription factors helps
the protein recognize specific enhancers in DNA
– Proteins with homeodomains probably regulate development by
coordinating the transcription of many developmental genes, switching
them on or off
• In the embryos of many animal species, different combinations of
homeobox genes are active in different parts of the embryo
– This selective expression of regulatory genes that vary in
space in time is critical to pattern formation
• Many other genes involved in development are also highly conserved from
species to species
– These include many genes encoding components of signaling
pathways
– These similarities among developmental genes in different animal
species raises questions as to how the same genes can be involved in
the development of very diverse animals
• Current studies suggest that small changes in the regulatory
sequences of particular genes cause changes in gene expression
patterns, leading to major
changes in body form
– Ex) These differing
patterns of expression of
Hox genes along the
body axis explain the
variation in number of
leg-bearing segments in
insects and crustaceans
Fig. 21-18
ThoraxGenitalsegments
Thorax Abdomen
Abdomen
• Recent research also suggests that the same Hox gene
product may have subtly different effects in different
species
– These effects include turning on new genes or turning
on the same genes at higher or lower levels
• In still other cases, similar genes direct different
developmental processes in different organisms
– These distinct developmental processes result in
diverse body forms
• Ex) Several Hox genes are expressed in the
embryonic and larval stages of the sea urchin, even
though these are nonsegmented animals
Comparison of Animal and Plant Development
• The last common ancestor of animals and plants was probably
a single-celled eukaryote
– This indicates that the processes of development must
have evolved independently in these two lineages
• Animals require morphogenetic movements of cells and
tissues
• Morphogenesis in plants relies primarily on differing
planes of cell division and on selective cell enlargement
– Despite these differences, there are still similarities in the
molecular mechanisms of development
• In both plants and animals, development relies on a
cascade of transcriptional regulators that turn genes on
or off
Comparison of Animal and Plant Development
• The genes that direct these processes, however, differ
considerably between plants and animals
– While many master regulatory switches in animals are
homeobox-containing Hox genes, those in plants belong to
a completely different family of genes called Mads-box
genes
• Although homeobox-containing genes can be found in
plants and Mads-box genes in animals, they do not
perform the same major roles in development that they
do in the other group
– This supports the idea that developmental programs
evolved separately in plants and animals
You should now be able to:
1. Explain how linkage mapping, physical mapping, and DNA sequencing each contributed to the Human Genome Project
2. Define and compare the fields of proteomics and genomics
3. Describe the surprising findings of the Human Genome Project with respect to the size of the human genome
4. Distinguish between transposons and retrotransposons
5. Explain how polyploidy may facilitate gene
evolution
6. Describe in general terms the events that may
have led to evolution of the globin superfamily
7. Explain the significance of the rapid evolution
of the FOXP2 gene in the human lineage
8. Provide evidence that suggests that the
homeobox DNA sequence evolved very early
in the history of life