Class V
Manipulating and characterizing nucleic acids – II(Determining the sequence of the human genome)
1. Polymerase chain reaction – PCR2. Cloning of DNA fragments3. Sequencing of cloned DNA4. Using cloned DNA fragments to study
gene expression
The human genome project aimed the determination of the DNA sequence of the
whole human genome, and also the sequence of all the genes in the human genome
national institutes of healthnational library of medicine
national center of bioinformatics
http://www.ncbi.nlm.nih.gov/mapview/
http://www.ncbi.nlm.nih.gov/projects/mapview/map_search.cgi?taxid=9606&query=
http://www.ncbi.nlm.nih.gov/projects/mapview/maps.cgi?taxid=9606&chr=1
23M = 23.000.000th base pair
“contig”
“gene”
22900K = 22.900.000th base pair
STRATEGY (human genome project):
Obtain tissue
Obtain Genomic DNA of this tissue
Determine the DNA sequence of the Genomic DNA
The human genome project:
-We are ready to “sequence” the whole genome. But how can we “sequence” DNA if we do not know the sequence to start with?
(remember, we need to make an oligonucleotide that is complementary to a small part of the DNA before we can do DNA sequencing)
Solution : We can use a plasmid sequence !
To be able to use the plasmid DNA sequence where our oligonucleotidewill hybridize (in red circles), we have to insert the DNA which we want to sequence analyze into the plasmid
“Sequencing” after cloning a DNA fragment into a plasmid
5’-GAATTCATGGATACGAACGAGCATTAGAATTC-3’3’-CTTAAGTACCTATGCTTGCTCGTAATCTTAAG-5’
A AA
A
A AA
A
CC
C
C
C
T
T
T
T
TT
T G
G
G
G
GG
G
A*A*
A*
A*
A*
if our DNA molecule contains R.E. sites on both ends, we can cut it and paste it into a plasmid
5’-AGAGGAATTCATGGATACGAACGAGCATTAGAATTCTCTACCTT-3’3’-TCTCCTTAAGTACCTATGCTTGCTCGTAATCTTAAGAGATGGAA-5’
A AA
A
A AA
A
CC
C
C
C
T
T
T
T
TT
T G
G
G
G
GG
G
A*A*
A*
A*
A*
“Sequencing” after cloning a DNA fragment into a plasmid
after ligating our DNA molecule into a plasmid, we will have “plasmid DNA sequences” on the 5’ and 3’ ends of our DNA
5’-AGAGGAATTCATGGATACGAACGAGCATTAGAATTCTCTACCTT-3’3’-AGATGGAA-5’
A AA
A
A AA
A
CC
C
C
C
T
T
T
T
TT
T G
G
G
G
GG
G
A*A*
A*
A*
A*
“Sequencing” after cloning a DNA fragment into a plasmid
and since we know the plasmid DNA sequence, we can hybridize our oligonucleotide to these known sequences
A AA
A
A AA
A
CC
C
C
C
T
T
T
T
TT
T G
G
G
G
GG
G
A*A*
A*
A*
A*
“Sequencing” after cloning a DNA fragment into a plasmid
and perform our DNA sequencing analysis
5’-AGAGGAATTCATGGATACGAACGAGCATTAGAATTCTCTACCTT-3’3’-AGATGGAA-5’
A AA
A
A AA
A
CC
C
C
C
T
T
T
T
TT
T G
G
G
G
GG
G
A*A*
A*
A*
A*
“Sequencing” after cloning a DNA fragment into a plasmid
this way we will be able to determine the sequence of the DNA molecule inserted into the plasmid, regardless of its DNA sequence
5’-AGAGGAATTCATGGATACGAACGAGCATTAGAATTCTCTACCTT-3’3’-AGATGGAA-5’
5’-AGAGGAATTCTTACGCGCTTCAACAATTCAGAATTCTCTACCTT-3’3’-AGATGGAA-5’
A AA
A
A AA
A
CC
C
C
C
T
T
T
T
TT
T G
G
G
G
GG
G
A*A*
A*
A*
A*
“Sequencing” after cloning a DNA fragment into a plasmid
this way we will be able to determine the sequence of the DNA molecule inserted into the plasmid, regardless of its DNA sequence
So we need to insert the DNA we want to sequence analyze (our genomic DNA) into a plasmid using restriction endonuclease digestion and ligation*.
Lets chose a R.E.: EcoRI
Then lets digest our genomic DNA (all of it) with EcoRI
.. and insert these DNA pieces into a plasmid that was also digested by EcoRI
*(for the “actual” human genome project, the genomic DNA was not digested but fragmented by force)
STRATEGY (human genome project):
EcoRI
Plasmid
Sequence √
STRATEGY (human genome project):
12345......
12 345
1,2,3 etc.insert:“Contigs”
We generated many DNA fragments, how shall we pick one to sequence?
(+)
(-)
46 = (1/4,096) =732421 fragments
QUESTION:
Lets put the genomic DNA fragments into a plasmidANSWER:
….and make sure each fragment goes into a different plasmid
# of fragments <<< # of plasmids(100) (100,000)
ANSWER:
# of fragments <<< # of plasmids(100) (100,000)
….and make sure each fragment goes into a different plasmid
ANSWER:
then, lets transfer the “recombinant” plasmids into a host cell (bacteria or yeast).
# cells >>> # of recombinant plasmids
AN
SW
ER
:
# cells >>> # of recombinant plasmids
AN
SW
ER
:….and make sure each plasmid goes into a different cell
This is a “Genomic DNA Library”A
NS
WE
R:
now lets grow each bacteria in a separate place
ANSWER:
plate containing
nutrients for the bacteria
If we keep this plate at 37°C for 12 hours each bacteria will generate colonies, each colony which contains “clones” or
copies of itself.
ANSWER:
ANSWER:
Bacterialcolony
clones
If we keep this plate at 37°C for 12 hours each bacteria will generate colonies, each colony which contains “clones” or
copies of itself.
ANSWER:
If we keep this plate at 37°C for 12 hours each bacteria will generate
colonies, each colony which contains “clones” or copies of itself.
Since each colony contains only one type of bacteria with only one type of plasmid, in this way, we have made many identical
copies of the plasmid; i.e. “cloned the plasmid”.
ANSWER:
After the bacteria divide and grow, we can disrupt them and purify the plasmid
ANSWER:
In this way we have pure plasmids we can use for sequencing
5’-ATGTCGGCTACTGCCTAGCAGGCGC…..
ANSWER:
EcoRI
Plasmid
Sequence √
..and if we determine the sequence of all the DNA cloned in our library we will have the DNA sequence of the human genome
12345......
12 345
1,2,3 etc.insert:
However, we have a problem:
We used this strategy and determined the DNA sequence of all our genomic DNA. But this way we do not know in which order these DNA fragments are found in the genome? i.e. there is no identifier on the sequence that tells us that one comes after another..
175328??
1 ? 4? 3? 2?
1 ? 4? 3? 2?
We can know the order of DNA fragments on the chromosome if we sequence “overlapping” DNA fragments
EcoRI EcoRI EcoRI
EcoRI EcoRI EcoRI
We can generate overlapping DNA fragments by “partial digestion”
Complete digestion: 37°C, 1 hour
Partial digestion:37°C, 5 minutes
EcoRI EcoRI EcoRI
This way, when we clone and sequence our “overlapping” fragmentssome will have regions where the DNA sequence is identical to that of another fragment. Using this information we can find the order of
the fragments as they exist in our genomic DNA
Partial digestion:37°C, 5 minutes
Once we have determined the
sequences of all the overlapping
genomic DNA fragments, we will use a software that can show us where
the overlapping sequences are. This
software is called BLAST (basic local alignment tool) and
is provided by NCBI.
http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastHome
http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastHome
Towards the bottom of the page, there is the option of clicking on “aligning two or more sequences”. This is what we will
use.
(bottom of page)
We will then paste into the top window the genomic DNA
sequence obtained from one fragment (#1) and into the
bottom window that of another fragment
(#2).
And click on “BLAST”
(bottom of page)
We will then paste into the top window the genomic DNA
sequence obtained from one fragment (#1) and into the
bottom window that of another fragment
(#2).
The next page that opens will show the sequences
that are identical between genomic DNA fragments #1 and #2. The result on the left shows that the DNA
sequence between the 84th to 116th nucleotides of fragment #1 (query*) is identical to the sequence
between the 1st to the 33rd nucleotides of
fragment #2 (Sbjct*=subject).
*names given to the sequences by BLAST
We have thus determined the sequence of our genomic DNA
But how were the locations of the individual genes determined?
Required reading:
Lodish chapter: 5.2Brown chapter: 10.1.1 (chain termination sequencing)
Suggested reading:Brown chapter: 10.2.2 (sequencing the genome)Brown chapter: 8.1 – 3 (library preparation)http://jgi.doe.gov/education/how/how_1.htmlhttp://www.ornl.gov/sci/techresources/Human_Genome/home.shtml
1. For the “human genome project” why do we use a plasmid to sequence a fragment of the human DNA?
2. How can we make sure that a different DNA fragment is cloned into only one plasmid?
3. Why do we care that there is only one fragment in a given plasmid?
4. What do we mean by a “genomic DNA library”?5. What do we mean by “cloning genomic DNA into a plasmid”? 6. What is “pyrosequencing” (Brown ch. 10.1). What is its
advantage over chain-termination sequencing?7. How does the “shot gun approach” work? (Brown ch. 10.2)