1
Current and Future tools for DNA Profiling
Daniele Podini [email protected]
Forensic Science?
Current and Future tools for DNA Profiling
Daniele Podini [email protected]
Forensic Science?
Outline
• PCR • Forensic Markers • STR analysis • Profile Frequency • Troubleshooting • Analytical Process • Future Technologies
THEN THERE WAS PCR Polymerase Chain Reaction PCR
• Polymerase Chain Reaction = molecular Xeroxing
• “Amplify” the desired DNA fragment(s)
• Increased sensitivity • 1988 FBI starts DNA section
Dr. Kary Mullis Eccentric Genius
1985
http://www.youtube.com/watch?v=L51UvB5za7c http://www.karymullis.com/pcr.shtml
2
PolyMarker
Speed of Analysis (Technology)
Power of Discrimination
(Genetics)
Low
High
Slow Fast
RFLP Single Locus Probes
RFLP Multi-Locus Probes
ABO blood groups
DQα
D1S80
Figure 1.1, J.M. Butler (2005) Forensic DNA Typing, 2nd Edition © 2005 Elsevier Academic Press
Limitations of DNA Testing (Past)
Multiplex STRs
Historical Perspective on DNA Typing
1985
1990
1994 1996
1998 2000
2002
1992 Capillary electrophoresis of STRs first described
First STRs developed
FSS Quadruplex
First commercial fluorescent STR
multiplexes
CODIS loci defined
Identifiler 5-dye kit and ABI 3100
PCR developed
UK National Database launched
(April 10, 1995) PowerPlex® 16 (16 loci in single amp)
2004 Y-STRs
RFLP DQA1 & PM
(dot blot) Multiplex STRs
Gill et al. (1985) Forensic application of DNA 'fingerprints‘. Nature 318:577-9
Advantages for STR Markers • Small product sizes are generally compatible with
degraded DNA and PCR enables recovery of information from small amounts of material
• Multiplex amplification with fluorescence detection enables high power of discrimination in a single test
• Commercially available in an easy to use kit format
• Uniform set of core STR loci provide capability for national and international sharing of criminal DNA profiles
Short Tandem Repeat (STR) Markers
TCCCAAGCTCTTCCTCTTCCCTAGATCAATACAGACAGAAGACAGGTGGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATATCATTGAAAGACAAAACAGAGATGGATGATAGATACATGCTTACAGATGCACAC
= 12 GATA repeats (“12” is all that is reported)
Target region (short tandem repeat)
7 repeats 8 repeats 9 repeats
10 repeats 11 repeats 12 repeats
13 repeats
The number of consecutive repeat units can vary between people
An accordion-like DNA sequence that occurs between genes
The FBI has selected 13 core STR loci that must be run in all DNA tests in order to provide a common currency with DNA profiles
11 repeats
12 repeats
D5S818
Short Tandem Repeats
11 repeats
12 repeats
D5S818
Short Tandem Repeats
3
11 repeats
12 repeats
D5S818
Short Tandem Repeats The ABI 310 Genetic Analyzer
ABI 310 Genetic Analyzer: Capillary Electrophoresis
Detector Window
11 12
D5S818
Short Tandem Repeats
7 8 9 10 11 12 13 14 15
D5S818
Short Tandem Repeats
11 12
D5S818
Short Tandem Repeats
D8S1179
11 9
4
Scanned Gel Image Capillary Electropherogram
The polymerase chain reaction (PCR) is used to amplify STR regions and label the amplicons with
fluorescent dyes using locus-specific primers 8 repeats
10 repeats Locus 1
8 repeats
9 repeats Locus 2
Multiplex PCR (Parallel Sample Processing)
• Compatible primers are the key to successful multiplex PCR
• STR kits are commercially available
• 15 or more STR loci can be simultaneously amplified
Advantages of Multiplex PCR – Increases information obtained per unit time (increases power of discrimination) – Reduces labor to obtain results – Reduces template required (smaller sample consumed)
Challenges to Multiplexing primer design to find compatible primers (no program exists) reaction optimization is highly empirical often taking months
Statistical estimates: the product rule
0.222 x 0.222 x 2
= 0.1
Statistical estimates: the product rule
= 0.1
1 in 79,531,528,960,000,000
1 in 80 quadrillion
1 in 10 1 in 111 1 in 20
1 in 22,200
x x
1 in 100 1 in 14 1 in 81
1 in 113,400
x x
1 in 116 1 in 17 1 in 16
1 in 31,552
x x
CODIS Short Tandem Repeats
5
DNA Profile Frequencies
• The likelihood of two unrelated individuals matching at all 13 loci is less than 1 in 100 trillion (1 X 1012 - 15)
• With 6 billion people on earth you wouldn’t expect to find two matching profiles at 13 loci
• Only identical twins have the same profile
DNA Databasing • DNA Databases allow for ability to search criminal
DNA profiles • 10/13/1998 FBI launched US Database
– Combined DNA Index System (CODIS)
• Revolutionized ability to link crime scene evidence to perpetrators
• Databases effective >60% violent criminals are rearrested within 3yrs. for similar offense
As of February 2011: 9,404,747 offender profiles and 361,176 forensic profiles 138,700 hits assisting in more than 133,400 investigations
Biological “Artifacts” of STR Markers
• Stutter Products • Non-template nucleotide addition • Microvariants • Tri-allelic patterns • Null alleles • Mutations
Stutter Products • Peaks that show up primarily one repeat less than the true allele
as a result of strand slippage during DNA synthesis
• Stutter is less pronounced with larger repeat unit sizes (dinucleotides > tri- > tetra- > penta-)
Types of STR Repeat Units
• Dinucleotide • Trinucleotide • Tetranucleotide • Pentanucleotide • Hexanucleotide
(CA)(CA)(CA)(CA) (GCC)(GCC)(GCC) (AATG)(AATG)(AATG) (AGAAA)(AGAAA) (AGTACA)(AGTACA)
Requires size based DNA separation to resolve different alleles from one another
Short tandem repeat (STR) = microsatellite = simple sequence repeat (SSR)
High stutter
Low stutter
YCAII
DYS448
~45%
<2%
6
D21S11 D18S51
D8S1179
DNA Size (bp)
Rel
ativ
e Fl
uore
scen
ce U
nits
Stutter Product
6.3% 6.2% 5.4%
Allele
Figure 6.1, J.M. Butler (2005) Forensic DNA Typing, 2nd Edition © 2005 Elsevier Academic Press
STR Alleles with Stutter Products
Taq DNA Polymerase has extended through 4 repeat units
Slipped Strand Mispairing Model
Taq has fallen off allowing the two strands to breathe apart.
Slipped Strand Mispairing Model
When the two strands re-anneal the template (bottom) strand has looped out and the extending strand aligns out-of-register by one repeat unit.
Slipped Strand Mispairing Model
The newly completed strand contains only 7 repeat units, while the template strand has the original 8 repeat units.
Slipped Strand Mispairing Model Non-Template Addition • Taq polymerase will often add an extra nucleotide to the end of a
PCR product; most often an “A” (termed “adenylation”)
• Dependent on 5’-end of the reverse primer; a “G” can be put at the end of a primer to promote non-template addition
• Can be enhanced with extension soak at the end of the PCR cycle (e.g., 15-45 min @ 60 or 72 oC) – to give polymerase more time
• Excess amounts of DNA template in the PCR reaction can result in incomplete adenylation (not enough polymerase to go around)
Best if there is NOT a mixture of “+/- A” peaks (desirable to have full adenylation to avoid split peaks)
A A
Incomplete adenylation
D8S1179
-A
+A
-A
+A
-A
+A
-A
+A
7
Impact of DNA Amount into PCR
• Too much DNA – Off-scale peaks – Split peaks (+/-A) – Locus-to-locus imbalance
• Too little DNA – Heterozygote peak imbalance – Allele drop-out – Locus-to-locus imbalance
D3S1358
-A
+A
10 ng template (overloaded)
2 ng template (suggested level)
DNA Size (bp)
Rel
ativ
e Fl
uore
scen
ce (R
FUs)
100 pg template
5 pg template
DNA Size (bp)
Stochastic effect when amplifying low levels of DNA produces allele dropout
Reason that DNA Quantitation is Important Prior to Multiplex Amplification Generally 0.5 – 2.0 ng DNA template is best for STR kits
Three-Peak Patterns
D21S11
“Type 2” Balanced peak
heights
Most common in TPOX and D21S11
“Type 1” Sum of heights of two of the peaks is equal to the third
D18S51
Most common in D18S51
TPOX
Clayton et al. (2004) A genetic basis for anomalous band patterns encountered during DNA STR profiling. J Forensic Sci. 49(6):1207-1214
*
* 8
8 6
6 8
Allele 6 amplicon has “dropped out”
Imbalance in allele peak heights
Heterozygous alleles are well balanced
Impact of DNA Sequence Variation in the PCR Primer Binding Site
No mutation
Mutation at 3’-end of primer binding site
(allele dropout)
Mutation in middle of primer
binding site
Butler, J.M. (2005) Forensic DNA Typing, 2nd Edition, Figure 6.9, ©Elsevier Academic Press
CSF1PO
D5S818
D21S11
TH01
TPOX
D13S317
D7S820
D16S539 D18S51
D8S1179
D3S1358
FGA VWA
13 CODIS Core STR Loci
AMEL
AMEL
Sex-typing
Position of Forensic STR Markers on Human Chromosomes
Core
STR L
oci fo
r the
Unite
d Sta
tes
1997
D21S11
TH01
D16S539 D18S51
D8S1179
D3S1358
FGA VWA
10 SGM Plus Loci
AMEL
AMEL
Sex-typing
Position of Forensic STR Markers on Human Chromosomes
D2S1338
D19S433
SE33
SE33 (Germany)
Core
STR L
oci fo
r Eur
ope
1995 1999 2005
D10S1248
D2S441
D22S1045
European loci overlap with 8 U.S. loci
STRBase
http://www.cstl.nist.gov/biotech/strbase/
8
The Analytical Process
• Extraction • Quanitiation • Amplification (PCR – STRs) • Analysis (Capillary Electrophoresis)
Sources of Biological Evidence
• Blood • Semen • Saliva • Urine • Hair • Teeth • Bone • Tissue Blood stain
Only a very small amount of blood is needed to obtain a
DNA profile
Extraction 1. Cell Lysis 2. Prevent DNA degradation 3. DNA purification
Cells obtained from a vaginal swab.
Perpetrator’s sperm mixed with victim’s
epithelial cells
Centrifuge
REMOVE supernatant
SDS, EDTA and proteinase K
(cell lysis buffer)
Remove a portion of the mixed stain
Incubate at 37 oC
sperm pellet
“Female Fraction”
Figure 3.2, J.M. Butler (2005) Forensic DNA Typing, 2nd Edition © 2005 Elsevier Academic Press
Sperm obtained after differential digestion prior to DNA testing
Sperm Fraction after Differential Extraction
9
Perpetrator’s sperm mixed with victim’s
epithelial cells
Centrifuge
REMOVE supernatant
SDS, EDTA and proteinase K
(cell lysis buffer)
Remove a portion of the mixed stain
SDS, EDTA and proteinase K + DTT
Incubate at 37 oC
sperm pellet
DTT lyses sperm heads
“Male Fraction” “Female Fraction” sperm pellet
Figure 3.2, J.M. Butler (2005) Forensic DNA Typing, 2nd Edition © 2005 Elsevier Academic Press
Automation of Differential Extraction
Quantitation Slot Blot Real time PCR
Amplification (PCR of STRs) Several Kits Applied Biosystems Profiler Plus Cofiler Identifiler Plus Minifiler Promega Powerplex 16 Powerplex ESX systems
Capillary Electrophoresis Applied Biosystems has monopoly
AFTER EXTRACTION Mixed Samples • ~99% of violent crimes are committed
by men
• DNA Mixtures of male suspect and female victim can pose an analytical challenge, especially when the female contribution is much greater than the male = preferential amplification
• Test for markers found only on the Y-chromosome. Only male DNA is amplified!
Y-STRs
Y-STRs • Paternal relatives all share the same Y-STR
haplotype • 10% of Central Asian males share the same
Y-STR haplotype, thought to belong to Genghis Khan
• Less statistical significance of inclusion • Compared to 13 autosomal STRs • Can increase number of Y- STR loci to increase
statistical power
10
Thomas Jefferson II
Field Jefferson Peter Jefferson
President Thomas Jefferson
Eston Hemings Thomas Woodson
Different Y Haplotype
Same Y Haplotype
Jefferson Y Haplotype
Jefferson Y Haplotype
?
Figure 9.10, J.M. Butler (2005) Forensic DNA Typing, 2nd Edition © 2005 Elsevier Academic Press
Historical Investigation DNA Study
Modern Use of Y-STR Testing Captured December 13, 2003
Is this man really Sadaam Hussein?
Uday and Qusay Hussein
Killed July 22, 2003
Matching Y-STR Haplotype Used to
Confirm Identity
(along with allele sharing from autosomal STRs)
Butler, J.M. (2005) Forensic DNA Typing, 2nd Edition, Box 23.1, p. 534
Mitochondrial DNA 1 2
3 5 4
12 11 10 9
6 7 8
18 17 15 16
13 14
MtDNA Haplotype Groups: 1
2,3,6,8,11,13,15,16 4,9,10
5 7
12 14,17,18
A B
B C
C C
D B
B
B
B
B B
E
F
G
G
G
Figure 10.2, J.M. Butler (2005) Forensic DNA Typing, 2nd Edition © 2005 Elsevier Academic Press
Where are we headed Future Challenges
• Portable all in one devices
• Obtaining more information from a DNA sample
11
Portable all in one devices
• Device that can be operated by non lab trained personnel • Patrol officer • Crime Scene • Border/Customs
62
• 3 etch steps • 20 drilled holes • 1 glass-glass bonding • 1 glass-PDMS bonding
Separate
assemble
Amplify
Extract
Plexiglas interface
*Easley et al. A fully-integrated microfluidic genetic analysis system with sample in-answer out capability. PNAS. 103 (51), 2006.
DNA Purification
DNA Amplification
Injection
Separation/Detection
SPE Inlet SPE Inlet Extraction Amplification Separation Injection
+5 min for fluidic pumping, etc.
63 64
DEVELOPMENT OF A SNP ASSAY PANEL FOR ANCESTRAL ORIGIN INFERENCE AND
INDIVIDUAL SOMATIC TRAITS
FY2009 – NIJ FORENSIC DNA R&D JANUARY 1ST 2010
NIJ grantee meeting - Chicago February 22nd 2011
Dr. Daniele Podini, Assistant Professor of Forensic Molecular Biology and Biological Sciences
Research focus:
Investigational Tool
• Ancestry and somatic traits inference
• Use technology currently available in forensic DNA laboratories
• Could be implemented in a kit form to be used on casework as needed
12
Ancestry and Phenotype Inference Investigational tool that will NOT directly identify a single suspected individual but rather will
• Help prioritize suspect processing • Corroborate witness testimony • Help determine the relevance of a piece of
evidence to a crime.
The Project: • Sample Collection • Identify DNA markers for ancestry and somatic traits
inference. • Develop and optimize assays to genotype the
selected markers. • Perform statistical analyses with the data set to
determine the optimal panel(s) of markers recommended for use in the crime laboratory.
• Develop final panels for casework application. • Disseminate results.
Sample Collection
To date ~ 250 – Target: up to 400
The Project: • Sample Collection • Identify DNA markers for ancestry and somatic
traits inference. • Develop and optimize assays to genotype the
selected SNPs. • Perform statistical analyses with the data set to
determine the optimal panel(s) of markers recommended for use in the crime laboratory.
• Develop final panes for casework application. • Disseminate results.
Types of Genetic Variations • Indels
– Small insertion/deletions CTT------GATC CTTACGGATC
• Small variable repeats – microsatellites – ACGACGACGACGACGACG (6 copies) – ACGACGACGACGACGACGACG (7 copies)
• Single Nucleotide Polymorphisms (SNP) – Single base pair changes GTCATTCGATT GTCAGTCGATT
http://www.science.marshall.edu/murraye/341/snps/Human%20Genetics%20MTHFR%20SNP%20Page.html
Ancestry Informative Single Nucleotide Polymorphisms
AISNPs
Single Nucleotide Polymorphisms (SNPs) that collectively give a high probability of an individual’s ancestry being from one part of the world or being derived from two or more areas of the world.
K.Kidd’s classification
13
Phenotype Informative Single Nucleotide Polymorphisms
PISNPs SNPs that provide a high probability that the individual has particular phenotypes, such as a particular skin color, hair color, eye color, etc.
K.Kidd’s classification
Ancestry and Phenotype Inference SNP selection – GWAS
– Anthropology – Human Pigmentation Studies
• Skin • Eye • Hair
– Other
R. A
. Stu
rm e
t al.,
Am
eric
an J
ourn
al o
f Hum
an G
enet
ics
82, 4
24 (2
008)
.
GWAS • Genome-wide association study:
examination of variation across a genome to identify genetic associations with observable traits and/or specific populations
• Can assay 1 million SNPs at one time (Affymetrix 6.0 array)
• Lots of studies in the last 5 yrs
0
50
100
150
200
250
300
350
2005 2006 2007 2008 2009 2010
Number of GWAS in NHGRI Database from 2005 through December 2010
SNPs with disease associations SNPs with disease associations
14
SNPs with disease associations Published Genome-‐Wide Associa4ons through 12/2009,
658 published GWA at p<5x10-‐8 NHGRI GWA Catalog www.genome.gov/GWAStudies
Published Genome-Wide Associations through 6/2010, 904 published GWA at p<5x10-8 for 165 traits
Duffy blood group identifies phenotypes associated with two proteins that appear on the outside of red-blood cells as a receptor
http://science.uwe.ac.uk/StaffPages/na/duffy_4.gif http://www.fi.edu/learn/heart/blood/images/red-blood-cells.jpg
Example: Duffy null allele
Duffy blood group
Plays an important role in susceptibility to malaria infection (P. vivax ).
The Fy(a-b-) phenotype (rs2814778) i.e. receptor not being expressed represents an adaptation to living in malaria-endemic regions.
This is a predominant feature in African populations especially those from West Africa.
Example: Duffy null allele
15
G. Tully, Forensic Science International: Genetics 1, 105 (2007).
SLC24A5 ASIP MC1R SLC45A2 OCA2
HERC 2 http://voiland.org/blog/wp-content/uploads/2008/02/blue.jpg
Pigmentation
http://www.research.uky.edu/odyssey/summer08/tan.html
Skin Pigmentation – MC1R • In modern humans:
– less skin pigmentation = high /varying polymorphism at MC1R – More skin pigmentation = low polymorphism at MC1R
• Indicates selective pressure over time favors MC1R mutations for less skin pigmentation, while need for more skin pigmentation limited variation in Africa
• Proposed hypotheses regarding the advantages / disadvantages of pigmentation
Effects of UV Radiation on Skin
Image from Jablonski 2004
SNPs
• An initial battery of 105 SNPs has been selected for screening:
• 46 Ancestry Informative SNPs • 59 Phenotype Informative SNPs
Modified from{http://snpsfinder.lanl.gov/
The Project: • Sample Collection • Identify DNA markers for ancestry and somatic
traits inference. • Develop and optimize assays to genotype the
selected SNPs. • Perform statistical analyses with the data set to
determine the optimal panel(s) of markers recommended for use in the crime laboratory.
• Develop final panels for casework application. • Disseminate results.
Single Base Extension
CTCAATGTGTAAGTTTTATTCACCTGCTAAGAAATTATTTTTTCAAAGCTAGCCTCAATATTTATTTTAAATGAGTGAACTTCAAGGCCTGAAAGAATAAACTAATACTTTACGAAATATTTTTGAAGTATAAAGAATATATTCAACATCTTTCCATGTCTCCAGATTTTAATATATGCCTTATTTTACTTTAAAAATTTTCAAATGTTTCTTTTATACACAATATGTTTCTTAGTCTGAATAACCTTTTCCTCTGCAGTATTTTTGAGCAGTGGCTCCRAAGGCACCGTCCTCTTCAAGAAGTTTATCCAGAAGCCAATGCACCCATTGGACATAACCGGGAATCCTACATGGTTCCTTTTATACCACTGTACAGAAATGGTGATTTCTTTATTTCATCCAAAGATCTGGGCTATGACTATAGCTATCTACAAGATTCAGGTAAAGTTTACTTTCTTTCAGAGGAATTGCTGAATCTAGTGTTACCAATTTATTTTGAGATAACACAAAACTTTATGCTTCGACAATGTTATTCCTGAACACTTTAAATCCTGAAAGTGCATTATAATCCTTAATTTAT
16
Single Base Extension
13-Plex
Single Base Extension Summary Advantages Disadvantages • Target small amplicons
• Easily multiplexed
• Increased sensitivity
• Equipment already present in Forensic DNA Labs
• Assay design is tedious
The Project: • Sample Collection • Identify DNA markers for ancestry and somatic
traits inference. • Develop and optimize assays to genotype the
selected SNPs. • Perform statistical analyses with the data set to
determine the optimal panel(s) of markers recommended for use in the crime laboratory.
• Develop final panels for casework application. • Disseminate results.
The Project: • Sample Collection • Identify DNA markers for ancestry and somatic
traits inference. • Develop and optimize assays to genotype the
selected SNPs. • Perform statistical analyses with the data set to
determine the optimal panel(s) of markers recommended for use in the crime laboratory.
• Develop final panels for casework application. • Disseminate results.
G/G Homozygotes Eye Color
Dark Blue Blue/Green Dark Green Hazel Light Green Grey Light Blue
Final SNP panel selection Phenotype: eye color HERC 2 - SNP rs12913832
A/A Homozygotes Eye Color
Black/Very Dark Brown
Dark Brown
Light Brown
A/G Heterozygotes Eye Color
Dark Brown Light Brown Hazel Dark Green Blue/Green Light Green Grey
174 Subjects
Homozygous G (69 ~ 40%) clear colored eyes
Homozygous A (56 ~ 32%) brown eyes
heterozygous G/A (49 ~ 28%) present both phenotypes
17
Final SNP panel selection Ancestry Duffy - rs2814778 - FY (A-B-)
174 Subjects
91% of homozygous C were African American or African
83% of heterozygous C/T were African American
4% of homozygous T were African American
C/C Homozygote Ethnicity
African
African American Asian
European
C/T Heterozygote Ethnicity
African American Asian
European
Other
T/T Homozygote Ethnicity
African American Asian
European
Hispanic
Other
More work
• Complete optimization of SBE assay • Complete sample collection and analysis • Continue data analysis exploring different
statistical approaches • Complete final SNP panel selection • Final panel development • Optimization on forensic samples
Next Generation Sequencing Technologies
Emulsion PCR
• Fragments, with adaptors, are PCR amplified within a water drop in oil.
• One primer is attached to the surface of a bead. • Used by 454, Polonator and SOLiD.
Bridge PCR
• DNA fragments are flanked with adaptors. • A flat surface coated with two types of primers,
corresponding to the adaptors. • Amplification proceeds in cycles, with one end of each
bridge tethered to the surface. • Used by Solexa.
ABI SOLID
18
NGS methods
• Sequencing millions of bases • Low cost per base • Entire genome can be done in a few days • Challenges in data management • Future applications
• At birth sequence entire genome • Personalized medicine • Non human genetics • Forensic Sciences
THANK YOU QUESTIONS? Acknowledgments: Katherine Butler, MS Michelle Peck, BS Jessica Hart, BS Dr Moses Schanfield Dr Pete Vallone NIST Dr Mike Coble NIST Dr James Landers UVA
Contact info: [email protected], Tel. 202-242-5766