Statistical Weights of DNA Profiles Forensic Bioinformatics () Dan E. Krane, Wright State...

Post on 26-Mar-2015

214 views 2 download

transcript

Statistical Weights of DNA Profiles

Forensic Bioinformatics (www.bioforensics.com)

Dan E. Krane, Wright State University, Dayton, OH

DNA statistics

• Coincidental 10 locus DNA profile matches are very rare

• Several factors can make statistics less impressive– Mixtures– Incomplete information– Relatives– Database searches

DNA profile

Comparing electropherograms

Evidence sample Suspect #1’s reference

EXCLUDEEXCLUDE

Comparing electropherograms

Evidence sample Suspect #2’s reference

CANNOT EXCLUDECANNOT EXCLUDE

What weight should be given to DNA evidence?

Statistics do not lie.

But, you have to pay close attention to the questions they are addressing.

What weight should be given to DNA evidence?

Statistics do not lie.

But, you have to pay close attention to the questions they are addressing.

What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?

Single source statistics:

Random Match Probability (RMP) or “Random Man Not Excluded”

(RMNE)

Single source samples

Formulae for RMNE:

At a locus:Heterozygotes:Homozygotes:

Multiply across all loci

p2

Statistical estimates: the product rule

2pq 2pq 2pq 2pq

2pq 2pq 2pq 2pq

2pq 2pq

2pq 2pq

2pqp2 p2

p2

x x x x

x x x x

x x x x

x

x

0.1454 x 0.1097 x 2

Statistical estimate: Single source sample

3.2% 6.0% 4.6% 1.2%

9.8% 9.5% 6.3% 2.2% 1.0%

2.9% 5.1% 29.9% 4.0%

1.1% 6.6%

X X X X

XXXXX

X X X X

X

Statistical estimate: Single source sample

1 in 608,961,665,956,361,000,000

1 in 608 quintillion(“less than one in one billion”)

= 0.0320.1454 0.1097 2x x

What weight should be given to DNA evidence?

Statistics do not lie.

But, you have to pay close attention to the questions they are addressing.

What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?

Mixture statistics:

Combined Probability of Inclusion (CPI) or Likelihood

Ratios (LR)

Mixed DNA samples

Put two people’s names into a mixture.

How many names can you take out?

How many names can you take out?

How many contributors to a mixture if analysts can discard a locus?

How many contributors to a mixture?

Maximum # of alleles observed in a 3-person mixture # of occurrences Percent of cases

2 0 0.00

3 78 0.00

4 4,967,034 3.39

5 93,037,010 63.49

6 48,532,037 33.12

There are 146,536,159 possible different 3-person mixtures of the 959 individuals in the FB I database (Paoletti et al., November 2005 JFS).

3,398

7,274,823

112,469,398

26,788,540

0.00

4.96

76.75

18.28

How many contributors to a mixture?

Maximum # of alleles observed in a 4-person mixture # of occurrences Percent of cases

4 13,480 0.02

5 8,596,320 15.03

6 35,068,040 61.30

7 12,637,101 22.09

8 896,435 1.57

There are 57,211,376 possible different 4-way mixtures of the 194 individuals in the FB I Caucasian database (Paoletti et al., November 2005 JFS). (35,022,142,001 4-person mixtures with 959 individuals.)

CPI Stats

CPI Stats

• Probability that a random, unrelated person could be included as a possible contributor to a mixed profile

• For a mixed profile with the alleles 14, 16, 17, 18; contributors could have any of 10 genotypes:

14, 14 14, 16 14, 17 14, 18 16, 16 16, 17 16, 18

17, 17 17, 18 18, 18

Probability works out as:

CPI = (p[14] + p[16] + p[17] + p[18])2

(0.102 + 0.202 + 0.263 + 0.222)2 = 0.621

Combined Probability of Inclusion

62.1%

91.5% 23.5% 19.2% 40.7%

47.6% 99.0% 54.4% 61.2% 8.4%

91.6% 63.7% 8.8%

82.9% 31.1%

X X X X

XXXXX

X X X X

X

62.1%

CPI Stats

1 in 1.3 million

What weight should be given to DNA evidence?

Statistics do not lie.

But, you have to pay close attention to the questions they are addressing.

What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?

Mixtures with drop out

The testing lab’s conclusions

Ignoring loci with “missing” alleles

• Labs often claim that this is a “conservative” statistic

• Ignores potentially exculpatory information

• “It fails to acknowledge that choosing the omitted loci is suspect-centric and therefore prejudicial against the suspect.”– Gill, et al. “DNA commission of the

International Society of Forensic Genetics: Recommendations on the interpretation of mixtures.” FSI. 2006.

Likelihood approaches for mixtures where allelic drop out may have occurred

• Determining the rate of allelic drop-out is problematic

• Determining the rate of allelic drop-in is problematic

• Considering more than two possible contributors is computationally intensive

• Considering mixtures of different racial groups can be computationally intensive

• Contributions from different kinds of close relatives require special considerations

How many names can you take out if you can use blanks?

How many names can you take out if you can use blanks?

The more blanks the harder it is to eliminate anyone’s name as possibly being in the mix.

What weight should be given to DNA evidence?

Statistics do not lie.

But, you have to pay close attention to the questions they are addressing.

What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?

The alternative suspect pool

Which allele frequency database should be used?

• Random match probabilities are typically generated for each of three major racial groups

• Literally hundreds of alternative allele frequency databases are available

• The racial background of a suspect is not relevant.

What is the relevant population?

A process of elimination

• Consider that a suspect matches an evidence sample

• If he is not the source of the DNA then it must be someone else’s. Whose might it be?

• Could the actual source be: Caucasian, Afro-Caribbean, or Indo-Pakistan?

• If it cannot be and there is no one else in the alternative suspect pool then the suspect must be the source.

A suspect pool

D matches.

It means something if we find that A, B and C are all unlikely to also match.

A

B C

D

Database searches

What weight should be given to DNA evidence?

Statistics do not lie.

But, you have to pay close attention to the questions they are addressing.

What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?

Consider cold hits

UK’s National DNA Database (NDNAD)

Maintained by the Home Office

Contains 6,929,946 arrested individuals as of 31 March, 2012

Assisted in 409,715 investigations (2,595 murders)

In which case is the DNA evidence most damning?

• Probable Cause Case

– Suspect is first identified by non-DNA evidence

– DNA evidence is used to corroborate traditional police investigation

• Cold Hit Case

– Suspect is first identified by search of DNA database

– Traditional police work is no longer focus

In which case is the DNA evidence most damning?

• Probable Cause Case

– Suspect is first identified by non-DNA evidence

– DNA evidence is used to corroborate traditional police investigation

– RMNE = 1 in 10 million

• Cold Hit Case

– Suspect is first identified by search of DNA database

– Traditional police work is no longer focus

– RMNE = 1 in 10 million

In which case is the DNA evidence most damning?

• Probable Cause Case

– Suspect is first identified by non-DNA evidence

– DNA evidence is used to corroborate traditional police investigation

– RMNE = 1 in 10 million

• Cold Hit Case

– Suspect is first identified by search of DNA database

– Traditional police work is no longer focus

– RMNE = 1 in 10 million

– DMP = 0.693 in 1

What weight should be given to DNA evidence?

Statistics do not lie.

But, you have to pay close attention to the questions they are addressing.

What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?

Familial searches

• Database search yields a close but imperfect DNA match

• Can suggest a relative is the true perpetrator

• UK performs them relatively rarely – a total of 29 were carried out in 2011-12

• Reluctance to perform them in US since 1992 NRC report

Is the true DNA match a relative or a random individual?

• Given a closely matching profile, who is more likely to match, a relative or a randomly chosen, unrelated individual?

• Use a likelihood ratio

( ))|(

|

randomEP

relativeEPLR =

Is the true DNA match a relative or a random individual?

• This question is ultimately governed by two considerations:

– What is the size of the alternative suspect pool?

– What is an acceptable rate of false positives?

( ))|(

|

randomEP

relativeEPLR =

What weight should be given to DNA evidence?

Statistics do not lie.

But, you have to pay close attention to the questions they are addressing.

What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?

Additional (free) resources

Forensic Bioinformatics (www.bioforensics.com)

GenoStat®(http://www.bioforensics.com/genostat/index.html)

Eight 50-minute YouTube videos(http://www.bioforensics.com/video/index.html)