taming uncertainty in forensic dna evidence · dna samples. journal of forensic sciences, 2001....

13
Cybergenetics © 2007-2011 1 Taming Uncertainty in Forensic DNA Evidence ENFSI Meeting ENFSI Meeting April, 2011 April, 2011 Brussels, Belgium Brussels, Belgium Mark W Mark W Perlin Perlin, PhD, MD, PhD , PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics, Pittsburgh, PA Cybergenetics © 2003-2011 Cybergenetics © 2003-2011 Uncertainty people (and the law) want people (and the law) want certainty certainty scientific data are scientific data are uncertain uncertain probability probability describes uncertainty describes uncertainty information information is is a change in probability a change in probability DNA identification is an information science DNA identification is an information science goal is goal is maximal objective information maximal objective information Genotype ? Prior probability Prior probability 10, 10 10, 11 10, 12 10, 13 11, 11 11, 12 12, 13 13, 13 unknown person population

Upload: others

Post on 22-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 1

Taming Uncertainty inForensic DNA Evidence

ENFSI MeetingENFSI MeetingApril, 2011April, 2011

Brussels, BelgiumBrussels, Belgium

Mark W Mark W PerlinPerlin, PhD, MD, PhD, PhD, MD, PhDCybergenetics, Pittsburgh, PACybergenetics, Pittsburgh, PA

Cybergenetics © 2003-2011Cybergenetics © 2003-2011

Uncertainty

•• people (and the law) want people (and the law) want certaintycertainty•• scientific data are scientific data are uncertainuncertain

•• probabilityprobability describes uncertainty describes uncertainty•• informationinformation is is a change in probabilitya change in probability

•• DNA identification is an information science DNA identification is an information science•• goal is goal is maximal objective informationmaximal objective information

Genotype

?

Prior probabilityPrior probability

10, 1010, 1110, 1210, 1311, 1111, 1212, 1313, 13 …

unknown personpopulation

Page 2: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 2

Genotype

?

Likelihood functionLikelihood function

10, 1010, 1110, 1210, 13biological father

child

mother

Genotype

?10, 1010, 1110, 1210, 13

Posterior probabilityPosterior probability

Objective Bayesian inference posterior = prior x likelihood

biological father

child

mother

Match

?10, 1010, 1110, 1210, 13 Pr(matching genotype)

10, 11

biological father

child

mother

putative father

50%

Page 3: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 3

Match

?10, 1010, 1110, 1210, 13

10, 1010, 1110, 1210, 1311, 1111, 1212, 1313, 13 …

Pr(matching genotype)

Pr(cooincidental genotype)

10, 11

50%

5%

biological father

child

mother

putative father

population

InformationPr(matching genotype)

Pr(cooincidental genotype)LR =

Likelihood ratioLikelihood ratio

log(LR) is a standard measure of information

50%5%

= 10=

log10(10) = 1 information unit

MW Perlin. Explaining the likelihood ratio in DNA mixture interpretation. Proceedingsof Promega's 21st International Symposium on Human Identification, 2010.

Kinship

?10, 1010, 1110, 1210, 13

10, 1010, 1110, 1210, 1311, 1111, 1212, 1313, 13 …

Pr(matching genotype)

Pr(cooincidental genotype)

10, 11

missing person

son

spouse

daughter

father

biological remains

90%

5%population

Page 4: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 4

Phenotype

+

evidence

reality

PhenotypeLab

+

10 11 12

evidence STR data

reality observation

Phenotype

evidence STR data genotype

10 11 12

10, 10 @ 30%10, 11 @ 50%10, 12 @ 20%

Lab Infer

+

reality observation model

Page 5: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 5

Phenotype

?10, 1010, 1110, 1210, 13

10, 1010, 1110, 1210, 1311, 1111, 1212, 1313, 13 …

Pr(matching genotype)

Pr(cooincidental genotype)

10, 11

50%

5%

evidence

suspect

population

10 11 12

Computer Interpretation

• quantitative computer interpretation• statistical search of probability model• preserve all identification information• objectively infer genotype, then match

• any number of mixture contributors• stutter, imbalance, degraded DNA• calculate uncertainty of every peak

Quantitative Data

Page 6: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 6

Quantitative Interpretation

Perlin MW, Szabady B. Linear mixture analysis: a mathematical approach to resolving mixedDNA samples. Journal of Forensic Sciences, 2001.

Calculate Peak Uncertainty

MW Perlin, A Sinelnikov. An information gap in DNA evidence interpretation. PLoS ONE, 2009.

Infer Accurate Genotype

MW Perlin, MM Legler, CE Spencer, JL Smith, WP Allan, JL Belrose, BW Duceman.Validating TrueAllele DNA mixture interpretation. Journal of Forensic Sciences, 2011.

Page 7: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 7

Qualitative Thresholds

Over threshold,peaks are treatedas allele events.

Under threshold,alleles do not exist.

list ofincluded alleles

Qualitative Method VariationNational Institute of Standards and Technology

Two Contributor Mixture Data, Known Victim

31 thousand (4)

213 trillion (14)

Fingernail: 7% MixtureCommonwealth v. Foley

Score Method13 thousand inclusion

23 million use victim189 billion quantitative

• probability modeling preserves information• peak threshold discards information

The DNA Investigator™ Newsletter, 2009Same Data, More Information – Murder, Match and DNA

Page 8: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 8

More Data, More Information

• low template mixture• three DNA contributors• triplicate amplification• post-PCR enhancement

vWA locus data

• no match score found• computer interpretation• joint likelihood function

Information

Gen

oype

pro

babi

lility

(vW

A)

Assume theta = 1%, and three contributors. A match between the suspect and the evidenceis 3,620,000 times more probable than coincidence.

6x

More Data, More Information

33

4455

66

Perlin MW, Greenhalgh M. Scientific combination of DNA evidence: a handgun mixture ineight parts. Twentieth International Symposium on the Forensic Sciences of the Australian

and New Zealand Forensic Science Society, Sydney, Australia. 2010.

Page 9: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 9

Data: Locus D18

33

44

55

66

11

22

44

88

Preserve vs. Discard

MW Perlin, A Sinelnikov. An information gap in DNA evidence interpretation. PLoS ONE, 2009.

Page 10: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 10

MW Perlin, MM Legler, CE Spencer, JL Smith, WP Allan, JL Belrose, BW Duceman.Validating TrueAllele DNA mixture interpretation. Journal of Forensic Sciences, 2011.

7.03 7.03

6.24 6.2413.2613.26

Quantitative InterpretationQuantitative Interpretation

Inclusion methodInclusion method

Preserve vs. Discard

Validated Reproducibility

0

5

10

15

20

25

2A 2C 2H 2D 2B 2F 2G 2E

log(L

R)

Rep 1

Rep 2

0.1750.175

MW Perlin, MM Legler, CE Spencer, JL Smith, WP Allan, JL Belrose, BW Duceman.Validating TrueAllele DNA mixture interpretation. Journal of Forensic Sciences, 2011.

Preserve vs. Discard

• quantitative interpretation preserves information - every time• peak threshold discards information - 70% of the time

Perlin MW, Duceman BW. Profiles in productivity: greater yield at lower cost with computerDNA interpretation. Twentieth International Symposium on the Forensic Sciences of the

Australian and New Zealand Forensic Science Society, Sydney, Australia. 2010.

Page 11: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 11

Data Peak Height is a RandomVariable with Mean and Variance

Accurate vs. DispersedGenotype Probability

Missed Allele Error > 100%

Threshold = 200 rfu

0.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

50:50 30:70 10:90

DNA mixture ratio

Fals

e n

eg

ati

ve r

ate

(all

ele

s/

locu

s)

1000 pg

500 pg

250 pg

125 pg

Perlin MW. Reliable interpretation of stochastic DNA evidence.Canadian Society of Forensic Sciences 57th Annual Meeting; Toronto, ON. 2010.

Page 12: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 12

Investigative DNA Database10, 1010, 1110, 1210, 13

evidence genotypes suspect genotypes

LR

• genotype probability representation (not alleles)• fully preserves DNA identification information• enables LR calculation with every match

10, 1010, 1110, 1210, 13

• connect crimes to criminals• disaster victim identification (WTC)• find missing people• automatic familial search• combat terrorism through DNA

MW Perlin, JB Kadane, RW Cotton. Match likelihood ratio for uncertain genotypes.Law, Probability and Risk, 2009.

Societal Consequences• much informative DNA evidence goes unused• lost in "interim" qualitative interpretation methods, thresholds applied for analyst convenience• DNA databases (e.g., CODIS) lose information, using inclusion-based "alleles", instead of ISFG's preferred LR-based "genotype" probability

Inaccurate, uninformative DNA interpretation:• DNA analysts lose genotypes • prosecutors lose criminal cases• databases lose investigative leads • innocent law abiding citizens lose lives

The DNA Investigator™ Newsletter, 2011DNA Intelligence and Forensic Failure – What you don't know can kill you

International Consensus1. DNA data is continuous, and has random variation2. Thresholds do not work for low template DNA3. Mathematical models can account for random variation

4. The 21st century might be a good time to move awayfrom potentially biased human review of low level (oralmost any) DNA data to some sort of objectivecomputer interpretation that can infer genotypes up toprobability, without ever looking at suspects, that givessome (possibly uninformative) objective answer.

M Perlin; P Gill, J Buckleton, B Budowle, A van Daal. Low template DNA controversy.Twentieth International Symposium on the Forensic Sciences of the Australian and New

Zealand Forensic Science Society, Sydney, Australia. 2010.

Page 13: Taming Uncertainty in Forensic DNA Evidence · DNA samples. Journal of Forensic Sciences, 2001. Calculate Peak Uncertainty MW Perlin, A Sinelnikov. An information gap in DNA evidence

Cybergenetics © 2007-2011 13

Strengthening Forensic Science

http://www.cybgen.com/[email protected]

•• objective, reliable, consistent interpretation objective, reliable, consistent interpretation•• preserve all available information, preserve all available information, from the scene of crime tofrom the scene of crime to courtcourt

•• probabilistic genotypesprobabilistic genotypes•• modern statistics and computation modern statistics and computation•• solid mathematical foundation solid mathematical foundation

•• platinum standard for preserving DNA information platinum standard for preserving DNA information•• move beyond less informative interim methods move beyond less informative interim methods

Learning More

www.cybgen.com/information

• Newsletters gentle introduction to ideas• Courses for scientists and lawyers• Presentations handouts, movies, transcripts• Publications abstracts, manuscripts

The science of quantitative DNA mixture interpretation