dna mixture statistics cybergenetics © 2003-2013 2013 spring institute commonwealth's...

78
DNA Mixture Statistics Cybergenetics © 2003-2013 Cybergenetics © 2003-2013 2013 Spring Institute 2013 Spring Institute Commonwealth's Attorney's Services Council Commonwealth's Attorney's Services Council Richmond, Virginia Richmond, Virginia March, 2013 March, 2013 Mark W Perlin, PhD, MD, PhD Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics, Pittsburgh, PA

Upload: vivian-parks

Post on 30-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

DNA Mixture Statistics

Cybergenetics © 2003-2013Cybergenetics © 2003-2013

2013 Spring Institute2013 Spring InstituteCommonwealth's Attorney's Services CouncilCommonwealth's Attorney's Services Council

Richmond, VirginiaRichmond, VirginiaMarch, 2013March, 2013

Mark W Perlin, PhD, MD, PhDMark W Perlin, PhD, MD, PhDCybergenetics, Pittsburgh, PACybergenetics, Pittsburgh, PA

Child molestation case

• June, 2011: Northern Virginia• daughter's birthday slumber party• 10 year old girls sleeping in basement

• object sexual penetration• aggravated sexual battery (2 counts)

Prosecutor: CDCA Nicole Wittmann

CHILD&

CHILD

CHILD&

CHILDCHILD

VICTIM

Television

Cabinet

Table

Table

L-Shaped Couch

Bathroom

Bed

room

Lau

ndry

Bed

room

StairsBookcase

VICTIM

VICTIM

Storage

Door to OutsideCloset

underpants

pajama pants

DNA mixture statistics

Human review (using thresholds)• underpants original = 10 million modified = 1 million• pajama pants original = 2 million modified = 4

Computer interpretation requested

Prosecutor question

What is the truematch information of the evidenceto the suspect?

Biology

1 trillion cells

Nucleuscellcell

nucleusnucleus

DNAcellcell

nucleusnucleus

chromosomeschromosomes

Locuscellcell

nucleusnucleus

chromosomeschromosomes

locus

Allelecellcell

nucleusnucleus

chromosomeschromosomes

locus

Short Tandem Repeat (STR)

alleles

cellcell

nucleusnucleus

chromosomeschromosomes

locus

Short Tandem Repeat (STR)

genotype7, 8

alleles

Genotype

IdentificationEvidence

item

Identification

10 12

LabEvidence item

Evidencedata

IdentificationEvidence

itemEvidence

dataEvidence genotype

10 12

10, 12

Lab Infer

IdentificationEvidence genotype

Known genotype

10 12

10, 12

10, 12

Lab Infer

Compare

Evidence item

Evidencedata

Identification

Known genotype

10 12

10, 12

10, 12

Lab Infer

Compare

Probability(identification)= Prob(suspect matches evidence) = 100%

Evidencedata

Evidence item

Evidence genotype

CoincidenceBiological population

CoincidenceBiological population

Allele frequency

data

10

Lab

11 12 13 14 15 16 17

CoincidenceBiological population

Allele frequency

data

Population genotype

10

10, 12 @ 5%

Lab Infer

11 12 13 14 15 16 17

Genotype product rule, combines allelesProb(10, 12) = 2 x p10 x p12

Prob(10, 10) = p10 x p10

CoincidenceBiological population

Allele frequency

data

Population genotype

Known genotype

10

10, 12 @ 5%

10, 12

Lab Infer

Compare11 12 13 14 15 16 17

CoincidenceBiological population

Allele frequency

data

Population genotype

Known genotype

10

10, 12 @ 5%

10, 12

Lab Infer

Compare11 12 13 14 15 16 17

Probability(coincidence) = Prob(coincidental match)= 5%

Identification information

before

data

(population)

after(evidence)

Evidence changes our beliefEvidence changes our belief

Prob(identification)

Prob(coincidence)

At the suspect's genotype,identification vs. coincidence?

Match statistic

before

data

(population)

after(evidence)

At the suspect's genotype,identification vs. coincidence?

Prob(evidence matches suspect)

Prob(coincidental match)

Perlin MW. Explaining the likelihood ratio in DNA mixture interpretation. Promega's Twenty First International Symposium on Human Identification, 2010; San Antonio, TX.

Match statistic

Prob(evidence matches suspect)

Prob(coincidental match)before

data

(population)

after(evidence)

20

=100%

5%

=

At the suspect's genotype,identification vs. coincidence?

Bayes theorem

Calculate probability

Belief in hypothesisafter having seen datais proportional to how well hypothesis explains the datatimes our initial belief.

All hypotheses must be considered. Need computers to do this properly.

Hypothesis: Defendant contributed to DNA evidence

Rev Bayes, 1763

Computers, 1985

Mixture interpretation variesNational Institute of Standards and Technology

Two Contributor Mixture Data, Known Victim

31 thousand (4)

213 trillion (14)

DNA mixture

+

Evidence item

Uncertainty

10 11 12

+

LabEvidence item

Evidencedata

Uncertainty

10 11 12

10, 12 @ 50%11, 12 @ 30%12, 12 @ 20%

+

Evidence genotype

Lab InferEvidence item

Evidencedata

Uncertainty

Known genotype

10 11 12

10, 12 @ 50%11, 12 @ 30%12, 12 @ 20%

10, 12

+

Compare

Evidence genotype

Lab InferEvidence item

Evidencedata

Uncertainty

10 11 12

10, 12 @ 50%11, 12 @ 30%12, 12 @ 20%

10, 12

+

Compare

Probability(identification) = Prob(suspect matches evidence) = 50%

Evidence genotype

Lab InferEvidence item

Evidencedata

Known genotype

Identification information

before

data

(population)

after(evidence)

Prob(identification)

Prob(coincidence)

At the suspect's genotype,identification vs. coincidence?

Less weight of evidence,Less weight of evidence,less change in our beliefless change in our belief

Numeratordecreases

Denominatorunchanged

Match statistic

before

data

(population)

after(evidence)

10

=50%

5%

=

At the suspect's genotype,identification vs. coincidence?

Prob(evidence matches suspect)

Prob(coincidental match)

TrueAllele operator

• Replicate computer runs for each item• 2 or 3 unknown mixture contributors• Victim genotype was considered

STR evidence data .fsa genetic analyzer files

Evidence genotypes probability distributions

DNA mixture dataQuantitative peak heights at a locus

peak size

peakheight

TrueAllele® Casework

ViewStationUser Client

DatabaseServer

Interpret/MatchExpansion

Visual User InterfaceVUIer™ Software

Parallel Processing Computers

Mixture weightSeparate mixture data into two contributor components

25% 75%

Genotype inferenceThorough: consider every possible genotype solutionObjective: does not know the comparison genotype

Explain thepeak pattern

Betterexplanationhas ahigher likelihood

Victim's allele pair

Another person's Another person's allele pairallele pair

Genotype inference

Explain thepeak pattern

Worseexplanationhas alower likelihood

Victim's allele pair

Another person's Another person's allele pairallele pair

Genotype separationmajor contributor

Genotype concordance

TrueAllele reportGenotype probability distributions

Evidence genotype Suspect genotype

Population genotype

Likelihood ratio (LR)DNA match statistic

Probability(evidence match)

Probability(coincidental match)

30x

3%

98%

DNA match statistic

Match statistic at 15 loci

TrueAllele DNA match

Black 36.6 quintillionCaucasian 20.7 quadrillionHispanic 212 quadrillion

Black 319 thousandCaucasian 3.86 thousandHispanic 32.9 thousand

LR match to Defendant

Underpants Pajama pants

Powers of Ten

-21 -18 -15 -12 -9 -6 -3 0 +3 +6 +9 +12 +15 +18 +21

logarithmic scale

thou

sand

milli

on

billio

n

trillio

n

quad

rillion

quint

illion

1 000 … 000

number of zeros

Trial preparation

• discuss case report• direct examination• curriculum vitae• PowerPoint slides• background reading• answer questions

Computer Interpretation of Quantitative DNA Evidence

Commonwealth v DefendantCommonwealth v DefendantApril, 2012April, 2012

Arlington, VirginiaArlington, Virginia

Mark W Perlin, PhD, MD, PhDMark W Perlin, PhD, MD, PhDCybergenetics, Pittsburgh, PACybergenetics, Pittsburgh, PA

Cybergenetics © 2003-2012Cybergenetics © 2003-2012

DNA genotype

8, 91 2 3 4 5 6 7 8

ACGT

1 2 3 4 5 6 7 8

A genetic locus has two DNA sentences,one from each parent.

9

locus

Many alleles allow formany many allele pairs. A person's genotype is relatively unique.

motherallele

fatherallele

repeated word

An allele is the numberof repeated words.

A genotype at a locusis a pair of alleles.

DNA evidence interpretationEvidence

itemEvidence

data

Lab Infer

10 11 12

Evidence genotype

Known genotype

10, 12 @ 50%11, 12 @ 30%12, 12 @ 20%

10, 12

Compare

Computers can use all the dataQuantitative peak heights at locus Penta E

peak size

peakheight

How the computer thinksConsider every possible genotype solution

Explain thepeak pattern

Betterexplanationhas ahigher likelihood

Victim's allele pair

Another person's Another person's allele pairallele pair

Evidence genotypeObjective genotype determined

solely from the DNA data. Never sees a suspect.

1%

98%

DNA match information

Probability(evidence match)

Probability(coincidental match)

How much more does the suspect match the evidencethan a random person?

30x

3%

98%

Match information at 15 loci

Is the suspect in the evidence?

A match between the underpants and Defendant is:

36.6 quintillion times more probable than a coincidental match to an unrelated Black person

20.7 quadrillion times more probable than a coincidental match to an unrelated Caucasian person

212 quadrillion times more probable than a coincidental match to an unrelated Hispanic person

Is the suspect in the evidence?

A match between the pajama pants and Defendant is:

319 thousand times more probable than a coincidental match to an unrelated Black person

3.86 thousand times more probable than a coincidental match to an unrelated Caucasian person

32.9 thousand times more probable than a coincidental match to an unrelated Hispanic person

Outcome

Guilty• object sexual penetration• two counts of aggravated sexual battery

Sentence• 22 years imprisonment

Court of Appeals• DNA chain of custody• appeal denied

TrueAllele mixture validation:Virginia case study

Mark W Perlin, PhD, MD, PhD Mark W Perlin, PhD, MD, PhD Kiersten Dormer, MS and Jennifer Hornyak, MSKiersten Dormer, MS and Jennifer Hornyak, MS

Cybergenetics, Pittsburgh, PACybergenetics, Pittsburgh, PA

Lisa Schiermeier-Wood, MS and Susan Greenspoon, PhDLisa Schiermeier-Wood, MS and Susan Greenspoon, PhDDepartment of Forensic Science, Richmond, VA Department of Forensic Science, Richmond, VA

Establish the reliability of TrueAllele mixture interpretation

Case composition

• 72 criminal cases• 92 evidence items • 111 genotype comparisons

Criminal offense• 18 homicide• 12 robbery • 6 sexual assault• 20 weapon

DNA mixture distribution

Data summary – “alleles”

Threshold

Over threshold, peaks are labeled as allele events

All-or-none allele peaks,each given equal status

Allele Pair7, 77, 107, 127, 14

10, 1010%10, 12

10, 1412, 1212, 1414, 14

CPI informationN

othi

ng r

epor

ted

256.70

2.26 CPI

Combined probability of inclusion

SWGDAM 2010 guidelines

Threshold

Under threshold, alleles less used

Allele Pair7, 77, 107, 127, 14

10, 100%10, 12

10, 1412, 1212, 1414, 14

Higher threshold for human review

Modified CPI information

25

56

2.12 6.70

1.75

2.26

Not

hing

rep

orte

d

CPImCPI

SWGDAM 2010 guidelines

3.2.2. If a stochastic threshold based on peak height is not used in the evaluation of DNA typing results, the laboratory must establish alternative criteria (e.g., quantitation values or use of a probabilistic genotype approach) for addressing potential stochastic amplification. The criteria must be supported by empirical data and internal validation and must be documented in the standard operating procedures.

Use TrueAllele® Casework for DNA mixture statistics

Validated genotyping method

Perlin MW, Sinelnikov A. An information gap in DNA evidence interpretation. PLoS ONE. 2009;4(12):e8327.

Perlin MW, Legler MM, Spencer CE, Smith JL, Allan WP, Belrose JL, Duceman BW. Validating TrueAllele® DNA mixture interpretation. Journal of Forensic Sciences. 2011;56(6):1430-47.

Perlin MW, Belrose JL, Duceman BW. New York State TrueAllele® Casework validation study. Journal of Forensic Sciences. 2013;58(6):in press.

TrueAllele reinterpretation

Virginia reevaluates DNA evidence in 375 casesJuly 16, 2011

“Mixture cases are their own little nightmare,” says William Vosburgh, director of the D.C. police’s crime

lab. “It gets really tricky in a hurry.”

“If you show 10 colleagues a mixture, you will probably end up with 10 different answers”

Dr. Peter Gill, Human Identification E-Symposium, 2005

Mixture weightSeparate mixture data into two contributor components

25% 75%

Genotype inferenceThorough: consider every possible genotype solutionObjective: does not know the comparison genotype

Explain thepeak pattern

Betterexplanationhas ahigher likelihood

Victim's allele pair

Another person's Another person's allele pairallele pair

Allele Pair7, 77, 107, 127, 14

10, 1098%10, 12

10, 1412, 1212, 1414, 14

TrueAllele sensitivity

9

25

56

2.12 6.70 10.93

5.52

1.75

2.26

Not

hing

rep

orte

d

CPImCPI

TrueAllele

TrueAllele specificityTrue exclusions, without false inclusions

– 19.69

TrueAllele reproducibility

log(LR1)

log(

LR2)

Concordance in two independent computer runs

standard deviation(within-group)

0.305

Validation results

A reliable method • sensitive • specific • reproducible

TrueAllele® Casework DNA mixture interpretation is:

TrueAllele computer genotyping is more effective than human review

TrueAllele Virginia outcomes144 cases analyzed

72 case reports – 10 trials

City Court Charge Sentence

Richmond Federal Weapon 50 years

Alexandria Federal Bank robbery 90 years

Quantico Military Rape 3 years

Chesapeake State Robbery 26 years

Arlington State Molestation 22 years

Richmond State Homicide 35 years

Fairfax State Abduction 33 years

Norfolk State Homicide 8 years

Charlottesville State Homicide 15 years

Hampton State Home invasion 5 years

TrueAllele in Virginia

• Department of Forensic Science has their own TrueAllele system• Training, validation, approvals• Services centralized in Richmond• DFS will provide DNA mixture statistics and court testimony

TrueAllele in the United States

Casework systemInterpretation services

More information

[email protected]

http://www.cybgen.com/information• Courses• Newsletters• Newsroom• Presentations• Publications