effect of additional loci on likelihood ratio values for ... · effect of additional loci on...

Post on 15-Aug-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Effect of Additional Loci on Likelihood Ratio Values for Complex Kinship Analysis

Kristen Lewis O’Connor, Ph.D.

National Institute of Standards and Technology

21st International Symposium on Human Identification San Antonio, TexasOctober 12, 2010

Final version of this presentation available at: http://www.cstl.nist.gov/strbase/NISTpub.htm

Questions to Be Addressed

• How does kinship analysis relate to forensic DNA typing?

• Is there value in examining additional loci?

• What has NIST accomplished with kinship analysis?

• Where can one learn more about these topics?

What is our forensic core competency?

Laying the foundation for a discussion of kinship analysis

Forensic Core Competency

Standard STR Typing

13 core loci

Standard STR Typing for

Database Search13 core loci

1-to-1

1-to-many

Typ

e o

f Se

arch

Random Match Probabilitydetermines the rarity of the profile

Direct match

High certainty

Evidence profile8,14 - 10,13 - …

Profile 1: 9,13 - 10,11 - ..

Profile 2: 8,11 - 10,12 - ..

Profile 3: 10,11 - 11,12 - ..

Profile 4: 15,16 - 13,13 - ..

Profile 5: 9,10 - 10,10 - ..

Profile 6: 8,14 - 10,13 - ..

Profile 7: 9,12 - 10,11 - ..

Profile 8: 7,15 - 10,12 - ..

Profile 9: 9,13 - 11,11 - ..

Reference Profiles (K1 to Kn)

Expanding the Forensic Core Competency

Standard STR Typing

13 core loci

Paternity Testing

~15 loci

Standard STR Typing for

Database Search13 core loci

1-to-1

1-to-many

Typ

e o

f Se

arch

12,15 8,13

12,13

AF M

C

Paternity Index determines the strength of the compared relationships

Level of CertaintyHigh

IndirectDirectType of Match

Low

Expanding the Forensic Core Competency

Standard STR Typing

ComplexKinship Analysis

13 core loci Additional loci may be beneficial

IndirectDirect

Paternity Testing

~15 loci

Level of CertaintyLow

Standard STR Typing for

Database Search

High

Type of Match

13 core loci

1-to-1

1-to-many

Typ

e o

f Se

arch

What is kinship analysis?

What is kinship analysis?

Evaluation of relatedness between individuals

Applications

Parentage testing (civil or criminal)

Disaster victim identification

Missing persons identification

Familial searching

Immigration

AF M

C

?

Immigration TestingU.S. Department of Homeland Security

U.S. anchor

Anchor may sponsor up to 15 relatives (spouse, parents, siblings, children)

79% of refugee claims were fraudulent based on DNA testing or failure to appear for DNA testing (U.S. Dept. of State)

DHS is looking to require DNA to support relationship claims

Why can kinship analysis be complex?

Direct Matching

Evidence profile8,14 - 10,13 - …

Suspect profile8,14 - 10,13 - ...

Exact match between compared genotypes

Standard STR Typing

Direct MatchingExact match between compared genotypes

Kinship Analysis? Kinship profile 1

8,14 - 10,13 - …Kinship profile 28,14 - 10,13 - ...

Direct MatchingExact match between compared genotypes

Kinship Analysis

Relationship 0 alleles 1 allele 2 alleles

Identical twin 0 0 1

Probability of Sharing Alleles from a Common Ancestor

(12,15) (10,13)

12,13 12,13

Twins!

2 alleles shared every locus

Indirect Matching

Relationship 0 alleles 1 allele 2 alleles

Parent-child 0 1 0

Probability of Sharing Alleles from a Common Ancestor

12,15

12,13

F

C

M

(10,13)

Parent-Offspring

1 allele shared at every locus

Relationship 0 alleles 1 allele 2 alleles

Full siblings 1/4 1/2 1/4

Probability of Sharing Alleles from a Common Ancestor

(12,15) (10,13)

12,13 10,15

Indirect MatchingFull Siblings

0 alleles shared at a locus

12,15 10,13

12,13

What information is required for kinship analysis?

• Alleged relationship• Genotypes of specific markers• Method to assess the relationship

Probability of genotypes if “10” is the true father of “21”Probability of genotypes if an unrelated man is the father of “21”

Paternity Index =

Paternity trio

(Likelihood Ratio)

What information is required for kinship analysis?

• Alleged relationship• Genotypes of specific markers• Method to assess the relationship

Paternity trio

Pedigrees are not always this simple

MaleFemaleDivorceNo data

Complex Pedigree

Why can kinship analysis be complex?

Relationship 0 alleles 1 allele 2 alleles

Parent-child 0 1 0

Full siblings 1/4 1/2 1/4

Half siblings 1/2 1/2 0

Uncle-nephew 1/2 1/2 0

Grandparent-grandchild 1/2 1/2 0

First cousins 3/4 1/4 0

Half siblings, uncle-nephew, and grandparent-grandchild are genetically identical

Probability of Sharing Alleles from a Common Ancestor

For more distant familial relationships, allele sharing decreases uncertainty increases

Leve

l of

Ce

rtai

nty

High

Low

What materials are used for kinship analysis at NIST?

What markers are being studied for kinship analysis?

• 46 autosomal loci

• 17 Y-chromosomal loci

• 15 X-STRs (AFDIL collaboration)

• Mitochondrial control region

NIST 26plex Assay(non-commercial)

D1GATA113

D1S1627

D1S1677

D2S1776

D3S3053

D3S4529

D4S2364

D4S2408

D5S2500

D6S474

D6S1017

D8S1115

D9S1122

D9S2157

D10S1435

D11S4463

D12ATA63

D14S1434

D17S974

D17S1301

D2S441

D10S1248

D22S1045

D18S853

D20S482

D20S1082

U.S.

TPOX

CSF1PO

D5S818

D7S820

D13S317

FGA

vWA

D3S1358

D8S1179

D18S51

D21S11

TH01

D16S539

D2S1338

D19S433

Penta D

Penta E

13

CO

DIS

loci

7 ESS loci

5 loci adopted to expand to 12 ESS loci

46 unique STR loci have been characterized at NIST

See poster #40 for details on additional loci

Europe

FGA

vWA

D3S1358

D8S1179

D18S51

D21S11

TH01

D16S539

D2S1338

D19S433

D12S391

D1S1656

D2S441

D10S1248

D22S1045

SE33

Autosomal STR Markers

European Standard Set = ESS

NIST Sample Set

• NIST U.S. population samples– 254 African American, 261 Caucasian, 139 Hispanic

• U.S. father/son samples– 178 African American, 198 Caucasian, 190 Hispanic,

198 Asian

• Extended family samples– 6 sets of 3–4 generations– 165 total samples

www.cstl.nist.gov/strbase/

NIST Data Analysis Capabilities

Kinship Software– DNA-VIEW™ v. 29.23 (Charles Brenner)

– KIn CALc v. 4.0 (CA DOJ, Steven Myers)

– GeneMarker® HID v. 1.90 (SoftGenetics)

– FSS DNA Lineage (Forensic Science Service)

– LISA (Future Technologies Inc.)

Population Genetics Software– Arlequin v. 3.5

What methods are we using to assess kinship?

How is kinship assessed?

MF

21

Likelihood Ratio (LR)

Evaluate genotypes to give weight (strength) to compared relationships

By the definition of a LR:LR > 1 supports the numerator (alleged relationship)LR < 1 supports the denominator (unrelated)

Larger LR values provide more support for the alleged relationship

Probability of genotypes if 1,2 are full siblingsProbability of genotypes if 1,2 are unrelated

LR =

How is kinship assessed?

Goal: How well does a set of loci perform for kinship analysis?

Method: Evaluate “expected” range of LRs for different relationship questions

Need: Graphical method to display the LRs

Solution: Distribution of data points (LR values)

What is a likelihood ratio distribution?

Variables:

Allele frequency

Number of loci

Kinship probabilities (account for shared alleles from common ancestor)

Likelihood ratio

High LR

(more weight for relationship)

Moderate support for

relationship

Low LR

(less weight for relationship)

Computer simulations can

generate many genotype

combinations and relatedness

scenarios (pedigrees)

Calculate LR values for each

pedigree

Generate LR distributions

What is a likelihood ratio distribution?

Variables:

Allele frequency

Number of loci

Kinship probabilities (account for shared alleles from common ancestor)

Likelihood ratio

LR threshold = 1

Unrelated persons

Relatedpersons

LR < 1 LR > 1

Separate distributions- High probability of shared alleles from common ancestor (e.g., parent-offspring)- Discriminating loci genotyped

Overlap of likelihood ratio distributions

Variables:

Allele frequency

Number of loci

Kinship probabilities (account for shared alleles from common ancestor)

Overlapping distributions- Low probability of shared alleles from common ancestor (e.g., first cousin)- Less discriminating loci genotyped

LR threshold = 1

RelatedUnrelated

False positives

False negatives

uncertainty

What kinship questions have we asked with our dataset?

NIST 26plex Assay(non-commercial)

D1GATA113

D1S1627

D1S1677

D2S1776

D3S3053

D3S4529

D4S2364

D4S2408

D5S2500

D6S474

D6S1017

D8S1115

D9S1122

D9S2157

D10S1435

D11S4463

D12ATA63

D14S1434

D17S974

D17S1301

D2S441

D10S1248

D22S1045

D18S853

D20S482

D20S1082

U.S.

TPOX

CSF1PO

D5S818

D7S820

D13S317

FGA

vWA

D3S1358

D8S1179

D18S51

D21S11

TH01

D16S539

D2S1338

D19S433

Penta D

Penta E

13

CO

DIS

loci

7 ESS loci

5 loci adopted to expand to 12 ESS loci

46 unique STR loci have been characterized at NIST

See poster #40 for details on additional loci

Europe

FGA

vWA

D3S1358

D8S1179

D18S51

D21S11

TH01

D16S539

D2S1338

D19S433

D12S391

D1S1656

D2S441

D10S1248

D22S1045

SE33

European Standard Set = ESS

6.3 megabases apart on chromosome 12

Can D12S391 be used with vWAfor kinship analysis?

vWA chr12:6,093,104 – 6,093,253

D12S391 chr12:12,449,874 – 12,450,226

6.3 megabases apart on chromosome 12UCSC Genome Browser, Feb. 2009 assembly

Are vWA and D12S391 independent?

O’Connor KL, et al., Linkage disequilibrium analysis of D12S391 and vWA in U.S. population and paternity samples, Forensic Sci. Int. Genet. (in press)

No!

No!

Budowle B, et al., Population genetic analyses of the NGM STR loci, Int. J. Legal Med. (in press)

Should vWA and D12S391 be multiplied for profile probability calculations in kinship analysis?

How do 13 loci perform for kinship analysis?

How do 13 loci perform for kinship analysis?

0

0.1

0.2

0.3

0.4

0.5

-10 -8 -6 -4 -2 0 2 4 6 8 10 12 14

Parent-Offspring(1000 simulations)

RelatedUnrelated

0.668 < -10

median LR=10,000median LR=2.7E-13

0

0.1

0.2

0.3

0.4

0.5

-10 -8 -6 -4 -2 0 2 4 6 8 10 12 14

median LR=0.14 median LR=6.9

0

0.1

0.2

0.3

0.4

0.5

-10 -8 -6 -4 -2 0 2 4 6 8 10 12 14

median LR=1.5E-3 median LR=2400

The degree of overlap corresponds with possible values for false positive or false negative results.

Parent-offspring comparisons:No overlap between unrelated and related LR distributions

Full sibling comparisons:False positive rate = 0.027False negative rate = 0.033

Half sibling comparisons:False positive rate = 0.155False negative rate = 0.168

Pro

po

rtio

n o

f si

mu

lati

on

s u

sin

g U

.S. C

auca

sian

alle

le f

req

uen

cies

LR threshold = 1

Full Siblings(5000 simulations)

Half Siblings(1000 simulations)

log10(LR)

Do additional loci improve the discrimination of true relatives

vs. unrelated persons?

0

0.1

0.2

0.3

0.4

0.5

-13 -11 -9 -7 -5 -3 -1 1 3 5 7 9 11 13 15 17

0

0.1

0.2

0.3

0.4

0.5

-13 -11 -9 -7 -5 -3 -1 1 3 5 7 9 11 13 15 17

0

0.1

0.2

0.3

0.4

0.5

-13 -11 -9 -7 -5 -3 -1 1 3 5 7 9 11 13 15 17

How do 20 loci perform for kinship analysis?

median LR=680,000median LR=2.4E-5

Pro

po

rtio

n o

f si

mu

lati

on

s u

sin

g U

.S. C

auca

sian

alle

le f

req

uen

cies

LR threshold = 1

median LR=0median LR=9 million

median LR=0.036 median LR=32.5

RelatedUnrelated

0.991 < -13

log10(LR)

Parent-Offspring(1000 simulations)

Full Siblings(5000 simulations)

Half Siblings(1000 simulations)

Parent-offspring comparisons:No overlap between unrelated and related LR distributions

Full sibling comparisons:False positive rate = 0.006False negative rate = 0.008

Half sibling comparisons:False positive rate = 0.075False negative rate = 0.104

Additional loci improve separation of LR distributions for parent-offspring and full siblings.

0

0.1

0.2

0.3

0.4

-17 -13 -9 -5 -1 3 7 11 15 19

0

0.1

0.2

0.3

0.4

-17 -13 -9 -5 -1 3 7 11 15 19

0

0.1

0.2

0.3

0.4

-17 -13 -9 -5 -1 3 7 11 15 19

How do 40 loci perform for kinship analysis?P

rop

ort

ion

of

sim

ula

tio

ns

usi

ng

U.S

. Cau

casi

an a

llele

fre

qu

enci

es

median LR=0.005 median LR=300

LR threshold = 1

median LR=1.4E-8 median LR=3.1 billion

All LRs = 0

median LR=140 billion

RelatedUnrelated

log10(LR)

Half Siblings(1000 simulations)

Parent-Offspring(1000 simulations)

Full Siblings(5000 simulations)

Parent-offspring comparisons:No overlap between unrelated and related LR distributions

Full sibling comparisons:False positive rate = 0.0006False negative rate = 0.0018

Half sibling comparisons:False positive rate = 0.051False negative rate = 0.066

Additional loci further improve separation of LR distributions for parent-offspring and full siblings.

0

0.1

0.2

0.3

0.4

-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Parent-offspring1000 simulations

Half siblings1000 simulations

Do additional loci improve kinship determination?P

rop

ort

ion

of

sim

ula

tio

ns

usi

ng

U.S

. Cau

casi

an a

llele

fre

qu

en

cies

Full siblings5000 simulations

(additional simulations performed for smoother curves)0

0.05

0.1

0.15

0.2

0.25

-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

0

0.1

0.2

0.3

0.4

-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

log10(LR)

13 CODIS

Review previous data

Parent-offspring1000 simulations

Pro

po

rtio

n o

f si

mu

lati

on

s u

sin

g U

.S. C

auca

sian

alle

le f

req

ue

nci

es

0

0.1

0.2

0.3

0.4

-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

0

0.05

0.1

0.15

0.2

0.25

-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

0

0.1

0.2

0.3

0.4

-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

log10(LR)

20 STRs

Half siblings1000 simulations

Full siblings5000 simulations

(additional simulations performed for smoother curves)

13 CODIS

Do additional loci improve kinship determination?Review previous data

Parent-offspring1000 simulations

Pro

po

rtio

n o

f si

mu

lati

on

s u

sin

g U

.S. C

auca

sian

alle

le f

req

ue

nci

es

0

0.1

0.2

0.3

0.4

-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

0

0.1

0.2

0.3

0.4

-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

log10(LR)

20 STRs 40 STRs

0

0.05

0.1

0.15

0.2

0.25

-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Half siblings1000 simulations

Full siblings5000 simulations

(additional simulations performed for smoother curves)

13 CODIS

Do additional loci improve kinship determination?Review previous data

How can uncertainty in kinship determination be reduced?

How can uncertainty in kinship determination be reduced?

Improve the measurement technique

• Add more family references

• Add more loci

– Autosomal STRs* improve identification of parent-offspring and full siblings

– Lineage markers or SNP arrays may improve identification of more distant relatives

Nothnagel M, Schmidtke J, Krawczak M. (2010) Potentials and limits of pairwise kinship analysis using autosomal short tandem repeat loci, Int. J. Legal Med. 124(3):205-15.

* More chances for mutation

Know your limits… simulate… validate!

How does kinship analysis relate to questions that concern you?

Expanding the Forensic Core Competency

Standard STR Typing

ComplexKinship Analysis

Familial Searching

13 core loci Additional loci may be beneficial

Additional loci may be beneficial

IndirectDirect

Paternity Testing

~15 loci

Level of CertaintyLow

Standard STR Typing for

Database Search

High

Type of Match

13 core loci

1-to-1

1-to-many

Typ

e o

f Se

arch

Expanding the Forensic Core Competency to Familial Searching

• One-to-many search many false positives

• Allele sharing method– Partial match between evidence profile and database profile– Miss true relatives

• Especially full siblings (1/4 probability of sharing 0 alleles at a locus)

– Introduce many false positives due to chance allele sharing

• Likelihood ratio approach– Kinship probabilities plus allele frequencies account for allele sharing

due to familial relationship

• Reduce uncertainty with additional loci (autosomal STRs, Y-STRs)

What is NIST doing to improve kinship analysis?

What is NIST doing to improve kinship analysis?

Allele frequencies for U.S. population samples

Evaluation of new loci

Concordance testing of new multiplexes

Developed a new website to support kinship analysis

See Poster #40 tomorrow

Kinship Resource Page on STRBasewww.cstl.nist.gov/strbase/kinship.htm

NIST Standard Reference Family DataAid validation of algorithms, software, and loci selection for kinship analysis – Use genotypes with known inheritance

– Compare LRs from algebraic and software calculations

– Test algorithms for mutation, rare alleles, null alleles, incest

– Evaluate use of additional loci to detect relationships

See Poster #35 today

AcknowledgmentsApplied Genetics

Kinship Page on STRBasehttp://www.cstl.nist.gov/strbase/kinship.htm

kristen.oconnor@nist.gov

John Butler

Applied GeneticsGroup Leader

FundingNRC – Postdoctoral fellowship support for Kristen O’ConnorFBI – Application of DNA Typing as a Biometric ToolNIJ – Forensic DNA Standards, Research, and Training

Kristen Lewis O’Connor

Peter Vallone

DNA BiometricsProject Leader

Erica Butts

Becky Hill

Workshops & Textbooks

Concordance & LT-DNA

Rapid PCR & Biometrics

Kinship Analysis DNA Extraction Efficiency

top related