mstruct: structure under mutations

Post on 27-Jan-2016

42 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

mStruct: Structure under mutations. mStruct: Inference of population structure in the presence of genetic admixing and allele mutations. Suyash Shringarpure and Eric Xing Carnegie Mellon University. Significance. Genetic Population Structure. Structure (Pritchard et al, 2000) ‏. - PowerPoint PPT Presentation

TRANSCRIPT

mStruct: Structure under mutations

Suyash Shringarpure and Eric XingCarnegie Mellon University

mStruct: Inference of population structure in the presence of genetic

admixing and allele mutations

2

Significance

3

Genetic Population Structure

• Structure (Pritchard et al, 2000)

Genetic structure of Human Populations (Rosenberg et al. 2002)

Africa Europe Mid-East Cent./S. Asia East Asia Oceania

Ancestral proportion

Generative model- Structure

0.3 0.7

0.8 0.2

α (for the dataset)

0.8 0.2

All the alleles observed at this locus

Modeling allele similarity

• Microsatellite– Repeats of a small DNA unit, say

Allele - 2

Allele - 9

Allele - 10

•Allele 9 is much more similar to allele 10 than allele 2.•Allele 10 might be a mutation of allele 9.•Mathematically encode the idea in the model•mStruct – Structure under mutations

Hypothesis

• Individual genomes in modern populations are a result of– Admixture of ancestral populations.– Mutations from ancestral alleles.

• Ancestral populations have fewer alleles– (Mostly) True for microsatellites

Generative model- mStruct

0.3 0.7

0.8 0.2

α (for the dataset)

0.8 0.2

All the alleles observed at this locus

δ1

δ2

Mutation models

• How to derive descendant alleles from ancestral alleles?

• Distribution based on the single step model

• P(b|a) α δabs(b-a) , δ < 1• Computationally “easy”• NOT conventional mutation rate.

Finding ancestral alleles

• Fit mixtures of mutation distributions

• Try using 1,2,3….. ancestral alleles

• Use information theory to decide how many ancestral alleles are appropriate

Histogram of observed alleles

Comparing population structure maps

11

Phylogenetic Trees from the Structural Maps

12

Phylogenetic Trees from the Structural Maps

mStruct Structure

HGDP SNP results

Implications of Inconsistency

• Simplistic mutation model• SNP mutations harder to discover from data• The model reduces to Structure• Fundamental difference– Different markers treated differently

• Structure’s treatment of alleles is almost categorical

Contour of Empirical Mutation

Conclusion

• Generative model for population structure• Modeling mutations from ancestral alleles• Gives mutational information apart from

population structure.• (in press) Genetics• Online version up now.

Graphical model representations

Structure mStruct

top related