02#715’advanced’topics’in’computaonal’...

28
Natural Selection 02715 Advanced Topics in Computa8onal Genomics

Upload: others

Post on 22-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Natural Selection

02-­‐715  Advanced  Topics  in  Computa8onal  Genomics  

Page 2: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Time Scales for the Signatures of Selection

Page 3: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Selective Sweep

Page 4: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Long Haplotypes •  LCT  allele  for  lactase  persistence  (high  frequency  ~77%  in  

European  popula8ons  but  long  haplotypes)  

Page 5: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Difficulties in Detecting Natural Selection

•  Confounding  effects  of  demography  –  Popula8on  boIleneck  and  expansion  can  leave  signatures  that  look  

like  a  posi8ve  selec8on  

•  Ascertainment  bias  for  SNPs  –  Regions  where  many  sequences  were  used  for  ascertainment  may  

appear  to  have  more  segrega8ng  alleles  at  low  frequencies  with  more  haplotypes.  

•  Recombina8on  rate  –  Strong  signature  for  selec8on  for  regions  with  low  recombina8on  rates  

Page 6: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007)

•  Look  for  evidence  of  recent  selec8ve  sweep  –  Long  haplotypes  –  Control  for  recombina8on  rates  by  comparing  the  long  haplotypes  to  

other  alleles  at  the  same  locus  

–  EHH,  iHS  tests  

Page 7: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

EHH Test

•  Extended  haplotype  homozygosity  (EHH):  EHH  at  distance  x  from  the  core  region  is  the  probability  that  two  randomly  chosen  chromosomes  carry  a  tested  core  haplotype  are  homozygous  at  all  SNPs  for  the  en8re  interval  from  the  core  region  to  the  distance  x.  

Page 8: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Haplotype Bifurcation Diagram for Computing EHH

Page 9: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

iHS Test

•  iHS  (integrated  haplotype  score):  

–  iHH:  integrated  EHH  –  iHHA:  iHH  for  ancestral  allele  

–  iHHD:  iHH  for  derived  allele  

Page 10: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

iHS Test

Page 11: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

iHS: More Examples

Page 12: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Analysis of HapMap Data for Natural Selection

•  Determining  targets  of  selec8on  among  the  candidate  regions  –  Target  alleles  are  likely  to  be  derived  alleles  –  Target  alleles  are  likely  to  be  highly  differen8ated  between  popula8ons  –  Target  alleles  are  likely  to  have  biological  effects,  e.g.,  non-­‐synonymous  

Page 13: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

HapMap: Candidates for Natural Selection

Page 14: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Global Distribution of Positively Selected Allele SLC24A5 A111T

Page 15: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

EHH, iHS, and Ascertainment Bias

•  EHH,  iHS  are  haplotype  based  method  –  Less  sensi8ve  to  ascertainment  bias.  

–  Good  power  for  recent  selec8ve  sweeps,  but  low  power  for  older  sweeps.  

Page 16: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Composite Likelihood Test (Nielsen et al., 2005)

•  Likelihood  models  for  null  and  alterna8ve  hypotheses  

•  Incorporates  a  scheme  for  correc8ng  the  ascertainment  bias  

Page 17: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Composite Likelihood Test 1

•  p  =  {p1,  …  pn-­‐1}:  probabili8es  of  derived  allele  frequencies  for  n  samples  

•  Likelihood  model  under  neutral  evolu8on  

•  Likelihood  model  under  selec8ve  sweep  

•  Test  sta8s8c  

Page 18: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Composite Likelihood Test 2

•  Incorporate  spa8al  distribu8on  in  allele  frequencies  due  to  recombina8ons  

•  Assump8on:  each  ancestral  lineage  in  the  genealogy  has  an  i.i.d.  probability  of  escaping  a  selec8ve  sweep  through  recombina8on  onto  the  selected  background.  

Page 19: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Ancestral Recombination Graph with Selective Sweep

Page 20: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Composite Likelihood Test 2

•  The  probability  of  escaping  through  recombina8on  

–  d:  distance  d  between  a  given  locus  and  the  selected  variant  –  α:  a  parameter  that  is  a  func8on  of  recombina8on  rate,  effec8ve  

popula8on  size,  selec8on  coefficient  of  the  selected  muta8on  (e.g.,  α  =  r  ln(2N)/s  

Page 21: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Composite Likelihood Test 2

•  The  probability  that  k  (0<k<n)  out  of  n  gene  copies  escaped  the  sweep:  

•  The  probability  of  observing  B  mutant  alleles  a`er  a  sweep  

Page 22: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Simulation Study

•  Distribu8on  of  test  sta8s8cs  under  null  hypothesis  

Test  1   Test  2  

Page 23: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Correcting for Ascertainment Bias

•  Likelihood  for  allele  frequencies  a`er  condi8oning  on  ascertainment  (i.e.,  unobserved  true  allele  frequencies)  

Page 24: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Correcting for Ascertainment Bias (Nielson et al., 2004)

•  Illustra8on  through  simula8on  study  (20  genes,  10,000  SNPs,  5  genes  for  ascertainment)  

Page 25: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

HapMap Data Analysis

•  HapMap  chromosome  2  

•  Test  1:  requires  a  choice  of  window  size  

•  Test  2:  no  need  to  fix  the  window  size  

Page 26: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Ascertainment Bias from HapMap Analysis

Page 27: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Neandertals and Modern Humans

Page 28: 02#715’Advanced’Topics’in’Computaonal’ Genomicssssykim/teaching/f11/slides/Lecture10.pdf · Analysis of HapMap Data for Natural Selection (Sabeti et al., 2007) • Look’for’evidence’of’recentselec8ve’sweep’

Selective Sweeps in Modern Human Genomes Compared to Neandertals