45
Supplemental Methods876
Herein we define a hierarchical Bayesian model to estimate genotypes g, allele frequencies877
p, and a genetic diversity parameter θ from low-coverage DNA sequence data with possible878
sequence errors. We describe the model for bi-allelic SNPs, but the model is easily modified879
for multi-allelic loci. Let xij denote the number of sequences of the arbitrarily defined880
reference allele for locus (SNP) i and individual j. And let ǫi denote the probability of a881
sequence error for locus i. Let gij ∈ {0, 1, 2} denote the genotype for locus i and individual882
j; gij = 1 is the heterozygous genotype, and gij = 0 and gij = 2 are the homozygous883
genotypes. We assume that that conditional probability of the data for locus i and884
individual j given gij is binomial,885
P (xij|gij) =nij!
xij!(1− xij)!
(1− ǫ)xijǫnij−xij if gij = 0
0.5nij if gij = 1
(ǫ)xij(1− ǫ)nij−xij if gij = 2
(A1)
where nij is the number of sequences (i.e., sequence coverage) for locus i and individual j.886
The full likelihood of the data is∏
i
∏
jP (xij|gij)P (gij|pi), where pi is the population887
frequency of the reference allele for locus i and P (gij|pi) ∼ binomial(pi, n = 2). By taking888
the product across loci and individuals we assume Hardy-Weinberg and linkage equilibrium889
within a population.890
We place a hierarchical prior on p that is conditional on a genetic diversity891
parameter θ, specifically, P (pi|θ) ∼ Beta(θ, θ). When θ is large (e.g., greater than one),892
many loci are expected to have intermediate allele frequencies. Conversely, as θ approaches893
zero, most loci are expected to have a single common allele and one or more rare alleles.894
Under certain conditions θ = 4Neµ (Wright, 1931). We place an uninformative hyperprior895
on θ, specifically we assume θ ∼ U(a, b) where a ≈ 0 and b is large (we use b = 10000). We896
developed software that uses a MCMC algorithm to sample from897
46
P (g,p, θ|x) ∝ P (x|g)P (g|p)P (p|θ)P (θ). The software is written in C++, uses the GNU898
Scientific Library (Galassi et al., 2009), and is available through DRYAD (doi pending).899
47
Supplemental Tables and Figures900
Supplementary Table S1: Population sample information (JH = Jackson Hole Lycaeides ;N♂ = male sample size; N♀ = female sample size).
Locality Taxon ID N♂ N♀ Lat. (◦N) Long. (◦W) Elevation (m)
King’s Hill, MT L. idas KHL 8 0 46.8407 110.6990 2239Garnet Peak, MT L. idas GNP 10 10 45.4323 111.2245 1910Bunsen Peak, WY L. idas BNP 10 10 44.9337 110.7212 2260Trout Lake, WY L. idas TRL 15 3 44.9019 110.1291 2124Hayden Valley, WY L. idas HNV 13 25 44.6823 110.4945 2344Mt. Randolf, WY JH MRF 23 12 43.8547 110.3918 2221Upper Slide Lake, WY JH USL 17 12 43.5829 110.3328 2246Teton Science School, WY JH TSS 17 17 43.6974 110.6102 2180Blacktail Butte, WY JH BTB 17 21 43.6382 110.6820 2220Bull Creek, WY JH BCR 20 17 43.3007 110.5530 2195Victor, ID L. melissa VIC 10 15 43.6590 111.1114 1850Lander, WY L. melissa LAN 12 12 42.6533 108.3551 1787Sinclair, WY L. melissa SIN 12 13 41.8517 107.0917 1961
Supplementary Table S2: Proportion of phenotypic variance explained by population.
Trait Prop. VarianceF 0.897H 0.492U 0.665W 0.100E 0.028F/W 0.706F/H 0.729[F+H]/E 0.495H/U 0.293Num. Ast. 0.123Num. Med. 0.000Prop. Ast. 0.141
48
Supplementary Table S3: The number of genetic regions with posterior inclusion probabilitiesgreater than or equal to the 99.9th empirical quantile for each pair of traits. The expectednumber of shared genetic regions if posterior inclusion probabilities for pairs of traits areindependent is less than one (approximately 1
20). Genitalic measurements F, H, U, W, E,
F/W, F/H, [F+H]/E, and H/U are depicted in Figure 2, and oviposition traits are thenumber or proportion of eggs laid on Medicago or Astragalus. These results are for the naiveanalysis that includes all Lycaeides populations (LN analysis).
[F+H] Num. Num. Prop.F H U W E F/W F/H /E H/U Ast. Med. Ast.
F 51 3 12 2 0 7 11 3 1 0 0 1H 3 52 6 2 0 5 1 4 1 0 0 0U 12 6 52 2 0 7 6 4 6 1 0 0W 2 2 2 52 2 19 2 2 2 0 0 0E 0 0 0 2 53 0 0 18 0 0 0 1F/W 7 5 7 19 0 52 4 3 2 0 0 0F/H 11 1 6 2 0 4 52 3 3 0 0 0[F+H]/E 3 4 4 2 18 3 3 52 1 0 0 0H/U 1 1 6 2 0 2 3 1 54 0 0 0Num. Ast. 0 0 1 0 0 0 0 0 0 53 1 1Num. Med. 0 0 0 0 0 0 0 0 0 1 52 1Prop. Ast. 1 0 0 0 1 0 0 0 0 1 1 52
Supplementary Table S4: The number of genetic regions with posterior inclusion probabilitiesgreater than or equal to the 99.9th empirical quantile for each pair of traits. The expectednumber of shared genetic regions if posterior inclusion probabilities for pairs of traits areindependent is less than one (approximately 1
20). Genitalic measurements F, H, U, W, E,
F/W, F/H, [F+H]/E, and H/U are depicted in Figure 2, and oviposition traits are thenumber or proportion of eggs laid on Medicago or Astragalus. These results are for the naiveanalysis that includes only admixed Lycaeides populations (AN analysis).
[F+H] Num. Num. Prop.F H U W E F/W F/H /E H/U Ast. Med. Ast.
F 57 3 5 2 2 2 3 3 3 0 3 1H 3 53 2 0 0 0 4 1 1 0 0 0U 5 2 54 3 4 2 1 3 5 0 0 3W 2 0 3 52 0 18 0 0 1 0 1 1E 2 0 4 0 54 1 2 17 1 1 1 0F/W 2 0 2 18 1 52 1 1 1 0 0 0F/H 3 4 1 0 2 1 53 3 3 0 2 1[F+H]/E 3 1 3 0 17 1 3 55 3 1 1 0H/U 3 1 5 1 1 1 3 3 55 0 0 2Num. Ast. 0 0 0 0 1 0 0 1 0 53 1 1Num. Med. 3 0 0 1 1 0 2 1 0 1 54 4Prop. Ast. 1 0 3 1 0 0 1 0 2 1 4 52
49
Supplementary Figure S1: Histograms summarize the variation for each morphological trait(diagonal) and scatter-plots depict the covariance between pairs of characters (off-diagonal;light gray = L. idas, gray = Jackson Hole Lycaeides, black = L. melissa). We denoteindividuals from each conspecific population with a different symbol. We report Pearson’sproduct-moment correlation in the lower-triangle plots.
50
Supplementary Figure S2: Histograms summarize the variation for each oviposition pref-erence trait (diagonal) and scatter-plots depict the covariance between pairs of characters(off-diagonal; light gray = L. idas, gray = Jackson Hole Lycaeides, black = L. melissa).We denote individuals from each conspecific population with a different symbol. We reportPearson’s product-moment correlation in the lower-triangle plots.
51
Nu
mb
er
of
loci
0.0 0.5 1.0
020000
40000
SIN
0.0 0.5 1.0
020000
40000
LAN
0.0 0.5 1.0
020000
40000
VIC
Nu
mb
er
of
loci
0.0 0.5 1.0
020000
40000
BTB
0.0 0.5 1.0
020000
40000
TSS
0.0 0.5 1.0
020000
40000
USL
Num
ber
of
loci
0.0 0.5 1.0
020000
40000
MRF
0.0 0.5 1.0
020000
40000
HNV
0.0 0.5 1.0
020000
40000
TRL
Allele frequency
Num
ber
of
loci
0.0 0.5 1.0
020000
40000
BNP
Allele frequency
0.0 0.5 1.0
020000
40000
GNP
Allele frequency
0.0 0.5 1.0
020000
40000
KHL
Supplementary Figure S3: Histograms depict the reference allele frequency distribution forall loci and each population. We define population abbreviations in Table S1.
52
Nu
mb
er
of
loci
0.0 0.1 0.2 0.3 0.4 0.5
05
10
15
20
25
A
0.0 0.1 0.2 0.3 0.4 0.5
0.0
0.2
0.4
0.6
0.8
1.0
B
0.0 0.1 0.2 0.3 0.4 0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
CN
um
ber
of lo
ci
0.0 0.1 0.2 0.3 0.4 0.5
010
20
30
40
D
0.0 0.1 0.2 0.3 0.4 0.5
01
23
45
E
0.0 0.1 0.2 0.3 0.4 0.5
0.0
0.5
1.0
1.5
2.0
F
Effect size
Num
ber
of
loci
0.0 0.1 0.2 0.3 0.4 0.5
02
46
8
G
Effect size
0.0 0.1 0.2 0.3 0.4 0.5
01
23
4
H
Effect size
0.0 0.1 0.2 0.3 0.4 0.5
0.0
0.2
0.4
0.6
0.8
1.0
I
Supplementary Figure S4: Histograms depict estimated effect sizes for SNPs with posteriorinclusion probabilities greater than 0.01. (A) F, all Lycaeides, naive (LN) analysis; (B) F,admixed populations, naive (AN) analysis; (C) F, all Lycaeides, population-mean adjusted(LR) analysis; (D) H, all Lycaeides, naive (LN) analysis; (E) H, admixed populations, naive(AN) analysis; (F) H, all Lycaeides, population-mean adjusted (LR) analysis; (G) propor-tion of eggs on Astragalus, all Lycaeides, naive (LN) analysis; (H) proportion of eggs onAstragalus, admixed populations, naive (AN) analysis; (I) proportion of eggs on Astragalus,all Lycaeides, population-mean adjusted (LR) analysis.
53
Supplementary Figure S5: Plots depict genetic region posterior inclusion probabilities forforearm length (A-B), humerelus length (C), and the proportion of eggs laid on Astragalus
(D). We use different symbols to designate different analyses: all Lycaeides, naive analysis(LN analysis; small, closed circle); admixed populations, naive analysis (AN analysis; +); allLycaeides, population-mean adjusted analysis (LR analysis; ×). The order of genetic regionsis arbitrary, but consistent among plots. The scale of the y-axis differs among plots. Wepresent posterior inclusion probabilities for forearm length in two panes, because the scalediffers considerably among the different analyses.