autosomal str variation in five austronesian populations · autosomal str variation in five...
TRANSCRIPT
Autosomal STR Variation in Five Austronesian Populations
E. M. SHEPARD,1 R. A. CHOW,1 EPIFANIA SUAFO’A,2 DAVID ADDISON,3 A. M. PEREZ-MIRANDA,1 R. L. GARCIA-BERTRAND,4 AND R. J. HERRERA1
Abstract Human population characteristics at the genetic level are inte-gral to both forensic biology and population genetics. This study evaluatesbiparental microsatellite markers in five Austronesian-speaking groups tocharacterize their intra- and interpopulation differences. Genetic diversitywas analyzed using 15 short tandem repeat (STR) loci from 338 unrelatedindividuals from 5 Pacific islands populations, including the aboriginal Amiand Atayal groups from Taiwan, Bali and Java in Indonesia, and the Polyne-sian islands of Samoa. Allele frequencies from the STR profiles were deter-mined and compared to other geographically targeted worldwide populationsprocured from recent literature. Hierarchical AMOVA analysis revealed alarge number of loci that exhibit significant correspondence to linguistic par-titioning among groups of populations. A pronounced divide exists betweenSamoa and the East (Formosa) and Southeast Asian (Bali and Java) islands.This is clearly illustrated in the topology of the neighbor-joining tree. Phylo-genetic analyses also indicate clear distinctions between the Ami and Atayaland between Java and Bali, which belie the respective geographic proximitiesof the populations in each set. This differentiation is supported by the higherinterpopulation variance components of the Austronesian populations com-pared to other Asian non-Austronesian groups. Our phylogenetic data indi-cate that, despite their linguistic commonalities, these five groups aregenetically distinct. This degree of genetic differentiation justifies the cre-ation of population-specific databases for human identification.
Genetic diversity, characterized at both the intra- and the interpopulation level,forms the basis of forensic biology and population genetics, respectively. Yetthese two disciplines are not always fully integrated in studies using human sam-ples. A well-established marker system such as autosomal short tandem repeats
1Department of Biological Sciences, Florida International University, University Park, OE 304, Miami, FL33199.
2National Park Service, Pago Pago, American Samoa, 96799.3American Samoa Power Authority, Pago Pago, American Samoa, 96799.4Department of Biological Sciences, Colorado College, Colorado Springs, CO 80903.
Human Biology, December 2005, v. 77, no. 6, pp. 825–851.Copyright � 2005 Wayne State University Press, Detroit, Michigan 48201-1309
KEY WORDS: ISLAND SOUTHEAST ASIA, TAIWAN ABORIGINES, POLYNESIA, TAIWAN, BALI,JAVA, SAMOA, AUSTRONESIAN-SPEAKING GROUPS, AMI GROUP, ATAYAL GROUP, SHORT TANDEMREPEATS, MICROSATELLITE MARKERS, D8S1179, D21S11, D7S820, CSF1PO, D3S1358, TH01, D13S317,D16S539, VWA, TPOX, D18S51, D5S818, FGA, D2S1338, D19S433, PHYLOGENY, FORENSIC BIOLOGY,GENETIC DIVERSITY.
PAGE 825................. 15768$ $CH7 02-21-06 11:52:37 PS
826 / shepard et al.
(STRs) represents hypervariable regions that can provide the fine resolutionneeded to determine relationships among closely related populations in recentevolutionary history (Bowcock et al. 1994; Jorde et al. 1995, 1997; Bosch et al.2000; Lum et al. 2002; Rowold and Herrera 2003; Perez-Miranda et al. 2005;Shepard and Herrera 2005) and the discrimination power essential for robustindividual probabilities of inclusion (Leibelt et al. 2003; Collins et al. 2004).Also, STRs are used in these studies because of their numerous and relativelyeven distribution throughout the genome, high levels of polymorphism, a largenumber of possible alleles per locus, and short amplicon lengths, which facilitateDNA amplification, separation, and detection (Butler 2001; Butler et al. 2003).
The populations composing the Austronesian language family have beenthe subject of numerous studies from overlapping anthropological disciplines,namely, linguistics, archeology, and molecular biology. Studies in these fieldshave provided evidence on the complexity of human migration patterns duringthe Austronesian diaspora (Bellwood 2001; Underhill 2004). The current rangeof Austronesian-speaking people extends from Taiwan (Formosa) to the north,Easter Island (west of Chile, South America) in the east, New Zealand to thesouth, and as far as Madagascar (off East Africa) to the west. The distancesbetween these locations cover approximately two-thirds the circumference of theplanet. Consequently, because of their wide geographic distribution, the Aus-tronesians are an interesting group from the perspective of population genetics.In addition, because of their relatively recent expansions into the Indian andPacific Oceans, beginning about 6,000 years b.p. and ending as late as 800 yearsb.p., Austronesian populations provide an ideal test group to study a major dis-persal process from prehistoric time. In terms of forensics, this area is underchar-acterized by accepted STR marker sets.
The goal of this study is to investigate the allelic profiles of 15 biparentalSTR loci common to forensic studies (D8S1179, D21S11, D7S820, CSF1PO,D3S1358, TH01, D13S317, D16S539, VWA, TPOX, D18S51, D5S818, FGA,D2S1338, and D19S433) in five geographically targeted Austronesian popula-tions from the Pacific Ocean. These include two aboriginal Taiwanese popula-tions (the Ami and Atayal), two Indonesian populations from Bali and Java, anda Polynesian population from Samoa. The ultimate aim of our study is to assessthe degree of genetic heterogeneity among these five Austronesian populationsand to ascertain how they relate phylogenetically to regional and worldwidegroups previously studied with the same set of markers. In addition, these well-characterized databases will be of forensic value.
Upon examination of these 15 highly polymorphic loci, we find that whengroups of populations are compared, the overall tests of correlation between ge-netic partitioning with linguistic and geographic differences are statistically sig-nificant. Also, most loci exhibit significant correlation at the level of groups ofpopulations along geographic and linguistic lines. Phylogenetic analyses displaysome thought-provoking results, including an extreme differentiation between
PAGE 826................. 15768$ $CH7 02-21-06 11:52:37 PS
Autosomal STR Variation in Austronesians / 827
the two aboriginal populations from Taiwan, the Ami and Atayal, despite theiroverlapping geographic range. Similarly, we detect a clear distinction betweenthe Indonesian populations of Bali and Java, separated by mere miles within theIndo-Malaysian archipelago. Of particular interest is the segregation of Samoafrom the other four Austronesian groups into a different clade altogether. Basedon this evidence, we conclude that these five populations do not share an obviousgenetic link, despite their common language affiliation, the implications of whichare discussed further within a framework of autosomal STR analysis.
Materials and Methods
Populations, Sample Collection, and DNA Isolation. The five Austronesianpopulations from the Pacific Ocean investigated in this study include two aborigi-nal groups from Formosa (Taiwan) (the Ami and Atayal), two populations fromislands of the Indonesian chain (Bali and Java), and a fifth population from theSamoan islands in Polynesia (Figure 1). The Samoan samples were collectedfrom both Western and American Samoa as a representative group. Data from 12worldwide populations (Table 1) were obtained from the literature and used forcomparison. Populations were chosen from the literature to be representative ofdifferent ethnic groups and biogeographic areas. Individuals were identified bybiogeographic information traced back at least two generations. Each collectionwas arranged through the leaders of each region and supervised by the same.Sample collections were performed according to the ethical guidelines outlinedby the Institutional Review Board of Florida International University. All sam-ples were collected as whole blood in Vacutainer tubes containing EDTA. DNAwas extracted using the standard phenol-chloroform method (Antunez de Mayoloet al. 2002).
PCR Amplification and Detection of STRs. The samples were amplified byPCR using the commercial AmpFISTR Identifiler kit (Applied Biosystems, Fos-ter City, California) at the following loci: D8S1179, D21S11, D7S820, CSF1PO,D3S1358, TH01, D13S317, D16S539, D2S1338, D18S433, VWA, TPOX,D18S51, D5S818, FGA, and amelogenin. Amplifications were performed in aGeneAmp PCR System 9600 thermocycler (Applied Biosystems) using the fol-lowing cycling parameters: 11 min denaturation at 95�C; 28 cycles of 1 mindenaturation at 94�C, 1 min primer annealing at 59�C, and 1 min primer extensionat 72�C; and a final soak for 60 min at 60�C. A portion of each amplified samplewas mixed with formamide and GS500 LIZ as an internal size standard, as rec-ommended by the manufacturer (Applied Biosystems), and then separated usingan ABI Prism 3100 Genetic Analyzer (Applied Biosystems). GeneScan 3.7 wasused to determine the fragment sizes, and Genotyper 3.7 NT software was usedto designate alleles by comparison with the allelic ladder provided by the manu-facturer.
PAGE 827................. 15768$ $CH7 02-21-06 11:52:37 PS
828 / shepard et al.
Figure 1. Locations of the populations used in this study. Language affiliation classifications wereobtained from http://www.ethnologue.com. Geographic coordinates for each populationwere generated according to the geopolitical Mercator projection (Watkins et al. 2003).
Statistical and Phylogenetic Analyses. Allele frequencies of the 15 STR lociwere calculated using the gene counting method (Li 1976). The Arlequin soft-ware package, version 2.000 (Levene 1949; Guo and Thompson 1992; Schneideret al. 2000), was used to assess Hardy-Weinberg equilibrium expectations usingFisher’s exact test with the modified Markov chain Monte Carlo method as wellas to determine Nei’s gene diversity index (GD) (Nei 1987). Hardy-Weinbergequilibrium was evaluated at � � 0.05 and also using the Bonferroni adjustmentfor the number of loci tested (0.05/15 � 0.0033) as a correction for type I errors.
Forensically useful parameters were also examined for all five populationsstudied, including power of discrimination (PD) and polymorphic informationcontent (PIC), using the PowerStats program, version 1.2 (Tereba 1999; Jones1972; Brenner and Morris 1990). To determine phylogenetic relationships, the 5populations studied along with 12 other geographically targeted worldwide refer-ence populations were included in a neighbor-joining tree using Phylip 3.52csoftware (Felsenstein 2002) based on FST distances (Reynolds et al. 1983). Boot-strap consensus scores (1,000 replications) were generated by the SeqBoot and
PAGE 828................. 15768$ $CH7 02-21-06 11:53:05 PS
Autosomal STR Variation in Austronesians / 829
Table 1. Description of and Reference Information for the Studied Populations
Population n Description Reference
African Americans 258 General population of United States Butler et al. (2003)Ami 79 Aboriginal tribe, east-central Taiwan Present studyAngola 110 General population of Cabinda, Angola Beleza et al. (2004)Atayal 25 Aboriginal tribe, north Taiwan Present studyBali 79 General population of Bali Present studyBelgium 222 General population, majority from Decorte et al. (2004)
Flanders region of BelgiumJapan 526 General population of Japan Hashiyada et al. (2003)Java 60 General population of Java Present studyMalaysian Malay 210 Malay ethnicity from Malaysia Seah et al. (2003)Malaysian Chinese 219 Chinese ethnicity from Malaysia Seah et al. (2003)Mozambique 142 General population of Maputo, Mozambique Alves et al. (2004)North Poland 145 General population of northern Poland Szczerkowska et al.
(2004)Samoa 95 General population of American Samoa Present study
and SamoaTaiwan 597 General population of Taiwan Wang et al. (2003)U.S. Caucasians 302 General population of United States Butler et al. (2003)U.S. Hispanics 140 General population of United States Butler et al. (2003)Venezuela 255 General population of Caracas, Venezuela Chiurillo et al. (2003)
GenDist options of the Phylip software, and the ConSense programs determinedthe best-fitting dendrogram.
Multidimensional scaling (MDS) analyses were performed using the Statis-tical Package for the Social Sciences (SPSS) software program, which is alsobased on FST distances (SPSS Inc. 2001). G tests were carried out to determinedifferences in overall genetic variability between populations using Carmody’sprogram (Carmody 1990).
Inter- and intrapopulation genetic variance component values (GST and Hs,respectively) were ascertained for nine Pacific Ocean populations, composed ofsix Austronesian (Ami, Atayal, Bali, Java, Malaysian Malay, and Samoa) andthree non-Austronesian (Japan, Malaysian Chinese, and Taiwan) groups for eachlocus, according to the DISPAN software program (Ota 1993). Genetic structur-ing was analyzed for the same nine populations according to both biogeographiclines and linguistic subfamily affiliations through hierarchical analysis of molec-ular variance (AMOVA) (Excoffier et al. 1992). Linguistic correlations were as-sessed based on the following subfamily partitioning: two non-Austronesiangroups [Japanese (Japan) and Sino-Chinese (Taiwan and Malaysian Chinese)]and three Austronesian groups [Formosan (Ami, Atayal), Western Mayalo-Polynesian (Bali, Java, Malaysian Malay), and Eastern Mayalo-Polynesian(Samoa)]. Geographic correlations were tested on the basis of the following re-gional groups: northeast Asia (Japan), East Asia (Ami, Atayal, Taiwan and Ma-laysian Chinese), Southeast Asia (Bali, Java, Malaysian Malay), and Polynesia
PAGE 829................. 15768$ $CH7 02-21-06 11:53:06 PS
830 / shepard et al.
(Samoa). Note that the Malaysian Chinese population, while inhabiting Malaysia,was grouped with the East Asian populations because of its Han Chineseancestry.
Results
STR Diversity Within Populations. The allele-frequency distributions of theAmi, Atayal, Bali, Java, and Samoa populations are listed in Tables 2 through 6.In addition, parameters of importance to population genetics are summarized inTable 7 for each group under study. This table lists the loci in each populationthat do not meet Hardy-Weinberg equilibrium expectations at p � 0.05 (4 lociout of 75 possible tests). However, after applying the Bonferroni adjustment (seeexplanation in Materials and Methods section), this is reduced to a single depar-ture from Hardy-Weinberg equilibrium (1 loci out of 75 possible tests).
Although a detailed dissection of allele frequencies is not the purpose ofthis study, some interesting observations warrant attention. For instance, in theAmi population (Table 2), allele 10 of TPOX, commonly encountered in otherAsian databases, is notably absent from the Ami. Conversely, in the Ami, allele24.2 of FGA (0.0316) is present at a frequency up to 10 times higher than in thepublished Asian databases included in this study, and is absent altogether fromthe Atayal, Bali, Java, and Samoa. Observed heterozygosity (Ho) for the Amiranges from 0.6076 in TH01 to 0.8987 in D13S317 and D2S1338. In the Atayalthe smallest and largest allele sizes in each locus are often not present (Table 3).This population also lacks four common midsize alleles; most notably absent arealleles 9.3 and 10 of TH01 and alleles 15 and 24.2 of VWA. Observed heterozy-gosity for the Atayal ranges from 0.5200 in TH01 to 1.0000 in FGA. The Indone-sian population from Bali (Table 4) contains a microvariant allele (allele 23.2 ofD21S11) not encountered in any of the Asian databases used in our study. Therewas one deviation from Hardy-Weinberg equilibrium expectations for Bali inthe VWA locus ( p � 0.05) that persists even after application of the Bonferroniadjustment for type I errors. Departures have been reported in other regionalpopulations, specifically the Malay group from Malaysia for this particular locus(Seah et al. 2003). Observed heterozygosity of the Bali population oscillates from0.6076 in VWA to 0.8608 in D2S1338 and D18S51. In the other Indonesiangroup from Java (Table 5), allele 13 of TPOX (0.0083) is rarely detected in otherAsian groups. In this Javanese population the observed heterozygosity rangesfrom 0.6333 in CSF1PO to 0.9500 in D21S11. Within the population fromSamoa (Table 6), locus D13S317 contains a rare allele with 16 repeats, one ofthe largest in this locus reported thus far among the Asian data sets. Ho for Samoaranges from 0.6105 in TPOX to 0.9158 in D16S539.
PAGE 830................. 15768$ $CH7 02-21-06 11:53:06 PS
Autosomal STR Variation in Austronesians / 831
STR Diversity Among Populations. To investigate genetic affiliationsamong the five Pacific Ocean populations and their relationships to other world-wide populations, we generated a neighbor-joining tree using FST distances. Fig-ure 2 displays a neighbor-joining phylogram based on all 17 populations. Thereare three major clusters in the dendrogram (bootstrap value of 50%). One consistsof those of European/Hispanic descent; another primarily includes Africangroups or groups of African ancestry, and a third clade represents a cluster madeup of Asians and Pacific Ocean populations. Overall, the topology of this tree isrobust (only four nodes exhibit bootstrap values under 50% incidence).
The Atayal and Ami segregate into the same group as the Japanese andChinese from both Malaysia and Taiwan (91%) and then separate from thesethree populations with a confidence value of 29%. The Ami bifurcate from theAtayal with a bootstrap value of 40%. It is interesting to note that, althoughthere is a genetic relationship between these two Taiwanese aboriginal groups,the extensive branch length of the Atayal, with respect to the Ami, is indicativeof their genetic uniqueness. Both Bali and Java bifurcate from the MalaysianMalays (bootstrap value � 71%). However, the genetic distance between Baliand Java is not as pronounced as in the case with the Ami and Atayal. Samoasegregates in an isolated and intermediate position between the African and Euro-pean/Hispanic groups with a bootstrap value of 100%.
MDS analysis was performed to examine the genetic relationships amongthe 17 populations based on FST distances (Figure 3). Its topology is consistentwith that of the grouping in the neighbor-joining dendrogram. As in the neighbor-joining tree, there are three main clusters: (1) Europeans, (2) Africans and groupsof African descent, and (3) Asians and Pacific Islanders. The Ami, Atayal, Bali,and Java populations all cluster with the Asian groups on the left side of the plot.The Samoan population lies on the y axis of the crosshairs closer to the Africangroups than to the European/Hispanic groups. The isolated placement of the Sa-moans segregating away from all clusters and the extreme outlier position ofthe Atayal indicate considerable genetic differentiation with respect to the otherpopulations and mirrors the long branch distance observed in the neighbor-joining tree.
The nine Asian and Pacific Ocean populations were split into two groups:Austronesian and non-Austronesian language affiliation. They were then ana-lyzed to determine the allocation of genetic variance at the inter-GST and intra-Hs population levels (Table 8). The STR markers in the Austronesian groupsdisplayed lower levels of intrapopulation variance than the non-Austronesiangroups for 11 of the 15 loci and when calculated across all loci. GST values rangedfrom 2 to 10 times higher in the Austronesians than in the non-Autronesians perlocus and more than 8 times higher when assessed across all loci. When pairwiseG tests were performed on our 5 populations and on the 12 geographically tar-geted worldwide groups taken from the literature, we observed significant geneticdifferentiation ( p � 0.05) between all populations.
PAGE 831................. 15768$ $CH7 02-21-06 11:53:06 PS
832 / shepard et al.
Tab
le2.
STR
Alle
leFr
eque
ncie
sfo
rth
eA
mi(
Taiw
anes
eA
bori
gine
),n
�79
All
ele
D8S
1179
D21
S11
D7S
820
CSF
1PO
D3S
1358
TH
01D
13S3
17D
16S5
39D
2S13
38D
19S4
33V
WA
TP
OX
D18
S51
D5S
818
FG
A
60.
1076
70.
2342
0.05
068
0.15
190.
0570
0.31
650.
4747
90.
0570
0.52
530.
0886
0.31
010.
1139
0.04
439.
30.
0063
100.
1076
0.22
150.
2722
0.06
960.
1582
0.09
490.
3544
110.
0759
0.36
080.
2215
0.31
010.
2215
0.39
240.
2089
120.
1519
0.17
090.
4114
0.12
030.
2215
0.01
900.
0443
0.22
7813
0.25
950.
0253
0.08
230.
0063
0.12
660.
2848
0.02
530.
1139
13.2
0.01
9014
0.23
420.
0127
0.01
270.
0063
0.01
900.
1519
0.14
560.
1772
14.2
0.14
5615
0.11
390.
3291
0.00
630.
0696
0.05
700.
2658
15.2
0.29
7516
0.03
800.
3291
0.08
860.
0063
0.12
660.
0823
0.00
6316
.20.
0253
170.
0190
0.32
910.
0127
0.25
320.
1709
180.
0063
0.06
330.
2342
0.07
5919
0.15
820.
1582
0.08
860.
1076
200.
0443
0.01
900.
0253
0.06
9621
0.00
630.
0063
0.03
160.
1582
220.
0823
0.01
270.
2405
230.
2025
0.17
09
PAGE 832................. 15768$ $CH7 02-21-06 11:53:07 PS
Autosomal STR Variation in Austronesians / 833
240.
2658
0.10
1324
.20.
0316
250.
0759
0.08
8626
0.02
5328
0.03
1629
0.13
9230
0.19
6231
0.22
1531
.20.
0886
320.
0380
32.2
0.17
7233
0.01
2733
.20.
0633
34.2
0.03
16H
o0.
8481
0.78
480.
7595
0.68
350.
6835
0.60
760.
8987
0.78
480.
8987
0.77
220.
7975
0.65
820.
8228
0.77
220.
8734
He
0.82
790.
8516
0.77
050.
7052
0.70
770.
6706
0.76
110.
7852
0.84
210.
7852
0.82
040.
6112
0.84
930.
7688
0.85
71P
valu
e0.
3351
90.
7364
50.
1416
60.
9257
70.
0914
30.
4773
30.
2132
80.
5604
80.
2597
60.
5907
50.
0467
60.
9053
50.
0676
50.
9811
50.
2783
8G
D0.
8279
0.85
160.
7705
0.70
520.
6793
0.65
360.
7611
0.78
520.
8421
0.78
520.
8204
0.82
040.
8493
0.76
620.
8571
PD0.
9354
0.95
530.
9002
0.85
660.
8060
0.83
580.
8678
0.91
010.
9351
0.91
460.
9319
0.74
700.
9438
0.90
590.
9508
PIC
0.79
960.
8279
0.72
970.
6477
0.60
450.
6060
0.71
710.
7464
0.81
750.
7474
0.78
950.
5278
0.82
660.
7249
0.83
49
Ho,
obse
rved
hete
rozy
gosi
ty.
He,
expe
cted
hete
rozy
gosi
ty.
Pva
lue:
Har
dy-W
einb
erg
equi
libri
um,F
ishe
r’s
exac
ttes
t.G
D,g
ene
dive
rsity
inde
x.PD
,pow
erof
disc
rim
inat
ion.
PIC
,pol
ymor
phic
info
rmat
ion
cont
ent.
Stat
istic
sca
lcul
ated
usin
gPo
wer
Stat
s,v.
1.2
(Pro
meg
a).
PAGE 833................. 15768$ $CH7 02-21-06 11:53:07 PS
834 / shepard et al.
Tab
le3.
STR
Alle
leFr
eque
ncie
sfo
rth
eA
taya
l(T
aiw
anA
bori
gine
s),n
�25
All
ele
D8S
1179
D21
S11
D7S
820
CSF
1PO
D3S
1358
TH
01D
13S3
17D
16S5
39D
2S13
38D
19S4
33V
WA
TP
OX
D18
S51
D5S
818
FG
A
60.
2000
70.
6400
0.04
008
0.10
000.
0200
0.06
000.
3000
90.
0200
0.14
000.
0400
0.52
000.
0200
100.
3400
0.36
000.
2000
0.36
000.
1000
0.10
000.
3800
110.
0800
0.38
000.
3400
0.44
000.
1600
0.58
000.
4200
120.
1400
0.12
000.
3200
0.10
000.
1800
0.04
000.
1200
130.
1000
0.02
000.
1200
0.02
000.
2400
0.02
000.
0400
13.2
0.18
0014
0.24
000.
0200
0.02
000.
1200
0.16
000.
4400
14.2
0.10
0015
0.04
000.
3400
0.18
000.
2400
15.2
0.16
0016
0.06
000.
4600
0.08
000.
0800
0.20
0016
.20.
0200
170.
1600
0.20
000.
2200
0.06
0018
0.04
000.
0600
0.42
0019
0.18
000.
1200
0.06
0020
0.06
000.
1600
210.
0800
220.
0400
0.18
00
PAGE 834................. 15768$ $CH7 02-21-06 11:53:08 PS
Autosomal STR Variation in Austronesians / 835
230.
1600
0.14
0024
0.20
000.
0800
250.
0200
0.24
0026
0.04
0027
0.02
0028
0.02
0029
0.24
0030
0.28
0031
0.14
0031
.20.
2000
320.
0400
32.2
0.08
00H
o0.
8000
0.72
000.
6800
0.64
000.
7600
0.52
000.
7200
0.72
000.
8000
0.84
000.
7200
0.68
000.
6400
0.88
001.
0000
He
0.80
410.
8343
0.76
080.
7869
0.64
570.
5412
0.67
670.
6743
0.86
370.
8441
0.75
840.
5747
0.79
590.
6751
0.86
37P
valu
e0.
5575
40.
0895
90.
7334
10.
3203
40.
1141
80.
8749
00.
3131
40.
9340
50.
6644
60.
4712
00.
9021
80.
4138
00.
6372
20.
2310
10.
2671
2G
D0.
8016
0.81
220.
7151
0.74
200.
6457
0.54
120.
6751
0.67
430.
8637
0.84
410.
7437
0.57
470.
7176
0.67
510.
8637
PD0.
9024
0.89
920.
8640
0.86
720.
6848
0.72
960.
7808
0.84
480.
9376
0.91
840.
8832
0.70
080.
8608
0.72
000.
9056
PIC
0.75
690.
7659
0.64
920.
6784
0.56
480.
4796
0.60
150.
6207
0.82
780.
8037
0.68
960.
4938
0.65
750.
5993
0.82
83
Ho,o
bser
ved
hete
rozy
gosi
ty.
He,
expe
cted
hete
rozy
gosi
ty.
Pva
lue:
Har
dy-W
einb
erg
equi
libri
um,F
ishe
r’s
exac
ttes
t.G
D,g
ene
dive
rsity
inde
x.PD
,pow
erof
disc
rim
inat
ion.
PIC
,pol
ymor
phic
info
rmat
ion
cont
ent.
Stat
istic
sca
lcul
ated
usin
gPo
wer
Stat
s,v.
1.2
(Pro
meg
a).
PAGE 835................. 15768$ $CH7 02-21-06 11:53:08 PS
836 / shepard et al.
Tab
le4.
STR
Alle
leFr
eque
ncie
sfo
rB
ali(
Indo
nesi
a),n
�79
All
ele
D8S
1179
D21
S11
D7S
820
CSF
1PO
D3S
1358
TH
01D
13S3
17D
16S5
39D
2S13
38D
19S4
33V
WA
TP
OX
D18
S51
D5S
818
FG
A
60.
0570
70.
3165
80.
2278
0.13
920.
3101
0.00
630.
6456
90.
0633
0.05
060.
2911
0.13
290.
0949
0.13
290.
0063
9.3
0.10
1310
0.15
820.
1835
0.24
050.
0949
0.16
460.
1266
0.00
630.
3671
110.
0886
0.34
180.
2848
0.23
420.
3418
0.20
890.
2342
120.
1582
0.14
560.
3797
0.14
560.
2468
0.08
230.
0063
0.04
430.
3228
12.2
0.01
2713
0.17
720.
0380
0.04
430.
0127
0.15
820.
2025
0.10
130.
0696
13.2
0.08
2314
0.17
720.
0190
0.02
530.
2342
0.22
150.
1962
14.2
0.05
7015
0.13
920.
3734
0.15
820.
0063
0.29
7515
.20.
1519
160.
0823
0.29
110.
1456
0.22
7816
.20.
0190
170.
0190
0.27
220.
0886
0.32
280.
0443
180.
0380
0.06
330.
1456
0.01
270.
0063
190.
0063
0.28
480.
0633
0.01
900.
0633
200.
1835
0.06
960.
0380
0.13
9221
0.02
530.
0253
0.01
270.
0380
21.2
0.05
0622
0.12
030.
0063
0.23
4222
.20.
0253
230.
1392
0.20
8923
.20.
0063
0.01
27
PAGE 836................. 15768$ $CH7 02-21-06 11:53:08 PS
Autosomal STR Variation in Austronesians / 837
240.
0570
0.11
3925
0.03
800.
0443
25.2
0.00
6326
0.03
1627
0.01
9027
.20.
0063
28.2
0.04
4329
.20.
2595
30.2
0.21
5231
0.00
6331
.20.
0759
320.
0316
32.2
0.06
3333
0.18
3533
.20.
0253
340.
0823
350.
0063
Ho
0.84
810.
8228
0.72
150.
6709
0.63
290.
7595
0.72
150.
7215
0.86
080.
7975
0.60
760.
5823
0.86
080.
7089
0.83
54H
e0.
8592
0.84
500.
8104
0.74
520.
7045
0.77
810.
7951
0.77
300.
8395
0.84
780.
8357
0.52
520.
8099
0.75
190.
8624
Pva
lue
0.59
438
0.23
265
0.59
467
0.29
176
0.15
917
0.85
325
0.09
204
0.57
035
0.95
736
0.30
441
0.00
004
0.69
530
0.76
840
0.10
703
0.56
864
GD
0.85
820.
8387
0.77
130.
7154
0.70
430.
7781
0.78
790.
7730
0.83
950.
8441
0.79
990.
5252
0.80
990.
7058
0.86
24PD
0.95
270.
9415
0.90
950.
8627
0.85
370.
9133
0.91
040.
9088
0.94
860.
9463
0.92
130.
7162
0.92
360.
8342
0.95
72PI
C0.
8348
0.81
350.
7312
0.65
860.
6419
0.73
950.
7495
0.73
380.
8150
0.81
870.
7669
0.46
920.
7788
0.64
420.
8419
Ho,o
bser
ved
hete
rozy
gosi
ty.
He,
expe
cted
hete
rozy
gosi
ty.
Pva
lue:
Har
dy-W
einb
erg
equi
libri
um,F
ishe
r’s
exac
ttes
t.G
D,g
ene
dive
rsity
inde
x.PD
,pow
erof
disc
rim
inat
ion.
PIC
,pol
ymor
phic
info
rmat
ion
cont
ent.
Stat
istic
sca
lcul
ated
usin
gPo
wer
Stat
s,v.
1.2
(Pro
meg
a).
PAGE 837................. 15768$ $CH7 02-21-06 11:53:09 PS
838 / shepard et al.
Tab
le5.
STR
Alle
leFr
eque
ncie
sfo
rJa
va(I
ndon
esia
),n
�60
All
ele
D8S
1179
D21
S11
D7S
820
CSF
1PO
D3S
1358
TH
01D
13S3
17D
16S5
39D
2S13
38D
19S4
33V
WA
TP
OX
D18
S51
D5S
818
FG
A
60.
1250
0.00
837
0.00
830.
2667
0.00
838
0.24
170.
0083
0.11
670.
2917
0.50
839
0.05
000.
0167
0.28
330.
1083
0.14
170.
1333
0.00
839.
30.
0583
100.
0250
0.19
170.
1750
0.15
000.
2167
0.15
830.
0583
0.41
6711
0.10
830.
3250
0.40
000.
2417
0.31
670.
2667
0.20
0012
0.15
830.
1583
0.32
500.
0917
0.26
670.
0583
0.01
670.
0417
0.20
8313
0.12
500.
0250
0.07
500.
0250
0.03
330.
1167
0.28
330.
0083
0.00
830.
1083
0.14
1713
.20.
0583
140.
2417
0.05
830.
0167
0.19
170.
2917
0.25
000.
0083
14.2
0.04
1715
0.18
330.
2750
0.12
500.
0417
0.23
3315
.20.
2250
160.
1333
0.35
830.
0333
0.00
830.
1083
0.18
330.
0083
16.2
0.00
8317
0.02
500.
2167
0.10
000.
2167
0.08
3318
0.06
670.
0583
0.18
330.
0083
0.02
5019
0.23
330.
1250
0.03
330.
0583
200.
1000
0.01
670.
0250
0.07
5021
0.01
670.
0083
0.00
830.
1667
21.2
0.02
5022
0.04
170.
0250
0.24
1722
.20.
0417
230.
2333
0.15
8323
.20.
0083
240.
1250
0.07
50
PAGE 838................. 15768$ $CH7 02-21-06 11:53:09 PS
Autosomal STR Variation in Austronesians / 839
250.
0417
0.02
5025
.20.
0333
260.
0167
0.03
3326
.20.
0083
270.
0083
0.01
6728
0.06
670.
0083
290.
1917
300.
2583
30.2
0.04
1731
0.13
3331
.20.
0917
320.
0500
32.2
0.08
3333
0.01
6733
.20.
0167
340.
0250
34.2
0.01
67H
o0.
8333
0.95
000.
8167
0.63
330.
6833
0.81
670.
6833
0.66
670.
9000
0.70
000.
6500
0.68
330.
8333
0.70
000.
8000
He
0.84
620.
8573
0.78
140.
8024
0.76
950.
8020
0.81
160.
7763
0.85
410.
8149
0.86
190.
6564
0.85
410.
7468
0.87
07P
valu
e0.
8149
70.
2663
50.
8975
50.
3044
70.
4808
90.
2399
60.
1792
20.
1747
80.
4728
0.30
142
0.06
316
0.26
398
0.91
123
0.50
657
0.10
098
GD
0.84
620.
8573
0.78
140.
7106
0.74
680.
8001
0.79
470.
7763
0.85
410.
8148
0.81
200.
6396
0.83
070.
7287
0.87
03PD
0.94
170.
9367
0.90
560.
8639
0.88
890.
9139
0.92
110.
9061
0.94
610.
9350
0.92
830.
8039
0.93
830.
8761
0.95
44PI
C0.
8190
0.83
380.
7395
0.65
170.
6977
0.76
330.
7562
0.73
320.
8299
0.78
180.
7782
0.57
850.
8011
0.67
970.
8497
Ho,
obse
rved
hete
rozy
gosi
ty.
He,
expe
cted
hete
rozy
gosi
ty.
Pva
lue:
Har
dy-W
einb
erg
equi
libri
um,F
ishe
r’s
exac
ttes
t.G
D,g
ene
dive
rsity
inde
x.PD
,pow
erof
disc
rim
inat
ion.
PIC
,pol
ymor
phic
info
rmat
ion
cont
ent.
Stat
istic
sca
lcul
ated
usin
gPo
wer
Stat
s,v.
1.2
(Pro
meg
a).
PAGE 839................. 15768$ $CH7 02-21-06 11:53:09 PS
840 / shepard et al.
Tab
le6.
STR
Alle
leFr
eque
ncie
sfo
rSa
moa
,n�
95
All
ele
D8S
1179
D21
S11
D7S
820
CSF
1PO
D3S
1358
TH
01D
13S3
17D
16S5
39D
2S13
38D
19S4
33V
WA
TP
OX
D18
S51
D5S
818
FG
A
60.
1263
70.
4684
80.
1000
0.13
680.
0632
0.37
309
0.11
050.
0737
0.18
950.
2000
0.23
689.
30.
1526
100.
2105
0.19
470.
1789
0.04
210.
0947
0.16
840.
0368
0.22
1111
0.04
740.
1947
0.39
470.
3368
0.25
260.
3526
0.00
530.
1053
11.2
0.03
1612
0.01
580.
2150
0.34
740.
2105
0.15
790.
0211
0.27
3713
0.29
400.
0842
0.06
840.
0526
0.10
530.
2421
0.01
050.
3105
13.2
0.10
0014
0.23
160.
0947
0.01
050.
0105
0.02
630.
1158
0.20
530.
2263
0.08
950.
0684
14.2
0.15
7915
0.10
000.
0053
0.33
680.
0158
0.02
110.
2158
0.22
630.
0211
15.2
0.22
1116
0.07
370.
3780
0.01
050.
0053
0.10
000.
1000
170.
0211
0.20
530.
0789
0.29
470.
3010
180.
0053
0.05
260.
1158
0.11
580.
0579
190.
0158
0.22
630.
0421
0.12
630.
0632
200.
0158
0.00
530.
0579
0.01
0521
0.08
420.
0053
0.01
5822
0.25
260.
0053
0.03
6823
0.12
110.
2632
PAGE 840................. 15768$ $CH7 02-21-06 11:53:09 PS
Autosomal STR Variation in Austronesians / 841
240.
0789
0.33
6825
0.02
110.
0158
0.13
1626
0.10
5327
0.00
530.
0263
280.
2579
0.00
5329
0.26
320.
0053
300.
1368
310.
0842
31.2
0.06
3232
0.00
5332
.20.
1526
33.2
0.01
5834
.20.
0158
Ho
0.80
000.
7684
0.85
260.
7368
0.68
420.
6842
0.75
790.
9158
0.82
110.
7474
0.86
320.
6105
0.82
110.
7474
0.82
11H
e0.
8224
0.85
390.
8437
0.69
030.
7014
0.80
170.
7937
0.82
480.
8411
0.81
790.
7987
0.68
220.
8226
0.77
930.
7991
Pva
lue
0.90
599
0.26
757
0.53
662
0.28
805
0.52
546
0.32
775
0.79
872
0.28
200
0.57
929
0.01
297
0.60
521
0.23
593
0.66
662
0.37
434
0.01
979
GD
0.80
100.
8148
0.84
370.
6903
0.70
140.
7192
0.79
370.
8248
0.84
100.
8179
0.79
430.
6821
0.82
260.
7709
0.78
66PD
0.92
720.
9334
0.94
580.
8319
0.85
520.
8829
0.92
880.
9259
0.94
540.
9303
0.91
300.
8397
0.93
720.
8999
0.88
93PI
C0.
7680
0.78
490.
8189
0.62
840.
6415
0.68
370.
7615
0.79
280.
8171
0.78
710.
7587
0.61
330.
7956
0.72
520.
7532
Ho,o
bser
ved
hete
rozy
gosi
ty.
He,
expe
cted
hete
rozy
gosi
ty.
Pva
lue:
Har
dy-W
einb
erg
equi
libri
um,F
ishe
r’s
exac
ttes
t.G
D,g
ene
dive
rsity
inde
x.PD
,pow
erof
disc
rim
inat
ion.
PIC
,pol
ymor
phic
info
rmat
ion
cont
ent.
Stat
istic
sca
lcul
ated
usin
gPo
wer
Stat
s,v.
1.2
(Pro
meg
a).
PAGE 841................. 15768$ $CH7 02-21-06 11:53:10 PS
842 / shepard et al.
Tab
le7.
Sta
tist
ical
Popu
lati
onG
enet
icP
aram
eter
sof
Fiv
ePo
pula
tion
s
Pop
ulat
ion
Tota
lA
llel
es
Com
bine
dP
ower
ofD
iscr
imin
atio
nA
vera
geH
eter
ozyg
osit
y
Loc
iw
ith
Hig
hest
Pow
erof
Dis
crim
inat
ion
Loc
iw
ith
Low
est
Pow
erof
Dis
crim
inat
ion
Dep
artu
res
from
Har
dy-W
einb
erg
Equ
ilib
rium
Am
i11
10.
9999
9999
9999
999
0.77
10D
21S
11,D
18S
51,F
GA
TP
OX
VW
AA
taya
l89
0.99
9999
9999
9966
30.
7269
D2S
1338
D3S
1358
,TP
OX
Non
eB
ali
118
0.99
9999
9999
9999
90.
7746
D2S
11,D
2S13
38,D
19S
433,
FG
AT
PO
XV
WA
a
Java
129
0.99
9999
9999
9999
90.
7915
D8S
1179
,D2S
1338
,FG
AT
PO
XN
one
Sam
oa11
70.
9999
9999
9999
999
0.77
99D
7S82
0,D
2S13
38C
SF
1PO
,TP
OX
D19
S43
3,F
GA
a.Pe
rsis
tenc
eof
depa
rtur
efr
omH
ardy
-Wei
nber
geq
uili
briu
maf
ter
Bon
ferr
oni-
like
adju
stm
ent
for
num
ber
oflo
cite
sted
(0.0
5/15
�0.
0033
)
PAGE 842................. 15768$ $CH7 02-21-06 11:53:10 PS
Autosomal STR Variation in Austronesians / 843
Figure 2. Neighbor-joining phylogenic analyses of 17 worldwide populations based on FST dis-tances from STR allele frequencies. The GenDist option of the Phylip software createdbranch distances onto which the corresponding bootstrap values (based on 1,000 repli-cations) were transferred to the corresponding nodes of the neighbor-joining tree.
Partitioning of Populations Based on Geography and Language. The dis-tribution of genetic variance was assessed along geographic and linguistic parti-tioning among the Asian and Pacific Ocean populations using AMOVA. Thesix Austronesian populations, five from this study and one previously studied(Malaysian Malay), and three non-Austronesian reference populations from Asia(Japan, Taiwan general population, and Malaysian Chinese) were included in thisanalysis. Table 9 indicates the loci that exhibit statistically significant correlationsand their corresponding variance values. Except for marker D19S433, all locidemonstrate no significant correlation ( p � 0.05) between genetic diversity andlinguistic or geographic partitions when populations within groups are compared.In contrast to the lack of significance among populations within groups, the fol-lowing five loci overlapped, showing significant correlation ( p � 0.05) betweengenetic diversity among groups of populations to both linguistics and geography:D8S1179, D7S820, TH01, D16S539, and D2S1338. Genetic diversity in fiveadditional loci exhibited significant correlation ( p � 0.05) among groups of pop-ulations along linguistic lines for 10 out of 15 significant loci. A single additionallocus showed significant correlation ( p � 0.05) between genetic differences andgeographic partitioning among groups of populations for 6 of the 15 loci. Theoverall AMOVA among groups of populations along linguistic ( p � 0.00001)
PAGE 843................. 15768$ $CH7 02-21-06 11:53:31 PS
844 / shepard et al.
Figure 3. Multidimensional scaling analyses of 17 worldwide populations based on FST distancesfrom STR allele frequencies. AFA, African American; AMI, Ami; ATA, Atayal; BAL,Bali; BEL, Belgium; CAB, Cabinda, Angola; JAP, Japan; JAV, Java; MCH, MalaysianChinese; MML, Malaysian Malay; MOZ, Mozambique; NPO, North Poland; SAM,Samoa; TAI, Taiwan; USC, US Caucasian; USH, US Hispanic; VEN, Venezuela.
and geographic ( p � 0.01) lines generated significant correspondence to geneticstructure. Yet the among-populations within-groups overall AMOVA gave insig-nificant correlation along linguistic and geographic lines.
Discussion
These results provide novel databases for 15 autosomal STR loci in fourAustronesian populations (Ami, Atayal, Bali, and Samoa). For a fifth Austrone-sian group (Java) new data was added in the form of loci D2S1338 and D19S433to a preexisting database (Othman et al. 2004). An inspection of the results re-veals that there is a marked lack of commonly encountered microvariant allelesamong the Ami, Atayal, and Samoan groups in the highly polymorphic lociD21S11 and FGA. We also detected specific alleles in common among the pub-lished Asian and Pacific Ocean population databases used in this study as wellas in our Balineses and Javanese groups. These include alleles 28.2, 29.2, and30.2 in the D21S11 locus and alleles 21.2, 22.2, 23.2, and 25.2 at the FGAlocus. Overall, D2S1338 and FGA are the most discriminating loci across allpopulations, and TPOX is the least discriminating locus; D2S1338 and FGA havethe highest number of alleles, and TPOX has the lowest number of alleles within
PAGE 844................. 15768$ $CH7 02-21-06 11:53:45 PS
Autosomal STR Variation in Austronesians / 845
Table 8. Components of Genetic Variance for Nine Austronesian and Asian Non-Austronesian Populations
Intrapopulation Hs Interpopulation GST
Locus Austronesians Non-Austronesians Austronesians Non-Austronesians
8S1179 0.823938 0.841298 0.026409 0.001787D21S11 0.830586 0.803820 0.023381 0.001264D7S820 0.772001 0.765179 0.018428 0.002064CSF1PO 0.706132 0.736505 0.010057 0.002924D3S1358 0.700171 0.719401 0.011240 0.003285TH01t 0.704140 0.692647 0.064438 0.008190D13S317 0.764493 0.798947 0.036658 0.001299D16S539 0.765375 0.779884 0.037288 0.007956D2S1338 0.845132 0.866158 0.026127 0.004573D19S433 0.817751 0.797587 0.015838 0.007098VWA 0.790375 0.794881 0.026988 0.002716TPOX 0.602920 0.613072 0.051971 0.009099D18S51 0.806176 0.862127 0.033837 0.001645D5S818 0.731385 0.792780 0.034100 0.002064FGA 0.845817 0.863177 0.033127 0.001464All loci 0.767093 0.781831 0.029760 0.003691
each population. As expected in most cases, the most polymorphic loci (highestHo value) are the most discriminating markers (highest PD value) for each popu-lation.
Although four of the five populations under study (Ami, Atayal, Bali, andJava) clustered within the Asian/Pacific Ocean clade of the neighbor-joining tree(see Figure 2), the fact that the Polynesian group from Samoa segregates withinthe African groups is most notable. It is likely that genetic drift resulting frommultiple bottleneck events, founder effects, and/or isolation have contributed toa genetic makeup for this Samoan population that does not reflect its ancestry.This rather unexpected result underscores the need to examine the genetic pro-files of individual Pacific islands and not to consider them interchangeable forforensic analysis.
A second observation is the larger branch lengths of the Atayal and Samoanpopulations. These indicate a large degree of genetic differentiation in these twogroups, possibly because of migratory bottlenecks, founder effects, many genera-tions of relative isolation, and/or genetic drift. The relative positions of these twopopulations in the MDS analysis (see Figure 3), most notably that of the Atayal,corroborates this notion. As mentioned previously, it is interesting that the Atayaland Samoa are two of the three groups that lack microvariant alleles.
Another important observation is that the three Indo-Malaysian islandgroups in this study (Bali, Java, and Malay Malaysia) segregate in a differentbranch of the clade distant from the two Taiwanese aboriginal groups, the Ami
PAGE 845................. 15768$ $CH7 02-21-06 11:53:45 PS
846 / shepard et al.
Table 9. Significant AMOVA Values for Nine Austronesian and Asian Non-Austrone-sian Populationsa
Linguistic Partitioning Geographic Partitioning
Among Among Among AmongGroups of Populations Within Groups of Populations WithinPopulations Groups Populations Groups
D8S1179 (0.37) D19S433 (3.17) D8S1179 (0.38) D19S433 (2.88)D21S11 (0.89) D7S820 (1.01)D7S820 (1.21) CSF1PO (0.99)TH01 (2.12) TH01 (2.61)D16S539 (1.70) D16S539 (1.64)D2S1338 (1.10) D2S1338 (0.98)VWA (0.50)TPOX (1.69)D18S51 (1.05)D5S818 (1.39)
a. Numbers in parentheses refer to percentage of variance, considered significant when p � 0.05.
and Atayal. From these results it appears that the Austronesian language affilia-tion of these five groups is not reflected in their present genetic relationship.Analysis of variance components (see Table 8) revealed two related points: (1)The level of interpopulation differentiation among our Austronesian populationsis higher than in the Asian non-Austronesian groups, and (2) conversely, theintrapopulation variance is lower in the Austronesians than in the Asian non-Austronesians for most loci. It would be expected that during geographic andcultural isolation, forces such as founder effects, inbreeding, and limited geneflow would mitigate within-population variance while acting to augment inter-population differences.
An examination of the AMOVA results (see Table 9) for nine Austronesianand Asian non-Austronesian populations indicates that most of the loci exhibitsignificant genetic partitioning along linguistic and geographic lines amonggroups of populations. On the other hand, only one locus, D19S433, showedsignificant correlations with both language and geography among populationswithin groups. It is likely that high levels of polymorphism and differences inallele frequencies particular to this locus among the nine populations provide thefine resolution necessary to detect genetic variability at the among-populationswithin-groups level.
Similarly, overall statistically significant correlation along linguistic andgeographic partitioning was detected only among groups of populations. It isexpected that a greater number of loci will generate significant correlations withlinguistic and geographic partitioning among groups of populations rather thanamong populations within groups because genetic differences are generally
PAGE 846................. 15768$ $CH7 02-21-06 11:53:46 PS
Autosomal STR Variation in Austronesians / 847
greater at the among-populations level. It is worth mentioning that a greater num-ber of loci display significant correspondence between genetics and linguisticscompared to genetic and geographic partitioning at the among groups of popula-tions level. These results indicate that division based on language groups is inbetter agreement with the genetic structure of these nine Austronesian and Asiannon-Austronesian populations.
Focusing on a smaller geographic scale, the aboriginal populations of For-mosa provide a unique opportunity to dissect the effects of social and culturalrelationships on the genetic makeup of neighboring populations. The Ami andAtayal groups represent two of the nine extant indigenous tribes of Taiwan. TheAmi inhabit the narrow eastern seacoast plains of the island and represent thelargest tribal group, approximately 130,000 in number. The Atayal are the secondlargest tribe (about 90,000) and reside adjacent to the Ami in the mountainousterrain of northern Taiwan. Historical accounts cite continuous waves of migrantsfrom the Asian mainland who displaced the indigenous tribes and forced theminto the less accessible areas of the island, hence leading to their current distribu-tion (Knapp 1980).
A number of previous studies using mitochondrial DNA (Lum et al. 1994;Redd et al. 1995; Melton et al. 1995, 1998) found that the Ami and Atayal aborig-inal groups have a large amount of mtDNA sequence homology, suggesting acommon ancestral source in central or southern China. However, the investigatorsalso reported evidence of a prolonged isolation from mainland Chinese and otherAsian influences in the recent past. Studies based on autosomal (Sewerin et al.2002; Chow et al. 2005) and mtDNA (Horai et al. 1995) markers have also dem-onstrated genetic uniqueness among the indigenous groups, implying varyingtemporal and/or spatial sources for the initial colonization of Formosa. One pater-nal lineage study found that the Ami stand out from the other aboriginal groupsbecause of their closer genetic association with both South China and the Philip-pines. This is illustrated by the Ami’s high frequency of haplogroup L in contrastto the other four aboriginal groups, which lack this haplogroup altogether.
The distinctness of aboriginal groups from each other is especially evidentin the extremely homogeneous Atayal, whose Y chromosomes are almost entirelyof a single haplogroup (haplogroup H) (Capelli et al. 2001). This theme is echoedin older studies using classical markers (Cavalli-Sforza et al. 1994). In the presentstudy both the Ami and the Atayal separate from the other populations from Asiaand the Pacific Ocean in both the dendrogram and the MDS plot. In turn, theextreme branch distance of the Ami and, in particular, the Atayal is consistentwith the intertribal group differentiation described in the literature. Although theranges of these two tribes share a border on the northeast of the island, it is likelythat regional geographic, cultural, and linguistic barriers played a role in thegenetic differentiation of these two Taiwanese aboriginal groups. These findingssupport previous genetic studies with regard to differences between tribes.
PAGE 847................. 15768$ $CH7 02-21-06 11:53:46 PS
848 / shepard et al.
Similar to the Taiwanese tribal groups, the populations of the islands ofBali and Java lie in close proximity within the Indo-Malaysian archipelago, sepa-rated by mere miles. In a recent study of the general Javanese population (Oth-man et al. 2004) that reported a battery of autosomal STRs that coincide with 13of the loci reported in the present work, we found no marked differences in allelefrequencies compared to our data. Both Bali and Java belong to the same WesternMalayo-Polynesian subgroup of the Austronesian language family and, not sur-prisingly, segregate together within our neighbor-joining tree and MDS analysis.Also, as could be anticipated, the Malaysian Malay population clusters togetherin the same subclade with the Bali and the Java groups. Contrary to expectationsbased on geography alone, the two populations are clearly more distinct fromeach other than even the population of Han Chinese is from the Japanese. In theMDS analysis the Javanese group plots closer to the Han Chinese from Malaysiaand Taiwan than to its nearest neighbors from Bali. This may be a reflection ofadmixture of the Javanese population with an influx of Muslim, Indian, and main-land Chinese in the recent past. In historical times Java has been subject to wavesof Buddhist, Hindu, and Muslim migrations, generating cultural diversity andpossibly genetic heterogeneity. In addition, Java is more than 23 times larger inarea than Bali, and it is possible that this sheer difference in area allows a largereffective population size. On the other hand, Bali has remained relatively isolatedboth culturally and genetically.
Samoa, the most distant population from the Asian mainland examined inthe present study, segregates as an outlier in both the dendrogram and MDSanalyses. These islands lie near the boundary dividing Polynesia from both Mel-anesia to the west and Micronesia to the northwest and therefore represent apopulation at the crossroads of the Pacific Ocean. Our phylogenetic analyses ofthis Polynesian group indicate no apparent genetic relationship with the otherAustronesian populations. In fact, the Samoans segregate at an intermediate posi-tion within the African cluster in the neighbor-joining tree and plot closer to theAfrican groups in the MDS analysis. The relatively long branch length in thephylogram and isolation in the MDS analysis is indicative of the genetic unique-ness of this particular group compared to the other populations examined in thisstudy.
In a previous work using five Y-chromosome STRs, similar results werenoted in a Western Samoan population and were attributed to unique allele distri-butions compared to other Asian and Pacific Ocean groups (Parra et al. 1999).Parra and colleagues (1999) suggested founder effects stemming from the initialAustronesian settlement of Oceania as a likely cause. It is possible that thepointed genetic distinctiveness of the Samoans that provides for their segregationwithin the African cluster is the result of extreme genetic drift.
It is well documented that early Polynesians had reached as far as Samoaby means of the northern islands of Tonga by about 3,000 years b.p. Here, thePolynesian language evolved into two different subgroups: the Tongic and Nu-clear Polynesian subgroups. The Nuclear Polynesian subgroup contains the East-ern Polynesian and Samoan Outlier languages. Archeological and linguistic data
PAGE 848................. 15768$ $CH7 02-21-06 11:53:47 PS
Autosomal STR Variation in Austronesians / 849
suggest that further colonization of the eastern islands of Polynesia occurred viaSamoa within the last 2,000 to 800 years before present (Bellwood 1978; Kirch1997). This recent 1,000-year layover in Samoa may have led to the severe ge-netic distinctness that is observed in both the dendrogram and the MDS plot andparalleled by the documented linguistic subdifferentiation.
Our results also could be interpreted as indicating a considerable geneticcontribution to the Samoan gene pool from another source, such as neighboringMelanesia. In this scenario, depending on the proportion of a Papuan geneticcomponent, the Samoans would be expected to appear as genetically distinctfrom the other groups examined in this study. This possibility is supported bythe findings of a previous study in which biparental STRs displayed a patternconsistent with an initial Austronesian expansion into Remote Oceania fromSoutheast Asia followed by a significant amount of gene flow from Near Oceania(Lum et al. 2002). However, although our phylogenetic data on the Samoan popu-lation may be thought-provoking, inclusion of additional relevant Pacific Oceanpopulations, including the Papuans, needs to be studied in order to examine theseissues.
Acknowledgments We gratefully acknowledge Laisel Martinez, Diane J. Rowold, andMaria Christina Terreros for their constructive criticism of the manuscript.
Received 13 May 2005; revision received 5 September 2005.
Literature Cited
Alves, C., L. Gusmao, A. Damasceno et al. 2004. Contribution for an African autosomal STR data-base (AmpFISTR Identifiler and Powerplex 16 system) and a report on genotypic variations.Forensic Sci. Int. 139:201–205.
Antunez de Mayolo, G., A. Antunez de Mayolo, P. Antunez de Mayolo et al. 2002. Phylogenetics ofworldwide human populations as determined by polymorphic Alu insertions. Electrophoresis23:3346–3356.
Beleza, S., C. Alves, F. Reis et al. 2004. 17 STR (AmpFISTR Identifiler and Powerplex 16 system)from Cabinda (Angola). Forensic Sci. Int. 141:193–196.
Bellwood, P. 1978. The Polynesians. London: Thames and Hudson.Bellwood, P. 2001. Early agriculturalist population diasporas? Farming, languages and genes. Annu.
Rev. Anthropol. 30:181–207.Bosch, E., F. Calafell, A. Perez-Lezaun et al. 2000. Genetic structure of northwest Africa revealed by
STR analysis. Eur. J. Hum. Genet. 8:360–366.Bowcock, A., A. Ruiz-Linares, J. Tomfohrde et al. 1994. High resolution of human evolutionary trees
with polymorphic microsatellites. Nature 368:455–457.Brenner, C., and J. Morris. 1990. Paternity index calculations in single locus hypervariable DNA
probes: Validation and other studies. In Proceedings for the International Symposium onHuman Identification 1989. Madison, WI: Promega, 21–53.
Butler, J. M. 2001. Forensic DNA Typing: Biology and Technology Behind STR Markers. London:Academic Press.
PAGE 849................. 15768$ $CH7 02-21-06 11:53:47 PS
850 / shepard et al.
Butler, J. M., R. Schoske, P. M. Vallone et al. 2003. Allele frequencies for 15 autosomal STR loci onU.S. Caucasian, African American, and Hispanic populations. J. Forensic Sci. 48(4):908–911.
Capelli, C., J. F. Wilson, M. Richards et al. 2001. A predominantly indigenous paternal heritage forthe Austronesian-speaking peoples of insular Southeast Asia and Oceania. Am. J. Hum. Genet.68:432–443.
Carmody, G. 1990. G-Test. Ottawa, Canada: Carleton University.Cavalli-Sforza, L. L., P. Menozzi, and A. Piazza. 1994. The History and Geography of Human Genes.
Princeton, NJ: Princeton University Press.Chiurillo, M. A., A. Morales, A. M. Mendes et al. 2003. Genetic profiling of a central Venezuelan
population using 15 STR markers that may be of forensic importance. Forensic Sci. Int.136:99–101.
Chow, R. A., J. L. Caeiro, S. J. Chen et al. 2005. Genetic characterization of four Austronesian-speaking populations. J. Hum. Genet. (in press).
Collins, P. J., L. K. Hennessy, C. S. Leibelt et al. 2004. Developmental validation of a single-tubeamplification of the 13 CODIS STR loci, D2S1338, D19S433 and amelogenin: The Amp-FISTR Identifiler PCR Amplification kit. J. Forensic Sci. 49(6):1265–1277.
Decorte, R., M. Engelen, L. Larno et al. 2004. Belgian population data for 15 STR loci (AmpFISTRSGM Plus and AmpFISTR Profiler PCR amplification kit). Forensic Sci. Int. 139:211–213.
Excoffier, L., P. E. Smouse, and J. M. Quattro. 1992. Analysis of molecular variance inferred frommetric distances among DNA haplotypes: Application to human mitochondrial DNA restric-tion data. Genetics 131:479–491.
Felsenstein, J. 2002. Phylogeny Inference Package (PHYLIP), Version 3.6a3. Distributed by author.Seattle: Department of Genetics, University of Washington.
Guo, S., and E. Thompson. 1992. Performing the exact test of Hardy-Weinberg proportion for multi-ple alleles. Biometrics 48:361–372.
Hashiyada, M., Y. Itakura, T. Nagashima et al. 2003. Polymorphism of 17 STRs by multiplex analysisin Japanese population. Forensic Sci. Int. 133:250–253.
Horai, S., K. Hayasaka, R. Kondo et al. 1995. Recent African origin of modern humans revealed bycomplete sequences of hominoid mitochondrial DNAs. Proc. Natl. Acad. Sci. USA 92:532–536.
Jones, D. A. 1972. Blood samples: Probability of discrimination. J. Forensic Sci. Soc. 12:355–359.Jorde, L. B., M. J. Bamshad, W. S. Watkins et al. 1995. Origins and affinities of modern humans: A
comparison of mitochondrial and nuclear genetic data. Am. J. Hum. Genet. 57:523–538.Jorde, L. B., A. R. Rogers, M. Bamshad et al. 1997. Microsatellite diversity and the demographic
history of modern humans. Proc. Natl. Acad. Sci. USA 94:3100–3103.Kirch, P. V. 1997. The Lapita Peoples: Ancestors of the Oceanic World. Cambridge, MA: Blackwell.Knapp, R. G. 1980. China’s Final Frontier: Studies in the Historical Geography of Taiwan. Taipei:
SMC Publishing.Leibelt, C., B. Budowle, P. Collins et al. 2003. Identification of a D8S1179 primer binding site
mutation and the validation of a primer designed to recover null alleles. Forensic Sci. Int.133(3):220–227.
Levene, H. 1949. On a matching problem arising in genetics. Ann. Math. Stat. 20:91–94.Li, C. C. 1976. First Course in Population Genetics. Pacific Grove, CA: Boxwood Press.Lum, J. K., L. B. Jorde, and W. Schiefenhovel. 2002. Affinities among Melanesians, Micronesians,
and Polynesians: A neutral biparental genetic perspective. Hum. Biol. 74(3):413–430.Lum, J. K., O. Rickards, C. Ching et al. 1994. Polynesian mitochondrial DNAs reveal three deep
maternal lineage clusters. Hum. Biol. 66:567–590.Melton, T., S. Clifford, J. J. Martinson et al. 1998. Genetic evidence for the Proto-Austronesian
homeland in Asia: mtDNA and nuclear DNA variation in Taiwanese aboriginal tribes. Am. J.Hum. Genet. 63:1807–1823.
Melton, T., R. Peterson, A. J. Redd et al. 1995. Polynesian genetic affinities with Southeast Asianpopulations as identified by mtDNA analysis. Am. J. Hum. Genet. 57:403–414.
PAGE 850................. 15768$ $CH7 02-21-06 11:53:48 PS
Autosomal STR Variation in Austronesians / 851
Nei, M. 1987. Molecular Evolutionary Genetics. New York: Columbia University Press.Ota, T. 1993. DISPAN: Genetic Distance and Phylogenetic Analysis. University Park, PA: Institute
of Molecular Evolutionary Genetics, Pennsylvania State University.Othman, M. I., L. H. Seah, S. Panneerchelvan et al. 2004. Allele frequencies for the PowerPlex 16
STR loci in Javanese population from Malaysia. J. Forensic Sci. 149(1):190–191.Parra, E., M. D. Shriver, A. Soemantri et al. 1999. Analysis of five Y-specific microsatellite loci in
the Asian and Pacific populations. Am. J. Phys. Anthropol. 110(1):1–16.Perez-Miranda, A. M., M. A. Alfonso-Sanchez, A. Kalantar et al. 2005. Allelic frequencies of 13
STR loci in autochthonous Basques from the province of Vizcaya (Spain). Forensic Sci. Int.152(2–3):259–262.
Redd, A. J., N. Takezaki, S. T. Sherry et al. 1995. Evolutionary history of the COII/tRNALys inter-genic 9 base pair deletion in human mitochondrial DNAs from the Pacific. Mol. Biol. Evol.12:604–615.
Reynolds, J., B. S. Weir, and C. C. Cockerham. 1983. Estimation of the coancestry coefficient: Basisfor a short term genetic distance. Genetics 105:767–779.
Rowold, D. J., and R. J. Herrera. 2003. Inferring recent human phylogenies using forensic STRtechnology. Forensic Sci. Int. 133:260–265.
Schneider, S., J.-M. Kueffer, D. Roessli et al. 2000. Arlequin v. 2000: A Software for PopulationGenetics Data Analysis. Geneva: Genetics and Biometry Laboratory, University of Geneva.
Seah, L. H., N. H. Jeevan, M. I. Othman et al. 2003. STR data for the AmpFISTR Identifiler loci inthree ethic groups (Malay, Chinese, Indian) of the Malaysian population. Forensic Sci. Int.138:134–137.
Sewerin, B., F. J. Cuza, M. N. Szmulewicz et al. 2002. On the genetic uniqueness of the Ami aborigi-nes of Formosa. Am. J. Phys. Anthropol. 119:240–248.
Shepard, E. M., and R. J. Herrera. 2005. Iranian STR variation at the fringes of biogeographicaldemarcation. Forensic Sci. Int. (in press).
SPSS Inc. 2001. SPSS for Windows, Release 11.0.1. Chicago: SPSS Inc.Szczerkowska, Z., E. Kapinska, J. Wysocka et al. 2004. Northern Polish population data and forensic
usefulness of 15 autosomal STR loci. Forensic Sci. Int. 144:69–71.Tereba, A. 1999. Tools for analysis of population statistics. In Profiles in DNA, I. MacIver, ed.
Madison, WI: Promega Corporation, v. 2, pp. 14–16.Underhill, P. A. 2004. A synopsis of extant Y chromosome diversity in East Asia and Oceania. In
The Peopling of East Asia: Putting Together Archeology, Linguistics, and Genetics, L. Sagart,R. Blench, and A. Sanchez-Mazas, eds. London: Routledge Curzon, ch. 17, pp. 301–319.
Wang, C.-W., D.-P. Chen, C.-Y. Chen et al. 2003. STR data for the AmpFISTR SGM Plus and Profilerloci from Taiwan. Forensic Sci. Int. 138:119–122.
Watkins, W. S., A. R. Rogers, C. T. Ostler et al. 2003. Genetic variation among world populations:Inferences from 100 Alu insertion polymorphisms. Genome Res. 13:1607–1618.
PAGE 851................. 15768$ $CH7 02-21-06 11:53:49 PS