high-throughput sequencing for community analysis: the ......original article high-throughput...

17
ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity, relatedness, abundances and interactions in spider communities Susan R. Kennedy 1 & Stefan Prost 2,3 & Isaac Overcast 4,5 & Andrew J. Rominger 6 & Rosemary G. Gillespie 7 & Henrik Krehenwinkel 8 Received: 28 November 2019 /Accepted: 29 January 2020 /Published online: 10 February 2020 # The Author(s) 2020 Abstract Large-scale studies on community ecology are highly desirable but often difficult to accomplish due to the considerable investment of time, labor and, money required to characterize richness, abundance, relatedness, and interactions. Nonetheless, such large-scale perspectives are necessary for understanding the composition, dynamics, and resilience of biological communities. Small invertebrates play a central role in ecosystems, occupying critical positions in the food web and performing a broad variety of ecological functions. However, it has been particularly difficult to adequately characterize communities of these animals because of their exceptionally high diversity and abundance. Spiders in particular fulfill key roles as both predator and prey in terrestrial food webs and are hence an important focus of ecological studies. In recent years, large-scale community analyses have benefitted tremendously from advances in DNA barcoding technology. High-throughput sequencing (HTS), particularly DNA metabarcoding, enables community-wide analyses of diversity and interactions at unprecedented scales and at a fraction of the cost that was previously possible. Here, we review the current state of the application of these technologies to the analysis of spider communities. We discuss amplicon-based DNA barcoding and metabarcoding for the analysis of community diversity and molecular gut content analysis for assessing predator-prey relationships. We also highlight applications of the third generation sequencing technology for long read and portable DNA barcoding. We then address the development of theoretical frameworks for community-level studies, and finally highlight critical gaps and future directions for DNA analysis of spider communities. Keywords Metabarcoding . Portable sequencing . Third generation sequencing . Gut content analysis . Community assembly This article is part of the Special Issue Crossroads in Spider Research - evolutionary, ecological and economic significance Communicated by Matthias Pechmann Electronic supplementary material The online version of this article (https://doi.org/10.1007/s00427-020-00652-x) contains supplementary material, which is available to authorized users. * Henrik Krehenwinkel [email protected] 1 Biodiversity and Biocomplexity Unit, Okinawa Institute of Science and Technology, Onna, Okinawa, Japan 2 LOEWE-Centre for Translational Biodiversity Genomics, Senckenberg Museum, Frankfurt, Germany 3 National Zoological Garden, South African National Biodiversity Institute, Pretoria, South Africa 4 Graduate Center of the City University New York, New York, NY, USA 5 Ecole Normale Supérieure, Paris, France 6 Santa Fe Institute, Santa Fe, NM, USA 7 Environmental Sciences Policy and Management, University of California Berkeley, Berkeley, CA, USA 8 Department of Biogeography, Trier University, Trier, Germany Development Genes and Evolution (2020) 230:185201 https://doi.org/10.1007/s00427-020-00652-x

Upload: others

Post on 21-Feb-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

ORIGINAL ARTICLE

High-throughput sequencing for community analysis: the promiseof DNA barcoding to uncover diversity, relatedness, abundancesand interactions in spider communities

Susan R. Kennedy1 & Stefan Prost2,3 & Isaac Overcast4,5 & Andrew J. Rominger6 & Rosemary G. Gillespie7&

Henrik Krehenwinkel8

Received: 28 November 2019 /Accepted: 29 January 2020 /Published online: 10 February 2020# The Author(s) 2020

AbstractLarge-scale studies on community ecology are highly desirable but often difficult to accomplish due to the considerableinvestment of time, labor and, money required to characterize richness, abundance, relatedness, and interactions.Nonetheless, such large-scale perspectives are necessary for understanding the composition, dynamics, and resilienceof biological communities. Small invertebrates play a central role in ecosystems, occupying critical positions in the foodweb and performing a broad variety of ecological functions. However, it has been particularly difficult to adequatelycharacterize communities of these animals because of their exceptionally high diversity and abundance. Spiders inparticular fulfill key roles as both predator and prey in terrestrial food webs and are hence an important focus ofecological studies. In recent years, large-scale community analyses have benefitted tremendously from advances inDNA barcoding technology. High-throughput sequencing (HTS), particularly DNA metabarcoding, enablescommunity-wide analyses of diversity and interactions at unprecedented scales and at a fraction of the cost that waspreviously possible. Here, we review the current state of the application of these technologies to the analysis of spidercommunities. We discuss amplicon-based DNA barcoding and metabarcoding for the analysis of community diversityand molecular gut content analysis for assessing predator-prey relationships. We also highlight applications of the thirdgeneration sequencing technology for long read and portable DNA barcoding. We then address the development oftheoretical frameworks for community-level studies, and finally highlight critical gaps and future directions for DNAanalysis of spider communities.

Keywords Metabarcoding . Portable sequencing . Third generation sequencing . Gut content analysis . Community assembly

This article is part of the Special Issue “Crossroads in Spider Research -evolutionary, ecological and economic significance

Communicated by Matthias Pechmann

Electronic supplementary material The online version of this article(https://doi.org/10.1007/s00427-020-00652-x) contains supplementarymaterial, which is available to authorized users.

* Henrik [email protected]

1 Biodiversity and Biocomplexity Unit, Okinawa Institute of Scienceand Technology, Onna, Okinawa, Japan

2 LOEWE-Centre for Translational Biodiversity Genomics,Senckenberg Museum, Frankfurt, Germany

3 National Zoological Garden, South African National BiodiversityInstitute, Pretoria, South Africa

4 Graduate Center of the City University New York, New York, NY,USA

5 Ecole Normale Supérieure, Paris, France

6 Santa Fe Institute, Santa Fe, NM, USA

7 Environmental Sciences Policy and Management, University ofCalifornia Berkeley, Berkeley, CA, USA

8 Department of Biogeography, Trier University, Trier, Germany

Development Genes and Evolution (2020) 230:185–201https://doi.org/10.1007/s00427-020-00652-x

Page 2: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

Introduction

Ecological communities are defined by both the organismsthat persist within habitats, and the interactions that shapethe assembly and diversity patterns of these organisms.Historically, characterizations of abundance, richness, related-ness, and interactions across entire communities have beenlimited to taxa that are readily identifiable or have been doneon a sufficiently small scale that the laborious process of quan-tifying all community members and their interactions has beenfeasible (Gruner 2004; Krushelnycky et al. 2007). Predator-prey interactions have largely been based on observation(Binford 2001; Hiruki et al. 1999), detailed morphologicalexamination of gut contents (Grey et al. 2002; Lafferty andPage 1997), or the analysis of stable isotope data (Wise et al.2006). The recent advent of molecular metabarcoding ap-proaches is just starting to revolutionize our ability to charac-terize biological communities (Cristescu 2014). In particular,data on small invertebrates, which make up the foundation offood webs and play central roles in ecosystem function, can beobtained on larger scales and in greater detail than ever before.We use spiders, some of the most phylogenetically and eco-logically diverse predators on Earth (Foelix 2011), to illustratethe potential of such approaches for understanding communi-ty assembly.

In the last two decades, DNA barcoding, the sequencing ofshort species-specific amplicons, has considerably simplifiedcommunity analyses (Hebert et al. 2003). DNA barcodes canprovide information on genetic variation within and betweenspecies, rapidly assign taxonomic status across divergent lin-eages (Hebert and Gregory 2005), and identify the prey com-position of predators’ gut contents (Agustí et al. 2003;Greenstone and Shufran 2003). However, traditional Sangersequencing-based DNA barcoding protocols can be prohibi-tively expensive and laborious when large community sam-ples have to be processed. The emergence of high-throughputsequencing technologies (HTS) has been a significant stepforward in recent years, greatly reducing both the cost andthe labor required for biodiversity studies (Bohmann et al.2014; Taberlet et al. 2012). These technologies enable simul-taneous processing of DNA barcodes for thousands of speci-mens (Shokralla et al. 2015; Srivathsan et al. 2019;Meier et al.2016) with considerably improved phylogenetic resolution(Krehenwinkel et al. 2019a). Metabarcoding makes it possibleto characterize the species composition of whole communities(Cristescu 2014; Yu et al. 2012) and identify the makeup of thepredators’ diets in an unprecedented detail (Piñol et al. 2014;Verschut et al. 2019). Recent developments even enable mo-bile DNA barcoding under remote field conditions (Menegonet al. 2017; Pomerantz et al. 2018).

Here, we provide an overview of available HTS-basedmethods, focusing specifically on the use of genetic and ge-nomic data for characterizing community structure and

function in spiders. Within this context, we first discussDNA barcoding for taxonomic and phylogenetic assignments,metabarcoding for community analysis, recent developmentsin long-read sequencing technology, and portable fieldbarcoding solutions. We then review the application of DNAbarcoding for gut content analysis to assess predator-prey as-sociations. We additionally discuss the development of theo-retical models to apply to the DNA-based community analy-ses. Finally, we address promising avenues of future research.

DNA barcoding and metabarcodingfor community analysis

DNA barcoding in spiders: An overview

As major predators of invertebrates, spiders are a central ele-ment of terrestrial food webs and perform key roles in com-munity function and assembly (Nyffeler and Birkhofer 2017).They provide important ecosystem services, such as pest con-trol (Riechert and Lockley 1984; Thomson and Hoffmann2010) and at the same time make upmuch of the diet of higherorder predators such as birds (Nyffeler et al. 2018). Mosthabitats harbor diverse communities of spiders with complexecological interrelationships (Kennedy et al. 2019; Raso et al.2014). Consequently, the diversity of spiders and their mani-fold interactions with other species must be understood inorder to characterize community assembly in terrestrial eco-systems. Spider communities are usually composed of ecolog-ically and morphologically distinct taxa, stemming from deep-ly divergent evolutionary lineages (Cardoso et al. 2011). Theidentification of different groups often requires specializedtaxonomic expertise, a skillset which is rapidly disappearing(Agnarsson and Kuntner 2007).

The task of characterizing the entire spider communities hasbeen greatly simplified by DNA barcoding (Barrett and Hebert2005; Čandek and Kuntner 2015; Crespo et al. 2018; Fig. 1).DNA barcoding of spiders is usually based on the 650-bp“barcode region” of the mitochondrial COI gene, which pro-vides good taxonomic resolution in this group (Čandek andKuntner 2015). As part of the mitochondrial genome, COI ismaternally inherited and not affected by recombination.Mitochondria occur in most tissues in high copy numbersand are thus easily accessible for PCR amplification, even indegraded samples. The gene also evolves relatively quickly,making it suitable to distinguish even recently diverged speciesand recover intraspecific variation (Hajibabaei et al. 2007).

DNA barcodes garnered much enthusiasm after their initialestablishment, with some authors even suggesting that tradi-tional taxonomic methods be entirely replaced with a DNAsequence divergence-based taxonomy (Meierotto et al. 2019;Tautz et al. 2003). However, a divergent barcode sequencerecovered from an unknown specimen is not enough to

186 Dev Genes Evol (2020) 230:185–201

Page 3: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

indicate the species status (Moritz and Cicero 2004;Obertegger et al. 2018). Instead, DNA barcodes can serve asa valuable complement to traditional taxonomy by facilitatingthe identification of divergent lineages, including cryptic spe-cies (Wang et al. 2018). This has proven to be useful in somespiders with ambiguous morphological differentiation, whichshow sufficiently deep genetic divergence to be consideredseparate species (Leavitt et al. 2015; Starrett and Hedin2007; Crespo et al. 2018).

Multiple primers with various levels of taxonomic specific-ity are available for DNA barcoding of spiders (Blagoev et al.2016; Krehenwinkel et al. 2018; Supplementary Table 1). The

resulting barcode sequences can be compared to referencedatabases to assign specimens to species (Barrett and Hebert2005; Robinson et al. 2009). Alternatively, the presence of aso-called DNA barcode gap between interspecific and intra-specific genetic divergences in the COI gene (Hebert et al.2004a, b) can be used to identify putative species from theDNA barcode data. There is no universal rule for geneticdistances to warrant “species” status; instead, the barcodegap must be evaluated on a lineage-specific basis. This ap-proach was demonstrated to workwell in orb-weaver and wolfspiders (Čandek and Kuntner 2015). Automated approachesaiding in the discovery of barcode gaps and resulting species

AGTAAATTTTATTTCAACTATTATTAATATACGGAGTGTGATCTGTTTTGATTACTGCGGTTTTA

TATTAATTTTATTTCTGCTATTTTAAATATGCGGAGTGTGATCTGTTTTAATTACTGCTGTGTTA

TATTAATTTTATTTCCACAATTATTAAGATACGAAGTGTGCTCTGTTTTAATTACAGCAGTATTA

TATTAATTTTATTTCCACCATTTTAAATAGACGGAGTGTGATCAGTTTTAATTACTGCTGTATTA

TATTAATTTTATTTCTACAATTATTAATAGACGTAGTGTGAGCTGTTTTAATTACTGCGGTCTTA

AATTAATTTTATTTCTACTAGTATTAATATGCGAAGTTTGATCAGTTCTTATTACGGCTGTTTTG

TATTAACTTTCTTTCTACAATTATTAAGATACGAAGTGTGGTCTGTTTTAATTACGGCAGTTCT

TATTAATTTTATTTCCACAATTATTAAGATACGAAGTGTGCTCTGTTTTAATTACAGCAGTATTA

TATTAATTTTATTTCTACAATTATTAATAGACGTAGTGTGAGCTGTTTTAATTACTGCGGTCTTA

AATTAATTTTATTTCTACTAGTATTAATATGCGAAGTTTGATCAGTTCTTATTACGGCTGTTTTG

ACCCCCCTCTATTACACTATTAATTATCAGTAGTATTGTAGAAAAGGGGGTA

ACCCTCTTCTTTATTTTTATTATTTATTTCTTCAATAGCTGAAATAGGAGTTG

GCCTCCTTCTTTGTTTTTATTATTCATTTCTTCTATAGCTGAAATATGAGTAG

ACCCCCATCTTTATTATTATTGTTTGTTTGATCTACGCTGAAATAGGAGTTG

ACCTCCTTCTCTATTTATATTATTAATTTAGTCTATAGTTGAGTTAGGGGTTG

Bulk DNA extraction

PCR & indexing

Sequencing

amplicon

index barcodes

Pooling & sequencing

Individual DNA extraction

PCR & indexing

PCR & indexing

Individual DNA extraction

Sequencing

a

b

c

Fig. 1 Summary of applications of Illumina amplicon sequencing forDNA barcoding and metabarcoding of spiders. A) In individual DNAbarcoding, DNA from each specimen is extracted, then the desiredfragment is amplified in a PCR and tagged with a unique combinationof index barcodes before all samples are pooled and sequenced. B) Inbulk metabarcoding, DNA extraction is performed on pools of multiple

specimens. This greatly reduces the number of PCRs and indexcombinations needed per specimen. C) For molecular gut contentanalysis, DNA is extracted from individual predator specimens, andPCR primers are chosen to amplify prey taxa while minimizingamplification of the predator itself

Dev Genes Evol (2020) 230:185–201 187

Page 4: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

are available (Puillandre et al. 2012). Yet another approach isthe grouping of barcodes from a community into clusters ofsimilarity, the so-called operational taxonomic units (OTUs)(Edgar 2013). These clusters, usually based on a maximumsequence divergence of 3%, are then treated as biologicalentities. Even though OTU clusters do not necessarily corre-spond to actual species, this approach can be very useful whenreference libraries are incomplete (Dopheide et al. 2019) andlarge numbers of sequences need to be processed. Clusteringapproaches can also be phylogenetically informed, resultingin more accurate approximations of real species (Fujita et al.2012; Zhang et al. 2013).

Problems with DNA barcoding, and approachesto mitigate problems

A significant obstacle to DNA barcoding is the incompletenessof the barcode reference libraries. Because many species havenot yet been added to these libraries, often specimens can onlybe identified to a relatively coarse taxonomic level such asorder or family. Even though substantial contributions to ref-erence databases have been made in recent years (Astrin et al.2016; Blagoev et al. 2016), given the sheer taxonomic diver-sity of spiders, a large proportion of species is still not repre-sented. There are two major bottlenecks in the generation ofbarcode reference libraries. First, the identification of spe-cies is time-consuming and requires taxonomic expertise,which may not be available (Agnarsson and Kuntner2007). Misidentifications or sequencing of contaminantDNA can then lead to erroneously assigned barcode se-quences in the database. Second, many species are rare ordifficult to collect, and thus represented by little more thantype material. Museum collections are therefore an indis-pensable resource for DNA barcoding. In spiders, this isparticularly feasible because the standard storage mediumfor spiders – ethanol – is an effective DNA preservative.Only slight modifications of DNA extraction and PCR pro-tocols are needed to recover the reduced and fragmentedDNA from historical specimens (Krehenwinkel and Pekár2015; Miller et al. 2013). Several primer combinations areavailable to target short, so-called mini-barcodes, whichare suitable for amplification of older specimens(Supplementary Table 1). In spiders, barcode analysis ofhistorical specimens has provided valuable insights intothe taxonomic assignment of species (Cotoras et al. 2017)and historical changes in genetic variation (Krehenwinkeland Tautz 2013).

Another challenge for DNA barcoding is that short-mitochondrial amplicons such as COI can yield biased biodi-versity assessments when used in isolation (Krehenwinkelet al. 2018). Mitochondrial divergence patterns do not neces-sarily parallel species divergence but are influenced by numer-ous different factors. For example, male-biased gene flow can

lead to highly divergent mitochondrial genomes in the ab-sence of nuclear differentiation (Krehenwinkel et al. 2016).Conversely, introgression can result in complete homogeniza-tion of the mitochondrial gene pools, despite divergent nucleargenomes (Irwin et al. 2009). Infections with endosymbioticbacteria can mimic various demographic scenarios of over-and under-differentiation of mitochondrial genomes com-pared to the nuclear background (Hurst and Jiggins 2005).Moreover, nuclear mitochondrial pseudogenes (NUMTs) canbe recovered as barcode sequences, leading to incorrect taxo-nomic assignments and biased phylogenetic inferences(Bensasson et al. 2001). To avoid these pitfalls, it is oftenrecommended to use multiple loci for DNA barcoding(Dupuis et al. 2012). Information from the unlinked loci inthe nuclear genome is particularly important and can aid withDNA barcode-assisted taxonomic discoveries, e.g., for testinghypotheses on cryptic species (Satler et al. 2013). Althoughmany popular nuclear markers evolve much more slowly thanCOI, they still show comparable patterns of genetic diver-gence when intraspecific and interspecific divergence ratesare compared (Supplementary Fig. 1). Multilocus data canalso increase the phylogenetic resolution of DNA barcoding,which is very limited when analyses are based on a singlemitochondrial amplicon (Krehenwinkel et al. 2018).

High throughput sequencing-based DNA barcoding

DNA barcoding is traditionally based on Sanger sequencing,requiring separate sequencing reactions for every sample. Witha total cost of $ 5–10 per sequence, this method can be prohib-itively expensive for community-level studies. A cost-efficientalternative is high-throughput amplicon sequencing (Kozichet al. 2013). Illumina technology, for example the MiSeq, withits maximum read length of 2 × 300 bp, is highly suitable forDNA barcode generation (Shokralla et al. 2015). Due to limi-tations in read length, HTS-basedDNAbarcoding usually relieson shorter amplicons than the complete 650 bp barcode (Lerayet al. 2013). Alternatively, the complete barcode can be recov-ered by sequencing multiple overlapping amplicons.

Illumina amplicon sequencing is distinguished by a verysimple library preparation process. Most commonly, a two-step PCR is used, in which the target sequence is amplifiedin the first round of PCR (Fig. 2). Dual indexes for uniquesample tagging and the necessary adapters for sequencing arethen incorporated in the second round of PCR (Lange et al.2014). This approach accommodates thousands of samples ina single sequencing run. Multiplex PCRs targeting multipleunlinked loci can additionally reduce the necessary number ofPCRs (Krehenwinkel et al. 2018; Macías-Hernández et al.2018), and inline barcodes attached to the first-round PCRprimers allow for a further increase in sample number(Sternes et al. 2017). Alternatively, fusion primers includingsample tags and sequencing adapters can be used. This allows

188 Dev Genes Evol (2020) 230:185–201

Page 5: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

library preparation to be accomplished in a single PCR(Kozich et al. 2013; Fadrosh et al. 2014) but limits the flexi-bility to target multiple amplicons. Further reductions in pro-cessing cost can be achieved by limiting the number of DNAextractions. This can be done by pooling specimens of diver-gent lineages, then performing bulk extractions (de Kerdrelet al. 2020). PCR and library preparation are then performedon the pooled extract, and the resultant sequences are assignedback to their specimens using a reference database. One otheroption is to omit DNA extraction entirely and instead usedirect PCR (Wong et al. 2014). Here, specimens are droppeddirectly into the PCR buffer, and the traces of DNA theyrelease are sufficient for barcode amplification. This methodhas been tested and established in different insect groups(Thongjued et al. 2019; Yeo et al. 2018) but has yet to beoptimized for spiders.

By limiting the number of DNA extractions, using multi-plex PCRs, and using multiple levels of sample indexing,barcodes can now be generated at a cost of $ 0.2–1 each andwith a considerable reduction of workload (de Kerdrel et al.2020; Meier et al. 2016; Srivathsan et al. 2019). This enablesresearchers to generate barcode sequences for thousands ofspecimens, thereby allowing estimates of both abundance

and taxonomic richness within a community (see below).Recent reductions in cost have even led to the suggestionof a reverse DNA barcoding workflow, in which all spec-imens in a collection are barcoded and only divergent lin-eages are selected for further morphological analysis(Wang et al. 2018).

Using multiplex PCR approaches, sequences for multipleindependent loci can be generated in parallel, greatly improv-ing the phylogenetic resolution of the generated data.Knowing the evolutionary relationships among taxa in a com-munity is critical for understanding the processes underlyingcommunity assembly (Barker 2002). Currently, phylogeneticanalyses often rely on information from hundreds or thou-sands of loci, for example, inferred from whole transcriptomesequencing (Foley et al. 2019) or the targeted enrichment ofultra-conserved elements (Kulkarni et al. 2020). While suchdata offer unprecedented phylogenetic resolution, their gener-ation is expensive and laborious. These methods are thus notfeasible for phylogenetic analyses at the community level,where information for thousands of specimens has to be gen-erated in parallel. This makes multiplexed amplicon sequenc-ing an attractive alternative to phylogenomic approaches forcommunity phylogenetic analyses.

Primary PCR

Indexing PCR

Primary PCR using inline barcodes

Indexing PCR

Pooling & sequencing of indexed products

Single PCR using fusion primers

a

b

c

......

......

......

......

......

......

......

TATTAATTTTATTTCCACAATTATTAAGATACGAAGTGTGCTCTGTT

TATTAATTTTATTTCCACCATTTTAAATATACGGAGTGTGATCAGTT

AATTAATTTTATTGCTACTAGTATGAATATGCGAAGTTTGATCAGTT

TATTAACTTTCTTTCTACAATGATTAAGATACGAAGTGTGGTCTGT

TATTAATTTTATTTCTGCTATTTTAAATATGCGGAGTGTGATCTGTT

ACCCCCCTCTATTACACTATTAATTATCAGTAGTATTGTA

ACCCTCTTCTTGACTTTTATTATGTATTTCTTCAATAGCT

GCCTCCTTCTTTGCTTTTATTATTCATTTCTTCTATAGCT

TATTAATTTTATTTCCACAATTATTAAGATACGAAGTGTGCTCTGTT

TATTAATTTTATTTCCACCATTTTAAATATACGGAGTGTGATCAGTT

AATTAATTTTATTGCTACTAGTATGAATATGCGAAGTTTGATCAGTT

TATTAACTTTCTTTCTACAATGATTAAGATACGAAGTGTGGTCTGT

TATTAATTTTATTTCCACAATTATTAAGATACGAAGTGTGCTCTGTT

TATTAATTTTATTTCCACCATTTTAAATATACGGAGTGTGATCAGTT

AATTAATTTTATTGCTACTAGTATGAATATGCGAAGTTTGATCAGTT

TATTAACTTTCTTTCTACAATGATTAAGATACGAAGTGTGGTCTGT

Fig. 2 Dual indexing strategies for Illumina sequencing. A) Librarypreparation can be accomplished in two separate PCRs. In the firstPCR, the DNA barcode specific primers contain added tails (in brown).Second PCR primers then bind to those tails and incorporate uniquebarcode identifiers as well as sequencing adapters to each sample. Aunique barcode combination is used for each sample. B) Throughputcan be increased with the use of inline barcodes (light green and indigo)

attached to the 5′-end of the first-round PCR primer. They multiply thenumber of unique barcode combinations available. Here, we show inlinebarcodes only on the forward primer, but they can also be incorporatedinto the reverse primer. C) Fusion primers can be used so that only oneround of PCR is necessary. The desired fragment is amplified and indexedsimultaneously

Dev Genes Evol (2020) 230:185–201 189

Page 6: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

Community metabarcoding

Reductions in cost and processing effort have made DNAbarcoding suitable even for the analysis of large communitysamples. However, processing all specimens individually stillamounts to a considerable workload and cost. Metabarcodingoffers a simple alternative to a single-specimen DNAbarcoding and is therefore quickly increasing in popularity(Gibson et al. 2014; Yu et al. 2012). In metabarcoding, bulksamples are extracted and DNA barcode sequences generatedfor the pooled community. Based on the sequence similarity,the recovered barcodes are then clustered into OTUs.Community diversity is estimated based on the number ofrecovered OTU sequences. In order to achieve a comprehen-sive taxon recovery in metabarcoding experiments, the use ofmore than one amplicon is advisable (Krehenwinkel et al.2018; Zhang et al. 2018). Recent developments in clusteringalgorithms (Edgar 2018) also allow the inference ofhaplotypic information from bulk metabarcoding data. Thisway, even the intraspecific genetic variation can be estimatedwithin whole biological communities (Elbrecht et al. 2018).

Due to its speed, accuracy, and cost efficiency,metabarcoding is now often the method of choice for arthro-pod community analysis. A major drawback of this approach,however, is that it only yields a list of OTU sequences, whichcannot be linked back to individual specimens because theDNA is bulk-extracted from mixed samples. As sequencescannot be assigned back to specimens, individual sequencesfrom multilocus barcoding cannot be linked together, limitingthe phylogenetic resolution of this approach unless the speci-mens can be linked to an extensive reference library (deKerdrel et al. 2020). Metabarcoding can also lead to inflateddiversity estimates, as spurious sequences coamplify with thetargeted specimen’s DNA barcodes. These include NUMTs,non-target species such as parasitic fungi or nematodes asso-ciated with the specimen, and chimeras resulting from linkingof incomplete PCR products of different taxa (Elbrecht et al.2017). Chimeras can be removed efficiently with appropriatesoftware solutions (Edgar 2013) and non-target taxa by com-paring the resulting data against a reference database.However, the removal of NUMTs is more challenging andhas not been fully resolved, especially when NUMTs retainan intact reading frame.

Another issue with metabarcoding is that many currentprotocols are performed destructively, i.e., specimens arecrushed in order to maximize the amount of DNA releasedfor extraction. This inevitably leads to the loss of morpholog-ical information. This issue, however, can be circumvented bysubsampling tissue from specimens before extraction. In spi-ders, it is common to extract DNA from one or more legswhile leaving the rest of the specimen intact (Gillespie et al.2018; Krehenwinkel et al. 2018). In addition, several nonde-structive protocols have been developed to isolate DNA from

specimens without compromising their morphological integ-rity, e.g., via a brief soak in lysis buffer (Andersen and Mills2012; Porco et al. 2010). DNAmay even be extracted directlyfrom the collectionmedium (e.g., ethanol), because specimensleave trace amounts of DNA in the medium (Hajibabaei et al.2012;Martins et al. 2019). However, the DNA recovered fromnondestructive extraction of community samples may be bi-ased toward those taxonomic groups that release DNA morereadily than others, e.g., soft-bodied animals (Carew et al.2018; Marquina et al. 2019).

Quantitative metabarcoding and PCR-freemetagenomics

While metabarcoding has been able to provide a highly accu-rate overview of a community’s species composition, it has notbeen possible to obtain accurate measures of abundance usingthis approach. This is because varying PCR efficiencies be-tween different taxa inevitably lead to biased recovery of spe-cies abundances, sometimes by several orders of magnitude(Elbrecht and Leese 2015). Besides simple primer-templatemismatches (Piñol et al. 2015), the GC content of the templateand even the polymerase type can bias taxon recovery (Nicholset al. 2018). The effects of these biases are condition dependentand difficult to quantify. Even the effect of primer-templatemismatches does not simply accumulate with mismatch num-ber. Instead, position and type of mismatch can have widelydifferent effects on amplification efficiency (Kwok et al. 1990).Nevertheless, the accurate quantification of relative abun-dances of taxa is critical for many biodiversity analyses, andmuch effort is therefore being made to optimize methods forquantitative metabarcoding (Krehenwinkel et al. 2017a; Piñolet al. 2019; Saitoh et al. 2016).

To overcome the amplification biases of metabarcoding,PCR-free approaches have been suggested (Jones et al. 2015).The simplest is shotgun sequencing of bulk samples, afterwhich the generated reads are processed and compared againsta reference database, an approach called metagenomics.However, this may not work well in spiders because very fewspider genomes (nine, representing seven families) are currentlyavailable (Supplementary Table 2), making it difficult to iden-tify the majority of sequences. A refinement of this method is“genome skimming,” in which mitochondrial sequences arefiltered from the recovered reads after shotgun sequencing(Papadopoulou et al. 2015). The filtered reads can then be as-sembled into longer contigs sometimes even spanning thewhole mitochondrial genome, allowing community analysiswith considerable phylogenetic support (Crampton-Platt et al.2016). However, although mitochondria are abundant in cells,mitochondrial DNA sequences usually do not exceed 1% of theread population of genomic libraries (Zhou et al. 2013).Genome skimming thus requires a very high sequencing cov-erage. Capture assays are another option: DNA barcode probes

190 Dev Genes Evol (2020) 230:185–201

Page 7: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

are used to capture barcode sequences from a community sam-ple, allowing sequencing without PCR amplification (Shokrallaet al. 2016). However, hybridization bias due to probe-targetdissimilarities may also result in skewed abundance estimates;furthermore, a capture approach adds considerable cost andworkload.

PCR-based metabarcoding is therefore still the most cost-effective and simple method for community analysis of spi-ders. Although PCR amplification bias can theoretically leadto skewed taxon abundances, this bias can be considerablymitigated using optimized protocols. Amplification with de-generate primers, or with primer binding in conserved DNAstretches, greatly improves taxon recovery while decreasingamplification bias (Krehenwinkel et al. 2017a). Furthermore,the response of individual taxa during a PCR is predictable,i.e., the relative abundance of a taxon in a community is lin-early correlated with the recovered read abundance (Fig. 3).Only the slope of this correlation differs among taxa. If theslopes are known, then correction factors can be applied toestimate the relative abundance of taxa in a community(Thomas et al. 2016). The downside is that correction factorsmust be developed individually for different taxa.

Environmental DNA metabarcoding of spiders

A popular application of metabarcoding is the analysis of en-vironmental DNA (eDNA). Every organism leaves traces ofDNA in the environment, for example, from feces, skin frag-ments, or saliva. These traces can be enriched, amplified, andsequenced, allowing characterization of whole communitieswithout needing to collect the organisms. Much work oneDNA has focused on aquatic ecosystems, usingDNA extractsfrom filtered water (Valentini et al. 2016). However, terrestrialorganisms can also be detected using eDNA, for example from

soil DNA extractions. Arthropods were recently shown toleave eDNA traces on wildflowers (Thomsen and Sigsgaard2019). Thus, by washing eDNA off of plants, it may be possi-ble to reconstruct associated arthropod communities.

Third generation sequencing-based barcodingand metabarcoding

DNA barcoding and metabarcoding applications are currentlylimited by the relatively short read length of second generationHTS applications, which cannot recover the whole 650-bpCOI barcode region as a single sequence (Piper et al. 2019).A solution to this limitation is provided by the third generationsequencing technologies. Oxford Nanopore Technologies(ONT) and Pacific Biosciences (PacBio) offer sequencingplatforms that achieve read lengths superior to any previoussequencing technology, with reads of close to 2 megabases forONT’s MinION platform (Payne et al. 2018). Both technolo-gies are well suited for amplicon sequencing, and dual indexescan be easily incorporated during PCR, allowing processingof thousands of DNA barcodes in a single sequencing run.The PacBio Sequel (Hebert et al. 2018) and ONT’s MinION(Srivathsan et al. 2019) were recently suggested as cost-efficient alternatives to Sanger sequencing for the generationof DNA barcodes. ONT and PacBio platforms have also re-cently been used to sequence near complete nuclear ribosomalDNA clusters (Krehenwinkel et al. 2019a; Tedersoo et al.2018). The advantage of rDNA barcoding is that conservedgene regions of the rDNA cluster can be used to design uni-versal primers (anchored in the highly conserved 18S and 28SrDNA), which can resolve very old divergences (Hillis andDixon 1991), while at the same time fast-evolving internaltranscribed spacers (ITS) can be used to resolve relationshipsof closely related taxa (Schoch et al. 2012). Our work on

Fig. 3 Association of the relative abundance of spider species from sevendifferent families in mock communities of 46 different arthropod species,with the relative read count recovered for the species after sequencing.The plots show the association for A) nuclear 18SrDNA, B) nuclear28SrDNA and C) mitochondrial COI. The abundance of the species in

the respective communities is generally well correlated to the recoveredread abundances. However, depending on marker and species, the slopeof the association is very variable, such that accurate abundance estimatesfrom read data would require careful calculation. Based on data fromKrehenwinkel et al. 2017a

Dev Genes Evol (2020) 230:185–201 191

Page 8: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

spiders suggests that such long rDNA amplicons are a well-suited complement to COI-based DNA barcoding(Supplementary Fig. 2; Krehenwinkel et al. 2019a). LongrDNA amplicons also offer very good phylogenetic supportand thus may constitute a cost-effective alternative to themultiplexed Illumina amplicon sequencing for thecommunity-wide phylogenetic analysis.

The third generation sequencing platforms offer an unprec-edented read length, but their major downside is a high rawread error rate. At about 5–30% (Tedersoo et al. 2018; Wicket al. 2018), ONT and PacBio sequencers’ raw read error rateis much higher than that of Illumina (error rate: 0.1–1%;Manley et al. 2016) and Sanger sequencing (error rate:0.001–1%; Noguchi et al. 2006). However, highly accurateconsensus sequences can be generated from ONT andPacBio data even at low coverage (Krehenwinkel et al.2019a; Pomerantz et al. 2018). Recent advances show prom-ise for further minimizing error. PacBio HiFi sequencingmode produces highly accurate reads using their circular con-sensus sequencing (CCS) technology, reducing raw read errorto < 1% (Wenger et al. 2019). Similarly, rolling circle ampli-fication can be used in metabarcoding applications to mitigateerror rates of the nanopore-based sequencing platforms (Caluset al. 2018).

The possibility of long-read metabarcoding was also re-cently explored (Callahan et al. 2019; Krehenwinkel et al.2019a; Tedersoo and Anslan 2019). Barcode sequences ofseveral thousand base pairs for a whole community wouldgreatly improve the phylogenetic resolution of metabarcodingand allow community-level phylogenetic analysis. However,the high raw read error rate poses a significant obstacle foraccurate community characterization, as it is hard to distin-guish whether a rare sequence variant belongs to a separatespecies in the community or is simply caused by sequencingerror. Nonetheless, advances such as CCS and rolling circleamplification may soon solve this problem. One other issue isthat community compositions of rDNA metabarcoding stud-ies can be highly skewed, likely due to favorable PCR ampli-fication of shorter rDNA fragments (Krehenwinkel et al.2019a). Although the length of the nuclear rDNA region isrelatively stable within spiders, biases can occur when addi-tional taxa are included (Krehenwinkel et al. 2019a).

Mobile DNA barcoding by third generationsequencing

A particular strength of ONT’s MinION is its portability. Withthe size of a USB stick, the device can be run outside ofconventional laboratories (e.g., Menegon et al. 2017;Pomerantz et al. 2018). Using a mobile laboratory of minia-turized equipment, all steps from DNA extraction to PCR,library preparation and sequencing can be performed in thefield using the MinION (Pomerantz et al. 2018; reviewed in

Krehenwinkel et al. 2019b). While field-based DNAbarcoding is an exciting perspective, it is unlikely to becomethe method of choice for community barcoding. Researchersusually have access to molecular laboratories that allow formore standardized and higher throughput sample processingthan field-based assays. Yet, a minimalistic and mobile DNAbarcoding system can be of great advantage when field sitesare remote or hard to access, or when time is of the essence forswift generation of biodiversity information (reviewed inKrehenwinkel et al. 2019b). Examples include monitoring ofdisease outbreaks (Quick et al. 2016; Walter et al. 2017) ordocumenting the immediate effects of ecological disasterssuch as forest fires or pipeline spillages. Another advantageof mobile barcoding is that it allows for in situ species mon-itoring without having to remove organisms from their habitator send samples internationally (Pomerantz et al. 2018). Thisis especially relevant for endangered species. In the case ofspiders, non-lethal sampling protocols could be applied forsite-based monitoring without directly affecting the popula-tion (Longhorn et al. 2007; Petersen et al. 2007).

Trophic niche analysis by DNA barcoding

DNA barcoding for gut content analysis:toward community-level food webs

Surprisingly little is known about the dietary ecology of spi-ders. While most spiders have long been understood as gen-eralist predators, recent work also highlights many examplesof dietary specialists, like termite feeders (Petráková et al.2015), araneophages (Benavides et al. 2017; Wood et al.2012) and even herbivorous species (Meehan et al. 2009;Nyffeler et al. 2016). The compilation of dietary informationfor spiders from observational data is very time-consumingand often inaccurate. The sheer diversity of spiders addition-ally complicates the task. More detailed dietary information isavailable chiefly for species being considered as potential bio-control agents (Schmidt et al. 2014; Roubinet et al. 2017).

Molecular gut content analysis has simplified the task ofcharacterizing spider prey communities and associatedstrengths of predator-prey interactions, thereby allowing amore accurate reconstruction of the often-cryptic arthropodfood web (Sint et al. 2019). In the simplest case, spiders canbe tested for consumption of specific prey taxa by subjecting aspider’s gut content to PCR assays using prey-specific primers(Schmidt et al. 2014; Whitney et al. 2018). This can be usefulfor determining whether a spider could serve as a biocontrolagent against a particular pest species. However, the limitationof this method is that the expected prey taxa must be known apriori, and specific PCR assays must be developed for everyprey taxon. Multiplex PCR approaches (King et al. 2011;

192 Dev Genes Evol (2020) 230:185–201

Page 9: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

Roubinet et al. 2017) can broaden taxonomic coverage of preydetection but are still limited in their taxonomic breadth.

A more complete prey spectrum can be recovered viametabarcoding of the gut contents (Deagle et al. 2009; Lerayet al. 2013). In principle, the sameDNA barcoding approachesused for community characterization can be applied to gutcontent analysis, i.e., by treating the prey DNA inside thegut as a “community.” However, there are some additionalconsiderations for tailoring these approaches to the specificconditions affecting DNA extracted from the gut. PCR inhib-itors may coprecipitate with the DNA, requiring additionalpurification of DNA from gut extractions. Also, prey DNAfrom the predator’s guts is often degraded and present at muchlower concentrations than the DNA of a single specimen in abulk community extract. Hence, short PCR amplicons areusually targeted to achieve a complete prey spectrum (Zealeet al. 2011; Kamenova et al. 2018). Gut contentmetabarcoding has become increasingly popular and has re-cently provided numerous novel insights into the trophic ecol-ogy of spiders. Examples include trophic niche differentiationwithin an adaptive radiation (Fig. 4; Kennedy et al. 2019), theeffect of grazers on prey communities (Schmidt et al. 2018),ontogenetic shifts in diet (Verschut et al. 2019), and a stablediet despite differences in available prey communities alongan elevational gradient (Eitzinger et al. 2019). Improved res-olution is often achieved by combining HTS-based gut con-tent screening with stable isotope analysis (Hambäck et al.2016; Kennedy et al. 2019).

Enrichment of prey DNA

Dissecting the gut of a spider for prey recovery is time-consuming and laborious. A highly simplified approachwas thussuggested by Piñol et al. (2014). In their study, the authors usedDNA extractions from whole spider bodies and amplified DNAbarcodes using universal primers. Predator barcodes, which co-amplified during PCR, were removed from the analysis, and theprey spectrum is reconstructed from the remaining sequences. Asuniversal primers are needed in order to recover a full prey spec-trum, a serious problem of this approach is that both predator andprey will be amplified. Consequently, the overabundant predatorDNA can completely outcompete the prey DNA during PCR.Recent work suggests the use of predator-specific blockingprimers, but due to the close relatedness of prey and predator,this approach has had only limited success in spiders (Micháleket al. 2017; Toju and Baba 2018). Another option is to enrichprey DNA from spider extracts. Prey DNA is quickly degradedin the spider’s digestive tract. By separating intact highmolecularweight DNA from degraded DNA fragments, prey DNA can besignificantly enriched (Fig. 5; Krehenwinkel et al. 2017b).However, this method will only work well if the predator DNAis not degraded; therefore, it is not suitable for old or poorlypreserved specimens. An additional enrichment of prey DNAcan be achieved by extracting DNA from the spider’sopisthosoma only, which contains the majority of the animal’sdigestive tract (Krehenwinkel et al. 2017b; Macías-Hernándezet al. 2018).

Fig. 4 Order-level prey compositions for four sympatric Tetragnathaspecies from the Hawaiian island of Maui, as recovered by moleculargut content analysis based on a short COI amplicon. The differentlifestyles of the web builders (T. acuta and T. stelarobusta) and free-hunting species (T. quasimodo and T. waikamoi) are reflected in divergentprey spectra. However, prey community differences also become

apparent within web builders and free hunters. This effect may be dueto interspecific differences in microhabitat and prey capture strategy. Thegreen T. waimamoi resides and hunts on green leaves, while the brownT. quasimodo occurs on tree bark or dead leaves. The two web buildersare distinguished by different web mesh widths, selectively catching dif-ferent arthropod groups. Based on data from Kennedy et al. 2019

Dev Genes Evol (2020) 230:185–201 193

Page 10: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

To further optimize prey recovery and reduce the necessarysequencing depth, lineage-specific PCR was recently sug-gested (Fig. 5; Krehenwinkel et al. 2019c). Single mismatchesat the 3′-end of a PCR primer can lead to a massive drop inamplification efficiency (Kwok et al. 1990). If primers aredesigned in very conserved regions but end at a lineage-diagnostic SNP, which distinguishes spiders from their insectprey, spiders will be mostly blocked from amplification. Atthe same time, the primer still amplifies a wide variety ofarthropod prey. Blocking spiders from amplification also en-ables detection of prey for very long time periods, possibly upto a month after feeding (Krehenwinkel et al. 2019c). A down-side of this approach is that spider-spider predation cannot bedetected. While prey enrichment can now be routinely per-formed from spider gut content, further standardizations ofthe protocol may be necessary to integrate the resulting datainto previously generated prey community data.

Non-lethal monitoring of spider prey communities

All methods mentioned above rely on DNA extraction fromspiders or their body parts. Corse et al. (2019) suggest the useof DNA extracts from spider webs as an alternative source ofprey DNA, without harming the spider. This method has sev-eral drawbacks. First, spider webs can collect airborne DNAfrom the environment, in addition to “bycatch” of insects thatthe spiders do not eat. This can lead to false positives if DNAfrom webs is used as a proxy for the spider’s diet. Also, many

spiders rebuild their webs on a daily basis. Web DNA thenonly allows detection of the daily prey catch (as well as thebycatch described above), in contrast to several weeks recov-ered by gut content analysis. Many spiders do not spin capturewebs but are active hunters, additionally limiting the broadapplicability of this method. Another alternative was sug-gested by Sint et al. (2015), who used DNA extracted fromspider feces as source of prey DNA. This is a promising ap-proach but may be logistically challenging, as spiders must bekept in captivity until feces can be collected. Moreover, recentwork in carabid beetles has shown that feces recover a lessdiverse, and therefore biased, prey spectrum compared to gutcontent extractions (Kamenova et al. 2018).

Pitfalls of HTS-based gut content analysis and howto avoid them

HTS-based gut content analysis has been shown to yield reliableand comprehensive prey spectra, allowing exploration of foodweb structure in whole communities of spiders. However, themethod still has several issues. PCR-based amplification of preyDNA is very sensitive to contamination. Theoretically, a fewmolecules of insect DNA are sufficient to be amplified. ThisDNA can also derive from external sources, e.g., from insectscaught or stored together with the spider specimen. This con-tamination can be minimized by bleach treatment of the collect-ed specimens before gut content analysis (Greenstone et al.2012). Another source of contamination is parasitoid larvae,

Fig. 5 Enrichment of prey DNA from extractions of spiders. A) Therecovered relative amount of prey DNA of Hololena adnexa in relationto spider DNA increases significantly when DNA extractions areperformed from the opisthosoma rather than the prosoma. The yield canbe further increased by using a bead protocol to enrich the low molecularweight DNA from the DNA extract. This works because prey DNA

rapidly degrades in the spider’s digestive tract. B) Enrichment of preyDNA in 12 spider species from seven families by using different lineage-specific primers and in comparison to a commonly used COI primer pair.Based on spider-specific 3′-primer mismatches, the amplification of spi-ders can be considerably reduced, enriching the prey DNA during PCR.Based on data from Krehenwinkel et al. 2017b, 2019c

194 Dev Genes Evol (2020) 230:185–201

Page 11: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

which may be located inside the spider but are not actual prey.Additional organisms inside the spider’s tissue, such as nema-todes, fungi, and bacteria, can also be coamplified alongwith theprey. Prey data must therefore be carefully analyzed to identifysuch potentially confounding factors.

Amplification bias can also strongly affect taxon recoveryin HTS-based gut content analysis. This problem can be mit-igated using multiplex PCR assays targeting several loci(Krehenwinkel et al. 2019c), which enables a good qualitativerecovery of prey. Accurate quantitative assessments of preycommunities, however, are not possible. This is partly due tobiased amplification of different taxa, and partly to the timingof prey consumption. DNA of the recently ingested prey willoutweigh that of earlier meals, which will already be mostlydegraded. However, by completely omitting quantitative in-formation, the prey spectrum is artificially biased toward veryrare taxa; thus, the number of reads obtained for different preytaxa should be taken into account even if abundance per secannot be reliably inferred (Deagle et al. 2019).

Another shortcoming of current HTS-based protocols isthat they fail to detect cannibalism. The short amplifiedbarcode sequences for gut content analysis are usually sharedwithin a species, making it impossible to distinguish the DNAof cannibalized prey from that of the predator. Recent work(Michálek et al. 2017) suggests using intraspecific haplotypicvariation to identify cannibalism events, but this method onlyworks if the spider population has very high-haplotypic vari-ation. Hence, in most spider species, the number of cannibal-ism events would be considerably underestimated.

Secondary predation is yet another issue, particularly forprotocols with a long detection half-life of prey DNA. Forexample, gut DNA extracts of a spider that has fed on aladybird (Coccinellidae) may also contain DNA of theladybird’s own aphid prey. However, secondary prey DNA isexpected to bemore degraded and less abundant than the DNAof an actual prey item, so a careful analysis of prey communitydata and the development of sequence coverage cutoffs maymitigate this issue. Such sequence coverage cutoffs will haveto be derived experimentally in the future to enable accurateassignments of real prey taxa in gut content studies.

Future outlook: practical and theoreticaldevelopments in the field

Development of laboratory and field protocols

Recent technological developments have greatly contributedto the field of DNA barcoding. Whole communities can nowbe routinely characterized for manageable cost and effort.However, the field is still in its infancy, and further develop-ments are warranted. One important focus is the completion ofDNA barcode reference libraries. Only with complete

reference databases can DNA barcoding be used to its fullpotential. Museum collections are a promising source of spec-imens from which to generate barcode data for the world’sspider biota. Also, future species descriptions should becoupled with the deposition of a DNA barcode sequence.Considering the limitations of the single-locus DNAbarcoding, new unlinked barcode loci should additionally bedeveloped. These should include information from the nucleargenome and be variable enough to distinguish species but alsoinclude conserved sequences which allow the design of uni-versal primers. A set of multiple, unlinked DNA barcodingloci would greatly facilitate taxonomic discoveries in spidersand may aid in community phylogenetic analysis.

A focus of future research should also be on the optimiza-tion of quantitative taxon recovery from metabarcoding.Alternatively, further developments in amplification-free ap-proaches may lead to a drop in cost, allowing this method tobe applicable to the whole-community samples. Further opti-mizations should be performed on long read protocols so thatthey can be used for accurate metabarcoding analyses. Thiswould considerably improve the phylogenetic resolution ofmetabarcoding data. With future simplifications to the proto-col, portable barcoding could develop into a routine method-ology for the exploration of remote ecosystems around theworld. Gut content sequencing is currently revolutionizingour understanding of cryptic prey-predator interactions in ar-thropod communities. With further experimental develop-ments, for example, into the avoidance of false positives andthe enrichment of prey DNA, the methodology will enable anin-depth understanding of the arthropod food web structure,which is also critical for understanding the food web relation-ships at higher trophic levels.

Linking theoretical biology and DNA barcoding

High throughput sequencing-based DNA barcoding andmetabarcoding have provided scientists with community-level datasets of unprecedented completeness and resolution.Nevertheless, the theoretical tools available for analyzing thesedata are still somewhat limited. Recent efforts, however, showgreat promise for improving the power and accuracy of DNAbarcode data for the analysis of community species richness,abundance, phylogenetics, and interactions. Here we providean overview of the current developments and future perspec-tives of integrating DNA barcoding data into theory.

Theoreticians are facing new opportunities for making infer-ences about past processes that have contributed to structuringcommunities using community-scale sequence data. Events atdifferent timescales are recorded in different aspects of these data,with abundance distributions reflecting short, ecological time-scales, population genetic variation reflecting medium time-scales, and phylogenetic diversity reflecting long timescales.For example, if abundances can be estimated from bulk-

Dev Genes Evol (2020) 230:185–201 195

Page 12: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

sampled sequence data using metabarcoding, then a variety ofmethods can be applied to differentiate neutral from non-neutralprocesses (Harpole and Tilman 2006; Tilman 2004), estimateassembly model parameters (Haegeman and Etienne 2017), orinfer equilibrium state of the community using mechanistic the-ory (Jabot and Chave 2011) or tools from statistical mechanics(Harte and Newman 2014; Rominger and Merow 2017).

The distribution of genetic variation within a communityprovides another axis of information which is complementaryto the abundance distribution (Vellend 2005; Vellend et al.2014). Recently, Overcast et al. (2019) described a mechanisticmodel of community assembly that can generate linked pat-terns of abundance and genetic diversity under an assumptionof joint ecological (Hubbell 2011) and evolutionary (Kimura1983) neutrality to estimate community abundance structureusing only intraspecific genetic variation. This method canserve as an alternative or a complement to estimates of abun-dance distributions from metabarcode data, which are con-founded by PCR amplification bias as described above. As aproof of concept for this method, Overcast et al. (2019) ana-lyzed the densely sampled abundances and community-scalepopulation genetic data (COI sequences) from a community ofspiders on La Réunion (Emerson et al. 2017) and demonstratedthat the abundance structure of the community could be accu-rately estimated using only the intraspecific genetic variation.

Analysis of the community phylogenies provides a deep-time lens on community structure which can be used to esti-mate speciation and extinction rates (Manceau et al. 2015) andmake inferences about diversification processes (Emerson andGillespie 2008; Morlon 2014; Pearse et al. 2014). Recentmethods have also been developed to simultaneously modeltrait evolution and species diversification (Weber et al. 2017)to investigate the importance of competition in shaping evo-lutionary radiations (Aristide and Morlon 2019), and the jointcontribution of competition and environmental filtering instructuring ecological communities (Ruffley et al. 2019).Community-scale trait data can also be analyzed along withmetabarcoding data in a hierarchical modeling framework tofurther account for feedbacks among processes happening atdisparate timescales (Overcast et al. n.d.). Such theoreticaldevelopments enable increasingly reliable and detailed infer-ences on past processes in shaping present-day patterns, yield-ing many exciting new perspectives on community assembly.

Acknowledgments IO acknowledges the support of the Mina ReesDissertation Fellowship in the Sciences. We also acknowledge theGerman Center for Integrative Biodiversity Research (sDiv program)for fostering discussions that gave rise to some of this work.

Funding information Open Access funding provided by Projekt DEAL.SKwas funded by a kickstart grant from the Okinawa Institute of Scienceand Technology to Evan Economo. Part of this work was funded by theNational Science Foundation’s Dimensions of Biodiversity program(DEB 1241253) to RGG and a Deutsche Forschungsgemeinschaft post-doctoral fellowship to HK.

Open Access This article is licensed under a Creative CommonsAttribution 4.0 International License, which permits use, sharing,adaptation, distribution and reproduction in any medium or format, aslong as you give appropriate credit to the original author(s) and thesource, provide a link to the Creative Commons licence, and indicate ifchanges weremade. The images or other third party material in this articleare included in the article's Creative Commons licence, unless indicatedotherwise in a credit line to the material. If material is not included in thearticle's Creative Commons licence and your intended use is notpermitted by statutory regulation or exceeds the permitted use, you willneed to obtain permission directly from the copyright holder. To view acopy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

References

Agnarsson I, Kuntner M (2007) Taxonomy in a changing world: seekingsolutions for a science in crisis. Syst Biol 56:531–539

Agustí N, Shayler SP, Harwood JD, Vaughan IP, Sunderland KD,Symondson WOC (2003) Collembola as alternative prey sustainingspiders in arable ecosystems: prey detection within predators usingmolecular markers. Mol Ecol 12:3467–3475

Andersen JC, Mills NJ (2012) DNA extraction from museum specimensof parasitic hymenoptera. PLoS One. https://doi.org/10.1371/journal.pone.0045549

Aristide L, Morlon H (2019) Understanding the effect of competitionduring evolutionary radiations: an integrated model of phenotypicand species diversification. Ecol Lett. https://doi.org/10.1111/ele.13385

Astrin JJ, Höfer H, Spelda J, Holstein J, Bayer S, Hendrich L, Huber BA,Kielhorn KH, Krammer HJ, Lemke M, Monje JC (2016) Towards aDNA barcode reference database for spiders and harvestmen ofGermany. PLoS One. https://doi.org/10.1371/journal.pone.0162624

Barker GM (2002) Phylogenetic diversity: a quantitative framework formeasurement of priority and achievement in biodiversity conserva-tion. Biol J Linn Soc 76:165–194

Barrett RD, Hebert PD (2005) Identifying spiders through DNAbarcodes. Can J Zool 83:481–491

Benavides LR, Giribet G, Hormiga G (2017) Molecular phylogeneticanalysis of “pirate spiders” (Araneae, Mimetidae) with the descrip-tion of a new African genus and the first report of maternal care inthe family. Cladistics 33:375–405

Bensasson D, Zhang DX, Hartl DL, Hewitt GM (2001) Mitochondrialpseudogenes: Evolution's misplaced witnesses. Trends Ecol Evol16:314–321

Binford GJ (2001) Differences in venom composition between orb-weaving and wandering Hawaiian Tetragnatha (Araneae). Biol JLinn Soc 74:581–595

Blagoev GA, deWaard JR, Ratnasingham S, deWaard SL, Lu L,Robertson J, Telfer AC, Hebert PD (2016) Untangling taxonomy:a DNA barcode reference library for Canadian spiders. Mol EcolResour 16:325–341

Bohmann K, Evans A, Gilbert MTP, Carvalho GR, Creer S, KnappM, Douglas WY, De Bruyn M (2014) Environmental DNA forwildlife biology and biodiversity monitoring. Trends Ecol Evol29:358–367

Callahan BJ, Wong J, Heiner C, Oh S, Theriot CM, Gulati AS, McGillSK, Dougherty MK (2019) High-throughput amplicon sequencingof the full-length 16S rRNA gene with single-nucleotide resolution.Nucleic Acids Res. https://doi.org/10.1093/nar/gkz569

Calus ST, Ijaz UZ, Pinto AJ (2018) NanoAmpli-Seq: a workflow foramplicon sequencing for mixed microbial communities on the

196 Dev Genes Evol (2020) 230:185–201

Page 13: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

nanopore sequencing platform. GigaScience. https://doi.org/10.1093/gigascience/giy140

Čandek K, Kuntner M (2015) DNA barcoding gap: reliable species iden-tification over morphological and geographical scales. Mol EcolResour 15:268–277

Cardoso P, Pekár S, Jocqué R, Coddington JA (2011) Global patterns ofguild composition and functional diversity of spiders. PLoS One.https://doi.org/10.1371/journal.pone.0021710

Carew ME, Coleman RA, Hoffmann AA (2018) Can non-destructiveDNA extraction of bulk invertebrate samples be used formetabarcoding? PeerJ. https://doi.org/10.7717/peerj.4980

Corse E, Tougard C, Archambaud-Suard G, Agnèse JF, Messu MandengFD, Bilong Bilong CF, Duneau D, Zinger L, Chappaz R, Xu CC,Meglécz E (2019) One-locus-several-primers: a strategy to improvethe taxonomic and haplotypic coverage in diet metabarcoding stud-ies. Ecol Evol 9:4603–4620

Cotoras D, Murray G, Kapp J, Gillespie R, Griswold C, Simison W,Green R, Shapiro B (2017) Ancient DNA resolves the history ofTetragnatha (Araneae, Tetragnathidae) spiders on Rapa Nui. Genes.https://doi.org/10.3390/genes8120403

Crampton-Platt A, Douglas WY, Zhou X, Vogler AP (2016)Mitochondrial metagenomics: letting the genes out of the bottle.GigaScience. 5:1–11. https://doi.org/10.1186/s13742-016-0120-y

Crespo LC, Domènech M, Enguídanos A, Malumbres-Olarte J, CardosoP, Moya-Laraño J, Frías-López C, Macías-Hernández N, De Mas E,Mazzuca P, Mora E (2018) A DNA barcode-assisted annotatedchecklist of the spider (Arachnida, Araneae) communities associatedto white oak woodlands in Spanish national parks. Biodivers Data J6:1–273

Cristescu ME (2014) From barcoding single individuals tometabarcoding biological communities: towards an integrative ap-proach to the study of global biodiversity. Trends Ecol Evol 29:566–571

de Kerdrel G, Andersen JC, Kennedy SR, Gillespie R, Krehenwinkel H(2020) Rapid and cost-effective generation of single specimenmultilocus barcoding data from whole arthropod communities bymultiple levels of multiplexing. Sci Rep UK in press

Deagle BE, Kirkwood R, Jarman SN (2009) Analysis of Australian furseal diet by pyrosequencing prey DNA in faeces. Mol Ecol 18:2022–2038

Deagle BE, Thomas AC, McInnes JC, Clarke LJ, Vesterinen EJ, ClareEL, Kartzinel TR, Eveson JP (2019) Counting with DNA inmetabarcoding studies: how should we convert sequence reads todietary data? Mol Ecol 28:391–406

DopheideA, Tooman LK, Grosser S, Agabiti B, Rhode B,Xie D, StevensMI, Nelson N, Buckley TR, Drummond AJ, Newcomb RD (2019)Estimating the biodiversity of terrestrial invertebrates on a forestedisland using DNA barcodes and metabarcoding data. Ecol Appl.https://doi.org/10.1002/eap.1877

Dupuis JR, Roe AD, Sperling FA (2012)Multi-locus species delimitationin closely related animals and fungi: one marker is not enough. MolEcol 21:4422–4436

Edgar RC (2013) UPARSE: highly accurate OTU sequences frommicro-bial amplicon reads. Nat Methods 10:996–998

Edgar RC (2018) Updating the 97% identity threshold for 16S ribosomalRNA OTUs. Bioinformatics 34:2371–2375

Eitzinger B, Abrego N, Gravel D, Huotari T, Vesterinen EJ, Roslin T(2019) Assessing changes in arthropod predator–prey interactionsthrough DNA-based gut content analysis—variable environment,stable diet. Mol Ecol 28:266–280

Elbrecht V, Leese F (2015) Can DNA-based ecosystem assessmentsquantify species abundance? Testing primer bias and biomass—sequence relationships with an innovative metabarcoding protocol.PloS One. https://doi.org/10.1371/journal.pone.0130324

Elbrecht V, Vamos EE, Meissner K, Aroviita J, Leese F (2017) Assessingstrengths and weaknesses of DNA metabarcoding-based

macroinvertebrate identification for routine stream monitoring.Methods Ecol Evol 8:1265–1275

Elbrecht V, Vamos EE, Steinke D, Leese F (2018) Estimating intraspecificgenetic diversity from community DNA metabarcoding data. PeerJ.https://doi.org/10.7717/peerj.4644

Emerson BC, Gillespie RG (2008) Phylogenetic analysis of communityassembly and structure over space and time. Trends Ecol Evol 23:619–630

Emerson BC, Casquet J, López H, Cardoso P, Borges PA, Mollaret N,Oromí P, Strasberg D, Thébaud C (2017) A combined field surveyandmolecular identification protocol for comparing forest arthropodbiodiversity across spatial scales. Mol Ecol Resour 17:694–707

Fadrosh DW,Ma B, Gajer P, Sengamalay N, Ott S, Brotman RM, Ravel J(2014) An improved dual-indexing approach for multiplexed 16SrRNA gene sequencing on the Illumina MiSeq platform.Microbiome. https://doi.org/10.1186/2049-2618-2-6

Foelix R (2011) Biology of spiders. Oxford University Press USA, NewYork

Foley S, Lüddecke T, Cheng DQ, Krehenwinkel H, Künzel S, LonghornSJ, Wendt I, von Wirth V, Tänzler R, Vences M, Piel WH (2019)Tarantula phylogenomics: a robust phylogeny of deep theraphosidclades inferred from transcriptome data sheds light on the pricklyissue of urticating setae evolution. Mol Phylogenet Evol. https://doi.org/10.1016/j.ympev.2019.106573

Fujita MK, Leaché AD, Burbrink FT, McGuire JA, Moritz C (2012)Coalescent-based species delimitation in an integrative taxonomy.Trends Ecol Evol 27:480–488

Gibson J, Shokralla S, Porter TM, King I, van Konynenburg S, JanzenDH, Hallwachs W, Hajibabaei M (2014) Simultaneous assessmentof the macrobiome and microbiome in a bulk sample of tropicalarthropods through DNA metasystematics. P Natl Acad Sci USA111:8007–8012

Gillespie RG, Benjamin SP, Brewer MS, Rivera MAJ, Roderick GK(2018) Repeated diversification of ecomorphs in Hawaiian stickspiders. Curr Biol 28:941–947

Greenstone MH, Shufran KA (2003) Spider predation: species-specificidentification of gut contents by polymerase chain reaction. JArachnol 31:131–135

Greenstone MH, Weber DC, Coudron TA, Payton ME, Hu JS (2012)Removing external DNA contamination from arthropod predatorsdestined for molecular gut-content analysis. Mol Ecol Resour 12:464–469

Grey J, Thackeray SJ, Jones RI, Shine A (2002) Ferox trout (Salmotrutta) as ‘Russian dolls’: complementary gut content and stableisotope analyses of the loch ness foodweb. Freshw Biol 47:1235–1243

Gruner DS (2004) Attenuation of top-down and bottom-up forces in acomplex terrestrial community. Ecology 85:3010–3022

Haegeman B, Etienne RS (2017) A general sampling formula for com-munity structure data. Methods Ecol Evol 8:1506–1519

Hajibabaei M, Singer GA, Hebert PD, Hickey DA (2007) DNAbarcoding: how it complements taxonomy, molecular phylogeneticsand population genetics. Trends Genet 23:167–172

Hajibabaei M, Spall JL, Shokralla S, van Konynenburg S (2012)Assessing biodiversity of a freshwater benthic macroinvertebratecommunity through non-destructive environmental barcoding ofDNA from preservative ethanol. BMC Ecol 12:1–10. https://doi.org/10.1186/1472-6785-12-28

Hambäck PA, Weingartner E, Dalén L, Wirta H, Roslin T (2016) Spatialsubsidies in spider diets vary with shoreline structure: complemen-tary evidence from molecular diet analysis and stable isotopes. EcolEvol 6:8431–8439

HarpoleWS, Tilman D (2006) Non-neutral patterns of species abundancein grassland communities. Ecol Lett 9:15–23

Harte J, Newman EA (2014) Maximum information entropy: A founda-tion for ecological theory. Trends Ecol Evol 29:384–389

Dev Genes Evol (2020) 230:185–201 197

Page 14: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

Hebert PD, Gregory TR (2005) The promise of DNA barcoding fortaxonomy. Syst Biol 54:852–859

Hebert PD, Cywinska A, Ball SL, deWaard JR (2003) Biological identi-fications through DNA barcodes. P Roy Soc Lond B Bio 270:313–321

Hebert PD, Stoeckle MY, Zemlak TS, Francis CM (2004a) Identificationof birds throughDNA barcodes. PLoS Biol. https://doi.org/10.1371/journal.pbio.0020312

Hebert PD, Penton EH, Burns JM, Janzen DH, HallwachsW (2004b) Tenspecies in one: DNA barcoding reveals cryptic species in the neo-tropical skipper butterfly Astraptes fulgerator. P Natl Acad Sci USA101:14812–14817

Hebert PD, Braukmann TW, Prosser SW, Ratnasingham S, deWaard JR,Ivanova NV, Janzen DH, HallwachsW, Naik S, Sones JE, ZakharovEV (2018) A sequel to sanger: amplicon sequencing that scales.BMC Genomics 19:1–14. https://doi.org/10.1186/s12864-018-4611-3

Hillis DM, Dixon MT (1991) Ribosomal DNA: molecular evolution andphylogenetic inference. Q Rev Biol 66:411–453

Hiruki LM, Schwartz MK, Boveng PL (1999) Hunting and social behav-iour of leopard seals (Hydrurga leptonyx) at Seal Island, SouthShetland Islands, Antarctica. J Zool 249:97–109

Hubbell SP (2011) The unified neutral theory of biodiversity and bioge-ography. Princeton University Press, Princeton

Hurst GD, Jiggins FM (2005) Problems with mitochondrial DNA as amarker in population, phylogeographic and phylogenetic studies:the effects of inherited symbionts. P Roy Soc Lond B Bio 272:1525–1534

Irwin DE, Rubtsov AS, Panov EN (2009) Mitochondrial introgressionand replacement between yellowhammers (Emberiza citrinella) andpine buntings (Emberiza leucocephalos)(Aves: Passeriformes). BiolJ Linn Soc 98:422–438

Jabot F, Chave J (2011) Analyzing tropical forest tree species abundancedistributions using a nonneutral model and through approximateBayesian inference. Am Nat 178:E37–E47

Jones MB, Highlander SK, Anderson EL, Li W, Dayrit M, Klitgord N,Fabani MM, Seguritan V, Green J, Pride DT, Yooseph S (2015)Library preparation methodology can influence genomic and func-tional predictions in human microbiome research. P Natl Acad SciUSA 112:14024–14029

Kamenova S, Mayer R, Rubbmark OR, Coissac E, Plantegenest M,Traugott M (2018) Comparing three types of dietary samples forprey DNA decay in an insect generalist predator. Mol Ecol Resour18:966–973

Kennedy S, Lim JY, Clavel J, Krehenwinkel H, Gillespie RG (2019)Spider webs, stable isotopes and molecular gut content analysis:multiple lines of evidence support trophic niche differentiation in acommunity of Hawaiian spiders. Funct Ecol 33:1722–1733

Kimura M (1983) The neutral theory of molecular evolution. CambridgeUniversity Press, Cambridge

King RA, Moreno-Ripoll R, Agustí N, Shayler SP, Bell JR, Bohan DA,Symondson WO (2011) Multiplex reactions for the molecular de-tection of predation on pest and nonpest invertebrates inagroecosystems. Mol Ecol Resour 11:370–373

Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD (2013)Development of a dual-index sequencing strategy and curation pipe-line for analyzing amplicon sequence data on the MiSeq Illuminasequencing platform. Appl Environ Microbiol 79:5112–5120

Krehenwinkel H, Pekár S (2015) An analysis of factors affectinggenotyping success from museum specimens reveals an increaseof genetic and morphological variation during a historical rangeexpansion of a European spider. PLoS One. https://doi.org/10.1371/journal.pone.0136337

Krehenwinkel H, Tautz D (2013) Northern range expansion of Europeanpopulations of thewasp spiderArgiope bruennichi is associated with

global warming–correlated genetic admixture and population-specific temperature adaptations. Mol Ecol 22:2232–2248

Krehenwinkel H, Graze M, Rödder D, Tanaka K, Baba YG, Muster C,Uhl G (2016) A phylogeographical survey of a highly dispersivespider reveals eastern Asia as a major glacial refugium forPalaearctic fauna. J Biogeogr 43:1583–1594

Krehenwinkel H, Wolf M, Lim JY, Rominger AJ, SimisonWB, GillespieRG (2017a) Estimating and mitigating amplification bias in qualita-tive and quantitative arthropodmetabarcoding. Sci Rep UK. 7:1–12.https://doi.org/10.1038/s41598-017-17333-x

Krehenwinkel H, Kennedy S, Pekár S, Gillespie RG (2017b) A cost-efficient and simple protocol to enrich prey DNA from extractionsof predatory arthropods for large-scale gut content analysis byIllumina sequencing. Methods Ecol Evol 8:126–134

Krehenwinkel H, Kennedy SR, Rueda A, Lam A, Gillespie RG (2018)Scaling up DNA barcoding – primer sets for simple and cost effi-cient arthropod systematics by multiplex PCR and Illuminaamplicon sequencing. Methods Ecol Evol 9:2181–2193

Krehenwinkel H, Pomerantz A, Henderson JB, Kennedy SR, Lim JY,Swamy V, Shoobridge JD, Graham N, Patel NH, Gillespie RG,Prost S (2019a) Nanopore sequencing of long ribosomal DNAamplicons enables portable and simple biodiversity assessmentswith high phylogenetic resolution across broad taxonomic scale.GigaScience. https://doi.org/10.1093/gigascience/giz006

Krehenwinkel H, Pomerantz A, Prost S (2019b) Genetic biomonitoringand biodiversity assessment using portable sequencing technolo-gies: current uses and future directions. Genes. https://doi.org/10.3390/genes10110858

Krehenwinkel H, Kennedy SR, Adams SA, Stephenson GT, Roy K,Gillespie RG (2019c) Multiplex PCR targeting lineage-specificSNPs: a highly efficient and simple approach to block out predatorsequences in molecular gut content analysis. Methods Ecol Evol 10:982–993

Krushelnycky PD, Loope LL, Gillespie RG (2007) Inventory of arthro-pods of the west slope shrubland and alpine ecosystems ofHaleakala National Park. Honolulu (HI): Pacific cooperative studiesunit, University of Hawaii at Manoa, Department of Botany. PCSUtechnical report, 148. 52 pages

Kulkarni S, Wood H, Lloyd M, Hormiga G (2020) Spider-specific probeset for ultraconserved elements offers new perspectives on the evo-lutionary history of spiders (Arachnida, Araneae). Mol Ecol Resour20:185–203

Kwok S, Kellogg DE, McKinney N, Spasic D, Goda L, Levenson C,Sninsky JJ (1990) Effects of primer-template mismatches on thepolymerase chain reaction: human immunodeficiency virus type 1model studies. Nucleic Acids Res 18:999–1005

Lafferty KD, Page CJ (1997) Predation on the endangered tidewatergoby, Eucyclogobius newberryi, by the introduced African clawedfrog, Xenopus laevis, with notes on the frog's parasites. Copeia1997:589–592

Lange V, Böhme I, Hofmann J, Lang K, Sauter J, Schöne B, Paul P,Albrecht V, Andreas JM, Baier DM, Nething J (2014) Cost-efficient high-throughput HLA typing by MiSeq amplicon sequenc-ing. BMC Genomics. https://doi.org/10.1186/1471-2164-15-63

Leavitt DH, Starrett J, Westphal MF, Hedin M (2015) Multilocus se-quence data reveal dozens of putative cryptic species in a radiationof endemic Californian mygalomorph spiders (Araneae,Mygalomorphae, Nemesiidae). Mol Phylogenet Evol 91:56–67

Leray M, Yang JY, Meyer CP, Mills SC, Agudelo N, Ranwez V, BoehmJT, Machida RJ (2013) A new versatile primer set targeting a shortfragment of the mitochondrial COI region for metabarcoding meta-zoan diversity: application for characterizing coral reef fish gut con-tents. Front Zool. https://doi.org/10.1186/1742-9994-10-34

Longhorn SJ, Nicholas M, Chuter J, Vogler AP (2007) The utility ofmolecular markers from non-lethal DNA samples of the CITES II

198 Dev Genes Evol (2020) 230:185–201

Page 15: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

protected "tarantula" Brachypelma vegans (Araneae,Theraphosidae). J Arachnol 35:278–292

Macías-Hernández N, Athey K, Tonzo V, Wangensteen OS, Arnedo M,Harwood JD (2018) Molecular gut content analysis of different spi-der body parts. PLoS One. https://doi.org/10.1371/journal.pone.0196589

Manceau M, Lambert A, Morlon H (2015) Phylogenies support out-of-equilibrium models of biodiversity. Ecol Lett 18:347–356

Manley LJ, Ma D, Levine SS (2016) Monitoring error rates in Illuminasequencing. J Biomol Tech 27:125–128

Marquina D, Esparza-Salas R, Roslin T, Ronquist F (2019) Establishingarthropod community composition using metabarcoding: surprisinginconsistencies between soil samples and preservative ethanol andhomogenate frommalaise trap catches.Mol Ecol Resour. https://doi.org/10.1111/1755-0998.13071

Martins FM, Galhardo M, Filipe AF, Teixeira A, Pinheiro P, Paupério J,Alves PC, Beja P (2019) Have the cake and eat it: optimizing non-destructive DNA metabarcoding of macroinvertebrate samples forfreshwater biomonitoring. Mol Ecol Resour. https://doi.org/10.1111/1755-0998.13012

Meehan CJ, Olson EJ, Reudink MW, Kyser TK, Curry RL (2009)Herbivory in a spider through exploitation of an ant–plant mutual-ism. Curr Biol 19:R892–R893

Meier R, Wong W, Srivathsan A, Foo M (2016) $1 DNA barcodes forreconstructing complex phenomes and finding rare species inspecimen-rich samples. Cladistics 32:100–110

Meierotto S, Sharkey MJ, Janzen DH, Hallwachs W, Hebert PD,Chapman EG, Smith MA (2019) A revolutionary protocol to de-scribe understudied hyperdiverse taxa and overcome the taxonomicimpediment. Deut Entomol Z 66:119–145

Menegon M, Cantaloni C, Rodriguez-Prieto A, Centomo C, AbdelfattahA, Rossato M, Bernardi M, Xumerle L, Loader S, Delledonne M(2017) On site DNA barcoding by nanopore sequencing. PLoSOne.https://doi.org/10.1371/journal.pone.0184741

Michálek O, Petráková L, Pekár S (2017) Capture efficiency and trophicadaptations of a specialist and generalist predator: a comparison.Ecol Evol 7:2756–2766

Miller JA, Beentjes KK, van Helsdingen P, IJland S (2013) Which spec-imens from a museum collection will yield DNA barcodes? A timeseries study of spiders in alcohol. ZooKeys 365:245–261

Moritz C, Cicero C (2004) DNA barcoding: promise and pitfalls. PLoSBiol. https://doi.org/10.1371/journal.pbio.0020354

Morlon H (2014) Phylogenetic approaches for studying diversification.Ecol Lett 17:508–525

Nichols RV, Vollmers C, NewsomLA,WangY, Heintzman PD, LeightonM, Green RE, Shapiro B (2018) Minimizing polymerase biases inmetabarcoding. Mol Ecol Resour 18:927–939

Noguchi H, Park J, Takagi T (2006) MetaGene: prokaryotic gene findingfrom environmental shotgun sequences. Nucleic Acids Res 34:5623–5630

Nyffeler M, Birkhofer K (2017) An estimated 400–800 million tons ofprey are annually killed by the global spider community. Sci Nat-Heidelberg. https://doi.org/10.1007/s00114-017-1440-1

Nyffeler M, Olson EJ, Symondson WO (2016) Plant-eating by spiders. JArachnol 44:15–28

Nyffeler M, Şekercioğlu ÇH, Whelan CJ (2018) Insectivorous birds con-sume an estimated 400–500 million tons of prey annually. Sci NatHeidelberg 105:1–13. https://doi.org/10.1007/s00114-018-1571-z

Obertegger U, Cieplinski A, Fontaneto D, Papakostas S (2018)Mitonuclear discordance as a confounding factor in the DNA tax-onomy of monogonont rotifers. Zool Scr 47:122–132

Overcast I, Emerson BC, Hickerson MJ (2019) An integrated model ofpopulation genetics and community ecology. J Biogeogr 46:816–829

Overcast I, Ruffley M, Rosindell J, Harmon L, Borges P, Chase J,Emerson B, Etienne RS, Gillespie R, Krehenwinkel H, Mahler L,

Massol F, Parent C, Patiño J, Peter B,Week B,Wagner C, HickersonMJ, Rominger AJ (2020) What a MESS!: On the distribution ofabundance, genetic, and functional diversity in ecological commu-nities. bioRxiv

Papadopoulou A, Taberlet P, Zinger L (2015) Metagenome skimming forphylogenetic community ecology: a new era in biodiversity re-search. Mol Ecol 24:3515–3517

Payne A, Holmes N, Rakyan V, Loose M (2018) BulkVis: a graphicalviewer for Oxford nanopore bulk FAST5 files. Bioinformatics 35:2193–2198

PearseWD, Purvis A, Cavender-Bares J, HelmusMR (2014) Metrics andmodels of community phylogenetics. In: Garamszegi LZ (ed)Modern phylogenetic comparative methods and their applicationin evolutionary biology. Springer, Berlin, pp 451–464

Petersen SD, Mason T, Akber S, West R, White B, Wilson P (2007)Species identification of tarantulas using exuviae for internationalwildlife law enforcement. Conserv Genet 8:497–502

Petráková L, Líznarová E, Pekár S, Haddad CR, Sentenská L,SymondsonWO (2015) Discovery of a monophagous true predator,a specialist termite-eating spider (Araneae: Ammoxenidae). Sci RepUK 5:1–10. https://doi.org/10.1038/srep14013

Piñol J, San Andrés V, Clare EL, Mir G, Symondson WOC (2014) Apragmatic approach to the analysis of diets of generalist predators:the use of next-generation sequencing with no blocking probes. MolEcol Resour 14:18–26

Piñol J, Mir G, Gomez-Polo P, Agustí N (2015) Universal and blockingprimer mismatches limit the use of high-throughput DNA sequenc-ing for the quantitative metabarcoding of arthropods. Mol EcolResour 15:819–830

Piñol J, Senar MA, Symondson WO (2019) The choice of universalprimers and the characteristics of the species mixture determinewhen DNA metabarcoding can be quantitative. Mol Ecol 28:407–419

Piper AM, Batovska J, Cogan NO, Weiss J, Cunningham JP, Rodoni BC,Blacket MJ (2019) Prospects and challenges of implementing DNAmetabarcoding for high-throughput insect surveillance.GigaScience. https://doi.org/10.1093/gigascience/giz092

Pomerantz A, Peñafiel N, Arteaga A, Bustamante L, Pichardo F, ColomaLA, Barrio-Amorós CL, Salazar-Valenzuela D, Prost S (2018) Real-time DNA barcoding in a rainforest using nanopore sequencing:opportunities for rapid biodiversity assessments and local capacitybuilding. GigaScience. https://doi.org/10.1093/gigascience/giy033

Porco D, Rougerie R, Deharveng L, Hebert P (2010) Coupling non-destructive DNA extraction and voucher retrieval for small soft-bodied arthropods in a high-throughput context: the example ofCollembola. Mol Ecol Resour 10:942–945

Puillandre N, Lambert A, Brouillet S, Achaz G (2012) ABGD, automaticbarcode gap discovery for primary species delimitation. Mol Ecol21:1864–1877

Quick J, Loman NJ, Duraffour S, Simpson JT, Severi E, Cowley L, BoreJA, Koundouno R, Dudas G, Mikhail A, Ouédraogo N (2016) Real-time, portable genome sequencing for Ebola surveillance. Nature530:228–232

Raso L, Sint D, Mayer R, Plangg S, Recheis T, Brunner S, Kaufmann R,Traugott M (2014) Intraguild predation in pioneer predator commu-nities of alpine glacier forelands. Mol Ecol 23:3744–3754

Riechert SE, Lockley T (1984) Spiders as biological control agents. AnnuRev Entomol 29:299–320

Robinson E, Blagoev G, Hebert P, Adamowicz S (2009) Prospects forusing DNA barcoding to identify spiders in species-rich genera.ZooKeys 16:27–46

Rominger AJ, Merow C (2017) meteR: an r package for testing themaximum entropy theory of ecology.Methods Ecol Evol 8:241–247

Roubinet E, Birkhofer K, Malsher G, Staudacher K, Ekbom B, TraugottM, Jonsson M (2017) Diet of generalist predators reflects effects of

Dev Genes Evol (2020) 230:185–201 199

Page 16: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

cropping period and farming system on extra-and intraguild prey.Ecol Appl 27:1167–1177

RuffleyM, Peterson K,Week B, Tank DC, Harmon LJ (2019) Identifyingmodels of trait-mediated community assembly using random forestsand approximate Bayesian computation. Ecol Evol in press

Saitoh S, Aoyama H, Fujii S, Sunagawa H, Nagahama H, Akutsu M,Shinzato N, Kaneko N, Nakamori T (2016) A quantitative protocolfor DNA metabarcoding of springtails (Collembola). Genome 59:705–723

Satler JD, Carstens BC, Hedin M (2013) Multilocus species delimitationin a complex of morphologically conserved trapdoor spiders(Mygalomorphae, Antrodiaetidae, Aliatypus). Syst Biol 62:805–823

Schmidt JM, Barney SK, Williams MA, Bessin RT, Coolong TW,Harwood JD (2014) Predator–prey trophic relationships in responseto organic management practices. Mol Ecol 23:3777–3789

Schmidt NM, Mosbacher JB, Eitzinger B, Vesterinen EJ, Roslin T (2018)High resistance towards herbivore-induced habitat change in a highArctic arthropod community. Biol Lett UK. https://doi.org/10.1098/rsbl.2018.0054

Schoch CL, Seifert KA, Huhndorf S, Robert V, Spouge JL, Levesque CA,Chen W, Fungal Barcoding Consortium (2012) Nuclear ribosomalinternal transcribed spacer (ITS) region as a universal DNA barcodemarker for Fungi. P Natl Acad Sci USA 109:6241–6246

Shokralla S, Porter TM, Gibson JF, Dobosz R, Janzen DH, HallwachsW,Golding GB, Hajibabaei M (2015) Massively parallel multiplexDNA sequencing for specimen identification using an IlluminaMiSeq platform. Sci Rep UK. https://doi.org/10.1038/srep09687

Shokralla S, Gibson J, King I, Baird D, Janzen D, Hallwachs W,Hajibabaei M (2016) Environmental DNA barcode sequence cap-ture: Targeted, PCR-free sequence capture for biodiversity analysisfrom bulk environmental samples. BioRxiv. https://doi.org/10.1101/087437

Sint D, Thurner I, Kaufmann R, Traugott M (2015) Sparing spiders:Faeces as a non-invasive source of DNA. Front Zool 12:1–5.https://doi.org/10.1186/s12983-015-0096-y

Sint D, Kaufmann R,Mayer R, Traugott M (2019) Resolving the predatorfirst paradox: arthropod predator food webs in pioneer sites of gla-cier forelands. Mol Ecol 28:336–347

Srivathsan A, Hartop E, Puniamoorthy J, Lee WT, Kutty SN, Kurina O,Meier R (2019) Rapid, large-scale species discovery in hyperdiversetaxa using 1DMinION sequencing. BMC Biol 17:1–20. https://doi.org/10.1186/s12915-019-0706-9

Starrett J, HedinM (2007) Multilocus genealogies reveal multiple crypticspecies and biogeographical complexity in the California turret spi-der Antrodiaetus riversi (Mygalomorphae, Antrodiaetidae). MolEcol 16:583–604

Sternes PR, Lee D, Kutyna DR, Borneman AR (2017) A combined meta-barcoding and shotgun metagenomic analysis of spontaneous winefermentation. GigaScience. https://doi.org/10.1093/gigascience/gix040

Taberlet P, Coissac E, Pompanon F, Brochmann C, Willerslev E (2012)Towards next-generation biodiversity assessment using DNAmetabarcoding. Mol Ecol 21:2045–2050

Tautz D, Arctander P, Minelli A, Thomas RH, Vogler AP (2003) A pleafor DNA taxonomy. Trends Ecol Evol 18:70–74

Tedersoo L, Anslan S (2019) Towards PacBio-based pan-eukaryotemetabarcoding using full-length ITS sequences. Environ MicrobiolRep. https://doi.org/10.1111/1758-2229.12776

Tedersoo L, Tooming-Klunderud A, Anslan S (2018) PacBiometabarcoding of Fungi and other eukaryotes: errors, biases andperspectives. New Phytol 217:1370–1385

Thomas AC, Deagle BE, Eveson JP, Harsch CH, Trites AW (2016)Quantitative DNA metabarcoding: improved estimates of speciesproportional biomass using correction factors derived from controlmaterial. Mol Ecol Resour 16:714–726

Thomsen PF, Sigsgaard EE (2019) Environmental DNA metabarcodingof wild flowers reveals diverse communities of terrestrial arthro-pods. Ecol Evol 9:1665–1679

Thomson LJ, Hoffmann AA (2010) Natural enemy responses and pestcontrol: importance of local vegetation. Biol Control 52:160–166

Thongjued K, Chotigeat W, Bumrungsri S, Thanakiatkrai P, Kitpipit T(2019) A new cost-effective and fast direct PCR protocol for insectsbased on PBS buffer. Mol Ecol Resour 19:691–701

Tilman D (2004) Niche tradeoffs, neutrality, and community structure: astochastic theory of resource competition, invasion, and communityassembly. P Natl Acad Sci USA 101:10854–10861

Toju H, Baba YG (2018) DNA metabarcoding of spiders, insects, andspringtails for exploring potential linkage between above-and be-low-ground food webs. Zool Lett 4:1–12. https://doi.org/10.1186/s40851-018-0088-9

Valentini A, Taberlet P, Miaud C, Civade R, Herder J, Thomsen PF,Bellemain E, Besnard A, Coissac E, Boyer F, Gaboriaud C (2016)Next-generation monitoring of aquatic biodiversity using environ-mental DNA metabarcoding. Mol Ecol 25:929–942

Vellend M (2005) Species diversity and genetic diversity: parallel pro-cesses and correlated patterns. Am Nat 166:199–215

VellendM, Lajoie G, Bourret A, Múrria C, Kembel SW, Garant D (2014)Drawing ecological inferences from coincident patterns ofpopulation- and community-level biodiversity. Mol Ecol 23:2890–2901

Verschut V, Strandmark A, Esparza-Salas R, Hambäck PA (2019)Seasonally varying marine influences on the coastal ecosystem de-tected through molecular gut analysis. Mol Ecol 28:307–317

WalterMC, Zwirglmaier K, Vette P, Holowachuk SA, Stoecker K, GenzelGH, AntwerpenMH (2017) MinION as part of a biomedical rapidlydeployable laboratory. J Biotechnol 250:16–22

Wang WY, Srivathsan A, Foo M, Yamane SK, Meier R (2018) Sortingspecimen-rich invertebrate samples with cost-effective NGSbarcodes: validating a reverse workflow for specimen processing.Mol Ecol Resour 18:490–501

Weber MG, Wagner CE, Best RJ, Harmon LJ, Matthews B (2017)Evolution in a community context: on integrating ecological inter-actions and macroevolution. Trends Ecol Evol 32:291–304

Wenger AM, Peluso P, Rowell WJ, Chang PC, Hall RJ, Concepcion GT,Ebler J, Fungtammasan A, Kolesnikov A, Olson ND, Töpfer A(2019) Accurate circular consensus long-read sequencing improvesvariant detection and assembly of a human genome. Nat Biotechnol37:1155–1162

Whitney TD, Sitvarin MI, Roualdes EA, Bonner SJ, Harwood JD (2018)Selectivity underlies the dissociation between seasonal prey avail-ability and prey consumption in a generalist predator. Mol Ecol 27:1739–1748

Wick RR, Judd LM, Holt KE (2018) Deepbinner: Demultiplexingbarcoded Oxford Nanopore reads with deep convolutional neuralnetworks. PLoS Comput Biol. https://doi.org/10.1371/journal.pcbi.1006583

Wise DH, Moldenhauer DM, Halaj J (2006) Using stable isotopes toreveal shifts in prey consumption by generalist predators. EcolAppl 16:865–876

Wong WH, Tay YC, Puniamoorthy J, Balke M, Cranston PS, Meier R(2014) ‘Direct PCR’ optimization yields a rapid, cost-effective, non-destructive and efficient method for obtaining DNA barcodes with-out DNA extraction. Mol Ecol Resour 14:1271–1280

Wood HM, Griswold CE, Gillespie RG (2012) Phylogenetic placementof pelican spiders (Archaeidae, Araneae), with insight into evolutionof the “neck” and predatory behaviours of the superfamilyPalpimanoidea. Cladistics 28:598–626

Yeo D, Puniamoorthy J, Ngiam RWJ, Meier R (2018) Towardsholomorphology in entomology: rapid and cost-effective adult–larva matching using NGS barcodes. Syst Entomol 43:678–691

200 Dev Genes Evol (2020) 230:185–201

Page 17: High-throughput sequencing for community analysis: the ......ORIGINAL ARTICLE High-throughput sequencing for community analysis: the promise of DNA barcoding to uncover diversity,

Yu DW, Ji Y, Emerson BC, Wang X, Ye C, Yang C, Ding Z (2012)Biodiversity soup: Metabarcoding of arthropods for rapid biodiver-sity assessment and biomonitoring. Methods Ecol Evol 3:613–623

Zeale MR, Butlin RK, Barker GL, Lees DC, Jones G (2011) Taxon-specific PCR for DNA barcoding arthropod prey in bat faeces.Mol Ecol Resour 11:236–244

Zhang J, Kapli P, Pavlidis P, Stamatakis A (2013) A general speciesdelimitation method with applications to phylogenetic placements.Bioinformatics 29:2869–2876

Zhang GK, Chain FJ, Abbott CL, Cristescu ME (2018) Metabarcodingusing multiplexed markers increases species detection in complexzooplankton communities. Evol Appl 11:1901–1914

ZhouX, Li Y, Liu S, Yang Q, SuX, Zhou L, TangM, FuR, Li J, HuangQ(2013) Ultra-deep sequencing enables high-fidelity recovery of bio-diversity for bulk arthropod samples without PCR amplification.GigaScience. https://doi.org/10.1186/2047-217X-2-4

Publisher’s note Springer Nature remains neutral with regard to jurisdic-tional claims in published maps and institutional affiliations.

Dev Genes Evol (2020) 230:185–201 201