genome projects in invasion biologyconservation genetics 1 3...

22
Vol.:(0123456789) 1 3 Conservation Genetics https://doi.org/10.1007/s10592-019-01224-x REVIEW ARTICLE Genome projects in invasion biology Michael A. McCartney 1  · Sophie Mallez 1  · Daryl M. Gohl 2,3 Received: 9 July 2018 / Accepted: 4 September 2019 © Springer Nature B.V. 2019 Abstract Advances in sequencing and informatics and rapidly falling costs have made genome sequencing projects far more acces- sible to researchers in all of the life sciences, including invasion biology. A complete genome is now the most efficient first step towards identifying and characterizing candidate genes that control invasiveness. At the genomic level, fundamental problems in invasion science can be pursued with great precision and rigor. This includes reconstruction of the history of invasions, analysis of demographic dynamics within colonizing populations, and study of the rapid, adaptive evolution of invasiveness. This update documents new developments in the emerging field of invasion genomics. Our review found that of 100 of the world’s most damaging invasive species, assembled genomes are available for 27—a minority but still a con- siderable resource. This calls for a larger investment in genomics, but also highlights publicly available genomic resources for invasive species that remain underutilized. We examine the value of reference genomes. We conclude that while some technologies (e.g. genotyping by Next Generation Sequencing) can be applied without reference genomes or with fragmented ones, investments in high quality genome assemblies will provide considerable long-term benefits in invasion and conserva- tion genomics research programs. Keywords Invasion genomics · Invasive species · Population genomics · Genome assembly Introduction A sequenced genome will soon become a routine element of biological research. Costs drop almost monthly, and advances in data collection and analysis occur so rapidly that projects often take advantage of new inventions while under- way. One notable recent example is the Mexican axolotyl, which required a new algorithm to be written to assemble its 32 gigabase genome (Nowoshilow et al. 2018)—10 times the length of Homo sapiens. Many of the life sciences can now benefit from the power of genomics, and this includes invasion biology. Here we review contributions of genomics to the study of biological invasions to date, highlight some future directions, and comment on research strategies. Historical framework Although both fields have a similarly brief history, progress in invasion genomics has been slow relative to conservation genomics; the latter the subject of excellent reviews (Luikart et al. 2003; Allendorf et al. 2010, 2013; Allendorf 2017). Allendorf (2017) notes that genomics has been applied to natural populations of non-model species for less than 2 decades. He cites Black et al. (2001) as the first publication to use the term “population genomics.” This is an important paper, aimed at entomologists but of interest to a broader audience in population genetics. The authors made a strong case for genomic approaches to previously unresolved issues in molecular population genetics theory, and to insect pest control (Black et al. 2001). Luikart et al. (2003) is another important foundational review and perspective, focused on the utility of genomics to approach long-standing problems in conservation and population genetics, such as the relation- ship between census size and N e , and outlier testing for loci under selection. * Michael A. McCartney [email protected] 1 Minnesota Aquatic Invasive Species Research Center and Dept. of Fisheries, Wildlife and Conservation Biology, University of Minnesota, 2003 Upper Buford Circle, St. Paul, MN 55108, USA 2 University of Minnesota Genomics Center, 2003 6th Street SE, Minneapolis, MN 55455, USA 3 Department of Genetics, Cell Biology, and Developmental Biology, University of Minnesota, Minneapolis, MN 55455, USA

Upload: others

Post on 10-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Vol.:(0123456789)1 3

Conservation Genetics https://doi.org/10.1007/s10592-019-01224-x

REVIEW ARTICLE

Genome projects in invasion biology

Michael A. McCartney1  · Sophie Mallez1 · Daryl M. Gohl2,3

Received: 9 July 2018 / Accepted: 4 September 2019 © Springer Nature B.V. 2019

AbstractAdvances in sequencing and informatics and rapidly falling costs have made genome sequencing projects far more acces-sible to researchers in all of the life sciences, including invasion biology. A complete genome is now the most efficient first step towards identifying and characterizing candidate genes that control invasiveness. At the genomic level, fundamental problems in invasion science can be pursued with great precision and rigor. This includes reconstruction of the history of invasions, analysis of demographic dynamics within colonizing populations, and study of the rapid, adaptive evolution of invasiveness. This update documents new developments in the emerging field of invasion genomics. Our review found that of 100 of the world’s most damaging invasive species, assembled genomes are available for 27—a minority but still a con-siderable resource. This calls for a larger investment in genomics, but also highlights publicly available genomic resources for invasive species that remain underutilized. We examine the value of reference genomes. We conclude that while some technologies (e.g. genotyping by Next Generation Sequencing) can be applied without reference genomes or with fragmented ones, investments in high quality genome assemblies will provide considerable long-term benefits in invasion and conserva-tion genomics research programs.

Keywords Invasion genomics · Invasive species · Population genomics · Genome assembly

Introduction

A sequenced genome will soon become a routine element of biological research. Costs drop almost monthly, and advances in data collection and analysis occur so rapidly that projects often take advantage of new inventions while under-way. One notable recent example is the Mexican axolotyl, which required a new algorithm to be written to assemble its 32 gigabase genome (Nowoshilow et al. 2018)—10 times the length of Homo sapiens. Many of the life sciences can now benefit from the power of genomics, and this includes invasion biology. Here we review contributions of genomics

to the study of biological invasions to date, highlight some future directions, and comment on research strategies.

Historical framework

Although both fields have a similarly brief history, progress in invasion genomics has been slow relative to conservation genomics; the latter the subject of excellent reviews (Luikart et al. 2003; Allendorf et al. 2010, 2013; Allendorf 2017). Allendorf (2017) notes that genomics has been applied to natural populations of non-model species for less than 2 decades. He cites Black et al. (2001) as the first publication to use the term “population genomics.” This is an important paper, aimed at entomologists but of interest to a broader audience in population genetics. The authors made a strong case for genomic approaches to previously unresolved issues in molecular population genetics theory, and to insect pest control (Black et al. 2001). Luikart et al. (2003) is another important foundational review and perspective, focused on the utility of genomics to approach long-standing problems in conservation and population genetics, such as the relation-ship between census size and Ne, and outlier testing for loci under selection.

* Michael A. McCartney [email protected]

1 Minnesota Aquatic Invasive Species Research Center and Dept. of Fisheries, Wildlife and Conservation Biology, University of Minnesota, 2003 Upper Buford Circle, St. Paul, MN 55108, USA

2 University of Minnesota Genomics Center, 2003 6th Street SE, Minneapolis, MN 55455, USA

3 Department of Genetics, Cell Biology, and Developmental Biology, University of Minnesota, Minneapolis, MN 55455, USA

Page 2: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

While its origins can be traced to work on the evolution-ary genetics of colonizing species more than 50 years ago (ya) (e.g. Baker and Stebbins 1965) “invasion genetics” emerged as a discipline in the late 1990’s (Barrett 2017). Estoup and Guillemaud (2010) is an influential paper that helped transform invasion genetics into the active, vibrant field it is today by popularizing statistically robust meth-ods for contrasting scenarios for the history of invasions. They considered natural and human-mediated range expan-sions under the same umbrella, stressed their complemen-tary value for research (different timescales capture differ-ent phases of an historical process), and offered pragmatic advice for analysis. For example, given the short time scales over which “biological” (i.e. human-mediated) invasions occur, they emphasized analyses that do not depend on pop-ulation genetic equilibria, such as genetic clustering meth-ods, and provided a how-to for using coalescent simulations to test alternative invasion scenarios using Approximate Bayesian Computation (ABC). Importantly, they recognized that resolving invasion sources and pathways would inform management, but also that such information is needed in basic research. For example, identification of geographic source populations is required for evolutionary comparisons between native and invasive-range populations (Estoup and Guillemaud 2010).

The literature of invasion genetics since 2010 is filled with attempts to reconstruct colonization history (Cristescu 2015). Until most recently, these studies utilized single or few loci [a few using multilocus PCR-based AFLP finger-printing are exceptions], but genome-scale studies are gain-ing momentum (Cristescu 2015; Elleouet and Aitken 2018). Like Allendorf (2017) and Cristescu (2015) recognized work on natural range expansions of the threespine stickleback [Gasterosteus aculeatus: (Hohenlohe et al. 2010; Catchen et al. 2013a)] as the first genuine population genomic stud-ies. Reviews of invasion genetics document the growing interest in genomic technologies. Chown et al. (2015) on climate change and invasiveness, Sherman et al. (2016) and Bourne et al. (2018) on genomics of marine invasive species, and Pelissie et al. (2018) on insect pest species are examples.

The stickleback studies (and many others since) employed RAD-seq (Restriction-site Associated DNA sequencing), a technology for genome-wide discovery and genotyping of single nucleotide polymorphisms (SNPs) via next gen-eration sequencing (NGS) of restriction-digested genomic DNA regions flanking the restriction sites. Genotyping at SNP loci via RAD-seq, Genotyping By Sequencing (GBS), and related protocols have been covered in several excel-lent reviews that report on the continuous improvement of molecular and analytical methods (Davey et al. 2011; Nielsen et al. 2011; Catchen et al. 2013b; Narum et al. 2013; Mastretta-Yanes et al. 2015; Andrews et al. 2016). These are “reduced representation” methods—low coverage NGS

protocols that sequence portions of the genome adjacent to the ends of genomic DNA fragments generated by various methods, and broadly survey the genome to discover and genotype thousands of SNPs.

Scope of this review

Our group is involved in a de novo genome sequencing pro-ject on the zebra mussel, a highly damaging aquatic invasive species. The present report was inspired by the review of NGS studies of biological invasions by Rius et al. (2015). Our goals are to update this valuable review, to revisit pro-gress in invasion genomics and to evaluate the contributions of de novo genome sequencing projects. Rius et al. (2015) tallied few whole-genome sequencing projects on invasive species; in fact the NGS applications they reviewed were dominated by analysis without a reference genome or with a fragmented one. In the 4 years since, a high-quality reference genome has become increasingly accessible; already avail-able for several invasive species, and well within reach for a growing number of researchers to generate for their species of interest. Still, since a reference genome represents a sub-stantial up-front cost for projects in invasion genomics, one of our aims was to examine the advantages of having one.

Genomes in invasion biology

To re-evaluate contributions of genomics to invasive spe-cies research to date, we first searched the 100 alien inva-sive species that are the “world’s worst,” [according to the International Union for the Conservation of Nature (IUCN: Lowe et al. 2000)] for assembled genomes deposited at the US National Center for Biotechnology Information (NCBI 2019a, accessed 12 Aug 2019). We found assemblies for 27 of these 100 species (Table 1). At first glance, this represents a sizable investment in genomics. But of course, there are many reasons to sequence a genome, so next we examined the pub-lications that announced the public release of these genomes, and those published soon after in some cases to recheck our conclusions. About half of the projects made no mention of the invasiveness of the species, and stressed instead the eco-nomic value and/or use of the species as a research model [e.g. Oncorhynchus mykiss (rainbow trout), Sus scrofa (pig), Capra hircus (goat), Mus musculus (mouse)]. In 13 of these projects, authors referred to the species’ invasiveness in the rationale for the sequencing project; in most of these cases they also conducted explicit analysis of the genome to address questions relevant to invasion biology (Table 2). The earlier review by Rius et al. (2015) tallied 118 projects on invasive species that had used NGS technologies. At that time, only 7 were genome sequencing projects de novo. They also found

Page 3: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

Tabl

e 1

Seq

uenc

ed g

enom

es fr

om 1

00 o

f the

wor

ld’s

wor

st al

ien

inva

sive

spec

ies

Spec

ies

Com

mon

nam

eH

ighe

r tax

on o

r gr

oup

Maj

or im

pact

sSt

rain

or i

sola

te

(yea

r): a

ssem

bly

acce

ssio

n #

Ass

embl

y le

vel

Num

ber o

f sc

affol

dsSc

affol

d N

50

(bp)

Num

ber o

f co

ntig

sC

ontig

N50

(b

p)G

enom

e le

ngth

(Gb)

Rana

cat

esbe

i-an

aB

ullfr

ogA

mph

ibia

nPr

eys u

pon

and

outc

om-

pete

s nat

ive

amph

ibia

ns

Bru

no is

olat

e (2

017)

: G

CA_0

0228

4835

.2Sc

affol

d1,

544,

635

39,3

632,

124,

505

5415

6.25

Rhin

ella

m

arin

aC

ane

toad

Am

phib

ian

Toxi

c sk

in

glan

ds p

oiso

n pr

edat

ors

upon

inge

s-tio

n, e

ndan

-ge

ring

nativ

e sp

ecie

s

Wild

(201

8):

GCA

_900

3032

85.1

Con

tig–

–31

,391

167,

498

2.55

2

Pom

acea

can

a-lic

ulat

aG

olde

n ap

ple

snai

lA

quat

ic in

ver-

tebr

ate

Vora

ciou

s fe

eder

on

crop

s and

na

tive

vege

ta-

tion

Isol

ate

SZH

N20

17

(201

8):

GCA

_003

0730

45.1

Chr

omos

ome

2431

,531

,291

746

1,07

2,85

70.

44

Mne

mio

psis

le

idyi

Com

b je

llyA

quat

ic in

ver-

tebr

ate

Inva

sive

ca

rniv

ore

that

co

nsum

es

zoop

lank

ton

Wild

(201

1):

GCA

_000

2260

15.1

Scaff

old

5100

187,

314

24,9

2711

,914

0.15

6

Myt

ilus g

allo

-pr

ovin

cial

isM

edite

rren

ean

blue

mus

sel

Aqu

atic

inve

r-te

brat

eM

arin

e m

usse

l th

at d

ispl

aces

na

tive

spec

ies

Wild

(201

7):

GCA

_001

6769

15.1

Scaff

old

1,00

2,33

429

311,

136,

100

2627

1.5

Stur

nus v

ul-

gari

sSt

arlin

gB

irdO

utco

mpe

tes

nativ

e bi

rds

for n

estin

g si

tes a

nd

dam

ages

fr

uits

and

ot

her c

rops

Isol

ate

715

(201

5):

GCA

_001

4472

65.1

Scaff

old

2361

3,41

6,70

822

,666

151,

865

1.03

7

Gam

busi

a affi

nis

Wes

tern

mos

-qu

ito fi

shFi

shC

ause

s dec

line

and

extin

c-tio

n of

oth

er

smal

l nat

ive

fishe

s thr

ough

co

mpe

titio

n

NE0

1/N

JP10

02.9

(2

018)

: G

CA_0

0309

7735

.1

Scaff

old

2943

6,65

1,46

073

,682

17,5

110.

599

Page 4: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

Tabl

e 1

(con

tinue

d)

Spec

ies

Com

mon

nam

eH

ighe

r tax

on o

r gr

oup

Maj

or im

pact

sSt

rain

or i

sola

te

(yea

r): a

ssem

bly

acce

ssio

n #

Ass

embl

y le

vel

Num

ber o

f sc

affol

dsSc

affol

d N

50

(bp)

Num

ber o

f co

ntig

sC

ontig

N50

(b

p)G

enom

e le

ngth

(Gb)

Cyp

rinu

s ca

rpio

Com

mon

car

pFi

shU

proo

ts a

quat

ic

vege

tatio

n,

caus

ing

decl

ines

in

plan

ts, o

ther

fis

hes a

nd

wat

er q

ualit

y

NA

(201

4):

GCA

_000

9516

15.2

Chr

omos

ome

9378

7,82

8,95

953

,088

75,0

801.

714

Onc

orhy

nchu

s m

ykis

sR

ainb

ow tr

out

Fish

Prey

s upo

n an

d ou

tcom

pete

s na

tive

fishe

s;

ofte

n hy

brid

-iz

es w

ith

nativ

e tro

ut

Swan

son

iso-

late

(201

7):

GCA

_002

1634

95.1

Chr

omos

ome

139,

800

1,67

0,13

855

9,85

513

,827

2.17

9

Apha

nom

yces

as

taci

Cra

yfish

pla

gue

Fung

usW

ater

mol

d le

thal

to

Euro

pean

cr

ayfis

h bu

t en

dem

ic in

N

orth

Am

eri-

can

host

spec

ies

Stra

in A

PO3

(201

4):

GCA

_000

5200

75.1

Scaff

old

835

657,

536

4659

36,4

390.

076

Batra

-ch

ochy

triu

m

dend

roba

tidis

Frog

chy

trid

fung

usFu

ngus

Cau

se o

f po

pula

tion

decl

ines

and

ex

tinct

ions

of

am

phib

ian

spec

ies

Isol

ate

JAM

81

(201

1):

GCA

_000

2037

95.1

Scaff

old

127

1,48

4,46

251

031

8,11

40.

024

Phyt

opht

hora

ci

nnam

omi

Phyt

opht

hora

ro

ot ro

tFu

ngus

Emer

ging

pla

nt

path

ogen

in

fect

-in

g ~ 50

00

nativ

e fo

rest

trees

and

cr

op p

lant

s w

orld

wid

e

Stra

in M

P94-

48 (2

015)

: G

CA_0

0131

4365

.1

Scaff

old

5777

24,8

6958

3124

,715

0.05

4

Euph

orbi

a es

ula

Leaf

y sp

urge

Land

pla

ntA

ggre

ssiv

e w

eed

in

rang

elan

ds

of N

orth

A

mer

ica

Cul

tivar

198

4-N

D00

1 (2

018)

: G

CA_0

0291

9075

.1

Scaff

old

1,63

3,09

410

352,

242,

201

605

1.12

5

Page 5: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

Tabl

e 1

(con

tinue

d)

Spec

ies

Com

mon

nam

eH

ighe

r tax

on o

r gr

oup

Maj

or im

pact

sSt

rain

or i

sola

te

(yea

r): a

ssem

bly

acce

ssio

n #

Ass

embl

y le

vel

Num

ber o

f sc

affol

dsSc

affol

d N

50

(bp)

Num

ber o

f co

ntig

sC

ontig

N50

(b

p)G

enom

e le

ngth

(Gb)

Mus

mus

culu

sM

ouse

Mam

mal

Econ

omic

pe

sts, c

arrie

rs

of h

uman

di

seas

e,

seve

ral n

ega-

tive

impa

cts

on in

vade

d ec

osys

tem

s

Stra

in C

57B

L/6 

J (2

017)

: G

CA_0

0000

1635

.8

Chr

omos

ome

162

54,5

17,9

5160

532

,813

,180

2.73

1

Ory

ctol

agus

cu

nicu

lus

Rab

bit

Mam

mal

Deg

rade

s bi

odiv

ersi

ty,

parti

cula

rly

in in

trodu

ced

area

s tha

t la

ck p

reda

tors

Thor

beck

e in

bred

br

eed

(200

5):

GCA

_000

0036

25.1

Chr

omos

ome

3318

35,9

72,8

7184

,024

64,6

482.

737

Felis

cat

usD

omes

tic c

atM

amm

alVo

raci

ous

pred

ator

s on

nativ

e bi

rds,

rept

iles a

nd

mam

mal

s, ca

usin

g lo

cal

extin

ctio

ns

Cin

nam

on is

o-la

te (2

017)

: G

CA_0

0018

1335

.4

Chr

omos

ome

4525

83,9

67,7

0749

0941

,915

,695

2.52

2

Mac

aca

fas-

cicu

lari

sC

rab-

eatin

g m

acaq

ueM

amm

alLo

wer

nat

ive

bird

div

ersi

ty

by e

atin

g eg

gs a

nd

chic

ks, a

nd

com

petin

g fo

r fo

od

Wild

(201

3):

GCA

_000

3643

45.1

Chr

omos

ome

7625

88,6

49,4

7587

,764

86,0

402.

947

Cer

vus e

laph

usRe

d de

erM

amm

alSt

rong

impa

cts

on n

ativ

e fo

rest

flora

an

d fa

una

in

inva

ded

rang

e

Subs

peci

es h

ippe

la-

phus

, Hun

garia

n is

olat

e (2

017)

: G

CA_0

0219

7005

.1

Chr

omos

ome

11,4

7910

7,35

8,00

640

6,63

779

443.

439

Sus s

crof

aPi

gM

amm

alFe

ral p

igs a

re

pests

of c

rops

an

d pr

oper

ty,

dig

up n

ativ

e ve

geta

tion,

pr

ey o

n se

vera

l nat

ive

spec

ies

Cro

ssbr

eed

isol

ate

2014

2300

4 (2

017)

: G

CA_0

0000

3025

.6

Chr

omos

ome

14,1

5713

1,45

8,09

814

,818

6,37

2,40

72.

755

Page 6: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

Tabl

e 1

(con

tinue

d)

Spec

ies

Com

mon

nam

eH

ighe

r tax

on o

r gr

oup

Maj

or im

pact

sSt

rain

or i

sola

te

(yea

r): a

ssem

bly

acce

ssio

n #

Ass

embl

y le

vel

Num

ber o

f sc

affol

dsSc

affol

d N

50

(bp)

Num

ber o

f co

ntig

sC

ontig

N50

(b

p)G

enom

e le

ngth

(Gb)

Cap

ra h

ircus

Goa

tM

amm

alVo

raci

ous

graz

ers w

ith

grea

t im

pact

s to

veg

etat

ion

and

casc

ad-

ing

effec

ts,

parti

cula

rly

on is

land

s

San

Cle

men

te

bree

d (2

016)

: G

CA_0

0170

4415

.1

Chr

omos

ome

29,9

0787

,277

,232

30,3

9926

,244

,591

2.92

3

Plas

mod

ium

re

lictu

mA

vian

mal

aria

Prot

istPa

rasi

tes o

f bi

rds,

caus

ing

wid

e-ra

ngin

g le

vels

of

mor

talit

y;

extin

ctio

ns o

f H

awai

ian

bird

sp

ecie

s

Stra

in S

GS1

(201

5):

GCA

_900

0057

65.1

Chr

omos

ome

514

1,28

7,09

872

458

3,86

10.

023

Line

pith

ema

hum

ileA

rgen

tine

ant

Terr

estri

al

inve

rtebr

ate

Ofte

n di

spla

ces

nativ

e an

tsW

ild (2

011)

: G

CA_0

0021

7595

.1Sc

affol

d30

301,

402,

257

18,2

2735

,858

0.22

Anop

loph

ora

glab

ripe

nnis

Asi

an lo

ng-

horn

ed b

eetle

Terr

estri

al

inve

rtebr

ate

Woo

d fe

edin

g pe

st of

tree

s in

fore

sts a

nd

urba

n se

tting

s

ALB

-LA

RVA

E (2

016)

: G

CA_0

0039

0285

.2

Scaff

old

9867

678,

234

26,7

4980

,490

0.70

7

Bem

isia

taba

ciSw

eet p

otat

o w

hite

flyTe

rres

trial

in

verte

brat

ePe

st of

veg

eta-

ble

crop

s and

or

nam

enta

ls

with

vas

t hos

t ra

nge

Isol

ate

MEA

M1

(201

6):

GCA

_001

8549

35.1

Scaff

old

19,7

513,

232,

964

31,5

7184

,501

0.61

5

Sole

nops

is

invi

cta

Red

impo

rted

fire

ant

Terr

estri

al

inve

rtebr

ate

Hig

hly

dam

ag-

ing

nuis

ance

sp

ecie

s and

pe

st of

cro

p pl

ants

, liv

e-sto

ck

Wild

(201

8):

GCA

_000

1880

75.2

Scaff

old

66,9

0462

1,03

987

,016

21,1

610.

398

Was

man

nia

auro

punc

tata

Littl

e fir

e an

tTe

rres

trial

in

verte

brat

eSt

ingi

ng a

nts

that

dis

plac

e na

tive

spec

ies

and

harm

cr

op p

lant

s

Stra

in W

ASH

AW1

(201

5):

GCA

_000

9562

35.1

Scaff

old

77,7

881,

175,

369

103,

610

37,9

120.

324

Page 7: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

From

thes

e 10

0 sp

ecie

s, ge

nom

es b

een

sequ

ence

d, a

ssem

bled

and

dep

osite

d at

NC

BI f

or th

e 27

spec

ies b

elow

. The

five

col

umns

with

ital

iciz

ed h

eadi

ngs p

rovi

de m

etric

s for

the

leng

th a

nd q

ual-

ity o

f the

seq

uenc

ed g

enom

es (B

ox 1

). A

ssem

bly

leve

l: ch

rom

osom

e le

vel i

s w

hen

scaff

olds

are

link

ed to

geth

er s

uch

that

bio

logi

cal c

hrom

osom

es a

re a

ssem

bled

to c

ompl

etio

n, o

r nea

rly s

o.

Bec

ause

gap

s re

mai

n, a

ssem

blie

s at

the

chro

mos

ome

leve

l typ

ical

ly h

ave

cont

igs

that

wer

e no

t ass

igne

d to

chr

omos

omes

, suc

h th

at th

e nu

mbe

r of s

caffo

lds

exce

eds

the

num

ber o

f bio

logi

cal

chro

mos

omes

in m

ost c

ases

. Gen

ome

leng

th is

the

tota

l len

gth

of th

e as

sem

bled

gen

ome

in m

egab

ase

pairs

(Mb)

Tabl

e 1

(con

tinue

d)

Spec

ies

Com

mon

nam

eH

ighe

r tax

on o

r gr

oup

Maj

or im

pact

sSt

rain

or i

sola

te

(yea

r): a

ssem

bly

acce

ssio

n #

Ass

embl

y le

vel

Num

ber o

f sc

affol

dsSc

affol

d N

50

(bp)

Num

ber o

f co

ntig

sC

ontig

N50

(b

p)G

enom

e le

ngth

(Gb)

Aede

s alb

op-

ictu

sA

sian

tige

r m

osqu

itoTe

rres

trial

in

verte

brat

eW

ides

prea

d ve

ctor

of y

el-

low

, den

gue

and

Chi

kun-

guny

a fe

ver

viru

ses

Fosh

an is

o-la

te (2

015)

: G

CA_0

0144

4175

.2

Scaff

old

154,

782

201,

017

355,

061

18,4

301.

923

that invasion biology per se had motivated only 40 or about 33% of projects—lower than our count.

Meanwhile, whole genomes of invasive species are being sequenced at an increasingly brisk pace. Of the 27 we reviewed, 13 genomes were completed from 2005 to 2015, with the remaining 14 in the last three years. This parallels the rise in the number of “Genome sequencing and assem-bly” BioProjects deposited over the same time intervals (NCBI 2019b, accessed 13 August 2019). So our first con-clusion is that the majority of high-priority invasive species lack sequenced genomes, pointing to a need for more. On the other hand, a growing number of genomes can be mined to ask interesting, fundamental questions.

Topics in invasion genomics research

History and routes of invasion

Determining history and geographic pathways of coloni-zation is the most common goal of invasion genetic and more recently, genomic studies. Below we describe several cases in which both have been used, to allow comparison. We begin with examples of natural range expansions. In a recent review Cristescu (2015) concluded that natural and human-mediated range expansions share similar evolution-ary dynamics, played out over different time scales. We examine two natural range expansions that followed the retreat of Pleistocene glaciers across North America.

Earlier population genetic work with multiple-locus nuclear markers (microsatellites and single nuclear loci) demonstrated that threespine stickleback (Gasterosteus acu-leatus) populations in freshwater drainages along the Pacific coast of North America originated from multiple independ-ent colonizations from ancestral marine populations (Cresko et al. 2004; Catchen et al. 2013a). These events have pro-vided an opportunity to study the evolutionary consequences of these “replicated” freshwater invasions during the past 10–20,000 years. For example, freshwater populations have evolved, in parallel, reductions in the bony plate armor cov-ering the body of these small fish. One likely cause is that selection favoring the protective armor in the more predator-rich marine environment was relaxed as sticklebacks entered lakes and streams (Cresko et al. 2004).

Hohenlohe et al. (2010) investigated parallel phenotypic evolution in two marine and three freshwater stickleback populations in southeast Alaska, using genomic approaches. The first stickleback reference genome (NCBI 2019c, accessed 13 August 2019) came from one of the freshwa-ter populations, and aided in the scoring of SNP genotypes by RAD-seq. Genomic data supported the “replicate inva-sion scenario” described above. Next, the authors used a moving average to scan windows of DNA sequence, and

Page 8: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

Tabl

e 2

Gen

ome

sequ

enci

ng o

f the

wor

ld’s

wor

st al

ien

inva

sive

spec

ies:

pro

ject

goa

ls

Spec

ies

Com

mon

nam

eH

ighe

r tax

on o

r gro

upSt

ated

goa

ls fo

r gen

ome

proj

ect

Refe

renc

e(s)

Inva

sion

ge

nom

ic

anal

ysis

Rana

cat

esbe

iana

Bul

lfrog

Am

phib

ian

Gen

omic

reso

urce

for b

iolo

gy o

f tru

e fro

gs

(Ran

idae

); fo

cus o

n de

velo

pmen

tal b

iol-

ogy

Ham

mon

d et

 al.

(201

7)N

Rhin

ella

mar

ina

Can

e to

adA

mph

ibia

nLa

ck o

f dra

ft ge

nom

e to

und

erst

and

inva

-si

vene

ssEd

war

ds e

t al.

(201

8)N

Pom

acea

can

alic

ulat

aG

olde

n ap

ple

snai

lA

quat

ic in

verte

brat

eH

igh

qual

ity g

enom

e to

stud

y ge

nes e

volv

-in

g ad

aptiv

ely;

gen

es c

ontro

lling

stre

ss

tole

ranc

e; m

etag

enom

ics o

f gut

flor

a

Liu

et a

l. (2

018)

Y

Mne

mio

psis

leid

yiC

omb

jelly

Aqu

atic

inve

rtebr

ate

Early

met

azoa

n ev

olut

ion,

orig

ins o

f cel

l ty

pes

Ryan

et a

l. (2

013)

N

Myt

ilus g

allo

prov

inci

alis

Med

iterr

enea

n bl

ue m

usse

lA

quat

ic in

verte

brat

eG

enom

ic k

now

ledg

e of

myt

ilid

mus

sels

Mur

gare

lla e

t al.

(201

6)N

Stur

nus v

ulga

ris

Star

ling

Bird

Gen

omic

reso

urce

inva

sive

spec

ies

Unp

ublis

hed

Gam

busi

a affi

nis

Wes

tern

mos

quito

fish

Fish

Gen

omic

reso

urce

inva

sive

spec

ies

Hoff

berg

et a

l. (2

018)

YC

ypri

nus c

arpi

oC

omm

on c

arp

Fish

Gen

omic

reso

urce

for a

quac

ultu

re a

nd

mar

ker-a

ssist

ed b

reed

ing

of o

rnam

enta

l va

rietie

s (e.

g. k

oi)

Xu

et a

l. (2

014)

N

Onc

orhy

nchu

s myk

iss

Rai

nbow

trou

tFi

shEv

olut

iona

ry a

naly

sis o

f ver

tebr

ate

who

le

geno

me

dupl

icat

ion;

reso

urce

for b

iolo

gi-

cal r

esea

rch

on m

odel

fish

spec

ies a

nd fo

r aq

uacu

lture

Ber

thel

ot e

t al.

(201

4)N

Apha

nom

yces

ast

aci

Cra

yfish

pla

gue

Fung

usRe

fere

nce

geno

me

for m

itoge

nom

ic a

naly

sis

and

sour

ce tr

acki

ng o

f inv

asiv

e ge

noty

pes

Mak

kone

n et

 al.

(201

6) a

nd M

inar

di e

t al.

(201

8)Y

Batra

choc

hytr

ium

den

drob

atid

isFr

og c

hytri

d fu

ngus

Fung

usPo

pula

tion

geno

mic

s of e

mer

ging

dis

ease

an

d ev

olut

iona

ry tr

ansi

tion

to p

atho

geni

c-ity

Farr

er e

t al.

(201

1) a

nd Jo

neso

n et

 al.

(201

1)Y

Phyt

opht

hora

cin

nam

omi

Phyt

opht

hora

root

rot

Fung

usFa

ctor

s inv

olve

d in

pla

nt-p

atho

gen

inte

rac-

tions

and

mar

ker d

evel

opm

ent t

o stu

dy

spre

ad o

f em

ergi

ng d

isea

se

Stud

holm

e et

 al.

(201

6) a

nd E

ngel

brec

ht

et a

l. (2

017)

Y

Euph

orbi

a es

ula

Leaf

y sp

urge

Land

pla

ntD

raft

geno

me

(of a

llohe

xapl

oid)

to u

nder

-st

and

wee

dine

ss g

enes

in sp

ecie

s with

m

ore

geno

mic

reso

urce

s

Hor

vath

et a

l. (2

018)

Y

Mus

mus

culu

sM

ouse

Mam

mal

Gen

omic

reso

urce

, bio

med

ical

mod

el sp

e-ci

esRo

berts

et a

l. (2

009)

N

Ory

ctol

agus

cun

icul

usR

abbi

tM

amm

alEv

olut

iona

ry g

enom

ics o

f dom

estic

atio

nC

arne

iro e

t al.

(201

4)N

Felis

cat

usD

omes

tic c

atM

amm

alEv

olut

iona

ry g

enom

ics o

f dom

estic

atio

nTa

maz

ian

et a

l. (2

014)

NM

acac

a fa

scic

ular

isC

rab-

eatin

g m

acaq

ueM

amm

alK

now

ledg

e of

intro

gres

sion

from

con

ge-

ner a

dvis

es u

se o

f spe

cies

as b

iom

edic

al

mod

el

Hig

ashi

no e

t al.

(201

2)N

Page 9: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

For t

he 2

7 sp

ecie

s tal

lied

in T

able

 1, t

he re

fere

nce(

s) li

sted

belo

w (a

nd N

CB

I res

ourc

es in

the

case

of u

npub

lishe

d ge

nom

es) w

ere

exam

ined

for t

he st

ated

goa

ls o

f the

se g

enom

e pr

ojec

ts. “

Inva

-si

on g

enom

ic a

naly

sis”

was

evi

denc

ed b

y pu

blis

hed

appl

icat

ion

to re

sear

ch p

robl

ems i

n in

vasi

on b

iolo

gy a

nd c

itatio

ns to

lite

ratu

re in

the

field

Tabl

e 2

(con

tinue

d)

Spec

ies

Com

mon

nam

eH

ighe

r tax

on o

r gro

upSt

ated

goa

ls fo

r gen

ome

proj

ect

Refe

renc

e(s)

Inva

sion

ge

nom

ic

anal

ysis

Cer

vus e

laph

usRe

d de

erM

amm

alG

WA

S fo

r kno

wle

dge

of h

alf-

dom

estic

ated

fa

rm-b

red

anim

als

Ban

a et

 al.

(201

8)N

Sus s

crof

aPi

gM

amm

alD

e no

vo a

ssem

bly

and

com

paris

on o

f SN

Ps

acro

ss p

ig b

reed

s wor

ldw

ide

Li e

t al.

(201

7)N

Cap

ra h

ircus

Goa

tM

amm

alH

i-C sc

affol

ding

of r

efer

ence

gen

ome:

do

mes

tic g

oat b

reed

s (m

ost c

ontig

uous

m

amm

alia

n ge

nom

e at

the

time)

Bic

khar

t et a

l. (2

017)

N

Plas

mod

ium

relic

tum

Avi

an m

alar

iaPr

otist

Dra

ft ge

nom

es fr

om li

neag

e th

at in

fect

s bi

rds a

nd st

udy

of e

volu

tion

of in

vasi

on-

rela

ted

gene

s

Boh

me

et a

l. (2

018)

Y

Line

pith

ema

hum

ileA

rgen

tine

ant

Terr

estri

al in

verte

brat

eG

enom

ic re

sour

ce fo

r hig

hly

inva

sive

sp

ecie

s–m

odel

ant

spec

ies f

or in

vasi

on

geno

mic

s

Smith

et a

l. (2

011)

Y

Anop

loph

ora

glab

ripe

nnis

Asi

an lo

ngho

rned

bee

tleTe

rres

trial

inve

rtebr

ate

Gen

omic

s of p

lant

-feed

ing

inva

sive

spec

ies,

McK

enna

et a

l. (2

016)

YBe

mis

ia ta

baci

Swee

t pot

ato

whi

tefly

Terr

estri

al in

verte

brat

eG

enom

ic re

sour

ce fo

r hig

hly

inva

sive

cro

p pe

st an

d vi

rus v

ecto

rC

hen

et a

l. (2

016)

Y

Sole

nops

is in

vict

aRe

d im

porte

d fir

e an

tTe

rres

trial

inve

rtebr

ate

Gen

omic

reso

urce

for h

ighl

y in

vasi

ve p

est;

evol

utio

n of

gen

es a

ssoc

iate

d w

ith so

cial

-ity

in H

ymen

opte

ra

Wur

m e

t al.

(201

1)Y

Was

man

nia

auro

punc

tata

Littl

e fir

e an

tTe

rres

trial

inve

rtebr

ate

Gen

omic

reso

urce

Unp

ublis

hed

Aede

s alb

opic

tus

Asi

an ti

ger m

osqu

itoTe

rres

trial

inve

rtebr

ate

Evol

utio

nary

gen

omic

ana

lysi

s of i

nvas

ive

pest

and

vect

or o

f Den

gue

and

Chi

kun-

guny

a

Che

n et

 al.

(201

5)Y

Page 10: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

found that most of the signals of balancing and divergence-promoting selection mapped to the same genomic regions in each of the three freshwater populations (Hohenlohe et al. 2010)—strong evidence for parallel adaptive evolu-tion post-colonization. Included was the genomic region previously implicated in the control of bony plate reduc-tion, from genome-wide association studies in offspring of crosses between (the fully interfertile) marine and freshwater populations (Cresko et al. 2004; Hohenlohe et al. 2010).

A second case of recent, postglacial range expansion comes from US Atlantic coast populations of the pitcher plant mosquito Wyeomyia smithii (Emerson et al. 2010). Most of this history could not be resolved using phylogeo-graphic studies of mitochondrial DNA (mtDNA) or micro-satellite DNAs—only a northern and southern clade were revealed. With > 3700 SNPs from a RAD-seq screen, sev-eral subclades within both the southern and northern group were revealed, as was evidence that populations north of the line of southern-most advance of the Laurentian ice sheet were recolonized from refugia in the southern Appalachians (Emerson et al. 2010).

Human-mediated biological invasions occur over decades to centuries, at most, and not the tens to hundreds-of-thou-sand-year spans of natural range expansions. Nevertheless, in several cases, population genetic and phylogeographic analysis has allowed inference of invasion sources and path-ways (e.g. Miller et al. 2005; Darling et al. 2008; Ascunce et al. 2011; Lombaert et al. 2014). What new information can be gleaned from genomes? We examine two well-docu-mented cases. European green crabs (Carcinus maenas) are one of the most successful and well-studied marine inva-sive species, having spread from their large Eastern Atlantic native range (from North African Atlantic coast to Norway) to invade all continents worldwide except Antarctica. Jeffery et al. (2017) focused on the Western Atlantic invaded range along the East Coast of the US and Atlantic Canada. This was the region where the first invasive population, anywhere, was reported in the early 1800’s, followed by a much more recent one in the 1980’s. Jeffery et al. (2017) superimpose a study of 9100 SNP loci (scored using RAD-seq) onto an updated mtDNA survey. As in earlier work (Roman 2006; Darling et al. 2008, 2014), the mtDNA haplotypes detected the presence of two differentiated populations. But the SNP data provide the best evidence to date that these derive from two independent introductions, followed by secondary con-tact between two sets of descendant populations with strong genome-wide divergence.

A second example of a population genomic study of invasion history focuses on the Asian tiger mosquito, Aedes albopictus (Chen et al. 2015; Kotsakiozi et al. 2017), one of the 100 “world’s worst” (Tables 1 and 2) and an impor-tant vector of human arboviruses (e.g. dengue, Chikungu-nya and Zika). Aedes albopictus is native to Southeast Asia,

China and Japan. Its spread began about 1500 ya (SW Indian Ocean Islands), with another wave about 100 ya (Hawaii and Guam), followed 30–40 ya with spread to Europe, Africa, and North and South America. While less competent at car-rying virus than A. aegypti, public health concerns center around the ability of A. albopictus to persist in temperate environments at higher latitudes, due to facultative hormo-nally controlled diapause tied to photoperiod.

Recent invasion genetic studies include an analysis of its global expansion, using ABC model tests based on 17 microsatellite loci (Manni et al. 2017). This study revealed an intricate invasion history, further complicated by different levels of diversity in source populations, so called “chaotic” dispersion patterns, and extensive admixture post-coloni-zation. A population genomic study used a panel of 58,000 SNPs derived from double digest RAD-seq (ddRAD-seq), focusing on the native range and the most recent wave of invasions (Kotsakiozi et al. 2017). The ddRAD study pro-vided more order to the results, due to greater resolution of genetic differentiation within the native Asian and the invaded ranges. The authors question the “chaotic dispersion out of Asia” conclusion from the microsatellite study (Manni et al. 2017), which lacked the power to discern genomic dif-ferentiation across the native range. Colonization and admix-ture patterns are still very complex, but the SNP dataset sets the stage for future work to clarify the patterns of spread and if possible, use the results to plan prevention efforts.

Genomic studies even have the potential to resolve inva-sion history at the most local scales relevant to management. In North America, for example, invasive species prevention is often implemented by US States and Canadian Provinces. Sard et al. (2019) examined genomics of populations of the round goby, a highly damaging Ponto-Caspian invader of the Laurentian Great Lakes, using SNPs genotyped using RAD-seq. Gobies were collected from 18 sites along the shorelines of Lakes Michigan and Huron, and three rivers draining into Lake Huron—all sites are in the state of Michigan and all were colonized within < 15 generations. ABC model testing was used to infer sources and estimate size and number of introductions. The Flint River population showed the most unambiguous result—that it was derived from a single intro-duction from the adjacent Saginaw Bay in Lake Huron, and experienced the strongest bottleneck. But models for the Au Sable and Cheboygan Rivers also showed evidence of ori-gins from Saginaw Bay, rejected other sites in Lake Huron and more distant sites in Lake Michigan as sources, and in all cases, inferred founding populations of < 50 gobies. Since bait buckets dumped into inland waters are one likely vector for spread, management recommendations included enhanc-ing education for boaters in Saginaw Bay, and encouraging bait dealers to more effectively sort out and remove gobies from their harvests (Sard et al. 2019).

Page 11: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

Demography of colonizing populations

Founder effect and other demographic outcomes are often the focus of genetic studies of colonizing populations. Inter-est rose with the popularization of the so called “genetic paradox” of biological invasions—i.e. the question of how colonizers, despite expected reductions in genetic diversity, can establish growing populations and maintain the adaptive capacity to persist over time (Allendorf and Lundquist 2003; Frankham 2004; Roman and Darling 2007). The deep theo-retical roots of the issue trace to the authors of the Modern Synthesis (Baker and Stebbins 1965; Dobzhansky 1965). Based on population genetic studies, Roman and Darling (2007) found that the paradox is far from universal across aquatic species; invasive populations often show no detect-able change in diversity compared to their native ranges. Estoup et al. (2016), moreover, listed several ways in which the paradox may be spurious, or even more interestingly, how evolution post-invasion may come to overshadow evi-dence for it. What do genome-scale studies have to say about the “paradox?”

Genomic markers confirm the lack of universal reduc-tion in genetic diversity that has been found with population genetic markers. For example, RAD-seq revealed high levels of diversity and cryptic variation within invasive popula-tions of Carcinus maenus in North America (Jeffery et al. 2017), which was attributed to secondary contact between independent introductions with admixture between them, as well as geographic gradients in selection driving clinal variation at selected loci. Levels of genomic variation were found to be similar among native and invasive populations at a global scale in Aedes albopictus. Lack of diversity loss was attributed to multiple introductions and high propagule pressure, although sampling was inadequate to distinguish those alternatives (Kotsakiozi et al. 2017). With invasive plants, the genetic paradox has been a topic of considerable study, with some complex findings. While founder effects have been found to be common (Dlugosch and Parker 2007) changes in molecular marker diversity often do not parallel those in quantitative characters, nor do they correlate with invasive potential (Dlugosch and Parker 2007; Dlugosch et al. 2015; Barker et al. 2017).

Genomes from invasive plants are confirming these results. For example, yellow starthistle (Centaurea solstitia-lis) is a damaging invasive weed in Western Europe and the Americas. Populations in its native Eurasian and introduced ranges on three continents were genotyped at ~ 1000 SNP loci developed from a ddRAD screen (Barker et al. 2017). ABC contrasts favored a well-supported invasion scenario in which western European populations derived from admix-ture, long ago, between populations in eastern Europe and Asia. More recently, western Europe functioned as a genetic “bridgehead” in which evolution produced adaptations that

facilitated the success of subsequent introductions to South and North America. California invasions came from a single source region (Chile). Then divergent genotypes founded multiple, independent introductions into the Pacific North-west (from California and from admixed Western European populations). The bridgehead scenario was first introduced to invasion genetics in studies of Harmonia axyridis, the multicolored Asian lady bird beetle (Lombaert et al. 2010, 2014). Admixed bridgehead source populations in eastern North America were suggested to be where adaptations evolved that enabled rapid spread to western north America, western Europe, South America and Africa. In both H. axy-ridis and yellow starthistle, the lack of evidence for the para-dox is attributed to admixture in bridgehead locations, and adaptations evolved within genetically diverse bridgehead populations that facilitated their later invasive spread.

Evolution of invasiveness

Of great theoretical and applied interest is whether and how invasions are facilitated by adaptive evolution (Lee 2002; Sax et al. 2007; Cristescu 2015). Genomic analysis is a powerful approach for identifying evolutionary change in genes and functional networks that contribute to invasive-ness. [By “invasiveness,” we refer to the ability to establish and spread, and not to the broader definition that includes impact on biodiversity (e.g. Colautti and MacIsaac 2004; Ricciardi and Cohen 2006)]. Also, whether adaptations that favor invasiveness occur prior to, during or after establish-ment is an important question for research and management (e.g. Hufbauer et al. 2012).

The chytrid fungus Batrachochytrium dendrobatidis is a high impact invasive species (Tables 1 and 2) respon-sible for epizootics that caused widespread extinctions of amphibians. Previous invasion genetic studies with molecu-lar markers lacked the power to resolve invasion sources or to identify evolutionary events responsible for the emer-gence of this virulent pathogen. The species is a member of an early-branching lineage of non-pathogenic fungi (Divi-sion Chytridiomycota) in which all other known species are saprobes, feeding on decaying organic matter. Population genomic studies (Farrer et al. 2011; Joneson et al. 2011) have shed some light on the mystery. Whole genomes (just 24 Mb) were sequenced from 20 isolates collected world-wide (Farrer et al. 2011). Phylogenomic analysis identified a “global panzootic lineage” (termed BdGPL) that included all of the isolates from regional epizootic infections across five continents. This lineage showed evidence for posi-tive selection on multiple gene products and numerous recombination breakpoints both within and between chro-mosomes; the latter a pattern consistent with the hypoth-esized origin of BdGPL as an ancient hybrid lineage whose clonal descendants spread across the globe. Experimental

Page 12: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

exposures of common toads (Bufo bufo) to all lineages revealed higher virulence of BdGPL. Finally, molecular clock dating of divergence between isolates collected over a decade dated the emergence of BdGPL to recent times (25–257 ya), overlapping with its hypothesized origin in the pet trade. While these studies have yet to identify virulence genes, they provide the important testable hypothesis that recombination between genomes initiates epizootics (Farrer et al. 2011).

Joneson et al. (2011) sought to further examine the evo-lutionary events that may have promoted the emergence of the single known vertebrate pathogen among this group of fungi. By sampling across 5 fungal phyla, they identified a long list of duplication events in protease gene families that are unique to the B. dendrobatidis lineage. But because the phylogeny of the available genomes spans hundreds of millions of years, duplication events cannot be placed with confidence on the tree. Using molecular clock dating of gene duplication events, Joneson et al. (2011) acknowledge that gene family expansions occurred millions of years prior to the outbreak of this pathogenic lineage, and suggest that finer scale intraspecific comparisons of paralogs will be required to determine whether protease duplications are involved in pathogenicity.

A comparative genomics study that also asks whether gene duplications promote invasiveness, but in a more well-resolved phylogenetic framework, was performed with the Southeast Asian fruit fly Drosophila suzukii (Asplen et al. 2015). The species is rapidly expanding in Europe and North America since arriving about 2008. Unlike other (geneti-cally more well-characterized) Drosophila, D. suzukii shows the unusual behaviors of egg laying and larval feeding on ripening rather than fermenting fruit, and as a consequence has become a damaging pest of soft fruits (e.g. blueberries, blackberries, strawberries). As part of research to develop integrated pest management, nuclear genomes, mitog-enomes, and transriptomes were recently sequenced and analyzed (Ometto et al. 2013).

To examine whether adaptive molecular changes helped facilitate the ecological shift to ripening fruit, Ramasamy et al. (2016) analyzed the repertoire of 131 genes involved in olfaction throughout the genus in three gene families—the odorant receptors (ORs), the antennal expressed ionotropic receptors (aIRs: odorant-responsive ion channels expressed in the sensory neurons of antennae), and the odorant bind-ing proteins (OBPs). To do so they annotated the entire repertoire of these olfactory genes in two D. suzukii strains and in the closely related species, Drosophila biarmipes. They then searched 12 Drosophila genomes in FlyBase (Drysdale et al. 2005) for orthologs within the three gene families using and the D. suzukii assembly of Ometto et al. (2013). The authors were able to establish several instances of gene loss, gene duplication and positive selection within

these gene families along the D. suzukii lineage—candidate adaptations that facilitated the switch in larval feeding and egg laying behaviors and promoted the success of this shift to new host plants (Ramasamy et al. 2016).

Also, on our list of high-impact invasive species (Tables 1 and 2) is the Asian longhorned beetle (Anoplophora glabrip-enniss), which causes damage to > 100 tree species world-wide. It belongs to the insect family Cerambycidae, con-taining the greatest diversity of animals capable of feeding on woody plants (> 35,000 species); all within the order Coleoptera (> 400,000 species). The A. glabripenniss pro-ject (McKenna et al. 2016) allowed a detailed comparative genomic analysis of polyphagy and invasiveness. This analy-sis utilized 14 additional genomes across 6 insect orders, all of which are from species capable of digesting woody plants. It included 2 other beetle species whose genomes were ana-lyzed for the first time—one being the emerald ash borer, a devasting pest of ash trees in North America—and a termite whose plant cell wall degrading enzymes are produced by symbiotic gut protists. The A. glabripenniss genome encodes one of the largest repertoires of enzymes that can digest polysaccharides in wood. The analysis found large expan-sions of arthropod gene families that encode these enzymes, and many horizontal gene transfers (HGTs) from bacteria and fungi. Several of the HGTs are in the glycoside hydro-lase gene family that digests plant cell walls, and are ancient insertions that have evolved into functional genes in A. glabripenniss. The genome also contains large tandem gene expansions of detoxifying enzymes, allowing the plasticity to feed on a diversity of woody plants, and a large number of genes with chemosensory functions involved in locating host plants and finding mates.

Populations of the fall armyworm Spodoptera fru-giperda (a noctuid moth) exist as two sympatric host races—the “corn strain” (C strain) feeding mostly on maize, cotton and sorghum and the “rice strain” (R strain) feeding mostly on rice and pasture grasses. The C and R strains are morphologically indistinguishable, but differ in numerous physiological, behavioral and developmental traits. They also show fitness differences on their respec-tive host plants, are estimated to have diverged 2 million ya, and have evolved partial pre and postzygotic repro-ductive isolation, due in part to differential response by females to pheromones—i.e. they may represent incipient pheromonal host races. Native to North and South America and Carribbean islands, they are damaging crop pests in their home range. Gouin et al. (2017) used a fragmented genome assembly that, due to synteny across order Lepi-doptera, still enabled establishment of orthology. The authors compared the diversity of genes associated with host plant use in this highly polyphagous species to some Lepidoptera with single host plants (i.e. the monophagous species Bombyx mori, Manduca sexta, Danaus plexippus,

Page 13: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

and Heliconius melpomene), and between the C and R host races. The most striking of their findings is the vast diver-sification of the gustatory receptor genes (expressed on taste sensillae on tarsi, mouthparts and ovipositors) in S. frugiperda compared to the monophagous species. The C and R races showed no differences in their complement of chemosensory genes, but they show differences in copy number and sequence of genes controlling digestive and detoxification functions.

The elements of a genome sequencing project: essential concepts and quality metrics

Genome projects are complex; involving multiple steps (most conducted sequentially: Fig. 1), several technologies, and a team of investigators that collaborate across disci-plines. Below we discuss the major elements in more detail.

Genome sequencing

The efforts leading up to the completion of the first draft of the human genome in 2001 marked a shift in the general strategies employed for genome sequencing (Fleischmann et al. 1995; Goffeau et al. 1996; Adams et al. 2000). His-torically, due to the high cost of obtaining sequencing data, genome projects relied on an ordered assembly approach in which genome fragments were first cloned into vectors

such as bacterial artificial chromosomes (BACs), yeast artificial chromosomes (YACs) or Fosmids, and the order and chromosomal locations of these fragments were deter-mined by various methods. This approach was employed by the public consortium sequencing the human genome (Lander et al. 2001). The massive parallelization and auto-mation of capillary electrophoresis-based Sanger sequenc-ing dramatically reduced cost and increased throughput, which allowed the Celera Genomics team to employ a shotgun sequencing approach, in which random genome fragments that hadn’t been previously mapped or ordered were sequenced and subsequently assembled computation-ally (Venter et al. 2001). As sequencing costs have con-tinued to fall with the advent of NGS technologies, shot-gun sequencing has become standard practice for genome sequencing projects.

Initial NGS technologies had read lengths comparable to or shorter than typical Sanger sequencing reads (in the 600–1000 bp range). [Sanger sequencing is used in many “1st generation” DNA sequencing technologies, in which DNA molecules are synthesized enzymatically, and random incorporation of modified “chain terminating” nucleotides into the growing molecule allow the DNA base present at each position to be identified.] Next generation “short-read” sequencing technologies are exemplified by sequencing-by-synthesis platforms on Illumina (San Diego, CA) instru-ments. Illumina sequencing allows for either single end or paired end reads (in the latter, sequence is read from both ends of a DNA molecule), with read lengths between 50

Fig. 1 Elements of a genome project. A flow chart connect-ing the elements or steps in a genome project. Arrows connect steps that are sequential—i.e. the previous step outputs infor-mation necessary to conduct the next one. Transcriptome work can be done in parallel with genome sequencing. In parentheses are some promi-nent molecular technologies, with examples of some popular analysis programs in italics

Page 14: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

and 300 bp. Illumina sequencing has several advantages that make it a standard in genome sequencing projects as a means to generate at least some short-read data. Illumina data are both cost-effective and highly accurate, with quality scores often exceeding modified Phred scores of 30 (one error in a thousand bases). While short reads are insufficient to bridge many repetitive sequences, additional strategies can be used to improve the quality of a short-read assembly, including the use of mate-pair sequencing (Metzker 2010) that can be used to span larger intervening segments of difficult-to-assemble DNA, or information from BAC or fosmid map-ping (e.g. Zhang et al. 2012). More recently, techniques such as Hi-C [chromosome conformational capture from high-throughput sequencing: (Burton et al. 2013)] and syn-thetic long-read approaches, such as 10 × Genomics (Zheng et al. 2016), have been successfully used to improve genome assemblies and for long-range mapping of polymorphisms to parental chromosomes [i.e. haplotype phasing (Seo et al. 2016; Moll et al. 2017)].

The availability and rapid improvement of long read sequencing technologies, such as Pacific Biosciences (PacBio, Menlo Park CA) and Oxford Nanopore (Oxford, UK), has been a major boon for genome sequencing and assembly. PacBio sequencing can routinely achieve read lengths of tens of kilobases (and up to hundreds of kilo-bases) and has been successfully used for de novo assembly of a large number of eukaryotic genomes (Rhoads and Au 2015). Such long reads are extremely powerful for assem-bly and have been used to complete a single 25 Mb contig that spans all of Drosophila melanogaster chromosome arm 3L (Berlin et al. 2015). Oxford Nanopore sequencing was recently used for de novo assembly of the human genome and can in extreme cases attain read lengths approaching 1 megabase. Both of these long-read technologies rely on the sequencing of single molecules, and thus have much higher intrinsic error rates than Illumina (Koren et al. 2012). How-ever, several strategies for mitigating the higher intrinsic error rates have been employed, including circular consensus sequencing (CCS) in the case of PacBio, and reading both DNA strands in the case of Oxford Nanopore (Rhoads and Au 2015; Jain et al. 2016). Often, long-read sequencing pro-jects employ a step known as “polishing” (Fig. 1), in which an additional bout of Illumina sequencing is performed to check and correct errors in sequences generated by long read technologies.

Genome assembly and annotation

Assembly can be the most time consuming and expensive step in genome sequencing of eukaryotes de novo. Com-pleteness and contiguity of the assembly is dependent on several key factors, including intrinsic properties of the

genome (in particular, the number and types of repeti-tive sequences) and technical factors such as the length of the sequencing reads and the sequencing depth that can be obtained (Kingsford et al. 2010; Schatz et al. 2010; Henson et al. 2012). Genomes of eukaryotic organisms typically contain millions of DNA segments consisting of repeated sequence motifs that do not code for genes. In fact, over half the genome of humans and other mammals is comprised of repetitive DNA (de Koning et al. 2011; Padeken et al. 2015), most of which is associated with insertions of transposable elements (DNA transposons and retrotransposons).

The quality of a genome assembly can be assessed in multiple ways (Box 1; Fig. 2). Contiguity measures, such as contig numbers and contig N50 values provide one metric of assembly quality. The contig N50 is defined as the value (in units of contig length) where 50% of the assembled genome is in contigs larger than the N50 value. Depend-ing on genome complexity, N50 s in the range of kilobases to tens of kilobases are achievable with short-read NGS technologies, such as Illumina. By incorporating long-read technologies such as PacBio Single Molecule Real Time (SMRT) sequencing (Kingan et al. 2019) or Oxford Nanopore (Goodwin et al. 2015), contig N50 values in the megabase range can be obtained. The completeness of an assembly can be assessed by examining its gene con-tent in an evolutionary context. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis (Simao et al. 2015) compares an assembly to a database of single copy orthologs present in > 90% of species within a clade (e.g. Metazoa) to provide a measure of the representation and complete annotation (discussed next) of these expected, highly conserved genes.

Once a draft assembly is completed, the genome can be annotated (i.e. the locations in the genome and functions of genes are predicted and the genes are named). There exist a number of different gene prediction programs, which look for open reading frames and other predicted gene features (Stanke and Morgenstern 2005; Goodswen et al. 2012). Additional algorithms can be used to functionally annotate the genome based on homology of the predicted genes to known proteins or protein domains. Annotation can be fur-ther improved by acquiring transcriptome data (Fig. 1), via RNA-sequencing (RNA-seq). As with genome assembly, long-read approaches for RNA-seq can also help to refine gene models and provide a fuller picture of the transcrip-tome. They do so by generating full length sequences of transcripts, to distinguish differences among transcript splice isoforms that short read RNA-seq cannot resolve (Sharon et al. 2013).

In contrast to the steadily changing and improv-ing approaches to sequencing and assembly; in fact to

Page 15: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

virtually all other steps in a genome project (Fig. 1), annotations are done just about as they were nearly 25 ya (Salzberg 2019). The first step—automated annota-tion—is particularly challenging for the number of large, fragmented, “draft” eukaryotic genomes that are accu-mulating, and accuracy is worsened by the tendency for errors to propagate as investigators annotated new genome drafts. RNA-seq will reduce errors, and nano-pore technologies on the horizon that will sequence RNAs without first converting them to DNA (as in RNA-seq) even more so (Salzberg 2019). In our experience,

moreover, the highly unfragmented zebra mussel genome remained difficult to annotate in part due to the absence of closely related sequenced bivalves (McCartney et al. 2019). Oysters, scallops and other sequenced bivalves diverged from zebra mussels hundreds of millions of ya (McCartney et al. 2019). A strong argument in favor of generating a high-quality zebra mussel genome assembly was to begin to fill this gap in the mollusk phylogenomic tree, and the same one can be made for many other taxa in our disciplines.

Fig. 2 Genome assembly and scaffolding. Top: Reads, Con-tigs, and Scaffolds. Illustration of how reads are assembled into contigs, which are then further placed on scaffolds which contain ordered contigs interspersed with gaps. Bottom: Contiguity Measures. Illus-tration of how N50 and L50 statistics are calculated using a hypothetical 1 Mb genome Fig-ure adapted from: https ://githu b.com/schat zlab/teach ingar chive /blob/maste r/2012/CSHL.Seque ncing /Whole %20Gen ome%20Ass embly %20and %20Ali gnmen t.pdf

Page 16: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

Box 1: Key terms and concepts in genome sequencing and assembly

Sequencing depth and  coverage These terms are often used interchangeably depending on the application. Cov-erage or depth of 100 × means that, on average, 100 reads contain a given DNA base at a position along the sequence that is “reconstructed” after piling up all of the sequenc-ing reads. Depth is a function of the total length of DNA being sequenced, the capacity of the technology (e.g. the number of bases that can sequenced on a lane of an Illu-mina instrument) and whether the reads are sequenced from only one end, or from both ends (as in paired-end and mate-pair sequencing, discussed below).

Contig A contig is a contiguous block of sequence that can be assembled at high confidence (Fig. 2). Contigs are assembled by stitching together overlapping sequencing reads. The full set of reads at a given genomic position (often referred to as a “pileup”) can be used to correct sequencing errors by consensus. Ploidy can be used to set an assumption a priori for the number of different reads expected at a given position, however, the existence of closely related paralogs can complicate such analysis. [Paralogs are members of gene families resulting from gene duplication events that generate multiple gene cop-ies at different positions on the chromosomes.]

Scaffold A scaffold is a set of contigs which are linked by gaps of known or estimated length. Repetitive or difficult to sequence regions can create contig break points that cannot be unambiguously assembled (Fig.  2). However, it may be possible to generate evidence that supports the ordering of a set of contigs and in some cases measure the distance between them, which can be used to position contigs on scaffolds. This evidence can include paired-end or mate-pair reads that bridge two contigs, as well as other forms of evidence, such as optical mapping and Hi-C (Schwartz et al. 1993; Burton et al. 2013).

Contiguity measures: N50 and L50 The N50 and L50 val-ues are measures of the contiguity of an assembly. These statistics can be applied to either contigs or scaffolds. N50 is defined as the contig (or scaffold) size where 50% of the genome is in contigs (or scaffolds) larger than the N50 size (Fig. 2). L50 is the minimum number of contigs (or scaffolds) in which 50% of the genome is contained (Fig.  2). Other similar statistics, such as N75 and L75 (where the percentage of the genome described by these values is 75%), are also sometimes reported.

A note on N50 and L50 values While N50 and L50 (and similar contiguity measures) are typically used to assess the technical quality of an assembled genome, it should be noted that these statistics are only strictly comparable when assessing genomes of the same size and number of chro-mosomes. For instance, the current human genome release (Hg38, RefSeq accession: GCF_000001405.38), still has many gaps and exists in 472 scaffolds (more than an order of magnitude greater than the expected 23 chromosomes), while the Escherichia coli K-12 (E. coli) genome is per-fectly complete (RefSeq accession: GCF_000005845.2: Blattner et al. 1997), existing as a single closed contig. Yet due to the much larger size of the human genomes compared to E. coli, the scaffold N50 of the human Hg38 assembly is > 67 Mb, while the E. coli N50 is only 4.6 Mb (the size of the full genome). Most metazoan genomes are sufficiently large that N50 and L50 values at both contig and scaffold levels provide useful information about the technical qual-ity of the assembly. Therefore, we provide them (Table 2) to evaluate the quality of invasive species genome assemblies. However recent technological advances such as Hi-C (Bur-ton et al. 2013) are capable of scaffolding reads at the scale of a full chromosome, in which case the scaffold N50 and L50 values are essentially determined by the properties of the genome being studied as opposed to technical factors.

Reference genomes in invasion and conservation genomics

Still, investigators have limited budgets, and whole genome sequences are overkill for some applications. GBS, RAD-seq and related technologies offer finer-scale resolution of popula-tion histories for invasive species and endangered species than ever before available, with more robust results from sampling thousands of independent gene genealogies, all without the need for high quality reference genomes. Elleouet and Ait-ken (2018) explored with coalescent simulations the ability of ABC to accurately estimate parameters (e.g. size of founding populations; age of introductions) for scenarios of recent spa-tial expansion of one ancestral population into a descendant, growing population; a model relevant to range expansion and biological invasions (as well as recovery of an endangered pop-ulation). They find that shallow sequencing of large numbers of individuals estimates known parameters more accurately than deep sequencing of fewer individuals per population, and they show that phased haplotype sequences and linkage dis-equilibrium information provides no more accuracy than SNPs scored without this information. Their message from this study and several reviews (Davey et al. 2011; Nielsen et al. 2011; Mastretta-Yanes et al. 2015) is to focus on reducing genotyping error and not on a genome assembly.

Page 17: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

In contrast, high quality assemblies are required when knowledge of genomic architecture is needed. Genomic studies of inbreeding and inbreeding depression in wild populations (Kardos et al. 2017a, b) are an example. Kardos et al. (2017a) surveyed genomes of 97 grey wolves from bottlenecked Scan-dinavian populations for Runs of Homozygosity (ROH): iden-tical-by-descent blocks of chromosomes evidenced by contigu-ous stretches of homozygous loci. Whole genome sequences mapped to a high-quality reference genome revealed that long ROH—those marking regions of inbreeding estimated to have arisen < 10 generations in the past—made up the majority of DNA regions that are identical by descent. This genomic approach offers a powerful alternative to Genome Wide Asso-ciation (GWA) mapping, and its requirement for extensive ped-igree information. In the wolf study, pedigree information was available, and correlations were calculated between the real-ized inbreeding coefficient FROH (from genome-wide ROH), the pedigree derived value (FP) and values of F estimated from 500 to 10,000 subsampled SNPs. Values of F from the smaller SNP panels were more closely correlated with the genome-wide FROH value than was FP (Kardos et al. 2017a), confirming the accuracy of the genomic methods. The success of ROH studies in humans and domesticated mammals indicate their power to complement any available pedigree information from inbred natural populations that are, for example, candidates for genetic restoration (Kardos et al. 2016).

Seven of the invasive-species projects that we reviewed were mammal genome projects motivated by domestic breed-ing programs, or by utility of the species as a biomedical research model (Table 2). Nevertheless, these provide infor-mation useful to our disciplines, as they can be used to gauge the utility of improved genome assemblies. The most recent release of the goat genome (Bickhart et al. 2017) employed long and short read sequencing, combined with optical map-ping and Hi-C to scaffold the assembly, resulting in what the authors claimed to be the most well-assembled genome of any non-human mammal. The new assembly mapped to chromosomes 90% of the 1723 previously unmapped SNP markers on the 52 K SNP Chip, a high throughput genotyping tool used in breeding programs, and improved their scoring call rate—apparently because these unmapped markers fall within repeat regions. Immune system genes showed particu-lar improvements with the new assembly due to their repeti-tive nature and extreme polymorphism, with two of the major gene regions mapping to a single scaffold. The project was completed at a total cost of $100,000 (Bickhart et al. 2017).

Scans for differentiation (based on FST and related indices) have been used for decades now to survey genomes for selec-tion. A population genomic perspective on this approach in birds recommends caution (Wolf and Ellegren 2017; Peona et al. 2018). Surveys of genomes of hooded and carrion crow in their European hybrid zone located a putative “speciation island” of high interspecific differentiation on chromosome

18, but this island spans an assembly gap of unknown size. Addition of long-read sequencing and optimal mapping found the gap to coincide with a large repeat region potentially asso-ciated with a centromere, and population resequencing data confirmed that this region showed depressed recombination, complicating its analysis (Weissensteiner et al. 2017). Recom-bination rate variation across the genomes of another avian speciation model, Ficedula albicollis and F. hypoleuca fly-catchers, is associated with structural features of the genome, such as transposable elements and locations for promoters in intergenic regions. Without high quality assembly, the fine scale “landscape” of recombination hotspots and coldspots would be missed, confounding analysis of the closely related genomes of these hybridizing species (Burri et al. 2015).

Here we point out that genome sequencing projects are works in progress, in which assembly and annotation are con-tinuously improved upon, and technological advancements as well as investigator effort drives progress. To illustrate, the first release of the human genome assembly (WGSA: from the December 2001 assembly by Celera Genomics) yielded a con-tig N50 of 23,350 bp. Since then, 221 assembly versions later, the latest update of the human reference genome (GRCh38.p12), boasts a contig N50 of 57,879,411 bp (NCBI 2019d, accessed 1 August 2019).

These and other examples underscore the value of commu-nity efforts. Projects on model organisms often use networks of investigators to undertake whole genome sequencing pro-jects on strategically sampled populations worldwide—such as the 1001 Genomes Consortium (2016) for Arabidopsis and the Drosophila melanogaster Genome Nexus [623 genomes: (Lack et al. 2015)]. One effort in a non-model taxon is a low-coverage (20 ×) whole-genome sequencing project to study genomic variation associated with domestication, phenotypic variation and disease states across > 5000 dog breeds and other canids (Ostrander et al. 2019). Another is in birds. There were three avian assemblies in 2001, 101 by 2018, and the B10 K initiative has the stated goal of sequencing 10,500 genomes for phylogenomics and a variety of other applications—i.e. nearly all bird species on Earth (Jarvis 2016; Peona et al. 2018). Assembly and annotation quality variation has been problematic in the phylogenomics project, and Jarvis (2016) suggests that researchers can help alleviate these issues and make genome projects more efficient by cooperatively gen-erating reference genomes from strategically chosen species.

Genomics in invasive species management: development of biocontrols

Some invasive species genome projects have been launched to discover biocontrols. For example, vector-directed bio-control drove the sequencing of the genomes of the invasive

Page 18: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

mosquito species that carry malaria [Anopheles gambiae: (Holt et al. 2002)] and yellow fever, dengue and Zika viruses [Aedes aegypti: (Nene et al. 2007; Matthews et al. 2018)]. Sequencing of the genome of the crown-of-thorns sea star (Acanthaster planci spp. group) identified the genes for an array of molecules released when animals aggregate to spawn—including a large number of unique ependymin-family proteins active in the central nervous system of many animals and their putative receptors (Hall et al. 2017). This communication system may be a target for biocontrol using synthetic peptides that mimic aggregation cues.

Genetic modification technologies for biocontrol of inva-sive species, crop pests and vectors of human disease are drawing much recent attention. Molecular biologists have invented several techniques with which they can deliver foreign DNA, or make precise edits in native DNA. The CRISPR/Cas9 gene editing system has received the greatest recent attention for applications in biological conservation, including control of invasive species (Champer et al. 2016; Noble et al. 2018; Rode et al. 2019). This approach offers the potential for spreading genetically edited alleles throughout wild populations (even when they lower fitness), through a mechanism known as a “gene drive.” The enzyme Cas9 cleaves target genes and directs their precise editing. When properly engineered, the Cas9 editing system can initiate conversion of the non-modified allele on the homologous chromosome to the modified allele, making the edited organ-ism homozygous and leading to super-Mendelian inherit-ance (Burt 2003; Gantz and Bier 2015; Gantz et al. 2015). Mosquito vectors of disease have been used in several labo-ratory trials (e.g., Gantz et al. 2015), including one in which a sex ratio distortion gene drive caused complete collapse of caged populations of Anopheles gambiae (Kyrou et al. 2018).

The first step forward in genetic modification requires selection of target genes and biological processes that, when modified, will produce the desired fitness effect (lethality, reduced viability, infertility). Genome sequence information is required for selecting target genes and designing edit-ing constructs. For example, Drury et al. (2017) generated genomic sequences from 4 global populations of the flour beetle Tribolium castaneum to examine population DNA variation at Cas9 sites. These edits are expected to produce a range of fitness costs, due to their effects on eye pigmen-tation, female and male fertility, and insecticide sensitivity. Maselko et al. (2017) searched the yeast genome for target gene promoter regions that when modified [using a “dead” Cas9 enzyme (dCas9) that does not cause a gene drive] would direct lethal overexpression of the gene product. To screen for efficacy, they searched population genomic data from rice and fruit flies for variants in dCas9 sites within promoter regions.

Conclusions

Genomic analyses have already demonstrated their greater power for the most common applications of genetics in inva-sion biology—reconstruction of invasion history. Source populations and invasion pathways can be identified with much finer resolution, even offering the potential to perform “invasion forensics” at geographic scales in which agencies implement invasive species management programs. Genom-ics has also shown greater power to research the “genetic paradox” of invasions, and provide better estimates of the number and size of introductions.

In addition to these largely quantitative improvements in power (e.g. from genotyping thousands of unlinked SNPs), genomics provides qualitatively different informa-tion about genome structure that is just beginning to be explored in studies of endangered species, and invasive species. In endangered species, studies of the genomic architecture of homozygosity have the potential to revo-lutionize studies of inbreeding and inbreeding depression, as well as the design of captive breeding and genetic res-toration programs. In genomes of invasive species, one of the most commonly reported structural features are gene family expansions, and comparative genomics is providing improved rigor to determine their role in the evolution of invasiveness.

A major challenge remaining for both invasion and con-servation genomics is the identification of genes that control invasiveness, disease resistance, or countless other traits of conservation relevance. The perspective of Salzberg (2019) reminds us that success will be limited by the quality of the annotations of genomes of the mostly non-model species that we study. We suggest that this leads to the strongest argument in favor of community efforts to gather the needed genomic resources, and for efficient strategies to gener-ate high quality reference genomes from carefully chosen species.

Acknowledgements We thank Benjamin Auch and Kenneth Beck-man in the University of Minnesota Genomics Center, Adam Herman, Thomas Kono, Kevin Silverstein, and Ying Zhang of the Minnesota Supercomputing Institute and many other collaborators for their excep-tional contributions to the zebra mussel genome project that inspired this review. Funding was provided by grants from the Minnesota Envi-ronment and Natural Resources Trust Fund and the Minnesota Aquatic Invasive Species Research Center, and from private donations.

References

Adams MD, Celniker SE, Holt RA et al (2000) The genome sequence of Drosophila melanogaster. Science 287:2185–2195

Allendorf FW (2017) Genetics and the conservation of natural popula-tions: allozymes to genomes. Mol Ecol 26:420–430

Page 19: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

Allendorf FW, Lundquist LL (2003) Introduction: population biol-ogy, evolution, and control of invasive species. Conserv Biol 17:24–30

Allendorf FW, Hohenlohe PA, Luikart G (2010) Genomics and the future of conservation genetics. Nat Rev Genet 11:697–709

Allendorf FW, Luikart G, Aitken SN (2013) Conservation and the genetics of populations, 2nd edn. Wiley-Blackwell, Chichester

Andrews KR, Good JM, Miller MR, Luikart G, Hohenlohe PA (2016) Harnessing the power of RADseq for ecological and evolutionary genomics. Nat Rev Genet 17:81–92

Ascunce MS, Yang C-C, Oakey J, Calcaterra L, Wu W-J, Shih C-J, Goudet J, Ross KG, Shoemaker D (2011) Global invasion his-tory of the fire ant Solenopsis invicta. Science 331:1066–1068

Asplen MK, Anfora G, Biondi A et al (2015) Invasion biology of spot-ted wing Drosophila (Drosophila suzukii): a global perspective and future priorities. J Pest Sci 88:469–494

Baker HG, Stebbins GL (eds) (1965) The evolution of colonizing spe-cies. Academic Press, New York

Bana NA, Nyiri A, Nagy J et al (2018) The red deer Cervus elaphus genome CerEla1.0: sequencing, annotating, genes, and chromo-somes. Mol Genet Genom 293:665–684

Barker BS, Andonian K, Swope SM, Luster DG, Dlugosch KM (2017) Population genomic analyses reveal a history of range expansion and trait evolution across the native and invaded range of yellow starthistle (Centaurea solstitialis). Mol Ecol 26:1131–1147

Barrett SCH (2017) Foundations of invasion genetics: The Baker and Stebbins legacy. In: Barrett SCH, Colautti RI, Dlugosch KM, Rieseberg LH (eds) Invasion genetics: the Baker and Stebbins legacy. Wiley, Chichester, pp 1–20

Berlin K, Koren S, Chin C-S, Drake JP, Landolin JM, Phillippy AM (2015) Assembling large genomes with single-molecule sequenc-ing and locality-sensitive hashing. Nat Biotechnol 33:623–630

Berthelot C, Brunet F, Chalopin D et al (2014) The rainbow trout genome provides novel insights into evolution after whole-genome duplication in vertebrates. Nat Commun 5:3657

Bickhart DM, Rosen BD, Koren S et al (2017) Single-molecule sequenc-ing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome. Nat Genet 49:643–650

Black WC IV, Baer CF, Antolin MF, DuTeau NM (2001) Population genomics: genome-wide sampling of insect populations. Annu Rev Entomol 46:441–469

Blattner FR, Plunkett G 3rd, Bloch CA et  al (1997) The com-plete genome sequence of Escherichia coli K-12. Science 277:1453–1462

Bohme U, Otto TD, Cotton JA et al (2018) Complete avian malaria parasite genomes reveal features associated with lineage-specific evolution in birds and mammals. Genome Res 28:547–560

Bourne SD, Hudson J, Holman LE, Rius M (2018) Marine invasion genomics: revealing ecological and evolutionary consequences of biological invasions. In: Rajora OP (ed) Population genomics: concepts, approaches and applications. Springer, Switzerland, pp 1–36

Burri R, Nater A, Kawakami T et al (2015) Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers. Genome Res 25:1656–1665

Burt A (2003) Site-specific selfish genes as tools for the control and genetic engineering of natural populations. Proc R Soc of Lond Ser B 270:921–928

Burton JN, Adey A, Patwardhan RP, Qiu R, Kitzman JO, Shendure J (2013) Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 31:1119–1125

Carneiro M, Rubin C-J, Di Palma F et  al (2014) Rabbit genome analysis reveals a polygenic basis for phenotypic change during domestication. Science 345:1074–1079

Catchen J, Bassham S, Wilson T, Currey M, O’Brien C, Yeates Q, Cresko WA (2013a) The population structure and recent colo-nization history of Oregon threespine stickleback determined using restriction-site associated DNA-sequencing. Mol Ecol 22:2864–2883

Catchen J, Hohenlohe PA, Bassham S, Amores A, Cresko WA (2013b) Stacks: an analysis tool set for population genomics. Mol Ecol 22:3124–3140

Champer J, Buchman A, Akbari OS (2016) Cheating evolution: engi-neering gene drives to manipulate the fate of wild populations. Nat Rev Genet 17:146–159

Chen X-G, Jiang X, Gu J et  al (2015) Genome sequence of the Asian Tiger mosquito, Aedes albopictus, reveals insights into its biology, genetics, and evolution. Proc Natl Acad Sci 112:E5907–E5915

Chen W, Hasegawa DK, Kaur N et al (2016) The draft genome of whitefly Bemisia tabaci MEAM1, a global crop pest, provides novel insights into virus transmission, host adaptation, and insec-ticide resistance. BMC Biol 14:110

Chown SL, Hodgins KA, Griffin PC, Oakeshott JG, Byrne M, Hoff-mann AA (2015) Biological invasions, climate change and genomics. Evol Appl 8:23–46

Colautti RI, MacIsaac HJ (2004) A neutral terminology to define “inva-sive” species. Divers Distrib 10:135–141

Cresko WA, Amores A, Wilson C, Murphy J, Currey M, Phillips P, Bell MA, Kimmel CB, Postlethwait JH (2004) Parallel genetic basis for repeated evolution of armor loss in Alaskan threespine stick-leback populations. Proc Natl Acad Sci USA 101:6050–6055

Cristescu ME (2015) Genetic reconstructions of invasion history. Mol Ecol 24:2212–2225

Darling JA, Bagley MJ, Roman JOE, Tepolt CK, Geller JB (2008) Genetic patterns across multiple introductions of the globally invasive crab genus Carcinus. Mol Ecol 17:4992–5007

Darling JA, Tsai YH, Blakeslee AM, Roman J (2014) Are genes faster than crabs? Mitochondrial introgression exceeds larval dispersal during population expansion of the invasive crab Carcinus mae-nas. R Soc Open Sci 1:140202

Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blax-ter ML (2011) Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat Rev Genet 12:499–510

de Koning APJ, Gu W, Castoe TA, Batzer MA, Pollock DD (2011) Repetitive elements may comprise over two-thirds of the human genome. PLoS Genet 7:e1002384

Dlugosch KM, Parker IM (2007) Founding events in species invasions: genetic variation, adaptive evolution, and the role of multiple introductions. Mol Ecol 17:431–449

Dlugosch KM, Anderson SR, Braasch J, Cang FA, Gillette HD (2015) The devil is in the details: genetic variation in introduced popula-tions and its contributions to invasion. Mol Ecol 24:2095–2111

Dobzhansky T (1965) “Wild” and”domestic” species of Drosophila. In: Baker AM, Stebbins GL (eds) The genetics of colonizing species. Academic Press, New York, pp 533–546

Drury DW, Dapper AL, Siniard DJ, Zentner GE, Wade MJ (2017) CRISPR/Cas9 gene drives in genetically variable and nonran-domly mating wild populations. Sci Adv 3:e1601910

Drysdale RA, Crosby MA, FlyBase C (2005) FlyBase: genes and gene models. Nucleic Acids Res 33:D390–D395

Edwards RJ, Tuipulotu DE, Amos TG et al (2018) Draft genome assembly of the invasive cane toad, Rhinella marina. Gigas-cience 7:1–13

Elleouet JS, Aitken SN (2018) Exploring Approximate Bayesian Com-putation for inferring recent demographic history with genomic markers in nonmodel species. Mol Ecol Resour 18:525–540

Emerson KJ, Merz CR, Catchen JM, Hohenlohe PA, Cresko WA, Bradshaw WE, Holzapfel CM (2010) Resolving postglacial

Page 20: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

phylogeography using high-throughput sequencing. Proc Natl Acad Sci USA 107:16196–16200

Engelbrecht J, Duong TA, Berg NVD (2017) New microsatellite mark-ers for population studies of Phytophthora cinnamomi, an impor-tant global pathogen. Sci Rep 7:17631

Estoup A, Guillemaud T (2010) Reconstructing routes of invasion using genetic data: why, how and so what? Mol Ecol 19:4113–4130

Estoup A, Ravigné V, Hufbauer R, Vitalis R, Gautier M, Facon B (2016) Is there a genetic paradox of biological invasion? Annu Rev Ecol Evol Syst 47:51–72

Farrer RA, Weinert LA, Bielby J et al (2011) Multiple emergences of genetically diverse amphibian-infecting chytrids include a glo-balized hypervirulent recombinant lineage. Proc Natl Acad Sci USA 108:18732–18736

Fleischmann RD, Adams MD, White O et al (1995) Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269:496–512

Frankham R (2004) Resolving the genetic paradox in invasive species. Heredity 94:385

Gantz VM, Bier E (2015) The mutagenic chain reaction: a method for converting heterozygous to homozygous mutations. Science 348:442–444

Gantz VM, Jasinskiene N, Tatarenkova O, Fazekas A, Macias VM, Bier E, James AA (2015) Highly efficient Cas9-mediated gene drive for population modification of the malaria vector mosquito Anopheles stephensi. Proc Natl Acad Sci USA 112:E6736

Genomes Consortium (2016) 1135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell 166:481–491

Goffeau A, Barrell BG, Bussey H et al (1996) Life with 6000 genes. Science 274:546

Goodswen SJ, Kennedy PJ, Ellis JT (2012) Evaluating high-throughput ab initio gene finders to discover proteins encoded in eukaryotic pathogen genomes missed by laboratory techniques. PLoS ONE 7:e50609

Goodwin S, Gurtowski J, Ethe-Sayers S, Deshpande P, Schatz MC, McCombie WR (2015) Oxford Nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome. Genome Res 25:1750–1756

Gouin A, Bretaudeau A, Nam K et al (2017) Two genomes of highly polyphagous lepidopteran pests (Spodoptera frugiperda, Noctui-dae) with different host-plant ranges. Sci Rep 7:11816

Hall MR, Kocot KM, Baughman KW et al (2017) The crown-of-thorns starfish genome as a guide for biocontrol of this coral reef pest. Nature 544:231–234

Hammond SA, Warren RL, Vandervalk BP et al (2017) The North American bullfrog draft genome provides insight into hormonal regulation of long noncoding RNA. Nat Commun 8:1433

Henson J, Tischler G, Ning Z (2012) Next-generation sequencing and large genome assemblies. Pharmacogenomics 13:901–915

Higashino A, Sakate R, Kameoka Y, Takahashi I, Hirata M, Tanuma R, Masui T, Yasutomi Y, Osada N (2012) Whole-genome sequenc-ing and analysis of the Malaysian cynomolgus macaque (Macaca fascicularis) genome. Genome Biol 13:R58

Hoffberg SL, Troendle NJ, Glenn TC, Mahmud O, Louha S, Chalopin D, Bennetzen JL, Mauricio R (2018) A high-quality reference genome for the invasive mosquitofish Gambusia affinis using a Chicago Library. G3 (Bethesda) 8:1855–1861

Hohenlohe PA, Bassham S, Etter PD, Stiffler N, Johnson EA, Cresko WA (2010) Population genomics of parallel adaptation in three-spine stickleback using sequenced RAD tags. PLoS Genet 6:e1000862

Holt RA, Subramanian GM, Halpern A et al (2002) The genome sequence of the malaria mosquito Anopheles gambiae. Science 298:129–149

Horvath DP, Patel S, Doğramaci M et al (2018) Gene space and tran-scriptome assemblies of leafy spurge (Euphorbia esula) identify

promoter sequences, repetitive elements, high-quality markers, and a full-length chloroplast genome. Weed Sci 66:355–367

Hufbauer RA, Facon B, Ravigne V, Turgeon J, Foucaud J, Lee CE, Rey O, Estoup A (2012) Anthropogenically induced adaptation to invade (AIAI): contemporary adaptation to human-altered habi-tats within the native range can promote invasions. Evol Appl 5:89–101

Jain M, Olsen HE, Paten B, Akeson M (2016) The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics com-munity. Genome Biol 17:239

Jarvis ED (2016) Perspectives from the avian phylogenomics project: questions that can be answered with sequencing all genomes of a vertebrate class. Annu Rev Anim Biosci 4:45–59

Jeffery NW, DiBacco C, Van Wyngaarden M et  al (2017) RAD sequencing reveals genomewide divergence between independ-ent invasions of the European green crab (Carcinus maenas) in the Northwest Atlantic. Ecol Evol 7:2513–2524

Joneson S, Stajich JE, Shiu S-H, Rosenblum EB (2011) Genomic transition to pathogenicity in chytrid fungi. PLoS Pathol 7:e1002338

Kardos M, Taylor HR, Ellegren H, Luikart G, Allendorf FW (2016) Genomics advances the study of inbreeding depression in the wild. Evol Appl 9:1205–1218

Kardos M, Åkesson M, Fountain T et al (2017a) Genomic conse-quences of intensive inbreeding in an isolated wolf population. Nat Ecol Evol 2:124–131

Kardos M, Qvarnstrom A, Ellegren H (2017b) Inferring individual inbreeding and demographic history from segments of identity by descent in Ficedula flycatcher genome sequences. Genetics 205:1319–1334

Kingan SB, Heaton H, Cudini J, Lambert CC, Baybayan P, Galvin BD, Durbin R, Korlach J, Lawniczak MKN (2019) A high-quality de novo genome assembly from a single mosquito using PacBio Ssquencing. Genes (Basel) 10:1–11

Kingsford C, Schatz MC, Pop M (2010) Assembly complexity of prokaryotic genomes using short reads. BMC Bioinformatics 11:21

Koren S, Schatz MC, Walenz BP et al (2012) Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotechnol 30:693–700

Kotsakiozi P, Richardson JB, Pichler V, Favia G, Martins AJ, Urbanelli S, Armbruster PA, Caccone A (2017) Population genomics of the Asian tiger mosquito, Aedes albopictus: insights into the recent worldwide invasion. Ecol Evol 7:10143–10157

Kyrou K, Hammond AM, Galizi R, Kranjc N, Burt A, Beaghton AK, Nolan T, Crisanti A (2018) A CRISPR-Cas9 gene drive target-ing doublesex causes complete population suppression in caged Anopheles gambiae mosquitoes. Nat Biotechnol 36:1062–1066

Lack JB, Cardeno CM, Crepeau MW, Taylor W, Corbett-Detig RB, Ste-vens KA, Langley CH, Pool JE (2015) The Drosophila genome nexus: a population genomic resource of 623 Drosophila mela-nogaster genomes, including 197 from a single ancestral range population. Genetics 199:1229–1241

Lander ES, Linton LM, Birren B et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921

Lee CE (2002) Evolutionary genetics of invasive species. Trends Ecol Evol 17:386–391

Li M, Chen L, Tian S et al (2017) Comprehensive variation discovery and recovery of missing sequence in the pig genome using mul-tiple de novo assemblies. Genome Res 27:865–874

Liu C, Jiang F, Wang H et al (2018) The genome of the golden apple snail Pomacea canaliculata provides insight into stress tolerance and invasive adaptation. GigaScience 7:giy101

Lombaert E, Guillemaud T, Cornuet J-M, Malausa T, Facon B, Estoup A (2010) Bridgehead effect in the worldwide invasion of the biocontrol harlequin ladybird. PLoS ONE 5:e9743

Page 21: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

Lombaert E, Guillemaud T, Lundgren J et al (2014) Complementarity of statistical treatments to reconstruct worldwide routes of inva-sion: the case of the Asian ladybird Harmonia axyridis. Mol Ecol 23:5979–5997

Lowe S, Browne M, Boudjelas S, De Poorter M (2000) 100 of the world’s worst invasive alien species: a selection from the global invasive species database. Aukland, New Zealand, p 12

Luikart G, England PR, Tallmon D, Jordan S, Taberlet P (2003) The power and promise of population genomics: from genotyping to genome typing. Nat Rev Genet 4:981

Makkonen J, Vesterbacka A, Martin F, Jussila J, Dieguez-Uribeondo J, Kortet R, Kokko H (2016) Mitochondrial genomes and com-parative genomics of Aphanomyces astaci and Aphanomyces invadans. Sci Rep 6:36089

Manni M, Guglielmino CR, Scolari F et al (2017) Genetic evidence for a worldwide chaotic dispersion pattern of the arbovirus vector, Aedes albopictus. PLoS Negl Trop Dis 11:e0005332

Maselko M, Heinsch SC, Chacón JM, Harcombe WR, Smanski MJ (2017) Engineering species-like barriers to sexual reproduc-tion. Nat Commun 8:883

Mastretta-Yanes A, Arrigo N, Alvarez N, Jorgensen TH, Pinero D, Emerson BC (2015) Restriction site-associated DNA sequenc-ing, genotyping error estimation and de novo assembly optimiza-tion for population genetic inference. Mol Ecol Resour 15:28–41

Matthews BJ, Dudchenko O, Kingan SB et al (2018) Improved refer-ence genome of Aedes aegypti informs arbovirus vector con-trol. Nature 563:501–507

McCartney MA, Auch B, Kono T et al (2019) The genome of the zebra mussel, Dreissena polymorpha: a resource for invasive species research. bioRxiv. https ://doi.org/10.1101/69673 2v1

McKenna DD, Scully ED, Pauchet Y et al (2016) Genome of the Asian longhorned beetle (Anoplophora glabripennis), a glob-ally significant invasive species, reveals key functional and evolutionary innovations at the beetle–plant interface. BMC Genome Biol 17:227

Metzker ML (2010) Sequencing technologies—the next generation. Nat Rev Genet 11:31–46

Miller N, Estoup A, Toepfer S et al (2005) Multiple transatlantic introductions of the western corn rootworm. Science 310:992

Minardi D, Studholme DJ, van der Giezen M, Pretto T, Oidtmann B (2018) New genotyping method for the causative agent of crayfish plague (Aphanomyces astaci) based on whole genome data. J Invertebr Pathol 156:6–13

Moll KM, Zhou P, Ramaraj T et al (2017) Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula. BMC Genom 18:578

Murgarella M, Puiu D, Novoa B, Figueras A, Posada D, Canchaya C (2016) A first insight into the genome of the filter-feeder mus-sel Mytilus galloprovincialis. PLoS ONE 11:e0151561

Narum SR, Buerkle CA, Davey JW, Miller MR, Hohenlohe PA (2013) Genotyping-by-sequencing in ecological and conser-vation genomics. Mol Ecol 22:2841–2847

NCBI (2019a) [National Center for Biotechnology Information, National Institutes of Health, US National Library of Medi-cine], Genome resource. https ://www.ncbi.nlm.nih.gov/genom e/

NCBI (2019b) List of BioProjects, filtered for data type “Genome sequencing and assembly”. https ://www.ncbi.nlm.nih.gov/biopr oject /brows e/

NCBI (2019c) Gasterosteus aculeatus reference genome. https ://www.ncbi.nlm.nih.gov/assem bly/GCA_00018 0675.1/#/s

NCBI (2019d) Homo sapiens genome assemblies from 2001 to present: https ://www.ncbi.nlm.nih.gov/assem bly/?term=Homo+sapie ns

Nene V, Wortman JR, Lawson D et al (2007) Genome sequence of Aedes aegypti, a major arbovirus vector. Science 316:1718

Nielsen R, Paul JS, Albrechtsen A, Song YS (2011) Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 12:443–451

Noble C, Adlam B, Church GM, Esvelt KM, Nowak MA (2018) Current CRISPR gene drive systems are likely to be highly invasive in wild populations. eLife 7:e33423

Nowoshilow S, Schloissnig S, Fei J-F et al (2018) The axolotl genome and the evolution of key tissue formation regulators. Nature 554:50–55

Ometto L, Cestaro A, Ramasamy S et al (2013) Linking genomics and ecology to investigate the complex evolution of an invasive Drosophila pest. Genome Biol Evol 5:745–757

Ostrander EA, Wang G-D, Larson G et al (2019) Dog10K: an inter-national sequencing effort to advance studies of canine domes-tication, phenotypes and health. Natl Sci Rev. https ://doi.org/10.1093/nsr/nwz04 9/54376 95

Padeken J, Zeller P, Gasser SM (2015) Repeat DNA in genome organi-zation and stability. Curr Opin Genet Dev 31:12–19

Pelissie B, Crossley MS, Cohen ZP, Schoville SD (2018) Rapid evolu-tion in insect pests: the importance of space and time in popula-tion genomics studies. Curr Opin Insect Sci 26:8–16

Peona V, Weissensteiner MH, Suh A (2018) How complete are “com-plete” genome assemblies?—an avian perspective. Mol Ecol Resources 18:1188–1195

Ramasamy S, Ometto L, Crava CM et al (2016) The evolution of olfactory gene families in Drosophila and the genomic basis of chemical-ecological adaptation in Drosophila suzukii. Genome Biol Evol 8:2297–2311

Rhoads A, Au KF (2015) PacBio sequencing and its applications. Genomics Proteomics Bioinf 13:278–289

Ricciardi A, Cohen J (2006) The invasiveness of an introduced species does not predict its impact. Biol Invas 9:309–315

Rius M, Bourne S, Hornsby HG, Chapman MA (2015) Applications of next-generation sequencing to the study of biological invasions. Curr Zool 61:488–504

Roberts RJ, Church DM, Goodstadt L et al (2009) Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol 7:e1000112

Rode NO, Estoup A, Bourguet D, Courtier-Orgogozo V, Débarre F (2019) Population management using gene drive: molecular design, models of spread dynamics and assessment of ecologi-cal risks. Conserv Genet 20:671–690

Roman J (2006) Diluting the founder effect: cryptic invasions expand a marine invader’s range. Proc R Soc Lond Ser B 273:2453–2459

Roman J, Darling JA (2007) Paradox lost: genetic diversity and the success of aquatic invasions. Trends Ecol Evol 22:454–464

Ryan JF, Pang K, Schnitzler CE et al (2013) The genome of the cteno-phore Mnemiopsis leidyi and its implications for cell type evolu-tion. Science 342:1242592

Salzberg SL (2019) Next-generation genome annotation: we still strug-gle to get it right. Genome Biol 20:92

Sard N, Robinson J, Kanefsky J, Herbst S, Scribner K (2019) Coales-cent models characterize sources and demographic history of recent round goby colonization of Great Lakes and inland waters. Evol Appl 12:1034–1049

Sax DF, Stachowicz JJ, Brown JH et al (2007) Ecological and evo-lutionary insights from species invasions. Trends Ecol Evol 22:465–471

Schatz MC, Delcher AL, Salzberg SL (2010) Assembly of large genomes using second-generation sequencing. Genome Res 20:1165–1173

Schwartz DC, Li X, Hernandez LI, Ramnarain SP, Huff EJ, Wang YK (1993) Ordered restriction maps of Saccharomyces cer-evisiae chromosomes constructed by optical mapping. Science 262:110–114

Page 22: Genome projects in invasion biologyConservation Genetics 1 3 Whileitsoriginscanbetracedtoworkontheevolution-arygeneticsofcolonizingspeciesmorethan50yearsago (ya)(e.g.BakerandStebbins1965

Conservation Genetics

1 3

Seo JS, Rhie A, Kim J et al (2016) De novo assembly and phasing of a Korean human genome. Nature 538:243–247

Sharon D, Tilgner H, Grubert F, Snyder M (2013) A single-molecule long-read survey of the human transcriptome. Nat Biotechnol 31:1009–1014

Sherman CDH, Lotterhos KE, Richardson MF, Tepolt CK, Rollins LA, Palumbi SR, Miller AD (2016) What are we missing about marine invasions? Filling in the gaps with evolutionary genom-ics. Mar Biol 163:198

Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM (2015) BUSCO: assessing genome assembly and annota-tion completeness with single-copy orthologs. Bioinformatics 31:3210–3212

Smith CD, Zimin A, Holt C et al (2011) Draft genome of the globally widespread and invasive Argentine ant (Linepithema humile). Proc Natl Acad Sci USA 108:5673–5678

Stanke M, Morgenstern B (2005) AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res 33:W465–W467

Studholme DJ, McDougal RL, Sambles C, Hansen E, Hardy G, Grant M, Ganley RJ, Williams NM (2016) Genome sequences of six Phytophthora species associated with forests in New Zealand. Genom Data 7:54–56

Tamazian G, Simonov S, Dobrynin P et al (2014) Annotated features of domestic cat–Felis catus genome. GigaScience 3:13

Venter JC, Adams MD, Myers EW et al (2001) The sequence of the human genome. Science 291:1304–1351

Weissensteiner MH, Pang AWC, Bunikis I, Hoijer I, Vinnere-Petterson O, Suh A, Wolf JBW (2017) Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications. Genome Res 27:697–708

Wolf JB, Ellegren H (2017) Making sense of genomic islands of dif-ferentiation in light of speciation. Nat Rev Genet 18:87–100

Xu P, Zhang X, Wang X et al (2014) Genome sequence and genetic diversity of the common carp, Cyprinus carpio. Nat Genet 46:1212–1219

Zhang G, Fang X, Guo X et al (2012) The oyster genome reveals stress adaptation and complexity of shell formation. Nature 490:49–54

Zheng GXY, Lau BT, Schnall-Levin M et al (2016) Haplotyping ger-mline and cancer genomes with high-throughput linked-read sequencing. Nat Biotechnol 34:303–311

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.