genome-wide association study of gossypium arboreum ......resistance was conferred by renbarb2 and...

11
RESEARCH ARTICLE Open Access Genome-wide association study of Gossypium arboreum resistance to reniform nematode Ruijuan Li 1 , John E. Erpelding 2* and Salliana R. Stetina 2 Abstract Background: Reniform nematode (Rotylenchulus reniformis) has emerged as one of the most destructive root pathogens of upland cotton (Gossypium hirsutum) in the United States. Management of R. reniformis has been hindered by the lack of resistant G. hirsutum cultivars; however, resistance has been frequently identified in germplasm accessions from the G. arboreum collection. To determine the genetic basis of reniform nematode resistance, a genome-wide association study (GWAS) was performed using 246 G. arboreum germplasm accessions that were genotyped with 7220 single nucleotide polymorphic (SNP) sequence markers generated from genotyping- by-sequencing. Results: Fifteen SNPs representing 12 genomic loci distributed over eight chromosomes showed association with reniform nematode resistance. For 14 SNPs, major alleles were shown to be associated with resistance. From the 15 significantly associated SNPs, 146 genes containing or physically close to these loci were identified as putative reniform nematode resistance candidate genes. These genes are involved in a broad range of biological pathways, including plant innate immunity, transcriptional regulation, and redox reaction that may have a role in the expression of resistance. Eighteen of these genes corresponded to differentially expressed genes identified from G. hirsutum in response to reniform nematode infection. Conclusions: The identification of multiple genomic loci associated with reniform nematode resistance would indicate that the G. arboreum collection is a significant resource of novel resistance genes. The significantly associated markers identified from this GWAS can be used for the development of molecular tools for breeding improved reniform nematode resistant upland cotton with resistance introgressed from G. arboreum. Additionally, a greater understanding of the molecular mechanisms of reniform nematode resistance can be determined through genetic structure and functional analyses of candidate genes, which will aid in the pyramiding of multiple resistance genes. Keywords: Cotton, Genome-wide association study, Genotyping-by-sequencing, Germplasm, Gossypium arboreum, Host-plant resistance, reniform nematode, Single nucleotide polymorphism Background Reniform nematode (RN, Rotylenchulus reniformis) is an obligate, semi-endoparasitic nematode that feeds on the root system of upland cotton (Gossypium hirsutum) causing significant yield losses [1, 2]. The United States ranks third in worldwide cotton production with 2.8 million metric tons (MMT) produced in 2016 (http:// apps.fas.usda.gov/psdonline/psdDataPublications.aspx). Yield losses from RN typically range from 1 to 5% in the southeastern cotton producing states [2] with an esti- mated loss of 24,211 MT in 2015 valued at approximately US$36 million [3]. According to Robinson [1], RN has replaced root-knot nematode (Meloidogyne spp.) as the major pathogenic nematode of cotton in the mid-south region of United States. RN infection in cotton is initiated when vermiform preadult females penetrate the stele of the roots, where successful parasitism results in the estab- lishment of a multi-nucleus syncytium inside the nema- tode feeding cells, which serves as the sole nutrient source [1, 2]. Symptoms of RN infestation include plant stunting, * Correspondence: [email protected] 2 USDA-ARS, Crop Genetics Research Unit, 141 Experiment Station Road, PO Box 345, Stoneville, MS 38776, USA Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Li et al. BMC Genetics (2018) 19:52 https://doi.org/10.1186/s12863-018-0662-3

Upload: others

Post on 27-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • RESEARCH ARTICLE Open Access

    Genome-wide association study ofGossypium arboreum resistance toreniform nematodeRuijuan Li1, John E. Erpelding2* and Salliana R. Stetina2

    Abstract

    Background: Reniform nematode (Rotylenchulus reniformis) has emerged as one of the most destructive rootpathogens of upland cotton (Gossypium hirsutum) in the United States. Management of R. reniformis has beenhindered by the lack of resistant G. hirsutum cultivars; however, resistance has been frequently identified ingermplasm accessions from the G. arboreum collection. To determine the genetic basis of reniform nematoderesistance, a genome-wide association study (GWAS) was performed using 246 G. arboreum germplasm accessionsthat were genotyped with 7220 single nucleotide polymorphic (SNP) sequence markers generated from genotyping-by-sequencing.

    Results: Fifteen SNPs representing 12 genomic loci distributed over eight chromosomes showed association withreniform nematode resistance. For 14 SNPs, major alleles were shown to be associated with resistance. From the 15significantly associated SNPs, 146 genes containing or physically close to these loci were identified as putative reniformnematode resistance candidate genes. These genes are involved in a broad range of biological pathways, includingplant innate immunity, transcriptional regulation, and redox reaction that may have a role in the expression of resistance.Eighteen of these genes corresponded to differentially expressed genes identified from G. hirsutum in response toreniform nematode infection.

    Conclusions: The identification of multiple genomic loci associated with reniform nematode resistance would indicatethat the G. arboreum collection is a significant resource of novel resistance genes. The significantly associated markersidentified from this GWAS can be used for the development of molecular tools for breeding improved reniformnematode resistant upland cotton with resistance introgressed from G. arboreum. Additionally, a greater understandingof the molecular mechanisms of reniform nematode resistance can be determined through genetic structure andfunctional analyses of candidate genes, which will aid in the pyramiding of multiple resistance genes.

    Keywords: Cotton, Genome-wide association study, Genotyping-by-sequencing, Germplasm, Gossypium arboreum,Host-plant resistance, reniform nematode, Single nucleotide polymorphism

    BackgroundReniform nematode (RN, Rotylenchulus reniformis) is anobligate, semi-endoparasitic nematode that feeds on theroot system of upland cotton (Gossypium hirsutum)causing significant yield losses [1, 2]. The United Statesranks third in worldwide cotton production with 2.8million metric tons (MMT) produced in 2016 (http://apps.fas.usda.gov/psdonline/psdDataPublications.aspx).

    Yield losses from RN typically range from 1 to 5% in thesoutheastern cotton producing states [2] with an esti-mated loss of 24,211 MT in 2015 valued at approximatelyUS$36 million [3]. According to Robinson [1], RN hasreplaced root-knot nematode (Meloidogyne spp.) as themajor pathogenic nematode of cotton in the mid-southregion of United States. RN infection in cotton is initiatedwhen vermiform preadult females penetrate the stele ofthe roots, where successful parasitism results in the estab-lishment of a multi-nucleus syncytium inside the nema-tode feeding cells, which serves as the sole nutrient source[1, 2]. Symptoms of RN infestation include plant stunting,

    * Correspondence: [email protected], Crop Genetics Research Unit, 141 Experiment Station Road, POBox 345, Stoneville, MS 38776, USAFull list of author information is available at the end of the article

    © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

    Li et al. BMC Genetics (2018) 19:52 https://doi.org/10.1186/s12863-018-0662-3

    http://crossmark.crossref.org/dialog/?doi=10.1186/s12863-018-0662-3&domain=pdfhttp://orcid.org/0000-0002-6412-0023http://apps.fas.usda.gov/psdonline/psdDataPublications.aspxhttp://apps.fas.usda.gov/psdonline/psdDataPublications.aspxmailto:[email protected]://creativecommons.org/licenses/by/4.0/http://creativecommons.org/publicdomain/zero/1.0/

  • suppressed root growth, and reduced yield; but thesesymptoms are typically uniform across the field making itdifficult to visually assess nematode damage [2].As the most widely planted cotton species, G. hirsutum

    is a natural allotetraploid (genome AADD referred to as2(AD)1) that likely resulted from the interspecifichybridization between diploid ancestors of G. arboreum(genome A2) and G. raimondii (genome D5) [4, 5]. Never-theless, no RN resistant G. hirsutum cultivars have beenidentified to effectively manage the nematode [1, 6, 7].Instead, related cotton species are being used as a sourceof resistance. RN resistance has been identified in at least10 of the 50 cotton species, including nine diploid speciesG. anomalum (genome B1B1), G. herbaceum (genomeA1A1), G. raimondii (genome D5D5), G. somalense (gen-ome E2E2), G. stocksii (genome E1E1), G. thurberi (genomeD1D1) [8], G. arboreum (genome A2A2) [8–10], G. aridum(genome D4D4) [11], G. longicalyx (genome F1F1) [8, 12],and one allotetraploid species G. barbadense (genomeAADD referred to as 2(AD)2) [7, 8, 13]. Among them,only G. longicalyx showed immunity to RN [8, 12]. Intro-gression of resistance from G. longicalyx into G. hirsutumwas successful [12] resulting in the release of two breedinglines, LONREN-1 (PI 669509) and LONREN-2 (PI669510) [14]. However, severe root necrosis and progres-sive root mass loss under high levels of RN inoculum havebeen reported for the G. longicalyx source of resistance,which are typical of hypersensitive responses [15]. Thisroot damage results in severe plant stunting, poor growthrate, and reduced lint yields [16], which has hindered theuse of the G. longicalyx source of resistance in cottonimprovement programs. RN resistance from G. aridumand G. arboreum has also been introgressed into G. hirsu-tum [10, 11]; however, the difficulties of introgressing anddeveloping cultivars with stable resistance from theseother cotton diploid species have slowed progress in culti-var development. Greater success has been achieved withRN resistance derived from G. barbadense; although, fewG. barbadense germplasm accessions showed resistance[1, 6–8]. Currently, RN resistance from two G. barbadensegermplasm accessions (PI 163608 and PI 608139) hasbeen introgressed into G. hirsutum with the release ofbreeding lines [17–20]. To increase the diversity of RNresistance, additional resistant cultivars derived fromdifferent resources are required and the G. arboreumgermplasm collection has been shown to be a majorsource of resistant accessions [8, 21].Studies of the genetic inheritance of RN resistance

    introgressed from G. longicalyx identified one dominantlyinherited gene that conferred resistance, which was local-ized to the A sub-genome of G. hirsutum on chromosome11 (Renlon) and flanked by codominant simple sequencerepeat (SSR) marker BNL3279_114 [22]. The RN resist-ance derived from G. barbadense was governed by three

    quantitative trait loci (QTL), Renbarb1 and Renbarb2 onchromosome 21 and Renbarb3 on chromosome 18, thatshowed a D sub-genome origin with each having a signifi-cant additive and dominance effect [13]. Recently, Wub-ben et al. [23] reported that QTL Renbarb1 and Renbarb2 onchromosome 21 can be resolved to a single locus(Renbarb2). They also showed that the majority of RNresistance was conferred by Renbarb2 and that locusRenbarb3 did not confer resistance in the absent ofRenbarb2; although, the two QTL were required to showexpression of resistance similar to the G. barbadense par-ental accession. Locus Renbarb2 associated with the sameSSR marker (BNL3279_109) [13] reported for locus Renlon

    on chromosome 11. These regions on chromosomes 11and 21 are putative homeologous regions [24] and resist-ance genes for several other cotton disease were associ-ated with these regions [25–28]. RN resistance derivedfrom G. aridum was also of D sub-genome origin andmapped to a single locus (Renari) on G. hirsutum chromo-some 21 flanked by two SSR markers BNL3279_132 andBNL2662_090 [11]. QTL Renari and Renbarb2 may beallelic [13]; whereas, Renari and Renlon are possibly dupli-cated genes on homeologous regions of the G. hirsutumgenome [24]. Thus, marker BNL3279 that showed differ-ent amplicon sizes was common among the three differentRN resistance sources. DNA sequences of bacterial artifi-cial chromosome clones [28] and RNA-seq [29] quantifi-cation of genes close to marker BNL3279 suggested thatresistance (R) genes, including leucine-rich repeat (LRR)receptor-like kinases and nucleotide binding site-LRRdomain-containing protein, could be associated with RNresistance.The traditional approach for mapping genes responsible

    for a biological trait is family-based linkage mapping com-monly using bi-parental populations, which capture therecombination events between chromosomes inheritedfrom two inbred strains. This approach has been widelyused in cotton improvement programs to determine gen-omic regions harboring genes for disease resistance, fiberyield, and lint quality traits in order to identify markerslinked to these traits for marker-assisted breeding [30, 31].While this method is powerful to map the causativegenomic locations for various traits specific to a definedpopulation, the low genomic resolution and time requiredto generate a population for more precise mapping limitsits application [32]. In comparison, a Genome-Wide Asso-ciation Study (GWAS) exploits all the historical recombin-ation present in a natural population or in a collection ofgermplasm accessions. By employing the non-randomassociation (LD, linkage disequilibrium) between alleles atunique loci along the chromosomes, GWAS is able toidentify the causative marker-trait associations and/orassociations in LD with the causative loci, given sufficientgenome-wide markers are provided. Thus, with GWAS, a

    Li et al. BMC Genetics (2018) 19:52 Page 2 of 11

  • higher genomic resolution can be reached and more alleles,which are not present in the mapping parents used for thefamily-based method, can be captured, although rare allelesassociated with important traits might be lost [33].GWAS has been conducted in cotton to identify SSR

    markers associated with fiber quality traits [34] and Verti-cillium wilt resistance [35]. These studies had limitationsfor both mapping resolution and genome coverage, be-cause of the availability of a small number of SSR markersfor genotyping. Next generation sequencing technologyhas greatly increased the number of markers available forgenotyping. Genotyping-by-sequencing (GBS), whichemploys reduced representation sequencing strategy, cangenerate genome-wide single nucleotide polymorphism(SNP) markers at a low cost and high efficiency [36].Alternately, a CottonSNP63K array with over 45 k puta-tive intraspecific and over 17 k interspecific SNP markerswas recently released [37]; although, the costs associatedwith this technology remain high. Application of thesepowerful technologies and resources in GWAS will helpthe advancement of genetic studies of different traits ofcotton including RN resistance.In an effort to utilize new sources of RN resistance

    identified from the G. arboreum germplasm collection[21] and dissect the genetic basis of this resistance, thisstudy genotyped 246 G. arboreum accessions for GWASof RN resistance. The aims were (i) to detect QTL andSNPs associated with G. arboreum RN resistance, and(ii) to identify candidate resistance genes in the associ-ated genomic regions. The detected SNPs from thisstudy can serve as important tools in marker-assistedbreeding for the introgression of resistance and thedevelopment of G. hirsutum cultivars with improved RNresistance and to increase the diversity of resistance.Additionally, the candidate genes identified from thisstudy will help in understanding and evaluation of themolecular mechanisms of resistance to improve theutilization of RN resistance sources.

    MethodsEvaluation of reniform nematode infection response of G.arboreum accessionsAn association mapping panel consisting of 246 G. arbor-eum accessions (Additional file 1: Table S1) was selectedfrom the USDA National Plant Germplasm System cottoncollection (https://npgsweb.ars-grin.gov). These accessionsrepresent landraces and cultivars that were donated to thecollection before 2001 [38]. To minimize within-accessionvariation, self-pollinated bolls from one plant wereselected for each accession. These self-pollinated seedswere used to evaluate nematode resistance and for DNAisolation. The reniform nematode infection response forthe accessions was evaluated in growth chamber testsmaintained at 28 °C using a 16 h photoperiod as described

    by Stetina and Erpelding [21]. Briefly, accessions wereplanted in plastic pots (Ray Leach SC10U Cone-tainer,Stuewe and Sons Inc., Tangent, OR) containing a steampasteurized soil mixture composed of two parts sand andone part sandy loam soil. One seedling representing anindividual accession was maintained in each pot and potswere arranged in a completely randomized experimentaldesign with three replications. The G. hirsutum accessionPI 529251 (cv. Deltapine 16) was included as a susceptiblecontrol [8]. The G. arboreum accession PI 615699 wasused as a resistant control [9] and was included in themapping panel. Pots were watered daily using an auto-matic watering system and the duration was increasedwith seedling growth to maintain adequate soil moisture.Seven days after planting, seedlings were inoculated withapproximately 1000 vermiform reniform nematodes of alocal field collected Mississippi isolate MSRR04 [39],which had been maintained on tomato (Solanum lycoper-sicon cv. Rutgers). Approximately 28 days after inocula-tion, individual plants were removed from pots and theroot system was gently agitated in tap water to removesoil. Roots were stained with red food coloring [40] andthe number of swollen females attached to the root systemwere counted. To compensate for differences in root size,root samples were placed on paper towels to removeexcess moisture and weighed with results expressed as thenumber of females per gram of fresh root weight. Becausethe accessions were screened as part of the breedingprogram and not evaluated in a single test, the mean levelof infection for each accession was expressed as a femaleindex (FI) value [41]; where the infection response isexpressed as a percentage of the average number offemales infecting the root system of the susceptible con-trol included in each test. Accessions with FI values lessthan 10% were classified as resistant, moderately resistantwith values ranging from 10 to 30%, moderately suscep-tible with values ranging from 31 to 60%, and susceptiblewith index values greater than 60%.

    Genotyping of G. arboreum accessionsOne self-pollinated seed for each accession was planted inthe greenhouse and leaf samples were collected from eachaccession 30 days after planting. DNA was extracted fromthe 246 accessions using the DNeasy Plant Mini Kit (Qia-gen, Valencia, CA) following the manufacturer’s protocol.Quantification of the DNA samples was conducted usingthe Quant-iT PicoGreen dsDNA Assay Kit (MolecularProbes, Inc., Eugene, OR) with DNA quality assayed byelectrophoresis of a 1 uL sample on a 1% agarose gel using1× TBE running buffer.GBS was performed at the Institute for Genomic Diver-

    sity, Cornell University, Ithaca, New York as described byElshire et al. [42]. Restriction enzyme ApeKI was used forthe preparation of DNA libraries. SNPs were identified

    Li et al. BMC Genetics (2018) 19:52 Page 3 of 11

    https://npgsweb.ars-grin.gov

  • using the TASSEL-GBS pipeline [43] in TASSEL: 3.0.166following the method described in Li and Erpelding [38].Briefly, raw fastQ sequences were trimmed of barcodesand reads from the fastQ files were collapsed into onemaster TagCounts file containing unique tags along withtheir associated read count information. These uniquetags were then mapped to the G. arboreum (cv. Shixiyal,SXY1) reference genome [44] by Burrows-WheelerAligner [45] and tags that aligned to unique positions onthe genome were retained for SNP calling. SNP discoverywas performed for each set of tags that aligned to theexact same starting genomic position and strand. Thegenotype of the SNP was then determined by the defaultbinomial likelihood ratio method of quantitative SNP call-ing in TASSEL: 3.0.166 [43]. Additional filtering stepswere conducted in TASSEL to generate SNPs for GWASusing minimum minor allele frequency of 0.05, minimumtaxa coverage of 0.8, minimum site coverage of 0.1, andmaximum heterozygosity ratio of 0.2.

    Association mapping and linkage disequilibriumGWAS was performed using the compressed mixed linearmodel (CMLM) [46] implemented in Genome Associationand Prediction Integrated Tool (GAPIT) [47]. Both kinship,calculated using the Efficient Mixed-Model Association(EMMA) method, and principle component (PC) analyseswere conducted in GAPIT to correct for false-positive asso-ciations from familial relatedness and population structure.The optimal number of PCs for GWAS was determined byactivating the Bayesian information criterion (BIC)-basedmodel selection in GAPIT. To correct SNP genotypesignificance level for multiple testing, a permutation testwas performed. In brief, phenotypic values for the differentaccessions were swapped 1000 times. Association tests wereran on the original genotypic dataset with each phenotypicdataset harboring swapped trait values. Then, P-values forthe most significant SNPs from each association test wereextracted and ordered from smallest to largest. P-valuethresholds for the genome-wide significant SNPs weredecided as the 50th of the 1000 P-values (i.e. 0.05 thresh-old) generated from association tests. Genome-wide LD forlinked loci (loci on the same chromosome) was calculatedin TASSEL 5.0 [43] in terms of squared allele frequencycorrelation (r2). Decay of LD was displayed by calculatingthe averages of r2 values between the linked pairs of SNPloci with physical distance from 0 to 2000 kb using acustom Perl script. The Manhattan and quantile-quantileplots were generated using R package qqman [48]. The LDplot for significantly associated SNPs was constructed withHaploview software 4.2 [49].

    Allelic effectsEstimates of allelic effects generated by the linear modelof GAPIT were examined for the significantly associated

    SNPs to determine the resistance allele. Resistance al-leles are those having negative allelic effects on FI values.The additive effects of the identified significantly associ-ated SNPs were examined by calculating the Pearsonproduct-moment correlation coefficient between thenumber of resistance alleles carried by each accessionand their respective reniform nematode resistance level(in terms of FI values). Figures and statistical tests weregenerated and performed in R.

    Candidate genes of reniform nematode resistanceThe G. arboreum reference genomic sequences along withits gene model information were retrieved from the Cot-ton Genome Project database (http://cgp.genomics.org.cn/page/species/index.jsp). Candidate genes were obtained byextracting gene models within 72 kb of the significantlyassociated SNP, which is the total contig N50 length forthe published reference genome [44]. Expression levels ofthe identified candidate genes in different G. hirsutum ge-notypes with and without reniform nematode infestationwere retrieved from Li et al. [29].

    ResultsPhenotypic variation in reniform nematode resistanceRN resistance based on FI values relative to the RN suscep-tible G. hirsutum control genotype revealed a broad rangein disease response with values from 2.8 to 140.3 for the246 G. arboreum accessions (Additional file 1: Table S1).An approximate continuous distribution of RN resistancewas observed with more accessions identified as moderatelyresistant (43%) or moderately susceptible (41%) comparedto accessions categorized as resistant (4%) or susceptible(13%) (Fig. 1).

    Linkage disequilibrium, genome-wide association study,and allelic effectsUsing 7220 SNPs (Additional file 2: Table S2), a meanpairwise r2 of 0.27 for the 13 chromosomes was deter-mined. With an increase in the physical distance betweenthe pair of SNPs, r2 decreased (Fig. 2). LD decayed to halfof the maximum value of r2 at a distance of ~ 20 kb, to avalue of 0.2 at a distance of ~ 100 kb, and to 0.1 when thedistance between the pair of SNPs reached ~ 300 kb.GWAS was conducted using CMLM model with the

    optimal number of covariates included based on the re-sults from model fitness test (Additional file 3: Table S3).Fifteen SNPs were identified as significantly associatedmarkers with RN resistance based on a P-value thresholdgenerated from permutation test (Additional file 4: TableS4; Table 1). The 15 SNPs were distributed over 8chromosomes with one SNP on chromosomes 1, 2, 3, and5, two SNPs on chromosomes 6 and 9, three SNPs onchromosome 12, and four SNPs on chromosome 7 (Table1, Fig. 3a). One of the genomic regions (S1_1431996298)

    Li et al. BMC Genetics (2018) 19:52 Page 4 of 11

    http://cgp.genomics.org.cn/page/species/index.jsphttp://cgp.genomics.org.cn/page/species/index.jsp

  • on Chromosome 6 showed the highest peak and explained15% of the phenotypic variation, which was higher thanthe variation (~ 7.8%) estimated without this SNP. Thephenotypic variation explained by the other SNPs rangedfrom 4.2 to 5.8% (Table 1). Local LD analysis for all signifi-cantly associated SNPs showed that markers on Chromo-some 7 and 12 were in high intra-chromosomal LD, andthat the 15 SNPs could be assigned to 12 loci (Fig. 4). Thequantile-quantile plot (Fig. 3b) showed an upward devi-ation from the linear line around −log10 (P) = 3 indicatingtrue positives for the 15 SNPs.

    The allelic effect estimates generated by the linearmodel of GAPIT showed major alleles as the resistancealleles except for S1_352262064. Additionally, additiveeffects were observed for RN resistance in G. arboreumwith a negative correlation between the number of resist-ance alleles carried by each accession and their respectiveFI value (correlation value of − 0.43, P-value < 0.001),which resulted in higher RN resistance levels observed foraccessions carrying a greater number of resistance alleles(Fig. 5). The majority of these accessions had six or moreresistance alleles with only 11 accessions having less thansix alleles. Two accessions were observed for each genemodel having 3 or 5 resistance alleles.

    Candidate genes for reniform nematode resistanceTwelve candidate genes for RN resistance were identifiedby searching genes at or physically close to the detectedsignificantly associated loci (Table 1). These 12 genes fellinto various biological pathways, including redox reaction,signaling, translation, RNA processing and binding, andresponse to stress and wounding. By searching 72 kbregions flanking the significantly associated SNPs on theG. arboreum reference genome [44], 146 annotated geneswere detected (Additional file 5: Table S5). By comparingthe 146 genes to the differentially expressed genes (DEGs)[29], 18 out of the 146 genes were found to be significantlyDEGs (fold change value > 2 and FDR P-value < 0.01) inresponse to RN infection (Additional file 6: Table S6).

    DiscussionLD has an essential role in GWAS. The distance betweenloci in which LD persists determines the number anddensity of markers needed for a reasonable resolution ofassociation analysis [50]. Many factors affect the extent ofLD, including genetic diversity, population size, admixturelevel, selection and mating pattern, as well as marker sys-tem [50, 51]. The LD decay estimates for predominantlyself-pollinated species, such as cotton, are much highercompared to the extent of LD in out-crossing species suchas maize (1–100 kb) [50]. As an example, LD decayedwithin 250 kb for Arabidopsis thaliana [52]; 75 to 500 kbfor rice [53] and 100 to 600 kb for soybean [54]. Agenome-wide LD extent level of 25 cM at a significancethreshold of r2 = 0.1 was reported for G. hirsutum usingSSR markers [34]. In the present study, LD decayed within~ 300 kb at r2 = 0.1, which is similar to otherself-pollinated species, but lower than G. hirsutum basedon SSR markers. The variation between the cotton speciesinvestigated in the two studies and the marker systemsemployed would be factors contributing to the differentrates of LD decay observed. In comparison, Su et al. [55]reported a mean LD decay distance of 100 kb with a meanmarker density of 1 SNP per 24.85 kb for the 81,675 SNPsused in the study. Based on the extent of LD and marker

    Fig. 1 Frequency distribution of female index (FI) values for the246 Gossypium arboreum germplasm accessions. FI = (average numberof female reniform nematode on an accession per gram of root tissue/ average number of female reniform nematode on the susceptibleGossypium hirsutum control genotype PI 529251 per gram of roottissue) * 100

    Fig. 2 Genome-wide linkage disequilibrium (LD) pattern for thepaired loci on the same chromosome. The red horizontal lineindicates that LD (r2) decayed to 0.1 when the physical distancebetween a pair of loci reached 300 kb

    Li et al. BMC Genetics (2018) 19:52 Page 5 of 11

  • density (1 SNP per 200 kb) in the present study, some locimay be left untagged, especially genomic regions with lowlocal LD and few SNPs. The relatively low genome widemarker density in the present study can result fromsequencing errors along with the high rate of missing data

    that frequently occur in GBS assays [43]. Genetic imput-ation can boost the number of SNPs by predicting geneticvariants that were not observed for individual genotypesby using a reference panel or by estimating haplotypefrom observed genotype data. However, when a reference

    Table 1 Candidate gene data for the 15 SNPs significantly associated with reniform nematode resistance identified from genome-wide association study

    Marker Chr Position Candidate gene Description R2a R2

    S1_1431996298 6 21296008 Cotton_A_17782 DNA-directed primase polymerase protein isoform ×1 0.08 0.15

    S1_1432469614 6 21769324 Cotton_A_17731 cct motif family protein 0.08 0.14

    S1_1387000550 9 72903560 Cotton_A_02972 chaperonin cpn60- mitochondrial 0.08 0.13

    S1_634724595 12 93797765 Cotton_A_06029 pentatricopeptide repeat-containing protein at1g16830 0.08 0.13

    S1_634724647 12 93797817 Cotton_A_06029 pentatricopeptide repeat-containing protein at1g16830 0.08 0.13

    S1_1238998071 3 27838614 Cotton_A_03279 coproporphyrinogen iii oxidase 0.08 0.13

    S1_842793872 7 18184817 Cotton_A_14237 leucine-rich repeat protein kinase family 0.08 0.13

    S1_842793960 7 18184905 Cotton_A_14237 leucine-rich repeat protein kinase family 0.08 0.13

    S1_917690183 7 93081128 Cotton_A_29428 apo protein chloroplastic-like 0.08 0.12

    S1_501185184 1 107382510 Cotton_A_24022 restin homolog 0.08 0.12

    S1_1380184051 9 66087061 Cotton_A_12928 cytochrome p450 86a1 0.08 0.12

    S1_303538481 2 70449510 Cotton_A_00283 wound-induced protein 1-like 0.08 0.12

    S1_662779880 12 121853050 Cotton_A_18631 eukaryotic translation initiation factor 2 family protein isoform 2 0.08 0.12

    S1_352262064 5 18150534 Cotton_A_17988 pre-mrna-processing factor 6-like 0.08 0.12

    S1_842793792 7 18184737 Cotton_A_14237 leucine-rich repeat protein kinase family 0.08 0.12

    Chr chromosome, R2a R square of model without SNP, R2 R square of model with SNP

    Fig. 3 Marker-trait associations for reniform nematode resistance evaluated for the 246 Gossypium arboreum germplasm accessions. a Genomewide Manhattan plot of association from compressed mixed linear model; x-axis showed the SNPs along each chromosome; y-axis was the –log10 (P-value) for the association. S1_842793872 and S1_842793960 on chromosome 7, S1_634724647 and S1_634724595 on chromosome12 are physically close to each other with nearly the same P-values (Additional file 4), so they overlapped as one dot in the figure. The alternateorange and blue dots represent SNPs mapped to different chromosomes, with the blue dots on the most right of the Manhattan plot locatedon scaffolds, which could not be assembled onto the 13 chromosomes of the G. arboreum reference genome. Dots above the blue horizontalline are SNPs with P-value < 0.001, whereas the one dot above the red line is the SNP with P-value < 0.0001. b Quantile-quantile (Q-Q) plot forthe compressed mixed linear model

    Li et al. BMC Genetics (2018) 19:52 Page 6 of 11

  • Fig. 4 Linkage disequilibrium (LD) plot showing LD patterns among the 15 reniform nematode resistance-associated SNPs. The LD between theSNPs is shown in the intersection of the diagonals from each SNP. The numbers are based on D’ values and multiplied by 100. Values close to 0indicate absence of LD (white shaded boxes), while those close to 100 indicate complete LD (black shaded boxes). The white block with blackvertical lines on the top indicates the 15 SNPs on the reference genome (placed according to their physical positions on the reference genome).SNP markers on different chromosomes are listed, with their corresponding mapped chromosomes indicated as C1, C2, C3, C5, C6, C7, C9, andC12 in front of their ID. The two haplotype blocks (outlined in bold black line) indicate markers that are in high LD

    Fig. 5 Female index (FI) for Gossypium arboreum accessions carrying the specific number of reniform nematode resistance alleles. The gray boxrepresents the 25th through 75th percentile with the black bar representing the mean FI value for each of the gene models. The range in FIvalues is represented by the dashed line. Fewer than five accessions comprised each of the gene models with less than seven resistance alleles.Two accessions were represented in the gene models having three or five resistance alleles

    Li et al. BMC Genetics (2018) 19:52 Page 7 of 11

  • panel is not available such as the case with G. arboreum, ahigh false positive rate of imputation and minimal differ-ences between imputed and non-imputed data could intro-duce an ascertainment bias to GWAS [56]. In the presentstudy, non-imputed data were therefore used for GWAS.The genome scan showed several peaks of moderate sig-

    nificance rather than major dominant peaks. This relativeweak association may be the result of a narrow geneticbase for the G. arboreum accessions used in this study[38] and the phenotypic variation typically observed inassaying for nematode resistance [9, 57]. While populationstructure and genetic relatedness can contribute to theseresults, it also suggested that RN resistance in G.arboreum is a complex trait conferred by multiple genes,where each gene contributes a moderate level of resist-ance. Although major genes for RN resistance have beenreported [9, 11, 13, 22], extensive quantitative variation isfrequently observed in segregating populations [9, 10].This result also suggests that compared to maker-assistedselection, genomic selection might be a better strategy totake for breeding lines with higher RN resistance levels,which aims to utilize genome wide molecular markers andphenotypic data in large breeding populations to predictbreeding values with high accuracy before they are fieldtested [58]. However, the difficulty in transferring RN re-sistance between cotton species and the inability to assayresistance for large populations could limit this approach.The 15 SNPs that showed significant associations with

    RN resistance were distributed over 8 chromosomes. Theoccurrence of multiple genes and the greater frequency ofresistant accessions in the G. arboreum germplasm collec-tion would suggest this collection is a key source of RNresistance for cotton improvement. These significantlyassociated loci were not mapped to the same loci orhomeologous loci as reported for the five RN resistanceQTL introgressed into G. hirsutum [11, 13, 22] and thesenew sources of resistance would be of A-genome origin.Additionally, major alleles were identified for 14 loci thatassociated with the resistant phenotype indicating majoralleles may have better molecular fitness in regard to RNresistance. The additivity of the resistance loci detectedfrom this study, along with the fact that these loci repre-sent novel genetic resources for RN resistance, suggestthat higher RN resistance levels should be achieved bypyramiding resistance alleles identified from G. arboreumwith alleles from other genetic resources. Genetic inherit-ance studies would support this conclusion as plants withhigher levels of resistance than the parents have been ob-served in segregating populations [9, 10]. The occurrenceof transgressive segregation in mapping populations hasalso been reported for root-knot nematode resistance [27].Several biological pathways were suggested to be in-

    volved in RN resistance for G. arboreum, by assessing thegene annotation and gene ontology term for the 146 genes

    co-localized within the genomic regions of the 15 signifi-cantly associated SNPs. The closest gene model to thethree SNPs on chromosome 7, which were in highintra-chromosomal LD, belongs to the leucine-rich repeat(LRR) protein kinase family. One class of genes in thisfamily encoding LRR domain containing receptor likekinases (LRR-RLKs), constitute a main type of R genesinvolved in plant immunity responses [59]. Upon sensinginvading pathogens, R genes can activate a series of down-stream defense-related responses leading to plant resist-ance. In cotton, some R genes conferring resistance havebeen reported for plant-nematode interactions [60]. Rgenes were also suggested to mediate different levels ofRN resistance through active regulation of expressionlevels [29]. The result presented here would furthersupport the suggestion that R genes might be criticalregulators in cotton resistance to RN, although furtherfunctional molecular studies are required to confirm thishypothesis. The SNP S1_1380184051 on chromosome 9co-localized with a gene encoding cytochrome P450,which showed differential expression in response to RNinfestation [29]. Cytochrome P450 enzymes function inthe biosynthesis of many secondary metabolites such asgibberellins, abscisic acid, and other plant defense com-pounds [61]. Additionally, one gene encoding heat shockprotein (HSP) 20 that co-localized with the two loci onchromosome 9 also showed differentially expression [29],suggesting its potential role in RN resistance. Otherpossible RN resistance candidate genes identified in thisstudy included WRKY transcription factor coding genesco-localized on chromosomes 6 and 12, an ethyleneresponsive transcription factor (ERF) coding gene onchromosome 6, and GRAS transcription factor on chromo-some 3. WRKYs regulate the expression of defense relatedgenes in plant immunity responses [62] and in G. hirsutumtheir accumulation levels were dynamically regulated inresponse to RN infection [29]. Pathogenesis-related ERFshave been shown to be involved in soybean cyst nematoderesistance responses [63, 64].The gene expression analysis data used in this study were

    generated from G. hirsutum, which used the published gen-ome of G. arboreum [44] as the reference A sub-genomefor transcriptome assembly and read mapping [29]. By inte-grating this gene expression data, more insight on the pos-sible genes and pathways associated with RN resistance inthe A-genome background could be achieved, although thegene expression pattern in G. arboreum could be differentcompared to G. hirsutum as a result of genetic divergenceduring genome evolution. In the future, it is desired toinclude gene expression data from G. arboreum accessionswith differential levels of RN resistance, so that genes exhi-biting differential expression associated with RN resistancecould be inferred as possible cis-regulatory elements affect-ing RN resistance. The inclusion of gene expression data

    Li et al. BMC Genetics (2018) 19:52 Page 8 of 11

  • could also help correct for false positives, because thequality of the reference G. arboreum genome is known tobe relatively low, which could pose potential problems onread mapping. That is, the genomic physical positions ofthe identified SNPs could be inaccurate. Nonetheless, SNPsand potential candidate genes identified from this study willbe a useful resource for further characterization of RNresistance in G. arboreum to aid in the transfer of resistanceto G. hirsutum.

    ConclusionsHigh-throughput GBS data integrated with phenotypicdata for 246 G. arboreum accessions with diverse RNresistance levels resulted in the identification of 15 SNPsrepresenting 12 loci distributed over eight chromosomesthat significantly associated with RN resistance. Majoralleles were associated with resistance for 14 SNPs. Thiswould indicate natural selection for RN resistance and/orselection during the breeding of G. arboreum germplasm,such as the selection of varieties more tolerant to environ-mental extremes. Also, additive effects were observed forRN resistance with accessions typically having more thanfive alleles; thus, higher resistance levels should beachievable by pyramiding resistance alleles from multiplesources. This would suggest that the accumulation ofresistance alleles may be advantageous. Because breedingfor RN resistance is hindered by the difficulty of introgres-sing resistance from G. arboreum and the lack of rapidprocedures to assess resistance in large populations, theSNPs identified in this study or the identification of SSRmarkers in the associated regions would benefitmarker-assisted selection approaches for the developmentof RN resistant varieties and the pyramiding of multipleresistance alleles. Populations have been developed for theRN resistant accessions to further characterize the genet-ics of resistance and for QTL mapping. As such, it wouldbe obvious to someone working in the field of plant breed-ing to use the reported SNPs or to develop additionalmolecular markers in the associated regions identified inthis study in order to breed elite upland cotton varietieswith RN resistance.Compared to the known loci involved in cotton RN re-

    sistance, the 12 loci identified in this study represent novelgenetic resources. A survey of genes that co-localized withthe RN resistance-associated loci identified 146 putative RNresistance candidate genes. These genes were involved in awide range of biological pathways that may play a signifi-cant role in resistance. A cluster of LRR domain containingkinase genes, representing one type of R genes, weredetected in the vicinity of three RN resistance-associatedSNPs on chromosome 7. This result, together with the find-ing that RN resistance loci co-localized R genes had a sig-nificant higher accumulation level in resistant G. hirsutumgenotypes, suggested the critical regulatory role of R genes

    in cotton resistance to RN. However, additional experi-ments need to be conducted to test the trait-loci associa-tions identified for RN resistance in this study. RNA-seqexperiments should be conducted for G. arboreum acces-sions to further explore the expression pattern of genesnear the potential resistance loci and to identify candidategenes responsible for RN resistance.

    Additional files

    Additional file 1: The 246 Gossypium arboreum germplasm accessionsselected for genome-wide association study including the reniformnematode female index value for each accession. (XLSX 23 kb)

    Additional file 2: The 7220 SNPs used for linkage disequilibrium analysisand genome-wide association study including major and minor alleles.SNPs highlighted in yellow identify chromosomal regions associated withreniform nematode resistance. (XLSX 6162 kb)

    Additional file 3: Bayesian information criterion (BIC) values ofcompressed mixed linear model with different numbers of principlecomponents used for association analysis in GAPIT. (XLSX 8 kb)

    Additional file 4: Genome-wide association results for the 7220 SNPsincluded in the study. (XLSX 564 kb)

    Additional file 5 The 146 annotated genes within 72 kb of the 15significantly associated SNPs. (XLSX 19 kb)

    Additional file 6: The 18 genes out of the 146 genes that weresignificantly differentially expressed in Gossypium hirsutum in responseto reniform nematode infestation. (XLSX 13 kb)

    AbbreviationsBIC: Bayesian information criteria; CMLM: Compressed mixed linear model;DEG: Differentially expressed gene; EMMA: Efficient mixed-model association;ERF: Ethylene responsive factor; FI: Female index; GAPIT: Genome Associationand Prediction Integrated Tool; GBS: Genotyping-by-sequencing;GWAS: Genome-wide association study; HSP: Heat shock protein; LD: Linkagedisequilibrium; LRR: Leucine-rich repeat; MMT: Million metric tons;PC: Principle component; QTL: Quantitative trait loci; RN: Reniform nematode;SNP: Single nucleotide polymorphism; SSR: Simple sequence repeat

    AcknowledgementsTechnical assistance provided by Kristi Jordan in evaluating reniformnematode infection response for germplasm accessions is appreciated.

    FundingThis research was funded by the USDA-ARS project 6066–22000-074-00D. Thisresearch was supported in part by an appointment to the Agricultural ResearchService (ARS) Research Participation Program administered by the Oak RidgeInstitute for Science and Education (ORISE) through an interagency agreementbetween the US Department of Energy (DOE) and the US Department ofAgriculture (USDA). ORISE is managed by ORAU under DOE contract numberDE-AC05-06OR23100. All opinions expressed in this paper are the author’s anddo not necessarily reflect the policies and views of USDA, ARS, DOE, or ORAU/ORISE. Mention of trade names or commercial products in this article is solelyfor the purpose of providing specific information and does not implyrecommendation or endorsement by the US Department of Agriculture.USDA is an equal opportunity provider and employer.

    Availability of data and materialsThe datasets supporting the conclusions of this article are includedwithin the article and its additional files. The Gossypium arboreumgermplasm accessions evaluated in this study are freely availablefor research and breeding projects. The accessions can be requested from theUSDA National Plant Germplasm System (https://npgsweb.ars-grin.gov) apartfrom any importation restrictions.

    Li et al. BMC Genetics (2018) 19:52 Page 9 of 11

    https://doi.org/10.1186/s12863-018-0662-3https://doi.org/10.1186/s12863-018-0662-3https://doi.org/10.1186/s12863-018-0662-3https://doi.org/10.1186/s12863-018-0662-3https://doi.org/10.1186/s12863-018-0662-3https://doi.org/10.1186/s12863-018-0662-3https://npgsweb.ars-grin.gov

  • Authors’ contributionsJEE conceived and designed the experiment. JEE conducted samplecollection and DNA preparation. RL carried out data analysis. RL and JEEwrote the manuscript. SRS conducted the screening of accessions for RNresistance and contributed to the writing of the manuscript. The authorshave read and approved the manuscript.

    Ethics approval and consent to participateNot applicable. The research in this study did not involve humans, animals,wild plants, endangered plants, or engineered plants. The plant materialsused in this study were provided by the USDA National Plant GermplasmSystem for research purposes requiring no special permissions incompliance with institutional, national, and international guidelines.The Gossypium arboreum germplasm accessions used in this study weredonated to the USDA National Plant Germplasm System by researchersor other Genebanks and are freely available for research and breedingpurposes following national and international guidelines. Research on theplant materials was conducted under local legislation and permissionsfollowing standard agronomic practices.

    Consent for publicationNot applicable.

    Competing interestsThe authors declare that they have no competing interests.

    Publisher’s NoteSpringer Nature remains neutral with regard to jurisdictional claims inpublished maps and institutional affiliations.

    Author details1Present address: Department of Plant Biology, University of California, Davis,One Shields Avenue, Davis, CA 95616, USA. 2USDA-ARS, Crop GeneticsResearch Unit, 141 Experiment Station Road, PO Box 345, Stoneville, MS38776, USA.

    Received: 1 September 2017 Accepted: 26 July 2018

    References1. Robinson AF. Reniform in U.S. cotton: when, where, why, and some

    remedies. Annu Rev Phytopathol. 2007;45:263–88.2. Koenning SR, Wrather JA, Kirkpatrick TL, Walker NR, Starr JL, Mueller JD.

    Plant-parasitic nematodes attacking cotton in the United States: old andemerging production challenges. Plant Dis. 2004;88:100–13.

    3. Lawrence K, Hagan A, Olsen M, Faske T, Hutmacher R, Mueller J, et al.Cotton disease loss estimate committee report, 2015. In: Proceedings of the2016 Beltwide Cotton Conferences. New Orleans; 2016. p. 113–5. http://www.cotton.org/beltwide/index.cfm?page=proceedings.

    4. Zhang T, Hu Y, Jiang W, Fang L, Guan X, Chen J, et al. Sequencing ofallotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resourcefor fiber improvement. Nat Biotechnol. 2015;33:531–7.

    5. Li F, Fan G, Lu C, Xiao G, Zou C, Kohel RJ, et al. Genome sequence ofcultivated upland cotton (Gossypium hirsutum TM-1) provides insights intogenome evolution. Nat Biotechnol. 2015;33:524–30.

    6. Robinson AF, Percival AE. Resistance to Meloidogyne incognita race 3 andRotylenchulus reniformis in wild accessions of Gossypium hirsutum and G.barbadense from Mexico. J Nematol. 1997;29(4S):746–55.

    7. Robinson AF, Bridges AC, Percival AE. New sources of resistance to thereniform (Rotylenchulus reniformis Linford and Oliveira) and root-knot(Meloidogyne incognita (Kofoid & White) Chitwood) nematode in upland(Gossypium hirsutum L.) and sea island (G. barbadense L.) cotton. J CottonSci. 2004;8:191–7.

    8. Yik C-P, Birchfield W. Resistant germplasm in Gossypium species and relatedplants to Rotylenchulus reniformis. J Nematol. 1984;16:146–53.

    9. Erpelding JE, Stetina SR. Genetics of reniform nematode resistance in Gossypiumarboreum germplasm line PI 529728. World J Agric Res. 2013;1:48–53.

    10. Sacks EJ, Robinson AF. Introgression of resistance to reniform nematode(Rotylenchulus reniformis) into upland cotton (Gossypium hirsutum) fromGossypium arboreum and a G. hirsutum/Gossypium aridum bridging line.Field Crops Res. 2009;112:1–6.

    11. Romano GB, Sacks EJ, Stetina SR, Robinson AF, Fang DD, Gutierrez OA, et al.Identification and genomic location of a reniform nematode (Rotylenchulusreniformis) resistance locus (Renari) introgressed from Gossypium aridum intoupland cotton (G. hirsutum). Theor Appl Genet. 2009;120:139–50.

    12. Robinson AF, Bell AA, Dighe ND, Menz MA, Nichols RL, Stelly DM.Introgression of resistance to nematode Rotylenchulus reniformis intoupland cotton (Gossypium hirsutum) from Gossypium longicalyx. Crop Sci.2007;47:1865–77.

    13. Gutiérrez OA, Robinson AF, Jenkins JN, McCarty JC, Wubben MJ, CallahanFE, et al. Identification of QTL regions and SSR markers associated withresistance to reniform nematode in Gossypium barbadense L. accessionGB713. Theor Appl Genet. 2011;122:271–80.

    14. Bell AA, Robinson AF, Quintana J, Nilesh ND, Menz MA, Stelly DM, et al.Registration of LONREN-1 and LONREN-2 germplasm lines of upland cottonresistant to reniform nematode. J Plant Reg. 2014;8:187–90.

    15. Sikkens RB, Weaver DB, Lawrence KS, Moore SR, van Santen E. LONRENupland cotton germplasm response to Rotylenchulus reniformis inoculumlevel. Nematropica. 2011;41:68–74.

    16. Schrimsher DW, Lawrence KS, Sikkens RB, Weaver DB. Nematicides enhancegrowth and yield of Rotylenchulus reniformis resistant cotton genotypes. JNematol. 2014;46:365–75.

    17. Bell AA, Robinson AF, Quintana J, Duke SE, Starr JL, Stelly DM, et al.Registration of BARBREN-713 germplasm line of upland cotton resistant toreniform and root-knot nematodes. J Plant Reg. 2015;9:89–93.

    18. Starr JL, Smith CW, Ripple K, Zhou E, Nichols RL, Faske TR. Registration ofTAM RKRNR-9 and TAM RKRNR-12 germplasm lines of upland cottonresistant to reniform and root-knot nematodes. J Plant Reg. 2011;5:393–6.

    19. McCarty JC Jr, Jenkins JN, Wubben MJ, Gutiérrez OA, Hayes RW, Callahan FE,Deng D. Registration of three germplasm lines of cotton derived fromGossypium barbadense L. accession GB713 with resistance to the reniformnematode. J Plant Reg. 2013;7:220–3.

    20. McCarty JC Jr, Jenkins JN, Wubben MJ, Hayes RW, Callahan FE, Deng D.Registration of six germplasm lines of cotton with resistance to the root-knot and reniform nematodes. J Plant Reg. 2017; https://doi.org/10.3198/jpr2016.09.0044crg.

    21. Stetina SR, Erpelding JE. Gossypium arboreum accessions resistant toRotylenchulus reniformis. J Nematol. 2016;48:223–30.

    22. Dighe ND, Robinson AF, Bell AA, Menz MA, Cantrell RG, Stelly DM. Linkagemapping of resistance to reniform nematode in cotton following introgressionfrom Gossypium longicalyx (hutch. & lee). Crop Sci. 2009;49:1151–64.

    23. Wubben MJ, McCarty JC Jr, Jenkins JN, Callahan FE, Deng DD.Individual and combined contributions of the Renbarb1, Renbarb2, andRenbarb3 quantitative trait loci to reniform nematode (Rotylenchulusreniformis Linford & Oliveira) resistance in upland cotton (Gossypiumhirsutum L.). Euphytica. 2017;213:47.

    24. Fang DD, Stetina SR. Improving cotton (Gossypium hirsutum L.) plantresistance to reniform nematodes by pyramiding Ren1 and Ren2. PlantBreed. 2011;130:673–8.

    25. Bolek Y, El-Zik KM, Pepper AE, Bell AA, Magill CW, Thaxton PM, ReddyOUK. Mapping of Verticillium wilt resistance genes in cotton. Plant Sci.2005;168:1581–90.

    26. Ulloa M, Hutmacher RB, Roberts PA, Wright SD, Nichols RL, Davis RM.Inheritance and QTL mapping of Fusarium wilt race 4 resistance in cotton.Theor Appl Genet. 2013;126:1405–18.

    27. Wang C, Ulloa M, Roberts PA. A transgressive segregation factor (RKN2) inGossypium barbadense for nematode resistance clusters with gene rkn1 in G.hirsutum. Mol Gen Genomics. 2008;279:41–52.

    28. Wang C, Ulloa M, Shi X, Yuan X, Saski C, Yu JZ, Roberts PA. Sequencecomposition of BAC clones and SSR markers mapped to upland cottonchromosomes 11 and 21 targeting resistance to soil-borne pathogens.Front Plant Sci. 2015;6:791.

    29. Li R, Rashotte AM, Singh NK, Lawrence KS, Weaver DB, Locy RD.Transcriptome analysis of cotton (Gossypium hirsutum L.) genotypes that aresusceptible, resistant, and hypersensitive to reniform nematode(Rotylenchulus reniformis). PLoS One. 2015;10:e0143261.

    30. Said JI, Lin Z, Zhang X, Song M, Zhang J. A comprehensive meta QTLanalysis for fiber quality, yield, yield related and morphological traits,drought tolerance, and disease resistance in tetraploid cotton. BMCGen. 2013;14:776.

    31. Zhang J, Yu J, Pei W, Li X, Said J, Song M, Sanogo S. Genetic analysis ofVerticillium wilt resistance in a backcross inbred line population and a

    Li et al. BMC Genetics (2018) 19:52 Page 10 of 11

    http://www.cotton.org/beltwide/index.cfm?page=proceedingshttp://www.cotton.org/beltwide/index.cfm?page=proceedingshttps://doi.org/10.3198/jpr2016.09.0044crghttps://doi.org/10.3198/jpr2016.09.0044crg

  • meta-analysis of quantitative trait loci for disease resistance in cotton. BMCGen. 2015;16:577.

    32. Mackay I, Powell W. Methods for linkage disequilibrium mapping in crops.Trends Plant Sci. 2007;12:57–63.

    33. Lipka AE, Kandianis CB, Hudson ME, Yu J, Drnevich J, Bradbury PJ, et al.From association to prediction: statistical methods for the dissection andselection of complex traits in plants. Curr Opin Plant Biol. 2015;24:110–8.

    34. Abdurakhmonov IY, Saha S, Jenkins JN, Buriev ZT, Shermatov SE, SchefflerBE, et al. Linkage disequilibrium based association mapping of fiber qualitytraits in G hirsutum L variety germplasm. Genetica. 2009;136:401–17.

    35. Zhao Y, Wang H, Chen W, Li Y. Genetic structure, linkage disequilibrium andassociation mapping of Verticillium wilt resistance in elite cotton (Gossypiumhirsutum L.) germplasm population. PLoS One. 2014;9:e86308.

    36. Poland JA, Rife TW. Genotyping-by-sequencing for plant breeding andgenetics. Plant Gen. 2012;5:92–102.

    37. Hulse-Kemp AM, Lemm J, Plieske J, Ashrafi H, Buyyarapu R, Fang DD, et al.Development of a 63K SNP array for cotton and high-density mapping ofintraspecific and interspecific populations of Gossypium spp. G3 GenesGenomes Genet. 2015;5:1187–209.

    38. Li R, Erpelding JE. Genetic diversity analysis of Gossypium arboreum germplasmaccessions using genotyping-by-sequencing. Genetica. 2016;144:535–45.

    39. Arias RS, Stetina SR, Tonos JL, Scheffler JA, Scheffler BE. Microsatellitesreveal genetic diversity in Rotylenchulus reniformis populations. JNematol. 2009;41:146–56.

    40. Thies JA, Merrill SB, Corley EL. Red food coloring stain: new, saferprocedures for staining nematodes in roots and egg masses on rootsurfaces. J Nematol. 2002;34:179–81.

    41. Schmitt DP, Shannon G. Differentiating soybean responses to Heteroderaglycines races. Crop Sci. 1992;32:275–7.

    42. Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, MitchellSE. A robust, simple genotyping-by-sequencing (GBS) approach for highdiversity species. PLoS One. 2011;6:e19379.

    43. Glaubitz JC, Casstevens TM, Lu F, Harriman J, Elshire RJ, Sun Q, et al.TASSEL-GBS: a high capacity genotyping by sequencing analysispipeline. PLoS One. 2014;9:e90346.

    44. Li F, Fan G, Wang K, Sun F, Yuan Y, Song G, et al. Genome sequence of thecultivated cotton Gossypium arboreum. Nat Genet. 2014;46:567–72.

    45. Li H, Durbin R. Fast and accurate short read alignment with burrows–wheeler transform. Bioinformatics. 2009;25:1754–60.

    46. Zhang Z, Ersoz E, Lai C-Q, Todhunter RJ, Tiwari HK, Gore MA, et al.Mixed linear model approach adapted for genome-wide associationstudies. Nat Genet. 2010;42:355–60.

    47. Lipka AE, Tian F, Wang Q, Peiffer J, Li M, Bradbury PJ, et al. GAPIT: genomeassociation and prediction integrated tool. Bioinformatics. 2012;28:2397–9.

    48. Turner SD. qqman: an R package for visualizing GWAS results using Q-Qand Manhattan plots. bioRxiv. 2014. https://doi.org/10.1101/005165.

    49. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization ofLD and haplotype maps. Bioinformatics. 2005;21:263–5.

    50. Flint-Garcia SA, Thornsberry JM, Buckler ES. Structure of linkagedisequilibrium in plants. Annu Rev Plant Biol. 2003;54:357–74.

    51. Stich B, Melchinger AE, Piepho H-P, Heckenberger M, Maurer HP, Reif JC. Anew test for family-based association mapping with inbred lines from plantbreeding programs. Theor Appl Genet. 2006;113:1121–30.

    52. Nordborg M, Borevitz JO, Bergelson J, Berry CC, Chory J, Hagenblad J,et al. The extent of linkage disequilibrium in Arabidopsis thaliana. NatGenet. 2002;30:190–3.

    53. Mather KA, Caicedo AL, Polato NR, Olsen KM, McCouch S, PuruggananMD. The extent of linkage disequilibrium in rice (Oryza sativa L.).Genetics. 2007;177:2223–32.

    54. Hyten DL, Choi I-Y, Song Q, Shoemaker RC, Nelson RL, Costa JM, et al.Highly variable patterns of linkage disequilibrium in multiple soybeanpopulations. Genetics. 2007;175:1937–44.

    55. Su J, Fan S, Li L, Wei H, Wang C, Wang H, et al. Detection of favorable QTLalleles and candidate genes for lint percentage by GWAS in Chinese uplandcotton. Front Plant Sci. 2016;7:1576.

    56. Marchini J, Howie B. Genotype imputation for genome-wide associationstudies. Nat Rev Genet. 2010;11:499–511.

    57. Weaver DB, Lawrence KS, van Santen E. Reniform nematode resistance inupland cotton germplasm. Crop Sci. 2007;47:19–24.

    58. Desta ZA, Ortiz R. Genomic selection: genome-wide prediction in plantimprovement. Trends Plant Sci. 2014;19:592–601.

    59. Jones JDG, Dangl JL. The plant immune system. Nature. 2006;444:323–9.60. Li R, Rashotte AM, Singh NK, Weaver DB, Lawrence KS, Locy RD. Integrated

    signaling networks in plant responses to sedentary endoparasiticnematodes: a perspective. Plant Cell Rep. 2015;34:5–22.

    61. Zhao Y-J, Cheng Q-Q, Su P, Chen X, Wang X-J, Gao W, et al. Researchprogress relating to the role of cytochrome P450 in the biosynthesis ofterpenoids in medicinal plants. Appl Microbiol Biotechnol. 2014;98:2371–83.

    62. Rushton PJ, Somssich IE, Ringler P, Shen QJ. WRKY transcription factors.Trends Plant Sci. 2010;15:247–58.

    63. Mazarei M, Puthoff DP, Hart JK, Rodermel SR, Baum TJ. Identification andcharacterization of a soybean ethylene-responsive element-binding proteingene whose mRNA expression changes during soybean cyst nematodeinfection. Mol Plant-Microbe Interact. 2002;15:577–86.

    64. Mazarei M, Liu W, Al-Ahmad H, Arelli PR, Pantalone VR, Stewart CN. Geneexpression profiling of resistant and susceptible soybean lines infected withsoybean cyst nematode. Theor Appl Genet. 2011;123:1193–206.

    Li et al. BMC Genetics (2018) 19:52 Page 11 of 11

    https://doi.org/10.1101/005165

    AbstractBackgroundResultsConclusions

    BackgroundMethodsEvaluation of reniform nematode infection response of G. arboreum accessionsGenotyping of G. arboreum accessionsAssociation mapping and linkage disequilibriumAllelic effectsCandidate genes of reniform nematode resistance

    ResultsPhenotypic variation in reniform nematode resistanceLinkage disequilibrium, genome-wide association study, and allelic effectsCandidate genes for reniform nematode resistance

    DiscussionConclusionsAdditional filesAbbreviationsAcknowledgementsFundingAvailability of data and materialsAuthors’ contributionsEthics approval and consent to participateConsent for publicationCompeting interestsPublisher’s NoteAuthor detailsReferences