identification of expressed resistance gene...

15
467 Arch. Biol. Sci., Belgrade, 67(2), 467-481, 2015 DOI:10.2298/ABS140902011Y IDENTIFICATION OF EXPRESSED RESISTANCE GENE ANALOGS (RGA) AND DEVELOPMENT OF RGA-SSR MARKERS IN TOBACCO Qinghua Yuan 1 , Ruihong Xie 1 , Zhenchen Zhang 1 , Zhuwen Ma 1 , Jiqin Li 1 , Shuling Li 1 , Junbiao Chen 1 and Yonghua Lü 2, * 1 Guangdong Academy of Agricultural Sciences, Crops Research Institute, Guangzhou, China 2 Guangdong Province Tobacco Monopoly Bureau, Guangzhou, China *Corresponding authors: [email protected] Abstract: Tobacco is an important cash crop and an ideal experimental system for studies of plant–pathogen interactions. Identification of tobacco resistance (R) genes and resistance gene analogs (RGAs) is propitious to elucidate the underlying resistant mechanisms. In recent years, the public tobacco EST (expressed sequence tags) data set, which provides a rich source for identifying expressed RGAs, has enlarged substantially. In this study, 149606 Uni-ESTs were assembled from 412325 tobacco ESTs available in GenBank, scanned with 112 published plant R-genes protein sequences, and 1113 Nicotiana (tobacco) RGAs (NtRGAs) were identified. The majority of them comprised the common R-genes domains, such as NBS-LRR, LRR-PK, LRR, PK and Mlo, while we were unable to identify 109 RGAs using published domains of R-genes. Upon sequence alignment, 1079 NtRGAs were allocated on 712 loci within the Nicotiana benthamiana genome. A total of 78 simple sequence repeats (SSRs) were identified from 72 NtRGAs, and out of 64 newly designed primer pairs, 54 primer pairs generated clear bands upon PCR amplification using tobacco genomic DNA. Only nine primer pairs displayed polymorphism in 24 varieties of tobacco, with 2-4 alleles per locus (2.56 alleles on average), while 41 primer pairs were able to detect polymorphisms in six wild species of genus Nicotiana, with 2-4 alleles per locus (2.61 alleles on average). Key words: Tobacco; EST; RGAs; SSR; identification Received September 2, 2014; Revised December 4, 2014; Accepted January 12, 2015 INTRODUCTION Tobacco (Nicotiana tabacum) is an important cash crop worldwide and an ideal experimental system for studies of plant–pathogen interaction. In tobacco production, severe losses in tobacco yield and quality have been caused by various diseases and pests including bacterial wilt, mo- saic virus, black shank, etc. According to the sta- tistical data released by the China tobacco dis- ease and pest forecast, prediction and integrated prevention website, in 2010 and 2011, the total area suffering diseases and pests in the 16 main tobacco production provinces amounted to 800 000 ha, causing a yield loss of 60 million kg and value loss of 700 million RMB. Therefore, effec- tive disease and pest control is of great signifi- cance for tobacco production, and identification and cloning of tobacco disease resistance genes (R-genes) and resistance gene analogs (RGAs)

Upload: others

Post on 25-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • 467

    Arch. Biol. Sci., Belgrade, 67(2), 467-481, 2015 DOI:10.2298/ABS140902011Y

    IDENTIFICATION OF EXPRESSED RESISTANCE GENE ANALOGS (RGA) AND DEVELOPMENT OF RGA-SSR MARKERS IN TOBACCO

    Qinghua Yuan1, Ruihong Xie1, Zhenchen Zhang1, Zhuwen Ma1, Jiqin Li1, Shuling Li1, Junbiao Chen1 and Yonghua Lü2,*

    1 Guangdong Academy of Agricultural Sciences, Crops Research Institute, Guangzhou, China

    2 Guangdong Province Tobacco Monopoly Bureau, Guangzhou, China

    *Corresponding authors: [email protected]

    Abstract: Tobacco is an important cash crop and an ideal experimental system for studies of plant–pathogen interactions. Identification of tobacco resistance (R) genes and resistance gene analogs (RGAs) is propitious to elucidate the underlying resistant mechanisms. In recent years, the public tobacco EST (expressed sequence tags) data set, which provides a rich source for identifying expressed RGAs, has enlarged substantially. In this study, 149606 Uni-ESTs were assembled from 412325 tobacco ESTs available in GenBank, scanned with 112 published plant R-genes protein sequences, and 1113 Nicotiana (tobacco) RGAs (NtRGAs) were identified. The majority of them comprised the common R-genes domains, such as NBS-LRR, LRR-PK, LRR, PK and Mlo, while we were unable to identify 109 RGAs using published domains of R-genes. Upon sequence alignment, 1079 NtRGAs were allocated on 712 loci within the Nicotiana benthamiana genome. A total of 78 simple sequence repeats (SSRs) were identified from 72 NtRGAs, and out of 64 newly designed primer pairs, 54 primer pairs generated clear bands upon PCR amplification using tobacco genomic DNA. Only nine primer pairs displayed polymorphism in 24 varieties of tobacco, with 2-4 alleles per locus (2.56 alleles on average), while 41 primer pairs were able to detect polymorphisms in six wild species of genus Nicotiana, with 2-4 alleles per locus (2.61 alleles on average).

    Key words: Tobacco; EST; RGAs; SSR; identification

    Received September 2, 2014; Revised December 4, 2014; Accepted January 12, 2015

    INTRODUCTION

    Tobacco (Nicotiana tabacum) is an important cash crop worldwide and an ideal experimental system for studies of plant–pathogen interaction. In tobacco production, severe losses in tobacco yield and quality have been caused by various diseases and pests including bacterial wilt, mo-saic virus, black shank, etc. According to the sta-tistical data released by the China tobacco dis-

    ease and pest forecast, prediction and integrated prevention website, in 2010 and 2011, the total area suffering diseases and pests in the 16 main tobacco production provinces amounted to 800 000 ha, causing a yield loss of 60 million kg and value loss of 700 million RMB. Therefore, effec-tive disease and pest control is of great signifi-cance for tobacco production, and identification and cloning of tobacco disease resistance genes (R-genes) and resistance gene analogs (RGAs)

  • 468 Yuan et al.

    will play a fundamental role in the elucidation of the underlying disease resistant mechanisms and in the formulation of correct disease and pest control measures.

    Plant disease resistance genes play a crucial role in identification of the proteins decoded by the avirulence genes of pathogens. In recent years, more than 100 plant disease resistance

    genes have been cloned by either map-based cloning or transposon tagging method (Sanse-verino et al., 2010; Johal et al., 1992; Whitham et al., 1996; Dixon et al., 1996) (http://prgdb.crg.eu/wiki/Species_with_R-genes). Although plant disease resistance genes could defend themselves against a broad range of pathogens, they only shared a few highly conserved domains, such as nucleotide binding site (NBS), leucine-rich

    Table 1. Varieties and wild species of tobacco used in this study.

    No. Material Type Classification

    1 Aboyan Burley Common tobacco

    2 ATNARELLO Burley Common tobacco

    3 Big White Burley 599 Burley Common tobacco

    4 KY26 Burley Common tobacco

    5 Margland 609 Burley Common tobacco

    6 White Burley 9 Burley Common tobacco

    7 Fengzi 1 Flue-cured tobacco Common tobacco

    8 Gexin 6 Flue-cured tobacco Common tobacco

    9 Gezaji Flue-cured tobacco Common tobacco

    10 Guanghuang 5 Flue-cured tobacco Common tobacco

    11 Honghuadajinyuan Flue-cured tobacco Common tobacco

    12 Jinxing Flue-cured tobacco Common tobacco

    13 Baiguniuli Sun-cured tobacco Common tobacco

    14 Daqiugen 2 Sun-cured tobacco Common tobacco

    15 Dashangou Sun-cured tobacco Common tobacco

    16 Datong Sun-cured tobacco Common tobacco

    17 Dayemihe Sun-cured tobacco Common tobacco

    18 Dazhongbaimao Sun-cured tobacco Common tobacco

    19 2040 Cigar Common tobacco

    20 74-16 Cigar Common tobacco

    21 Black sea Samsun Cigar Common tobacco

    22 Cuban Havana Cigar Common tobacco

    23 Dark Virginia Cigar Common tobacco

    24 Xiawangna Cigar Common tobacco

    25 Nicotiana. clevelandii A. Gray Nicotiana. sect. Polydicliae Wild Nicotiana species

    26 Nicotiana debneyi Domin Nicotiana. sect. Suaveolentes Wild Nicotiana species

    27 Nicotiana. repanda Willd Nicotiana. sect. Repandae Wild Nicotiana species

    28 Nicotiana. rustica L. Nicotiana. sect. Rusticae Wild Nicotiana species

    29 Nicotiana. sylvestris Speg. & Comes Nicotiana. sect. Sylvestres Wild Nicotiana species

    30 Nicotiana. tomentosiformis Goodsp. Nicotiana. sect. Tomentosae Wild Nicotiana species

  • NTRGAS IDENTIFICATION AND RGA-SSR DEVELOPMENT 469

    Fig. 1. Alignment of CL1421Contig1 and CL1421Contig2 to N. benthamiana genome. A – CL1421Contig1 match with 1078-1448 bp and 1547-2749 bp in genomic scaffold (sequence ID: Niben.v0.3.Scf24878592). B – CL1421Contig2 match with 1276-1449 bp and 1549-2786 bp in genomic scaffold (sequence ID: Niben.v0.3.Scf24878592).

  • 470 Yuan et al.

    repeats (LRR), serine-threonine kinase (STK), leucine zippers (LZ), transmembrane domain (TM), Toll/Interleukin-1 Receptor (TIR) and so on (Bent et al., 1996; Meyers et al., 1999; Hulbert et al., 2001; Dangl et al., 2001). These conserva-tive domains provide a convenient and reliable basis for the rapid identification and cloning of R-genes and RGAs.

    Plant disease resistance genes can be divided into five major classes according to the conserva-tive domains of amino acid sequences. The first class containing NBS-LRR domains may further be divided into two sub-classes based on the pres-ence/absence of the N terminus within TIR (i.e. TIR-NBS-LRR and non-TIR-NBS-LRR R-genes). For instance, the tobacco mosaic virus resistance gene, N, contains a TIR-NBS-LRR domain (Mey-ers et al., 1999; Meyers et al., 2003), while the Rps2 gene of Arabidopsis thaliana resistant to Pseudo-monas syringae contains a coiled-coil (CC)-NBS-LRR domain (Bent et al., 1994). The second class contains LRR-PK domains, such as the Fls2 gene

    of Arabidopsis thaliana and the Xa21 gene of rice (Dunning et al., 2007; Song et al., 1995). The third class is characterized by an extracellular LRR domain, such as the RPP27 gene of Arabidopsis thaliana (Tor et al., 2004). The fourth class con-tains only the PK domain, such as the Pto gene of tomato and the At1 gene of melon (Martin et al., 1993; Taler et al., 2004). The fifth class comprises all remaining R-genes characterized by different mechanisms of resistance to pathogens, such as the Hm1 of maize and Mlo gene of barley (Johal et al., 1992; Buschges et al., 1997).

    In the past, RGAs were isolated by PCR am-plifying conserved domains of R genes, with which a number of RGAs were successfully cloned from Arabidopsis thaliana (Botella et al., 1997; Aarts et al., 1998), soybean (Graham et al., 2000), rice (Mago et al., 1999), corn (Collins et al., 1998), wheat (Seah 1998), tobacco (Leng et al., 2010; Gao et al., 2010) and other plants (Wan et al., 2010; Huettel et al., 2002; Nair et al., 2007).

    Table 2. Information on the name, protein ID and structure of 112 known R genes from plants.

    Plant R Gene Protein ID Structure Plant R Gene Protein ID Structure

    Aegilops tauschii Cre1 AAM94164NBS, LRR, others

    Oryza sativa XA1 BAA25068NBS, LRR, others

    Arabidopsis thaliana EFR NP_197548 LRR, PK Oryza sativa xa21 BAE93934 LRR, PK

    Arabidopsis thaliana ER - Erecta NP_180201 LRR, PK Oryza sativa Pi2 ABC94598NBS, LRR, others

    Arabidopsis thaliana FLS2 NP_199445 LRR, PK Oryza sativa Pi36 ABI64281NBS, LRR, others

    Arabidopsis thaliana HRT AAF36987NBS, LRR, PK, others

    Oryza sativa Pi9 ABC18336NBS, LRR, others

    Arabidopsis thaliana PEPR1 NP_177451 LRR, PK Oryza sativa Pid2 ACR15163 PK

    Arabidopsis thaliana RAC1 AAS01763TIR, NBS, LRR, others

    Oryza sativa Xa13 ABD78944 n.a.

    Arabidopsis thaliana RCY1 BAC67706NBS, LRR, others

    Oryza sativa Xa27 AAY54163 n.a.

    Arabidopsis thaliana RFO1 NP_178085 PK Oryza sativa Xa5 AAV53715 n.a.

    Arabidopsis thaliana RLM3 NP_001031652 NBS, TIR, others Oryza sativa Pi5-1 ACJ54697NBS, LRR, others

    Arabidopsis thaliana RPM1 NP_187360NBS, LRR, others

    Oryza sativa Pi5-2 ACJ54698NBS, LRR, others

    Arabidopsis thaliana RPP1 NP_190034NBS, LRR, others

    Oryza sativa Pid3 ACN79514NBS, LRR, others

    Arabidopsis thaliana RPP13 NP_190237NBS, LRR, others

    Oryza sativa Pikm1-TS BAG72135 NBS, LRR, others

    Arabidopsis thaliana RPP27 CAE51863 LRR Oryza sativa Pikm2-TS BAG72136NBS, LRR, others

    Arabidopsis thaliana RPP4 NP_193420TIR, NBS, LRR, others

    Oryza sativa Pikp-2 ADV58351NBS, LRR, others

    Arabidopsis thaliana RPP5 NP_193428TIR, NBS, LRR, others

    Oryza sativa Pit BAH20862NBS, LRR, others

    Arabidopsis thaliana RPP8 NP_199160NBS, LRR, others

    Oryza sativa Piz-t ABC73398NBS, LRR, others

  • NTRGAS IDENTIFICATION AND RGA-SSR DEVELOPMENT 471

    Plant R Gene Protein ID Structure Plant R Gene Protein ID Structure

    Arabidopsis thaliana Rps2 NP_194339NBS, LRR, others

    Oryza sativa Xa26 ABD84047 LRR, PK

    Arabidopsis thaliana Rps4 NP_199338TIR, NBS, LRR, others

    Phaseolus vulgaris PGIP CAA46016 LRR, PK

    Arabidopsis thaliana RPS5 NP_17268NBS, LRR, others

    Solanum acaule Rx2 CAB56299NBS, LRR, others

    Arabidopsis thaliana RPW8.1 AAK09266 RPW8, others Solanum bulbocastanum Rpi-blb1 AAP86601NBS, LRR, others

    Arabidopsis thaliana RPW8.2 AAK09267 RPW8, others Solanum bulbocastanum Rpi-blb2 AAZ95005NBS, LRR, others

    Arabidopsis thaliana RRS1 NP_001078715NBS, LRR, others

    Solanum demissum R1 AAL39063NBS, LRR, others

    Arabidopsis thaliana RTM1 NP_172067 n.a. Solanum habrochaites Cf-4 CAA05270LRR, PK

    Arabidopsis thaliana RTM2 NP_568144 n.a. Solanum habrochaites Cf4A CAA05265 LRR, PK

    Arabidopsis thaliana SSI4 AAN86124TIR, NBS, LRR, others

    Solanum lycopersicum I-2 AAD27815NBS, LRR, others

    Beta vulgaris Hs1 AAB48305 n.a. Solanum lycopersicum Asc-1 AAF67518 n.a.

    Capsicum annuum Bs3 ABW82012 NBS, LRR, others Solanum lycopersicum Bs4 AAR21295TIR, NBS, LRR, others

    Capsicum annuum Bs3-E ABW82011 NBS Solanum lycopersicum Hero CAD29729NBS, LRR, others

    Capsicum chacoense Bs2 AAF09256NBS, LRR, others

    Solanum lycopersicum LeEIX1 AAR28377 LRR, PK

    Cucumis melo At1 AAL47679 PK Solanum lycopersicum LeEIX2 AAR28378 LRR, PK

    Cucumis melo At2 AAL62332 PK Solanum lycopersicum Mi1.2 AAC67238NBS, LRR, others

    Cucumis melo FOM-2 ABB91438NBS, LRR, others

    Solanum lycopersicum Sw-5 AAG31013NBS, LRR, others

    Cucumis melo VATNBS, LRR, others

    Solanum lycopersicum Tm-2 AAQ10735NBS, LRR, others

    Glycine max KR1 AAL56987TIR, NBS, LRR, others

    Solanum lycopersicum Tm-2a AAQ10736NBS, LRR, others

    Glycine max Rps1-k-1 AAX89382NBS, LRR, others

    Solanum lycopersicum Ve1 AAK58682 LRR, PK

    Glycine max Rps1-k-2 AAX89383NBS, LRR, others

    Solanum lycopersicum Ve2 AAK58011 LRR, PK

    Helianthus annuus Pl8 AAT08955NBS, LRR, others

    Solanum lycopersicum Cf-5 AAC78591 LRR, PK

    Hordeum vulgare MLA1 ACZ65507NBS, LRR, others

    Solanum pimpinellifolium Cf-2 AAC15779LRR, PK

    Hordeum vulgare MLA10 AAQ55541NBS, LRR, others

    Solanum pimpinellifolium Prf AAF7630NBS, LRR, others

    Hordeum vulgare MLA13 AAO16014NBS, LRR, others

    Solanum pimpinellifolium Pto AAC48914 PK

    Hordeum vulgare Mlo CAB06083 Mlo Solanum pimpinellifolium Cf-9 CAA05277 LRR, PK

    Hordeum vulgare RPG1 ABK51311 PK Solanum pimpinellifolium Cf9B CAA05273LRR, PK

    Hordeum vulgare Mla12 AAO43441NBS, LRR, others

    Solanum tuberosum Gpa2 AAF04603NBS, LRR, others

    Hordeum vulgare Mla6 CAC29242NBS, LRR, others

    Solanum tuberosum Gro1.4 AAP44390TIR, NBS, LRR, others

    Hordeum vulgare Rdg2a ADK47521NBS, LRR, others

    Solanum tuberosum R3a AAW48299NBS, LRR, others

    Lactuca sativa Dm3 AAD03156NBS, LRR, others

    Solanum tuberosum Rx CAB50786NBS, LRR, others

    Linum usitatissimum L6 AAA91022TIR, NBS, LRR, others

    Solanum tuberosum RY-1 CAC82811TIR, NBS, LRR, others

    Linum usitatissimum M AAB47618TIR, NBS, LRR, others

    Triticum aestivum Lr1 ABS29034NBS, LRR, others

    Linum usitatissimum P2 AAK28805TIR, NBS, LRR, others

    Triticum aestivum Lr10 AAQ01784NBS, LRR, others

    Nicotiana benthamiana Serk3A ADO86982 LRR, PK Triticum aestivum Lr21 ACO53397NBS, LRR, others

    Nicotiana benthamiana Serk3B ADO86983 LRR, PK Triticum aestivum Lr34 ADK62371 NBS, LRR, others

    Nicotiana glutinosa N AAA50763TIR, NBS, LRR, others

    Triticum aestivum Pm3 AAQ96158NBS, LRR, others

    Nicotiana tabacum IVR CAA08776 n.a. Zea mays Hm1 NP_001105920 n.a.

    Oryza sativa PIB BAA76282NBS, LRR, others

    Zea mays Hm2 ABY68564 n.a.

    Oryza sativa Pi-ta AAO45178NBS, LRR, others

    Zea mays Rp1-D AAD47197 NBS, LRR, others

    Table 2 continued

  • 472 Yuan et al.

    Bertioli et al. (2009) isolated 78 RGAs from the peanut and its wild relatives with degenerate primers designed based on an NBS domain. Gao et al. (2010) isolated 100 RGAs from Nicotiana repanda based on NBS and PK domains.

    Compared with PCR amplification, data min-ing is an effective and efficient strategy for the identification of RGAs from genomes. Meyers et al. (2003) identified 149 NBS-LRR decoding genes and 58 other types of genes from the ge-nome of Arabidopsis thaliana. Ameline-Torre-grosa et al. (2008) identified 333 non-redundant NBS-LRR genes from the draft genome sequence of Medicago truncatula, and predicted that in its whole genome there existed 400-500 NBS-LRR genes. Recently, Li et al. (2010) successfully iden-tified 158 NBS-encoding R genes from the ge-nome of Lotus corniculatus.

    In the plant genome, there are abundant pseudogenes that have lost biological functions. Inevitably, most of the RGAs identified on the ba-sis of genomic sequences are unexpressed pseu-dogenes (Li et al., 2010), which severely hamper the effective cloning of R-genes. Therefore, the cloning of true R-genes from plentiful pseudo-genes is required. Recently, a number of RGAs have been identified from plant EST sequences through data mining. Liu et al. (2012) obtained 1.69 million ESTs of common bean by 454 Se-quencing technology whereby 364 RGAs were identified. Liu et al. (2013) successfully identi-fied 385 expressed peanut RGAs by scanning the peanut EST sequences available in GenBank with 54 plant resistance gene sequences.

    Once the RGAs are identified, the next logi-cal step is to develop RGA markers, which could be restriction fragment length polymorphisms (RFLP) (Sanz et al., 2013), sequence-tagged sites

    (STS) (Loarce et al., 2009), single-strand con-formation polymorphisms (SSCP) (Tantasawat et al., 2012), cleaved amplified polymorphic se-quences (CAPS) (Palomino et al., 2009), simple sequence repeats (SSR) (Liu et al., 2013), etc. Sanz et al. (2013) designed 31 RFLP probes based on the RGA sequences of oat, and successfully mapped 53 RGA-RFLPs profiling markers on the hexaploid map of A. byzantina cv. Kanota × A. sativa cv. Ogle. Recently, Liu et al. (2013) devel-oped 28 SSR markers using 25 peanut RGAs, and mapped one of the markers, RGA121, onto the linkage group AhIV. SSR markers possess many advantages over other types of markers such as codominance, high polymorphism, and easy ma-nipulation with good reproducibility (Agarwal et al., 2008); therefore, it is more practical to de-velop SSR markers from RGAs for the mapping and cloning of plant disease resistance genes.

    Up to June 2012, the number of tobacco EST sequences in the public nucleotide database Gen-Bank has reached 412325, covering almost all genes expressed in different growth stages and different tissues, thus enabling the identification of tobacco-expressed RGAs. Therefore, this study was intended to identify expressed RGAs from tobacco EST sequences by data mining, which have been used to develop RGA-SSR markers that will provide a useful basis for future identi-fication and cloning of tobacco disease resistance genes.

    MATERIALS AND METHODS

    Plant materials

    Twenty-four varieties of tobacco and 6 wild spe-cies of the genus Nicotiana were used in this study (Table 1). The plant materials were culti-

  • NTRGAS IDENTIFICATION AND RGA-SSR DEVELOPMENT 473

    vated in the experimental farm of the Guang-dong Academy of Agricultural Sciences, China, and young leaves were sampled in the summer of 2012. Genomic DNA was extracted from fresh leaf samples by DNA extraction kit (Cat. No. DP320, Tiangen, Beijing, China).

    Tobacco EST sequence assembly

    The tobacco EST sequences available in GenBank (http://www.ncbi.nlm.nih.gov/) were download-ed and assembled using the TIGR Gene Indices Clustering Tools (TGICL) (http://compbio.dfci.harvard.edu/tgi/software/). EST sequences were

    considered to meet the assembly requirements if a) the length of overlapping nucleotides exceeded 50; b) the similarity reached 90%; and c) the non-matching length did not exceed 20 nucleotides.

    Data mining of tobacco RGAs

    The amino acid sequences of 112 published plant R-genes (Table 2) were used to scan tobacco Uni-ESTs in order to identify RGAs. Sequence blast was carried out using the tBLASTn tool, and the Uni-ESTs with a ≥100 blast score and E-values ≤1e-10 were considered candidate Nicotiana taba-cum RGAs (NtRGAs).

    Fig. 2. Alignment of FS423566 and CL17188Contig1 to N. benthamiana genome. A – FS423566 match with 4904-5308 bp in genomic scaffold (sequence ID: Niben.v0.3.Scf24993858). B – CL17188Contig1 match with 5494-6282 bp in genomic scaffold (sequence ID: Niben.v0.3.Scf24993858).

  • 474 Yuan et al.

    Table 3. Characteristics of RGA-SSR in Nicotiana

    Marker MotifPrimer Sequence

    Ta (˚C) Length

    Amplified Length (bp)

    Alleles

    Forward (5’-3’) Reverse (5’-3’) culti-varwild

    species

    RGA-1 (ACA)5 AGACACTAAAATGGATAGAGTCTTAT CAATGGTCTATCGGAAACAG 47.3 201 200

    RGA-3 (TCT)5 TCTTGCCACAACCACAAGTT GGTCTTTTATTCCCTTATTATTCAC 49.3 198 NO

    RGA-4 (AGG)5 AAAGAGCCTCGACCAATAACC CAATGAAAATGCCAAGGAAAA 51.4 251 250 2 4

    RGA-5 (CTT)5 GAGAATAAAGACCGAGTCCA TAGTTCATTCAAACACCACC 49.5 292 290 3 4

    RGA-7 (TTC)5 CTACCAAAGCCTCCTTCTCC TTGCTTTCCCTTTTCTCAAA 50.5 299 750 3

    RGA-8 (CCACCG)5 AATTCCGATGCCCACTTT CAGCATTCAAGAAACCCAGTA 52.4 296 290 2

    RGA-9 (CCTGCT)4 GCCTCGTGGACAACAGAGT AATGAAACTAGAGCCCTTATGAC 52.8 263 530 2

    RGA-10 (TTTG)4 ATTTCCTACTTCCTTCCTTTTA TATTCCCTTATTATTCACGACT47.7

    313 310

    RGA-11 (CCA)6 GGAAATAATCATCGGCGGAGGT GCAAGGAGGTTCTGGTGACGGA 55.6 109 110 3

    RGA-12 (AGT)6 CCTGAGGGTGAAATGGTT TTACATTAAGATTGGAGGTAAGA 47.3 136 135 2 3

    RGA-13 (ATAAA)4 AAAACAGGTGGTAAATGGCG AATCAAACTTGGGTGGAATA 45.5 182 Multi-band

    RGA-14 (TATTT)4 TAGTAATTGTTATGGGGAGTTTAG ACACTGTTAGCGAAGGTGAA 49.5 300 300 2

    RGA-15 (ACA)5 ACAGAAGTAAAGCAATACAGACA ACATTTACCCTCCCCAGA 47.1 180 180 3

    RGA-16 (AAAAT)4TACCAT T TGTATAATGTATACTA-CACTG AACTTTGATTAGGCACTGGG

    45.7130 130 3

    RGA-17 (AGA)13 TGCTTTCCCTTTTCTCAA AATGTAAATGTATGTGCTGCTA 48.6 250 Multi-band

    RGA-18 (CAA)6 TTTTACGTGCCAGGTCTT GTTTCAATCATCCCACTTTT 47.3 245 245 3 4

    RGA-20 (ATA)5 AAAGGAACAAGCAAGATTGAAGT TTCTCAAATACCATCAAGTAGGC 49.0 286 285 3

    RGA-21 (T)11 AGCGGAATGTATAGAGCAGATAG TTTATTTGGACTAGAAAGTTTGC 50.3 232 232 3

    RGA-23 (AAC)6 TTTTCAAAGAAGAATAACAACCTC ATAATATGCAGGAAAGGAGCTAAC 47.7 213 215 2

    RGA-24 (ACA)6 GTTTCGGTGGTTCTGTCTATG AGGTAAGGTCTGCGTACACTCT 50.4 238 240 3

    RGA-25 (GTA)9 TCAAAGCAGCAATTACAAACAC AAAATCAAAACTTTCCCACAGA 50.9 320 320 2

    RGA-26 (A)11 CCGTATTTGAAAGTGGCGTGT TCGTGGATAGTATAAATCATATCGT 47.9 198 200

    RGA-27 (CGGTGG)5 CCAGCATTCAAGAAACCCAGTA CCCACTTATTGCCACTTCCTC 55.2 132 130

    RGA-28 (T)10 CTCTTCTCGCCTAATTTCACTA CAGAGGCAGAGCCGTAGA 48.3 151 150 2

    RGA-29 (T)10 CCCCTGAATATGACACCATC TGTATGAATAGGCTACAGAGTGAG 48.3 222 220 2

    RGA-30 (GTT)6 AGGTAGGGCTAAGACTGCGTAT TAGGGTCTTCATCAACTAGCACA 50.6 141 1200 2

    RGA-31 (T)10 AAAAGATAGACTTAGCCTACCAAT CACAAACTTGGAACTCAATAAAA 46.0 168 170

    RGA-32 (A)10 TCCTACCTCCTGACTTTTCTTTA GAACTTCCCTTCTTCTACCACA 48.9 299 300 2

    RGA-33 (T)13 ATCTTCTTATCTTTCCTACCTCCTG ATTCTAGCACTCAAACAAATCCC 48.7 145 380 2

    RGA-34 (TTG)5 TTCCCTAAAACTATTTGTCGCT CTGCCTATGTAACCGATTTTGT 51.5 281 220 2

    RGA-35 (T)11 CTTACTAATCCACCCCAATCTA TTTTACTTCTTGCCCCTTTTAT 48.7 242 245

    RGA-36 (TCT)5 TGCCTACTTCTTGACTTCTCAC TATTAGTATTGCCCAAATCCAG 51.3 288 Multi-band

    RGA-37 (A)16 TAGCAATAGTCTACAACAAATGAT AAGGAATACTTACTGGTGGGA 46.9 213 NO

    RGA-38 (CACCC)4 TGCTGTCTGAACAACGGCTCT AGCACGAAGTGAGAAAGACGAG 50.9 125 NO

    RGA-39 (A)11 GATGATAAACTAAGACAGAATGCCA AGAGGACAGGTACCCTCATGC 48.9 154 Multi-band

    RGA-40 (A)17 GCATTGATTTCAGTGCCCTATT ATTCTCACGCTCTGCTCTTCC 50.3 298 Multi-band

  • NTRGAS IDENTIFICATION AND RGA-SSR DEVELOPMENT 475

    Marker MotifPrimer Sequence

    Ta (˚C) Length

    Amplified Length (bp)

    Alleles

    Forward (5’-3’) Reverse (5’-3’) culti-varwild

    species

    RGA-41 (A)12 AATGGTGAACAGGATTGTGTAA AAATGAGAGAAAAAGATGGGAC 47.5 277 280 3

    RGA-42 (GAG)6 ATCTGGTTCTTCGCCTTCAC ACACTGCCAGAATCAGAACG 56.2 182 180 2

    RGA-43 (T)10 ACACCACCGATGACTGAAAT ACTCGGAGTACCTGCATGTG 48.3 174 Multi-band

    RGA-44 (A)11 TACTGTTCACCACCTATAATACATC CAGAAGAGGAGCCAAGAGCA 47.9 234 235 2

    RGA-45 (ATAC)4 AGTGGGAACTTAAAGTTGGTGA TCCTGCTAAACCTACATCTTGC 49.6 177 175 3

    RGA-46 (T)21 CCAACATAGTTAATGTGCAGGAT GGGGTAAATAACATGTAGCTCAG 47.7 163 165 2

    RGA-47 (A)20 GAGTTGCTAGGCTGGCAGAA AATGGGGGAGACAGAGGGAG 50.8 110 110 3 4

    RGA-48 (ATA)5 TCTGAATTTACCCGGTTTAGTTC TTTCCCGTTCATTTCCGTTA 50.1 173 175

    RGA-49 (T)12 TGTCAGCAAGAACACCACCAT CTGAAAACAAGTGAAGCAAGAGC 51.0 244 245 2

    RGA-50 (C)13 TACCAGCATAGTCTGGGGATC GGCAAAGGGAAGATACAACAA 51.8 280 280 2 3

    RGA-51 (AAG)6 AAGAAGAAGGAGAAGAAAGACGA AGAGTAGGGTAAAGGAACAAGGA 48.3 277 275

    RGA-52 (GTCAA)4 ACAATAAACTAGCAGGGACTGTC TTTCCACTGAAATCTTTTGG 47.9 299 400

    RGA-53 (TTG)5 AGCAAGACACCATTTCCCACA CAGAAGTTCCCTGACAAGTGATT 51.4 228 Multi-band

    RGA-54 (TC)8(AC)10 TTTCTATTTCTGTACTACACTTCCAAC CTCCTTTGGACATACAGTTGACA 50.2 239 750 2

    RGA-55 (AGG)9 CTTCAGTTGATTGGGAGAAAGA GTGAACCAAGGCTCATTACAAG 50.8 301 400

    RGA-56 (TTC)6 GCTTTTACTACGCCCTTTTCCC TGTCTCGTCGTGTTCATTCCAG 51.6 268 270 2 3

    RGA-57 (ATTC)4(CTA)5 CGAGCGACTATACAATCGGTCT GCAAAACTTACTGTTCCAACTCCTA 53.0 280 280 3

    RGA-58 (T)10 AGATTCATTGATTCAACCATAAAC AGAGTTCTCGTGTCAAATGGG 47.9 126 125 2

    RGA-59 (T)13 ATAGCAAAGCAATCTTGTCTATA CACCACCATTTGTCACTTCTAT 45.3 113 115 2

    RGA-60 (T)10 AAAGCCAAGCTATTATCTGATTA TAGTGCTATGCTACCAATCTTAT 43.4 103 95 2 4

    RGA-61 (G)10 GAATTAAATCCGGGAGGGA AACAAAGAGGCAGGAAAGGT 51.5 195 195

    RGA-62 (CGGTGG)5 CCAGCATTCAAGAAACCCAGTA TCCCACTTATTGCCACTTCCTC 55.3 133 135 2

    RGA-63 (TA)18 AACCAAATATGTACAAAGGGTCA CCGAGGGATATGATGTAGATAGC 49.4 301 300 4 4

    RGA-64 (T)11 AATAGAGGGATAGCTGCAATGAC GATTTGGTGTAGAGGATGGGACT 52.7 246 245

    RGA-65 (A)10 CATTTTCCTTCATACTACACCTTTC TACGAGAAACACCAATCCAAC 49.2 259 260 2

    RGA-66 (A)10 ATGCCATTCTCAATTTTCCAAG CACCATAACAGCCACCACCTAA 49.9 241 240 2

    RGA-67 (T)11 TGTACCAAGACCAAGTCTAACC TAGCAAGTCCTATGAACGCAAT 46.4 199 200 2

    Mapping of NtRGAs in Nicotiana benthamiana genome

    The NtRGAs were mapped in the genome of Nicotiana benthamiana through sequence blast using BLASTtool (http://solgenomics.net/or-ganism/Nicotiana_benthamiana/genome). Ac-cording to the sequence blast result, the genome sequences with the highest blast scores (>50) and

    the smallest E-values (

  • 476 Yuan et al.

    the shortest length of 15 bp, dinucleotide motifs repeated at least 8 times, three-nucleotide motifs repeated at least 5 times, 4- to 6-nucleotide mo-tifs repeated at least 4 times, and single nucleo-tide motifs excluded. Using the Primer Premier 5 program (http://www.premierbiosoft.com/prim-erdesign/), primer pairs were designed based on the following core criteria: (1) melting temper-ature (Tm) between 52 and 63 with 60°C as an optimum; (2) product size ranging from 100 bp to 350 bp; (3) primer length ranging from 18 bp to 24 bp with amplification rate larger than 80%, and (4) a GC% content between 40% and 60%. The parameters were modified when unsuitable primer pairs were retrieved by the program.

    Newly designed primers were used for PCR amplification in 24 varieties of tobacco and 6 wild species of the genus Nicotiana in order to detect polymorphism at the species and/or ge-nus levels. PCR was performed in a total volume of 20 μL using standard PCR conditions {20 ng DNA, 2.0 μL 10×buffer [0.8 mol/L Tris-HCl, 0.2 mol/L (NH4)2SO4, 0.2% (v/v) Tween 20], 2.0 μL 10× dNTPs (2.5 mmol/L each), 0.4 μL each PCR primer (10 mmol/L), 2.4 μL MgCl2 (25 mmol/L), 1 unit Taq polymerase (Cat. No. ET101, Tiangen, Beijing, China)}. The PCR profile was as follows: 1 cycle for 5 min at 94°C, 35 cycles of 1 min at 94°C, 30 s at 55°C and 45 s at 72°C and an ad-ditional cycle for final extension for 10 min at 72°C. All primers were initially screened using Taq DNA polymerase. A negative control con-taining all PCR reaction components except template DNA served to validate the PCR. Each of the primer pairs was screened twice to con-firm the repeatability of the observed bands in each genotype. PCR products were separated on a 6% polyacrylamide denaturing gel. The gels were silver stained for SSR band detec-tion. Alleles were scored visually by comparing the position of the bands to the DNA marker.

    RESULTS

    Tobacco EST sequence assembly

    Up to June 7 2012, the number of ESTs available for the genus Nicotiana in the GenBank reached 412325, of which 334384 were of N. tabacum, 56102 of N. benthamiana, 12448 of N. langsdorf-fii x N. sanderae, 8583 of N. sylvestris, 355 of N. attenuate, and 453 of other species. All these EST sequences were downloaded from the GenBank in FASTA format and used for development of tobacco RGAs. However, they comprised a large number of redundant EST sequences. In order to improve the quality of EST sequences, to obtain EST sequences that were longer than the original ones as well as consensus sequences derived from the same loci, the tobacco EST sequences from the GenBank were assembled by TGICL. The results showed that a total of 149606 potential unique EST sequences, including 45137 contigs and 101169 singletons were generated, with the longest sequence of 2312 bp, the shortest of 431 bp, and an average length 874 bp.

    Identification of NtRGAs

    A total of 112 R genes were used to search against tobacco Uni-EST sequences with an E-value cut-off of 1e-10. Out of 112 R genes, 109 bore similar-ity to 6963 Uni-EST sequences except 3 R genes (RPW8.1, RPW8.2, xa27). Since different R-genes often harbor the same or similar domains and such genes tend to be matched with the same Uni-EST sequence, many of the matched Uni-EST sequences are often repeatedly counted via blast. Upon removal of the repeated counts, we found a total of 1113 Uni-EST sequences match-ing the 109 R-genes (Additional files 1). Out of these sequences, 273 harbored NBS-LRR do-mains, 546 harbored LRR-PK domains, 53 har-bored extracellular LRR domains, 102 harbored

  • NTRGAS IDENTIFICATION AND RGA-SSR DEVELOPMENT 477

    only the PK domain, 30 harbored an Mlo do-main, and for the remaining 109 EST sequences, no domains were found. These Uni-ESTs, de-lineated in the present study and matching the R-genes, were identified as tobacco RGAs and designated as NtRGAs.

    Mapping of NtRGAs in the Nicotiana benthamiana genome

    The mapping of NtRGAs onto the genome of to-bacco is of great significance for the isolation of specific candidate disease resistance gene/QTLs. Due to the publication of a draft genome se-quence of N. benthamiana, it was possible to map the NtRGAs directly on its genome by sequence blast. In the present study, the draft genome se-quence of N. benthamiana was screened for loci matching the NtRGAs sequences. Upon setting the score value greater than 50 and E value less than 1e-10, matching the NtRGAs sequences with the draft genome sequence of N. benthamiana re-sulted in the identification of 1071 matched simi-lar sequences out of the 1113 NtRGAs in the N. benthamiana genome. Out of the 1071 matched sequences, 965 (90.7%) were matched with more than one fragments. On average, one NtRGA se-quence matched with 9.67 fragments, and there

    was one (CL7158Contig2) that matched with 529 genome fragments. Since plant genes have fre-quently acquired multiple copies during the long course of evolution, it remains a difficult task to accurately map the NtRGAs onto the genome. In our study, the genome sequences that had the highest blast score and the lowest E values were regarded as the most possible genome loci of NtR-GAs. Finally, the 1071 NtRGAs were allocated on 712 genome loci, of which 218 loci matched more than one NtRGAs (Appendix 2). A further analy-sis revealed the existence of two types of NtRGAs matching the same genome loci: (1) NtRGAs with high homology matching to the same genomic re-gion (Fig. 1), and (2) NtRGAs without homology matching to different genomic regions (Fig. 2). An alternatively spliced gene might be transcribed into different mRNAs; the Type 1 NtRGAs could be from the homogenous genes of different tobac-co species, but also be from the same gene. ESTs are usually part of a complete gene; therefore, EST sequences from the same gene may lack an overlapping region and are therefore unable to be assembled. Thus, Type 2 NtRGAs were possibly derived from different segments of the same gene.

    Regarding the distribution of NtRGAs in the N. benthamiana genome, we found that the

    Fig. 3. Amplification results of RGA-63 in Nicotiana. 1 – ATNARELLO; 2 – KY26; 3 – Big White Burley 599; 4 – White Burley 9; 5 – A Bo Yan; 6 – Margland 609; 7 – Guang Huang 5; 8 – Feng Zi 1; 9 – Ge Xin 6;10 – Ge Za Ji; 11 – Hong Hua Da Jin Yuan; 12 – Jin Xing; 13 – Bai Gu Niu Li; 14 – Da Qiu Gen 2; 15 – Da Shan Gou; 16 – Da Tong; 17 – Da Ye Mi He; 18 – Da Zhong Bai Mao; 19 – 2040; 20 – 74-16; 21 – Xia Wang Na; 22 – BLACK SEA SAMSUN; 23 – Burley Hampton; 24 – Dark_Virginia; 25 – N. clevelandii A. Gray; 26 – N. repanda Willd; 27 – N. debneyi Domin; 28 – N. rustica L.; 29 – N. tomentosiformis; 30 – N. sylvestris Speg. & Comes; M: Φ×X174/Hinc degest DNA marker (13 bands: 79~1057 bp, Cat. No. DNA-114, TOYOBO, Shanghai, China).

  • 478 Yuan et al.

    NtRGAs were not evenly distributed through-out the genome, and that a tandem of several NtRGAs occurred in 17 genomic scaffolds. For example, the 63 kb genomic scaffold (sequence ID : Niben.v0.3.Scf25265845) contained four NtRGAs.

    Development of SSR markers

    A total of 78 SSR loci detected in 1113 NtRGAs using Perl script MISA were distributed on 72 sequences and thus, one SSR was present at ev-ery 939348 bp on average. Six NtRGAs harbored more than one SSR. We designed 64 pairs of primers within flanking sequences of the SSR using the software Primer Premier 5, but were unable to design primers for 8 NtRGAs due to flanking sequences being too short or too com-plex in structure. On testing these 64 primer pairs, 54 generated clear bands in tobacco, and the remaining 10 pairs failed to produce PCR products or generated RAPD-like non-specific bands (Table 3). Of the 54 primer pairs that gen-erated clear bands, 46 pairs produced amplified products of the expected lengths, seven pairs had products of lengths larger than expected, and one pair had a shorter fragment than expected. Nine of the 54 primer pairs (16.7%) displayed poly-morphism in the 24 varieties of tobacco. The to-tal number of alleles detected at these nine loci was 23; the number of alleles per locus ranged from 2 to 4, with an average of 2.56 alleles per locus. All 54 primer pairs tested in cultivated va-rieties were successfully amplified in the 6 wild species of genus Nicotiana, and higher levels of polymorphisms were detected as compared to cultivated varieties, i.e. a total of 41 pairs of primers displayed polymorphism, accounting for 75.9% of the amplified primers. The total number of alleles detected at 41 loci was 92, the number of alleles per locus ranged from 2-4, with an average of 2.61. The amplification results of

    RGA-63 in Nicotiana are shown in Fig.3. At this locus, four alleles were observed in the 24 vari-eties of tobacco and 6 wild species of the genus Nicotiana.

    DISCUSSION

    As an integrated part of the gene-to-gene disease resistance mechanism, plant R-genes play a cru-cial role in the identification of pathogen-specific proteins decoded by the avirulence genes (Flor, 1956). In the present study, 1113 RGAs were successfully identified from the tobacco EST data submitted to GenBank, and then mapped on the N. benthamiana genome, indicating that EST data could be utilized to efficiently identify RGAs.

    To date, RGAs have been successfully identi-fied from sugarcane, wheat, corn and other crops by data mining (Rossi et al., 2003, Dilbirligi et al., 2003, Collins et al., 1998). Dilbirligi et al. (2003) tested four different strategies to search for RGAs from wheat, including domain search, single or multiple motif search, consensus sequence search and single full-length sequence search, respective-ly. The authors found that the last strategy per-formed best, whereby 243 NBS-LRR-type RGAs and 101 RGAs of other types were detected, with the E value set at ≤ e-10. Xiao et al. (2006) applied modified amplified fragment length polymor-phism (AFLP), rapid-amplification of cDNA ends (RACE) and data mining to identify R-gene-like ESTs (or RGAs) in maize and found that data mining was the most effective. Using the strict-est blast condition (E < e-50), Rossi et al. (2003) detected 88 RGAs from sugarcane EST sequences, representing three main R-gene families, namely NBS-LRR, LRR-TM and PK. The above reports demonstrated that RGA searching results were

  • NTRGAS IDENTIFICATION AND RGA-SSR DEVELOPMENT 479

    influenced largely by the applied E value. In the present study, the E value was set to be ≤ e-10, and a total of 1113 RGAs were identified, three times as many as the number of RGAs obtained from wheat (Dilbirligi et al., 2003). There are two reasons for the large difference in the number of detected RGAs in these two crops. The num-ber of tobacco EST used in the present study for data mining of NtRGAs was larger than that of wheat − we used a total of 412325 tobacco EST se-quences, whereas Dilbirligi et al. (2003) used only 78221 wheat EST sequences. In addition, in this study, 112 R-genes were employed for the blast, much more than the number of R-genes applied in wheat (Dilbirligi et al., 2003).

    One of the advantages of identifying RGAs through data mining EST sequences is that all identified RGAs are expressed genes, whereas RGAs identified from genome sequences may be unexpressed pseudogenes. For example, 65 RGAs of Lotus corniculatus identified by Li et al. (2010) were finally found to be pseudogenes. Meyers et al. (2003) searched for RGAs that con-tained NBS-LRR domains from Arabidopsis thali-ana and found at least 12 NBS-LRR genes had evolved into pseudogenes due to frame shift or nonsense mutation.

    In this study, 1071 of the identified NtRGAs were allocated to 712 loci of the N. benthamiana genome, which provides a basis for the future cloning of tobacco R-genes. However, 42 of the identified NtRGAs could not be mapped on the N. benthamiana genome, most likely for the fol-lowing reasons: (1) the currently available whole genome of N. benthamiana used in the present study is still a draft and comprised regions that have not been sequenced yet; (2) only a part of the NtRGAs identified in this study were derived from N. benthamiana, and the NtRGAs derived from other species cannot be mapped due to the

    variations between the genomes of different spe-cies. We found that the NtRGAs were not dis-tributed evenly throughout the genome, with a number of RGAs occurring in clusters, which was consistent with previous reports in other plants (He et al., 2004; Peñuela et al., 2002). The tandem of R-genes facilitates the genetic variation and evolution of R-genes. Bertioli et al. (2009) analyzed the synteny of Arachis with Lotus and Medicago and found that retrotransposons are associated with some disease resistance gene families. Other hypotheses such as replication, gene conversion and non-allelic exchange have also been used to explain the clusterization and evolution of plant R-genes (Ellis et al., 2000).

    How to utilize the identified RGAs remains an unanswered question. In this study, 62 RGA-SSR markers have been developed, and they will facilitate future mapping and cloning of disease resistance genes. Recent studies revealed that SSRs exhibit high polymorphism in common tobacco (Bindler et al., 2011). In the construc-tion of a tobacco genetic map, Bindler et al. (2011) found that 2415 (47%) of 5119 pairs of SSR primers detected polymorphism between parents. However, in this study, only 16.7% of primers detected polymorphism in tobacco, and the number of detected alleles per locus was just 2-4. The low polymorphism of the SSR markers developed in this study might be because they were derived from expressed genes, which are under higher selective pressure than the whole genome and thus have to maintain high sequence conservation to keep their biological functions (Fay et al., 2003; Flowers et al., 2008).

    Acknowledgments: This research was funded by grants from Science and Technology Planning Project of Guang-dong Province (No. 2012B020302006), Science and Tech-nology Project of China Tobacco Company Guangdong Province branch (No. 201205; No. 201002).

  • 480 Yuan et al.

    Authors’ contributions: Qinghua Yuan performed EST sequences assembly, RGA identification, and prepared the first draft of the manuscript. Ruihong Xie was responsible for RGA-SSR markers developing. Zhuwen Ma detected RGA-SSR polymorphism in Nicotiana. Jiqin Li performed the DNA extraction. Shuling Li participated in RGA-SSR primer designation. Junbiao Chen cultivated the plant material. Yonghua Li oversaw the project, and revised and submitted the manuscript. All authors read and approved the final manuscript.

    Conflict of interest disclosure: The founders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. We declare no conflict of interests.

    Supplementary Material

    Supplementary file 1 – NtRGAs sequence.fasta; available at: http://serbiosoc.org.rs/sup/file1.fastaSupplementary file 2 – Mapping of NtRGAs in Nicotiana benthamiana genome.XLS; available at: http://serbiosoc.org.rs/sup/file2.xls

    REFERENCES

    Aarts M. G. M, Hekkert B. L., Holub E. B., Beynon J. L., Stiekema W. J. and A. Pereira (1998). Identification of R-gene homologous DNA fragments genetically linked to disease resistance loci in Arabidopsis thaliana. Mol. Plant Microbe. Interact. 11, 251-258.

    Agarwal M., Shrivastava N. and H. Padh (2008). Advances in molecular marker techniques and their applications in plant sciences. Plant Cell Rep. 27, 617-31.

    Ameline-Torregrosa C., Wang B. B., O’Bleness M. S., Deshpande S., Zhu H. Y., Roe B., Young N. D. and S. B. Cannon (2008). Identification and characterization of nucleotide-binding site-leucine-rich repeat genes in the model plant Medicago truncatula. Plant Physiol. 146, 5-21.

    Bent A. F. (1996). Plant disease resistance genes: function meets structure. Plant Cell 8, 1757-1771.

    Bent A. F., Kunkel B. N., Dahlbeck D., Brown K. L., Schmidt R., Giraudat J., Leung J. and B. J. Staskawicz (1994). RPS2 of Arabidopsis thaliana: a leucine-rich repeat class of plant disease resistance genes. Science 265, 1856-1860.

    Bertioli D. J., Moretzsohn M. C., Madsen L. H., Sandal N., Leal-Bertioli S. C., Guimaraes P. M., Hougaard B. K., Fredslund J., Schauser L., Nielsen A. M., Sato S., Tabata S., Cannon S. B. and Stougaard J. An analysis of synteny of Arachis

    with Lotus and Medicago sheds new light on the structure, stability and evolution of legume genomes. BMC Genomics (2009), 10, 45. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2656529/

    Bindler G., Plieske J., Bakaher N., Gunduz I., Ivanov N., Van der Hoeven R., Ganal M. and P. Donini (2011). A high density genetic map of tobacco (Nicotiana tabacum L.) obtained from large scale microsatellite marker development. Theor. Appl. Genet. 23, 219-30.

    Botella M. A., Coleman M. J., Hughes D. E., Nishimura M. T., Jones J. D. G. and S. C. Somerville (1997). Map positions of 47 Arabidopsis sequences with sequence similarity to disease resistance genes. Plant J. 12, 1197-1211.

    Buschges R., Hollricher K., Panstruga R., Simons G., Wolter M., Frijters A., van Daelen R., van der Lee T., DieRGLarde P., Groenendijk J., Topsch S., Vos P., Salamini F. and P. Schulze-Lefert (1997). The barley Mlo gene: a novel control element of plant pathogen resistance. Cell 88:695-705.

    Collins N.C., Webb C. A., Seah S., Ellis J. G., Hulbert S. H. and A. Pryor (1998). The isolation and mapping of disease resis-tance gene analogs in maize. Mol. Plant Microbe. Interact. 11, 968-978.

    Dangl J. L. and J. D. Jones (2001). Plant pathogens and integrated defence responses to infection. Nature 411, 826-833.

    Dilbirligi M. and K. S. Gill (2003). Identification and analysis of expressed resistance gene sequences in wheat. Plant Mol. Biol. 53, 771-787.

    Dixon M. S., Jones D. A., Keddie J. S., Thomas C. M., Harrison K., Jones J. D. G. and C. Lane (1996). The tomato Cf-2 disease resistance locus comprises two functional genes encoding leucine-rich repeat protein. Cell 84, 451-459.

    Dunning F. M., Sun W., Jansen K. L., Helft L. and A. F. Bent (2007). Identification and mutational analysis of Arabidopsis FLS2 leucine-rich repeat domain residues that contribute to fla-gellin perception. Plant Cell 19, 3297-313.

    Ellis J., Dodds P. and T. Pryor (2000). The generation of plant disease resistance gene specificities. Trends. Plant. Sci. 5, 373-379.

    Fay J. C. and C. I. Wu (2003). Sequence divergence, functional constraint and selection in protein evolution. Annu. Rev. Genomics Hum. Genet. 4, 213-235.

    Flor H. H. (1956). The complementary genic systems in flax and flax rust. Adv. Genet. 8, 29-54.

    Flowers J. M. and M. D. Purugganan (2008). The evolution of plant genomes: scaling up from a population perspective. Curr. Opin. Genet. Dev. 18, 565-570.

    Gao Y. L., Xu Z. L., Jiao F. C., Yu H. Q., Xiao B. G., Li Y. P. and X. P. Lu (2010). Cloning, structural features and expression analysis of resistance gene analogs in tobacco. Mol. Biol. Rep. 37, 345-354.

    Graham M. A., Marek L. F., Lohnes D., Cregan P. and R. C. Shoe-maker (2000). Expression and genome organization of resistance gene analogs in soybean. Genome 43, 86-93.

    He L., Du C., Covaleda L., Xu Z., Robinson A. F., Yu J. Z., Kohel R. J. and H. B. Zhang (2004). Cloning, characterization and evolution of the NBS-LRR-encoding resistance gene ana-

  • NTRGAS IDENTIFICATION AND RGA-SSR DEVELOPMENT 481

    logue family in polyploid cotton (Gossypium hirsutum L.). Mol. Plant Microbe. Interact. 17, 1234-1241.

    Huettel B., Santra D., Muehlbauer J. and G. Kahl (2002). Resis-tance gene analogues of chickpea (Cicer arietinum L.): iso-lation, genetic mapping and association with a Fusarium resistance gene cluster. Theor. Appl. Genet. 105, 479-490.

    Hulbert S. H., Webb C. A., Smith S. M. and Q. Sun (2001). Resis-tance gene complexes: evolution and utilization. Annu. Rev. Phytopathol. 39, 285-312.

    Johal G. S. and S. P. Briggs (1992). Reductase activity encoded by the Hm1 disease resistance gene in maize. Science 258, 985-987.

    Leng X., Xiao B., Wang S., Gui Y., Wang Y., Lu X., Xie J., Li Y. and L. Fan (2010). Identification of NBS-Type Resistance Gene Homologs in Tobacco Genome. Plant. Mol. Biol. Rep. 28, 152-161.

    Li X. Y., Cheng Y., Ma W., Zhao Y., Jiang H. Y. and M. Zhang (2010). Identification and characterization of NBS-encod-ing disease resistance genes in Lotus japonicus. Plant Syst. Evol. 289, 101-110.

    Liu Z., Crampton M., Todd A. and Kalavacharla V. Identification of expressed resistance gene-like sequences by data min-ing in 454-derived transcriptomic sequences of common bean (Phaseolus vulgaris L.). BMC Plant Biol. (2012) 12, 42. http://www.biomedcentral.com/1471-2229/12/42

    Liu Z., Feng S., Pandey M. K., Chen X., Culbreath A. K., Varshney R. K. and B. Guo (2013). Identification of Expressed Resis-tance Gene Analogs from Peanut (Arachis hypogaea L.) Expressed Sequence Tags. J. Integr. Plant Biol. 55, 453–461.

    Loarce Y., Sanz M. J., Irigoyen M. L., Fominaya A. and E. Ferrer (2009). Mapping of STS markers obtained from oat resis-tance gene analog sequences. Genome 52, 608-619.

    Mago R., Nair S. and M. Mohan (1999). Resistance gene analogues from rice: cloning, sequencing and mapping. Theor. Appl. Genet. 99, 50-57.

    Martin G. B., Brommonschenkel S. H., Chunwongse J., Frary A., Ganal M. W., Spivey R., Wu T., Earle E. D. and S. D. Tanksley (1993). Map-based cloning of a protein kinase gene conferring disease resistance in tomato. Science 262, 1432-1436.

    Meyers B. C., Dickerman A. W., Michelmore R. W., Sivaramakrish-nan S., Sobral B. W. and N. D. Young (1999). Plant disease resistance genes encode members of an ancient and diverse protein family within the nucleotide-binding superfamily. Plant J. 20, 317-332.

    Meyers B.C., Kozik A., Griego A., Kuang H. and R. W. Michel-more (2003). Genome-wide analysis of NBS-LRR-encoding genes in Arabidopsis. Plant Cell 15, 809-834.

    Nair R. A. and G. Thomas (2007). Isolation, characterization and expression studies of resistance gene candidates (RGCs) from Zingiber spp. Theor. Appl. Genet. 116, 123-134.

    Palomino C., Fernández-Romero M. D., Rubio J., Torres A., Moreno

    M. T. and T. Millán (2009). Integration of new CAPS and dCAPS-RGA markers into a composite chickpea genetic map and their association with disease resistance. Theor. Appl. Genet. 118, 671-682.

    Peñuela S., Danesh D. and N. D. Young (2002). Targeted isolation, sequence analysis and physical mapping of nonTIR NBS-LRR genes in soybean. Theor. Appl. Genet. 104, 261-272.

    Rossi M., Araujo P. G., Paulet F., Garsmeur O., Dias V. M., Chen H., Van Sluys M. A. and A. D’Hont (2003). Genomic dis-tribution and characterization of EST-derived resistance gene analogs (RGAs) in sugarcane. Mol. Genet. Genomics 269, 406-419.

    Sanseverino W., Roma G., Simone M. D., Faino L., Melito S., Stupka E., Frusciante L. and M. R. Ercolano (2010). PRGdb: a bioinformatics platform for plant resistance gene analysis. Nucleic. Acids. Res. 38, D814-D821.

    Sanz M. J., Loarce Y., Fominaya A., Vossen J. H. and E. Ferrer (2013). Identification of RFLP and NBS/PK profiling mark-ers for disease resistance loci in genetic maps of oats. Theor. Appl. Genet. 126, 203-218.

    Seah S., Sivasithamparam K., Karakousis A. and E. S. Lagudah (1998). Cloning and characterization of a family of disease resistance gene analogs from wheat and barley. Theor. Appl. Genet. 97, 937-945.

    Song W. Y., Wang G. L., Chen L. L., Kim H. S., Pi L. Y., Holsten T., Gardner J., Wang B., Zhai W. X., Zhu L. H., Fauquet C. and P. Ronald (1995) A receptor kinase-like protein encoded by the rice disease resistance gene, Xa21. Science 270, 1804-1806.

    Taler D., Galperin M., Benjamin I., CohenY. and D. Kenigsbuch (2004). Plant eR genes that encode photorespiratory enzymes confer resistance against disease. Plant Cell 16, 172-184.

    Tantasawat P. A., Poolsawat O., Prajongjai T., Chaowiset W. and A. Tharapreuksapong (2012). Association of RGA-SSCP mark-ers with resistance to downy mildew and anthracnose in grapevines. Genet. Mol. Res. 11, 1799-1809.

    Tor M., Brown D., Cooper A., Woods-Tor A., Sjolander K., Jones J. D. and B. Holub (2004). Arabidopsis downy mildew resis-tance gene RPP27 encodes a receptor-like protein similar to CLAVATA2 and tomato Cf-9. Plant Physiol. 135, 1100-1112.

    Wan H. J., Zhao Z. G., Malik A., Qian C. T. and Chen J. F. Iden-tification and characterization of potential NBS-encoding resistance genes and induction kinetics of a putative can-didate gene associated with downy mildew resistance in Cucumis. BMC plant Biol. (2010)10, 186. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2956536/

    Whitham S., McCormick S. and B. Baker (1996). The N gene of tobacco confers resistance to tobacco mosaic virus in trans-genic tomato. Proc. Natl. Acad. Sci. USA. 93, 8776-8781.

    Xiao W. K., Xu M.L., Zhao J. R., Wang F.G., Li J. S. and J. R. Dai (2006). Genome-wide isolation of resistance gene analogs in maize (Zea mays L.). Theor. Appl. Genet. 113, 63-72.