Applica'on of transcriptomics to plant breeding
Advances in genetics and molecular breeding of three legume cropsof semi-arid tropics using next-generation sequencing
and high-throughput genotyping technologies
RAJEEV K VARSHNEY1,2,*,HIMABINDU KUDAPA
1,MANISH ROORKIWAL
1,MAHENDAR THUDI
1,
MANISH K PANDEY1 ,RACHIT K SAXENA1 ,SIVA K CHAMARTHI1,MURALI MOHAN S1 ,
NALINI MALLIKARJUNA1,HARI UPADHYAYA
1,POORAN M GAUR
1,L KRISHNAMURTHY
1,KB SAXENA1 ,
SHYAM N NIGAM1 and SURESH PANDE1
1International Crops Research Institute for the Semi-Arid Tropics (ICRISAT),Patancheru 502 324, India
2CGIAR Generation Challenge Programme (GCP), c/o CIMMYT, 06600, Mexico,DF, Mexico
*Corresponding author (Fax, 0091-40-30713074/75; Email, [email protected])
Molecular markers are the most powerful genomic tools to increase the efficiency and precision of breeding practicesfor crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-aridtropics (SAT), namely, chickpea (Cicer arietinum), pigeonpea (Cajanus cajan) and groundnut (Arachis hypogaea), ascompared to other crop species like cereals, has been very slow. With the advances in next-generation sequencing(NGS) and high-throughput (HTP) genotyping methods, there is a shift in development of genomic resourcesincluding molecular markers in these crops. For instance, 2,000 to 3,000 novel simple sequence repeats (SSR)markers have been developed each for chickpea, pigeonpea and groundnut. Based on Sanger, 454/FLX andIllumina transcript reads, transcriptome assemblies have been developed for chickpea (44,845 transcriptassembly contigs, or TACs) and pigeonpea (21,434 TACs). Illumina sequencing of some parental genotypesof mapping populations has resulted in the development of 120 million reads for chickpea and 128.9 millionreads for pigeonpea. Alignment of these Illumina reads with respective transcriptome assemblies haveprovided >10,000 SNPs each in chickpea and pigeonpea. A variety of SNP genotyping platforms includingGoldenGate, VeraCode and Competitive Allele Specific PCR (KASPar) assays have been developed inchickpea and pigeonpea. By using above resources, the first-generation or comprehensive genetic maps havebeen developed in the three legume species mentioned above. Analysis of phenotyping data together with genotyping datahas provided candidate markers for drought-tolerance-related root traits in chickpea, resistance to foliar diseases ingroundnut and sterility mosaic disease (SMD) and fertility restoration in pigeonpea. Together with these trait-associated markers along with those already available, molecular breeding programmes have been initiated forenhancing drought tolerance, resistance to fusarium wilt and ascochyta blight in chickpea and resistance tofoliar diseases in groundnut. These trait-associated robust markers along with other genomic resources includinggenetic maps and genomic resources will certainly accelerate crop improvement programmes in the SAT legumes.
http://www.ias.ac.in/jbiosci J. Biosci. 37(5), November 2012, 811–820, * Indian Academy of Sciences 811
Keywords. Chickpea; genomic resource; genotyping; groundnut; legume; pigeonpea; sequencing
Abbreviations used: AB, ascochyta blight; AFLP, amplified fragment length polymorphism; BAC, bacterial artificial chromosome; BES,BAC- end derived sequence; CAPS, cleaved amplified polymorphic sequences; CISR, conserved intron spanning region; COS, conservedorthologous sequence; DArT, diversity array technology; FW, Fusarium wilt; GMM, genic molecular marker; GS, Genomic selection; HTP, high-throughput; LLS, late leaf spot; MABC, marker-assisted backcrossing; MAS, marker-assisted selection; NGS, next-generation sequencing; RAPD,random amplified polymorphic DNA; RFLP, restriction fragment length polymorphism; RIL, recombinant inbred line; SAT, semi-arid tropics;SCMR, SPAD chlorophyll meter readings; SLA, specific leaf area; SMD, sterility mosaic disease; SNP, single nucleotide polymorphism; SPAD,soil plant analytical development; SSR, simple sequence repeats; TAC, transcript assembly contig; TE, transpiration efficiency; TUS, tentativeunique sequences
Published online: 15 October 2012
Applica'on of transcriptomics to plant breeding
Legumes include a number of important crops, namely, soybean (Glycine max), groundnut (Arachis hypogaea), cowpea (Vigna unguiculata), common bean (Phaseolus vulgaris), chickpea (Cicer arie9num), pigeonpea (Cajanus cajan), len'l (Lens culinaris), pea (Pisum sa9vum), mungbean (Vigna radiata), etc.
The global produc'on of grain and forage legumes is about 300 million metric tons which are grown on about 190 million hectares. Legumes are a rich source of proteins, vitamins, minerals and dietary fibre
Applica'on of transcriptomics to plant breeding Several food legume crops are grown in semi-‐arid tropics (SAT) of Africa, Asia and South America. In these areas, the legume crops are exposed to various bio'c and abio'c stresses. Crop produc)vity of these legume crops can be enhanced through the use of biotechnological tools in the breeding programmes.
Marker trait associa'ons are a prerequisite for marker-‐assisted selec'on (MAS). Marker trait associa'on for a number of traits in all major crops have now become available due to the accessibility of an array of molecular markers and dense molecular gene'c maps. However, majority of the legume crops remained untouched with genomics revolu)on
Applica'on of transcriptomics to plant breeding
Two NGS technologies namely 454 and Illumina together with Sanger sequencing technology have been used to characterize the transcriptomes of chickpea and pigeonpea. 20.162 and 9.888 ESTs were developed for chickpea and pigeonpea on drought-‐ and salinity-‐challenged cDNA libraries for chickpea and Fusarium wilt (FW) and sterility mosaic disease (SMD)-‐challenged cDNA libraries for pigeonpea.
In order to improve these transcriptomic resources, 454/FLX sequencing was undertaken on normalized and pooled RNA samples collected from >20 'ssues represen'ng different developmental stages of the plant. As a result, 435.018 transcript reads for chickpea and 494.353 transcript reads for pigeonpea have been generated. Cluster analysis of these transcript reads provided transcript assembly of chickpea with 103.215 tenta)ve unique sequences (TUSs) and pigeonpea with 127.754 TUSs
Applica'on of transcriptomics to plant breeding Illumina sequencing was carried out on parental genotypes of mapping popula'ons of chickpea and pigeonpea. Alignment of these short reads onto TAs of respec)ve species has provided a large number (tens of thousands) of SNPs in each of these species. This allowed the development of cost-‐effec)ve genotyping plaQorms.
The applica)on of these informa)on in plant breeding programs let to the mapping of QTLs involved in drought response in chickpea and in response to Sterility Mosaic Disease in pigeonpea
Applica'on of transcriptomics to plant breeding
RESEARCH ARTICLE Open Access
De novo assembly and Characterisation of theTranscriptome during seed development, andgeneration of genic-SSR markers in Peanut(Arachis hypogaea L.)Jianan Zhang1,2, Shan Liang3, Jialei Duan4, Jin Wang1, Silong Chen1, Zengshu Cheng1, Qiang Zhang1,Xuanqiang Liang5 and Yurong Li1*
Abstract
Background: The peanut (Arachis hypogaea L.) is an important oilseed crop in tropical and subtropical regions ofthe world. However, little about the molecular biology of the peanut is currently known. Recently, next-generationsequencing technology, termed RNA-seq, has provided a powerful approach for analysing the transcriptome, andfor shedding light on the molecular biology of peanut.
Results: In this study, we employed RNA-seq to analyse the transcriptomes of the immature seeds of threedifferent peanut varieties with different oil contents. A total of 26.1-27.2 million paired-end reads with lengths of100 bp were generated from the three varieties and 59,077 unigenes were assembled with N50 of 823 bp. Basedon sequence similarity search with known proteins, a total of 40,100 genes were identified. Among these unigenes,only 8,252 unigenes were annotated with 42 gene ontology (GO) functional categories. And 18,028 unigenesmapped to 125 pathways by searching against the Kyoto Encyclopedia of Genes and Genomes pathway database(KEGG). In addition, 3,919 microsatellite markers were developed in the unigene library, and 160 PCR primers of SSRloci were used for validation of the amplification and the polymorphism.
Conclusion: We completed a successful global analysis of the peanut transcriptome using RNA-seq, a largenumber of unigenes were assembled, and almost four thousand SSR primers were developed. These data willfacilitate gene discovery and functional genomic studies of the peanut plant. In addition, this study providesinsight into the complex transcriptome of the peanut and established a biotechnological platform for futureresearch.
BackgroundThe peanut (Arachis hypogaea L.), also known as thegroundnut, is an important oilseed crop in the tropicaland subtropical regions of the world. It is grown on sixcontinents but mainly in Asia, Africa and America. Pea-nuts are cultivated on 23.51 million hectares worldwide,with a total global production of approximately 35.52million tons (the weight includes the shell). China is thelargest producer in the world, accounting for 37.6%
(13.34 million tons) of the total world production (FAO,2009, http://faostat.fao.org).Peanuts have a desirable fatty acid profile and are rich
in vitamins, minerals and bioactive materials, includingseveral known heart-healthy nutrients, such as monoun-saturated and polyunsaturated fatty acids, potassium,magnesium, copper niacin, arginine, fibre, a-tocopherol,folates, phytosterols, and flavonoids. Indeed, peanut con-sumption has been associated with an improvement inthe overall quality of the diet and nutrient [1-4].In China, almost 60% of the peanuts are used to pro-
duce peanut oil [5]. Peanut oil, due to its high monounsa-turated fat content, is considered healthier than saturatedoils and is resistant to rancidity. Monounsaturated fat,
* Correspondence: [email protected] of Food and Oil Crops, Hebei Academy of Agriculture and ForestrySciences/Laboratory of Crop Genetics and Breeding of Hebei Province,Shijiazhuang 050031, ChinaFull list of author information is available at the end of the article
Zhang et al. BMC Genomics 2012, 13:90http://www.biomedcentral.com/1471-2164/13/90
© 2012 Zhang et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative CommonsAttribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction inany medium, provided the original work is properly cited.
Applica'on of transcriptomics to plant breeding The peanut (Arachis hypogaea L.), also known as the groundnut, is an important oilseed crop in the tropical and subtropical regions of the world. It is grown on six con'nents but mainly in Asia, Africa and America. Peanuts are cul)vated on 23.51 million hectares worldwide, with a total global produc)on of approximately 35.52 million tons
Peanut consump'on has been associated with an improvement in the overall quality of the diet and nutrient
The development of the peanut seed has been studied intensely to understand the physiological, biochemical, and molecular characteris'cs that determine the oil quality and their beneficial nutri'onal contribu'ons. However, the development of the peanut seed is a complex process involving a cascade of biochemical changes, which involve the transcrip)onal modula)on of many genes, yet liXle is known about these transcrip)onal changes and their regula)on.
Applica'on of transcriptomics to plant breeding
We generated 27.2, 26.9 and 26.1 million 100-‐bp paired-‐ end reads for the JH4, K01 and T21 varie'es, encompassing 2.44, 2.42 and 2.35 Gb of sequence data, respec'vely
Assembling these reads produced 44.028, 47.110 and 44.157 unigenes for the JH4, K01 and T21 varie'es, respec'vely. AYer the final clustering, 59.077 unigenes were obtained.
The unigenes were compared against the NCBI NR protein database using blastx. Among the 59.077 unigenes, 40.100 (67.88%) had at least one significant match with an E-‐value below 1e-‐5. A total of 18.977 unigenes had no significant matches to any known protein, the result that may be partly due to novel genes or highly divergent genes, or these unigenes could represent untranslated regions.
Applica'on of transcriptomics to plant breeding
Microsatellite markers (SSR markers) are some of the most successful molecular markers in the construc'on of a peanut gene'c map and in diversity analysis. In this study, 5.883 microsatellites were detected in 4.993 unigenes, of which, 728 sequences contained more than 1 SSR.
Based on the 5,883 SSRs, 3,919 primer pairs were successfully designed
A total of 160 primer pairs were randomly selected to validate these polymorphisms in six varie'es. All 160 of the markers yielded amplifica)on products, and 65 (40.63%) exhibited polymorphisms among the six varie)es.